how to create outputs for key points of bounding boxes on image in Neural network in Python

By Ritesh Sahu - April 29, 2021

I've made a convolutional neural network which can classify one object on image (220, 220). The objects classes are only two - dogs and cats. I have 3385 '.jpg' pictures of dogs and cats on my PC. I have information about class of each object (1 or 0). Also I have information about two corners coordinates (from 0.0 to 1.0) of bounding boxes (0.0 is left/top corner of image and 1.0 is right/bottom corner of image) around of the objects. Convolutional net must have 5 outputs (1 output for classes and 4 outputs for two corners coordinates: x1,y1,x2,y2). I have 3047 train images 'x_train' shape (3047, 220, 220, 3) and labels information in 'y_train' shape (3047, 5). Also 338 test images in 'x_test' shape (338, 220, 220, 3) and test labels in 'y_test'. For example, y_train[0] has such elements array([1. , 0.555 , 0.18 , 0.70833333, 0.395 ]). But my neural network have such prediction results prediction[0] is array([1., 0., 0., 0., 0.], dtype=float32). And np.sum(prediction) is 338.0 which is number of test images. During the training parameters increased from 'loss: 4.8534 - accuracy: 0.5938' to 'loss: 533785.6875 - accuracy: 1.0000'. Could anyone help me with an advice what do I do wrong?

#Summary:
#Model: "sequential_1"
#conv2d_2 (Conv2D)            (None, 220, 220, 32)      896       
#max_pooling2d_2 (MaxPooling2 (None, 110, 110, 32)      0 
#conv2d_3 (Conv2D)            (None, 110, 110, 64)      18496     
#max_pooling2d_3 (MaxPooling2 (None, 55, 55, 64)        0         
#flatten_1 (Flatten)          (None, 193600)            0         
#dense_2 (Dense)              (None, 128)               24780928  
#dense_3 (Dense)              (None, 5)                 645 
#Total params: 24,800,965
#Trainable params: 24,800,965
#Non-trainable params: 0

import os
import cv2
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
import numpy as np
import matplotlib.pyplot as plt
from tensorflow import keras
from tensorflow.keras.layers import Dense, Flatten, Dropout, Conv2D, MaxPooling2D
import time

model = keras.Sequential([
    Conv2D(32, (3, 3), padding='same', activation='relu', input_shape=(220, 220, 3)),
    MaxPooling2D((2, 2), strides=2),
    Conv2D(64, (3, 3), padding='same', activation='relu'),
    MaxPooling2D((2, 2), strides=2),
    Flatten(),
    Dense(128, activation='relu'),
    Dense(5, activation='softmax')
])

print(model.summary())
model.compile(optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy'])
his = model.fit(x_train, y_train, batch_size=32, epochs=2, validation_split=0.5)
model.evaluate(x_test, y_test)
prediction = model.predict(x_test)
print(np.sum(prediction))
# >>> 338.0

from Recent Questions - Stack Overflow https://ift.tt/2PzC2j2
https://ift.tt/eA8V8J

Search This Blog

Theprogrammersfirst | A technical portal.

how to create outputs for key points of bounding boxes on image in Neural network in Python

Comments

Post a Comment

Popular posts from this blog

Spring Elasticsearch Operations

Hibernate Search - Elasticsearch with JSON manipulation

Today Walkin 14th-Sept