tensorflow.python.framework.errors_impl.InvalidArgumentError: Reshape cannot infer the missing input size for an empty tensor unless all specified input sizes are non-zero #281

ximik666 · 2019-08-09T20:42:18Z

Hello. I am trying to train the model from the example about hololens, but such an error comes out during training. I download dataset from hololens and use this code

from imageai.Detection.Custom import DetectionModelTrainer

trainer = DetectionModelTrainer()
trainer.setModelTypeAsYOLOv3()
trainer.setDataDirectory(data_directory="hololens")
trainer.setTrainConfig(object_names_array=["hololens"], batch_size=1, num_experiments=20, train_from_pretrained_model="pretrained-yolov3.h5") #download pre-trained model via https://github.com/OlafenwaMoses/ImageAI/releases/download/essential-v4/pretrained-yolov3.h5
trainer.trainModel()`

pretrained-yolov3.h5 i download and put in example directory. What could be the problem?

Using TensorFlow backend.
Generating anchor boxes for training images and annotation...
Average IOU for 9 anchors: 0.88
Anchor Boxes generated.
Detection configuration saved in hololens/json/detection_config.json
Training on: ['hololens']
Training with Batch Size: 1
Number of Experiments: 20
WARNING:tensorflow:From /usr/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /usr/lib/python3.7/site-packages/imageai-2.1.3-py3.7.egg/imageai/Detection/Custom/yolo.py:24: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Training with transfer learning from pretrained Model
/usr/lib/python3.7/site-packages/keras/callbacks.py:1065: UserWarning: epsilon argument is deprecated and will be removed, use min_delta instead.
warnings.warn('epsilon argument is deprecated and '
WARNING:tensorflow:From /usr/lib/python3.7/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Epoch 1/20
Traceback (most recent call last):
File "custom_detection_train.py", line 7, in
trainer.trainModel()
File "/usr/lib/python3.7/site-packages/imageai-2.1.3-py3.7.egg/imageai/Detection/Custom/init.py", line 286, in trainModel
max_queue_size=4
File "/usr/lib/python3.7/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/usr/lib/python3.7/site-packages/keras/engine/training.py", line 1418, in fit_generator
initial_epoch=initial_epoch)
File "/usr/lib/python3.7/site-packages/keras/engine/training_generator.py", line 217, in fit_generator
class_weight=class_weight)
File "/usr/lib/python3.7/site-packages/keras/engine/training.py", line 1217, in train_on_batch
outputs = self.train_function(ins)
File "/usr/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py", line 2715, in call
return self._call(inputs)
File "/usr/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
fetched = self._callable_fn(*array_vals)
File "/usr/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1439, in call
run_metadata_ptr)
File "/usr/lib/python3.7/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Reshape cannot infer the missing input size for an empty tensor unless all specified input sizes are non-zero
[[{{node replica_0/model_1/yolo_layer_1/Reshape}}]]
[[{{node training/Adam/gradients/replica_0/model_1/bnorm_25/FusedBatchNorm_grad/FusedBatchNormGrad}}]]

The text was updated successfully, but these errors were encountered:

OlafenwaMoses · 2019-08-10T09:34:47Z

I will review this. In the mean time

why are you using a batch size of 1 and not 2, 4,etc ?
what version of Tensorflow do you have installed?

ximik666 · 2019-08-10T15:43:13Z

i use tensorflow-gpu-1.13.2
I have only 2GB GPU memory and when i run script with bath size 2 and more i get tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1,44,44,256] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

ximik666 · 2019-08-10T16:00:37Z

if i try this example everything ok
from imageai.Prediction.Custom import ModelTraining

model_trainer = ModelTraining()
model_trainer.setModelTypeAsResNet()
model_trainer.setDataDirectory("idenprof")
model_trainer.trainModel(num_objects=10, num_experiments=200, enhance_data=True, batch_size=2, show_network_summary=True)

OlafenwaMoses · 2019-08-10T20:12:40Z

I will advice that you use Google Colab for this training as it offers 15GB GPU memory to train. Object detection is a very compute intensive training and a batch size of 1 is not viable.

ximik666 · 2019-08-11T09:23:03Z

OK, now i using Google Colab, hololens dataset, tensorflow-gpu 1.13 and get this error

from imageai.Detection.Custom import DetectionModelTrainer
trainer = DetectionModelTrainer()
trainer.setModelTypeAsYOLOv3()
trainer.setDataDirectory(data_directory="hololens")
trainer.setTrainConfig(object_names_array=["hololens"], batch_size=2, num_experiments=200, train_from_pretrained_model="pretrained-yolov3.h5")
trainer.trainModel()

Generating anchor boxes for training images and annotation...
Average IOU for 9 anchors: 0.78
Anchor Boxes generated.
Detection configuration saved in hololens/json/detection_config.json
Training on: ['hololens']
Training with Batch Size: 2
Number of Experiments: 200
Training with transfer learning from pretrained Model

/usr/local/lib/python3.6/dist-packages/keras/callbacks.py:1065: UserWarning: epsilon argument is deprecated and will be removed, use min_delta instead.
warnings.warn('epsilon argument is deprecated and '

Epoch 1/200

ResourceExhaustedError Traceback (most recent call last)

in ()
5 trainer.setDataDirectory(data_directory="hololens")
6 trainer.setTrainConfig(object_names_array=["hololens"], batch_size=2, num_experiments=200, train_from_pretrained_model="pretrained-yolov3.h5")
----> 7 trainer.trainModel()

8 frames

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/errors_impl.py in exit(self, type_arg, value_arg, traceback_arg)
526 None, None,
527 compat.as_text(c_api.TF_Message(self.status.status)),
--> 528 c_api.TF_GetCode(self.status.status))
529 # Delete the underlying status object from memory otherwise it stays alive
530 # as there is a reference to status from this from the traceback due to

ResourceExhaustedError: OOM when allocating tensor with shape[1,416,416,32] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node training_2/Adam/gradients/zeros_573-0-1-TransposeNCHWToNHWC-LayoutOptimizer}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[{{node training_2/Adam/gradients/replica_1_2/model_7/bnorm_38/cond/FusedBatchNorm_grad/FusedBatchNormGrad}}]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Whats wrong?

Meulen92 · 2019-08-19T12:57:05Z

Seems like you ran out of video memory to support a batch_size of 2 on this specific dataset. How much GPU memory do you have available?

Unfortunately, a batch_size of 1 will never work.

Yejing-Lai · 2019-11-07T05:58:55Z

hello,I have encountered the similar error。

I changed the batch size to 2 or 4 are the same mistakes. Have you solved this problem?

rashminair1986 · 2020-04-16T18:00:08Z

I don't understand how to resolve this error , even after changing the batch size to 2.
It works well when the size =1 but the loss computation is wrong.

Generating anchor boxes for training images and annotation...
Average IOU for 9 anchors: 0.87
Anchor Boxes generated.
Detection configuration saved in fishes1_2\json\detection_config.json
Training on: ['Dascyllus', 'Myripristis', 'Plectroglyphidodon']
Training with Batch Size: 2
Number of Experiments: 100
WARNING:tensorflow:From D:\deeplearningvideos\anaconda\envs\tensorflow\lib\site-packages\imageai\Detection\Custom\yolo.py:24: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From D:\deeplearningvideos\anaconda\envs\tensorflow\lib\site-packages\imageai\Detection\Custom\yolo.py:149: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.

Training with transfer learning from pretrained Model
D:\deeplearningvideos\anaconda\envs\tensorflow\lib\site-packages\keras\callbacks\callbacks.py:998: UserWarning: epsilon argument is deprecated and will be removed, use min_delta instead.
warnings.warn('epsilon argument is deprecated and '
WARNING:tensorflow:From D:\deeplearningvideos\anaconda\envs\tensorflow\lib\site-packages\keras\backend\tensorflow_backend.py:422: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

WARNING:tensorflow:From D:\deeplearningvideos\anaconda\envs\tensorflow\lib\site-packages\keras\backend\tensorflow_backend.py:431: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

WARNING:tensorflow:From D:\deeplearningvideos\anaconda\envs\tensorflow\lib\site-packages\keras\backend\tensorflow_backend.py:438: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

WARNING:tensorflow:From D:\deeplearningvideos\anaconda\envs\tensorflow\lib\site-packages\keras\callbacks\tensorboard_v1.py:200: The name tf.summary.merge_all is deprecated. Please use tf.compat.v1.summary.merge_all instead.

WARNING:tensorflow:From D:\deeplearningvideos\anaconda\envs\tensorflow\lib\site-packages\keras\callbacks\tensorboard_v1.py:203: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead.

Epoch 1/100
1/2592 [..............................] - ETA: 25:52:34 - loss: 128.3459 - yolo_layer_1_loss: 19.1331 - yolo_layer_2_ 2/2592 [..............................] - ETA: 13:25:01 - loss: 129.0509 - yolo_layer_1_loss: 19.6597 - yolo_layer_2_ 3/2592 [..............................] - ETA: 9:12:23 - loss: 128.5705 - yolo_layer_1_loss: 19.6130 - yolo_layer_2_l 4/2592 [..............................] - ETA: 7:07:19 - loss: 127.7652 - yolo_layer_1_loss: 19.5261 - yolo_layer_2_loss: 37.1629 - yolo_layer_3_loss: 71.0762Traceback (most recent call last):
File "fish_train.py", line 7, in
trainer.trainModel()
File "D:\deeplearningvideos\anaconda\envs\tensorflow\lib\site-packages\imageai\Detection\Custom_init_.py", line 291, in trainModel
max_queue_size=8
File "D:\deeplearningvideos\anaconda\envs\tensorflow\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "D:\deeplearningvideos\anaconda\envs\tensorflow\lib\site-packages\keras\engine\training.py", line 1732, in fit_generator
initial_epoch=initial_epoch)
File "D:\deeplearningvideos\anaconda\envs\tensorflow\lib\site-packages\keras\engine\training_generator.py", line 220, in fit_generator
reset_metrics=False)
File "D:\deeplearningvideos\anaconda\envs\tensorflow\lib\site-packages\keras\engine\training.py", line 1514, in train_on_batch
outputs = self.train_function(ins)
File "D:\deeplearningvideos\anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\keras\backend.py", line 3292, in call
run_metadata=self.run_metadata)
File "D:\deeplearningvideos\anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1458, in call
run_metadata_ptr)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
(0) Resource exhausted: OOM when allocating tensor with shape[1024,512,3,3] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node training/Adam/gradients/replica_1/model_1/conv_80/convolution_grad/Conv2DBackpropInput}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

     [[training/Adam/gradients/replica_0/model_1/bnorm_32/cond/FusedBatchNorm_grad/FusedBatchNormGrad/_5957]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

(1) Resource exhausted: OOM when allocating tensor with shape[1024,512,3,3] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node training/Adam/gradients/replica_1/model_1/conv_80/convolution_grad/Conv2DBackpropInput}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

0 successful operations.
0 derived errors ignored.

IshinEV mentioned this issue Aug 12, 2019

Custom object detection ... #283

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tensorflow.python.framework.errors_impl.InvalidArgumentError: Reshape cannot infer the missing input size for an empty tensor unless all specified input sizes are non-zero #281

tensorflow.python.framework.errors_impl.InvalidArgumentError: Reshape cannot infer the missing input size for an empty tensor unless all specified input sizes are non-zero #281

ximik666 commented Aug 9, 2019

OlafenwaMoses commented Aug 10, 2019

ximik666 commented Aug 10, 2019

ximik666 commented Aug 10, 2019

OlafenwaMoses commented Aug 10, 2019

ximik666 commented Aug 11, 2019 •

edited

Loading

Meulen92 commented Aug 19, 2019

Yejing-Lai commented Nov 7, 2019

rashminair1986 commented Apr 16, 2020

tensorflow.python.framework.errors_impl.InvalidArgumentError: Reshape cannot infer the missing input size for an empty tensor unless all specified input sizes are non-zero #281

tensorflow.python.framework.errors_impl.InvalidArgumentError: Reshape cannot infer the missing input size for an empty tensor unless all specified input sizes are non-zero #281

Comments

ximik666 commented Aug 9, 2019

OlafenwaMoses commented Aug 10, 2019

ximik666 commented Aug 10, 2019

ximik666 commented Aug 10, 2019

OlafenwaMoses commented Aug 10, 2019

ximik666 commented Aug 11, 2019 • edited Loading

Meulen92 commented Aug 19, 2019

Yejing-Lai commented Nov 7, 2019

rashminair1986 commented Apr 16, 2020

ximik666 commented Aug 11, 2019 •

edited

Loading