1

I have trained a object detection model using transferred learning from Mask R-CNN Inception ResNet V2 1024x1024 and after converting the model to js I get the error: ERROR provided in model.execute(dict) must be int32, but was float32. Here are the steps I took to create the model.

1- Created the training.json, validation.json, testing.json annotation files along with the label_map.txt files from my images. I have also pre-processed the images to fit the 1024 * 1024 size.

2- Used the create_coco_tf_record.py provided by tensorflow to generate tfrecord files. The only alteration I made to the create_coco_tf_record.py file was changing include_mask to True

tf.flags.DEFINE_boolean(
    'include_masks', True **was false**, 'Whether to include instance segmentations masks '

then ran the bottom command using conda

python create_coco_tf_record.py ^
--logtostderr ^
--train_image_dir=C:/model/ai_container/training ^
--val_image_dir=C:/model/ai_container/vidation ^
--test_image_dir=C:/model/ai_container/testing ^
--train_annotations_file=C:/model/ai_container/training/training.json ^
--val_annotations_file=C:/model/ai_container/validation/coco_validation.json ^
--testdev_annotations_file=C:/model/ai_container/testing/coco_testing.json ^
--output_dir=C:/model/ai_container/tfrecord

3- I then train the model. Bellow is the modified portion of my config_file based on base mask-rcnn config file. The batch and num_steps are set to 1 just so I could quickly train the model to test the results.

    train_config: {
  batch_size: 1
  num_steps: 1
  optimizer {
    momentum_optimizer: {
      learning_rate: {
        cosine_decay_learning_rate {
          learning_rate_base: 0.008
          total_steps: 200000
          warmup_learning_rate: 0.0
          warmup_steps: 5000
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }
  gradient_clipping_by_norm: 10.0
  fine_tune_checkpoint_version: V2
  fine_tune_checkpoint: "C:/ObjectDetectionAPI/mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8/checkpoint/ckpt-0"
  fine_tune_checkpoint_type: "detection"
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
}

train_input_reader: {
  label_map_path: "C:/model/ai_container/label_map.txt"
  tf_record_input_reader {
    input_path: "C:/model/ai_container/tfrecord/coco_train.record*"
  }
  load_instance_masks: true
  mask_type: PNG_MASKS
}

eval_config: {
  metrics_set: "coco_detection_metrics"
  metrics_set: "coco_mask_metrics"
  eval_instance_masks: true
  use_moving_averages: false
  batch_size: 1
  include_metrics_per_category: false
}

eval_input_reader: {
  label_map_path: "C:/model/ai_container/label_map.txt"
  shuffle: false
  num_epochs: 1
  tf_record_input_reader {
    input_path: "C:/model/ai_container/tfrecord/coco_val.record*"
  }
  load_instance_masks: true
  mask_type: PNG_MASKS
}

than ran training command:

python object_detection/model_main_tf2.py ^
--pipeline_config_path=C:/ObjectDetectionAPI/mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8/mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8.config ^
--model_dir=C:/TensoFlow/training_process_2 ^
--alsologtostderr

4- Run validation command (might be doing this wrong)

python object_detection/model_main_tf2.py ^
--pipeline_config_path=C:/ObjectDetectionAPI/mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8/mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8.config ^
--model_dir=C:/TensoFlow/training_process_2 ^
--checkpoint_dir=C:/TensoFlow/training_process_2 ^
--sample_1_of_n_eval_examples=1 ^
--alsologtostderr

5- Export Model

python object_detection/exporter_main_v2.py ^
--input_type="image_tensor" ^
--pipeline_config_path=C:/ObjectDetectionAPI/mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8/mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8.config ^
--trained_checkpoint_dir=C:/TensoFlow/training_process_2 ^
--output_directory=C:/TensoFlow/training_process_2/generatedModel

6- Convert Model to tensorflowJs

tensorflowjs_converter ^
--input_format=tf_saved_model ^
--output_format=tfjs_graph_model  ^
--signature_name=serving_default  ^
--saved_model_tags=serve ^
C:/TensoFlow/training_process_2/generatedModel/saved_model C:/TensoFlow/training_process_2/generatedModel/jsmodel

7- Then attempt to load the model into my angular project. I placed the converted model bin and json files in my assets folder.

npm install @tensorflow/tfjs 

ngAfterViewInit() {
   tf.loadGraphModel('/assets/tfmodel/model1/model.json').then((model) => {
     this.model = model;
     this.model.executeAsync(tf.zeros([1, 256, 256, 3])).then((result) => {
       this.loadeModel = true;
     });
   });
}

I then get the error

    tf.min.js:17 ERROR Error: Uncaught (in promise): Error: The dtype of dict['input_tensor'] provided in model.execute(dict) must be int32, but was float32
    Error: The dtype of dict['input_tensor'] provided in model.execute(dict) must be int32, but was float32
    at F$ (util_base.js:153:11)
    at graph_executor.js:721:9
    at Array.forEach (<anonymous>)
    at e.value (graph_executor.js:705:25)
    at e.<anonymous> (graph_executor.js:467:12)
    at h (tf.min.js:17:2100)
    at Generator.<anonymous> (tf.min.js:17:3441)
    at Generator.next (tf.min.js:17:2463)
    at u (tf.min.js:17:8324)
    at o (tf.min.js:17:8527)
    at resolvePromise (zone.js:1211:31)
    at resolvePromise (zone.js:1165:17)
    at zone.js:1278:17
    at _ZoneDelegate.invokeTask (zone.js:406:31)
    at Object.onInvokeTask (core.mjs:26343:33)
    at _ZoneDelegate.invokeTask (zone.js:405:60)
    at Zone.runTask (zone.js:178:47)
    at drainMicroTaskQueue (zone.js:585:35)

Im using angular. I have also tried a few online solutions with no success. If anyone could give me any information on how to possible solve this issue I would be grateful. THANKS.

Hozeis
  • 1,542
  • 15
  • 36

1 Answers1

0

The error message "The dtype of dict['input_tensor'] provided in model.execute(dict) must be int32, but was float32" indicates that the input tensor that you are providing to your model in the model.executeAsync(tf.zeros([1, 256, 256, 3])) call has the wrong data type.

The Mask RCNN model expects an integer tensor for the input, but you are providing a tensor of zeros with a float32 data type. It is possible that this discrepancy is due to the specific pre-processing required for the Mask RCNN model, which often involves rescaling pixel values and converting them to integers.

I can see here an example where the type is explicitly mentioned:

boxes = torch.zeros([num_objs,4], dtype=torch.float32)

You could therefore try and adapt your code with dtype (using tz.zeros from the js.tensorflow API):

// Load your model as before
tf.loadGraphModel('/assets/tfmodel/model1/model.json').then((model) => {
    this.model = model;

    // Create a tensor of zeros and preprocess it as required by your model
    let inputTensor = tf.zeros([1, 256, 256, 3], dtype=tf.int32);
    inputTensor = inputTensor.mul(255).toInt();  // rescale to [0, 255] and convert to int32

    this.model.executeAsync(inputTensor).then((result) => {
        this.loadeModel = true;
    });
});

Note: This is just an example. You need to adjust the preprocessing to match what your specific model requires.
And this assumes this is not simpler, like a python2 to 3 issue.


From the comments:

This indeed fixed the error, but now I get another error:

"The new shape (NaN,0) has NaN elements and the old shape (0) has 0 elements"

I also noticed if I do this.model.inputs[0].shape prior to executeAsyn, I get [1, -1, -1, 3]. I trained my model with images 1024 x 1024. Should my shape be [1, 1024, 1024, 3]?

The input shape of the model should match the shape of the images used for training. If you have trained your model with images of size 1024x1024, the expected input shape for the model should indeed be [1, 1024, 1024, 3] (more on the input shape here and here).

The shape [1, -1, -1, 3] indicates that the model can accept any size of image, because -1 is often used as a placeholder for "size inference" in TensorFlow. However, this flexibility in the model's input size might not actually work in practice if the model architecture includes layers that require a fixed input size, which is usually the case for CNN-based models like Mask R-CNN.

The error message "The new shape (NaN,0) has NaN elements and the old shape (0) has 0 elements" indicates that there is an issue with reshaping the input tensor to the expected input shape of the model. This could be due to the model trying to reshape the input tensor to an incorrect shape, possibly because of the -1 values in the input shape.

Try the following steps:

  1. Create an input tensor of the correct shape (i.e., [1, 1024, 1024, 3]) when calling model.executeAsync().
    For example:

    let inputTensor = tf.zeros([1, 1024, 1024, 3]);
    inputTensor = inputTensor.mul(255).toInt();  // adjust this line as needed based on your model's expected input
    
    this.model.executeAsync(inputTensor).then((result) => {
        this.loadeModel = true;
    });
    
  2. If the model still throws an error, it might be due to a mismatch between the input tensor's shape and the model's expected input shape. You can try changing the model's input shape to match the shape of the input tensor using the tf.util.setShape() function:

    tf.util.setShape(this.model.inputs[0], [1, 1024, 1024, 3]);
    
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • This indeed fixed the error, but know I get another one ``The new shape (NaN,0) has NaN elements and the old shape (0) has 0 elements`` – Hozeis May 30 '23 at 00:31
  • I also noticed if I do ``this.model.inputs[0].shape`` prior to ``executeAsyn`` I get ``[1, -1, -1, 3]``. I trained my model with images 1024 x 1024. Should my shape be ``[1, 1024, 1024, 3]`` – Hozeis May 30 '23 at 00:48
  • @Hozeis I have edited the answer to address your comments. – VonC May 30 '23 at 18:22