I am using Microsoft OnnxRuntime to detect and classify objects in images and I want to apply it to real-time video. To do that, I have to convert each frame into an OnnxRuntime Tensor. Right now I have implemented a method that takes around 300ms:
public Tensor<float> ConvertImageToFloatTensor(Bitmap image)
{
// Create the Tensor with the appropiate dimensions for the NN
Tensor<float> data = new DenseTensor<float>(new[] { 1, image.Width, image.Height, 3 });
// Iterate over the bitmap width and height and copy each pixel
for (int x = 0; x < image.Width; x++)
{
for (int y = 0; y < image.Height; y++)
{
Color color = image.GetPixel(x, y);
data[0, y, x, 0] = color.R / (float)255.0;
data[0, y, x, 1] = color.G / (float)255.0;
data[0, y, x, 2] = color.B / (float)255.0;
}
}
return data;
}
I need this code to run as fast as possible since I am representing the output bounding boxes of the detector as a layer on top of the video. Does anyone know a faster way of doing this conversión?