I am working on a embedded deep learning inference C++ project using tensorRT. For my model it is necessary to subtract the mean image.
The api that I'm using allows me to define a mean image with the following data structure for rgb images:
uint8_t *data[DW_MAX_IMAGE_PLANES]; // raw image data
size_t pitch; // pitch of the image in bytes
uint32_t height; // height of the image in px
uint32_t width; // image width in px
uint32_t planeCount; // plane count of the image
So far I found the lib LodePNG, which is quite usefull for this task I think. It can load pngs with just a few lines:
// Load file and decode image.
std::vector<unsigned char> image;
unsigned width, height;
unsigned error = lodepng::decode(image, width, height, filename);
The question now is how to convert std::vector<unsigned char>
to uint8_t *[DW_MAX_IMAGE_PLANES]
and calculate the pitch and planeCount values?
As I'm using rgb images DW_MAX_IMAGE_PLANES equals 3.