I am trying to calculate the itersection over union of predicted bounding boxes with their corresponding ground truth boxes. The problem is that the model crops the image in order to locate the object. ( I cannot change that ). So now, I have images that have different sizes than the original ones therefore the coordinates of the predicted bounding boxes are with respect to the new size of the image. What is the best way in such situation to calculate the intersection over union with the ground truth ? I tried rescaling the predicted image to the original size (and rescaling the predicted coordinates) but some of them are too small to the fact that the bounding box becomes a line. ( some bounding boxes will have the same y value for ymin and ymax). So what should I do or how sould I proceed ?
I edited my question for more clarification: There is no specified fixed size for the original images nor for the cropped ones. Each original image has tables in it and the model will crop these tables. The ground truth is the coordinates of the cells of the tables in the original image and the predicted boxes are the coordinates of the cells of each table (in the cropped image). I opted for linear interpolation to calculate the predicted coordinates of the cells like they were in the original image but beacuse the original image is small ( for example 594 x 845 ) the calculated coordinates will become very small. For example : a predicted box is [696,0,1414,48] after using linear interpolation it will become [414,0,888,0] so it is now a line not a rectangle. The image given by the model in this case has the size 1000 x 1048