I've a dataframe with 2-columns: ImageData, Label. ImageData column is of 2-D array for various dimensions. Label column is boolean True/False.
I'm trying to convert data in the "ImageData" column to 128x128 shape (along with minor other transformations). So, I'm doing following:
def convert_image_to_binary_image(img: np.ndarray, threshold: int = 1, max_value: int = 1) -> np.ndarray:
ret, bin_img = cv.threshold(img, thresh=threshold, maxval=max_value, type=cv.THRESH_BINARY)
bin_img = bin_img.astype('float32')
return bin_img
def transform_img_dimension(img: np.ndarray, target_width: int = 128, target_height: int = 128) -> np.ndarray:
img = img.astype('uint8')
bin_image = convert_image_to_binary_image(img)
bin_3dimg = tf.expand_dims(input=bin_image, axis=2)
bin_img_reshaped = tf.image.resize_with_pad(image=bin_3dimg, target_width=target_width, target_height=target_height, method="bilinear")
xformed_img = np.squeeze(bin_img_reshaped, axis=2)
# return xformed_img.copy()
return xformed_img
I'm calling apply as following:
testDF["ImageData"] = testDF.apply(lambda row: transform_img_dimension(row["ImageData"], axis=1)
But that's causing SettingWithCopyWarning.
I tried defining a wrapper function (instead of lambda) as following:
def transform_dimension(row: pd.Series, target_width: int = 128, target_height: int = 128) -> np.ndarray:
copy_row = row.copy(deep=True)
xformed_data = transform_img_dimension(copy_row["ImageData"], target_width=target_width, target_height=target_height)
del copy_row
return xformed_data
And updated the call to apply as following:
testDF["ImageData"] = testDF.apply(transform_dimension, axis=1)
However, this is not resolving the problem. What is the fix for this warning for my case?
Update 1:
If I rewrite as following, I don't get the warning
testDF2 = testDF.copy(deep=True)
testDF2["ImageData"] = testDF.apply(lambda row: transform_img_dimension(row["ImageData"], axis=1)
Is it not memory overhead now to hold 2 dataframes? Am I recommended to delete the original dataframe, testDF, now?