0

I've a dataframe with 2-columns: ImageData, Label. ImageData column is of 2-D array for various dimensions. Label column is boolean True/False.

I'm trying to convert data in the "ImageData" column to 128x128 shape (along with minor other transformations). So, I'm doing following:

def convert_image_to_binary_image(img: np.ndarray, threshold: int = 1, max_value: int = 1) -> np.ndarray:
    ret, bin_img = cv.threshold(img, thresh=threshold, maxval=max_value, type=cv.THRESH_BINARY)
    bin_img = bin_img.astype('float32')
    return bin_img

def transform_img_dimension(img: np.ndarray, target_width: int = 128, target_height: int = 128) -> np.ndarray:
    img = img.astype('uint8')

    bin_image = convert_image_to_binary_image(img)
    bin_3dimg = tf.expand_dims(input=bin_image, axis=2)
    bin_img_reshaped = tf.image.resize_with_pad(image=bin_3dimg, target_width=target_width, target_height=target_height, method="bilinear")

    xformed_img = np.squeeze(bin_img_reshaped, axis=2)

    # return xformed_img.copy()
    return xformed_img

I'm calling apply as following:

testDF["ImageData"] = testDF.apply(lambda row: transform_img_dimension(row["ImageData"], axis=1)

But that's causing SettingWithCopyWarning.

I tried defining a wrapper function (instead of lambda) as following:

def transform_dimension(row: pd.Series, target_width: int = 128, target_height: int = 128) -> np.ndarray:
    copy_row = row.copy(deep=True)
    xformed_data = transform_img_dimension(copy_row["ImageData"], target_width=target_width, target_height=target_height)
    del copy_row
    return xformed_data

And updated the call to apply as following:

testDF["ImageData"] = testDF.apply(transform_dimension, axis=1)

However, this is not resolving the problem. What is the fix for this warning for my case?

Update 1:

If I rewrite as following, I don't get the warning

testDF2 = testDF.copy(deep=True)
testDF2["ImageData"] = testDF.apply(lambda row: transform_img_dimension(row["ImageData"], axis=1)

Is it not memory overhead now to hold 2 dataframes? Am I recommended to delete the original dataframe, testDF, now?

soumeng78
  • 600
  • 7
  • 12
  • 1
    adding .copy before you take the sub df – BENY Aug 23 '22 at 21:32
  • 2
    Does this answer your question? [How to deal with SettingWithCopyWarning in Pandas](https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas) – Tim Aug 23 '22 at 21:33
  • @BENY - do you mean testDF.copy(deep=True).apply()? Otherwise, if I understand correctly, I've attempted copy on *row* in transform_dimension(). – soumeng78 Aug 23 '22 at 23:11
  • @Tim I tried this : testDF2 = testDF.copy(deep=True) testDF2["ImageData"] = testDF.apply(lambda arr: transform_img_dimension(arr["ImageData"], axis=1) this didn't cause the warning. Is it not memory overhead now to hold 2 dataframes? Or did I understand something wrong? – soumeng78 Aug 23 '22 at 23:24

0 Answers0