I tried the suggestion in this answer, and it appears that the conversion to string before dropping duplicates results in the truncated representation being compared. It seems to me that the dataframe.astype(str)
already has this truncation. How do I stop this from happening?
What I have tried:
The following code
import pandas as pd
import numpy as np
A = np.zeros((100,100))
B = np.zeros((100,100))
A[50][50] = 1
B[50][51] = 1
data = [[A, 0], [B, 0]]
dataframe = pd.DataFrame(data)
dataframe2 = dataframe.astype(str).drop_duplicates(keep=False)
results in dataframe2
being an empty dataframe, whereas this
import pandas as pd
import numpy as np
C = np.zeros((2,2))
D = np.zeros((2,2))
C[0][0] = 1
D[1][1] = 1
data = [[C, 0], [D, 0]]
dataframe = pd.DataFrame(data)
dataframe2 = dataframe.astype(str).drop_duplicates(keep=False)
gives dataframe2
being the same as dataframe
. I would expect this to be the result in the first case too.
I also tried adding pd.set_option('display.max_colwidth', None)
, but that didn't help.