Given that there is a big dataframe.
import pandas as pd
df = pd.DataFrame({'A':[1, 2, 1, 1, 1, 1], 'B': [1, 1, 1, 1, 2, 3]})
df.to_csv("tmp.csv", sep="|", index=False)
df = pd.read_csv("tmp.csv", sep="|", chunksize=3)
How can I remove all duplicate lines? Even in different chunks. That is, if in the first chunk a line 1, 1, then the other chunks cannot have it.