I want to find the set of all unique characters contained within a pandas DataFrame. One solution that works is given below:
from operator import add
set(reduce(add, map(unicode, df.values.flatten())))
However, the solution above takes a long time with large DataFrames. What are more efficient ways of doing this?
I am trying to find all unique characters in a pandas DataFrame so I can choose an appropriate delimiter when writing the DataFrame to disk as a csv.