I am trying to bin a Pandas dataframe in Python 3 in order to have more efficient grouping over a large dataset. Currently the performance bottleneck is in iterating over the dataframe using the .apply() method.
All entries within the column are in hex, so it seems like the pd.to_numeric function should do exactly what I want.
I've tried a variety of options, but so far nothing has worked.
# This sets all values to np.nan with coerced errors, 'Unable to parse string' with raise errors.
dataframe[bin] = pd.to_numeric(dataframe[to_bin], errors='coerce') % __NUM_BINS__
# Gives me "int() Cannot convert non-string with explicit base"
dataframe[bin] = int(dataframe[to_bin].astype(str), 16) % __NUM_BINS__
# Value Error: Invalid literal for int with base 10 'ffffffffff'
dataframe[bin] = dataframe.astype(np.int64) % __NUM_BINS__
Any suggestions? This seems like something that people would have to have tackled in the past.