I have a matrix of data ( 55K X8.5k) with counts. Most of them are zeros, but few of them would be like any count. Lets say something like this:
a b c
0 4 3 3
1 1 2 1
2 2 1 0
3 2 0 1
4 2 0 4
I want to binaries the cell values.
I did the following:
df_preference=df_recommender.applymap(lambda x: np.where(x >0, 1, 0))
While the code works fine, but it takes a lot of time to run.
Why is that?
Is there a faster way?
Thanks
Edit:
Error when doing df.to_pickle
df_preference.to_pickle('df_preference.pickle')
I get this:
---------------------------------------------------------------------------
SystemError Traceback (most recent call last)
<ipython-input-16-3fa90d19520a> in <module>()
1 # Pickling the data to the disk
2
----> 3 df_preference.to_pickle('df_preference.pickle')
\\dwdfhome01\Anaconda\lib\site-packages\pandas\core\generic.pyc in to_pickle(self, path)
1170 """
1171 from pandas.io.pickle import to_pickle
-> 1172 return to_pickle(self, path)
1173
1174 def to_clipboard(self, excel=None, sep=None, **kwargs):
\\dwdfhome01\Anaconda\lib\site-packages\pandas\io\pickle.pyc in to_pickle(obj, path)
13 """
14 with open(path, 'wb') as f:
---> 15 pkl.dump(obj, f, protocol=pkl.HIGHEST_PROTOCOL)
16
17
SystemError: error return without exception set