I would like to encode integer masks stored in pandas dataframe column into respective binary features which correspond to bit positions in those integers. For example, given 4-bit integers, and a decimal value of 11 I would like to derive 4 columns with values 1, 0, 1, 1, and so on across entire column.
Asked
Active
Viewed 867 times
1
-
1Please provide a sample data. – YOLO Mar 20 '18 at 14:04
-
Please show some of your work plus share with us a [mcve](https://stackoverflow.com/help/mcve), [mcve2](http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports) – rpanai Mar 20 '18 at 14:12
1 Answers
4
You can use:
df = pd.DataFrame([list('{0:04b}'.format(x)) for x in df['col']], index=df.index).astype(int)
Thank you, @pir for python 3.6+ solution:
df = pd.DataFrame([list(f'{i:04b}') for i in df['col'].values], df.index)
Numpy
Convert array to DataFrame
- solution from this answer, also added slicing for swap values per rows:
d = df['col'].values
m = 4
df = pd.DataFrame((((d[:,None] & (1 << np.arange(m)))) > 0)[:, ::-1].astype(int))
#alternative
#df = pd.DataFrame((((d[:,None] & (1 << np.arange(m-1,-1,-1)))) > 0).astype(int))
Or:
df = pd.DataFrame(np.unpackbits(d[:,None].astype(np.uint8), axis=1)[:,-m:])

jezrael
- 822,522
- 95
- 1,334
- 1,252
-
3Python 3.6+ you can use f-strings `s = pd.Series(range(16)); pd.DataFrame([list(f'{i:04b}') for i in s.values], s.index)` – piRSquared Mar 20 '18 at 14:14
-
1