python pandas - convert values to column names and fill with zeros and ones

Question

I need to convert categorical values to column names and fill with zeros and ones.

x = pd.DataFrame({'province' : ['Ontario', 'Manitoba', 'Quebec'], 'species' : ['a', 'b', 'c']})

   province species
0   Ontario       a
1  Manitoba       b
2    Quebec       c

I want to reshape the data frame above so that the values in species turn into column names, and the values of the new columns indicate presence or absence. The new data frame should look like this:

   province  a  b  c
0   Ontario  1  0  0
1  Manitoba  0  1  0
2    Quebec  0  0  1

`x = pd.get_dummies(x, columns=['species'])` like [this answer](https://stackoverflow.com/a/36285489) or `x = pd.get_dummies(x, columns=['species'], prefix='', prefix_sep='')` for exact output. — Henry Ecker, Jan 15 '22 at 21:45
@Henry, I didn't find an option not to add the prefix though — mozway, Jan 15 '22 at 21:46
@mozway The second option with `, prefix='', prefix_sep='')` works fine no? Like [here](https://stackoverflow.com/a/62902495/15497888) — Henry Ecker, Jan 15 '22 at 21:46

mozway · Accepted Answer · 2022-01-15T21:48:16.310

You can use crosstab:

(pd.crosstab(x['province'], x['species'])
   .reset_index().rename_axis(None, axis=1)
)

output:

   province  a  b  c
0  Manitoba  0  1  0
1   Ontario  1  0  0
2    Quebec  0  0  1

NB. crosstab will give you the number of found values, so if you have duplicates you can have 2/3/etc.

or get_dummies:

pd.get_dummies(x, columns=['species'], prefix='', prefix_sep='')

output:

   province  a  b  c
0   Ontario  1  0  0
1  Manitoba  0  1  0
2    Quebec  0  0  1

python pandas - convert values to column names and fill with zeros and ones

1 Answers1