3

I'm in the process of migrating my pandas operations to dask. When I was using pandas the following line worked succesfully: triggers = df.triggers.str.get_dummies(','). It split the string at the commas before taking them to be dummy variables.

For example if df.triggers had three rows as such:

["a, b, c", 
 "a", 
 "b, c"]

this would output the values:

a | b | c
1 | 1 | 1
1 | 0 | 0
0 | 1 | 1

However, I cannot use the same command in dask and get the error AttributeError: get_dummies. When I try to use dd.get_dummies instead it asks me to categorize the strings. However, each string only becomes a string after splitting by commas.

Any thoughts on how to get around this?

sachinruk
  • 9,571
  • 12
  • 55
  • 86
  • Does this answer your question? [Create dummies from column with multiple values in dask](https://stackoverflow.com/questions/64237300/create-dummies-from-column-with-multiple-values-in-dask) – Santosh Kumar Janumahanthi Nov 03 '21 at 03:52

0 Answers0