I'm in the process of migrating my pandas operations to dask. When I was using pandas the following line worked succesfully:
triggers = df.triggers.str.get_dummies(',')
. It split the string at the commas before taking them to be dummy variables.
For example if df.triggers
had three rows as such:
["a, b, c",
"a",
"b, c"]
this would output the values:
a | b | c
1 | 1 | 1
1 | 0 | 0
0 | 1 | 1
However, I cannot use the same command in dask and get the error AttributeError: get_dummies
. When I try to use dd.get_dummies
instead it asks me to categorize the strings. However, each string only becomes a string after splitting by commas.
Any thoughts on how to get around this?