I have a pandas DataFrame, there is a column with values like a,b,c i.e. string splited by ','. Now I want to create new columns, for example, for a,b,c there would be new column a, column b, column c. then the data with a,b,c would get a value of true on the three columns, the data with a,b,e would get true on columns a and b but false on c, maybe it is more clearly to see the picture below. How to do this?
Asked
Active
Viewed 44 times
1
-
1Possible duplicate of https://stackoverflow.com/questions/49182255/executing-a-function-that-adds-columns-and-populates-them-dependig-on-other-colu?noredirect=1&lq=1 – Zero Mar 23 '18 at 09:20
-
Possible duplicate of [Converting pandas column of comma-separated strings into dummy variables](https://stackoverflow.com/questions/46867201/converting-pandas-column-of-comma-separated-strings-into-dummy-variables) – jpp Mar 23 '18 at 09:25
1 Answers
3
Use str.get_dummies
with cast to bool by astype
and add column B
by join
:
df1 = df['A'].str.get_dummies(',').astype(bool).join(df['B'])
print (df1)
a b c f B
0 True True True False 3
1 False True True True 4
More general solution with pop
for extract column A
:
df = pd.DataFrame({'A':['a,b,c','b,c,f'], 'B':[3,4], 'C':[7,3]})
print (df)
A B C
0 a,b,c 3 7
1 b,c,f 4 3
df1 = df.pop('A').str.get_dummies(',').astype(bool).join(df)
print (df1)
a b c f B C
0 True True True False 3 7
1 False True True True 4 3

jezrael
- 822,522
- 95
- 1,334
- 1,252