Create new columns using unique values in other columns in Python

Question

I'd like to create new columns in my dataframe using unique values from another column, for example

Column 1 has the following values:

Apple
Apple
Banana
Strawberry
Strawberry
Strawberry

When I check unique values in Column 1, the output would be :

Apple
Banana
Strawberry

Now I want to use these three values to create columns named "Apple","Banana","Strawberry" and I want to keep the code dynamic to adapt to however number of unique values are present in Column 1

I'm new to python, any help will be appreciated!

So far, I've been doing getting that output by manually creating new columns in the dataset, I need this to happen automatically depending on the unique values in Column 1

provide minimal reproducible code in text format ( no screenshots) — eshirvana, Nov 23 '22 at 16:28
If this is a [tag:pandas] dataframe, please add that tag to your question — Pranav Hosangadi, Nov 23 '22 at 16:33
Does this answer your question? [Pandas Python : how to create multiple columns from a list](https://stackoverflow.com/questions/51495800/pandas-python-how-to-create-multiple-columns-from-a-list) (I know this question asks about adding columns from a _list_, but the idea is the same for any iterable) — Pranav Hosangadi, Nov 23 '22 at 16:34
Here's an example of the data and code: My original column ('Rating') has two values "Agree" & "Disagree" I'm manually creating new columns like this data['Agree'] = np.where(data['Rating']== 'Agree', 1, 0) data['Disagree'] = np.where(data['Rating']== 'Disagree', 1, 0) data['Total'] = data[['Agree', 'Disagree']].sum(axis=1) I want to do the same without having to do it manually, irrespective of how many unique values would be present in 'Rating' column — DG572, Nov 23 '22 at 16:48

LoneWanderer · Accepted Answer · 2022-11-23T17:15:22.900

extract unique values, iterate on them to create columns and fill in data.

Here I inly put boolean values based on matching with the col1 value ...

df = pd.DataFrame({"col1": ["apple", "apple", "banana", "pineapple", "banana", "apple"]})

data=

        col1
0      apple
1      apple
2     banana
3  pineapple
4     banana
5      apple

transform:

unique_col1_val = df["col1"].unique().tolist()
for u in unique_col1_val:
    df[u] = df["col1"] == u # you need to determine how to fill these new columns
    # here we just put a bool indicating a match between new col name and col1 content ...
    # to put an int truth value use:
    # df[u] = (df["col1"] == u).astype(int)

In [72]: df
Out[72]:
        col1  apple  banana  pineapple
0      apple   True   False      False
1      apple   True   False      False
2     banana  False    True      False
3  pineapple  False   False       True
4     banana  False    True      False
5      apple   True   False      False

using df[u] = (df["col1"] == u).astype(int):

        col1  apple  banana  pineapple
0      apple      1       0          0
1      apple      1       0          0
2     banana      0       1          0
3  pineapple      0       0          1
4     banana      0       1          0
5      apple      1       0          0

Thanks so much, this is exactly what I was looking for. May I ask, instead of using "True" & "False", how can I assign 1 & 0 to the same. — DG572, Nov 23 '22 at 17:05
(btw you can mark answer as accepted if you think it is the case.) — LoneWanderer, Nov 23 '22 at 17:15

Create new columns using unique values in other columns in Python

1 Answers1