-2

I'd like to create new columns in my dataframe using unique values from another column, for example

Column 1 has the following values:

Apple
Apple
Banana
Strawberry
Strawberry
Strawberry

When I check unique values in Column 1, the output would be :

Apple
Banana
Strawberry

Now I want to use these three values to create columns named "Apple","Banana","Strawberry" and I want to keep the code dynamic to adapt to however number of unique values are present in Column 1

I'm new to python, any help will be appreciated!

So far, I've been doing getting that output by manually creating new columns in the dataset, I need this to happen automatically depending on the unique values in Column 1

DG572
  • 15
  • 2
  • provide minimal reproducible code in text format ( no screenshots) – eshirvana Nov 23 '22 at 16:28
  • If this is a [tag:pandas] dataframe, please add that tag to your question – Pranav Hosangadi Nov 23 '22 at 16:33
  • Does this answer your question? [Pandas Python : how to create multiple columns from a list](https://stackoverflow.com/questions/51495800/pandas-python-how-to-create-multiple-columns-from-a-list) (I know this question asks about adding columns from a _list_, but the idea is the same for any iterable) – Pranav Hosangadi Nov 23 '22 at 16:34
  • Here's an example of the data and code: My original column ('Rating') has two values "Agree" & "Disagree" I'm manually creating new columns like this data['Agree'] = np.where(data['Rating']== 'Agree', 1, 0) data['Disagree'] = np.where(data['Rating']== 'Disagree', 1, 0) data['Total'] = data[['Agree', 'Disagree']].sum(axis=1) I want to do the same without having to do it manually, irrespective of how many unique values would be present in 'Rating' column – DG572 Nov 23 '22 at 16:48

1 Answers1

0

extract unique values, iterate on them to create columns and fill in data.

Here I inly put boolean values based on matching with the col1 value ...

df = pd.DataFrame({"col1": ["apple", "apple", "banana", "pineapple", "banana", "apple"]})

data=

        col1
0      apple
1      apple
2     banana
3  pineapple
4     banana
5      apple

transform:

unique_col1_val = df["col1"].unique().tolist()
for u in unique_col1_val:
    df[u] = df["col1"] == u # you need to determine how to fill these new columns
    # here we just put a bool indicating a match between new col name and col1 content ...
    # to put an int truth value use:
    # df[u] = (df["col1"] == u).astype(int)
In [72]: df
Out[72]:
        col1  apple  banana  pineapple
0      apple   True   False      False
1      apple   True   False      False
2     banana  False    True      False
3  pineapple  False   False       True
4     banana  False    True      False
5      apple   True   False      False

using df[u] = (df["col1"] == u).astype(int):

        col1  apple  banana  pineapple
0      apple      1       0          0
1      apple      1       0          0
2     banana      0       1          0
3  pineapple      0       0          1
4     banana      0       1          0
5      apple      1       0          0
LoneWanderer
  • 3,058
  • 1
  • 23
  • 41