Is there a way to transform all unique values into a new dataframe using loop and at the same time create additional columns?

Question

My problem is that I have a dataframe like this:

##for demonstration
import pandas as pd

example = {
"ID": [1, 1, 2, 2, 2, 3],
"place":["Maryland","Maryland", "Washington", "Washington", "Washington", "Los Angeles"],
"type": ["condition", "symptom", "condition", "condition", "sky", "condition"],
"name":  ["depression", "cough", "fatigue", "depression", "blue", "fever" ]
}

#load into df:
example = pd.DataFrame(example)

print(example) 
}

And I want to sort it by unique ID so that it will be reorganized like that:

#for demonstration
import pandas as pd

result = {
"ID": [1,2,3],
"place":["Maryland","Washington", "Los Angeles"],
"condition": ["depression", "fatigue", "fever"],
"condition1":["no", "depression", "no"],
"symptom": ["cough", "no", "no"],
"sky": ["no", "blue", "no"]
}

#load into df:
result = pd.DataFrame(result)

print(result)

I tried to sort it like:

example.nunique()   

df_names = dict()
for k, v in example.groupby('ID'):
    df_names[k] = v

However, this gives me back a dictionary and is not organized in a way it should.

Is there a way to do it with the loop like for all unique ID create a new column if there is condition, sky or others? If there are couple conditions that the next condition is becoming condition1. Could you please help me if you know the way to realize it?

score 1 · Accepted Answer · answered Jun 23 '22 at 18:14

This should give you the answers you need. It is a combination of cumsum() and pivot

import pandas as pd

df = pd.DataFrame({
"ID": [1, 1, 2, 2, 2, 3],
"place":["Maryland","Maryland", "Washington", "Washington", "Washington", "Los Angeles"],
"type": ["condition", "symptom", "condition", "condition", "sky", "condition"],
"name":  ["depression", "cough", "fatigue", "depression", "blue", "fever" ]
})
df['type'] = df['type'].astype(str) + '_' + df.groupby(['place', 'type']).cumcount().astype(str)
df = df.pivot(index=['ID', 'place'], columns = 'type', values = 'name').reset_index()
df = df.fillna('no')
df.columns = df.columns.str.replace('_0', '')
df = df[['ID', 'place', 'condition', 'condition_1', 'symptom', 'sky']]
df

Is there a way to transform all unique values into a new dataframe using loop and at the same time create additional columns?

1 Answers1