0

I have a dataframe with the following columns in it.

df_levels = ["No","window1_level","window2_level","window3_level","window4_level", "window5_level"]

Source dataframe:

df_source = pd.DataFrame([[111111,123,103,"window1","window2"], [333333,104,123,"window3","window4"], [678698,111,144,"window1","window5"]], columns=['No','level1','level2','value1','value2'])


No       level1        level2       value1      value2
111111      123          103        window1     window2
333333      104          123        window3     window4
678698      111          144        window1     window5

Result dataframe:

df_result:

 No   window1_level window2_level window3_level window4_level window5_level 
 111111      123       103                 
 333333                               104          123
 678698      111                                              144

I am looping through the source dataframe and assign values to the result dataframe (df_result). After assignment, I am not sure on how to append these values to df_result as the row['value1'] and row['value2'] will change depend on the values in df_source.

for index, row in df_source.iterrows():
 df_result["No"] = row['No']
 df_t2[str(row['value1'])+"_level"] = row['level1']
 df_t2[str(row['value2'])+"_level"] = row['level2']

Any suggestions would be appreciated!

hydesingh
  • 47
  • 4

2 Answers2

1

Dont iterate in pandas, if exist some vectorized alternatives, link.

Use wide_to_long with DataFrame.pivot:

df = pd.wide_to_long(df_source.reset_index(), 
                     stubnames=['level', 'value'],
                     i=['index','No'],
                     j='tmp').reset_index()

df = df.pivot('No','value','level')
print (df)
value   window1  window2  window3  window4  window5
No                                                 
111111    123.0    103.0      NaN      NaN      NaN
333333      NaN      NaN    104.0    123.0      NaN
678698    111.0      NaN      NaN      NaN    144.0
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

I would loop through df_source and append the new values to a separate list, then concatenate the values in the list to get df_result. You can try doing this:

result = []
for _, row in df_source.iterrows():  # Ignore index because not needed
    value1, level1 = row['value1'], row['level1']
    value2, level2 = row['value2'], row['level2']
    
    # Append as a list of dictionaries
    result.append({
        "No": row['No'],
        value1 + "_level": level1,
        value2 + "_level": level2
    })    

# Concatenate everything into a DataFrame
df_result = pd.DataFrame(result)

This results in:

    No      window1_level   window2_level   window3_level   window4_level   window5_level
0   111111  123.0           103.0           NaN             NaN             NaN
1   333333  NaN             NaN             104.0           123.0           NaN
2   678698  111.0           NaN             NaN             NaN             144.0

Depending on what you need, you can replace the NaN with whatever values you need, e.g. 0:

df_result.fillna(0)
Daren
  • 114
  • 5