0

I have a pandas data frame that looks like this

    train   val     score
1   0.6125  0.0827  loss
2   0.8565  0.9845  precision
3   0.7596  0.982   recall
4   0.0466  0.0454  loss
5   0.9897  0.9949  precision
6   0.9884  0.9949  recall

and I want to convert it to something like this

    train_loss  train_precision  train_recall  val_loss  val_precision  val_recall
1   0.6125      0.8565           0.7596        0.0827    0.9845         0.982       
2   0.0466      0.9897           0.9884        0.0454    0.9949         0.9949  
  • `df.assign(index=lambda d: d.groupby('score').cumcount()).pivot(index='index', columns='score')`, then optionally flatten the MultiIndex – mozway Jun 12 '23 at 17:48

2 Answers2

1

You can use something like the following:

# Transpose a pandas dataframe
import pandas as pd

# Create a dataframe
df = pd.DataFrame({'train': [0.6125, 0.8565, 0.7596, 0.0466, 0.9897, 0.9884], 'val': [0.0827, 0.9845, 0.982, 0.0454, 0.9949, 0.9949], 'score': ['loss', 'precision', 'recall']*2})

# Find list of unique values in the score column
new_cols = {f'{i}_{j}': [] for i in ['train', 'val'] for j in df.score.unique()}

# Iterate over the dataframe and append values to the new_cols dictionary
for _, row in df.iterrows():
    new_cols[f'train_{row["score"]}'].append(row[0])
    new_cols[f'val_{row["score"]}'].append(row[1])

# Create a new dataframe from the new_cols dictionary
new_df = pd.DataFrame(new_cols)
print(new_df)

This code returns the requested df.

   train_loss  train_precision  train_recall  val_loss  val_precision  val_recall
0      0.6125           0.8565        0.7596    0.0827         0.9845      0.9820       
1      0.0466           0.9897        0.9884    0.0454         0.9949      0.9949       
SaptakD625
  • 89
  • 4
0
def transform(dataframe):
  train_loss, train_precision, train_recall, val_loss, val_precision, val_recall = ([] for i in range(6))
  
  for idx in dataframe.index:
    if dataframe['score'][idx] == 'loss':
       train_loss.append(dataframe['train'][idx])
       val_loss.append(dataframe['val'][idx])
    if dataframe['score'][idx] == 'precision':
       train_precision.append(dataframe['train'][idx])
       val_precision.append(dataframe['val'][idx])
    if dataframe['score'][idx] == 'recall':
       train_recall.append(dataframe['train'][idx])
       val_recall.append(dataframe['val'][idx])

   return train_loss, train_precision, train_recall, val_loss, val_precision, val_recall
 
  
df = pd.DataFrame()
df['train_loss'], df['train_precision'], df['train_recall'], df['val_loss'], df['val_precision'], df['val_recall'] = transform(dataframe)