create pandas column using cell values of another multi indexed data frame

Question

I have a multi indexed data frame as the attached image. Now I need to create a new column on a different data frame where each column name will be the unique room numbers.

For example, one expected output from the code will be as following:

N.B. I want to avoid for loops to save memory space and time. What would be the optimal way to get desired output ?

I have tried using for loops and could get desired output but I am not sure if it s a good idea for a large dataset. Here is the code snippet :

import numpy as np
import pandas as pd
d = np.array(['624: COUPLE , 507: DELUXE+ ,301: HONEYMOON','624:FAMILY , 
      507: FAMILY+','621:FAMILY , 517: FAMILY+','696:FAMILY , 585: 
      FAMILY+,624:FAMILY , 507: DELUXE'])
df = pd.Series(d)
df= df.str.extractall(r'(?P<room>[0-9]+):\s*(?P<grd>[^\s,]+)')

gh = df[df['room'] == '507'].index

rf = pd.DataFrame(index=range(0,4),columns=['room#507','room#624'], 
      dtype='float')


for i in range(0,rf.shape[0]):
   for j in range(0,gh.shape[0]):
     if (i == gh[j][0]):
        rf['room#507'][i] = df.grd[gh[j][0]][gh[j][1]]

In order for us to help you, it is necessary that you show your effort and submit data to be used to reproduce your problem. While providing an image is helpful, it doesn't allow for reproducing the issue. Please edit your question to show a minimal reproducible set. See [Minimal Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example "Minimal Reproducible Example") for details. — itprorh66, Feb 21 '21 at 20:35
Does this answer your question? [How to pivot a dataframe?](https://stackoverflow.com/questions/47152691/how-to-pivot-a-dataframe) — Wiktor Stribiżew, Feb 22 '21 at 09:51

score 1 · Accepted Answer · answered Feb 22 '21 at 06:56

Use DataFrame.reset_index with DataFrame.pivot:

df= df.str.extractall(r'(?P<room>[0-9]+):\s*(?P<grd>[^\s,]+)')

df = df.reset_index(level=1, drop=True).reset_index()

df = df.pivot('index','room','grd').add_prefix('room_')
print (df)
room    room_301 room_507 room_517 room_585 room_621 room_624 room_696
index                                                                 
0      HONEYMOON  DELUXE+      NaN      NaN      NaN   COUPLE      NaN
1            NaN  FAMILY+      NaN      NaN      NaN   FAMILY      NaN
2            NaN      NaN  FAMILY+      NaN   FAMILY      NaN      NaN
3            NaN   DELUXE      NaN  FAMILY+      NaN   FAMILY   FAMILY

Or DataFrame.set_index with Series.unstack:

df= df.str.extractall(r'(?P<room>[0-9]+):\s*(?P<grd>[^\s,]+)')

df = (df.reset_index(level=1, drop=True)
        .set_index('room', append=True)['grd']
        .unstack()
        .add_prefix('room_'))

print (df)
room   room_301 room_507 room_517 room_585 room_621 room_624 room_696
0     HONEYMOON  DELUXE+      NaN      NaN      NaN   COUPLE      NaN
1           NaN  FAMILY+      NaN      NaN      NaN   FAMILY      NaN
2           NaN      NaN  FAMILY+      NaN   FAMILY      NaN      NaN
3           NaN   DELUXE      NaN  FAMILY+      NaN   FAMILY   FAMILY

create pandas column using cell values of another multi indexed data frame

1 Answers1

Linked