I have a multi indexed data frame as the attached image. Now I need to create a new column on a different data frame where each column name will be the unique room numbers.
For example, one expected output from the code will be as following:
N.B. I want to avoid for loops to save memory space and time. What would be the optimal way to get desired output ?
I have tried using for loops and could get desired output but I am not sure if it s a good idea for a large dataset. Here is the code snippet :
import numpy as np
import pandas as pd
d = np.array(['624: COUPLE , 507: DELUXE+ ,301: HONEYMOON','624:FAMILY ,
507: FAMILY+','621:FAMILY , 517: FAMILY+','696:FAMILY , 585:
FAMILY+,624:FAMILY , 507: DELUXE'])
df = pd.Series(d)
df= df.str.extractall(r'(?P<room>[0-9]+):\s*(?P<grd>[^\s,]+)')
gh = df[df['room'] == '507'].index
rf = pd.DataFrame(index=range(0,4),columns=['room#507','room#624'],
dtype='float')
for i in range(0,rf.shape[0]):
for j in range(0,gh.shape[0]):
if (i == gh[j][0]):
rf['room#507'][i] = df.grd[gh[j][0]][gh[j][1]]