0

How to modify a data frame which looks like this:

col_name           No. Missing   row_number
ST_NUM                  2          2,4
ST_NAME                 0      
OWN_OCCUPIED            3         1,3,10
NUM_BEDROOMS            2         1,4

want to convert the above data frame as:

col_name           row_number
ST_NUM               2
ST_NUM               4
OWN_OCCUPIED         1
OWN_OCCUPIED         3
OWN_OCCUPIED         10
NUM_BEDROOMS         1
NUM_BEDROOMS         4
Uri Granta
  • 1,814
  • 14
  • 25
Sam Vamsi
  • 65
  • 8

2 Answers2

0

Based on this answer, you could try this :

lengths = [len(item) for item in df['row_number']]

result=pd.DataFrame( {"col_name" : np.repeat(df['col_name'].values,lengths),
                    "No_Missing" : np.repeat(df['No_Missing'].values,lengths),
                        "row_number" : np.hstack(df['row_number'])})

print(result)

       col_name  No_Missing  row_number
0        ST_NUM           2         2.0
1        ST_NUM           2         4.0
2  OWN_OCCUPIED           3         1.0
3  OWN_OCCUPIED           3         3.0
4  OWN_OCCUPIED           3        10.0
5  NUM_BEDROOMS           2         1.0
6  NUM_BEDROOMS           2         4.0
Sruthi
  • 2,908
  • 1
  • 11
  • 25
0

Assuming you're using pandas 0.25+, you can simply write:

df.row_number = df.row_number.str.split(",")
df.explode("row_number")[["col_name", "row_number"]]

You will then need to filter out rows corresponding to empty row_numbers, though how you do that will depend on how you represented those in the first place ("", nan, etc).

Uri Granta
  • 1,814
  • 14
  • 25