User Defined function for converting Dataframe Format?

Question

How to modify a data frame which looks like this:

col_name           No. Missing   row_number
ST_NUM                  2          2,4
ST_NAME                 0      
OWN_OCCUPIED            3         1,3,10
NUM_BEDROOMS            2         1,4

want to convert the above data frame as:

col_name           row_number
ST_NUM               2
ST_NUM               4
OWN_OCCUPIED         1
OWN_OCCUPIED         3
OWN_OCCUPIED         10
NUM_BEDROOMS         1
NUM_BEDROOMS         4

score 0 · Answer 1 · answered Oct 19 '19 at 06:25

Based on this answer, you could try this :

lengths = [len(item) for item in df['row_number']]

result=pd.DataFrame( {"col_name" : np.repeat(df['col_name'].values,lengths),
                    "No_Missing" : np.repeat(df['No_Missing'].values,lengths),
                        "row_number" : np.hstack(df['row_number'])})

print(result)

       col_name  No_Missing  row_number
0        ST_NUM           2         2.0
1        ST_NUM           2         4.0
2  OWN_OCCUPIED           3         1.0
3  OWN_OCCUPIED           3         3.0
4  OWN_OCCUPIED           3        10.0
5  NUM_BEDROOMS           2         1.0
6  NUM_BEDROOMS           2         4.0

score 0 · Accepted Answer · answered Oct 19 '19 at 06:27

0

Assuming you're using pandas 0.25+, you can simply write:

df.row_number = df.row_number.str.split(",")
df.explode("row_number")[["col_name", "row_number"]]

You will then need to filter out rows corresponding to empty row_numbers, though how you do that will depend on how you represented those in the first place ("", nan, etc).

answered Oct 19 '19 at 06:27

Uri Granta

1,814
14
25

explode is not available in python3 – Sam Vamsi Oct 20 '19 at 06:12
It is but it requires a recent version of pandas. If you installed pandas with pip you can run `pip install --upgrade pandas` – Uri Granta Oct 20 '19 at 08:06

User Defined function for converting Dataframe Format?

2 Answers2