1

enter image description here

I have this dataframe, in which some values in Postcode may have more than 1 postcode. What I try to do is to split the row that have multiple postcode and then append back into the dataframe.

I already get the list of index that have multiple postcode using following code;

index_list = df_selangor[df_selangor['Postcode'].str.contains(' ')].index

This allow me to create new dataframe and then split the value in Postcode like this;

df_selangor_split = df_selangor.copy()
df_selangor_split = df_selangor_split[df_selangor_split.index.isin(index_list)]
df_selangor_split['Postcode'] = df_selangor_split['Postcode'].str.split()

enter image description here

However, I stuck after this step. I not sure how to split it again so that the Area is copied and have only 1 postcode.

Fahmieyz
  • 255
  • 4
  • 19
  • This may be a possible duplicate of https://stackoverflow.com/a/15954301/7067946. Hope it solves your problem – Arihant Oct 02 '18 at 04:07

1 Answers1

2
pd.concat([pd.Series(row['Area'], row['Postcode'].split(','))
         for _, row in dfx.iterrows()])

Basically, we are iterating each row and splitting the postcode column for each area and then concatenating it.

Pang
  • 9,564
  • 146
  • 81
  • 122
Ajay Shah
  • 414
  • 5
  • 10
  • this code help this problem... but it kinda mess with my DataFrame layout.. Thanks – Fahmieyz Oct 02 '18 at 04:25
  • kinda mess? Can you explain? Also, you can set the new dataframe to a different variable. – Ajay Shah Oct 02 '18 at 04:40
  • well the code put the `Postcode` value as index thus when I reset the index of the dataframe, the `Postcode` become the first value... it is ok, I just need to rearrange the column back... no biggie – Fahmieyz Oct 02 '18 at 10:36