I am currently working on a pandas dataframe. I am reformatting the data so that it is easier to understand when running analysis. The default data in the columns is a string that looks like this something | something
. An example is Accident | repairable-damage
.
I want to create two new columns in the dataframe that split the string into 2 different strings and assign different parts of the split string to different columns.
Incident_Category |
------------------------------
Accident | repairable-damage
Accident | repairable-damage
Accident | hull-loss
This is what the expected output is:
Incident_Category | Incident_Type | Incident_Damage |
----------------------------------------------------------------
Accident | repairable-damage | Accident | repairable-damage
Accident | repairable-damage | Accident | repairable-damage
Accident | hull-losss | Accident | hull-losss
This is the code that I currently have:
print(dropped_dataset['Incident_Category'].unique())
dropped_dataset['Incident_type_array'] = dropped_dataset['Incident_Category'].str.split("|")
dropped_dataset['Incident_type'] = dropped_dataset['Incident_type_array'][0][0]
dropped_dataset['Incident_damage'] = dropped_dataset['Incident_type_array'][[1]]
dropped_dataset.head(7)
It is currently grabbing the first record and assigning the first rows details for the entire dataframe columns.
I want each rows Incident_Category
to be split and assigned.