I currently have a column which has data I want to parse, and then put this data on other columns. Currently the best I can get is from using the apply method:
def parse_parent_names(row):
split = row.person_with_parent_names.split('|')[2:-1]
return split
df['parsed'] = train_data.apply(parse_parent_names, axis=1).head()
The data is a panda df with a column that has names separated by a pipe (|):
'person_with_parent_names'
|John|Doe|Bobba|
|Fett|Bobba|
|Abe|Bea|Cosby|
Being the rightmost one the person and the leftmost the "grandest parent". I'd like to transform this to three columns, like:
'grandfather' 'father' 'person'
John Doe Bobba
Fett Bobba
Abe Bea Cosby
But with apply, the best I can get is
'parsed'
[John, Doe,Bobba]
[Fett, Bobba]
[Abe, Bea, Cosby]
I could use apply three times, but it would not be efficient to read the entire dataset three times.