Suppose I have the pandas DataFrame raw_corpus with columns unique_ID and 'tokenized_recipes' as follows:
unique_ID tokenized_recipes
0 11530 ['photo', 'video', '500px', 'new', 'photo', 'from', 'anyone', 'tagged', 'with', 'phrase', 'change', 'new', 'tab', 'background', 'google', 'chrome', 'other']
1 17176 ['environment', 'control', 'monitoring', 'nest', 'protect', 'smoke', 'alarm', 'warning', 'activate', 'shortcut', 'wink', 'shortcuts', 'smart', 'hubs', 'systems']
2 6984 ['security', 'monitoring', 'systems', 'dlink', 'motion', 'sensor', 'motion', 'detected', 'post', 'to', 'channel', 'slack', 'communication']
I would like to reorganize this data and write it to a tab-delimited csv so it looks like this:
unique_ID tokenized_recipes
11530 'photo'
11530 'video'
11530 '500px'
11530 'new'
...
17176 'environment'
17176 'control'
...
I tried 2 of the solutions linked above with 11 responses. I re-ordered the cols of my dataframe to correspond to the solution order.
My dataframe variable 'tokenized_recipes' is already a list.
The more complicated generic solution produces an error that I have a zero-dimensional array.
Then I attempt to explode the dataframe id_token with this code and get the NameError: name 'Series' is not defined.
#now explode the dataframe id_token string entry to separate rows
pd.concat([Series(row['unique_ID'],
row['tokenized_recipes'].split(','))
for _, row in id_token.iterrows()]).reset_index()