List of list of strings in pandas dataframe change to long form

Question

I could not reproduce my example exactly, please imagine that every element of the list is wrapped in quotes and is a string variable. I'm not sure if that affects the overall answer, but wanted to include that info.

Given:

lis1= [['apples'],['bananas','oranges','cinnamon'],['pears','juice']]
lis2= [['john'],['stacy'],['ron']]
lis3= [['2015-11-24'], ['2014-02-23','2014-03-25', '2014-03-29'],['2018-02-01','2018-03-27']]
lis4= [['smells good'],['saweet','sour as hell','spicey is goody'],['it bites back','so good']]


pd.DataFrame({'fruits':lis1,'users':lis2, 'date': lis3, 'review': lis4})

I need:

lis1= ['apples','bananas','oranges','cinnamon','pears','juice']
lis2= ['john','stacy', 'stacy','stacy','ron','ron']
lis3= ['2015-11-24', '2014-02-23', '2014-03-25', '2014-03-29','2018-02-01', '2018-03-27']
lis4= ['smells good','saweet','sour as hell','spicey is goody','it bites back','so good']


pd.DataFrame({'fruits':lis1,'users':lis2, 'date': lis3, 'review': lis4})

I've tried to adapt an Itertools example but can't figure out to do this with 4 columns.

jezrael · Answer 1 · 2018-09-20T13:08:35.153

I think need product with flattening in list comprehension:

import ast

#if necessary convert strings to lists
#df = df.applymap(ast.literal_eval)

from  itertools import product
df1 = pd.DataFrame([j for i in df.values for j in product(*i)], columns=df.columns)

print (df1)
      fruits  users        date           review
0     apples   john  2015-11-24      smells good
1    bananas  stacy  2014-02-23           saweet
2    bananas  stacy  2014-02-23     sour as hell
3    bananas  stacy  2014-02-23  spicey is goody
4    bananas  stacy  2014-03-25           saweet
5    bananas  stacy  2014-03-25     sour as hell
6    bananas  stacy  2014-03-25  spicey is goody
7    bananas  stacy  2014-03-29           saweet
8    bananas  stacy  2014-03-29     sour as hell
9    bananas  stacy  2014-03-29  spicey is goody
10   oranges  stacy  2014-02-23           saweet
11   oranges  stacy  2014-02-23     sour as hell
12   oranges  stacy  2014-02-23  spicey is goody
13   oranges  stacy  2014-03-25           saweet
14   oranges  stacy  2014-03-25     sour as hell
15   oranges  stacy  2014-03-25  spicey is goody
16   oranges  stacy  2014-03-29           saweet
17   oranges  stacy  2014-03-29     sour as hell
18   oranges  stacy  2014-03-29  spicey is goody
19  cinnamon  stacy  2014-02-23           saweet
20  cinnamon  stacy  2014-02-23     sour as hell
21  cinnamon  stacy  2014-02-23  spicey is goody
22  cinnamon  stacy  2014-03-25           saweet
23  cinnamon  stacy  2014-03-25     sour as hell
24  cinnamon  stacy  2014-03-25  spicey is goody
25  cinnamon  stacy  2014-03-29           saweet
26  cinnamon  stacy  2014-03-29     sour as hell
27  cinnamon  stacy  2014-03-29  spicey is goody
28     pears    ron  2018-02-01    it bites back
29     pears    ron  2018-02-01          so good
30     pears    ron  2018-03-27    it bites back
31     pears    ron  2018-03-27          so good
32     juice    ron  2018-02-01    it bites back
33     juice    ron  2018-02-01          so good
34     juice    ron  2018-03-27    it bites back
35     juice    ron  2018-03-27          so good

There are too many duplicates here... In the example I provided, if there are 3 fruit names, then there are 3 reviews of them-- but only one user name. So the question is really about how I unpack these elements from the list while multiplying the username by the len(row(user_name)) — D500, Sep 20 '18 at 14:04

List of list of strings in pandas dataframe change to long form

1 Answers1