0

My question is similar to this one:

pandas: When cell contents are lists, create a row for each element in the list

I want to create duplicated rows for each element of the list in samples, but now the elements are stored as a single string, something like:

                   samples  subject  trial_num
0    '[string1, string21, string3]'        1          1
1    '[string3, string24, string3]'        1          2
2    '[string4, string24, string4]'        1          3
3    '[string5, string24, string5]'        2          1
4    '[string13, string24, string6]'        2          2
5    '[string16, string24, string6]'        2          3

Thank you

user3620915
  • 137
  • 1
  • 9

1 Answers1

0
from ast import literal_eval

df.assign(samples=df.samples.str.strip("'").apply(literal_eval)).pipe(
    lambda d: d.loc[d.index.repeat(d.samples.str.len())].assign(
        samples=np.concatenate(d.samples)
    )
)

   samples  subject  trial_num
0     0.57        1          1
0    -0.83        1          1
0     1.44        1          1
1    -0.01        1          2
1     1.13        1          2
1     0.36        1          2
2     1.18        1          3
2    -1.46        1          3
2    -0.94        1          3
3    -0.08        2          1
3    -4.22        2          1
3    -2.05        2          1
4     0.72        2          2
4     0.79        2          2
4     0.53        2          2
5     0.40        2          3
5    -0.32        2          3
5    -0.13        2          3
piRSquared
  • 285,575
  • 57
  • 475
  • 624
  • I've just edited the question putting strings in samples column (that is my case). With your solution I've an error "malformed string" – user3620915 Sep 21 '17 at 10:57