11

I want to do this :

# input:
        A   B
0  [1, 2]  10
1  [5, 6] -20
# output:
   A   B
0  1  10
1  2  10
2  5 -20
3  6 -20

Every column A's value is a list

df = pd.DataFrame({'A':[[1,2],[5,6]],'B':[10,-20]})
df = pd.DataFrame([[item]+list(df.loc[line,'B':]) for line in df.index for item in df.loc[line,'A']],
                  columns=df.columns)

The above code can work but it's very slow

is there any clever method?

Thank you

Zhang Tong
  • 4,569
  • 3
  • 19
  • 38
  • refer to: http://stackoverflow.com/questions/32468402/how-to-explode-a-list-inside-a-dataframe-cell-into-separate-rows – hangc Jul 18 '16 at 06:55
  • With recent pandas use `DataFrame.explode` df = pd.DataFrame({'A':[[1,2],[5,6]],'B':[10,-20]}) df.explode('A') – Maciej Skorski Jan 28 '22 at 16:42

1 Answers1

13

Method 1 (OP)

pd.DataFrame([[item]+list(df.loc[line,'B':]) for line in df.index for item in df.loc[line,'A']],
             columns=df.columns)

Method 2 (pir)

df1 = df.A.apply(pd.Series).stack().rename('A')
df2 = df1.to_frame().reset_index(1, drop=True)
df2.join(df.B).reset_index(drop=True)

Method 3 (pir)

A = np.asarray(df.A.values.tolist())
B = np.stack([df.B for _ in xrange(A.shape[1])]).T
P = np.stack([A, B])
pd.Panel(P, items=['A', 'B']).to_frame().reset_index(drop=True)

Thanks @user113531 for the reference to Alexander's answer. I had to modify it to work.

Method 4 (@Alexander) LINKED ANSWER

(Follow link and Up Vote if this was helpful)

rows = []
for i, row in df.iterrows():
    for a in row.A:
        rows.append([a, row.B])

pd.DataFrame(rows, columns=df.columns)

Timings

Method 4 (Alexander's) is the best followed by Method 3

enter image description here

Community
  • 1
  • 1
piRSquared
  • 285,575
  • 57
  • 475
  • 624