0

I don't think this is a typical wide to long question because the items I'm looking to turn to long are actually nested in list fields.

I have a uid field which is a list of integers, and another array which is a list of booleans that corresponds to the uid fields. I'd like to turn this array records into long records instead.

My data frame that looks like this:

    name          uid         is_left  colC colD colE ...
record01        [885]          [True]    ..   ..   .. ...
record02        [981]         [False]    ..   ..   .. ...
record03   [713, 981]   [False, True]    ..   ..   .. ...
record04        [713]          [True]    ..   ..   .. ...
record05        [126]          [True]    ..   ..   .. ...

I'd like to unwind it to:

    name          uid         is_left  colC colD colE ...
record01          885            True    ..   ..   .. ...
record02          981           False    ..   ..   .. ...
record03          713           False    ..   ..   .. ...
record03          981            True    ..   ..   .. ...
record04          713            True    ..   ..   .. ...
record05          126            True    ..   ..   .. ...

You can see record03 now has 2 entries, one for (713, False) and one for (981, True).

Mittenchops
  • 18,633
  • 33
  • 128
  • 246

1 Answers1

1

Better Answer:

In pandas 1.3 you can use multi-column explode:

df.explode(['uid','is_left'])

For older versions, explode each column individually:

df.apply(pd.Series.explode)

Old Answer:

You can use the explode method:

df.explode("uid").explode("is_left")

explode take the name of the column to convert from list elements to new rows.

C. Braun
  • 5,061
  • 19
  • 47
  • 1
    You should end up with mismatches if you explode twice (`713, False`, `713, True`, `981, False`, and `981, True`). – ifly6 Apr 15 '22 at 18:54
  • This seem to work actually. This is the right answer. The question this links to uses a list syntax that fails. – Mittenchops Apr 15 '22 at 19:07
  • Oh yea I'm seeing the mis-matches now. I'd recommend upgrading to pandas 1.3.0 and using the multi-column explode as said in the comments above. – C. Braun Apr 15 '22 at 19:12
  • Right, this does a cartesian product---t does not pair the exploded. What's the solution if I cannot upgrade as the linked question shows? – Mittenchops Apr 16 '22 at 18:59
  • @Mittenchops see my new answer above - you can `explode` each column individually. – C. Braun Apr 17 '22 at 19:39