0

I have this data set, in which there is a uid field which is a list of integers, and another array which is a list of booleans that corresponds to the uid fields. I'd like to turn this array records into long records instead, exploding on /2/ aligned columns.

    name          uid         is_left  colC colD colE ...
record01        [885]          [True]    ..   ..   .. ...
record02        [981]         [False]    ..   ..   .. ...
record03   [713, 981]   [False, True]    ..   ..   .. ...
record04        [713]          [True]    ..   ..   .. ...
record05        [126]          [True]    ..   ..   .. ...

I understand that for pandas 1.3.0 I can use this syntax to multi-column explode:

df.explode(['uid', 'is_left'])

However, I am stuck on a lower version, 1.25.0 and cannot use the 1.3.0 multi column explode simple syntax. What is the correct way to go from what I have to:

    name          uid         is_left  colC colD colE ...
record01          885            True    ..   ..   .. ...
record02          981           False    ..   ..   .. ...
record03          713           False    ..   ..   .. ...
record03          981            True    ..   ..   .. ...
record04          713            True    ..   ..   .. ...
record05          126            True    ..   ..   .. ...

You can see record03 now has 2 entries, one for (713, False) and one for (981, True), and it is NOT the Cartesian explosion you'd get from applying explode twice:

(713, False)
(713, True)
(981, False)
(981, True)

References:

Mittenchops
  • 18,633
  • 33
  • 128
  • 246
  • `df = df.set_index('name').apply(pd.Series.explode)` should work – Rodalm Apr 16 '22 at 19:13
  • how is the linked duplicate not answering your question? – mozway Apr 16 '22 at 19:22
  • Thanks, @mozway. One suggested upgrading to 1.3, which is not possible for me, and another only applied in the case where /the only columns that existed in the data frame/ were exploded, so not including additional columns that are not to be exploded as I have in colC, colD, etc. Were you able to get their solution to run on this shape? – Mittenchops Apr 16 '22 at 19:30
  • Oh and another /explicitly/ did the Cartesian product that I am not looking for, rather than pairing each record in uid with the same record in the is_left array. – Mittenchops Apr 16 '22 at 19:31
  • Thanks, @Rodalm. I might be misunderstanding, but I don't see how your suggestion limits this only to the 2 paired columns and keeps the lists aligned. – Mittenchops Apr 16 '22 at 19:33
  • @Mittenchops you have to be clearer. Why my suggestion doesn't work? What do you mean by "limits this only to the 2 paired columns"? It keeps the lists aligned. Actually, there is no need for the `set_index`, I meant just `df.apply(pd.Series.explode)`. What is the output of this code and what is the output you were expecting? – Rodalm Apr 16 '22 at 19:48

0 Answers0