Array in a dataframe to n dataframe rows

Question

I have 4 column dataframe:

PERSON ID [int - key]
PERSON NAME [varchar]
PERSON POINTS [numpy 1d array of int]
PERSON DISTANCES [numpy 1d array of int]

As you see the granularity is on the person. I want to create a new dataframe with a new granularity: Person_Points with also 4 columns:

PERSON_ID [int - key]
PERSON_NAME [varchar]
PERSON_POINT [int - key]
PERSON_DISTANCE [int] by creating as many rows as there is element in my 1d numpy arrays.

Exemple of PERSON dataframe:

PERSON_ID PERSON_NAME PERSON_POINTS PERSON_DISTANCES
1         A           [1 2 3]       [2 4 6]
2         B           [4 5 6 7]     [2 4 6 8]
3         C           [8]           [6]
4         D           [9 10]        [4 8]

Exemple of PERSON_Points dataframe:

PERSON_ID PERSON_NAME PERSON_POINT PERSON_DISTANCE
1         A           1             2
1         A           2             4
1         A           3             6
2         B           4             2
2         B           5             4
2         B           6             6
2         B           7             8
3         C           8             6
4         D           9             4
4         D           10            8

As my dataframe Person is very large , I am not sure of the best way to do it, and can't find many exemples. Can someone help here? Thanks in advance.

score 0 · Accepted Answer · answered Dec 12 '22 at 17:52

0

You can use explode as Bruno pointed out

# new df with one row per element in PERSON_POINTS and PERSON_DISTANCES
df = df.explode('PERSON_POINTS')
df = df.explode('PERSON_DISTANCES')

print(df)

Above is a short example to do it

answered Dec 12 '22 at 17:52

Kartik Shandilya

3,796
5
24
42

Array in a dataframe to n dataframe rows

1 Answers1