I have a Pandas DataFrame with two columns - id
and image
. But ids
are duplicated since one product (id) can have multiple images.
I'm trying to convert this DataFrame into a DataFrame with unique id
column and image
column which has lists of image URLs as values.
id image
0 10 https://s3-eu-west-1.amazonaws.com/1
1 10 https://s3-eu-west-1.amazonaws.com/2
2 10 https://s3-eu-west-1.amazonaws.com/3
3 20 https://s3-eu-west-1.amazonaws.com/4
4 20 https://s3-eu-west-1.amazonaws.com/5
I want to convert DF above to this format:
id image
0 10 ['https://s3-eu-west-1.amazonaws.com/1','https://s3-eu-west-1.amazonaws.com/2','https://s3-eu-west-1.amazonaws.com/3']
1 20 ['https://s3-eu-west-1.amazonaws.com/4','https://s3-eu-west-1.amazonaws.com/5',]
I could use some a loop but maybe there is a more simple/efficient way to do this since DataFrame
has a GroupBy
option.
Do you know how to do that?