0

I am trying to split out series in a column and map it against a unique value (in this case user).

So far I have been able to get the data frame to look like this.

df1:
   user      weight
0  james     [100.0]
1  brandon   [60.0, 40.0]
2  brandon   [60.0, 40.0]
3  chris     [100.0]
4  james     [80.0, 5.0, 15.0]      
5  james     [80.0, 5.0, 15.0]
6  james.    [80.0, 5.0, 15.0]

I tried the split explode method, but it creates duplicate rows.

What is the best way to loop through the series and update the data frame as this desired output:

df2:
0 james    [100.0]
1 brandon  [60.0]
2 brandon  [40.0]
3 chris    [100.0]
4 james    [80.0]
5 james    [5.0]
6 james    [15.0]

Thank you.

bbaskets
  • 84
  • 7
  • Do you want the `weight` column to contain lists with a single entry (`[100.0]`) or numerical values (`100.0`)? – FirefoxMetzger Jul 04 '22 at 18:16
  • `weight` is a list or string looks like list? – Corralien Jul 04 '22 at 18:18
  • They are all weights that add up to `100`. If there is only one value, it would be 100. But in this example, all `james` values equal 100. So yes, I think they should be numerical values. – bbaskets Jul 04 '22 at 18:22

1 Answers1

0

You can use drop_duplicates and explode

>>> df.drop_duplicates('user').explode('weight')
      user weight
0    james  100.0
1  brandon   60.0
1  brandon   40.0
3    chris  100.0
6   james.   80.0
6   james.    5.0
6   james.   15.0
Corralien
  • 109,409
  • 8
  • 28
  • 52