0

I have extracted data (with the help of stack :) )from a really clunky text file and converted the lists to pandas df. Any suggestions how I may unpack the df['wnd']? The output is currently formatted as seen below.

                       Regions  \
0                   EAST COAST   
1              NORTHEAST COAST   
2             FUNK ISLAND BANK   
3         NORTHERN GRAND BANKS   
4                  SOUTH COAST   
...                        ...   
2390          FUNK ISLAND BANK   
2391      NORTHERN GRAND BANKS   
2392               SOUTH COAST   
2393  SOUTHEASTERN GRAND BANKS   
2394  SOUTHWESTERN GRAND BANKS   

                                                    Wnd  
0     [01/06Z NW25,  01/15Z SW15,  02/00Z SE25,  02/...  
1     [01/06Z NW25,  01/15Z S15,  02/00Z SE25,  02/1...  
2     [01/06Z NW25-35,  01/12Z W25,  02/00Z SW15,  0...  
3     [01/06Z NW25-35,  01/12Z NW15-20,  02/00Z SE20...  
4     [01/06Z W15,  01/18Z S20,  02/00Z SE35-45,  02...  
...                                                 ...  
2390  [30/06Z N45,  31/03Z N35,  31/12Z N25,  01/00Z...  
2391  [30/06Z NW35-45,  30/12Z NW25-35,  31/04Z N35 ...  
2392  [30/06Z NW25,  30/18Z N30,  31/06Z N20,  31/15...  
2393  [30/06Z W25 XCPT W35 OVER NORTHERN SECTIONS,  ...  
2394  [30/06Z NW15-20,  31/00Z N25,  31/15Z N15-20, ... 

My desired output is so that the df['wnd'] is unpacked, and each element in df['wnd'] is in list format... For example all elements in the first row of df['wnd'] are assigned to the region EAST COAST:

                       Regions  \
0                   EAST COAST   
1                   EAST COAST    
2                   EAST COAST   
...                        ...

              Wnd                                             
0     01/06Z NW25  
1     01/15Z SW15  
2     02/00Z SE25 
...           ...

I will have to split the dates, directions and wind speeds later, but my main issue is unpacking/formatting the elements properly first.

1 Answers1

0

It's a bit unclear what you mean by unpacking, this could be done in many ways. If you simply wish to have them become rows themselves, you can do this:

from random import randint, choice
import pandas as pd

df = pd.DataFrame({
    'Wnd': [[randint(0, 10) for _ in range(randint(1,5))] for _ in range(1000)],
    'col2': [choice('abcdefg') for _ in range(1000)]
})

print(df.head())

               Wnd col2
0        [6, 0, 5]    c
1           [8, 2]    g
2  [1, 6, 2, 2, 8]    e
3   [10, 1, 10, 9]    a
4       [10, 4, 2]    g



df.explode('Wnd')  #       <-- important bit

    Wnd col2
0     2    b
0     7    b
0     2    b
0     2    b
1     4    a
..   ..  ...
998   0    a
998   1    a
999   6    g
999   0    g
999   4    g
Bertil Johannes Ipsen
  • 1,656
  • 1
  • 14
  • 27