1

Is there an nice way to do the below?

This is exactly the same question as here: Split pandas dataframe string entry to separate rows

But that post is pretty old and wondering if there is a better method using newer pandas features.

I have managed to reproduce with my data as below but not sure how to incorporate more than 2 columns. In other words my var3 would be treated similarly as var 2 where it is replicated across the rows.

Sort of get the logic of row[val]

row['var2'], row['var3'], row['var1'].split(',')
produces:
(99999, 1403298300, [u'08241', u'08215', u'08217'])

But still not sure how to extend this out to more than 2 columns.

Out[104]:
                                  var1   var2        var3
0                          47429,47404  10700  1403298300
1  23030,23831,23147,23836,23860,23875  99999  1403297100
2  72930,72951,72832,72820,72949,72821  10200  1403298300
3              56522,58030,56583,56565  99999  1403295900
4        59824,59831,59821,59863,59865  99999  1403294700


pd.concat([pd.Series(row['var2'], row['var1'].split(','))\
    for _, row in testdf.iterrows()]).reset_index()[:5]

 index      0
0  47429  10700
1  47404  10700
2  23030  99999
3  23831  99999
4  23147  99999

Example provided by older post:

In [7]: a

Out[7]: 
    var1  var2
0  a,b,c     1
1  d,e,f     2

In [8]: b

Out[8]: 
  var1  var2
0    a     1
1    b     1
2    c     1
3    d     2
4    e     2
5    f     2
Community
  • 1
  • 1
horatio1701d
  • 8,809
  • 14
  • 48
  • 77

0 Answers0