Current data frame: I have a pandas data frame where each employee has a text code(all codes start with T) and an associated frequency right next to the code. All text codes have 8 characters.
+----------+-------------------------------------------------------------+
| emp_id | text |
+----------+-------------------------------------------------------------+
| E0001 | [T0431516,-8,T0401531,-12,T0517519,12] |
| E0002 | [T0701540,-1,T0431516,-2] |
| E0003 | [T0517519,-1,T0421531,-7,T0516319,9,T0500371,-6,T0309711,-3]|
| E0004 | [T0516319,-3] |
| E0005 | [T0431516,2] |
+----------+-------------------------------------------------------------+
Expected data frame: I am trying to make the text codes present in the data frame as individual columns and if an employee has a frequency for that code then populate frequency else 0.
+----------+----------------------------------------------------------------------------------------+
| emp_id | T0431516 | T0401531 | T0517519 | T0701540 | T0421531 | T0516319 | T0500371 | T0309711 |
+----------+----------------------------------------------------------------------------------------+
| E0001 | -8 | -12 | 12 | 0 | 0 | 0 | 0 | 0 |
| E0002 | -2 | 0 | 0 | -1 | 0 | 0 | 0 | 0 |
| E0003 | 0 | 0 | -1 | 0 | -7 | 9 | -6 | -3 |
| E0004 | 0 | 0 | 0 | 0 | 0 | -3 | 0 | 0 |
| E0005 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+----------+----------------------------------------------------------------------------------------+
Sample data:
pd.DataFrame({'emp_id' : {0: 'E0001', 1: 'E0002', 2: 'E0003', 3: 'E0004', 4: 'E0005'},
'text' : {0: '[T0431516,-8,T0401531,-12,T0517519,12]', 1: '[T0701540,-1,T0431516,-2]', 2: '[T0517519,-1,T0421531,-7,T0516319,9,T0500371,-6,T0309711,-3]', 3: '[T0516319,-3]', 4: '[T0431516,2]'}
})
So, far my attempts were unsuccessful. Any pointers/help is much appreciated!