I am kind of new to data frames and been trying to find a solution to shift data in specific columns to another row if ID
and CODE
matches.
df = pd.DataFrame({'ID': ['123', '123', '123', '154', '167', '167'],
'NAME': ['Adam', 'Adam', 'Adam', 'Bob', 'Charlie', 'Charlie'],
'CODE': ['1001', '1001', '1011', '1002', 'A0101', 'A0101'],
'TAG1': ['A123', 'B123', 'K123', 'D123', 'E123', 'G123'],
'TAG2': [ np.NaN, 'C123', 'L123', np.NaN, 'F123', 'H123'],
'TAG3': [ np.NaN, 'M123', np.NaN, np.NaN, np.NaN, np.NaN]})
ID NAME CODE TAG1 TAG2 TAG3
0 123 Adam 1001 A123 NaN NaN
1 123 Adam 1001 B123 C123 M123
2 123 Adam 1011 K123 L123 NaN
3 154 Bob 1002 D123 NaN NaN
4 167 Charlie A0101 E123 F123 NaN
5 167 Charlie A0101 G123 H123 NaN
So above I have added the code which depicts the initial data frame, now we can see that ID='123'
has two rows with the same codes and the values in 'TAG'
columns vary, I would like to shift the 'TAG'
data in the second row to the first row after 'TAG1'
or in 'TAG1'
if it is empty and delete the second row completely. It should be the same for ID='167'
.
Below is another code with a sample data frame I have added manually which depicts the final result, any suggestions would be great. I tried one thing, it exactly did not work the way I wanted it to.
df_result = pd.DataFrame({'ID': ['123', '123', '154', '167'],
'NAME': ['Adam', 'Adam', 'Bob', 'Charlie',],
'CODE': ['1001', '1011', '1002', 'A0101'],
'TAG1': ['A123', 'K123', 'D123', 'E123'],
'TAG2': ['B123', 'L123', np.NaN, 'F123'],
'TAG3': ['C123', np.NaN, np.NaN, 'G123'],
'TAG4': ['M123', np.NaN, np.NaN, 'H123']})
ID NAME CODE TAG1 TAG2 TAG3 TAG4
0 123 Adam 1001 A123 B123 C123 M123
1 123 Adam 1011 K123 L123 NaN NaN
2 154 Bob 1002 D123 NaN NaN NaN
3 167 Charlie A0101 E123 F123 G123 H123
The code that I have tried is below to kind of get the result. But it did not get exactly the output I wanted
df2=pd.pivot_table(test_shift, index=['ID', 'NAME', 'CODE'],
columns=test_shift.groupby(['ID', 'CODE']).cumcount().add(1),
values=['TAG1'], aggfunc='sum')
NOTE: Sorry for the bad posting the first time, I tried adding the dataframe as code to help you visually. But I 'FAILED'
. I will try to learn that over the coming days and be a better member of 'Stack Overflow
.
Thank you for the help.