I have a DataFrame df filled with rows and columns where there are duplicate Id's:
Index A B
0 0.00 0.00
1 0.00 0.00
29 0.50 105.00
36 0.80 167.00
37 0.80 167.00
42 1.00 209.00
44 0.50 105.00
45 0.50 105.00
46 0.50 105.00
50 0.00 0.00
51 0.00 0.00
52 0.00 0.00
53 0.00 0.00
When I use:
df.drop_duplicates(subset=['A'], keep='last')
I get:
Index A B
37 0.80 167.00
42 1.00 209.00
46 0.50 105.00
53 0.00 0.00
Which makes sense, that's what the function does. However, what I actually would like to achieve is something like:
Index A B
1 0.00 0.00
29 0.50 105.00
37 0.80 167.00
42 1.00 209.00
46 0.50 105.00
53 0.00 0.00
Basically from each subpart of column A (0,0), (0.80, 0.80), etc. To pick the last value.
It is also important the values in column A stay in this order 0; 0.5; 0.8; 1; 0.5;0 and they do not get mixed.