I have a humongous dataframe with multiple types of columns - string, boolean, integer, float (this is important, because it means that I cant use np.repeat
for this problem. Which is why I'm asking my own because I believe similar solutions on here dont work for me. Either that, or I dont know how!). Well, one of my columns is an ID number and for some reason some rows have multiple numbers listed under ID. Something like this:
i ID Name Boolean1 Boolean2 etc
0 2755 Blahblah1 True False ...
1 2894, 4755 PainInMy2 True True ...
2 331 Blehblue False False ...
I wanna split this painful row in a way that each ID number is on a Separate row and All other values get duplicated Under it. i.e.
i ID Name Boolean1 Boolean2 etc
0 2755 Blahblah1 True False ...
1 2894 PainInMy2 True True ...
2 4755 PainInMy2 True True ...
3 331 Blehblue False False ...
What is an elegant way I can achieve this? Keep in mind this is a huge Pandas df with hundreds of thousands of rows and a dozen columns of DIFFerent types; and I would like to keep most, if not all, of pandas df metadata for it. I can butcher it with series of for
s and if
s, but I feel like there should be easier, possibly just a couple lines, way to do this. Maybe with split(',')
or something similar. But I can't figure out how. Thanks!
(Please dont mark this as duplicate. I couldnt find any answer that worked for me!)