So I have this huge CSV, I made a test script to see the number of rows, and it return about 24 million rows. I want to extract the number of rows that have the same CIK number, and transfer that data in separate CSV.
So the wanted output in the other file would be:
CIK number: number of IP with that CIK number.
I had some ideas, but they weren't efficient enough, so the script was useless, because it took for ages to go through csv. So did someone come a cross a similar problem as I have?
Should I use Pandas for this, any suggestion would be a huge help !
Example of the CSV: