I have a 12 GB CSV file. I'm hoping to extract only some columns from this data and then write a new CSV that hopefully I can load into R for analysis.
The problem is that I'm getting a memory error when trying to load the entire list at once before writing the new CSV file. How can I parse the data row by row and then create a CSV output?
Here is what I have so far:
import pandas
colnames = ['contributor name', 'recipient name', 'recipient party', 'contributor cfscore', 'candidate cfscore', 'amount']
DATA = pandas.read_csv('pathname\filename.csv', names=colnames)
DATA.to_csv(''pathname\filename.csv', cols = colnames)