0

I have a big csv with 116+ million observations. I need to break this down into lots of small CSV's so that I can run it through a different program that has a restrictive file size limit. Is there a way where

read df <- bigfile.csv

split df into 30 by rows, keeping headers write out all 30 csvs as littlefile_1.csv; littlefile_2.csv etc.

I know this is pretty rudimentary, but I am pretty new to R.

Phil
  • 7,287
  • 3
  • 36
  • 66
tchoup
  • 971
  • 4
  • 11
  • Start with `?read.csv` as you can skip rows and limit the number of rows read. Then check `?write.csv` as there is an append option if you writing observations in batches. – manotheshark Nov 11 '20 at 23:02
  • Thanks for that suggestion - I shall try it. Is there anyway to just tell it to split by a specific number. Say I want it to divide into 30 batches, is there a command for that? – tchoup Nov 11 '20 at 23:30
  • I would look at this question as it has several examples: https://stackoverflow.com/questions/6119667/in-r-how-do-i-read-a-csv-file-line-by-line-and-have-the-contents-recognised-as-t – manotheshark Nov 12 '20 at 12:57

0 Answers0