1

I have a Large CSV File, that is about 1GB. When I try to open it in Excel to edit it and do some preprocessing on the data I encounter issues.

Namely when I finish the edit and save the file, when I open it later the majority of the file is clipped off and only a small portion of the data stays in the new saved file.

How can I edit a large CSV File without loosing a large portion of the data?

Mustard Tiger
  • 3,520
  • 8
  • 43
  • 68
  • 20 million rows X 6 columns – Mustard Tiger Apr 17 '17 at 01:46
  • You can use VBA and something like `Open strFileToImport For Input Access Read Lock Read As #FreeFile()` to open the file row by row. Like this you don't have to load the entire file (at once) into memory and there is no limit such as the one million rows limit. Simultaneously you can create a new file for output and the process line by line. Sounds slow. I know. But it isn't. It's actually pretty fast: http://stackoverflow.com/questions/35434220/vba-replacing-commas-in-csv-not-inside-qoutes/35440236#35440236 (350 MB in one minute should translate to three minutes for a 1 GB file). – Ralph Apr 17 '17 at 08:06
  • Possible duplicate of [VBA - Replacing commas in CSV not inside qoutes](http://stackoverflow.com/questions/35434220/vba-replacing-commas-in-csv-not-inside-qoutes) – Ralph Apr 17 '17 at 08:09

1 Answers1

0
  1. I would not use excel as excel :)... you gain nothing by loading the csv to excel sheet... at the end its csv and not excel.

  2. what you can do is use vba as pure programing language... load some data into array or some other data structure... do your magic(data manipulation) and then save the changes back to the same file or to a new file. please note that in c# or other "proffessional" languages there are dedicated libraries that will help you deal will csv problematic characters etc...

Asaf
  • 3,067
  • 11
  • 35
  • 54