0

I have a 15 million rows .csv file. It includes some rows where only hyphens are. But impossible to open this file with excel, notepad or notepad++. Therefore I thought that to modify it in C# (first read in, than write out as a new modified file where hyphens rows are not founded).

How can I code it in the easiest way?

Community
  • 1
  • 1
blackcornail
  • 155
  • 1
  • 2
  • 10
  • 2
    https://msdn.microsoft.com/en-us/library/aa287535(v=vs.71).aspx – lordkain Mar 06 '17 at 15:19
  • 5
    Why do you have a 15 million row csv file? Who do you expect to make sense of that? – Steve Mar 06 '17 at 15:20
  • ok it depends on how you want to modify it, but you can read any file line by line, and write out a new one... and then move it over the place of the old. – BugFinder Mar 06 '17 at 15:21
  • There are libraries for opening CSV files (like https://joshclose.github.io/CsvHelper/)... You'll have to do it line by line and you'll need much patience and an SSD disk :-) Clearly you can even simply read it line-by-line as a text file (in the end you want to simply rewrite it minus some lines) – xanatos Mar 06 '17 at 15:22
  • See this post ► [**Reading large text files with streams in C#**](http://stackoverflow.com/questions/2161895/reading-large-text-files-with-streams-in-c-sharp) One of the answers further down shows what they used to process a 19GB file. – Nope Mar 06 '17 at 15:22

1 Answers1

6

Consider migrating the CSV file to a SQL database, import it then remove the offending column. CSV is not really an efficient solution

Qrchack
  • 899
  • 1
  • 10
  • 20
  • 1
    Note you can always export back to CSV when you're done if you insist on using flat files – Qrchack Mar 06 '17 at 15:22
  • If you want to really go hard and keep using CSV, there's a Python module for that: https://docs.python.org/2/library/csv.html – Qrchack Mar 06 '17 at 15:25