-3

I have a 75MB CSV file containing about 2 million rows. I now want to replace all ; characters with comma's. Each line contains six ; characters, so the total number of characters to replace would be around 12 million. Wat is the best tool for this? I am working on windows and tried Notepad++ 'replace all' functionality, but that keeps freezing due to the size of the file. Any suggestions?

Thanks in advance

Peter
  • 722
  • 6
  • 24
  • Are there "embedded; semicolons" that should not be replaced? What tools do you have that you are willing to use? With no embedded delimiters, this is easy work for awk or sed; if you do need to parse the csv, perl, python, ruby, etc have good parsers. – dawg Dec 16 '21 at 14:26
  • You also need to make sure the data currently doesn't contain any comma `,` like for the decimals, that would mess up the resulting file structure. Btw Notepad++ has a **CSV Lint** plugin with a re-format function to change the separator characters https://github.com/BdR76/CSVLint/ but I think that will probably also freeze on such a large file – BdR Dec 16 '21 at 15:32
  • @dawg No there aren't any embedded semicolons. Do sed or awk also work on windows? I'm willing to use all windows tools as well as python – Peter Dec 16 '21 at 15:59
  • @BdR No there aren't any commas already in the file – Peter Dec 16 '21 at 16:00
  • If you have Python running, just use the CSV module. Read the `;` delimited file line-by-line. Set the output csv writer to `,` delimiter and write line by line to a new file. Have a beer. – dawg Dec 16 '21 at 16:03

2 Answers2

4

If you have a CSV without errors, you could use the windows beta version of Miller 6.

In example if you have

fieldA;fieldB
1;a
2;"A sample, text"

the command is mlr.exe --csv --ifs ";" --ofs "," cat input.csv >output.csv.

The output is

fieldA,fieldB
1,a
2,"A sample, text"
aborruso
  • 4,938
  • 3
  • 23
  • 40
1

Try to make a script wich imports the file as a string, maybe only 100 rows (for-loop) and replace... Maybe this works.

I never tried sth like this...

  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Dec 16 '21 at 15:00
  • You could create a PowerShell script to read the 75MB file line by line and write it (also line by line) to a new file https://stackoverflow.com/a/65841115/1745616 – BdR Dec 16 '21 at 15:47