11

I am trying to read a csv file >4GB, However, when I use fread command it produces and error

library(data.table)
csv1 <- fread("cleaned.csv",sep = ",",colClasses = "character",showProgress = TRUE)

Error: embedded nul in string: '\0'

After some looking I found that you could use sed function such as in this stackoverflow Question But I have no clue how to use it in my scenario. Please help!

UPDATE: I have attempted to use the sed function as described below in comments, however, they throw an error.

sed couldn't flush stdout no space left on device

UPDATE2: I have solved it with the help of some colleagues.However, I am still looking to automate this activity since I had to repeat the process for each file. Expected Automation would either be from within the R or using a BASH Script. Any Suggestions?

Community
  • 1
  • 1
Shoaibkhanz
  • 1,942
  • 3
  • 24
  • 41

1 Answers1

4

The csv files were populated with ^@ and they were placed within the blank values, somehow they couldn't be searched or replaced via sed commands to solve the problem, I followed the following solution.

In linux, follow to the file directory and use vim command such as,

vim filename.csv

:%s/CTRL+2//g

ESC #TO SWITCH FROM INSERT MODE

:wq # TO SAVE THE FILE

I had to do this manually for every file. However, I still looking for a way to automate this either within R or using from BASH script.

Shoaibkhanz
  • 1,942
  • 3
  • 24
  • 41
  • 2
    Vim scripts (gvim,vim) can be pretty easy to adapt `vi -s edit.vim filename.txt` where edit.vim contains (the :wq is optional) `:%s/CTRL+2//g :wq` you can also use the `:argdo `command to run a command on all files in the argument. – scribbles Aug 07 '15 at 19:46