2

I'm reading in a file in R using fread as such

test.set = fread("file.csv", header=FALSE, fill=TRUE, blank.lines.skip=TRUE)

Where my csv consists of 6 columns. An example of a row in this file is

"2014-07-03 11:25:56","61073a09d113d3d3a2af6474c92e7d1e2f7e2855","Securenet Systems Radio Playlist Update","Your Love","Fred Hammond & Radical for Christ","50fcfb08424fe1e2c653a87a64ee92d7"

However, certain rows are formatted in a particular way when there is a comma inside one of the cells. For instance,

"2014-07-03 11:25:59","37780f2e40f3af8752e0d66d50c9363279c55be6","Spotify","\"Hello\", He Lied","Red Box","b226ff30a0b83006e5e06582fbb0afd3"

produces an error of the sort

Expecting 6 cols, but line 5395818 contains text after processing all         
cols. Try again with fill=TRUE. Another reason could be that fread's    
logic in distinguishing one or more fields having embedded sep=','      
and/or (unescaped) '\n' characters within unbalanced unescaped quotes    
has failed. If quote='' doesn't help, please file an issue to figure 
out if the logic could be improved.

As you can see, the value that is causing the error is "\"Hello\", He Lied", which I want to be read by fread as "Hello, He Lied". I'm not sure how to account for this, though - I've tried using fill=TRUE and quote="" as suggested, but the error still keeps coming up. It's probably just a matter of finding the right parameter(s) for fread; anyone know what those might be?

www
  • 38,575
  • 12
  • 48
  • 84

1 Answers1

0

In read.table() from base R this issue is solvable.

Using Import data into R with an unknown number of columns?

In fread from data.table this is not possible.

Issue logged for this : https://github.com/Rdatatable/data.table/issues/2669

Joseph Wood
  • 7,077
  • 2
  • 30
  • 65
Soumya Boral
  • 1,191
  • 14
  • 28