8

There are 5 rows at the top of my csv file which serve as information about the file, which I do not need.

These information rows have only 2 columns, while the headers, and rows of data (from 6 on-wards) have 8. This appears to be the cause of the issue.

I have tried using the skip function within read.csv to skip these lines, and the same with read.table

df = read.csv("myfile.csv", skip=5)
df = read.table("myfile.csv", skip=5)

but this still gives me the same error message, which is:

Error in read.table("myfile.csv",  :empty beginning of file

In addition: Warning messages:

1: In readLines(file, skip) : line 1 appears to contain an embedded nul
2: In readLines(file, skip) : line 2 appears to contain an embedded nul
...
5: In readLines(file, skip) : line 5 appears to contain an embedded nul

How can I get this .csv to be read into r without the null values in the first 5 rows causing this issue?

thelatemail
  • 91,185
  • 12
  • 128
  • 188
datavoredan
  • 3,536
  • 9
  • 32
  • 48
  • There was a file type error. My csv was apparently stored as 'Unicode', even though it said "Microsoft Excel Comma Seper..." under Type in the folder. – datavoredan Apr 09 '14 at 02:57
  • 1
    This may be of use for you: the `fread` function that reads a csv file after doing automatic detection of number of rows to be skipped (http://stackoverflow.com/questions/15332195/reading-in-multiple-csvs-with-different-numbers-of-lines-to-skip-at-start-of-fil/15333597#15333597 ) – Jealie Apr 09 '14 at 04:19

2 Answers2

7

You could try:

read.csv(text=readLines('myfile.csv')[-(1:5)])

This will initially store each line in its own vector element, then drop the first five and treat the rest as a csv.

jbaums
  • 27,115
  • 5
  • 79
  • 119
0

You can get rid of warning messages by using parameter 'skipNul';

text=readLines('myfile.csv', skipNul=True)
Mayank Jain
  • 5,663
  • 7
  • 32
  • 65
Nasir Mahmood
  • 73
  • 1
  • 1
  • 5