How to skip comments line in data file I want to import, with R

Question

I've many string files (.str), and I want to import them in R (looping on files). The problem is that the first line is neither columns name nor the beginning of the matrix.. It is a comment line. Idem for the last line. between those two lines, stand up the matrix I want to import.. How can I do that ?

Thx

Welcom to SO. Please read this on how to create a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). In this case for example, you should add some parts of your text, then what have you tried. — agstudy, Jul 05 '13 at 09:26
Read `?read.table`. The parameters `skip`, `nrow`, and `comment.char` might be relevant to you. — Roland, Jul 05 '13 at 09:28
If the files don't all have identical structure, you can always read in with `readLines` and then use regexp functions to remove lines you don't want before converting to your intended data structure. — Thomas, Jul 05 '13 at 09:41
Thx you guys.. Roland, I can't use nrow.. number of rows depends on files, it is variable. — , Jul 05 '13 at 09:41
@user2551551 But if the ***first*** line is the one you want to skip, just use `skip = 1` in `read.table` to jump the first line and carry on as normal, e.g. `read.table( "myfile.txt" , skip = 1 , header = TRUE )` — Simon O'Hanlon, Jul 05 '13 at 10:11

score 6 · Answer 1 · answered Jul 05 '13 at 11:53

If you want to skip the first and last lines in a file, you can do it as follows. Use readLines to read the file into a character vector, and then pass it to read.csv.

strs <- readLines("filename.csv")
dat <- read.csv(text=strs,             # read from an R object rather than a file
                skip=1,                # skip the first line
                nrows=length(strs) - 3 # skip the last line
                )

The - 3 is because the number of rows of data is 3 less than the number of lines of text in the file: 1 skipped line at the beginning, 1 line of column headers, and 1 skipped line at the end. Of course, you could also just ignore the nrows argument, and delete the nonsense row from your data frame after the import.

score 6 · Answer 2 · answered Jul 28 '15 at 07:01

You can put your comments anywhere in the data files in the same way that you put your comments an R script. For example, if I have a data.txt like this:

# comment 1
str1
str2
# comment 2
str3
# comment 3
str4
str5# comment 4
str6
str7
# comment 5

Then you don't need to do anything to skip the comments:

> x<-read.table("data.txt", header=FALSE)
> x
    V1
1 str1
2 str2
3 str3
4 str4
5 str5
6 str6
7 str7
>

Note that comment 4 is not read. You can change the comment character # by using the comment.char option.

score 0 · Answer 3 · answered Jun 16 '17 at 13:57

0

You can skip arbitrary lines anywhere in the file if you combine the readLines approach Hong Ooi gives together with negative indexing. Here's an example which skips lines 2-5 in a file that has headers but a number lines of annotation/meta info:

lines <- readLines('myfile.txt')
mytable <- read.table(text = lines[-c(2:5)], sep = '\t', header = T)

answered Jun 16 '17 at 13:57

posdef

6,498
11
46
94

This does not skip the lines. It reads all the lines, and then removes some of them. If the files are large, this is a bad approach. – CoderGuy123 Aug 28 '20 at 01:36

How to skip comments line in data file I want to import, with R

3 Answers3