Skip rows while use read_excel or read.excel in R

Question

I have a excel file like this:

I try to read it in read.xlsx or read_excel by skipping the second row:

library(xlsx)
df <- read.xlsx('./data.xls', 'Sheet1')

library(readxl)
df <- read_excel("./data.xls", sheet = 'Sheet0', skip = 2, col_names = TRUE)

The first one (read.xlsx), I didn't find parameters for skip rows, the second one give a df without headers.

Where did I do wrong in the code above and how to read it correctly? Thanks.

I think you are looking for https://stackoverflow.com/questions/15860071/read-csv-header-on-first-line-skip-second-line — Ronak Shah, Jun 23 '20 at 07:53
I'm not sure if we can read `.xlsx` file with `read_csv`, I get an error like this when I'm trying to do so: `line 1 appears to contain embedded nullsline 2 appears to contain embedded nullsline 3 appears to contain embedded nullsline 4 appears to contain embedded nullsline 5 appears to contain embedded nullsError in make.names(col.names, unique = TRUE) : ` — ah bon, Jun 23 '20 at 07:57
I see, you are right. I think zx8754's answer below would help. I would just read the excel as it is and then remove the second row. — Ronak Shah, Jun 23 '20 at 08:12

score 12 · Accepted Answer · answered Jun 23 '20 at 07:59

Read it twice: once for column names, then for the data:

library(readxl)
myCols <- as.character(read_excel("./test123.xlsx", n_max = 1, col_names = FALSE))
myDF <- read_excel("./test123.xlsx", skip = 2, col_names = myCols)

myDF
# # A tibble: 3 x 2
#   colAtitle colBtitle
#       <dbl>     <dbl>
# 1         1         5
# 2         2         6
# 3         3         7

Example input: test123.xlsx

Skip rows while use read_excel or read.excel in R

1 Answers1

Linked

Related