7

I have a excel file like this:

enter image description here

I try to read it in read.xlsx or read_excel by skipping the second row:

library(xlsx)
df <- read.xlsx('./data.xls', 'Sheet1')

library(readxl)
df <- read_excel("./data.xls", sheet = 'Sheet0', skip = 2, col_names = TRUE)

The first one (read.xlsx), I didn't find parameters for skip rows, the second one give a df without headers.

Where did I do wrong in the code above and how to read it correctly? Thanks.

ah bon
  • 9,293
  • 12
  • 65
  • 148
  • I think you are looking for https://stackoverflow.com/questions/15860071/read-csv-header-on-first-line-skip-second-line – Ronak Shah Jun 23 '20 at 07:53
  • I'm not sure if we can read `.xlsx` file with `read_csv`, I get an error like this when I'm trying to do so: `line 1 appears to contain embedded nullsline 2 appears to contain embedded nullsline 3 appears to contain embedded nullsline 4 appears to contain embedded nullsline 5 appears to contain embedded nullsError in make.names(col.names, unique = TRUE) : ` – ah bon Jun 23 '20 at 07:57
  • 1
    I see, you are right. I think zx8754's answer below would help. I would just read the excel as it is and then remove the second row. – Ronak Shah Jun 23 '20 at 08:12

1 Answers1

12

Read it twice: once for column names, then for the data:

library(readxl)
myCols <- as.character(read_excel("./test123.xlsx", n_max = 1, col_names = FALSE))
myDF <- read_excel("./test123.xlsx", skip = 2, col_names = myCols)

myDF
# # A tibble: 3 x 2
#   colAtitle colBtitle
#       <dbl>     <dbl>
# 1         1         5
# 2         2         6
# 3         3         7

Example input: test123.xlsx

enter image description here

zx8754
  • 52,746
  • 12
  • 114
  • 209