1

I am trying to read an excel file with multiple tabs. For that, I use the code provided here. The problem is that each tab has a different number of empty rows before the actual data begins. For example, the first tab has two empty rows, the second tab has three empty rows, and so on.

Normally, I would use the parameter skip in the read_excel function to indicate the number of empty lines to skip. But how do I do that for multiple tabs with different numbers of rows to skip?

Oleg Ivanytskyi
  • 959
  • 2
  • 12
  • 28

1 Answers1

2

perhaps the easiest solution would be to read it as it is then remove rows, i.e. yourdata <- yourdata[!is.na(yourdata$columname),] ; this would work if you don't expect any NA's in a particular column, like id. If you have data gaps everywhere you can test for all NAs in multiple columns - let me know if that's what you need.

user3685724
  • 84
  • 1
  • 8
  • Yes, this should work, thank you! I was just wondering if there is an already implemented and more elegant way to do that, but this approach is fine too, I guess – Oleg Ivanytskyi May 19 '22 at 07:40
  • Sorry, I wondered the same thing just now and apparently if you use read.xls function you can add "blank.lines.skip=TRUE" parameter for that – user3685724 May 19 '22 at 07:43
  • I somehow cannot make it work:( I get `Error in file.exists(tfn) : invalid 'file' argument`, even though the usual `readxl::read_excel` works perfectly well – Oleg Ivanytskyi May 19 '22 at 08:11
  • yeah, sorry, should have made it more obvious - a different package for reading excel: gdata – user3685724 May 19 '22 at 09:45
  • Yeah, I did import it. This error is related to something else – Oleg Ivanytskyi May 19 '22 at 11:42
  • I think you just specify the file path without the file argument, i.e. or use "df1 <- read.xlsx(xlsxFile = xlsxFile, sheet = 1, skipEmptyRows = FALSE)...", so from docs is says xlsxFile rather than file – user3685724 May 19 '22 at 14:03