I'm importing an Excel file where a single sheet has multiple tables.
When importing into R using read_excel()
, I noticed each table can be delimited from the rest by looking at col1
: if it contains 2-4 rows of NA
s followed by a row with a header
, then what follows is a new table that I'd like to store as a data frame.
For example, the pattern below repeats across the file, each a separate table:
col1 col2 col3 col4
NA header2 header3 header4 <-- use these headers and
NA header2a header3a header4a <-- append these to the above
header Number Number Number <-- omit this row
attribute1 Number Number Number
attribute2 Number Number Number
attribute3 Number Number Number
NA NA NA NA
NA NA NA NA
From the above, I would like to pull everything below the headers
, and omit NA
only rows (which indicate the end of that table). Sample output:
whatever header2_header2a header3_header3a header4_header4a
attribute1 Number Number Number
attribute2 Number Number Number
attribute3 Number Number Number
Finally, I'd like to apply that to the entire file, so that each table gets stored as its own data frame.
Is this even possible?