The which
argument has to be a vector of sheet names. That is, it needs to be in this format, using the c
function:
which = c("201901,Xizhi,PM10", "201902,Xizhi,PM10", "201902,Wanli,PM10")
or whatever sheets you want to import.
The which
argument doesn't appear to support regular expression, so importing sheets using wildcards is not possible. But you can import all sheets and then use the _file
column to select the data to individually export (if you know which sheet number is which since the sheet names are not saved, only the number).
In your data, you have all the information in the first row, but you don't import this because you specify the range as "A2:Y33". If you select "A1:Y33" then you'll get "characters" for all columns, and you don't want that since the column names start in row 2.
One solution is to import all sheets, range="A2:Y33", as you did, and then repeat for range "A1:Y2", saving the result as d2
. This range contains the information you need to select. You can then merge d
with d2
on _file
and export individual csv files based on the information in d2
.
library(rio)
d1 <- import_list("sample_data.xlsx",
setclass="data.table",
range = "A2:Y33", rbind = T,
col_names=T,
rbind_label = "_file",
rbind_fill = TRUE)
For d2
we import only range "A1:Y1", then subset columns 1, 7, 13 and 26, and set the names appropriately.
d2 <- setNames(
subset(
import_list("sample_data.xlsx",
setclass="data.table",
range = "A1:Y1", rbind = T,
col_names=F,
rbind_label = "_file"),
select=c(1,7,13,26)),
c("Site","Param","YM","_file"))
Then merge the two datasets on the _file
column.
d3 <- merge(d1, d2, by="_file")
Check the data.
table(d3$Site, d3$Param, d3$YM)
, , = YM: 2019/01
Param: AMB_TEMP(℃) Param: PM10(μg/m3)
Site: Wanli 31 31
Site: Xizhi 31 31
, , = YM: 2019/02
Param: AMB_TEMP(℃) Param: PM10(μg/m3)
Site: Wanli 31 31
Site: Xizhi 31 31
So there's eight sheets each containing 31 records. But we have to clean these names because if we use these for the filenames, R will complain.
d3$Site <- sub(".*: (.+)", "\\1", d3$Site)
d3$Param <- sub(".*: (.+)\\(.+", "\\1", d3$Param)
d3$YM <- sub(".*: (\\d{4})\\/(.+)", "\\1_\\2", d3$YM)
Now split on the interaction of all three contextual variables
f <- with(d3, split(d3, list(Site, Param, YM)))
And save to individual CSV files.
lapply(names(f), function(x) write.csv(f[[x]], file=paste(x, ".csv", sep="")))
dir()
[1] "Wanli.AMB_TEMP.2019_01.csv" "Wanli.AMB_TEMP.2019_02.csv" "Wanli.PM10.2019_01.csv"
[4] "Wanli.PM10.2019_02.csv" "Xizhi.AMB_TEMP.2019_01.csv" "Xizhi.AMB_TEMP.2019_02.csv"
[7] "Xizhi.PM10.2019_01.csv" "Xizhi.PM10.2019_02.csv"
You can then easily choose to export only the data you want using a subset of f
or d3
.