What I want to do is subset large .csv files by certain dates to extract certain years.
What I have done so far is read the whole .csv file using fread and then subset by date.
Below is an example (note I have generated some exmple data rather than reading it in using fread):
# Example data.table created after reading in from fread
library("data.table")
DT <- data.table(seq(as.Date("1999-01-01"), as.Date("2009-01-01"), by="day"))
DT$Var <- sample(1000, size=nrow(DT), replace=TRUE)
colnames(DT) <- c("Date", "Var")
# subset to extract data for the year 2004
DT_2004 <- subset(DT, Date %in% as.Date("2004-01-01"):as.Date("2004-12-31"))
This works but requires me to read in the whole .csv file first which with very large .csv files is quite time consuming. Is there a way to susbset the .csv file within fread so that I only read in the dates I want?
Thank you.