I have a 5GB file (19M rows and 16 columns) that I am working on. One of the fields in this file is in YearQtr format. Example 2014Q1. I followed Extract year from date thread to extract year information as:
library(zoo)
x <- "2014Q1"
d <- as.factor(format(as.yearqtr(x), "%Y"))
While this works, but because I have about 19M rows, it takes forever for RStudio to process this. For instance, it takes about 45 seconds for fread
to read the files, but 10 minutes to extract the year! Is there anyway I can make this work faster? I'd appreciate any thoughts. I even tried as.Date()
but there was no improvement. Any thoughts?