The dataset consists of a company identifier (a string), three columns of data values and a date (yyyymmdd). There are around 25,500 unique company identifiers and each has daily values from Jan. 1 1973 to the present for a total of around 700,000 rows. What I would like to do is calculate some statistics (i.e. range, mean, median, SD, etc) for each date in the dataset. The data was originally a csv and was imported into R as a dataframe. My first attempt is below but I was wondering if there is a more efficient method than looping through 25,000 rows
stat <- data.frame(Date=as.Date(character()), Mean=numeric(), SD=numeric(), Quant_75=numeric(), Quant_25=numeric(), Range=numeric(), stringsAsFactors=FALSE)
uniq <- unique(unlist(data$Date))
for (i in 1:length(uniq)){
data_sub <- subset(data, date == uniq[i])
stat[i,] = rbind(date, mean(data_sub), sd(data_sub), quantile(data_sub, 0.75), quantile(data_sub,0.25), range(data_sub) )
}