-3

I have my data in a space-separated text file; each line represents a data point contained in a given month:

Jan/2012 1000
Jan/2012 1500
Jan/2012 1200
Feb/2012 1300
Feb/2012 1400
Feb/2012 1000
...
Dec/2012 1300
Dec/2012 1400
Dec/2012 1000

I'd like to generate for each month the min, max, median, mean, standard deviation, and 95th quantile. I'd also like to generate a boxplot for the entire year. How can I do this in R? I can load the data with mydata = read.table(file="mydata.txt", sep=" "), but summary produces output like:

      month              time          
 Aug/2012: 229357   Min.   :    31100  
 Oct/2012: 223158   1st Qu.:    91267  
 Mar/2012: 221986   Median :   124048  
 Apr/2012: 215368   Mean   :   199639  
 Jul/2012: 213956   3rd Qu.:   176766  
 May/2012: 200920   Max.   :150018802  
 (Other) :1146616                      

I don't have any experience generating boxplots; guidance is welcome.

Eric R. Rath
  • 1,939
  • 1
  • 14
  • 16
  • Aggregating / grouping questions are close to the most within the [r] tag. [This](http://stackoverflow.com/questions/3505701/r-grouping-functions-sapply-vs-lapply-vs-apply-vs-tapply-vs-by-vs-aggrega) is the canonical question, however you may find [this](http://stackoverflow.com/questions/14758566/how-can-i-use-functions-returning-vectors-like-fivenum-with-ddply-or-aggregate/14804827#14804827) pertinant. A quick search for `[r] boxplot` would get you some [pointers for boxplots](http://stackoverflow.com/questions/7147836/how-to-generate-boxplot). – mnel Feb 12 '13 at 23:14
  • especially with dates also please use `dput()` to provide your data. – user1317221_G Feb 12 '13 at 23:15

1 Answers1

1
tapply(dfrm$time, substr(dfrm$month, 1,3), summary)
library(Hmisc)
tapply(dfrm$time, substr(dfrm$month, 1,3), describe)
boxplot(time~month, data=dfrm)
IRTFM
  • 258,963
  • 21
  • 364
  • 487