0

I have London crime dataset which contains data from Jan 2009 - 2018 Dec broken into each month and i want to aggregate all the dates by borough. Currently, I am passing each month in aggregate method but I think it is not at all efficient. Is there any other way to do this?

Below is what I have tried.

Borough_avg_Crime<- aggregate((X200801+X200802+X200803+X200804+X200805+X200806+X200807+X200808+X200809+X200810+X200811+X200812
                               +X200901+X200902+X200903+X200904+X200905+X200906+X200907+X200908+X200909+X200910+X200911+X200912
                               +X201001+X201002+X201003+X201004+X201005+X201006+X201007+X201008+X201009+X201010+X201011+X201012
                               +X201101+X201102+X201103+X201104+X201105+X201106+X201107+X201108+X201109+X201110+X201111+X201112
                               +X201201+X201202+X201203+X201204+X201205+X201206+X201207+X201208+X201209+X201210+X201211+X201212
                               +X201301+X201302+X201303+X201304+X201305+X201306+X201307+X201308+X201309+X201310+X201311+X201312
                               +X201401+X201402+X201403+X201404+X201405+X201406+X201407+X201408+X201409+X201410+X201411+X201412
                               +X201501+X201502+X201503+X201504+X201505+X201506+X201507+X201508+X201509+X201510+X201511+X201512
                               +X201601+X201602+X201603+X201604+X201605+X201606+X201607+X201608+X201609+X201610+X201611+X201612
                               +X201701+X201702+X201703+X201704+X201705+X201706+X201707+X201708+X201709+X201710+X201711+X201712
                               +X201801+X201802+X201803+X201804+X201805+X201806+X201807+X201808+X201809+X201810+X201811+X201812)
~Borough,data = data_crime,FUN = "mean")

It is giving me the required result but I was wondering what is the efficient way to perform this?

Ahsan Hasan
  • 57
  • 2
  • 11
  • 3
    Yes, just do `aggregate(. ~ Borough, data = data_crime[subsetColumns], mean)` `subsetColumns <- names(data_crime)[yourindexofcolumns]` – akrun Apr 16 '19 at 12:52
  • @akrun hi thank you for your quick reply, but I am getting this error "Error in model.frame.default(formula = cbind(X200801, X200802, X200803, : invalid type (list) for variable 'Borough' – Ahsan Hasan Apr 16 '19 at 13:01
  • this is what I did as per your instructions: subsetColumns <- names(data_crime)[4:135] Borough_avg_Crime <- aggregate(. ~ Borough,data = data_crime[subsetColumns],FUN = "mean") – Ahsan Hasan Apr 16 '19 at 13:02
  • It is difficult to understand the issue when there is no reproducible example. YOu can check `aggregate(.~ Species, iris, mean)` the output and see how it is different from your data – akrun Apr 16 '19 at 13:03
  • So much easier if you reshape your data from [wide-to-long](https://stackoverflow.com/q/2185252/680068) then aggregate. – zx8754 Apr 16 '19 at 13:06

0 Answers0