I am trying to do some analysis in a data-set (homicide rates in Brazil). Data is simple but I am learning, so not so simple for me anyway... After creating subsets grouping info by year, state and region, I can't still understand how to group these subsets into a bigger one (states by region). I would like to group all the regions on one bigger 'subset' so I can plot the information and instead of having the plot with data being shown by state, having it by region instead. It's probably simple and silly but I wasted a couple of hours googling and trying different codes, nothing works so far.
North <- subset(Homicides, State == 'AM' | State == 'RR'| State == 'AP' | State == 'PA' | State == 'TO' | State == 'RO' | State == 'AC')
Northeast <- subset(Homicides, State == 'MA' | State == 'PI'| State == 'CE' | State == 'RN' | State == 'PE' | State == 'PB' | State == 'SE' | State == 'AL' | State == 'BA')
Midwest <- subset(Homicides, State == 'MT' | State == 'MS'| State == 'GO'| State == 'DF')
Southeast <- subset(Homicides, State == 'SP' | State == 'RJ'| State == 'ES'| State == 'MG')
South <- subset(Homicides, State == 'PR' | State == 'RS'| State == 'SC')
AllRegions <- # How to group them so I can plot correctly?
And for the plot code:
ggplot(Homicides, aes(x = Year, y = TotalRate, group = State, color = State)) + # Where state should be the regions instead
geom_line() +
geom_point(size = 1) +
ggtitle("Total Homicides") +
theme_hc() +
scale_colour_hc()
How the dataset file looks like (for understanding)
State Year TotalRate FirearmsRate
1 AC 1979 34 13
2 AC 1980 26 12
3 AC 1981 28 8
4 AC 1982 41 18
5 AC 1983 33 12
6 AC 1984 36 13