Good Afternoon,
After trying several times R will not sum up the data I have below. As can be seen in the replica of my data there are 4 33024 zipcodes listed. R will continue to say that 33024 only has 2 injuries and will sum the rest of them up. Any help on this?
Edit: This should help as well. Seeing the Max stay at 3 and not increase based on the number of zip-codes that have an injury.
ZipCode Age Fatality Injury Year
33065 : 24 15 :28 Min. :1 Min. :1.000 2015:92
33313 : 18 18 :27 1st Qu.:1 1st Qu.:1.000 2016:67
33317 : 14 13 :21 Median :1 Median :1.000 2017:35
33076 : 13 17 :19 Mean :1 Mean :1.083
33026 : 11 12 :18 3rd Qu.:1 3rd Qu.:1.000
33311 : 11 14 :18 Max. :1 Max. :3.000
ZipCode Age Fatality Injury Year
1 33023 17 NA 1 2015
2 33024 6 NA 1 2015
3 33024 8 NA 2 2015
4 33024 13 NA 1 2015
5 33024 13 NA 1 2015
6 33026 14 NA 1 2015
BCD = read.csv(file.choose())
BCD
head(BCD)
tail(BCD)
library(ggplot2)
str(BCD)
colnames(BCD) = c("ZipCode", "Age", "Fatality", "Injury", "Year")
head(BCD)
list(BCD$Injury)
list(BCD$ZipCode)
factor(BCD$Year)
factor(BCD$ZipCode)
BCD$Year= factor(BCD$Year)
BCD$ZipCode= factor(BCD$ZipCode)
BCD$Age = factor(BCD$Age)
BCD$Injury = as.numeric(BCD$Injury)
BCD$Fatality = as.numeric(BCD$Fatality)
str(BCD)
head(BCD)
summary(BCD)
BCD2 = ggplot(data=BCD, aes(x=Injury, y=ZipCode, color=Age, size=Year))
BCD2 + geom_point()+ geom_smooth()
This is the code to this point. I am attempting to produce a ggplot based on year, age, zipcode, and the number of injuries that occurred at that zip-code.