I have this data set:
## fips SCC Pollutant Emissions type year
## 4 09001 10100401 PM25-PRI 15.714 POINT 1999
## 8 09001 10100404 PM25-PRI 234.178 POINT 1999
## 12 09001 10100501 PM25-PRI 0.128 POINT 1999
## 16 09001 10200401 PM25-PRI 2.036 POINT 1999
## 20 09001 10200504 PM25-PRI 0.388 POINT 1999
## 24 09001 10200602 PM25-PRI 1.490 POINT 1999
'data.frame': 2096 obs. of 6 variables:
$ fips : chr "24510" "24510" "24510" "24510" ...
$ SCC : chr "10100601" "10200601" "10200602" "30100699" ...
$ Pollutant: chr "PM25-PRI" "PM25-PRI" "PM25-PRI" "PM25-PRI" ...
$ Emissions: int 6 78 0 10 10 83 6 28 24 40 ...
$ type : chr "POINT" "POINT" "POINT" "POINT" ...
$ year : int 1999 1999 1999 1999 1999 1999 1999 1999 1999 1999 ...
fips: A five-digit number (represented as a string) indicating the U.S. county
SCC: The name of the source as indicated by a digit string (see source code classification table)
Pollutant: A string indicating the pollutant
Emissions: Amount of PM2.5 emitted, in tons
type: The type of source (point, non-point, on-road, or non-road)
year: The year of emissions recorded
I am trying to make a plot in ggplot to see if the emissions have increased or decreased along the years by the type of source; also I would like to add a linear model to show the trend.
This is what I've done so far:
GGplotGraph <- ggplot(PM25Baltimore, aes(x = year, y = Emissions, group = year, colour = type))
GGplotGraph <- GgplotGraph + geom_line() + facet_wrap(~ type) + theme(legend.position = "none")
GGplotGraph <- GgplotGraph + geom_smooth(method = "lm", formula = Emissions ~ year , se = FALSE, aes(group = 1)
This is the graph i get, but I would like the lines to be a continuous, from 1999 to 2008.
After doing some research on the topic,I understood that this is happening because the grouping is done wrong. I tried various combinations, i converted the type column to factor, but still, it did not work.
The other problem I have is with the linear model. I receive this error:
Error in model.frame.default(formula = formula, data = data, weights = weight, :
variable lengths differ (found for '(weights)')
Error in if (nrow(layer_data) == 0) return() : argument is of length zero
I found here some explanations, but my skills regarding debug, traceback or recover are very limited.
I would like some advice on how to proceed or what to try next.