0

Whenever I try to create a line graph in R to display data I have an error with the way I want it to appear as it doesn't seem to use the data given for that axis. the goal is to be able to display it so that have years gone by on the x axis and the amount of deaths that have occurred during those years on the y axis however whenever I try to do that R changes the data of the deaths, but if I invert the data it prints the data correctly.

I have tried to remove some non - essential information and I have tried to swap the data positions.

totals = 21,945 22,321 22,849 22,425 22,856 23,968 23,500 24,514 23,584 24,473 25,801 27,335 26,316 27,289 27,414 28,300 27,901 28,704 29,782 29,690 31,555
years = 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017

plot(totals, years, type="l", xlab = "Years", ylab = "Deaths", main = "Deaths Over the Years")
plot(years, totals, type="l", xlab = "Years", ylab = "Deaths", main = "Deaths Over the Years")

The first line of code shown gives the graph in the form inverted to what I wish it to be and in the wrong style of value, however it uses the correct data: inversed graph The second line shows the data in the correct style and is supposed show that data the correct way around however, while it shows the Years axis correctly the data on the deaths axis has been decreased by a significant amount which is not supposed to happen: correctly formatted graph (corrupted data)

1 Answers1

2

Please provide a reproducible example with the actual data. That really helps you to get help. From your examples, I think you need to understand the type of variables you are working with. In this case, differences between character and numeric variables and factors matter.

I can reproduce your plots with the following code where the values you posted are copied and then entered with the scan function. Your data have been placed in a data frame.

Years <- scan() # copy and paste 'years' data
Deaths <- scan(what = "", sep = " ") # copy and paste 'totals' (deaths)
df <- data.frame(Years, Deaths)

I am using the formula interface for base plots where plot(y ~ x) is the same as plot(x, y). Notice what has been done to reproduce your graphs.

opar <- par(mfrow = c(1, 2))
main <- "Deaths Over the Years"
plot(Years ~ Deaths, df, main = main, ylab = "Deaths", xlab = "Years")
plot(as.numeric(Deaths) ~ Years, df, type = "l", main = main, ylab = "Deaths")
par(opar)

Reproduction of plots

The values for 'Year' retained the comma. Because of this, they were imported as character variables, which get converted to a factor. When factors are converted to a numeric type they become integers representing the sequence of the factor levels. For your example, a simple solution is to remove the comma and convert the character variable to a numeric variable.

Deaths2 <- as.numeric(gsub(",", "", Deaths))
df$Deaths2 <- Deaths2 # add this variable to the data frame
plot(Deaths2 ~ Years, df, type = "l", main = main, ylab = "Deaths")
David O
  • 803
  • 4
  • 10