1

This is my dataframe using dput():

structure(list(Year = 1900:1903, Top.10..income.share = structure(c(82L, 
81L, 76L, 75L), .Label = c("", "30,3", "30,65", "30,8", "31,3", 
"31,37", "31,38", "31,4", "31,5", "31,51", "31,52", "31,55", 
"31,62", "31,64", "31,66", "31,67", "31,69", "31,75", "31,77", 
"31,8", "31,81", "31,82", "31,85", "31,9", "31,98", "32,01", 
"32,03", "32,04", "32,05", "32,07", "32,11", "32,12", "32,2", 
"32,35", "32,36", "32,42", "32,43", "32,44", "32,5", "32,62", 
"32,64", "32,67", "32,72", "32,82", "32,87", "33,02", "33,22", 
"33,4", "33,69", "33,72", "33,76", "33,87", "33,95", "34,25", 
"34,4", "34,57", "34,62", "34,71", "35,49", "36,3", "36,48", 
"37,26", "37,3", "37,73", "37,77", "37,78", "37,84", "37,92", 
"38,01", "38,1", "38,2", "38,38", "38,4", "38,47", "38,52", "38,59", 
"38,6", "38,63", "38,84", "38,91", "38,99", "39,13", "39,31", 
"39,48", "39,6", "39,82", "39,9", "40,29", "40,54", "40,59", 
"40,75", "41,02", "41,16", "41,52", "41,73", "41,98", "42,12", 
"42,23", "42,36", "42,67", "42,76", "42,86", "42,95", "43", "43,07", 
"43,11", "43,26", "43,35", "43,39", "43,64", "43,76", "44,07", 
"44,17", "44,4", "44,43", "44,57", "44,67", "44,77", "44,94", 
"45,03", "45,16", "45,47", "45,5", "45,67", "45,96", "46,09", 
"46,3", "46,35", "46,54"), class = "factor")), .Names = c("Year", 
"Top.10..income.share"), row.names = c(NA, 4L), class = "data.frame")

My sessionInfo():

R version 2.15.0 (2012-03-30)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] de_DE.UTF-8/de_DE.UTF-8/de_DE.UTF-8/C/de_DE.UTF-8/de_DE.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] rstudio_0.97.248 tools_2.15.0  

What I want to do is to plot the Year column against the Top10 column.
But since there are a quite a few missing values the graph looks pretty awful.
Here is the code I wrote to plot this so far.

require(stats)

tid_a = read.csv("tid-a.csv", header=TRUE, sep=";")
germany <- tid_a[1:111, ]
usa <- tid_a[112:210, ]

germany_year <- germany[, 2]
germany_top10 <- na.omit(germany[, 3])

plot(germany_year, germany_top10, type="l")

I've tried a few examples I found online to omit the missing values, but I'm a noob and can't get it to work.

LukasKawerau
  • 1,071
  • 2
  • 23
  • 42

1 Answers1

2

This question is not reproducible, but something like this should work:

gdat <- na.omit(subset(germany,select=c(Year,Top10)))
plot(Top10~Year,data=gdat,type="l")

(The use of subset in this way is fairly rare -- germany[,c("Year","Top10")] is more common -- but I like it because it's readable. A simpler version is plot(...,data=na.omit(germany)), but that will omit rows with NAs in any column of the data frame.)

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • Thanks Ben, that really helped :) But now I get an error saying "Grafikparameter "type" ist veraltet" (Graphics parameter "type" is old/deprecated), what should I do? – LukasKawerau Feb 10 '13 at 19:06
  • give a reproducible example (as indicated in the link in my question), or at least post the results of `sessionInfo()` -- I can't reproduce the error. – Ben Bolker Feb 10 '13 at 19:17
  • You need to use `read.csv2()` to read your data, or specify `dec=","`: see http://stackoverflow.com/questions/6123378/how-to-read-in-numbers-with-a-comma-as-decimal-separator . Because R isn't expecting a comma to be used as a decimal separator, your numeric values are being read in as factors. – Ben Bolker Feb 10 '13 at 19:44
  • You, Sir, are the Chief of Chiefs. Thank you. – LukasKawerau Feb 10 '13 at 19:51