2

In SAS there is the capability to remove an observation through a simple if statement. For instance if I wanted to delete the rows where Year = #. I could write:

If year == "#" then delete; 

Is there an equivalent way in r? I tried this:

 DF<- DF[!(DF$Year == "#"),]

Data set in R:

Year        Month
#           June
#           July 
#           August
2015        August

But when I run DF$year, I still get # as a factor level when I thought it would be filtered out?

steppermotor
  • 701
  • 6
  • 22
  • 3
    You're doing it right. It will still be a factor level, that doesn't mean it actually exists in the vector `DF$year`. You can drop the extra factor level if you need to with `droplevels` or by using factor again `DF$year <- factor(DF$year)`. – Brandon Bertelsen Jul 17 '15 at 23:20
  • @BrandonBertelsen, Thanks. I ran the DF$year <-factor(DF$year) I don't see the # sign show up now it shows 99 Levels: 1900 1901 1907 1916 1918 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 ... 2014. I'm trying to do a correlation between one x variable that is numeric,and I thought that the # sign was the only non-numeric in my y-variable data. Perhaps I have a character somewhere, since the corr(x,y) function is warning me that y is not numeric. Is there a way to remove all non numerics? – steppermotor Jul 17 '15 at 23:27
  • Yeah, I ran is.numeric(DF2$Year) and it returned false, so how to get rid of those non-numerics..Thanks – steppermotor Jul 17 '15 at 23:29
  • Well, a factor isn't numeric. It can be converted to numeric with `as.numeric(as.character(your_factor))`. You should open a new question with your next question, please make a reproducible example (see: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – Brandon Bertelsen Jul 17 '15 at 23:30
  • Factors are factors, they are not numbers/numeric. Try translating them into numbers and see if it works: as.numeric(as.character(DF2$Year)) – Sharon Jul 17 '15 at 23:30
  • There's an echo in here... there's an echo in here... – Brandon Bertelsen Jul 17 '15 at 23:33

1 Answers1

3

You're doing it right. It will still be a factor level, that doesn't mean it actually exists in the vector DF$year. You can drop the extra factor level if you need to with droplevels or by using factor again DF$year <- factor(DF$year)

If you need to convert a factor with numeric labels to a numeric vector, you can use:

as.numeric( ## convert to numeric as expected
  as.character( ## because factors are numbers with labels
    your_factor 
    )
  )
)
Brandon Bertelsen
  • 43,807
  • 34
  • 160
  • 255