1

I have performed this operation before, but this time it doesn't work and I don't know why. I have also not found a similar topic here which deals with factors.

I want to remove all rows from a dataset which have certain contents (see code below):

em_nineties <- data.frame(subset(em_df, !(Year == '1985-1987', '1985', 
'1986', '1987', '1988', '1989') ))

Error: unexpected ',' ...

Excerpt of data frame

   Year  emissions  Poll    Country  Sector 
1553   1993  0.00000    CO2     Austria  6 
1554   2006  0.00000    CO2     Austria  6   
1555   2015  0.00000    CO2     Austria  6   
2243   1998  12.07760   CO2     Austria  5  
2400   1992  11.12720   CO2     Austria  5  
2401   1995  11.11040   CO2     Austria  5  
2402   2006  10.26000   CO2     Austria  5 
2489   1998  0.00000    CO2     Austria  6

I don't understand why this is a problem now. I can't see any typos or ',' which shouldn't be there...

Does anybody has a quick idea?

Thanks a lot in advance!

Nordsee

Nordsee
  • 81
  • 1
  • 10
  • 1
    for a start you need to put Years in `c()`; try this - `em_nineties <- data.frame(subset(em_df, !(Year %in% c('1985-1987', '1985', '1986', '1987', '1988', '1989'))))`. Also isn't `em_df` already a dataframe? why wrap it again in `data.frame()`? – Shree Oct 12 '18 at 14:08
  • 2
    You probably need to use `%in%` and provide a vector of values `c(...)`. Your code checks only `Year == '1985-1987'` and doesn't understand why you have `,` right after. You can run this simple example: `subset(mtcars, !cyl %in% c(4,6))` – AntoniosK Oct 12 '18 at 14:09
  • @Shree very appreciated. It is working now. I need the "cleaned out" dataframe for further processing – Nordsee Oct 12 '18 at 14:22
  • @AntoniosK, thank you also for your suggestions – Nordsee Oct 12 '18 at 14:23
  • Possible duplicate of [How can I subset rows in a data frame in R based on a vector of values?](https://stackoverflow.com/questions/15227887/how-can-i-subset-rows-in-a-data-frame-in-r-based-on-a-vector-of-values) – divibisan Oct 12 '18 at 15:39
  • You should also take a look at this question: https://stackoverflow.com/questions/9860090/why-is-better-than-subset – divibisan Oct 12 '18 at 15:39
  • @Shree I have tried to insert another condition into the code you have provided, but when doing so the code does ignore the restriction on the years `m_nineties <- data.frame(subset(em_df, Pollutant_name == 'CO2', !(Year %in% c('1985-1987', '1985', '1986', '1987', '1988', '1989'))))` It does excute the CO2 restriction though. – Nordsee Oct 12 '18 at 16:19

0 Answers0