-3

I am new to R and I'm trying to subsetting a data frame, but I don't know how to do according to my needs. Specifically, I have a panel data frame ranging from 1987 to 2017, but some information I need are observed on 2005, 2007, 2013 and 2017. As I can assume this information is constant over time, it's sufficient that one individual has been observed at least in one of these years. How can I subset the data frame to have all the individuals along all years, condition on have being observed at least in one of the set 2005, 2009, 2013, 2017? Thank you.

The idea is the following:

pid   year 
101   1984
101   1985
101   1986
101   1987
102   1984
102   1985
102   1986
102   1987
..
102   2005
102   2006
103   1990
103   1991
103   1992
103   1993
...
103   2005

What I would like is to keep the all information and years for the pid who have at least the observation in 2005 or 2009, or 2013 or 2017.

zx8754
  • 52,746
  • 12
  • 114
  • 209
  • Try - `yourdf[rowSums(!is.na(yourdf)) >= 1, ]` – Shree Aug 05 '19 at 12:45
  • Can you provide a sample of data with dput(yourdata) – tom Aug 05 '19 at 12:45
  • @Shree i've tried your suggestion, but it returns exactly the same data frame – Luca Giangregorio Aug 05 '19 at 12:58
  • @tom I'm having troubles as the dataset is very large, but I try to give you an example. Having id and year, id 1 has years from 1984 to 1990. The id 2 from 1984 to 2013. I want to keep all the years of id 2 as it has at least one year from the set (2005, 2009, 2013, 2017). Hope this is helpful – Luca Giangregorio Aug 05 '19 at 12:59
  • Try - `with(yourdf, yourdf[ave(value_col, id, year, FUN = function(x) !all(is.na(x)))` – Shree Aug 05 '19 at 13:13
  • here value_col stands for? – Luca Giangregorio Aug 05 '19 at 13:26
  • `value_col` is the observation you are checking for. You should share output of `dput(head(yourdf)` for more specific help. – Shree Aug 05 '19 at 13:30
  • 1
    Duplicate of: [Remove group from data.frame if at least one group member meets condition](https://stackoverflow.com/questions/31661704/remove-group-from-data-frame-if-at-least-one-group-member-meets-condition) – Shree Aug 05 '19 at 13:31

2 Answers2

0

A guess with base R :

yearOk <- which(dat$year %in% c(2005, 2007, 2013, 2017)) #row with year ok
idOK <- unique(dat$id[yearOk]) #get the ids that are in these years
datOk <- dat[which(dat$id %in% idOk),] #subset dat based on the wanted ids
tom
  • 725
  • 4
  • 17
  • that's work perfectly! Thank you. I was just using only the first condition and not apply it on the id. Thanks! – Luca Giangregorio Aug 05 '19 at 13:37
  • @akrun, in view of the little information and as OP could not make your proposition work. I thought an ugly but easy to understand index filtering could have make the trick – tom Aug 05 '19 at 13:41
0

Here's a way using ave from base R -

yourdf[with(yourdf, ave(year, id, FUN = function(x) any(x %in% c(2005,2009,2013,2017)))), ]
Shree
  • 10,835
  • 1
  • 14
  • 36