Part of my data is as below:
CUSIP yearmon datafqtr PRIMEXCH date PRC VOL RET
1: 00003210 Nov 1970 1970 Q4 A 1970-11-16 9.875 3400 -0.091954
2: 00003210 Nov 1970 1970 Q4 A 1970-11-17 8.750 4100 -0.113924
3: 00003210 Nov 1970 1970 Q4 A 1970-11-18 9.125 5400 0.042857
4: 00003210 Nov 1970 1970 Q4 A 1970-11-19 9.375 3600 0.027397
5: 00003210 Nov 1970 1970 Q4 A 1970-11-20 9.625 3100 0.026667
6: 00003210 Nov 1970 1970 Q4 A 1970-11-23 9.250 1500 -0.038961
SHROUT NUMTRD vwretd ceqq S A A0
1: 2655 NA -0.001385 10.544 24558.75 0.05144521 2.094781e-06
2: 2655 NA 0.000824 10.544 24558.75 0.05144521 2.094781e-06
3: 2655 NA -0.007519 10.544 24558.75 0.05144521 2.094781e-06
4: 2655 NA 0.001180 10.544 24558.75 0.05144521 2.094781e-06
5: 2655 NA 0.009683 10.544 24558.75 0.05144521 2.094781e-06
6: 2655 NA 0.006372 10.544 24558.75 0.05144521 2.094781e-06
Aplus Aminus Aplus.market Aminus.market BTM
1: 0.03421433 0.06293247 0.05269694 0.04643831 0.0004293378
2: 0.03421433 0.06293247 0.05269694 0.04643831 0.0004293378
3: 0.03421433 0.06293247 0.05269694 0.04643831 0.0004293378
4: 0.03421433 0.06293247 0.05269694 0.04643831 0.0004293378
5: 0.03421433 0.06293247 0.05269694 0.04643831 0.0004293378
6: 0.03421433 0.06293247 0.05269694 0.04643831 0.0004293378
RET.month MOM1 MOM2 MOM3 MOM4
1: -0.1724146 NA NA NA NA
2: -0.1724146 NA NA NA NA
3: -0.1724146 NA NA NA NA
4: -0.1724146 NA NA NA NA
5: -0.1724146 NA NA NA NA
6: -0.1724146 NA NA NA NA
The combination of CUSIP
and yearmon
makes each individual group, the observations are in daily frequency. I want to subset all the observations in groups that have no more than 5 missing values in the variable VOL
. That means for a specific CUSIP
, in a specific month (yearmon
), when there are more than 5 missing values in VOL
, then the observations of this CUSIP
in this month (yearmon
) will be deleted from the data.