Some wired output with subsetting data.frame in R.
here is files I used
After read data in R , there are 194 obs. with 13 vars.
> str(WHO)
'data.frame': 194 obs. of 13 variables:
$ Country : Factor w/ 194 levels "Afghanistan",..: 1 2 3 4 5 6 7 8 9 10 ...
$ Region : Factor w/ 6 levels "Africa","Americas",..: 3 4 1 4 1 2 2 4 6 4 ...
$ Population : int 29825 3162 38482 78 20821 89 41087 2969 23050 8464 ...
$ Under15 : num 47.4 21.3 27.4 15.2 47.6 ...
$ Over60 : num 3.82 14.93 7.17 22.86 3.84 ...
$ FertilityRate : num 5.4 1.75 2.83 NA 6.1 2.12 2.2 1.74 1.89 1.44 ...
$ LifeExpectancy : int 60 74 73 82 51 75 76 71 82 81 ...
$ ChildMortality : num 98.5 16.7 20 3.2 163.5 ...
$ CellularSubscribers : num 54.3 96.4 99 75.5 48.4 ...
$ LiteracyRate : num NA NA NA NA 70.1 99 97.8 99.6 NA NA ...
$ GNI : num 1140 8820 8310 NA 5230 ...
$ PrimarySchoolEnrollmentMale : num NA NA 98.2 78.4 93.1 91.1 NA NA 96.9 NA ...
$ PrimarySchoolEnrollmentFemale: num NA NA 96.4 79.4 78.2 84.5 NA NA 97.5 NA ...
But the result of subsetting with function subset differ from df[,] as example below.
> Outliers <- WHO[WHO$GNI > 10000 & WHO$FertilityRate > 2.5,]
> nrow(Outliers)
[1] 27
Country Region Population Under15 Over60 FertilityRate LifeExpectancy ChildMortality CellularSubscribers
NA <NA> <NA> NA NA NA NA NA NA NA
23 Botswana Africa 2004 33.75 5.63 2.71 66 53.3 142.82
NA.1 <NA> <NA> NA NA NA NA NA NA NA
NA.2 <NA> <NA> NA NA NA NA NA NA NA
(trimmed ...)
There is a lot of NA obs.
While use subset function, yield correct results.
> Outliers <- subset(WHO, GNI > 10000 & FertilityRate > 2.5)
> nrow(Outliers)
[1] 7
> Outliers
Country Region Population Under15 Over60 FertilityRate LifeExpectancy ChildMortality CellularSubscribers
23 Botswana Africa 2004 33.75 5.63 2.71 66 53.3 142.82
56 Equatorial Guinea Africa 736 38.95 4.53 5.04 54 100.3 59.15
63 Gabon Africa 1633 38.49 7.38 4.18 62 62.0 117.32
83 Israel Europe 7644 27.53 15.15 2.92 82 4.2 121.66
88 Kazakhstan Europe 16271 25.46 10.04 2.52 67 18.7 155.74
131 Panama Americas 3802 28.65 10.13 2.52 77 18.5 188.60
150 Saudi Arabia Eastern Mediterranean 28288 29.69 4.59 2.76 76 8.6 191.24
(trimmed ...)