Error condition has length > 1 and only the first element will be used

Question

I am writing a function so that I can count the number of NAs in the columns I am interested in. Below, pts_au is a balanced panel data set of country-years, PTS_A is the column name, and the Country.string is the column I want to condition on.

> if (pts_au$Country.string == "China"){
+   sum(is.na(pts_au$PTS_A))
+ }
Warning message:
In if (pts_au$Country.string == "China") { :
  the condition has length > 1 and only the first element will be used

Basically I am trying to count the number of NAs in the PTS_A column, when the country name is China.

> unique(pts_au$Country.string)
[1] "Armenia"    "Azerbaijan" "Belarus"    "China"     
[5] "Kazakhstan" "Russia"     "Tunisia"    "Uganda"    
> str(pts_au$Country.string)
 chr [1:112] "Armenia" "Armenia" "Armenia" "Armenia" ...

What would the error mean? Should I subset the dataframe first and then apply is.na?

`if()` is used for program control flow, and should return a single `TRUE` or `FALSE` value. Your `pts_au$Country.string == "China"` however returns a vector of TRUE/FALSE. You should be able to subset to get your desired result: `sum(is.na(pts_au$PTS_A[pts_au$Country.string == "China"]))` as you suggest. — thelatemail, Nov 07 '19 at 23:34
This question really needs to be added to R FAQs. It is a duplicate of among the following: https://stackoverflow.com/questions/14170778, https://stackoverflow.com/questions/58054034, https://stackoverflow.com/questions/26934710, https://stackoverflow.com/questions/34053043. Some lean towards vectorized solutions of `ifelse`, some towards better subsetting your data to reduce to a single condition. — r2evans, Nov 15 '19 at 17:47

score 0 · Answer 1 · answered Nov 08 '19 at 11:23

Judging from what you are trying to achieve, I believe subsetting will do the job:

sum(is.na(pts_au$PTS_A[which(pts_au$Country.string == "China")]))

Or you can simply subset without the which function:

sum(is.na(pts_au$PTS_A[pts_au$Country.string == "China"]))

There is no need to use if statement to reach this goal.

Error condition has length > 1 and only the first element will be used

1 Answers1