5

I have a data frame containing various columns along with sender_bank_flag. I ran the below two queries on my data frame.

sum(s_50k_sample$sender_bank_flag, na.rm=TRUE)

sum(s_50k_sample$sender_bank, na.rm=TRUE)

I got the same output from both the queries even though there is no such column as sender_bank in my data frame. I expected to get an error for the second code. Didn't know R has such a functionality! Does anyone know what exactly is this functionality and how can it be better utilized?

starball
  • 20,030
  • 7
  • 43
  • 238
Harsh Khad
  • 99
  • 1
  • 7
  • 6
    This is because of partial matching behind `$`. See `?Extract`. Try `s_50k_sample[["sender_bank_flag"]]` and `s_50k_sample[["sender_bank"]]` – Zheyuan Li Aug 28 '18 at 11:43
  • 1
    李哲源 , please post this as an answer. `options(warnPartialMatchDollar=TRUE)` might be of interest as well ... – Ben Bolker Aug 28 '18 at 12:47
  • 1
    More info in Advanced R [chapter on subsetting](http://adv-r.had.co.nz/Subsetting.html#subsetting-operators) especially the part with the $ sign and here in the [R language definition 3.4](https://cran.r-project.org/doc/manuals/R-lang.html#Subset-assignment) and this [argument matching post](https://stackoverflow.com/questions/14153904/partial-matching-of-function-argument#14155259) on SO. Though this last one is more about the matching for function names. – phiver Aug 28 '18 at 13:31
  • Yes, thanks a lot to both of you. Really helpful. PS: Just had a look on this today, hence the delayed reply! :) – Harsh Khad Oct 05 '18 at 15:04

1 Answers1

4

Probably worthwhile to augment all comments into an answer.


Both my comment and BenBolker's point to doc page ?Extract:

Under Recursive (list-like) objects:

Both "[[" and "$" select a single element of the list. The main difference is that "$" does not allow computed indices, whereas "[[" does. x$name is equivalent to x[["name", exact = FALSE]]. Also, the partial matching behavior of "[[" can be controlled using the exact argument.

Under Character indices:

Character indices can in some circumstances be partially matched (see ?pmatch) to the names or dimnames of the object being subsetted (but never for subassignment). Unlike S (Becker et al p. 358), R never uses partial matching when extracting by "[", and partial matching is not by default used by "[[" (see argument exact).

Thus the default behaviour is to use partial matching only when extracting from recursive objects (except environments) by "$". Even in that case, warnings can be switched on by options(warnPartialMatchDollar = TRUE).

Note, the manual has rich information, and make sure you fully digest them. I formatted the content, adding Stack Overflow threads behind where relevant.


Links provided by phiver's comment are worth reading in a long term.

Community
  • 1
  • 1
Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248