0

I'm doing what seems to be a simple exercise. I'm supposed to create a histogram of the ages on the passengers on the titanic. The data frame "titanic" is already spacified. Since age is the sixth variable (I checked) I write the following:

hist(titanic[6])

This does not seem to work for some reason. R generates an error message and tells med that "'x' must be numeric".

(The variable age is indeed numeric, I checked this both with the str()-function and by executing the titanic[6]-command outside of the function.)

Meanwhile I can write it like this without any problems:

hist(titanic$Age)

Why can I use one and not the other? When, in general, can I use brackets and when do I have to use the $-sign?

Magnus
  • 728
  • 4
  • 17
  • 3
    *Very* closely related to https://stackoverflow.com/q/1169456, close enough that I'd consider closing as a dupe, though it is less-detailed about the `$` operator. Buttom line: `$` *on a dataframe* should always return a vector, `[` will always return a `list` or single-column `data.frame`, and `[[` will always return a vector. Two biggest differences: `[[` accepts a variable to indicate column name (so `a <- "cyl"; mtcars[[a]]` works, `mtcars$a` does not here); and `$a` is same as `mtcars[[a,exact=FALSE]]`, meaning `mtcars$cy` gives you the vec of `$cyl`. – r2evans Jun 02 '19 at 20:52
  • 1
    see `help("[")` – Rorschach Jun 02 '19 at 20:52
  • @duckmayr, I considered that ... but the `$`/`[` question has *GOT* to be a dupe somewhere ... – r2evans Jun 02 '19 at 20:55
  • The main problem with $ is that it has partial matching, meaning that if you do df$a it will find the first thing that has "a" as it's first letter. – Bruno Jun 02 '19 at 21:26
  • 1
    @Bruno - That's not completely correct. Partial matching will only occur when the match is unique (i.e. only a single variable begins with "a"). `NULL` will be returned if there are multiple partial matches. – Ritchie Sacramento Jun 02 '19 at 21:35
  • Interesting, I guess that makes it slightly less dangerous, still it breaks the minds of non R programmers. – Bruno Jun 02 '19 at 21:37
  • 2
    The fact that it does partial matching *as a default* is dangerous itself ... the fact that you cannot always rely on this behavior makes it even less predictable, not less dangerous (in my eyes). (Which is one reason I often -- but admittedly not always -- avoid `$` in some types of production code with user-provided data that is "supposed" to have certain columns.) – r2evans Jun 02 '19 at 21:42

0 Answers0