15

At work, I have a Windows 7 computer running R 3.1.2.

I have a file called packages.R. In my this file, I have the following code:

library(dplyr)
library(sqlutils)
library(RODBC)

My .Rprofile contains a function called .First.

.First <- function() {
    source("R/packages.R")
}

When I load R, I get the following output:

Loading required package: roxygen2
Loading required package: stringr
Loading required package: DBI

Attaching package: 'dplyr'

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

If you look at this carefully, you will see the filter from stats is not masked.

But, if I take my exact same setup, and comment out the library(dplyr) statement in packages.R, save the file, and restart R and then manually . . . . as in type it in by hand . . . .

library(dplyr)

Attaching package: 'dplyr'

The following object is masked from 'package:stats':

    filter

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

Now, it masks package::stats.

I don't get it. I need to use the filter command from dplyr a lot for this project and I don't want to type dplyr::filter in order to use it. Could someone please help my weak mind understand why this is behaving this way? I have tried starting R in RStudio and ESS, and I get the exact same behavior in both. I also tried moving dplyr to the end of the packages.R file, with no difference to the results. I just want to mask stats::filter. Thanks.

Choens
  • 1,312
  • 2
  • 14
  • 23
  • 1
    When you load libraries in .RProfile they get attached very early in the R startup process, before the stats package is attached. The other way, you're attaching dplyr after stats has already been loaded. I've seen Hadley recommend against loading packages in .RProfile for this reason (discrepancies in package loading order). – joran Nov 14 '14 at 17:03
  • 2
    ...I suppose you could try adding `library(stats)` at the start of the script you're sourcing in .RProfile. – joran Nov 14 '14 at 17:05
  • For grins and giggles, I added library(stats) to my packages.R file. But, that seems like an unnecessarily complicated thing to have to do. I guess I will pull those two commands out of my .First() and move them to the front of all my analytic files. I hate have templates that are full of the same thing over and over and over again, but I guess in this instance, it is the better option. – Choens Nov 14 '14 at 17:18
  • @joran - I can't give you credit for the answer since you just left a comment. IF you post an answer to the affect of what you said, I'll check it off. – Choens Nov 14 '14 at 17:20
  • 3
    Couldn't you do `filter <- dplyr::filter` at the top of the script and they would essentially be reversed, calling `stats::filter` to get the stats version? Dirty, but it would work – Rich Scriven Nov 14 '14 at 17:31
  • Yeah. That would work. But it is ugly / hackish. I liked Joran's solution better simply because I learned a little more about how R starts. – Choens Nov 18 '14 at 01:51

2 Answers2

19

When you load libraries in .RProfile they get attached very early in the R startup process, before the stats package is attached. The other way, you're attaching dplyr after stats has already been loaded. You can learn about R's startup process by typing ?Startup. There it says:

Note that when the site and user profile files are sourced only the base package is loaded, so objects in other packages need to be referred to by e.g. utils::dump.frames or after explicitly loading the package concerned.

I've seen Hadley recommend against loading packages in .RProfile for this reason, i.e. the discrepancies in package loading order, although personally I don't have strong feelings about it.

One possible solution is to simply add library(stats) as the very first library call in your script, before loading dplyr.

Another (long term) option to avoid these sorts of issues more globally would be to transition your workflows from "a large collection of scripts" to one or more packages.

Salim B
  • 2,409
  • 21
  • 32
joran
  • 169,992
  • 32
  • 429
  • 468
-1

I had exactly the same issue and it is so annoying. If you want to suppress the warning messages as I did :-), you can load with library(dplyr, warn.conflicts = FALSE).

Desta Haileselassie Hagos
  • 23,140
  • 7
  • 48
  • 53