1

I'm writing a package that imports (and uses) both SparkR::sql and dbplyr::sql. The other relevant questions involve unintentional collisions brought on my haphazard importing.

In my NAMESPACE I have:

importFrom(dbplyr, sql)
importFrom(SparkR, sql)

Both of these functions are used in the script and, being aware of the conflict, I'm sure to always prefix the package name:

dbplyr::sql(...)
SparkR::sql(...)

Nevertheless, I get an import replacement warning when building/checking the package:

Warning: replacing previous import ‘dbplyr::sql’ by ‘SparkR::sql’ when loading ‘my_pkg’

What I see in Writing R Extensions seems to be the following:

If a package only needs a few objects from another package it can use a fully qualified variable reference [foo::f] in the code instead of a formal import... This is slightly less efficient than a formal import and also loses the advantage of recording all dependencies in the NAMESPACE file (but they still need to be recorded in the DESCRIPTION file). Evaluating foo::f will cause package foo to be loaded, but not attached, if it was not loaded already—this can be an advantage in delaying the loading of a rarely used package.

Am I right that the takeaway of this / best practice is to:

  • Pick which function is used most often and add that to importFrom
  • Remove the "less-frequent" function from importFrom but keep that package in DESCRIPTION
  • Just use :: (perhaps preceded by require()) for the "less-frequent" function
zx8754
  • 52,746
  • 12
  • 114
  • 209
MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
  • I think (especially in this case) that I'd always use the `::` notation, even if one is explicitly in `NAMESPACE` ... for clarity if nothing else. I agree that it seems counter-intuitive to be told "you should put it in `NAMESPACE`" and then get the warning when it's an intentional thing. Just my 2 cents. – r2evans Sep 06 '18 at 03:52

1 Answers1

3

I have always followed this advice:

It’s common for packages to be listed in Imports in DESCRIPTION, but not in NAMESPACE. In fact, this is what I recommend: list the package in DESCRIPTION so that it’s installed, then always refer to it explicitly with pkg::fun().

In your case:

  • Remove both importFrom
  • Keep both packages in Imports:
  • Use dbplyr::sql and SparkR::sql

My main motivation here is consistency: Even without any name clashes I want to always use the full name to make it clear when reading the code where some function comes from. If I do not use importForm and forget about using the full name at one place, R CMD ckeck will catch that. I value this code clarity higher than the (perceived) advantage of collecting dependencies in two places: DESCRIPTION and (more explicitly) NAMESPACE.

Ralf Stubner
  • 26,263
  • 3
  • 40
  • 75
  • This runs counter to the official advice, of course: "[Using `::`] is slightly less efficient than a formal import and also loses the advantage of recording all dependencies in the `NAMESPACE` file" – MichaelChirico Sep 06 '18 at 08:54
  • @MichaelChirico Agreed, but even in the case without name clashes I find readability of the source code more important, c.f. my expanded answer. – Ralf Stubner Sep 06 '18 at 09:28
  • I can just comment next to the code which function's being used & still get the efficiency gain of `importFrom` – MichaelChirico Sep 06 '18 at 09:36
  • @MichaelChirico For me such a comment would constitute a code smell. But that is just my opinion, so I will stop here. – Ralf Stubner Sep 06 '18 at 09:48