6

I am creating an R package that I plan to submit to CRAN. How can I check if any of my function names conflict with function names in packages already on CRAN? Before my package goes public, it's still easy to change the names of functions, and I'd like to be a good citizen and avoid conflicts where possible.

For instance, the packages MASS and dplyr both have functions called "select". I'd like to avoid that sort of collision.

Community
  • 1
  • 1
Sam Firke
  • 21,571
  • 9
  • 87
  • 105
  • Related (but not exactly an answer): [List of functions of a package](http://stackoverflow.com/questions/22488645/list-of-functions-of-a-package) – Tensibai Aug 22 '16 at 15:30
  • 2
    It's not a huge deal in my opinion to have function names shared with other packages. Good programming techniques can easily avoid conflicts. Furthermore, the number of functions in CRAN packages is huge and checking them all could be very tedious. I'd just try to use not very common words and function names of the `base` (and other basic packages like `stats`) package. I think it should be enough. – nicola Aug 22 '16 at 15:36
  • 3
    I agree generally with @nicola, I would only add that if there are a handful of packages that you think are likely to be loaded with yours, it probably is worth trying to avoid conflicts with functions in those packages. – joran Aug 22 '16 at 15:38
  • 3
    I usually go to [rdocumentation.org](http://rdocumentation.org) to search against **all** of CRAN and BioC at once. – Dirk Eddelbuettel Aug 22 '16 at 16:24
  • 4
    Do **not** do this. It leads to bad function names because the good ones are taken. Instead, encourage users to use properly namespace qualified names when such clashes occur. See [my corresponding explanation on reddit](https://www.reddit.com/r/rstats/comments/4y3uxz/namespace_best_practices_when_creating_a_new/d6l443p). – Konrad Rudolph Aug 22 '16 at 19:01

1 Answers1

6

There are a lot of packages (9008 at the moment, Aug 2016), so it is almost certainly better to only look at a subset you want to avoid clashes with. Also, to re-emphasise some of the good advice in the comments (just for the record in case comments get deleted, or hidden):

  1. sharing function names with other packages is not really a big problem, and not worth avoiding beyond, perhaps avoiding clashes with common packages that are most likely to be loaded at the same time (thanks @Nicola and @Joran)
  2. Unnecessarily avoiding re-usue of names "leads to bad function names because the good ones are taken" (@Konrad Rudolph)

But, if you really want to check all the packages, perhaps to at least know which packages use the same names as yours, you can get a vector of the package names by

crans <- available.packages()[, "Package"]
#           A3        abbyyR           abc   ABCanalysis      abc.data      abcdeFBA 
#         "A3"      "abbyyR"         "abc" "ABCanalysis"    "abc.data"    "abcdeFBA"
length(crans)
# [1] 9008

You can then install them in bulk using

N = 4 # only using the 1st 4 packages here - 
      # doing it for the whole lot will take a lot of time and disk space!!!
install.packages(crans[1:N])

Then you can get a list of the function names in these packages with

existing_functions = sapply(1:N, function(i)  ls(getNamespace(crans[i])))
Thomas
  • 43,637
  • 12
  • 109
  • 140
dww
  • 30,425
  • 5
  • 68
  • 111
  • 1
    probably easier to crawl https://cloud.opencpu.org/ocpu/test/ which already has all packages installed and exposes them in an easy-to-crawl way, e.g. https://cloud.opencpu.org/ocpu/library/dplyr/R/ – Ruben Aug 09 '17 at 09:03