3

I am inheriting some code and I think there is room for housekeeping, which includes tidying up the packages game.

In particular, I want both to see what are loaded packages used for, and (most importantly) if there are any loaded-but-unused packages.

The code is long and the packages are many, so I would definitely prefer to automate this process.

As a small example, if I had:

packageload <- c("ggplot2", "readxl")
lapply(packageload, library, character.only = TRUE)

ggplot(diamonds, aes(x = cut)) +
  geom_bar()

I would want some output telling me that ggplot() is used from ggplot2 (which is therefore being used as package), and that readxl is currently not used in the project code.

Matteo
  • 2,774
  • 1
  • 6
  • 22

1 Answers1

4

I have been looking for a clear answer to this and finally, building on the useful function pointed out by @eh21 here, I built up this small approach that fits the intention with 3 lines of code and that can be replicated by anyone (and with this I mean by non-experienced programmes like me) on their case with no effort.

The principle is to use this approach after the packages have been loaded and before the actual project code (i.e. no need for it to be run in order to get the desired information), as below:

# Load packages ----

packageload <- c("ggplot2", "readxl")
lapply(packageload, library, character.only = TRUE)


# Find which packages do used functions belong to ----

used.functions <- NCmisc::list.functions.in.file(filename = "thisfile.R", alphabetic = FALSE) |> print()


# Find which loaded packages are not used ----

used.packages <- used.functions |> names() |> grep(pattern = "packages:", value = TRUE) |> gsub(pattern = "package:", replacement = "") |> print()

unused.packages <- packageload[!(packageload %in% used.packages)] |> print()


# Actual project code (no need to be run) ----

ggplot(diamonds, aes(x = cut)) +
  geom_bar()

The relevant outputs are:

> used.packages
[1] "base"    "ggplot2"

> used.functions
$`character(0)`
[1] "list.functions.in.file"

$`package:base`
[1] "c"      "lapply" "print"  "names"  "grep"   "gsub"   

$`package:ggplot2`
[1] "ggplot"   "aes"      "geom_bar"

> unused.packages
[1] "readxl"

Notes:

  • This requires install.packages("NCmisc"), however I didn't load that package (and used :: instead) for consistency, as it shouldn't appear among the used.packages;
  • if using RStudio and wanting to apply this to multiple scripts, using rstudioapi::getSourceEditorContext()$path instead of "thisfile.R" in NCmisc::list.functions.in.file will be handy.
  • The approach above works for the case in which lapply() is used on a named object to load packages. If packages are instead loaded without resorting to a named object (e.g. with a series of library() or require()), the # Load packages ---- section of the code above can be modified as follows:
# Load packages ----

packageload <- search()

library(ggplot2)
library(readxl)

packageload <- search()[!(search() %in% packageload)] |> grep(pattern = "package:", value = TRUE) |> gsub(pattern = "package:", replacement = "")
Matteo
  • 2,774
  • 1
  • 6
  • 22
  • Great approach, but this only works for me if I replace ```grep(pattern = "packages:", value = TRUE)``` with ```grep(pattern = "package:", value = TRUE)```. – Julian Oct 06 '22 at 07:32