1

I have a Python subprocess to call R:

cmd = ['Rscript', 'Rcode.R', 'file_to_process.txt']
out = subprocess.run(cmd, universal_newlines = True, stdout = subprocess.PIPE)
lines = out.stdout.splitlines() #split stdout

My R code first checks if the 'ape' package is installed before proceeding:

if (!require("ape")) install.packages("ape")
library(ape)
do_R_stuff.......
return_output_to_Python

Previously, the whole process from Python to R worked perfectly - R was called and processed output was returned to Python - until I added the first line (if (!require("ape")) install.packages("ape")). Now Python reports: "there is no package called 'ape'" (i.e. when I uninstall ape in R). I have tried wait instructions in both the R and Python scripts but I can't get it working. When checking, the R code works in isolation.

The full error output from Python is:

Traceback (most recent call last):

  File ~\Documents\GitHub\wolPredictor\wolPredictor_MANUAL_parallel.py:347 in <module>
    if __name__ == '__main__': main()

  File ~\Documents\GitHub\wolPredictor\wolPredictor_MANUAL_parallel.py:127 in main
    cophen, purge, pge_incr, z, _ = R_cophen('{}/{}'.format(dat_dir, tree), path2script) #get Dist Mat from phylogeny in R

  File ~\Documents\GitHub\wolPredictor\wolPredictor_MANUAL_parallel.py:214 in R_cophen
    purge = int(np.max(cophen) * 100) + 1 #max tree pw distance

  File <__array_function__ internals>:5 in amax

  File ~\anaconda3\lib\site-packages\numpy\core\fromnumeric.py:2754 in amax
    return _wrapreduction(a, np.maximum, 'max', axis, None, out,

  File ~\anaconda3\lib\site-packages\numpy\core\fromnumeric.py:86 in _wrapreduction
    return ufunc.reduce(obj, axis, dtype, out, **passkwargs)

ValueError: zero-size array to reduction operation maximum which has no identity


Loading required package: ape
Installing package into 'C:/Users/Windows/Documents/R/win-library/4.1'
(as 'lib' is unspecified)
Error in contrib.url(repos, "source") : 
  trying to use CRAN without setting a mirror
Calls: install.packages -> contrib.url
In addition: Warning message:
In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called 'ape'
Execution halted
furas
  • 134,197
  • 12
  • 106
  • 148
user3329732
  • 346
  • 2
  • 15
  • Try adding dependencies=TRUE to the install packages parameters. Perhaps you don’t have one of ape’s requirements for installation or loading. – IRTFM Jun 26 '22 at 02:27
  • @IRTFM good idea but didn't work thanks – user3329732 Jun 26 '22 at 02:37
  • 1
    It should have produced more messages. I’m guessing you have not updated your question to include the entire system response. Note: The path to R libraries may not be the same when using Rscript. – IRTFM Jun 26 '22 at 02:40
  • As I said, the subprocess call worked before - it's not a path problem - output added in main body – user3329732 Jun 26 '22 at 02:48
  • 1
    The error says there was difficulty in finding the CRAN repository. – IRTFM Jun 26 '22 at 06:15
  • Thanks @IRTFM I had to set a new sub process in a new .R file specifically for the install, but the CRAN mirror was key, I never realised it would be an issue as it's not a requirement on my local machine (not sure why it becomes an issue thru subprocess) – user3329732 Jun 26 '22 at 08:54
  • *the R code works in isolation.*...How did you run the R code? With RStudio which runs with various global settings and options? Try using vanilla `Rscript` and you may reproduce the Python subprocess issue. When trying to automate R, always test scripts outside of IDEs. – Parfait Jun 26 '22 at 13:02
  • I see *Anaconda* in traceback. This suggests you may be running Python in a different environment than R. So, `ape` may be installed in library of R in base environment but not R in Anaconda where latter cannot reach CRAN version. – Parfait Jun 26 '22 at 13:06

2 Answers2

1

Point 1: The path to R libraries when using R in a standalone mode may not be the same when using Rscript.

Point 2: The error says there was difficulty in finding the CRAN repository, so perhaps the options that set the repos were not set for the Rscript environment. They can be set in the call to install packages or with a Sys.setenv() call.

Ther OP wrote: "Thanks @IRTFM I had to set a new sub process in a new .R file specifically for the install, but the CRAN mirror was key, I never realised it would be an issue as it's not a requirement on my local machine (not sure why it becomes an issue thru subprocess)."

The places to find more information are the ?Startup help page and the ?Rscript page. Rscript has many fewer defaults. Even the usual set of recommended packages may not get loaded by default. The Rscript help page includes these flags which could be used for debugging and setting a proper path to the libraries desired.:

--verbose gives details of what Rscript is doing.

--default-packages=list where list is a comma-separated list of package names or NULL. Sets the environment variable R_DEFAULT_PACKAGES which determines the packages loaded on startup.

Here is a previous similar SO question with an answer that includes some options for construction of a proper working environment: Rscript: There is no package called ...?

There are a few R packages on CRAN that aid in overcoming some of the differences between programming for standalone R and Rsript.

getopt: https://cran.r-project.org/web/packages/getopt/index.html

optparse: https://cran.r-project.org/web/packages/optparse/index.html (styled after a similar Python package.)

argparse: https://cran.r-project.org/web/packages/argparse/index.html

IRTFM
  • 258,963
  • 21
  • 364
  • 487
1

I solved the issue (thanks to @IRTFM) by placing the if-then-install.packages code in a separate Rscript (including the CRAN mirror):

if (!require("ape")) install.packages("ape", repos='http://cran.us.r-project.org')

which I then called using a separate Python subprocess in my Python routine

user3329732
  • 346
  • 2
  • 15