1

I keep getting the error when trying to build the vignetee for {disk.frame}. And I think it's due to buggy behaviour with {callr} with NSE.

Is it possible to not use {callr} with RMarkdown? I think {callr} creates a fresh session of R in the background, but if I just use the same session to build Markdown then I should be fine. But I can't find where in the Rmarkdown documentation that I can disable {callr}.

--- re-building 'intro-disk-frame.Rmd' using rmarkdown
Quitting from lines 230-235 (intro-disk-frame.Rmd) 
Error: processing vignette 'intro-disk-frame.Rmd' failed with diagnostics:
no applicable method for 'filter_' applied to an object of class "NULL"
--- failed re-building 'intro-disk-frame.Rmd'

SUMMARY: processing the following file failed:
  'intro-disk-frame.Rmd'

Error : Vignette re-building failed.

Error: <callr_status_error: callr subprocess failed: Vignette re-building failed.>
-->
<callr_remote_error: Vignette re-building failed.>
 in process 17276 

See `.Last.error.trace` for a stack trace.
Warning message:
In df_setup_vignette(excl = c("08-more-epic.Rmd", "06-vs-dask-juliadb.Rmd",  :
  NAs introduced by coercion

Update

This is the code that you can try with

---
title: "Test"
output: rmarkdown::html_vignette
---

``` {r setup, include = FALSE}
remotes::install_github("xiaodaigh/disk.frame", ref="development")
suppressPackageStartupMessages(library(disk.frame))
library(fst)
library(magrittr)
library(nycflights13)
library(dplyr)
library(data.table)

# you need to run this for multi-worker support
# limit to 2 cores if not running interactively; most likely on CRAN
# set-up disk.frame to use multiple workers
if(interactive()) {
  setup_disk.frame()
  # highly recommended, however it is pun into interactive() for CRAN because
  # change user options are not allowed on CRAN
  options(future.globals.maxSize = Inf)  
} else {
  setup_disk.frame(2)
}


knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r asdiskframe, cache=TRUE}
library(nycflights13)
library(dplyr)
library(disk.frame)
library(data.table)

# convert the flights data to a disk.frame and store the disk.frame in the folder
# "tmp_flights" and overwrite any content if needed
flights.df <- as.disk.frame(
  flights, 
  outdir = file.path(tempdir(), "tmp_flights.df"),
  overwrite = TRUE)

flights.df
```

```{r, dependson='asdiskframe'}
library(disk.frame)
flights.df %>%
  group_by(carrier) %>% # notice that hard_group_by needs to be set
  summarize(count = n(), mean_dep_delay = mean(dep_delay, na.rm=T)) %>%  # mean follows normal R rules
  collect %>% 
  arrange(carrier)
```
r2evans
  • 141,215
  • 6
  • 77
  • 149
xiaodai
  • 14,889
  • 18
  • 76
  • 140
  • 2
    You could a) show the code in `eval=FALSE` mode and then b) show the (precomputed, fixed) output. Sometimes these setups are fragile, and a more defensive approach may be appropriate. – Dirk Eddelbuettel Dec 15 '19 at 04:14
  • It appears that your problem is not that `callr` failed, it's that it returned an error ... it behaved correctly. `callr` is returning an error because your document is failing (see the error about *"no applicable method for 'filter_' ... NULL"*, likely because you did not instantiate data correctly in your document). (BTW: if it's `dplyr::filter_`, it's deprecated.) – r2evans Dec 15 '19 at 04:14
  • 1
    r2evans, but it runs when I run it in interactive mode. Anyway, disk.frame loads dplyr so it's shouldn't be missing – xiaodai Dec 15 '19 at 04:16
  • *"it runs ... in interactive mode"* and not when rendering suggests that you are not defining data in your document that it needs. If you start a fresh workspace (nothing found in `ls(all.names=TRUE)`) and run the document piece by piece, does it still work without error? (It's important to start with an empty environment and run only the code present in the rmarkdown document. It's a common thing to inadvertently use data from your current session in your documents, making them not reproducible.) – r2evans Dec 15 '19 at 04:21
  • I edited my comment about *"no applicable method"*. The problem is that some object is not being defined properly, so you are effectively calling `x <- NULL; filter_(x, ...)`. – r2evans Dec 15 '19 at 04:22
  • @r2evans yes it does run! I have pasted the code. I've tried the obvious thing you suggested already before posting. – xiaodai Dec 15 '19 at 22:47
  • @r2evans the issue is the use of NSE and capturing global variables which callr isn't doing properly – xiaodai Dec 15 '19 at 22:47

1 Answers1

2

Based on the discussions here https://community.rstudio.com/t/error-with-callr-not-doing-nse-the-way-disk-frame-does-causing-issue-with-knitting-in-rmarkdown/47401

One can simply use rmarkdown::render() on the file directly to circumvent the issue.

xiaodai
  • 14,889
  • 18
  • 76
  • 140