0

I have a Docker container running a R script "successfully" (i.e. the script is completed) on AWS but returning the following error:

Error: ignoring SIGPIPE signal  
Execution halted

Here is the R script (curated after I narrowed down where the issue occurred):

library(arrow)
library(naniar)
library(tidyr)
library(dplyr)

df <- arrow::read_parquet("s3://mybucket/myfile.parquet")

start_time <- Sys.time()

print("Calculate missing-shadow")
df %>%
  # slice(1:1000) %>%
  # Use Naniar to bind a shadow matrix (ie. a T/F matrix for if a varible is NA or not)
  bind_shadow() %>%
  # Keep only ID and the variables that are shadows
  select(bet_cus_id,ends_with("_NA")) %>%
  # Gather so we can combine into one giant string
  gather(var,isna,-bet_cus_id) %>%
  # Change to true/false, becuase 1/0 is easier to read than !NA and NA
  mutate(isna = isna == "NA") %>%
  unite("Merged",var:isna,remove=TRUE) %>%
  # Sort so every  string has the same sorting
  group_by(bet_cus_id) %>%
  arrange(Merged, .by_group = TRUE) %>%
  {.} -> df.player

print("Finished missing-shadow")
print(df.player %>% slice(1:3))
print(length(unique(df.player$bet_cus_id)))
print(length(df.player$bet_cus_id))

Sys.time()-start_time

And the Dockerfile

FROM rstudio/r-base:4.0.4-focal

RUN apt-get update

RUN apt-get install -y --no-install-recommends git cmake

# arrow
RUN apt-get install -y libcurl4-openssl-dev
RUN apt-get install -y libssl-dev

ENV ARROW_S3=ON

RUN apt-get update

# h2o
RUN apt-get install -y default-jdk
RUN R CMD javareconf

ENV RENV_VERSION 0.13.2
RUN R -e "install.packages('remotes', repos = c(CRAN = 'https://cloud.r-project.org'))"
RUN R -e "remotes::install_github('rstudio/renv@${RENV_VERSION}')"

RUN Rscript -e "install.packages('devtools', repos='https://packagemanager.rstudio.com/all/__linux__/focal/latest')"
RUN Rscript -e "devtools::install_version('h2o', version = '3.30.0.1', repos = 'https://packagemanager.rstudio.com/all/__linux__/focal/latest')"

WORKDIR /project
COPY renv.lock renv.lock
RUN R -e 'renv::restore(repos=c("https://packagemanager.rstudio.com/all/__linux__/focal/latest"))'

RUN rm -rf /tmp/* \
  && apt-get remove --purge -y $BUILDDEPS \
  && apt-get autoremove -y \

EXPOSE 3840
  
COPY test.R test.R

CMD ["Rscript","test.R"]

and here are the last line of the log file (coming from journalctl -u docker.service)

Jul 06 08:59:53 ip-XXX.eu-north-1.compute.internal 131e033f5fe2[4359]: Time difference of 3.678913 mins
Jul 06 08:59:53 ip-XXX.eu-north-1.compute.internal 131e033f5fe2[4359]: Error: ignoring SIGPIPE signal
Jul 06 08:59:53 ip-XXX.eu-north-1.compute.internal 131e033f5fe2[4359]: Execution halted
Jul 06 08:59:53 ip-XXX.eu-north-1.compute.internal dockerd[4359]: time="2021-07-06T08:59:53.867394818Z" level=info msg="ignoring event" container=131e033f5fe2cf5f0854ce97115544889a59f97a959509bdc7fef24d8ba08cd4 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
    enter code here

I cannot spot anything useful from the above log. However this seems to be related to memory since the SIGPIPE/execution halted does not occur if I slice down the size of the data frame down to 1000 rows for example (the data frame on input is rather big: ~45M rows / 3 columns). It might be that logging does not manage to keep up before the end of the script due to the large size of the data frame hence the SIGPIPE issue. Just a guess... and I do not know how to solve this anyways.

Update 06/07: I should have mentioned that the above R script is running all fine in itself (that is outside a docker container).

Any help would be much appreciated.

Greg
  • 11
  • 2
  • I googled the error and found https://stackoverflow.com/q/28915838/3358272, possibly related? – r2evans Jul 06 '21 at 11:59
  • Thanks for checking @r2evans. I don't think the post you mentioned is directly related. I should have mentioned that the R script run all fine in itself (that is outside a docker container). So this is somewhat Docker related. – Greg Jul 06 '21 at 13:54
  • I think you missed the point: that error is from R, not from docker. The only thing that might be doing it is `read_parquet`; I suggest you reduce your code to stop immediately after `df <- arrow::read_parquet(.)` and see if the error persists; if so, then it might be worth a new issue to `arrow` in case its file-descriptors are not being properly closed (in a timely manner). The fact that it is happening in docker only is something I cannot explain. – r2evans Jul 06 '21 at 16:05
  • 1
    I do not get any error message if I reduce the code to only reading the parquet file. I have to get down to the `arrange()` function to get the execution halt message. This function is pretty demanding in terms of memory usage but my environment has plenty of it. – Greg Jul 06 '21 at 18:42
  • I found a workaround based on the link provided by @r2evans. So instead of submitting the R script directly it actually passes through another R script which uses the `system()` function to ignore any stderr message: `system("Rscript auto01_macro_predict.R", intern = FALSE, ignore.stderr = TRUE)` That does not explain where the SIGPIPE error is coming from exactly nor fixes it. But I can leave with that for now! Thanks for your help @r2evans – Greg Jul 07 '21 at 11:49

0 Answers0