1

I've built a package for interacting with HDFql from R. It relies on the R wrapper and DLLs/SOs provided by HDFql 2.1.0. The packages works perfectly in Windows using the DLLs, but for some reason the HDFql library SOs fail to load in a Linux environment. I've tried this both on Travis and on a local Docker Linux/R container.

The relevant code contained in the function hql_load() is below. Assume that HDFql is extracted to a folder in the current directory "/hdfql-2.1.0", which means that

dllpath = c("/hdfql-2.1.0/lib/libHDFql.so", "/hdfql-2.1.0/wrapper/R/libHDFqlR.so")

I check that these paths exist using normalizePath(dllpath, mustWork = TRUE) and also check that the objects load successfully and appear in getLoadedDlls().

# ... starting at line 153 of connect.r ... #
wrapper.file = tempfile(fileext = ".r")
  wrapper.lines = readLines(wrapperpath)
  writeLines(wrapper.lines[-grep("dyn\\.load", wrapper.lines)],
    wrapper.file)
  # load DLLs
  for (dll in dllpath) {
    dyn.load(dll, local = FALSE, now = TRUE)
    if (!dll %in% sapply(getLoadedDLLs(), function(x) normalizePath(x[["path"]], mustWork = FALSE))) {
      stop("Error loading HDFql shared library object ", dll)
    } 
  }
  # load wrapper
  wrapper = new.env(parent = .BaseNamespaceEnv)
  tryCatch(
    sys.source(wrapper.file, envir = wrapper, toplevel.env = packageName()),
    error = function(e) {
      stop("Failed to execute HDFql R wrapper.\n Additional Information:\n",
       e)
    }
  )
  assign("wrapper", wrapper, envir = hql)
  invisible(NULL)
}

The error occurs in the sys.source call to evaluate the code in the wrapper file provided by HDFql, and specifically in the initialization call. The wrapper contents are below; note that in my function above I remove the dyn.load calls from the wrapper before evaluating it (the libraries are loaded beforehand).

hdfql_operating_system = Sys.info()["sysname"]
if (hdfql_operating_system == "Windows")
{
    dyn.load("HDFqlR.dll")
    hdfql_shared_library <- "HDFqlR"
} else if (hdfql_operating_system == "Linux")
{
    dyn.load("libHDFqlR.so")
    hdfql_shared_library <- "libHDFqlR"
} else   # macOS
{
    dyn.load("libHDFqlR.dylib")
    hdfql_shared_library <- "libHDFqlR.dylib"
}
rm(hdfql_operating_system)



#===========================================================
# INITIALIZE HDFQL R WRAPPER SHARED LIBRARY
#===========================================================
hdfql_initialize_status = .Call("_hdfql_initialize", PACKAGE = hdfql_shared_library)

Error: Failed to execute HDFql R wrapper.

Additional Information:

Error in eval(parse(wrapper.file), envir = wrapper): Could not find/load HDFql shared library 'libHDFql.so'!

I have been troubleshooting this for weeks with little progress. Can anyone tell me why the library is not loading correctly in Linux systems?

Community
  • 1
  • 1
mikeck
  • 3,534
  • 1
  • 26
  • 39
  • 1
    What is the error message? Have you added the location of the SO files to `LD_LIBRARY_PATH`? – Ralf Stubner Jul 30 '19 at 21:48
  • @RalfStubner I updated my post with the error message, which is quite vague. Can you explain the use of `LD_LIBRARY_PATH`? The HDFql wrapper instructions do not discuss this setting and my understanding was that `LD_LIBRARY_PATH` is for DLLs that are included or built by the package. The HDFql library is not installed with the package; it is an external program. The package uses `dyn.load` with the full path to access the shared library. – mikeck Jul 30 '19 at 21:58
  • 2
    @RalfStubner actually I see now that `LD_LIBRARY_PATHS` is discussed in the HDFql manual. I will review and see if I can resolve the issue. – mikeck Jul 30 '19 at 22:54

1 Answers1

0

The suggestion by @RalfStubner was correct: the issue was setting LD_LIBRARY_PATH on the Linux machine to include the HDFql libraries. Interestingly enough, I still had to provide the full qualified path names to the DLLs in the dyn.load call and could not use Sys.setenv or the DLLpath argument to dyn.load to temporarily modify LD_LIBRARY_PATH---instead I had add a line to my travis.yaml/Dockerfile to update LD_LIBRARY_PATH:

export LD_LIBRARY_PATH=${HDFQL_DIR}/lib:${HDFQL_DIR}/wrapper/R:$LD_LIBRARY_PATH

I still don't understand why I could not resolve the issue using e.g.

Sys.setenv(LD_LIBRARY_PATH = paste(dirname(dll), Sys.getenv("LD_LIBRARY_PATH"), sep = ":")

or

dyn.load(basename(dll), DLLpath = dirname(dll))

since both of these commands should (in theory) modify the search path to include the HDFql directories. I would appreciate any answers that could explain this.

I was also surprised to find that once I updated LD_LIBRARY_PATH I was hit with environment bindings and namespace issues that never came up on the Windows build---but that's a separate issue.

mikeck
  • 3,534
  • 1
  • 26
  • 39