0

Im trying to manipulate a large data table (~37 MB) but in a special way: for other (unrelated) reasons I have implemented a 'hook' like structure meaning that the overall process is like

1) load the data.table from disk

2) fire a certain hook

3) the hook structure looks for this name ans checks whether the user (=me :)) has bound a function to this hook and if so, it is called

4) the data is processed further

The functions look like this:

data = readRDS(pathToFile)
data = data.table(data)
fireHook("After_data_read", data, [some other parameters])
some_more_processing(data)

and the region around fireHook looks like

hooksRegistered = list(
  "After_data_read" = function(data, ...) { 
                        # do some stuff 
                      }


)

fireHook = function(hookName, ...) {
  for (hookNameRegistered in names(hooksRegistered)) {
    if (hookName == hookNameRegistered) {
      func = .global.hooksRegistered[[hookName]]
      func(hookName, ...)
    }
  }
}

Observe that one needs to cast an object that already is a data.table into it again (otherwise the pass-by-reference does not work), see Adding new columns to a data.table by-reference within a function not always working and Pass by reference bug?


Problem: this line: func(hookName, ...) takes like forever (> 5 minutes).


The debugger never really gets into the function (so its not the code in the function that takes a long time) and I've tested it with small data.tables and it worked. Also, I noted that the following seems to work:

fireHook = function(hookName, ...) {
  args = list(...)
  for (hookNameRegistered in names(.global.hooksRegistered)) {
    if (hookName == hookNameRegistered) {
      func = .global.hooksRegistered[[hookName]]
      func(hookName, args)
    }
  }
}

(notice that I substituted ... by list(...)). To me, it seems as if R is trying to copy the whole table when using .... Is this right/desired? Or am I using it wrong?

regards,

FW

Community
  • 1
  • 1
Fabian Werner
  • 957
  • 11
  • 19
  • What version of R are you using. I have a foggy recollection that this may have changed in more recent versions. – ctbrown May 12 '15 at 10:37
  • `version` gives me the version String `R version 3.1.1 (2014-07-10)`... so its not the newest but also not too old I guess... – Fabian Werner May 13 '15 at 11:54

0 Answers0