I am having troubles using the G2-test function of the Fast function in R since it outputs a segmentation fault even though it seems to me that the input parameters are correct.
More specifically, I am able to run the example code in the manual page
nvalues <- 3
nvars <- 10
nsamples <- 5000
data <- matrix( sample( 0:(nvalues - 1), nvars * nsamples, replace = TRUE ), nsamples, nvars )
dc <- rep(nvalues, nvars)
res<-g2Test( data, 1, 2, 3, c(3, 3, 3) )
But I'm not able to make it run on my data. The function g2Test takes as input a matrix of numbers, three integer that stands for the column on which to condition (in the example we are studying the dependence of the first on the second conditioned on the third) and a vector with the number of unique values per column.
My code follows the same principles reading data from the ALARM csv file
library(readr)
library(Rfast)
# open the file
path <- "datasets/alarm.csv"
dataset <- read.csv(path)
# search for the indexes of the column I'm interested in and the amount of unique values per column
c1 <- "PVS"
c2 <- "ACO2"
s <- c("VALV", "VLNG", "VTUB", "VMCH")
n <- colnames(dataset)
col_c1 <- match(c1, n)
col_c2 <- match(c2, n)
cols_c3 <- c()
uni <- c(length(unique(dataset[c1])[[1]])[[1]],length(unique(dataset[c2])[[1]])[[1]])
if (!s[1]=="()"){
for(v in s){
idx <- match(v, n)
cols_c3 <- append(cols_c3,idx)
uni <- append(uni,length(unique(dataset[v])[[1]])[[1]])
}
}
# transforming the str DataFrame into a integer matrix
for (nn in n){
dataset[nn] <- unclass(as.factor(dataset[nn][[1]]))
}
ds <- as.matrix(dataset)
colnames(ds) <- NULL
# running the G2 test
res <- g2Test(ds, col_c1, col_c2, cols_c3, uni)
But it results into a segmentation fault
*** caught segfault ***
address 0x1f103f96a, cause 'memory not mapped'
Traceback:
1: g2Test(ds, col_c1, col_c2, cols_c3, uni)
Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
The same happens if I condition on just one variable and not on multiple ones.
I really don't understand why this happens since it seems to me that my case is the same as the example on the reference, just with different data. I would really appreciate any help for debugging this issue, please tell me if I need to specify further infos.