Can high numbers of causal conditions produce errors in QCA analysis?

Question

I am trying to run a QCA analysis of some crisp and some fuzzy sets. I have 50 variables (causes) that I am interested in using to produce a truth table. In doing this work, I've run into different errors when using superSubset() and minimize(). During troubleshooting I found that if I removed some of my variables, sometimes the errors would go away. This is unrelated to which variables I remove, however, so I'm thinking that (some of) the issue might lie in the number of causes and their 2^n possible configurations.

I've updated QCA and R to their most current versions.

The specific errors I've seen and actions I've taken are:

When running superSubset(data, outcome = "variable", incl.cut = 0.7) on some of the 50 variables, I got the following error: "INTEGER() can only be applied to a 'integer', not a 'double'". I changed incl.cut to .6 and then the operation ran with no error. I added all variables to the data and ran superSubset(data, outcome = "variable", incl.cut = 0.6) but saw the error again. If I remove many variables from data (doesn't seem to matter which), I see no error.
When running min(tt) with all variables included in the truth table, I got NAs introduced by coercion to integer range. Error: Conditions '0,1,2,3,4,5,6,7,8,9,X' do not match the set names from "snames" argument. If I remove many variables from data (doesn't seem to matter which), I see no error. This error has been reported by others, but—since I'm up to date on my packages and it seems the QCA binaries are up to date—it doesn't seem to me that it has the same cause.

Do these errors seem related to you? Is there some reason a high number of causes would matter? Any suggestions on how to resolve the errors or troubleshooting approaches?

ETA: Below is some example data and code that reproduces the errors I'm talking about:

library(QCA)

big_sample_data <- sample(c(0,1), 52*114, replace = TRUE)
big_example_df <- matrix(big_sample_data, nrow = 114, ncol =52) %>%
  as.data.frame()

#yields INTEGER() error
superSubset(big_example_df, outcome="V1", incl.cut = .9)

#yields Conditions error
tt <- truthTable(big_example_df, outcome = "V1", complete = TRUE, show.cases = TRUE, sort.by = "incl")
minimize(tt)

But, if I reduce the size of the dataset, the errors don't occur:

little_sample_data <- sample(c(0,1), 50*10, replace = TRUE)
little_example_df <- matrix(little_sample_data, nrow = 50, ncol = 10) %>%
  as.data.frame() 

#works
superSubset(little_example_df, outcome="V1", incl.cut = .9)

#works
tt <- truthTable(little_example_df, outcome = "V1", complete = TRUE, show.cases = TRUE, sort.by = "incl")
minimize(tt)

I am sorry but you will not get useful answers to this question without editing it. You should attach a sample of your data and a code that reproduces the error so that others can replicate it and see what is going wrong. For more details, see here https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example — Otto Kässi, Jul 08 '20 at 07:17

score 1 · Answer 1 · answered Jul 10 '20 at 07:14

If I have understood correctly from reading ?superSubset, the algorithm has to go through K! combinations of columns, where K is the number of columns of its input matrix. Since, 51! = 1.551119e+66, so I am not surprised that your problem is too big for superSubset to handle.

With regards to min(tt), I happened to have QCA version 3.0 installed on my computer. With that version min(tt) does not return an error.

On the other hand, with QCA 3.8.2, I managed to replicate your issue. I have no idea if the result returned by QCA 3.0 is correct though. You might need to get in touch with QCA developers to understand what's going on.

To install QCA ver 3.0, you can run

library(devtools)
install_version('QCA', version='3.0', repos = "http://cran.us.r-project.org")

score 1 · Answer 2 · answered Jul 10 '20 at 19:03

I also posted my question to the QCA Google group and got a response from Adrian Dusa, who wrote the package: https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/qcawithr/5DvPLfbLczg/f3kH3R5YAgAJ

Essentially, the answer is yes, I have more data than can be used.

Can high numbers of causal conditions produce errors in QCA analysis?

2 Answers2