Questions tagged [fpgrowth]
55 questions
4
votes
2 answers
PySpark :: FP-growth algorithm ( raise ValueError("Params must be either a param map or a list/tuple of param maps, ")
I am the beginner with PySpark. I am using FPgrowth computing association in PySpark. I followed the steps below.
Data Example
from pyspark.sql.session import SparkSession
spark = SparkSession.builder.getOrCreate()
# make some test data
columns =…

James Taylor
- 484
- 1
- 8
- 23
3
votes
1 answer
Convert StringType Column To ArrayType In PySpark
I have a dataframe with column "EVENT_ID" whose datatype is String.
I am running FPGrowth algorithm but throws the below error
Py4JJavaError: An error occurred while calling o1711.fit.
:java.lang.IllegalArgumentException: requirement failed:
The…

user3198755
- 477
- 2
- 10
- 21
2
votes
1 answer
spark.databricks.queryWatchdog.outputRatioThreshold Error for FPGrowth using Pyspark on Databricks
I'm working on Market Basket Analysis using Pyspark on Databricks.
The transactional dataset consists of a total of 5.4 Million transactions, with approx. 11,000 items.
I'm able to run FPGrowth on the dataset, but whenever I'm trying to either…

Gaurav Kamble
- 21
- 2
2
votes
1 answer
How to efficiently export association rule generated using pyspark in .CSV or .XLSX file in python
After resolving this issue:
How to limit FPGrowth itemesets to just 2 or 3
I am trying to export the association rule output of fpgrowth using pyspark to .csv file in python. After running for almost 8-10 hrs it gives an error.
My machine has…

Shubham Bajaj
- 309
- 1
- 3
- 12
2
votes
0 answers
How to get Antecedents/Consequents from FPGrowth Algorithm in Pyspark?
How am I misusing/misreading the use of the FPGrowth algorithm in Pyspark, I have a Apriori algorithm output I was hoping to be the same. Provided is my FPGrowth code, my Apriori output, and my FPGrowth output.
from pyspark.mllib.fpm import…

Mark McGown
- 975
- 1
- 10
- 26
2
votes
1 answer
fpgrowth error in R
I am trying to fit a fpgrowth model on a in-built data set called Adult. While fitting a model, I was getting an error as shown below.
Error in .jcall(jPruning, "[[Ljava/lang/String;", "fpgrowth", support, :
method fpgrowth with signature…

789372u
- 77
- 1
- 8
1
vote
0 answers
Spark MLlib FPGrowth not working with 40+ items in Frequent Item set
Spark FPGrowth works well with millions of transactions (records) when the frequent items in the Frequent Itemset is less than 25. Beyond 25 it runs into computational limit (executor computing time keeps growing). For 40+ items in the Frequent…

Spark Guest
- 11
- 1
1
vote
2 answers
Compare the annual rates between groups
I am strugling into comparing the rates 'of mortality' between two percentages over time interval. My goal is to get the annual rates per group.
My values are already in percentages (start and end values), representing how mych forest have been lost…

maycca
- 3,848
- 5
- 36
- 67
1
vote
0 answers
Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent error using rCBA::fpgrowth
I have the following dataset for which I want to generate association rules using FP growth
> head(order_pairs)
# A tibble: 6 x 2
product_A product_B
1 Organic…

code-noob
- 11
- 1
1
vote
0 answers
How to use the consequent parameter in fpgrowth algorithm in the rCBA package in R?
The items column in the transactions I am passing to the fpgrowth method are of the form
{
Bag of Organic Bananas,
Cornbread Mix, …

kenneth-rebello
- 914
- 1
- 7
- 13
1
vote
1 answer
Error calling rCBA::fpgrowth: method fpgrowth with signature (DDI)[[Ljava/lang/String; not found
I wrote the R code below to mine with the FP-Growth algorithm:
fpgabdata <- read.csv('../Agen Biasa.csv', header = FALSE)
train <- sapply(fpgabdata, as.factor)
train <- data.frame(train, check.names = TRUE)
txns <-…

Mr Simple
- 21
- 2
1
vote
0 answers
FP-Growth cannot processing
I have a problem processing the fp-growth algorithm on Rstudio
this is my first time using R
I write code
FpgConf = rCBA :: fpgrowth (dataset, support = 0.1, confidence = 0.5, maxLength = 2, consequent = "Species", parallel = FALSE)
en then system…

Mr Simple
- 21
- 2
1
vote
1 answer
How to interpret results of Mlxtend's association rule
I am using mlxtend to find association rules:
Here is the code:
df = apriori(dum_data, min_support=0.4, use_colnames=True)
rules = association_rules(df, metric="lift", min_threshold=1)
rules2=rules[ (rules['lift'] >= 1) & (rules['confidence'] >=…

MAC
- 1,345
- 2
- 30
- 60
1
vote
1 answer
pyspark--FPGrowth: how does transform work on unseen transactions?
I am using pyspark.ml.fpm.FPGrowth in Spark 2.4 and I have a question about how precisely transform works on a transactions which are new.
My understanding is that model.transform will take each transaction X and find all Y such that
Conf(X-->Y) >…

Nick
- 69
- 5
1
vote
1 answer
Appending column name to column value using Spark
I have data in comma separated file, I have loaded it in the spark data frame:
The data looks like:
A B C
1 2 3
4 5 6
7 8 9
I want to transform the above data frame in spark using pyspark as:
A B C
A_1 B_2 C_3
A_4 B_5 C_6
…

MAC
- 1,345
- 2
- 30
- 60