Issue using MaximumLikelihoodEstimator.estimate_cpd() from pgmpy when using dataframe with tuples as columns names as parameter

Question

When I use a dataframe with only text as columns names the code works fine. Now I need to use a dataframe with tuples as columns names as parameter for MaximumLikelihoodEstimator.estimate_cpd() method from pgmpy.estimators, but it raises the KeyError: "None of [Index(['Consumption', 0], dtype='object')] are in the [columns]"

I was trying to use the MaximumLikelihoodEstimator.estimate_cpd() on a standard Bayesian network to generate the cpds for an Dynamic Bayesian Netwok (DBN) since the said method is not implemented yet for DBNs. So I want to generate a cpd table with the names used on DBNs, that are tuples.

The altered Dataframe categorical_energy_production is similar to this example table:

	('Consumption', 0)	('Production', 0)	('Nuclear', 0)	('Oil and Gas', 0)	('Hydroelectric', 0)
DateTime
2019-01-01 00:00:00	high	low	low	high	low
2019-01-01 01:00:00	high	high	low	high	high

Code:

energy_production_model = BayesianNetwork([(('Nuclear', 0),('Production', 0)),
                                           (('Oil and Gas', 0),('Production', 0)),
                                           (('Hydroelectric', 0),('Production', 0))]) #This works

cpd_production = MaximumLikelihoodEstimator(energy_production_model, categorical_energy_production).estimate_cpd(('Production', 0))

# ^ This raises the error KeyError: "None of [Index(['Consumption', 0], dtype='object')] are in the [columns]"

Issue using MaximumLikelihoodEstimator.estimate_cpd() from pgmpy when using dataframe with tuples as columns names as parameter

0 Answers0