Building ensemble model with the python merf library

Question

I would like to use the merf (mixed effect random forest) library in an ensemble model e.g. by using the mlens or mlxtend python libraries. However, due to the non-traditional way in which the fit and predict methods of merf are structured, I am unable to figure out how to do that:

from merf import MERF
merf = MERF()
merf.fit(X_train, Z_train, clusters_train, y_train)
y_hat = merf.predict(X_test, Z_test, clusters_test)

Is there a way I can use the merf library in an ensemble model? The issue is that building an ensemble model with mlens or other ensemble libraries assumes a scikit-learn structure where the fit method has X, y as input and the predict method has X as the input. However, merf clearly has more inputs in both the fit and predict methods. Here is a simplified syntax for mlens:

from mlens.ensemble import SuperLearner 
ensemble = SuperLearner()
ensemble.add(estimators)
ensemble.add_meta(meta_estimator)
ensemble.fit(X, y).predict(X)

I am not restricted to using mlens or mlxten. Any other way to build an ensemble model with merf in it would work too.

I am adding this for support for sklearn estimator ensemble. There are two specifically methods which are missing with merf, `get_params` and `_get_param_names`, I think may be creating a base estimator for merf should be able to use ensemble method with merf + mlens. Some Resources : [this issue](https://github.com/scikit-learn/scikit-learn/issues/13555) and [this answer](https://github.com/scikit-learn/scikit-learn/issues/13555#:~:text=I%27ve%20a%20simple%20workaround%20before%20it%20has%20been%20merged%3A) — ASHu2, Aug 25 '22 at 07:04

DialFrost · Answer 1 · 2022-08-27T14:21:49.540

I mean, you could always just sneak in the datamaking process using merf :P. Majority of the data generation is taken from manifoldai merf example:

from merf.utils import MERFDataGenerator
import numpy as np
from mlens.ensemble import SuperLearner
from sklearn.svm import SVR
from sklearn.linear_model import Lasso
from mlens.metrics.metrics import rmse

dgm = MERFDataGenerator(m = .6, sigma_b = np.sqrt(4.5), sigma_e = 1)

num_clusters_each_size = 20
train_sizes = [1, 3, 5, 7, 9]
known_sizes = [9, 27, 45, 63, 81]
new_sizes = [10, 30, 50, 70, 90]

train_cluster_sizes = MERFDataGenerator.create_cluster_sizes_array(train_sizes, num_clusters_each_size)
known_cluster_sizes = MERFDataGenerator.create_cluster_sizes_array(known_sizes, num_clusters_each_size)
new_cluster_sizes = MERFDataGenerator.create_cluster_sizes_array(new_sizes, num_clusters_each_size)

train, test_known, test_new, training_cluster_ids, ptev, prev = dgm.generate_split_samples(train_cluster_sizes, known_cluster_sizes, new_cluster_sizes)

X_train = train[['X_0', 'X_1', 'X_2']]
Z_train = train[['Z']]
clusters_train = train['cluster']
y_train = train['y']

Before making the fit and prediction with some modification from Flennerhag mlens.ensemble superlearner.py (Github):

ensemble = SuperLearner()
ensemble.add([SVR(), Lasso()])
ensemble.add_meta(SVR())
pred = ensemble.fit(X_train, y_train).predict(X_train)

root = rmse(y_train, pred)

print(root)

>>>

2.345318341087564

But of course, there is always a better method overall if you don't mind specifically using the merf and ensemble together.

`Keras` method

from keras.models import Sequential
from keras.layers import Dense
from matplotlib import pyplot
from keras import backend
import matplotlib.pyplot as plt
import numpy as np
 
def rmse(y_true, y_pred):
    return backend.sqrt(backend.mean(backend.square(y_pred - y_true), axis=-1))

X = X_train.to_numpy().flatten()
model = Sequential()
model.add(Dense(2, input_dim=1, activation='relu'))
model.add(Dense(1))
model.compile(loss='mse', optimizer='adam', metrics=[rmse])
history = model.fit(X, X, epochs=500, batch_size=len(X), verbose=2)
plt.plot(history.history['rmse'])
plt.title("keras loss function")
plt.show()

>>>

Do note that the X_train used here is from the previous merf code:

X_train = train[['X_0', 'X_1', 'X_2']]

Building ensemble model with the python merf library

1 Answers1

Keras method

`Keras` method