Distributing task/function/computation across multiple machines/nodes can be done using the various framework in python. The most common and widely used are Ray, Dask, and PySpark and which one of these to use will really depend on usecase.
For simple function/task distribution, you can use Ray library (@ray.remote) to distribute and then use get method to integrate/compute the result back. Same can be done through dask as well.
https://rise.cs.berkeley.edu/blog/modern-parallel-and-distributed-python-a-quick-tutorial-on-ray/
I will prefer Spark/Pyspark, when you are dealing with a large dataset and you want to perform some kind of ETL operation to distribute the huge dataset across multiple nodes and then perform some transformation or operation on it. Note Spark or mapreduce concept assume you bring the computation towards data and it will execute the same/similar task on a different subset of data and finally perform some aggregation (involves shuffling).
Spark/Pyspark supports ensemble through its inbuilt random forest or gradient boosting tree algorithm. But training separate models (random forest, gradient trees, logistic regression, etc) on separate nodes/executor is currently not supported in spark (out of the box). Although it might be possible through the customized spark code, just like the way they are doing internally for random forest (training multiple decision trees).
The real-life scenario of ensembling can be easily done with dask and sklearn. Dask integrates well with scikit-learn xgboost etc to perform parallel computation across distributed cluster nodes/workers using joblib context manager.
Now for ensemble scenario, you can use different models/algorithm of scikit-learn(RandomForest, SGD, SVM, Logistic Regression) and use the Voting classifier to combine multiple different models (i.e., sub-estimators) into a single model, which is (ideally) stronger than any of the individual models alone (i.e basics of ensemble concept).
Using Dask will train individual sub-estimators/models on different machines in a cluster.
https://docs.dask.org/en/latest/use-cases.html
High level the code will look like-
classifiers = [
('sgd', SGDClassifier(max_iter=1000)),
('logisticregression', LogisticRegression()),
('xgboost', XGBClassifier()
('svc', SVC(gamma='auto')),
]
clf = VotingClassifier(classifiers)
with joblib.parallel_backend("dask"):
clf.fit(X, y)
** The above can be achieved through other distributed framework like Ray/Spark.etc as well, but it will need more customized coding.
Hope this information helps you!