Questions tagged [neuraxle]

Neuraxle is a sklearn-like machine learning pipeline library for Python with improved functionalities for Deep Learning and complex pipelining. The project is open-source and commercially usable (Apache 2.0 license).

23 questions
149
votes
4 answers

What is exactly sklearn.pipeline.Pipeline?

I can't figure out how the sklearn.pipeline.Pipeline works exactly. There are a few explanation in the doc. For example what do they mean by: Pipeline of transforms with a final estimator. To make my question clearer, what are steps? How do they…
farhawa
  • 10,120
  • 16
  • 49
  • 91
116
votes
3 answers

Will scikit-learn utilize GPU?

Reading implementation of scikit-learn in TensorFlow: http://learningtensorflow.com/lesson6/ and scikit-learn: http://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html I'm struggling to decide which implementation to…
blue-sky
  • 51,962
  • 152
  • 427
  • 752
12
votes
1 answer

How to combine features with different dimensions output using scikit-learn

I am using scikit-learn with Pipeline and FeatureUnion to extract features from different inputs. Each sample (instance) in my dataset refers to documents with different lengths. My goal is to compute the top tfidf for each document independently,…
Abrial
  • 421
  • 1
  • 5
  • 20
10
votes
4 answers

Custom sklearn pipeline transformer giving "pickle.PicklingError"

I am trying to create a custom transformer for a Python sklearn pipeline based on guidance from this tutorial: http://danielhnyk.cz/creating-your-own-estimator-scikit-learn/ Right now my custom class/transformer looks like this: class…
Jed
  • 1,823
  • 4
  • 20
  • 52
6
votes
3 answers

keras + scikit-learn wrapper, appears to hang when GridSearchCV with n_jobs >1

UPDATE: I have to re-write this question as after some investigation I realise that this is a different problem. Context: running keras in a gridsearch setting using the kerasclassifier wrapper with scikit learn. Sys: Ubuntu 16.04, libraries:…
Ziqi
  • 2,445
  • 5
  • 38
  • 65
3
votes
1 answer

How to run 2 pipelines in parallel in scikit-learn or Neuraxle?

I want to create a simple pipeline with neuraxle (I know I can use other libraries but I want to use neuraxle) where I want to clean data, split it, train 2 models and compare them. I want my pipeline to do something like this: p = Pipeline([ …
kAch
  • 33
  • 1
  • 6
2
votes
1 answer

Using predict_proba() instead predict() in Neuraxle Pipeline with OneVsRestClassifier

I'm trying to setup a Neuraxle Pipeline that uses sklearns OneVsRestClassifier (OVR). Every valid step in a Neuraxle pipeline has to implement the fit() and transform() methods. In order to use sklearns pipeline steps, Neuraxle uses a SKLearnWrapper…
Namnyef
  • 31
  • 3
2
votes
1 answer

AutoML Pipelines: Label extraction from input data and sampling within Neuraxle or SKLearn Pipelines

I am working on a project that is looking for a lean Python AutoML pipeline implementation. As per project definition, data entering the pipeline is in the format of serialised business objects, e.g. (artificial example): property.json: { "area":…
sim
  • 1,227
  • 14
  • 20
2
votes
1 answer

Neuraxle simple pipeline trouble (StandardScaler -> LinearSVC)

I can't realize why this neuraxle pipeline does't works. I just want scale data and apply LinearSVC. What I am doing wrong? This is what I am trying to do: import numpy as np from sklearn.ensemble import GradientBoostingRegressor from…
alxkolm
  • 1,911
  • 2
  • 14
  • 11
2
votes
1 answer

Neuraxle's RandomSearch() successor

I updated Neuraxle to the latest version (3.4). I noticed the whole auto_ml.py was redone. I checked the documentation but there is nothing about it. On git it seems method RandomSearch() was replaced a long time ago by AutoML() method. However the…
1
vote
1 answer

Default hyperparameter values in Neuraxle

Implementing pipeline components in Neuraxle, i wonder if it is possible and/or advisable to have default values for hyperparameters. Looking at code and documentation, my guess is that it is not supported, but i cannot find any mention of it in…
1
vote
1 answer

Is it possible to combine multiple pipeline into single estimator in Neuraxle or sklearn to create multi-output classifer and fit in one go

I want to create multi-output classifier. However, my problem is that the distribution of positive label for each output varied greatly e.g. for output 1 there are 2% positive label and for output 2 there are 20% positive label. So, I want to…
1
vote
1 answer

Error loading neuraxle pipeline with execution context

When I save a pipeline, which has an ExecutionContext associated with it, and try to load it again, I get the error shown below. from neuraxle.base import ExecutionContext, Identity from neuraxle.pipeline import Pipeline PIPELINE_NAME =…
eL_BaRTo
  • 82
  • 1
  • 5
1
vote
1 answer

Neuraxle Select Columns in Pandas DataFrame

Whats the NeurAxle way to select a subset of columns from a dataset? This is how i am doing it via sklearn: class ColumnSelectTransformer(BaseEstimator, TransformerMixin): def __init__(self, columns): self.columns = columns def…
Simon Taylor
  • 607
  • 1
  • 9
  • 27
1
vote
1 answer

How best to handle errors and or missing data in a Neuraxle pipeline?

Let's assume you have a pipeline with steps that can fail for some input elements for example: FetchSomeImagesFromIds -> Resize -> DoSomethingElse In this case the 1st step downloads 10 out of a 100 images... and passes those to resize.. I'm looking…
hexa
  • 31
  • 2
1
2