Questions tagged [yellowbrick]

Yellowbrick is a Python visualization library for machine learning. It extends the Scikit-Learn API to provide visual diagnostic tools for classifiers, regressors, clusterers, transformers, pipelines, feature extraction tools and more. This tag should be used to ask questions about how to use visualizers, how to extend or modify visualizations, or how to interpret diagnostics. This tag is commonly used with the scikit-learn and matplotlib tags.

Yellowbrick (sometimes referred to as scikit-yellowbrick) is a Python library that extends the Scikit-Learn API to enhance the machine learning workflow with visual diagnostics with matplotlib. The yellowbrick tag is therefore usually applied in combination with the scikit-learn, python, and matplotlib tags. Good questions for this tag include:

  1. Questions about how to work with or extend existing visualizers
  2. Questions about how to interpret visual results
  3. Questions about how to modify resulting figures or annotate them
  4. Questions about how to create new visualizers

The best questions will include a code example along with the figure being generated by Yellowbrick. In order to allow others to run your code, if the visualization is not data specific (e.g. the questions is not the result of specific input), please use one of the example datasets from the Yellowbrick tutorial or one of Scikit-Learn's dataset generation methods. It is also very helpful if you include the version of Yellowbrick you're using, which can be found with print(yellowbrick.__version__).

For result interpretation questions, please be as general as possible and focused on the content of the visualizer. For example, a good question is "what is the meaning of the macro-average curve on ROC/AUC with more than two classes?" A poor question is "how do I make my model have a higher F1 score?"

Finally, for both folks asking questions and those responding, Yellowbrick contributors and developers take seriously respectful discourse. In addition to be nice, the StackExchange code of conduct, Yellowbrick also follows the Python Software Foundation Code of Conduct

Definitions

Yellowbrick extends the Scikit-Learn API with a new Estimator called a Visualizer. Visualizers are estimators, in that they can be fit with data in order to produce a visualization. Anything that produces a visualization in Yellowbrick is a Visualizer, though there are different types.

A FeatureVisualizer produces a representation of the feature space or data space. These are used to explore the input to models or the relationship of data to the model.

A ModelVisualizer produces a representation of the model space, describing how the model interacts with data or behaves. It does this in two ways, by describing internal parameters of the model, or by describing the relationship to test data with a ScoreVisualizer.

73 questions
9
votes
3 answers

ModuleNotFoundError installing yellowbrick in Python

I am having trouble installing yellowbrick. I am using Anaconda, hence I took advantage of using the "conda install". # set number of clusters kclusters = 5 pittsburgh_grouped_clustering = pittsburgh_grouped.drop('Neighborhood', 1) X =…
Mo Kaiser
  • 127
  • 1
  • 1
  • 5
9
votes
1 answer

Issues when using subplots with yellowbrick and losing legend and titles

I'm having issues when putting multiple yellowbrick charts into a subplot arrangement. The title and legend only show for the last chart. I've tried multiple ways to write the code but can't get all of them to show the legends and titles. I'm sure…
8
votes
3 answers

YellowBrick ImportError: cannot import name 'safe_indexing' from 'sklearn.utils'

I'm trying to plot a silhouette plot for a K-Means model I've run, however, I get the error: ImportError: cannot import name 'safe_indexing' from 'sklearn.utils. I was initially getting the NotFoundError issue described in this post here however I…
softmax55
  • 578
  • 2
  • 7
  • 21
5
votes
2 answers

Pandas dataframe divide features to group of high correlation

I have a dataframe with over 280 features. I ran correlation map to detect groups of features that are highly correlated: Now, I want to divide the features to groups, such that each group will be a "red zone", meaning each group will have features…
Cranjis
  • 1,590
  • 8
  • 31
  • 64
4
votes
1 answer

Extracting k from Yellow brick KElbowVisualizer

I am trying to extract the value of k from Yellow brick KElbowVisualizer visualizer for further processing. I can see the k value on the visualization, but I cannot seem to extract it and put in a variable.
Marius
  • 41
  • 3
4
votes
0 answers

AttributeError: 'XGBRegressor' object has no attribute 'line_color'

Find below the code I used to create the residuals_plot from yellowbrick package. I ran the xgbregressor model, predicted the results and tried to create the residuals plot. The plot came out properly, but followed with the below error. I didn't use…
Dayakar Malgari
  • 81
  • 1
  • 1
  • 5
3
votes
2 answers

error 'RandomForestClassifier' object has no attribute 'target_type_'

when I run this piece of code: from yellowbrick.classifier import ROCAUC from sklearn.ensemble import RandomForestClassifier rf = RandomForestClassifier(**{"max_features": 0.4, "n_estimators":15,"min_samples_leaf":…
mab66
  • 31
  • 5
3
votes
1 answer

Yellowbrick change legend and add title

I created a graph with yellowbrick RadViz: visualizer = RadViz(classes=labels) visualizer.fit(X, y) visualizer.transform(X) visualizer.show() As you can see, the legend overrides some of the feature names: Moreover, I want to edit the title. I…
Cranjis
  • 1,590
  • 8
  • 31
  • 64
3
votes
0 answers

logistic regression residuals plot/distribution

I am trying to evaluate the logistic model with residual plot in Python. I searched on the internet and cannot get the info. It seems that we can calculate the deviance residual from this answer. from sklearn.metrics import log_loss def…
Peter Chen
  • 1,464
  • 3
  • 21
  • 48
3
votes
1 answer

Yellowbrick: Increasing font size on Yellowbrick generated charts

Is there a way to increase the font size on Yellowbrick generated charts? I find it difficult to read the text. I wasn't able to find anything on it in the documentation. I'm using Python 3.6, Yellowbrick 0.5 in a Jupyter Notebook.
Wessi
  • 1,702
  • 4
  • 36
  • 69
2
votes
1 answer

Attempting to see the Discrimination Threshold Plot for Fitted models

I'm trying to use the Discriminationthreshold Visualizer for my fitted models; They're all binary classifiers (logistic regression, lightgbm, and xgbclassifier) however, based on the documentation I am having a hard time producing the plot on…
2
votes
1 answer

Scikit-learn and Yellowbrick giving different scores

I am using sklearn to compute the average precision and roc_auc of a classifier and yellowbrick to plot the roc_auc and precision-recall curves. The problem is that the packages give different scores in both metrics and I do not know which one is…
2
votes
3 answers

The supplied model is not a clustering estimator in YellowBrick

I am trying to visualize an elbow plot for my data using YellowBrick's KElbowVisualizer and SKLearn's Expectation Maximization algorithm class: GaussianMixture. When I run this, I get the error in the title. (I have also tried ClassificationReport,…
2
votes
1 answer

How to set figure size in yellowbrick plots?

So I've just recently started to discover the power of yellowbrick library (thanks!) but is there a way I could set a figure size for the inline plots?
shiv_90
  • 1,025
  • 3
  • 12
  • 35
2
votes
1 answer

yellowbrick visualiser.fit() raises ValueError

I am trying you Visualise a dispersion plot for my twitter data Here is the link to the datset dataset This is the code from yellowbrick.text import DispersionPlot text = combine['tweet'] target_words = ht_negative_unnest visualizer =…
1
2 3 4 5