Questions tagged [boosting]

Boosting is a machine learning ensemble meta-algorithm in supervised learning, and a family of machine learning algorithms that convert weak learners to strong ones. Also: Boosting is the process of enhancing the relevancy of a document or field

From [the docs]:

"Boosting" is a machine learning ensemble meta-algorithm for primarily reducing bias, and also variance in supervised learning, and a family of machine learning algorithms that convert weak learners to strong ones.

Also:

From the docs:

Boosting is the process of enhancing the relevancy of a document or field. Field level mapping allows to define an explicit boost level on a specific field. The boost field mapping (applied on the root object) allows to define a boost field mapping where its content will control the boost level of the document.

181 questions
17
votes
3 answers

How to boost a Keras based neural network using AdaBoost?

Assuming I fit the following neural network for a binary classification problem: model = Sequential() model.add(Dense(21, input_dim=19, init='uniform', activation='relu')) model.add(Dense(80, init='uniform', activation='relu')) model.add(Dense(80,…
ishido
  • 4,065
  • 9
  • 32
  • 42
14
votes
2 answers

How to use XGBoost algorithm for regression in R?

I was trying the XGBoost technique for the prediction. As my dependent variable is continuous, I was doing the regression using XGBoost, but most of the references available in various portal are for classification. Though i know by using objective…
Amarjeet
  • 907
  • 2
  • 9
  • 14
14
votes
1 answer

How to access weighting of indiviual decision trees in xgboost?

I'm using xgboost for ranking with param = {'objective':'rank:pairwise', 'booster':'gbtree'} As I understand gradient boosting works by calculating the weighted sum of the learned decision trees. How can I access the weights that are assigned to…
саша
  • 521
  • 5
  • 20
12
votes
2 answers

How can I boost the field length norm in elasticsearch function score?

I know that elasticsearch takes in account the length of a field when computing the score of the documents retrieved by a query. The shorter the field, the higher the weight (see The field-length norm). I like this behaviour: when I search for…
Mario Trucco
  • 1,933
  • 3
  • 33
  • 45
10
votes
1 answer

R: implementing my own gradient boosting algorithm

I am trying to write my own gradient boosting algorithm. I understand there are existing packages like gbm and xgboost, but I wanted to understand how the algorithm works by writing my own. I am using the iris data set, and my outcome is…
Adrian
  • 9,229
  • 24
  • 74
  • 132
10
votes
1 answer

Feature importance 'gain' in XGBoost

I want to understand how the feature importance in xgboost is calculated by 'gain'. From https://towardsdatascience.com/be-careful-when-interpreting-your-features-importance-in-xgboost-6e16132588e7: ‘Gain’ is the improvement in accuracy brought by…
nellng
  • 103
  • 1
  • 1
  • 5
10
votes
1 answer

Dumping XGBClassifier model into text

I train a multi label classification model with XGBBoost and want to code this model in another system. Is it possible to see the text output of my XGBClassifier model as dump_model in XGB Booster. Edit: I found that…
Sabri Karagönen
  • 2,212
  • 1
  • 14
  • 28
10
votes
1 answer

"valid deviance" is nan for GBM model, What does this means and how to get rid of this?

I am using Gradient boosting for classification. Though the result is improving but I am getting NaN in validdeviance. Model = gbm.fit( x= x_Train , y = y_Train , distribution = "bernoulli", n.trees = GBM_NTREES , shrinkage =…
Amarjeet
  • 907
  • 2
  • 9
  • 14
9
votes
2 answers

Gradient Boosting Variable Importance

I have fit my gradient boosting model and am trying to print variable importance. I have used the same code and works using random forest. I keep getting the error when running varImp(). The error is the following. Error in…
Dustin Smith
  • 115
  • 1
  • 6
7
votes
1 answer

Need help implementing a custom loss function in lightGBM (Zero-inflated Log Normal Loss)

Im trying to implement this zero-inflated log normal loss function based on this paper in lightGBM (https://arxiv.org/pdf/1912.07753.pdf) (page 5). But, admittedly, I just don’t know how. I don’t understand how to get the gradient and hessian of…
7
votes
1 answer

Catboost: what are reasonable values for l2_leaf_reg?

Running catboost on a large-ish dataset (~1M rows, 500 columns), I get: Training has stopped (degenerate solution on iteration 0, probably too small l2-regularization, try to increase it). How do I guess what the l2 regularization value should be?…
Guy Adini
  • 5,188
  • 5
  • 32
  • 34
6
votes
1 answer

What is the use of base_score in xgboost multiclass working?

I am trying to explore the working of Xgboost binary classification as well as for multi-class. In case of binary class, i observed that base_score is considered as starting probability and it also showed major impact while calculating Gain and…
6
votes
1 answer

LightGBM ignore warning about "boost_from_average"

I use LightGBM model (version 2.2.1). It shows next warning on train: [LightGBM] [Warning] Starting from the 2.1.2 version, default value for the "boost_from_average" parameter in "binary" objective is true. This may cause significantly…
Mikhail_Sam
  • 10,602
  • 11
  • 66
  • 102
5
votes
1 answer

Upgrade from LatLonType to LatLonPointSpatialField

I am using Solr 6.5.1 LatLonType is now deprecated (https://lucene.apache.org/solr/guide/6_6/spatial-search.html) and I'm trying to use the LatLonPointSpatialField. I also need it to be multi-valued. My field is defined as follows:
Valentin V
  • 24,971
  • 33
  • 103
  • 152
5
votes
5 answers

Predicting probabilities of classes in case of Gradient Boosting Trees in Spark using the tree output

It is known that GBT s in Spark gives you predicted labels as of now. I was thinking of trying to calculate predicted probabilities for a class (say all the instances falling under a certain leaf) The codes to build GBT's import…
1
2 3
11 12