how initial bias value is chosen in sklearn logistic regression?

Question

When training logistic regression it goes through an iterative process where at each process it calculates weights of x variables and bias value to minimize the loss function.

From official sklearn code class LogisticRegression | linear model in scikit-learn, the logistic regression class' fit method is as follows

def fit(self, X, y, sample_weight=None):
    """
    Fit the model according to the given training data.
    Parameters
    ----------
    X : {array-like, sparse matrix} of shape (n_samples, n_features)
        Training vector, where n_samples is the number of samples and
        n_features is the number of features.
    y : array-like of shape (n_samples,)
        Target vector relative to X.
    sample_weight : array-like of shape (n_samples,) default=None
        Array of weights that are assigned to individual samples.
        If not provided, then each sample is given unit weight.
        .. versionadded:: 0.17
           *sample_weight* support to LogisticRegression.

I am guessing sample_weight = weight of x variables which are set to 1 if not given, is the bias value also 1?

desertnaut · Accepted Answer · 2021-01-31T00:03:33.183

You sound somewhat confused, perhaps looking for an analogy here with the weights & biases of a neural network. But this is not the case; sample_weight here has nothing to do with the weights of a neural network, even as a concept.

sample_weight is there so that, if the (business) problem requires so, we can give more weight (i.e. more importance) to some samples compared with others, and this importance directly affects the loss. It is sometimes used in cases of imbalanced data; quoting from the Tips on practical use section of the documentation (it is about decision trees, but the rationale is the same):

Class balancing can be done by sampling an equal number of samples from each class, or preferably by normalizing the sum of the sample weights (sample_weight) for each class to the same value.

and from a relevant thread at Cross Validated:

Sample weights are used to increase the importance of a single data-point (let's say, some of your data is more trustworthy, then they receive a higher weight). So: The sample weights exist to change the importance of data-points

You can see a practical demostration of how changing the weight of some samples changes the final model in the SO thread What does `sample_weight` do to the way a `DecisionTreeClassifier` works in sklearn? (again, it is about decision trees, but the rationale is the same).

Having clarified that, it should now be apparent that there is no room here for any kind of "bias" parameter whatsoever. In fact, the introductory paragraph in your question is wrong: logistic regression does not compute such weights and biases; it returns coefficients and an intercept term (sometimes itself called bias), and these coefficients & intercept have nothing to do with sample_weight.

I'm agreeing to your comment which says "coef and int has nothing to do with sample_weights". But I am sure about logisticRegression computing weights of x variable, it is whole point of training, to get weights for each x variable. https://web.stanford.edu/~jurafsky/slp3/5.pdf check it out. — haneulkim, Jan 31 '21 at 00:09
@Ambleu this is a matter of *terminology*; the terms *weights* and *bias* are **sometimes** used interchangeably for the *coefficients* and *intercept*, respectively, but this is far from standard. scikit-learn (which your question is specifically about) does **not** use the weights/biases terminology, but the coefficient (`coef_`) and intercept (`intercept_`) one ([docs](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html)) — desertnaut, Jan 31 '21 at 00:22

how initial bias value is chosen in sklearn logistic regression?

1 Answers1