Stochastic Gradient Descent (or SGD) is an algorithm used to find a minima (local or global) of a differentiable function.
Questions tagged [stochastic-gradient]
35 questions
6
votes
1 answer
What is the default batch size of pytorch SGD?
What does pytorch SGD do if I feed the whole data and do not specify the batch size? I don't see any "stochastic" or "randomness" in the case.
For example, in the following simple code, I feed the whole data (x,y) into a model.
optimizer =…

Tony B
- 272
- 2
- 8
2
votes
1 answer
Why are the gradients not equivalent when using loss.backward() v.s torch.auto.grad?
I ran into this weird behavior when trying to "manually" optimize a network's parameters via SGD. When attempting to update the model's parameters using the following way, it works just fine:
for _ in trange(epochs):
for x, y in train_loader:
…

Omar AlSuwaidi
- 1,187
- 2
- 6
- 26
2
votes
1 answer
Implement SGD Classifier with Logloss and L2 regularization Using SGD without using sklearn
X, y = make_classification(n_samples=50000, n_features=15, n_informative=10, n_redundant=5,
n_classes=2, weights=[0.7], class_sep=0.7, random_state=15)
initialize weights
def initialize_weights(dim):
''' In this…

maddy
- 31
- 5
1
vote
0 answers
Gradient Descent Cost Function Blows up after locating minimum value
I'm working on a multivariate optimization problem using the gradient descent algorithm. The algorithm does an okay job, but I noticed the cost function is not following a monotonic descending trend as desired.
It blows up right after the minimal…

Young Wang
- 11
- 1
1
vote
0 answers
Visualize Stochastic Gradient Descent using Contour plot in Python
I tried to implement the stochastic gradient descent method and apply it to my build dataset. The data set follows a linear regression ( wx + b = y).
The process has also somehow converged towards the appropriate values. What causes me difficulties…

pivive
- 11
- 1
1
vote
1 answer
Encountering a TypeError: can't multiply sequence by non-int of type 'float' when creating an SGD algorithm
# We first define the observations as a list and then also as a table for the experienced worker's performance.
Observation1 = [2.0, 6.0, 2.0]
Observation2 = [1.0, 5.0, 7.0]
Observation3 = [5.0, 2.0, 1.0]
Observation4 = [2.0, 3.0, 8.0]
Observation5…

NSM_2000
- 13
- 2
1
vote
1 answer
SGD breaks down when encountering unseen values
This is my code:
from sklearn.linear_model import SGDClassifier, LogisticRegression
from sklearn.metrics import classification_report, accuracy_score
from sklearn.feature_extraction.text import TfidfVectorizer, CountVectorizer
from…

Outcast
- 4,967
- 5
- 44
- 99
1
vote
1 answer
How can I get my neural net to correctly do linear regression?
I used the code for the first neural net from in the book neural nets and deep learning by Michael Nielsen, which was used for recognising handwritten digits. It uses stochastic gradient descent with mini batches and the sigmoid activation function.…

thebasqueinterdisciplinarian
- 27
- 1
- 9
1
vote
1 answer
SGD classifier Precision-Recall curve
I'm working on a binary classification problem and I have an sgd classifier like so:
sgd = SGDClassifier(
max_iter = 1000,
tol = 1e-3,
validation_fraction = 0.2,
class_weight = {0:0.5, 1:8.99}
)
I fitted…

nz_21
- 6,140
- 7
- 34
- 80
1
vote
0 answers
Simple NN in python not working, problem with backpropagation algorithm perhaps?
I tried to code classic XOR problem using NN with 2 inputs, 2 hidden neurons, 1 output neuron (using stochastic gradient descent). But whatever i do my NN doesn't work correctly, still getting wrong output and i really don't know where is the…

frederik kosa
- 11
- 1
1
vote
1 answer
How to make stochastic gradient regressor run up to 1000 epochs or yield better results?
I am running the stochastic gradient regressor from sklearn (docs).
Here are the parameters I used:
{loss: "huber",
"learning_rate": "adaptive",
"penalty": "l1",
"alpha": "0.001",
"l1_ratio": "0.75",
"early_stopping":…

JA-pythonista
- 1,225
- 1
- 21
- 44
1
vote
2 answers
Why I'm getting a huge cost in Stochastic Gradient Descent Implementation?
I've run into some problems while trying to implement Stochastic Gradient Descent, and basically what is happening is that my cost is growing like crazy and I don't have a clue why.
MSE implementation:
def mse(x,y,w,b):
predictions = x @ w
…

Dawid_C
- 11
- 1
1
vote
1 answer
Forward Pass calculation on current batch in "get_updates" method of Keras SGD Optimizer
I am trying to implement a stochastic armijo rule in the get_gradient method of Keras SGD optimizer.
Therefore, I need to calculate another forward pass to check if the learning_rate chosen was good. I don't want another calculation of the…

mreiners
- 11
- 2
1
vote
2 answers
Restarting with adam
I am training my network with early stopping strategy. I start with a higher learning rate, and based on validation loss, I need to restart training from an earlier snapshot.
I am able to save/load snapshot with model and optimizer state_dicts. No…

Xoul
- 359
- 1
- 3
- 13
1
vote
1 answer
How do sklearn SGDClassifier model thresholds relate to model scores?
I've trained a model and identified a 'threshold' that I'd like to deploy it at, but I'm having trouble understanding how the threshold relates to the score.
X = labeled_data[features].reset_index(drop=True)
Y =…

Tyler Wood
- 1,947
- 2
- 19
- 29