Questions tagged [backpropagation]

Backpropagation is a method of the gradient computation, often used in artificial neural networks to perform gradient descent.

Backpropagation is a method of the gradient computation, often used in artificial neural networks to perform gradient descent. It led to a “renaissance” in the field of artificial neural network research.

In most cases, it requires a teacher that knows, or can calculate, the desired output for any input in the training set. The term is an abbreviation for "backward propagation of errors".

1267 questions
957
votes
18 answers

What is the role of the bias in neural networks?

I'm aware of the gradient descent and the back-propagation algorithm. What I don't get is: when is using a bias important and how do you use it? For example, when mapping the AND function, when I use two inputs and one output, it does not give the…
330
votes
2 answers

Extremely small or NaN values appear in training neural network

I'm trying to implement a neural network architecture in Haskell, and use it on MNIST. I'm using the hmatrix package for linear algebra. My training framework is built using the pipes package. My code compiles and doesn't crash. But the problem is,…
Charles Langlois
  • 4,198
  • 4
  • 16
  • 25
78
votes
2 answers

What does the parameter retain_graph mean in the Variable's backward() method?

I'm going through the neural transfer pytorch tutorial and am confused about the use of retain_variable(deprecated, now referred to as retain_graph). The code example show: class ContentLoss(nn.Module): def __init__(self, target, weight): …
77
votes
2 answers

How does keras handle multiple losses?

If I have something like: model = Model(inputs = input, outputs = [y1,y2]) l1 = 0.5 l2 = 0.3 model.compile(loss = [loss1,loss2], loss_weights = [l1,l2], ...) what does Keras do with the losses to obtain the final loss? Is it something…
jfga
  • 803
  • 1
  • 8
  • 8
61
votes
3 answers

In which cases is the cross-entropy preferred over the mean squared error?

Although both of the above methods provide a better score for the better closeness of prediction, still cross-entropy is preferred. Is it in every case or there are some peculiar scenarios where we prefer cross-entropy over MSE?
53
votes
4 answers

What is the difference between SGD and back-propagation?

Can you please tell me the difference between Stochastic Gradient Descent (SGD) and back-propagation?
50
votes
3 answers

Understanding Neural Network Backpropagation

Update: a better formulation of the issue. I'm trying to understand the backpropagation algorithm with an XOR neural network as an example. For this case there are 2 input neurons + 1 bias, 2 neurons in the hidden layer + 1 bias, and 1 output…
Kiril
  • 39,672
  • 31
  • 167
  • 226
47
votes
2 answers

How to use k-fold cross validation in a neural network

We are writing a small ANN which is supposed to categorize 7000 products into 7 classes based on 10 input variables. In order to do this we have to use k-fold cross validation but we are kind of confused. We have this excerpt from the presentation…
Ortixx
  • 833
  • 3
  • 10
  • 23
45
votes
4 answers

What is the difference between back-propagation and feed-forward Neural Network?

What is the difference between back-propagation and feed-forward neural networks? By googling and reading, I found that in feed-forward there is only forward direction, but in back-propagation once we need to do a forward-propagation and then…
USB
  • 6,019
  • 15
  • 62
  • 93
43
votes
1 answer

What are forward and backward passes in neural networks?

What is the meaning of forward pass and backward pass in neural networks? Everybody is mentioning these expressions when talking about backpropagation and epochs. I understood that forward pass and backward pass together form an epoch.
41
votes
6 answers

Neural network backpropagation with RELU

I am trying to implement neural network with RELU. input layer -> 1 hidden layer -> relu -> output layer -> softmax layer Above is the architecture of my neural network. I am confused about backpropagation of this relu. For derivative of RELU, if…
Danny
  • 441
  • 1
  • 4
  • 5
34
votes
1 answer

How does the back-propagation algorithm deal with non-differentiable activation functions?

While digging through the topic of neural networks and how to efficiently train them, I came across the method of using very simple activation functions, such as the rectified linear unit (ReLU), instead of the classic smooth sigmoids. The…
27
votes
5 answers

ReLU derivative in backpropagation

I am about making backpropagation on a neural network that uses ReLU. In a previous project of mine, I did it on a network that was using Sigmoid activation function, but now I'm a little bit confused, since ReLU doesn't have a derivative. Here's an…
Gergely Papp
  • 800
  • 1
  • 7
  • 12
27
votes
3 answers

Difference on performance between numpy and matlab

I am computing the backpropagation algorithm for a sparse autoencoder. I have implemented it in python using numpy and in matlab. The code is almost the same, but the performance is very different. The time matlab takes to complete the task is…
pabaldonedo
  • 947
  • 1
  • 7
  • 14
26
votes
1 answer

Pytorch ValueError: optimizer got an empty parameter list

When trying to create a neural network and optimize it using Pytorch, I am getting ValueError: optimizer got an empty parameter list Here is the code. import torch.nn as nn import torch.nn.functional as F from os.path import dirname from os import…
Gulzar
  • 23,452
  • 27
  • 113
  • 201
1
2 3
84 85