Questions tagged [automatic-differentiation]

Also known as algorithmic differentiation, short AD. Techniques that take a procedure evaluating a numerical function and transform it into a procedure that additionally evaluates directional derivatives, gradients, higher order derivatives.

Techniques include operator

overloading for dual numbers,
operator overloading to extract the operations sequence as a tape,
code analysis and transformation.

For a function with input of dimension n and output of dimension n, requiring L elementary operations for its evaluation, one directional derivative or one gradient can be computed with 3*L operations.

The accuracy of the derivative is, automatically, nearly as good as the accuracy of the function evaluation.

Other differentiation method are

symbolic differentiation, where the expanded expression for the derivatives is obtained first, which can be large depending on the implementation, and
numerical differentiation by divided differences, which provides less accuracy with comparable effort, or comparable accuracy with a higher effort.

See wikipedia and autodiff.org

192 questions

votes

2 answers

What does the parameter retain_graph mean in the Variable's backward() method?

I'm going through the neural transfer pytorch tutorial and am confused about the use of retain_variable(deprecated, now referred to as retain_graph). The code example show: class ContentLoss(nn.Module): def __init__(self, target, weight): …

asked Oct 16 '17 at 16:11

jvans

2,765
2
22
23

votes

3 answers

Why don't C++ compilers do better constant folding?

I'm investigating ways to speed up a large section of C++ code, which has automatic derivatives for computing jacobians. This involves doing some amount of work in the actual residuals, but the majority of the work (based on profiled execution time)…

c++ compiler-construction eigen automatic-differentiation ceres-solver

asked Aug 31 '18 at 10:29

jkflying

1,079
7
15

votes

3 answers

Difference between symbolic differentiation and automatic differentiation?

I just cannot seem to understand the difference. For me it looks like both just go through an expression and apply the chain rule.. What am I missing?

symbolic-math automatic-differentiation

asked Apr 17 '17 at 16:24

Moody

1,297
2
12
21

votes

1 answer

How to get more performance out of automatic differentiation?

I am having a hard time optimizing a program that is relying on ads conjugateGradientDescent function for most of it's work. Basically my code is a translation of an old papers code that is written in Matlab and C. I have not measured it, but that…

haskell automatic-differentiation hmatrix

asked Jun 17 '15 at 10:14

fho

6,787
26
71

votes

7 answers

Automatic differentiation library in Scheme / Common Lisp / Clojure

I've heard that one of McCarthy's original motivations for inventing Lisp was to write a system for automatic differentiation. Despite this, my Google searches haven't yielded any libraries/macros for doing this. Are there any Scheme/Common…

clojure lisp scheme numerical-methods automatic-differentiation

asked Feb 03 '11 at 23:09

SuperElectric

17,548
10
52
69

votes

2 answers

how is backpropagation the same (or not) as reverse automatic differentiation?

The Wikipedia page for backpropagation has this claim: The backpropagation algorithm for calculating a gradient has been rediscovered a number of times, and is a special case of a more general technique called automatic differentiation in the…

algorithm neural-network backpropagation calculus automatic-differentiation

asked May 06 '14 at 03:48

Brannon

5,324
4
35
83

votes

4 answers

Is there any working implementation of reverse mode automatic differentiation for Haskell?

The closest-related implementation in Haskell I have seen is the forward mode at http://hackage.haskell.org/packages/archive/fad/1.0/doc/html/Numeric-FAD.html. The closest related related research appears to be reverse mode for another functional…

haskell automatic-differentiation

asked Apr 30 '10 at 13:53

Ian Fiske

10,482
3
21
20

votes

1 answer

Optimize a list function that creates too much garbage (not stack overflow)

I have that Haskell function, that's causing more than 50% of all the allocations of my program, causing 60% of my run time to be taken by the GC. I run with a small stack (-K10K) so there is no stack overflow, but can I make this function faster,…

performance list haskell garbage-collection automatic-differentiation

asked Sep 24 '15 at 15:29

JP Moresmau

7,388
17
31

votes

1 answer

Navigating the automatic differentiation ecosystem in Julia

Julia has a somewhat sprawling AD ecosystem, with perhaps by now more than a dozen different packages spanning, as far as I can tell, forward-mode (ForwardDiff.jl, ForwardDiff2.jl ), reverse-mode (ReverseDiff.jl, Nabla.jl, AutoGrad.jl), and…

julia automatic-differentiation

asked Jul 02 '21 at 20:33

cbk

4,225
6
27

votes

1 answer

how to apply gradients manually in pytorch

Starting to learn pytorch and was trying to do something very simple, trying to move a randomly initialized vector of size 5 to a target vector of value [1,2,3,4,5]. But my distance is not decreasing!! And my vector x just goes crazy. No idea what I…

pytorch mathematical-optimization autograd automatic-differentiation

asked Mar 07 '18 at 14:38

Evan Pu

2,099
5
21
36

votes

1 answer

How does tensorflow handle non differentiable nodes during gradient calculation?

I understood the concept of automatic differentiation, but couldn't find any explanation how tensorflow calculates the error gradient for non differentiable functions as for example tf.where in my loss function or tf.cond in my graph. It works just…

python tensorflow backpropagation automatic-differentiation

asked Nov 08 '18 at 13:03

Natjo

2,005
29
75

votes

3 answers

Where is Wengert List in TensorFlow?

TensorFlow use reverse-mode automatic differentiation(reverse mode AD), as shown in https://github.com/tensorflow/tensorflow/issues/675. Reverse mode AD need a data structure called a Wengert List - see…

tensorflow automatic-differentiation

asked May 09 '17 at 06:56

Marisa Kirisame

votes

2 answers

Numeric.AD and typing problem

I'm trying to work with Numeric.AD and a custom Expr type. I wish to calculate the symbolic gradient of user inputted expression. The first trial with a constant expression works nicely: calcGrad0 :: [Expr Double] calcGrad0 = grad df vars where …

haskell types automatic-differentiation

asked May 09 '11 at 14:31

aleator

4,436
20
31

votes

3 answers

What is differentiable programming?

Native support for differential programming has been added to Swift for the Swift for Tensorflow project. Julia has similar with Zygote. What exactly is differentiable programming? what does it enable? Wikipedia says the programs can be…

language-agnostic differentiation automatic-differentiation

asked Dec 14 '19 at 19:51

joel

6,359
2
30
55

votes

1 answer

Pytorch Autograd: what does runtime error "grad can be implicitly created only for scalar outputs" mean

I am trying to understand Pytorch autograd in depth; I would like to observe the gradient of a simple tensor after going through a sigmoid function as below: import torch from torch import autograd D = torch.arange(-8, 8, 0.1,…

tensorflow neural-network pytorch autograd automatic-differentiation

asked Oct 22 '19 at 18:21

A.E

2 3

…

12 13 Next