9

Given the product of a matrix and a vector

A.v

with A of shape (m,n) and v of dim n, where m and n are symbols, I need to calculate the Derivative with respect to the matrix elements. I haven't found the way to use a proper vector, so I started with 2 MatrixSymbol:

n, m = symbols('n m')
j = tensor.Idx('j')
i = tensor.Idx('i')
l = tensor.Idx('l')
h = tensor.Idx('h')
A = MatrixSymbol('A', n,m)
B = MatrixSymbol('B', m,1)
C=A*B

Now, if I try to derive with respect to one of A's elements with the indices I get back the unevaluated expression:

diff(C, A[i,j])
>>>> Derivative(A*B, A[i, j])

If I introduce the indices in C also (it won't let me use only one index in the resulting vector) I get back the product expressed as a Sum:

C[l,h]
>>>> Sum(A[l, _k]*B[_k, h], (_k, 0, m - 1))

If I derive this with respect to the matrix element I end up getting 0 instead of an expression with the KroneckerDelta, which is the result that I would like to get:

diff(C[l,h], A[i,j])
>>>> 0

I wonder if maybe I shouldn't be using MatrixSymbols to start with. How should I go about implementing the behaviour that I want to get?

2 Answers2

7

SymPy does not yet know matrix calculus; in particular, one cannot differentiate MatrixSymbol objects. You can do this sort of computation with Matrix objects filled with arrays of symbols; the drawback is that the matrix sizes must be explicit for this to work.

Example:

from sympy import *
A = Matrix(symarray('A', (4, 5)))
B = Matrix(symarray('B', (5, 3)))
C = A*B
print(C.diff(A[1, 2]))

outputs:

Matrix([[0, 0, 0], [B_2_0, B_2_1, B_2_2], [0, 0, 0], [0, 0, 0]])
user
  • 5,370
  • 8
  • 47
  • 75
6

The git version of SymPy (and the next version) handles this better:

In [55]: print(diff(C[l,h], A[i,j]))
Sum(KroneckerDelta(_k, j)*KroneckerDelta(i, l)*B[_k, h], (_k, 0, m - 1))
asmeurer
  • 86,894
  • 26
  • 169
  • 240