2

I have the following:

import numpy as np

XYZ_to_sRGB_mat_D50 = np.asarray([
    [3.1338561, -1.6168667, -0.4906146],
    [-0.9787684, 1.9161415, 0.0334540],
    [0.0719453, -0.2289914, 1.4052427],
])

XYZ_1 = np.asarray([0.25, 0.4, 0.1])
XYZ_2 = np.random.rand(100,100,3)

np.matmul(XYZ_to_sRGB_mat_D50, XYZ_1) # valid operation
np.matmul(XYZ_to_sRGB_mat_D50, XYZ_2) # makes no sense mathematically

How do I perform the same operation on XYZ_2 that I would on XYZ_2? Do I somehow reshape the array first?

Brandon Dube
  • 428
  • 1
  • 10
  • 26

2 Answers2

3

It seems you are trying to sum-reduce the last axis of XYZ_to_sRGB_mat_D50 (axis=1) with the last one of XYZ_2 (axis=2). So, you can use np.tensordot like so -

np.tensordot(XYZ_2, XYZ_to_sRGB_mat_D50, axes=((2),(1)))

Related post to understand tensordot.


For completeness, we can surely use np.matmul too after swappping last two axes of XYZ_2, like so -

np.matmul(XYZ_to_sRGB_mat_D50, XYZ_2.swapaxes(1,2)).swapaxes(1,2)

This won't be as efficient as tensordot one.


Runtime test -

In [158]: XYZ_to_sRGB_mat_D50 = np.asarray([
     ...:     [3.1338561, -1.6168667, -0.4906146],
     ...:     [-0.9787684, 1.9161415, 0.0334540],
     ...:     [0.0719453, -0.2289914, 1.4052427],
     ...: ])
     ...: 
     ...: XYZ_1 = np.asarray([0.25, 0.4, 0.1])
     ...: XYZ_2 = np.random.rand(100,100,3)

# @Julien's soln
In [159]: %timeit XYZ_2.dot(XYZ_to_sRGB_mat_D50.T)
1000 loops, best of 3: 450 µs per loop

In [160]: %timeit np.tensordot(XYZ_2, XYZ_to_sRGB_mat_D50, axes=((2),(1)))
10000 loops, best of 3: 73.1 µs per loop

Generally speaking, when it comes to sum-reductions on tensors, tensordot is much more efficient. Since, the axis of sum-reduction is just one, we can make the tensor a 2D array by reshaping, use np.dot, get the result and reshape back to 3D.

Divakar
  • 218,885
  • 19
  • 262
  • 358
  • Thank you -- this answer is excellent. It's interesting that tensordot beats regular dot with transpose here, speed-wise. – Brandon Dube Dec 05 '17 at 06:25
  • @BrandonDube My understanding is that for tensors, `dot` is looping at some level (most probably at C level), while `tensordot` avoids it. – Divakar Dec 05 '17 at 06:29
  • Wow! Didn't know that! Any insight as to why changing the shape gives 10x faster? That doesn't seem to make any sense... – Julien Dec 05 '17 at 06:35
  • @Julien You mean why the dot with reshape step : `XYZ_2.reshape(-1,3)...` is faster than `XYZ_2.dot(XYZ_to_sRGB_mat_D50.T)`? Because, with the reshaped 2D version, there's no looping for `np.dot`. – Divakar Dec 05 '17 at 06:38
  • but the cost of looping through the shape should be totally negligible compared to the actual computations right? At the end of the day 2 nested for loops of 100 should be the same as 1 long 10,000 for loop, no? – Julien Dec 05 '17 at 06:41
  • 1
    @Julien `np.dot` leverages BLAS instead of actual looping. So, for this case, without reshape, it makes 100 BLAS calls, whereas with 2D reshaped version just one BLAS call. `tensordot` probably reshapes to `2D` (probably at C level) and then leverages one BLAS call.The slowness comes with the 100 BLAS calls, that's my understanding. – Divakar Dec 05 '17 at 06:45
  • @Divakar -- I looked at this again this morning and made an edit to your answer -- tensordot scrambles the axes, but you can swap them around for free. – Brandon Dube Dec 05 '17 at 16:55
  • @BrandonDube Which axes are to be swapped? – Divakar Dec 05 '17 at 16:55
  • @Divakar -- see my proposed edit -- the output is of shape (3,100,100), the output needs to have its axes circshifted. – Brandon Dube Dec 05 '17 at 16:57
  • @BrandonDube We were just needed to swap the positions of inputs there. Edited. – Divakar Dec 05 '17 at 17:10
2

You probably simply want this:

XYZ_2.dot(XYZ_to_sRGB_mat_D50.T)

np.matmul(XYZ_to_sRGB_mat_D50, XYZ_1) is equivalent to XYZ_1.dot(XYZ_to_sRGB_mat_D50.T), you can then simply broadcast the operation to a 100 x 100 x 3 matrix.

Julien
  • 13,986
  • 5
  • 29
  • 53