0

I've been dabbling in OpenGL and DirectX for the past while, and I've noticed that all transformations are done by doing matrix by matrix and matrix by vector multiplication. I think we can all admit that especially matrix by matrix multiplication is not intuitive, and when I learned that matrix by matrix multiplication involved 64 multiplications and 48 additions, I wasn't so hard on myself for not understanding them well.

Anyway, I know that matrix and vector multiplication on modern systems is done with SIMD or SSE instructions, reducing the number of operations (or calculations), but many calculations I've seen programmers make seem unnecessary.

For example if you have a vertex you want to transform, lets just say we want to rotate 45 degrees and then translate (5, 5, 5) locally, the typical way I've seen is the following:

1: Get the identity matrix.

2: Multiply the identity matrix by the rotation matrix.

3: Multiply the resulting matrix by the translation matrix (order matters).

4: Multiply the resulting matrix by the point/vector you want to transform.

If I wanted to translate an object in a certain direction, instead of multiplying its matrix by

{ 1  0  0  translationX }
{ 0  1  0  translationY }
{ 0  0  1  translationZ }
{ 0  0  0      1        }

...couldn't I just add the translations to the appropriate matrix indices, ie., matrix[3][0] += translationX;

The difference is 3 additions instead of 64 multiplications and 48 additions.

Likewise, say I wanted to translate locally, and not in world space, say for example down an object's right vector, then I could multiply the translation vector by the upperleft part of the object's world or model matrix, getting the object's local right vector? That would only be 3x3 matrix times a vector?

So yeah, I've been thinking about this for a while, and I was just wondering if these big matrix by matrix multiplications are entirely unnecessary, at least for some things. Also, I'm aware that scaling adds some complexities, and haven't got my head around the concept of matrices that well yet.

Zebrafish
  • 11,682
  • 3
  • 43
  • 119
  • 1
    You can certainly optimize special cases, but the specific order of concatenation matters. For simple "rotate & translate" only transformations, it's often best just use a quaternion and a translation rather than maintain a matrix, but it depends on how general you want your system to be. In practice what matters is not how expensive it is to compute the matrix as much as how many vertices will you be transforming by that matrix. For 'real-world' models, the vertex transformation cost usually dominates. – Chuck Walbourn Oct 31 '16 at 07:03
  • if you consider matrices not intuitive see my attempt to explain them for rookies in: [Understanding 4x4 homogenous transform matrices](http://stackoverflow.com/a/28084380/2521214) may be that will help a bit – Spektre Oct 31 '16 at 09:03

2 Answers2

3

think we can all admit that especially matrix by matrix multiplication is not intuitive

I completely disagree. First and foremost, when thinking about linear transformations it doesn't make sense to think of matrices to be "2D arrays of numbers". The proper way to think of matrices is as operators in a very general way.

Anyway, I know that matrix and vector multiplication on modern systems is done with SIMD or SSE instructions, reducing the number of operations (or calculations), but many calculations I've seen programmers make seem unnecessary.

The rules of matrix multiplication and their necessity are fully determined by the rules of linear algebra. You start out with certain elementary rules how transformations of vectors from one space to another shall behave and from there the rules of matrix multiplication rise.

Important is, that when chaining up a series of transformations the end result can be coalesced into one single matrix. That's the beauty of these things. No matter how convoluted and complex your transformation setup is, one single matrix does the job. Matrices give you the oppertunity for precomputation!

...couldn't I just add the translations to the appropriate matrix indices, ie., matrix[3][0] += translationX;

Not in general. Only if the upper left part is a identity transformation. The moment that part is non-identity, the translation gets modified by that as well.

I suggest you write out per hand the result

M = rot((0,0,1), 90°) · translate(1, 2, 3)

hint

                    |  0 -1  0  0 |
rot((0,0,1), 90°) = |  1  0  0  0 |
                    |  0  0  1  0 |
                    |  0  0  0  1 |

So yeah, I've been thinking about this for a while, and I was just wondering if these big matrix by matrix multiplications are entirely unnecessary, at least for some things.

The funny thing is, that once you hit a certain depth of transformation levels matrices quickly win out over chaining individual base vector operations.

But here's the thing: Don't think of matrices as 2D arrays of numbers. That's just a way to write down linear transformations.

datenwolf
  • 159,371
  • 13
  • 185
  • 298
  • Isn't that rotation matrix you've written wrong? Shouldn't that be a negative 1 in the first row second column? If so, then the last column shows the correct translation, or position (-2, 1, 3). – Zebrafish Oct 31 '16 at 14:41
  • @TitoneMaurice: uh, yes, that was indeed the wrong sign. Just assume that there was a scale(-1,1,1) somewhere there, too. I fixed it. But that doesn't change the fact, that when concatenating a translation with a rotation the translation has to be transformed with the rotation. And if you chain enough transformations it's getting tedious. – datenwolf Oct 31 '16 at 17:03
  • Thanks for your help. I understand that the beauty of matrices is that you can concatenate them as you go along and then multiply the final vertex giving the same result as if you had done a separate transform at each step. But I have to say it is totally not tedious to use a simpler method of working another transformation into a matrix if it's going to save 40 or 50 multiplications each time you do it. Chuck mentioned that you are usually worried about how many vertices you have rather than how many you calculate a transformation. – Zebrafish Nov 01 '16 at 00:56
  • But for a simulation or particle system I gotta say using a simple method of adding another transform to a matrix and saving 40 or 50 multiplications each time you do it is not an unimportant consideration. That is, provided it gives the same result. I just don't get that my question was downvoted when the point I made was a question about the unnecessary extra computation made in, well mainly D3D and OpenGL tutorials. And I still haven't been shown an example of when it wouldn't work. For example to add that scale transform would it seems to me still result in the correct position. – Zebrafish Nov 01 '16 at 01:04
  • I'm wrong, I got completely wrong results when adding the (-1, 1, 1) scale to it. Scaling seems to throw it off. – Zebrafish Nov 01 '16 at 01:32
1
  • It is a premature optimization. Small matrix multiplication (i.e. for 2d or 3d graphics) are cheap enough in most cases for us not to think about it.

  • No, matrices are not the only way to represent transformations. Another really nice representation, alas less popular, is with quaternion for rotation + vector for translation. It does not include some of the transformations possible with a 3x4 (or 4x4) matrix, but they are more compact, more numerically stable, easier to interpolate and sometimes cheaper to work with. I'm surprised that you call matrices 'unintuitive', but if so quaternions might be even harder to grasp.

  • The point of those transformation representations (matrices or quaternions) is that they can be composed. What happens IRL is that you compute some transformation as a composition of multiple transformations and then apply that to every vertex of a model, say. Consider a case that a viewer from a flying helicopter looks at a tank with a rotating turret. To render the turret you have to apply at least three rotations and translations to transform the vertices from the model-space of the turret to the viewer coordinates. Doing it by applying each of the transformations individually is costly, compared to precomputing the whole chain into one matrix and then applying that to each vertex at a mere cost of 9 additions and 9 multiplications per vertex (this is the cost of a non-projective matrix-vector multiplication).

Yakov Galka
  • 70,775
  • 16
  • 139
  • 220
  • Wrapping all the transforms in the one matrix, for example the model/world matrix by the camera/view matrix by the perspective/orthographic matrix and then multipying by the vertex is two matrix by matrix multiplications and one matrix by vector multiplication. Doing it separately ends up being 3 matrix by vector multiplications. That's 48 multiplications for doing it separately versus 144 multiplications for wrapping up all the transforms in one matrix beforehand. Is there something I'm missing? – Zebrafish Oct 31 '16 at 14:16
  • 1
    @TitoneMaurice: Yes. You almost always have more than one vertex to transform. So in effect its `12*(M-1) + 9*N` versus `9*M*N` scalar multiplication where `M` is the number of matrices I apply and `N` is the number of vertices, which can easily be in the thousands. This is assuming non-perspective matrices (I'm not sure how you got your numbers). – Yakov Galka Oct 31 '16 at 14:23
  • Oh sorry, that's right, because by premultiplying the world-view-projection matrix you only need to recompute it for each object. Yeah, I got mixed up. It's basically one WVP matrix (as it's named in my program) times the vector for each vertex. Now that is efficient. Actually I think I had split the view and projection matrices from the model/world matrix for when I needed the vertex's world position to do lighting. – Zebrafish Oct 31 '16 at 14:48
  • @TitoneMaurice: true, the projection matrix is usually applied separately for exactly that reason. And in fact you can get along with just two scalars (flenx, fleny) rather than the entire projection matrix. This is why all my above calculations assume a representation of 3x3 matrix + translation vector, because you simply don't need the projective part. BTW, it is usually called MVP (stands for MODEL, view, projection). – Yakov Galka Oct 31 '16 at 14:53