🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

Matrices - translation and rotation

Started by
22 comments, last by JohnnyCode 4 years, 10 months ago
I've been playing with graphics programming for years, but to this day, I really don't understand what part of a left handed 4x4 matrix represents a rotation or a translation and how they do it. I just know you multiply matrices together to transform coordinates from one system to another. What is the connection between the vector on the left side of the equation, the matrix and each of its individual elements, and the vector on the right side of the equation?
 
How does [x,y,z] * [x1, y1,z1, w1]
                             [x2, y2, z2, w2]
                             [x3, y3, z3, w3]
                             [x4, y4, z4, w4]
 
Make my spaceship turn and move? I just take it for granted that it does and now it's bothering me.
What does y3 actually represent? what does z1 represent, etc?

I've read oodles of graphics programming and linear algebra books and I still really don't understand what magic happens and how when I multiply a vector by a matrix or multiple two matrices together.
Advertisement

Hi, I would suggest you to have a look to this link, it includes a pretty good explanation about the math behind it

http://www.opengl-tutorial.org/beginners-tutorials/tutorial-3-matrices/

What really helped me to understand this was to think in a triangle with three points, and apply these matrices to each of them.

For instance if you apply the scaling matrix you will see that the result is the same triangle with different scale. I would read this article as well, having a better understanding of the homogeneous space is really helpful

https://www.tomdalling.com/blog/modern-opengl/explaining-homogenous-coordinates-and-projective-geometry/

Hope this helps

4 hours ago, Fleshbits said:
I've been playing with graphics programming for years, but to this day, I really don't understand what part of a left handed 4x4 matrix represents a rotation or a translation and how they do it.

A matrix by itself doesn't have a property of 'handedness' per se (at least not to my knowledge), and the example in your post doesn't include anything that would indicate handedness - perhaps you're thinking of row vs column vectors?

[Edit: To be clear, handedness can come into play when working with certain transforms in certain contexts, but I suspect the term is maybe misapplied in your post above.]

For the fact: vec * mat is something like 

new vec.x = dot(vec, matrow1);

new vec.y = dot(vec, matrow2);

etc.

 

Depending on which row major or column major notation you use you have something like:

For a be tor representation of rotations etc.


			AIR_MATRIX[0] = rr.x; //right
			AIR_MATRIX[1] = rr.y;
			AIR_MATRIX[2] = rr.z;
			AIR_MATRIX[3] = 0.0;
			AIR_MATRIX[4] = ru.x; //up
			AIR_MATRIX[5] = ru.y;
			AIR_MATRIX[6] = ru.z;
			AIR_MATRIX[7] = 0.0;
			AIR_MATRIX[8]  = -rf.x; //front
			AIR_MATRIX[9]  = -rf.y;
			AIR_MATRIX[10] = -rf.z;
			AIR_MATRIX[11] = 0.0;
			AIR_MATRIX[12] =  0.0;  //point better not use it
			AIR_MATRIX[13] =  0.0;
			AIR_MATRIX[14] =  0.0;
			AIR_MATRIX[15] = 1.0;

But its far way beyond that.

There is also translation and scale information, but truth to be told you decide what you put where and what operation you perform, anyway someone smart enough already packed data into matrices so you can just use them...

Hi there,

the best thing you can do to understand stuff is to write down explicitly what is happening. So let's do that for a 3x3 matrix and corresponding vector.

If you multiply a matrix A and a vector x you get the result vector r:


A*x=r

If you write it down with values you would get:


|a0 b0 c0|   |x0|   |r0|
|a1 b1 c1| * |x1| = |r1|
|a2 b2 c2|   |x2|   |r2|

Or if you use the row notation it would be:


             | a0 a1 a2 |
             | b0 b1 b2 |
|x0 x1 x2| * | c0 c1 b3 | = | r0 r1 r2 |

Notice that the difference between both notations is, that you have row vectors instead of column vectors, the vector is now to the left of the matrix and the values of the matrix are flipped by exchanging rows and columns. If you see the vectors as 3x1 matrices, you have also just flipped rows and columns. This "flipping" is called transposing a matrix, but we won't bother with that anymore. It is just important for you to realize, that both notations I showed you give the exact same results if you compare the values of the equation. I favour the column notation and will use it, from now on.

Okay, what you know so far is, that you multiply a vector with a matrix to get a transformed vector. So let's have a look what is actually calculated. As @_WeirdCat_ already mentioned, the matrix-vector (or matrix-matrix) product calculates the dot product of the left-hand matrices/vector row and the right-hand matrix/vector columns. If you visualize it:


             |x0|
             |x1|
             |x2|

|a0 b0 c0|   |r0|   |dot(row0ofA, x)|   |a0*x0 + b0*x1 + c0*x2 |
|a1 b1 c1|   |r1| = |dot(row1ofA, x)| = |a1*x0 + b1*x1 + c1*x2 |
|a2 b2 c2|   |r2|   |dot(row2ofA, x)|   |a2*x0 + b2*x1 + c2*x2 |

If you remove some stuff you get


|r0|   |a0*x0 + b0*x1 + c0*x2|
|r1| = |a1*x0 + b1*x1 + c1*x2|
|r2|   |a2*x0 + b2*x1 + c2*x2|

These are basically just 3 equations for the 3 values of the new vector r:


r0 = a0*x0 + b0*x1 + c0*x2 
r1 = a1*x0 + b1*x1 + c1*x2 
r2 = a2*x0 + b2*x1 + c2*x2

Each value of the result vector is a weighted sum of all values from the original vector. All a matrix does is providing the weighting factors.

To get a better understanding, let us monitor the basic operations that we use in computer graphics:

Let's say we want to scale an object by 2 along the y-Axis, then you would just multiply the y-component by 2:


r0 = x0
r1 = 2*x1 
r2 = x2

So writing it down as a weighted sum of all components gives:


r0 = 1*x0 + 0*x1 + 0*x2 
r1 = 0*x0 + 2*x1 + 0*x2 
r2 = 0*x0 + 0*x1 + 1*x2

Or as matrix notation:


|1 0 0|   |x0|   |r0|
|0 2 0| * |x1| = |r1|
|0 0 1|   |x2|   |r2|

Not so hard, right?

So, now you want to rotate around the z-axis. Means the z values remain constant, but the x and y values change. If you rotate around a certain angle P, then you will get the formulas for each component:


r0 = cos(P)*x0 - sin(P)*x1 
r1 = sin(P)*x0 + cos(P)*x1  
r2 = x2

Just look up, why you have to use these combinations of sinus and cosinus: https://en.wikipedia.org/wiki/Rotation_matrix

So as weighted sum we get:


r0 = cos(P)*x0 + -sin(P)*x1 + 0*x2 
r1 = sin(P)*x0 +  cos(P)*x1 + 0*x2 
r2 =      0*x0 +       0*x1 + 1*x2

Or as a matrix:


|cos(P) -sin(P) 0|   |x0|   |r0|
|sin(P)  cos(P) 0| * |x1| = |r1|
|     0      0  1|   |x2|   |r2|

I think you get the pattern now. But what about translations. Well usually you just add the translation to the old vector:


r0 = x0 + t0
r1 = x1 + t1
r2 = x2 + t2

The problem here is, that this does not fit into the pattern we used so far. The translations are not multiplied by any component of x. So how do we include them into matrix-vector multiplication? Well, the trick is to extend the matrix dimension.

 


|a0 b0 c0 d0|   |x0|   |r0|
|a1 b1 c1 d1| * |x1| = |r1|
|a2 b2 c2 d2|   |x2|   |r2|
|a3 b3 c3 d3|   |x3|   |r3|

So your sum becomes:


r0 = a0*x0 + b0*x1 + c0*x2 + d0*x3
r1 = a1*x0 + b1*x1 + c1*x2 + d1*x3
r2 = a2*x0 + b2*x1 + c2*x2 + d2*x3
r3 = a3*x0 + b3*x1 + c3*x2 + d3*x3

Remember that you always had to set the fourth component of your vector (x3) to 1? Well then you get:


r0 = a0*x0 + b0*x1 + c0*x2 + d0
r1 = a1*x0 + b1*x1 + c1*x2 + d1
r2 = a2*x0 + b2*x1 + c2*x2 + d2
r3 = a3*x0 + b3*x1 + c3*x2 + d3

To make sure the new vector has also a value of 1, the factors a3, b3 and c3 need to be zero:

 


r0 = a0*x0 + b0*x1 + c0*x2 + d0
r1 = a1*x0 + b1*x1 + c1*x2 + d1
r2 = a2*x0 + b2*x1 + c2*x2 + d2
r3 = 0 *x0 + 0 *x1 + 0 *x2 + d3

 

So to create a translation marix for


r0 = x0 + t0
r1 = x1 + t1
r2 = x2 + t2
r3 = x3        (= 1)

 You set


d0 = t0
d1 = t1
d2 = t2

So you get:


r0 = 1*x0 + 0*x1 + 0*x2 + t0*x3  // REMEMBER x3 = 1 !!!
r1 = 0*x0 + 1*x1 + 0*x2 + t1*x3
r2 = 0*x0 + 0*x1 + 1*x2 + t2*x3
r3 = 0*x0 + 0*x1 + 0*x2 + 1 *x3

Or as a matrix:


|1 0 0 t0|   |x0|   |r0|   // REMEMBER: x3 = 1!!!
|0 1 0 t1| * |x1| = |r1|
|0 0 1 t2|   |x2|   |r2|
|0 0 0 1 |   |x3|   |r3|

So that's basically all you need to know and I hope you understand it now. ?

 

Greetings

^^^^^ WOW. Thank you so much!

"Fortunately" I never learnt about matrices, I figured out a solution for rotating points that later turned out to be matrix multiplication.

You can imagine it by having 3 unit vectors aligned to the major axes. If you want to rotate a point, just rotate those 3 vectors, and then combine the corresponding vector components with the point's in question.

eg: take the X component of all the axis vectors and multiply them by the point's X component, and the sum will be the rotated X, and so on. res.x = mtx.x.x * pos.x + mtx.y.x * pos.y + mtx.z.x * pos.z;

This is a 3x3 rotation matrix. And yes they are not just arbitrary 9 numbers or a 3x3 array of floats. They represent 3 vectors, each of them pointing along the corresponding axis.

As long as they have unit length they simply rotate. And you can guess, if you scale the X axis by 2 that will result in scaling along X axis by 2 :)

Thus, I never used the float[4][4] representation because they are a kind of meaningless pack of numbers for me.
I use 3x3 rotation matrix, position and scaling separated. I've been using this format for more than 20 years and never had the idea to change :)

If you want to use 4x4 matrices for other than rendering I'd encourage you not to do so. In the logic side they are just pain.
For final rendering you must have a 4x4 matrix anyway, but from the format above you can combine it in no time, just copy the values at the right place in the 4x4 matrxix and you are done, and you have to do it probably once.

If you look at a 4x4 matrix closely you can notice that the top-left 3x3 area is this rotation+scaling part. (3 scaled axis vectors)
The last row/column (depending on your preference) is the translation (position)

Just see an identity matrix:
[ 1 0 0 0 ]
[ 0 1 0 0 ]
[ 0 0 1 0 ]
[ 0 0 0 1 ]


A matrix that rotates 90° around Y and moves 5 to the right(X), and 2 up (Y) :
[ 0 0 -1 0 ]
[ 0 1  0 0 ]
[ 1 0  0 0 ]
[ 5 2  0 1 ]

In both matrices the bold numbers are the rotation vectors.

If you expand the transformation for a 4x4 matrix by a vector[4]

res.x = mtx.x.x * pos.x + mtx.y.x * pos.y + mtx.z.x * pos.z + mtx.w.x * pos.w;

The last multiplication (bold) is the translation part, the rest is the rotation. I hope I didn't screw it up :)

 

There are more benefits:

  • You can never tell if a float[4][4] matrix is row or column major, and always a riddle how to get the "up" vector.
  • If you don't put scale in the rotation and keep the position separated, you'll never need to decompose. eg: need a position 3meters in front of the camera? : pos = camera.mtx.z * 3 + camera.pos  (I use Y-up because I'm old ;) )
  • If you use orthogonal 3x3 rotation matrices, you can forget about matrix inverse. In 99% of the cases the matrix inverse results in simple transpose that is FREE compared to any SSE optimized fancy stuff. Most math libraries don't implement a multiply by transpose function, so you must rely on inverse.
  • My representation is left-handed row-major, I use directx so I love it :)

 

This is just a general purpose matrix, for projection you'll have to dig deeper.
And of course this is just my personal taste, I'd guess most people would argue.

50 minutes ago, bmarci said:

You can never tell if a float[4][4] matrix is row or column major, and always a riddle how to get the "up" vector.

Quote

My representation is left-handed row-major, I use directx so I love it

Just a minor point: is it possible you're talking about row vs column vectors rather than row- vs column-major? (Maybe it could be either, but row vs column vectors seems to make a little more sense in the given context.)

 

4 hours ago, bmarci said:

This is a 3x3 rotation matrix. And yes they are not just arbitrary 9 numbers or a 3x3 array of floats.

 

4 hours ago, bmarci said:

Thus, I never used the float[4][4] representation because they are a kind of meaningless pack of numbers for me.

 

4 hours ago, bmarci said:

If you look at a 4x4 matrix closely you can notice that the top-left 3x3 area is this rotation+scaling part.

Honestly, I got a little bit confused. A 3x3 matrix is totally logical for you and a 4x4 matrix not even though you know that the 4x4 is just an extended version of the same 3x3? ? I mean, even in your "vector thinking" format it is the same, just extended to a fourth dimension.

 

4 hours ago, bmarci said:

If you want to use 4x4 matrices for other than rendering I'd encourage you not to do so. In the logic side they are just pain.

While I agree, that it usually does not make sense to use more dimensions than necessary, there are a lot of mathematical tricks/optimizations that are based on adding another dimension. Just a little fun fact: In 4-dimensional space, every 3d translation can be represented by 2 rotations and 1 scaling (https://en.wikipedia.org/wiki/Singular_value_decomposition). It doesn't make sense here, but it is not like people are doing this stuff just for being fancy.;) If I remember correctly, NURBS are also some kind of projection from 4d to 3d. The key factor here is, that it is beneficial to use things only if you know why you are using them and not just because everybody else does it.

Regarding your benefits:

4 hours ago, bmarci said:

You can never tell if a float[4][4] matrix is row or column major, and always a riddle how to get the "up" vector.

How do you tell if a 3x3 matrix is a column or row-major? Treating a row-major matrix accidentally as column-major is like transposing the matrix. For rotations, this would flip the direction of the rotation.

4 hours ago, bmarci said:

If you don't put scale in the rotation and keep the position separated, you'll never need to decompose. eg: need a position 3meters in front of the camera? : pos = camera.mtx.z * 3 + camera.pos  (I use Y-up because I'm old ;) ) 

Like always, it depends on what you want to do. If you don't need the translation vector separately, this is not really a benefit but another variable you need to synchronize between GPU and CPU.

Another important aspect is performance. A single rotation with subsequent translation is slightly cheaper with a 3x3 multiplication and another addition than a 4x4 multiplication which combines both operations. But if another rotation and translation follow the 4x4 matrix is superior, because you can "stack" all operations. So you compute the result once on the CPU and use it during multiple shader executions. With 3x3 matrices, this is not possible as soon as you get a single translation in your sequence (if it is not the last operation).

4 hours ago, bmarci said:

If you use orthogonal 3x3 rotation matrices, you can forget about matrix inverse.

1

You can do exactly the same with a pure 4x4 rotation matrix since it is also orthogonal. It is only a matter of not mixing the rotations that need to be inverted with other operations (Just keep a copy of the rotations that need to be inverted). In 4x4 world, it just requires more discipline because you can combine a rotation matrix with a translation matrix, which destroys orthogonality. But the same is true for combining scaling with rotations. It's just that scaling is less frequently used as translations, so the problem occurs much less in 3x3 world.

4 hours ago, bmarci said:

In 99% of the cases the matrix inverse results in simple transpose that is FREE compared to any SSE optimized fancy stuff.

1

I am not sure what SSE has to do with it. I have implemented my own LA library based on SSE but with serial fallback option. First of all: nothing is ever free, even if you think it is. If you mean, that multiplication by the Transposed is simply switching some indices in matrix-vector multiplication, then a benchmark might surprise you, that one version is slower than the other. The problem here is that the memory access pattern changes. This "might" still be better than a "naive" SSE implementation where the matrix is actually transposed before multiplication, but I think if you turn off auto-vectorization, the SSE version would still be faster.

4 hours ago, bmarci said:

Most math libraries don't implement a multiply by transpose function, so you must rely on inverse.

I think you don't. I can't speak for every library out there, but I guess many do not differ between row vector and column vector. If it is right of the matrix, it is treated as a column vector and if it is left, it is treated as a row vector. Even if they do, transposing a vector should be a no-op in an optimized library. So you want to undo a rotation by performing:


x * transposed(A) = r

Then there are these mathematical properties:


A = transposed(transposed(A))

transposed(A*B) = transposed(B) * transposed(A)

so you get:


transposed(transposed(A)) * transposed(x)  = transposed(r)
--->
A * transposed(x)  = transposed(r)

In case your library does not differ between row and column vectors, you just switch the matrix and the vector positions to get your result. If it does, you have to additionally transpose 2 vectors, which, as I said, should be a no-op in an optimized library. By the way, an SSE library can also use this trick. ;)

4 hours ago, bmarci said:

And of course this is just my personal taste, I'd guess most people would argue.

Well if you just see the coding aspects, it might be a matter of taste. But many "standard" approaches were based on performance decisions, we just don't realize it anymore since modern computers and smartphones are so fast that, that most games won't run slower because your matrix-vector multiplications take 2 times longer. So as long as your not aiming for maximal performance, do whatever you like. ;) 

Greetings

 

21 hours ago, Zakwayda said:

Just a minor point: is it possible you're talking about row vs column vectors rather than row- vs column-major?

Row vs column major! It always confuses me if I see a matrix and don't know which 3 numbers represent a certain direction.
For me it's much more clear if I see a vector component in a transformation and I instantly know that points up in local space :)

 

17 hours ago, DerTroll said:

How do you tell if a 3x3 matrix is a column or row-major? Treating a row-major matrix accidentally as column-major is like transposing the matrix. For rotations, this would flip the direction of the rotation.

That's the point, I don't, I don't even want to care, I don't want "accidents", whenever I had to hunt for a bug, that has never been because I accidentally took a matrix in a wrong order :) In my vector thinking these kind of mistakes are simply not possible.

 

18 hours ago, DerTroll said:

Like always, it depends on what you want to do. If you don't need the translation vector separately, this is not really a benefit but another variable you need to synchronize between GPU and CPU.

Of course, if I have a matrix only for storage and I don't ever calculate anything with it, there is no point in separation.
But most of the cases, even in AAA games, we had everything in 4x4 matrices because the sacred GPU likes that way, and the game code looked like a mess. Full with matrix decompose and inverses. But at least the engine/render guys could blame the game coders because the overall bad performance :D

 

18 hours ago, DerTroll said:

But if another rotation and translation follow the 4x4 matrix is superior, because you can "stack" all operations. So you compute the result once on the CPU and use it during multiple shader executions. With 3x3 matrices, this is not possible as soon as you get a single translation in your sequence (if it is not the last operation).

Somehow it works for me, and exactly that's what I'm doing with my rotate + translate combo, and before I pass it to the render I pack them into a 4x4 matrix.

 

18 hours ago, DerTroll said:

You can do exactly the same with a pure 4x4 rotation matrix since it is also orthogonal. It is only a matter of not mixing the rotations that need to be inverted with other operations (Just keep a copy of the rotations that need to be inverted).

Yes, but why would I keep a copy if I can have everything at hand? :/
With a pure 4x4 rotation matrix half of the operations will be multiplying numbers by zeros and adding them together.

 

18 hours ago, DerTroll said:

If you mean, that multiplication by the Transposed is simply switching some indices in matrix-vector multiplication, then a benchmark might surprise you, that one version is slower than the other. The problem here is that the memory access pattern changes.

I agree. I haven't profiled it yet, though...

 

18 hours ago, DerTroll said:

Well if you just see the coding aspects, it might be a matter of taste. But many "standard" approaches were based on performance decisions, we just don't realize it anymore since modern computers and smartphones are so fast that, that most games won't run slower because your matrix-vector multiplications take 2 times longer. So as long as your not aiming for maximal performance, do whatever you like. ;) 

As I saw in the past, the performance doesn't always depend on the 2x faster low level function. I try to keep everything as simple as possible, that always helped me in the valley of darkness :)

 

 

If it's still not clear why that poor spaceship moves and rotates, I still believe the 3-vector rotation and position representation of a matrix is a good starting point ;)

 

 

 

 

This topic is closed to new replies.

Advertisement