Because I'm always forgetting this stuff
Adding a fourth term: \(w\) makes it easier to work in 3D euclidian space. $$({x \over w},{y \over w},{z \over w}) \Rightarrow (x,y,z,w)$$
Why? Here's a long list of reasons, but a couple key points:
for a point \(p\): $$ T(t)p = \begin{bmatrix} 1 & 0 & 0 & t_x\\ 0 & 1 & 0 & t_y\\ 0 & 0 & 1 & t_z\\ 0 & 0 & 0 & 1\\ \end{bmatrix} \begin{bmatrix} p_x\\ p_y\\ p_z\\ 1\\ \end{bmatrix} = \begin{bmatrix} p_x + t_x\\ p_y + t_y\\ p_z + t_z\\ 1\\ \end{bmatrix} $$ for a vector \(v\): $$ T(t)v = \begin{bmatrix} 1 & 0 & 0 & t_x\\ 0 & 1 & 0 & t_y\\ 0 & 0 & 1 & t_z\\ 0 & 0 & 0 & 1\\ \end{bmatrix} \begin{bmatrix} v_x\\ v_y\\ v_z\\ 0\\ \end{bmatrix} = \begin{bmatrix} v_x\\ v_y\\ v_z\\ 0\\ \end{bmatrix} $$ inverse transform: \(T^{-1}(t) = T(-t)\)
$$ S(t)p = \begin{bmatrix} s_x & 0 & 0 & 0\\ 0 & s_y & 0 & 0\\ 0 & 0 & s_z & 0\\ 0 & 0 & 0 & 1\\ \end{bmatrix} \begin{bmatrix} p_x\\ p_y\\ p_z\\ 1\\ \end{bmatrix} = \begin{bmatrix} s_xp_x\\ s_yp_y\\ s_zp_z\\ 1\\ \end{bmatrix} $$
When applying individual rotations (yaw, pitch, roll), the order of rotation operations is important. To get around this problem use axis-angle representation.
If you only need to rotate around a single axis, then applying the rotation transform can be straightforward. $$ R_z(\alpha) = \overset{\text{yaw}}{ \begin{bmatrix} \cos\alpha & -\sin\alpha & 0 & 0\\ \sin\alpha & \cos\alpha & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{bmatrix}} R_y(\beta) = \overset{\text{pitch}}{ \begin{bmatrix} \cos\beta & 0 & \sin\beta & 0\\ 0 & 1 & 0 & 0\\ -\sin\beta & 0 & \cos\beta & 0\\ 0 & 0 & 0 & 1 \end{bmatrix}} $$ $$ R_x(\gamma) = \overset{\text{roll}}{ \begin{bmatrix} 1 & 0 & 0 & 0\\ 0 & \cos\gamma & -\sin\gamma & 0\\ 0 & \sin\gamma & \cos\gamma & 0\\ 0 & 0 & 0 & 1 \end{bmatrix}} $$ The inverse of a rotation matrix corresponds to its transpose $$R_x^{-1} = R_x^T \qquad R_y^{-1} = R_y^T \qquad R_z^{-1} = R_z^T$$
Local space, World space, View space, Clip space, Screen space
An interactive playground with buggy drag'n'drop
perspective / orthographic projection. near plane / far plane. clipping.
You can use the Viewport Tranform to get to Screen Space (actual pixels)
To transform a vertex coordinate to clip coordinates: $$V_\text{clip} = M_\text{projection}\cdot{M_\text{view}}\cdot{M_\text{model}}\cdot{V_\text{local}}$$
Coordinate systems we commonly reference for development.
$$ \begin{align} P & = \overbrace{K}^{\text{Intrinsic Matrix}} \times \overbrace{\left[R|\mathbf{t}\right]}^{\text{Extrinsic Matrix}} \\ & = \overbrace{ \underbrace{ \begin{bmatrix} 1 & 0 & x_0\\ 0 & 1 & y_0\\ 0 & 0 & 1\\ \end{bmatrix} }_{\text{2D Translation}} \times \underbrace{ \begin{bmatrix} f_x & 0 & 1\\ 0 & f_y & 1\\ 0 & 0 & 1\\ \end{bmatrix} }_{\text{2D Scaling}} \times \underbrace{ \begin{bmatrix} 1 & s/f_x & 0\\ 0 & 1 & 0\\ 0 & 0 & 1\\ \end{bmatrix} }_{\text{2D Shear}} }^{\text{Intrinsic Matrix}} \times \overbrace{ \underbrace{ \begin{bmatrix} 1 & 0 & 0 & t_x\\ 0 & 1 & 0 & t_y\\ 0 & 0 & 1 & t_z\\ \end{bmatrix} }_{\text{3D Translation}} \times \underbrace{ \begin{bmatrix} ... \end{bmatrix} }_{\text{3D Rotation}} }^{\text{Extrinsic Matrix}} \end{align} $$ Source: Kyle Simek's excellent computer vision blog
Column 2 of the view matrix is the camera's -Z direction. \begin{bmatrix} u_x & v_x & -n_x & -\text{eye}_x\\ u_y & v_y & -n_y & -\text{eye}_y\\ u_z & v_z & -n_z & -\text{eye}_z\\ 0 & 0 & 0 & 1\\ \end{bmatrix} Where \(u\), \(v\), and \(n\) are the normalized vectors for the camera referential. \(u\) is the up vector, \(n\) is the direction the camera is looking at, and \(v\) is perpendicular to both \(n\) and \(u\).
From us to your inbox weekly.