NOTE:
This is a big wall of text and I completely glaze over a lot of important stuff - but my intention here is just an overview...hopefully some of the terms/concepts here will lead you to better Googling for appropriate chunks on the web.
It helps if you walk your way through "Life as a point":
Here we are, a nice little 3-dimensional point:
var happyPoint = new Point(0, 0, 0);
And here is its buddy, defined in relation to his friend:
var friendlyPoint = new Point(1, 0, 0);
For now, let's call these two points our "model" - we'll use the term "model space" to talk about points within a single three-dimensional structure (like a house, monster, etc).
Models don't live in a vacuum, however...it's usually easier to separate the "model space" and "world space" to make things easier for model tweaking (otherwise, all your models would need to be in the same scale, have the same orientation, etc, etc, plus trying to work on them in a 3d modelling program would be friggin impossible)
So we'll define a "World Transform" for our "Model" (ok, 2 points is a lame model, but a model it remains).
What is a "World Transform"? Simply put:
- A world transform
W = T X R X S
, where
- T = translation - that is, sliding it along the X, Y, or Z axes
- R = rotation - turning the model with respect to an axis
- S = scaling - resizing a model (maintaining all the relative points within) along an axis
We'll take the easy out here, and just define our world transform as the Identity matrix - basically, this means we don't want it to translate, rotate, or scale:
world = [
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
];
I highly recommend you brush up on your Matrix math, especially multiplication and Vector->Matrix operations its used ALL THE FREAKING TIME in 3D graphics.
So cleverly skipping over the actual matrix multiplication, I'll just tell you that multiplying our "world transform" and our model points just ends up with our model points again (albeit in this fun new 4-dimensional vector representation, which I won't touch here).
So we've got our points, and we've absolutely located them in "space"...now what?
Well, where are we looking at it from? This leads to the concept of View Transformations
or Camera Projection
- basically, it's just another matrix multiplication - observe:
Say we've got a point X, at...oh, (4 2) or so:
|
|
|
|
| X
|
------------------------
From the perspective of the origin (0 0), X is at (4 2) - but say we put our camera off to the right?
|
|
|
|
| X >-camera
|
------------------------
What is the "position" of X, relative to the camera? Probably something closer to either (0 9) or (9 0), depending on what your camera's "up" and "right" directions are. This is what View transformations are - mapping one set of 3D points to another set of 3D points such that they are "correct" from the perspective of an observer. In your case of a top-down fixed camera, your observer would be some fixed position in the sky, and all the models would be transformed accordingly.
So let's draw!
Unfortunately, our screen isn't 3D (yet), so first we need to "project" this point onto a 2D surface. Projection is...well, its basically a mapping that looks like:
(x, y, z) => (x, y)
The number of possible projections is nigh-infinite: for example, we could just shift over the X
and Y
coordinates by Z
:
func(x, y, z) => new point2d(x + z, y + z);
Usually, you want this projection to mimic the projection the human retina does when looking at 3D scenes, however, so we bring in the concepts of a View Projection. There are a few different view projections, like Orthographic, YawPitchRoll-defined, and Perspective/FOV-defined; each of these has a couple of key bits of data you need to properly build the projection.
A Perspective/FOV based projection, for example, needs:
- The position of your "eyeball" (i.e., the screen)
- How far into the distance your "eyeball" is capable of focusing (the "far clipping plane")
- Your angular field of view (i.e., hold your arms out, just at the edges of your peripheral vision)
- The ratio of width to height for the "lens" you're looking through (typically your screen resolution)
Once you've got these numbers, you create something called a "bounding frustum", which looks a bit like a pyramid with the top lopped off:
\-----------------/
\ /
\ /
\ /
\ /
\-------/
Or from the front:
___________________
| _____________ |
| | | |
| | | |
| | | |
| | | |
| | | |
| |_____________| |
|___________________|
I won't do the matrix calculations here, since that's all well defined elsewhere - in fact, most libraries have helper methods that'll generate the corresponding matrices for you - but here's roughly how it works:
Let's say your happy little point lies in this frustum:
\-----------------/
\ /
\ o<-pt /
\ /
\ /
\-------/
___________________
| _____________ |
| | | |
| | | |
|o | | |
|^---- pt | |
| | | |
| |_____________| |
|___________________|
Notice it's way off to the side, so far that it's out of the "near clip plane" rectangle - What would it look like if you "looked into" the smaller end of the pyramid?
Much like looking into a Prism (or a lens), the point would be "bent" into view:
___________________
| _____________ |
| | | |
| | | |
|>>>o <- pt is | |
| | shifted | |
| | | |
| |_____________| |
|___________________|
Put another way, if you had a bright light behind the frustum, where would the shadows from your points be "cast" upon the near clipping field? (the smaller rectangle) That's all projection is - a mapping of one point to another, in this case, removing the Z component and altering the X and Y accordingly in a way that "makes sense" to our eyes.