In 1994 I published an article in the PC Game Programmers Encyclopedia about texture mapping. One element of that article are a set of formulae sometimes called "magic cross-products" which provide a method for computing the texture coordinates for a 2D pixel based not on interpolating per-vertex texture coodinates but on computing them directly from texture coordinate basis vectors.
This method was used for performing texture mapping in the software renderers for Quake (e.g. here) and for Thief: the Dark Project (personal knowledge: I wrote it), and possibly others.
The purpose of this article is to re-introduce the method with better notation and modern coordinate systems and to show how it is derived—not because I expect people to perform software rendering now, but both for history's sake and because the technique has other applications; for example, I use it to perform hit-testing of a "3D" user-interface where the user interface elements lie in their own 2D coordinate system in a single arbitrary 3D plane.
I originally "invented" this method just to try to understand how to make texture mapping work from first principles, but it also had advantages over traditional texture-coordinates-at-vertices approach for rendering polygons with more than 3 vertices (e.g. if you were sharing the texture gradients over the whole polygon, if there were colinear vertices to avoid t-junctions, you had to avoid using three colinear vertices to compute the gradients).
Suppose we have a 4x4 matrix M which is our "modelviewprojection" matrix; that is, it transforms from "object space" to "clip space", and the latter allows deriving "normalized device coordinates":
Suppose we have a non-degenerate triangle mapped non-degenerately with a 2D texture map (for example, by defining texture coordinates at the vertices, or defining a projection for each of the coordinates). Then:
Thus F(s,t) = P' + s S' + t T'. (Yes, I just defined F in terms of F. Shhhh.)
If these vectors are not precomputed, they can be computed at run-time, but the point of this method is to assume they're calculated previously.
First we transform the texture map basis vectors to clip space (where they become 4-vectors):
Finally, for each pixel we compute several dot products and two divides:
The clip-space coordinates of the the point is (xw,yw,zw,w), where w is also unknown. This defines a ray through the clip space origin (which is the eye location).
Therefore, we know that (xw,yw,zw,w) = P + sS + tT.
Ignoring z, this gives us three unknowns (s, t, and w) and three equations:
[ Sx Tx -x ] [ s ] [ -Px ] [ Sy Ty -y ] [ t ] = [ -Py ] [ Sw Tw -1 ] [ w ] [ -Pw ]Define D to be the eye ray direction, (x,y,1), and drop the z component from the vectors. Then we can solve for s and t using Cramer's rule ("det" means "determinant"):
Note that this derivation gives no geometric explanation for the cross-products, as the cross-products do not appear geometrically in the solution, but are just a consequence of how the determinant is computed when you favor isolating a particular column (in this case. because one column contains values that are computationally variable and the other matrix values are computationally constant). For a discussion of a geometric justification of ratios of determinants (as ratios of volumes), see Kensler and Shirley, "Optimizing Ray-Triangle Intersection via Automated Search".
Another way to view this is that we are have three (non-orthogonal) basis vectors describing the space: S, T, and D. We're computing what weighted sum of D, -S, and -T must be combined to reach P; we use D to reach the plane, then walk "back" from that point along S and T (which stay in the plane) to reach the texture coordinate origin.
This page by Christer Ericson discusses how the scalar triple product can be used to solve this non-orthogonal change-of-basis problem, which would let us skip all the derivation.
Thanks to Fabian Giesen for reviewing and helping me figure out why my first-draft derivation was crazy roundabout, and for providing the references.