2.  Mathematical Background

As discussed previously the two radiographs are at different magnifications, rotated with respect to one another, and the images are at different offsets in the film. Therefore before the two images can be compared, they must be brought into alignment by the application of a spatial transformation. A spatial transformation is a mapping function that establishes a relationship among all points in an image and its warped counterpart. To register the two images, a method of determining the spatial transformation involved is required.

In this section we discuss the mathematics behind the spatial transformations involved, and the possible methods available to infer the parameters of the transformation involved for two particular images.

2.1  The Affine Transformation

The three possible transformations of the radiograph (rotation, scale and translation) are special forms of the Euclidean Affine transformation, which is a transformation w:Â2®Â2, such that the transformation of the point (x,y) is described by the equation

 w(x,y)  =  (ax+by+e,cx+dy+f) (1)

where a,b,c,d,e and f are real numbers. This can be more conveniently represented in matrix notation as

w(x,y) =
é
ę
ë
 a b c d
ů
ú
ű
x + é
ę
ë
 e f
ů
ú
ű
W = A·x+b

where A represents a linear transformation and b a translation.

Lemma 1 The affine transformation has an important property, that is if w1:Â2®Â2 and w2:Â2®Â2 are both affine transformations then w3 = w1 ° w2 is also an affine transformation.

As the three translations that we wish to apply to our radiograph are particular forms of affine transformations, then by lemma 1 (Barnsley & Hurd 1993) there must exist a single affine transformation W that when applied to the radiograph will apply all three desired transformations in one operation. This reduces the problem of registering the two radiographs to finding the general affine transformation W that will apply a suitable scaling, rotation and translation to our image. Figure 1: An affine transformation is determined by its action on three points.

The affine transformation can be characterized by it's action on three points, as shown in figure 1. If we have three points (x1,y1), (x2,y2), (x3,y3), and the corresponding points (x~1,y~1), (x~2,y~2), (x~3,y~3) after the affine transformation W has been applied. Then if we substitute these into equation 1, we get a system of linear equations (2) that when solved will give us the coefficients a,b,c,d,e and f of W.

 x1a + y1b + e   = ~x1 x2a + y2b + e   = ~x2 x3a + y3b + e   = ~x3 x1c + y1d + f   = ~y1 x2c + y2d + f   = ~y2 x3c + y3d + f   = ~y3
(2)

Thus by selecting three points in one image and the corresponding points in the other image, by then solving the resultant system of linear equations we have a means by which we can calculate the correct general affine transformation required to register the two images.

Using this approach however has a drawback in that a general affine transformation has more degrees of freedom than we require. A general affine transformation is composed of a rotation, a magnification of different amounts in the directions of the x and y axis, a shear and a translation. We know however that any difference in the magnifications between the two images will be the same in both directions, and that is also unlikely for there to be any significant shearing between the two images.

Therefore what we require is to find a more specialized form of the affine transformation that only involves a translation (e,f), a rotation q and a scaling r, uniform in all directions. This special case form of the affine transformation is known as a similitude and has the form

w(x,y) = é
ę
ë
 r cos q -r sin q r sin q r cos q
ů
ú
ű
é
ę
ë
 x y
ů
ú
ű
+ é
ę
ë
 e f
ů
ú
ű
(3)

substituting a = r cosq and b = r sinq into equation 3 gives

w(x,y)  =
é
ę
ë
 a -b b a
ů
ú
ű
é
ę
ë
 x y
ů
ú
ű
+ é
ę
ë
 e f
ů
ú
ű
=   (ax -by+e, bx +ay+f)

This is now an equation in four unknowns, just two control points (x1,y1), (x2,y2), and the corresponding transformed points (x~1,y~1), (x~2,y~2) will provide sufficient equations to allow the solution for the four unknowns a, b, e and f.

 x1a - y1b + e   = ~x1 x2a - y2b + e   = ~x2 x1b + y1a + f   = ~y1 x2b + y2a + f   = ~y2
(4)

It would also be useful if we could determine from the overall affine transformation what the specific rotation, magnification and translation were. Determining the translation is trivial as the two components are separate unknowns in the simultaneous equations (e and f), however the rotation and magnification are more complicated as the rotation q and the magnification r are combined to give the two unknowns a and b, as

 a  =  r cos q (5) b  =  r sin q (6)

It is possible to separate r and q by solving this system of simultaneous equations. First re-arranging equation 5 to give

 r  = acosq
(7)

then by substituting equation 7 back into equation 6 we get

 b
= a
cosq
sinq
 b a
= sin q
cos q
 tan q
=
 b a
(8)

Hence using equation 8 we can work out the angle of rotation q, which can then be substituted into equation 7 to workout the magnification r. It should be noted that care has to be taken when working out the inverse tangent, due to the limited range of the inverse tangent function (tan¯ ¹q has the range [-½pp]). When carrying this out, using an inverse tangent function that takes the numerator and denominator of the fraction q as separate arguments is preferable. Inverse tangent functions of this form have the full range [-p,p] Figure 2: The geometry of the patient and treatment couch.

One final point to consider is the inability of affine transformations in the Euclidean plane to cope with off-axis rotations. This is shown by figure  2. From this we can see how movements produce the various transformations seen in the images. Shifts in the xy plane, result in translations in the image, movement along parallel to the z-axis will produce changes in magnification, and rotations around the z-axis will produce rotations in the image.

2.2  Overdetermined Linear Systems

As discussed in the previous section to find the correct transformation required to register the two images we are required to solve a system of simultaneous linear equations. After two pairs of fiducial points have been identified in each image, the coordinates can be substituted into equation 4 and the systems solved.

This however overlooks the fact that each point identified in the second image is unlikely to be in the true corresponding position. In reality when identifying a fiducial point there is likely to be some uncertainty (Dx,Dy) in identifying it's position. Additionally this uncertainty in the location of the fiducial point will vary between different attempts at identification and between different operators. So while in theory a set of two fiducial points is sufficient to calculate the correct transformation to register the two images, in reality the errors associated with their positions means that the transformation calculated will not be optimal.

A method to overcome the uncertainties in the placement of the fiducial points is therefore required. The positional uncertainty associated with each individual point will be normally distributed, randomly around it's true point.

One method then to overcome these positional uncertainties is simply to increase the number of points used in calculating the transformation. If the uncertainties are truly random, then this will reduce the overall error. This is because as the number of points is increased, the system moves from a random regime to a stochastic one. In the stochastic regime the errors balance themselves out, with an equal number of positive and negative errors. This is analogous to radioactive decay and as the number of fiducial points tends towards infinity the error tends to zero as shown in equation .

 lim n ® Ą nĺ i = 1 (Dxi, Dyi) = 0
(9)

Unfortunately if we have more than two sets of fiducial points, we will have a system of linear algebraic equations that has more equations than unknowns. Usually such systems have no unique solution, and the set of equations are said to be overdetermined.

2.2.1 The Normal System

The system of linear algebraic equations whose solution will yield the transformation required to register the two images is overdetermined, that is we have m equations with n unknowns with m>n. If all the fiducial points were perfectly placed a unique solution would exist, however due to the uncertainties in the placement of the points, a unique solution does not exist. What is required then is to seek a ``compromise'' solution to the system of equations A·x = b, which comes closest to satisfying all of the equations simultaneously.

The solution first proposed by Gauss, is to seek the parameters x, which would have the property that the sum of the squares of the differences between the left- and right-hand sides of the equation are minimized. This is known as the linear least-squares problem. The solution proposed by Gauss is to solve a reduced set of n equations in n unknowns, given by equation

 (AT ·A)· A =(AT ·b) (10)

where AT denotes the transpose of the matrix A. The system of equations 10 are known as the normal system, whose solution will yield a solution vector x that minimizes the sum of the squares of the residuals.

Unfortunately the condition number of matrix AT ·A is the square of the condition number of the matrix A. This means that if the residuals are small then any numerical solution of the normal system, will suffer a serious decline in accuracy due to round off errors. So while it may be tempting to solve the normal system directly by Cholesky decomposition, exploiting the symmetrical and positive definite properties of the matrix AT ·A, this will lead to a numerically unstable answer.

2.2.2 Singular Value Decomposition

The method of choice for solving a linear least squares problem, without prior knowledge of the system is Singular Value Decomposition or SVD. This is a powerful technique for dealing with equations which are singular, or close to. This is exactly what we have in our normal system. The technique of singular value decomposition relies on the theorem of linear algebra, shown in lemma 2 (Press, Teukolsky, Vetterint & Flannery 1992), the proof of which is beyond the scope of this document.

Lemma 2 Any m×n matrix A whose number of rows m is greater than or equal to its number of columns n, can be written as the product of an m×n column-orthogonal matrix U, an n ×n diagonal matrix W with positive or zero elements (the singular values), and the transpose of an n ×n orthogonal matrix V.

The technique works by avoiding the direct computation of the normal system. Considering our set of simultaneous equations

 A ·x = b (11)

where A is a m by n matrix, and x and b are vectors. Then if we then perform singular value decomposition on the matrix A to give us the matrixes U, W and V and substitute these into equation 11 according to lemma 2 we get

 U · [diag(wj)] ·VT·x = b (12)

Now as both U and V are orthogonal, then the products U&#middot;UT, and V·VT are both equal to the identity matrix I. Additionally as W is a diagonal matrix, then its inverse is the diagonal matrix whose elements are the reciprocals of the elements wj. From equation 12 it follows that the solution vector x is given by

 x = V· éęë diag ćçč 1wj ö÷ř ůúű ·UT ·b
(13)

This ``solution'' vector x constructed by equation 13, will not exactly solve equation 11. However it will be the solution, from the set of all possible solutions that minimizes the sum of the squares of the residuals between the left- and right-hand side, ie. the least squares solution.

Therefore by calculating the singular value decomposition of the matrix A formed from the set of linear simultaneous equations, resulting from the coordinates of the fiducial points we have a means by which we can calculate the ``best'' transformation to register the two images.