BioIngenium Group
Centro de Telemedicina
Universidad Nacional de Colombia
 
     
  Distributed Genetic Algorithm for Subtraction Radiography  
     
  1. Introduction  
 
Digital subtraction radiography detects tissue mass changes by subtracting two digital radiographs. This method has shown to be very useful in early diagnosis of disease and follow-up examination. When subtracting two radiographs taken over time, the image features which are coincident to both images can be removed and the small changes can be amplified to highlight their presence.

For many years, digital subtraction radiography in dentistry has been used to qualitatively assess changes in radiographic density. Numerous authors have demonstrated the ability of this method to improve diagnostic performance for the detection of approximal dental caries, periapical pathology and periodontal disease. The use of digital subtraction radiography has also been shown to markedly increase the detection of destruction in the periodontal bone, as well as secondary caries detection.

A large variety of odontological diseases result in destruction of mineralized tissues, which are relatively small in the initial progression of the disease. A reliable detection and follow-up examination necessarily requires a precise alignment of the two images for the tissue changes to be detectable. Different approaches have been proposed for correcting such geometrical distortions. It goes from manual correction to different devices used to ensure a consistent geometric projection with can be reliably reproduced over time.

In this research, an entirely automatic method is proposed for spatial radiographic alignment. The process starts by selecting either of the two images as the reference while the other is considered the floating image. Afterward illumination differences are eliminated by means of an equalization algorithm explained below. Then consecutive affine transformations are performed on the floating image and the transformed image is compared to the reference using the correlation ratio as the similarity measure. An adaptive GA is used in order to find the transformation that produces the best match. The process is robust, reliable and reproducible on the test group of images.
 
     
  Correlation Ratio  
  The task at hand involves the design of a similarity measure that is assumed to be maximal when the images are correctly aligned. The following analysis has been extracted from the research report "Multimodal Image Registration by Maximization of the Correlation Ratio", written by Alexis Roche et al., and referenced in [14].

* Least squares criterion (identical image intensities)
* Simple correlation (not identical, linearly correlated, good for monomodal images)
* Multimodal images:
   
  MR (Magnetic Resonance)
  CT (Computed Tomography)
  PET (Positron Emission Tomography)
  SPECT (Single Photon-Emission Computed Tomography)
   
  Woods criterion (manual segmentation)
  Robust estimators (SPECT-MR)
   
  Mutual Information
  Given two images X and Y, their joint probability density function (joint pdf) P( i, j ) can be defined by simple normalization of their 2D-histogram
   
  Given the marginal probabiilty density functions: Px( i ) and Py( j )
   
  Then mutual information between X and Y is given by:
   
 
   
  Mutual information measure can be considered very general since it makes very few asumptions regarding the relationships of the image intentisities, it does not assume linear correlation, nor even functional correlation, it assumes only statistical dependence.

Pitfall: mutual information treats intensity values in a purely qualitative way, without considering any notion of proximity in the intensity space (nearby intensities convey spatial information).

In a real image, tissue is never represented by a single intensity value, but rather by a certain interval. Fig. 1 in [14] shows a synthetic situation in which mutual information is not well adapted.
   
  Images as Random Variables
  Statistical concepts have proven to be powerful tools for the design and computation of similarity, therefore in this paper, we "artifitially" considered images as random variables: this corresponds to interpreting an image histogram as a probability density function (pdf). Moreover, we consider the 2D-histogram of an image pair as their joint pdf. This means that when randomly selecting a voxel in image X, the probability of obtaining an intensity i is proportional to the number of voxels (Ni) in X, having the intensity i:
  (1)
 
  In order to define the joint pdf of an image pair, the authors consider two images (X,Y) and a spatial transformation T that maps the grid of Y, Ωy, to the grid of X, Ωx. X and Y take their intensity values from a finite set A which can be assumed to be the same for the two images. Typically A = {0 ... 255}.
  (2)
 
  By applying transformation T to image Y, a new mapping is defined from the transformed positions of Y to A:
  (3)
 
  The points of T(Ωx) that don't have eight neighbors in Ωx are rejected. T(Ωy)* denotes the subset of accepted points and denotes the transformation of X. The image pair is defined then as the following random couple:
  (4)
 
  Then, the joint pdf X and Yt is defined in a similar way as it was done for a single image in (1):
  (5)
 
  The marginal pdf's of the images X and YT are entirely determined by the joint pdf PT(i,j). However, they are not equal a priori to those obtained by considering single images. Due to interpolation, they depend on the transformation T:
  (6)
 
   
  Random variables geometry
  In the previous section the joint pdf of two images has been defined. The task now is to find some kind of dependence between two random variables when their joint pdf is known. For this purpose, the geometry of the L2 space provides a simple method for quantifying the functional dependence between two random variables. L2 is defined as the space of square integrable real variables, that is, the variables which verify:
 
  where E denotes the expectation operator. L2 is a Hilbert space with respect to the dot product . Thus, the corresponding norm is the second order moment of a variable:
  (7)
 
  The L2 norm is closely related to the classical notions of expectation, variance and standard deviation, therefore, equation (7) can be rewritten as:
  (8)
 
  Due to its Hilbertian structure, L2 has interesting geometric properties. The notion of orthogonality between two variables can be defined as:
  (9)
 
  The meaning of orthogonality in L2 relates with the notion of independence, but in a less restrictive way. Two variables X and Y are said to be independent if their joint pdf is equal to the product of their marginal pdf's, that is . It can be shown that for two such variables, thus:
  (10)
 
  The converse, however, is false. Orthogonality in L2 is a weaker constraint than independence. It may be seen as a notion of independence on the average. In a general way, the angle between two variables X and Y is defined thanks to a basic property of dot products:
  (10)
 
  Expectation
  L2 contains the one-dimensional space Δ of deterministic variables, i.e. variables which are constant on the state space Ω. Given a variable X, its expectation is:
  (11)
 
  Therefore, E(X) is nothing but the orthogonal projection of X onto Ω. In the sense of the L2 norm, it is the constant variable which best approximates X (classical notion of mean).
 
  Figure 1. Geometric interpretation of expectation. E(X) is the orthogonal projection of X onto the constant direction Δ.
   
  Correlation Coefficient
  A quick method to approximate the degree of dependence between two variables is to compute their correlation coefficient. Given two variables X and Y, it is defined as:
  (12)
 
  From a geometric point of view, we can write:
  (13)
 
  Using equation (10), the correlation coefficient between X and Y can be interpreted in a geometric way. Let α denote the angle between X - E(X) and Y - E(Y). We have:
  (14)
 
  Figure 2. Geometric interpretation of the correlation coefficient. We have ρ(X,Y) = cos(α). The constant (or deterministic) direction is denoted by Δ.
   
  We see that ρ(X,Y) is larger as the angle α is small. It reaches 1 if X - E(X) and Y - E(Y) are colinear. This is to say that the correlation coefficient measures the linear dependence between two variables. As we want to take into account general functions between X and Y, possibly non-linear and non-invertible, this is not a good measure of functional dependence.
   
  Conditional Expectation
  Evaluating the functional dependence between two variables comes down to an interpolation problem with no constraints. Suppose we want to estimate a variable Y with another variable X. A natural approach would be: (1) find the function that best fits Y among all possible functions of X; (2) quantify the quality of the estimate with respect to Y. The notion of conditional expectation provides a straightforward method for such an evaluation, without having to test every possible function of X. If X and Y are not independent, knowing an event X = x should provide some new information about Y. Any event X = x induces a conditional pdf for Y, that is:
  (15)
 
  Then, the corresponding a posteriori expectation of Y is:
  (16)
 
  To any possible realization of X corresponds an a posteriori expectation of Y. Thus, it can be defined a function of X, which is the conditional expectation of Y in terms of X:
  (17)
 
  Notice that E(Y | X) is also a random variable. It is easy to verify that it is an unbiased estimate, i.e.:
  (18)
 
  The conditional expectation's major interest is that it is the optimal approximator in the sense of the L2 norm. In [14], appendix B, A. Roche et al. show that E(Y | X) is the measurable function of X that has the smallest distance to Y:
  (19)
 
   
  Total Variance Theorem
  A geometric interpretation of the conditional expectation is now presented. For this matter it is considered the sub-space Lx of every possible function Φ of X (provided that it remains in L2):
  (20)
 
  Every constant variable is a (constant) function of X, so that:
  (21)
 
  Figure 3. Geometric interpretation of the conditional expectation. It is the orthogonal projection onto Lx.
   
  As the conditional expectation E(Y | X) minimizes the distance between Y and Lx, E(Y | X) is the orthogonal projection of Y onto Lx. This is due to the Hilbertian structure of L2. This simple geometrical property allows us to compute easily the distance between Y and Lx. Indeed, Y - E(Y | X) is orthogonal to any vector of Lx by definition of the orthogonal projection. It can be noted that:
  (22)
 
  Therefore, the triangle whose vertices are Y, E(Y) and E(Y | X) is right-angled in E(Y | X). Applying the Pythagorean theorem, we retrieve a result known as the total variance theorem:
  (23)
 
  since E[E(Y | X)] = E(Y), and since
  (24)
 
  the equation (23) can be rewritten as:
  (25)
 
  where Ex, is the operator defined by:
  (26)
 
  This may be seen as an energy conservation equation. The variance of Y is decomposed as a sum of two "energy" terms:
   
  1. Var[E(Y | X)] that is the variance of the conditional expectation E(Y | X). It measures the part of Y which is predicted by X.
   
  2. Conversely, the term Ex[Var[E(Y | X = x)], which is called the conditional variance, represents the square distance of Y to the space Lx. It measures the part of Y which is functionally independent of X.
   
  Design of the Correlation Ratio
  Based on the previous analysis, a measure of functional dependence between X and Y can be designed. Accounting for the interpretation of the total variance theorem in terms of energy, it seems natural to compare the "explained" energy of Y with its total energy. This leads to the definition of the correlation ratio between X and Y.
  (27)
 
  The correlation ratio also has a simple geometric interpretation. Let θ denote the angle between Y - E(Y) and the space Lx, By definition, θ is also the angle between Y - E(Y) and E(Y | X) - E(Y) (Fig. 3). Then we have:
  (28)
 
  Unlike the correlation coefficient which measures the linear dependence between two variables, the correlation ratio measures their functional dependence. It takes on values between 0 and 1. A value near 1 indicates a high functional dependence, while a value near 0 indicates a low functional dependence. The two extreme cases are:
  (29)
 
  By nature, the correlation ratio is asymmetric since the two variables fundamentally do not play the same role in the functional relationship. Thus, in general:
 
  Properties of the correlation ratio:
 
  This last equality is true if and only if E(Y | X) is linear with respect to X, i.e. if
 
   
  Computation of the Correlation Ratio
  In order to compute for a given transformation T, in practice it is used the equation:
 
  Thus, having defined the joint pdf of X and Yt, we compute using:
 
  with:
 
   
   
 
     
  > methodology  
  < index  
   
     
   
     
      > Aviso Legal