next up previous
Next: EM Algorithms for Source Up: Maximum Likelihood Source Localization Previous: Array Observation Model

EM and ML Source Location Estimation

For the MAP estimator, we must find $\theta$ to maximize $f( \theta \vert{\bf X})$. Using Bayes rule:
 
$\displaystyle f( \theta \vert{\bf X})$ = $\displaystyle \frac {f({\bf X}\vert \theta ) f( \theta ) }{f({\bf X})}$  
  = % latex2html id marker 2148
$\displaystyle K_{\ref{eq:f(theta\vert X)}} f({\bf X}\vert \theta ) f( \theta )
\doteq f({\bf X}\vert \theta )f( \theta ) \; \; ,$ (4)

where % latex2html id marker 762
$K_{\ref{eq:f(theta\vert X)}}$ is a constant which takes into account the effect of $f({\bf X})$ for normalizing the distribution.2 If $f( \theta )$ is unknown or assumed to be uniform, then the ML estimator which maximizes the likelihood function $f({\bf X}\vert \theta )$ can be used instead. All of the EM algorithms derived in the subsequent sections here produce iterative solutions to the ML problem of maximizing $f({\bf X}\vert \theta )$. But if $f( \theta )$ is known and not uniform, then this could be easily incorporated into the EM algorithms as an additive term in the cost function, thus producing a MAP estimator. The dependence of $f({\bf X}\vert \theta )$ on the random channel coefficients is:

 \begin{displaymath}f({\bf X}\vert \theta ) =
E[f({\bf X}\vert{\bf S}, \theta )] ...
...bf S}} f({\bf X}\vert{\bf S}, \theta )
f({\bf S }) d{\bf S } ,
\end{displaymath} (5)

where $f({\bf X}\vert{\bf S}, \theta )$ is given by (3). Evaluating this multidimensional integral and then maximizing the result with respect to $\theta$ seems to be an intractable problem in general. Yet this is exactly the problem which the EM algorithms can solve iteratively. To maximize (5) using EM, define the auxiliary function [5]:
 
$\displaystyle Q( \theta \vert \phi )$ = $\displaystyle E [ \log (f({\bf X},{\bf S}\vert \theta )) ~ \vert {\bf X}, \phi ]$  
  = $\displaystyle \int_{\bf S} \log ( f({\bf X},{\bf S}\vert \theta ) )
f({\bf S}\vert{\bf X}, \phi ) d {\bf S} ,$ (6)

where we desire the ML estimate of $\theta$, and $\phi$ represents the current estimate of $\theta$. In the terminology of EM, ${\bf X}$ is the observed data, ${\bf S}$ is the hidden data, and $({\bf X},{\bf S})$ is the complete data.
The general EM algorithm for this problem is:
1.
Initialize $\phi$ to an estimate of $\theta$.
2.
E-step (Expectation): construct $Q( \theta \vert \phi )$
3.
M-step (Maximization): find $\theta$ to maximize $Q( \theta \vert \phi )$
4.
Set $ \phi = \theta$ and repeat steps 2 and 3 until converged.


In general, to construct $Q( \theta \vert \phi )$, start with $f({\bf X},{\bf S}\vert \theta )$:

 \begin{displaymath}f({\bf X},{\bf S}\vert \theta ) =
f({\bf X}\vert{\bf S}, \theta ) f({\bf S}\vert \theta ) .
\end{displaymath} (7)

We can drop the $f({\bf S}\vert \theta )$ term from this since ${\bf S}$ and $\theta$ are independent so $f({\bf S}\vert \theta )$ is not a function of $\theta$ and will not affect the maximization in the M-step. $Q( \theta \vert \phi )$ from (6) may then be written as:
 
$\displaystyle Q( \theta \vert \phi )$ $\textstyle \doteq$ $\displaystyle E[ \log ( f({\bf X}\vert{\bf S}, \theta ) ) ~ \vert{{\bf X}, \phi }]$  
  = $\displaystyle \int_{\bf S} \log ( f({\bf X}\vert{\bf S}, \theta ) )
f({\bf S }\vert{\bf X}, \phi ) d {\bf S} .$ (8)

To evaluate $Q( \theta \vert \phi )$, we need $f({\bf S }\vert{\bf X}, \phi )$, which can be written in terms of the apriori channel coefficient density function $f({\bf S})$ as:

 \begin{displaymath}f({\bf S}\vert{\bf X}, \phi ) =
\frac {f({\bf X}\vert{\bf S}, \phi ) f({\bf S}\vert \phi )}
{f({\bf X}\vert \phi )} ,
\end{displaymath} (9)

where the denominator term $f({\bf X}\vert \phi )$ can be treated as a constant since it simply serves to normalize the distribution, and is not a function of ${\bf S}$ or $\theta$, so will not affect the maximization step. Also, $f({\bf S}\vert \phi )$ = $f({\bf S})$since the signal amplitudes and source locations are independent. But note that given ${\bf X}$ and $\phi$, $f({\bf S }\vert{\bf X}, \phi ) \neq f({\bf S})$. Therefore, for $f({\bf S }\vert{\bf X}, \phi )$ in (8) we may use:

 \begin{displaymath}f({\bf S}\vert{\bf X}, \phi ) \doteq
f({\bf X}\vert{\bf S}, \phi ) f({\bf S}) .
\end{displaymath} (10)

In the specific cases considered subsequently, we will show that the E-step reduces to computing sufficient statistics of $Q( \theta \vert \phi )$, which are the mean and covariance of ${\bf S}$ conditioned on ${\bf X}$ and $\phi$. This enables the use of an ML-like estimator of $\theta$ in the M-step, which employs the sufficient statistics provided in the E-step. The E-step and M-step formulas will be derived in detail in subsequent sections for various specific signal amplitude distributions.
next up previous
Next: EM Algorithms for Source Up: Maximum Likelihood Source Localization Previous: Array Observation Model
Rick Perry
2000-03-16