Note: this is an old internal report which is being superseded by various journal and conference publications in progress. ______________________________________________________________________________ EM ISI with time-varying channel coefficients R. Perry, 28 May 1999 ______________________________________________________________________________ 0. Introduction --------------- Following the report "EM ISI Examples", EM algorithms are derived for the case of time-varying channel coefficients. First the most general case is considered. Then three specific cases of channel coefficient distributions are considered: independent, independent Gaussian, and first-order Gauss-Markov. 1. General EM Setup ------------------- The received data at time k is: (1) r(k) = A(k)'H(k) + n(k), k=1..N A(k) is a column vector containing transmitted data a(k-i+1), i=1..L, and a(t) == 0 for t <= 0. H(k) is a column vector containing the channel impulse response coefficients (which are unknown) at time k. n(k) is the white Gaussian noise at time k with variance s. Letting H represent {H(1)..H(N)} and A represent {A(1)..A(N)}, we have: (2) f(r|H,A) = (1/((s 2pi)^(N/2))) product k=1..N { exp(-(r(k)-A(k)'H(k))^2/(2s)) } For the EM algorithm, define Q(A|B): (3) Q(A|B) = E[ -log f(r,H|A) | r,B] where we desire the ML estimate of A, B is the current estimate of A, r is the observed data, H is the hidden data, and (r,H) is the complete data. f(r,H|A) can be written as: f(r,H|A) = f(r|H,A) f(H|A) We can drop the f(H|A) term from this since H and A are independent so f(H|A) is not a function of A and will not affect the minimization in the M-step. Using f(r|H,A) from (2), and dropping constants which will not affect the minimization: (4) Q(A|B) = E[ sum k=1..N { (r(k)-A(k)'H(k))^2 } | r,B] = integral over H { dH f(H|r,B) sum k=1..N { (r(k)-A(k)'H(k))^2 } } Given an apriori density function f(H) for the channel coefficients, f(H|r,B) can be written as: (5) f(H|r,B) = f(r|H,B) f(H|B) / f(r|B) where the denominator term f(r|B) can be ignored since it simply serves to normalize the distribution, and is not a function of A, so will not affect the minimization step. Also, f(H|B) = f(H) since H and B are assumed to be independent. Using (2) for f(r|H,B), and combining constants into a normalizing factor K, (5) becomes: (6) f(H|r,B) = K f(H) product k=1..N { exp(-(r(k)-B(k)'H(k))^2/(2s)) } and K can be determined by setting the integral of f(H|r,B) to 1. 2. Independent Channel Coefficients ----------------------------------- If the elements of H are independent over time, then f(H) can be written as: (7) f(H) = product k=1..N { f(H(k)) } and f(H|r,B) becomes: (8) f(H|r,B) = K product k=1..N { f(H(k)|r,B) } with: (9) f(H(k)|r,B) = f(H(k)) exp(-(r(k)-B(k)'H(k))^2/(2s)) Using (8) and (9) in (4) we obtain: (10) Q(A|B) = E[ sum k=1..N { (r(k)-A(k)'H(k))^2 } | r,B] = sum k=1..N { E[ (r(k)-A(k)'H(k))^2 | r,B] } = sum k=1..N { integral over H(k) { dH(k) f(H(k)) exp(-(r(k)-B(k)'H(k))^2/(2s)) (r(k)-A(k)'H(k))^2 } } Define g(k) and G(K): (11) g(k) == E[H(k)|r,B] (12) G(k) == E[(H(k)-g(k))(H(k)-g(k))'|r,B] Following the derivation of (11) in "EM ISI Examples", the integral in (10) may be written as: (13) V(k) = (r(k) - A(k)'g(k))^2 + A(k)'G(k)A(k) and represents the Viterbi incremental cost function at time k. 3. Independent Gaussian Channel Coefficients -------------------------------------------- If f(H(k)) is Gaussian with mean d and covariance R, then similar to the derivation of (22)-(24) in "EM ISI Examples", we obtain: (14) f(H(k)|r,B) = K4 exp(-(H(k)-g(k))'inv(G(k))(H(k)-g(k))/2) where K4 is a normalizing constant and: (15) G(k) = inv(B(k)B(k)'/s+inv(R)) (16) g(k) = G(k) (B(k)r(k)/s+inv(R)d) So in this case, the distribution of H(k) given r and B is also Gaussian, and the EM algorithm is: 1. Initialize B to some estimate of A by initially estimating g(k) and G(k), k=1..N, then using the Viterbi algorithm to estimate A, then setting B=A. 2. E-step: use (15) and (16) to estimate G(k) and g(k), k=1..N 3. M-step: use the Viterbi algorithm with incremental cost function (13) to estimate A. 4. Set B=A and repeat steps 2 and 3 until converged. Some initial BER simulation results, using 1000 trials: EM algorithm N SNR deterministic h Gaussian h known channel 6 6 0.1785 0.13383 0.12967 10 6 0.1879 0.1234 0.1203 4. Gauss-Markov Channel Coefficients ------------------------------------ Let `a' represent the first-order Markov factor such that a = exp(-wT) where T is the sampling period and w/pi is the Doppler spread. Let the time-varying channel coefficients follow a Gauss-Markov distribution such that, at time k: (17) H(k) = a H(k-1) + v(k) where v(k) is white and Gaussian with mean 0 and autocorrelation I. In this case, f(H(k)|H(k-1)) is Gaussian with mean = a H(k-1) and autocorrelation I. And f(H(k)) is Gaussian with mean 0 and autocorrelation I/(1-a^2). Also, f(H(k-1)|H(k)) is Gaussian with mean = a H(k) and autocorrelation I. Q(A|B) from (4) can be written in terms of f(H(k)|r,B): (18) Q(A|B) = sum k=1..N { E[ (r(k)-A(k)'H(k))^2 | r,B] } = sum k=1..N { integral over H(k) { dH(k) f(H(k)|r,B) (r(k)-A(k)'H(k))^2 } } and the definitions of g(k), G(k), and V(k) from (11-13) still apply. f(H) can be written as: (19) f(H) = f(H(N)|H(N-1)) f(H(N-1)|H(N-2)) ... f(H(2)|H(1)) f(H(1)) so f(H|r,B) from (6) is: (20) f(H|r,B) = K f(H(N)|H(N-1)) f(H(N-1)|H(N-2)) ... f(H(2)|H(1)) f(H(1)) * product k=1..N { exp(-(r(k)-B(k)'H(k))^2/(2s)) } To determine f(H(k)|r,b), and thereby g(k) and G(k), we must integrate (20) over H(i), i=1..N, i!=k. The derivation of this integration result is shown in Appendix A. The result is that f(H(k)|r,B) is Gaussian, with g(k) and G(k) determined by the following algorithm: (21) Initialize: for k=N..2 G(k) = inv(I + B(k)B(k)'/s + a^2(I - G(k+1)) q(k) = r(k)B(k)/s + a G(k+1) q(k+1) (22) Iterate: for k=1..N G(k) = inv((1-a^2)I + B(k)B(k)'/s + a^2 (2I - G(k-1) - G(k+1))) q(k) = r(k)B(k)/s + a (G(k-1)q(k-1) + G(k+1)q(k+1)) g(k) = G(k)q(k) Use G(k) and g(k) in the Viterbi incremental cost function for time k. Update for next iteration: G(k) = inv(I + B(k)B(k)'/s + a^2 (I - G(k-1))) q(k) = r(k)B(k)/s + a G(k-1) q(k-1) with G(0) == G(N+1) == I and q(0) == q(N+1) == 0. It is apparent from the above algorithm that f(H(k)|r,B) at any time k depends on all of the received data and estimated symbols from time 1 to N. Initial simulation results indicate that the Gauss-Markov EM algorithm performs almost as well as the known channel. Appendix A ---------- To derive the formulas for g(k) and G(k) in the case of the Gauss-Markov channel coefficient distribution, first note that (19) can be written in terms of f(H(k)) for any value of k. (19) shows this for k=1. For k=2, we can rewrite f(H(2)|H(1)) as f(H(1)|H(2))f(H(2))/f(H(1)) to obtain: (A1) f(H) = f(H(N)|H(N-1)) ... f(H(3)|H(2)) f(H(2)) f(H(1)|H(2)) For k=3, rewrite f(H(3)|H(2)) as f(H(2)|H(3))f(H(3))/f(H(2)) to obtain: (A2) f(H) = f(H(N)|H(N-1)) ... f(H(4)|H(3)) f(H(3)) f(H(2)|H(3)) f(H(1)|H(2)) In general: (A3) f(H) = f(H(N)|H(N-1)) ... f(H(k+1)|H(k)) f(H(k)) f(H(k-1)|H(k)) f(H(k-2)|H(k-1)) ... f(H(1)|H(2)) To evaluate f(H(k)|r,B), we can use the above form of f(H) in (20) and integrate over H(i), i=1..N, i!=k. Consider two of these integrations, say the integrals over H1 and H2. By examining the form of this result, we will be able to produce the general result. Showing just the terms involving H1: (A4) I1 = integral over H1 { dH1 K1 exp( - (H1 - a H2)'(H1 - a H2)/2 - (r1-B1'H1)^2/(2s)) } where K1 is a constant. Writing the exponential part as exp(-E1/2) we have: (A5) E1 = (H1 - a H2)'(H1 - a H2) + (r1-B1'H1)^2/s = H1'H1 - 2 a H1'H2 + a^2 H2'H2 + r1^2/s - 2 r1 H1'B1/s + H1'B1B1'H1/s = (H1-g1)'inv(G1)(H1-g1) - g1'inv(G1)g1 + a^2 H2'H2 where we have dropped the constant r1^2/s term, and G1 and g1 will be defined below. Since the last two terms in E1 are not functions of H1, they can be factored out of the integral. And the integral over H1 involving the first term from E1 produces a constant which can be ignored. Thus, the result is: (A6) I1 = exp( -(-g1'inv(G1)g1 + a^2 H2'H2) / 2) with: (A7) G1 = inv(I + B1B1'/s) q1 = r1 B1/s g1 = G1 (a H2 + q1) Expanding the first term in I1: (A8) g1'inv(G1)g1 = a^2 H2' G1 H2 + 2 a H2' G1 q1 where we have dropped the constant q1' G1 q1 term. So we now have: (A9) I1 = exp( -(-a^2 H2' G1 H2 - 2 a H2' G1 q1 + a^2 H2'H2) / 2) This result shows how the integral over H1 produces terms involving H2 which must be included when performing the integral over H2. Showing just the terms involving H2 for this next integral: (A10) I2 = integral over H2 { dH2 K2 exp( - (H2 - a H3)'(H2 - a H3)/2 - (r2-B2'H2)^2/(2s) - (-a^2 H2' G1 H2 - 2 a H2' G1 q1 + a^2 H2'H2) / 2) ) } where K2 is a constant. Writing the exponential part as exp(-E2/2) we have: (A11) E2 = (H2 - a H3)'(H2 - a H3) + (r2-B2'H2)^2/s - a^2 H2' G1 H2 - 2 a H2' G1 q1 + a^2 H2'H2 = (H2-g2)'inv(G2)(H2-g2) - g2'inv(G2)g2 + a^2 H3'H3 where we have dropped the constant r2^2/s term, and G2 and g2 will be defined below. Since the last two terms in E2 are not functions of H2, they can be factored out of the integral. And the integral over H2 involving the first term from E2 produces a constant which can be ignored. Thus, the result is: (A12) I2 = exp( -(-g2'inv(G2)g2 + a^2 H3'H3) / 2) with: (A13) G2 = inv(I + B2B2'/s + a^2(I - G1)) q2 = r2 B2/s + a G1 q1 g2 = G2 (a H3 + q2) From the forms of (A6) and (A12) we can easily deduce the general form of the result of integrating over H(i). (A13) indicates the general form of the equations for G(i) and q(i). These equations have more terms than (A7) because in deriving (A7) there were no previous integration results to incorporate. However, if we define G0 = I and q0 = 0, then (A7) may also be written in the same form as (A13). The preceding examples used (A3) to integrate (20) over H1 and H2, and show the recursive forms of the update equations which appear at the end of (22). If we start at the "other end" of (A3), and integrate over H(N) first, we obtain similar recursive equations which appear in (21). Finally, to produce f(H(k)|r,B) after integrating over H(i), i=1..N, i!=k, we have: (A14) f(H(k)|r,B) = Kk exp(-Ek/2) (A15) Ek = (1-a^2)H(k)'H(k) + (r(k) - B(k)'H(k))^2/s + a^2 H(k)'H(k) - a^2 H(k)'G(k+1)H(k) - 2 a H(k)'G(k+1)q(k+1) + a^2 H(k)'H(k) - a^2 H(k)'G(k-1)H(k) - 2 a H(k)'G(k-1)q(k-1) where the second and third lines of (A15) show the cumulative contributions of the integrals over H(i) for i>k and i