TY - JOUR
T1 - Notes on the maximum likelihood estimation of haplotype frequencies
AU - Mano, Shuhei
AU - Yasuda, N.
AU - Katoh, T.
AU - Tounai, K.
AU - Inoko, H.
AU - Imanishi, T.
AU - Tamiya, Gen
AU - Gojobori, T.
PY - 2004/5/1
Y1 - 2004/5/1
N2 - The maximum likelihood estimation (MLE) is one of the most popular ways to estimate haplotype frequencies of a population with genotype data whose linkage phases are unknown. The MLE is commonly implemented in the use of the Expectation-Maximization (EM) algorithm. It is known that the EM algorithm carries the risk that an estimator may converge erroneously to one of the local maxima or saddle points of the likelihood surface, resulting in serious errors in the MLE of haplotype frequencies. In this note, by theoretical treatments we present the necessary and sufficient conditions that the local maxima or saddle points on the likelihood surface appear. As a rule of thumb, that the difference between the coupling and repulsive haplotype frequencies in phase known individuals is 3/2 times larger than the frequency of phase ambiguous individuals is the sufficient condition that the likelihood surface is unimodal. Moreover, we present the analytic solution to the biallelic two-locus problem, and construct a general algorithm to obtain the global maximum.
AB - The maximum likelihood estimation (MLE) is one of the most popular ways to estimate haplotype frequencies of a population with genotype data whose linkage phases are unknown. The MLE is commonly implemented in the use of the Expectation-Maximization (EM) algorithm. It is known that the EM algorithm carries the risk that an estimator may converge erroneously to one of the local maxima or saddle points of the likelihood surface, resulting in serious errors in the MLE of haplotype frequencies. In this note, by theoretical treatments we present the necessary and sufficient conditions that the local maxima or saddle points on the likelihood surface appear. As a rule of thumb, that the difference between the coupling and repulsive haplotype frequencies in phase known individuals is 3/2 times larger than the frequency of phase ambiguous individuals is the sufficient condition that the likelihood surface is unimodal. Moreover, we present the analytic solution to the biallelic two-locus problem, and construct a general algorithm to obtain the global maximum.
KW - Analytic solution
KW - EM algorithm
KW - Haplotype frequency estimation
KW - Likelihood surface
KW - Maximum likelihood estimation
UR - http://www.scopus.com/inward/record.url?scp=4444383439&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=4444383439&partnerID=8YFLogxK
U2 - 10.1046/j.1529-8817.2003.00088.x
DO - 10.1046/j.1529-8817.2003.00088.x
M3 - Article
C2 - 15180706
AN - SCOPUS:4444383439
VL - 68
SP - 257
EP - 264
JO - Annals of Human Genetics
JF - Annals of Human Genetics
SN - 0003-4800
IS - 3
ER -