Estimating individual admixture from finite reference databases
The concept of individual admixture (IA) assumes that the genome of individuals is composed of alleles inherited from K ancestral populations. Each copy of each allele has the same chance q_k to originate from population k, and together with the allele frequencies in all populations p comprises the admixture model, which is the basis for software like STRUCTURE and ADMIXTURE. Here, we assume that p is given through a finite reference database, and q is estimated via maximum likelihood. Above all, we are interested in efficient estimation of q, and the variance of the estimator which originates from finiteness of the reference database, i.e. a variance in p. We provide a central limit theorem for the maximum-likelihood estimator, give simulation results, and discuss applications in forensic genetics.
READ FULL TEXT