Replacing IBS with IBD: The MLS Method Biostatistics 666 Lecture 15 Previous Lecture z Analysis of Affected Relative Pairs z Test for Increased Sharing at Marker z Expected Amount of IBS Sharing Previous Lecture: Expected IBS Sharing z Calculated probability of IBS for each IBD state z Probability of IBD state depends on relationship ? Under the null hypothesis of no linkage ? = ===== 2 0 )|()|()|( j RjIBDPjIBDiIBSPRiIBSP Shortcomings of IBS Method z All sharing is weighted equally ? Sharing a rare allele ? Sharing a common allele ? Sharing homozygous genotype ? Sharing heterozygous genotype z Inefficient. ? Data contains additional information that is being ignored. Today z A likelihood based approach z Evaluate linkage in fully informative pairs z An E-M algorithm for practical settings z MLS method, Risch (1990) Simple Case z If IBD could be observed z Each pair of individuals scored as ? IBD=0 ? IBD=1 ? IBD=2 z Evaluate likelihood for null and alternative hypothesis The Model z Depends on three parameters z 0 , z 1 , z 2 ? Probability of sharing 0, 1 and 2 alleles IBD z Under the null, determined by relationship z Under the alternative, determined by genetic model Sib Pair Likelihood (Fully Informative Data) () () () 210 210 210 4 1 2 1 4 1 ??? hypothesis ealternativ Under the )()()( :hypothesis null Under the IBDIBDIBD IBDIBDIBD nnn nnn zzzL L = = Testing for Linkage z Evaluate likelihood at null hypothesis z Evaluate likelihood at MLE z Compare alternatives using likelihood ratio test Commonly Used Test Statistics ),,(ln2)?,?,?(ln2 ),,( )?,?,?( ln2 ),,( )?,?,?( log 4 1 2 2 1 1 4 1 0210 4 1 2 2 1 1 4 1 0 210 2 4 1 2 2 1 1 4 1 0 210 10 ===?= === = === = zzzLzzzL zzzL zzzL zzzL zzzL LOD ? Example 1 1 2 1 / 2 2/ 2/ 2/ IBD=1 1 2 2 1 / 2 2/ 2/ 2/ IBD=2 5x 5x Example z Assume that 10 sib-pairs are examined ? 5 share 2 alleles IBD ? 5 share 1 allele IBD z Calculate likelihood for null z Calculate MLEs z Calculate LOD score z Evaluate LOD for each pair In real life? z Markers are only partially informative z IBD sharing is equivocal ? Some uncertainty removed by examining relatives z Need an alternative likelihood ? Should allow for partially informative data Desirable Properties z Also depends on parameters z 0 , z 1 , z 2 ? Probability of sharing 0, 1 and 2 alleles IBD z Can incorporate partial information on IBD sharing z For fully informative data, equivalent to previous likelihood For A Single Family ij iij j ijj j ii w jIBDGenotypesPw wzjIBDGenotypesPASPjIBDPL ateproportion needonly We )|( defines (1990)Risch )|()|( 2 0 2 0 == ==== ?? == Likelihood and LOD Score 210 2 4 1 1 2 1 0 4 1 221100 10 210 z,z,z of MLEs at the evaluated LOD theis statistic MLS The ??? log ),,( ? ?? ++ ++ = = i iii iii i j ijj www wzwzwz LOD wzzzzL Example: Scoring of w ij 1 1 2 1 / 2 2/ 2/ 2/ 1 2 2 1 / 2 2/ 2/ 2/ In this case, only one of the weights is non-zero for each family. More interesting examples: w ij 1 1 2 1 / 1 2/ 2/ 2/ 1 2 2 2 / 2 2/ 2/ 2/ 2 2 2 2 / 2 2/ 2/ 2/ In these cases, multiple weights are non-zero (but equal) for each family. More interesting examples: w ij 2 2/ 2 2/ In this case, relative weights depend on allele frequency. How to maximize likelihood? z If all families are informative ? Use sample proportions of IBD=0, 1, 2 z If some families are uninformative ? Use an E-M algorithm ? At each stage generate complete dataset with fractional counts ? Iterate until estimates of LOD and z parameters are stable Assigning Partial Counts in E-M ? ? = = = == == = == = == 2 0 2 0 )|()|( )|()|( )|()|( )|( k ikk ijj k i wz wz kIBDGenotypesPASPkIBDP jIBDGenotypesPASPjIBDP L jIBDGenotypesPASPjIBDP GenotypesjIBDP Example 2 2/ 2 2/ IBD=? 1 2 2 1 / 2 2/ 2/ 2/ IBD=2 5x 5x Assume a bi-allelic marker where the two alleles have identical frequencies. Example of E-M Steps Other z0 z1 z2 IBD=0 IBD=1 IBD=2 IBD=2 LOD LODi LODu 0.250 0.500 0.250 0.56 2.22 2.22 5 0.00 0.00 0.00 0.056 0.222 0.722 0.08 0.66 4.26 5 3.19 2.30 0.89 0.008 0.066 0.926 0.01 0.17 4.82 5 4.01 2.84 1.16 0.001 0.017 0.982 0.00 0.04 4.96 5 4.20 2.97 1.23 0.000 0.004 0.996 0.00 0.01 4.99 5 4.25 3.00 1.24 0.000 0.001 0.999 0.00 0.00 5.00 5 4.26 3.01 1.25 0.000 0.000 1.000 0.00 0.00 5.00 5 4.26 3.01 1.25 Equivocal FamiliesParameters Properties of Pair Analyses Explored by Risch z Effect of marker informativeness z Effect of adding relative genotypes z Size of genetic effect z Degree of relationship PIC: Measure of Marker Informativeness z Probability that alleles of parent can be distinguished in offspring ? Botstein et al, 1980. ? Markers that could track dominant alleles z Probability that parent is heterozygous and informative in relation to spouse z In general: z For a equally frequent alleles z PIC <= Heterozygosity PIC ? Definition ??? ==+= ??= n i n i n ij jii pppPIC 111 22 )(21 3 11 a a a a PIC ? ? ? = Some PICs and Heterozygosities Alleles PIC H 20.380.50 30.590.67 40.700.75 50.770.80 80.860.8 10 0.89 0.90 20 0.95 0.95 Marker Informativeness Proportion of LOD Retained 0.0 0.2 0.4 0.6 0.8 1.0 .40.60.81.0 Marker Informativeness P r opor t i on of E x pe c t e d MLS Sibs 2nd Degree 3rd Degree Marker Informativeness Gene of Modest Effect (? O =3) Expected LOD Score 0.0 1.5 3.0 4.5 6.0 7.5 .40.60.81.0 Marker Informativeness Ex pe c t e d M L S Sibs 2nd Degree 3rd Degree Marker Informativeness Gene of Larger Effect (? O =10) Expected LOD Score 0 3 6 9 12 0.4 0.6 0.8 1.0 Marker Informativeness E x p e ct ed MLS Sibs 2nd Degree 3rd Degree Genotypes of Other Family Members z Expected LOD score decreases ? by < 33% if only sib-pairs are typed ? by < 60% for second degree relatives ? by < 70% for third degree relatives z Genotyping effort decreases by ? by 50% if only sib-pairs are typed ? by 60% if only second degree relatives typed ? by 75% if only third degree relatives typed Quick Comment on Literature z Greenwood and Schork (2004) suggested that uninformative families could bias MLS z However, their results use a poor estimate for MLEs z If an E-M algorithm is used, there is no problem Today ? z Describe a likelihood model based on IBD sharing for pairs of individuals z Model accommodates partially informative families z Maximum LOD score can be calculated using an E-M algorithm Recommended Reading z Risch (1990) ? Linkage Strategies for Genetically Complex Traits. III. The Effect of Marker Polymorphism on Analysis of Affected Relative Pairs ? Am J Hum Genet 46:242-253 z Introduces MLS method for linkage analysis ? Still, one of the best methods for analysis pair data z Evaluates different sampling strategies ? Results were later corrected by Risch (1992) Recommended Reading z Risch (1992) ? Corrections to Linkage strategies for genetically complex traits. III. The effect of marker polymorphism on analysis of affected relative pairs. ? Am J Hum Genet 51:673-675 z Evaluates utility of parental genotype data Goncalo Microsoft PowerPoint - 666.15 -- The MLS Method