Importance sampling and the two-locus model with subdivided population structure.

*(English)*Zbl 1144.62092Summary: The diffusion-generator approximation technique developed by M. De Iorio and R. C. Griffiths [ibid. 36, No. 2, 417–433 (2004; Zbl 1045.62111)] is a very useful method of constructing importance-sampling proposal distributions. Being based on general mathematical principles, the method can be applied to various models in population genetics. In this paper we extend the technique to the neutral coalescent model with recombination, thus obtaining novel sampling distributions for the two-locus model. We consider the case with subdivided population structure, as well as the classic case with only a single population. In the latter case we also consider the importance-sampling proposal distributions suggested by P. Fearnhead and P. Donnelly [Genetics 159, 1299–1318 (2001); see also J. R. Stat. Soc., Ser. B 64, No. 4, 657–680 (2002; Zbl 1067.62111)], and show that their two-locus distributions generally differ from ours. In the case of the infinitely-many-alleles model, our approximate sampling distributions are shown to be generally closer to the true distributions than are Fearnhead and Donnelly’s.

##### MSC:

62P10 | Applications of statistics to biology and medical sciences; meta analysis |

93E25 | Computational methods in stochastic control (MSC2010) |

60G40 | Stopping times; optimal stopping problems; gambling theory |

92D10 | Genetics and epigenetics |

62L15 | Optimal stopping in statistics |

##### Keywords:

coalescent process; recombination; diffusion process; importance sampling; migration; subdivided population
PDF
BibTeX
XML
Cite

\textit{R. C. Griffiths} et al., Adv. Appl. Probab. 40, No. 2, 473--500 (2008; Zbl 1144.62092)

Full Text:
DOI

##### References:

[1] | Bahlo, M. and Griffiths, R. C. (2000). Inference from gene trees in a subdivided population. Theoret. Pop. Biol. 57, 79–95. · Zbl 0984.92020 |

[2] | Beaumont, M. (1999). Detecting population expansion and decline using microsatellites. Genetics 153, 2013–2029. |

[3] | Cornuet, J. M. and Beaumont, M. A. (2007). A note on the accuracy of PAC-likelihood inference with microsatellite data. Theoret. Pop. Biol. 71, 12–19. · Zbl 1173.62333 |

[4] | De Iorio, M. and Griffiths, R. C. (2004a). Importance sampling on coalescent histories. I. Adv. Appl. Prob. 36, 417–433. · Zbl 1045.62111 |

[5] | De Iorio, M. and Griffiths, R. C. (2004b). Importance sampling on coalescent histories. II: subdivided population models. Adv. Appl. Prob. 36, 434–454. · Zbl 1124.62317 |

[6] | Ethier, S. N. and Griffiths, R. C. (1990). On the two-locus sampling distribution. J. Math. Biol. 29, 131–159. · Zbl 0729.92012 |

[7] | Fearnhead, P. and Donnelly, P. (2001). Estimating recombination rates from population genetic data. Genetics 159, 1299–1318. |

[8] | Fearnhead, P. and Smith, N. G. C. (2005) A novel method with improved power to detect recombination hotspots from polymorphism data reveals multiple hotspots in human genes. Amer. J. Human Genetics 77, 781–794. |

[9] | Golding, G. B. (1984). The sampling distribution of linkage disequilibrium. Genetics 108, 257–274. |

[10] | Griffiths, R. C. and Marjoram, P. (1996). Ancestral inference from samples of DNA sequences with recombination. J. Comput. Biol. 3, 479–502. |

[11] | Griffiths, R. C. and Tavaré, S. (1994a). Ancestral inference in population genetics. Statist. Sci. 9, 307–319. · Zbl 0955.62644 |

[12] | Griffiths, R. C. and Tavaré, S. (1994b). Sampling theory for neutral alleles in a varying environment. Proc. R. Soc. London B 344, 403–410. |

[13] | Griffiths, R. C. and Tavaré, S. (1994c). Simulating probability distributions in the coalescent. Theoret. Pop. Biol. 46, 131–159. · Zbl 0807.92015 |

[14] | Hudson, R. R. (2001). Two-locus sampling distributions and their application. Genetics 159, 1805–1817. |

[15] | Kuhner, M. K., Yamato, J. and Felsenstein, J. (1995). Estimating effective population size and mutation rate from sequence data using Metropolis–Hastings sampling. Genetics 140, 1421–1430. |

[16] | Kuhner, M. K., Yamato, J. and Felsenstein, J. (2000). Maximum likelihood estimation of recombination rates from population data. Genetics 156, 1393–1401. |

[17] | Li, N. and Stephens, M. (2003). Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics 165, 2213–2233. |

[18] | McVean, G., Awadalla, P. and Fearnhead, P. (2002). A coalescent-based method for detecting and estimating recombination from gene sequences. Genetics 160, 1231–1241. |

[19] | McVean, G. et al. (2004). The fine-scale structure of recombination rate variation in the human genome. Science 304, 581–584. |

[20] | Myers, S. et al. (2005). A fine-scale map of recombination rates and hotspots across the human genome. Science 310, 321–324. |

[21] | Stephens, M. and Donnelly, P. (2000). Inference in molecular population genetics. J. R. Statist. Soc. Ser. B 62, 605–655. JSTOR: · Zbl 0962.62107 |

[22] | Wilson, I. J. and Balding, D. J. (1998). Genealogical inference from microsatellite data. Genetics 150, 499–510. · Zbl 0902.62037 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.