STA 290 Seminar: Julia Palacios

STA 290 Seminar Series

Wednesday, January 20th, 10:30am, MSB 1147 (Colloquium Room)

Speaker:          Julia Palacios (Harvard University / Brown University)

Title:                “Inference of Population Size Trajectories from Genomic Data”

Abstract:          Inferring and interpreting changes in population sizes as a function of time from present-day samples of genomic data is a core goal of population biology. Modeling the sample’s ancestry and the mutation process allows such inference on the time scales ranging from the deep past into the near present. Estimates of population size changes over time shed light on how historical events affect genetic diversity within a species, as well as the genetic response of a species to large-scale climatic events and responses to changes in the environment. Sophisticated inferential tools coupled with the coalescent model have recently emerged for estimating past population sizes from genomic data. Recent methods that estimate population size changes over time make constraining assumptions in order to gain computational tractability. In this talk, I present a Gaussian process-based Bayesian nonparametric approach to estimate population size changes as a function of time from either a single non-recombining genomic segment or from recombining whole genomes, where sequences are correlated across sites. First, I summarize and discuss current approaches for estimation of population sizes. Next, I explore the application of two computationally efficient Bayesian frameworks based on Hamiltonian Monte Carlo and Integrated nested Laplace approximations to this problem. I discuss the advantages and disadvantages of both frameworks and show that the proposed Gaussian process-based method outperforms recently developed methods on simulated data. I show applications of the method to viral sequences of hepatitis C and human Influenza A, and human populations with European ancestry and from Ibadan, Nigeria. Lastly, I discuss future extensions of the Bayesian nonparametric population size reconstruction that include integration of other sources of information into the coalescent framework.