DSpace Repository

Adjusting for population stratification in longitudinal quantitative trait locus identification

Show simple item record

dc.contributor.advisor Finch, Stephen J. en_US
dc.contributor.author Wang, Yifan en_US
dc.contributor.other Department of Applied Mathematics and Statistics en_US
dc.date.accessioned 2013-05-24T16:38:14Z
dc.date.accessioned 2015-04-24T14:45:36Z
dc.date.available 2013-05-24T16:38:14Z
dc.date.available 2015-04-24T14:45:36Z
dc.date.issued 2012-08-01 en_US
dc.identifier.uri http://hdl.handle.net/1951/60212 en_US
dc.identifier.uri http://hdl.handle.net/11401/71024 en_US
dc.description 164 pgs en_US
dc.description.abstract Genome-wide association studies (GWAS) are widely used to detect genotypes associated with complex diseases. Such GWAS studies of disease progression over time may be clinically significant. Longitudinal quantitative trait locus (LQTL) methods are used in these studies to simulate disease progression. However, population stratification (PS) can lead to false positive or negative findings when conducting a GWAS study. PS is induced by a candidate marker's variation in allele frequency across ancestral populations. One of the approaches used to adjust for population stratification in GWAS is the global principal component analysis (PCA) approach. In this thesis I examine the statistical properties of GWAS analysis procedures using principal component adjustments across the whole genome. I use additive risk allele models to test the association between rare genetic variants and the longitudinal quantitative phenotypes across the whole genome. The genotype data are taken from the Hapmap 3 dataset for 1198 unrelated individuals. The simulated quantitative phenotype data are estimated using the Bayesian posterior probabilities (BPPs) that a participant belongs to a clinically important trajectory curve. The PCA method implemented in the EIGENSTRAT program is then used to reduce the data to ten variables containing most of the genetic variability information. The power and rejection rates are evaluated based on 1000 simulated replicates. The association test follows a chi-square distribution with one degree of freedom under the null hypothesis of no association. The p-values of the test of the coefficient of a genotype with and without a PC adjustment for PS are documented. For each disease gene, I select 25 matching SNPs (the ones with high correlation coefficient of allele frequencies with the disease gene across population) and 25 non-correlated SNPs (the ones with low correlation coefficient of allele frequencies with the disease gene across population). All SNPs considered are in overall Hardy Weinberg equilibrium (HWE). The additive risk allele model LQTL models have strong empirical power. The model with global PCA adjustment for PS is able to consistently maintain correct false positive rates. en_US
dc.description.sponsorship This work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree. en_US
dc.format Monograph en_US
dc.format.medium Electronic Resource en_US
dc.language.iso en_US en_US
dc.publisher The Graduate School, Stony Brook University: Stony Brook, NY. en_US
dc.subject.lcsh Statistics en_US
dc.subject.other genome wide association study, longitudinal quantitative trait locus, population stratification, principal component analysis en_US
dc.title Adjusting for population stratification in longitudinal quantitative trait locus identification en_US
dc.type Dissertation en_US
dc.mimetype Application/PDF en_US
dc.contributor.committeemember Mendell, Nancy R.Wu, Song en_US
dc.contributor.committeemember Gordon, Derek en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account