endedErwin Schrödinger

Analysis of rare variants from sequencing data

Analyse von seltenen Varianten in Sequenzierdaten

View on FWF Research Radar

Principal Investigator

Name: Christian Fuchsberger
Role: Projektleiter:in
Institution: University of Michigan

Grant Details

Approval Date: 25 Nov 2012
Start Date: 15 Mar 2013
End Date: 14 Sept 2014
Approved Amount: € 67.174

Keywords & Classification

Keywords

Next generation sequencingRare VariantsAssociation TestingBurden TestsGene Set Enrichment Analysis

Research Disciplines

Biology

Research Fields

Biology

Project Summary

Today, many large-scale sequencing studies are on the way, addressing one of the major questions in human genetics: how and to what extent can insights into disease etiology be advanced by studying low frequency variants. The development of analytical tools, however, is barely keeping up with the deluge of human sequencing data. For example, single-SNP disease associations are commonly tested based on logistic regression. This approach is powerful for common variant and therefore broadly used in GWAS, but for studying the association of rare variants our power to detect signals will be modest. One possibility is to assess the combined effects of specific sets of rare variants: for example, all coding variants in a particular gene. These burden tests take into account overall variant-load within specified genomic regions of interest and are, therefore, better able to detect signals in the presence of multiple rare causal alleles. This is a very active area of research: within the last three years more than 20 burden tests have been proposed. However, the properties of these tests are still not fully understood and the comparisons provided in the original publications are often too simplistic or cover only a small range of genetic architectures. Furthermore, the few published method-neutral comparisons, have used simulations that do not reflect the properties of real data (e.g. excess of singletons beyond neutral expectations) or are not covering a wide range of methods. Therefore, analysts of sequence data have to make best-guess decisions when choosing a rare variant analysis method to address certain genetic hypotheses. Therefore, aim 1 of this project is to fill this gap by performing an extensive method neutral evaluation of different burden tests based on realistic sequence data. Our results will guide investigators to identify the most powerful approach to identify rare variants associated with disease. One interesting feature of burden tests is the integration of functional information at gene or locus level. A logical next step in mining genome-wide sequence data is to analyze them at gene set or even at the pathway level. For common variants gene set enrichment analysis (GSEA) is broadly used to test if pathways are enriched. Aim 2 of the proposed project is to extend GSEA to take full advantage of sequence specific properties, such as extensive ascertainment of rare variants, and compare power to the extended burden test approach outlined above. In Aim 3, we will further extend the method proposed in Aim 2 by taking into account the a-priori known relationships between genes and variants. Completion of these three aims will result in research tools of high strategic value and impact, and will enhance the value of many ongoing and future large-scale sequencing experiments.

Analysis of rare variants from sequencing data

Principal Investigator

Grant Details

Keywords & Classification

Project Summary

Research Outputs (0)

No outputs linked

Further Funding (0)