April 9, 2018

How to get from variants to genes?

Genome-wide association studies are great at identifying genetic variants associated with diseases and other traits of interest. However, for most of these variants there neither is a clear candidate gene nor an alternative mechanistic explanation for how they exert their effect.

Some thoughts on this:
  1. The vast majority of significant haplotypes (blocks of variants) reported by a given GWAS are causative for the trait of interest, assuming that the GWAS followed best practices
  2. Only a small proportion of those variants are coding or otherwise likely to affect protein function
  3. The proportion of coding variants decreases even further after finemapping to exclude variants that are not likely to be causative
  4. Some of the remaining noncoding variants act by changing gene expression, i.e. they're eQTLs
  5. Since many eQTLs are cell type and condition specific, and since data is not available for all cell types and conditions, it's unclear for what proportion of GWAS variants this applies
  6. There is a lack of understanding of how noncoding non-eQTL GWAS variants act mechanistically
  7. Some variants may not directly act through protein coding genes at all. Instead, they may act through noncoding RNAs (e.g. lncRNAs) or some other unknown mechanism
  8. Software tools for identifying causative genes from noncoding non-eQTL GWAS hits have been proposed. Here's one
  9. Experimental follow-up for those variants is hardly ever (never?) done, making it uncertain how well these tools work
  10. A lot of people are putting a lot of thought into how to approach this problem, and I expect some best practices to crystallize in the next few years

No comments:

Post a Comment