Nature paper by Pritchard lab show how new molecular QTLs can help us understand…
Nature paper by Pritchard lab show how new molecular QTLs can help us understand non-coding genetic variation influence by having DNase I sequencing data for 90 YRI cell lines in which they identify 8902 dsQTLs
Embedded Link
Google+: Reshared 1 times
Google+: View post on Google+
13 Replies to “Nature paper by Pritchard lab show how new molecular QTLs can help us understand…”
Here's a quick question: How can these new dsQTL data be integrated into the genome?
What do you mean with "integrated"? These chromatin phenotypes are, at least that's how I look at them, a level higher than the genome. So these dsQTLs show that some of the genome variation in non-coding regions of the genome are involved in open/closed chromatin.Would you agree with that +Albert Vilella ?
Would be good to get insight from +Daniel Gaffney or +Joseph Pickrell how far-reaching this conclusion is. They present the argument based on power etc that roughly, p(variant affects chromatin | variant affects expression) = 0.5, and p(variant affects expression | variant affects chromatin) = 0.3. The former is nicely interpretable as enhancer accessibility affecting expression. The latter I find less straightforward, as the connection depends on the location of the variant. It's not like a third of all noncoding variants affecting dnase sensitivity affect expression – it must be in a regulatory region first. But in any case, great work leading towards mechanistic understanding of gene expression variation in humans.
Good question; I defer to +Jacob Degner +Roger PR and +Athma Pai.
Yes, good question. I would say that this 2nd result stems from the fact that DNase sensitivity is a good indicator of the position of regulatory regions and may actually be caused by TF binding. Thus, when we detect a change in DNase sensitivity, we are often detecting a change in TF binding. Then, I think it makes sense that at least 1/3 of these would have an effect on transcription.
To follow up with Jack's answer. I would just add a couple of comments if you are interested in more details. That genetic variants at DNase hypersensitive regions are more likely to be eQTLs is explored in more detail in a recent paper from our group by Daniel Gaffney (http://www.ncbi.nlm.nih.gov/pubmed/22293038). The 0.3 is an estimate across all dsQTLs that are found within 100kb of a gene transcription start site and taking into account that we have a limited sample size. I think that a more interpretable analysis of this can be found in the last part of our the paper that deals with which are the additional properties (location is one of them) a variant that affects chromatin may have in order to affect gene expression (see Figure 4 of the paper). In addition to location, notice that CTCF has also an important role in determining whether a dsQTL will affect gene expression.
From Fig4 panel A we can probably guess that most of the 0.3 comes from dsQTLs that are close to a gene TSS. This seems to be general property of eQTLs – see Fig 2, 3 in http://www.ncbi.nlm.nih.gov/pubmed/18846210
The CTCF result is very satisfying. +Zhihao Ding in Richard Durbin's group is working on TF binding QTLs for CTCF in LCLs (CEU I think – Zhihao?). They have recently begun to detect CTCF QTLs (something like 400 or so but this was a first analysis). It would be really nice if we could overlap this with the dsQTLs e.g. search for the following scenario:
dsQTL – CTCF-QTL – promoter
We could then test whether the effect of the dsQTL on gene expression depends on the individual's genotype at the CTCF-QTL.
+Daniel Gaffney indeed +Zhihao Ding is working on 60 ceu lcl lines (the same as for which i have FAIRE chip data and the same as Dermitzakis has RNA-seq eQTLs published for) its quite nice to have soon all this data next to each other and to see its overlap
Yes, I've been working on it ! 😉
Commenting on +Sander Timmer's comment, here is what I think about dsQTLs: John Stam showed this plot at one of the epigenetics meetings in Hinxton last fall. It was about distance of GWAS hits to dsQTL-like regions. I say dsQTL-like because this was as defined by their method, and I can't remember all the details right now. The plot was impressive: almost all GWAS hits that you can't tag to a nearby gene have "something to do" to relevant DnaseI regions. I didn't know how important dsQTLs were before that. And then I saw his plot. Now I'm a believer. Not a trace. Of doubt in my mind. (Smash Mouth)
Noncoding QTLs can only tag a rare coding variant (unlikely) or affect transcript levels themselves, so it makes sense that GWAS hits are near dsQTLs.
We can build better models of mRNA expression variation by combining TF sequence specificities (lots of good in vitro work on this coming out soon), TF occupancies (ENCODE), nucleosome positioning, DNAse accessibility etc. The goal of these models is to fill the gap between genotype and mRNA. However, we can already measure mRNA quite well, and there are models (e.g. Alexandra Nica's) to test whether GWAS QTL and eQTL are shared. What I'd like to know is if people are already doing quantitative proteomics on the HapMap LCLs to understand the next level of effects of genetic variation!
Looking forward to discussing these topics with people at Biology of Genomes this year!
Leo – you should check out Ron Hause's recent work in Rich Jones' lab in Chicago. They've developed a micro-western assay to look at protein expression levels of 100s of TFs in YRI LCLs. He presented this at last years ASHG:
http://www.ichg2011.org/cgi-bin/showdetail.pl?absno=11653
+Daniel Gaffney — is this published? hause r [au] gives me one entry: http://www.ncbi.nlm.nih.gov/pubmed?term=21999828