The distribution of fitness effects of new mutations in regulatory regions of the D. melanogaster genome
The distribution of fitness effects of new mutations in regulatory regions of the D. melanogaster genome
Daigle, A.; Marsh, J.; Kay, A.; Johri, P.
AbstractAlthough non-coding regions play important roles in gene regulation and contribute to individual fitness, the precise distribution of fitness effects (DFE) of new mutations in these regions remains poorly understood. Here, we carefully compile experimentally validated regulatory regions in non-coding regions in the Drosophila melanogaster genome and identify putatively neutral sites near them. Incorporating a realistic genomic architecture that mimics the placement of the regulatory and coding regions, as well as a realistic heterogeneity in recombination and mutation rates across the genome, we use forward-in-time simulations to assess the power and accuracy of population genetics approaches that infer the DFE of new mutations in these regions. While the parameters of DFEs primarily comprising moderately and strongly deleterious mutations are estimated accurately, those of DFEs comprising mostly mildly deleterious mutations are misinferred. Applying these insights to three African D. melanogaster populations, we find that a large fraction of new mutations in functionally important non-coding regions are moderately deleterious, as opposed to strongly deleterious in coding regions. While the fraction of beneficial substitutions in regulatory regions (0.25-0.45) was also lower in coding regions (~0.5), our results suggest that non-coding regions contribute a majority of new deleterious mutations and beneficial substitutions in D. melanogaster populations. By incorporating both the genomic distribution and the inferred DFE of non-coding regions, we demonstrate that the effects of background selection across the genome are more accurately captured than with coding regions alone, highlighting the importance of considering selection on non-coding regions when interpreting patterns of genomic variation.