Optimizing phenotype scale improves genetic analyses in large-scale biobanks

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

Optimizing phenotype scale improves genetic analyses in large-scale biobanks

Authors

Huang, Z.; Costantino, M.; Dahl, A.

Abstract

Large-scale biobanks have enabled increasingly complicated genetic analyses across thousands of phenotypes. However, studies rarely consider the appropriate phenotype measurement scale, a problem that can drastically affect inferences on genetic architecture. Here, we introduce SIQReg, a practical solution to this classical problem, which learns a data-driven phenotype scale by minimizing heterogeneity across phenotype quantiles. Applied to complex traits in UK Biobank, SIQReg rejects the default scale for 24/25 traits. Generally, SIQReg scales lie between default and logarithmic, indicating that default-scale traits are neither purely additive nor purely multiplicative. We show that SIQReg improves both non-additive and additive genetic analyses. SIQReg eliminates most non-additive genetic signals (such as 97% of vQTL and 76% of quantile-dependent TWAS genes), indicating they may be statistical artifacts, while preserving biologically plausible non-additive signals. Simultaneously, SIQReg improves power to detect additive signals, increasing GWAS loci, TWAS genes, and PGS prediction accuracy by 11%, 13%, and 10%, respectively, and identifies 50% more high-risk individuals. These gains replicate across ancestry groups. Our results establish SIQReg as a principled approach to phenotype scale transformation that improves genetic analyses of complex traits.

Follow Us on

0 comments

Add comment