STARMAP: A 3D-informed framework for mapping functional regions in proteins to regulatory and cellular phenotypes
STARMAP: A 3D-informed framework for mapping functional regions in proteins to regulatory and cellular phenotypes
Shukla, K.; Castro, J.; Cheng, D.; Holley, L.; Brunk, E. C.
AbstractArtificial Intelligence (AI) has transformed biology by revealing patterns in large-scale datasets and predicting regulatory relationships. Yet even the most advanced models often fail to identify biologically meaningful mechanisms from statistical associations. This limitation arises not from algorithmic capacity but from the lack of mechanistically grounded input features. Our structure-informed framework Structure-based Topological Analysis of Regulatory and Molecular Activity Patterns (STARMAP) embeds protein three-dimensional structure and population-scale functional genomics data into a unified representation for mechanistic inference. By mapping over 1.5 million naturally occurring variants across ~1,700 cancer cell lines onto protein structures, STARMAP was able to identify spatial clusters of variation associated with shifts in transcriptional regulatory networks and drug response phenotypes. This approach transforms natural genetic variation into a large-scale, structure-informed screen, enabling systematic discovery of regulatory relationships across the proteome and providing interpretable and testable models of cellular regulation.