DPAC: Prediction and Design of Protein-DNA Interactions via Sequence-Based Contrastive Learning

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

DPAC: Prediction and Design of Protein-DNA Interactions via Sequence-Based Contrastive Learning

Authors

Chen, L. T.; Pulugurta, R.; Vure, P.; Chatterjee, P.

Abstract

Interactions between DNA and proteins are pivotal in natural biological processes, and designing proteins that can bind to DNA with high specificity is crucial for advancing genomic technologies. Existing state-of-the-art models for both modeling and designing protein-DNA interactions primarily rely on structural information, facing limitations in scalability and efficiency for large-scale applications. Notable methods like AlphaFold 3 and RosettaTTAFold All-Atom exist, but they are inefficient and inherently struggle at modeling conformationally unstable proteins, such as transcription factors, which arguably represent the most important class of DNA-binding proteins. Here, we present DPAC (DNA-Protein binding Alignment via Contrastive learning), which leverages pre-trained protein and DNA language models via a contrastive loss to align the two modalities in a high-dimensional shared latent space. DPAC not only significantly accelerates the design process compared to current structure-based methods but also demonstrates a strong ability to differentiate real binders from non-binders. Our model achieves an AUC score of 0.591 on a low identity set, outperforming state-of-the-art structure-based methods. Additionally, DPAC integrates simulated annealing for the design of new protein sequences with optimized DNA binding affinity, successfully recovering binding affinity in engineered sequences by up to 20% in in silico tests. Our results highlight DPAC\'s potential for facilitating the design and discovery of sequence-specific DNA-binding proteins, paving the way for advancements in genomic research and biotechnology applications.

Follow Us on

0 comments

Add comment