Multimolecular feature-based machine learning: system biology enhanced RF-QSAR modeling for the efficient prediction of the inhibitory potential of diverse SARS CoV-2 3CL Protease inhibitors

Avatar
Poster
Voices Powered byElevenlabs logo
Connected to paperThis paper is a preprint and has not been certified by peer review

Multimolecular feature-based machine learning: system biology enhanced RF-QSAR modeling for the efficient prediction of the inhibitory potential of diverse SARS CoV-2 3CL Protease inhibitors

Authors

Manaithiya, A.; Bhowmik, R.; Ray, R.; Kumar, S.; Sharma, S.; Mathew, B.; Gong, W.; Parkkila, S.; Aspatwar, A.

Abstract

The COVID-19 pandemic has catalyzed an urgent need for effective treatment, given ongoing concerns about vaccine efficacy. By leveraging the CHEMBL database, we focused on SARS-CoV-2 3C-like protease (3CLpro) inhibitors, which are crucial drug targets in the coronavirus genome. We deployed a multi-faceted strategy combining quantitative structure-activity relationship (QSAR) analysis, molecular docking, dynamic simulations, and system biology to discover promising antiviral compounds. This culminated in a cheminformatics pipeline that crafted QSAR models, powered by machine learning. These models analyzed 919 molecules and demonstrated outstanding correlation coefficients of 97.36% and 74.13%. The models were built on the foundation of substructure fingerprints and 1D/2D molecular descriptors. Tools such as variance importance plots (VIPs) and correlation matrices were utilized to identify key features crucial for inhibiting 3CL protease. A web tool, 3CLpro-pred -Selec-Pred (https://3clpro-pred.streamlit.app/), was developed for streamlined bioactivity prediction against the protease. Molecular docking and dynamics provided atomic-level insights, confirming QSAR hypotheses and improving understanding of interaction dynamics. Hits compounds from QSAR and molecular modeling studies were further evaluated by system biology to elucidate the mechanisms underlying potential compounds\' inhibitory effects on SARS-CoV-2 3CL protease. Gene ontology and KEGG pathway analysis identified the most relevant pathways for the involved genes. Detailed system biology analysis highlighted crucial genes like TBK1, PIK3CA, IKBKB, GSK3B, and CASP3 as key nodes, linking compounds, targets, and pathways. The identified molecular interactions invite more extensive experimental verification to validate their therapeutic promise in antiviral drug discovery.

Follow Us on

0 comments

Add comment