Sangam: A Confluence of Knowledge Streams

D-SCRIPT translates genome to phenome with sequence-based, structure-aware, genome-scale predictions of protein-protein interactions

Show simple item record

dc.contributor Massachusetts Institute of Technology. Department of Mathematics
dc.contributor Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.creator Sledzieski, Samuel
dc.creator Singh, Rohit
dc.creator Cowen, Lenore
dc.creator Berger, Bonnie
dc.date 2022-09-28T17:07:56Z
dc.date 2022-09-28T17:07:56Z
dc.date 2021
dc.date 2022-09-28T17:03:52Z
dc.date.accessioned 2023-03-01T18:12:22Z
dc.date.available 2023-03-01T18:12:22Z
dc.identifier https://hdl.handle.net/1721.1/145605
dc.identifier Sledzieski, Samuel, Singh, Rohit, Cowen, Lenore and Berger, Bonnie. 2021. "D-SCRIPT translates genome to phenome with sequence-based, structure-aware, genome-scale predictions of protein-protein interactions." Cell Systems, 12 (10).
dc.identifier.uri http://localhost:8080/xmlui/handle/CUHPOERS/279150
dc.description We combine advances in neural language modeling and structurally motivated design to develop D-SCRIPT, an interpretable and generalizable deep-learning model, which predicts interaction between two proteins using only their sequence and maintains high accuracy with limited training data and across species. We show that a D-SCRIPT model trained on 38,345 human PPIs enables significantly improved functional characterization of fly proteins compared with the state-of-the-art approach. Evaluating the same D-SCRIPT model on protein complexes with known 3D structure, we find that the inter-protein contact map output by D-SCRIPT has significant overlap with the ground truth. We apply D-SCRIPT to screen for PPIs in cow (Bos taurus) at a genome-wide scale and focusing on rumen physiology, identify functional gene modules related to metabolism and immune response. The predicted interactions can then be leveraged for function prediction at scale, addressing the genome-to-phenome challenge, especially in species where little data are available.
dc.format application/pdf
dc.language en
dc.publisher Elsevier BV
dc.relation 10.1016/J.CELS.2021.08.010
dc.relation Cell Systems
dc.rights Creative Commons Attribution 4.0 International license
dc.rights https://creativecommons.org/licenses/by/4.0/
dc.source Elsevier
dc.title D-SCRIPT translates genome to phenome with sequence-based, structure-aware, genome-scale predictions of protein-protein interactions
dc.type Article
dc.type http://purl.org/eprint/type/JournalArticle


Files in this item

Files Size Format View
1-s2.0-S2405471221003331-main.pdf 2.541Mb application/pdf View/Open

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse