The aging of the world’s population, the increasing burden of chronic and infectious diseases, and the emergence of new pathogens have made the need for new treatments more urgent than ever. However, discovering a new drug and bringing it to market is a long, arduous, and expensive journey marked by many failures and few successes.
Artificial intelligence has long been considered the answer to overcoming some of these obstacles due to its ability to analyze vast amounts of data, discover patterns and relationships, and predict effects.
Now, a multi-institutional team led by Harvard Medical School biomedical informatics Marinka Zitnik has launched a platform that aims to optimize AI-driven drug discovery by developing more realistic data sets and algorithms. higher fidelity.
Therapeutics Data Commons, described in a recent comment on nature chemistry biologyis an open access platform that serves as a bridge between computer scientists and machine learning researchers at one end, and biomedical researchers, biochemists, clinical researchers, and drug designers at the other end—communities that have traditionally worked in isolation between Yes. .
The platform offers both dataset curation and algorithm design and performance evaluation for multiple treatment modalities, including small molecule drugsantibodies and cell and gene therapies, at all stages of drug development, from the identification of chemical compounds to the performance of drugs in clinical trials.
Zitnik, an assistant professor of biomedical informatics in the Blavatnik Institute at HMS, conceptualized the platform and now leads the collaborative work with researchers from MIT, Stanford University, Carnegie Mellon University, Georgia Tech, the University of Illinois-Urbana Champaign and Cornell University. .
He recently discussed Therapeutics Data Commons with Harvard Medicine News.
HMNews: What are the core challenges in drug discovery and how can AI help solve them?
Zitnik: Developing a drug from scratch that is both safe and effective is an incredible challenge. On average, it takes 11 to 16 years and $1 billion to $2 billion to do. Why is that?
It is very difficult to determine early on whether an initially promising chemical compound would produce results in human patients consistent with the results it shows in the laboratory. The number of small molecule compounds is 10 to the power of 60; however, only a small fraction of this astronomically large chemical space has been explored for molecules with medicinal properties. Despite that, the impact of existing therapies in treating disease has been staggering. We believe that novel algorithms coupled with automation and new data sets can find many more molecules that can translate into improved human health.
AI algorithms can help us determine which of these molecules are most likely to be safe and effective human therapies. That is the fundamental problem that drug discovery development suffers from. Our vision is that machine learning models can help filter and integrate vast amounts of biochemical data that we can connect more directly with molecular and genetic information and ultimately individualized patient outcomes.
HMNews: How close is AI to making this promise a reality?
Zitnik: We’re not there yet. There are a number of challenges, but I would say the biggest is understanding how well our current algorithms work and whether their performance translates to real world problems.
When we evaluate new AI models through computer modeling, we are testing them on reference data sets. Increasingly, we see in the posts that those models are achieving near-perfect accuracy. If that’s the case, why aren’t we seeing widespread adoption of machine learning in drug discovery?
This is because there is a big gap between performing well on a reference dataset and being ready to transition to real-world implementation in a biomedical or clinical setting. The data with which these models are trained and tested is not indicative of the type of challenges these models are exposed to when used in real practice, so closing this gap is really important.
HMNews: Where does the Therapeutics Data Commons platform come into this?
Zitnik: The goal of Therapeutics Data Commons is to address precisely those challenges. It serves as a meeting point between the machine learning community at one end and the biomedical community at the other end. It can help the machine learning community with algorithmic innovation and make these models more translatable to real world scenarios.
HMNews: Could you explain how it actually works?
Zitnik: First, keep in mind that the drug discovery process runs the gamut, from initial drug design based on data from chemistry and chemical biology, to preclinical research based on data from animal studies. , to clinical research in humans. patients The machine learning models that we train and test as part of the platform use different types of data to support the development process at all of these different stages.
For example, the machine learning models that support the design of small molecule drugs are typically based on large data sets of molecular graphics: structures of chemical compounds and their molecular properties. These models find patterns in the known chemical space that relate parts of the chemical structure to the chemical properties necessary for drug safety and efficacy.
Once an AI model is trained to identify these telltale patterns in the known subset of chemicals, it can be deployed and can search for the same patterns in the vast data sets of as-yet-untested chemicals and make predictions about how they would behave. these chemicals. .
To design models that can help with late-stage drug discovery, we train them on data from animal studies. These models are trained to search for patterns that relate biological data to probable clinical outcomes in humans.
We can also ask if a model can look for molecular signatures in chemical compounds that correlate with patient information to identify which subset of patients is most likely to respond to a chemical compound.
HMNews: Who are the contributors and end users of this platform?
Zitnik: We have a team of volunteer students, scientists, and experts who come from partner universities and from industry, including small start-ups in the Boston area, as well as some large pharmaceutical companies in the United States and Europe. computer and biomedical researchers They contribute their expertise in the form of state-of-the-art machine learning models and curated, pre-processed data sets, which are standardized so others can publish them ready for use.
Therefore, the platform contains analysis-ready data sets and machine learning algorithms, along with robust measures that tell us how well a machine learning model performs on a specific data set.
Our end users are researchers from all over the world. We host webinars to introduce new features, get feedback, and answer questions. We offer tutorials. This ongoing training and feedback is really crucial.
We have 4,000-5,000 active users every month, most of them from the US, Europe and Asia. Overall, we have seen over 65,000 downloads of our dataset/machine learning algorithm package. We have seen more than 160,000 downloads of standardized and harmonized data sets. The numbers are growing, and we expect them to continue to grow.
HMNews: What are the long-term goals of Therapeutics Data Commons?
Zitnik: Our mission is to support AI drug discovery on two fronts. First, in the design and testing of machine learning methods at all stages of drug discovery and development, from chemical compound identification and drug design to clinical research.
Second, to support the design and validation of machine learning algorithms in multiple therapeutic modalities, especially newer ones, including biologics, vaccines, antibodies, mRNA drugs, protein therapies, and gene therapies.
There’s a huge opportunity for machine learning to contribute to those novel therapies, and we haven’t yet seen the use of AI in those areas to the extent that we’ve seen it in small molecule research, where much of the focus is today. . This gap is mainly due to the paucity of AI-ready standardized data sets for those new therapeutic modalities, which we hope to address with Therapeutics Data Commons.
HMNews: What sparked your interest in this work?
Zitnik: I have always been interested in understanding and modeling the interactions between complex systems, which are systems with multiple components that interact with each other in a non-dependent manner. As a result, many problems in therapeutic science are, by definition, just such complex systems.
We have a target protein that is a complex three-dimensional structure, we have a small molecule compound that is a complex graph of atoms and bonds between those atoms, and then we have a patient, whose description and state of health are given in the form of a representation. multiscale. This is a classic complex systems problem, and I really love searching and finding ways to standardize and “tame” those complex interactions.
Therapeutic science is full of those kinds of problems that are ripe to benefit from machine learning. That’s what we’re chasing, that’s what we’re looking for.
Kexin Huang et al, Artificial Intelligence Foundation for Therapeutic Science, nature chemistry biology (2022). DOI: 10.1038/s41589-022-01131-2
Harvard Medical School
Citation: Can AI transform the way we discover new drugs? (2022, November 16) Accessed November 16, 2022 at https://phys.org/news/2022-11-ai-drugs.html
This document is subject to copyright. Apart from any fair dealing for private study or research purposes, no part may be reproduced without written permission. The content is provided for informational purposes only.