Among all possible chemical compounds, it is estimated that between $10^{20}$ and $10^{60}$ may hold potential as small-molecule drugs. Evaluating these compounds experimentally would be far too time-consuming for chemists. Recently, researchers have begun using artificial intelligence to identify compounds that could serve as good drug candidates. One such researcher is MIT Associate Professor Connor Coley, who has appointments in the departments of Chemical Engineering and Electrical Engineering and Computer Science, as well as the MIT Schwarzman College of Computing. His research intersects chemical engineering and computer science, developing computational models to analyze vast numbers of potential chemical compounds, design new ones, and predict reaction pathways that could generate those compounds.
Coley states, "It’s a very general approach that could be applied to any application of organic molecules, but the primary application we think about is small-molecule drug discovery."
Coley's interest in science runs in his family. He mentions that his family includes more scientists than non-scientists, including his father, a radiologist; his mother, who earned a degree in molecular biophysics and biochemistry before attending the MIT Sloan School of Management; and his grandmother, a math professor. As a high school student in Dublin, Ohio, Coley participated in Science Olympiad competitions and graduated at the age of 16. He then attended Caltech, choosing chemical engineering as his major to combine his interests in science and math. During his undergraduate years, he also pursued computer science, working in a structural biology lab using Fortran to help solve protein crystal structures.
After graduating from Caltech, he continued in chemical engineering and came to MIT in 2014 to start his PhD. Advised by professors Klavs Jensen and William Green, Coley focused on optimizing automated chemical reactions. His work combined machine learning with cheminformatics—applying computational methods to analyze chemical data—to plan reaction pathways for new drug molecules. He also worked on designing hardware for automatic reactions, part of which was funded by DARPA's Make-It program, aimed at using machine learning and data science to improve the synthesis of medicines from simple building blocks.
Coley began applying for faculty positions while still a student and accepted an offer from MIT at age 25. He received mixed advice about taking a job at his alma mater but decided the opportunity was too enticing to turn down. "MIT is a very special place in terms of resources and fluidity across departments. It supports the intersection of AI and science, creating a vibrant ecosystem," he says.
Coley deferred his faculty position for a year to do a postdoc at the Broad Institute, gaining more experience in chemical biology and drug discovery. There, he worked on identifying small molecules from billions of candidates in DNA-encoded libraries that might bind to mutated proteins associated with diseases. After returning to MIT in 2020, he built his lab group to deploy AI not only to synthesize existing therapeutic compounds but also to design new molecules with desirable properties.
His lab has developed various computational approaches to achieve these goals. "We try to think about how to best pair a challenge in chemistry with a potential computational solution. Often, that pairing motivates the development of new methods," Coley says. One model his lab developed, known as ShEPhERD, evaluates potential new drug molecules based on how they interact with target proteins, using the drug molecules' three-dimensional shapes. This model is currently used by pharmaceutical companies to aid in drug discovery.
Coley emphasizes, "We’re trying to give more of a medicinal chemistry intuition to the generative model, so it is aware of the right criteria and considerations." In another project, his lab created a generative AI model called FlowER, which predicts reaction products from combining different chemical inputs. The researchers incorporated fundamental physical principles, such as the law of conservation of mass, and required the model to consider the feasibility of intermediate steps necessary to transform reactants into products. These constraints improved the model's predictive accuracy.
Coley notes, "Thinking about those intermediate steps, the mechanisms involved, and how the reaction evolves is something that chemists do very naturally. It’s how chemistry is taught, but it’s not something that models inherently consider." His students also work on various areas related to optimizing chemical reactions, including computer-aided structure elucidation, laboratory automation, and optimal experimental design. "Through these many different research threads, we hope to advance the frontier of AI in chemistry," Coley concludes.
Blogger's Review: This research highlights how the integration of AI into chemistry could revolutionize drug discovery. The application of machine learning models not only enhances the efficiency of drug screening but also provides new insights into chemical reactions. The future applications of AI in chemistry may significantly accelerate the drug development process.