Sunday, March 8, 2015

Chemical Space Explorers

One startling omission from my Acc. Chem. Res. post? Jean-Louis Reymond's review on the vastness of his generated database GDB-17, a.k.a. The Chemical Space Project. With over 166 billion compounds  Reymond claims to have produced the largest virtual library ever assembled. The best part? It's 99.9% new-to-science compounds. As Derek has quipped, chemical space truly is "Big. Really big."

Trudging along through chemical space, using Dr. Reymond's MQN-browser.
(I realize there's no way some of these are stable - 49? 54? - but they sure do look cool!)

Of these innumerable options, how do we decide what to make next? It's like that old Wall Street saw about how "Buy low, sell high" sounds easy, but takes a lifetime to figure out. It seems straightforward to say that you've generated billions of druglike compounds in silico, but how do you find out which ones are actually drugs?

You have to start somewhere. I still recall the first "chemical space exploration" paper that truly caught my eye - a 2009 J. Med. Chem. scribed by Will Pitt and colleagues at UCB (I still keep a dog-eared copy in my file cabinet). Using machine learning, the team constructed a library (VEHICLe) containing synthetically feasible heterocyclic compounds, most of which had never been made.

Offering a partial update to Will Pitt's "Figure 6" from his 2009 J. Med. Chem. I searched SciFinder for each ring system as a substructure of reaction products, allowing for certain substitutions (say, fused phenyl in place of endocyclic olefin) and considering tautomers. By my count: 10 down, 12 to go!

Pitt issued a challenge in the introduction:
"With this work, we aim to provide fresh stimulus to creative organic chemists by highlighting a small set of apparently simple ring systems that are predicted to be tractable but are, to the best of our knowledge, unconquered."
Heady stuff. So, who will step forward to try these tantalizing targets? Someone certainly should, as Prof. Reymond seems to suggest with his own forward-leaning graphic:

GDB-17 "nearest neighbors" - closely related to known drugs, but not yet synthesized.
(I couldn't find anything similar in SciFinder, either)
Source: J-L Reymond, 2015 Acc. Chem. Res.

Do you suppose an academic candidate could make a convincing case? I'd be tickled pink if something along these lines were sent off to the NIH R01 office:
"Dear [insert funding agency] - Listen, I really want to develop novel molecules to improve human health, but I'm not collecting plants or culturing microbes, and it's too tough to compete with industry head-on. But say, there's this guy who's looked at more compounds than any other human being alive, and he says there's some structures that look really close to existing drugs that nobody's ever tried making. Mind giving me some cash for that?"
Good luck, chemical space explorers.