Mapping the Unknown: How AI Is Revealing New Molecules Hidden in Nature

In a remarkable fusion of chemistry and artificial intelligence, scientists have developed a new AI model capable of discovering molecular structures that have remained invisible to human researchers—until now. The model, called DreaMS, is a product of collaboration between IOCB Prague and the Czech Institute of Informatics, Robotics and Cybernetics (CIIRC CTU), and it has already revealed previously unknown compounds hiding in plain sight within mass spectrometry data.
This achievement, reported by Phys.org and published in Nature Biotechnology, represents a transformative leap for drug discovery, environmental science, and even the search for extraterrestrial life.
The Chemical Mystery of the Natural World
It’s estimated that the vast majority of naturally occurring molecules remain undescribed. Each of these molecules carries the potential to become a new drug, pesticide, or biochemical tool. However, the main bottleneck has been interpreting the massive datasets produced by mass spectrometry, the standard tool for analyzing molecular composition.
Mass spectrometry produces spectra—essentially numerical fingerprints of molecules—that are notoriously difficult to decode. Traditional analysis methods fall short when it comes to unfamiliar or rare structures. That’s where AI steps in.
DreaMS: The ChatGPT of Molecular Spectra
Inspired by large language models like ChatGPT, the DreaMS model uses self-supervised learning to analyze millions of spectra and extract hidden structural features—without needing prior labels or explicit chemical definitions.
“Just as ChatGPT learns language patterns from raw text, DreaMS learns structural representations from raw spectra,” explains Dr. Josef Šivic of CIIRC CTU. “It can infer what molecular structures are likely to be responsible for specific patterns, even without direct human guidance.”
Trained on tens of millions of spectra from diverse biological and environmental sources—such as soil, microbes, plants, and food—the model uncovers latent connections between seemingly unrelated compounds.
The DreaMS Atlas: An Internet of Molecular Data
The researchers built an interactive network of spectra called the DreaMS Atlas. Each spectrum acts like a “node” linked to others with similar characteristics. Scientists can explore this virtual network to discover chemical relationships—for example, between pesticides and human skin or between microbial compounds and plant metabolites.
One striking result from this network was a hypothesized link between certain pesticides and autoimmune diseases such as psoriasis, based on shared spectral features across disparate datasets.
A Fluorine Surprise and a Vision for the Future
Another breakthrough was DreaMS’ ability to detect fluorine in molecules—a challenging task due to its subtle spectral footprint. After fine-tuning with just a few thousand fluorine-containing compounds, the AI successfully learned to identify them in new samples. This is a crucial advancement, as fluorine is present in roughly one-third of all pharmaceuticals and agrochemicals.
The next frontier? Teaching DreaMS to reconstruct full molecular structures directly from spectra. If achieved, this could revolutionize how we catalog, synthesize, and interact with natural chemical diversity on Earth—and possibly beyond.
Applications Across Chemistry, Biology, and Beyond
This research is expected to impact multiple disciplines:
- 🧪 Drug discovery: Identifying novel bioactive molecules
- 🌱 Agrochemistry: Detecting safer or more sustainable pesticides
- 🌍 Environmental science: Monitoring pollutants and natural metabolites
- 👽 Astrobiology: Understanding the chemical basis of life elsewhere
According to Dr. Tomáš Pluskal, a Neuron Award-winning scientist and co-author of the study, “This is just the beginning. AI models like DreaMS could fundamentally change how we explore chemical space.”
Read the Original Article
Phys.org – Researchers Discover Unknown Molecules with the Help of AI
Nature Biotechnology DOI: 10.1038/s41587-025-02663-3
🚀 Want Your Research to Reach the Right Audience?
We help scientists, universities, and R&D companies communicate their discoveries to a global audience.
- ✅ SEO-optimized blog content and feature articles
- ✅ Sponsored ad placements within our blog posts
- ✅ Social media outreach and campaign strategy
- ✅ Fully flexible and negotiable pricing
#ArtificialIntelligence #MassSpectrometry #MolecularDiscovery #DrugDiscovery #SpectralAnalysis #NatureBiotechnology #AIinScience #FluorineDetection #QuantumServerNetworks #ChemicalBiology
Comments
Post a Comment