DAEMON is glad to announce the co-organization of “MACHINE LEARNING FOR MATERIALS DISCOVERY (ML4MD)” workshop, will take place on 05-08/05/2025 in Helsinki, Finland.
For more information about the event, we invite you to visit the CECAM hosted event webpage
In this workshop, we will focus on the following research questions:
How can humans or LLMs be integrated into material discovery tasks to achieve good out-of-domain performance with reduced data requirements? Recent progress in the NLP field resulted in multiple novel models. Such language models assisted already in data extraction, analysis and generation of molecular and inorganic materials [8][12][13]. New good quality and open-access models can be fine-tuned to specific tasks reducing time and data costs. Additionally, active learning and human-in-the-loop approaches can improve the learning process by sampling only necessary data [4].
What is the state-of-the-art in generative models for organic and inorganic material discovery and where are the key bottlenecks? Generative material discovery is a rapidly evolving field. Methods for generation of both organic and inorganic materials use similar ML architectures, such as generative adversarial networks, variational autoencoders or diffusion models [6][10]. Training of such models requires large, diverse datasets to generalize and to provide accurate predictions [3]. Additionally, generative models still struggle with the synthesizability of generated materials.
How can generative models be paired with stability and synthesizability considerations, as well as structure-property relationships? Early generative methods relied on combinatorial and high-throughput screening. Instead of generating numerous varied structures, models should focus on producing optimal solutions considering both properties and synthesizability. Recent advances show that models trained on structural and property data can yield more optimized solutions. Adding synthesizability scoring functions, such as synthetic complexity or retrosynthetic accessibility, and integrating retrosynthesis pathway searches, improves the likelihood of obtaining synthesizable materials [11].
What is the state-of-the-art in generative models for retrosynthesis-driven materials discovery? Retrosynthesis enables the synthesis of complex molecules by breaking down targets into simpler precursors. It requires careful multi-step planning. While ML models can aid in designing molecules with clear disconnections, they often lack a holistic view, leading to suboptimal routes. Recent state-of-the-art models address this by incorporating a human-in-the-loop or by representing the chain of steps as embedding vectors, minimizing their distance to known retrosynthesis procedures [4].