AI assistant for content discovery

AI assistant for content discovery

For a given material performance metric, multiple Jarvis (Combined Automated Repository for Different Integrated Simulations) components can work together to optimize or design an entirely new material. Credits: Kamal Choudhary, NIST

When Tony Stark is required to travel to space in the original Iron Man film, he enlists his artificial intelligence (AI) assistant Jarvis to create a suit that can survive harsh conditions.

As AI expert Kamal Choudhary explains: “The way I saw it, what Jarvis did was he had a database of materials, scanned the database, found a suitable material, tested it, then synthesized an alloy.” Which could survive in space conditions.

“That’s what we want our system to do, and that’s why we called it Jarvis.”

Choudhary, a researcher at the National Institute of Standards and Technology (NIST), is the founder and developer of Jarvis (Joint Automated Repository for Different Integrated Simulations) – an open dataset designed to automate content discovery and optimization.

write in NPJ Computational Materials In December 2021, Choudhury and Brian DeKost (NIST) described the latest improvements to Jarvis that apply AI to motion search. Combining graph neural networks with chemical and structural knowledge about materials, their Atomic Line Graph Neural Networks (ALIGNN) outperform previously reported models on atomic prediction tasks with very high accuracy and better or comparable model training speed. Huh.

“ALIGNN can predict features in seconds instead of months,” Chowdhury said.

Beyond the inspiration for Iron Man, the material was the Genome Initiative. Started in 2011 under President Obama, the initiative is a multi-federal agency effort to discover, manufacture and deploy materials that are twice as fast and at a fraction of the cost of conventional methods.

NIST’s original contribution to this initiative was the creation of a database of materials and their characteristics, rigorously achieved using standardized, state-of-the-art computing methods.

Many such databases have been established, but “what is special about the Jarvis database is that it contains modules for a wide variety of computational approaches,” according to David Vanderbilt, professor of physics at Rutgers University, of the National Academy of Sciences. Member, and a contributor to the project. “There are many different theoretical levels at which you can approach this area. Jarvis is unusual in that it extends to a higher level than other databases.”

The original data for Jarvis was drawn from density function theory (or DFT) calculations. “DFT is the standard way that most people calculate properties of materials at the atomic level,” Vanderbilt explained. “They are first-principal calculations, where there is no experimental input and the results are derived from theory according to the laws of quantum mechanics.”

This paradigm has been incredibly effective, Chowdhary said, “though if you look at the periodic table, there are billions of possible combinations of elements—far more than we could ever generate data for.” “This is where machine learning comes in.”

If quantum mechanical computation can serve as a screening tool for physical experiments, Choudhury argued, machine learning could serve as a screening tool for costly calculations.

AI assistant for content discovery

The atomistic line graph neural network (ALIGNN) predicted examples of high-CO₂ adsorption metal organic frameworks. Credits: Kamal Choudhary, Taner Yildirim, Daniel Siderius, A. Gilead Kusne, Austin McDonald, Diana L. ortiz-montalvo

But first, such a system needs to be trained. Neural networks like ALIGNN require a huge amount of training data to be effective. Behind Chowdhary’s state-of-the-art AI model are DFT simulations of 70,000 materials and counting. This growing database was used to train neural networks, which in turn can rapidly mark new materials or screen for materials with specific properties.

“This material is the vision of the Genome Initiative,” Choudhury said.

Writing in arXiv, Choudhury and his colleagues provide an example of how the system can speed up search. They used ALIGNN to predict the CO₂ adsorption properties of metal organic structures, a class of porous materials that can remove CO₂ from the atmosphere, and to computationally rank the leading candidates for experimental synthesis.

The Jarvis dataset was produced primarily on supercomputers at NIST, who have been working on this effort for nearly five years. Recently, Choudhury gained access to the Frontera and Stampede 2 supercomputers at the Texas Advanced Computing Center (TACC), which have also contributed to the dataset.

“The field of machine learning has been around since the 1980s, but the main problem was having well-curated datasets,” Chowdhury said. “We’re now accessing 100,000 materials in our database and that was only possible because of Frontera and NIST. That’s what helped us bridge that gap.”

With a large number of training samples available and knowledge of chemistry and physics hard-coded into neural networks, Choudhury was able to significantly improve the accuracy of his machine learning model. “The more domain knowledge you can use the better. I think physics and AI shouldn’t be competitors to each other; they should be friends and allies.”

ALIGNN tools, such as for DFT calculations and other machine learning methods, have been incorporated into JARVIS and made available to researchers around the world. Choudhury estimates that 8,000 chemists and biologists use this repository every year. More recently, this has enabled scientists at Argonne National Laboratory to study topological magnetic materials, and researchers at Northwestern University to study transfer learning for materials.

Choudhury is also collaborating with David Vanderbilt to develop “beyond-DFT” methods, applying them to quantum materials, and integrating those methods and datasets into Jarvis.

“The DFT has some important approximations,” Vanderbilt said. “Since the electrons are assumed to be free, you miss some special and interesting behavior in quantum materials, leading to implications beyond the general expectation of a general theory.”

These include, but are not limited to, unconventional superconductivity, the quantum Hall effect, and topological magnetic structures. “For these classes of materials, simple DFT does not work well enough,” he continued. “Our database takes three or four high-level beyond-DFT approaches to give the community a better understanding of how answers may differ depending on the underlying approach.”

By establishing a database of potential materials and developing tools to automate the screening, Choudhury hopes to speed up the pipeline of discovery, bringing Iron Man-like capabilities closer to reality.

“Imagine the day when a model that can predict a new material, a new drug — and say, ‘Out of a million molecules, try this first. Choudhary said. “That’s the golden age of physics.”

Improving Machine Learning for Material Design

more information:
Kamal Choudhary et al, Atomic Line Graph Neural Networks for Better Material Property Predictions, NPJ Computational Materials (2021). DOI: 10.1038/s41524-021-00650-1

Provided by Texas Advanced Computing Center

Citation: AI Assistant for Materials Discovery (2022, February 14) Retrieved 30 March 2022 from

This document is subject to copyright. No part may be reproduced without written permission, except for any fair use for the purpose of personal study or research. The content is provided for information purposes only.

Related Posts

Leave a Reply

Your email address will not be published.