Machine Learning in Materials Science

The process of materials discovery is complex and iterative, requiring a level of expertise to be done effectively. Materials workflows that require human judgement present a specific challenge to the discovery process, which can be leveraged as an opportunity to introduce digital technologies. 

In the lab, many tasks require manual data collection and judgment. And even with an expert running the task, results (as well as the decisions based on them) can be highly variable. One expert will make decisions that deviate significantly from those of their peers and even from their own prior decisions.  This variability creates risk and can impact the businesses bottom line. For example: 

  • Unnecessary time and expertise spent on routine tasks that can be standardized– pulling experts away from the high-value work that requires their more advanced skills and knowledge 
  • Errors in judgment that can lead to rework, lost sales and/or an impact on the company’s reputation 
  • Processes that provide no systematic mechanism to detect or improve the situation over time

Even simple algorithms can augment, improve, or replace human judgement – providing superior consistency and greater accuracy.

In Materials Science, labs can now use computer intelligence to replace human judgment in many places, freeing up experts to focus on how to solve the big research questions. Whether it is feature detection, text extraction, or search and sort based on standard criteria, there are ample opportunities to incorporate artificial intelligence (AI) into existing workflows.

Laboratories looking to leverage digital technologies to become more efficient and to unlock new capabilities should consider how image processing and analysis can fit into their materials workflows. While computer vision (CV) is ubiquitous and flexible, it can be challenging to apply it to scientific problems. For new and cutting-edge problems, a digital solution is likely to require programming skills, domain expertise and time; in short, a significant investment. By selecting the correct process to improve or automate, the return on investment more than warrants the cost and effort.

Image Processing & Automation In Practice

For a customer who develops thin-film electronic materials, implementing a CV solution was transformative. Their previous process of manual optical characterization of patterned films was a slow, multi-step process. The expert-driven process of identifying and measuring features seemed to properly indicate whether a particular patterning process was successful, but the subjectivity and variability of the results would lead to hidden errors that would not be discovered until later in the materials development process. 

Using a traditional CV to find regular features, and deep learning for trickier issues where human judgement would normally be applied, they were able to ensure more consistent characterization resulting in improved product development decisions and reduction in verification tests. Not only were there cost savings from time and materials, but they were also able to capture more business in an emerging and growing market.

Complex Processes Require a Specialized Approach

That said, there are cases where a process or measurement might be too difficult to automate entirely. In these cases, leveraging machine learning in materials science can partially capture nuances of human judgement and assist in the interpretation. By definition, machine learning models learn from data and improve with training. The first step is to train a model on the data you have to identify and classify the image features an expert usually interprets. This can be types of defects, normal versus abnormal features, or regions of interest requiring additional analysis. Depending on the problem, you may be able to use a simple model, such as Random Forest, or may need something more sophisticated, like Deep Learning. With a suitable model in place, the next step is to teach it.

AI in Materials Science

The key to machine learning for materials discovery is integrating AI into the existing, human-driven workflow. First as an assistant, and eventually as an expert. This augmented workflow will be an improvement over the existing manual process because:

  • The model has a higher percentage of success, particularly when detecting and classifying common features. 
  • When the model gets something wrong, the human operator will correct it, and by doing so, the model will capture the operator’s judgement.
  • The more the model is corrected, the closer it will get to being able to provide a consensus judgement call moving forward. As the model masters the complex detection tasks, the human judgement required for those tasks will decrease, speeding up the process or measurement. 
  • Eventually, the model will be able to handle all but the most infrequent of cases. And if the model provides an uncertainty measure, it can be used to alert a human operator when to intervene. 

At Enthought, we’ve applied this approach in our custom solutions, and also in our products for energy exploration, SubsurfaceAI and the Thin Section Tool.

We’ve found that materials labs that can leverage the power of digital with computer vision and machine learning can see massive returns on investment via process improvements across the entire lab. For example:

  • Increased Efficiency– When critical or routine measurements are performed faster, the lab as a whole can operate more efficiently
  • Increased Operator Satisfaction– Reducing redundancies in workflows enables a more fulfilling work experience 
  • Improved Processes– Experts who often get called in for their judgement are now spending more time on improving the process, rather than running it
  • Reduced or Eliminated Risk– Business risks from process inconsistencies are reduced or eliminated entirely
  • Increased Digital Awareness– Increased awareness and education within the lab, helping to inform future transformation and automation priorities

By transforming workflows, labs are not only improving accuracy and efficiency, they are also enabling new innovations. Machine learning in materials discovery enables the automation of human-driven processes, helping scientists develop a sense for how appropriate technology can improve their laboratory and their work. The outcomes are even further enhanced when those scientists have some ownership over the solution. 

To learn more about how we turn scientists into digital leaders who drive laboratory innovation contact us today.

Share this article:

Related Content


Enthought GKチームは、東京で開催されたライフサイエンスカンファレンス「ファーマIT&デジタルヘルスエキスポ2022」に出展し、技術的な見識と市場成長の活性化を求めて集まる製薬業界のリーダーたちと会談しました。三日間の会期中に200社が出展し、6700人以上の参加者が集まりました。 デジタルトランスフォーメーションが主要テーマである本展示会は、当社のターゲットとする企業に、製薬業界の新薬開発を加速させる当社のサービスを

Read More


OpenAIのChatGPTやGoogleのBardなど、大規模言語モデル(LLM)は自然言語で人と対話する能力において著しい進歩を遂げました。 ユーザーが言葉で要望を入力すれば、LLMは「理解」し、適切な回答を返してくれます。

Read More



Read More

Top 5 Takeaways from the American Chemical Society (ACS) 2023 Fall Meeting: R&D Data, Generative AI and More

By Mike Heiber, Ph.D., Di…

Read More

Life Sciences Labs Optimize with New Digital Technologies and Upskilling

Labs are resetting the tr…

Read More

From Data to Discovery: Exploring the Potential of Generative Models in Materials Informatics Solutions

Generative models can be used in many more areas than just language generation, with one particularly promising area: molecule generation for chemical product development.

Read More

The Importance of Large Language Models in Science Even If You Don’t Work With Language

OpenAI's ChatGPT, Google's Bard, and other similar Large Language Models (LLMs) have made dramatic strides in their ability to interact with people using natural language....

Read More

Leveraging AI in Cell Culture Analysis

Mammalian cell culture is a fundamental tool for many discoveries, innovations, and products in the life sciences.

Read More

Scientists Who Code

Digital skills personas f…

Read More

Making the Most of Small Data in Scientific R&D

For many traditional innovation-driven organizations, scientific data is generated to answer specific immediate research questions and then archived to protect IP, with little attention paid...

Read More