AI Needs the ‘Applied Sciences’ Treatment

As industries rapidly advance in AI/machine learning, a key to unlocking the power of these approaches for companies is an enabling environment. Domain experts need to be able to use artificial intelligence on data relevant to their work, but they should not have to know computer or data science techniques to solve their problems. An environment which enables the domain expert to easily and intuitively label data and train models will allow AI to become truly ‘applied.’ The above image shows a series of fault planes predicted by our approach in the SubsurfaceAI Seismic application, created with ‘applied machine learning’ in mind. Learn More.

The Rise of ‘Applied Machine Learning’ and Geoscience

A generally accepted definition of ‘applied sciences’ is the use of the scientific method and associated knowledge to solve practical problems. The next phase for artificial intelligence now gaining attention is ‘applied AI,’ with an analogous definition being ‘the use of AI methods and associated domain expert knowledge to solve practical problems,’ with those underlying methods transparent to the expert owning the challenge and using the tools. 

In the last few years, the field of AI/machine learning has experienced rapid advances in capabilities and applications. A search on the words ‘applied machine learning’ provides a host of engaging articles. These articles also indicate AI capabilities have been driven largely by data scientists and experts in coding/model building using general approaches. 

Many of these capabilities have yet to be easily accessible to domain experts in a way that enables them to rapidly adapt them to solve their specific problems – in other words, to be truly ‘applied.’ 

Machine learning models themselves are increasingly becoming commoditized, freely available, and easy to build proof of concept work around. In business terms, this means that soon internally developed machine learning models will not provide differentiation to a company’s offering. 

Machine learning models have become more like plug and play building blocks that can be fit into a solution. This makes it very easy to rapidly test and prototype solutions, but it does not solve the critical problems associated with actually putting AI solutions into the hands of domain experts in a way that allows adaptation as working tools. 

The problem is one of industries moving to ‘applied machine learning.’ A key question for this transition is how do you set up a solution that gives a domain expert access to the machine learning tools in an environment where they can easily and intuitively train machine learning models? 

That is a much trickier proposition than prototyping a solution, and it’s why we’re seeing recent high valuations for companies such as Scale and Labelbox, which are focused on providing a way to operationalize AI for business. 

It’s All About the User and Labeled Data 

The machine learning models themselves are important, and it is necessary to test different networks, different models, and various ways of layering models to arrive at reasonable results. However, many of these models are essentially commodities. So, although you need to fiddle with them, a domain expert can take them off the shelf and connect different ones in various ways to test different solutions relatively quickly and easily. 

Increasingly, labeled data is the key. Things get more challenging when it comes to how the domain expert will interact with the machine learning models and with the data they are interested in manipulating or analyzing. A lot of effort from many companies has gone into labeling, analyzing, and interpreting everyday types of data. In the B2C world, this is dominated by pictures of people, roads, cars, or usage patterns of consumers of various services, such as social media platforms and video streaming services. 

These efforts have resulted in a large amount of labeled data on objects that commonly appear in our world. But areas that have been left behind in those efforts include many of the sciences where datasets are typically much smaller, the number of people working on it are fewer, and the number of people who can correctly label the data are fewer still. 

For example, let’s say an exploration team wants geologic core (rock) data labeled such that the stratigraphy is highlighted, as well as the general makeup of each of the stratigraphic layers (e.g., sand, shale and carbonates). They can’t just let someone with no geoscience background do the labeling. The result would be a bunch of meaningless training data. That’s the situation many science-based companies are in. They have good data, maybe not on a large scale, but enough to use AI and ML to good effect. However, they lack the labeling technology and labeled data to use it effectively.

So, a really important thing to build simply and intuitively is the user interface to the data and AI models. The domain expert must be able to easily and intuitively interact with the data and rapidly build-out training data on which to run AI models. 

The Ultimate Prize for the Geoscientist 

The ultimate prize in the subsurface world of energy is for the geoscientist to be training the machine learning model while labeling the data. This is a revolutionized workflow – one that completely removes any role for an intermediary such as a data scientist and one that enables the domain expert to utilize a model that will interpret the way they do.

In the energy industry subsurface world, one could envision analogs to ImageNet, for example a ‘Seismic ImageNet,’ a ‘WellLogNet’, and ‘CoreCTScanNet’ as open source datasets. There is rapidly enough open source data becoming available to develop such high-quality models. 

Automated, iterative image labeling integrated with models makes it possible, and the result is that companies with massive amounts of subsurface data exclusive to them will find their advantage in big data approaches eroding. 

This prize is available, albeit in an early stage, for seismic interpretation in our recently developed custom deep learning application, SubsurfaceAI Seismic. Anyone who would like to see how it meets the ‘applied machine learning’ test, please get in touch.  

About the Author

Mason Dykstra is the Enthought vice president of Energy Solutions. As an intuitive thought leader, he helps oil and gas companies connect the dots between science, engineering, technology, and business needs. Mason leads the Enthought team of energy experts and scientists in tackling big problems that contribute to the bottom line. Connect with Mason on LinkedIn at linkedin.com/in/mason-dykstra-a304b25/ to join his online conversations.

Share this article:

Related Content

Enthoughtが定義する、製薬会社の研究開発ラボにおける真のDX

Enthought GKチームは、東京で開催されたライフサイエンスカンファレンス「ファーマIT&デジタルヘルスエキスポ2022」に出展し、技術的な見識と市場成長の活性化を求めて集まる製薬業界のリーダーたちと会談しました。三日間の会期中に200社が出展し、6700人以上の参加者が集まりました。 デジタルトランスフォーメーションが主要テーマである本展示会は、当社のターゲットとする企業に、製薬業界の新薬開発を加速させる当社のサービスを

Read More

科学における大規模言語モデルの重要性

OpenAIのChatGPTやGoogleのBardなど、大規模言語モデル(LLM)は自然言語で人と対話する能力において著しい進歩を遂げました。 ユーザーが言葉で要望を入力すれば、LLMは「理解」し、適切な回答を返してくれます。

Read More

Top 5 Takeaways from the American Chemical Society (ACS) 2023 Fall Meeting: R&D Data, Generative AI and More

By Mike Heiber, Ph.D., Di…

Read More

From Data to Discovery: Exploring the Potential of Generative Models in Materials Informatics Solutions

Generative models can be used in many more areas than just language generation, with one particularly promising area: molecule generation for chemical product development.

Read More

The Importance of Large Language Models in Science Even If You Don’t Work With Language

OpenAI's ChatGPT, Google's Bard, and other similar Large Language Models (LLMs) have made dramatic strides in their ability to interact with people using natural language....

Read More

Leveraging AI in Cell Culture Analysis

Mammalian cell culture is a fundamental tool for many discoveries, innovations, and products in the life sciences.

Read More

Extracting Value from Scientific Data to Accelerate Discovery and Innovation

In the digital era, robust data tools are crucial for all companies and the science-driven industries like the life sciences, materials science, and chemistry are...

Read More

Giving Visibility to Renewable Energy

The ultimate project goal…

Read More

Machine Learning in Materials Science

The process of materials …

Read More

AI Needs the ‘Applied Sciences’ Treatment

As industries rapidly adv…

Read More