Advertisement
Advertisement
Science
Get more with myNEWS
A personalised news feed of stories that matter to you
Learn more
Chinese researchers are trying to incorporate “prior knowledge” with data when training machines. Photo: Shutterstock

Chinese researchers hope to create ‘real AI scientists’ through ‘informed machine learning’

  • The scientists wrote in a recent paper that they had found ways to train machines with ‘prior knowledge’ such as the laws of physics or mathematical logic
  • Even cutting edge models such as OpenAI’s text-to-video model Sora are currently unable to ‘accurately model the physics of many basic interactions’
Science
Chinese researchers have developed a new framework that will help developers train machine learning models that they hope could lead to the creation of “real AI scientists” capable of improving experiments and solving scientific problems.

Deep learning models have “revolutionised the field of scientific research” due to their ability to uncover relationships from large amounts of data, according to a paper published in the peer-reviewed Cell Press journal Nexus on Friday.

One recent example is Sora, a text-to-video model from the American company OpenAI, which the developers say can understand how “things exist in the real world”.
It has been widely praised for its advanced, realistic depictions of things and hailed as a massive step forward for generative AI, but the company has admitted it still struggles to simulate some aspects of the real world and cannot “accurately model the physics of many basic interactions, like glass shattering”.

Sora is trained using large amounts of visual data, allowing it to pick up patterns to generate images and videos that mimic reality. But it is not trained to understand physical laws such as gravity.

“Without a fundamental understanding of the world, a model is essentially an animation rather than a simulation,” said Chen Yuntian, study author and a professor at the Eastern Institute of Technology (EIT).

China’s AI gap with US is widening: ‘we are all very anxious’

Deep learning models are generally trained using data and not prior knowledge, which can include things such as the laws of physics or mathematical logic, according to the paper.

But the scientists from Peking University and EIT wrote that when training the models, prior knowledge could be used alongside data to make them more accurate, creating “informed machine learning” models capable of incorporating this knowledge into their output.

Deciding what prior knowledge – which can include things such as functional relationships, equations and logic – to incorporate into a model for it to “pre-learn” was a challenge and incorporating multiple rules could also lead to models collapsing, the team wrote.

“When faced with a high volume of knowledge and rules – which is often the case, current informed machine learning models tend to struggle or even fail,” Chen said.

To address this issue, the researchers created a framework to assess the value of rules and determine which combinations resulted in the most predictive models.

“Embedding human knowledge into AI models has the potential to improve their efficiency and ability to make inferences, but the question is how to balance the influence of data and knowledge,” Xu Hao, first author and researcher at Peking University, said in a Cell Press statement.

03:48

Young Chinese singles turn to AI-generated partners

Young Chinese singles turn to AI-generated partners

“Our framework can be employed to evaluate different knowledge and rules to enhance the predictive capability of deep learning models.”

The framework calculates “rule importance”, looking at how a specific rule or combination or rules affects the predictive accuracy of a model, according to the paper.

Teaching the AI models about such rules – for example, the laws of physics – could make them “more reflective of the real world, which would make them more useful in science and engineering”, Chen from EIT said in the statement.

The researchers tested their framework by using it to optimise a model for solving multivariate equations, and another one used to predict the results of a chemistry experiment.

Chen said that in the short term this framework would be the most useful in scientific models “where consistency between the model and physics rules is crucial to avoid potentially disastrous consequences”.

From AI to EVs, how is the China-US rivalry in key hi-tech areas playing out?

The team hopes to take their framework further to allow AI to identify its own knowledge and rules directly from data without human interference.

“We want to make it a closed loop by making the model into a real AI scientist,” Chen said in the statement. The team is developing an open source, plugin tool for AI developers that could allow them to achieve this.

However, the team has already identified at least one problem.

05:03

How does China’s AI stack up against ChatGPT?

How does China’s AI stack up against ChatGPT?

During the study the team found out that when more data is added to a model, general rules become more significant than specific local rules, but this does not help in fields such as biology and chemistry because they “often lack readily available general rules akin to governing equations”.

3