Content provided by IBM and TNW.
Today’s AI systems are rapidly evolving to become humans’ new best friend. We now have AIs that can whip up award-winning whiskey, write poetry, and help doctors perform pinpoint-accurate surgeries. But one thing they can’t do – which is, on the face of it, much simpler than all those other things – is to use common sense.
Common sense is different from intelligence in that it is generally something innate and natural to humans that helps them navigate everyday life and cannot really be taught. In 1906, philosopher GK Chesterton wrote that “common sense is a wild, wild, and beyond the rules thing.”
Bots, of course, run on algorithms that are just that: rules.
So no, robots can’t use common sense yet. But thanks to current efforts in the field, we can now measure the basic psychological reasoning ability of an AI, which brings us one step closer.
So why does it matter if we teach AI common sense?
In reality, it boils down to the fact that common sense will make AI more effective in helping us solve real-world problems. Many argue that AI-based solutions designed for complex problems, like diagnosing Covid-19 treatments for example, often fail because the system cannot easily adapt to a real-world situation where problems are unpredictable, vague and not defined by rules.
Common sense includes not only social skills and reasoning, but also a “naïve sense of physics.”
Injecting common sense into AI could mean great things for humans; better customer service, where a bot can actually help an unhappy customer beyond sending them into an endless “Choose from the following options” loop. This can allow self-driving cars to better respond to unexpected road incidents. It can even help the military derive life-or-death information from intelligence.
So why haven’t scientists been able to crack the common sense code until now?
Called the “dark matter of AI,” common sense is both crucial to the future development of AI and, so far, elusive. Equipping computers with common sense has actually been a goal of computing since the very beginning of the field; in 1958, pioneering computer scientist John McCarthy published an article titled “Programs with common sense” which examined how logic could be used as a method of representing information in computer memory. But we haven’t made much progress since then to make it a reality.
Common sense not only includes social skills and reasoning, but also a “naive sense of physics” – this means that we know certain things about physics without having to work out physical equations, like why you shouldn’t put a bowling ball on an inclined surface. It also includes basic knowledge of abstract things like time and space, which enables us to plan, estimate and organize. “It is the knowledge that you should have,” says Michael Witbrock, an AI researcher at the University of Auckland.
All of this means that common sense is not a precise thing and therefore cannot be easily defined by rules.
We’ve established that common sense requires a computer to infer things based on complex real-world situations – something that comes easily to humans and begins to form from childhood.
Computer scientists are making (slow) but steady progress toward building AI agents that can infer mental states, predict future actions, and work with humans. But to see how close we really are, we first need a rigorous benchmark to gauge an AI’s “common sense,” or psychological reasoning ability.
Researchers from IBM, MIT, and Harvard came up with exactly that: AGENT, which stands for Aaction-goal-Eco-efficiencyNOTstump-uJity. After testing and validation, this benchmark proves capable of assessing the basic psychological reasoning ability of an AI model. This means it can actually provide a sense of social awareness and could interact with humans in real-life settings.
To make good sense, an AI model must have built-in representations of how humans plan.
So what is AGENT? AGENT is a large-scale dataset of 3D animations inspired by experiments that study cognitive development in children. Animations depict a person interacting with different objects under different physical constraints. According to IBM:
“The videos include separate trials, each comprising one or more ‘familiarization’ videos of an agent’s typical behavior in a certain physical environment, combined with ‘test’ videos of the same agent behavior in a new environment, which are labeled as “expected” or “surprising”, given the agent’s behavior in the corresponding familiarization videos.”
A model must then judge the degree of surprise of the agent’s behaviors in the “test” videos, based on the actions it has learned in the “familiarization” videos. Using the AGENT benchmark, this model is then validated against large-scale human evaluation trials, where humans rated “surprising” test videos as more surprising than “expected” test videos. .
IBM’s trial shows that to make common sense, an AI model must have built-in representations of how humans plan. It means combining both a basic sense of physics and “cost-reward trade-offs,” meaning an understanding of how humans take action “based on utility, trading the rewards of its purpose. against the costs of reaching it.
Although not yet perfect, the results show that AGENT is a promising diagnostic tool for developing and assessing common sense in AI, a topic IBM is also working on. It also shows that we can use traditional developmental psychology methods similar to those used to teach human children how objects and ideas are related.
In the future, this could help to significantly reduce the need for training on these models, saving companies IT energy, time and money.
Robots don’t yet understand human consciousness, but with the development of benchmarking tools like AGENT, we’ll be able to gauge how close we’re getting.