How AI Works (or Doesn't)

Artificial intelligence (AI) has been a hot research area for a long time. Amazon's Alexa is an artificial intelligent assistant that helps you to use and control many electronic systems. But AI systems still do not seem to have "common sense".
Today, machines can be given words with which to generate an entire article (like this one!). But since they have no experience of the real world (they are not living breathing beings), they may not be able to write sentences that appear natural and sensible.
For instance, here's one example sentence generated by a state-of-the-art model using the words "dog, frisbee, throw, catch": "Two dogs are throwing frisbees at each other."
It certainly seems like a reasonable sentence, except that it doesn't make sense! This is a fundamental challenge in the goal of developing AI. Basically there is a lack of experience of the human world, which shows up for instance, when you ask a robot to bring you hot milk: it may not understand that you just want a glass of it and not the entire vessel!

The common sense test
Common-sense reasoning is the hardest to teach. Some claim that modern deep-learning models can now reach around 90% accuracy. But this is hotly contested. For instance, a sentence such as "I'm afraid that I don't have the money to buy you ice-cream." is a very simple one. You can immediately understand that the speaker is in some sense apologising for not being able to buy you ice-cream. But think of how this sounds to an artificial intelligence such as a robot. The word "afraid" means to fear something. What is there to be afraid about with ice-cream?! "I don't have the money for ice-cream." is entirely clear and any robot will be able to understand this. But the first part of the sentence is very hard for a robot to understand since it has no experience of emotions such as regret. Such sentences belong to what is called Natural Language Processing (NLP) since they naturally occur when we speak. 
To evaluate different machine models, Ren, and his student Lin, who are experts in natural language processing, developed a software called CommonGen. This can be used as a benchmark to test the generative common sense of machines. The researchers presented a dataset consisting of 35,141 concepts associated with 77,449 sentences. They found the even best performing model only achieved an accuracy rate of 31.6%. Human did much better at 63.5%.
"Robots need to understand natural scenarios in our daily life before they make reasonable actions to interact with people," said Lin. "By introducing common sense and other domain-specific knowledge to machines, I believe that one day we can see AI agents generate natural responses and interact with our lives."