Winograd Schema Challenge: A Turing Test Alternative Puts AI Programs To Shame

Share on twitter
Tweet
Share on whatsapp
WhatsApp
Share on facebook
Share
bot ai turing test

bot ai turing testShort Bytes: Alternatives to the Turing Test are being thought about. A major one is the Winograd Schema Challenge in which the AI bots have to answer specially crafted questions in order to prove their intelligence and common sense.

The development of artificial intelligence has been on a butter track since the last decade with computer devices becoming smarter every day. We have Siri and Cortana which are able to understand our queries through voice commands and put up the appropriate answer in front of us. These voice assistants do the basic job of understanding the meaning of the sentence thrown towards them.

Another use of artificial intelligence is to create human-like-thinking AI bots that are capable of deciphering normal language and reply accordingly. Turing test, named after Alan Turing, is the method used to judge the intelligence of an AI bot. But, this test has its pitfalls and loopholes. An example that can be quoted is of Eugene Goostman — the first AI bot which passed the Turing test by fooling 30 percent of the judges that he a was a Ukranian boy.

Alternatives were searched for the Turing test and a new name Winograd Schema Challenge (WSC) came into existence which was based on the Winograd Schema created by Terry Winograd of Stanford University. WSC was proposed by the University of Toronto computer scientist Hector Levesque in order to test the machine intelligence by taking multiple choice questions test.

The structure of the questions designed for Winograd Schema Challenge is very specific in format. Let’s have a look at a sample question for instance —

The trophy would not fit in the brown suitcase because it was too big (small). What was too big (small)?

  • Answer 0: the trophy
  • Answer 1: the suitcase

The answer to this question–involves common sense which is a complex task for computers–to decide what is “what”. It will be 0 if the word big is used in the sentence and 1 if the word small is taken into consideration. A computer program may scratch his head to that, not literally, though.

Now, coming to the challenge, in order to claim the $25,000 cash prize, a program must attain 90% accuracy. The Winograd Schema Challenge for this year was held at the IJCAI 2016 in NYC on July 12.

Out of all the participants, two of them—a USTC prof. Quan Liu and an Open University of Cyprus researcher Nicos Issak—took the lead with their programs achieving the highest level of accuracy which was 48 percent when the programs used their digital brains to find the answer. However, an accuracy level of 45 percent was achieved if the answers were chosen randomly.

WSC Contest advisor isn’t amazed by the fact that the AI bots competitors were able to manage this much, “It’s unsurprising that machines were barely better than chance”, he says. He did praise the artificial intelligence R&D going under the roofs of companies like Google and Facebook. “It could’ve been that those guys waltzed into this room and got a hundred percent,” he laughs. “But that would’ve astounded me.”

Enhancing the intelligence level of these artificial chatting brains is a strenuous task, especially giving them ‘commons sense’ which even humans fail to you sometimes. In fact, they are just a piece of software stuffed with grammar concepts and large chunks of daily life human conversations. Some things are way beyond mathematical calculations and assumptions made by these AI talkers. But we can hope that the future bots would have a higher intellectual level than their ancestors. A few decades ago we didn’t have the ancestor bots either.

What do you think about the Winograd Schema Challenge? Tell us in the comments below.

Also Read: Briana — Use This Free Software And Give Artificial Intelligence Powers To Your PC

Aditya Tiwari

Aditya Tiwari

Aditya likes to cover topics related to Microsoft, Windows 10, Apple Watch, and interesting gadgets. But when he is not working, you can find him binge-watching random videos on YouTube (after he has wasted an hour on Netflix trying to find a good show). Reach out at [email protected]

New on Fossbytes

Scroll to Top