A premiere artificial intelligence program pitted against a standard high school mathematics test. What could possibly go wrong? A lot, as the abysmally- low score of Google's premiere artificial intelligence program DeepMind in a math test designed for 16-year-old high school students in UK, suggests.
Researchers from Google's DeepMind this week published a paper explaining how they had attempted to train neural networks to solve basic arithmetic, calculus, and algebra problems. Surprisingly, DeepMind could just manage to solve 14 of the 40 questions— equivalent to E grade for a British high schooler. The questions were based on "a national school mathematics curriculum (up to age 16), restricted to textual questions (thus excluding geometry questions), which gave a comprehensive range of mathematics topics that worked together as part of a learning curriculum."
So what went wrong?
DeepMind even flunked an easy question like ‘What is the sum of 1+1+1+1+1+1+1?’, leaving researchers wondering what went wrong.
The paper, "Analysing Mathematical Reasoning Abilities of Neural Models," was created as a benchmark test set upon, which others can build in order to develop neural networks for math learning, similar to how ImageNet was created as an image recognition benchmark test.
A Medium article explains that the researchers found that algorithms struggle to translate a question as it appears on a test, full of words and symbols and functions, into the actual operations needed to solve it.
According to the researchers, even a simple math problem involves a great deal of brainpower, as people learn to automatically make sense of mathematical operations, memorize the order in which to perform them, and know how to turn word problems into equations.
Artificial intelligence, on the other hand, is programmed to pore over data, scanning for patterns and analyzing them.