Learning occurs most quickly when the difficulty of the training is adjusted to keep the learner’s accuracy at around 85%, a new US study shows.
The finding adds weight to the widely accepted idea that kids, in particular, learn best when given challenges just beyond their current ability level.
This produces a “sweet spot in which training is neither too easy nor too hard and where learning progresses most quickly”, write Robert Wilson, from the University of Arizona, and colleagues in the journal Nature Communications.
There’s much evidence to support this, Wilson says, but until now no theoretical account to explain why, or what that peak difficulty level might be, whether in training people or working with machines.
His team’s approach was to apply “gradient descent” algorithms, which learn by trying to reduce errors.
“As an example,” he explains, “imagine trying to learn how to classify images as either dogs or cats – you feed in an image to the model, the model processes the image with some parameters, and then it spits out an answer (cat or dog) that could be right or wrong.
“Gradient descent algorithms work by adjusting their parameters in such a way as to reduce error (on average) over time. In the simplest case it’s trial and error learning – if the model answers correctly it shouldn’t change its parameters, but if it answers incorrectly, it should.”
They showed that for binary classification tasks, such as cat versus dog, the optimal difficulty level that produces the quickest learning is when mistakes are made 15.87% of the time, or conversely when the algorithm’s learning accuracy is around 85%. Recommended
They tested the algorithm in three different models covering “artificial and biologically plausible neural networks”.
The artificial example involved completely random images (“think two kinds of TV static for the two categories”). The next was drawn from psychology where people must decide whether a pattern of moving dots is going left or right, and the final was a more realistic task of categorising handwritten digits as odd or even, or high or low.
In all cases they found the algorithm learned fastest when the difficulty level was kept at 85% accuracy. Further, they report that maintaining this produced “exponentially faster” learning than when the difficulty level remained fixed.
Wilson says the “Eighty Five Percent Rule” probably won’t apply directly to students because they aren’t learning binary classification tasks – though it may apply with perceptual learning that gets better with practice, such as when radiologists learn to detect tumours in MRI images.
It does, however, confirm that educators have been more or less hitting the mark.
“[T]he Eighty Five Percent Rule accords with the informal intuition of many experimentalists that participant engagement is often maximised when tasks are neither too easy nor too hard,” the authors write.
“Indeed it is notable that staircasing procedures (that aim to titrate task difficulty so that error rate is fixed during learning) are commonly designed to produce about 80-85% accuracy.”
The most direct applications of the study – and the rule – are for speeding up various machine learning algorithms, such as “multilayered feedforward and recurrent neural networks”, including “deep learning”, although the researchers note that the assumption might not always be met in more complex scenarios.