Learning Forever, Backprop Is Insufficient

#ai #ml

Continual Learning, or Life-long learning, is becoming more popular in Machine Learning (ML). This new research paper talks about plasticity decay, and how normal backpropagation is insufficient for continual learning. The inherent non-stationary property in many problems, especially in Reinforcement Learning (RL), makes it difficult to learn. Continual Backpropagation (CBP) is proposed as a solution to this.

Outline:
0:00 - Overview
2:00 - Paper Intro
2:53 - Problems & Environments
8:11 - Plasticity Decay Experiments
11:45 - Continual Backprop Explained
15:54 - Continual Backprop Experiments
22:00 - Extra Interesting Experiments
25:34 - Summary

Paper source: https://arxiv.org/abs/2108.06325

Abstract:
The Backprop algorithm for learning in neural networks utilizes two mechanisms: first, stochastic gradient descent and second, initialization with small random weights, where the latter is essential to the effectiveness of the former. We show that in continual learning setups, Backprop performs well initially, but over time its performance degrades. Stochastic gradient descent alone is insufficient to learn continually; the initial randomness enables only initial learning but not continual learning. To the best of our knowledge, ours is the first result showing this degradation in Backprop's ability to learn. To address this issue, we propose an algorithm that continually injects random features alongside gradient descent using a new generate-and-test process. We call this the Continual Backprop algorithm. We show that, unlike Backprop, Continual Backprop is able to continually adapt in both supervised and reinforcement learning problems. We expect that as continual learning becomes more common in future applications, a method like Continual Backprop will be essential where the advantages of random initialization are present throughout learning.

Learning Forever, Backprop Is Insufficient


continual learning life long learning life-long learning never-ending learning machine learning ml ai artificial intelligence research research paper paper reinforcement learning paper review continual backpropagation cbp backpropagation multi-task learning ICLR explained rich sutton

Post a Comment

0 Comments