The Total Derivative: Correcting the Misconception of Backpropagation’s Chain Rule
This article uses concepts from this brilliant paper. For a deeper understanding of the mathematics please refer to the paper. Here we try to present the math in a more intuitive and explicit way, with some important nuances highlighted. 1 Introduction Discussions about Backpropagation often say we use the ‘chain rule’ to derive the gradient […]
The Total Derivative: Correcting the Misconception of Backpropagation’s Chain Rule Read More »