Typical Steps for Backpropagation

Let’s outline the procedure with your example:

Given functions:

and inputs: , , .


Step 1: Rewrite the function


Step 2: Draw the computation graph

x -----+
---> a = x + y -------+
/ |
y ------+ |
* ---> f = a * b
y ---------------------------+ |
| |
z ---------------------------> b = max(y, z)

Step 3: Calculate gradients of each local function

  • ,
  • (since ),

Step 4: Fill in values on the graph

  • ,

Step 5: Compute total gradients with respect to inputs

Using chain rule and summation:


This matches your observed total gradients paper: