Typical Steps for Backpropagation

Let’s outline the procedure with your example:

Given functions:

and inputs: , , .


Step 1: Rewrite the function


Step 2: Draw the computation graph

x -----+  
---> a = x + y -------+  
/ |  
y ------+ |  
* ---> f = a * b  
y ---------------------------+ |  
| |  
z ---------------------------> b = max(y, z)

Step 3: Calculate gradients of each local function

  • ,
  • (since ),

Step 4: Fill in values on the graph

  • ,

Step 5: Compute total gradients with respect to inputs

Using chain rule and summation:


This matches your observed total gradients paper: