2-3 Automatic Differentiate

The neural networks relies on back propagations to calculate gradients and update the parameters in the network. Gradient calculation is complicated which is easy to incur mistakes.

The framework of deeplearning helps us to calculate gradient automatically.

tf.GradientTape is usually used to record forward calculation in Tensorflow, and reverse this “tape” to obtain the gradient.

This is the automatic differentiate in TensorFlow.

1. Calculate the Derivative Using the Gradient Tape

  1. import tensorflow as tf
  2. import numpy as np
  3. # Calculate the derivative of f(x) = a*x**2 + b*x + c
  4. x = tf.Variable(0.0,name = "x",dtype = tf.float32)
  5. a = tf.constant(1.0)
  6. b = tf.constant(-2.0)
  7. c = tf.constant(1.0)
  8. with tf.GradientTape() as tape:
  9. y = a*tf.pow(x,2) + b*x + c
  10. dy_dx = tape.gradient(y,x)
  11. print(dy_dx)
  1. tf.Tensor(-2.0, shape=(), dtype=float32)
  1. # Use watch to calculate derivatives of the constant tensor
  2. with tf.GradientTape() as tape:
  3. tape.watch([a,b,c])
  4. y = a*tf.pow(x,2) + b*x + c
  5. dy_dx,dy_da,dy_db,dy_dc = tape.gradient(y,[x,a,b,c])
  6. print(dy_da)
  7. print(dy_dc)
  1. tf.Tensor(0.0, shape=(), dtype=float32)
  2. tf.Tensor(1.0, shape=(), dtype=float32)
  1. # Calculate the second order derivative
  2. with tf.GradientTape() as tape2:
  3. with tf.GradientTape() as tape1:
  4. y = a*tf.pow(x,2) + b*x + c
  5. dy_dx = tape1.gradient(y,x)
  6. dy2_dx2 = tape2.gradient(dy_dx,x)
  7. print(dy2_dx2)
  1. tf.Tensor(2.0, shape=(), dtype=float32)
  1. # Use it in the autograph
  2. @tf.function
  3. def f(x):
  4. a = tf.constant(1.0)
  5. b = tf.constant(-2.0)
  6. c = tf.constant(1.0)
  7. # Convert the type of the variable to tf.float32
  8. x = tf.cast(x,tf.float32)
  9. with tf.GradientTape() as tape:
  10. tape.watch(x)
  11. y = a*tf.pow(x,2)+b*x+c
  12. dy_dx = tape.gradient(y,x)
  13. return((dy_dx,y))
  14. tf.print(f(tf.constant(0.0)))
  15. tf.print(f(tf.constant(1.0)))
  1. (-2, 1)
  2. (0, 0)

2. Calculate the Minimal Value Through the Gradient Tape and the Optimizer

  1. # Calculate the minimal value of f(x) = a*x**2 + b*x + c
  2. # Use optimizer.apply_gradients
  3. x = tf.Variable(0.0,name = "x",dtype = tf.float32)
  4. a = tf.constant(1.0)
  5. b = tf.constant(-2.0)
  6. c = tf.constant(1.0)
  7. optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)
  8. for _ in range(1000):
  9. with tf.GradientTape() as tape:
  10. y = a*tf.pow(x,2) + b*x + c
  11. dy_dx = tape.gradient(y,x)
  12. optimizer.apply_gradients(grads_and_vars=[(dy_dx,x)])
  13. tf.print("y =",y,"; x =",x)
  1. y = 0 ; x = 0.999998569
  1. # Calculate the minimal value off(x) = a*x**2 + b*x + c
  2. # Use optimizer.minimize
  3. # This optimizer.minimize is identical to calculating gradient using tape, then call apply_gradient
  4. x = tf.Variable(0.0,name = "x",dtype = tf.float32)
  5. #Note that f() has no argument
  6. def f():
  7. a = tf.constant(1.0)
  8. b = tf.constant(-2.0)
  9. c = tf.constant(1.0)
  10. y = a*tf.pow(x,2)+b*x+c
  11. return(y)
  12. optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)
  13. for _ in range(1000):
  14. optimizer.minimize(f,[x])
  15. tf.print("y =",f(),"; x =",x)
  1. y = 0 ; x = 0.999998569
  1. # Calculate minimal value in Autograph
  2. # Use optimizer.apply_gradients
  3. x = tf.Variable(0.0,name = "x",dtype = tf.float32)
  4. optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)
  5. @tf.function
  6. def minimizef():
  7. a = tf.constant(1.0)
  8. b = tf.constant(-2.0)
  9. c = tf.constant(1.0)
  10. for _ in tf.range(1000): #Note that we should use tf.range(1000) instead of range(1000) when using Autograph
  11. with tf.GradientTape() as tape:
  12. y = a*tf.pow(x,2) + b*x + c
  13. dy_dx = tape.gradient(y,x)
  14. optimizer.apply_gradients(grads_and_vars=[(dy_dx,x)])
  15. y = a*tf.pow(x,2) + b*x + c
  16. return y
  17. tf.print(minimizef())
  18. tf.print(x)
  1. 0
  2. 0.999998569
  1. # Calculate minimal value in Autograph
  2. # Use optimizer.minimize
  3. x = tf.Variable(0.0,name = "x",dtype = tf.float32)
  4. optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)
  5. @tf.function
  6. def f():
  7. a = tf.constant(1.0)
  8. b = tf.constant(-2.0)
  9. c = tf.constant(1.0)
  10. y = a*tf.pow(x,2)+b*x+c
  11. return(y)
  12. @tf.function
  13. def train(epoch):
  14. for _ in tf.range(epoch):
  15. optimizer.minimize(f,[x])
  16. return(f())
  17. tf.print(train(1000))
  18. tf.print(x)
  1. 0
  2. 0.999998569

Please leave comments in the WeChat official account “Python与算法之美” (Elegance of Python and Algorithms) if you want to communicate with the author about the content. The author will try best to reply given the limited time available.

You are also welcomed to join the group chat with the other readers through replying 加群 (join group) in the WeChat official account.

image.png