Functions of several variables have derivatives in each direction. A partial derivative holds all variables fixed except one. The gradient collects all partial derivatives into a vector that points uphill. Directional derivatives measure the rate of change in any direction.
Functions of several variables
A function f(x, y) maps R2 to R. Its graph is a surface in R3. Think of it as a terrain map: the input is position, the output is altitude.
The partial derivative df/dx treats y as a constant and differentiates with respect to x. Numerically, it is the limit of [f(x+h, y) − f(x, y)] / h as h goes to zero.
Scheme
; Partial derivatives of f(x,y) = x^2*y + sin(x*y)
(define (f x y) (+ (* x x y) (sin (* x y))))
; Numerical partial derivatives
(define h 0.0001)
(define (df/dx f x y) (/ (- (f (+ x h) y) (f x y)) h))
(define (df/dy f x y) (/ (- (f x (+ y h)) (f x y)) h))
(define x0 1.0)
(define y0 2.0)
(display "f(1, 2) = ") (display (f x0 y0)) (newline)
(display "df/dx at (1,2) = ") (display (df/dx f x0 y0)) (newline)
; Exact: 2xy + y*cos(xy) = 4 + 2*cos(2)
(display "df/dy at (1,2) = ") (display (df/dy f x0 y0)) (newline)
; Exact: x^2 + x*cos(xy) = 1 + cos(2)
(display "Exact df/dx = ") (display (+ 4 (* 2 (cos 2)))) (newline)
(display "Exact df/dy = ") (display (+ 1 (cos 2)))
Python
import numpy as np
def f(x, y): return x**2 * y + np.sin(x * y)
h = 1e-7
dfdx = lambda x, y: (f(x+h, y) - f(x, y)) / h
dfdy = lambda x, y: (f(x, y+h) - f(x, y)) / h
print(f"df/dx at (1,2) = {dfdx(1.0, 2.0):.6f}")
print(f"df/dy at (1,2) = {dfdy(1.0, 2.0):.6f}")
print(f"Exact df/dx = {4 + 2*np.cos(2):.6f}")
print(f"Exact df/dy = {1 + np.cos(2):.6f}")
Gradient
The gradient of f is the vector of all partial derivatives: grad(f) = (df/dx, df/dy). It points in the direction of steepest ascent. Its magnitude is the maximum rate of change.
Scheme
; Gradient of f(x,y) = x^2 + y^2; grad(f) = (2x, 2y)
(define (f x y) (+ (* x x) (* y y)))
(define h 0.0001)
(define (gradient f x y)
(list (/ (- (f (+ x h) y) (f x y)) h)
(/ (- (f x (+ y h)) (f x y)) h)))
(define (vec-mag v) (sqrt (apply + (map (lambda (x) (* x x)) v))))
(define g (gradient f 3.04.0))
(display "grad(f) at (3,4) = ") (display g) (newline)
; Exact: (6, 8)
(display "|grad(f)| = ") (display (vec-mag g)) (newline)
; Magnitude = 10, which is the steepest ascent rate; Gradient points away from the minimum at (0,0)
(define g0 (gradient f 0.00.0))
(display "grad(f) at (0,0) = ") (display g0)
; (0, 0) -- zero gradient at the minimum
Python
# Gradient of f(x,y) = x^2 + y^2importmathdef f(x, y):
return x**2 + y**2
h = 1e-7def gradient(f, x, y):
dfdx = (f(x + h, y) - f(x, y)) / h
dfdy = (f(x, y + h) - f(x, y)) / h
return [dfdx, dfdy]
g = gradient(f, 3.0, 4.0)
mag = math.sqrt(sum(c**2for c in g))
print("grad(f) at (3,4) = " + str([round(c, 4) for c in g]))
print("|grad(f)| = " + format(mag, ".4f"))
g0 = gradient(f, 0.0, 0.0)
print("grad(f) at (0,0) = " + str([round(c, 4) for c in g0]))
Directional derivatives
The directional derivative in direction u is grad(f) · u (where u is a unit vector). The gradient gives the direction of maximum directional derivative.
Scheme
; Directional derivative of f(x,y) = x^2 + y^2 at (3,4) in direction (1,1)/sqrt(2)
(define (dot u v) (apply + (map * u v)))
(define (vec-mag v) (sqrt (apply + (map (lambda (x) (* x x)) v))))
(define (normalize v)
(let ((m (vec-mag v)))
(map (lambda (x) (/ x m)) v)))
; grad(f) at (3,4) = (6, 8)
(define grad-f (list 68))
; Direction: 45 degrees (northeast)
(define direction (normalize (list 11)))
(define dir-deriv (dot grad-f direction))
(display "Directional derivative in (1,1) direction = ")
(display dir-deriv) (newline)
; = (6 + 8)/sqrt(2) = 14/sqrt(2) = 9.899; Compare: max directional derivative = |grad(f)| = 10
(display "Max rate of change = |grad(f)| = ")
(display (vec-mag grad-f))
Python
# Directional derivative of f(x,y) = x^2 + y^2 at (3,4)importmath
grad_f = [6, 8] # grad(f) at (3,4) = (2x, 2y)# Direction: 45 degrees (northeast), normalized
direction = [1 / math.sqrt(2), 1 / math.sqrt(2)]
dir_deriv = sum(g * d for g, d inzip(grad_f, direction))
print("Directional derivative in (1,1) direction = " + format(dir_deriv, ".4f"))
print("Max rate of change = |grad(f)| = " + format(math.sqrt(sum(c**2for c in grad_f)), ".4f"))
Tangent planes
The tangent plane to z = f(x, y) at (a, b) is: z = f(a,b) + df/dx(a,b)*(x−a) + df/dy(a,b)*(y−b). It is the best linear approximation to the surface near the point.
Scheme
; Tangent plane to f(x,y) = x^2 + y^2 at (1, 2); z = f(1,2) + 2*1*(x-1) + 2*2*(y-2); z = 5 + 2(x-1) + 4(y-2) = 2x + 4y - 5
(define (f x y) (+ (* x x) (* y y)))
; Tangent plane approximation
(define (tangent-plane x y)
(+ (* 2 x) (* 4 y) -5))
; Compare near (1, 2)
(display "At (1, 2):") (newline)
(display " f = ") (display (f 1.02.0)) (newline) ; 5
(display " plane = ") (display (tangent-plane 1.02.0)) (newline) ; 5
(display "At (1.1, 2.1):") (newline)
(display " f = ") (display (f 1.12.1)) (newline) ; 5.62
(display " plane = ") (display (tangent-plane 1.12.1)) (newline) ; 5.6; Good approximation near the point, diverges far away
(display "At (3, 4):") (newline)
(display " f = ") (display (f 3.04.0)) (newline) ; 25
(display " plane = ") (display (tangent-plane 3.04.0)) ; 17 (bad!)