<
Chapter 5
Vectors in Calculus
This chapter introduces vectors and their applications to calculus. We will use them to compute direc-
tional derivatives, to differentiate compositions of functions, and to find minimum and maximum values
of a function.
Contents
5.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
5.2 The Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
5.3 Normal Equations of Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
5.4 The Gradient Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
5.5 The Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
5.6 Maximum and Minimum Values . . . . . . . . . . . . . . . . . . . . . . . . 375
5.7 Lagrange Multipliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
Example 5.1.4
Performing Vector Arithmetic
Given diagrams of two vectors u and v, how would we calculate
1
2
u + v?
What if we are instead given the components u = ⟨a, b⟩ and v = ⟨c, d⟩?
Solution
After drawing a random u and a random v, we draw
1
2
u in the same direction as u but is half as long.
We place it head to tail with v, and
1
2
u + v completes the triangle.
In coordinates the computation is as follows.
1
2
u + v =
1
2
⟨a, b⟩+ ⟨c, d⟩
=
1
2
a,
1
2
b
+ ⟨c, d⟩
=
1
2
a + c,
1
2
b + d
311
Question 5.1.9
How Do We Denote Vectors in Higher Dimensions?
Higher dimensional vectors represent displacements in higher dimensional spaces. We can call a
vector in n-space an n-vector. We can still denote and n-vector by its endpoints. We can also denote
it in coordinate notation, but we need more components.
Example
If A = (2, 4, 1) and B = (5, −1, 3) then
−−→
AB = ⟨3, −5, 2⟩.
In three space, we add another standard basis vector
k.
Standard basis for 3-vectors
i = ⟨1, 0, 0⟩
j = ⟨0, 1, 0⟩
k = ⟨0, 0, 1⟩
Example
⟨3, −5, 2⟩ = 3
i − 5
j + 2
k
Higher dimensions still have a standard basis, but at this point the naming conventions are less
standard. {e
1
, e
2
, e
3
, . . . , e
n
} is common for n-vectors.
Length of a Vector
The length of an n-vector derives from the distance formula in n-space.
|⟨a
1
, a
2
, a
3
, . . . , a
n
⟩| =
q
a
2
1
+ a
2
2
+ a
2
3
+ ···+ a
2
n
We might be concerned that direction becomes an even more difficult concept to work with as the
dimension increases. However, angles are a valid a way of comparing directions any dimension (though
they may be more difficult to compute).
314
Section 5.1
Exercises
5.1.1
Q5
Which of the following are vectors?
i. The reading on a speedometer.
ii. The intersection of two lines.
iii. Five miles toward Atlanta.
iv. The length of a string.
v. The velocity of a projectile.
Q6
Which of the following are vectors?
i. The displacement of a key on a keyboard, when pressed.
ii. The speed of light.
iii. The center of the earth.
iv. The force applied by a rocket engine.
v. The mass of five hippopotamuses.
Q7
If
−−→
AB =
−→
AC, what does that tell us about the points B and C? Explain.
Q8
If
−−→
AB =
−−→
BA, what does that tell us about the points A and B? Explain.
5.1.2
Q9
If A = (8, 7, 11) and B = (2, 3, 15) write the vector
−−→
AB
a
in terms of its components
b
in standard basis notation
Q10
If P = (−2, 3, 5) and Q = (−2, 0, −4) write the vector
−−→
P Q
a
in terms of its components
b
in standard basis notation
Q11
What is the slope of the vector −4
i + 10
j?
Q12
Give three different vectors of slope
3
7
.
316
Q13
Suppose two different vectors have the equal slopes. How are they related?
Q14
Given a number m, give two different vectors with slope m.
5.1.3
Q15
Let u be a vector. How are the magnitude and direction of u and 2u related?
Q16
How is the direction and magnitude of u related to the direction and magnitude of −u?
Q17
Given diagrams of two vectors u and v, how would we draw u −v? What it its significance?
Q18
If u is a vector and
2u = u, what does that tell us about u? Explain.
Q19
If u =
−−→
AB, v =
−→
AC, and
1
2
u +
1
2
v =
−−→
AD, where is D?
Q20
If u =
−−→
AB, v =
−→
AC, and
1
5
u +
4
5
v =
−−→
AD, where is D?
5.1.4
Q21
Let u = 4
i + 3
j and v = 5
i − 2
j. Compute u + v.
Q22
Let w = ⟨5, −1⟩ and v = ⟨12, 10⟩. Compute w −v.
Q23
For Lindsey to get from her house to Sam’s house, she travels 5mi north and 3mi west. To
get to Russel’s house, she travels 2mi due south. What displacement would get her from Sam’s
house to Russel’s house?
Q24
One can get from Atlanta to Decatur by travelling 8km east and 2km north. To get from
Decatur to Covington, one can travel 43km east and 20km south. Describe how to get directly
from Atlanta to Covington.
Q25
Using the diagram below, describe each vector in terms of u and v using vector addition and
scalar multiplication. Use the fact that ACDB and ACBE are parallelograms.
317
5.1.6
Q29
Compute the length of u = ⟨−5, 12⟩.
Q30
Given a nonzero vector u, many vectors of length 5 are parallel to u? Explain.
Q31
Find a unit vector in the direction of 3
i −
j.
Q32
Find a unit vector in the direction of ⟨12, −16⟩.
5.1.7
Q33
If u and v are vectors in R
2
whose components are all positive, what is the largest possible angle
between u and v?
Q34
Explain the difference between the terms “perpendicular” and “orthogonal.”
Q35
Suppose two vectors do not have the same inital point, but when we represent them by arrows,
the arrows happen to cross. Is the angle made in the crossing equal to the angle between the
vectors (as we defined it)?
Q36
Describe all the vectors that make an angle of
π
4
with v = −
j.
5.1.8
Q37
If u = ⟨2, 0, 3⟩ and v = ⟨5, 6, 0⟩, compute 3u − 4v.
Q38
If a = 10
i − 25
k and
b = 8
i − 4
j + 10
k, compute
3
5
a +
1
2
b.
Q39
Compute the magnitude of v = 2
i − 7
j + 6
k.
Q40
Compute two unit vectors parallel to v = ⟨4, −4, 2⟩.
319
Section 5.1
Exercises
Q41 a
How many different (nonequal) unit vectors are orthogonal to a given vector in R
2
? How
are they related to each other?
b
How many different (nonequal) unit vectors are orthogonal to a given vector in R
3
? How
are they related to each other?
Q42
Let u and v be non-parallel vectors in R
3
. How many unit vectors in R
3
are orthogonal to both
u and v?
Synthesis and Extension
Q43
Is the vector v = 2
i + 3
j + 8
k parallel to the plane p whose slope-intercept equation is z =
x + 2y − 7?
Q44
For a two-variable function f (x, y), f
x
(x
0
, y
0
) is the slope of the line tangent to z = f(x, y) at
(x
0
, y
0
, f(x
0
, y
0
)) in the x-direction. Write a vector v that is parallel to this line.
Q45
If u =
−−→
AB and v =
−→
AC, show that for any scalar t, tu + (1 − t)v = AD where D is a point on
the line through B and C.
Q46
If u, v and w are position vectors of the three vertices A, B and C of a triangle, then
1
3
(u+v + w)
is the position vector of K, the center of mass of the triangle. Verify this by showing that K lies
on the line between A and the midpoint of the side BC.
Q47
Suppose we become interested in studying vectors of infinite dimension (yes this is something
mathematicians actually do).
a
Explain what trouble we might run computing the length of the vector ⟨1, 1, 1, 1, 1, . . .⟩.
b
What would the length of the vector ⟨1,
1
2
,
1
4
,
1
8
,
1
16
, . . .⟩ be?
320
Section 5.2
Goals:
1 Calculate the dot product of two vectors.
2 Determine the geometric relationship between two vectors based on their dot product.
3 Calculate vector and scalar projections of one vector onto another.
The arithmetic of vectors appears to have room for expansion. While we can add and subtract
vectors, we only defined how to multiply them by scalars, not by other vectors. There are in fact
products of two vectors. The simplest and most useful is the dot product. The dot product takes two
n-vectors and outputs a single number. Despite this apparent loss of information, the dot product is
the key tool in computing the angle between vectors, the work done by a force, or the illumination in a
digital scene.
Question 5.2.1
Definition
The dot product of two vectors is a number.
For two dimensional vectors v = ⟨v
1
, v
2
⟩ and u = ⟨u
1
, u
2
⟩ we define
v ·u = v
1
u
1
+ v
2
u
2
For three dimensional vectors v = ⟨v
1
, v
2
, v
3
⟩ and u = ⟨u
1
, u
2
, u
3
⟩ we define
v ·u = v
1
u
1
+ v
2
u
2
+ v
3
u
3
This pattern can be extended to any dimension.
Example 5.2.2
a
Calculate ⟨2, 3, −1⟩· ⟨4, 1, 5⟩
b
Calculate (−2
i + 4
k) · (
i + 2
j −
k)
321
Solution
We’ll apply the cosine formula, compute all of the components besides θ and solve.
⟨1, 0, 1⟩·⟨1, 1, 0⟩ = |⟨1, 0, 1⟩||⟨1, 1, 0⟩|cos θ
(1)(1) + (0)(1) + (1)(0) =
p
1
2
+ 0
2
+ 1
2
p
1
2
+ 1
2
+ 0
2
cos θ
1 =
√
2
√
2 cos θ
1
2
= cos θ
cos
−1
1
2
= θ
π
3
= θ
We can verify this by noting that these vectors are diagonals in a unit cube. We could connect them
with a third diagonal to make an equilateral triangle. We may recall that an equilateral triangle has
angles of
π
3
.
Figure: Two vectors in a unit cube
Application 5.2.6
In physics, we say a force works on an object if it moves the object in the direction of the force.
Given a force F and a displacement s, the formula for work is:
W = F s
325
5.2.1
Q5
What do v ·
i and v ·
j measure about v?
Q6
Elaine computes u·v and gets ⟨15, 4⟩. How can you tell that Elaine got the wrong answer without
even knowing what u and v are?
5.2.2
Q7
Compute the following dot products.
a
⟨4, 5⟩· ⟨−1, −2⟩
b
(5
i + 6
j) · (
i − 2
j)
c
⟨2, 4, −10⟩·⟨0, −1, −2⟩
Q8
Compute the following dot products.
a
⟨4, 5⟩· ⟨−1, −2⟩
b
(5
i + 6
j) · (
i − 2
j)
c
(2
i − 3
k) · (7
j −
k)
5.2.3
Q9
Let u = ⟨2, 3⟩, v = ⟨4, −1⟩ and w = ⟨−5, 2⟩.
a
Compute u · u and u ·v and u · w.
b
Compute v · u. How does it compare to u ·v?
327
5.2.5
Q19
Compute the angle between ⟨6, 1, 4⟩ and ⟨7, 0, 2⟩.
Q20
Compute the angle between ⟨0, 3, −5⟩ and ⟨3, −4, 3⟩.
Q21
Let A be the vertex of a cube. Let B the a vertex closest to A and C be the vertex farthest from
A. Compute the angle between
−−→
AB and
−→
AC.
Q22
Let A be the vertex of a cube, and B and C be any two other points on the cube. Use a dot
product to explain why the angle between
−−→
AB and
−→
AC cannot be larger than
π
2
. (Hint, put A
at (0, 0, 0).)
Synthesis and Extension
Q23
How could you use the dot product to determine whether two vectors are parallel? How does this
compare with the methods we already have?
Q24
Use dot products to find at least one vector that is orthogonal to both ⟨5, −1, 2⟩ and ⟨4, 4, 1⟩
Q25
“Think of a vector v” says Raphael, “tell me its dot product with the vector of my choice, and
I’ll tell you what your vector was.”
a
Is there any mathematical way to make such a trick work? Explain.
b
How many dot products would you need to ask for to uniquely identify an unknown vector?
What dot products would you ask for?
329
Figure: A plane, its normal vector n, and a vector
−−→
P Q in the plane
This gives us an avenue to test whether a point Q lies on the plane or not. If
−−→
P Q is orthogonal to
n, then Q lies on the plane. If
−−→
P Q and n make a different angle, then Q is not on the plane.
We’d like to rewrite this relationship terms of the coordinates of Q. If r
0
is the position vector of
P and r is the position vector of Q, then
−−→
P Q = r −r
0
. The dot product gives us a simple test to see
whether this vector is orthogonal to n.
Theorem
If r
0
= ⟨x
0
, y
0
, z
0
⟩ describes an known point on a plane, and n = ⟨a, b, c⟩ is a normal vector. Then
the normal equation of the plane is
(r −r
0
) ·n = 0
or
a(x − x
0
) + b(y − y
0
) + c(z − z
0
) = 0
Notice that since x
0
, y
0
and z
0
are constants, we can distribute and collect them into a single term:
d.
ax + by + cz − ax
0
− by
0
− cz
0
= 0
ax + by + cz + d = 0
This reasoning works in any dimension to define a set of points whose displacement from a known
point is orthogonal to some normal vector.
331
Question 5.3.1
What is a Normal Vector to a Plane?
Example
a(x − x
0
) + b(y − y
0
) = 0 defines a line.
a(x − x
0
) + b(y − y
0
) + c(z − z
0
) = 0 defines a plane.
a
1
(x
1
− c
1
) + a
2
(x
2
− c
2
) + ··· + a
n
(x
n
− c
n
) = 0 defines a hyperplane.
Example 5.3.2
Computing a Normal Vector
Find the normal equation of the plane with intercepts (4, 0, 0), (0, 3, 0) and (0, 0, 8). Compute a
normal vector.
Solution
The normal equation of a plane has the form ax + by + cz + d = 0. Each of these points must satisfy
this equation. We will plug them in and see what they tell me about the coefficients.
a(4) + b(0) + c(0) + d = 0 4a + d = 0
d = −4a
a(0) + b(3) + c(0) + d = 0 3b + d = 0
d = −3b
a(0) + b(0) + c(8) + d = 0 8c + d = 0
d = −8c
There are infinitely many solutions to this system of equations. This makes sense, because there are
infinitely many normal vectors to a plane. Different choices of d give n’s that are scalar multiples of
each other. A convenient choice for d is −24, but any nonzero value will work. d = −24 gives
6x + 8y + 3z − 24 = 0
The normal vector is ⟨6, 8, 3⟩.
332
Synthesis 5.3.3
Using the Normal Vector to Compute Distance
Theorem
Given a line, plane, or hyperplane with normal equation L(x
1
, . . . , x
k
) = 0 and corresponding normal
vector n, the signed distance from the hyperplane to the point Q = (q
1
, . . . , q
k
) is
L(q
1
, . . . , q
k
)
n
.
Let P be a known point on the hyperplane. The scalar projection of
−−→
P Q onto n is equal to the
signed distance from the hyperplane to Q.
Figure: The scalar projection of
−−→
P Q onto the normal vector of a line
Distance =
−−→
P Q · n
|n|
(formula for scalar projection)
=
L(q
1
, . . . , q
k
)
|n|
(normal equation of the plane)
This formula is especially powerful because we do not need to know a point on the hyperplane. The
equations
a(x − x
0
) + b(y − y
0
) + c(z − z
0
) = 0
ax + by + cz + d = 0
are equivalent, and correspond to the same normal vector. We can use whichever one we happen to
have in our signed distance formula.
334
5.3.1
Q5
Is v = ⟨8, −3, −10⟩ parallel to the plane 6x + 6y + 3z + 11 = 0? Explain.
Q6
Is v = 9
i − 15
j + 6
k normal to the plane −6x + 10y − 4z + 23 = 0? Explain.
Q7
Name a normal vector to the following planes:
i. 3x − 8y + 10z − 4 = 0
ii. z − 2 = 4(x + 7) − 5(y + 1)
Q8
Suppose that n is a normal vector to 6x −3y + 2z −4 = 0, that happens to also be a unit vector.
Give all possible values of n.
Q9
Write a normal equation of a plane parallel to 7x − 11y + 8z + 15 = 0 that passes through the
origin.
Q10
Write a normal equation of a plane parallel to 10x − 11y + z + 20 = 0 that passes through
(2, 3, 5).
Q11
Given that the plane ax + by + cz + d = 0 passes through the origin, what can you say about a,
b, c, and d?
Q12
Given that plane ax + by + cz + d = 0 contains the x-axis, what can you say about a, b, c, and
d?
Q13
Are the planes 4x + 6y + 8z + 15 = 0 and 10x + 15y + 20z − 7 = 0 parallel? Explain how you
know.
Q14
Suppose we know the planes 12x + 18y + 6z − 15 = 0 and ax + by + 4z + d = 0 are parallel.
What can you say about the values of a, b and d?
Q15
The equations 3x −y + 4z + 10 = 0 and −6x + 2y −8z + k = 0 describe the same plane. What
is the value of k?
Q16
Consider the plane with normal equation 7x + y − 2z = 5.
a
Give two other normal equations of this plane.
b
What are the normal vectors corresponding to the orginal equation and your two equations
in
a
?
337
Section 5.3
Exercises
c
How are these vectors in
b
related to each other?
5.3.2
Q17
Give a normal equation of the plane with intercepts (10, 0, 0), (0, −5, 0) and (0, 0, 2).
Q18
Give a normal equation of the plane with intercepts (−18, 0, 0), (0, 9, 0) and (0, 0, −4).
Q19
Give a normal equation of the plane through (4, 3, 0), (5, 1, 1) and (−2, 5, 2).
Q20
Give a normal equation of the plane through (1, 1, 1), (8, 1, 4) and (0, 0, 4).
5.3.3
Q21
Katie is computing the distance from the point (6, 3) to the line 2x + 3y − 12 = 0. She notices
that (6, 0) is the x-intercept of the line. Since (6, 3) is 3 units away from (6, 0) she concludes
the distance from the point to the line is 3. What do you think of Katie’s reasoning?
Q22
Consider the line L with normal equation 2x + 3y − 12 = 0 and the point Q = (6, 3).
a
What is the slope of L?
b
What would be the slope of a line perpendicular to L?
c
Write an equation (in any form you’d like) of a line K that passes through Q and is perpen-
dicular to L.
d
Compute the intersection point of P of L and K.
e
What is the distance from P to Q?
f
Check that your answer to
e
matches the distance formula we derived. Which method do
you like better?
338
5.3.4
Q23
How far is (5, 2, 1) from 3x + 2y − 5z + 10 = 0?
Q24
How far is (0, 0, 1) from 3x + 12y − 4z + 20 = 0?
Q25
Are (6, 7, 1) and (5, −3, −4) on the same or different sides of 3x − 10y + 9z + 46 = 0?
Q26
The point (x, 4, 5) lies on the same side of the plane 2x + y − 2z + 10 = 0 as the origin does.
What does that tell you about the value of x?
5.3.5
Q27
We have six images of dogs and cats. We measure four things about each, and have collected
the data below. We would like to use the hyperplane 2x
1
+ 5x
2
−4x
3
+ 10x
4
+ k = 0 to separate
the images of dogs from the images of cats.
Type Measurements
Cat (5, 1, 3, 6)
Dog (7, 3, 7, 2)
Dog (7, 2, 6, 4)
Dog (9, 1, 8, 5)
Cat (6, 4, 5, 5)
Cat (9, 2, 7, 6)
a
What values of k would cause the hyperplane to correctly separate the dog images from the
cat images?
b
If you intended to use the hyperplane to guess whether a future image was a dog or cat,
what k would you choose? Why?
Q28
Suppose we have a hyperplane that we would like to separate two sets of points, but it doesn’t
quite work. We measure the error of this separation by taking the sum of the geometric distances
from the hyperplane of each point that is on the wrong side of the hyperplane. Suppose we were
hoping that the line 2x + 3y − 12 = 0 would separate the points of type T from the points of
type S.
339
Section 5.3
Exercises
Type Coordinates
T (6, 2)
T (2, 1)
T (5, 3)
T (4, 4)
S (1, 5)
S (1, 1)
S (4, 0)
S (4, 2)
a
Create a diagram of these points (labelled or colored by type) and the line.
b
We did not specify which side of the line should be T and which should be S. Use your
diagram to decide which choice of sides will give less error.
c
Compute the error in this method of separation.
d
Suppose we were trying to find a better line of the form ax + by + c = 0. When a = 2, b = 3
and c = −12, would increasing a increase or decrease the error? Justify your answer with a
derivative.
Synthesis and Extension
Q29
Write the equation of a plane that contains all the points equidistant from A = (1, −2, 7) and
B = (7, 0, 5)
Q30
Two planes are perpendicular if their normal vectors are orthogonal.
a
Are 4x − 7y + z − 3 = 0 and 5x + y + 13z + 25 = 0 perpendicular?
b
If two planes are perpendicular, is every vector in the first plane orthogonal to every vector
in the second plane?
Q31
Write the normal equation of a plane that contains the x and z axes. Where have we seen this
plane before?
340
Q32
What trouble do you run into if you try to write the equation of the plane through (6, 0, 0),
(0, 8, 0) and (3, 4, 0)? Explain geometrically why this makes sense.
341
Figure: The tangent line to f(x, y) in the direction of u
Recall that we compute D
x
f by comparing the values of f at (x, y) to the value at (x + h, y), a
displacement of h in the x-direction.
D
x
f(x, y) = lim
h→0
f(x + h, y) − f (x, y)
h
To compute D
u
f for u = a
i+b
j, we compare the value of f at (x, y) to the value at (x+ta, y +tb),
a displacement of t in the u-direction.
Limit Formula
D
u
f(x, y) = lim
t→0
f(x + ta, y + tb) − f(x, y)
t
Questions:
1 What direction produces the greatest directional derivative? The smallest?
2 How are these directions related to the geometry (specifically the level curves) of the graph?
3 How these directions related to the partial derivatives?
We can explore these questions with an applet in the Other Cross Sections activity.
343
Question 5.4.1
How Do We Compute Rates of Change in Another Direction?
Figure: A cross section of z = f(x, y) and a tangent line in the direction of u
Question 5.4.2
What Is the Gradient Vector?
The relationship between the direction of maximum increase and the partial derivatives suggest that
we could treat the partial derivatives like components of a vector.
Definition
The gradient vector of f at (x, y) is
∇f(x, y) = ⟨f
x
(x, y), f
y
(x, y)⟩
Remarks:
1 The gradient vector is a function of (x, y). Different points have different gradients.
2 u
max
, which maximizes D
u
f, points in the same direction as ∇f .
3 u
0
, which is tangent to the level curves, is orthogonal to ∇f.
344
Remark
Students often wonder: what is the geometric intuition behind the gradient vector and its properties?
The answer is often disappointing, but important. The gradient vector does not have a geometric
motivation. We artificially created the gradient vector because it has convenient algebraic properties. If
that were the end of the story, we wouldn’t bother learning about it. However, the gradient turns out
to be so useful that we will study it intensely, despite its uncompelling origins.
Question 5.4.3
How Do We Compute a Directional Derivative?
There are several ways to derive a formula for the directional derivative. One approach is to apply
algebra and limit laws to the limit definition. A more geometric method is to exploit our previous work
with the tangent plane. The directional derivative is the slope of a tangent line. The tangent lines live
in the tangent plane. We can compute their slope by rise over run.
Let u be a unit vector from (x
0
, y
0
) to (x
1
, y
1
). Let the associated z values in the tangent plane be
z
0
and z
1
respectively.
D
u
f(x
0
, y
0
) =
rise
run
=
z
1
− z
0
|u|
=f
x
(x
0
, y
0
)(x
1
− x
0
) + f
y
(x
0
, y
0
)(y
1
− y
0
)
=∇f(x
0
, y
0
) · u.
Functions of More Variables
We can also define directional derivatives of higher variable functions with analogous results.
f(x
1
, . . . , x
n
) is a differentiable function.
u is a unit vector in R
n
.
D
u
f denotes the directional derivative in the direction of u.
∇f = ⟨f
x
1
, . . . , f
x
n
⟩ is an n-dimensional vector function on R
n
.
D
u
f = ∇f · u
345
Synthesis 5.4.4
Directional Derivative and the Cosine Formula
Now that we have a formula for directional derivatives, we can verify our observations from earlier.
Suppose f(x, y) is a differentiable function and we can choose any unit vector u.
a
Write D
u
f(x, y) in terms of the length of a vector and an angle.
b
In what direction u will f increase fastest?
c
What will be the value of D
u
f(x, y) in that direction?
d
In what direction u will D
u
f(x, y) = 0?
Solution
a
Since the directional derivative is a dot product, we can apply our formula that relates the dot
product to the lengths of the vectors and the angle between them.
D
u
f(x, y) = ∇f(x, y) ·u dot product formula
= |∇f(x, y)||u|cos θ cosine formula
= |∇f(x, y)|cos θ u is a unit vector
b
Given a particular (x, y), |∇f(x, y)|cos θ is largest when θ = 0 This means that D
u
f(x, y) is
maximized when u is in the direction of ∇f(x, y). The formula for a unit vector in the direction
of the gradient is
u =
1
|∇f(x, y)|
∇f(x, y)
c
In this direction, cos θ = 1 so D
u
f(x, y) = |∇f(x, y)|.
d
We can solve for θ
D
u
f(x, y) = 0
|∇f(x, y)|cos θ = 0by part (a)
cos θ = 0 as long as ∇f (x, y) =
0
θ =
π
2
We conclude that u must be orthogonal to ∇f(x, y).
346
Figure: The angle between the gradient of f and a unit vector
Main Ideas
The cosine formula for the dot product lets us relate the directional derivative to an angle.
f increases fastest in the direction of ∇f(x, y).
D
u
f(x, y) = 0 when ∇f(x, y) and u are orthogonal.
Example 5.4.5
Let f(x, y) =
p
9 − x
2
− y
2
and let u = ⟨0.6, −0.8⟩.
a
What are the level curves of f?
b
What direction does ∇f (1, 2) point?
c
Without calculating, is D
u
f(1, 2) positive or negative?
d
Calculate ∇f(1, 2) and D
u
f(1, 2).
347
Example 5.4.5
A Directional Derivative
Solution
a
The level curves have the equations
p
9 − x
2
− y
2
= c. These solve to x
2
+ y
2
= 9 − c
2
. As
c increases from 0 to 3 these are circles starting at radius 3 and shrinking to the origin. For c
outside this range, the level curve has no points.
b
∇f points in the direction of increase and normal to the level curves. Since higher level curves
are smaller circles, closer to the origin, ∇f (1, 2) points toward the origin.
c
D
u
f(1, 2) = ∇f(1, 2) ·u. Since u appears to make an acute angle with ∇f(1, 2), we expect this
dot product to be positive.
d
First we need to compute ∇f (1, 2).
∇f(x, y) = ⟨f
x
(x, y), f
y
(x, y)⟩
=
*
1
2
p
9 − x
2
− y
2
(−2x),
1
2
p
9 − x
2
− y
2
(−2y)
+
(chain rule)
∇f(1, 2) =
1
2
√
9 − 1
2
− 2
2
(−2)(1),
1
2
√
9 − 1
2
− 2
2
(−2)(2)
=
−
1
2
, −1
Now we use the dot product formula to compute D
u
f(1, 2).
D
u
f(1, 2) = ∇f(1, 2) · u
=
−
1
2
, −1
· ⟨0.6, −0.8⟩
348
Application 5.4.7
Representing an image by defining a brightness (or color) function on the pixels is simple enough,
but can a computer be taught to make sense of what it sees? Image recognition is an exciting field that
promises to automate and improve tasks from medical diagnosis to driving a vehicle.
The problem is daunting. What algorithm can possibly take a set of pixels and locate a tumor or a
pedestrian? The first step is to identify the objects in the image. The first step of object identification is
edge detection, determining where one object ends and another begins. We can do this by approximating
the partial derivatives at each pixel. We compare each pixel to nearby pixels and compute rise over run
(how these are chosen and averaged can significantly affect the accuracy of the algorithm).
The length of the gradient of a brightness function detects the edges in a picture, where the brightness
is changing quickly.
∂B
∂x
(336, 785) ≈
185−187
1
∂B
∂y
(336, 785) ≈
179−187
1
∇B(336, 785) ≈ (−2, −8)
∂B
∂x
(340, 784) ≈
97−139
1
∂B
∂y
(340, 784) ≈
72−139
1
∇B(340, 784) ≈ (−42, −67)
∇B
∇B
Figure: A long gradient vector indicates a swift change in brightness. Its direction suggests the shape
of the edges.
Notice that the gradient is long near the edge of the iris in Mona Lisa’s eye. It is much shorter at a
point in the white of her eye. Moreover, the gradient at the edge of the iris is approximately normal to
the edge of her iris, because gradients are normal to level curves. This information can be used by an
algorithm to detect not only the location of the edges, but also their direction.
Application 5.4.8
Tangent Planes to a Level Surface
Use a gradient vector to find the equation of the tangent plane to the graph x
2
+ y
2
+ z
2
= 14 at
the point (2, 1, −3).
There are two solutions worth comparing here.
350
Solution 1
We can write z as a function of x and y and apply the tangent plane formula.
x
2
+ y
2
+ z
2
= 14
z
2
= 14 − x
2
− y
2
z = −
p
14 − x
2
− y
2
(z = −3 is on the negative branch of the function)
f
x
(x, y) = −
1
2
p
14 − x
2
− y
2
(−2x) f
x
(2, 1) =
2
3
f
y
(x, y) = −
1
2
p
14 − x
2
− y
2
(−2y) f
y
(2, 1) =
1
3
Equation: z + 3 =
2
3
(x − 2) +
1
3
(y − 1)
Solution 2
Define F (x, y, z) = x
2
+ y
2
+ z
2
. The graph x
2
+ y
2
+ z
2
= 14 is a level surface of F . ∇F (2, 1, −3)
is normal to the level surface, meaning it is also a normal vector for the tangent plane.
∇F (x, y, z) = ⟨2x, 2y, 2z⟩
∇F (2, 1, −3) = ⟨4, 2, −6⟩
We now have a normal vector n = ∇F (2, 1, −3). Our known point is (x
0
, y
0
, z
0
) = (2, 1, −3). The
normal equation of the plane is
4(x − 2) + 2(y − 1) − 6(z + 3) = 0.
Solution 2 requires more conceptual reasoning, but is computationally much easier. In fact, in
some cases we cannot use Solution 1 at all because we do not know how to solve for z. Once we are
comfortable with the concepts involved, the second method is generally superior for graphs of implicit
equations.
351
5.4.1
Q5
Suppose that f(3, 7) = 12 and f (7, 4) = 10.
a
What is the distance from (3, 7) to (7, 4)?
b
Approximate the rate of change of f at (3, 7) travelling toward (7, 4)
Q6
Suppose g(0, 2) = 15 and g(4, 1) = 17.
a
What is the distance from (0, 2) to (4, 1)?
b
Approximate the rate of change of g at (0, 2) travelling toward (4, 1).
c
If you wanted to express the previous rate of change as an approximation of D
u
g(0, 2), what
would the unit vector u be?
5.4.2
Q7
If f(x, y) = x
2
sin(xe
y
), what is ∇f (x, y)?
Q8
If g(x, y) =
p
6x
2
+ 5y
4
, what is ∇g(x, y)?
Q9
If ∇f(x
0
, y
0
) is orthogonal to ∇g(x
0
, y
0
), what can we say about the level curves of f and g?
Be specific.
Q10
Harriet says “The gradient vector of f is tangent to the graph of z = f(x, y).”
“No,” says Marcus, “it is normal to the graph of z = f(x, y).” Who is correct?
353
Section 5.4
Exercises
5.4.3
Q11
Consider our computation of the directional derivative as a dot product.
a
Where did we use the fact that u is a unit vector?
b
If u were not a unit vector, then ∇f ·u would no longer represent rise over run. What would
it represent instead?
Q12
Suppose the linearization of f (x, y) at (−3, 9) has the equation
L(x, y) = 4 + 2(x + 3) −
1
3
(y − 9).
What is the slope of L from (−3, 9) to (5, 3)?
5.4.4
Q13
Given a function f(x, y) and a point (x, y), in what direction u is f decreasing fastest? Compute
an expression for u.
Q14
If D
u
f(x, y) < 0, what can you say about the directions of ∇f(x, y) and u?
Q15
If f
x
(3, 5) = f
y
(3, 5) in what direction(s) from (3, 5) could f increase most quickly?
Q16
Explain why it makes sense that if D
u
f(a, b, c) = 0, then u is tangent to the level surface of f
through (a, b, c).
Q17
If f(x, y, z) = 3xy + z
2
, find the unit vector u that maximizes D
u
f(2, 1, −4). What is the value
of D
u
f(2, 1, −4) for this u?
Q18
Let f(x, y) = 2x
2
y − 10x − y
2
.
a
What unit vector u maximizes the quantity D
u
f(−1, 3)?
b
Compute D
u
f(−1, 3) for the u you found in part
a
.
354
5.4.5
Q19
If u =
2
3
, −
1
3
, −
2
3
and f(x, y, z) = xe
yz
, compute D
u
f(3, 0, 4).
Q20
If u =
3
7
,
6
7
, −
2
7
and f(x, y, z) = xy + yz + zx, compute D
u
f(7, −7, 14).
Q21
If u is a unit vector in the direction of ⟨2, 3⟩ and f(x, y) = x
2
+ 3xy + 2, calculate D
u
f(−1, 4).
Q22
Compute the directional derivative of g(x, y) = e
x
2
−y
at (3, 7) in the direction of ⟨−12, 5⟩.
5.4.6
Q23
In this diagram, we have several level sets of f(x, y).
a
Which way does ∇f (−4, 1.25) point?
b
Mark all the points (x, y) that satisfy
f(x, y) = 30
∇f(x, y) points in the positive y-direction
Q24
Some level curves of f are drawn below. Indicate the direction of the gradient of f at each
labelled point.
355
Section 5.4
Exercises
5.4.7
Q25
If ∇B(x
0
, y
0
) = ⟨13, −17⟩, would you expect the pixels above (x
0
, y
0
) to be brighter or dimmer
than (x
0
, y
0
)? Explain.
Q26
The brightness function on the Mona Lisa image ranges from 0 to 255. If we use adjacent points
to apporixmate the gradient as in the example, what is the longest gradient vector we could
theoretically produce?
5.4.8
Q27
Calculate a normal equation of a tangent line to x
3
+ 8y
3
− 12xy = 0 at (3, 1.5).
Q28
Let P be a point on the circle x
2
+ y
2
= r
2
. Show that the position vector of P is normal to the
circle at P .
Q29
Produce an equation of the tangent plane to z
3
− xz
2
− yx
2
= 24 at (4, −2, 2).
Q30
Give an equation of the tangent plane to the graph z
2
x + 2yz − x
2
y
2
= 59 at (3, 2, 5).
356
Synthesis and Extension
Q31
Suppose f(x, y) is a differentiable function, and we know that for u = ⟨−0.6, 0.8⟩, D
u
f(5, −1) =
4 and for v = ⟨0, −1⟩ we know that D
v
f(5, −1) = −2. What is ∇f (5, −1)?
Q32
Suppose the point P = (x
0
, y
0
, z
0
) lies on the graph z = f(x, y).
a
Give the formula for tangent plane to this graph at P .
b
z = f (x, y) is a level surface of F (x, y, z) = f (x, y) −z. Use the gradient of F to write the
equation of the tangent plane to F (x, y, z) = 0 at P .
c
Are these equations equivalent? Justify your answer with algebra.
Q33
How could you use the gradient of f to rewrite the formula for the linearization L(x, y) of f(x, y)
at (x
0
, y
0
)?
Q34
Suppose f(x, y) is a differentiable function and ∇f (a, b) is not the zero vector. How many unit
vectors u exist such that D
u
f(a, b) = 0. How are they related geometrically?
Q35
Suppose f(x, y, z) is a differentiable function and ∇f(a, b, c) is not the zero vector. How many
unit vectors u exist such that D
u
f(a, b, c) = 0. How are they related geometrically?
Q36
Suppose that f(x, y, z) is a differentiable function, and f(3, 5, −2) = 13. Suppose further that
the vectors ⟨3, 1, 0⟩ and ⟨0, 2, 5⟩ both lie in the tangent plane to the surface f (x, y, z) = 13 at
(3, 5, −2). If the maximum value of D
u
f(3, 5, −2) is 20, find all possible values of ∇f(3, 5, −2).
Q37
Consider the function h(x, y) = x
2
+ 2x + 4y
3/2
a
Compute all possible unit vectors u such that D
u
h(2, 3) = 6
b
What angle do these vectors u make with the tangent line to the level curve h(x, y) =
8 + 12
√
3 at (2, 3).
Q38
Let f(x, y) = x
4
y + 3x − y
3
.
a
Give an equation of the level curve of f through the point (−1, 2).
b
Give an equation of the tangent line to the level curve of f at (−1, 2). Write your equation
in normal form.
357
Question 5.5.1
How Can We Visualize a Composition with a Multivariable Function?
Given a function f(x, y) where x = x(t) and y = y(t), we can ask how f changes as t changes.
We can visualize this change by drawing the graph z = f(x, y) over the path given by the parametric
equations x(t) and y(t).
Figure: The composition f (x(t), y(t)), represented by the height of z = f(x, y) over the path
(x(t), y(t))
Question 5.5.2
How Do We Compute the Derivative of a Composition of Functions?
Theorem [The Chain Rule]
Consider a differentiable function f(x, y). If we define x = x(t) and y = y(t), both differential functions,
we have
df
dt
=
∂f
∂x
dx
dt
+
∂f
∂y
dy
dt
or
df
dt
= ∇f(x, y) · ⟨x
′
(t), y
′
(t)⟩
360
Example 5.5.3
Using the Chain Rule
Remark
Notice we don’t need the chain rule when we have expressions for each function. We can write the
composition ourselves and take an ordinary derivative. In this example we could just differentiate
P = 100w − (3000 + 70w − 0.1w
2
).
Question 5.5.4
What If We Have More Variables?
The chain rule works just as well if x and y are functions of more than one variable. In this case it
computes partial derivatives.
Theorem
If f(x, y), x(s, t) and y(s, t), are all differentiable, then
∂f
∂s
=
∂z
∂x
∂x
∂s
+
∂z
∂y
∂y
∂s
or
∂f
∂s
= ∇f(x, y) ·
∂x
∂s
,
∂y
∂s
We can also modify it for functions of more than two variables.
Theorem
Given f(x, y, z), x(t), y(t) and z(t), all differentiable, we have
df
dt
=
∂f
∂x
dx
dt
+
∂f
∂y
dy
dt
+
∂f
∂z
dz
dt
or
df
dt
= ∇f(x, y, z) · ⟨x
′
(t), y
′
(t), z
′
(t)⟩
362
Example 5.5.6
A Composition with Limited Information
Suppose g(p, q, r) = re
p
2
q
. Given that p, q, r are all differentiable functions of x with the values in
the following table, compute
dg
dx
when x = 2.
x 0 1 2 3
p(x) 3 1 5 10
p
′
(x) −3 2 3 4
q(x) 6 2 −2 3
q
′
(x) −1 −5 2 3
r(x) 10 11 7 3
r
′
(x) 1 0 −1 −3
Solution
The chain rule says
dg
dx
=
∂g
∂p
dp
dx
+
∂g
∂q
dq
dx
+
∂g
∂r
dr
dx
We require the partial derivatives of g
∂g
∂p
= 2pqre
p
2
q
∂g
∂q
= p
2
re
p
2
q
∂g
∂r
= e
p
2
q
Now we plug in the partial derivatives, along with the derivatives of p, q and r from the table.
dg
dx
= 2pqre
p
2
q
(3) + p
2
re
p
2
q
(2) + e
p
2
q
(−1)
This is correct, but not sufficiently simplified. We have left p’s, q’s and r’s in the expression, but the
table tells us what value these have when x = 2. We can make these subsitutions:
dg
dx
= 2(5)(−2)(7)e
(5)
2
(−2)
(3) + (5)
2
(7)e
(5)
2
(−2)
(2) + e
(5)
2
(−2)
(−1)
= −420e
−50
+ 350e
−50
− e
−50
= −71e
−50
364
Application 5.5.7
Recall that an implicit equation on n variables is a level curve of a n-variable function. Consider the
graph x
3
+ y
2
− 4xy = 0. How can we use this to calculate
dy
dx
at the point (3, 3)?
Solution
First, note that (3, 3) does lie on the graph. When we plug x = 3 and y = 3 into our equation, we get
27 + 9 − 36 = 0, which is true. Now suppose that for every x near 3, we can define y(x) to be the y
coordinate on the graph x
3
+ y
2
− 4xy = 0.
Define F (x, y) = x
3
+ y
2
−4xy. The points (x, y(x)) lie on the graph F (x, y) = 0. We can use this
equation to obtain an expression for
dy
dx
. When we differentiate F (x, y(x)), both components change as
x changes, so we cannot use a partial derivative. We need the chain rule.
F (x, y(x)) = 0
d
dx
F (x, y(x)) =
d
dx
0 differentiate both sides
∂F
∂x
dx
dx
+
∂F
∂y
dy
dx
= 0 apply chain rule
∂F
∂x
+
∂F
∂y
dy
dx
= 0
dx
dx
= 1
∂F
∂y
dy
dx
= −
∂F
∂x
solve for
dy
dx
dy
dx
= −
∂F
∂x
∂F
∂y
We compute the partial derivatives at (3, 3), then plug them into the formula we derived.
F
x
(x, y) = 3x
2
− 4y F
x
(3, 3) = 15
F
y
(x, y) = 2y − 4x F
y
(3, 3) = −6
dy
dx
= −
15
−6
=
5
2
Figure: The graph of F (x, y) = x
3
+ y
2
− 4xy = 0, its tangent line at (3, 3), and the gradient of F
365
Application 5.5.7
Implicit Differentiation
Main Ideas
dy
dx
is the slope of the tangent line to F (x, y) = c.
The chain rule allows us to derive
dy
dx
= −
F
x
F
y
−
F
x
F
y
is the negative reciprocal of
F
y
F
x
, which is the slope of ∇F .
In order to solve for
dy
dx
we had to assume that y was a differentiable function of x. How do we
know that’s even true? There is an advanced and powerful theorem that tells us when we can write one
variable in an implicit equation as a function of the others. Here is the two-variable version.
Theorem [The Implicit Function Theorem]
Suppose we have a point (x
0
, y
0
) on the graph of F (x, y) = c. Suppose that
1 The partial derivatives of F exist and are continuous at (x
0
, y
0
)
2 F
y
(x
0
, y
0
) = 0
Then there is a function y = f(x) that agrees with the graph of F (x, y) = c in some neighborhood
around (x
0
, y
0
). Furthermore
1 f is continuous
2 f is differentiable
3 f
′
(x
0
) = −
F
x
(x
0
, y
0
)
F
y
(x
0
, y
0
)
In the case of our example, the partial derivatives in question are polynomials. As long as F
y
(x
0
, y
0
) =
0, we are guaranteed that our graph has a tangent line at (x
0
, y
0
), and its slope is −
F
x
(x
0
, y
0
)
F
y
(x
0
, y
0
)
.
Application 5.5.8
Suppose a firm chooses how much quantity q to produce, but their profit Π(q, α) depends on some
parameter α outside their control (maybe a tax or a measure of regulatory burden). The firm, once
it knows the value of α, will choose the q that maximizes profit. How will their profit change as α
changes?
366
Solution
The change in the firms profit is
dΠ
dα
. Since q is also a function of α we will need the chain rule.
dΠ
dα
=
∂Π
∂q
dq
dα
+
∂Π
∂α
dα
dα
We can substitute
dα
dα
= 1. We can also argue that
∂Π
∂q
= 0. Why? Because q is the choice that
maximizes profit, and maximums occur at critical points. If
∂Π
∂q
> 0 then the firm could increase q to
increase profit (without changing α, which it has no control over). Similarly, If
∂Π
∂q
< 0 then reducing
production would increase profit.
Performing these substitutions gives:
dΠ
dα
=
∂Π
∂α
This suggests that in this case, the total derivative is equal to the partial derivative.
We can verify this equality graphically as well. Pick a particular α
0
and let q
0
= q(α
0
). Notice:
The graph π(q
0
, α) is never above π(q(α), α) for any α, since q(α) is the optimal choice of q.
The graphs π(q
0
, α) and π(q(α), α) meet at α
0
, since q
0
= q(α
0
).
If two graphs meet but one stays below the other, they are tangent. They have the same tangent
line and thus the same derivative.
Figure: Two graphs of z = Π(q, α), one where q changes to be the optimal choice for each α and one
where q is fixed at q
0
, the optimal choice for α
0
367
Q6
Consider the curve defined by
x(t) = t
y(t) = e
t
a
Plot a few points on the curve by plugging in different values of t.
b
In general, what curve does
x(t) = t
y(t) = f(t)
seem to produce?
Q7
A particle is travelling according to the parametric equations
x(t) = 2 cos t
y(t) = 3 sin t
What is the speed (magnitude of velocity) at t =
π
3
?
Q8
Produce a tangent vector to the curve defined by
x(t) = t
3
y(t) = t
2
at the point (−27, 9).
Q9
Is the graph of
x(t) = t
2
y(t) = sin(t)
the graph of a function? How can you tell without graphing it?
Q10
How are the graphs of the following two parametric equations related? Can you generalize your
answer to similar pairs of parametric equations?
x(t) = cos t x(t) = cos(t
3
)
y(t) = ln t y(t) = ln(t
3
)
369
Section 5.5
Exercises
5.5.2
Q11
Let f(x, y) be a funtion. Under what conditions is
df
dt
equal to the directional derivative of f in
the direction of the tangent vector ⟨x
′
(t), y
′
(t)⟩?
Q12
Liam says “If f is a function of x and y and x and y are increasing, then f is increasing.” We
all know Liam is incorrect. How could we use the chain rule to refute him?
5.5.3
Q13
The angular speed of an object is given by ω =
v
r
where r is the distance from the center of
rotation and v is the linear speed. Suppose an object is orbiting earth at a radius of 8400000m
and a speed of 6900m/s. If the radius is increasing at a rate of 100m/s and the linear speed is
decreasing by 60m/s
2
, how quickly is the angular speed changing?
Q14
Let x = t
2
and y = sin t. Let f (x, y) = xy.
a
Compute
df
dt
using the multivariable chain rule.
b
Compute
df
dt
by substituting and using single-variable differentiation.
c
What earlier rule of differentiation can we recover by applying the chain rule to f(x, y) = xy?
5.5.4
Q15
Suppose h(x
1
, x
2
, x
3
, x
4
) is a four-variable function and each x
i
(x, t) is a function of parameters
s and t. How would the multivariable chain rule compute
∂h
∂t
?
Q16
Suppose k(x) is a function and x(r, s, t) is a function of paramters r, s, and t. How does the
multivariable chain rule say we should compute
∂k
∂r
?
370
5.5.5
Q17
Agular momemtum is given by L = rmv where r is the radius of roatation, m is the mass of the
object, and v is its linear speed. At a certain time t
0
, r is 42 million meters and increasing at
80, 000 meters per second, m is 6000kg and not changing, and v is 3100m/s and increasing at
20m/s
2
. How quickly is angular momentum increasing?
Q18
Let f(x, y) = x
2
−y
2
. If x(r, θ) = r cos θ and y(r, θ) = r sin θ, compute
∂f
∂θ
at (r, θ) =
4,
π
6
.
5.5.6
Q19
Suppose x(t) and y(t) are differentiable functions of t such that
x(2) = 3 x
′
(2) = 2 y(2) = −5 y
′
(2) = 10
If f(x, y) = ye
(x
2
y)
, show how to compute
df
dt
at t = 2.
Q20
Suppose that x and y are functions of t such that when t = 2:
x = 3 y = 1
dx
dt
= 5
dy
dt
= 2
If g(x, y) = 3xy
2
− x
2
+ 2y, compute
dg
dt
t=2
.
5.5.7
Q21
Compute
dy
dx
at (4, 2), if x and y satisfy y
3
− xy + x
2
− 4 = 0
Q22
Compute
dy
dx
at (3, 0), if x and y satisfy xe
xy
= 3
Q23
What is the slope of the tangent line to x − y
2
= 9 at (18, −3)?
Q24
Compute the slope of the tangent line to x
3
= y
2
at (4, −8).
371
Section 5.5
Exercises
Q25
Angular momentum is given by L = rmv. One law of physics states that angular momentum of
an object is conversed (unchanged) unless the a force (besides gravity) acts to speed up or slow
down the object. Use the chain rule to derive an expression for
dv
dr
, the amount of linear speed
an object gains or loses per unit that its radius of rotation increases. What do you notice about
the role of mass in your answer?
Q26
Another principle in physics is the conservation of energy. Kenetic energy is given by E =
1
2
mv
2
,
where m is the mass and v is the linear speed of the object. Suppose that we have a rock
drifiting through space. Suppose it impacts stationary rocks and the combined mass sticks
together (without releasing any energy as heat, light or sound). Thus the mass of the total
travelling object increases, while the total energy stays the same. Derive an expression for how
speed changes per unit of increase in mass.
5.5.8
Q27
Suppose that x is a function of t and that when t = 9, we have x = 7 and
dx
dt
= −3. Define
f(x, t) =
√
x + t.
a
Compute the partial derivate
∂f
∂t
(7, 9).
b
Compute the total derivative
df
dt
(7, 9).
c
In a few sentences, explain what these two quantities compute and why they are different
from each other.
Q28
A firm with a monopoly produces gets to set the price of its products and decide how much to
produce. There is a demand function p such that if the firm produces q units, it must set its
price at p(q) to get consumer to buy all of its production. Each unit costs c to produce. The
profit function of the firm is
π(q, c) = p(q)q − cq
We can assume that once the firm has worked out what c is, it chooses the q to maximize profit.
How much will the firm’s actual profit change per unit of increase in c?
372
Section 5.6
Maximum and Minimum Values
Goals:
1 Find critical points of a function.
2 Test critical points to find local maximums and minimums.
3 Use the Extreme Value Theorem to find the global maximum and global minimum of a function
over a closed set.
Functions can be used to model a variety of real-world quantities. A company’s profit, a disease’s
infection rate, or the impact of a government program. In these cases, the most pressing question is:
what choice of independent variables will maximize or minimize the value of the function? Answering
this question was one of the headline applications of single-variable calculus. In this section we will
generalize those methods to functions of multiple variables.
Question 5.6.1
The local extremes of a function are the local minimums and maximums.
Definition
Given an n-variable function f (x
1
, x
2
, . . . , x
n
) we say that a point P in n-space is
1 a local maximum if f(P ) ≥ f(Q) for all Q in some neighborhood around P .
2 a local minimum if f(P ) ≤ f(Q) for all Q in some neighborhood around P .
Question 5.6.2
Where Do Local Extremes Lie?
At a local maximum (or minimum) D
u
f cannot be positive (or negative) in any direction. Thus at
a local extreme, ∇f(P ) =
0, the zero vector. In other words, all the partial derivatives of f are 0 at P .
In the case of a two-variable function, we can visualize this condition. If f
x
(P ) = 0, then we could
travel in the x direction to increase or decrease f . If f
x
(P ) = 0, then we could travel in the y direction
to increase or decrease f. Thus at a local maximum or local minimum, the tangent plane must be
375
Solution
We know the minimum value exists, so it must lie at a critical point. We compute
∇f(x, y) = ⟨4x + 4, 2y − 6⟩
One type of critical point is where this is undefined, but no value of (x, y) makes these expressions
undefined. The other type of critical point occurs when these components are 0. We can solve that
system of equations.
4x + 4 = 0 2y − 6 = 0
x = −1 y = 3
The only point that satisfies this requirement is (−1, 3). Since there is only one critical point, and the
promised minimum lies at a critical point, (−1, 3) must be that point. The minimum value is
z = (2)(−1)
2
+ (4)(−1) + 3
2
− (6)(3) + 13 = 2
¿
Question 5.6.4
How Do We Identify Two-Variable Local Maximums and Minimums?
Once we have found a critical point, how do we know whether it is a local minimum, a local maximum
or neither? Consider a function f(x, y) and a critical point P . There are two possibilities for ∇f(P ). In
the case that ∇f(P ) does not exist, calculus can be no further use to us. If ∇f(P ) = ⟨0, 0⟩, there are
a few different shapes the graph could take. Since we are working with two-variables, we can visualize
these shapes.
A critical point could be a local maximum. In this case f curves downward in every direction.
Figure: A local maximum at (0, 0)
377
Example 5.6.5
Classifying a Critical Point
Solution
a
f
x
(x, y) = −sin(2x + y)(2) + y (chain rule)
f
x
(0, 0) = −sin((2)(0) + 0)(2) + 0 = 0
f
y
(x, y) = −sin(2x + y)(1) + x (chain rule)
f
y
(0, 0) = −sin((2)(0) + 0)(1) + 0 = 0
∇f(0, 0) = ⟨0, 0⟩
b
For the second derivatives test, we need to compute f
xx
, f
xy
and f
yy
at (0, 0).
f
xx
(x, y) = −2 cos(2x + y)(2) (chain rule)
f
xx
(0, 0) = −2 cos((2)(0) + (0))(2) = −4
f
xy
(x, y) = −2 cos(2x + y)(1) + 1 (chain rule)
f
xy
(0, 0) = −2 cos((2)(0) + (0))(1) + 1 = −1
f
yy
(x, y) = −cos(2x + y)(1) (chain rule)
f
yy
(0, 0) = −cos((2)(0) + (0))(1) = −1
D = f
xx
(0, 0)f
yy
(0, 0) − [f
xy
(0, 0)]
2
= (−4)(−1) − (−1)
2
= 3
Since D > 0 and f
xx
< 0, (0, 0) is a local maximum of f.
Figure: The graph z = cos(2x + y) + xy with a local maximum at (0, 0)
380
These are never undefined, so there are no critical points of that type. The only critical points
will be where both partial derivatives are 0.
0 = 2x − 2xy 0 = 4y − x
2
0 = 2x(1 − y) (factor 2x − 2xy)
x = 0 or y = 1
0 = 4y − 0
2
0 = 4(1) − x
2
(examine each case seperately)
0 = y x = ±2
We should be careful not to lose track of the logic. The x = ±2 solution goes with the y = 1
case. The y = 0 solution goes with the x = 0 case. Mixing these up will give invalid solutions.
You can always plug in pair of (x, y) to verify they satisfy the system of equations.
We conclude that (0, 0), (2, 1) and (−2, 1) are the critical points, but (2, 1) is not in the domain,
so we discard it.
c
No. Recall our method for maximizing single variable functions on a closed interval. The maximum
can occur at the endpoint of the interval without being detected by the derivative.
The same is true here. If the maximum is on the boundary of D, the gradient need not be 0. In
the single-variable case, we only need to test the endpoints (by evaluating f there). There are
infinitely many points on the boundary of D. Evaluating f on all of them is not an option. With
graphing software we can see that the maximum occurs on the boundary somewhere in the third
quadrant, but how can we solve for it exactly?
385
Example 5.6.7
Finding a Global Maximum
Figure: The graph of y = f(x, y) over the domain D
d
To narrow down the search for a maximum on the boundary of D, we will use the boundary
equations to write an expression for f that is valid only on the boundary. We can find the critical
points of this expression, and rule out any point that is not a critical point.
Suppose the maximum lies on x = 0. The function on x = 0 is f(0, y) = 0
2
+ 2y
2
− 0
2
y =
2y
2
. This function only has one variable, so we can find potential maximums by looking for
its critical points.
f
′
(y) = 4y
This is never undefined. It is 0 at y = 0. The only critical point of f (y) on x = 0 is (0, 0).
However, not all of x = 0 is the boundary of D. This component of the boundary ends
at (0, 4) and (0, −4). Like with a closed interval, the derivative of f(y) cannot detect a
maximum at those endpoints.
Suppose the maximum lies on x
2
+ y
2
= 16. On this graph, we can similarly reduce f(x, y)
to a function of one variable, but the substitution is more complicated. We solve
x
2
+ y
2
= 16
x
2
= 16 − y
2
f(y) = (16 − y
2
) + 2y
2
− (16 − y
2
)y (substitute for x
2
)
= y
3
+ y
2
− 16y + 16
f
′
(y) = 3y
2
+ 2y − 16
0 = 3y
2
+ 2y − 16 (solve for critical points)
0 = (3y + 8)(y − 2)
y = −
8
3
y = 2
x
2
+
−
8
3
2
= 16 x
2
+ 2
2
= 16 (substituue into x
2
+ y
2
= 16)
x
2
= 16 −
64
9
x
2
= 16 − 4
386
x = −
r
80
9
x = −
√
12 (+ solutions are not in D)
Our critical points are
−
q
80
9
, −
8
3
and
−
√
12, 2
. This component of the boundary also
ends at (0, 4) and (0, −4), so the maximum might lie there.
We can now argue that one of the points we have found is the maximum.
If the maximum is not on the boundary, it lies at (−2, 1).
If the maximum is on x = 0, then it lies at (0, 0), (0, 4) or (0, −4).
If the maximum is on x
2
+ y
2
= 16, then it lies at
−
q
80
9
, −
8
3
,
−
√
12, 2
, (0, 4) or
(0, −4).
One of these must be the case. To figure out which it is, we can evaluate f at each point and see
which produces the largest value.
f(−2, 1) = (−2)
2
+ 2(1)
2
− (−2)
2
(1) = 2
f(0, 0) = (0)
2
+ 2(0)
2
− (0)
2
(0) = 0
f(0, 4) = (0)
2
+ 2(4)
2
− (0)
2
(4) = 32
f(0, −4) = (0)
2
+ 2(−4)
2
− (0)
2
(−4) = 32
f
−
q
80
9
, −
8
3
=
−
q
80
9
2
+ 2
−
8
3
2
−
−
q
80
9
2
−
8
3
=
1264
27
(maximum)
f
−
√
12, 2
= (−
√
12)
2
+ 2(2)
2
− (−
√
12)
2
(2) = −4
Main Ideas
If the Extreme Value Theorem applies, then all we need to do is find the critical points and evaluate
f at each. One is guaranteed to be the maximum, and one is guaranteed to be the minimum.
∇f =
0 will detect critical points on the interior, but not on the boundary.
We can rewrite the function on a boundary component using substitution. Set the derivative equal
to 0 to find critical points.
Derivatives will not detect maximums at the endpoints of a boundary curve. These must be
included in your set of critical points.
387
5.6.2
Q9
Suppose ∇f(4, 2) = ⟨−5, 11⟩. Where would you travel from (4, 2) to find higher values of f?
Q10
The function f (x, y) = |x|+|y| has its global minimum at (0, 0). Is this a critical point? Explain.
Q11
If (a, b) produces the minimum value of |∇f(x, y)|, must (0, 0) must be a critical point? Explain.
Q12
Suppose f(x) is a function of x with critical points x = a and x = b. Suppose g(y) is a function
of y with critical points y = c and y = d. What are the critical points of h(x, y) = f(x) +g(y)?
5.6.3
Q13
Find the critical points of f (x, y) = x
4
+ 4xy + y
4
.
Q14
Find the critical points of g(x, y) = x
2
+ y
2
− 3xy − 13x + 12y.
5.6.4
Q15
If (x
0
, y
0
) is critical point and f
(
xx)(x
0
, y
0
) = 0, can (x
0
, y
0
) be a local maximum of f ? What
must be the value of f
xy
(x
0
, y
0
) if so?
Q16
For what values of a does f(x, y) = x
2
+ y
2
+ axy have a local minimum at the origin?
389
Section 5.6
Exercises
5.6.5
Q17
Find the critical points of h(x, y) = x
2
y − x
2
− 2y
2
. Classify each as a local maximum, local
minimum, or saddle point.
Q18
Find all critical points of f(x, y) =
1
3
x
3
− 4xy + 2y
2
. Classify them as local maximums, local
minimums, or saddle points.
Q19
Compute the critical points of f(x, y) = 2x
3
−12xy + 3y
2
and classify each as a local maximum,
local minimum, or saddle point.
Q20
Let h(x, y) = x
2
+ y
3
+ 3xy. Find the critical points of h, and classify each as a local maximum,
local minimum or saddle point.
Q21
Let f(x, y) = x
3
−15x
2
−9x + 12xy −3y
2
−18y. Find the critical points of f and classify each
one as local maximum, local minimum or saddle point.
Q22
Let f(x, y) = x
5
+ 20xy + 5y
2
. Find the critical points of f and classify each one as local
maximum, local minimum or saddle point.
Q23
Find the critical points of g(x, y) = e
x
3
+y
2
−12x+10y
. Classify each one as local maximum, local
minimum or saddle point.
Q24
Find the critical points of f(x, y) =
1
x
4
−x
2
y+y
2
+10
. Classify each one as local maximum, local
minimum or saddle point.
5.6.6
Q25
Draw a sketch of D = {(x, y) : y ≥ x
2
, y ≤ x
3
}. State whether D is closed and whether D is
bounded.
Q26
Draw a sketch of D = {(x, y) : y ≥ x, y ≤ 2x, xy < 1}. State whether D is closed and whether
D is bounded.
Q27
Draw a sketch of D = {(x, y) : x > 0, y ≥ x
4
}. State whether D is closed and whether D is
bounded.
390
Q28
Draw a sketch of D = {(x, y) : − 1 < x
2
+ y
2
≤ 16}. State whether D is closed and whether
D is bounded.
Q29
Let D = {(x, y) : y ≥ x
2
}. Can the Extreme Value Theorem guarantee that f has a maximum
on D? Explain.
Q30
Does the function f (x, y) =
1
x
2
+y
2
have a maximum and minimum value on the domain D =
{(x, y) : −3 ≤ x ≤ 3, −4 ≤ y ≤ 4}? If yes, find them. If not, explain why the extreme value
theorem does not apply.
5.6.7
Q31
Draw a careful diagram of D = {(x, y) : y ≥ x
2
, x
2
+ y
2
≤ 20}. Where would you need to
check to guarantee you’d find the maximum value of a continuous function f on D?
Q32
Let f(x, y) be a differentiable function and let
D = {(x, y) : y ≥ x
2
− 4, x ≥ 0, y ≤ 5}.
a
Sketch the domain D.
b
Does the Extreme Value Theorem guarantee that f has an absolute minimum on D? Explain.
c
List all the places you would need to check in order to locate the minimum.
Q33
Find the maximum and minimum value of f (x, y) = e
x+3y
in the triangle with vertices (0, 0),
(6, 0) and (0, 3).
Q34
Find the maximum and minimum value of f (x, y) = 3x + y on D, the closed region bounded by
y = x
2
and y = 16.
Q35
Find the global max and min of f(x, y) = x
3
− 12x + y
3
− 3y on the rectangle 0 ≤ x ≤ 4 and
−2 ≤ y ≤ 2.
Q36
Consider the function g(x, y) =
x
4
−2x
2
+2
y
2
−2y+2
on the rectangle −2 ≤ x ≤ 2 and 0 ≤ y ≤ 3.
391
Question 5.7.2
How Do We Solve a Constrained Optimization?
Theorem
Suppose an objective function f(x, y) and a constraint function g(x, y) are differentiable. The local
extremes of f(x, y) given the constraint g(x, y) = c occur where
∇f = λ∇g
for some number λ, or else where ∇g = 0. The number λ is called a Lagrange Multiplier.
This theorem generalizes to functions of more variables.
We can justify the theorem visually by examining the relationship ∇f, ∇g and the constraint. The
constraint g(x, y) = c is by definition a level curve of g. It is normal to ∇g.
Figure: Where ∇f is not parallel to ∇g, we can travel along g(x, y) = c and increase the value of f .
This is because D
u
f > 0 for some u along the constraint.
By this argument, the only place a maximum or minimum of the objective function can lie of the
contraint is where D
u
f would have to be 0, because ∇f is parallel to ∇g.
Remark
When ∇f(P ) is parallel to ∇g(P ) (and neither of these vectors is
0), the level curves of f through P
is tangent to the level curve g(x, y) = c. If we can draw the level curves of f , this gives us a visual
method of identifying the potential maximums and minimums.
Example 5.7.3
Find the point(s) on the ellipse 4x
2
+ y
2
= 4 on which the function f (x, y) = xy is maximized.
394
The EVT and constraints
Are we guaranteed that a maximum exists at all? The Extreme Value Theorem can still be applied to
constraints. Here are a few ways we can identify that a constraint is closed:
1 A curve is closed if it includes its endpoints (or none exist).
2 A surface is closed if it includes its boundary (or none exists).
3 The level set of a continuous function is always closed.
Even armed with these, we still need to check that the domain is bounded.
Solution
We’ll check the conditions of the Extreme Value Theorem
1 4x
2
+ y
2
= 4 is a curve with no endpoints, so it is closed.
2 4x
2
+ y
2
= 4 is an ellipse. It stays within a bounded distance from the origin.
3 f is continuous.
By the Extreme Value Theorem, we know that a maximum exists. We will use Lagrange multipliers
to narrow down our search to the possible maximums. We set g(x, y) = 4x
2
+ y
2
and compute the
gradient vectors of f and g.
∇f(x, y) = ⟨y, x⟩ ∇g(x, y) = ⟨8x, 2y⟩
The theorem allows two possibilities at a maximum.
1 ∇g(x, y) = ⟨0, 0⟩. The only (x, y) that satisfies this is (0, 0). But (0, 0) is not on the constraint,
so it is not a valid solution.
2 ∇f = λ∇g. We can factor the λ across each component of the vectors, but that gives us two
equations and three variables (x ,y and λ). We need another equation, and fortunately we have
one. x and y must satisfy 4x
2
+ y
2
= 4 as well. Here is one (but not the only) way to solve this
system of equations.
395
Example 5.7.4
The Maximum on a Surface
Solution
First note that the EVT applies, since a sphere is closed and bounded and f is continuous. To identify
potential maximums, we appeal to Lagrange multipliers.
Set g(x, y, z) = x
2
+y
2
+z
2
. Then ∇g(x, y, z) = ⟨2x, 2y, 2z⟩. The case ∇g(x, y, z) =
0 only occurs
at the origin, which is not on the sphere. The critical points must be only the points where ∇f = λ∇g.
∇f(x, y, z) =
4x
3
y
4
z, 4x
4
y
3
z, x
4
y
4
.
Equating each coordinate gives us three equations, and the constraint is a fourth. We thus have a
system of four equations and four variables.
4x
3
y
4
z = λ2x 4x
4
y
3
z = λ2y x
4
y
4
z = λ2z x
2
+ y
2
+ z
2
= 36
The most obvious way to solve this algebraically is to solve for λ, but this requires us to divide by
x, y and z. We would need to remember that another possible solution is that x, y or z is 0. We can
avoid this by multiplying and factoring instead.
4x
3
y
4
z = λ2x 4x
4
y
3
z = λ2y x
4
y
4
= λ2z
4x
3
y
5
z
2
= λ2xyz 4x
5
y
3
z
2
= λ2xyz x
5
y
5
= λ2xyz
4x
3
y
5
z
2
= 4x
5
y
3
z
2
x
5
y
5
= 4x
5
y
3
z
2
4x
3
y
5
z
2
− 4x
5
y
3
z
2
= 0 x
5
y
5
− 4x
5
y
3
z
2
= 0
4x
3
y
3
z
2
(y − x)(y + x) = 0 x
5
y
3
(y − 2z)(y + 2z) = 0
either x = 0
or y = 0
or y = ±x and y = ±2z
±2z = x
x
2
+ y
2
+ z
2
= 36
(±2z)
2
+ (±2z)
2
+ z
2
= 36
9z
2
= 36
z = ±2
(±2)(±2) = x y = (±2)(±2)
±4 = x y = ±4
This gives us 8 critical points: (±4, ±4, ±2). In addition every point in the x = 0 cross section of
the sphere is a critical point, as is every point in the y = 0 cross-section. This is infinitely many points
to evaluate, but fortunately the algebra of our objective function allows us to evaluate these points in
large batches.
if x = 0 f (x, y, z) = 0
4
y
4
z = 0
if y = 0 f (x, y, z) = x
4
0
4
z = 0
f(±4, ±4, 2) = (±4)
4
(±4)
4
(2) = 2
17
f(±4, ±4, −2) = (±4)
4
(±4)
4
(−2) = −2
17
Thus the maximum value is 2
17
. It occurs at the four points (±4, ±4, 2).
398
Remark
If we hadn’t seen how to avoid dividing by x, y and z, we could have gone ahead and done the division.
Remember that when you divide while solving an equation, you obtain an extra solution where the divisor
is 0. This would lead us to check x = 0, y = 0 and z = 0 as we did in the factoring solution.
Synthesis 5.7.5
Using the Extreme Value Theorem and Lagrange Multipliers
How can Lagrange multipliers help us find the maximum of f(x, y) = x
2
+ 2y
2
−x
2
y on the domain
D = {(x, y) : x
2
+ y
2
≤ 16, x ≤ 0}?
Solution
We can continue Example 7. After finding the critical points of f at (0, 0) and (−2, 1), we turn to the
boundaries. The boundaries are level curves.
For x
2
+ y
2
= 16, set g(x, y) = x
2
+ y
2
= 16. We have
∇f(x, y) =
2x − 2xy, 4y − x
2
∇g(x, y) = ⟨2x, 2y⟩
∇g(x, y) =
0 only at the origin, which isn’t on the constraint. So we solve ∇f(x, y) = λ∇g(x, y)
and g(x, y) = 4.
399
Synthesis 5.7.5
Using the Extreme Value Theorem and Lagrange Multipliers
2x − 2xy = λ2x 4y − x
2
= λ2y x
2
+ y
2
= 16
2x − 2xy − 2λx = 0
2x(1 − y − λ) = 0
if x = 0 0
2
+ y
2
= 16
y = ±4
if 1 − y − λ = 0
λ = 1 − y 4y − x
2
= (1 − y)2y
2y
2
+ 2y = x
2
(2y
2
+ 2y) + y
2
= 16
3y
2
+ 2y − 16 = 0
(3y + 8)(y − 2) = 0
if y = −
8
3
if y = 2
x
2
+
−
8
3
2
= 16 x
2
+ 2
2
= 16
x
2
+
64
9
=
144
9
x
2
= 12
x
2
=
80
9
= x = ±
√
12
x = ±
r
80
9
The critical points are (0, ±4),
−
√
12, 2
and
−
q
80
9
, −
8
3
. The solutions with positive x are
not in D.
On x = 0, substitution is probably the easier choice, but Lagrange multipliers are still possible.
x = 0 is a level set of the function g(x, y) = x.
∇g(x, y) = ⟨1, 0⟩
∇g =
0 so we solve ∇f (x, y) = λ∇g(x, y).
2x − 2xy = λ 4y − x
2
= 0 x = 0
4y = 0
This is the same equation we obtained by substituting x = 0 into f and differentiating.
400
Synthesis 5.7.6
The Gradient on the Boundary
P may be a local minimum but may not be. The directional derivative along the boundary is 0, so f
could curve upward or downward along the boundary. If f curves downward we could find lower values
of f nearby and P would not be a minimum. If f curves upward, then P would be a minimum. We
could compute this curvature by taking the substituted version of f that we used to solve for P and
computing its second derivative at P .
On the other hand, if we suppose that ∇f(P ) points out of D, then D decreases as we travel into
D, and P cannot be a local minimum. It may or may not be a local maximum.
Question 5.7.7
Can This Lagrange Apply to More Than One Constraint?
If we have two constraints in three-space, g(x, y, z) = c and h(x, y, z) = d, then their intersection
is generally a curve.
Figure: The intersection of the constraints g(x, y, z) = c and h(x, y, z) = d
According to our earlier argument about directional derivatives, at a maximum P on the constraint,
∇f(P ) must be normal to the constraint. There are more ways for this to happen with two constraint
equations.
1 ∇f(P ) could be parallel to ∇g(P ).
2 ∇f(P ) could be parallel to ∇h(P ).
3 ∇f(P ) could be the vector sum of a vector parallel to ∇g(P ) and a vector parallel to ∇h(P ).
You should look at Figure 380 to convince yourself that these ∇f(P ) would all be normal to the
constraint. We can express this condition algebraically
402
Theorem
If f(x, y, z) is a differentiable function and g(x, y, z) = c and h(x, y, z) = d are two constraints. If P is
a maximum of f (x, y, z) among the points that satisfy these constraints then either
∇f(P ) = λ∇g(P) + µ∇h(P )
for some scalars λ and µ, or ∇g(P) and ∇h(P ) are parallel.
This system of equations is usually difficult to solve by hand.
Remark
You can check the reasonableness of this method by noting that it gives us a system of 5 variables, x,
y, z, λ, µ, and five equations:
f
x
(x, y, z) = λg
x
(x, y, z) + µh
x
(x, y, z) g(x, y, z) = c
f
y
(x, y, z) = λg
y
(x, y, z) + µh
y
(x, y, z) h(x, y, z) = d
f
z
(x, y, z) = λg
z
(x, y, z) + µh
z
(x, y, z)
We therefore generally expect this system to have a finite number of solutions, though there are plenty
of counterexamples to this expectation.
Section 5.7
Summary Questions
Q1
What is a constraint?
Q2
What equations do you write when you apply the method of Lagrange multipliers?
Q3
Is the set of points that satisfies a constraint closed and bounded? Explain.
Q4
How does a constraint arise when finding the maximum over a closed and bounded domain?
403
Section 5.7
Exercises
5.7.1
Q5
Suppose we have $230 to spend on three goods. Good 1 costs $13 per unit. Good 2 costs $22
per unit. Good 3 costs $11 per unit. Write a budget constraint that expresses what purchases
(x, y, z) of good 1, good 2 and good 3 are possible, if you spend you budget.
Q6
Suppose the maximum value of f (x, y) occurs at (3, −4). Where is the maximum value of f (x, y)
that satisfies the constraint x
2
+ y
2
= 25? Explain.
5.7.2
Q7
Suppose f (x, y, z) is a smooth function. Suppose the maximum value of f on the sphere x
2
+
y
2
+ z
2
= 25 occurs at P . What can you say about ∇f(P ) and the tangent plane to the sphere
at P ?
Q8
Suppose the curve below is the graph of g(x, y) = k. Use methods from calculus to find and
mark the approximate location of the point that maximizes the function f(x, y) = 3y −x subject
to the constraint g(x, y) = k. Justify your reasoning in a few sentences.
Q9
Suppose that (a, b) is a local maximum of the smooth function f(x, y) which also happens to
satisfy the constraint g(a, b) = k.
a
Is (a, b) also a local maximum of f among the points on the constraint? Explain.
b
If we used Lagrange multipliers to detect (a, b), what would we expect λ to be equal to at
that point?
404
Q10
Show that (3, 3) is not a local maximum of f (x, y) = 2x
2
− 4xy + y
2
− 8x on the graph
x
3
+ y
3
= 6xy.
5.7.3
Q11
Compute the maximum value of y − x
2
on the constraint x
2
+ y
2
= 4.
Q12
Refer to your “Maximums on a Constraint” worksheet.
a
What system of equations would you set up to find the critical points of f on the constraint
p(x, y) = c?
b
Can you solve it?
c
Which was easier, using Lagrange or using substitution?
5.7.4
Q13
Find the maximum value of f (x, y, z) = xyz on the sphere x
2
+ y
2
+ z
2
= 36.
Q14
Find the maximum value of f (x, y, z) = xz on the sphere x
2
+ y
2
+ z
2
= 36.
Q15
Find the maximum value of f (x, y, z) = 3y + 2z on the ellipsoid 25x
2
+ y
2
+ 4z
2
= 100.
Q16
The function h(x, y, z) = x
2
+ y
2
+ z
2
has a minimum value on the plane 3x + 5y − 2z = 30.
Compute it.
405
Section 5.7
Exercises
5.7.5
Q17
Suppose f(x, y) is differentiable but has no critical points. Will the method of Lagrange multipliers
detect the maximum value of f in D = {(x, y) : x
2
+ y
2
≤ 49}? Explain.
Q18
Consider the following two questions:
Find the maximum value of f (x, y) that satisfies x
2
+ y
2
≤ 9.
Find the maximum value of f (x, y) that satisfies x
2
+ y
2
= 9.
a
How are the questions different?
b
Which question takes less work to solve? Explain how you know.
c
Do solutions exist to both questions? What additional information would guarantee that
they do?
Q19
Let D = {(x, y) : x
2
+ y
2
≤ 1, x ≥ 0, y ≤ 0}. Find the maximum and minimum values of
f(x, y) = x
2
− y on D.
Q20
Consider the function f(x, y) = x
2
+ 6xy + 9y
2
+ 5. Find the maximum and minimum values of
f on the domain D = {(x, y) : y ≥ x, x ≥ 0, x
2
+ y
2
≤ 10}
Q21
Let D = {(x, y) : x
2
+ y
2
≤ 20, y ≥ −x}. Find the maximum and minimum values of
f(x, y) = x
4
y on D.
Q22
Let D = {(x, y) : x
2
+ y
2
≤ 25, y ≥ x + 1, y ≥ 0}. Find the maximum and minimum values of
f(x, y) = x
3
y
2
on D.
Q23
Let D = {(x, y) : x
2
+ y
2
≤ 20, y ≥ −x}. Find the maximum and minimum values of
f(x, y) = x
4
y on D.
Q24
Let D =
(x, y) :
x
2
16
+
y
2
64
≤ 1, x ≥ 0
. Find the points in D that obtain the maximum and
minimum values of f (x, y) = 2x + 3y.
406
5.7.6
Q25
Suppose the maximum of f (x, y) on
D = {(x, y) | g(x, y) ≤ c}
occurs at P on the boundary of D. We know that ∇f(P ) points out of D. What does this tell
us about the sign of λ?
Q26
Explain why knowing which way ∇f points is not useful for ruling out potential maximums given
a domain of the form g(x, y) = c.
5.7.7
Q27
How does the method of Lagrange multipliers suggest we solve for the maximum value of f(x, y)
on the constraints x + y = 1 and x −y = 0? Do we need to know what f is to solve this? Why
shouldn’t that bother us?
Q28
Write a system of equations that one would solve to find the maximum and minimum values of
f(x, y, z) = x on the two constraints y
2
+ z
2
= 25 and x + y + z = 1.
Synthesis and Extension
Q29
Consider the plane p with normal equation 7x + 6y − 3z − 42 = 0
a
Use Lagrange multipliers to find the point A on p that s closest to the origin O.
b
Show that
−→
OA is a normal vector to p.
c
Show how you can use the observation in
b
to solve for the closest point (A) without using
calculus.
Q30
Determine the smallest rectangle (parallel to the x and y axes) that contains the ellipse x
2
+
3xy + 4y
2
− 4x − 13y + 4 = 0.
407
>