<
Advanced Calculus For Data Science
Mike Carr
Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1 Review of Algebra and Calculus 5
1.1 Graphs of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Limits and Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3 Applications of Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.4 Definite Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2 Advanced Integration and Applications 59
2.1 Area Between Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
2.2 Volumes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
2.3 Integration by Parts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
2.4 Approximate Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
2.5 Improper Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
2.6 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
2.7 Functions of Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
3 Series 169
3.1 Taylor Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
3.2 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
3.3 Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
3.4 Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
3.5 Taylor Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
4 Multivariable Functions 241
4.1 Three-Dimensional Coordinate Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 242
4.2 Functions of Several Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
4.3 Limits and Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
4.4 Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
4.5 Linear Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
5 Vectors in Calculus 305
5.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
5.2 The Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
5.3 Normal Equations of Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
5.4 The Gradient Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
5.5 The Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
5.6 Maximum and Minimum Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
5.7 Lagrange Multipliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
6 Multivariable Integration 409
6.1 Double Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410
6.2 Double Integrals over General Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . 424
6.3 Joint Probability Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
6.4 Triple Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
1
So far in calculus you have developed the tools to answer the following questions about a function
of one variable:
1 How quickly does the value of the function change
as the input changes?
2 How do we estimate the value of the function near
a point?
3 What are the maximum and minimum values of the
function?
4 What is the area under the graph of the function?
What does it mean?
These are all useful tools, but they don’t necessarily apply to the types of data that we encounter in
the world.
Data generally takes the form of a set of observations, rather than an algebraic function. How do
we perform calculus with such a set? We cannot integrate it without an antiderivative. In some cases,
the best functions to model our data are difficult to work with. We take for granted that sin x is a
2
Chapter 1
Review of Algebra and Calculus
This chapter reviews the most important information about functions, limits, derivatives, and integrals.
It is not meant to teach this material to a first-time learner, but can serve as a reference or reminder.
Contents
1.1 Graphs of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Limits and Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3 Applications of Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.4 Definite Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Section 1.1
Goals:
1 Graph algebraic and trigonometric functions.
2 Solve equations using inverse functions.
3 Solve equations containing quotients.
4 Graph transformations of functions.
Definition
The graph of an equation is the set of ordered pairs (x, y) that satisfy the equation. These are the
points that, when their coordinates are plugged in for x and y, the two sides of the equation are equal.
Linear Functions
Linear functions can be written in slope-intercept form:
f(x) = mx + b.
The graph y = mx + b of a linear function is a line.
m is the slope, which is the change in y over the change in x between any two points on the line.
(0, b) is the y intercept.
If we have the slope and a known point (x
0
, y
0
) on a line. We can write its equation in point-slope
form.
y −y
0
= m(x − x
0
)
If we have both the x- and y-intercepts of the line, it is convenient to write it in normal form
ax + by + c = 0
6
Solution
2x
2
− 3x − 5 = 0 set numerator = 0
(2x − 5)(x + 1) = 0 factor
x =
5
2
or x = −1
Then we must check that neither of these causes the denominator to be 0.
5
2
2
+ 3
5
2
+ 2 =
63
4
(−1)
2
+ 3(−1) + 2 = 0
So x =
5
2
is the only solution.
If there are terms besides the quotient, move them all to the same side of the equation and use a
common denominator to combine them.
Example
Solve
2 +
x + 3
x + 1
=
4
x
Solution
2 +
x + 3
x + 1
−
4
x
= 0 move to one side
2x
2
+ 2x
x
2
+ x
+
x
2
+ 3x
x
2
+ x
−
4x + 4
x
2
+ x
= 0 common denominator
3x
2
+ x − 4
x
2
+ x
= 0 combine
set 3x
2
+ x − 4 = 0
(3x + 4)(x − 1) = 0 factor
x = −
4
3
or x = 1
Then we must check that neither of these causes the denominator to be 0.
−
4
3
2
+
−
4
3
=
4
9
1
2
+ 1 = 2
Both solutions are valid. x = −
4
3
or x = 1.
13
Q16
Graph y = −2
√
x + 1 + 4.
1.1.2
Q17
Solve for x:
x
2
+ 5x − 6
x − 1
= 0
Q18
Solve for x:
e
x
− 2
x
2
+ 2x − 3
= 0
Q19
Solve for x:
3x
2
− 5
2e
x
− 7
= 0
Q20
Solve for t:
ln t − 4
3 − t
= 0
Q21
Solve for x:
ln x − 4
3 − x
= 0
Q22
Solve for x:
3
x + 2
=
7
x + 4
Q23
Solve for u:
5
(u + 1)
2
=
u
u + 1
15
Solution
f
′
(x) = lim
h→0
f(x + h) − f(x)
h
definition of derivative
= lim
h→0
(x + h)
2
+ 2(x + h) − x
2
− 2x
h
plug in x and x + h
= lim
h→0
x
2
+ 2xh + h
2
+ 2x + 2h − x
2
− 2x
h
distribute
= lim
h→0
2xh + h
2
+ 2h
h
cancel
= lim
h→0
2x + h + 2 functions agree except at h = 0 so limits are equal
= 2x + 0 + 2 limit = value on a continuous function
= 2x + 2
Theorem
If f
′
(x) > 0 for all x in some interval [a, b] then f(x) is increasing on [a, b].
If f
′
(x) < 0 for all x on [a, b] then f(x) is decreasing on [a, b].
We can take higher order derivatives by taking derivatives of derivatives. The derivative function
of f in this context is called the first derivative. Its derivative function is the second derivative. The
second derivative’s derivative function is the third derivative and so on.
Notation
The following notations are used for higher order derivatives
name prime notation Leibniz notation
first derivative f
′
(x)
df
dx
second derivative f
′′
(x)
d
2
f
dx
2
third derivative f
′′′
(x)
d
3
f
dx
3
fourth derivative f
(4)
(x)
d
4
f
dx
4
fifth derivative f
(5)
(x)
d
5
f
dx
5
25
Theorem
The following rules allow us to differentiate functions made of simpler functions whose derivative we
know.
Sum Rule (f(x) + g(x))
′
= f
′
(x) + g
′
(x)
Constant Multiple Rule (cf(x))
′
= cf
′
(x)
Product Rule (f(x)g(x))
′
= f
′
(x)g(x) + g
′
(x)f(x)
Quotient Rule
f(x)
g(x)
′
=
f
′
(x)g(x)−g
′
(x)f(x)
(g(x))
2
unless g(x) = 0
Chain Rule (f(g(x))
′
= f
′
(g(x))g
′
(x)
Example
Compute
d
dx
tan(x)
Solution
tan x =
sin x
cos x
. We apply the quotient rule
(tan x)
′
=
(sin x)
′
cos x − (cos x)
′
sin x
cos
2
x
quotient rule
=
cos
2
x + sin
2
x
cos
2
x
=
1
cos
2
x
Pythagorean identity
= sec
2
x
27
1.2.3
Q9
Compute lim
x→3
x − 3
x
2
− 9
Q10
Compute lim
x→1
x
2
− 4x + 3
x − 1
Q11
Compute lim
x→9
2x − 18
√
x − 3
Q12
Compute lim
x→4
1
x
2
−
1
16
x − 4
1.2.4
Q13
Explain why sin x = 2x − 1 has a solution in [0, 1].
Q14
Explain why
3
√
x = log
2
x has a solution in [0, 8].
Q15
What does the Intermediate Value Theorem say about whether f(x) =
1
x
−
1
2
has a root in
[−1, 1]?
Q16
Consider the equation sin x =
3
4
. Gloria computes sin
π
3
=
√
3
2
and sin
5π
6
=
1
2
. Since
3
4
is not
between
1
2
and
√
3
2
, she concludes that sin x =
3
4
has no roots in
π
3
,
5π
6
. What do you think
of Gloria’s reasoning?
1.2.5
Q17
Compute lim
x→∞
x
2
+ 2x − 9
3x − 6
.
Q18
Compute lim
x→∞
4x
2
− 7x + 9
2x
2
+ 11
.
Q19
Compute lim
x→∞
p
e
1/x
.
Q20
Compute lim
x→∞
1
ln x
.
31
1.2.7
Q29
Use derivative rules to differentiate each of the following functions.
a
5x
7
− 3x
2
+
5
x
2
b
4x
5
− 2x
2
+ 3x + 4
x
c
(x
2
+ 2x) sin x
d
e
x
x
2
e
√
x − 5
f
cos(4x)
g
sin(e
x
)
h
(x
2
+ 5x + 4)
60
i
e
x
2
sin x
j
ln(x
2
+ 2)
x
2
+ 3x
Q30
Use derivative rules to differentiate each of the following functions.
a
3
x
+
7
x
3
b
5x
4
+ 3x
3
− 8x
2
x
2
c
ln x
x
d
4
x
sin(x)
e
tan(2x + 7)
f
e
3x+2
g
cos(x
3
+ 2x)
h
5
(cos x)
3
i
e
x
2
sin
3
x
j
ln(
√
x sin x)
Q31
Let f(x) = sin(3x). Compute f
′′′
(x).
Q32
Let f(x) = e
x
3
. Compute f
′′
(x).
1.2.8
Q33
Where in its domain is the function f(x) = x
3
− x
2
increasing?
33
Question 1.3.3
Does a Function Always Have a Maximum?
No. Many functions don’t have maximums, because as x gets larger and larger the values of f(x)
increase or decrease without bound. However, if we restrict the domain, we can sometimes guarantee a
maximum
Theorem [The Extreme Value Theorem]
If f(x) is a continuous function on a closed domain [a, b] then f has an absolute maximum and an
absolute minimum on [a, b].
Remark
When the EVT applies, we can find the absolute maximum and minimum by process of elimination. A
maximum exists, so it must occur at a critical point. We can find the critical points and evaluate f at
each of them. Whichever has the greatest value is the maximum.
Note that a and b are always critical points because the derivative does not exist there. There is no
limit from the left at a because those points are outside the domain of f. Similarly, there is no limit
from the right at b.
Example
Compute the maximum and minimum value of f(x) = 8x
3
− x
4
on the domain [2, 8], if they exist.
Solution
f(x) is continuous and [2, 8] is closed, so the EVT guarantees that a maximum and minimum exist.
The first derivative test says that they can only occur at critical points.
f
′
(x) = 24x
2
− 4x
3
compute first derivative
0 = 24x
2
− 4x
3
set equal to 0
0 = 4x
2
(6 − x) factor
x = 0 or x = 6
x = 0 is not in the domain, so we discard it. On the other hand x = 2 and x = 8 are also critical points
because the derivative does not exist there. To find which critical point is the maximum and which is
the minimum, we plug each into f and compare.
f(2) = (8)(8) − 16 = 48
f(6) = (8)(216) − 1296 = 436 (maximum)
f(8) = (8)(512) − 4096 = 0 (minimum)
39
1.3.4
Q13
Evaluate lim
x→0
+
x cos(x − π)
e
x
− 1
.
Q14
Evaluate lim
x→0
+
e
−3x
+ 3x − 1
sin(x
2
)
.
Q15
Evaluate lim
x→∞
x ln x
x
5/2
+ 3
.
Q16
Evaluate lim
x→−∞
e
x
x
2
.
43
Section 1.4
Goals:
1 Express areas under a graph and antiderivatives using integral notation.
2 Derive antiderivatives from known derivatives.
3 Compute general antiderivatives.
4 Compute definite integrals using the Fundamental Theorem of Calculus.
5 Use u-substitution to compute integrals where necessary.
By definition, integrals compute area under a graph. The Fundamental Theorem of Calculus connects
integrals to antiderivatives, meaning that integrals can also be used to compute total change, given a
rate of change function.
Question 1.4.1
What Is an Antiderivative?
Definition
F (x) is antiderivative of a function f(x), if F
′
(x) = f(x).
Every derivative we know also tells us an antiderivative.
Example
d
dx
x
2
2
+ 5
= x so F (x) =
x
2
2
+ 5 is an antiderivative of f(x) = x.
Notice that
x
2
2
+ 2,
x
2
2
− 6, and
x
2
2
are also antiderivatives of f(x) = x.
Functions have infinitely many antiderivatives. Adding a constant to one antiderivative produces
another, since the derivative of a constant is 0. In fact, this is the only relationship between antideriva-
tives.
Theorem
If F (x) and G(x) are antideriavatives of f(x), then there is a constant c such that
F (x) = G(x) + c.
Since the antiderivatives are related this way, it is easy to express all of the antiderivatives of a
function at once.
44
Question 1.4.4
How Do We Compute the Area Under a Graph?
Defintion
We define the definite integral of f(x) over [a, b] to be
Z
b
a
f(x) dx = lim
|∆x|→0
X
f(x
∗
i
)∆x
i
where the limit is taken over all divisions of [a, b], ∆x
i
is the length of the ith subinterval, x
∗
i
is a point
in the ith subinterval and |∆x| is the largest ∆x
i
.
Notice there is no requirement that the subintervals be the same length. Because of this, we don’t
take a limit as n approaches ∞. For instance, using a large number of rectangles from
a,
a+b
2
and only
a single rectangle over
a+b
2
, b
will not give us a good approximation, no matter how many rectangles
we use. Instead we take a limit as the largest ∆x
i
approaches 0.
In practice, we get the same limit whether the subintervals are equal length or not not. It is common
to use the same ∆x =
b−a
n
for each subinterval.
The definite integral almost solves our area problem, but wherever f(x) < 0, the product f (x
∗
i
)∆x
i
will be negative.
Theorem
If f(x) > 0 on [a, b] then
Z
b
a
f(x) dx computes the area under y = f(x) over [a, b]. In general
Z
b
a
f(x) dx computes the signed area between y = f(x) and the x-axis, where area above the axis
counts as positive, and area below the axis counts a negative.
Since integrals are limits, they inherit two laws from limits. The third can be taken from geometry,
setting the area of a region equal to the sum of the areas of two subregions.
Integral Laws
Z
b
a
f(x) + g(x) dx =
Z
b
a
f(x) dx +
Z
b
a
g(x) dx (Sum Rule)
Z
b
a
cf(x) dx = c
Z
b
a
f(x) dx (Constant Multiple Rule)
Z
b
a
f(x) dx =
Z
c
a
f(x) dx +
Z
b
c
f(x) dx (Union Rule)
48
b
1
4
+
4
16
+
9
64
+
16
256
+
25
1024
c
√
2 +
√
6 +
√
12 +
√
20 +
√
30 +
√
42 +
√
56.
1.4.4
Q15
Does
R
1
1/2
ln x dx compute the area under y = ln x over
1
2
, 1
? Explain.
Q16
Suppose
R
b
a
f(x) dx < 0. What does this tell you about the graph y = f(x)? Be specific.
Q17
Draw a careful graph of y =
√
x. Use 5 subintervals of [1, 11] to estimate the area beneath the
graph over [1, 11]. Use the left endpoints of each subinterval as the test points x
∗
i
.
Q18
Draw a careful graph of y = 3x. Use 3 subintervals of [2, 8] to estimate the area beneath the
graph, with the test points x
∗
i
being the left endpoints of each subinterval.
Q19
Draw the graph of y = 7. Use geometry to evaluate
R
3
87 dx.
Q20
Draw the graph of y =
x
3
+ 1. Use geometry to evaluate
R
−3
9
x
3
+ 1 dx.
1.4.5
Q21
Let g(x) =
R
x
5
f(t) dt. What is g
′
(8)?
Q22
Let g(x) =
R
x
2
cos t dt. Is g(x) increasing or decreasing at x = 3? Explain.
Q23
Suppose f(x) is an increasing function. Is
R
31
22
f
′
(x) dx positive or negative?
Q24
Suppose F (x) and G(x) are both antiderivatives of f(x). Given the following incomplete table
of values, compute
R
4
1
f(x) dx.
x 1 2 3 4 5 6
F (x) − 7 − 13 − 9
G(x) 3 − 9 − 10 5
55
1.4.7
Q37
Write some general rules. Suppose F (x) + c is the antiderivative of f(x)
a
What is the antiderivative of f(x + a)?
b
What is the antiderivative of f(ax)?
Q38
Assuming that
Z
b
a
f(x) dx exists, argue that it is equal to
Z
2b
2a
1
2
f
x
2
dx, in the following two
ways:
a
By appealing to an integration rule.
b
By describing the relationship between the graphs of y = f(x) and y = f
x
2
. A picture
might help.
1.4.8
Q39
Compute
Z
e
7x
dx.
Q40
Compute
Z
√
5x + 3 dx.
Q41
Compute
Z
cos
θ
3
dθ.
Q42
Compute
Z
(t − 2)
6
dt.
Q43
Compute
Z
1/4
0
sin(πt) dt.
Q44
Compute
Z
3
0
x
2
e
x
3
dx.
Q45
Compute
Z
(x
5
− 2x)(5x
4
− 2) dx.
Q46
Compute
Z
3π/4
π/4
cos(x)
1
sin
2
x
dx.
57
>