A Guide Through the Proof of the (Second) Fundamental Theorem of Calculus

Welcome back. This week, I hope to build on our discussion of the Riemann integral and prove the Fundamental Theorem of Calculus (FTC), which is one of the most powerful results in mathematics. This theorem technically comes in two parts and deals with antiderivatives. Essentially, an antiderivative of a function f(x) is another function whose derivative is equal to f(x). Since the derivative of a constant is always zero, there can be infinitely many antiderivatives of a given function, all of which differ by a constant. The Fundamental Theorem of Calculus Part I tells us how to construct an antiderivative of a function f(x) on an interval of the real number line [a,b]. In particular, if we define the function F(x) as follows, then F(x) has a derivative at every point at which f(x) is continuous and that derivative, dF/dx, is precisely equal to our original function f evaluated at x.

If F(x) = ∫_{[a, x]} f(t) dt, then dF/dx = f(x) for all points x in [a,b] for which f(x) is continuous.

The Fundamental Theorem of Calculus Part II tells us that we can compute the (signed) area under the graph of f(x) between x = a and x = b by evaluating an antiderivative of f(x) at b, and then subtracting the antiderivative evaluated at a.

∫_{[a, b]} f(x) dx = F(b) – F(a) if there exists a function F(x) such that dF/dx = f(x) for all x in [a,b].

In most engineering and mathematics contexts, it is more common to be calculating areas under curves than formulating antiderivatives using integrals formulas. Therefore, the FTC Part II has proven itself to be more useful than the FTC Part I. Therefore, in this blog post, we will prove the second Fundamental Theorem of Calculus and leave Part I for a later week.

To begin, we will assume that we will have a “nice” real-valued function, f(x), defined on an interval [a,b]. By “nice” we mean that f(x) is bounded (ie. doesn’t blow up to positive or negative infinity) and integrable (ie. it has an integral that exists according to the way the Riemann integral is constructed). We will also assume that we can produce a function F(x) that is defined on [a,b] and satisfies dF/dx = f(x) for every x in [a,b].

Now we will perform a technique that is standard in Real Analysis: Let us pick a random real number that is greater than 0 and call that number “ε” (pronounced epsilon). We can eventually think of ε as a number that is so small that it is almost indistinguishable from zero. Since f(x) is Riemann integrable over [a,b], we know that the upper Riemann sums and the lower Riemann sums become arbitrarily close to each other if we take partitions of [a,b] that are finer and finer.

Therefore, there exists a partition P = {x_j: j = 0,1,2,…, n} with x₀ = a and x_n = b such that the following is true:

U(f, P) – L(f, P) < ε.

We will then leverage the Mean Value Theorem, a powerful result from differential calculus. The Mean Value Theorem applied in this context tells us that for each j ≥ 1, there exists a real number t_j that lies in the open interval (x_j-1, x_j) such that:

F(x_j) – F(x_j-1) = F’(t_j) (x_j – x_j-1) = f(t_j) (x_j – x_j-1).

We can then notice that the following equality holds by the way the upper and lower Riemann sums are defined (please see the last blog post for more details).

U(f, P) = ∑ M_j ∆x_j ≥ ∑ f(t_j) ∆x_j ≥ ∑ m_j ∆x_j = L(f, P).

Furthermore,

U(f, P) ≥ ∫_{[a, b]} f(x) dx ≥ L(f, P).

We know that this inequality is true because L(f, P) increases if the partition P gets finer, U(f, P) decreases if the partition P gets finer, and ∫_{[a, b]}f(x) dx is, by definition, equal to the supremum over all partitions of L(f, P) and the infimum over all partitions of U(f, P).

Since the difference between U(f, P) and L(f, P) is less than ε, then the absolute value of the difference between ∑ f(t_j) ∆x_j and ∫_{[a, b]} f(x) dx must be less than ε. Indeed, since ∫_{[a, b]} f(x) dx and ∑ f(t_j) ∆x_j are both in between U(f, P) and L(f, P), they cannot differ by more than what U(f, P) and L(f, P) differ by.

Lastly,

∑ f(t_j) ∆x_j = ∑ F(x_j) – F(x_j-1)

= [F(x₁) – F(a)] + [F(x₂) – F(x₁)] + … + [F(b) – F(x_n-1)]

= F(b) – F(a).

Since the sum is taken from j = 1 to j = n, we can say that the sum “telescopes” and that ∑[F(x_j)-F(x_j-1)] must be equivalent to F(b) – F(a).

Thus,

| [F(b) – F(a)] – ∫_{[a, b]} f(x) dx | < ε.

Since ε was an arbitrary real number bigger than zero, we can conclude the result: namely,

∫_{[a, b]} f(x) dx = F(b) – F(a).

This result was incredible to me when I first learned calculus. The idea that areas under curves were connected to antiderivatives evaluated at endpoints did not make intuitive sense. It seemed almost too profound to be true. Therefore, I became determined to learn as much math as I could so that I could understand once and for all why the Fundamental Theorem of Calculus Part II is true. While I am sure that many textbooks can probably explain the theorem in greater detail, I hope that this guide was informative and illuminated some of the mystery behind the result.

Oliver Khan

Reader Interactions

Leave a Reply Cancel reply