Separation of Variables and the Method of Characteristics: Two of the Most Useful Ways to Solve Partial Differential Equations

Welcome back. I hope all of my readers are having an excellent start to the new semester. Over the last couple of months, we have discussed partial differential equations (PDEs) in some depth, which I hope has been interesting and at least somewhat enjoyable. Today, we will explore two of the most powerful and commonly used methods of solving PDEs: separation of variables and the method of characteristics. We will first illustrate how separation of variables works by working out an example problem. While we have mostly discussed theory in this blog series, it can be useful to illustrate concepts with a concrete example. We will then finish up by developing the theory behind the method of characteristics. In a future post, we will see an application of the method of characteristics when we discuss Burgers’ equation (∂u/∂t + u ∂u/∂x = 0).

Suppose we have a horizontal string that is π units long and vibrates up and down with a speed of 2 units of distance per unit of time starting at time t = 0. Further suppose that one end of the string is nailed down at x = 0, the other end is nailed down at x = π, and the string is initially shaped exactly like the curve y = sin(x). At any time, t, and any horizontal position, x, we will let u(x,t) denote the height of the string above the x-axis. The string will then vibrate according to the wave equation ∂²u/∂t² = 2² ∂²u/∂x². Finally, we will also assume that when t = 0, ∂u/∂t = 0 for all values of x. This initial condition means that the string starts at rest. In terms of equations, we wish to solve the following initial-boundary value problem:

i) u = u(x,t) with 0 ≤ x ≤ π and t > 0

ii) ∂²u/∂t² = 4 ∂²u/∂x²

iii) u(0,t) = u(π,t) = ∂u/∂t(x,0) = 0

iv) u(x,0) = sin(x)

To begin, we will let u(x,t) = X(x)T(t), for some twice-differentiable functions capital X and capital T. This is the fundamental step. We are supposing that the function, u, of x and t is equal to the product of a function purely in terms of x and a function purely in terms of t. Once we substitute this formula for u(x,t) into the wave equation, we obtain:

X(x) T’’(t) = 4 X’’(x) T(t).

What comes next is another core idea of separation of variables. We will move everything involving x’s to one side of the equation and move everything involving t’s to the other side to yield:

X’’(x) / X(x) = T’’(t) / [4 T(t)].

The only way that an expression purely in terms of x can be equal to an expression purely in terms of t is if the expressions are equal to a constant. Thus, X’’(x) / X(x) = T’’(t) / [4 T(t)] = C for some fixed real number C. Theoretically, this constant can be either positive, negative, or zero. However, it turns out that if C is positive or if C = 0, then our entire solution u(x,t) will be identically equal to zero, which unfortunately will not match our initial condition u(x,0) = sin(x). Therefore, we let C be negative and proceed. One way to express C as being negative is to let C = -k² for some k > 0. Thus, we obtain the following two ordinary differential equations that we can solve in turn:

X’’(x) / X(x) = – k² (Eq. 1)

T’’(t) / [4 T(t)] = – k² (Eq. 2)

Let’s first examine (Eq. 1). When we rearrange the terms, we find that X’’(x) + k² X(x) = 0. In other words, the second derivative of X(x) is equal to the constant factor -k² times X(x) itself. It turns out that both sine and cosine functions have second derivatives that are scaled versions of themselves. Therefore, our solution to (Eq. 1) has the following form, where A and B are as of yet undetermined constants:

X(x) = A cos(kx) + B sin(kx).

Since u(0,t) = 0 and u(π,t) = 0 for all t, we must have that X(0) = X(π) = 0. Since X(0) = A, we immediately conclude that A = 0 and B is not equal to zero. If B = 0, then we would again have the zero solution that we already rejected. Since X(π) = B sin(kπ) = 0 and B is nonzero, we can conclude that sin(kπ) = 0, which is true if k is equal to any positive integer. We will use this piece of information a little bit later.

Now let us examine (Eq. 2). In a similar fashion, we can find that the solution has the following form:

T(t) = C cos(2kt) + D sin(2kt).

Since ∂u/∂t(x,0) = 0 for all values of x in our domain of interest, we conclude that dT/dt = 0 at t = 0, meaning that D = 0. (To see why this is true, take the derivative of T(t), plug in t = 0, and set the resulting expression equal to zero. Remember that by assumption, k is nonzero).

Since k can be equal to any positive integer n, we so far have infinitely many potential solutions of the form u_n(x,t) = c_n cos(2nt) sin(nx), where c_n can be any constant. Because the wave equation is linear, the sum of all of these u_n’s is still a solution, making the most general solution possible have the following form, where n ranges from 1 to infinity:

u(x,t) = ∑ c_n cos(2nt) sin(nx).

But u(x,0) = sin(x), so c₁ = 1 and c_n = 0 for all n that are not equal to 1. Thus, the final answer is the following:

u(x,t) = cos(2t) sin(x)

Success!

To summarize separation of variables, we first assume that the solution u is equal to a product of functions that only depend on each of our variables. Next, we move expressions involving each variable to opposite sides of an equality and set those expressions equal to a constant. We determine whether that constant is positive, negative, or zero, and then solve the resulting ordinary differential equations.

Now let’s finish off with a discussion of the method of characteristics. The general problem we want to solve is the following first-order quasi-linear PDE, where u = u(x,y) is a function of the two variables x and y and where a, b, and f are differentiable functions of x, y, and u(x,y) itself:

a(x,y,u) ∂u/∂x + b(x,y,u) ∂u/∂y – f(x,y,u) = 0.

We will begin by letting φ(x,y,u) = u(x,y) – u. In this equation, we can view “u(x,y)” as what it is: a function of x and y. The gradient of φ is then ∇φ = < ∂u/∂x, ∂u/∂y, -1 >, so we can rewrite the PDE we wish to study as a dot product:

< a, b, f > • ∇φ = 0 (Eq. 3).

Now suppose that we have a parametrized curve in xyu-space r(s) = < x(s), y(s), u(s) >. Suppose further that r(s) is a characteristic curve of φ. In other words, φ(x,y,u) is constant whenever x = x(s), y = y(s), and u = u(s). Therefore, the total derivative of φ with respect to s is equal to zero. Using the multivariable chain rule, we obtain:

dφ/ds = dr/ds • ∇φ = 0 (Eq. 4).

In the above equation, r is differentiated component-wise, meaning that dr/ds = <dx/ds, dy/ds, du/ds>.

We can now notice that (Eq. 3) and (Eq. 4) have almost the same form, only the derivative of r with respect to s is replaced by the vector < a, b, f >. Since u(x,y) must satisfy the original PDE, we also conclude that dr/ds must be parallel to < a, b, f >. Mathematically, the following equation must be true where k is some real number proportionality constant:

<dx/ds, dy/ds, du/ds> = <k a(x,y,u), k b(x,y,u), k f(x,y,u)>.

With some algebraic manipulations, we obtain a system of ordinary differential equations that we can work with to find an implicitly-defined solution to our quasi-linear PDE.

dx / a(x,y,u) = dy / b(x,y,u) = du / f(x,y,u).

We know that all of these ratios are equal to each other because they are each equal to k.

If we let h(x, y, u(x,y)) = c₁ and j(x, y, u(x,y)) = c₂ be two implicitly-defined solutions of the above system of equations, we form an implicitly-defined solution to the original partial differential equation:

j(x,y,u) = F(h(x,y,u)).

Since h(x,y,u) is equal to the arbitrary constant c₁ and j(x,y,u) is equal to the other arbitrary constant c₂, we can find a continuously-differentiable function, F, that maps c₁ to c₂. To determine what F is explicitly, we can utilize the initial conditions that would be given to us in a real-world problem.

The method of characteristics can be a bit conceptually difficult, as we are first trying to find equations for parametric curves along which the function φ is constant, and then using those equations to find an implicit solution to the quasi-linear PDE. My hope is that this method will become more clear once we apply it to study Burgers’ equation. Since an in-depth look at Burgers’ equation will require considerable preparation on my part, we will take next week to examine the multidimensional version of integration by parts and use it to derive some interesting properties of harmonic functions. Until then, take care.

Oliver Khan

Reader Interactions

Leave a Reply Cancel reply