## Directional Derivative

There’s really nothing too special going on here. Instead of calculating the rate of change of the function along a single axis (as a partial derivative does), we’re now going in arbitrary directions. The only real technicality is that we need a vector representing the direction we want to travel, and so we use a unit vector in that direction. We use a unit vector because we really want to know “as I travel one unit in the direction of $\vec{u}$, how much does the function change?” If we had anything besides a unit vector, this would be an awkward measurement. It’s like saying that your car gets 28 miles to the gallon compared to saying your car gets 41 miles to the $\sqrt{2}$ gallons.

The definition says that if $\vec{u}=u_1\vec{i}+u_2\vec{j}$ is a unit vector, then the directional derivative $f_{\vec{u}}$ at the point $(a,b)$ is defined as

$f_{\vec{u}}(a,b)=\lim_{h\to 0}\frac{f(a+hu_1,\ b+hu_2)-f(a,b)}{h}.$

This quantity, when the limit exists, tells us how fast the function is changing as we move in the direction of $\vec{u}$. (If you’re not given a unit vector for $\vec{u}$, make it in to a unit vector before trying to use it. Here’s where things get interesting. Notice that if we want to calculate the rate of change of $f$ in a direction given by $\vec{u}$, we could use the components of $\vec{u}$ and consider the partial derivatives. (This really assumes local linearity, and we aren’t too horribly worried about the specifics in this course. I’m presenting the details here because some of you seemed interested.)

The figure shows a plane having a point at the origin and a positive slope in the $x$ and $y$ directions (hence positive partial derivatives). The slope of the green line (the directional derivative) is the slopes of the red line (the $x$-derivative) and blue line (the $y$-derivative) added together. Thus, we really have

\begin{align*} f_{\vec{u}}(a,b) &= \lim_{h\to 0}\frac{f(a+hu_1,b+hu_2)-f(a,b)}{h}\cr &= \lim_{h\to 0}\frac{f(a+hu_1,b)-f(a,b)}{h}+\lim_{h\to 0}\frac{f(a,b+hu_2)-f(a,b)}{h}\cr &= \lim_{h\to 0}u_1\frac{f(a+h,b)-f(a,b)}{h}+\lim_{h\to 0}u_2\frac{f(a,b+h)-f(a,b)}{h}\cr &= u_1\lim_{h\to 0}\frac{f(a+h,b)-f(a,b)}{h}+u_2\lim_{h\to 0}\frac{f(a,b+h)-f(a,b)}{h}\cr &= u_1f_x(a,b)+u_2f_y(a,b)\cr &= f_x(a,b)u_1+f_y(a,b)u_2\cr &= \left\langle f_x,f_y\right\rangle\cdot\left\langle u_1,u_2\right\rangle\cr &= \nabla f\cdot \vec{u}. \end{align*}

The $\nabla f$ is incredibly useful and is called the gradient of $f$. It is simply a vector where the components are the partial derivatives of the given function. More information on the gradient is in your text. What the gradient is, how it is used, its direction and magnitude, these are all very fundamental concepts in this class.

It might seem a bit like magic that we can pull the $u_1$ and $u_2$ out from the quotients. You should realize what the quotients at either side of that process represent. Before pulling out the $u_1$ or $u_2$, we’re looking at the slope of the secant line as we travel $hu_1$ and $hu_2$ along the $x$ and $y$-axes, respectively. That amount of change is simply $u_1$ or $u_2$ times the amount of change for simply going $h$ along those axes. So, we can actually pull those numbers out from the quotients and then out in front of the limits.

Example 1: Find the directional derivative of $x^2+y$ in the direction of $\vec{u}=\vec{i}+\vec{j}$.

Solution: We’re not asked to find the directional derivative at a specific point, and so we’ll do it for arbitrary points $(x,y)$. (Thus, if you want the directional derivative at a specific point, simply plug in the $x$ and $y$ for that point.) A unit vector in the direction of $\vec{u}$ is $\vec{v}=\frac{1}{\sqrt{2}}\vec{i}+\frac{1}{\sqrt{2}}\vec{j}$. Now, from the definition of directional derivative, we find that

\begin{align*} f_{\vec{v}}(x,y) &= \lim_{h\to 0}\frac{f\left(x+h\frac{1}{\sqrt{2}},y+h\frac{1}{\sqrt{2}}\right)-f(x,y)}{h}\cr &= \lim_{h\to 0}\frac{\left(x+h\frac{1}{\sqrt{2}}\right)^2+\left(y+h\frac{1}{\sqrt{2}}\right)-(x^2+y)}{h}\cr &= \lim_{h\to 0}\frac{2xh\frac{1}{\sqrt{2}}+\frac{1}{2}h^2+h\frac{1}{\sqrt{2}}}{h}\cr &= \lim_{h\to 0}2x\frac{1}{\sqrt{2}}+\frac{1}{2}h+\frac{1}{\sqrt{2}}\cr &= \frac{1}{\sqrt{2}}(2x+1) \end{align*}

Of course, we could simply use the equivalences given above and use the fact that $f_{\vec{v}}(x,y)=\langle f_x,f_y\rangle\cdot\langle u_1,u_2\rangle$. We know that $f_x=2x$ and $f_y=1$. This gives the same result much more quickly.

There are a few key points in understanding how the gradient actually gets used. These are illustrated by several examples.

Example 2: This is much like a WebWork problem. Assume you’re climbing a mountain along the steepest route at a rate of $20^\circ$. As you climb, a trail diverges to your left at an angle of $30^\circ$ (according to the map). Assuming the mountain is locally linear, if you take the trail to the left, at what rate will you now be climbing the mountain?

Solution: The gradient represents the maximum rate of change for a function and the direction in which that rate of change occurs. Therefore, while you’re climbing the steepest route up the mountain, you’re following the gradient. Now, what you want to know is the rate of change of the height of the surface that is the mountain as you travel $30^\circ$ from the gradient. That is, you wish to know the directional derivative along the path that we’ll call $\vec{u}$ (a unit vector in the direction of $30^\circ$ counter-clockwise from the gradient). We know that

$f_{\vec{u}}=\nabla f\cdot \vec{u}=||\nabla f||\,||\vec{u}||\cos(30^\circ).$

The problem here comes in realizing that $||\nabla f||=\tan(20^\circ)$ and $||\vec{u}||=1$. Thus, we know that

$f_{\vec{u}}=\tan(20^\circ)\cos(30^\circ).$

Of course, that tells us the rate of change (rise over run), and so we can get the angle by using arctan:
$\tan^{-1}(\tan(20^\circ)\cos(30^\circ))$

Example 3: Find an equation of the tangent plane to the surface $2x+3y^2+\sin(z)=4$ at the point $(1,-1,3\pi/2)$.

Solution: The gradient gives us a normal vector to the surface, as discussed in class. The required normal vector is
$\nabla f(1,-1,3\pi/2)=\langle 2,6y,\cos(z)\rangle|_{(1,-1,3\pi/2)}=\langle 2,-6,0\rangle.$

Thus, the equation of the tangent plane is
$2(x-1)-6(y+1)=0.$