# Sergei Yakovenko's blog: on Math and Teaching

## Understanding and using the derivative

### A few exotic examples

1. The function $f(x)=x^2 h(x)$, where $h(x)$ is the Dirichle function (equal to 1 when $x$ is rational and 0 if $x$ is irrational) is differentiable at only one point. Why?

2. The function $g(x)=x^2\sin (x^k)$, $k\in\mathbb Z$, is a source of many examples. Its derivative for $x\ne 0$ can be computed by the usual rules: $g'(x)=2x \sin (x^k)+x^2 \cos (x^k)\cdot kx^{k-1}=2x\sin(x^k)+kx^{k+1}\cos(x^k)$. This function has a limit at $x=0$ for $k\ge0$, is bounded (but discontinuous) for $k=-1$, is unbounded (without any limit, infinite or not) for $k\le -1$.

On the other hand, regardless of $k$, the function is always differentiable at the origin! Thus there exist functions differentiable at each point, whose derivative is discontinuous!

3. There is no function $f(x)$ differentiable everywhere on $[-1,+1]$ such that $f'$ is continuous on $[-1,0)\cup(0,+1]$ and has two unequal limits $\lim_{x\to 0-}f'(x)\ne\lim_{x\to 0^+}f'(x)$. Why?

### Derivative and monotonicity

For a function continuous at a point $a\in\mathbb R$, the sign of the value $f(a)$, if it is nonzero (i.e., $\text{sign} f(a)=\pm 1$) determines the sign of $f$ at all sufficiently near points (where $f$ is defined, of course). In a similar way, the sign of the derivative of a differentiable function, if it is nonzero, determines the monotonicity direction of $f$ at $a$: from the definition of the derivative, it follows that

$f'(a)>0\implies \exists\delta>0:\quad \forall x_-,x_+\ \text{such that } a-\delta

and in a similar way, the function $f(x)-f(a)$ changes its sign from $+1$ to $-1$ if $f'(a)<0$.

Note that this result does not mean that $f(x)$ is monotone growing on $[a-\delta,a+\delta]$  if $f'(a)>0$! (Give an example, tuning Example 2 below…) However, this result does imply that for differentiable functions certain points (in a sense, majority) cannot be extrema.

Theorem.

1. An interior point of smoothness can be extremum only if the derivative vanishes at this point.
2. A left endpoint can be minimum (resp., maximum), only if the function is differentiable at this point and the derivative is $\ge 0$ (resp., $\le 0$).
3. A right endpoint can be minimum (resp., maximum), only if the function is differentiable at this point and the derivative is $\le 0$ (resp., $\ge 0$).

These rules can be easily memorized if instead of the word “function” one substitutes the words “affine function”, when the behavior is obvious. The first (principal) assertion of the theorem means that an affine function can reach an extremum (always non-strict!) only if it has zero slope (i.e., it is a constant).

Warning: if the function is non-differentiable, the theorem says nothing! Consider the functions $f(x)=|x|$ and $g(x)=|x|+3x$.

Warning: The fact that the derivative vanishes (in the interior or end-point) does not guarantee that the extremum indeed exists. However, if the derivative has a definite sign at the endpoints, this guarantees that they are indeed extrema (in accordance with the theorem).

### Differential

The derivative as a function $f':a\mapsto f(a)$ is usually non-linear and even not affine, contrary to the declared goal of finding a linear approximation. On the other hand, the affine approximation $\ell(x)=\alpha (x-a)+\beta=\alpha x+(\beta-\alpha a),\ \alpha=f'(a),\ \beta=f(a)$, is in general non-linear and does depend on the point $a$, the center of approximation. If we want to have a function that would be at the same time linear and explicitly depend on $a$, we need a function of two variables (and not one). This function is called differential.

Definition. The tangent space to the real line $\mathbb R$ at a point $a\in\mathbb R$ is the vector space of all pairs $\{(a,v):\ v\in\mathbb R^1\}$ with the operations

$(a,v)\pm(a,w)=(a,v\pm w),\quad \lambda (a,v)=(a,\lambda v).$

An element of this space is called a vector attached to the point $a$. It differs from the usual, “free” vector, by the “memory”: the attached vector remembers where it grows from. Vectors attached to different points, in general should not be added between themselves: such addition assumes that we can always “translate” vectors from one point to another. While it can still be easy on the plane, in more complicated situations such translation may be problematic (think about translating vector tangent to a circle at one point, to another point on the circle).

Definition. Let $a\in\mathbb R$ be a point at which the function $f$ is differentiable, and $v\in\mathbb R^1$ is a vector attached to the point $a$. The differential $df$ is a  function linear in the second argument, which realizes the linear approximation to $f$, i.e., sends the pair $(a,v)$ into the number $f'(a)\cdot v\in\mathbb R$.

For animation see the Wolfram page. Note that we treat $df$ as an indivisible symbol for the function of two arguments, though later we will show that it can be considered as the result of application of some operator $d$ to the function $f$.

How to write the differentials, if their arguments  are “vectors” (even attached to points)? Introduce the “units of measurements” and compare!

Example. Let $dx$ (again, an indivisible symbol) be the function which sends the “attached vector” $(a,v),\ a\in\mathbb R,\ v\in\mathbb R^1$, into the number $v\in\mathbb R\simeq\mathbb R^1$. Since any two linear maps  are proportional, another linear map has the form $A\,dx$, where $A\in\mathbb R$ is the coefficient (slope), which in general may depend on the point $a$. The linear map approximating a differentiable function $f$ has the form $f'(a)\,dx$ at the point $a$, and $f'(x)\,dx$ at a general point $x\in\mathbb R$.  We write the result of this computation symbolically as

$df=f'(x)\,dx,\qquad df(a,v)=f'(a)\,dx(v).$

### Invariance of differentials by the change of variables

The derivative of a function depends on the name of the independent variable. The velocity of the same motion in km/h and in ft/sec is completely different. The notion of the differential is assembled of two parts, both involving the notation of the variable. Yet in a miraculous (well-conceived!) way, the differential is independent of which units (even non-uniform) are used for measurement.

• Change of variables.
• Action of differentiable changes of variables on points and on tangent vectors.
• Example: velocity of the motion along the line. Traveling along the mountain road: height vs. length; height vs. geographic location.
• Action of differentiable changes of variables on functions. Non-invariance of the derivative.
• Invariance of the differential. This allows us to write $df$ rather than $df(x)$.