# Differentiability and derivative

Continuity of functions (and maps) means that they can be nicely approximated by constant functions (maps) in a sufficiently small neighborhood of each point. Yet the constant maps (easy to understand as they are) are not the only “simple” maps.

## Linear maps

Linear maps naturally live on vector spaces, sets equipped with a special structure. Recall that $\mathbb R$ is algebraically a field: real numbers cane be added, subtracted between themselves and the ratio $\alpha/\beta$ is well defined for $\beta\ne0$.

Definition. A set $V$ is said to be a vector space (over $\mathbb R$), if the operations of addition/subtraction $V\owns u,v\mapsto u\pm v$ and multiplication by constant $V\owns v,\ \mathbb R\owns \alpha\mapsto \alpha v$ are defined on it and obey the obvious rules of commutativity, associativity and distributivity. Some people prefer to call vector spaces linear spaces: the two terms are identical.

Warning. There is no “natural” multiplication $V\times V\to V$!

Examples.

1. The field $\mathbb R$ itself. If we want to stress that it is considered as a vector space, we write $\mathbb R^1$.
2. The set of tuples $\mathbb R^n=(x_1,\dots,x_n),\ x_i\in\mathbb R$ is the Euclidean $n$-space. For $n=2,3$ it can be identified with the “geometric” plane and space, using coordinates.
3. The set of all polynomials of bounded degree $\leq d$ with real coefficients.
4. The set of all polynomials $\mathbb R[x]$ without any control over the degree.
5. The set $C([0,1])$ of all continuous functions on the segment $[0,1]$.

Warning. The two last examples are special: the corresponding spaces are not finite-dimensional (we did not have time to discuss what is the dimension of a linear space in general…)

Let $V,Z$ be two (different or identical) vector spaces and $f:V\to W$ is a function (map) between them.
Definition. The map $f$ is linear, if it preserves the operations on vectors, i.e., $\forall v,w\in V,\ \alpha\in\mathbb R,\quad f(v+w)=f(v)+f(w),\ f(\alpha v)=\alpha f(v)$.

Sometimes we will use the notation $V\overset f\longrightarrow Z$.

Obvious properties of linearity.

• $f(0)=0$ (Note: the two zeros may lie in different spaces!)
• For any two given spaces $V,W$ the linear maps between them can be added and multiplied by constants in a natural way! If $V\overset {f,g}\longrightarrow W$, then we define $(f+g)(v)=f(v)+g(v)$ for any $v\in V$ (define $\alpha f$ yourselves). The result will be again a linear map between the same spaces.
• If $V\overset f\longrightarrow W$ and $W\overset g\longrightarrow Z$, then the composition $g\circ f:V\overset f\longrightarrow W\overset g\longrightarrow Z$ is well defined and again linear.

Examples.

1. Any linear map $\mathbb R^1\overset f\longrightarrow \mathbb R^1$ has the form $x\mapsto ax, \ a\in\mathbb R$ (do you understand why the notations $\mathbb R, \mathbb R^1$ are used?)
2. Any linear map $\mathbb R^n\overset f\longrightarrow \mathbb R^1$ has the form $(x_1,\dots,x_n)\mapsto a_1x_1+\cdots+a_nx_n$ for some numbers $a_1,\dots,a_n$. Argue that all such maps form a linear space isomorphic to $\mathbb R^n$ back again.
3. Explain how linear maps from $\mathbb R^n$ to $\mathbb R^m$ can be recorded using $n\times m$-matrices. How the composition of linear maps is related to the multiplication of matrices?

The first example shows that linear maps of $\mathbb R^1$ to itself are “labeled” by real numbers (“multiplicators“). Composition of linear maps corresponds to multiplication of the corresponding multiplicators (whence the name). A linear 1-dim map is invertible if and only if the multiplicator is nonzero.

Corollary. Invertible linear maps $\mathbb R^1\to\mathbb R^1$ constitute a commutative group (by composition) isomorphic to the multiplicative group $\mathbb R^*=\mathbb R\smallsetminus \{0\}$.

## Shifts

Maps of the form $V\to V, \ v\mapsto v+h$ for a fixed vector $h\in V$ (the domain and source coincide!) are called shifts (a.k.a. translations). Warning: The shifts are not linear unless $h=0$! Composition of two shifts is again a shift.

Exercise.
Prove that all translations form a commutative group (by composition) isomorphic to the space $V$ itself. (Hint: this is a tautological statement).

## Affine maps

Definition.
A map $f:V\to W$ between two vector spaces is called affine, if it is a composition of a linear map and translations.

Example.
Any affine map $\mathbb R^1\to\mathbb R^1$ has the form $x\mapsto ax+b$ for some $a,b\in\mathbb R$. Sometimes it is more convenient to write the map under the form $x\mapsto a(x-c)+b$: this is possible for any point $c\in\mathbb R^1$. Note that the composition of affine maps in dimension 1 is not commutative anymore.

Key computation. Assume you are given a map $f:\mathbb R^1\to\mathbb R^1$ in the sense that you can evaluate it at any point $c\in\mathbb R^1$. Suppose an oracle tells you that this map is affine. How can you restore the explicit formula $f(x)=a(x-c)+b$ for $f$?

Obviously, $b=f(c)$. To find $\displaystyle a=\frac{f(x)-b}{x-c}$, we have to plug into it any point $x\ne c$ and the corresponding value $f(x)$. Given that $b=f(c)$, we have $\displaystyle a=\frac{f(x)-f(c)}{x-c}$ for any choice of $x\ne c$.

The expression $a_c(x)=\displaystyle \frac{f(x)-f(c)}{x-c}$ for a non-affine function $f$ is in general not-constant and depends on the choice of the point $x$.

Definition. A function $f:\mathbb R^1\to\mathbb R^1$ is called differentiable at the point $c$, if the above expression for $a_c(x)$, albeit non-constant, has a limit as $x\to c:\ a_c(x)=a+s_c(x)$, where $s_c(x)$ is a function which tends to zero. The number $a$ is called the derivative of $f$ at the point $c$ and denoted by $f'(c)$ (and also by half a dozen of other symbols: $\frac{df}{dx}(c),Df(c), D_xf(c), f_x(c)$, …).

Existence of the limit means that near the point $c$ the function $f$ admits a reasonable approximation by an affine function $\ell(x)=a(x-c)+b$: $f(x)=\ell(x)+s_c(x)(x-c)$, i.e., the “non-affine part” $s_c(x)\cdot (x-c)$ is small not just by itself, but also relative to small difference $x-c$.

# Differentiability and algebraic operations

See the notes and their earlier version.

The only non-obvious moment is differentiability of the product: the product (unlike the composition) of affine functions is not affine anymore, but is immediately differentiable:

$[b+a(x-c)]\cdot[q+p(x-c)]=pq+(aq+bp)(x-c)+ap(x-c)^2$, but the quadratic term is vanishing relative to $x-c$, so the entire sum is differentiable.

Exercise. Derive the Leibniz rule for the derivative of the product.

# Derivative and the local study of functions

Affine functions have no (strong) maxima or minima, unless restricted on finite segments. Yet absence of the extremum is a strong property which descends from the affine approximation to the original function. Details here and here.