# Sergei Yakovenko's blog: on Math and Teaching

## Real numbers as solutions to infinite systems of equalities

In the past we already extended our number system by adding “missing” elements which are assumed to satisfy certain equations, based only on knowing what these equations are. It turns out that we may extend the set of rational numbers $\mathbb Q$ to a much larger set of real numbers $\mathbb R$ by adding solutions to (infinite numbers) of inequalities. As before, the properties of these new numbers could be derived only from the properties of inequalities between the rational numbers.

On one leg, the idea can be explained as follows. Since for any two rational numbers $r,s\in\mathbb Q$ one and only one relation out of three is possible, $r$ < $s$, $r=s$ or $r$ > $s$, we can uniquely define any, say, positive rational unknown number $x$ by looking at the two sets, $L=\{l\in\mathbb Q: 0\le l\le x\}$ and $R=\{r\in\mathbb Q: x\le r\}$. (You don’t have to be too smart at this moment: $x$ is the only element in the intersection $L\cap R$ 馃槈

However, sometimes the analogous construction leads to problems. For instance, if $L=\{l\in\mathbb Q: l\ge 0, l^2\le 2\}$ and $R=\{r\in\mathbb Q: r^2\ge 2\}$, then $L\cap R=\varnothing$, since the square root of two is not a rational number, but $L\cup R=\mathbb Q_+$, i.e., for any positive rational number we can say whether is smaller or larger the missing number $\sqrt 2$. This allows to derive all properties of $\sqrt 2$, including its approximation with any number of digits.

Proceeding this way, we introduce (positive) real numbers by indication, what is their relative position to all rational numbers. This allows to describe the real numbers completely.

The details can be found here.

## A didactic digression

Some of you complained about insufficient number of problems that are discussed during the tutorials. Everybody knows that problems and questions for self-control are the most important elements of study mathematics, especially in comparison with other disciplines. The rationale behind is the assumption that a student who understands the subject, should be able to answer these questions immediately or after some reflection. Composing such problems is an easy thing: you any mathematical argument you can stop for a second and ask yourself: “why I can do as explained?” or “under what conditions are my actions justified?”. In the lecture notes (see the link above) tens of such problems are explicitly formulated. Similar problems will await you on the exam.

However, remember one simple thing. If you already know how to solve a problem, this is not a problem but rather a job. Unless you solve these problems yourselves, there is no sense in memorizing their solutions: knowing solution of one such problem won’t help you with solving another problem unless you really understand what’s going on. There are no “typical problems”: each one of them is of its own sort, though, of course, some problems can be solved by similar methods.

A practical advice: you should not expect that all problems that appear on the exam will be discussed at length at the tutorials. There are no ready recipes to memorize. Only to understand honestly. Believe me, this is easier than memorize by heart endless formulas and algorithms.

## Numbers

The basic set theory allows us to construct a set $\mathbb N=\{|,||,|||,||||,|||||,\dots\}$ with a function “next”, denoted by $\mathrm{Succ}:\mathbb N\to\mathbb N\smallsetminus\{|\}$, which is bijective. This set describes the process of counting objects and is the most basic structure. Starting from a distinguished element denoted by 1, we construct an infinite number of elements $2=\mathrm{Succ}(1),\ 3=\mathrm{Succ}(2),\ 4=\mathrm{Succ}(3)$ etc. There are two axioms guaranteeing that the set $\mathbb N$ indeed coincides with what we call the set of natural numbers:

1. $\forall x\in\mathbb N\ \mathrm{Succ}(x)\ne 1$
2. Any element $x\in\mathbb N$ is obtained by the iteration of $\mathrm{Succ}$: $x=(\mathrm{Succ}\circ\cdots\circ\mathrm{Succ})(1)$.

Using this function and its partial inverse聽 one can define on $\mathbb N$ the order and the operations of addition (as repeated addition of 1 which is just evaluation of $\mathrm{Succ}$) and multiplication (repeated addition).

However, not all equations of the form $x+a=b$ or $x\cdot a=b$ are solvable. One can enlarge $\mathbb N$ by adding solutions of all such equations, obtaining the set of integer numbers $\mathbb Z$ which is a commutative group with respect to the operation of addition, and finally the set of rational numbers $\mathbb Q$ in which division is available by any nonzero number.

Division by zero is impossible: if we add “solution of the equation $0\cdot x=1$” as a new imaginary element, then we will not be able to do some arithmetic operations on it. Still, if we are ready to pay this price, then the rational numbers can be extended by a new element so that, say, the function $f(x)=1/x$ would be everywhere defined and continuous.

Details are available in the lecture notes here.

# 砖诇讜诐 讻讬讟讛 讗!

The main feature that distinguishes the Calculus (or Mathematical Analysis) from other branches of mathematics is the repeated use of infinite constructions and processes. Without infinity even the simplest things, like the decimal representation of the simple fraction $\frac13=0.333333\dots$ becomes problematic.

Yet to deal with infinity and infinite constructions, we need to make precise our language, based on the notions of sets and functions (maps, applications, – all these words are synonymous).

Look at the first section of the lecture notes here.

You are most welcome to start discussions in the comments to this (or any other) post. Don’t be afraid of asking questions that may look stupid: this never harms! Write in any language (besides Hebrew/English, I hope that Ghadeer will take care of questions in Arabic, and I promise to deal with French/Spanish/Catalan/Italian/Russian/Ukrainian questions) 馃槈 Subscribe for updates on this site with your usual emails, to be independent from any dependence 馃槈

Looking forward for a mutually beneficial interaction in the new semester!

## Monday, February 1, 2016

### Finally, exam!

Filed under: lecture,Rothschild course "Analysis for high school teachers" — Sergei Yakovenko @ 3:41
Tags:

# Exam

The exam is posted online on Feb 1, 2016, and must be submitted on the last day of the exams’ period, February 26. Its goals are, besides testing your acquired skills in the Analysis, to teach you a few extra things and see your ability for logical reasoning, not your proficiency in performing long computations. If you find yourself involved in heavy computations, better double check whether you understand the formulation of the problem correctly. Remember, small details sometimes matter!

Please provide argumentation, better in the form of logical formulas, not forgetting explicit or implicit quantifiers $\forall$ and $\exists$. They really may change the meaning of what you write!

Problems are often subdivided into items. The order of these items is not accidental, try to solve them from the first till the last, and not in a random order (solution of one item may be a building block for the next one).

To get the maximal grade, it is not necessary to solve all problems, but it is imperative not to write stupid things. Please don’t try to shoot in the air.

The English version is the authoritative source, but if somebody translates it into Hebrew (for the sake of the rest of you) and send me the translation, I will post it for your convenience, but responsibility will be largely with the translator.

If you believe you found an error or crucial omission in the formulation of a problem, please write me. If this will be indeed the case (errare humanum est), the problem will be either edited (in case of minor omissions) or cancelled (on my account).

That’s all, folks!漏 Good luck to everybody!

Yes, and feel free to leave your questions/talkbacks here, whether addressed to Michal/Boaz/me or to yourself, if you feel you want to ask a relevant question.

# Corrections

## Correction 1

The formulation of Problem 1 was indeed incorrect. The set $A'$ was intended to be the set of accumulation points for a set $A\subseteq [0,1]$. The formal definition is as follows.

Definition. A point $p\in [0,1]$ belongs to to the set of limit points $A'$ if and only if $\forall\varepsilon$>0 the intersection $(p-\varepsilon,p+\varepsilon)\cap A$ is infinite. The point $p$ itself may be or may be not in $A$.

Isolated points of $A$ are never in $A'$, but $A'$ may contain points $p\notin A$.

Apologies for the hasty formulation.

## Correction 2: Problem 3(b) cancelled!

The statement requested to prove in Problem 3(b) is wrong, and I am impressed how fast did you discover that. Actually, the problem was taken from the textbook by Zorich, vol. 1, where it appears on p. 169, sec. 4.2.3, as Problem 4.

The assertion about existence of the common fixed point of two commuting continuous functions $f,g\colon [0,1]\to[0,1]$ becomes true if we require these functions to be continuously differentiable on $[0,1]$ (in particular, for polynomials), but the proof of this fact is too difficult to be suggested as a problem for the exam.

Thus Problem 3(b) is cancelled.

## Tuesday, January 26, 2016

### Lecture 13, Jan 26, 2016

Questions concerned integrability of discontinuous functions, notions of improper integrals (how and when they can be defined), topological properties (equivalent definitions of compactness, connectedness etc.)

Here are some textbooks that I recommend for preparing when working on the exam. Keep them on your virtual bookshelf: they cover much more that I explained in the course, but who knows what questions related to analysis you might have.

1. V. Zorich, vol. 1: Chapters 1-6, pp.1-371.
2. V. Zorich, vol. 2: Parts of Chapter 9 (continuous maps) and the first part of Chapter 18 on Fourier series.
3. W. Rudin: Chapters 1-6, pp.1-165.

The problems for exam will be posted on February 1st (at least the English version).

# Functions of complex variable

The field $\mathbb C$ is naturally extending the field $\mathbb R$, which means that all arithmetic operations on $\mathbb R$ extend naturally as operations on $\mathbb C$. In particular, any polynomial $p(x)=a_0x^n+a_1 x^{n-1}+\cdots+a_n$ can be interpreted as a map $p:\mathbb C\to\mathbb C$. Geometrically, this can be visualized as a map of the 2-plane to the 2-plane.

We discussed the maps $p(x)=x^2$ and $p(x)=1/x$.

Smooth functions are those which can be accurately approximated by the $\mathbb R$-affine maps $x\mapsto c+\lambda(x-a)$ near each point $a\in\mathbb R$. Functions that can be accurately approximated by $\mathbb C$-affine maps (the same formula, but over the complex numbers), are called holomorphic (or complex analytic). Such maps are characterized by the property that small circles are mapped into small almost-circles, that is,

1. angles are preserved, and
2. lengths are scaled

Sometimes these local conditions become global. Examples: the affine maps $x\mapsto \lambda (x-a)+b$ send lines to lines and circles to circles. The map $x\mapsto 1/x$ maps circles and lines into circles or lines (depending on whether the circles/lines pass through the origin $x=0$).

## Complex integration

Integration is carried over smooth (or piecewise smooth) paths in $\mathbb C$, using Riemann-like sums. It depends linearly on the function which we integrate, but in contrast with the real case we have much more freedom in choosing the paths.

1. If $f(x)=c$ is a constant, and $\gamma =[p_0,p_1]+[p_1,p_2]+[p_2,p_0]$ is a closed triangle, then the integral is zero as the sum $c(p_1-p_0)+c(p_2-p_1)+c(p_0-p_2)=c\cdot0=0$.
2. If $f(x)$ is non-constant, then the integral identity is valid only for “very small triangles” near a point $a\in\mathbb C$ with $c=f(a)$.
3. This implies that $\displaystyle \int_\gamma f(x)\,\mathrm dx=\int_{\gamma'} f(x)\,\mathrm dx$ as long as the paths $\gamma,\gamma'$ share the common endpoints and can be continuously deformed one into the other.

Integrals over closed loops are zeros, unless there are singular points (where the function is non-holomoprhic) inside.

Example: $f(x)=\frac 1x$ is non-holomorphic at $x=0$, and $\displaystyle \oint_{|x|=1}\tfrac 1x\,\mathrm dx=2\pi i$.

## Cauchy integral formula

If $f$ is holomorphic inside a domain $U$ bounded by a closed curve $\gamma$ and $a\in U$, then $\displaystyle f(a)=\frac 1{2\pi i}\oint_\gamma \frac{f(x)\,\mathrm dx}{x-a}$.

In other words, the value of $f$ on the boundary uniquely determine its values inside the domain. This is in wild contrast with functions of real variable!

## Taylor series

As a function of $a$, the expression $\frac1{x-a}$ admits a converging Taylor expansion (in fact, the same old geometric progression series) in powers of $a-a_0$ for any $a_0\ne x$. Thus if we choose $a_0\in U$ off the path $\gamma$, then the series will converge for any $x\in\gamma$ (warning! note that the “variable” and the “parameter” exchanged their roles!!!), the Cauchy integral can be expanded in the converging series of powers $(a-a_0)^n, \ n=0,1,2,\dots$, hence the function $f(x)$ gets expanded in the series $f(a)=c_0+c_1(a-a_0)+c_2(a-a_0)^2+\cdots$.

Conclusion: functions that are $\mathbb C$-differentiable can be expanded in the convergent Taylor series (hence are “polynomials of infinite degree”) and vice versa, “polynomials of infinite degree” are infinitely $\mathbb C$-differentiable. This is a miracle that so many functions around us are actually holomorphic!

# Elementary transcendental functions as solutions to simple differential equations

The way how logarithmic, exponential and trigonometric functions are usually introduced, is not very satisfactory and appears artificial. For instance, the mere definition of the non-integer power $x^a$, $a\notin\mathbb Z$, is problematic. For $a=1/n,\ n\in\mathbb N$, one can define the value as the root $\sqrt[n]x$, but the choice of branch/sign and the possibility of defining it for negative $x$ is speculative. For instance, the functions $x^{\frac12}$ and $x^{\frac 24}$ may turn out to be different, depending on whether the latter is defined as $\sqrt[4]{x^2}$ (makes sense for negative $x$) or as $(\sqrt[4]x)^2$ which makes sense only for positive $x$. But even if we agree that the domain of $x^a$ should be restricted to positive arguments only, still there is a big question why for two close values $a=\frac12$ and $a=\frac{499}{1000}$ the values, say, $\sqrt 2$ and $\sqrt[1000]{2^{499}}$ should also be close…

The right way to introduce these functions is by looking at the differential equations which they satisfy.

A differential equation (of the first order) is a relation, usually rational, involving the unknown function $y(x)$, its derivative $y'(x)$ and some known rational functions of the independent variable $x$. If the relation involves higher derivatives, we say about higher order differential equations. One can also consider systems of differential equations, involving several relations between several unknown functions and their derivatives.

Example. Any relation of the form $P(x, y)=0$ implicitly defines $y$ as a function of $x$ and can be considered as a trivial equation of order zero.

Example. The equation $y'=f(x)$ with a known function $f$ is a very simple differential equation. If $f$ is integrable (say, continuous), then its solution is given by the integral with variable upper limit, $\displaystyle y(x)=\int_p^x f(t)\,\mathrm dt$ for any meaningful choice of the lower limit $p$. Any two solutions differ by a constant.

Example. The equation $y'=a(x)y$ with a known function $a(x)$. Even the case where $a(x)=a$ is a constant, there is no, say, polynomial solution to this equation (why?), except for the trivial one $y(x)\equiv0$. This equation is linear: together with any two functions $y_1(x),y_2(x)$ and any constant $\lambda$, the functions $\lambda y_1(x)$ and $y_1(x)\pm y_2(x)$ are also solutions.

Example. The equation $y'=y^2$ has a family of solutions $\displaystyle y(x)=-\frac1{x-c}$ for any choice of the constant $c\in\mathbb R$ (check it!). However, any such solution “explodes” at the point $x=c$, while the equation itself has no special “misbehavior” at this point (in fact, the equation does not depend on $x$ at all).

## Logarithm

The transcendental function $y(x)=\ln x$ satisfies the differential equation $y'=x^{-1}$: this is the only case of the equation $y'=x^n,\ n\in\mathbb Z$, which has no rational solution. In fact, all properties of the logarithm follow from the fact that it satisfies the above equation and the constant of integration is chosen so that $y(1)=0$. In other words, we show that the function defined as the integral $\displaystyle \ell(x)=\int_1^x \frac1t\,\mathrm dt$ possesses all what we want. We show that:

1. $\ell(x)$ is defined for all $x>0$, is monotone growing from $-\infty$ to $+\infty$ as $x$ varies from $0$ to $+\infty$.
2. $\ell(x)$ is infinitely differentiable, concave.
3. $\ell$ transforms the operation of multiplication (of positive numbers) into the addition: $\ell(\lambda x)=\ell(\lambda)+\ell(x)$ for any $x,\lambda>0$.

## Exponent

The above listed properties of the logarithm ensure that there is an inverse function, denoted provisionally by $E(x)$, which is inverse to $\ell:\ \ell(E(x))=x$. This function is defined for all real $x\in\mathbb R$, takes positive values and transforms the addition to the multiplication: $E(\lambda+x)=E(\lambda)\cdot E(x)$. Denoting the the value $E(1)$ by $e$, we conclude that $E(n)=e^n$ for all $n\in\mathbb Z$, and $E(x)=e^x$ for all rational values $x=\frac pq$. Thus the function $E(x)$, defined as the inverse to $\ell$, gives interpolation of the exponent for all real arguments. A simple calculation shows that $E(x)$ satisfies the differential equation $y'=y$ with the initial condition $y(0)=1$.

## Computation

Consider the integral operator $\Phi$ which sends any (continuous) function $f:\mathbb R\to\mathbb R$ to the function $g=\Phi(f)$ defined by the formula $\displaystyle g(x)=f(0)+\int_0^x f(t)\,\mathrm dt$. Applying this operator to the function $E(x)$ and using the differential equation, we see that $E$ is a “fixed point” of the transformation $\Phi$: $\Phi(E)+E$. This suggests using the following approach to compute the function $E$: choose a function $f_0$ and build the sequence of functions $f_n=\Phi(f_{n-1})$, $n=1,2,3,4,\dots$. If there exists a limit $f_*=\lim f_{n+1}=\lim \Phi(f_n)=\Phi(f_*)$, then this limit is a fixed point for $\Phi$.

Note that the action of $\Phi$ can be very easily calculated on the monomials: $\displaystyle \Phi\biggl(\frac{x^k}{k!}\biggr)=\frac{x^{k+1}}{(k+1)!}$ (check it!). Therefore if we start with $f_0(x)=1$, we obtain the functions $\latex f_n=1+x+\frac12 x^2+\cdots+\frac1{n!}x^n$. This sequence converges to the sum of the infinite series $\displaystyle\sum_{n=0}^\infty\frac1{n!}x^n$ which represents the solution $E(x)$ on the entire real line (check that). This series can be used for a fast approximate calculation of the number $e=E(1)=\sum_0^\infty \frac1{n!}$.

## Differential equations in the complex domain

The function $E(ix)=e^{ix}$ satisfies the differential equation $y'=\mathrm iy$. The corresponding “motion on the complex plane”, $x\mapsto e^{\mathrm ix}$, is rotation along the (unit) circle with the unit (absolute) speed, hence the real and imaginary parts of $e^{\mathrm ix}$ are cosine and sine respectively. In fact, the “right” definition of them is exactly like that,

$\displaystyle \cos x=\textrm{Re}\,e^{\mathrm ix},\quad \sin x=\textrm{Im}\,e^{\mathrm ix} \iff e^{\mathrm ix}=\cos x+\mathrm i\sin x,\qquad x\in\mathbb R$.

Thus, the Euler formula “cis” in fact is the definition of sine and cosine. Of course, it can be “proved” by substituting the imaginary value into the Taylor series for the exponent, collecting the real and imaginary parts and comparing them with the Taylor series for the sine and cosine.

In fact, both sine and cosine are in turn solutions of the real differential equations: derivating the equation $y'=\mathrm iy$, one concludes that $y''=\mathrm i^2y=-y$. It can be used to calculate the Taylor coefficients for sine and cosine.

For more details see the lecture notes.

Not completely covered in the class: solution of linear equations with constant coefficients and resonances.

# Integral and antiderivative

1. Area under the graph as a paradigm
2. Definitions (upper and lower sums, integrability).
3. Integrability of continuous functions.
4. Newton-Leibniz formula: integral and antiderivative.
5. Elementary rules of antiderivation (linearity, anti-Leibniz rule of “integration by parts”).
6. Anti-chain rule, change of variables in the integral and its geometric meaning.
7. Riemann–Stieltjes integral and change of variables in it.
8. Integrability of discontinuous functions.

Not covered in the class: Lebesgue theorem and motivations for transition from Riemann to the Lebesgue integral.

The sketchy notes are available here.

# Higher derivatives and better approximation

We discussed a few issues:

• Lagrange interpolation formula: how to estimate the difference $f(b)-f(a)$ through the derivative $f'$?
• Consequence: vanishing of several derivatives at a point means that a function has a “root of high order” at this point (with explanation, what does that mean).
• Taylor formula for polynomials: if you know all derivatives of a polynomial at some point, then you know it everywhere.
• Peano formula for $C^n$-smooth functions: approximation by the Taylor polynomial with asymptotic bound for the error.
• Lagrange formula: explicit estimate for the error.

The notes (updated) are available here.

# Differentiability and derivative

Continuity of functions (and maps) means that they can be nicely approximated by constant functions (maps) in a sufficiently small neighborhood of each point. Yet the constant maps (easy to understand as they are) are not the only “simple” maps.

## Linear maps

Linear maps naturally live on vector spaces, sets equipped with a special structure. Recall that $\mathbb R$ is algebraically a field: real numbers cane be added, subtracted between themselves and the ratio $\alpha/\beta$ is well defined for $\beta\ne0$.

Definition. A set $V$ is said to be a vector space (over $\mathbb R$), if the operations of addition/subtraction $V\owns u,v\mapsto u\pm v$ and multiplication by constant $V\owns v,\ \mathbb R\owns \alpha\mapsto \alpha v$ are defined on it and obey the obvious rules of commutativity, associativity and distributivity. Some people prefer to call vector spaces linear spaces: the two terms are identical.

Warning. There is no “natural” multiplication $V\times V\to V$!

Examples.

1. The field $\mathbb R$ itself. If we want to stress that it is considered as a vector space, we write $\mathbb R^1$.
2. The set of tuples $\mathbb R^n=(x_1,\dots,x_n),\ x_i\in\mathbb R$ is the Euclidean $n$-space. For $n=2,3$ it can be identified with the “geometric” plane and space, using coordinates.
3. The set of all polynomials of bounded degree $\leq d$ with real coefficients.
4. The set of all polynomials $\mathbb R[x]$ without any control over the degree.
5. The set $C([0,1])$ of all continuous functions on the segment $[0,1]$.

Warning. The two last examples are special: the corresponding spaces are not finite-dimensional (we did not have time to discuss what is the dimension of a linear space in general…)

Let $V,Z$ be two (different or identical) vector spaces and $f:V\to W$ is a function (map) between them.
Definition. The map $f$ is linear, if it preserves the operations on vectors, i.e., $\forall v,w\in V,\ \alpha\in\mathbb R,\quad f(v+w)=f(v)+f(w),\ f(\alpha v)=\alpha f(v)$.

Sometimes we will use the notation $V\overset f\longrightarrow Z$.

Obvious properties of linearity.

• $f(0)=0$ (Note: the two zeros may lie in different spaces!)
• For any two given spaces $V,W$ the linear maps between them can be added and multiplied by constants in a natural way! If $V\overset {f,g}\longrightarrow W$, then we define $(f+g)(v)=f(v)+g(v)$ for any $v\in V$ (define $\alpha f$ yourselves). The result will be again a linear map between the same spaces.
• If $V\overset f\longrightarrow W$ and $W\overset g\longrightarrow Z$, then the composition $g\circ f:V\overset f\longrightarrow W\overset g\longrightarrow Z$ is well defined and again linear.

Examples.

1. Any linear map $\mathbb R^1\overset f\longrightarrow \mathbb R^1$ has the form $x\mapsto ax, \ a\in\mathbb R$ (do you understand why the notations $\mathbb R, \mathbb R^1$ are used?)
2. Any linear map $\mathbb R^n\overset f\longrightarrow \mathbb R^1$ has the form $(x_1,\dots,x_n)\mapsto a_1x_1+\cdots+a_nx_n$ for some numbers $a_1,\dots,a_n$. Argue that all such maps form a linear space isomorphic to $\mathbb R^n$ back again.
3. Explain how linear maps from $\mathbb R^n$ to $\mathbb R^m$ can be recorded using $n\times m$-matrices. How the composition of linear maps is related to the multiplication of matrices?

The first example shows that linear maps of $\mathbb R^1$ to itself are “labeled” by real numbers (“multiplicators“). Composition of linear maps corresponds to multiplication of the corresponding multiplicators (whence the name). A linear 1-dim map is invertible if and only if the multiplicator is nonzero.

Corollary. Invertible linear maps $\mathbb R^1\to\mathbb R^1$ constitute a commutative group (by composition) isomorphic to the multiplicative group $\mathbb R^*=\mathbb R\smallsetminus \{0\}$.

## Shifts

Maps of the form $V\to V, \ v\mapsto v+h$ for a fixed vector $h\in V$ (the domain and source coincide!) are called shifts (a.k.a. translations). Warning: The shifts are not linear unless $h=0$! Composition of two shifts is again a shift.

Exercise.
Prove that all translations form a commutative group (by composition) isomorphic to the space $V$ itself. (Hint: this is a tautological statement).

## Affine maps

Definition.
A map $f:V\to W$ between two vector spaces is called affine, if it is a composition of a linear map and translations.

Example.
Any affine map $\mathbb R^1\to\mathbb R^1$ has the form $x\mapsto ax+b$ for some $a,b\in\mathbb R$. Sometimes it is more convenient to write the map under the form $x\mapsto a(x-c)+b$: this is possible for any point $c\in\mathbb R^1$. Note that the composition of affine maps in dimension 1 is not commutative anymore.

Key computation. Assume you are given a map $f:\mathbb R^1\to\mathbb R^1$ in the sense that you can evaluate it at any point $c\in\mathbb R^1$. Suppose an oracle tells you that this map is affine. How can you restore the explicit formula $f(x)=a(x-c)+b$ for $f$?

Obviously, $b=f(c)$. To find $\displaystyle a=\frac{f(x)-b}{x-c}$, we have to plug into it any point $x\ne c$ and the corresponding value $f(x)$. Given that $b=f(c)$, we have $\displaystyle a=\frac{f(x)-f(c)}{x-c}$ for any choice of $x\ne c$.

The expression $a_c(x)=\displaystyle \frac{f(x)-f(c)}{x-c}$ for a non-affine function $f$ is in general not-constant and depends on the choice of the point $x$.

Definition. A function $f:\mathbb R^1\to\mathbb R^1$ is called differentiable at the point $c$, if the above expression for $a_c(x)$, albeit non-constant, has a limit as $x\to c:\ a_c(x)=a+s_c(x)$, where $s_c(x)$ is a function which tends to zero. The number $a$ is called the derivative of $f$ at the point $c$ and denoted by $f'(c)$ (and also by half a dozen of other symbols: $\frac{df}{dx}(c),Df(c), D_xf(c), f_x(c)$, …).

Existence of the limit means that near the point $c$ the function $f$ admits a reasonable approximation by an affine function $\ell(x)=a(x-c)+b$: $f(x)=\ell(x)+s_c(x)(x-c)$, i.e., the “non-affine part” $s_c(x)\cdot (x-c)$ is small not just by itself, but also relative to small difference $x-c$.

# Differentiability and algebraic operations

See the notes and their earlier version.

The only non-obvious moment is differentiability of the product: the product (unlike the composition) of affine functions is not affine anymore, but is immediately differentiable:

$[b+a(x-c)]\cdot[q+p(x-c)]=pq+(aq+bp)(x-c)+ap(x-c)^2$, but the quadratic term is vanishing relative to $x-c$, so the entire sum is differentiable.

Exercise. Derive the Leibniz rule for the derivative of the product.

# Derivative and the local study of functions

Affine functions have no (strong) maxima or minima, unless restricted on finite segments. Yet absence of the extremum is a strong property which descends from the affine approximation to the original function. Details here and here.

Next Page »

Create a free website or blog at WordPress.com.