Sergei Yakovenko's blog: on Math and Teaching

Tuesday, November 24, 2015

Lecture 5, Nov 24, 2015

The Peano curve: continuity can be counter-intuitive

The Peano curve is obtained as the limit of piecewise-linear continuous (even closed) curves \gamma_n. Denote by K=\{|x|+|y|\le 1\} the square (rotated by \frac \pi/4) and by \mathbb Z^2=\{(x,y):x,y\in\mathbb Z the grid of horizontal and vertical lines at distance 1 from each other, then one can construct a family of piecewise-linear continuous curves \gamma_n:[0,1]\to\mathbb R^2 which visits all points of the intersection K\cap\frac1{2^n}\mathbb Z^2 in such a way that |\gamma_n(t)-\gamma_n(t)|<\frac1{2^n} uniformly on t\in[0,1].

This sequence of curves converges uniformly to a function (curve) \gamma_*:[0,1]\to\mathbb R^2 and this curve is closed and continuous for the same reasons that justify continuity of the Koch snowflake curve.

What are the properties of the images C_n=\gamma_n([0,1]) and of the limit curve C_*=\gamma_*([0,1])?

  • Each curve C_n for any finite n is piecewise-linear. It has zero area in the sense that for any \varepsilon > 0 the curve C_n can be covered by a finite union of (open) rectangles with the total area less than \varepsilon;
  • Each curve C_n has finite length (although it grows to infinity as n\to\infty, – check it!).
  • The limit curve C_* has no length (that’s the same as saying that it has infinite length). Moreover, unlike many other curves of infinite length (say, the straight line \{y=0\}\subseteq\mathbb R^2), no part \gamma([a,b]),\ a<b, of C_* has finite length!
  • The limit curve C_* coincides with the square K, hence fills the area equal to 2.

All these assertions are easy except for the last one. Let’s prove it.

Consider the images C_n=K\cap \frac1{2^n}\mathbb Z^2. The union of these images is dense in K: by definition, this means that any point P\in K can be approximated by a sequence of points P_n\in C_n which converge to P as n\to\infty. Being in the image of \gamma_n([0,1]), each point P_n is the image of some point in [0,1]: \exists a_n\in[0,1]:\ \gamma(a_n)=P_n. Such point may well be non-unique, and in any case we have absolutely no knowledge of how the points a_1,a_2,\dots are distributed over [0,1].

However, we know that the sequence a_n\in [0,1] must have an accumulation point a_*\in [0,1], which is by definition a limit of some infinite subsequence. (This won’t be the case if instead of [0,1] we were dealing with the curves defined on the entire real line!). Replacing the sequence by this subsequence, we see that it still converges to the same limit, P_n=\gamma(a_n)\to a_*=\gamma_*(a_*)=P. Thus we proved that an arbitrary point in K lies in the image: P\in C_*.

Topology: the study of properties preserved by continuous maps (functions, applications, …)

Definition. A neighborhood of a point a\in\mathbb R^n in the Euclidean space is any set of the form \{x:|x-a| 0, where | ??? | is a distance function satisfying the triangle inequality. Examples:

  • |x|=\sqrt{x_1^2+\cdots+x_n^2} (the usual Euclidean distance on the line, on the plane, …) for x=(x_1,\dots,x_n)\in\mathbb R^n;
  • |x|=\max\{|x_1|, \dots, |x_n|\} (in the above notation);
  • |x|=|x_1|+\cdots+|x_n|.

Definition. A subset A\subset\mathbb R^n of the Euclidean space (OK, plane) is called open, if together with any its point a\in A it contains some neighborhood of a.
A subset is called closed, if the limit of converging infinite sequence \{a_n\}\subset A again belongs in A.

Theorem. A subset A is open if and only if its complement \mathbb R^n\smallsetminus A is closed.

Theorem. The union of any family (infinite or even uncountable) of open sets is open. Finite intersection of open sets is also open (for infinite intersections this is wrong).
Corollary. Intersection of any family (infinite or even uncountable) of closed sets is closed. Finite union of closed sets is also closed (for infinite intersections this is wrong).

One can immediately produce a lot of examples of open/closed subsets in \mathbb R^n. It turns out that any property that can be formulated using only these notions, is preserved by maps which are continuous together with their inverses. The corresponding area of math is called topology.

Friday, November 20, 2015

Lecture 4, Nov 19, 2015

Continuity of functions

Let f\colon D\to\mathbb R,\ D\subseteq\mathbb R be a function of one variable, and a\in D a point in its domain. The function is said to be continuous at a, if for any precision \varepsilon>0 the function is \varepsilon-constant (equal to its value f(a)) after restriction on a sufficiently short interval around a.

Formally, if we denote by \bold I=\bold I^1 the (open) interval, then the continuity means that \forall \varepsilon>0\ \exists\delta>0\ f(a+\delta\bold I)\subseteq f(a)+\varepsilon\bold I (check that you understand the meaning of the notation u+v\bold I for a subset of \mathbb R).

A function f\colon D\to\mathbb R is said to be continuous on a subset D'\subset D, if it is continuous at all points a\in D' of this subset. Usually we consider the cases where D'=D, that is, functions continuous on their domain of definition.

1. If a is an isolated point of the domain D, then any function is automatically continuous at a for a simple reason: for all sufficiently small \delta>0 the intersection (a+\delta\bold I)\cap D consists of a single point a, so the image is a single point f(a).

2. If a\notin D but \inf_{x\in D} |x-a|=0 and there exists a limit A=\lim_{x\to a}f(x), then one can extend f on a by letting f(a)=A and obtain a function defined on D\cup\{a\} which is continuous at a.

Obvious properties of the continuity

The sum, difference and product of two functions continuous at a point a, is again continuous. The reciprocal of a continuous function \frac1{f} is continuous at a, if f(a)\ne 0.

This is an obvious consequence of the rules of operations on “approximate numbers” (קירובים). When dealing with the sum/difference, one has to work with absolute approximations, when dealing with the product/ratio – with the relatice approximations, but ultimately it all boils down to the same argument: if two functions are almost constant on a sufficiently small interval around a, then application of the arithmetic operations is almost constant.

Not-so-obvious property of continuity

When the continuity is compatible with transition to limit? More specifically, we consider the situation where there is an infinite number of functions f_n\colon:D\to\mathbb R defined on the common domain D. Assume that for any a\in D the values (numbers!) f_n(a)\in\mathbb R form a converging sequence whose limit is denoted by f_*(a) (it depends on a!). What can one say about the function f_*\colon a\mapsto f_*(a)?

Assume that f_n(x)=x^n and D=[0,1]. All of them are continuous (why?). If a<1, then \lim_{n\to\infty} a^n=0. If a=1, then for any n\ a^n=1. Thus the limit \lim_{n\to\infty}f_n(a) exists for all a, but as a function of a\in[0,1] it is a discontinuous function. Thus without additional precautions a sequence of continuous functions can converge to a discontinuous one.

Distance between the functions.
The distance between real numbers a,b\in\mathbb R is the nonnegative number |a-b| which is zero if and only if a=b. Motivated by that, we introduce the distance \|f-g\| between two functions f,g\colon D\to \mathbb R as the expression \sup_{a\in D}|f(a)-g(a)|.

Exercise. Prove that any three functions f,g,h defined on the common domain D, the “triangle inequality” \|f-g\|\leqslant \|f-h\|+\|h-g\|.

Definition. A sequence of functions f_n\colon D\mathbb R is said to be uniformly converging to a function f_*\colon D\to\mathbb R, if \lim_{n\to\infty}\|f_n-f_*\|=0.

Theorem. If a sequence of continuous functions converges uniformly, then the limit is also a continuous function.

Indeed, denote by g the limit function, a\in D any point, and let \varepsilon>0 be any “resolution”. We need to construct a small interval around a such that g on this interval is \varepsilon-indistinguishable from the value g(a). We split the resolution allowance into two halves. The first half we use to find N such that \|f_n-g\| < \frac\varepsilon2 for all n\ge N. The second half we spend on the continuity: since f_N is continuous, there exists a segment a+\delta\bold I on which f_N is \frac\varepsilon2-indistinguishable from f_N(a). Collecting all inequalities we see that for any point x\in a+\delta\bold I we have three inequalities: |f_N(a)-g(a)|<\frac\varepsilon2, \ |f_N(x)-g(x)|<\frac\varepsilon2,\ |f_N(x)-g(x)|<\frac\varepsilon2. By the triangle inequality, |g(x)-g(a)|< \frac{3\varepsilon}2. Ooops! we were heading for \varepsilon! One should rather divide our allowance into unequal parts, \frac{2\varepsilon}3 for the distance and \frac\varepsilon3 for the continuity of f_N if we thought ahead of the computation! ;-) in any case, the outcome is the same.


The notion of continuity, the distance between functions etc. can be generalized from functions of one variable to other classes of functions.

For instance, functions of the form \gamma\colon [0,1]\to\mathbb R^2 can be called (parametrized) curves. Here the argument x\in[0,1] can be naturally associated with time, so the \gamma(t) is the position of the moving point at the moment t. We can draw the image \gamma([0,1]): this drawing does not reflect the timing: to indicate it, we can additionally mark the images, say, \gamma (\frac k{10}),\ k=0,1,\dots,10.

To define continuity for curves, denote by \bold I^2 the unit square \{|x|<1,\ |y|<1\}. A curve \gamma is continuous at a point a\in [0,1] if \forall\varepsilon >0 \ \exists \delta>0 such that \gamma (a+\delta\bold I^1)\subseteq \gamma(a)+\varepsilon \bold I^2. (Do you understand this formula? ;-) )

The distance between two points a=(a_1,a_2),\ b=(b_1,b_2)\in\mathbb R^2 is usually defined as \sqrt{(a_1-b_1)^2+(a_2-b_2)^2}, but this difference is not very much different from the expression |a-b|=\max_{i=1,2}|a_i-b_i| (this definition can be immediately generalized for spaces \mathbb R^n of any finite dimension n=3,4,\dots. The distance between two curves has a very similar form: \|f-g\|=\sup_{x\in [0,1]}|f(x)-g(x)|.

Remark. If the functions f,g are continuous, we can replace the supremum by maximum (which is always achieved).

Koch snowflake revisited

Now we can return to one of the examples we discussed on Lecture 1, the Koch snowflake. In contrast with that time, we now have an appropriate language to deal with it.

The process of constructing the curve actually produces a sequence of closed curves. The image of the first curve is an equilateral triangular, the second one gives the Star of David, the third one has no canonical name.

In all cases the new curve \gamma_{n+1} is obtained by taking the previous curve \gamma_n and modifying it on a subset of its domain: instead of traversing a line segment with constant speed, one takes a middle third of this segment and forces \gamma_{n+1} to detour. This requires increasing the speed, but we don’t care as long as the trajectory remains continuous. The distance between \gamma_n and \gamma_{n+1} is \frac{\sqrt3}2 times the size of the size of the segment \frac1{3^n}.

This observation guarantees that \|\gamma_n-\gamma_{n+1}\|< C(1/3)^n. This implies that the sequence of maps \gamma_n\colon [0,1]\to\mathbb R^2 converges uniformly. The result is continuous curve \gamma_*\colon[0,1]\to\mathbb R^2 which has "infinite length" (in fact, it has no length at all).

Sunday, November 15, 2015

One-time change of schedule

Filed under: lecture,schedule — Sergei Yakovenko @ 3:31

Because of the travel arrangements, the next lecture on “Analysis for high school teachers” is moved from Tue Nov 17, 9:15-11:15 (Sci teaching lab) to Thu Nov 19, 9:15-11:15 (Seminar room 2).
For the same reasons the next course by Dmitry Novikov is moved from Thu Nov 19, 9:15-11:15 (Seminar room 2) to Tue Nov 17, 9:15-11:15 (Sci teaching lab).

In other words, the two classes will simply swap their space-time slots for one week only.

Wednesday, November 11, 2015

Lecture 3, Nov 10, 2015

Filed under: Rothschild course "Analysis for high school teachers" — Sergei Yakovenko @ 5:18
Tags: ,


First, what’s the problem?
Assume we want to calculate the derivative of the function f(x)=x^2, say, at the point x=2. This derivative is the number, defined using the divided difference \displaystyle\frac{(2+h)^2-4}{h}=4+h when h is “very small”. What does it mean “very small”? We cannot let h be exactly zero, since division by zero is forbidden. On the other hand, if h\ne 0, then the above expression is never equal to 4 (as expected) precisely, so in any case the “derivative” cannot be 4, as we want. To resolve this controversy, Leibniz introduced mysterious “differentials” which disappear when added to usual numbers, but whose ratio has precise numerical meaning.

The approach by Leibniz can be worked out into a rigorous mathematical theory, called nonstandard analysis, but historically a different approach, based on the notion of limit (of sequence, function, …), prevailed.

Limit of a sequence

Consider an (infinite) sequence \{a_n:n\in\mathbb N\}=\{a_1,a_2,a_3,\dots,a_n,\dots\} of real numbers. We say that it stabilizes (מתיצבת) at the value A\in\mathbb R, if only finitely many terms in this sequence can be different from A, and the remaining infinite “tail” consists of the repeated value A. Since among the finitely many numbers n\in\mathbb N one can always choose the maximal one (denoted by N), we say that the sequence $latex\{a_n\}$ stabilizes, if

\exists N\in\mathbb N\ \forall n>N\ a_n=A.

Obviously, stabilizing sequences are not interesting, but their obvious properties can be immediately listed:

  1. Changing any finite number of terms in a stabilizing sequence keeps it stabilizing and vice versa;
  2. If a sequence \{a_n\} stabilizes at A, and another sequence \{b_n\} stabilizes at B, then the sum-sequence \{a_n+b_n\} stabilizes at A+B, the product-sequence \{a_nb_n\} at $AB$.
  3. The fraction-sequence \{a_n/b_n\} may be not defined, but if B\ne 0, then only finitely many terms b_n can be zeros, just change them to nonzero numbers and then the fraction-sequence will be defined for all n and stabilize at A/B.
  4. Exercise: formulate properties of stabilizing sequences, involving inequalities.

Blurred vision

As we have introduced the real numbers, to test their equality requires to check infinitely many “digits”, which is unfeasible. All the way around, we can specify a given precision \varepsilon >0 (it can be chosen rational). Then one can replace the genuine equality by \varepsilon-equality, saying that two numbers a,b\in\mathbb R are \varepsilon-equal, if |a-b|<\varepsilon. This is a bad notion that will be used only temporarily, since it is not transitive (for the same fixed value of \varepsilon). Yet as a temporary notion, it is very useful.

We say that an infinite sequence \{a_n\}\ \varepsilon-stabilizes at a value A for a given precision \varepsilon, if only finitely many terms in the sequence are not \varepsilon-equal to A. Formally, this means that

\exists N\in\mathbb N\ \forall n>N\ |a_n-A|<\varepsilon.

Spectacles can improve your vision

The choice of the precision \varepsilon>0 is left open so far. In practice it may be set at the level which is determined by the imperfection of our measuring instruments, but since we strive for a mathematical definition, we should not set any artificial threshold.

Definition. A sequence of real numbers \{a_n\} is said to converge to the limit A\in\mathbb R, if for any given precision \varepsilon>0 this sequence \varepsilon-stabilizes at A. The logical formula for the corresponding statement is obtained by adding one more quantifier to the left:

\forall\varepsilon>0\ \exists N\in\mathbb N\ \forall n>N\ |a_n-A|<\varepsilon.

If we want to claim that the sequence is converging without specifying what the limit is, one more quantifier is required:

\exists A\in\mathbb R\ \forall\varepsilon>0\ \exists N\in\mathbb N\ \forall n>N\ |a_n-A|<\varepsilon.

Of course, this formula is inaccessible to anybody not specially prepared for the job, this is why so many students shuttered their heads over it.

Obvious examples

  1. a_n=\frac 1n, or, more generally, a_n=\frac1{n^p},\ p>0. The limit is zero.
  2. a_n=c,\ c\in\mathbb R. Joke.
  3. a_n =n^p,\ p>0. Diverges.
  4. These rules (plus the obvious rules concerning the arithmetic operations) allow to decide the convergence of any sequence a_n whose general term is a rational function of n.
  5. Exceptional cases are very rare: e.g., when a_n=\displaystyle\left(1+\frac 1n\right)^n

Limits of functions

Let f\colon D\to\mathbb R be a function defined on a subset D, and a\in\mathbb R\smallsetminus D a point outside the doomain of f. We want to “extend” the function to this point if this makes sense.

For a given precision \varepsilon>0 we say that f is \varepsilon-constant on a set S\subseteq D, if there exists a constant A\in\mathbb R such that \forall x\in S\ |f(x)-A|<\varepsilon.

Definition. The function f\colon D\to\mathbb R is said to have a limit equal to A at a point a\notin D, if

  1. all intersections between D and small intervals I_\delta=\{|x-a|<\delta\} are non-empty and
  2. \forall\varepsilon>0\ \exists\delta>0 such that the function restricted on D\cap \{|x-a|<\delta\} is \varepsilon-constant.

In other words, the function is \varepsilon-indistinguishable from a constant on a sufficiently small open interval centered around a.

Remark. One can encounter situations when the function is defined at some point a\in D, but if we delete this point from D, then the function will have a limit $A$ at this point. If this limit coincides with the original value f(a), then the function is well-behaved (we say that it is continuous). If the limit exists but is different from f(a), then we understand that the function was intentionally twisted, and if we change its value at just one point, then it will become continuous. If f has no limit at a if restricted on D\smallsetminus \{a\}, we say that $katex f$ is discontinuous$ at a.

Clearly, such extension by taking limit is possible (if at all) only for points at “zero distance” from the domain of the function.

For more detail read the lecture notes from the past years.


As was mentioned, the problem of calculating limits of explicitly given (i.e., elementary) functions is usually not very difficult. The real fun begins when there is no explicit formula for the terms of the sequence (or the function). This may happen if the sequence is produced by some recurrent (inductive) rule.

The most simple case occurs where the rule is simply summation (should we say “correction”?) of a known nature:

a_{n+1}=a_n+\text{something explicitly depending on }n.

If we denote the added value by b_n, then the sequence will take the form
a_1, a_1+b_1, a_1+b_1+b_2,a_1+b_1+b_2+b_3,a_1+b_1+b_2+b_3+b_4,\dots. If we can perform such summations explicitly and write down the answer as a function of n, it would be great.

Example. Consider the case where b_n=\frac1{n(n+1)}=\frac1n-\frac1{n+1}. Then we get a “telescopic” sum which can be immediately computed. But this is rather an exception…

Another example is the geometric progression where b_n=cq^n for some constant c,q.

In general we cannot write down the sum as a function of n, which makes the task challenging.

Definition. Let b_n be a real sequence. We say that the infinite series \sum_{k=1}^\infty b_k converges, if the sequence of its finite sums a_n=\sum_{k=1}^n b_k has a (finite) limit.


  1. The geometric series \sum_{k=1}^\infty q^k converges if |q|<1 and diverges if |q|\ge 1.
  2. The harmonic series \sum_{k=1}^\infty \frac1k diverges.
  3. The inverse square series \sum_{k=1}^\infty \frac1{n^2} converges.

The last two statements follow from the comparison of the series with patently diverging or patently converging series (fill in the details!).

Later on we will concentrate specifically on the series of the form \sum_{k=0}^\infty c_k q^k with a given sequence of “Taylor coefficients” c_0,c_1,c_2,\dots which contain a parameter q. Considered as the function of q, these series exhibit fascinating properties which make them ubiquitous in math and applications.

Tuesday, November 3, 2015

Lecture 2, Nov 3, 2015

Real numbers

There are certain situations when the rational numbers are apparently not sufficient: for instance, the function f(x)=x^2-2 is negative at x=0, positive at x=2 but does not take the intermediate value zero: \forall x\in\mathbb Q\ f(x)\ne 0. Another situation concerns the possibility to define the notions of supremum and infimum for infinite sets: the set A=\{x\in\mathbb Q: x^2<2\} is bounded from two sides, but among its upper bounds B=\{b\in\mathbb Q:\ \forall a\in A\ a\leqslant b\} there is no minimal one.

The idea is to adjoin to \mathbb Q solutions of infinitely many inequalities.

For any rational number a\in\mathbb Q one can associate two subsets L,R\subset\mathbb Q as follows: L=\{l\in \mathbb Q: l\le a\} and R=\{r\in\mathbb Q: a\le r\}. Then the number a is the unique solution to the infinite system of inequalities of the form l\le x\le r for different choices of l\in L,\ r\in R. This system has the following two features:

  1. it is self-consistent (non-contradictory): any lower bound l is no greater than any upper bound r, i.e., L\le R, and
  2. it is maximal: together the two sets give \mathbb Q=L\cup R, and none of the sets can be enlarged without violating the first condition.

A (Dedekind) cut is any pair of subsets L,R\subseteq\mathbb Q satisfying the two conditions above.

If a rational number a\in\mathbb Q satisfies all the inqualities l\le a,\ a\le r for all l\in L,\ r\in R, then we call it a root (or a solution) of the cut. Every rational number is the solution to some cut \alpha=(L,R) as above, and this happens if and only if L\cap R=\{a\}. Yet not all cuts have rational solutions (give an example!).

We can associate cuts without rational solutions with “missing” numbers which we want to adjoin to \mathbb Q. For this purpose we have to show how cuts can be ordered (in a way compatible with the order on \mathbb Q) and how arithmetic operations can be performed on cuts.

Order on cuts

Let \alpha=(L,R),\ \beta=(L',R') be two different cuts. We declare that \alpha\triangleleft\beta, if L\cap R'\ne\varnothing, i.e., if there is a rational number a\in\mathbb Q that is at the same time an upper bound for the cut \alpha and a lower bound for the cut \beta. If both cuts have rational solutions, this number would be squeezed between these solutions. In the similar way we define the opposite order \alpha\triangleright\beta if and only if L'\cap R\ne\varnothing.

To see that this definition is indeed a complete order, we need to check that for any two cuts \alpha,\beta one and only one of the three possibilities holds: \alpha\triangleleft\beta,\ \alpha\triangleright\beta or \alpha=\beta (meaning that L=L',R=R'). This is a routine check: if the first two possibilities are excluded, then L\cap R'=L'\cap R=\varnothing, and therefore (L\cup L', R\cup R') is a self-consistent cut. But because of the maximality condition, this means that L\cup L'=L=L' and R\cup R'=R=R', that is, \alpha=\beta.

Arithmetic operations on cuts

If \alpha=(L,R),\ \beta=(L',R') are two cuts which have rational solutions a,b, then these solutions satisfy inequalities L\le a\le R,\ L'\le b\le R' (check that you understand the meaning of this inequality between sets and numbers ;-)!) Adding these inequalities together means that c=a+b satisfies the infinite system of inequalities L+L'\le c\le R+R', where L+L' stands for the so called Minkowski sum L+L'=\{l+l':\ l\in L,\ l'\in L'\} (the same for R+R'). This allows to define the summation on cuts.

The sum of two cuts \alpha=(L,R),\beta=(L',R') is the cut \gamma=(L+L',R+R') with the Minkowski sum in the right hand side.

To define the difference, we first define the cut -\alpha as follows, -\alpha=(-R,-L), where (of course!) -L=\{-l: l\in L\},\ -R=\{-r: r\in R\}. Note that the upper and lower bounds exchanged their roles, since multiplication by -1 changes the direction (sense) of the inequalities. Then we can safely define \alpha-\beta as \alpha + (-\beta). Again, one has to check that this definition is well-behaving and all arithmetic properties are preserved.

To define multiplication, one has to exercise additional care and start with multiplication between positive cuts \alpha,\beta\triangleright 0 (do it yourselves!) and then extend it for negative cuts and the zero cut. After introducing this definition, one has to make a lot of trivial checks:

  1. that for cuts having rational solutions, we get precisely what we expected, that is, the new operation agrees with the old one on the rational numbers,
  2. that they have the same algebraic properties (associativity, distributivity, commutativity etc) as we had for the rational numbers,
  3. that they agree with the order that we introduced earlier exactly as this was the case with the rational numbers,
  4. … … …. …. …

Of course, nobody ever wrote the formal proofs of these endless properties! (Life is short and one should not waste it for nothing). Yet every mathematician can certainly provide a formal proof for any of them, and nobody of countless students who passed through this ordeal ever voiced any concern about validity of these endless nanotheorems. So wouldn’t we.

Achievement of the stated goals

Once we constructed the extension of the rational numbers by all cuts and denote the result \mathbb R and call it the set of real numbers, one has to verify that all the problems we started with, were actually resolved. There is a number of theorems about the real numbers that look dull and self-evident unless we know that a heavy price had to be paid for that. Namely, we can guarantee that:

  1. Any subset A\subset\mathbb R which admits at least one upper bound, admits the minimal upper bound called \sup A=\sup_{a\in A}a (and, of course, the analogous statement holds for \inf A).
  2. If \varnothing\ne I_k=[a_k,b_k]\subseteq\mathbb R is a family of nested nonempty closed intervals, I_1\supseteq I_2\supseteq I_3\supseteq\cdots, then the intersection I_\infty=\bigcap_{k=1}^\infty I_k is also nonempty.
  3. Any function f:[a,b]\to\mathbb R continuous on the closed segment [a,b], takes any intermediate value between f(a) and f(b).

For more detailed exposition, read the lecture notes here.

Sunday, November 1, 2015

Tutorial 1, Thu Oct 29, 2015

,שלום לכולם

בתרגול היום עסקנו בשאלה ‘איך סוכמים אינסוף איברים?’. ראינו כי האינטואיציה שלנו צריכה לעבוד קשה כשקופצים מעיסוק בכמויות סופיות לאינסופיות, ובפרט מבדידות לרציפות, והזכרנו כי השיח המתמטי העוסק במושג ‘אינסוף’ נמצא בתהליך מתמשך של התפתחות. בפרט, המושג ‘התכנסות טורים’ הוא מורכב יותר משנדמה על סמך הלימודים בתיכון (שם הדיון מוגבל לרוב לטורים גיאומטריים) או אפילו על סמך התואר הראשון.                                                                                                       י

התחלנו בתזכורת לגבי המובן הרגיל של התכנסות טורים אינסופיים, המבוסס על התכנסות של סדרת הסכומים החלקיים של הטור, והמשכנו לדיון קצר לגבי טורים שאינם מתכנסים במובן הרגיל – אך מניפולציות אלגבריות פשוטות מצליחות “בכל זאת” לשייך ערך מספרי עבור טורים אלו. דנו בשלוש דוגמאות לטורים אינסופיים של מספרים שלמים ש”מתכנסים” למספר שברי, שלילי, או גם וגם, ונגענו (בעדינות רבה) בכמה מהכלים המתמטיים שפותחו עם השנים כדי להתמודד עם טורים מתבדרים.                       י

.כך למשל, עבור הטור  1-1+1-1+1... ראינו את העיסוק במקרים הגבוליים של נוסחת הסכום לסדרה הנדסית אינסופית מתכנסת ואת שיטת הסכימה של צ’זארו, שתי שיטות המשייכות את הערך \frac{1}{2} לטור זה

עבור הטור 1+2+4+8+... הזכרנו את השינוי בנקודת המבט לגבי איברי הטור, כך שניתן להתבונן בו כמייצג טור של מספרים 2-אדיים. מנקודת מבט זו (ובשונה מאד מאשר במובן הרגיל של התכנסות טורים, שם הטיעון הבא שגוי), האיבר הכללי של הטור שואף לאפס ולכן הטור מתכנס ב2-אדיים, ובפרט – מתכנס שם לערך -1 .      בנוסף, ברגע שמצאנו מובן עבור ההתכנסות של הטור, אנו מקבלים לגיטימיזציה לשימוש נקודתי במניפולציות האלגבריות על הטור אשר מביאות לאותו ערך.                            י

:עבור הטור 1+2+3+4+... ראינו כי המניפולציות האלגבריות מביאות לתוצאה שלגביה האינטואיטיציה שלנו נאלמת דום: -\frac{1}{12}. דוגמה למניפולציות אלו תוכלו לראות בסרטון הבא

.בנוסף, ציינתי בקצרה כי קיימות שיטות סכימה נוספות שמשייכות לטור זה את הערך הנ”ל, כגון שיטת הסכימה של אבל, שיטת הסכימה של רמנוג’אן, וגם פונקציית זטא של רימן

לסיכום – הקפיצה ממספר סופי של דברים למספר אינסופי אינה מובנת מאליה, וחלק מהאתגר הוא לבצע את הקפיצה באופן שישמר את התוצאות המוסכמות בקהילה. מתמטיקאים שונים בחרו בדרכים שונות, וחלק מהיופי שבמתמטיקה הוא הדרכים השונות להסתכל על אותו דבר, וכך להבין אותו טוב יותר.                                                                                                                                                                                                                                                             י

Wednesday, October 28, 2015

Lecture 1, Tue Oct 27, 2015

‘שלום כיתה א

Welcome to the 2015/6 season of the Rothschild–Caesaria course of Analysis for high school teachers! You are welcome to bookmark this site and check it for all kind of information relevant for the course, from room changes to new handouts, updated lecture notes etc. Below follows the brief synopsis of the first lecture.


We discussed all kinds of paradoxes and possible controversies that may appear if we allow infinite sets, infinite procedures etc. They are listed in Section 1 (pages 1-5) here.


The next subject was devoted to the numbers we use. The natural numbers \mathbb N=\{1,2,3,\dots\} can be axiomatically defined using the Peano axiom system, i.e., using the symbol | (usually written as 1) and the operation “next after x” (denoted in various sources as x^+ or \textrm{Succ}(x)). Applying this operation several times, one gets elements ||,|||,||||,|||||\dots which are usually denoted by 2,3,4,5,\dots. This construction emulates the process of counting, which is how the natural numbers appeared in the human culture. More about this here, pages 1-4.

From the “usual” natural numbers one can construct larger sets of “numbers”. This can be done in more than one way, e.g., the negative integer numbers can be introduced like here (sect. 1.2, pages 4-6).

Yet there is a more general construction which works surprisingly often. The idea is to “add solutions of equations which are not solvable in the usual sense”. For instance, the negative number -n can be introduced as the “solution” of the equation x+n+1=1 which has no solution x\in\mathbb N. Using the equation, we can derive rules of manipulation with such numbers. Once we check that they are not mutually contradicting (this is a boring but necessary step), the “extension” is done. For details see sect. 1.3 of the same Note.

This process, however, does not work always. Sometimes “ideal solutions” cannot be introduced without violating the existing rules. For instance, if we decide to add “solution” of the equation 0\cdot x=1 (kind of “infinity”) which has no solutions over \mathbb Z, then we get a contradiction: such “ideal number” cannot be added with the usual integers from \mathbb Z, see Sect. 1.4.

If we start with \mathbb N and extend it so that all linear equations of the form ax+b=c are solvable (except for the “impossible” case above), the result will be the set of all \mathbb Q of rational numbers. It is a field: addition, subtraction and multiplication is always possible in \mathbb Q, while division is possible by nonzero numbers only.

However, if we want solvability of equations of degree higher than 1, then the rational numbers again become insufficient. The equations x^2-2=0 and x^2+1=0 are not solvable in \mathbb Q, albeit for “different reasons”. Still we can adjoin either of them (or both) to \mathbb Q, see Sect. 2. In principle, we can adjoin (this would require some hard work) solutions to all polynomial equations of the form a_0 x^n+ a_1x^{n-1}+\cdots +a_{n-1} x+a_n=0 with rational coefficients a_0,\dots,a_n\in\mathbb Q. The corresponding set is called the (field of) algebraic numbers \overline{\mathbb Q}.

Still for many reasons it is insufficient. Algebra is not all ;-)

Tuesday, December 30, 2014

Final announcement

This is to inform the noble audience of the course that the main program of the course is completed. I will stay in Pisa for one more week (till January 8, 2015) and will be happy to discuss any subject (upon request).

Meanwhile one of the subjects discussed in this course was brought to a pre-final form: the manuscript

  1. Shira Tanny, Sergei Yakovenko, On local Weyl equivalence of higher order Fucshian equations, arXiv:1412.7830,

was posted on arXiv and submitted to the Arnold Mathematical Journal, a new venue for publications molded in the spirit of  the late V. I. Arnol’d and his seminar.

Any criticism will be most appreciated. Congratulations modestly accepted.

Tanti auguri, carissimi! Buon anno, happy New Year, с наступающим Новым Годом, שנה (אזרחית) טובח, bonne année!

Lecture 12+ (Mon, Dec 22, 2014)

Riemann-Hilbert problem

The Riemann-Hilbert problem consists in “constructing a Fuchsian system with a prescribed monodromy”.

More precisely, let M_1,M_2,\dots,M_d be nondegenerate matrices such that their product is an identical matrix, and a_0,a_1,\dots, a_d\in\mathbb C are distinct points, such that the segments [a_0,a_k]\subset\mathbb C,\ k=1,\dots,d are all disjoint except for the point a_0 itself.

The problem is to construct a linear system of equations

\displaystyle \dot X=A(t)X,\quad A(t)=\sum_{k=1}^d \frac{A_k}{t-a_k},\quad \sum_{k=1}^d A_k=0,

such that the monodromy operator along the path “\gamma_k=segment [a_0,a_k]+ small loop around a_k+segment [a_k,a_0]” is equal to M_k.

The modern strategy of solving this problem is surgery. One can easily construct a local solution, a differential system on a neighborhood U_k of the segment [a_0,a_k], which has the specified monodromy. The phase space of this system is the cylinder U_k\times\mathbb C^n, and without loss of generality one can assume that together the neighborhoods U_k cover the whole Riemann sphere \mathbb CP^1=\mathbb C\cup\{\infty\}. Patching together these local solutions, one can construct a linear system with the specified monodromy, but it will be defined not on \mathbb C P^1\times\mathbb C^n, as required, but on a more general object, holomorphic vector bundle over \mathbb C P^1.

Description of different vector bundles is of an independent interest and is well known. It turns out (Birkhoff), that each holomorphic vector bundle in dimension n is completely determined by a(n unordered) tuple of integer numbers d_1,\dots,d_n\in\mathbb Z, and the bundle is trivial if and only if d_1=\cdots=d_n=0.

However, the strategy of solving the Riemann-Hilbert problem by construction of the bundle and determining its holomorphic type is complicated by two facts:

  1. Determination of the holomorphic type of a bundle is a transcendental problem;
  2. The local realization of the monodromy is by no means unique: in the non-resonant case one can realize any matrix M_k by an Euler system with the eigenvalues which can be arbitrarily shifted by integers; in the resonant case one should add to this freedom also non-Euler systems. This freedom can change the holomorphic type of the vector bundle in a very broad range.

It turns out that the fundamental role in solvability of the Riemann-Hilbert problem plays the (ir)reducibility of the linear group generated by the matrices M_1,\dots,M_k.

Theorem (Bolibruch, Kostov). If the group is irredicible, i.e., there is no invariant subspace in \mathbb C^n common for all operators M_k, then one can choose the local realizations in such a way that the resulting bundle is trivial and thus yields solution to the Riemann-Hilbert problem. 

The proof is achieved as follows: one constructs a possibly nontrivial bundle realizing the given monodromy, and then this bundle is brutally trivialized by a transformation that is only meromorphic at one of the singularities. The result will be a system with all but one singularities being Fuchsian, and the problem reduces to bringing to the Fuchsian form the last point (assumed to be at infinity) by transformations of the form X\mapsto P(t)X with P being a matrix polynomial with a constant nonzero determinant.  The group of such transformations is considerably more subtle, but ultimately the freedom in construction of the initial bundle can be used to guarantee that the last point is also “Fuchsianizible”.

All the way around, if the monodromy group is reducible, then there is an obstruction of the torsion type exists for trivializing the bundle. This obstruction was first discovered by A. Bolibruch, and its description can be found in the textbook by Yu. Ilyashenko and SY (sections 16G and 18).

Lecture 12 (Friday Dec. 19, 2014)

Noetherian chains

Computation of the (local intersectional) degree of a phase curve of a polynomial vector field, produced in Lecture 11, is based on the length of the ascending chain of polynomial ideals generated by consecutive derivations.

Let D:\mathbb C[x_1,\dots,x_n]\to\mathbb C[x_1,\dots,x_n] be the Lie derivation of the algebra of polynomials along the vector field v. It increases the degrees by at most d-1. Let p_0\in\mathbb C[x] be a seed polynomial of degree \delta\in\mathbb N and consider the ascending chain of ideals

I_0\subseteq I_1\subseteq I_2\subseteq\cdots\subseteq I_k\subseteq\cdots \subseteq\mathbb C[x],\qquad I_k=\left<p_0,\dots,p_k\right>,

where p_k=Dp_{k-1},\ k=1,2,\dots.  By Noetherianity, this chain must eventually stabilize at some step: I_N=I_{N+1}=\cdots. In addition to this chain of ideals, one can consider the associated descending chain of algebraic varieties

\mathbb C^n\supseteq X_0\supseteq X_1\supseteq\cdots\supseteq X_k\supseteq\cdots, \qquad X_k=\{p_0(x)=\cdots=p_k(x)=0\}.

This chain also stabilizes  no later than on the Nth step, but may stabilize earlier.  The following properties of these chains can be verified by elementary arguments.

  1. The chain of ideals is strictly ascending: If I_N=I_{N+1}, then all subsequent ideals in the chain coincide.
  2. The chain of varieties may be nonstrictly ascending: e.g., n=1,\ p_0(x)=x^m,\ D=\frac{\mathrm d}{\mathrm dx}.
  3. The length of the descending chain measures the maximal nontrivial order of contact between the trajectories of v and the hypersurface X_0=\{p_0=0\}.

In general, the length of a strictly ascending chain of polynomial ideals generated by the sequence of polynomials of degrees not exceeding an explicit (growing) function of k, can be bounded by an algorithmically computable function. However, even in the simplest case where \deg p_k\le \delta+k(d-1) (as above), this function turns out to be the Ackermann generalized exponential, a recursive but not primitively recursive function of n,d,\delta\in\mathbb N which grows faster than any elementary (or primitive recursive) function. It is the algebraic origin of the sequence of polynomials, which allows to establish better results.

Example. Assume that A:\mathbb C[x]\to\mathbb C[x] is an endomorphism of the ring of the polynomials, and instead of the iterations p_k=Dp_{k-1} of the Lie derivation, we consider the sequence p_k=Ap_{k-1}. Then analogous chains can be constructed, yet their properties will be slightly different (in a sense, better). In particular, the chain of varieties becomes strictly descending and its length can be relatively simply bounded by simple function of n,d,\delta. If the growth rate of \deg p_k is linear (as above), the bound will be double exponential in n. However,  in general the growth rate of iterates A^k p_0 is exponential, which leads to the bound given by a tower function (iterated exponent) of height n=\dim x.

The easiest way to estimate the length of varieties generated by consecutive derivations is based on the explicit Nullstellensatz. By this  theorem, for any polynomial q\in\mathbb C[x] which vanishes on the variety X\subseteq\mathbb C^n which is the zero locus of an ideal I\subseteq\mathbb C[x] there exist a finite power \rho such that q^\rho\in X. The number \rho can be explicitly bounded from above (J. Kollar, 1988): if I is generated by polynomials of degree no greater than m, then \rho\leqslant m^n. Having this bound, for each irreducible component of the variety X_k which does not belong to the stable limit, one can predict, how many steps in can survive before being eliminated.  The resulting upper bound will be double exponential in n.

However, a better, more realistic and simple exponential in n upper bound can be achieved by completely different argument.

Example. Assume that n=2 and we look at an isolated contact between a (nonsingular) trajectory of a vector field v and an algebraic curve X_0=X of degree \delta at a point a\in\mathbb C^2. Consider the local analytic chart in which v is parallel to the y-axis and the point a is at the origin. If the curve X has tangency of order \mu with the vertical axis, then its projection on the x-axis is locally a ramified covering of order \mu. Consider a small bidisk neighborhood of the origin and apply a small analytic perturbation to X. The multiple tangency point will be scattered into several points of simple (quadratic) tangency, while the topological covering property will persist. Denote by \nu the number of obtained simple tangencies: at each tangency exactly two leaves of the covering “collide”. Thus the total number of leaves \mu cannot be greater than 2\nu. The problem thus becomes to estimate \nu. However, the set of points of quadratic contact is algebraic: it is defined by the equations X_1=\{p_0=0, p_1=0\} of degrees \delta and \delta+d-1, so by the Bezout theorem the number of points does not exceed the product of these two numbers.

To generalize this argument for the multidimensional settings, one has to modify the topological part of the argument dealing with “exactly two leaves of the covering collide”. Instead of just one set X_1, one has to consider the sets X_1,\dots, X_{n-1} (recall that n is the dimension of the ambient space), and instead of counting points, one should consider their Euler characteristics. The corresponding combinatorics can be elegantly expressed by the “integration over the Euler characteristic” discovered by O. Viro (1988), while the bounds for the Euler characteristic of algebraic varieties can be bounded by virtue of the J. Milnor’s result (1964).

The result, due to A. Gabrielov and A. Khovanskii, is simple exponential (in n) bound, was achieved in 1998. However, for some problems in the analytic number theory (algebraic independence of transcendental numbers) it is important to have a more precise estimate of the maximal tangency order for d, n fixed, but \delta variable and growing to infinity. The most recent achievements in this direction are due to G. Binyamini [4], see below.

Besides, a different (and considerably more difficult) problem arises in the singular context, when one tries to estimate the order of contact of an algebraic hypersurface with a separatrix of a polynomial vector field, an invariant analytic curve (usually non-smooth) which contains a singular point of the vector field v.  Here again the most recent breakthroughs are due to Binyamini [5].

References (in addition to those mentioned earlier).

  1. A. Gabrielov, A. KhovanksiiMultiplicity of a Noetherian intersection.  Geometry of differential equations, 119–130,
    Amer. Math. Soc. Transl. Ser. 2, 186, Amer. Math. Soc., Providence, RI, 1998.
  2. O. ViroSome integral calculus based on Euler characteristic. Topology and geometry—Rohlin Seminar, 127–138,
    Lecture Notes in Math., 1346, Springer, Berlin, 1988.
  3. J. Milnor, On the Betti numbers of real varieties.  Proc. Amer. Math. Soc. 15 1964 275–280.
  4. G. Binyamini, Multiplicity Estimates: a Morse-theoretic approach, arXiv:1406.1858 (2014).
  5. G. Binyamini, Multiplicity estimates, analytic cycles and Newton polytopes, arXiv:1407.1183
Next Page »

The Rubric Theme. Blog at


Get every new post delivered to your Inbox.