Sergei Yakovenko's blog: on Math and Teaching

Monday, March 28, 2011

Lectures 2-3, March 29 – April 5, 2011

Filed under: lecture,Rothschild Course "Symmetry" — Sergei Yakovenko @ 5:07

Symmetries and symmetric functions

Preliminaries and motivations

Definition. A function f:\mathbb R\to \mathbb R is even, if \forall x\in\mathbb R\ f(-x)=f(x). A function is odd, if f(-x)=-f(x).

Proposition 1. Every function can be represented as a sum of even and odd functions.
Proof. Let f_+(x)=\frac12\bigl(f(x)+f(-x)\bigr), f_-(x)=\frac12\bigl(f(x)-f(-x)\bigr). Then f=f_++f_-.

Problem 1. Prove that even functions form a ring: any linear combination or product of even functions is again even. What’s wrong with odd functions?

Problem 2. Prove that any real polynomial can be written as p(x^2)+x q(x^2), where p,q are two real polynomials.
Problem 3. Is the same assertion true if we replace “polynomial” by “continuous function”? Show that any continuous function can be represented as f(x^2)+\frac 1x g(x^2) with continuous f,g.

Theorem. Any analytic or C^\infty-smooth function defined in a neighborhood of x=0, can be represented under the form f(x^2)+x\,g(x^2) in a (smaller, if necessary) open neighborhood of the origin, with the functions f,g of the same type.

The difficult part of the proof of this theorem for non-analytic functions is hidden in the following Hadamard’s lemma.

Lemma. An analytic or smooth function f:(\mathbb R,0)\to\mathbb R which vanishes at the origin, f(0)=0, is divisible by x: f(x)=x\,g(x).

For analytic functions both the Lemma and the Theorem are very easy.

Definition. A function of two variables x,y is symmetric, if f(x,y)=f(y,x) for any values of x,y.

Problem 4. Describe all symmetric polynomials of degree \le 2.

Answer. \sigma_1(x,y)=x+y,\ \sigma_2(x,y)=xy. [Forgotten: s_2(x,y)=x^2+y^2=\sigma_1^2-2\sigma_2, and in general any other polynomial of the form a_0+a_1\sigma_1+a_2\sigma_1^2+b_1\sigma_2. Later we will learn that any symmetric polynomial of any degree can always be presented as a polynomial in \sigma_1,\sigma_2.]

Problem 5. Give the definition of an antisymmetric function of two variables in such a way that every function of two variables becomes a sum of a symmetric and antisymmetric functions.

Problem 6. Formulate and prove an analog of the statement from Problem 1 for symmetric functions.

Theorem. Any antisymmetric polynomial in two variables can be represented as p(x,y)=(x-y)\,q(x,y), where q(x,y) is a symmetric polynomial in two variables.

Proof. Consider the change of variables u=x-y,\ v=x+y which can easily be inverted: x=\frac12(u+v),\ y=\frac12(u-v). Any polynomial in x,y can be written as a polynomial in u,v and vice versa. If p(x,y) is an antisymmetric polynomial, then the polynomial P(u,v)=p\bigl(u(x,y),v(x,y)\bigr) satisfies the property P(-u,v)=-P(u,v) for any u,v. This means that each polynomial P(\cdot,v) obtained by “freezing” the variable v, is odd in u and hence is divisible by u, so that P(u,v)=u\,Q(u,v).   One can easily see that the quotient Q(u,v) is again polynomial in u,v. Thus, returning to the initial variables, we conclude that p(x,y)=(x-y)\,q(x,y). Obviously, q must be symmetric (apply the definition!).

What is common between the two examples?

Consider any set X (e.g., the real line, or the plane \mathbb R^2) and a group \mathbf G=\mathbb Z_2 which consists of two transformations of X into itself. One element is necessarily the identical transformation denoted by I, the other (denoted by S) must be an involution: S\circ S=I.

Real-valued functions on X can be acted upon by the group \mathbf G: if f:X\mapsto \mathbb R is such a function, then the action of I on f is trivial (identical), and S sends f into g=S^*f by the formula g(x)=f(S(x)). We put an asterisk to distinguish between the action of S on points from the action of S on functions.

Problem 7. Prove that all functions on X form a linear space (denoted by \mathbb R(X)). This space is finite-dimensional, if X has finitely many elements. What is the dimension? what plays the role of the zero vector in this space?

Problem 8. Prove that the map S^* is linear.

In these terms, functions symmetric by S^* are defined by the condition S^*f=f, antisymmetric — by the condition S^*f=-f. Is there anything left out?

Problem 9. Find all numbers \lambda\in\mathbb R such that the condition S^*f=\lambda f defines a nontrivial function f\not\equiv0.

Theorem. Prove that any function on X is a sum of a symmetric (by S) and an antisymmetric functions.

Proof. For any function f:X\to\mathbb R consider its symmetrization by the group \mathbf G=\mathbb Z_2: \displaystyle f_+=\frac1{|\mathbf G|}\sum_{G\in \mathbf G}G^*f=\frac12(f+S^*f). Then for any element G\in\mathbf G we have G^* f_+=f_+, since the action of G^* on the sum simply permutes the terms in this sum. We call f_+ the symmetric part of f.

The difference f_-=f-f_+ is antisymmetric for \mathbf G=\mathbb Z_2: indeed, f_-=f-\frac12(f+S^*f)=\frac12(f-S^*f). Application of the linear operator S^* exchanges the terms in the difference and hence changes its sign.

Problem 10. Describe symmetric and antisymmetric polynomials for the central symmetry S:(x,y)\mapsto(-x,-y). Check that the above theorem remains valid in this case as well.

What is special about the group \mathbf G=\mathbb Z_2?

The definition of \mathbf G-symmetric function can be given in the general case when \mathbf G is a  group acting on the set X. A function is \mathbf G-symmetric, if  G^*f=f for any G\in\mathbf G.

Example. The constant functions taking the same value at all points of X, are invariant by any group acting on X. The constant functions form one-dimensional subspace in \mathbb R(X).

The notion of antisymmetric function, however, has to be modified, since the they may behave differently by action of different non-identical elements of \mathbf G. Still, one can suggest an analog of the decomposition of functions into symmetric (even) and odd parts.

Definition. Let \mathbf G be a finite group acting on X, and |\mathbf G| stands for the number of elements in the group. The symmetrization (or averaging over the group) of a function f\in\mathbb R(X) is the function f_+\in\mathbb R(X) defined by the formula

\displaystyle f_+=\frac1{|\mathbf G|}\sum_{G\in \mathbf G}G^*f

Considered as the operator f\mapsto f_+, symmetrization is a linear operator distilling from f its “symmetric part”.

Definition. A function f\in\mathbb R(X) is called \mathbf G-even, if f=f_+. A function f\in\mathbb R(X) is called \mathbf G-odd, if its symmetrization as above is zero, f_+\equiv 0.

Lemma. For any finite group action, symmetric functions are even and vice versa.

Proof. If f is symmetric, then in the above sum G^*f=f for all G, hence f_+=f. Conversely, if f is equal to its average, then for any element H\in\mathbf G the action H^*f by linearity is equal to \displaystyle \frac1{|\mathbf G|}\sum_{g\in\mathbf G}H^*G^*f. But this is the same sum as before, just with permuted terms. (Warning: this is a statement which requires thinking out! You need to use the axioms of the group to check it!).

Proposition (obvious). For any finite group action, any function admits decomposition into symmetric and odd parts. f=f_+f_-, with f_+ symmetric and f_- odd.

The operation taking f into its even and odd part f_\pm is very much like projection on complementary subspaces: (f_+)_+=f_+,\ (f_-)_+=(f_+)_-=0.

Example. Let X be the set of faces of a cube (standard dice):

Rolling dice: any real number can be written on each face

A function on X is a marking: on each face a real number can be written (not necessarily from 1 to 6). This space is 6-dimensional (isomorphic to \mathbb R^6), so any element can be written as a string of 6 numbers. However, this description obscures the symmetry of the dice.

The dice can be rotated around many axes, reflected in a mirror or in the center.

The dice can be rotated around many axes, reflected in a mirror or in the center. Respectively, we can consider the entire group or its proper subgroups. We have quite a few to choose from:

  1. The subgroup \mathbf S_0 generated by the central symmetry, isomorphic to \mathbb Z_2;
  2. Three subgroups \mathbf S_i,\ i=1,2,3,  generated by reflections in the three coordinate planes. Each of these subgroups is isomorphic to \mathbb Z_2;
  3. Three different subgroups  generated by rotations around the three coordinate axis; each such subgroup is cyclic and isomorphic to \mathbb Z_4;
  4. Four different subgroups \mathbf R_i,\ i=1,2,3 generated by rotations around the diagonals of the cube; each such subgroup is isomorphic to \mathbb Z_3.

The only functions that are symmetric by the entire cube group, are constants (the same number is written on all six faces).

  1. For the first subgroup \mathbf S^0,  a function is even, if its values on the opposite faces are the same, and odd, if they differ by sign. The spaces of even and off functions are both 3-dimensional.
  2. For each of the mirror symmetries \mathbf S_i the odd functions form a 1-dim space (values of opposite sign on two opposite faces, zeros everywhere else), while even functions constitute 5-dim susbspace.

Problem. Describe symmetric functions for each of the remaining subgroups, and find the dimension of the corresponding even and odd subspaces.

Why symmetric and antisymmetric functions may  be of any interest?

Because sometimes we have to study symmetric operations on such functions, and the symmetry simplifies this task.

Problem 11. On each face of the cube a positive number is written in such a way that their sum is equal to 613. Every day each of these numbers is (simultaneously) replaced by the average of its 4 neighbors. Compute the result which will appear after one month of such manipulations with the accuracy of 1/1000.

Looks stupid, isn’t it? How can one compute the result without knowing the initial numbers? But wait.

Consider the operator L:\mathbb R(X)\to\mathbb R(X), which transforms a function on the cube (“numbers on the faces”) into the average as described.Clearly, L is a linear operator. If we write its 6\times 6-matrix, it will consist of zeros and the quarters \frac 14 at certain places. The problem is to compute the iterate L^{30}=L\circ\cdots\circ L (thirty times) and apply it to the unknown initial “vector” f_0\in\mathbb R(X). If L were a diagonal operator, this would be an easy task: L^{n} is also diagonal with the eigenvalues being nth powers of eigenvalues of L. Thus we need to find all functions such that Lf=\lambda f for all possible values of \lambda.

This operator “respects the symmetry”, that is, for any symmetry of the cube G\in\mathbf G the identity holds, L\circ G^*=G^*\circ L. This formula means that L is interchangeable with any cube rotation. This commutativity implies many things. For instance, each eigenspace of L must be invariant by all operators G^*:

Lf=\lambda f\implies\forall G\in\mathbf G\quad g=G^*f \text{ also satisfies }Lg=\lambda g\text{ with the same }\lambda.

Yet our goal is the opposite: we have to find the eigenspaces of L, knowing its symmetry group.

Proposition. If L:\mathbb R(X)\to\mathbb R(X) commutes with all elements of a group \mathbf G, then both  \mathbf G-even and \mathbf G-odd functions form two complementary subspaces in \mathbb R(X), both invariant by L and disjoint with each other.

Proof. As before, one has to write the definition of even (resp., odd) function via averaging and apply L to both parts of the equality.

Now the strategy of constructing eigenspaces (subspaces on which L acts as a multiplication by a scalar) is rather simple: one has to consider various subgroups and see how restrictions of L on their even/odd subspaces look like.

  1. Application of L to \mathbf S_0-odd functions is zero. Indeed, for any face its four neighbors form two pairs of opposite faces. If f is  \mathbf S_0-odd, then the sum of its values over these four faces is zero. Thus we have discovered a 3-dimensional subspace on which L has eigenvalue 0.
  2. Consider the (invariant) subspace of \mathbf S_0-even functions and its intersection with \mathbf R_i-even/odd functions for a fixed rotation axis passing through two opposite vertices of the cube. Denote by a,b,c the values on the three faces adjacent to this vertex:

    Three numbers around a vertex

    the rotation permutes these faces cyclically with period 3 (and also permutes cyclically the other three invisible faces on which the same values appear, since we are inside \mathbf S_0-even subspace). A function is \mathbf R_i-even, if a=b=c, and is \mathbf R_i-odd, if a+b+c=0. In the first case all 6 numbers on the faces are equal, the corresponding doubly even functions are constants.

  3. Thus we have the three-dimensional subspace of \mathbf S_0-even functions represented as a sum of 1-dimensional \mathbf R_i-even subspace of constant functions and 2-dimensional \mathbf R_i-odd subspace. Clearly, the constant functions are preserved by L, i.e., the corresponding eigenvalue is equal to 1. To describe the action of L on the odd functions, note that its application replaces the number a on one of these faces  by \frac14(b+c+b+c)=\frac12(b+c). But since a+b+c=0, we have b+c=-a, that is, a is replaced by -\frac 12 a. In other words, L has 1-dimensional eigenspace with the eigenvalue 1 and 2-dimensional subspace with the eigenvalue -\frac12.
  4. The total list of eigenvalues of  L  repeated with their multiplicities, called its spectrum, is 1,-\frac12,-\frac12,0,0,0. Without much computations we diagonalized the matrix of L.

Solution of the Problem 11. Any initial set of numbers written on the faces of the cube, i.e., the initial function f\in\mathbb R(X), can be represented as a sum of three terms, f=f_1+f_{\frac12}+f_0, belonging to the corresponding eigenspaces. The operator L preserves f_1, kills f_0 and multiplies f_{\frac12} by -\frac12. After 30 iterations we obtain L^{30}f=f_1+2^{-30}f_{\frac12}. Since 2^{30}>1000^3=1,000,000,000, we conclude that the result is practically indistinguishable from f_1, that is, after one month the numbers on all faces will be almost the same. To find this common number, we note that the sum of all6 numbers remains the same after application by L. Thus all of the numbers will be very close to 613/6=102\frac16.

To complete the proof, one should be certain that the norm of the vector f_{\frac12} is not very large. This follows from the fact that the norm of the initial vector (the sum of the absolute values of all numbers) is no greater than 613, and the fact that the norm of the projection does not exceed the norm of the vector. We leave the details as an exercise.

Concluding remarks

Clearly, one can recycle the same symmetry arguments to solve an analogous problem for other Plato bodies, like tetrahedron and octahedron: in each case the operator L assigns to a given face the average of values written on all adjacent 3 faces. You may think that the problem with the tetrahedron is simpler because it has only 4 faces, not 8 like the octahedron. Yet the symmetry group of the octahedron is the same as that of the cube! Indeed, centers of faces of the cube are vertices of the smaller octahedron and reciprocally. Thus quite a bit of information can be recycled if working smart.

One may question the choice of the operator L as perhaps artificial. It is not! This operator is a discrete approximation of the Laplace operator which describes propagation of heat along surfaces. The eigenvalues of the Laplace operator are also the principal characteristic of sounding membranes: the collection of the corresponding numbers counted with their multiplicities characterizes main frequences and overtones of drums, loudspeakers etc. The correspondence between acoustic properties and geometry of the membranes is one of the most fascinating questions in mathematics.

For a long time it was believed that the complete spectrum (of infinitely many eigenvalues) completely characterizes the geometry. In a provocative language the question was formulated as follows, “Can one hear the shape of a drum?” by Marc Katz in 1966. However, the conjecture was disproved by John Milnor, first for 16-dimensional “drums” in a paper of record short length (half a page). Later much simpler examples of 2-dimensional membranes were constructed.

As an excellent project, one may suggest computing the spectra of the “discrete Laplacians” for all Plato bodies and compare them. Analogous problems can be posed for various graphs.


Sunday, March 27, 2011

Abel Prize to John W. Milnor

Filed under: Uncategorized — Sergei Yakovenko @ 4:49

The past Tuesday came the announcement that the Abel prize for 2011 is awarded to John Willard (Jack) Milnor, one of the most diverse and elegant mathematicians of all times, «for pioneering discoveries in topology, geometry and algebra». Some of his wonderful achievements are explained at various detail on the Prize web page, e.g., in the popular exposition by Tim Gowers. Yet for the majority of “rank and file” mathematicians Milnor is the author of incredibly clearly written expository books. The legendary “Morse Theory” with its 160 small pages serves as an introduction to singularity theory, Riemannian geometry, calculus of variations and differential topology. The book “Topology from the differential viewpoint” is a must read for any analyst or geometer… it is impossible to believe that these books were written by a mature mathematician aged less than 35 and which remain the standard reference for half a century!

Even on the level of the Guinness book of records one can mention a few surprising facts about Milnor:

  • he is one of the few mathematicians whose first recorded publication at age 18+ is in the “Annals of Mathematics”, probably the most competitive journal;
  • he is the author of one of the shortest mathematical papers solving a long-standing problem, – a “pure serendipity”, as he himself describes this half-page article proving that one can not always hear the shape of 16-dimensional drum.

Among less known achievements of John Milnor are his three papers on axiomatic approach to the game theory, written between 1951 and 1955. Although he never returned to this area again, the encyclopedic treatise of Luce and Raiffa “Games and Decisions” devotes several dozens of pages describing and discussing the works of 20-years-old maître.

Thus you might imagine the fascination with which a recent student would meet such a mountainous figure…

John Milnor, S.Y. and Misha Lyubich

Guanajuato, Mexico, February 1991, Milnor-60-fest (with Misha Lyubich)

Yet in the real world Jack is one of the most modest, soft-speaking and delicate people.

Excellent decision, dear Abel Prize committee members! Congratulations and admiration, dear Jack! Till 120 at the same pace!

Tuesday, March 22, 2011

Symmetry: new course for the high school math teachers

Filed under: lecture,Rothschild Course "Symmetry" — Sergei Yakovenko @ 2:53

What is a symmetry?

There is no “rigorous definition”. Informally, the notion of symmetry appears wherever we have:

  • a collection (set) of transformations (actions, operations, …) which can be applied to these objects and transform them again into objects of the same class;
  • a collection (set) of objects (points, figures, variables, functions, rules of the game…).

Then usually, but not always, the result of the transformation of an object is a new, different object. However, there may be exceptions: some objects may stay the same after a nontrivial transformation. Then we say that such objects are symmetric and the transformations preserving the object are called its symmetries.

Sometimes the mere understanding of the symmetry can be a key to solving the problem.
Problem. For which values of the parameters a,b\in\mathbb R the system of algebraic equations


admits only one solution?

Solution. Consider the transformation of cyclic change of variables, x\mapsto y,\ y\mapsto z,\ z\mapsto x. This transformation can be applied to each of the three equations and preserves all of them. Therefore if we have a solution (x_0,y_0,z_0), then the triple (y_0,z_0,x_0) is also a solution. If the system has only one solution, then it should stay the same, i.e., x_0=y_0,\ y_0=z_0,\ y_0=z_0. In other words, all three components should be equal to the same number c\in R. The first equation instantly implies that c=\frac13. Substituting this value to the remaining equations, we see that 3\times\frac19=\frac13=b and 3\times\frac1{27}=b=\frac19.

Problem. Can the above system have exactly two solutions for some values of a,b?

Problem. Consider the regular polygon with n vertices, inscribed in a circle. Denote by O the center of the circle and A_1,\dots,A_n the vertices (see the picture for n=5):

Prove that the vector sum \overline{OA_1}+\overline{OA_2}+\cdots+\overline{OA_n}=0.

Solution. Consider the rigid rotation of the plane around the origin O by the angle 360^\circ/n. Then the circle and the polygon will be preserved, only the labels of the vertices will change as follows: A_1\mapsto A_2,\ A_2\mapsto A_3,\ \dots,\ A_n\mapsto A_1. Thus all terms in the above  vector sum will undergo a cyclic permutation, which does not change the result. If we denote this sum by S, then the conclusion is that the rotation of S by  360^\circ/n is equal to S. Yet the only vector that does not change after a rotation (other than by 360^\circ) is the zero vector.

Problem. Prove that for a\in\mathbb N integer and p prime, the difference a^p-a is divisible by p. (Little Fermat theorem)

Solution. Consider necklaces (מחרוזת) of p beads (חרוזים) each, made in a distinct colors, and let’s count them. We can choose each bead by a distinct ways, so the total number appears to be a^p.  However, this is not the right answer, since the necklaces which differ by rotation we counted several times:


A “completely asymmetric” necklace, without any regularity in the pattern, will be counted p times, since we have exactly p rotations. On the other hand, the monochromatic (single-color) necklaces will be counted only once: there exist a of them.

Are there any intermediate symmetric necklaces? If after rotation by q “steps” we obtain the initial necklace, then this would mean that the color pattern is periodic with period q. But the whole necklace should consist of an entire number of such “periods”, i.e., p should be divisible by q. But by assumption, p is prime! This means, that except for the monochromatic necklaces, there are no “partially symmetric” ones, only completely asymmetric (i.e., such, that any nontrivial rotation produces a different coloring scheme). Now we conclude that

(total count of coloring schemes) a^p = (number of single-color schemes) a + p\times(number of non-symmetric colorings, an integer number).

This proves that a^p-a is divisible by p.

Problem. Prove the little Fermat theorem by induction in a. Hint: consider the binomial coefficient \binom {p}{k}=\frac{p!}{k!(p-k)!} and show that for p prime it is always divisible by p except when k=0 or k=p.

Blog at