Archive for the ‘Physics’ Category

Mathematics and Physics: Tension with Interaction

It is pretty common for us to hear about interactions between mathematics and physics. If one search for such phrase on the web, one will find many articles that delve into such a topic. Perhaps it is not that surprising the close link between physics and mathematics as they co-evolve earlier under the name of natural philosophy (see Evelyn Lamb’s Symmetry article “The Coevolution of Physics and Math“). The term natural philosophy has gone out of fashion due to the divergent paths made by mathematics, physics and philosophy itself (see problem of calls for return to natural philosophy by Arran Gare in his article, “Natural Philosophy and the Sciences: Challenging Science’s Tunnel Vision“). Thus having physics and mathematics emerge as separate domains but yet interacting in some nominal fashion, may not be surprising.

So I find it refreshing to find an article about tension between mathematics and physics instead i.e. Miklos Redei, “On the Tension Between Physics and Mathematics“, J. Gen. Phil. Sci. 51 (2020) 411-425. Redei is a Professor of Philosopher of Science at LSE and has written numerous articles on philosophy and foundations of physics. On my wishlist is his book on “The Principle of the Common Cause“, co-authored with G. Hofer-Szabo & L.E. Szabo. Why are tensions more interesting? They tend to drive new ideas in both fields of physics and mathematics. Redei classified the tensions into two classes:

  • Tension of Type I: Unavailability of ready-made mathematical concepts needed for physics;
  • Tension of Type II: Availability of mathematical concepts but dismissive of rigour for further development pf physics.

In the former, it is clear that there is a need for new mathematical ideas and the historical examples given are calculus for Newtonian mechanics and spectral theory of self-adjoint operators for quantum mechanics. The other two examples are perhaps the interesting ones as they are yet unsolved, namely, ergodic theory for statistical mechanics and operator-valued tempered distributions for quantum field theory. The case of quantum field theory is not surprising and there are many attempts of putting quantum field theory on rigorous basis by axiomatizing or through constructive quantum field theory. A relative recent review on CQFT can be found in Summers’ “A Perspective on Constructive Quantum Field Theory” (arXiv: 1203.3991). For ergodic theory, this is lesser known to me and Redei mentions the reference of Szasz’s “Boltzmann’s Ergodic Hypothesis, A Conjecture for Centuries” (book is open access).

For Type II Tension, my first impression is that it is not totally unrelated to Type I, with quantum field theory come to mind. For instance, functional integrals are often used in field theories without rigour (see here). Redei mentions the Jaffe and Quinn article “Theoretical Mathematics: Towards a Cultural Synthesis of Mathematics and Theoretical Physics” which sparks debates. See responses here. How tension II can lead to progress is not as clear as tension I but mathematical rigour can always uncover new problems that may lead to new progress. An example that Redei mentioned is von Neumann’s own dismissive behaviour over the Hilbert space formulation of quantum mechanics in favour of an operator algebraic approach. Redei wrote an article on this: Why John von Neumann Did Not Like the Hilbert Space Formalism of Quantum Mechanics (and What He Liked Instead). I’m recalling that Bob Coecke always made reference to this von Neumann’s reaction in favour for a more intuitive categorical/pictorial approach. Another reference that was also mentioned is that of Valente’s “John von Neumann’s Mathematical “Utopia” in Quantum Theory“. So will need to read these papers to understand von Neumann’s standpoint.

Lecture: Set Mappings

Mappings

Sets and their operations may be restrictive when considered on their own. More interesting things happened when one allows a particular set to ‘transform’ into another.

Let X,Y be two sets. A mapping \varphi from X to Y denoted as \varphi: X\rightarrow Y is a subset of X\times Y such that for every x\in X, there is a unique y\in Y for which (x,y)\in\varphi  i.e.

(x,y)\in\varphi\quad\textrm{and}\quad(x,y')\in\varphi\Rightarrow y=y' .

These are also called functions or maps and they are normally written as y=\varphi(x) or \varphi:x\mapsto y=\varphi(x). The element x is called the source and y=\varphi(x) is the image. A map \varphi:X\rightarrow X is called a transformation of X. Note also, the subset \varphi\subseteq X\times Y is also called the graph of the function \varphi. The set X is called the domain and Y is the target or codomain.

The subset \varphi(X) defined by

Y\supseteq \varphi(X)=\{y\in Y\vert y=\varphi(x),\ x\in X\}

is called the range. This is to be contrasted with the codomain which is Y itself.

Let U\subseteq Y. Then the inverse image U is

\varphi^{-1}(U)=\{x\in X\vert\varphi(x)\in U\} .

Note that such definition applies even if \varphi^{-1} does not exist and hence giving the empty set.

Example: Consider the map \sin:\mathbb{R}\rightarrow\mathbb{R}. The inverse image of 0 is

\sin^{-1}(0)=\{0.\pm\pi,\pm 2\pi,\cdots\} ,

while for \vert x\vert >1, its inverse image is

\sin^{-1}(x)=\emptyset .

It is convenient at this stage to introduce some terminologies that will later parallel with the language of geometry. Given a map \varphi: X\rightarrow Y, the image im\varphi of \varphi is a subset of Y i.e. im\varphi\subseteq Y. Note that im\varphi is non-null if X is non-null. The set

\{x\in X\vert\varphi(x)=y\}

is called the fibre of \varphi over y\in Y. This is possibly null if \varphi is not surjective. The fibres of map \varphi are called levels or contours of \varphi. The set of non-null fibres of \varphi is called the coimage of \varphi.

Next, we can make a map to be an n-ary function from X to Y i.e. \varphi: X^n\rightarrow Y with

y=\varphi(x_1,x_2,\cdots,x_n)

for ((x_1,x_2,\cdots,x_n),y)\in\varphi, where y is said to have n arguments. One can further generalize this by making each argument coming from different spaces i.e.

\varphi: X_1\times X_2\times\cdots\times X_n\rightarrow Y ,

where X_i\neq X_j for some i\neq j. An important map of this nature is the projection map

\begin{aligned} pr_i:&X_1\times X_2\times\cdots\times X_n\\&(x_1,x_2,\cdots,x_n)\mapsto x_i\end{aligned}

where x_i\in X_i. The projection pr_i essentially gives the ith component of the domain.

If \varphi:X\rightarrow Y and \psi:Y\rightarrow Z are maps, then the composition map \psi\circ\varphi:X\rightarrow Z is defined by

\psi\circ\varphi(x)=\psi(\varphi(x)) .

Note that the codomain and domain of consecutive map(s) must match. The composition of mappings themselves satisfies associative law. Suppose \varphi:X\rightarrow Y, \psi: Y\rightarrow Z and \alpha: Z\rightarrow W are maps. Then

\alpha\circ(\psi\circ\varphi)=(\alpha\circ\psi)\circ\varphi .

Thus one can write unambiguously \alpha\circ\psi\circ\varphi.

Surjections, Injections & Bijections

Mapping \varphi: X\rightarrow Y is surjective if its range is all of Y i.e.

\varphi(X)=Y .

We say that \varphi is a mapping of X onto Y or \varphi is a surjection.

Mapping \varphi: X\rightarrow Y is injective if for every y\in Y there is a unique x\in X such that y=\varphi (x) i.e.

\varphi(x)=\varphi(x')\Leftrightarrow x=x' .

We say that \varphi is a one-to-one mapping of X into Y or \varphi is an injection.

A mapping \varphi that is both injective and surjective is called bijective. An inverse map \varphi^{-1}: Y\rightarrow X has the property

\varphi^{-1}(\varphi(x)) = x\quad;\quad\forall x\in X .

Two sets X,Y are in one-to-one correspondence if there exists a bijection \varphi: X\rightarrow Y. If Y=X, the bijective map \varphi: X\rightarrow X is called a transformation or permutation of X. The most trivial bijection is the identity map or identity transformation id_X where

id_X(x)=x\quad\forall x\in X .

This gives the diagonal graph

id_X=\{(x,x)\vert x\in X\}\subseteq X\times X .

When \varphi: X\rightarrow Y is bijective with inverse \varphi^{-1}, then

\varphi^{-1}\circ\varphi=id_X\quad,\quad\varphi\circ\varphi^{-1}=id_X .

If \varphi,\psi are bijections (with codomain and domain), then so is \psi\circ\varphi with the inverse

(\psi\circ\varphi)^{-1}=\varphi^{-1}\circ\psi^{-1} .

Exercise: Verify this.

If U\subseteq X and \varphi:X\rightarrow Y, then the restriction of map \varphi to U is the map

\begin{aligned} \varphi\vert_U: &U\rightarrow Y\\ &x\mapsto \varphi\vert_U(x)=\varphi(x)\quad,\quad\forall x\in U\end{aligned} .

Applying restriction to the identity map, one obtains

i_u=id_X\vert_U: U\rightarrow X\equiv U\hookrightarrow X ,

which is called the inclusion map for U. It can now be seen that the restriction of \varphi to U is the composition of \varphi with the inclusion map:

\varphi\vert_U=\varphi\circ i_U .

Example: Let U\subseteq X. Define characteristic function of U by \chi_U:X\rightarrow \{0,1\} as follows:

\chi_U(x)=\begin{cases} 0\quad\textrm{if}\ x\not\in U\\ 1\quad\textrm{if}\ x\in U\end{cases}   .

Consider (other) functions \varphi: X\rightarrow\{0,1\} which are by definition characteristic functions of (other) subset U\subseteq X whose points mapped to 1:

\varphi = \chi_U\quad\textrm{where}\quad U=\varphi^{-1}(\{1\}) .

Thus, the set of all possible subsets of X i.e. 2^X are in one-to-one corresponding of all possible maps \varphi:X\rightarrow\{0,1\}.

Example: Consider R be an equivalence relation on a set X. A canonical map \phi: X\rightarrow X/R from the set X to factor space X/R can be given by

\varphi(x)=[x]_R .

The map is onto since all x will belong to one equivalence class and the factor space is simply partitioning of X by the equivalence relation R.

Example: Suppose \varphi: X\rightarrow Y is a bijection. Another bijection \phi: X\rightarrow Y can be introduced via transformation/permutation \alpha: X\rightarrow X or \beta: Y\rightarrow Y:

\begin {matrix} X&\overset{\varphi}{\longrightarrow}&Y\\ ^{\alpha}\uparrow&\nearrow_\phi &\ \\ X&\ &\ \end{matrix}   and   \begin{matrix} X&\overset{\varphi}{\longrightarrow}&Y\\ \ &_{\phi}\searrow&\downarrow ^{\beta}\\ \ &\ &Y\end{matrix}

 

Proposition: Let \varphi:X\rightarrow Y and \phi:W\rightarrow X, then

  • \varphi and \phi are surjective   \Rightarrow   \varphi\circ\phi is surjective;
  • \varphi and \phi are injective   \Rightarrow   \varphi\circ\phi is injective;
  • \varphi\circ\phi is surjective   \Rightarrow   \varphi is surjective;
  • \varphi\circ\phi is injective   \Rightarrow   \phi is injective.

Corollary: Let \varphi: X\rightarrow Y\phi: W\rightarrow be maps. Then

  • \varphi and \phi are bijective   \Rightarrow   \varphi\circ\phi is bijective;
  • \varphi\circ\phi is bijective   \Rightarrow   \varphi is surjective and \phi is injective.

 

References

Dynamical Systems: An Outsider’s Glance Part I

I will leave Bengtsson & Zyczkowski book for awhile to look into dynamical systems and nonlinear dynamics, a topic I covered in my talk in India.

The subject of dynamical system is very broad (either from physics or mathematics), which makes it hard for a beginner (me) to survey the subject. It is only better for me to explain why am I interested in this area. The procedure of quantization often begins with classical (dynamical) systems of which a major interest is in quantizing (a particle system on) nonlinear configuration spaces. Most of the time we treat the nonlinearities intrinsically by adopting the appropriate canonical variables to be quantized (or quantization by constraints). Some of these nonlinear systems can be chaotic classically such as particles on hyperbolic surfaces (see review article “Chaos on the Pseudosphere” by Balasz and Voros or “Some Geometrical Models of Chaos” by Caroline Series) and it is often pondered how such behaviour translates into the quantum regime. My original concern in this topic is mostly how complex topologies of hyperbolic surfaces get encoded in quantum theory. We will defer such discussions to a later time.

What is a dynamical system? There are three ingredients to a dynamical system:

  • Evolution parameter (usually time) space T;
  • State space M;
  • Evolution rule F_t: M\rightarrow M,\quad t\in T.

Note that T can either be \mathbb{Z}^+,\mathbb{Z} or I\subseteq\mathbb{R}, for which the former is called discrete dynamical systems where evolution rule is usually difference equation, while the latter is called continuous dynamical systems with the evolution rule described by ordinary differential equation. One can consider to discretize the state space itself to give cellular automata with their update rules (see also graph dynamical systems). Perhaps the most famous cellular automata (CA) is Conway’s Game of Life build upon a two-dimensional (rectangular) lattice whose update rule is given by

  • Any live cell with fewer than two live neighbours die (underpopulation).
  • Any live cell with two or three live neigbours live.
  • Any live cell with more than three live neighbours die (overpopulation).
  • Any dead cell surrounded by three live neighbours live (regeneration).

Fascinating configurations can be constructed by such simple rules. Even simpler is the elementary 1-dimensional CAs for which Wolfram has classified according to their update rules: rule 0 to 255 (256=2^{2^3} rules). It was proven by Cook that rule 110 is capable to be a universal Turing machine. Note in the above CAs, there are only two states (live or dead). One can generalise the number of states to go beyond two e.g. CA with three-valued state (RGB) can be used in pattern and image recognition (see e.g. https://www.sciencedirect.com/science/article/pii/S0307904X14004983).

One can further make abstract the notion of dynamical system as done by Giunti & Mazzola in “Dynamical Systems on Monoids: Toward a General Theory of Deterministic Systems and Motion” (see also here). We will not pursue this but instead mention two other cases normally not classed as dynamical systems.

I would like to add further to the above some dynamical systems encountered in (theoretical) physics that ought to be differentiated from the classes above. First, systems whose equations are given by partial differential equations i.e. those with differential operators of not only with respect to time, but also with respect to space. Most notable example is the Navier-Stokes equation that governs fluids. At this point, one should even mention about relativistic systems whose evolution parameter might not even be separably identified from spatial coordinates. Relatively recently, techniques of (conventional) dynamical systems have found its way into cosmology via long-term behaviour of cosmological solutions and reducing the full Einstein equation (pdes) to simpler ones. See the book of Alan Coley or the book of Wainwright & Ellis. See also the articles of Boehmer & Chan and Bahamonde et al. (published version here). It is interesting to note the author of former article Dr. Nyein Chan was in Swinburne University, Sarawak Campus before. He has probably returned to his home country Myanmar (his story can be read here).

To discuss geometry of dynamical systems, I make extensive use of the notes by Berglund (arXiv: math/0111177). To start, dynamical systems are equipped with a first order ODE describing the dynamical equation:

\dot{x}_i =\cfrac{dF_i}{dt}=f_i(t)\quad (i=1\cdots n)

where F is a function on the state space M. Using chain rule, one can rewrite this equation as

\cfrac{dF_i}{dt}=\cfrac{\partial F_i}{\partial x^j}\cfrac{d x^j}{dt}\equiv f^j\cfrac{\partial}{\partial x^j}(F_i)=f^j\partial_j (F_i)   .

Note that f^j\partial_j can now be treated as a vector field (as one does in usual (local) coordinate-based differential geometry). Vectors field (despite its local coordinatization whose transformation law is known) encodes geometric information of the state space it lives on. The easiest way to see this is the exemplification of the hairy ball theorem by the statement one can’t comb the hair of a coconut. On the other hand, one can do so on the torus (surface of a doughnut). Technically this is due to the nonvanishing Euler characteristic of the sphere( and in the case of the torus, it vanishes).

The standard example of a dynamical system comes from mechanical systems (say, one particle obeying Newton’s laws). However Newtonian equations for such systems are second order ODEs. This simply implies that the mechanical state should be a pair of variables, say of position and momenta (q^i,p_i) forming the phase space \mathbb{R}^{2n}. Formulating the mechanical system as Hamiltonian mechanics, one can rewrite the Newtonian equations of motion as two sets of first order ODEs known as Hamilton’s equations:

\dot{q}^i=\cfrac{\partial H}{\partial p_i}\quad;\qquad\dot{p}_i=-\cfrac{\partial H}{\partial q^i}\quad,

where H is the Hamiltonian of the system. It is convenient to rewrite this equation using another algebraic structure known as Poisson bracket which is defined as

\{ f,g \}=\sum_i \left(\cfrac{\partial f}{\partial q^i}\cfrac{\partial g}{\partial p_i}-\cfrac{\partial g}{\partial q^i}\cfrac{\partial f}{\partial p_i}\right) .

Then one can rewrite Hamilton’s equations as

\dot{q}^i=\{q^i,H\}\quad;\qquad\dot{p}_i=\{p_i,H\} .

The convenience is that the dynamics is contained in the algebraic form of Poisson bracket. Thus, studying the Poisson bracket structure is equivalent to studying the dynamical structure. One can further ‘geometrize’ this algebraic structure by considering the vector fields on the phase space \xi_{q^i}=\partial / \partial q^i and \xi_{p_i}=\partial / \partial p_i and the Hamilton’s equation as

\dot{q}^i = \omega(\xi_{q^i},\xi_H)\quad;\qquad\dot{p}_i=\omega(\xi_{p_i},\xi_H) ,

where \omega=\sum_i dq^i\wedge dp_i is a covariant antisymmetric tensor known as the symplectic form and

\xi_H= \cfrac{\partial H}{\partial p_i}\cfrac{\partial}{\partial q^i} - \cfrac{\partial H}{\partial q^i}\cfrac{\partial}{\partial p_i} .

The manifold (space) equipped with the symplectic form is known as the symplectic manifold. With the Poisson bracket replaced by the symplectic form, one can simply study the properties of the symplectic form to know about the dynamics. Finding symmetries preserving the symplectic form has become the basis of (some) quantization procedure.

The motivation to study dynamical systems is to learn about chaotic dynamical systems. The word chaos conjures images like the ones below (a favourite picture from Bender & Orszag book and a billiard in )

BenderOrszag

Source: Bender & Orszag, “Advanced Mathematical Methods for Scientists and Engineers” (McGraw Hill, 1978) Fig.4.23 on page 191.

 

However, the iconic diagram one associates with chaotic dynamical system is that of the two winged Lorenz butterfly diagram (later), which I thought it had structures. In such a system it was the sensitivity of initial conditions for the orbits traversed that played a characteristic role. The orbits above are perhaps closer to a different concept of the ergodic hypothesis. How sensitivity of initial conditions have been called chaotic is quite interesting. A more mundane name for the whole subject is nonlinear dynamics which is used before the term chaos got popular.

So how does one put some useful handles to such systems with complicated behaviour? One begins by looking for simple solutions i.e. stationary solutions. Recall F:M\rightarrow M and f_i=dF_i / dt. (Note: at times, I will not write out the indices and should be understood contextually.) A stationary solution is the one that obeys f_i=0 i.e. doesn’t change with time. Of related interest are fixed points x^\ast such that F(x^\ast)=x^\ast; also called equilibrium points. Points x^\ast for which f(x^\ast)=0 are called singular points of the vector field f; also called stationary orbits.

We can now explore solutions nearby the equilibrium point x = x^\ast + y for which

\dot{y}=f(x^\ast + y)\simeq A y + g(y)

where

A= \left. \cfrac{\partial f}{\partial x}\right\vert_{x^\ast}\quad;\qquad \lVert g(y)\rVert\leq M\lVert y\rVert^2 ,

i.e. linearizing the solutions with the higher order terms are assumed to be bounded by some constant. In the linear case (g(y)=0), one has

\dot{y}=Ay\qquad\Rightarrow\qquad y(t) = e^{At} y(0) .

Thus, one can see that the eigenvalues of a_j of A will play in the important role of long-term behaviour of the solutions.

To take advantage of this fact, one can use projectors P_j to eigenspace of a_j to study the behaviour of equilibrium points. Construct projectors to sectors of eigenvalues

P^+=\sum_{\textrm{Re} (a_j)>0} P_j\quad;\qquad P^-=\sum_{\textrm{Re} (a_j)<0} P_j\quad;\qquad P^0=\sum_{\textrm{Re} (a_j)=0} P_j .

and define subspaces

E^+=P^+ M=\left\{y:\lim_{t\rightarrow -\infty} e^{At} y=0\right\} ;

E^-=P^- M=\left\{y:\lim_{t\rightarrow \infty} e^{At} y=0\right\} ;

E^0=P^0 M ,

which are called respectively unstable subspacestable subspace, and centre subspace of x^\ast and they are invariant subspaces of e^{At}. With respect to these spaces, one can actually classify the equilibrium points. The equilibrium point is a sink if E^+=\{0\}=E^0; a source if E^-=\{0\}=E^0; a hyperbolic point if E^0=\{0\}; and an elliptic point if E^+=\{0\}=E^-. Note that one has a richer variety of equilibrium points than the one-dimensional case simply because there are more ‘directions’ to consider in higher-dimensional cases (characterised by the eigenvalues of A). To illustrate this, we consider the two-dimensional case with two eigenvalues a_1,\ a_2 (borrowing diagram from Berglund):

EquilibriumPoints

Case (a) refers to a node in which a_1a_2>0 (arrows either pointing in or out). Case (b) is a saddle point in which a_1a_2<0. Cases (c) and (d) happen when a_1,a_2\in\mathbb{C} and hence giving rotational (or oscillatory motion in phase space). Cases (e) and (f) are more complicated versions of nodes when there are degeneracy of eigenvalues (please refer to Berglund for details). At this juncture, it is appropriate to mention the related concepts of basins of attraction which appear in chaotic dynamics literature. Particular one has the concept of strange attractor, arising from the fact that while F is assumed continuous, the vector field may be singular at some points and thus giving rise to space filling structures known as fractals (see this article).

To proceed beyond the linear case, one needs extra tools namely the Lyapunov functions (often in the form of quadratic forms) that set up level curves over which phase space trajectories can approach or cross and help characterise stability of equilibrium points in general. The Lyapunov functions are those functions V(x) such that

  • V(x)>V(x^\ast) for x in the neighbourhood of x^\ast;
  • its derivative along orbits \dot{V}(x)=\nabla V(x)\cdot f(x) is negative showing x^\ast is stable.

To illustrate this, we borrow again a diagram of Berglund to show how phase space orbits approach or cross the level curves of V(x).

StabilityLyapunov

Cases (a) and (b) are respectively the stable and asymptotically stable equilibrium points where trajectories cut the level curves in direction opposite to their normals. Case (c) is the case the unstable equilibrium point generalizing the linear case where orbits may approach the point in one region and moves away in another region. Such orbits are called hyperbolic flows. This is essentially the case of interest. Note in particular if one reverses the arrows, there is invariance of the two separate regions of stable and unstable spaces and the special status of hyperbolicity.

One can now state a result known in the literature i.e. given a hyperbolic flow \varphi on some K\subset M, a neigbourhood of hyperbolic equilibrium point x^\ast, there exists local stable and unstable manifolds

W^s_{\textrm{loc}}(x^\ast):=\{x\in K:\lim_{t\rightarrow\infty}\varphi_t(x)=x^\ast\ \textrm{and}\ \varphi_t(x)\in K,\ \forall t\ge 0\} ;

W^u_{\textrm{loc}}(x^\ast):=\{x\in K:\lim_{t\rightarrow -\infty}\varphi_t(x)=x^\ast\ \textrm{and}\ \varphi_t(x)\in K,\ \forall t\le 0\} .

For further technical details, consult Araujo & Viana, “Hyperbolic Dynamical Systems” (arXiv:0804.3192 [math.DS]) and Dyatlov, “Notes on Hyperbolic Dynamics” (arXiv:1805.11660 [math.DS]).

Examples for which hyperbolic flows are known are the cases of geodesic flows on negatively curved (hyperbolic) surfaces and billiard balls in Euclidean domains with concave boundaries (see Dyatlov). Hyperbolicity then becomes a paradigm for structurally stable ergodic system as discussed by Smale in 1960s (see Smale, “Differentiable Dynamical Systems“, Bull. Amer. Math. Soc. 73 (1967) 747-817). While this is so, unknown to the mathematicians then, E. Lorenz discovered a dynamical system that was neither hyperbolic nor structurally stable (see Lorenz, “Deterministic Nonperiodic Flow“, J. Atmosph. Sci. 20 (1963) 130-141). A new paradigm is needed to account for such systems. However, we will defer this discussion to a future post.

 

Convex Geometry for Quantum Theory

Restarting this (technical) blog by reporting what I have been reading few weeks ago. For now, I have returned to the book by Bengtsson & Zyczkowski on “Geometry of Quantum States“. I must thank Prof. Zyczkowski for giving me a complimentary copy of this book, for which I have recommended to the graduate students. The first chapter is on a topic that some (especially students) might not expect, namely on “Convexity, Colours and Statistics” and let me just focus on why convex geometry. Those who are working on quantum foundations will probably know why as I hope to discuss below.

First, convex geometry is simply the study of geometry of convex sets. A convex set is a subspace of Euclidean space $latex  \mathbb{E}^n$ whose points \mathbf{x}_1, \mathbf{x}_2 can be combined as

\mathbf{x}=\lambda_1\mathbf{x}_1+\lambda_2\mathbf{x}_2\ ,\quad \lambda_1+\lambda_2=1

and that \mathbf{x} will still belong to the set. Should be stated here that the fact that we have the linear addition of points necessitates the space it belongs to is a linear space (hence the use of \mathbb{E}^n). One could in fact remove the requirement of a special point (say, the origin) in vector spaces and consider the embedding space being simply an affine space with linear addition operation still in place.

Before we move on to how is this related to quantum theory. Let us mention down some applications where convex spaces or convex geometry are used (see also here). Perhaps the first instance, one hears the word convex is in convex functions (see also here). The use here is very much related to the idea of minimization; minimum of a convex function is a global minimum. Relationship between convex function and convex set is that the epigraph (area above the function’s curve) of a convex function is a convex set. Much more complex problems on convex functions can come in the subject of convex analysis but we will not go into details. Not totally unrelated to optimization is the underlying geometrical ideas that led to the subject of convex geometry which finds uses in tiling problems, robotic motions and image analysis (see also here). One interesting application is the idea of convex relaxation for which hard problems are approximated by approximating them to relevant convex optimization problems. See this lecture by Sam Wong of Microsoft Research at https://www.youtube.com/watch?v=_2pMKksyeD0. An excellent introduction to what is convexity and the various interrelations can be found in the paper of Berger, “Convexity“, Amer. Math. Monthly Vol. 97, No. 8 (1990) 650-678. Note: Berger is the same person who wrote the famous two-volume book on Geometry (see here and here). A worthy new book by Berger is “Geometry Revealed: A Jacob’s Ladder to Modern Higher Geometry“.

Digression: Normally one discussed convex spaces whose embedding space is \mathbb{R}^n and that the convex combination involves a real parameter \mu. One question comes to mind is how this generalizes to complex linear spaces (as in quantum theory). Found a paper, “Convex Sets and Convex Combinations on Complex Linear Spaces” by Matsuzaki, Endou and Shidama, though the contents are quite opaque to me. In any case, they mention the use of a complex \lambda and embed the unit 1_\mathbb{C} to form 1_\mathbb{C} - \lambda in the convex combination. This will imply the loss of ordering. However they also mention the use of inner product of elements of the embedding space to form the necessary reals if required.

Let’s move closer to quantum theory. One field that is close to convex geometry is probability theory. One begins by introducing the idea of probability measures which are maps/functions on sets, normally identified as event spaces. For a simple introduction on probabilities, one can refer to notes by Cozman, “A Few Notes on Sets of Probability Distributions“. Probability measure is defined as a function \mu  on the event space, that takes in the interval [0,1] and it obeys additivity property

\mu\left(\bigcup_{i}\ E_i\right) = \sum_i\mu(E_i) ,

for disjoint events E_i. One can extend this to continuous labelled events and the sum extends to an integral. Generalizing even further one can let the measure be defined for (Minkowski) sum of convex sets (say, A,B) in such a way that they obey Brunn-Minkowski inequality:

(\mu(A+B))^{1/n}\ge(\mu(A))^{1/n}+(\mu(B))^{1/n} .

Putting a convex sum instead, one has

(\mu(\lambda A+(1-\lambda)B))^{1/n}\ge(\lambda\mu(A))^{1/n}+((1-\lambda)\mu(B))^{1/n} .

This is called \mathbf{n}convex measure. The special case of n=0 gives

(\mu(\lambda A+(1-\lambda)B))\ge\mu(A)^\lambda+\mu(B)^{(1-\lambda)}

or

\log(\mu(\lambda A+(1-\lambda)B))\ge\lambda\log(\mu(A))+(1-\lambda)\log(\mu(B)) .

This is sometimes called log concave measure.

All these are pretty technical but points to construction variability of the subject and certainly requires deeper study. Let me end this by pointing to the article of Milman, “Geometrization of Probability” in the book “Geometry and Dynamics of Groups and Spaces“. Another connection that is worth mentioning is the relation between geometric probabilities with hitting probabilities for connvex bodies – see Schneider’s “Convexity and Geometric Probabilities“.

Continuing the connection between convex geometry and quantum theory; first, we note that quantum states are typically represented by state vectors modulo complex scalars (i.e. rays) that belong to a Hilbert space. However they can also be represented by projection or density operators \rho (typically density matrices if the Hilbert space is finite-dimensional). Generally density operators \rho have the following properties:

  • Hermiticity: \rho^\dagger=\rho;
  • Unit trace: \textrm{Tr}(\rho)=1;
  • Positive-definite: \langle\phi\vert\rho\vert\phi\rangle\ge 0.

Projectors corresponding to rays/states have the further property of being

  • \rho^2=\rho (twice projection has the same effect as the first single one).

Such states are called pure states. Are there any other states? Indeed, there are, which are the convex combination of pure states called mixed states. These typically correspond to states of partial knowledge (probabilistic mixture). In this particular case, \textrm{Tr}(\rho^2)\le 1. Hence the relation of convex geometry to quantum theory. While mixed states are the more general form (which includes pure ones), it is normally the case that one needs to define what the pure states are first (through some procedure, say, quantization). The other interesting point is that the expression of the mixed state in terms of pure state is not unique; physically translating as two statistical mixtures with same statistical averages cannot be distinguished. Mielnik likened this to be similar to the use of local coordinates in Riemannian geometry and promoted convex geometry as the geometry for (even a generalised) quantum mechanics – see Mielnik, “Generalized Quantum Mechanics“, Comm. Math. Phys. 37 (1974) 221-256. His views are summarized in the relatively preprint “Convex Geometry: A Travel to the Limits of Our Knowledge” (arXiv:1202.2164) – published version appeared in “Geometric Methods in Physics” (eds.) P. Kielanowski, S.T. Ali, A. Odzijewicz, M. Schlichenmaier & T. Voronov (Springer, 2013).

One has actually more stories to convex geometry. As stated above, the pure states are special points in the convex geometry of states (extremal points). The bulk of (mixed) states live elsewhere in the convex hull, which is the smallest convex set containing the pure states. If these are of finite number, the body is called a convex polytope. One can start building a hierarchy of simplexes: A p-simplex consists of points taking up the convex combination:

\mathbf{x}=\lambda_0\mathbf{x}_0+\lambda_1\mathbf{x}_1+\cdots +\lambda_p\mathbf{x}_p

where \lambda_i\ge 0 and \lambda_0+\lambda_1+\cdots +\lambda_p=1. A subset F of a convex set, stable under mixing and purification is called a face of the convex set i.e. if

\mathbf{x}=\lambda_1\mathbf{x}_1+(1-\lambda)\mathbf{x}_2\ ;\quad 0\le\lambda\le 1 ,

then \mathbf{x} lies in F if and only if \mathbf{x}_1,\mathbf{x}_2 lie in F. A face of dimension k is a k-face. Special case: 0-face is an extremal point and (n-1)-face is a facet. Given the set of all feces of a convex body, one can form a partial ordering i.e. F_1\le F_2 if face F_1 is contained in F_2. This leads to another idea: partial ordering is the essential ingredient to form a logic, allowing what statements can be made. Thus convex geometry is related to the idea of quantum logic proposed much earlier and this is discussed in Mielnik, “Geometry of Quantum States“, Comm. Math. Phys. 9 (1968) 55-80. More recent developments and discussions can be found in Coecke, “An Alternative Gospel of Structure: Order, Composition, Processes” (arXiv: 1307.4038) and Coecke & Martin, “A Partial Order on Classical and Quantum States” (downloadable here).

Thus, with all these fundamental interconnections, one can understand convex geometry is a convenient place to start discussing quantum states in the book of Bengtsson and Zyczkowski.

Gauge Theories: A Quick Tour

This is a post long overdue and I promised my students to upload (a version of) the talk I gave at our group meeting.

Prior to the Robert Brout Memorial Conference on Spontaneous Symmetry Breaking at IAS, NTU, I have decided to introduce to my students preliminary materials on gauge theories in a talk on 9 January 2018. Thought that I should share this here.

My first exposure to gauge theories was in a Particle Physics course taught by Lindsey Dodd at Adelaide University (see here). Little did I know then, gauge theory was already present in electromagnetism. The physical fields in electromagnetism are the (propagating) electric and magnetic fields \underline{E},\ \underline{B} obeying Maxwell’s equations. It is convenient to introduce scalar and vector potentials (\phi,\underline{A}) to help solve such that

\underline{E}=-\underline{\nabla}\phi-\cfrac{\partial\underline{A}}{\partial t}\ ;\quad\underline{B}=\underline{\nabla}\times\underline{A}\ .

The choice of the potentials however is not unique since redefining the potential by derivatives of the function \Lambda i.e.

\phi'=\phi+\cfrac{\partial\Lambda}{\partial t}\ ;\quad\underline{A}'=\underline{A}-\underline{\nabla}\Lambda

will give the same \underline{E},\ \underline{B}. It is important to think first the space of potentials as space of solutions to the field equations (Maxwell equations) but yet the ambiguity set up equivalences for the solutions (potentials). These equivalences are the (vast) space of symmetry transformations involving functions of space-time (see F. Wilczek, “Unification of Force and Substance”, Phil. Trans. Royal Soc. A 374 (2016) 20150257 (available as https://arxiv.org/abs/1512.02094). This begs the question of why not deal with the physical fields directly (rather than the potentials). In philosophical circles, this is known as the excess or surplus structure and may be addressed using category theory (see J.O. Weatherall, “Understanding Gauge”, Phil. Sci. 83 (2016) 1039-1049 (available as https://arxiv.org/abs/1505.02229)).

Such surplus structure can also be found in other places in physics and they are not totally unrelated. In quantum mechanics, it is well-known that states are rays represented by wavefunctions \psi(\underline{r},t) are defined modulo phase factors (see B. Felsager, Geometry, Particles and Fields, (Odense University Press,  1981)):

\psi\rightarrow\psi'(\underline{r},t)=\exp (iq\chi(\underline{r},t/\hbar)) \psi(\underline{r},t)\ .

The space-dependent phase factor also implies a change in the momentum operator by the function \chi(\underline{r},t) but yet retains the canonical commutation relations implying a form of gauge symmetry. See

Another context in which gauge freedom appears is in the geometric formulation of classical mechanics. The phase space of classical mechanics is a manifold that carries a special 2-form known as symplectic form \omega=-d\theta where \theta is the symplectic potential often given as \theta=pdq (see P. Libermann & C.-M. Marle, Symplectic Geometry & Analytical Mechanics, (D. Reidel, 1987)). The symplectic form actually encodes equations of motion of the particle and hence symmetries of the symplectic form actually gives symmetries of the equation of motion. An obvious symmetry keeping the symplectic form invariant is when the symplectic potential gets modified by a total (exterior) differential: \theta\rightarrow\theta + df.

The usual context one learns about gauge theories  is the Lagrangian (density) form of field theories where wavefunctions of particles are replace by fields over spacetime, say \phi(x,t). (see Ciaran Hughes, “A Brief Discussion on Gauge Theories” for a brief introduction,  http://www.damtp.cam.ac.uk/user/ch558/pdf/Gauge_theories.pdf). In this case equations of motion are given by Euler-Lagrange equations:

\cfrac{\partial}{\partial x^\mu} \left(\cfrac{\partial\mathcal{L}}{\partial(\partial\phi/\partial x^\mu)}\right) - \cfrac{\partial\mathcal{L}}{\partial\phi} = 0\ .

Some important examples are:

  • \mathcal{L} = \frac{1}{2}(\partial_\mu\phi)(\partial^\mu\phi) - \frac{1}{2}m^2\phi^2  giving Klein-Gordon equation  \partial_\mu\phi\partial^\mu\phi +m^2\phi=0.
  • \mathcal{L}=i\bar{\psi}\gamma_\mu\partial^\mu\psi-m\bar{\psi}\psi  giving Dirac Equation  i\gamma^\mu\partial_\mu\psi-m\psi=0.
  • \mathcal{L}=-\frac{1}{4}F_{\mu\nu}F^{\mu\nu}-j^\mu A_\mu  giving Maxwell equation  \partial_\mu F^{\mu\nu}=j^\nu.

How does this relate to what we have said before? Instead of looking directly to the symmetries of the equations of motion (which are derived from the Lagrangians), we can now opt to look for symmetries of the Lagrangian instead.

Consider the free fermion (matter) fields with Lagrangian

\mathcal{L}=i\bar{\psi}\gamma_\mu\partial^\mu\psi-m\bar{\psi}\psi

Noting that the fields that can be varied are \psi(x), one consider the transformations by a (global) phase factor:

\psi(x)\rightarrow e^{i\alpha}\psi(x)\quad;\quad\bar{\psi}(x)\rightarrow e^{-i\alpha}\bar{\psi}(x).

Note that the phase factor is a U(1) group element. The free fermion Lagrangian above is invariant under this global phase transformation.

One can further extend this to multicomponent fields \chi(x)=(\psi_1(x), \cdots,\psi_n(x))^T with the Lagrangian

\bar{\chi}(x)((i\gamma^\mu\partial_\mu-m)\mathbb{I}_{n\times n})\chi(x).

The Lagrangian will be invariant under global SU(n) transformations

\chi(x)\rightarrow U\chi(x)\quad;\quad\bar{\chi}(x)\rightarrow\bar{\chi}(x)U^\dagger\quad;\quad U\in SU(n).

The global transformations are too restrictive. What if we allow the phase factors of the transformations to be local instead i.e. say U=U(x) sending \chi(x) to U(x)\chi(x). Then the Lagrangian transforms as

\begin{aligned} \mathcal{L}&\rightarrow\bar{\chi}(x)U^\dagger(x)(i\gamma^\mu \partial_\mu-m)U(x)\chi(x)\\ &=\mathcal{L}+\bar{\chi}(x)U^\dagger(x)(i\gamma^\mu \partial_\mu(U(x)))\chi(x) \end{aligned}

Hence the Lagrangian is no longer invariant as one hopes for. To correct this, one defines a new derivative D_\mu such that the derivative transforms as needed:

D_\mu \chi(x)\rightarrow U(x)D_\mu\chi(x).

This derivative is known as covariant derivative and is defined as

D_\mu=\partial_\mu-igA_\mu,

where A_\mu is an element of SU(n) known as the gauge field and g is the (gauge) coupling constant. Note that the ambiguity of the gauge fields (potentials) is still there with the gauge field transforms as

A_\mu(x)\rightarrow U(x)A_\mu U^\dagger(x) - \frac{i}{g}(\partial_\mu U(x))U^\dagger(x) .

Analogous to electromagnetism, one needs now a kinetic term for the gauge fields A_\mu = A_\mu ^a T^a (summed over generators of the Lie algebra su(n). This is given via the field strength tensor

F_{\mu\nu}= F^a_{\mu\nu} T^a=[D_\mu,D_\nu] .

Note that this tensor transforms as F_{\mu\nu}\rightarrow U(x)F_{\mu\nu}U^{-1}(x) and that

F^a_{\mu\nu} F^{a\mu\nu}=2\textrm{Tr}(F_{\mu\nu}F^{\mu\nu})

contains the necessary squared derivative terms for the kinetic energy and is further gauge invariant. Thus the needed total Lagrangian is

\mathcal{L}=-\frac{1}{4}F^a_{\mu\nu}F^{a\mu\nu} + \bar{\chi}(x)(i\gamma^\mu D_\mu-m)\chi(x) .

Note that one can’t introduce the mass term for gauge fields m^2 A_\mu A^\mu as such term will break the gauge invariance of the Lagrangian.Hence at this stage, gauge fields are considered massless.

Essentially with the above construction, one now has the recipe to discuss gauge theories of the standard model of particle physics. Each particle interaction is associated to a gauge theory. The strong nuclear force for instance, at the fundamental level is given by an SU(3) gauge theory where each quark with three different colour charges are arranged in a fundamental triplet and the gauge fields are the 8 gluons that can transform the colour charges of the quarks. For weak nuclear and electromagnetic interactions, the story is a bit complicated and are best described in a more fundamental SU(2)\times U(1) gauge theory. The complication is that the gauge fields that mediate the weak interaction are massive (the W^\pm, Z^0 particles) and hence must include a further ingredient which is the Higgs field and the idea of spontaneous symmetry breaking, which is the subject of the conference mentioned earlier. What one sees as weak interaction and electromagnetic interaction are really remnants of the SU(2)\times U(1) gauge theory after the symmetry breaking. We will not discuss this here and leave it for future discussion (the topic can be found in many particle physics textbook e.g. K. Huang, Quarks, Leptons & Gauge Fields, (World Scientific, 1992)).

Some notes to be mentioned here:

  • What is discussed above is purely classical. Quantum aspects come in when the fields are given operator status which makes the theories nontrivial.
  • Even at the classical level, gauge theories have intriguing geometrical properties where one considers the internal degrees over which the gauge fields act on as a geometrical space residing over space-time (fibre bundles). See A. Marsh, “Gauge Theories and Fiber Bundles: Definitions, Pictures and Results” (arXiv:1607.03089 [math.DG]) for excellent notes and also its companion, A. Marsh, “Riemannian Geometry: Definitions, Pictures and Results”, (arXiv:1412.2393 [gr-qc]) for Riemannian geometry. We will probably discuss this in a different post.
  • As stated earlier, the gauge fields are elements of the Lie algebra su(n) and are only defined locally. At the global level (important for geometric description), there are further intricacies with respect to its global structure. For instance, there are thirteen connected Lie groups that may ‘contain’ the Lie algebra of the standard model SU(3)\times SU(2)\times U(1). See J. Hucks, “Global Structure of the Standard Model, Anomalies, and Charge Quantization“, Phys. Rev. D  43 (1991) 2709-2717. This makes modeling for particle physics technically ever more interesting.

To end this post, let me put a pic of the notes on particle physics by Lindsey Dodd that goes way back to 1984.

No automatic alt text available.

Deformed Numbers and Calculus

When I first heard of fractional calculus, I thought that it sounds contrived. But then again, so are many abstractions and generalizations of established ideas in mathematics. There is one good motivation for this idea, which strikes at the heart of fundamental physics i.e. the Dirac operator \hat{D} which is understood as the square root of the Laplacian \Delta i.e. \hat{D}^2=\Delta. If it is one dimensional with \Delta=\partial_x^2, this would have been straightforward. For more than one dimension, this is nontrivial – what is the square root of say \partial_x^2 + \partial_y^2?

The reason my attention got to this is through a visitor of mine, Prof. Won Sang Chung from Gyeongsang National University who visited the institute for the period of February 2-14, 2017. His research has been mainly on deformation of physical and mathematical structures. His interest in noncommutative quantum mechanics was the initial reason of our contact. With respect to the above topic, Prof. Won’s many works on deformation has led him to study the deformation of units for instance in the published work “On the q-deformed circular imaginary unit and hyperbolic imaginary unit: q-deformed rotation in two dimension and q-deformed special relativity in 1+1 dimension“. It is interesting that this work was carried out with an undergraduate, of which I’m told, work together with Prof. Won a few times a week at late evening hours. There are other works by them that touches on arithmetic operations of numbers, for which they call it \alpha-deformed numbers. In a way, it shifts the idea of deformation of a complex operation of say the derivative to arithmetic ones.

They start with the operation of addition for which in general it is deformed as

x\oplus_f y=f^{-1}(f(x)+f(y))

where f are some specified function and x,y are integers. By making f(x)=x^\alpha, one then has the alpha-deformed addition

x\oplus_\alpha y= (x^\alpha + y^\alpha)^{1/\alpha}

for x,y>0. For more cases of the integer, it is

x\oplus_\alpha y = \vert \vert x\vert^{\alpha-1}x +\vert y\vert^{\alpha-1}y\vert^{\frac{1}{\alpha}-1} (\vert x\vert^{\alpha-1}x + \vert y\vert^{\alpha-1}y)\ .

Suffice for our discussion to take the simpler case. It is easy to show that the additive identity is preserved

x\oplus_\alpha 0=0\oplus_\alpha x= x

The additive inverse requires the definition of \alpha-deformed subtraction:

x\ominus_\alpha y = \begin{cases} (x^\alpha - y^\alpha)^{1/\alpha}\quad &(x>y)\\ -(y^\alpha - x^\alpha)^{1/\alpha}\quad &(x<y) \end{cases}\ .

Consecutive addition can be done easily. Now we can define the \alpha-deformed numbers by noting the following:

0_\alpha=0,\quad 1_\alpha=1,\quad (-1)_\alpha=-1

and

(n)_\alpha = 1\oplus_\alpha 1\oplus_\alpha \cdots \oplus_\alpha 1=n^{1/\alpha}\ .

The multiplication and division operations are taken as the normal ones (undeformed) and

(mn)_\alpha = (m)_\alpha (n)_\alpha\quad;\quad \left(\cfrac{m}{n}\right)_\alpha=\cfrac{(m)_\alpha}{(n)_\alpha}\ .

Thus, the \alpha-deformed numbers form a commutative ring and essentially there is a 1-1 correspondence between the \alpha-deformed numbers and the integers. By the division operation defined above, I guess one could easily extend these deformed integers to \alpha-deformed rational numbers.

The topic stopped there will not be too interesting. The authors went on to define the \alpha-deformed derivation to build up calculus. It is here, there is a leap of faith to say that the \alpha-deformed rational numbers can be completed to form \alpha-deformed real numbers. Suppose that this can be done, one defines the \alpha-derivative by

D^\alpha_x F(x) =\lim_{y\rightarrow x}\ \cfrac{F(y)\ominus_\alpha F(x)}{y\ominus_\alpha x}\ .

One can show indeed that this definition gives a derivation (obeying Leibniz rule) – see paper. Hence, one finds

D^\alpha_x x\ominus_\alpha xD^\alpha_x = 1\quad\Rightarrow D^\alpha_x x= 1\oplus_\alpha xD^\alpha_x\ .

So one can easily show that for example, D^\alpha_x x^n= n_\alpha x^{n-1} i.e. it works like the ordinary derivative with numbers replace by the deformed numbers. One could generalise this further to power series functions. For instance the \alpha-exponential function defined by

D^\alpha_x e_\alpha (x) = e_\alpha (x)\ .

One can extend the deformed numbers to that of \alpha-deformed complex numbers z_\alpha=x_\alpha\oplus_\alpha iy_\alpha and form the \alpha-trigonometric function via the complex \alpha-exponential function. In fact one can show that these functions obey the usual Euler relation.

Once we build up these function and their calculus, one can start to think about solving standard physics problems via \alpha-deformed differential equations. We will state only the case of the quantum harmonic oscillator which was worked out in the paper. The differential equation to solve is

\left(-\cfrac{\hbar^2}{2m}(D^\alpha_x)^2\oplus_\alpha\cfrac{1}{2}m\omega^2 x^2\right)\ u= Eu\quad .

Suffice for our discussions, to state the results for its spectra, namely

E_n=2^{\frac{1}{\alpha}-1}\hbar\omega\left(n+\cfrac{1}{2}\right)^{1/\alpha}\

Note that the energy levels almost retain its functional form but note that the energy level spacings

E_{n+1} \ominus_\alpha E=2^{\frac{1}{\alpha}-1}\hbar\omega

are only equidistant in the \alpha-deformed sense! So, there are nontrivialities associated with the earlier deformations of the arithmetic operations. In one of his talks, he had mentioned that the case of \alpha-deformed hydrogen atom has also been worked out with similar functional form of the spectra but of course again deformed. The applications that Prof. Won had mentioned are the case of quantum systems whose primary potential is known but there are missing information about them. He envisaged that by doing some fitting of the spectra, better physical understanding of previously unsolved systems (but spectrally known) can be achieved using these \alpha-deformed theories.

Choosing Zeidler on QFT

Some weeks back just before the Singapore trip, it was suggested that I should do reading seminars on quantum field theory (QFT) and/or particle physics. A decision to be made is which book to be chosen. My first exposure to quantum field theory is through Herbert Green‘s course on “Elementary Field Theory” in Adelaide and as usual, his treatment is unique and usually has no text-book reference similar to what he is doing. The next course that had field theory in it was particle physics for which I referred to is T.D. Lee‘s “Particle Physics and Introduction to Field Theory“, which is clear enough for me though his space-time signature is (+1,+1,+1,+i). It was only later (after graduating), that I was introduced to Mandl & Shaw by my eldest brother, when he gave me the book as a present. Later, I used this as one of my main reference during my Part III course on Quantum Field Theory in Cambridge. At the time, however, I heard praises of the book by Itzykson & Zuber, which is of course harder to read but my guess contained a lot of gems (never finished reading it).

For our group, I thought I will have something different, tailored for a more mathematical-inclined audience. Even here, there are several choices. I settled for the thickest book I can find. Saw Zeidler with his intended six-volume book. Apparently, he did not quite finish the six volumes since he passed away in November 2016. Three volumes of “Quantum Field Theory – A Bridge Between Mathematicians and Physicists” were published. So we begin with Volume 1: Basics in Mathematics and Physics.

A Prologue was written to describe the style and scope of  the book with an outline of the 6 volumes. The book is filled with quotations and historical anecdotes, making it untiring to read. It opens with five different quotations of famous physicists  and mathematicians leading to the five golden rules of the book:

  1. Write with an open landscape with depths of perspectives.
  2. Teach the content with a battery of problems.
  3. A technical book which is readable beyond the first several pages.
  4. Delves into deep questions of physics interlinking with mathematics.
  5. Teach for the appreciation of the physics.

The prologue also uses the often-quoted example of how successful QFT is in calculating the anomalous magnetic moment of the electron. Let us recall that any current-carrying loop produces a magnetic moment \underline{M} =  \frac{1}{2}\underline{r} \times\underline{j} where \underline{j} is the current density of the loop. The current may be carried by a charged particle orbiting in a loop. From the formula given, the magnetic moment is then proportional to its angular momentum.

If one let the particle be an electron carrying only the spin angular momentum \underline{S}, then its magnetic moment is

\underline{M}_e = \cfrac{g_e\mu_B\underline{S}}{\hbar}\ ,

where \mu_B=e\hbar/(2m_e) is the Bohr magneton and g_e is the gyromagnetic factor (dimensionless magnetic moment). Now according to Dirac’s theory, g_e=2 but this quantity receives corrections from QFT which can be written as

g_e=2(1+a)\ .

The correction a can be written as a (divergent) power series of the fine structure constant \alpha. Up to fourth power of \alpha (using 891 Feynman diagrams), the correction is

a_{th}=0.001 159 652 164 \pm 0.000 000 000 108\ .

Experimentally it is

a_{expt}=0.001 159 652 188 4\pm 0.000 000 000 004 3\ .

This is indeed excellent agreement. The calculation does not stop there. The next fifth power of \alpha correction involves 12,672 Feynman diagrams giving a correction which is accurate to more than 1 part in a billion. See the 2006 PRL paper here. The question now that this series that gave the correction is said to be divergent.

QFT is a practical theory which goes on to compute useful quantities for physicists:

  • cross-sections of scattering processes of particles
  • masses of stable elementary particles
  • lifetime of unstable elementary particles

Thus one ultimate goal of QFT is to bring it a rigorous mathematical setting. Perhaps it is worth mentioning here the work of Kreimer that has connections to noncommutative differential geometry.

In the search for a mathematical framework, some guiding principles from mathematics are

  1. Going from concrete structures to abstract ones
  2. Combining abstract structures
  3. Functor between abstract structures
  4. Statistics of abstract structures

In the last one, much help was given by physics (Zeidler used the term physical mathematics which I will not adopt) through Feynman functional integrals which we will go into the coming readings.