## Wednesday, 1 January 2020

### A practical understanding of ergodicity

Preamble

In this tutorial, we will provide a brief introduction on what does ergodicity means and how to detect when a dynamical system behaves ergodically. The very definition of ergodicity is not uniform in the literature, especially definitions of Boltzmann and Birkhoff are not the same. So, one should talk about a set of ergodicity, i.e., ergodic theorems, rather than a single one. We follow the basic definition that, the system is ergodic for a given observable when the ensemble-averaged value of the observable is the same as its time-averaged value. But beware that this definition is a purely statistical definition and does not reflect Boltzmann's ergodicity from Physics.

 Boltzmann: Father of ergodicity (Wikipedia)
What are the ensemble and time averaging?

The ensemble is a fancy word. It doesn't mean a group of musical instruments but it is short for the statistical ensemble. It is developed by Gibbs, an American theoretical physicist. Essentially all possible state of a physical system. In statistics, this actually has another name, a sample space.
For example, sample space for the outcome of two fair coins tossed at the same time would be
$$\Omega = \{HH, HT, TH, TT\}$$
Let's say we represent them as bits $H=1, T=0$ and we want to compute an observable $\mathscr{A}$, the sum of outcomes, in the entire ensemble would read.
$$\Omega_{\mathscr{A}} = \{2,1,1,0\}$$
And the ensemble average, arithmetic mean, would be $\langle \mathscr{A} \rangle_{ensemble} = 1.0$. A time-averaging can be computed via an experiment, so-called trials. Let's say we had 6 trials
$$\Omega_{time} = \{HH, TH, HH, HT, TH, HT\}$$
and time-averaged observable will be $\langle \mathscr{A} \rangle_{time} = 1.33$. Note that larger the trials time-averaged values approach to ensemble averaged ones.

Note that this sounds very naive and useless example, but if we have very large sample space with the complicated setting, so-called in thermodynamic ensembles and limits, computing time-averaged values are much easier, whereby computing ensemble average is intractable, most of the case in statistical physics.

Connection to the law of large numbers

Statistically  inclined readers might catch that the above definition sounds like the law of large numbers. It is indeed the strong form of the law of large numbers and it is a special case of an ergodic theorem.

Conclusions: Why would I care about ergodicity?

Apart from the intellectual appeal, ergodicity is very important in statistical physics. Almost all computations rely on the ergodic theorems. So-called N-body systems are simulated based on this for computing physical properties. Ergodicity pops up everywhere from economics to deep learning.

Further Reading with Notes

The literature is vast in ergodicity. Due to its mathematical nature, text on ergodicity can easily be non-accessible for an average scientist, well, even for an average mathematician.

• Modern Ergodic Theory, Joel L, Lebowitz and Oliver Penrose link (excellent for an introduction)
• Computational Ergodic Theory, Choe, link (advanced text)
• An introduction to Chaos and Nonequilibrium Statistical Mechanics, link (explains why Boltzmann's ergodicity is different)
• Deep Learning and complexity: link (ergodicity in spectra, different kind of ergodicity)
• Ergodicity and economics: link (ergodicity in risk decisions, different kind of ergodicity)

Postscript notes

• Sample space and ensemble are not synonymous. Ensemble is a special type of sample space where by all unique combinations are exhausted for a given system, i.e., defined event set.
• Origins of ergodicity goes back to Boltzmann but Gibbs's ensembles also provides a conceptual parallels. An ensembles could be thought as a distinct measurement protocols. Then the question of ensemble equivalence is similar to to question of ergodicity, i.e., in a strict mathematical sense and still an analogy that if two protocols are equivalent, because technically a time-averaging would also curate a different set.  Only distinction is that time-averaging is tied to a a given ensemble.