Tuesday, 7 April 2020

Short Opinion: Extending General Relativity to N-dimensions is not even wrong leading to inventor's paradox

Bohr and Einstein (1926)
(Wikipedia)
It is original to have a solution to the generic case, so-called arbitrary N in mathematics, such as the N-dimensional case. This is considered as a generic solution in computer science as well, inventor's paradox.

However, such generalisation to higher-order objects may not be needed for reality. Mathematical beauty does not bring reality with it by default. An example is trying to generalise General Relativity [1]. I think this is a novel work in mathematics but it may not reflect our physical world as it is. This opinion is not new and probably the reason why many decades, community resisted against the attempts to lower the status of General Relativity as a special case of something higher dimensional object [2] that can not be tested. Einstein's theory of GR is good enough to explain our universe and supported with observations [3].


Trying to the extent any physical theory to higher dimensions may not be even wrong if it can not be observed.


[1]  A Generalization of Gravity, arXiv:1409.6757 

[2] Huggett, Nick and Vistarini, Tiziana (2014) Deriving General Relativity From String Theory.
[3]  On Experimental Tests of the General Theory of Relativity, American Journal of Physics 28, 340 (1960); https://doi.org/10.1119/1.1935800

PS: Links added on 13 April 2020

Saturday, 29 February 2020

Freeman Dyson's contribution to deep learning: Circular ensembles mimics trained deep neural networks

In memory of Professor Dyson, also see the paper Equivalence in Deep Neural Networks via Conjugate Matrix Ensembles

Preamble 
Dyson 2007 (Wikipedia) 
Freeman Dyson was a polymath scientist: theoretical physicist, mathematician and visionary thinker among others. In this post, we will briefly summarise his contribution to deep learning,  i.e., deep neural networks.  Obscure usage of his circular ensembles as a simulation tool in conjunction with the concept of ergodicity explained why deep learning systems learn in such high accuracy.

A simulation tool for deep learning: Circular (Random Matrix) Ensembles

Circular ensembles [1,2,3] developed by Dyson in 1962 for explaining quantum statistical mechanics systems as a modification of basic random matrix theory. Circular ensembles can be used in simulating deep learning architectures [4]. Basically, his three ensembles can be used to generate a "trained deep neural network". It is shown by myself with colleagues from Hamburg and Mallorca that using Dyson's ensembles generated networks, deeper they are so-called spectral ergodicity goes down [4], this is recently proved on real networks as well [5].

How to generate a simulated trained deep neural network in Python

Using Bristol python package [6] one could generate a set of weight matrices corresponding to each layer connections, i.e., weight matrices. A simple example, using Circular Unitary Ensemble (CUE), let's say we have 4 hidden layers of  64, 64, 128, 256 units. This would generate learned weight matrices of sizes 64x64, 64x128 and 128x256, One possible trained network weights can be generated: Note that we make non-square ones by simple multiplying by its transpose. 


from bristol.ensembles import circular
ce = circular()
seed_v   = 997123
W1 = ce.gue(64, set_seed=True, seed=seed_v)
W2 = ce.gue(128, set_seed=True, seed=seed_v)
W3 = ce.gue(256, set_seed=True, seed=seed_v)

These are complex matrices, one could take the arguments or use them as it is if only eigenvalues are needed.  An example of a trained network generation can be found in Zenedo. One can use any one of the circular ensembles.

Conclusion

Dyson's contributions are so bright that even his mathematical tools appear in modern deep learning research. He will be remembered many generations to come as a bright scientist and a polymath. 

References 


[1] Freeman Dyson, Journal of Mathematical Physics 3, 1199 (1962) [link]
[2] Michael Berry, New Journal of Physics 15 (2013) 013026 [link]
[3] Mehmet Süzen (2017), Summary Notebook on Circular ensembles [link]
[4] Spectral Ergodicity in Deep Learning Architectures via Surrogate Random Matrices,
Mehmet Süzen, Cornelius Weber, Joan J. Cerdà, arXiv:1704.08693 [link]
[5] Periodic Spectral Ergodicity: A Complexity Measure for Deep Neural Networks and Neural Architecture Search,
 Mehmet Süzen, Cornelius Weber, Joan J. Cerdà, arXiv:1911.07831 [link]
[6] Bristol Python package [link]


Wednesday, 1 January 2020

A practical understanding of ergodicity

Preamble

In this tutorial, we will provide a brief introduction on what does ergodicity means and how to detect when a dynamical system behaves ergodically. The very definition of ergodicity is not uniform in the literature, especially definitions of Boltzmann and Birkhoff are not the same. So, one should talk about a set of ergodicity, i.e., ergodic theorems, rather than a single one. We follow the basic definition that, the system is ergodic for a given observable when the ensemble-averaged value of the observable is the same as its time-averaged value. But beware that this definition is a purely statistical definition and does not reflect Boltzmann's ergodicity from Physics.

Boltzmann: Father of ergodicity
(Wikipedia)
What are the ensemble and time averaging?

The ensemble is a fancy word. It doesn't mean a group of musical instruments but it is short for the statistical ensemble. It is developed by Gibbs, an American theoretical physicist. Essentially all possible state of a physical system. In statistics, this actually has another name, a sample space.
For example, sample space for the outcome of two fair coins tossed at the same time would be
$$ \Omega = \{HH, HT, TH, TT\}$$
Let's say we represent them as bits $H=1, T=0$ and we want to compute an observable $\mathscr{A}$, the sum of outcomes, in the entire ensemble would read.
$$\Omega_{\mathscr{A}} = \{2,1,1,0\}$$
And the ensemble average, arithmetic mean, would be $\langle \mathscr{A} \rangle_{ensemble} = 1.0$. A time-averaging can be computed via an experiment, so-called trials. Let's say we had 6 trials
$$ \Omega_{time} = \{HH, TH, HH, HT, TH, HT\}$$
and time-averaged observable will be $\langle \mathscr{A} \rangle_{time} = 1.33$. Note that larger the trials time-averaged values approach to ensemble averaged ones.

Note that this sounds very naive and useless example, but if we have very large sample space with the complicated setting, so-called in thermodynamic ensembles and limits, computing time-averaged values are much easier, whereby computing ensemble average is intractable, most of the case in statistical physics.

Connection to the law of large numbers 

Statistically  inclined readers might catch that the above definition sounds like the law of large numbers. It is indeed the strong form of the law of large numbers and it is a special case of an ergodic theorem.


Conclusions: Why would I care about ergodicity?

Apart from the intellectual appeal, ergodicity is very important in statistical physics. Almost all computations rely on the ergodic theorems. So-called N-body systems are simulated based on this for computing physical properties. Ergodicity pops up everywhere from economics to deep learning.

Further Reading with Notes

The literature is vast in ergodicity. Due to its mathematical nature, text on ergodicity can easily be non-accessible for an average scientist, well, even for an average mathematician.

  • Modern Ergodic Theory, Joel L, Lebowitz and Oliver Penrose link (excellent for an introduction)
  • Computational Ergodic Theory, Choe, link (advanced text)
  • An introduction to Chaos and Nonequilibrium Statistical Mechanics, link (explains why Boltzmann's ergodicity is different)
  • Deep Learning and complexity: link (ergodicity in spectra, different kind of ergodicity)
  • Ergodicity and economics: link (ergodicity in risk decisions, different kind of ergodicity)

Please cite as follows:

 @misc{suezen23lpc, 
     title = {A practical understanding of ergodicity}, 
     howpublished = {\url{http://science-memo.blogspot.com/2020/01/a-practical-understanding-of-ergodicity.html}}, 
     author = {Mehmet Süzen},
     year = {2023}
}  


Postscript notes

  • Sample space and ensemble are not synonymous. Ensemble is a special type of sample space where by all unique combinations are exhausted for a given system, i.e., defined event set.
  • Origins of ergodicity goes back to Boltzmann but Gibbs's ensembles also provides a conceptual parallels. An ensembles could be thought as a distinct measurement protocols. Then the question of ensemble equivalence is similar to to question of ergodicity, i.e., in a strict mathematical sense and still an analogy that if two protocols are equivalent, because technically a time-averaging would also curate a different set.  Only distinction is that time-averaging is tied to a a given ensemble. 
  • Law of large numbers and ergodicity:  There is a deep connection between law of large numbers and ergodicity.  The ergodicity of von Neumann-Birkhoff means visiting all possible states are considered. Given advanced level  language used in definitions of ergodicity and law of large numbers, invoking measure theory made difficult to understand that two concepts overlaps for practical purposes. However, both appears to have multiple different inceptions. Strong form of law of large numbers overlaps with  von Neumann-Birkhoff ergodicity.




Monday, 2 January 2017

Testing emergent gravity: Gravitational Lensing to atom interferometer

Paranal Telescopes in Chile. (ESO/H.H. Heyer)
In this post, I would like to briefly discuss emergent gravity, an idea, that gravity itself is an artefact of more fundamental description, such as entropy. Despite the fact that we have an experimental evidence of gravitational waves, thus gravity really exist and physically detectable: The recent results of Laser Interferometer Gravitational-Wave Observatory (LIGO) programme in detecting gravitation waves was a landmark experimental evidence supporting Einstein's General Relativity theory.

Emergent Gravity: Verlinde's thesis

Eric Verlinde has proposed a controversial hypothesis in 2010 that gravity is originated from an entropic force [here]. He has shown that both Newtonian and Einstein's gravitational equations are artefacts of entropic force and in 2016 he proposed a similar approach in explaining galactic motions without the need of using dark matter [here].

Testing Emergent Gravity: Gravitational Lensing

Margot M. Brouwer and her co-workers have published a work [here], using weak gravitational lensing data from ESO telescopes. This was the first evidence for Verlinde's theory which attracted a lot of media attention because of implications in our understanding of the nature of gravity.

Testing Entropic Gravity Directly: Atom Interferometer

Despite this initial test of emergent gravity, there is still a lack of direct experimental evidence for Verlinde's initial hypothesis that force laws are artefacts of entropic force. Recently another approach is proposed to test this entropic force argument, [here], using mater-wave interferometry via utilising part of Newton-Schroedinger. The core idea is there is a direct relationship between the gravitational constant G and the atomic system's quantum state. If this is experimentally feasible, maybe with next generation atom interferometer systems, this could be a direct test for Verlinde's entropic force argument.



(c) Copyright 2008-2024 Mehmet Suzen (suzen at acm dot org)

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.