![]() |
Figure: Maxwell's handwritings, state diagram (Wikipedia) |
The modern statistics now move into an emerging field called data science that amalgamate many different fields from high performance computing to control engineering. However, the emergent behaviour from researchers in machine learning and statistics that, sometimes they omit naïvely and probably unknowingly the fact that some of the most important ideas in data sciences are actually originated from Physics discoveries and specifically developed by physicist. In this short exposition we try to review these physics origins on the areas defined by Gelman and Vehtari (doi). Additional section is also added in other possible areas that are currently the focus of active research in data sciences.
Bootstrapping and simulation based inference : Gibbs's Ensemble theory and Metropolis's simulations
![]() |
Figure: Maxwell Relations as causal diagrams. |
- What are the Most Important Statistical Ideas of the Past 50 Years? Gelman & Vehtari (2021)
- Leisurely Look at the Bootstrap, the Jackknife, and Cross-Validation, Bradley Efron and Gail Gong (1983)
- Elementary Principles in Statistical Mechanics, Gibbs (1902)
- Equation of State Calculations by Fast Computing Machines, Metropolis et. al. (1953)
- Generalized statistical mechanics: connection with thermodynamics, Curado-Tsallis (1992)
- Poincaré sections of Hamiltonian systems (1996)
Statistical mechanics of ensemble learning, Anders Krogh and Peter Sollich (1997)
AI as a phenomena appears to be in the domain of core physics. For this reason, studying physics as a (post)-degree or as a self-study modules will give students and practitioners alike a definitive cutting-edge insights.
- Statistical models based on correlations originates from physics of periodic solids and astrophysical n-body dynamics.
- Neural networks originates from the modelling magnetic materials in discrete states and later named as cooperative phenomenon. Their training dynamics closely follows free-energy minimisation.
- Causality roots in ensemble theory of physical entropy.
- Almost all sampling based techniques are based on the idea of sampling physics of energy surfaces, i.e. Potential Energy Surfaces. (PES).
- Generative AI originates from physics of diffusion of fluids: classical Liouville description of the classical mechanics, i.e, phase-space flows and generalised Fokker-Planck dynamics.
- Language models based on attention are actually coarse-grained entropy-dynamics
introduced by Gibbs: ‘Attention Layers’ behaves as coarse-graining procedure, i.e, compressed
causal graphs mapping.
This is not about building analogies to physics but as foundational topics to AI.