Scientific Memo

Sunday, 3 May 2015

Constants or integrals of motion: Invariants of a dynamical flow

In this post we will shortly review a concept in dynamical systems namely of invariants of a dynamical flow with a simple derivation using famous Lotka-Volterra system as an example, due to Lotka (1925) and Volterra (1927).

Concept

A dynamical flow associated with an observation vector ${\bf y}(t)$ may have functions, $I({\bf y})$ that are time independent, being $dI/dt=0$. The number of invariants and the length of the observation vector have an effect on overall dynamics.

Lotka-Volterra (LV) System

The LV dynamics explains the behaviour between population of the prey $v$ and population of predators $u$, a case of predator-prey model. We will use a special case of the LV dynamics, remember the dot notation, meaning time derivatives, for predators,
$$ \dot{u} = u (v-2) $$
and for prays,
$$ \dot{v} = v (1-u) $$
Observation vector will consist of $y=(u,v)$.

If we divide these equations, hoping that we can collect $u$ and $v$ in separate terms,

$$
\begin{eqnarray}
\frac{\dot{u}}{\dot{v}}            & = & \frac{u (v-2)}{v(1-u)} \\
\dot{u} v (1-u)                        & = & \dot{v} u (v-2) \\
\dot{u} v (1-u) - \dot{v} u (v-2) & = & 0 \\
\dot{u} (1-u) - \dot{v} u/v (v -2) & = & 0\\
\dot{u} (1-u)/u - \dot{v}(v-2)/v   & = & 0
\end{eqnarray}
$$

If we integrate both sides over time $dt$,
$$
\begin{eqnarray}
\int \frac{1-u}{u} \frac{du}{dt} dt - \int \frac{v-2}{v} \frac{dv}{dt} dt & = &0 \\
\int \frac{1-u}{u}du - \int \frac{v-2}{v} dv & = &0 \\
\end{eqnarray}
$$

Solution of these indefinite integrals yields to a an invariant of the LV dynamics
$$ d I({\bf y})/dt = ln u - u + 2 ln v - v $$
We have shown one invariant of the system. This is important to determine the structure of the system, such as volume preserving dynamics, i.e., Hamiltonian Dynamics.

Further Reading

Geometric Numerical Integration, Ernst Hairer, Christian Lubich, Gerhard Wanner, Springer (2002)
Arnold, V. I. and A. Avez (1968). Ergodic Problems of Classical Mechanics. New York, Benjamin.

Tuesday, 13 May 2014

Is ergodicity a reasonable hypothesis? Understanding Boltzmann's ergodic hypothesis

Ergodic vs. non-ergodic
trajectories (Wikipedia)

Many undergraduate Physics students barely study Ergodic Hypothesis in detail. It is usually manifested as ensemble averages being equal to time averages. While the concept of the statistical ensemble maybe accessible to students, when it comes to ergodic theory and theorems, where higher level mathematical jargon kicks in, it maybe confusing for the novice reader or even practicing Physicists and educator what does ergodicity really mean. For example recent pre-print titled "Is ergodicity a reasonable hypothesis?" defines the ergodicity as follows:

...In the physics literature "ergodicity" is taken to mean that a system, including a macroscopic one, visits all microscopic states in a relatively short time...[link]

Visiting all microscopic states is not a pre-condition for ergodicity from statistical physics stand point. This form of the theory is the manifestation of strong ergodic hypothesis because of the Birkhoff theorem and may not reflect the physical meaning of ergodicity. However, the originator of ergodic hypothesis, Boltzmann, had a different thing in mind in explaining how a system approaches to thermodynamic equilibrium. One of the best explanations are given in the book of J. R. Dorfman, titled An introduction to Chaos and Nonequilibrium Statistical Mechanics [link], in section 1.3, Dorfman explains what Boltzmann had in mind:

...Boltzmann then made the hypothesis that a mechanical system's trajectory in phase-space will spend equal times in regions of equal phase-space measure. If this is true, then any dynamical system will spend most of its time in phase-space region where the values of the interesting macroscopic properties are extremely close to the equilibrium values...[link]

Saying this, Boltzmann did not suggest that a system should visit ALL microscopic states. His argument only suggests that only states which are close the equilibrium has more likelihood to be visited.

Postscript (June 2022)

The sufficiency of Sparse Visits: Physical states are rarely fine-grained

A requirement for attaining ergodicity is visiting all possible states or regions due to the ergodic theorems of Birkhoff and von Neumann. This requirement is not correct for Physics. The key concepts here are coarse-graining and the sufficiency of sparse visits. Most of the physical systems have equally likely states.

The generated dynamics would rarely need to visit all accessible states or regions. Physical systems are rarely fine-grained and have a degree of sparseness, reducing their astronomically large number of states to a handful. In summary, visiting all physical states or regions in time averages is not strictly needed for the physics definition of ergodicity.

A collection of regions or multiple states with a higher probability will need to be covered to achieve thermodynamic equilibrium. A concept of “sufficiency of sparse visits”. This approach makes physical experiments possible over a finite time consistent with thermodynamics.

Friday, 17 January 2014

Particle approximation to probability density functions: Dirac delta function representation

In the previous post, I have briefly shown the idea of using dirac delta function for discrete data representation. In the second example there, a histogram locations for a given set of points are presented as spike trains, where as heights are somehow given in a second sum. This is hard to follow and visualise, of course if you are not that good in reading formulation with different indexes. Due to pedagocial reasons, an easier representation of arbitrary probability density function (PDF), $p(x)$, one would simply need to couple each discrete points with a corresponding weight.

Hence, a set $\{x_{i}, \omega^{i}\}_{i=1}^{N}$ would be an estimation of PDF, $\hat{p}(x)$ . At this point we can invoke dirac delta function,

$ \hat{p}(x) = \sum_{i=1}^{N} \omega^{i} \delta(x-x_{i})$

Let's revisit the R code given there, this time let's draw uniform numbers between $[-2, 2]$ to get 100 $x_{i}$ values. Simply these numbers will indicate the locations on the x-axis, a spike train. For simplicity, let's use Gausian distribution for target PDF, $ \mathcal{N}(0, 1)$. Than, for weights we need to draw numbers using the spike locations. This approach is easier to understand compare to my previous double index notation.

R Example code

Above explained procedure is trivial to implement in R.

# Generate locations 100 x locations
# out out 1000 points in [-2.0, 2.0]
set.seed(42)
# Domain where Dirac comb operates
Xj = seq(-2,2,0.002) 
Xi = sample(Xj, 100)
# Now generate weights from N(0,1) at those given locations
Wi = dnorm(Xi)
# Now visualise
plot(Xi, Wi, type="h",xlim=c(-2.0,2.0),ylim=c(0,0.6),lwd=2,col="blue",ylab="p")

Conclusion

Above notation introduces second abuse of notation while actually there must be a secondary regular grid that pics $x_{i}$ values using dirac delta in practice. Because, the argument of $\hat{p}(x)$, is in the discrete domain. So a little better notation that reflects the above code would be

$ \hat{p}(x_j) = \sum_{i=1}^{N} \omega^{i} \delta(x_{j}-x_{i})$

The set $x_j$ is simply defined in a certain domain, for example regularly. Hence I only recommend not to introduce dirac delta for explaining a particle approximation to PDFs for novice students in the class. It will only confuse them even more.

Figure: Spike trains with weights $\hat{p}(x) = \sum_{i=1}^{N} \omega^{i} \delta(x-x_{i})$

Wednesday, 20 November 2013

Demystify Dirac delta function for data representation on discrete space

Dirac delta function is an important tool in Fourier Analysis. It is used specially in electrodynamics and signal processing routinely. A function over set of data points
is often shown with a delta function representation. A novice reader relying on integral properties of the delta function may found this notation quite confusing. Probably, the notation itself is an example of abuse of notation.

One dimensional function/distribution: Sum of delta functions

Let's define a one dimensional function, $f(x)$ as follows, $x \in \mathbb{R}$ and $a$ being constant:

$ f(x) = a \sum_{i=-n}^{n} \delta(x - x_{i})$

This representation is inspired from Dirac comb and used in spike trains. Note that set of data points in one dimension $\{x_{i} \}$ will determine the graph of this function. Using the shifting property of delta function, the value of the function will be zero every where except on data points. The constant $a$ will simply be the height of the graph at the data point.

Figure: A spike train.

Numeric Example

Let's plot $f(x)$ for some specific values of the set $\{x_{i} \} = {-0.5, -0.2, -0.1, 0.2, 0.4}$ and $a=0.5$. Here is the R code for plotting this spike train.

x_i = c(-0.5, -0.2, -0.1, 0.2, 0.4)
a   = c(0.5, 0.5, 0.5, 0.5, 0.5)
plot(x_i,a,type="h",xlim=c(-0.6,0.6),ylim=c(0,0.6),lwd=2,col="blue",ylab="p")

Representing Histograms: One dimensional example

Particularly convenient representation of histograms can be developed similarly. Consider set of points $\{x_{i}\}_{i=1}^{n}$ where we would like to establish a histogram out of this set, let's say $h(x)$. If we set our histogram intervals as $\{x_{j}\}_{j=1}^{m}$. The histogram $h(x)$ then can be written as

$h(x_{j}) = \sum_{i=1}^{n} \sum_{j=1}^{m} \delta(x_{j}- x_{i}^{min})$
where set $x_{i}^{min}$ represents the value from set $\{x_{j}\}_{j=1}^{m}$ that is closest to given $x_{i}$. Where as, second sum determines the height at a given point, i.e., frequency. This is just a confusing mathematical representation and practical implementation only counts the frequency of $x_{i}^{min}$ directly.

Conclusion

However it is quite trivial, the above usage of sum of delta functions appear in mathematical physics as well, not limited to statistics.