Tuesday, 3 June 2025

Compressive algorithmic randomness:
Gibbs-randomness proposition for massively energy efficient deep learning

Figure: Dual Tomographic Compression
Performance, Süzen, 2025.
Preamble

Randomness is elusive and its probably one of the outstanding concepts for human scientific endeavour, along with gravity. Kolmogorov complexity, appears to be so novel in trying to answering "what is randomness?". The idea that the length of the smallest model that can generate the random sequence determines its complexity was a turning point in history of science. Similarly, it implies choosing the simplest model for explaining a phenomenon. That's why Kolmogorov's work was also supported by the ideas of Solomonoff and Chaitin. A recent work, explores this algorithmic information from compression perspective with Gibbs entropy.

A strange tale of path from applied research to fundamental proposition. 

During study of model compression algorithm development, I have noticed an amazing behaviour that information, entropy and compression over compression process have a more in depth. 

New concepts in compression and randomness via train-compress cycles

Here, we explain the new concepts for both deep learning model compression and on the interplay between compression and algorithmic randomness.

Inverse compressed sensing (iCS): Normally CS procedure is applied to reconstruct an unknown signal with fewer measurements. In the case of deep learning train-compress, weights are known at one point in the training cycle. If we create hypothetical measurements, using CS formulations, we can reconstruct weights sparse projection. 

Dual Tomographic Compression: Applying iCS for the input and output of neuronal level, layer-wise, simultaneously.  

Weight rays:  An output reconstructed vector out of DTC; weights given sparsity level, though they are not generated in isolation but within train-compress cycle.

Gibbs randomness proposition : An extension of Kolmogorov complexity for a compression process. That, directed randomness is the same as complexity reduction, i.e., compression. 

Conclusion 

A new technique called DTC can be used to train deep learning with model compression on the fly. This gives rise to massively energy efficient deep learning, reaching almost ~98% reduction in energy use.  Moreover, the technique also demonstrated an extended version of Kolmogorov complexity. 

Further reading 

Paper & codes are released :


Cite as 

 @misc{suzen25car, 
     title = {Compressive algorithmic randomness: <br>Gibbs-randomness proposition for massively energy efficient deep learning}, 
     howpublished = {\url{ https://science-memo.blogspot.com/2025/06/compressive-algorithmic-randomness.html}}, 
     author = {Mehmet Süzen},
     year = {2025}
}  

Tuesday, 22 April 2025

Ultimate physical limit of data storage:
Connecting Bekenstein Bound to Landauer Principle

Preamble 
Bekenstein's Information
(Wikipedia)

Black holes are not too esoteric anymore after LIGO's success and successful imaging efforts. Their entropy behaves much different than the entropy of an ordinary matter. This leads to incredible discovery of so called holographic principle. The principle stating that we live on a projection of higher-dimensional manifestation of universe. On the other hand, Landauer made silently a discovery on energy expenditure of keeping information processing reversible. Similar bound put forward by Bekenstein for black holes made Landauer-Bekenstein bound for quantum gravity a natural avenue to study. 

What is the Bekenstein Bound?

Basically, this puts limits the size of a black hole, via its entropy bound. Essentially it states that, entropy of a black hole $S$ is bounded with the radius of the black hole $R$ and Energy $E$, other constants being 1, 

$$S \le R \dot E$$

What is Landauer Principle?

A limit on any process wants to delete 1 bit of information,  has to dissipate energy proportional to its temperature, 

$$ E \ge T ln 2$$ 

Again we made the constants to 1. 

Physical limit of data storage: 1 BekensteinBytes

Using both Bekenstein bound and Landauer principle one can compute the a physical limit for a data storage on a unit sphere,

$$S \sim T$$

If we scaled this with inverse of Planck area $\ell_{p}$; One bit of information proportional to Planck area and maximum attainable temperature $10^{32}$ K is scaled with this. The final value corresponds to $\approx 10^{100}$ bits. This again corresponds to about 10 Giga-Quetta-Quatta-Quatta bytes (1 Quetta byte is $10^{30}$ bytes). 

At this point in time, we can propose that $10^{100}$, would be to call 1 BekensteinBytes.  However with more fine-grain computations, the number may change. Our purpose here is to give a very rough idea about the scale of ultimate physical limitations of data storage in BekensteinBytes.

Bekenstein Information Conjecture : One cannot compress more than 1 BekensteinBytes on a smallest patch of space.

Outlook

An interesting connections in quantum gravity and computation, provides certain physical limitations on how much information we can be stored at a given unit space.  We called this Bekenstein Information Conjecture. 

Further reading

Cite as 

 @misc{suezen25datalimit, 
     title = {Ultimate physical limit of data storage: Connecting Bekenstein Bound  to Landauer Principle}, 
     howpublished = {\url{https://science-memo.blogspot.com/2025/04/ultimate-physical-limit-of-data-storage.html}}, 
     author = {Mehmet Süzen},
     year = {2025}
}  


(c) Copyright 2008-2024 Mehmet Suzen (suzen at acm dot org)

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.