Hi all,

See below for the title and abstract of Susanne's talk.

Best,

-joe

Title: Thermodynamics and predictive inference

Abstract: Technically, most machine learning algorithms solve some
optimization problem, where the aims of the learning method are
encoded into an objective function. While this optimization can be
hard in practice, and might require serious work, the main conceptual
problem is that of finding the objective. I investigate if and how
physical considerations can be used to derive learning objectives, or,
equivalently, rules for information acquisition and processing. All
information processing systems are physical, and hence subject to
fundamental physical limits. Here, we will focus on thermodynamic
efficiency. There is a fundamental relationship between Shannon's
mutual information and work, which can be easily appreciated in the
context of information engines. This connection allows us to derive
lossy compression from a ``least effort principle"---rate-distortion
theory emerges. If we widen the scope to partial observability and the
use of different temperature reservoirs, then allowing for maximal
thermodynamic efficiency leads directly to the ``information
bottleneck" (IB) method. This method has been widely used in machine
learning and was extended to time-series modeling, quantum systems,
and to the situation in which the observer actively changes the
process it is learning about. Fluctuation theorems have helped clarify
the thermodynamics of systems driven arbitrarily far from equilibrium,
with, and without, feedback. The thermodynamics of classical, strongly
coupled systems can be understood in a straightforward way by
realizing that causal intervention offers a way to identify those
parts of the entropy production that result from feedback between the
subsystems. From this, the central relations describing the
thermodynamics of strongly coupled classical systems follow in a few
lines. I plan to spend some time on generalized, partially observable,
information engines, and the derivation of the IB, and then either on
discussing the thermodynamics of strongly coupled systems, or on
explaining IB and generalizations in greater detail, depending on the
interests of the audience. I would like to encourage a discussion to
connect to quantum thermodynamics.

Papers:
Thermodynamics:
S. Still (2019) Thermodynamic cost and benefit of memory. arXiv:1705.00612
G. E. Crooks and S. Still (2019) Marginal and Conditional Second Laws
of Thermodynamics. EPL 125, 4, 40005
S. Still, D. A. Sivak, A. J. Bell and G. E. Crooks. (2012) The
thermodynamics of prediction. PRL 109, 120604
S. Still (2014) Lossy is lazy. Proc. 7th Workshop on Information
Theoretic Methods in Science and Engineering (WITMSE-2014), eds. J.
Rissanen, P. Myllymäki, T. Roos, and N. P. Santhanam
E. Stopnitzky, S. Still, T. E. Ouldridge and L. Altenberg (2019)
Physical limitations of work extraction from temporal correlations.
Phys. Rev. E 99 042115
Information Bottleneck and extensions:
S. Still (2009) Information theoretic approach to interactive
learning. EPL 85, 28005.
S. Still (2014) Information Bottleneck Approach to Predictive
Inference. Entropy 16(2):968-989 (2014)
S. Still and W. Bialek (2004) How many clusters? An information
theoretic perspective. Neural Computation, 16(12):2483-2506
A. L. Grimsmo and S. Still (2016) Quantum Predictive Filtering Phys.
Rev. A 94, 012338
S. Still and D. Precup (2012) An information-theoretic approach to
curiosity-driven reinforcement learning. Theory in Biosciences: Volume
131, Issue 3, pages 139-148.
S. Still, J. P. Crutchfield and C. J. Ellison (2010) Optimally
Predictive Causal Inference CHAOS 20, 037111
S. Still and J. P. Crutchfield (2007) Structure or Noise?
http://lanl.arxiv.org/abs/0708.0654