Hi all,
See below for the title and abstract of Susanne's talk.
Best,
-joe
Title: Thermodynamics and predictive inference
Abstract: Technically, most machine learning algorithms solve some optimization problem, where the aims of the learning method are encoded into an objective function. While this optimization can be hard in practice, and might require serious work, the main conceptual problem is that of finding the objective. I investigate if and how physical considerations can be used to derive learning objectives, or, equivalently, rules for information acquisition and processing. All information processing systems are physical, and hence subject to fundamental physical limits. Here, we will focus on thermodynamic efficiency. There is a fundamental relationship between Shannon's mutual information and work, which can be easily appreciated in the context of information engines. This connection allows us to derive lossy compression from a ``least effort principle"---rate-distortion theory emerges. If we widen the scope to partial observability and the use of different temperature reservoirs, then allowing for maximal thermodynamic efficiency leads directly to the ``information bottleneck" (IB) method. This method has been widely used in machine learning and was extended to time-series modeling, quantum systems, and to the situation in which the observer actively changes the process it is learning about. Fluctuation theorems have helped clarify the thermodynamics of systems driven arbitrarily far from equilibrium, with, and without, feedback. The thermodynamics of classical, strongly coupled systems can be understood in a straightforward way by realizing that causal intervention offers a way to identify those parts of the entropy production that result from feedback between the subsystems. From this, the central relations describing the thermodynamics of strongly coupled classical systems follow in a few lines. I plan to spend some time on generalized, partially observable, information engines, and the derivation of the IB, and then either on discussing the thermodynamics of strongly coupled systems, or on explaining IB and generalizations in greater detail, depending on the interests of the audience. I would like to encourage a discussion to connect to quantum thermodynamics.
Papers: Thermodynamics: S. Still (2019) Thermodynamic cost and benefit of memory. arXiv:1705.00612 G. E. Crooks and S. Still (2019) Marginal and Conditional Second Laws of Thermodynamics. EPL 125, 4, 40005 S. Still, D. A. Sivak, A. J. Bell and G. E. Crooks. (2012) The thermodynamics of prediction. PRL 109, 120604 S. Still (2014) Lossy is lazy. Proc. 7th Workshop on Information Theoretic Methods in Science and Engineering (WITMSE-2014), eds. J. Rissanen, P. Myllymäki, T. Roos, and N. P. Santhanam E. Stopnitzky, S. Still, T. E. Ouldridge and L. Altenberg (2019) Physical limitations of work extraction from temporal correlations. Phys. Rev. E 99 042115 Information Bottleneck and extensions: S. Still (2009) Information theoretic approach to interactive learning. EPL 85, 28005. S. Still (2014) Information Bottleneck Approach to Predictive Inference. Entropy 16(2):968-989 (2014) S. Still and W. Bialek (2004) How many clusters? An information theoretic perspective. Neural Computation, 16(12):2483-2506 A. L. Grimsmo and S. Still (2016) Quantum Predictive Filtering Phys. Rev. A 94, 012338 S. Still and D. Precup (2012) An information-theoretic approach to curiosity-driven reinforcement learning. Theory in Biosciences: Volume 131, Issue 3, pages 139-148. S. Still, J. P. Crutchfield and C. J. Ellison (2010) Optimally Predictive Causal Inference CHAOS 20, 037111 S. Still and J. P. Crutchfield (2007) Structure or Noise? http://lanl.arxiv.org/abs/0708.0654