Connections between physics, mathematics, and deep learning
Abstract
Starting from Fermat’s principle of least action, which governs classical and quantum mechanics and from
the theory of exterior differential forms, which governs the geometry of curved manifolds, we show how
to derive the equations governing neural networks in an intrinsic, coordinate-invariant way, where the loss
function plays the role of the Hamiltonian. To be covariant, these equations imply a layer metric which is
instrumental in pretraining and explains the role of conjugation when using complex numbers. The differential
formalism clarifies the relation of the gradient descent optimizer with Aristotelian and Newtonian
mechanics. The Bayesian paradigm is then analyzed as a renormalizable theory yielding a new derivation
of the Bayesian information criterion. We hope that this formal presentation of the differential geometry
of neural networks will encourage some physicists to dive into deep learning and, reciprocally, that the
specialists of deep learning will better appreciate the close interconnection of their subject with the foundations
of classical and quantum field theory.
References
This work is licensed under a Creative Commons Attribution 4.0 International License.
Letters in High Energy Physics (LHEP) is an open access journal published by Andromeda Publishing and Education Services. The articles in LHEP are distributed according to the terms of the creative commons license CC-BY 4.0. Under the terms of this license, copyright is retained by the author while use, distribution and reproduction in any medium are permitted provided proper credit is given to original authors and sources.
Terms of Submission
By submitting an article for publication in LHEP, the submitting author asserts that:
1. The article presents original contributions by the author(s) which have not been published previously in a peer-reviewed medium and are not subject to copyright protection.
2. The co-authors of the article, if any, as well as any institution whose approval is required, agree to the publication of the article in LHEP.