June 20, 2024
Information-Geometry
Browse posts by tag
April 20, 2024
Fisher Flow: Optimization on the Statistical Manifold
Gradient descent in Euclidean space ignores the geometry of probability distributions. Natural gradient descent uses the Fisher information metric instead. Fisher Flow makes this continuous.
September 12, 2023
Your Optimizer Is (Approximately) Propagating Fisher Information
Adam, K-FAC, EWC, and natural gradient are all approximating the same thing at different fidelity levels. The math and the caveats.