Implicit Regularisation, Large Stepsizes and Edge of Stability for (S)GD over Diagonal Linear Networks

Abstract

See on arxiv

Publication
Neural Information Processing Systems (NeurIPS), 2023
Mathieu Even
Mathieu Even
PhD student in Statistics and Optimization