Welcome to the Modula docs!

Modula is a deep learning framework designed for graceful scaling. Neural networks written in Modula automatically transfer learning rate across scale. Modula can be installed by running:

$ pip install modula

Purpose of the docs

We wrote these docs with the intention of explaining both scaling theory and the design of Modula in clear and simple terms. We hope that this will help speed up deep learning optimization research.

If something is unclear, first check the FAQ, but then consider starting a GitHub issue, making a pull request or reaching out to us by email. Then we can improve the docs for everyone.

Companion paper

If you prefer to read a more academic-style paper, then you can check out our arXiv paper:

@article{modula,
  author  = {Tim Large and Yang Liu and Minyoung Huh and Hyojin Bahng and Phillip Isola and Jeremy Bernstein},
  title   = {Scalable Optimization in the Modular Norm},
  journal = {arXiv:2405.14813},
  year    = 2024
}

Acknowledgements

Thanks to Gavia Gray, Uzay Girit, Jyo Pari and Laurence Aitchison for helpful feedback.