Welcome to the Modula docs!¶
Modula is a deep learning framework designed for graceful scaling. Neural networks written in Modula automatically transfer learning rate across scale. Modula can be installed by running:
$ pip install modula
Purpose of the docs¶
We wrote these docs with the intention of explaining both scaling theory and the design of Modula in clear and simple terms. We hope that this will help speed up deep learning optimization research.
If something is unclear, first check the FAQ, but then consider starting a GitHub issue, making a pull request or reaching out to us by email. Then we can improve the docs for everyone.
Companion paper¶
If you prefer to read a more academic-style paper, then you can check out our arXiv paper:
@article{modula,
author = {Tim Large and Yang Liu and Minyoung Huh and Hyojin Bahng and Phillip Isola and Jeremy Bernstein},
title = {Scalable Optimization in the Modular Norm},
journal = {arXiv:2405.14813},
year = 2024
}
Acknowledgements¶
Thanks to Gavia Gray, Uzay Girit, Jyo Pari and Laurence Aitchison for helpful feedback.