This release introduces partially Bayesian Transformers and neuron-level control over model stochasticity.
Key Additions:
Partially Bayesian Transformers: Transformer neural networks are at the heart of modern AI systems and are increasingly used in physical sciences. However, robust uncertainty quantification with Transformers remains challenging. While replacing all weights with probabilistic distributions and using advanced sampling techniques works for smaller networks, this approach is computationally prohibitive for Transformers. Our new partially Bayesian Transformer implementation allows you to selectively make specific modules (embedding, attention, etc.) probabilistic while keeping others deterministic, significantly reducing computational costs while still delivering reliable uncertainty quantification.
Fine-grained Stochasticity Control: Even with only some layers probabilistic, training deep learning models can be resource-intensive. You can now specify exactly which weights in particular layers should be stochastic, providing a finer control over the computational cost vs. uncertainty trade-off.
What's Changed
- Add layer-by-layer convergence diagnostics by @ziatdinovmax in #31
- Add a classifier option to convnets by @ziatdinovmax in #32
- Add basic Transformer (deterministic and partially Bayesian) by @ziatdinovmax in #33
- Add hybrid-layers for PBNNs by @ziatdinovmax in #36
- Fix classification option in Transformer by @ziatdinovmax in #38
- Partial attention by @ziatdinovmax in #39
- Add a simple option to view flax model layer configs by @ziatdinovmax in #40
- Ensure num of features is displayed properly for attention layers by @ziatdinovmax in #42
Full Changelog: 0.0.10...0.0.12