Releases: eole-nlp/eole
Releases · eole-nlp/eole
0.0.3
What's Changed
- [patch] minor fixes for 0.0.2 by @francoishernandez in #109
- Rework handling of special tokens by @francoishernandez in #45
- [patch] get_transforms_cls after update_config_with_checkpoint by @francoishernandez in #110
- [patch] get_transforms_cls after update_config_with_checkpoint BIS by @francoishernandez in #111
- Updated translator.py to handle updated special token logic when computing alignments by @dameikle in #113
- clearer log by @vince62s in #112
- fix training tensor parallel by @vince62s in #115
- restore all_reduce directly but with detach.clone first - fix #115 by @vince62s in #116
- Initial support for Metal Performance Shaders (MPS) by @dameikle in #98
- Manage
share_decoder_embeddings
inconvert_HF
, misc fixes and improvements by @francoishernandez in #121 - Deduce share_decoder_embeddings from HF tie_word_embeddings flag by @francoishernandez in #123
- [docs] Upgrading docusaurus packages, should fix dependabot warnings by @francoishernandez in #124
- add estimator in decoder-only + clean code by @vince62s in #120
- fineweb10B/gpt2 recipe, and supporting changes by @francoishernandez in #32
- enable pure bf16 training by @vince62s in #133
- Update WMT17 recipe with working tokenization transforms examples by @francoishernandez in #129
- fixes #131, module 'eole.utils' has no attribute 'distributed' error when training multi-gpu by @isanvicente in #132
- add estimator in greedy inference by @vince62s in #135
- Some QOL config/saving improvements by @francoishernandez in #134
- fix #136. Updated eole/bin/model/average_models.py to work with safetensors model format. by @isanvicente in #137
- fix head dim in rope by @vince62s in #140
- fix autocast at scoring when doing AMP by @vince62s in #141
- Some minor fixes by @francoishernandez in #143
- fix lora lm head by @vince62s in #142
- fix missing pad change by @vince62s in #148
- flash_attn_func does not support padding mask maybe we need to drop a… by @vince62s in #149
- fix maybe_retranslate when number of newline does not match by @vince62s in #150
- Supporting HF tokenizers by @francoishernandez in #122
- Model Validator Recipe by @francoishernandez in #146
- apply bytefallback at detok (onmt_tokenize with sentencepiece) by @vince62s in #155
- patch eos_token_id list handling by @francoishernandez in #158
- Compile and Ctranslate2 support by @vince62s in #161
- Move predict config update from model loading to config validation by @francoishernandez in #163
- EuroLLM Gradio (web based) translator 35 languages to 35 languages by @vince62s in #164
New Contributors
- @isanvicente made their first contribution in #132
Full Changelog: 0.0.2...0.0.3
0.0.2
What's Changed
- Refactor position encoding configuration by @vince62s in #60
- fix update vocab by @vince62s in #63
- bfloat16 support, and an attempt at homogenizing model_dtype & precision by @francoishernandez in #54
- Fix prefix and suffix transforms - avoid adding empty suffix or prefix by @sersh88 in #57
- fix the incorrect dockerimages in the ReadMe by @aaaallleen in #68
- Remove unnecessary optim in convert_HF by @francoishernandez in #71
- Add onmt_config converter to facilitate switch by @francoishernandez in #69
- Update some FAQ sections by @francoishernandez in #74
- Added TER and BLEU for early stopping by @aaaallleen in #73
- [fix] fix normalize and clean transforms config management by @francoishernandez in #87
- [docs] Fix quickstart config and command by @francoishernandez in #90
- add head_dim setting when diff from hidden // heads by @vince62s in #78
- Some MHA and RoPE refactoring, llama-3.1 rope_scaling by @francoishernandez in #91
- Fixed variable referenced before assignment when position_embeddings is None error by @dameikle in #95
- Send src_pad_mask and tgt_pad_mask to decoder in _align_forward by @dameikle in #96
- Fixdistrib by @vince62s in #100
- fix added tokens by @vince62s in #101
- Support mapped tokens eg: <im_start> ==> ⦅im_start⦆in inference.yaml … by @vince62s in #102
- add wmt22 recipes with TowerInstruct and Llama3.1 LLMs by @vince62s in #103
- Remove duplicate sentencepiece requirement by @francoishernandez in #104
- [patch] Adapt some warning behaviours for reduced verbosity by @francoishernandez in #105
- [patch] Update precision to compute_dtype in forgotten places by @francoishernandez in #106
- Inference server, lots of related changes by @francoishernandez in #42
New Contributors
- @sersh88 made their first contribution in #57
- @aaaallleen made their first contribution in #68
- @dameikle made their first contribution in #95
Full Changelog: 0.0.1...0.0.2
0.0.1
There will be more breaking changes in the near future so let's start some proper versioning and corresponding changelog.
What's Changed Since Initial Publication
- mlp refact by @vince62s in #1
- fix llama3 and parallel_residual by @vince62s in #4
- fixed mismatch between mask and batch dimensions by @l-k-11235 in #6
- simplify LayerNorm access as a constant by @vince62s in #7
- Fix the checkpoint directory cleaning by @l-k-11235 in #10
- Modify default model config behaviour by @francoishernandez in #8
- rename num_kv remove multiquery by @vince62s in #12
- fix mmlu config by @vince62s in #13
- Fix the tokenizer saving in the HF converter by @l-k-11235 in #14
- remove unsused average attn by @vince62s in #15
- MHA refac: rope without complex operations + query only as input of the forward by @vince62s in #20
- Revert "MHA refac: rope without complex operations + query only as input of the forward" by @vince62s in #22
- missing removal of average attn by @vince62s in #23
config.models.BaseModelConfig._override_values
updates everything once by @francoishernandez in #24- [fix] Patch lora bin to dump json config by @francoishernandez in #28
- review flash/sdpa arg by @vince62s in #25
- fix missing layers names by @vince62s in #30
- Split MHA by @vince62s in #29
- Resize the key_pad_mask by @l-k-11235 in #36
- [patch] upgrade docusaurus deps, fix build script by @francoishernandez in #37
- Add gpt2 converter, hellaswag eval tool, misc fixes by @francoishernandez in #38
- Forgot hellaswag.py tool in #38 by @francoishernandez in #39
- estim lambda scheduler by @vince62s in #40
- Add support for XLM-Roberta-XL (and XXL) conversion by @vince62s in #41
- Some fixes, get rid of data_task, homogenize model_task to model_type by @francoishernandez in #43
- Some improvements to config.json readability by @francoishernandez in #44
- [docs] Github Actions workflow to facilitate docs deployment by @francoishernandez in #47
- [fix] Allow to build_vocab with full train config, patch vocab validation by @francoishernandez in #49
- Enable PyPI release workflow by @francoishernandez in #50
- [fix] Fix paths in wiki_103 recipe, add pyarrow opt requirement by @francoishernandez in #51
- Estim first token instead of average by @vince62s in #46
- Add Recipe to train a cometkiwi-like encoder model (which can be used to score sentence pairs) by @vince62s in #53
- Simplify init files, remove some unused code by @francoishernandez in #52
New Contributors
- @l-k-11235 made their first contribution in #6
Full Changelog: https://github.com/eole-nlp/eole/commits/0.0.1
0.0.1rc1 - Test PyPI Publish Workflow
This is a first test release to validate the build/publish release workflow.
Full Changelog: https://github.com/eole-nlp/eole/commits/0.0.1rc1