Skip to content

Latest commit

 

History

History
106 lines (97 loc) · 10 KB

CHANGELOG.md

File metadata and controls

106 lines (97 loc) · 10 KB

Changelog

This is just a centralised version of the Github automatically generated Release changelogs.

0.0.3

  • [patch] minor fixes for 0.0.2 by @francoishernandez in #109
  • Rework handling of special tokens by @francoishernandez in #45
  • [patch] get_transforms_cls after update_config_with_checkpoint by @francoishernandez in #110
  • [patch] get_transforms_cls after update_config_with_checkpoint BIS by @francoishernandez in #111
  • Updated translator.py to handle updated special token logic when computing alignments by @dameikle in #113
  • clearer log by @vince62s in #112
  • fix training tensor parallel by @vince62s in #115
  • restore all_reduce directly but with detach.clone first - fix #115 by @vince62s in #116
  • Initial support for Metal Performance Shaders (MPS) by @dameikle in #98
  • Manage share_decoder_embeddings in convert_HF, misc fixes and improvements by @francoishernandez in #121
  • Deduce share_decoder_embeddings from HF tie_word_embeddings flag by @francoishernandez in #123
  • [docs] Upgrading docusaurus packages, should fix dependabot warnings by @francoishernandez in #124
  • add estimator in decoder-only + clean code by @vince62s in #120
  • fineweb10B/gpt2 recipe, and supporting changes by @francoishernandez in #32
  • enable pure bf16 training by @vince62s in #133
  • Update WMT17 recipe with working tokenization transforms examples by @francoishernandez in #129
  • fixes #131, module 'eole.utils' has no attribute 'distributed' error when training multi-gpu by @isanvicente in #132
  • add estimator in greedy inference by @vince62s in #135
  • Some QOL config/saving improvements by @francoishernandez in #134
  • fix #136. Updated eole/bin/model/average_models.py to work with safetensors model format. by @isanvicente in #137
  • fix head dim in rope by @vince62s in #140
  • fix autocast at scoring when doing AMP by @vince62s in #141
  • Some minor fixes by @francoishernandez in #143
  • fix lora lm head by @vince62s in #142
  • fix missing pad change by @vince62s in #148
  • flash_attn_func does not support padding mask maybe we need to drop a… by @vince62s in #149
  • fix maybe_retranslate when number of newline does not match by @vince62s in #150
  • Supporting HF tokenizers by @francoishernandez in #122
  • Model Validator Recipe by @francoishernandez in #146
  • apply bytefallback at detok (onmt_tokenize with sentencepiece) by @vince62s in #155
  • patch eos_token_id list handling by @francoishernandez in #158
  • Compile and Ctranslate2 support by @vince62s in #161
  • Move predict config update from model loading to config validation by @francoishernandez in #163
  • EuroLLM Gradio (web based) translator 35 languages to 35 languages by @vince62s in #164

0.0.2

  • Refactor position encoding configuration by @vince62s in #60
  • fix update vocab by @vince62s in #63
  • bfloat16 support, and an attempt at homogenizing model_dtype & precision by @francoishernandez in #54
  • Fix prefix and suffix transforms - avoid adding empty suffix or prefix by @sersh88 in #57
  • fix the incorrect dockerimages in the ReadMe by @aaaallleen in #68
  • Remove unnecessary optim in convert_HF by @francoishernandez in #71
  • Add onmt_config converter to facilitate switch by @francoishernandez in #69
  • Update some FAQ sections by @francoishernandez in #74
  • Added TER and BLEU for early stopping by @aaaallleen in #73
  • [fix] fix normalize and clean transforms config management by @francoishernandez in #87
  • [docs] Fix quickstart config and command by @francoishernandez in #90
  • add head_dim setting when diff from hidden // heads by @vince62s in #78
  • Some MHA and RoPE refactoring, llama-3.1 rope_scaling by @francoishernandez in #91
  • Fixed variable referenced before assignment when position_embeddings is None error by @dameikle in #95
  • Send src_pad_mask and tgt_pad_mask to decoder in _align_forward by @dameikle in #96
  • Fixdistrib by @vince62s in #100
  • fix added tokens by @vince62s in #101
  • Support mapped tokens eg: <im_start> ==> ⦅im_start⦆in inference.yaml … by @vince62s in #102
  • add wmt22 recipes with TowerInstruct and Llama3.1 LLMs by @vince62s in #103
  • Remove duplicate sentencepiece requirement by @francoishernandez in #104
  • [patch] Adapt some warning behaviours for reduced verbosity by @francoishernandez in #105
  • [patch] Update precision to compute_dtype in forgotten places by @francoishernandez in #106
  • Inference server, lots of related changes by @francoishernandez in #42

Full Changelog: https://github.com/eole-nlp/eole/compare/0.0.1...0.0.2

0.0.1

  • mlp refact by @vince62s in #1
  • fix llama3 and parallel_residual by @vince62s in #4
  • fixed mismatch between mask and batch dimensions by @l-k-11235 in #6
  • simplify LayerNorm access as a constant by @vince62s in #7
  • Fix the checkpoint directory cleaning by @l-k-11235 in #10
  • Modify default model config behaviour by @francoishernandez in #8
  • rename num_kv remove multiquery by @vince62s in #12
  • fix mmlu config by @vince62s in #13
  • Fix the tokenizer saving in the HF converter by @l-k-11235 in #14
  • remove unsused average attn by @vince62s in #15
  • MHA refac: rope without complex operations + query only as input of the forward by @vince62s in #20
  • Revert "MHA refac: rope without complex operations + query only as input of the forward" by @vince62s in #22
  • missing removal of average attn by @vince62s in #23
  • config.models.BaseModelConfig._override_values updates everything once by @francoishernandez in #24
  • [fix] Patch lora bin to dump json config by @francoishernandez in #28
  • review flash/sdpa arg by @vince62s in #25
  • fix missing layers names by @vince62s in #30
  • Split MHA by @vince62s in #29
  • Resize the key_pad_mask by @l-k-11235 in #36
  • [patch] upgrade docusaurus deps, fix build script by @francoishernandez in #37
  • Add gpt2 converter, hellaswag eval tool, misc fixes by @francoishernandez in #38
  • Forgot hellaswag.py tool in #38 by @francoishernandez in #39
  • estim lambda scheduler by @vince62s in #40
  • Add support for XLM-Roberta-XL (and XXL) conversion by @vince62s in #41
  • Some fixes, get rid of data_task, homogenize model_task to model_type by @francoishernandez in #43
  • Some improvements to config.json readability by @francoishernandez in #44
  • [docs] Github Actions workflow to facilitate docs deployment by @francoishernandez in #47
  • [fix] Allow to build_vocab with full train config, patch vocab validation by @francoishernandez in #49
  • Enable PyPI release workflow by @francoishernandez in #50
  • [fix] Fix paths in wiki_103 recipe, add pyarrow opt requirement by @francoishernandez in #51
  • Estim first token instead of average by @vince62s in #46
  • Add Recipe to train a cometkiwi-like encoder model (which can be used to score sentence pairs) by @vince62s in #53
  • Simplify init files, remove some unused code by @francoishernandez in #52

Full Changelog: https://github.com/eole-nlp/eole/commits/0.0.1rc1