Releases: google/budoux
Releases · google/budoux
v0.3.0
What's Changed
Faster model training
We made model training faster by applying JAX's JIT compilation, pooling file writes, etc.
- Faster training data encoding by @tushuhei in #89
- Add out_span option for better GPU utilization by @tushuhei in #90
- Apply JAX JIT compiling for faster training by @tushuhei in #95
- Check in updated Simplified Chinese model by @tushuhei in #99
Smaller models
We made models smaller by removing less important features, disabling ASCII encoding, etc.
- Remove Unicode Block features by @tushuhei in #86
- Disable ASCII encoding when building the model file by @tushuhei in #98
- Output compact model by @tushuhei in #100
Misc
- encode_data: write without break line join by @tushuhei in #91
- Update unit tests for the encoding script by @tushuhei in #92
- Add more granularity in weight outputs by @tushuhei in #93
- Remove tar module dependency by @tushuhei in #96
Full Changelog: v0.2.1...v0.3.0
v0.2.1
What's Changed
- Fix mypy issue by @tushuhei in #83
- Add missing export for HTMLProcessor in index.ts by @Harukaichii in #82
- Remove P features from JS module by @tushuhei in #85
- Nit fix for mypy issue by @tushuhei in #87
- Version up to 0.2.1 by @tushuhei in #88
New Contributors
- @Harukaichii made their first contribution in #82
Full Changelog: v0.2.0...v0.2.1
v0.2.0
What's Changed
- Fix a mathematical bug by @tushuhei in #78
- Add .js extension for better module portability by @tushuhei in #79
- Remove the P features by @tushuhei in #80
- Version up to 0.2.0 by @tushuhei in #81
thres
won't be available in theparse
method and the CLI options any more. Please fix your program if it's relying on thethres
parameter.- The parsing logic is different to older versions due to the fix for a mathematical error and removal of some features around past results. See #78 and #80 for details.
Full Changelog: v0.1.2...v0.2.0
v0.1.2
v0.1.1
What's Changed
- Add isort and pytest to dev dependencies by @eggplants in #56
--lang
option by @eggplants in #55- Add JavaScript
HTMLProcessor
class by @kojiishi in #58 - Bump async from 2.6.3 to 2.6.4 in /demo by @dependabot in #59
- Faster encode data by @tushuhei in #61
- Faster preprocess by @tushuhei in #62
- Change
applyElement
to callHTMLProcessor
by @kojiishi in #60 - Normalize weights not to overflow by @tushuhei in #63
- Install Jax for GPU acceleration by @tushuhei in #64
- Add py.typed for static analysis with mypy by @ryu22e in #65
- Update gts to 4.0.0 by @tushuhei in #69
- Fix Mypy GitHub Action by @tushuhei in #70
- Output precision and recall during training by @tushuhei in #71
- Upgrade dependencies by @tushuhei in #72
- Version up to 0.1.1 by @tushuhei in #73
New Contributors
Full Changelog: v0.1.0...v0.1.1
v0.1.0
- Simplified Chinese support added.
- Now the parser starts the segmentation process from the first character of the input sentence, in contrast to the old parser which starts the process from the third character assuming that the first phrase should be longer than 3 character long.
- While this old assumption holds in many cases in Japanese, it does not apply when it comes to Chinese. We removed this assumption according to the introduction of the Simplified Chinese model.
v0.0.4
What's Changed
- Add thres arg to Python CLI by @tushuhei in #32
- Add custom help formatter and shorthand of
--thres
by @eggplants in #33 - Update dependent Node.js packages by @tushuhei in #35
- Update build-demo.yml by @tushuhei in #36
- mypy and flake8 by @eggplants in #34
- Add description about CLI and deploy markdownlint CI by @eggplants in #37
- Update style-check.yml by @tushuhei in #38
- Specify python required version by @eggplants in #40
- Add chunk-size option to reduce memory for model training by @tamanyan in #41
- Bump follow-redirects from 1.14.7 to 1.14.8 in /demo by @dependabot in #44
- Add thres parameter to Node.js CLI by @tushuhei in #46
- Add a license header to .markdownlint.yaml by @tushuhei in #47
- Take split_dataset out from fit by @tushuhei in #42
- Dependencies version up by @tushuhei in #50
New Contributors
- @tamanyan made their first contribution in #41
- @dependabot made their first contribution in #44
Full Changelog: v0.0.3...v0.0.4
v0.0.3
Featured changes
- Node.js CLI by @junseinagao
- CI improvements by @eggplants
- BudouX Web Components by @tushuhei
What's Changed
- Fix Typos by @hiro0218 in #22
- Add test CI for NodeJS by @eggplants in #19
- Add badges (PyPI, npm) by @eggplants in #21
- Fix version data by @eggplants in #23
- Add cli test by @eggplants in #16
- Add PR trigger to CI by @eggplants in #24
- Add
npm link
to test CI by @eggplants in #25 - Export the parser threshold value by @tushuhei in #26
- Update .prettierrc.js by @tushuhei in #27
- Implement a simple node.js cli tool. by @junseinagao in #20
- Refactor tests of node.js cli by @junseinagao in #28
- Add web components by @tushuhei in #29
- Version bump by @tushuhei in #30
New Contributors
- @hiro0218 made their first contribution in #22
- @junseinagao made their first contribution in #20
Full Changelog: v0.0.2...v0.0.3
v0.0.2
What's Changed
- CLI by @eggplants in #6
- Fix Python code style by @tushuhei in #11
- Fix type hints to work with older Python versions by @tushuhei in #13
- add: unittest CI for Python by @eggplants in #14
- Fix: encoding error in windows by @eggplants in #15
- Use native unittest instead of pytest by @tushuhei in #17
- 0.0.2 release by @tushuhei in #18
New Contributors 🎉
- @eggplants made their first contribution in #6
Full Changelog: v0.0.1...v0.0.2
First release
Update style-check.yml Include `scripts` directory for style check.