Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support custom tokenization #7

Open
obo opened this issue Jan 18, 2021 · 1 comment
Open

Support custom tokenization #7

obo opened this issue Jan 18, 2021 · 1 comment
Labels
enhancement New feature or request

Comments

@obo
Copy link
Contributor

obo commented Jan 18, 2021

Anyone who would like SLTev to support custom tokenizers (e.g. via --tokenizer=...), please discuss here.
Let's add only features people need.
Pull requests are also welcome.

@obo obo added the enhancement New feature or request label Jan 18, 2021
@mohammad2928
Copy link
Collaborator

It is a good idea. There are two approaches for dealing with different tokenizer idea.
First approach: we can make various tokenizers in the SLTev and identified them with numbers or names.
Second approach: we can allow users to use a file that contains a function for tokenization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants