Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language detection method in CLI #799

Closed
osma opened this issue Sep 17, 2024 · 0 comments · Fixed by #801
Closed

Language detection method in CLI #799

osma opened this issue Sep 17, 2024 · 0 comments · Fixed by #801
Milestone

Comments

@osma
Copy link
Member

osma commented Sep 17, 2024

Issue #631 proposed adding a language detection method in the REST API and CLI. The REST API part was implemented in PR #659, closing the issue. But the CLI part still remains to be done. This issue attempts to specify the CLI part.

Here is an example how it could work:

$ echo "this is some example text in an unknown language" | annif detect-language --languages fi,sv,en
en      1.0000
fi      0.2000
sv      0.2000
-       0.0000

Some notes:

  • the new command is annif detect-language
  • the candidate languages are given as the --languages parameter, separated by comma
  • the output is TSV (language code and score)
  • the results are ordered by decreasing score (best guess first)
  • - means unknown language (same as null value in the REST API)
  • the number of candidate languages is not limited (unlike the REST API method where the limit is 5 due to memory usage concerns)
@osma osma closed this as completed in #801 Sep 17, 2024
@juhoinkinen juhoinkinen added this to the 1.2 milestone Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants