Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language detection method in REST API & CLI #631

Closed
osma opened this issue Oct 10, 2022 · 0 comments · Fixed by #659
Closed

Language detection method in REST API & CLI #631

osma opened this issue Oct 10, 2022 · 0 comments · Fixed by #659
Milestone

Comments

@osma
Copy link
Member

osma commented Oct 10, 2022

Currently language detection is only used for the language filtering transformation, but there have been some requests about providing this functionality also for users of the REST API, e.g. for the selection of an appropriate model.

We could have a method in the REST API for language detection, something like

POST /detect-language

with the parameters text (input text whose language to detect) and candidates (list/array of language codes to consider, e.g. ["fi", "sv", "en"]). Not sure about whether it's better to use form data or a JSON object to wrap the parameters. The return format would be a JSON object containing an array of results, something like this:

{
  "results": [
    {"language": "fi", "score": 0.85},
    {"language": "sv", "score": 0.3},
    {"language": "en", "score": 0.3},
    {"language": null, "score": 0.1}
  ]
}

Here, the scores are arbitrary values between 0.0 and 1.0, and the language null stands for unknown language. The scores wouldn't necessarily add up to 1.

There could also be a similar CLI command for symmetry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant