Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#19 Support multiple translation processing at once with persistently loaded model #28

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Yuvrajsinghspd09
Copy link

Changes made :

  1. Ensured the translation model stays loaded in memory for faster processing instead of reloading it every time.

  2. Removed translate_context function and moved its logic into the Service class for better structure.

3 .Added a thread lock (self._lock) in the Service class to handle multiple translation requests at the same time safely.

  1. Simplified Config Handling and removed redundant options like "hf_model_path" to focus only on necessary keys

  2. Made logging more dynamic by supporting both string and numeric log levels for easier debugging.

Please Let me know if there are any additional improvements needed. Looking forward to any feedback to refine this further.

@kyteinsky
Copy link
Contributor

Hello,
Thanks for the work!

It would, however, be better to first discuss the features or fixes to be done in an issue so our views align and no effort goes unused.

Ensured the translation model stays loaded in memory for faster processing instead of reloading it every time.

sounds good, I don't have a preference really.

Removed translate_context function and moved its logic into the Service class for better structure.

context managers make code more readable imo with clear separation of logic. This needs to be reverted.

Added a thread lock (self._lock) in the Service class to handle multiple translation requests at the same time safely.

if there is a lock, it is essentially the same as have now, processing one translation request at a time.
lookup there docs to know how ctranslate2 can handle multiple requests at a time:
https://opennmt.net/CTranslate2/parallel.html
https://opennmt.net/CTranslate2/python/ctranslate2.Translator.html#ctranslate2.Translator.translate_batch

Simplified Config Handling and removed redundant options like "hf_model_path" to focus only on necessary keys

this key is required to know the exact path at which the model was downloaded.
Similarly, the ModelConfig dict is required to know at which point the download of the model completed and the path was set.
"model_name" takes the name of the model from huggingface like https://huggingface.co/Nextcloud-AI/madlad400-3b-mt-ct2-int8_float32, which is the default

Made logging more dynamic by supporting both string and numeric log levels for easier debugging.

nice

also, if possible, please try to make separate commits of the changes so it is easier to change for you, and easier to review for us.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra file

Comment on lines +41 to 48
log_level = config.get("log_level", logging.INFO)
if isinstance(log_level, str):
log_level = getattr(logging, log_level.upper(), logging.INFO)
logger.setLevel(log_level)

ctranslate2.set_log_level(log_level)
logger.setLevel(log_level)
self.load_config(config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.load_config(config) should be before we access any config keys
here it is fine since log_level is not changed in load_config function but if at some point we do that, it breaks expectation that config has already been changed/updated

model_name = config_copy["loader"].pop("model_name")
# Assuming "model_path" is dynamically resolved
# For example, you can set it to a default local directory
resolved_model_path = os.path.join("models", model_name.replace("/", "_"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, but let's make these changes to only model_path and not model_name
we can also change the key to be "hf_model_name" if that signals that key is exclusively for huggingface names.

and let's use the "persistent_storage"/models path for this purpose. persistent_storage function can be imported from nc_py_api: https://github.com/nextcloud/llm2/blob/7e377cc4d8a5e59c4fc3740bc0707032eacdfdac/lib/main.py#L17
this storage does not get deleted on app updates.

also, it would good to not modify the model_paths that don't already are prefixed with "persistent_storage"/models or / (absolute paths)

Comment on lines -85 to -86
if "hf_model_path" in config_copy["loader"]:
config_copy["loader"]["model_path"] = config_copy["loader"].pop("hf_model_path")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be added back

Comment on lines +115 to +119
def close(self):
# Cleanup resources during service shutdown.
del self.tokenizer
del self.translator
logger.info("Service resources released.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when an ex-app is disabled, the container is shutdown so this cleanup is not required

Comment on lines +22 to +26
#Instead of "from Service import Service, TranslateRequest" used "from lib.Service import Service, TranslateRequest"
from lib.Service import Service, TranslateRequest
from lib.util import load_config_file

import concurrent.futures
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove those comments from the code and add them to the extended commit messages
and sort the imports, isort is a program that can do it automatically: https://pypi.org/project/isort/

Comment on lines -184 to +201
worker = threading.Thread(target=task_fetch_thread, args=(service,))
worker.start()
if not hasattr(service, "_worker"):
service._worker = threading.Thread(target=task_fetch_thread, args=(service,))
service._worker.start()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not really required since this if block wouldn't get executed multiple times, and as said above, when a app disable is triggered in app_api, it means the container/program shuts down

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants