-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#19 Support multiple translation processing at once with persistently loaded model #28
base: main
Are you sure you want to change the base?
#19 Support multiple translation processing at once with persistently loaded model #28
Conversation
Signed-off-by: Yuvrajsinghspd09 <[email protected]>
Hello, It would, however, be better to first discuss the features or fixes to be done in an issue so our views align and no effort goes unused.
sounds good, I don't have a preference really.
context managers make code more readable imo with clear separation of logic. This needs to be reverted.
if there is a lock, it is essentially the same as have now, processing one translation request at a time.
this key is required to know the exact path at which the model was downloaded.
nice also, if possible, please try to make separate commits of the changes so it is easier to change for you, and easier to review for us. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra file
log_level = config.get("log_level", logging.INFO) | ||
if isinstance(log_level, str): | ||
log_level = getattr(logging, log_level.upper(), logging.INFO) | ||
logger.setLevel(log_level) | ||
|
||
ctranslate2.set_log_level(log_level) | ||
logger.setLevel(log_level) | ||
self.load_config(config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.load_config(config)
should be before we access any config keys
here it is fine since log_level is not changed in load_config function but if at some point we do that, it breaks expectation that config has already been changed/updated
model_name = config_copy["loader"].pop("model_name") | ||
# Assuming "model_path" is dynamically resolved | ||
# For example, you can set it to a default local directory | ||
resolved_model_path = os.path.join("models", model_name.replace("/", "_")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool, but let's make these changes to only model_path and not model_name
we can also change the key to be "hf_model_name" if that signals that key is exclusively for huggingface names.
and let's use the "persistent_storage"/models path for this purpose. persistent_storage
function can be imported from nc_py_api: https://github.com/nextcloud/llm2/blob/7e377cc4d8a5e59c4fc3740bc0707032eacdfdac/lib/main.py#L17
this storage does not get deleted on app updates.
also, it would good to not modify the model_paths that don't already are prefixed with "persistent_storage"/models or /
(absolute paths)
if "hf_model_path" in config_copy["loader"]: | ||
config_copy["loader"]["model_path"] = config_copy["loader"].pop("hf_model_path") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs to be added back
def close(self): | ||
# Cleanup resources during service shutdown. | ||
del self.tokenizer | ||
del self.translator | ||
logger.info("Service resources released.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when an ex-app is disabled, the container is shutdown so this cleanup is not required
#Instead of "from Service import Service, TranslateRequest" used "from lib.Service import Service, TranslateRequest" | ||
from lib.Service import Service, TranslateRequest | ||
from lib.util import load_config_file | ||
|
||
import concurrent.futures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove those comments from the code and add them to the extended commit messages
and sort the imports, isort
is a program that can do it automatically: https://pypi.org/project/isort/
worker = threading.Thread(target=task_fetch_thread, args=(service,)) | ||
worker.start() | ||
if not hasattr(service, "_worker"): | ||
service._worker = threading.Thread(target=task_fetch_thread, args=(service,)) | ||
service._worker.start() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not really required since this if
block wouldn't get executed multiple times, and as said above, when a app disable is triggered in app_api, it means the container/program shuts down
Changes made :
Ensured the translation model stays loaded in memory for faster processing instead of reloading it every time.
Removed translate_context function and moved its logic into the Service class for better structure.
3 .Added a thread lock (self._lock) in the Service class to handle multiple translation requests at the same time safely.
Simplified Config Handling and removed redundant options like "hf_model_path" to focus only on necessary keys
Made logging more dynamic by supporting both string and numeric log levels for easier debugging.
Please Let me know if there are any additional improvements needed. Looking forward to any feedback to refine this further.