-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quid solved #428
Quid solved #428
Conversation
* Overview This PR addresses issue scribe-org#423 by implementing error handling for missing QID values in the `language_metadata.json` file. The changes focus on enhancing the robustness of the `cli_utils.py` module, particularly in scenarios where language entries lack a QID. ** Changes 1. Modified the `language_to_qid` dictionary creation process in `cli_utils.py`: - Implemented a try-except block to catch potential KeyErrors when accessing QID values. - Added a warning message for languages with missing QIDs. 2. Updated the `validate_language_and_data_type` function: - Enhanced error handling to accommodate languages without QIDs. - Improved the validation process to prevent crashes due to missing QID data. 3. Refactored related code sections for consistency and maintainability. * Technical Details - Utilized the `dict.get()` method with a default value of `None` to safely access potentially missing QID keys. - Implemented a logging mechanism to warn about missing QIDs without halting execution. - Adjusted the validation logic to gracefully handle languages with missing QIDs, allowing the CLI to continue functioning for valid entries. ** Testing - Conducted thorough testing by removing QIDs from various language entries in `language_metadata.json`. - Verified that the CLI continues to function correctly for languages with valid QIDs. - Confirmed that appropriate warnings are logged for languages with missing QIDs. - Tested edge cases, including scenarios with multiple missing QIDs and mixed valid/invalid entries. ** Impact These changes significantly improve the resilience of the Scribe-Data CLI, ensuring it can operate effectively even when faced with incomplete language metadata. This enhancement aligns with our goal of creating a more robust and user-friendly tool. ** Next Steps - Consider implementing a more comprehensive logging system for better traceability of warnings and errors. - Explore the possibility of adding unit tests specifically for QID error handling scenarios. - Evaluate the need for a data validation step during the metadata file loading process to preemptively identify and report missing or malformed entries.
Thank you for the pull request!The Scribe team will do our best to address your contribution as soon as we can. The following is a checklist for maintainers to make sure this process goes as well as possible. Feel free to address the points below yourself in further commits if you realize that actions are needed :) If you're not already a member of our public Matrix community, please consider joining! We'd suggest using Element as your Matrix client, and definitely join the General and Data rooms once you're in. Also consider joining our bi-weekly Saturday dev syncs. It'd be great to have you! Maintainer checklist |
…with sub-language support
src/scribe_data/cli/cli_utils.py
Outdated
for lang in language_metadata["languages"]: | ||
lang_lower = lang["language"].lower() | ||
qid = lang.get("qid") | ||
|
||
if qid is None: | ||
print(f"Warning: 'qid' missing for language {lang['language']}") | ||
else: | ||
language_map[lang_lower] = lang | ||
language_to_qid[lang_lower] = qid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your work on this, @Collins-Webdev! 😊
I believe we don't need to change the previous method of retrieving languages using language_metadata.items()
, as the latest version of language_metadata.json no longer includes the languages
key, trying to access language_metadata["languages"]
will cause an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think we should check for sub_languages
first before attempting to retrieve the language qid
. Languages like Norwegian and Chinese, which have sub-languages, naturally don't have their own qid
.
- Remove assumption of 'languages' key in language_metadata - Handle sub-languages correctly - Improve warning messages for missing qids
5fdd991
to
8f4287d
Compare
- Remove assumption of 'languages' key in language_metadata - Handle sub-languages correctly - Improve warning messages for missing qids
Hello @catreedle 👋🏼,
Please review the changes and let me know if any further modifications are needed. |
src/scribe_data/cli/cli_utils.py
Outdated
@@ -27,8 +27,6 @@ | |||
|
|||
from scribe_data.utils import DEFAULT_JSON_EXPORT_DIR | |||
|
|||
# MARK: CLI Variables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't remove marks from the code in the future @Collins-Webdev as these are meant to make the files more manageable to navigate.
50d4c30 further added back in some single quotes that were removed and were causing the tests to fail :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes here, @Collins-Webdev! And also appreciate the review, @catreedle! Made mine much easier 😊 Would be great to get continued support on checking PRs!
Contributor checklist
pytest
command as directed in the testing section of the contributing guideDescription
Related issue