Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement autosuggestions generation in get_data function #462

Closed

Conversation

Collins-Webdev
Copy link
Contributor

This commit integrates the autosuggestions functionality from process_wiki.py
into the get_data function in get.py. Key changes include:

  1. Import gen_autosuggestions function from scribe_data.wikipedia.process_wiki
  2. Add new conditional block to handle 'autosuggestions' data type
  3. Implement autosuggestions generation logic for specified languages
  4. Add placeholder load_text_corpus function for future implementation

The autosuggestions block now:

  • Iterates through specified languages
  • Loads text corpus (placeholder function to be implemented)
  • Calls gen_autosuggestions with appropriate parameters
  • Sets update_local_data=True to save results
  • Uses interactive mode for verbose output

This update allows CLI users to generate autosuggestions directly via
the get command, streamlining the data generation process.

Note: The load_text_corpus function needs to be implemented to load
the actual text corpus for each language before this feature is fully functional.

TODO:

  • Implement load_text_corpus function
  • Ensure correct file paths and imports across the project
  • Add error handling for corpus loading and autosuggestions generation
  • Update documentation to reflect new autosuggestions functionality in CLI

Contributor checklist


Description

Related issue

  • #ISSUE_NUMBER

- Enhanced noun query to include definite and indefinite forms
- Updated proper noun query with definite and vocative forms
- Expanded verb query to cover past simple, present continuous, future tense, and imperative forms
- Added comments and FILTER options for both Latin and Arabic script variants
- Improved overall query structure and readability
This commit integrates the autosuggestions functionality from process_wiki.py
into the get_data function in get.py. Key changes include:

1. Import gen_autosuggestions function from scribe_data.wikipedia.process_wiki
2. Add new conditional block to handle 'autosuggestions' data type
3. Implement autosuggestions generation logic for specified languages
4. Add placeholder load_text_corpus function for future implementation

The autosuggestions block now:
- Iterates through specified languages
- Loads text corpus (placeholder function to be implemented)
- Calls gen_autosuggestions with appropriate parameters
- Sets update_local_data=True to save results
- Uses interactive mode for verbose output

This update allows CLI users to generate autosuggestions directly via
the get command, streamlining the data generation process.

Note: The load_text_corpus function needs to be implemented to load
the actual text corpus for each language before this feature is fully functional.

TODO:
- Implement load_text_corpus function
- Ensure correct file paths and imports across the project
- Add error handling for corpus loading and autosuggestions generation
- Update documentation to reflect new autosuggestions functionality in CLI
Copy link

Thank you for the pull request!

The Scribe team will do our best to address your contribution as soon as we can. The following is a checklist for maintainers to make sure this process goes as well as possible. Feel free to address the points below yourself in further commits if you realize that actions are needed :)

If you're not already a member of our public Matrix community, please consider joining! We'd suggest using Element as your Matrix client, and definitely join the General and Data rooms once you're in. Also consider joining our bi-weekly Saturday dev syncs. It'd be great to have you!

Maintainer checklist

  • The linting and formatting workflow within the PR checks do not indicate new errors in the files changed

  • The CHANGELOG has been updated with a description of the changes for the upcoming release and the corresponding issue (if necessary)

@andrewtavis
Copy link
Member

Hey @Collins-Webdev: Current version of this branch doesn't have anything to do with autosuggestions and the included queries already exist, so I'll go ahead and close this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants