Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define Processing Flow #4

Open
jhpierce opened this issue Mar 28, 2022 · 0 comments
Open

Define Processing Flow #4

jhpierce opened this issue Mar 28, 2022 · 0 comments
Labels

Comments

@jhpierce
Copy link
Collaborator

jhpierce commented Mar 28, 2022

Repository Processing Steps

Pre-Processing

  1. Clone given repository
  2. Checkout the starting branch, if provided. If not provided, create feature branch, and check it out
  3. Write instructions for post-processing to disk, if there are are CCMs.
  4. Queue repository for processing. (Python Queue data structure)

Processing

  1. Retrieve changeset spec
  2. Run Filters, Code Changes
  3. Save results no-match, match, failure.

Post-Processing

  1. Write/return results for no-match, and failure, comment + close any existing issues/prs
  2. For matches without file changes, create/update existing issues
  3. For matches with file changes, create/update PRs.

Implementation

Overall, the process must leverage multithreading, since the nature of the work is primarily IO-bound - especially pre and post processing.. I'm picturing using python Queues to pass objects in between phases. Queues will enable us to define the level of parallelism at each layer, as well as max queue size, etc. We want this to be lightning fast locally.

What should the objects passed from Pre-Process to Process look like? And from Process to Post-Process? We'll probably want models representing the results of pre-processing, processing, and post-processing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant