Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HARMONY-2007: Add is_sequential: true where missing for query-cmr in services.yml and validations for same #698

Merged
merged 4 commits into from
Feb 14, 2025

Conversation

indiejames
Copy link
Contributor

@indiejames indiejames commented Feb 13, 2025

Add is_sequential: true where missing for query-cmr in services.yml and validation for same

Jira Issue ID

HARMONY-2007

Description

This fixes a configuration error where some query-cmr steps did not have sequential: true. This was causing jobs to move from RUNNING_WITH_ERRORS to COMPLETED_WITH_ERRORS prematurely. It also adds a validation for this setting and tests for the validation.

Local Test Steps

  1. Do a deployment configured to point to production CMR/EDL. Or use my sandbox deployment here https://internal-harmony-jn-frontend-13863418.us-west-2.elb.amazonaws.com
  2. Run the following query
http://localhost:3000/OPERA_L2_RTC-S1_V1/ogc-api-coverages/1.0.0/collections/parameter_vars/coverage/rangeset?forceAsync=true&subset=time(%222025-01-01T00%3A00%3A00%22%3A%222025-01-02T00%3A00%3A00%22)&destinationUrl=s3%3A%2F%2Fcdd-harmony-sandbox-staging%2Fopera-browse-test&format=image%2Fpng&skipPreview=true&variable=all

This job will run very slowly due to issues with getting file sizes from S3 when updating work-items from query-cmr. Eventually it should move to RUNNING_WITH_ERRORS, and then after about an hour COMPLETED_WITH_ERRORS.
3. Use the workflow-ui to verify that there are no running work-items. Or check the database.
You should see something like this if you set a status: running filter (no work-items runniing)
image

PR Acceptance Checklist

  • Acceptance criteria met
  • Tests added/updated (if needed) and passing
  • Documentation updated (if needed)
  • Harmony in a Box tested? (if changes made to microservices or new dependencies added)

Copy link
Contributor

@chris-durbin chris-durbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find documentation for is_sequential - can you make sure to add something in the adapting services guide in the section where we have an example service chain definition.

@@ -109,6 +110,10 @@ function validateServiceConfigSteps(config: ServiceConfig<unknown>): void {
+ `Configured to use ${maxBatchInputs}, but will be limited to ${env.maxGranuleLimit}`);
}
}
if (step.image.match(/harmonyservices\/query\-cmr:.*/) && !step.is_sequential) {

throw new TypeError(`Invalid is_sequential ${step.is_sequential}. query-cmr steps must always have sequential = true.`);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we remove the is_sequential: true setting from query-cmr step and enforce it in the code? This feels like leaking unnecessary implementation details to the client.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a client error. This is an error that shows up when you start Harmony if it is misconfigured. I think I want to keep it this way instead of adding separate handling for query-cmr.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have said service provider rather than client. The service providers shouldn't have to know the query-cmr step is sequential when they set up the service chain in services.yml. The fact that Harmony always execute query cmr sequentially shouldn't have to be spelled out in the service configuration especially when it is something the service providers doesn't care or know. Like I said that I am ok with having it spelled out because we do have other steps that also use the is_sequential attribute, but I don't think it is necessary.

@indiejames
Copy link
Contributor Author

I couldn't find documentation for is_sequential - can you make sure to add something in the adapting services guide in the section where we have an example service chain definition.

Done

@indiejames indiejames merged commit 4c43331 into main Feb 14, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants