Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle all ray job statuses #4389

Merged
merged 2 commits into from
Nov 10, 2023
Merged

Handle all ray job statuses #4389

merged 2 commits into from
Nov 10, 2023

Conversation

EngHabu
Copy link
Contributor

@EngHabu EngHabu commented Nov 9, 2023

Describe your changes

Ray operator sets a few different possible transient statuses as well as different cluster deployment statuses. The plugin doesn't handle all of them and silently attempts to make an illegal state transition instead of erroring. This change does two things:

  • Update code to handle all current known statuses
  • Fail hard if the job moves to an unknown status

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Signed-off-by: Haytham Abuelfutuh <[email protected]>
@EngHabu EngHabu marked this pull request as ready for review November 9, 2023 06:19
@EngHabu EngHabu requested a review from pingsutw November 9, 2023 06:19
@EngHabu EngHabu enabled auto-merge (squash) November 9, 2023 06:19
Copy link

codecov bot commented Nov 9, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (dd91cba) 59.10% compared to head (f4005cf) 57.97%.
Report is 2 commits behind head on master.

❗ Current head f4005cf differs from pull request most recent head 8a3ca32. Consider uploading reports for the commit 8a3ca32 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4389      +/-   ##
==========================================
- Coverage   59.10%   57.97%   -1.14%     
==========================================
  Files         614      396     -218     
  Lines       52103    29233   -22870     
==========================================
- Hits        30797    16948   -13849     
+ Misses      18853    10571    -8282     
+ Partials     2453     1714     -739     
Flag Coverage Δ
unittests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

see 570 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -489,12 +489,18 @@ func (plugin rayJobResourceHandler) GetTaskPhase(ctx context.Context, pluginCont
return pluginsCore.PhaseInfoFailure(flyteerr.TaskFailedWithError, reason, info), nil
case rayv1alpha1.JobStatusSucceeded:
return pluginsCore.PhaseInfoSuccess(info), nil
case rayv1alpha1.JobStatusPending, rayv1alpha1.JobStatusRunning:
case rayv1alpha1.JobStatusPending, rayv1alpha1.JobStatusRunning, rayv1alpha1.JobStatusStopped:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stopped sounds weird to be running

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not currently being used in the operator... so I took a guess it's a temporary state (Succeeded/Failed are terminal)... hence why I marked it as Running. That said, I think it's ok for now to add it to the default handler of failing as an unknown/not handled state..

Signed-off-by: Haytham Abuelfutuh <[email protected]>
@EngHabu EngHabu requested review from kumare3 and jeevb November 10, 2023 15:56
@EngHabu EngHabu merged commit 7712626 into master Nov 10, 2023
36 of 39 checks passed
@EngHabu EngHabu deleted the ray-job-status branch November 10, 2023 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants