Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure cases handling #80

Closed
pigmej opened this issue Sep 20, 2016 · 13 comments
Closed

Failure cases handling #80

pigmej opened this issue Sep 20, 2016 · 13 comments
Assignees
Milestone

Comments

@pigmej
Copy link
Contributor

pigmej commented Sep 20, 2016

Currently we support only success path, we may need to support failure path too.

Exact scope of this issue is TDB.

@pigmej pigmej added the ready label Sep 26, 2016
@nkwangleiGIT
Copy link

hi pigmej, do we have some plan or proposal for this requirement? we want to use AppController for workflow tasks (k8s jobs), thanks.

@pigmej
Copy link
Contributor Author

pigmej commented Oct 10, 2016

@nkwangleiGIT Currently @gluke77 is doing research about that, how we can support that etc. We will have document for that somewhere in the middle of this week.

@nkwangleiGIT Could you describe your use case? It would be cool to include it in @gluke77 research around that area.

@pigmej pigmej added in progress and removed ready labels Oct 10, 2016
@zen
Copy link
Contributor

zen commented Oct 10, 2016

Basically, we have an idea to let user decide what to do with failures, and provide him with several options. What we plan for first iteration of this feature is:

  • retires with timeouts (configurable)
  • rollback
  • tear down on failure, e.g. entire graph will be destroyed after failure

In the future we plan to support more sophisticated scenarios, when you can attach custom graphs on different failures. This would allow more elegant tear down or alternative graph execution.

Anyway, it would be great to see your use cases @nkwangleiGIT , we could consider them in our design

@zen zen changed the title Failure cases handling Basic failure handling Oct 10, 2016
@zen zen added this to the v0.2 milestone Oct 10, 2016
@zen zen changed the title Basic failure handling Failure handling research Oct 10, 2016
@zen zen changed the title Failure handling research Failure cases handling Oct 10, 2016
@nkwangleiGIT
Copy link

nkwangleiGIT commented Oct 11, 2016

@pigmej @zen thanks so much for your reply.
About my use case, here are some basic info:

  1. We're trying to create build-flow(CI) using k8s Jobs, so it's quite like the approach in your demo:
    https://youtu.be/BXRToNV4Rdw?t=178
    Thanks again for the great demo.
  2. Basically, it'll do similar flow like shippable but using k8s Jobs for each build step.
    http://docsv2.readthedocs.io/en/latest/config.html#test-and-code-coverage-visualization
  3. So we need to handle the failure case of each step, including timeout, retry times, remove the latter steps after the failure point. Or even use job as the decision gateway, and make decision based on the output of job(low priority for now)
  4. For long term, it'll be great if we can use k8s to do workflow or bpm.
    BTW, we also expect AppController to support Deployment, so we can use the Dependency for application deployment and visualize these scenarios.

We'd like to contribute this project once we get familiar with this great tool, it'll be also great if you can send me some more documents to get start besides the ones in this repository, thanks!

@gluke77
Copy link
Contributor

gluke77 commented Oct 13, 2016

@nkwangleiGIT Ptal #98. If you think that the document does not cover all your usecases, pls let me know.

@nkwangleiGIT
Copy link

@gluke77 it's sufficient for our current requirement, when we can get the 1st version implementation about this?
BTW, how we should handle the events about failure? or we have to watch related k8s API for this?
Thanks

@gluke77
Copy link
Contributor

gluke77 commented Oct 17, 2016

@nkwangleiGIT As for implementation ETA, well the only answer that I could give is "we want it rather sooner then later". As for externally available events -- we don't have any plans yet.

@pigmej
Copy link
Contributor Author

pigmej commented Oct 18, 2016

@nkwangleiGIT or maybe you meant more or less some post deployment events ? And handling these errors? If yes then it's also being worked on currently.

@nkwangleiGIT
Copy link

@gluke77 got it, I'll keep a close eye on this issue and AppController project
@pigmej yes, events to add custom actions or integrate with other service will be great to simplify the work
Thanks all

@pigmej
Copy link
Contributor Author

pigmej commented Dec 14, 2016

General handling landed, specific cases moved to next milestone.

@pigmej pigmej modified the milestones: v0.4, v0.3 Dec 14, 2016
@pigmej pigmej modified the milestones: v0.4, v0.5 Jan 5, 2017
@gluke77
Copy link
Contributor

gluke77 commented Feb 17, 2017

So far retries (w/o timeouts) #174 and 'on-error' conditional subgraphs #184 are implemented. Timeout support #199 is in progress. 'Ignore this' and 'ignore with children' will follow shortly.

@pigmej pigmej modified the milestones: v0.6, v0.5 Feb 17, 2017
@gluke77
Copy link
Contributor

gluke77 commented Feb 27, 2017

'Ignore this" implemented #205

@pigmej
Copy link
Contributor Author

pigmej commented Mar 1, 2017

Tear down moved to: #210
Rollback moved to: #209

Other items are merged.

@pigmej pigmej closed this as completed Mar 1, 2017
@pigmej pigmej removed the in progress label Mar 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants