Skip to content
This repository has been archived by the owner on Dec 5, 2020. It is now read-only.

if failure with timeout for creating rancher_stack while stack is in activating state, retrying causes "notunique" error #26

Open
oddperfect opened this issue Aug 3, 2017 · 1 comment
Labels

Comments

@oddperfect
Copy link

Terraform Version v0.10.0-rc1-dev

Rancher Provider v0.1.1
Rancher Server 1.6

Affected Resource(s)

rancher_stack

Terraform Configuration Files

resource "rancher_stack" "microservices" {
    name                        = "microservices"
    description                 = "microservices"
    environment_id              = "${var.environment_id}"
    scope                       = "user"
    docker_compose              = "${file("${path.module}/microservices-docker-compose.yml")}"
    start_on_create             = "true"
    finish_upgrade              = "true"
}

Mainly need start_on_create to be true and "io.rancher.container.pull_image: "always" as label in docker-compose file and something to cause timeout (large docker image, many images in stack, slow network, etc).

Expected Behavior

Should be able to run script again after first failure and not fail due to stack already existing

Actual Behavior

Failed with "NotUnique" error for stack 'name' field

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

  1. terraform apply
  2. Have it fail with timeout such as:
    rancher_stack.microservices: Error waiting for stack (1st257) to be created: timeout while waiting for state to become 'active' (last state: 'activating', timeout: 10m0s)
  3. terraform apply again
  4. See in logs it is creating the stack again:
    module.utilities.rancher_stack.microservices: Creating...
  5. It will fail with:
    rancher_stack.microservices: Bad response statusCode [422]. Status [422 ???]. Body: [code=NotUnique, fieldName=name, baseType=error] from [......]

Important Factoids

Issue seems to be that even though the stack was created the state file assumes it was not because it did not get to active state. I can think of a few possibilities:

  1. If terraform times out creating stack and the state of the stack is activating, after failure delete the stack so state file matches what is on rancher server.
  2. Increase timeout when stack gets to activating state. May not solve the issue though if it still times out.
  3. Save in the state file that the stack is created but not activated. This would allow the second run of the script to see it is the same stack and wait until it is active (if it is not already).
  4. When running script second time and there is a already stack by that name, delete and recreate it or modify it. I realize this may allow duplicate stacks in the same script to override one another.

I favor number 3.

Side note: would love to be able to override timeout on rancher items such as rancher_stack. Can open another issue if you think this is valid request.

@mcanevet mcanevet added the bug label Aug 4, 2017
@mcanevet
Copy link
Contributor

mcanevet commented Aug 4, 2017

This is definitely a behavior I also noted, but unfortunately I can't figure out how to fix this. @raphink any idea?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants