Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hitting authentication errors when running octodns sync for multiple new zones on the same run #108

Closed
rlaakkol opened this issue Sep 23, 2024 · 7 comments
Labels

Comments

@rlaakkol
Copy link

So we are creating new zones with DNS records in Cloudflare using octodns, and almost every time if there are enough new zones, we hit a octodns_cloudflare.CloudflareAuthenticationError: Authentication error at some point of the process. Usually before the 40th or so zone in the set.

My gut feeling is that this is because octodns tries to create the records into the newly created zone too soon after creation, and Cloudflare throws back a 403. But I have a hard time verifying this, as this happens so sporadically.

Here's the full stacktrace for reference:

Traceback (most recent call last):
  File "env/bin/octodns-sync", line 8, in <module>
    sys.exit(main())
  File "env/lib/python3.10/site-packages/octodns/cmds/sync.py", line 62, in main
    manager.sync(
  File "env/lib/python3.10/site-packages/octodns/manager.py", line 856, in sync
    total_changes += target.apply(plan)
  File "env/lib/python3.10/site-packages/octodns/provider/base.py", line 298, in apply
    self._apply(plan)
  File "env/lib/python3.10/site-packages/octodns_cloudflare/__init__.py", line 1107, in _apply
    getattr(self, f'_apply_{class_name}')(change)
  File "env/lib/python3.10/site-packages/octodns_cloudflare/__init__.py", line 920, in _apply_Create
    self._try_request('POST', path, data=content)
  File "env/lib/python3.10/site-packages/octodns_cloudflare/__init__.py", line 131, in _try_request
    return self._request(*args, **kwargs)
  File "env/lib/python3.10/site-packages/octodns_cloudflare/__init__.py", line 156, in _request
    raise CloudflareAuthenticationError(resp.json())
octodns_cloudflare.CloudflareAuthenticationError: Authentication error

We can work around this by just rerunning the sync until all the zones are successfully processed, but this is a bit of a nuisance.

@ross
Copy link
Contributor

ross commented Sep 23, 2024

What version are you running. The line numbers in your stack track e.g. 920, don't align with the current release 0.0.7 as that line isn't even in the _apply_Create function.

Some sort of creation timing issue is a good guess, other possibility would be a rate limit of some sort. You might try throwing resp.content into

self.log.debug('_request: status=%d', resp.status_code)
and running with --debug if you can reliably recreate the problem.

When I get a change to sit down and mess with it I'll try and reproduce the issue, but it sounds like it might be a tough one to do and may even rely on latency to CF's api servers etc.

@rlaakkol
Copy link
Author

Yeah i was running 0.0.6. I'll try to run with extra debugging once we get the next bigger batch of zones to process through octodns! Thanks for the info!

@rlaakkol
Copy link
Author

rlaakkol commented Sep 25, 2024

One extra tidbit of information: We intially worked around this issue by just running octodns-sync in a loop targeted at one zone at a time, and in this way the authentication error never happened. Also we had our request quotas increased from Cloudflare side to make sure we weren't hitting rate limits, but that had no effect.

@ross
Copy link
Contributor

ross commented Sep 25, 2024

in a loop targeted at one zone at a time, and in this way the authentication error never happened.

That does make it sound more like a rate limit than race condition, but 🤷

Also we had our request quotas increased from Cloudflare side to make sure we weren't hitting rate limits, but that had no effect.

OK. Since it sounds like you have an account contact you might check with them and see if they have anything to say about eventually consistent creates/race conditions.

@ross
Copy link
Contributor

ross commented Oct 7, 2024

So far no luck recreating this. I put together a test data source that creates 64 zones each with a single record and then tries to sync them. Eventhing works fine up to the point it hit the rate limit:

...
********************************************************************************


2024-10-07T14:06:09  [140704633040320] INFO  CloudflareProvider[cloudflare] apply: making 1 changes to sub-0000.dev.
2024-10-07T14:06:11  [140704633040320] INFO  CloudflareProvider[cloudflare] apply: making 1 changes to sub-0001.dev.
2024-10-07T14:06:12  [140704633040320] INFO  CloudflareProvider[cloudflare] apply: making 1 changes to sub-0002.dev.
...
2024-10-07T14:07:29  [140704633040320] INFO  CloudflareProvider[cloudflare] apply: making 1 changes to sub-0049.dev.
2024-10-07T14:07:30  [140704633040320] INFO  CloudflareProvider[cloudflare] apply: making 1 changes to sub-0050.dev.
2024-10-07T14:07:31  [140704633040320] WARNING CloudflareProvider[cloudflare] rate limit encountered, pausing for 300s and trying again, 3 remaining

Going to rework things to create more records in each zone to see if that makes a difference, but otherwise this one is probably going to sit until it times out if we can't come up with reliable steps to reproduce.

@ross
Copy link
Contributor

ross commented Oct 7, 2024

Actually looks like the rate limit (for me anyway) never recovers:

2024-10-07T14:07:29  [140704633040320] INFO  CloudflareProvider[cloudflare] apply: making 1 changes to sub-0049.dev.
2024-10-07T14:07:30  [140704633040320] INFO  CloudflareProvider[cloudflare] apply: making 1 changes to sub-0050.dev.
2024-10-07T14:07:31  [140704633040320] WARNING CloudflareProvider[cloudflare] rate limit encountered, pausing for 300s and trying again, 3 remaining
2024-10-07T14:12:31  [140704633040320] WARNING CloudflareProvider[cloudflare] rate limit encountered, pausing for 300s and trying again, 2 remaining
2024-10-07T14:17:31  [140704633040320] WARNING CloudflareProvider[cloudflare] rate limit encountered, pausing for 300s and trying again, 1 remaining
Traceback (most recent call last):
  File "/Users/ross/octodns/octodns-cloudflare/env/bin/octodns-sync", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/ross/octodns/octodns-cloudflare/env/lib/python3.12/site-packages/octodns/cmds/sync.py", line 62, in main
    manager.sync(
  File "/Users/ross/octodns/octodns-cloudflare/env/lib/python3.12/site-packages/octodns/manager.py", line 856, in sync
    total_changes += target.apply(plan)
                     ^^^^^^^^^^^^^^^^^^
  File "/Users/ross/octodns/octodns-cloudflare/env/lib/python3.12/site-packages/octodns/provider/base.py", line 298, in apply
    self._apply(plan)
  File "/Users/ross/octodns/octodns-cloudflare/octodns_cloudflare/__init__.py", line 1186, in _apply
    resp = self._try_request('POST', '/zones', data=data)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ross/octodns/octodns-cloudflare/octodns_cloudflare/__init__.py", line 145, in _try_request
    return self._request(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ross/octodns/octodns-cloudflare/octodns_cloudflare/__init__.py", line 172, in _request
    raise CloudflareRateLimitError(resp.json())
octodns_cloudflare.CloudflareRateLimitError: You have exceeded the limit for adding zones. Please activate some zones.

Copy link

github-actions bot commented Jan 6, 2025

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jan 6, 2025
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants