Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNSPolicy scale test #615

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

DNSPolicy scale test #615

wants to merge 1 commit into from

Conversation

mikenairn
Copy link
Member

@mikenairn mikenairn commented Jan 13, 2025

Adds a DNSPolicy specific scale test using kube burner.

Part of #928

Based on the existing scale test, but with a focus on DNSPolicy and shared hostnames being updated by multiple dns operator instances.

The workload will create multiple instances of the dns operator in separate namespaces(kuadrant-dns-operator-x), and multiple test namespaces (scale-test-x) that the corresponding dns operator is configured to watch. The number of dns operator instances and test namespaces created is determined by the JOB_ITERATIONS environment variable.
In each test namespace a test app and service is deployed and one or more gateways are created determined by the NUM_GWS environment variable. The number of listeners added to the gateway is determined by the NUM_LISTENERS environment variable.
Each listener hostname is generated using the listener number and the KUADRANT_ZONE_ROOT_DOMAIN environment variable. In each test namespace a dns provider credential is created, the type created is determined by the DNS_PROVIDER environment variable, additional environment variables may need to be set depending on the provider type.

Requires:

Comments/Thoughts:

  • Kubeburner does not have the concept of running workloads across multiple instances. This was one of the asks in this issue. It is probably possible to run multiple kubeburner tasks simultaneously using the same configuration in order to have multiple updates to the same record set from multiple clusters but there would be no orchestration from kubeburners POV. It should also use of a single thanos instance instead of one deployed on each cluster.
  • For these workloads to be of any use we need good metrics and alerts that are expected to fire when things are not working. It's not a test suite with assertions on the state, but rather it expects alerts to fire in order to fail the test run.
  • Separating the DNS Operator specific templates/metrics/alerts into the dns operator repo makes sense as long as we have a similar scale test in that repo. TBD if we do want that.

Alerts
A small list of alerts that i realised would be useful, but really there are probably hundreds required.

  • Alert when a gateway has not been assigned an address in an appropriate amount of time (Can be hit quite easily when using kind if you only have a few IPs available). This isn't strictly a kuadrant, issue.
  • Alert when DNSRecords are in a failing state for a given amount of time.
  • Alert if the managers are restarting an unexpected amount of times during the test run. Hit this as part of the DNSRecord scale test, wrote an alert for this here.

@awk 'BEGIN {FS = ":.*?## "} /^[a-zA-Z_-]+:.*?## / {printf "\033[36m%-30s\033[0m %s\n", $$1, $$2}' $(MAKEFILE_LIST)
.PHONY: help
help: ## Display this help.
@awk 'BEGIN {FS = ":.*##"; printf "\nUsage:\n make \033[36m<target>\033[0m\n"} /^[a-zA-Z_0-9-]+:.*?##/ { printf " \033[36m%-15s\033[0m %s\n", $$1, $$2 } /^##@/ { printf "\n\033[1m%s\033[0m\n", substr($$0, 5) } ' $(MAKEFILE_LIST)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional change, just brings it in-line with the help in other repos, you can use ##@ foo to add sections:

Before:

$ make help
commit-acceptance              Runs pre-commit linting checks
reformat                       Reformats testsuite with black
test                           Run all non mgc tests
authorino                      Run only authorino related tests
authorino-standalone           Run only test capable of running with standalone Authorino
limitador                      Run only Limitador related tests
kuadrant                       Run all tests available on Kuadrant
kuadrant-only                  Run Kuadrant-only tests
multicluster                   Run Multicluster only tests
dnstls                         Run DNS and TLS tests
disruptive                     Run disruptive tests
kuadrantctl                    Run Kuadrantctl tests
poetry                         Installs poetry with all dependencies
poetry-no-dev                  Installs poetry without development dependencies
polish-junit                   Remove skipped tests and logs from passing tests
reportportal                   Upload results to reportportal. Appropriate variables for juni2reportportal must be set
help                           Print this help
clean                          Clean all objects on cluster created by running this testsuite. Set the env variable USER to delete after someone else
test-scale-dnspolicy           Run DNSPolicy scale tests.
kube-burner                    Download kube-burner locally if necessary.

After:

$ make help

Usage:
  make <target>
  commit-acceptance  Runs pre-commit linting checks
  reformat         Reformats testsuite with black
  test             Run all non mgc tests
  authorino        Run only authorino related tests
  authorino-standalone  Run only test capable of running with standalone Authorino
  limitador        Run only Limitador related tests
  kuadrant         Run all tests available on Kuadrant
  kuadrant-only    Run Kuadrant-only tests
  multicluster     Run Multicluster only tests
  dnstls           Run DNS and TLS tests
  disruptive       Run disruptive tests
  kuadrantctl      Run Kuadrantctl tests
  poetry           Installs poetry with all dependencies
  poetry-no-dev    Installs poetry without development dependencies
  polish-junit     Remove skipped tests and logs from passing tests
  reportportal     Upload results to reportportal. Appropriate variables for juni2reportportal must be set
  help             Display this help.
  clean            Clean all objects on cluster created by running this testsuite. Set the env variable USER to delete after someone else

Scale Testing
  test-scale-dnspolicy  Run DNSPolicy scale tests.

Build Dependencies
  kube-burner      Download kube-burner locally if necessary.

- https://raw.githubusercontent.com/{{.DNS_OPERATOR_GITHUB_ORG}}/dns-operator/refs/heads/{{.DNS_OPERATOR_GITREF}}/test/scale/alerts.yaml
indexer:
type: local
metricsDirectory: ./metrics
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the alerts and metrics being pulled from the dns operator repo here, but i imagine we could have these being pulled from multiple sources i.e. kuadrant-operator, testsuite repo, other components, where they define their own metrics/alerts specific to the resources they are providing.

The metrics/alerts configured, from what i can gather, are really needed to make the most out of kubeburner runs since alerts firing during the run are what will tell us if things are working or not, and what we would need to improve on if we feel these types of scale tests are useful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the alerts and metrics files should be maintained in this repo for easier maintenance in the context of running and maintaining tests.
The alternative could result in extra toil, particularly when working out the details of assertions for a test.

Copy link

@maleck13 maleck13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't actually try this and would prefer if @trepel or another of the QE team took a look and approved, but the changes look good to me

@trepel
Copy link
Contributor

trepel commented Jan 20, 2025

I tried against OCP cluster and Route53 and it worked as described - except for that missing github.com/kuadrant/dns-operator/config/observability?ref=main but that's being discussed. What surprises me was that no matter how many GWs/HTTPRoutes are created there is just NUM_LISTENERS unique hostnames. That's by design if I got it right, and it seems to be working (checked by dig +short )

@maleck13
Copy link

@mikenairn do we want to move this to ready and get it merged ? @trepel from your comment seems good to merge?

@mikenairn mikenairn force-pushed the dnspolicy_scale_test branch from 0be3332 to 82a7848 Compare January 29, 2025 12:01
@mikenairn mikenairn marked this pull request as ready for review January 29, 2025 12:02
@mikenairn
Copy link
Member Author

What surprises me was that no matter how many GWs/HTTPRoutes are created there is just NUM_LISTENERS unique hostnames.

This is intentional for the workload being added since it's testing dns at scale, part of that is testing that multiple records all contributing to the same dns name works with increasing numbers of owners.

@mikenairn mikenairn mentioned this pull request Jan 29, 2025
Adds a DNSPolicy specific scale test using kube burner.

The workload will create multiple instances of the dns operator in
separate namespaces(kuadrant-dns-operator-x), and multiple test
namespaces (scale-test-x) that the corresponding dns operator is
configured to watch.  The number of dns operator instances and test
namespaces created is determined by the `JOB_ITERATIONS` environment
variable.
In each test namespace a test app and service is deployed and one or
more gateways are created determined by the `NUM_GWS` environment
variable.  The number of listeners added to the gateway is determined by
the `NUM_LISTENERS` environment variable.
Each listener hostname is generated using the listener number and the
`KUADRANT_ZONE_ROOT_DOMAIN` environment variable.  In each test
namespace a dns provider credential is created, the type created is
determined by the `DNS_PROVIDER` environment variable, additional
environment variables may need to be set depending on the provider type.

Signed-off-by: Michael Nairn <[email protected]>
@mikenairn mikenairn force-pushed the dnspolicy_scale_test branch from 82a7848 to 9aaa301 Compare January 29, 2025 13:04
@maleck13
Copy link

@trepel are you ok to approve and merge this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants