Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update performance tests #14289

Merged
merged 27 commits into from
Sep 21, 2023
Merged

Update performance tests #14289

merged 27 commits into from
Sep 21, 2023

Conversation

ReToCode
Copy link
Member

@ReToCode ReToCode commented Aug 22, 2023

Context

  • We use these tests mid- and downstream. This is a back contribution of our changes.

Changes

  • Remove mako, as it is now longer maintained
  • Use vegeta.Metrics as a common wrapper to report latency metrics the same way
  • Harmonize the various ways to test performance
  • Add a new test case to simulate more real traffic (Services with latencies and actual payloads) patterns
  • Drop kperf as an inline golang alternative already exists
  • Drop ytt setup, as it is not necessary any longer
  • Rename benchmarks to better reflect what they are doing
  • Add SLAs in go-code for all benchmarks and add them to grafana dashboard to see SLA breaches
  • Split-up set-up to only install the required resources for a specific test (reduce load)
  • Align env variables of all benchmarks
  • Use a common reporter for influx + vegeta
  • Document setup of influxDB (also for local development)
  • Update README to the new setup and explain the background better
  • Use existing grafana instance (https://grafana.knative.dev) instead of perf.knative.dev. More info: Use grafana.knative.dev instead of perf.knative.dev (which is broken) infra#173
  • Update grafana dashboard to visualize all stats
  • Use BUILD_ID and JOB_NAME to identify a build, see conversation

Additional notes

  • Unfortunately, Github shows the diffs a bit weird. These are still the existing tests, just renamed and streamlined.
  • A full run here
  • Grafana dashboard

Follow-up PRs

@knative-prow
Copy link

knative-prow bot commented Aug 22, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@knative-prow knative-prow bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Aug 22, 2023
@knative-prow knative-prow bot added area/networking area/test-and-release It flags unit/e2e/conformance/perf test issues for product features labels Aug 22, 2023
@ReToCode
Copy link
Member Author

/test performance-tests-mako

@codecov
Copy link

codecov bot commented Aug 22, 2023

Codecov Report

Patch coverage has no change and project coverage change: +0.04% 🎉

Comparison is base (9ffab17) 86.07% compared to head (847a398) 86.12%.
Report is 33 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #14289      +/-   ##
==========================================
+ Coverage   86.07%   86.12%   +0.04%     
==========================================
  Files         196      196              
  Lines       14787    14789       +2     
==========================================
+ Hits        12728    12737       +9     
+ Misses       1750     1746       -4     
+ Partials      309      306       -3     

see 5 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ReToCode
Copy link
Member Author

/test performance-tests-mako

@ReToCode
Copy link
Member Author

/test performance-tests-mako

3 similar comments
@ReToCode
Copy link
Member Author

/test performance-tests-mako

@ReToCode
Copy link
Member Author

/test performance-tests-mako

@ReToCode
Copy link
Member Author

/test performance-tests-mako

@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 23, 2023
@ReToCode
Copy link
Member Author

/test performance-tests-mako

@knative-prow-robot knative-prow-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 23, 2023
@ReToCode
Copy link
Member Author

/test performance-tests-mako

1 similar comment
@ReToCode
Copy link
Member Author

/test performance-tests-mako

@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 24, 2023
@knative-prow-robot knative-prow-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 28, 2023
@ReToCode
Copy link
Member Author

/test performance-tests-mako

@ReToCode
Copy link
Member Author

/test performance-tests

@ReToCode
Copy link
Member Author

/test upgrade-tests

hack/tools.go Show resolved Hide resolved
test/e2e-common.sh Show resolved Hide resolved
test/performance/README.md Show resolved Hide resolved
test/performance/performance/vegeta.go Outdated Show resolved Hide resolved
test/performance/performance-tests.sh Outdated Show resolved Hide resolved
@ReToCode
Copy link
Member Author

/test performance-tests

@ReToCode
Copy link
Member Author

2023/09/14 06:36:22 influxdb2client E! Write error: unprocessable entity: failure writing points to database: partial write: field type conflict: input field "errors" on measurement "Knative Serving real traffic test" is type integer, already exists as type float dropped=1

So it seems like we need to drop existing data if we want to change a fields format.

@dprotaso
Copy link
Member

So it seems like we need to drop existing data if we want to change a fields format.

Any luck doing that?

@ReToCode
Copy link
Member Author

/test performance-tests

@ReToCode
Copy link
Member Author

Any luck doing that?

Yes, works again. Test run, results.

@ReToCode
Copy link
Member Author

/unhold

@knative-prow knative-prow bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 20, 2023
@dprotaso
Copy link
Member

/test performance-tests

@dprotaso
Copy link
Member

can you confirm some errors in the logs are not actual errors

the test run you linked has

timed out waiting for the condition on services/perftest-00-sqihvrvj
Error from server (NotFound): services.serving.knative.dev "perftest-01-jzwpwozp" not found
Error from server (NotFound): services.serving.knative.dev "perftest-02-lzeycwsi" not found
Error from server (NotFound): services.serving.knative.dev "perftest-03-qhztwtoh" not found
...

@ReToCode
Copy link
Member Author

Yes these are not real errors. The job before deletes all services in a defer function, but as its quite a lot of services, they seem to not be gone until the next one starts. Let's see if 847a398 helps.

/test performance-tests

@@ -107,6 +107,7 @@ header "Real traffic test"

run_job real-traffic-test "${REPO_ROOT_DIR}/test/performance/benchmarks/real-traffic-test/real-traffic-test.yaml"
sleep 100 # wait a bit for the cleanup to be done
kubectl delete ksvc -n "$ns" --all --wait --now
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@dprotaso
Copy link
Member

/lgtm
/approve

thanks @ReToCode this looks great

I dismissed a code security warning - not sure why the nolint:gosec didn't work

https://github.com/knative/serving/security/code-scanning/24

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Sep 21, 2023
@knative-prow
Copy link

knative-prow bot commented Sep 21, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dprotaso, ReToCode

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow knative-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 21, 2023
@knative-prow knative-prow bot merged commit 0d73dfe into knative:main Sep 21, 2023
63 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/networking area/test-and-release It flags unit/e2e/conformance/perf test issues for product features lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants