Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal- Porch Performance Testing Framework - Operation Flow with Prometheus #174

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mansoor17syed
Copy link
Contributor

@mansoor17syed mansoor17syed commented Jan 23, 2025

Proposal: Porch Performance Testing Framework with Prometheus Integration

Overview
This proposal introduces a comprehensive performance testing framework for Porch, designed to measure and monitor repository and package management operations using Prometheus. The framework provides automated setup, detailed metrics collection, and extensive reporting capabilities.

Basic Test Execution

go test -v ./... \
  --repos=2 \
  --packages=5 \
  --packageRevisions=4
  

Goals

  1. Establish automated performance testing for Porch operations
  2. Implement comprehensive metrics collection using Prometheus
  3. Provide detailed operation monitoring and reporting
  4. Enable configurable test scenarios

Implementation Details

  1. Framework Components
  • Automated Setup
  • Prometheus Configuration
  • Automatic container creation and initialization
  • Custom prometheus.yml configuration
  • Access: http://localhost:9090
  • Metrics Server
  • Port: :2113
  • Endpoint: http://localhost:2113/metrics
  • Logging System
  • Directory: ./logs
  • Format: porch-metrics-YYYY-MM-DD-HH-MM-SS.log
  • Console and file logging

2. Monitored Operations

Repository Management

  • Create Gitea Repository
  • Create Porch Repository
  • Wait Repository Ready
  • Delete Repository

Package Management

  • Create PackageRevision
  • Update to Proposed
  • Update to Published
  • Delete PackageRevision

Metrics Collection

  • Performance Metrics
  • Operation duration tracking
  • Success/failure counting
  • Resource utilization monitoring
  • Error tracking and reporting

Prometheus Integration

scrape_configs:
  - job_name: 'porch_metrics'
    static_configs:
      - targets: ['host.docker.internal:2113']
    scrape_interval: 1s

Benefits

  • Automated performance testing
  • Comprehensive metrics collection
  • Real-time monitoring capabilities
  • Detailed reporting and analysis
  • Configurable test scenarios

Future Enhancements

  • Additional metric types
  • Enhanced visualization options
  • Automated performance regression detection
  • Extended test scenario support

image
image
image
image
image
image
image
image

Copy link
Contributor

nephio-prow bot commented Jan 23, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mansoor17syed
Once this PR has been reviewed and has the lgtm label, please assign johnbelamaric for approval by writing /assign @johnbelamaric in a comment. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mansoor17syed
Copy link
Contributor Author

Hi team,
I've worked on the Porch Performance Testing Framework with Prometheus integration. It includes automated setup, metrics collection, and detailed logging. Please review the changes and share your feedback.

@efiacor
Copy link
Collaborator

efiacor commented Jan 24, 2025

Hi @mansoor17syed .
Thanks for setting this up. Really nice to have.
Couple of points.
Is the plan for this to be run as an e2e test? If so, we should prob align with how the others are doing it.
See here for the main e2e gh action workflow.
So, it sets up the kind cluster, deploys porch with the Make target "run-in-kind", and then executes the tests.
They also have a flag to avoid these being executed as part of the unit test runs - "E2E=1 go test ..."

For interacting with gitea, we should avoid using the ip address.
https://github.com/nephio-project/porch/pull/174/files#diff-da6008a39400b7b8b7c9a7b68d4b0928faf93f44c3ca48ad1d29cb9cfb4ae40fR37

The existing e2e suite use a stubbed git deployment - https://github.com/nephio-project/porch/blob/main/test/e2e/suite.go#L556
Could we make use of this for your suite?

@liamfallon
Copy link
Member

/retest

@mansoor17syed
Copy link
Contributor Author

Hi @efiacor

Apologies for the delayed response—I was on vacation.

The main goal was to add metrics, but I wasn't sure how the community would receive them. So, I initially included them as part of the tests and proposed the changes.

I’m happy to modify or integrate these metrics directly into the code based on your feedback and requirements. Since you have a better perspective on this, I’d appreciate your guidance.

Looking forward to your response.

@liamfallon
Copy link
Member

/retest

Copy link
Contributor

nephio-prow bot commented Feb 4, 2025

@mansoor17syed: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
presubmit-nephio-go-test fe57069 link true /test presubmit-nephio-go-test

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@liamfallon
Copy link
Member

Hi @mansoor17syed , the unit tests are failing on this one.

@efiacor
Copy link
Collaborator

efiacor commented Feb 4, 2025

Hi @efiacor

Apologies for the delayed response—I was on vacation.

The main goal was to add metrics, but I wasn't sure how the community would receive them. So, I initially included them as part of the tests and proposed the changes.

I’m happy to modify or integrate these metrics directly into the code based on your feedback and requirements. Since you have a better perspective on this, I’d appreciate your guidance.

Looking forward to your response.

Hi @mansoor17syed ,

No problem. So if we want to add this as a stand alone suite we will need some mech to setup the test cluster.
The e2e suites use a local kind cluster and a stubbed gitea. If you need to use a full gitea deployment then we have a pkg for that.
Also, to avoid these being run during the unit tests, we will need to add some mech to skip similar to here - https://github.com/nephio-project/porch/blob/main/test/e2e/e2e_test.go#L62

@mansoor17syed
Copy link
Contributor Author

  1. Regarding cluster setup:

For running these performance tests, can I use the existing kind cluster setup from the e2e framework?
I notice the e2e tests use a local kind cluster - should I integrate with that same mechanism?

  1. Regarding Gitea:

The e2e suites use a stubbed gitea deployment. You mentioned there's a pkg for full gitea deployment - which approach would be more appropriate for these performance tests? Should I:
a) Use the existing stubbed gitea from e2e suite
b) Use the full gitea deployment pkg you mentioned

  1. Regarding test organization:

Would you prefer these performance tests to:
a) Run as part of the existing e2e test suite
b) Remain as a standalone suite but use the e2e infrastructure
c) Have a completely separate setup

Looking forward to your thoughts!

@efiacor
Copy link
Collaborator

efiacor commented Feb 6, 2025

  1. Regarding cluster setup:

For running these performance tests, can I use the existing kind cluster setup from the e2e framework? I notice the e2e tests use a local kind cluster - should I integrate with that same mechanism?

2. Regarding Gitea:

The e2e suites use a stubbed gitea deployment. You mentioned there's a pkg for full gitea deployment - which approach would be more appropriate for these performance tests? Should I: a) Use the existing stubbed gitea from e2e suite b) Use the full gitea deployment pkg you mentioned

3. Regarding test organization:

Would you prefer these performance tests to: a) Run as part of the existing e2e test suite b) Remain as a standalone suite but use the e2e infrastructure c) Have a completely separate setup

Looking forward to your thoughts!

I think to get us started, we can use the existing e2e cluster setup. The setup script deploys porch, gitea, metallb. This may be more suitable for your use case if you are scraping metrics. If we need prometheus also, it might be better to deploy an instance to the same cluster no?
For gitea, we can use the pkg that gets deployed as part of the setup script - https://github.com/nephio-project/porch/blob/main/scripts/setup-dev-env.sh
I think we should prob go with a standalone suite and use the existing e2e infra.

Other opinions are also important - @liamfallon @kispaljr @Catalin-Stratulat-Ericsson @JamesMcDermott @nagygergo

@kispaljr kispaljr requested review from kispaljr and removed request for s3wong and henderiw February 14, 2025 09:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants