Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving Antrea Matrix Compatibility CI #6161

Closed
1 of 4 tasks
XinShuYang opened this issue Mar 28, 2024 · 13 comments
Closed
1 of 4 tasks

Improving Antrea Matrix Compatibility CI #6161

XinShuYang opened this issue Mar 28, 2024 · 13 comments
Labels
area/test Issues or PRs related to unit and integration tests. kind/design Categorizes issue or PR as related to design. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@XinShuYang
Copy link
Contributor

XinShuYang commented Mar 28, 2024

Describe what you are trying to solve
We are aiming to simplify the Antrea matrix compatibility CI jobs by leveraging Kind and CAPA for testing instead of the existing CAPV setup. This initiative involves recovering outdated Jenkins CI jobs and introducing kind and CAPA to Antrea matrix compatibility tests.

Describe the solution you have in mind

Describe how your solution impacts user flows

  1. By transitioning to Kind-based testing and removing outdated Jenkins jobs, we aim to simplify CI maintenance efforts for the maintainers. This will lead to a more efficient and streamlined CI process.
  2. New Kind-based periodic job will automate the testing process, eliminating the need for users to manually trigger corresponding Kind jobs.

Additional context
Any feedback or recommendations are appreciated, as these jobs were created a long time ago, and I may have missed some details. cc @tnqn @antoninbas @luolanzone

@XinShuYang XinShuYang added area/test Issues or PRs related to unit and integration tests. kind/design Categorizes issue or PR as related to design. labels Mar 28, 2024
@rajnkamr
Copy link
Contributor

Currently Antrea matrix test wrt k8s versions with CAPA is still not merged #5476 , in case we want to move to kind based matrix testing for k8 versions, we need to streamline current CPAV matrix jobs as well !

@XinShuYang
Copy link
Contributor Author

XinShuYang commented Mar 28, 2024

Currently Antrea matrix test wrt k8s versions with CAPA is still not merged #5476 , in case we want to move to kind based matrix testing for k8 versions, we need to streamline current CPAV matrix jobs as well !

The current CAPV matrix job tests are based on Kubernetes v1.17.5 and v1.18.2, so I believe they have not been used for a long time. Before merging the CAPA job PR, I will first attempt to restore the necessary CAPV jobs for matrix tasks. Given that some tests can be covered by Kind testing, we can consider removing them and replacing others with CAPA afterward.

@XinShuYang XinShuYang changed the title Improving Antrea Matrix Compatibility CI with Kind Improving Antrea Matrix Compatibility CI Mar 28, 2024
@antoninbas
Copy link
Contributor

Overall sounds good to me.

New Kind-based periodic job will automate the testing process, eliminating the need for users to manually trigger corresponding Kind jobs during release phases.

I am not sure I agree with this statement, or maybe I am misunderstanding. We should still validate releases by running a "full" test suite, and periodic CI jobs will not change that. However, I imagine that well-monitored periodic CI jobs will reduce the amount of test failures that have to be investigated and resolved "at the last minute" during the release phase. However, historically, we haven't been good at keeping track of failures in periodic CI jobs and addressing the issues as they arise, as far as I know. Some mechanism that automatically opens a Github issue when a periodic CI job starts failing (e.g., with 2 consecutive failures) could help, but I am not sure how easy it is to set up.

@antoninbas
Copy link
Contributor

I will first attempt to restore the necessary CAPV jobs for matrix tasks

What's the value in trying to restore / recover these old jobs before introducing the new (Kind / CAPA based) ones?

@XinShuYang
Copy link
Contributor Author

I will first attempt to restore the necessary CAPV jobs for matrix tasks

What's the value in trying to restore / recover these old jobs before introducing the new (Kind / CAPA based) ones?

There are matrix jobs validating Antrea based on CentOS and Photon OS. Do we have other running jobs that can replace them? If so, we indeed don't need to restore the CAPV matrix jobs before the CAPA upgrade.

@XinShuYang
Copy link
Contributor Author

Overall sounds good to me.

New Kind-based periodic job will automate the testing process, eliminating the need for users to manually trigger corresponding Kind jobs during release phases.

I am not sure I agree with this statement, or maybe I am misunderstanding. We should still validate releases by running a "full" test suite, and periodic CI jobs will not change that. However, I imagine that well-monitored periodic CI jobs will reduce the amount of test failures that have to be investigated and resolved "at the last minute" during the release phase. However, historically, we haven't been good at keeping track of failures in periodic CI jobs and addressing the issues as they arise, as far as I know. Some mechanism that automatically opens a Github issue when a periodic CI job starts failing (e.g., with 2 consecutive failures) could help, but I am not sure how easy it is to set up.

Yes, what I mean is periodic jobs can report failures in time. Regarding job failure notifications, the current Jenkins email notification is disabled due to the migration. I will attempt to enable it first.

@rajnkamr
Copy link
Contributor

I will first attempt to restore the necessary CAPV jobs for matrix tasks

What's the value in trying to restore / recover these old jobs before introducing the new (Kind / CAPA based) ones?

There are matrix jobs validating Antrea based on CentOS and Photon OS. Do we have other running jobs that can replace them? If so, we indeed don't need to restore the CAPV matrix jobs before the CAPA upgrade.

Compatibility Matrix test of antrea with k8 versions for ubuntu is taken care ( CAPA PR #5476 - non kind) , For centos and photon Antrea matrix compatibility tests, kind can be leveraged. Periodic job approach seems to save on resources which should be usually available for regular jobs.
Email notification for period job failures must be enabled for tracking of periodic jobs. May be we can add checking period job result in release checklist.

@antoninbas
Copy link
Contributor

@XinShuYang you have these 2 list items:

  • Recovering centOS and Photon test cases for both antrea-upgrade-matrix-compatibility-test and antrea-matrix-compatibility-test jobs based on CAPV.
  • Enabling CAPA jobs for different OS such as CentOS or Photon.

I guess I do not understand how they relate to each other. If you want to enable CAPA jobs for these OS's, why do you need the first list item?

@rajnkamr
Copy link
Contributor

rajnkamr commented Apr 2, 2024

@XinShuYang
I have same question as @antoninbas , do we still want to use CAPV, we wanted to run most of the capv matrix jobs on CAPA.
It might be better to split 3rd and 4th items to different issues

"3.Introducing Kind to Ubuntu test cases in Antrea matrix compatibility jobs. This job will automatically trigger periodic runs to ensure the ongoing compatibility of Antrea across different Kubernetes versions and configurations.
4. Enabling CAPA jobs for different OS such as CentOS or Photon."

@XinShuYang
Copy link
Contributor Author

@XinShuYang you have these 2 list items:

* Recovering centOS and Photon test cases for both antrea-upgrade-matrix-compatibility-test and antrea-matrix-compatibility-test jobs based on CAPV.

* Enabling CAPA jobs for different OS such as CentOS or Photon.

I guess I do not understand how they relate to each other. If you want to enable CAPA jobs for these OS's, why do you need the first list item?

@antoninbas The final goal is to use CAPA instead of CAPV in the matrix test. However, since the PR for CAPA #5476 has not been merged yet (and I think it may take longer for review and merge), I prefer to first restore the existing CAPV jobs.

@XinShuYang
Copy link
Contributor Author

@XinShuYang I have same question as @antoninbas , do we still want to use CAPV, we wanted to run most of the capv matrix jobs on CAPA. It might be better to split 3rd and 4th items to different issues

"3.Introducing Kind to Ubuntu test cases in Antrea matrix compatibility jobs. This job will automatically trigger periodic runs to ensure the ongoing compatibility of Antrea across different Kubernetes versions and configurations. 4. Enabling CAPA jobs for different OS such as CentOS or Photon."

@rajnkamr Yes, I created two issues to track the progress, thanks for the suggestion.

Copy link
Contributor

github-actions bot commented Jul 2, 2024

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 2, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 30, 2024
@XinShuYang
Copy link
Contributor Author

Move this plan to stage 2 of #6698.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/test Issues or PRs related to unit and integration tests. kind/design Categorizes issue or PR as related to design. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

3 participants