Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retire pm_test from PC suspend 30 cycles and separate IoT suspend cycles (New) #601

Merged
merged 11 commits into from
Jan 5, 2024

Conversation

seankingyang
Copy link
Contributor

@seankingyang seankingyang commented Jul 4, 2023

Description

  1. Retire pm_test.py from PC for suspend 30 cycles
  2. Separate IOT's suspend cycles from one job to multiple jobs
  3. Make the suspend-cycles-stress-test can support both PC and IoT

The detail story will be here: https://warthogs.atlassian.net/browse/CQT-2475

Resolved issues

Which suspend number the DUT is suspend can be easily identified when DUT is hang.

Documentation

The following environment configs will be used in these test cases:
STRESS_S3_ITERATIONS : suspend times in each reboot cycles
STRESS_SUSPEND_REBOOT_ITERATIONS : total reboot cycles

Note: Total suspend number = STRESS_S3_ITERATIONS x STRESS_SUSPEND_REBOOT_ITERATIONS

STRESS_S3_SLEEP_DELAY: sleep dealy
STRESS_S3_WAIT_DELAY: device check delay
STRESS_SUSPEND_SLEEP_THRESHOLD : suspend time threshold
STRESS_SUSPEND_RESUME_THRESHOLD: resume time threshold

Tests

DUT image: Desktop , Checkbox: deb version

DUT image: FDE , Checkbox: Slave- snap version, Master- deb version

DUT image: Desktop , Checkbox: snap version

@seankingyang seankingyang requested a review from pieqq July 4, 2023 06:16
@seankingyang seankingyang marked this pull request as draft July 4, 2023 06:22
@seankingyang seankingyang force-pushed the Modify-suspend-cycles branch 2 times, most recently from 7977722 to 6d7bc92 Compare July 7, 2023 03:12
@seankingyang seankingyang marked this pull request as ready for review July 7, 2023 03:32
Copy link
Contributor

@kissiel kissiel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an awesome piece of work!

There are a few small problems I pointed out below.

In general the numerous ifs here signify lack of decomposition or suboptimal handling responsibility given to parts of the solution.

Let's look at the FWTS ifs.
The way I understand it is that you want to have a different behavior (commands run) when the platform is x86_64 or i386.
In the current state of the branch there is a jinja2 branch which interpolates the command string differently depending on the value provided by the resource.
So this means that the chain of concrete jobs may end up being one of those two:
suspend_(fwts_test) -> reboot -> do_checks
suspend_(using_rtcwake) -> reboot -> do_checks

I understand that you don't want to have two separate groups of all three steps depending on the FWTS (the strategy-level responsibility). But if that's just a detail on how to do a suspend, then the suspend job shouldn't change, and inside of it the decision should be made. In other words, imagine this:

command: suspend.py $"{STRESS_S3_DURATION}"

or

command: suspend.sh $"{STRESS_S3_DURATION}"

And within those the machine could be checked and appropriate method could be chosen. So the flow would be a linear:
suspend->reboot->do_checks

Also, introducing the generic "FTWS is supported if machine is either x86_64 or i386" conjecture is misleading. The check checks for machine type, it has nothing to do with FWTS. As a matter of fact FWTS today is supported for arm64, armhf, riscv64 and others as well.

providers/base/units/stress/s3s4.pxu Outdated Show resolved Hide resolved
providers/base/units/stress/s3s4.pxu Outdated Show resolved Hide resolved
providers/base/units/stress/s3s4.pxu Outdated Show resolved Hide resolved
providers/base/units/stress/s3s4.pxu Outdated Show resolved Hide resolved
providers/base/units/stress/s3s4.pxu Outdated Show resolved Hide resolved
providers/base/units/stress/s3s4.pxu Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Oct 30, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (120bb96) 34.83% compared to head (8b67b95) 35.73%.
Report is 75 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #601      +/-   ##
==========================================
+ Coverage   34.83%   35.73%   +0.89%     
==========================================
  Files         302      302              
  Lines       34165    34245      +80     
  Branches     5909     5916       +7     
==========================================
+ Hits        11901    12236     +335     
+ Misses      21698    21441     -257     
- Partials      566      568       +2     
Flag Coverage Δ
provider-base 5.50% <ø> (+2.36%) ⬆️
provider-certification-client 57.14% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@seankingyang seankingyang changed the title Retire pm_test from PC suspend 30 cycles and separate IoT suspend cycles Retire pm_test from PC suspend 30 cycles and separate IoT suspend cycles (New) Nov 2, 2023
@pieqq
Copy link
Collaborator

pieqq commented Nov 3, 2023

/canonical/self-hosted-runners/run-workflows d75bbee

Copy link
Collaborator

@pieqq pieqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

The newly created jobs are not simple to understand at first glance. It's quite a complicated setup to be able to get everything you want (a mix of suspend cycles and reboot cycles). I cannot think of a better way to do, though, so I think we should go ahead with this, especially if we can use it on both desktop and IoT projects!

I tested your changes on a laptop, using Checkbox remote (with your changes sideloaded in /var/tmp/checkbox-providers/).

On the controller (my computer) I created the following launcher:

[launcher]
launcher_version = 1

[environment]
STRESS_S3_ITERATIONS = 2
STRESS_SUSPEND_REBOOT_ITERATIONS = 3

[agent]
normal_user = u

I then launched Checkbox remote using the following command:

checkbox.checkbox-cli control <DUT IP> my-launcher

I selected com.canonical.certification::suspend-cycles-stress-test test plan.

Both $STRESS_S3_ITERATIONS and $STRESS_SUSPEND_REBOOT_ITERATIONS were correctly taken into account, and no problems were reported! ✔️

The only complaint I have is that you used the summary field to provide a long description of what each new test is doing, resulting in rather long lines in the output:

----------------------------[ Running job 19 / 30 ]-----------------------------
[ This is part of automated stress suspend cycles test that will force the system to suspend/resume. ]
ID: com.canonical.certification::stress-tests/suspend_cycles_1_reboot1
Category: Suspend (S3) Stress Test
--------------------------------------------------------------------------------
(...)

----------------------------[ Running job 20 / 30 ]-----------------------------
[ This is part of automated stress suspend cycles (suspend_cycles_2_reboot1) test that will force the system to suspend/resume. ]
ID: com.canonical.certification::stress-tests/suspend_cycles_2_reboot1
Category: Suspend (S3) Stress Test
--------------------------------------------------------------------------------
(...)

----------------------------[ Running job 21 / 30 ]-----------------------------
[ This is part of automated stress suspend cycles test that will force the system to reboot (suspend_cycles_reboot1). ]
ID: com.canonical.certification::stress-tests/suspend_cycles_reboot1
Category: Suspend (S3) Stress Test
--------------------------------------------------------------------------------
Connection lost!
connection closed by peer
Reconnecting ...
Reconnected (took: 114s)
[ This is part of automated stress suspend cycles test that will force the system to reboot (suspend_cycles_reboot1). ]
ID: com.canonical.certification::stress-tests/suspend_cycles_reboot1
Category: Suspend (S3) Stress Test
--------------------------------------------------------------------------------
Outcome: job passed
----------------------------[ Running job 26 / 34 ]-----------------------------
[ This is part of automated stress suspend cycles (suspend_cycles_1_reboot2) test that will force the system to suspend/resume. ]
ID: com.canonical.certification::stress-tests/suspend_cycles_1_reboot2
Category: Suspend (S3) Stress Test
--------------------------------------------------------------------------------
(...)

Could you modify the summary for the new jobs to display something more concise? For instance:

This is part of automated stress suspend cycles (suspend_cycles_2_reboot1) test that will force the system to suspend/resume.

could be reworded, for instance,

Suspend and resume device (suspend cycle 2, reboot cycle 1)

Apart from that, I think this can land.

@diohe0311
Copy link
Contributor

/canonical/self-hosted-runners/run-workflows 3766ce1

Copy link
Contributor

@kissiel kissiel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a huge piece of work and it requires a lot of coffee to understand.
My recommendations:
The generation part is very complicated, and I won't argue its usefulness, but It would be awesome if all of those mechanics would be put in a separate pxu file with a big explanation at the beginning of the file (bonus points for diagrams/flowcharts).

There are also unnecessary variables being introduced that make the code unfortunately, unnecessarily more complex.

providers/base/units/stress/s3s4.pxu Outdated Show resolved Hide resolved
providers/base/units/stress/s3s4.pxu Outdated Show resolved Hide resolved
providers/base/units/stress/s3s4.pxu Outdated Show resolved Hide resolved
@diohe0311
Copy link
Contributor

/canonical/self-hosted-runners/run-workflows 6bb969d

Copy link
Collaborator

@pieqq pieqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whooo, this is great stuff! Very nice documentation added!

I made a few suggestions and typos, and once landed I think this is good to be merged.

Thanks for your hard work, @seankingyang !

providers/base/units/stress/suspend_cycles_reboot.md Outdated Show resolved Hide resolved
providers/base/units/stress/suspend_cycles_reboot.md Outdated Show resolved Hide resolved
providers/base/units/stress/suspend_cycles_reboot.md Outdated Show resolved Hide resolved
providers/base/units/stress/suspend_cycles_reboot.md Outdated Show resolved Hide resolved
providers/base/units/stress/suspend_cycles_reboot.md Outdated Show resolved Hide resolved
providers/base/units/stress/suspend_cycles_reboot.md Outdated Show resolved Hide resolved
providers/base/units/stress/suspend_cycles_reboot.md Outdated Show resolved Hide resolved
providers/base/units/stress/suspend_cycles_reboot.md Outdated Show resolved Hide resolved
providers/base/units/stress/suspend_cycles_reboot.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@pieqq pieqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a tiny comment regarding the use of \ to split lines (it's not necessary). Then we can land :)

providers/base/units/stress/suspend_cycles_reboot.md Outdated Show resolved Hide resolved
providers/base/units/stress/suspend_cycles_reboot.md Outdated Show resolved Hide resolved
providers/base/units/stress/suspend_cycles_reboot.pxu Outdated Show resolved Hide resolved
Copy link
Collaborator

@pieqq pieqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This had slipped my attention. Awesome, thanks!

@pieqq
Copy link
Collaborator

pieqq commented Jan 5, 2024

/canonical/self-hosted-runners/run-workflows 8b67b95

@pieqq pieqq dismissed kissiel’s stale review January 5, 2024 02:56

The documentation changes requested by Maciej have been made.

@pieqq pieqq merged commit cac5044 into canonical:main Jan 5, 2024
14 checks passed
LiaoU3 pushed a commit to LiaoU3/checkbox that referenced this pull request Jan 9, 2024
…les (New) (canonical#601)

* Retire pm_test from PC suspend 30 cycles and saperate IoT suspend cycles

* Remove the useless flag preserve-local

* Make suspend (fwts and rtcwake) flow more linear

* Correct the summary of resources jobs

* Modify the summary.

* Remove unnecessary variables

* Seperate the suspend_cycles_reboot test case to a new file

* Add the detail description in md file.

* Fix the typo, and add the short description at the beginning of suspend_cycles_reboot.pxu

* Break lines of text at 80 characters as possible as I can

* Fix some tiny problems
LiaoU3 pushed a commit to LiaoU3/checkbox that referenced this pull request Mar 20, 2024
…les (New) (canonical#601)

* Retire pm_test from PC suspend 30 cycles and saperate IoT suspend cycles

* Remove the useless flag preserve-local

* Make suspend (fwts and rtcwake) flow more linear

* Correct the summary of resources jobs

* Modify the summary.

* Remove unnecessary variables

* Seperate the suspend_cycles_reboot test case to a new file

* Add the detail description in md file.

* Fix the typo, and add the short description at the beginning of suspend_cycles_reboot.pxu

* Break lines of text at 80 characters as possible as I can

* Fix some tiny problems
binli pushed a commit to binli/checkbox that referenced this pull request Mar 22, 2024
…les (New) (canonical#601)

* Retire pm_test from PC suspend 30 cycles and saperate IoT suspend cycles

* Remove the useless flag preserve-local

* Make suspend (fwts and rtcwake) flow more linear

* Correct the summary of resources jobs

* Modify the summary.

* Remove unnecessary variables

* Seperate the suspend_cycles_reboot test case to a new file

* Add the detail description in md file.

* Fix the typo, and add the short description at the beginning of suspend_cycles_reboot.pxu

* Break lines of text at 80 characters as possible as I can

* Fix some tiny problems
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants