-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retire pm_test from PC suspend 30 cycles and separate IoT suspend cycles (New) #601
Conversation
66a5285
to
bc2aa40
Compare
7977722
to
6d7bc92
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an awesome piece of work!
There are a few small problems I pointed out below.
In general the numerous if
s here signify lack of decomposition or suboptimal handling responsibility given to parts of the solution.
Let's look at the FWTS ifs.
The way I understand it is that you want to have a different behavior (commands run) when the platform is x86_64 or i386.
In the current state of the branch there is a jinja2 branch which interpolates the command
string differently depending on the value provided by the resource.
So this means that the chain of concrete jobs may end up being one of those two:
suspend_(fwts_test) -> reboot -> do_checks
suspend_(using_rtcwake) -> reboot -> do_checks
I understand that you don't want to have two separate groups of all three steps depending on the FWTS (the strategy-level responsibility). But if that's just a detail on how to do a suspend, then the suspend job shouldn't change, and inside of it the decision should be made. In other words, imagine this:
command: suspend.py $"{STRESS_S3_DURATION}"
or
command: suspend.sh $"{STRESS_S3_DURATION}"
And within those the machine could be checked and appropriate method could be chosen. So the flow would be a linear:
suspend->reboot->do_checks
Also, introducing the generic "FTWS is supported if machine is either x86_64 or i386" conjecture is misleading. The check checks for machine type, it has nothing to do with FWTS. As a matter of fact FWTS today is supported for arm64, armhf, riscv64 and others as well.
50aa17e
to
72e2e88
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #601 +/- ##
==========================================
+ Coverage 34.83% 35.73% +0.89%
==========================================
Files 302 302
Lines 34165 34245 +80
Branches 5909 5916 +7
==========================================
+ Hits 11901 12236 +335
+ Misses 21698 21441 -257
- Partials 566 568 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
72e2e88
to
d75bbee
Compare
/canonical/self-hosted-runners/run-workflows d75bbee |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
The newly created jobs are not simple to understand at first glance. It's quite a complicated setup to be able to get everything you want (a mix of suspend cycles and reboot cycles). I cannot think of a better way to do, though, so I think we should go ahead with this, especially if we can use it on both desktop and IoT projects!
I tested your changes on a laptop, using Checkbox remote (with your changes sideloaded in /var/tmp/checkbox-providers/
).
On the controller (my computer) I created the following launcher:
[launcher]
launcher_version = 1
[environment]
STRESS_S3_ITERATIONS = 2
STRESS_SUSPEND_REBOOT_ITERATIONS = 3
[agent]
normal_user = u
I then launched Checkbox remote using the following command:
checkbox.checkbox-cli control <DUT IP> my-launcher
I selected com.canonical.certification::suspend-cycles-stress-test
test plan.
Both $STRESS_S3_ITERATIONS
and $STRESS_SUSPEND_REBOOT_ITERATIONS
were correctly taken into account, and no problems were reported! ✔️
The only complaint I have is that you used the summary
field to provide a long description of what each new test is doing, resulting in rather long lines in the output:
----------------------------[ Running job 19 / 30 ]-----------------------------
[ This is part of automated stress suspend cycles test that will force the system to suspend/resume. ]
ID: com.canonical.certification::stress-tests/suspend_cycles_1_reboot1
Category: Suspend (S3) Stress Test
--------------------------------------------------------------------------------
(...)
----------------------------[ Running job 20 / 30 ]-----------------------------
[ This is part of automated stress suspend cycles (suspend_cycles_2_reboot1) test that will force the system to suspend/resume. ]
ID: com.canonical.certification::stress-tests/suspend_cycles_2_reboot1
Category: Suspend (S3) Stress Test
--------------------------------------------------------------------------------
(...)
----------------------------[ Running job 21 / 30 ]-----------------------------
[ This is part of automated stress suspend cycles test that will force the system to reboot (suspend_cycles_reboot1). ]
ID: com.canonical.certification::stress-tests/suspend_cycles_reboot1
Category: Suspend (S3) Stress Test
--------------------------------------------------------------------------------
Connection lost!
connection closed by peer
Reconnecting ...
Reconnected (took: 114s)
[ This is part of automated stress suspend cycles test that will force the system to reboot (suspend_cycles_reboot1). ]
ID: com.canonical.certification::stress-tests/suspend_cycles_reboot1
Category: Suspend (S3) Stress Test
--------------------------------------------------------------------------------
Outcome: job passed
----------------------------[ Running job 26 / 34 ]-----------------------------
[ This is part of automated stress suspend cycles (suspend_cycles_1_reboot2) test that will force the system to suspend/resume. ]
ID: com.canonical.certification::stress-tests/suspend_cycles_1_reboot2
Category: Suspend (S3) Stress Test
--------------------------------------------------------------------------------
(...)
Could you modify the summary
for the new jobs to display something more concise? For instance:
This is part of automated stress suspend cycles (suspend_cycles_2_reboot1) test that will force the system to suspend/resume.
could be reworded, for instance,
Suspend and resume device (suspend cycle 2, reboot cycle 1)
Apart from that, I think this can land.
/canonical/self-hosted-runners/run-workflows 3766ce1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a huge piece of work and it requires a lot of coffee to understand.
My recommendations:
The generation part is very complicated, and I won't argue its usefulness, but It would be awesome if all of those mechanics would be put in a separate pxu file with a big explanation at the beginning of the file (bonus points for diagrams/flowcharts).
There are also unnecessary variables being introduced that make the code unfortunately, unnecessarily more complex.
/canonical/self-hosted-runners/run-workflows 6bb969d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whooo, this is great stuff! Very nice documentation added!
I made a few suggestions and typos, and once landed I think this is good to be merged.
Thanks for your hard work, @seankingyang !
…nd_cycles_reboot.pxu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a tiny comment regarding the use of \
to split lines (it's not necessary). Then we can land :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This had slipped my attention. Awesome, thanks!
/canonical/self-hosted-runners/run-workflows 8b67b95 |
The documentation changes requested by Maciej have been made.
…les (New) (canonical#601) * Retire pm_test from PC suspend 30 cycles and saperate IoT suspend cycles * Remove the useless flag preserve-local * Make suspend (fwts and rtcwake) flow more linear * Correct the summary of resources jobs * Modify the summary. * Remove unnecessary variables * Seperate the suspend_cycles_reboot test case to a new file * Add the detail description in md file. * Fix the typo, and add the short description at the beginning of suspend_cycles_reboot.pxu * Break lines of text at 80 characters as possible as I can * Fix some tiny problems
…les (New) (canonical#601) * Retire pm_test from PC suspend 30 cycles and saperate IoT suspend cycles * Remove the useless flag preserve-local * Make suspend (fwts and rtcwake) flow more linear * Correct the summary of resources jobs * Modify the summary. * Remove unnecessary variables * Seperate the suspend_cycles_reboot test case to a new file * Add the detail description in md file. * Fix the typo, and add the short description at the beginning of suspend_cycles_reboot.pxu * Break lines of text at 80 characters as possible as I can * Fix some tiny problems
…les (New) (canonical#601) * Retire pm_test from PC suspend 30 cycles and saperate IoT suspend cycles * Remove the useless flag preserve-local * Make suspend (fwts and rtcwake) flow more linear * Correct the summary of resources jobs * Modify the summary. * Remove unnecessary variables * Seperate the suspend_cycles_reboot test case to a new file * Add the detail description in md file. * Fix the typo, and add the short description at the beginning of suspend_cycles_reboot.pxu * Break lines of text at 80 characters as possible as I can * Fix some tiny problems
Description
suspend-cycles-stress-test
can support both PC and IoTThe detail story will be here: https://warthogs.atlassian.net/browse/CQT-2475
Resolved issues
Which suspend number the DUT is suspend can be easily identified when DUT is hang.
Documentation
The following environment configs will be used in these test cases:
STRESS_S3_ITERATIONS
: suspend times in each reboot cyclesSTRESS_SUSPEND_REBOOT_ITERATIONS
: total reboot cyclesSTRESS_S3_SLEEP_DELAY
: sleep dealySTRESS_S3_WAIT_DELAY
: device check delaySTRESS_SUSPEND_SLEEP_THRESHOLD
: suspend time thresholdSTRESS_SUSPEND_RESUME_THRESHOLD
: resume time thresholdTests
DUT image: Desktop , Checkbox: deb version
DUT image: FDE , Checkbox: Slave- snap version, Master- deb version
DUT image: Desktop , Checkbox: snap version