Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce refresh/revert tests for kernel/snapd/gadget snaps (New) #724

Closed
wants to merge 18 commits into from

Conversation

pieqq
Copy link
Collaborator

@pieqq pieqq commented Sep 14, 2023

Description

Rework @patliuu 's kernel snap refresh/revert tests to

  • be more generic (so now the tests support gadget and snapd snaps in addition)
  • allow to refresh to a specific revision instead of a channel
  • add jobs to not only refresh to/revert from stable, but also from base revision (base revision = the revision of a snap that is present on the image at install time). This is to allow simulating a scenario where a freshly installed device gets updated to the latest snap available.
  • create a separate test plan to check all of the snaps for refresh/revert (the jobs will be skipped if the revisions are the same)

Please see individual commit messages for more info.

Resolved issues

CHECKBOX-717

Documentation

  • Documented the snap_update_test.py script (docstrings, type hints)
  • Added unit tests for it
  • Put additional info in commit messages for later reference

Tests

@codecov
Copy link

codecov bot commented Sep 14, 2023

Codecov Report

Merging #724 (1c5d392) into main (c3c3f45) will increase coverage by 0.41%.
Report is 3 commits behind head on main.
The diff coverage is 93.54%.

@@            Coverage Diff             @@
##             main     #724      +/-   ##
==========================================
+ Coverage   34.83%   35.25%   +0.41%     
==========================================
  Files         302      302              
  Lines       34165    34173       +8     
  Branches     5909     5907       -2     
==========================================
+ Hits        11903    12048     +145     
+ Misses      21697    21559     -138     
- Partials      565      566       +1     
Flag Coverage Δ
provider-base 4.16% <93.54%> (+1.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
providers/base/bin/snap_update_test.py 93.54% <93.54%> (ø)

@pieqq pieqq marked this pull request as draft September 21, 2023 13:38
@pieqq pieqq force-pushed the CHECKBOX-717/refresh-revert-snaps branch from 01aa458 to 1f823db Compare September 25, 2023 02:19
@pieqq pieqq marked this pull request as ready for review September 25, 2023 02:19
@pieqq
Copy link
Collaborator Author

pieqq commented Sep 25, 2023

Rebased on main, and moved test data in its own directory.

@pieqq pieqq force-pushed the CHECKBOX-717/refresh-revert-snaps branch from ac875b1 to 47f4cc2 Compare September 25, 2023 07:47
Copy link
Collaborator

@Hook25 Hook25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ty for this contribution, I have a few comments here and there, given that the main is very sequential with no operation that can fail I can accept it to remain untested as is.

:return: a dict containing the snap names and their base revisions
:rtype: dict
"""
base_snaps = glob("/var/lib/snapd/seed/snaps/*.snap")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my system I don't have /var/lib/snapd/seed/snaps but I have /var/lib/snapd/snaps. Is that normal?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's because you're using Arch™ 😆

The /var/lib/snapd/seed/snaps/ directory contains the base revision of the snaps installed by a given ISO image.

If you installed snapd afterwards, I think it's normal not to have the /var/lib/snapd/seed/snaps/ dir.

FYI, this is what it looks like on my desktop (Ubuntu 22.04):

ls /var/lib/snapd/seed/snaps/
bare_5.snap  core20_1587.snap  firefox_1635.snap  gnome-3-38-2004_112.snap  gtk-common-themes_1535.snap  snapd_16292.snap  snapd-desktop-integration_14.snap  snap-store_582.snap

Comment on lines 64 to 67
snap_basename = os.path.basename(snap_path)
snap_name = os.path.splitext(snap_basename)[0]
snap, rev = snap_name.rsplit("_", maxsplit=1)
base_rev_info[snap] = rev
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both here and above, strange to use os.path instead of pathlib, it makes this code quite a bit harder to read. If you use Path.glob above, everything becomes a Path

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I somehow thought pathlib was "recent" and therefore we couldn't use it with xenial, but it was introduced in Python 3.4... 👴 I'll fix this.

for snap_path in base_snaps:
snap_basename = os.path.basename(snap_path)
snap_name = os.path.splitext(snap_basename)[0]
snap, rev = snap_name.rsplit("_", maxsplit=1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: is _ always the separator to the version?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This was confirmed by snapd team.

snap_basename = os.path.basename(snap_path)
snap_name = os.path.splitext(snap_basename)[0]
snap, rev = snap_name.rsplit("_", maxsplit=1)
base_rev_info[snap] = rev
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This relies on glob ordering, for example I have two revision of the core snap:

core_16202.snap core_16091.snap

Which will be in base_rev_info? I personally didn't know (it is 16091) and this seems like something that should be taken into consideration and explicitly resolved.

Related, the glob documentation states:

results are returned in arbitrary order

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In /var/lib/snapd/seed/snaps/, there can only be one revision for a given snap, because it's the one that comes pre-installed when you install the image. Therefore, we will never run into this problem.

And I think if we had to deal with it, we could simply assume the base revision is the smallest int between the two.



def print_resource_info():
snaps = guess_snaps().values()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to point out, if the keys are not used anywhere, guess_snaps can be reduced to a nice and easy to read oneliner

Comment on lines 231 to 232
with open(self.path, "r") as file:
data = json.load(file)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all of the with open(self.path, ...) as file: ... parts of this file can be substituted with a save_[...] and load_[...] function. This would make it very easy to do error handling uniformly

Comment on lines +242 to 243
echo "Waiting 90s for any snap operation to finish before rebooting..."
sleep 90
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this necessary? isn't the call to refresh synchronous?

I hope it does not break snaps if I reboot after the refresh call and the script itself waits for the operation to be done when verifying. This looks like a sleep for X secs wasting some time every time the test is run

depends: ubuntucore/kernel-verify-after-refresh-{kernel_name}
imports: from com.canonical.certification import snap_revision_info
requires:
(snap_revision_info.name == "{name}") and snap_revision_info.base_rev != snap_revision_info.original_installed_rev
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: does this do what we expect it does? I'm always scared of !=

Comment on lines +297 to +299
echo "Waiting 90s for any snap operation to finish before rebooting..."
sleep 90
reboot
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, also, do we need to reboot for sure?

template-resource: snap_revision_info
template-unit: job
id: ubuntucore/reboot-after-snap-revert-{type}-{name}-from-base-rev
_summary: Reboot after {name} snap revert to base revision
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposal: Reboot after the {name} snap was reverted to base revision

@patliuu
Copy link
Contributor

patliuu commented Oct 13, 2023

Hi @pieqq ,

Thanks for revamping this.
Recently, we QA has done a few milestone tests with Checkbox v2.10.1 which has the kernel refresh/revert jobs landed. We ran into a situation where the pre-GM milestone kernel was not in the latest/stable channel due to various kinds of reasons. such as some SKUs would like to share the same kernel in the end, but for the kernel development, they place the kernel in different branches for alpha or beta ... milestones.

It would cause some trouble for QA when the job wants to refresh the kernel to latest/stable because the revision will be different in the situation above. I'm wondering, is it possible for Cert team to add these kernel refresh/revert jobs at the time when setting up them in the Cert lab (Post-GM) for SUV process? As these jobs are mainly designed for SUV testing.

@pieqq pieqq marked this pull request as draft October 18, 2023 11:51
@pieqq pieqq force-pushed the CHECKBOX-717/refresh-revert-snaps branch from 9b0ed7c to 8ab03e8 Compare October 20, 2023 06:42
@pieqq pieqq marked this pull request as ready for review October 20, 2023 06:43
@pieqq
Copy link
Collaborator Author

pieqq commented Oct 20, 2023

I took @Hook25 comments into account and spent way too long refactoring the unit tests and adding some to have a better test coverage.

I also tested my changes against a RPi4 running UC22. See this submission and its description for more information.

@pieqq
Copy link
Collaborator Author

pieqq commented Oct 24, 2023

Hi @pieqq ,

Thanks for revamping this. Recently, we QA has done a few milestone tests with Checkbox v2.10.1 which has the kernel refresh/revert jobs landed. We ran into a situation where the pre-GM milestone kernel was not in the latest/stable channel due to various kinds of reasons. such as some SKUs would like to share the same kernel in the end, but for the kernel development, they place the kernel in different branches for alpha or beta ... milestones.

It would cause some trouble for QA when the job wants to refresh the kernel to latest/stable because the revision will be different in the situation above. I'm wondering, is it possible for Cert team to add these kernel refresh/revert jobs at the time when setting up them in the Cert lab (Post-GM) for SUV process? As these jobs are mainly designed for SUV testing.

So you would like to remove the new tests from the ubuntucore-automated test plan?

  • ubuntucore/snap-refresh-kernel-.*-stable-rev
  • ubuntucore/reboot-after-snap-refresh-kernel-.*-stable-rev
  • ubuntucore/snap-verify-after-refresh-kernel-.*-stable-rev
  • ubuntucore/snap-revert-kernel-.*-stable-rev
  • ubuntucore/reboot-after-snap-revert-kernel-.*-stable-rev
  • ubuntucore/snap-verify-after-revert-kernel-.*-stable-rev

pieqq added a commit that referenced this pull request Oct 24, 2023
In a comment of the PR[1], Patrick explained the tests are meant for
Certification more than pre-GM testing, and therefore should be removed
from the ubuntucore-automated test plan:

> Recently, we QA has done a few milestone tests with Checkbox v2.10.1
> which has the kernel refresh/revert jobs landed. We ran into a
> situation where the pre-GM milestone kernel was not in the
> latest/stable channel due to various kinds of reasons. such as some
> SKUs would like to share the same kernel in the end, but for the
> kernel development, they place the kernel in different branches for
> alpha or beta ... milestones.
>
> It would cause some trouble for QA when the job wants to refresh the
> kernel to latest/stable because the revision will be different in the
> situation above.

[1]
#724 (comment)
@patliuu
Copy link
Contributor

patliuu commented Oct 24, 2023

Hi @pieqq ,
Thanks for revamping this. Recently, we QA has done a few milestone tests with Checkbox v2.10.1 which has the kernel refresh/revert jobs landed. We ran into a situation where the pre-GM milestone kernel was not in the latest/stable channel due to various kinds of reasons. such as some SKUs would like to share the same kernel in the end, but for the kernel development, they place the kernel in different branches for alpha or beta ... milestones.
It would cause some trouble for QA when the job wants to refresh the kernel to latest/stable because the revision will be different in the situation above. I'm wondering, is it possible for Cert team to add these kernel refresh/revert jobs at the time when setting up them in the Cert lab (Post-GM) for SUV process? As these jobs are mainly designed for SUV testing.

So you would like to remove the new tests from the ubuntucore-automated test plan?

  • ubuntucore/snap-refresh-kernel-.*-stable-rev
  • ubuntucore/reboot-after-snap-refresh-kernel-.*-stable-rev
  • ubuntucore/snap-verify-after-refresh-kernel-.*-stable-rev
  • ubuntucore/snap-revert-kernel-.*-stable-rev
  • ubuntucore/reboot-after-snap-revert-kernel-.*-stable-rev
  • ubuntucore/snap-verify-after-revert-kernel-.*-stable-rev

Hi @pieqq ,

Yes, if Cert team could handle when should these jobs be run, it'll be great to remove them from the ubuntucore-automated plan. It could make our QA testing easier when running with some non-standard projects.

This commit builds upon Patrick Liu's kernel_snap_test.py script (see
commit 3b73301) and does the following
in order to make it more generic, so it can be used with not only the
kernel snap, but also snapd and gadget snaps:

- rename the script `snap_update_test.py` and replace references to
"kernel" with something more generic
- extract some utility functions from the SnapRefreshRevert class (these
functions can be executed without being part of that class)
- use revision numbers instead of channels to refresh the snap to; this
is because we want to take another use case into account: refreshing
to/reverting from the base revision, which is the revision of the snap
that is present on the image at install time
- add the `guess_snaps()` function to guess the names of the kernel,
gadget and snapd snaps
- add the `get_snap_base_rev()` function to retrieve the base revision
of a given snap from the system

Fix CHECKBOX-717
- Turn existing kernel-only jobs into generic ones, using the
snap_revision_info resource job as a template for the naming
- Add jobs to refresh to/revert from base revision
- Separate the reboot section into its own job: because the
noreturn/autorestart flags have a big impact on Checkbox, it's better to
isolate these jobs to the smallest possible action (in this case, a
reboot). This also helps with keeping stdout/stderr to avoid issue

[1] #694
Keep the kernel tests, but use the new jobs.
…gadget)

These 3 test plans are aimed at being used during the
snap-update-verification process, where typically:

- a new snap (kernel, gadget, snapd...) is released and put into the
beta channel
- this triggers a Checkbox run using a defined test plan on a set of
defined devices

Depending on the snap being updated, operator can select the right test
plan for the job.

I initially wanted to put everything into the same test plan and use
`requires:` section of the jobs to skip the jobs that were not required
(for instance, if testing a new gadget snap, the kernel snap will not
have a new revision available, so related jobs should be skipped).
However, I could not find a proper way to deal with this using the
current requirement mechanism in Checkbox jobs.
…skipped

This requirement trick allows to specify the stable_rev/base_rev and
original_installed_rev to check for the snap being tested.

This allows to efficiently skip the job if said snap has the same source
and destination revision.

Thanks to this, these jobs are not marked as failed anymore, but instead
skipped.
Thanks to the previous commit, only one test plan can be used instead of
3 separate test plans. This way, if the test plan is called by the
snap-update-verification process because there is a new gadget snap in
beta channel, the jobs for the two other snaps (kernel and snapd) will
be skipped instead of failing.
Namely:

- Add a logger and use it instead of calling logging.xxx() directly
- Decouple the logic that waits for a snap change from the verification
logic (create a new `wait_for_snap_change()` method and modify
`verify()`)
- When error happens, put the message as a parameter of SystemExit()
instead of logging it first
- Pass arguments from command line to the `main()` function
If there is an error for current snap change, no need to wait for
timeout. We log the information from this change and exit with an error.

Also add unit tests to test this.
assert_called() was introduced in 3.6... 😭
@pieqq pieqq force-pushed the CHECKBOX-717/refresh-revert-snaps branch from f291298 to 1c5d392 Compare October 30, 2023 05:08
@pieqq
Copy link
Collaborator Author

pieqq commented Oct 30, 2023

Following the changes in the repository requiring all commits to be signed, I've created new signed commits based on my previous work, rebased and pushed again.

@pieqq
Copy link
Collaborator Author

pieqq commented Oct 30, 2023

Because our company's runners policy prevents / in branch names, I uploaded a new branch and created a new PR using the same content: #805

Please help review. Marking current one as Closed.

@pieqq pieqq closed this Nov 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants