Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review v35 #3342

Open
wants to merge 213 commits into
base: review-v35
Choose a base branch
from
Open

Review v35 #3342

wants to merge 213 commits into from

Conversation

lucasmoura
Copy link
Contributor

This changeset represents the diff to be uploaded as version 35 of the ubuntu-pro-client package.

lucasmoura and others added 30 commits September 12, 2024 13:10
When collecting apparmor logs, we pull the /var/log/syslog file
from the system under test. However, if the file doesn't exist,
we shouldn't block the whole test, but instead alert about that
For the Noble securitu-status test, we need one package
with an upgrade from security and another for updates.
The past packages now both have candidates on the security
pocket. We are now updating the package to have that distinction
again
There are many configs in the 'Unattended-Upgrade::' prefix which can
be disabled while still keeping the main functionality enabled.
For example:
 - Unattended-Upgrade::OnlyOnACPower "0"
     Allows upgrades even on battery power
 - Unattended-Upgrade::Skip-Updates-On-Metered-Connections "0"
     Allows upgrades even on metered connections
and many more.

You may `grep Unattended-Upgrade:: /usr/bin/unattended-upgrade` for
a list of keys that may make sense to disable while keeping
unattended-upgrades enabled.
We are now grouping as many FIPS VM tests as we can under some
scenarios. This will facilitate fixing tests, as we will now only
need to apply a fix for a single scenario.

We have also removed tests that are already being covered by other
scenarios
We are now grouping as many FIPS cloud tests as we can under
some scenarios. This will facilitate fixing tests, as we will
now only need to apply a fix for a single scenario.
We are now grouping as many Pro FIPS cloud tests as we can under
some scenarios. This will facilitate fixing tests, as we will
now only need to apply a fix for a single scenario.
We are makint the process of applying a jq filter to
the output simpler by using the python jq library instead
The keys in UNATTENDED_UPGRADES_CONFIG_KEYS are considered to be
mandatory in order to report unattended-upgrades as enabled.

When parsing the config files, get_apt_config_values will set
missing keys to None rather than not populating the dict.
Thus unattended_upgrades_cfg.get(key, fallback) for a missing key
returns None rather than the fallback. Finally `int(None)` fails.
Prior to this change, missing configs would be reported as
UNATTENDED_UPGRADES_CFG_LIST_VALUE_EMPTY, but since their semantics
are disabled-by-default we should instead report them as
UNATTENDED_UPGRADES_CFG_VALUE_TURNED_OFF.

In order to preserve the correct behaviour for
"Unattended-Upgrade::Allowed-Origins" which is of type list, when
its config is missing force it to be an empty list rather than None.
The anbox-cloud product no longer asks to install the amc snap so we
should not check for it. The amc command has been merged into the
anbox-cloud-appliance snap.
We are better consolidating our proxy tests by
removing whole scenarios that only test setting the proxy
by writing directly in the user-config.json file while
also removing some redundant scenarios on the proxy tests
Create a separate feature file for the cli disable command
and move all of the related disable tests to it
Create a dedicated feature file for the ROS service
and move related tests to it
Create a separate feature file for the cli refresh command
and move all of the related refresh tests to it
Create a separate feature file for the cli detach command
and move all of the related detach tests to it
Create a separate feature file for the cli auto-attach command
and move all of the related auto-attach tests to it
Create a dedicated feature file for the CC-EAL service
and move related tests to it
Move all related version tests to the already existing
version feature file
Move all help related tests to appropriate feature file
Move all related timer tests to the already existing
timer feature file
Create a separate feature file for the cli status
command and move all of the related status tests to it
Move all related livepatch tests to the already existing
livepatch feature file
Create a separate feature file for the cli enable
command and move all of the related enable tests to it
Move all related test that check for postinst install logic to appropriate file
Create a dedicated feature file for the cis service
and move related tests to it
Create a dedicated feature file for the usg service
and move related tests to it
Create a separate feature file for the api reboot required endpoint
and move all of the related tests to it
Move all related fips tests to the appropriate file
@bryceharrington
Copy link
Member

bryceharrington commented Oct 17, 2024

I'm still in process on this review, but here's some early feedback to start with. I still have the security vulnerability feature left to go through, but I've gotten through everything other than that.

By and large I'm spotting nothing release critical that has to be fixed in v35, so many of these should be fine to tackle in future releases, but I do have some questions. There's a handful of copyedit type things per usual.


I notice you've added a snazzy per-PR Checklist. One of the items is "Does this PR require review from someone outside the core ubuntu-pro-client team?"

Is there a way to programmatically query for a list of these PRs that had 'Yes' checked? I'm not familiar enough with github-actions to figure out how to craft such a filter, but guessing you guys might? I'm curious to take a peek at what these were.

commit ce27228:

Author:     Grant

config: lxd_guest_attach setting

The text says:

+                " the guest will not be allowed to attach using the host's Pro"
+                " lx access. If set to 'available', the guest will be allowed"

Is that 'lx' a stray or should it be 'LXD'?

Also, in uaclient/cli/config.py, there this code:

+    elif set_key == "lxd_guest_attach":
+        if not _is_attached(cfg).is_attached:
+            raise exceptions.UnattachedError()
+        state_files.lxd_pro_config_file.write(
+            state_files.LXDProConfig(
+                guest_attach=LXDGuestAttachEnum.from_value(set_value.lower())
+            )
+        )

Do I understand correctly that whatever value is given for "lxd_guest_attach" it performs a check for attachment. Have you considered a situation where the parameter is already configured to be off (or unset, so off by default), and the caller here is specifying it to be off?

In such a case, maybe it would be best not to perform the attachment check? I wonder if raising an exception on this may not be what would be wanted?

commit 5d62735:

Author:     Grant Orndorff <[email protected]>

lxd: attempt to attach on launch

When a lxd container is launched, pro-client will attempt to auto-attach
via the lxd apis that talk to the host's pro-client. It will only try
once and will not continue to poll.

This opens a socket to the LXD instance

It logs a warning if it encounters an LXD error. Guessing it would be better to log an error.

is_pro_license_present() returns true if "guest_attach" is set to "on", false otherwise. I wonder if it may be helpful to users wanting to understand why it isn't working if this should log an info or debug if "guest_attach" is not "on"?

commit 7b5802d:

Author:     Lucas

system: add logic to collect cpu type information

We are adding the logic to collect the CPU type for attached
users in the Pro client. We first identify the machine architecture
and then we collect the CPU type for the appropriate given the
architecture

adds classes for different categories of machine architectures (i.e. manufacturer). I'm curious about the motivation for organizing the CPU data in this fashion, since for example amd64 and i386 have significantly different ramifications for packaging in the distro.

I would recommend if going this route that you should include tests for each CPU variation. Currently I see CPU_INFO only for a 64-bit Intel system, and similarly for armhf/arm64.

For ARM systems I'm curious why it is being preferred to try to lookup from /sys/firmware/devicetree instead of /proc/cpuinfo as done in the other cases. Perhaps worth including a note somewhere or in a comment to explain the discrepancy?

I don't know how comprehensive you want to make this, but just in case be aware there are other CPU architectures that just aren't included in Ubuntu (at least right now) but might be present in the wild. E.g. Debian supports a somewhat different set:

https://wiki.debian.org/SupportedArchitectures

For example, long ago in the past we used to support a CPU type 'powerpc', that by your categorization would belong in this category as well. I would be surprised if this still appears anywhere relevant for Ubuntu Pro, but depending on how thorough you aim to make this collector it could check for 'ppc' as well as 'power' prefixes.

commit a240b7e:

Author:     Renan

help: deduplicate 'pro help' command entries

We use a class variable to make sure each command can register its own
help entry to the main help page. When get_parser is called more than
once, the subparsers are instance bound but the help content gets
duplicated. We now make sure to only add entries that are not present
yet.

Signed-off-by: Renan Rodrigo <[email protected]>

I notice the api command is inconsistently capitalized in the help.feature text:

+      Other commands:
+
+        api              Calls the Client API endpoints.
+        auto-attach      automatically attach on supported platforms
+        config           manage Ubuntu Pro configuration on this machine

It also has a period where other lines do not.

Also inconsistencies here:

+      Flags:
+
+        -h, --help       Displays help on pro and command line options
+        --debug          show all debug log messages to console
+        --version        show version of pro

The 'pro' in --version should be capitalized like the other uses, and maybe for --help too. There's a leading capital as well. I think the description for --help might be worded better.

Also think about the first word pluralization ("call" vs "calls", "display" vs "displays", etc.)

commit 6754a6d:

Author:     Dheyay

Override applicability status for landscape

Fixes: #3331

Wondering if this deserves a test case somewhere?

commit 0ed6473:

Author:     Grant

refactor: avoid pickling when we only need a copy of a dict

On xenial, this would try to pickly the UserConfigData object in the
deepcopy call, which fails for enums. The new approach uses the
DataObject's methods for converting to and from dicts, which is good
enough for this use case.

s/pickly/pickle/

I notice there some other instances of deepcopy() calls in the codebase, although at first glance they mostly either use const dicts (DEFAULT_STATUS, etc.) which should be fine, or are test code. But I wonder about a few others, such as:

./uaclient/contract.py:    general_overrides = copy.deepcopy(entitlement.get("overrides", []))
./uaclient/apt.py:            key: copy.deepcopy(cfg.get(key)) for key in cfg.keys()
./uaclient/api/u/pro/security/fix/_common/__init__.py:    usn_pkg_status = copy.deepcopy(pkg_status)
./lib/patch_status_json.py:    new_status = copy.deepcopy(status)
./uaclient/entitlements/base.py:        return copy.deepcopy(
                                            self.machine_token_file.entitlements().get(self.name, {})
                                        )

I'm assuming if those were problems they'd have cropped up by now, but wanted to raise them for consideration in case you would like to fix those this same way.

But for this commit:
LGTM, +1

commit 8ccfbe1:

Author:     Grant

refactor: reusable http unix request function

This is a nice refactoring.

I notice it includes converting utf-8 if needed; would that be worth including in one of the test cases?

I wonder also if there's any chance of invalid JSON being returned from a response? If there's any potential of that it might be worth including such a test case, and perhaps including a check that if the output from json.loads() is neither a dict or list, then something's probably gone haywire. I mention this because this change is generalizing the functionality to work not only with the snap API but to be available for other stuff more generally, so even if you had been certain the snap API is immune to problems, perhaps other future services may have irregularities worth checking for.


README

"If you need any of those services for your machine, pro is the right
tool for you. pro is already installed on every Ubuntu system. Try it
out by running pro help!"

This phrasing feels weird to me: The first part of the 1st sentence
reads dispassionate and passive voice, the second part and 2nd sentence
reads friendly, so it seems inconsistent. Personally I prefer the more
personable feel, so would suggest rephrasing this to something like:

"If these sound well suited for your machinery, pro is the right tool
for you.

pro is already installed on every Ubuntu system. Try it
out by running pro help!"

I like 'machinery' but equally good synonyms might be 'images',
'infrastructure', 'installations', etc.

Another part that could use some wordsmithing:

"Ubuntu Pro Client is a member of the Ubuntu family. It’s an open source
project that warmly welcomes community projects, contributions,
suggestions, fixes and constructive feedback."

@lucasmoura
Copy link
Contributor Author

lucasmoura commented Oct 17, 2024

Thanks so much for the review @bryceharrington. Let me try to address all of those points. But first, I must say that I don't think we can query the commits for the ones we tagged external reviewers. However, I agree that this will be a great for the review and I will discuss with the team a way to fetch that for the next release.

Now, let's address each commit:

commit ce27228:

Author:     Grant

config: lxd_guest_attach setting
+                " the guest will not be allowed to attach using the host's Pro"
+                " lx access. If set to 'available', the guest will be allowed"

That is indeed a typo (lx) and I will fix it for the release

+    elif set_key == "lxd_guest_attach":
+        if not _is_attached(cfg).is_attached:
+            raise exceptions.UnattachedError()
+        state_files.lxd_pro_config_file.write(
+            state_files.LXDProConfig(
+                guest_attach=LXDGuestAttachEnum.from_value(set_value.lower())
+            )
+        )

You are right that only attached users can change that directive. The reason for this is that we only want to configure the LXD integration if the user is attached, otherwise it doesn't apply at all. If that configuration is on and we detach the user, we will also turn it off automatically. Therefore, we believe there is no major impact if we don't allow non-attached user to even try to turn it off as you mentioned.

commit 5d62735:

Author:     Grant Orndorff <[email protected]>

lxd: attempt to attach on launch

I agree with both LOG statements. I will make that change

commit 7b5802d:

Author:     Lucas

system: add logic to collect cpu type information

I don't know if we should be that concerned for architectures that are not supported by Ubuntu, since the Pro client has no guarantees for other distros. However, do you have any suggestion on other architectures we could test here ? I can definitely improve our test based on your suggestion.

Now, regarding the ARM question, I am assuming the devicetree is being used because it is simpler and more straightforward to use, since we can just extract the value and use it directly. But I am tagging @cpaelzer here to provide more input behind this decision.

commit a240b7e:

Author:     Renan

help: deduplicate 'pro help' command entries

We use a class variable to make sure each command can register its own
help entry to the main help page. When get_parser is called more than
once, the subparsers are instance bound but the help content gets
duplicated. We now make sure to only add entries that are not present
yet.

Good catch on all of the issues. This has already been identified when we created our UX refactor spec and that will all be addressed in 35.1, which will be a UX focused release

commit 6754a6d:

Author:     Dheyay

Override applicability status for landscape

Fixes: #3331

We do have an integration test for it already, it just not in that commit. You can see it here:
7f9d026

commit 0ed6473:

Author:     Grant

refactor: avoid pickling when we only need a copy of a dict

On xenial, this would try to pickly the UserConfigData object in the
deepcopy call, which fails for enums. The new approach uses the
DataObject's methods for converting to and from dicts, which is good
enough for this use case.

I don't think those deepcopy calls should pose an issue, specially, because as you noted, they have been introduced in the Pro client a long time ago. I will double check, but I cannot see that as an issue. But thanks for pointing it out

commit 8ccfbe1:

Author:     Grant

refactor: reusable http unix request function

We do have a test that address the utf-8 scenario, were we are converting a non-utf8 body. However, the JSON situation is valid. I will address that in the code

README

Yeah, we have not looked at our README for a while, specially since we are know hosting our public documentation. However, I agree with your suggestions and I also think we need to probably rewrite a lot of the content found on the README. However, I don't know if this release is the right place for this

@bryceharrington
Copy link
Member

Thanks for the quick reply. Yes, like I mentioned this all felt like post-35 worries, but glad to see some of the easier/obvious fixes are going to be included.

Thanks for answers to all my questions, just one point to followup on. Regarding the inclusion of other architectures no I agree it's not crucial, obviously if/when new architectures come up in relevance you've got a framework for adding them to the models.

My question was more about the purpose of the CPU groupings? I.e. i386+amd64, armhf+arm64, and ppc*. From a packaging POV the 32-bit and 64-bit variants are completely distinct even if the instruction sets themselves are mostly compatible. I don't think your approach is wrong, more that I'm curious how you're going to use this distinction in practice? I like what you've done, as it's an intriguing abstraction, but wonder if it's going to be a YAGNI. I didn't spot something using it on first glance, but guessing either I missed something or perhaps there are some future plans?

@bryceharrington
Copy link
Member

I've finished reviewing the security vulnerability work. I did not spot any system-level functionality, as it mainly is involved in downloading data objects, parsing/processing them, and attaching them to appropriate internals. As a result, I mostly skimmed through without analyzing; an impressive bit of coding work, so props.

I've also reviewed last few day's subsequent commits up through 49020f4, double-checked the debian/changelog entry for this release, and done a scan through the US102 spec to understand the feature being introduced this cycle. Let me know when the remaining changes are ready for review, but overall this lgtm so far.

orndorffgrant and others added 5 commits October 17, 2024 20:30
esm-apps and esm-infra now have apt snapshots available; however,
without the apt auth configured properly, downloads from the snapshot
servers will fail. This ensures new enablements will include the
snapshot urls, and it migrates existing enablements to include the
snapshot urls on upgrade.
Add a spinning wheel when users are running the pro vulnerability list
command. This should better inform users that some processing is
happening in the background
We are now caching the result of getting the published
date of our JSON vulnerability data. The goal is to not
do multiple head requests to get that information
Do not show outdated vulnerability data message if
the user is providing a static vulnerability data through
the --data-file parameter
@lucasmoura
Copy link
Contributor Author

@bryceharrington regarding the collect cpu type. There shouldn't be no distinction between 32-bit and 64-bit, as we believe the proc files we use should behave the same for both architectures. For example, we believe that /proc/cpuinfo for intel or amd should have the same format for a 32 or 64 bit architecture. However, please let us know if that is not the case, because then we will have a problem.

And we are indeed already using the collect cpu abstraction. We are sending that information to the Contract server when the user attaches the machine or during our metering job, that runs daily for attached users.

Finally, we are really close on landing all of the remaining PRs. I will ping you once that is done. But thanks for already looking into the release as is, your feedback was great and I have already a PR up addressing some comments you did. Once the team approves it, I will also tag you on the PR

renanrodrigo and others added 16 commits October 23, 2024 10:12
Get rid of valid_names, called_name and related functionality, so
entitlements are called only by the name defined for them.

Signed-off-by: Renan Rodrigo <[email protected]>
The entitlement classes list now returns a different value for xenial
and bionic, which includes cis, and for the other releases, which
includes usg

Signed-off-by: Renan Rodrigo <[email protected]>
This way it is easier to test, as the return value is not determined at
import time.

Signed-off-by: Renan Rodrigo <[email protected]>
As Focal is the only series to ever have both of the services in a valid
state at different times, there is a particular path to enable USG
instead of CIS when users call `pro enable cis`. A specific message is
shown on the CLI to tell users that it changed.

Signed-off-by: Renan Rodrigo <[email protected]>
The old CIS entitlement will not be recognized on Focal and Jammy after
version 35. This converts the .list file and changes cis/ to usg/ both
in the source and authentication files when running on these series.

Signed-off-by: Renan Rodrigo <[email protected]>
We are now sorting the entitlements by name which requires
a test change
We identified a typo on the config endpoint, but when reading the
message content again, we decided to remove the whole word with the
typo
We are now logging when the config option
is turned off to help better debug situations
where an user might not know why the LXD integration
is not working
If we our JSON response is not a dict or a list,
we will now log to better debug potential issues
of this unexpected response
We have recently started building Minimal Pro images (as well as the
usual "Base" images). This commit adds the new licenses to the
appropriate suites in GCP_LICENSES.
On AWS, we no longer try to run auto-attach if we run the
ua-auto-attach systemd job. Because of that, the setup
we are using for the retry auto-attach test no longer works
for AWS. We are now using a different setup for AWS
Copy link
Member

@bryceharrington bryceharrington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through the remaining changes last night. Thanks for including fixes to most of my earlier feedback, and hope the others are useful for future work. Nothing in the latest set of changes have concerns from a system perspective, and LGTM.

Running pro vulnerability is now an alias for running pro vulnerability
list
For the realtime-kernel check, we are also checking the
presence of the usg service on the output. We are updating
the test to not check for that anymore, as it doesn't have
any relation with the feature being tested
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants