-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: describe all data collection #3216
Conversation
@@ -0,0 +1,86 @@ | |||
What data does Canonical collect from Ubuntu Pro machines? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably suggest changing this header to "what data does Canonical collect through the Ubuntu Pro Client"
The reason being that the contents of the page are relevant to people who haven't attached yet (and want the info to know if they want to attach), or who have detached and aren't using Pro anymore (can we consider their machine an Ubuntu Pro machine?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two of the sections are data not collected directly through the Ubuntu Pro Client, but the collection happens because you used the pro-client to attach to Pro. So I'm not sure "collect through the Ubuntu Pro Client" is quite accurate.
A detached machine shouldn't be considered an Ubuntu Pro machine - they are in the same bucket as never attached machines for the purpose of this doc.
None of this data is collected for unattached machines. And I don't think the current title would prevent someone wondering about data collection from looking at it. We could rename it to "What data does Canonical collect from Ubuntu machines that are attached to an Ubuntu Pro subscription" but that felt unnecessarily verbose
IDK though, I don't have a strong opinion on the title here, just wanted to list my thoughts before changing it. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that depends on how long we keep data for. Is machine data purged when the machine is detached?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Data is not purged on detach - so a detached machine would have had data collected while it was Ubuntu Pro, and after it is detached, no more data will be collected, but data will exist on the backend for some time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If data isn't purged on detach, then I think there is a difference between detached and never-attached machines since users who care about this stuff want to know what we collect, why, and how long we expect to keep that data for.
I think tweaking the title to say "What data is collected from active Ubuntu Pro machines?" would be enough to satisfy the distinction, especially if we can also provide info on how long it takes before collected info is purged (although I wouldn't consider that a blocker).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that makes sense!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left some suggested changes! thanks grant!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Some suggestions
|
||
These data elements are collected to ensure machines that are attached to a | ||
particular Ubuntu Pro contract are compliant with the terms of that particular | ||
contract. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could any of this info/data be considered as personally identifiable?
Do we know roughly how long is data kept for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think any of this counts as personally identifiable on it's own, but it is connected to an Ubuntu Pro account on the backend via a machine id.
Do we know roughly how long is data kept for?
I don't know the answer to this one. Tagging @pandrey2003 and @alnvdl-work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't consider this as a blocker to getting this merged. We can always add a section later at the bottom here about data retention.
Co-authored-by: Sally <[email protected]>
********************************************************** | ||
|
||
Some system data is sent to Canonical servers for the purpose of delivering | ||
Ubuntu Pro services in compliance with the terms of the Ubuntu Pro subscriptio |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/subscriptio/subscription
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops 🙃
|
||
This document categorises data collection by method of collection. | ||
|
||
APT package downloads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can be wrong here, but it seems that the APT packages downloads
and Livepatch downloads
are not directly tied to data collection per se. It seems more like data used per service
than a collection of some sorts.
I think there is still value for those sections, but I would not put them under data collection
. We could create something like service data needs
for them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's a good idea :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's a good point, this is data that is sent to canonical servers for the purposes of using the services. I'll rework the structure of this a bit to make that more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@s-makin I've restructured the document and headers around this. All the content is the same, but I'm not sure about the new header "Data sent in order to provide service" - do you have any better ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for your patience!
Fixes: #2894