-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: Improve ORAS diagnose experience #1483
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Feynman Zhou <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1483 +/- ##
==========================================
+ Coverage 85.54% 86.05% +0.51%
==========================================
Files 113 118 +5
Lines 3936 4224 +288
==========================================
+ Hits 3367 3635 +268
- Misses 342 352 +10
- Partials 227 237 +10 ☔ View full report in Codecov by Sentry. |
Signed-off-by: Feynman Zhou <[email protected]>
Signed-off-by: Feynman Zhou <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, if a user decides to get verbose output, he or she will be prepared to receive a lot of information in the output (including very technical information - for example, curl gives details about the encryption algorithm). I am still not convinced that we need to complicate the interface to suppress some of the output.
> Request headers: | ||
"User-Agent": "oras/1.2.0+Homebrew" | ||
"Accept": "application/vnd.docker.distribution.manifest.v2+json, application/vnd.docker.distribution.manifest.list.v2+json, application/vnd.oci.image.manifest.v1+json, application/vnd.oci.image.index.v1+json, application/vnd.oci.artifact.manifest.v1+json" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to insert separators like -----
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose to add an empty line as separator between each request and response for readability since this is a very common design. Do you have better ideas on improving the debug logs' readability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe an empty line between incoming and outgoing message
- Example: `DEBUG: [2023-10-01T12:00:00Z] Starting metadata retrieval for repository oras-demo` | ||
|
||
### 2. **Capture API-Specific Details** | ||
- **API Requests:** Log detailed information about API requests made to the registry server, including the HTTP method, endpoint, headers, and body (excluding sensitive information). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, we should not exclude sensitive information for debug output for two reasons:
- it is nearly impossible to redact all sensitive information
- it may actually be useful to see the sensitive information in the logs
As some kind of compromise maybe attempt no redaction at the highest level of debug if we were to have levels. Ansible used this policy and I think it was very useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it be insecure and non-compliant to log sensitive information in ORAS debug logs?
Signed-off-by: Feynman Zhou <[email protected]>
Signed-off-by: Feynman Zhou <[email protected]>
Signed-off-by: Feynman Zhou <[email protected]>
Signed-off-by: Feynman Zhou <[email protected]>
Signed-off-by: Feynman Zhou <[email protected]>
Signed-off-by: Feynman Zhou <[email protected]>
- Poor readability of debug logs. No separator lines between request and response information. Users need to add separator lines manually for readability. See the relevant issue [#1382](https://github.com/oras-project/oras/issues/1382). | ||
- Critical information is missing in debug logs. For example, the [error code](https://github.com/opencontainers/distribution-spec/blob/main/spec.md#error-codes) and metadata of the processed resource object (e.g. image manifest) are not displayed. | ||
- The detailed operation information is missing in verbose output. For example, how many and where are objects processed. Less or no verbose output of ORAS commands in some use cases. | ||
- Timestamp of each request and response is missing in debug logs, which is hard to trace historical operation and trace the sequence of events accurately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends on the OS and environment.
You may observe the differences by running the following commands:
oras manifest fetch -d --descriptor ghcr.io/oras-project/oras:v1.2.0
oras manifest fetch -d --descriptor ghcr.io/oras-project/oras:v1.2.0 2>&1 | cat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oras manifest fetch -d --descriptor ghcr.io/oras-project/oras:v1.2.0
gives
DEBU[0000] Request #0
while
oras manifest fetch -d --descriptor ghcr.io/oras-project/oras:v1.2.0 2>&1 | cat
reports
time=2024-09-14T15:17:52+08:00 level=debug msg=Request #0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on this example, it looks like timestamp is not visible to users in debug logs by default. Users have to output the timestamp information with an additional command cat
.
How about making timestamp information to each request and response by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends on logrus. @Wwwsylvia Could you provide options in logrus?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just tested a bit. We can enable TextFormatter.FullTimestamp
to achieve the goal:
logger.SetFormatter(&logrus.TextFormatter{
DisableQuote: true,
+ FullTimestamp: true,
})
As a result, logs in TTY will show as:
DEBU[2024-09-23T18:56:49+08:00] Request #0
And logs in non-TTY will show as:
time=2024-09-23T18:59:56+08:00 level=debug msg=Request #0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thus, it is technically feasible to make full timestamp information to each request and response by default.
- **State Logs:** Log important state information, function names, important variable state, file name and line number before and after key operations or decisions. | ||
- Example: `DEBUG: Current retry attempt: 1, Max retries: 3` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not apply to implementation inside oras-go
. In other words, no details in oras.Copy()
will be output. The reason is simple that logging is tightly coupled with the specific logging implementation while oras-go
should be generic.
Other than logurs
, there are other logging implementations like log
, log/slog
, zap
, apex/log
, etc. Unfortunately, they don't share the same interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that all logs will be hooks in real implementations.
- Example: `DEBUG: Current registry URL: https://myregistry.io, Timeout setting: | ||
|
||
### 6. **Avoid Logging Sensitive Information** | ||
- **Privacy:** Abstain from logging sensitive information such as passwords, personal data, or security tokens. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Define "sensitive information".
Signed-off-by: Feynman Zhou <[email protected]>
Signed-off-by: Feynman Zhou <[email protected]>
@@ -52,7 +52,7 @@ Logs focus on providing technical details for in-depth diagnosing and troublesho | |||
## Proposals for ORAS CLI | |||
|
|||
- Deprecate the global flag `--verbose` and only remain `--debug` to avoid ambiguity. Based on the concept above, it is reasonable to continue using `--debug` to enable the output of `DEBUG` level logs in ORAS as it is in ORAS. Meanwhile, This change will make the diagnose experience much more straightforward and less breaking since only ORAS `pull/push/attach/discover` commands have verbose output. | |||
- The existing output of `--verbose` in several ORAS commands `pull/push/attach/discover` can still be reserved. Introducing an additional flag to these commands. | |||
- Introduce a new flag `--detail` to replace the existing global flag `--verbose` of commands like `pull`, `push`, `attach`, and `discover` for detailed output. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to discuss, how about --long
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think --detail
is a good option, since it will cause similar confusion as --verbose
. Questions like is it about detailed logs or about detailed other outputs?
One option is output the --verbose
content by default, then we do not need any flags for verbose output that are not logs. This also solves the problem that some commands have --verbose
flags, some don't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will --debug
be a global flag, or just available for some commands?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of --detail
, I think we can have --quiet
for pull
, push
, and attach
.
discover
is different since the --verbose
takes effect on the metadata output instead of the status output. In that case, we can just remove --verbose
and introduce new values for --format
such as --format tree-full
. It is also a good time to revisit the outputs of oras discover
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @shizhMSFT . Current verbose output in pull
, push
, attach
is progress information but in discover
it is metadata output. They are different cases.
|
||
### Output | ||
|
||
There are three types of output in ORAS CLI: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think log is also one type of output. An "output” of a command refers to the information that is displayed or returned after the command is executed, it can be logs, formatted data, error messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logs are logs, and logs may not be displayed or returned. They may be collected in systemd
, or sent to fluentd
.
> Request headers: | ||
"User-Agent": "oras/1.2.0+Homebrew" | ||
"Accept": "application/vnd.docker.distribution.manifest.v2+json, application/vnd.docker.distribution.manifest.list.v2+json, application/vnd.oci.image.manifest.v1+json, application/vnd.oci.image.index.v1+json, application/vnd.oci.artifact.manifest.v1+json" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe an empty line between incoming and outgoing message
@@ -52,7 +52,7 @@ Logs focus on providing technical details for in-depth diagnosing and troublesho | |||
## Proposals for ORAS CLI | |||
|
|||
- Deprecate the global flag `--verbose` and only remain `--debug` to avoid ambiguity. Based on the concept above, it is reasonable to continue using `--debug` to enable the output of `DEBUG` level logs in ORAS as it is in ORAS. Meanwhile, This change will make the diagnose experience much more straightforward and less breaking since only ORAS `pull/push/attach/discover` commands have verbose output. | |||
- The existing output of `--verbose` in several ORAS commands `pull/push/attach/discover` can still be reserved. Introducing an additional flag to these commands. | |||
- Introduce a new flag `--detail` to replace the existing global flag `--verbose` of commands like `pull`, `push`, `attach`, and `discover` for detailed output. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think --detail
is a good option, since it will cause similar confusion as --verbose
. Questions like is it about detailed logs or about detailed other outputs?
One option is output the --verbose
content by default, then we do not need any flags for verbose output that are not logs. This also solves the problem that some commands have --verbose
flags, some don't.
|
||
### Guiding Principles | ||
|
||
Here are the guiding principles to write debug logs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest we can just select a popular logging library or tool and follow the formats that are defined by the library or tool, because I don't think it is worth spending time to develop/customize our own logging formats, unless it is easy to customize using the library or tool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -52,7 +52,7 @@ Logs focus on providing technical details for in-depth diagnosing and troublesho | |||
## Proposals for ORAS CLI | |||
|
|||
- Deprecate the global flag `--verbose` and only remain `--debug` to avoid ambiguity. Based on the concept above, it is reasonable to continue using `--debug` to enable the output of `DEBUG` level logs in ORAS as it is in ORAS. Meanwhile, This change will make the diagnose experience much more straightforward and less breaking since only ORAS `pull/push/attach/discover` commands have verbose output. | |||
- The existing output of `--verbose` in several ORAS commands `pull/push/attach/discover` can still be reserved. Introducing an additional flag to these commands. | |||
- Introduce a new flag `--detail` to replace the existing global flag `--verbose` of commands like `pull`, `push`, `attach`, and `discover` for detailed output. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will --debug
be a global flag, or just available for some commands?
|
||
### Output | ||
|
||
There are three types of output in ORAS CLI: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logs are logs, and logs may not be displayed or returned. They may be collected in systemd
, or sent to fluentd
.
|
||
### Logs | ||
|
||
Logs focus on providing technical details for in-depth diagnosing and troubleshooting issues. It is intended for developers or technical users who need to understand the inner workings of the tool. Debug logs are detailed and technical, often including HTTP request and response from interactions between client and server, as well as code-specific information. In general, there are different levels of log including `DEBUG`, `INFO`, `WARNING`, `ERROR`, but only `DEBUG` level log is used in ORAS, which is controlled by the flag `--debug`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some log implementation has the TRACE
level, and the FATAL
level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the description as follows:
In general, there are different levels of logs. Logrus has been used by ORAS, which has seven logging levels:
Trace
,Debug
,Info
,Warning
,Error
,Fatal
andPanic
, but onlyDEBUG
level log is used in ORAS, which is controlled by the flag--debug
.
- Introduce a new flag `--detail` to replace the existing global flag `--verbose` of commands like `pull`, `push`, `attach`, and `discover` for detailed output. | ||
- Add separator lines between each request and response for readability. | ||
- Add timestamp of each request and response to the beginning of each request and response. | ||
- Add the response body including [error code](https://github.com/opencontainers/distribution-spec/blob/main/spec.md#error-codes) and the metadata of processed OCI object (e.g. image manifest) to the debug logs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does it mean? Is it applicable to OCI image layout?
- Add separator lines between each request and response for readability. | ||
- Add timestamp of each request and response to the beginning of each request and response. | ||
- Add the response body including [error code](https://github.com/opencontainers/distribution-spec/blob/main/spec.md#error-codes) and the metadata of processed OCI object (e.g. image manifest) to the debug logs | ||
- Define guiding principles and conventions for writing clear and conductive debug logs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose / motivation of this?
|
||
### Guiding Principles | ||
|
||
Here are the guiding principles to write debug logs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- **Catch and Log Exceptions:** Always catch exceptions and log them with relevant context and stack traces. | ||
- Example: `ERROR: Exception occurred in fetchManifest: Network timeout while accessing /v2/oras-demo/manifests/latest` | ||
|
||
- **Error Codes:** Include specific error codes to facilitate quick identification and resolution. | ||
- Example: `ERROR: [ErrorCode: 504] Network timeout while accessing /v2/oras-demo/manifests/latest` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, oras
only use debug logs.
ORAS Version: 1.2.0+Homebrew | ||
Go version: go1.22.3 | ||
OS/Arch: linux/amd64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing git information
Signed-off-by: Feynman Zhou <[email protected]>
|
||
### 1. **Timestamp Each Log Entry** | ||
- **Precise Timing:** Ensure each log entry has a precise timestamp to trace the sequence of events accurately. | ||
- Example: `DEBUG: [2023-10-01T12:00:00Z] Starting metadata retrieval for repository oras-demo` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was it discussed at all to put the timestamp before the log level
- Example: `DEBUG: [2023-10-01T12:00:00Z] Starting metadata retrieval for repository oras-demo` | |
- Example: `[2023-10-01T12:00:00Z] DEBUG: Starting metadata retrieval for repository oras-demo` |
What this PR does / why we need it:
This proposal document aims to:
--verbose
and--debug
options.Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #1091