provider: enable k3d cluster log export helper #485

harshanarayana · 2025-01-17T10:46:41Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

Enable k3d provider with Ability to export cluster logs. Changes required to enable this was added to k3d with k3d-io/k3d#1471.

With that changes being merged, we can now add support for exporting the logs.

Which issue(s) this PR fixes:

NA

Special notes for your reviewer:

Does this PR introduce a user-facing change?

added ability to export cluster logs for `k3d` based Cluster Provider

Additional documentation e.g., Usage docs, etc.:

k8s-ci-robot · 2025-01-17T10:46:44Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

k8s-ci-robot · 2025-01-17T10:46:47Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: harshanarayana

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [harshanarayana]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

harshanarayana · 2025-01-17T10:48:08Z

/assign @cpanato

PTAL when you can. @vladimirvivien @cpanato

harshanarayana

/hold until I finish validating this and add an Example for using this

cpanato

thanks

/lgtm

vladimirvivien · 2025-01-17T15:19:27Z

third_party/k3d/k3d.go

+		}
+		return nil
+	} else {
+		log.Warning("ExportLogs not implemented for k3d. Please use regular kubectl like commands to extract the logs from the cluster")


Do we know what that kubectl command would be to extract logs? Could we just use it as a fallback instead of just reporting a warning?

@vladimirvivien Actually, That kubectl like is probably a mistake. Just kubectl won't be enough. We will also have to get a few more logs. And the problem is, we have to find a way to group them in a proper format. it can be tricky. (Also a duplicate work to replicate the logic).

If we want to add an export mechanism, we might be better off writing some mechanism that can do this for any provider that doesn't have a log export support.

vcluster doesn't support. So is the case with kwok.

For kwok there is an adhoc kubectl logs equivalent of export but it does only for the kwok's own components.

How do we go about doing this ?

We also have to worry about fetching the journalctl logs + the dmesg + any other applicable logs. Might not be a bad idea to implement something generic that can be reused for any provider in the future that is lacking the log export capabilities

package utils import ( "bytes" "context" "fmt" "os" "path/filepath" v1 "k8s.io/api/core/v1" log "k8s.io/klog/v2" "sigs.k8s.io/e2e-framework/klient/k8s/resources" "sigs.k8s.io/e2e-framework/pkg/types" ) type LogCollector struct { resourceFetcher *resources.Resources baseDir string } func NewLogCollector(provider types.E2EClusterProvider, clusterName, destination string) (*LogCollector, error) { baseDir := filepath.Join(destination, fmt.Sprintf("debug-logs-%s", clusterName)) if err := os.MkdirAll(baseDir, os.FileMode(0755)); err != nil { log.ErrorS(err, "failed to create base dir required to collect the logs", "dir", destination) return nil, err } resourceFetcher, err := resources.New(provider.KubernetesRestConfig()) if err != nil { log.ErrorS(err, "failed to create resource fetcher") return nil, err } return &LogCollector{ resourceFetcher: resourceFetcher, baseDir: baseDir, }, nil } func (lc *LogCollector) CollectLogs(ctx context.Context) error { var namespaces v1.NamespaceList if err := lc.resourceFetcher.List(ctx, &namespaces); err != nil { log.ErrorS(err, "failed to list namespaces in the cluster") return err } for _, ns := range namespaces.Items { if err := lc.collectNamespaceLogs(ctx, ns.Name); err != nil { return err } } return nil } func (lc *LogCollector) collectNamespaceLogs(ctx context.Context, namespace string) error { log.V(3).InfoS("Collecting POD information for namespace", "namespace", namespace) var pods v1.PodList if err := lc.resourceFetcher.WithNamespace(namespace).List(ctx, &pods); err != nil { log.ErrorS(err, "failed to list pods in the namespace", "namespace", namespace) return err } for _, pod := range pods.Items { if err := lc.collectPodLogs(ctx, namespace, pod); err != nil { return err } } return nil } func (lc *LogCollector) collectPodLogs(ctx context.Context, namespace string, pod v1.Pod) error { uid := fmt.Sprintf("%s", pod.GetUID()) if hash, ok := pod.GetAnnotations()["kubernetes.io/config.hash"]; ok { uid = hash } podBaseDir := filepath.Join(lc.baseDir, fmt.Sprintf("%s_%s_%s", namespace, pod.Name, uid)) if err := os.MkdirAll(podBaseDir, os.FileMode(0755)); err != nil { return err } containers := append(pod.Spec.Containers, pod.Spec.InitContainers...) containerStatus := append(pod.Status.ContainerStatuses, pod.Status.InitContainerStatuses...) for _, container := range containers { if err := lc.collectContainerLogs(ctx, namespace, pod.Name, container, containerStatus, podBaseDir); err != nil { return err } } return nil } func (lc *LogCollector) collectContainerLogs(ctx context.Context, namespace, podName string, container v1.Container, containerStatus []v1.ContainerStatus, podBaseDir string) error { containerBaseDir := filepath.Join(podBaseDir, container.Name) if err := os.MkdirAll(containerBaseDir, os.FileMode(0755)); err != nil { return err } log.V(3).InfoS("Collecting logs for pod", "namespace", namespace, "pod", podName, "container", container.Name) var podLog bytes.Buffer if err := lc.resourceFetcher.GetPodLog(ctx, namespace, podName, container.Name, &podLog); err != nil { return err } restartCount := 0 for _, cs := range containerStatus { if cs.Name == container.Name { restartCount = int(cs.RestartCount) break } } if err := os.WriteFile(filepath.Join(containerBaseDir, fmt.Sprintf("%d.log", restartCount)), podLog.Bytes(), os.FileMode(0644)); err != nil { return err } return nil }

@cpanato @vladimirvivien Would something like this help ? We can also use the NodeLogQuery feature for fetching most of the node logs that we might want o collect.

k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Jan 17, 2025

k8s-ci-robot requested review from ShwethaKumbla and vladimirvivien January 17, 2025 10:46

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 17, 2025

provider: enable k3d cluster log export helper

edabcee

harshanarayana force-pushed the feature/enable/log-export-for-k3d-provider branch from 71cdeda to edabcee Compare January 17, 2025 10:47

k8s-ci-robot assigned cpanato Jan 17, 2025

harshanarayana commented Jan 17, 2025

View reviewed changes

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 17, 2025

cpanato reviewed Jan 17, 2025

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 17, 2025

vladimirvivien reviewed Jan 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

provider: enable k3d cluster log export helper #485

provider: enable k3d cluster log export helper #485

harshanarayana commented Jan 17, 2025

k8s-ci-robot commented Jan 17, 2025

k8s-ci-robot commented Jan 17, 2025

harshanarayana commented Jan 17, 2025

harshanarayana left a comment

cpanato left a comment

vladimirvivien Jan 17, 2025

harshanarayana Jan 20, 2025

harshanarayana Jan 20, 2025

harshanarayana Jan 20, 2025

provider: enable k3d cluster log export helper #485

Are you sure you want to change the base?

provider: enable k3d cluster log export helper #485

Conversation

harshanarayana commented Jan 17, 2025

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., Usage docs, etc.:

k8s-ci-robot commented Jan 17, 2025

k8s-ci-robot commented Jan 17, 2025

harshanarayana commented Jan 17, 2025

harshanarayana left a comment

Choose a reason for hiding this comment

cpanato left a comment

Choose a reason for hiding this comment

vladimirvivien Jan 17, 2025

Choose a reason for hiding this comment

harshanarayana Jan 20, 2025

Choose a reason for hiding this comment

harshanarayana Jan 20, 2025

Choose a reason for hiding this comment

harshanarayana Jan 20, 2025

Choose a reason for hiding this comment