Profiling: upgrade guide for self-hosted profiling agents (#3854)

Signed-off-by: inge4pres <[email protected]> Co-authored-by: Mike Birnstiehl <[email protected]>
elastic · May 6, 2024 · f0fae70 · f0fae70
1 parent 7697da2
commit f0fae70
Show file tree

Hide file tree

Showing 3 changed files with 62 additions and 24 deletions.
diff --git a/docs/en/observability/profiling-get-started.asciidoc b/docs/en/observability/profiling-get-started.asciidoc
@@ -164,8 +164,6 @@ Consider the following when configuring your Universal Profiling Agent:
 
 * Before {stack} version 8.9 the Universal Profiling Agent versioning scheme was **not aligned with the {stack} version scheme**.
 
-* The OS packages downloaded from `releases.prodfiler.com` have a version in their file name.
-
 * You can find a list of container image versions in the
 https://container-library.elastic.co/r/observability/profiling-agent[Elastic container library repository].
 

diff --git a/docs/en/observability/profiling-self-managed-ops.asciidoc b/docs/en/observability/profiling-self-managed-ops.asciidoc
@@ -11,7 +11,7 @@ This page outlines operating the backend when running Universal Profiling on a s
 * <<profiling-self-managed-ops-configuration, Configuring your collector and symbolizer>>
 * <<profiling-self-managed-ops-monitoring, Monitoring your collector and symbolizer>>
 * <<profiling-scaling-backend-resources, Scaling your resources>>
-* <<profiling-upgrade-backend-bin, Upgrading backend binaries>>
+* <<profiling-self-managed-upgrade, Upgrading backend binaries>>
 
 [discrete]
 [[profiling-self-managed-ops-sizing-guidance]]
@@ -38,8 +38,9 @@ NOTE: This table is derived from benchmarks performed on Universal Profiling wit
 The profiled machines had a near-constant load of 75% CPU utilization.
 The deployment used 3 Elasticsearch nodes with 64 GB memory each, 8 vCPU, and 1.5 TB NVMe disk drives.
 
-Because resource demand is nearly proportional to the amount of data the Universal Profiling Agents generate, you can calculate the necessary resources for use cases beyond those in the table by comparing your actual number of cores profiled with the number of cores in the table.
-When calculating, factor in the following:
+Resource demand is nearly proportional to the amount of data the Universal Profiling Agents generate.
+Therefore, you can calculate the necessary resources for use cases beyond those in the table by comparing your actual number of cores profiled with the number of cores in the table.
+When calculating, consider the following:
 
 * The average load of the machines being profiled: The average load directly impacts the amount of CPU samples collected. For example, on a system that is mostly idle, not all CPUs will be scheduling tasks during the sampling intervals.
 * The rate of change of the executables being profiled—for example, how often you deploy new versions of your software: The rate of change impacts the amount of debug metadata stored in Elasticsearch as a result of symbolization; the more different executables the Universal Profiling Agent collects, the more debug data will be stored in Elasticsearch. Note that two different builds of the same application still result in two different executables, as the Universal Profiling Agent will treat each ELF file independently.
@@ -185,44 +186,79 @@ When scaling out the number of replicas, depending on the load balancer that you
 This ensures that the load is evenly distributed across all replicas.
 
 [discrete]
-[[profiling-upgrade-backend-bin]]
-== Upgrade the backend binaries
+[[profiling-self-managed-upgrade]]
+== Upgrade a self-hosted stack
 
-Upgrade the backend binaries whenever you upgrade the rest of the Elastic stack.
-While we try to keep backward compatibility between two consecutive minor version, we may introduce changes to the data format that require the applications to be at the same version of Elasticsearch and Kibana.
+Upgrading a self-hosted stack involves upgrading the backend applications and the agent.
+We recommend upgrading the backend first, followed by the agent. This way, if you encounter problems with the backend, you can roll back to the previous version without needing to downgrade the agent.
+
+NOTE: We recommend having the same version of the agent and the backend deployed.
+
+We strive to maintain backward compatibility between minor versions.
+Occasionally, changes to the data format may require having the same version of the agent and backend deployed.
+When a breaking change in the protocol is introduced, the profiling agents that are not up to date will stop sending data.
+The agent logs will report an error message indicating that the backend is not compatible with the agent (or vice versa).
 
 The upgrade process steps vary depending on the installation method used.
+You may have a combination of installation methods. For example, you might deploy the backend on ECE and the agents on Kubernetes.
+In that case, refer to the specific sections (backend/agent) in each method.
+
+IMPORTANT: Depending on your infrastructure setup, upgrading the backend may also update the endpoint exposed by the collector.
+In this case, amend the agent configuration to connect to the new endpoint upon upgrade.
 
 [discrete]
-[[profiling-backend-upgrade-ece]]
+[[profiling-self-managed-upgrade-ece]]
 === ECE
 
-When using ECE, the upgrade process is managed by the platform itself.
-You don't need to perform any action to upgrade the backend binaries.
+When using ECE, the upgrade process of the backend is part of the installation of a new ECE release.
+You don't need to perform any action to upgrade the backend applications, as they will be upgraded automatically.
+
+For the agent deployment, you can upgrade the Fleet integration installed on the Elastic Agent if that's how you're deploying the agent.
 
 [discrete]
-[[profiling-backend-upgrade-k8s]]
-=== Kubernetes
+[[profiling-self-managed-upgrade-k8s]]
+=== ECK or generic Kubernetes
 
-Perform a helm upgrade using the `helm upgrade` command.
+Perform a helm upgrade of the backend charts using the `helm upgrade` command.
 You may reuse existing values or provide the full values YAML file on each upgrade.
 
+For the agent deployment, upgrading through the Helm chart is also the simplest option.
+
+IMPORTANT: starting with version 8.15 the agent Helm chart has been renamed from `pf-host-agent` to `profiling-agent`.
+
+When **upgrading to 8.15 from 8.14 or lower**, follow these additional instructions:
+
+. Fetch the currently applied Helm values:
+
+        helm -n universal-profiling get values pf-host-agent -oyaml > profiling-agent-values.yaml
+
+. Update the repo to find the new chart:
+
+        helm repo update
+
+. Uninstall the old chart:
+
+        helm -n uninstall pf-host-agent
+
+. Install the new chart by following the instructions displayed in the Universal Profiling "Add Data" page or with the following command:
+
+        helm install -n universal-profiling universal-profiling-agent elastic/profiling-agent -f profiling-agent-values.yaml
+
 [discrete]
-[[profiling-backend-upgrade-os]]
+[[profiling-self-managed-upgrade-os]]
 === OS packages
 
 Upgrade the package version using the OS package manager.
-Not all package managers will call into `systemd` to restart the service, so you may need to restart the service manually or through any other automation in place.
+You will find the name and links to the new packages in the "Add Data" page.
+
+Not all package managers will call into `systemd` to restart the service,
+so you may need to restart the service manually or through any other automation in place.
 
 [discrete]
-[[profiling-backend-upgrade-binaries]]
+[[profiling-self-managed-upgrade-binaries]]
 === Binaries
 
 Download the corresponding binary version and replace the existing one, using the command seen in the <<profiling-self-managed-running-linux-binary, Binary>> section of the setup guide.
 Replace the old binary and restart the services.
 
-[discrete]
-[[profiling-backend-upgrade-containers]]
-=== Containers
-
-Pull the new container image, and replace the existing image with the new image.
+You will find the links to the new binaries in the "Add Data" page, under the "Binary" tab.
diff --git a/docs/en/observability/profiling-upgrade.asciidoc b/docs/en/observability/profiling-upgrade.asciidoc
@@ -5,6 +5,10 @@
 <titleabbrev>Upgrade</titleabbrev>
 ++++
 
+This section is specific to upgrading Universal Profiling on {ecloud} or {ece}.
+
+For self-hosted installations, refer to <<profiling-self-managed-upgrade,Upgrade Universal Profiling in self-hosted installations>>.
+
 [discrete]
 [[profiling-upgrade-process]]
 == Upgrade process
@@ -13,7 +17,7 @@ To upgrade from all versions earlier than 8.10 (GA), you need to:
 
 . Perform a stack upgrade in {ecloud}
 . Stop profiling data ingestion
-. Delete existing profiling data 
+. Delete existing profiling data
 . Setup Universal Profiling from scratch
 . Start profiling data ingestion