In this tutorial, you will learn how to deploy Velero
to your Kubernetes
cluster, create backups
, and recover
from a backup if something goes wrong.
You can back up your entire
cluster, or optionally choose a namespace
or label
selector to back up.
Backups
can be run one off
or scheduled
. It’s a good idea to have scheduled
backups so you are certain you have a recent
backup to easily fall back to. You can also create backup hooks, if you want to execute actions before
or after
a backup is made.
Why choose Velero
?
Velero
gives you tools to back up
and restore
your Kubernetes cluster resources
and persistent volumes
. You can run Velero
with a cloud provider
or on-premises
.
Advantages of using Velero
:
- Take
backups
of your cluster andrestore
in case of loss. Migrate
cluster resources to other clusters.Replicate
yourproduction
cluster todevelopment
andtesting
clusters.
After finishing this tutorial, you will be able to:
Backup
andrestore
yourapplications
, or your entireDOKS
cluster.- Create
scheduled
backups for your applications. - Configure
DO Spaces
storage forVelero
to use.
- Introduction
- Prerequisites
- Step 1 - How Velero Works
- Step 2 - Installing Velero
- Step 3 - Namespace Backup and Restore Example
- Step 4 - Backup and Restore Whole Cluster Example
- Step 5 - Scheduled Backup and Restore
- Step 6 - Deleting Backups
- Conclusion
To complete this tutorial, you need the following:
- A DO spaces bucket bucket and
access
keys. Save theaccess
andsecret
keys in a safe place for later use. - A DigitalOcean API token for
Velero
to use. - A Git client, to clone the
Starter Kit
repository. - Helm, for managing
Velero
releases and upgrades. - Doctl, for
DigitalOcean
API interaction. - Kubectl, for
Kubernetes
interaction. - Velero client, to manage
Velero
backups.
Velero
consists of two parts:
- A
server
that runs on your cluster - A
command-line
client that runs locally
Each Velero
operation – on-demand backup
, scheduled backup
, restore
– is a custom resource
, defined with a Kubernetes Custom Resource Definition
(CRD) and stored in etcd
. Velero
also includes controllers
that process the custom resources to perform backups, restores, and all related operations.
Whenever you execute a backup command
, the Velero CLI
makes a call to the Kubernetes API
server to create a Backup
object. The Backup Controller
then validates the backup object i.e. whether it is cluster
backup, namespace
backup, etc. and then it makes a call to the API
server to query the data to be backed up. Finally it starts the backup process once it collects the data to be backed up. Backup Controller
then makes a call to DigitalOcean Spaces
to store the backup file. The backup file is stored as a tarball
file (.tar.gz
).
Similarly whenever you execute a restore command
, the Velero CLI
makes a call to Kubernetes API
server to restore from a backup object. Based on the restore command executed, Velero Restore Controller
makes a call to DigitalOcean Spaces
and initiates restore from the particular backup object.
Below is a diagram that shows the Backup/Restore
workflow:
Velero
is ideal
for the disaster
recovery use case, as well as for snapshotting your application state, prior to performing system operations on your cluster, like upgrades. For more details on this topic, please visit the How Velero Works official page.
In the next step, you will learn how to install Velero
, using Helm
.
In this step, you will deploy Velero
and all the required components, so that it will be able to perform backups for your Kubernetes
cluster resources (PV's
as well). The backup data will be stored in the DO Spaces
bucket created earlier in the Prerequisites section.
Steps to follow:
-
First, clone the
Starter Kit
Git repository and change directory to your local copy:git clone https://github.com/digitalocean/Kubernetes-Starter-Kit-Developers.git cd Kubernetes-Starter-Kit-Developers
-
Next, add the
Helm
repository and list the available charts:helm repo add vmware-tanzu https://vmware-tanzu.github.io/helm-charts helm search repo vmware-tanzu
The output looks similar to the following:
NAME CHART VERSION APP VERSION DESCRIPTION vmware-tanzu/velero 2.23.6 1.6.3 A Helm chart for velero
Note:
The chart of interest is
vmware-tanzu/velero
, which will installVelero
on the cluster. Please visit the velero-chart page for more details about this chart. -
Then, open and inspect the
06-setup-velero/assets/manifests/velero-values.yaml
file provided in theStarter kit
repository, using an editor of your choice (preferably withYAML
lint support). You can use VS Code, for example:code 06-setup-velero/assets/manifests/velero-values.yaml
-
Next, please replace the
<>
placeholders accordingly for your DO SpacesVelero
bucket (like: name, region and secrets). Make sure that you provide your DigitalOceanAPI
token as well (DIGITALOCEAN_TOKEN
key). -
Finally, install
Velero
usingHelm
:HELM_CHART_VERSION="2.23.6" helm install velero vmware-tanzu/velero --version "${HELM_CHART_VERSION}" \ --namespace velero \ --create-namespace \ -f 06-setup-velero/assets/manifests/velero-values-v${HELM_CHART_VERSION}.yaml
Note:
A
specific
version for theHelm
chart is used. In this case2.23.6
is picked, which maps to the1.6.3
version of the application (see the output fromStep 2.
). It’s good practice in general, to lock on a specific version. This helps to have predictable results, and allows versioning control viaGit
.
Now, please check your Velero
deployment:
helm ls -n velero
The output looks similar to the following (STATUS
column should display deployed
):
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
velero velero 1 2021-08-25 13:16:24.383446 +0300 EEST deployed velero-2.23.6 1.6.3
Next, verify that Velero
is up and running:
kubectl get deployment velero -n velero
The output looks similar to the following (all pods must be in Ready
state):
NAME READY UP-TO-DATE AVAILABLE AGE
velero 1/1 1 1 67s
Notes:
- If you’re interested in looking further, you can view Velero’s server-side components:
kubectl -n velero get all
-
Explore
Velero
CLI help pages, to see whatcomands
andsub-commands
are available. You can get help for each, by using the--help
flag:List all the available commands for
Velero
:velero --help
List
backup
command options forVelero
:velero backup --help
Velero
uses a number of CRD
's (Custom Resource Definitions) to represent its own resources like backups
, backup schedules
, etc. You'll discover each in the next steps of the tutorial, along with some basic examples.
In this step, you will learn how to backup
an entire namespace
of your DOKS
cluster, and restore
it afterwards making sure that all the resources are re-created. The namespace in question is ambassador
.
Next, you will perform the following tasks:
Creating
theambassador
namespacebackup
, usingVelero
CLI.Deleting
theambassador
namespace.Restoring
theambassador
namespace, usingVelero
CLI.Checking
theambassador
namespacerestore
status, usingVelero
CLI.
First, initiate the backup:
velero backup create ambassador-backup --include-namespaces ambassador
Next, check that the backup was created:
velero backup get
The output looks similar to:
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
ambassador-backup Completed 0 0 2021-08-25 19:33:03 +0300 EEST 29d default <none>
Then, after a few moments, you can inspect it:
velero backup describe ambassador-backup --details
The output looks similar to:
Name: ambassador-backup
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: velero.io/source-cluster-k8s-gitversion=v1.21.2
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=21
Phase: Completed
Errors: 0
Warnings: 0
Namespaces:
Included: ambassador
Excluded: <none>
...
Hints:
-
Look for the
Phase
line. It should sayCompleted
. -
Check that no
Errors
are reported as well. -
A new Kubernetes
Backup
object is created:~ kubectl get backup/ambassador-backup -n velero -o yaml apiVersion: velero.io/v1 kind: Backup metadata: annotations: velero.io/source-cluster-k8s-gitversion: v1.21.2 velero.io/source-cluster-k8s-major-version: "1" velero.io/source-cluster-k8s-minor-version: "21" ...
Finally, take a look at the DO Spaces
bucket - there's a new folder named backups
, which contains the assets created for your ambassador-backup
:
First, simulate a disaster, by intentionally deleting the ambassador
namespace:
kubectl delete namespace ambassador
Next, check that the namespace was deleted (namespaces listing should not print ambassador
):
kubectl get namespaces
Finally, verify that the echo
and quote
backend services endpoint
is DOWN
(please refer to Creating the Ambassador Edge Stack Backend Services or Creating the Nginx Backend Services, regarding the backend applications
used in the Starter Kit
tutorial). You can use curl
to test (or you can use your web browser):
curl -Li http://quote.starter-kit.online/quote/
curl -Li http://quote.starter-kit.online/echo/
First, restore the ambassador-backup
:
velero restore create --from-backup ambassador-backup
Important note:
When you delete the ambassador
namespace, the load balancer resource associated with the ambassador service will be deleted as well. So, when you restore the ambassador
service, the LB
will be recreated by DigitalOcean
. The issue is that you will get a NEW IP
address for your LB
, so you will need to adjust
the A records
for getting traffic
into your domains hosted on the cluster.
First, check the Phase
line from the ambassador-backup
restore command output. It should say Completed
(also, please take a note of the Warnings
section - it tells if something went bad or not):
velero restore describe ambassador-backup
Next, verify that all the resources were restored for the ambassador
namespace (look for the ambassador
pods, services
and deployments
):
kubectl get all --namespace ambassador
The output looks similar to:
NAME READY STATUS RESTARTS AGE
pod/ambassador-5bdc64f9f6-9qnz6 1/1 Running 0 18h
pod/ambassador-5bdc64f9f6-twgxb 1/1 Running 0 18h
pod/ambassador-agent-bcdd8ccc8-8pcwg 1/1 Running 0 18h
pod/ambassador-redis-64b7c668b9-jzxb5 1/1 Running 0 18h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/ambassador LoadBalancer 10.245.74.214 159.89.215.200 80:32091/TCP,443:31423/TCP 18h
service/ambassador-admin ClusterIP 10.245.204.189 <none> 8877/TCP,8005/TCP 18h
service/ambassador-redis ClusterIP 10.245.180.25 <none> 6379/TCP 18h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/ambassador 2/2 2 2 18h
deployment.apps/ambassador-agent 1/1 1 1 18h
deployment.apps/ambassador-redis 1/1 1 1 18h
NAME DESIRED CURRENT READY AGE
replicaset.apps/ambassador-5bdc64f9f6 2 2 2 18h
replicaset.apps/ambassador-agent-bcdd8ccc8 1 1 1 18h
replicaset.apps/ambassador-redis-64b7c668b9 1 1 1 18h
Finally, after reconfiguring your LoadBalancer
and DigitalOcean domain
settings, please verify that the echo
and quote
backend services endpoint
is UP
(please refer to Creating the Ambassador Edge Stack Backend Services or Creating the Nginx Backend Services, regarding the backend applications
used in the Starter Kit
tutorial). For example, you can use curl
to test each endpoint:
curl -Li https://quote.starter-kit.online/
curl -Li https://echo.starter-kit.online/
In the next step, you will simulate a disaster by intentionally deleting your DOKS
cluster (the Starter Kit
DOKS cluster).
In this step, you will simulate a disaster recovery
scenario. The whole DOKS
cluster will be deleted, and then restored from a previous backup.
Next, you will perform the following tasks:
Creating
theDOKS
clusterbackup
, usingVelero
CLI.Deleting
theDOKS
cluster, usingdoctl
.Restoring
theDOKS
cluster, usingVelero
CLI.Checking
theDOKS
cluster state.
First, create a backup for the whole DOKS
cluster:
velero backup create all-cluster-backup
Next, check that the backup was created and it's not reporting any errors. The following command lists all the available backups:
velero backup get
The output looks similar to:
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
all-cluster-backup Completed 0 0 2021-08-25 19:43:03 +0300 EEST 29d default <none>
Finally, inspect the backup state and logs (check that no errors are reported):
velero backup describe all-cluster-backup
velero backup logs all-cluster-backup
First, delete the whole DOKS
cluster (make sure to replace the <>
placeholders accordingly):
doctl kubernetes cluster delete <DOKS_CLUSTER_NAME>
Next, re-create the cluster, as described in Section 1 - Set up DigitalOcean Kubernetes. Please make sure the new DOKS
cluster node count is equal or greater
with to the original one - this is important!
Then, install Velero CLI
and Server
, as described in the Prerequisites section, and Step 2 - Installing Velero respectively. Please make sure to use the same Helm Chart version
- this is important!
Finally, restore
everything, by using the below command:
velero restore create --from-backup all-cluster-backup
First, check the Phase
line from the all-cluster-backup
restore describe command output. It should say Completed
(also, please take a note of the Warnings
section - it tells if something went bad or not):
velero restore describe all-cluster-backup
Next, an important aspect to keep in mind is that whenever you destroy a DOKS
cluster, the associated Load Balancer
is destroyed as well. It means that each time the DOKS
cluster is re-created, the Load Balancer
is re-created as well with a different IP
address. You have to make sure that DNS
records will be updated
as well, to reflect the change.
Now, verify all cluster Kubernetes
resources (you should have everything in place):
kubectl get all --all-namespaces
Finally, the backend applications
should respond to HTTP
requests as well (please refer to Creating the Ambassador Edge Stack Backend Services or Creating the Nginx Backend Services, regarding the backend applications
used in the Starter Kit
tutorial):
curl -Li http://quote.starter-kit.online/quote/
curl -Li http://quote.starter-kit.online/echo/
In the next step, you will learn how to perform scheduled backup and restore for your DOKS
cluster applications.
Taking backups automatically based on a schedule, is a really useful feature to have. It allows you to rewind back time
, and restore the system to a previous working state if something goes wrong.
Creating a scheduled backup is a very straightforward process. An example is provided below for a 1 minute
interval (the kube-system
namespace was picked).
First, create the schedule:
velero schedule create kube-system-minute-backup --schedule="@every 1m" --include-namespaces kube-system
Hint:
Linux cronjob format is supported also:
schedule="*/1 * * * *"
Next, verify that the schedule was created:
velero schedule get
The output looks similar to:
NAME STATUS CREATED SCHEDULE BACKUP TTL LAST BACKUP SELECTOR
kube-system-minute-backup Enabled 2021-08-26 12:37:44 +0300 EEST @every 1m 720h0m0s 32s ago <none>
Then, inspect all the backups, after one minute or so:
velero backup get
The output looks similar to:
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
kube-system-minute-backup-20210826093916 Completed 0 0 2021-08-26 12:39:20 +0300 EEST 29d default <none>
kube-system-minute-backup-20210826093744 Completed 0 0 2021-08-26 12:37:44 +0300 EEST 29d default <none>
First, check the Phase
line from one of the backups (please replace the <>
placeholders accordingly) - it should say Completed
:
velero backup describe kube-system-minute-backup-<timestamp>
Finally, take a note of possible Erros
and Warnings
from the above command output as well - it tells if something went bad or not.
To restore one of the minute
backups, please follow the same steps as you learned in the previous steps of this tutorial. This is a good way to exercise and test your experience accumulated so far.
In the next step, you will learn how to manually or automatically delete specific backups you created over time.
When you decide that some older backups are not needed anymore, you can free up some resources both on the Kubernetes
cluster, as well as on the Velero DO Spaces
bucket.
In this step, you will learn how to use one of the following methods to delete Velero
backups:
Manually
(or by hand), usingVelero
CLI.Automatically
, by setting backupsTTL
(Time To Live), viaVelero
CLI.
First, pick a one minute backup for example, and issue the following command (please replace the <>
placeholders accordingly):
velero backup delete kube-system-minute-backup-<timestamp>
Now, check that it's gone from the velero backup get
command output. It should be deleted from the DO Spaces
bucket as well.
Next, you will learn how to delete multiple
backups at once, by using a selector
. The velero backup delete
subcommand provides a flag called --selector
. It allows you to delete multiple
backups at once based on Kubernetes Labels
. The same rules apply as for Kubernetes Label Selectors.
First, list the available backups:
velero backup get
The output looks similar to:
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
ambassador-backup Completed 0 0 2021-08-25 19:33:03 +0300 EEST 23d default <none>
backend-minute-backup-20210826094116 Completed 0 0 2021-08-26 12:41:16 +0300 EEST 24d default <none>
backend-minute-backup-20210826094016 Completed 0 0 2021-08-26 12:40:16 +0300 EEST 24d default <none>
backend-minute-backup-20210826093916 Completed 0 0 2021-08-26 12:39:16 +0300 EEST 24d default <none>
backend-minute-backup-20210826093816 Completed 0 0 2021-08-26 12:38:16 +0300 EEST 24d default <none>
backend-minute-backup-20210826093716 Completed 0 0 2021-08-26 12:37:16 +0300 EEST 24d default <none>
backend-minute-backup-20210826093616 Completed 0 0 2021-08-26 12:36:16 +0300 EEST 24d default <none>
backend-minute-backup-20210826093509 Completed 0 0 2021-08-26 12:35:09 +0300 EEST 24d default <none>
Next, say that you want to delete all the backend-minute-backup-*
assets. Pick a backup from the list, and inspect the Labels
:
velero describe backup backend-minute-backup-20210826094116
The output looks similar to (notice the velero.io/schedule-name
label value):
Name: backend-minute-backup-20210826094116
Namespace: velero
Labels: velero.io/schedule-name=backend-minute-backup
velero.io/storage-location=default
Annotations: velero.io/source-cluster-k8s-gitversion=v1.21.2
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=21
Phase: Completed
Errors: 0
Warnings: 0
Namespaces:
Included: backend
Excluded: <none>
...
Next, you can delete all
the backups that match
the backend-minute-backup
value of the velero.io/schedule-name
label:
velero backup delete --selector velero.io/schedule-name=backend-minute-backup
Finally, check that all the backend-minute-backup-*
assets disappeared from the velero backup get
command output, as well as from the DO Spaces
bucket.
When you create a backup, you can specify a TTL
(Time To Live), by using the --ttl
flag. If Velero
sees that an existing backup resource is expired, it removes:
- The
Backup
resource - The backup
file
from cloud objectstorage
- All
PersistentVolume
snapshots - All associated
Restores
The TTL
flag allows the user to specify the backup retention period with the value specified in hours, minutes and seconds in the form --ttl 24h0m0s
. If not specified, a default TTL
value of 30 days
will be applied.
Next, you will create a short lived backup for the ambassador
namespace, with a TTL
value set to 3 minutes
.
First, create the ambassador
backup, using a TTL
value of 3 minutes
:
velero backup create ambassador-backup-3min-ttl --ttl 0h3m0s --include-namespaces ambassador
Next, inspect the ambassador
backup:
velero backup describe ambassador-backup-3min-ttl
The output looks similar to (notice the Namespaces -> Included
section - it should display ambassador
, and TTL
field is set to 3ms0
):
Name: ambassador-backup-3min-ttl
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: velero.io/source-cluster-k8s-gitversion=v1.21.2
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=21
Phase: Completed
Errors: 0
Warnings: 0
Namespaces:
Included: ambassador
Excluded: <none>
Resources:
Included: *
Excluded: <none>
Cluster-scoped: auto
Label selector: <none>
Storage Location: default
Velero-Native Snapshot PVs: auto
TTL: 3m0s
...
A new folder should be created in the DO Spaces
Velero bucket as well, named ambassador-backup-3min-ttl
.
Finally, after three minutes or so, the backup and associated resources should be automatically deleted. You can verify that the backup object was destroyed, using: velero backup describe ambassador-backup-3min-ttl
. It should fail with an error, stating that the backup doesn't exist anymore. The corresponding ambassador-backup-3min-ttl
folder from the DO Spaces
Velero bucket, should be gone as well.
Going further, you can explore all the available velero backup delete
options, via:
velero backup delete --help
In this tutorial, you learned how to perform one time
, as well as scheduled
backups, and to restore everything back. Having scheduled
backups in place, is very important as it allows you to revert to a previous snapshot in time, if something goes wrong along the way. You walked through a disaster recovery scenario, as well.
You can learn more about Velero
, by following below topics:
- Backup Command Reference
- Restore Command Reference
- Backup Hooks
- Cluster Migration
- Velero Troubleshooting
Next, you will learn how to set up Alerts
and Notifications
using AlertManager
, to give you real time notifications (e.g. Slack
), if something bad happens in your DOKS
cluster.