Releases: libopenstorage/stork
23.4.0
New Features
- You can now apply exclude label in MigrationSchedule spec to exclude specific resources from migration. #1339
Improvements
-
Stork service is updated to not accept old TLS versions 1.0 and 1.1. #1348
-
Stork now creates the default-migration-policy schedule policy, which is set to an interval of 30 minutes instead of 1 minute. #1346
-
Stork now skips migrating the OCP specific (
system:
) ClusterRole and ClusterRoleBinding resources on OpenShift. #1347 -
Stork now uses a default QPS of 1000 and a Burst of 2000 for its Kubernetes client. #1356, #1378
-
Updated moby package to fix vulnerability CVE-2023-28840. #1381
Bug Fixes
-
Issue: Stork monitoring controller causes a high number of ListPods API calls to be executed repeatedly, resulting in a considerable consumption of memory.
User Impact: When there is a considerable quantity of pods within the cluster, the Stork monitoring system triggers additional ListPods APIs, which leads to a substantial utilization of memory.
Resolution: To reduce the overall memory usage of Stork, utilize a cache for retrieving the pod list within the health monitoring process. #1321, #1340, #1390 -
Issue: Certain plurals of CRD do not follow pluralizing rules, which causes the APIs to fail to collect plurals during migration or backup.
User Impact: These CRDs neither get migrated or backed up properly, which affects disaster recovery, backup and restore of applications that depend on the CRs.
Resolution: Use API calls with correct CRD plurals, fetched from the cluster. #1361
23.3.1
Improvements
- Issue: Backing up a significant number of Kubernetes resources has resulted in failures due to limits on gRPC requests, Kubernetes custom resource sizes, or etcd payload sizes.
User Impact: If the number of Kubernetes resources in a backup is large, then the backup process may fail to complete due to errors related to size limits.
Resolution: To prevent errors related to gRPC size limits, etcd payload limits, and custom resource definition size, the resource information was removed from both the ApplicationBackup and ApplicationRestore CRs.
23.3.0
Improvements
- Stork now supports cluster pairing for Oracle Kubernetes Engine (OKE) clusters. #1331
- Stork now supports bidirectional cluster pairing for Asynchronous DR migration.
- Stork will update StorageClass on PV objects after the PVC migration. #1320
- Fixed CVE-2020-26160 vulnerability by updating the JWT package. #1343
Bug Fixes
-
Issue: Failback to primary cluster failed when an app used Portworx CSI volumes, as the volumeHandle pointed to the old volume ID.
User Impact: App did not come up on the primary cluster after failback.
Resolution: Recreate the PVs and specify the correct volume name in the volumeHandle field of the spec. As a result, the app will use properly bounded PVCs and will come up without any issue. #1355 -
Issue: Could not find resources for the Watson Knowledge Catalog.
User Impact: Migration failed for the Watson Knowledge Catalog.
Resolution: Use the proper plurals for the CRDs for a successful migration process for the Watson Knowledge Catalog. #1326 -
Issue: Service and service account updates did not reflect on the destination cluster.
User Impact: Migration failed to keep updated resources on the destination cluster.
Resolution: During migration, you need to sync service updates, merge secrets associated with the service account, and update the AutomountServiceAccountToken param for the service account on the destination cluster. #1326
23.2.1
Bug Fixes
-
Issue: px-backup needs a way to know the custom admin namespace configured in stork deployment.
User Impact: px-backup users were unable to use the custom admin namespace configured in stork deployment.
Resolution: Added a configmap stork-controller-config in kube-system namespace with the details of admin namespace. #1310 -
Issue: rule cmd executor pods were always getting created/started in kube-system namespace.
User Impact: Users had concerns about running the stork rule pods in kube-system namespace.
Resolution: Fixed it to run the rule cmd executor pods in the namespace where stork is deployed. #1338
23.2.0
Notes:
- Starting with 23.2.0, the naming scheme for Stork releases has changed. Release numbers are now based on the year and month of the release.
- Customers upgrading to stork 23.2.0 will need to update
storkctl
on their cluster. This is required to correctly set up migration with Auto Suspend.
Improvements
- Stork will migrate all the CRDs under the same group for which CR exists if it is for a different kind. #1269
- Stork will update the owner reference for PVC objects on the destination cluster. #1269
- Added support for gke-gcloud-auth-plugin required for authenticating with GKE. #1312
Bug Fixes
-
Issue: Users were unable to migrate Confluent Kafka resources during failback.
User Impact: Confluent Kafka application failback was unable to bring up the application.
Resolution: Stork will now remove any finalizers on the CRs when deleting the resource during migration so that a new version of the resource can be recreated. #1295 -
Issue: Migration was suspended after failback if autosuspend was enabled.
User Impact: After failback, the existing migration schedule was not being resumed, which caused the secondary cluster not to sync.
Resolution: With the fix, the primary cluster's migration schedule will correctly detect if migration can be resumed to secondary. #1282 -
Issue: The Confluent Kafka operator was not able to recognize service sub resources during Async-DR migration.
User Impact: Application pods for Confluent Kafka were not able to start.
Resolution: Stork will not migrate Service resources if the owner reference field is set. #1269 -
Issue: Stork was throwing the error
Error migrating volumes: Operation cannot be fulfilled on migrations.stork.libopenstorage.org: the object has been modified; please apply your changes to the latest version and try again
.
User Impact: This error caused unnecessary confusion during migration.
Resolution: Stork no longer raises this event as it retries the failed operation. #1272, #1293 -
Issue: Users were not allowed to take backups based on namespace labels.
User Impact: Users had to manually select the static namespaces list for backup schedules. Dynamic selection of the namespace list based on the namespace label was not possible.
Resolution: With namespace label support, users can specify the namespace label such that the list of namespaces with those label will be selected dynamically for backups. #1258, #1315 -
Issue: The Rancher project association for the Kubernetes resources in the Rancher environment was not backed up and restored.
User Impact: Since the project configurations are not restored, some applications failed to come up.
Resolution: Project settings are now backed up and applied during the restore. Users can also change to a different project with project mapping during restore. #1294, #1318 -
Issue: Users were not able to specify the option to use the default storage class configured on the restore cluster in storage class mapping.
User Impact: Users were not able to use the default storage class for restore.
Resolution: Now users can mentionuse-default-storage-class
as the destination storage class in the storage class mapping if they want to use the default configure storage class from the restore cluster. #1288
Stork 2.12.4
Improvements
Updating the container image to fix libksba vulnerability issue CVE-2022-47629
Stork 2.12.3
Improvements
- Following vulnerabilities are fixed by updating the base image and updating libxml package: CVE-2022-40303, CVE-2022-40304, CVE-2022-43680, CVE-2022-3821, CVE-2022-35737, CVE-2021-46848
Stork 2.12.2
New Features
- Stork will try to schedule application pods that are using sharedV4 service volumes to run on nodes where a volume replica does not exist. This means that the nodes will have an NFS mountpoint, and when a failover happens, application pods will not be restarted. You can use the StorageClass parameter
stork.libopenstorage.org/preferRemoteNodeOnly: "true"
to strictly enforce this behavior. #1222 - Operator now sets the scheduler of the
px-csi-ext
pods asstork
when running Operator 1.10.1 or newer. If Stork detects that apx-csi-ext
pod is running on an offline Portworx node, it deletes thepx-csi-ext
pod. When Stork gets a scheduling request for such a pod, it can place the pod on the node where Portworx is operational. #1213
Improvements
- Introduced a dynamic shared informer cache for StorageClass and ApplicationRegistration CRs to improve migration times in clusters which hit high API server rate limits. #1227
- Added support for migrating MongoDB Enterprise Operator's MongoDBOpsManager and MongoDB CRs #1245
Bug Fixes
- Issue: Resource transformation for ResourceQuota was failing with the error
server could not find the requested resource
.
User Impact: Resource transformation of ResourceQuota was failing during migrations.
Resolution: API calls with the right resource kind solved the issue, and ResourceQuota can be transformed. #1209 - Issue: Stork will hit a nil panic when
SkipDeletedNamespaces
is not set inMigration
orMigrationSchedule
object and a migration is requested for a deleted namespace.
User Impact: Stork pod will restart and migrations won't succeed.
Resolution: The nil panic is now handled in Stork. #1241 - Issue: All storage-provisioner annotations were not getting removed during pvc restore which was causing pvc to remain in unbounded state in generic restore case.
User Impact: Restoring backups taken on GKE cluster to AKS cluster was failing
Resolution: Removed both "volume.kubernetes.io/storage-provisioner" and "volume.beta.kubernetes.io/storage-provisioner" during pvc restore before applying. #1225 - Issue: Snapshots triggered as part of a scheduled were getting indefinitely queued and retried if they got an error from the storage driver.
User Impact: Retries of multiple snapshot requests put additional load on the storage driver.
Resolution: Limited the number of snapshot in error state which are triggered as part of a schedule. Stork will delete older snapshot requests which are in error state. #1231 - Issue: When stork creates snapshots as part of a schedule, it creates a name for the snapshot by appends a timestamp to the name of the schedule. If the length of the snapshot schedule name plus the suffix was greater than 63 characters, then the snapshot operation would fail.
User Impact: Stork failed to trigger a snapshot for a schedule which had a long names.
Resolution: Truncate the name of the snapshots which are created from a snapshot schedule . #1231
Stork 2.12.1
Bug Fixes
- Issue: Restore of a Portworx volume using the in-tree provisioner (
kubernetes.io/portworx-volume
) to Portworx volume using the CSI provisioner (pxd.portworx.com
) was not adding csi section in the PV spec
User Impact: A Portworx volume with CSI provisioner would not be identified as a CSI volume.
Resolution: Stork while restoring will add the csi section to the PersistentVolume. #1195
Stork 2.11.5
Improvements
- Following vulnerabilities are addressed in the stork 2.11.x release train: CVE-2022-27774 CVE-2022-1292