-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust the Status of NonAdminBackup #13
Conversation
This one has a lot of duplication with #12 lets merge that one, then linters one, then this one |
70434b7
to
963db7d
Compare
@@ -32,14 +32,12 @@ type NonAdminBackupSpec struct { | |||
|
|||
// BackupSpec defines the specification for a Velero backup. | |||
BackupSpec *velerov1api.BackupSpec `json:"backupSpec,omitempty"` | |||
|
|||
// BackupStatus captures the current status of a Velero backup. | |||
BackupStatus *velerov1api.BackupStatus `json:"backupStatus,omitempty"` | |||
} | |||
|
|||
// NonAdminBackupStatus defines the observed state of NonAdminBackup | |||
type NonAdminBackupStatus struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-Admin backup status can have more details than velero backup status, eg: validation status or reconciliation status of the non-admin backup CR itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified. We now have (two examples from this PR):
spec: {}
status:
failureReason: NonAdminBackup CR does not contain valid VeleroBackupSpec
phase: FailedValidation
spec:
backupSpec:
volumeSnapshotLocations:
- velero-sample-1
defaultVolumesToFsBackup: false
csiSnapshotTimeout: 10m0s
ttl: 720h0m0s
itemOperationTimeout: 4h0m0s
metadata: {}
storageLocation: velero-sample-1
hooks: {}
snapshotMoveData: false
status:
backupStatus:
expiration: '2024-04-05T16:46:15Z'
failureReason: >-
unable to get credentials: unable to get key for secret: Secret
"cloud-credentials" not found
formatVersion: 1.1.0
phase: Failed
startTimestamp: '2024-03-06T16:46:15Z'
version: 1
/hold need to move update Velero Backup Status within it's own section. |
/unhold |
api/v1alpha1/nonadminbackup_types.go
Outdated
// BackupStatus captures the current status of a Velero backup. | ||
|
||
// +optional | ||
Phase velerov1api.BackupPhase `json:"phase,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This phase should be Non-Admin Backup CR Reconciliation phase, right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or better option would be to use conditions just like we do for DPA CR status ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah this seems duplicated information. What is your goal here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, however for this implementation I've used same as https://github.com/vmware-tanzu/velero/blob/4d548612d45d5eb0c61b2473286127dfa6e3c39d/pkg/apis/velero/v1/backup_types.go#L290-L348
It's because I think at the moment it's enough for us within velerov1api.BackupPhase
. If we decide to have more states we can always create our own Non Admin Backup Phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mateusoliveira43 the goal was to create Phase in the Non Admin Backup separate from BackupStatus Phase, I've used the velerov1api.BackupPhase
, but we can change it as condidions or create our own NonAdminPhase type for the Non Admin Backup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shubham-pampattiwar I 100% agree with that, but isn't current list of conditions from velerov1api.BackupPhase
enough? The idea was to use one of those as an overall status, but we can have own:
BackupPhaseNew BackupPhase = "New"
BackupPhaseFailedValidation BackupPhase = "FailedValidation"
BackupPhaseInProgress BackupPhase = "InProgress"
BackupPhaseWaitingForPluginOperations BackupPhase = "WaitingForPluginOperations"
BackupPhaseWaitingForPluginOperationsPartiallyFailed BackupPhase = "WaitingForPluginOperationsPartiallyFailed"
BackupPhaseFinalizing BackupPhase = "Finalizing"
BackupPhaseFinalizingPartiallyFailed BackupPhase = "FinalizingPartiallyFailed"
BackupPhaseCompleted BackupPhase = "Completed"
BackupPhasePartiallyFailed BackupPhase = "PartiallyFailed"
BackupPhaseFailed BackupPhase = "Failed"
BackupPhaseDeleting BackupPhase = "Deleting"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, I will change :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mpryc Let elaborate more and give you an example why we need independent conditions
for non-admin backup CR:
The Velero BackupStatus provides us information on the reconciliation status of Velero Backup CR but how would we know bout the reconciliation of the non-admin backup CR ? This is when the independent conditions
on non-admin backup CR come into picture. For example if the non-admin backup controller failed to created the Velero backup and got an error, how would the user know that the request failed, in such scenario the non-admin backup controller should add a error
condition on non-admin backup CR and specify the error on the non-admin CR status conditions so that the user gains transparency on what is actually happening with non-admin backup CR. This is in general how controllers work, spec
is what the user wants and status
provides information on how the CR is doing, whether its performing the intended tasks and whether the actions intended via spec are done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added Conditions as well.
3d67dff
to
068dbc6
Compare
api/v1alpha1/nonadminbackup_types.go
Outdated
} | ||
|
||
// NonAdminBackupStatus defines the observed state of NonAdminBackup | ||
type NonAdminBackupStatus struct { | ||
Conditions []metav1.Condition `json:"conditions,omitempty"` | ||
|
||
// +optional | ||
Phase NonAdminBackupPhase `json:"phase,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Phase
and FailureReason
might be redundant here, becasue conditions consists of similar fields like Conditions.Type
, Conditions.Reason
and Conditions.Message
Reference: https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#Condition
Rebased work. Now it's following design from 2nd scenario which is described in: |
cedc613
to
8073768
Compare
Is this PR blocker by #23 ? |
api/v1alpha1/nonadminbackup_types.go
Outdated
|
||
// NonAdminCondition are used for more detailed information supporing NonAdminBackupPhase state. | ||
// +kubebuilder:validation:Enum=BackupAccepted;BackupQueued | ||
type NonAdminCondition string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not like declaring things on a file that are not used on that file. Would move to file that uses it
kubebuilder annotation here does not do anything, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea what is the comment here about, could you give me more information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check CRD yaml, it does not mention NonAdminCondition, right? then // +kubebuilder:validation:Enum=BackupAccepted;BackupQueued
does not do anything
This type is declared here, but never used here, only by other files. If it is used by more then one file, would move to constants file, otherwise, just to the file that uses this type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, I moved that from spec to status and now it's not used by kubebuilder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is still never used in the file it was declared, right ❓
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is used in this file, few lines below as a NonAdminCondition:
const (
NonAdminConditionAccepted NonAdminCondition = "Accepted"
NonAdminConditionQueued NonAdminCondition = "Queued"
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the NonAdminCondition
type is not used in this file (it is just created and its allowed values declared), it should be moved to another file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is pretty much standard to keep the type with const that is expecting that type, see even in velero:
Where would you move it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is used by functions and nab controller, so common folder
we can add it to constants directly, or create a types file in their first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here you go
// No change, no need to update | ||
logger.V(1).Info("NonAdminBackup status and spec is already up to date") | ||
return nil | ||
logger.V(1).Info("NonAdminBackup Phase is already up to date") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: Should not be an Info log, can make it debug
currentCondition := apimeta.FindStatusCondition(nab.Status.Conditions, string(condition)) | ||
if currentCondition != nil && currentCondition.Status == conditionStatus { | ||
// Condition is already set to the desired status, no need to update | ||
logger.V(1).Info(fmt.Sprintf("NonAdminBackup Condition is already set to: %s", condition)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: Unsure if this should be Info, might get logs flooded in reconcile cycles
}, | ||
) | ||
|
||
logger.V(1).Info(fmt.Sprintf("NonAdminBackup Condition to: %s", condition)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: Again should not be info level I think
} | ||
|
||
// Check if BackupSpec needs to be updated | ||
if !reflect.DeepEqual(nab.Spec.BackupSpec, &veleroBackup.Spec) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about the spec update from velero backup spec to nab backup spec, seems opposite to what we want to do in our NAC feature.
NAC spec should be cascaded to Velero backup spec, not the other way round.
@@ -69,44 +72,98 @@ const ( | |||
func (r *NonAdminBackupReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) { | |||
r.Log = log.FromContext(ctx) | |||
logger := r.Log.WithValues("NonAdminBackup", req.NamespacedName) | |||
logger.V(1).Info(">>> Reconcile NonAdminBackup - loop start") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: remove this log
|
||
// Resource version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can remove these comments ?
// Bail out when the Non Admin Backup reconcile was triggered, when the NAB got deleted | ||
// Reconcile loop was triggered when Velero Backup object got updated and NAB isn't there | ||
if err != nil && apierrors.IsNotFound(err) { | ||
logger.V(1).Info("Deleted NonAdminBackup CR", nameField, req.Name, constant.NameSpaceString, req.Namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsure about this Deleted
log. This is implying that CR triggred the reconcile but in between reconcile trigger and CR Get call, the CR was deleted. Maybe we just say the CR was not found and not Deleted
if err != nil { | ||
logger.Error(err, "Error while performing NonAdminBackup reconcile") | ||
errMsg := "NonAdminBackup CR does not contain valid BackupSpec" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message string must from err
object itself
|
||
// Phase: BackingOff | ||
// BackupAccepted: False | ||
// BackupQueued: False # already set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are you setting BackupQueued: False
as false ? Maybe I am missing this somehow :(
@@ -48,7 +50,8 @@ type NonAdminBackupReconciler struct { | |||
} | |||
|
|||
const ( | |||
nameField = "Name" | |||
nameField = "Name" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using this nameField
everywhere in logs ?
logger.Error(errUpdate, "Unable to set NonAdminBackup Phase: BackingOff", nameField, req.Name, constant.NameSpaceString, req.Namespace) | ||
return ctrl.Result{}, errUpdate | ||
} else if updatedStatus { | ||
// We do not requeue as the State is BackingOff |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++ vvimp
// We do not requeue as the State is BackingOff | ||
return ctrl.Result{}, nil | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add comment that you are setting BackupAccepted: False
if err != nil { | ||
logger.Error(err, "Error while performing NonAdminBackup reconcile") | ||
errMsg := "NonAdminBackup CR does not contain valid BackupSpec" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This main tasks of this whole code block seems like:
- Set
Phase: BackingOff
- Set Condition
BackupAccepted: False
- Set Condition
BackupQueued: False
(seems to be missing )
So why not combine these tasks into one function likeSetPhaseAndCondition
? This will make the main reconciler more readable and more debug-gable ? WDYT ?
@@ -48,7 +50,8 @@ type NonAdminBackupReconciler struct { | |||
} | |||
|
|||
const ( | |||
nameField = "Name" | |||
nameField = "Name" | |||
requeueTimeSeconds = 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't 10 sec a lot ? maybe 5 sec ? WDYT ?
|
||
veleroBackupName := function.GenerateVeleroBackupName(nab.Namespace, nab.Name) | ||
|
||
if veleroBackupName == constant.EmptyString { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From line 154-line 214 seems like a task that we want the controller to perform only if the NAB.Status has New
phase and Backup Accepted
condition. Should'nt we be adding conditionals on these two things and then perform the task ? Also, IMO making a function of this whole Create Velero Backup task would be better suited.
logger.Info("Backup successfully created", nameField, veleroBackupName) | ||
|
||
logger.Info("NonAdminBackup Reconcile loop end") | ||
logger.V(1).Info(">>> Reconcile NonAdminBackup - loop end") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: Remove this log or maybe add a debug log
@@ -143,10 +211,28 @@ func (r *NonAdminBackupReconciler) Reconcile(ctx context.Context, req ctrl.Reque | |||
logger.Error(err, "Failed to create backup", nameField, veleroBackupName) | |||
return ctrl.Result{}, err | |||
} | |||
logger.Info("VeleroBackup successfully created", nameField, veleroBackupName) | |||
|
|||
// Phase: Created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment as this one: https://github.com/migtools/oadp-non-admin/pull/13/files#r1576897507
IMHO we adopting a ReconcileBatch pattern as OADP Operator will be pretty helpful here. YMMV though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not change OADP checks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
api folder files are created by kubebuilder, I think makes more sense to this be in internal/common/type/type.go
// NonAdminBackupSpec defines the desired state of NonAdminBackup | ||
type NonAdminBackupSpec struct { | ||
// https://github.com/vmware-tanzu/velero/blob/main/pkg/apis/velero/v1/backup_types.go | ||
|
||
// BackupSpec defines the specification for a Velero backup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I saw in the code a check if this was not nil
adding // +kubebuilder:validation:Required
would be valid here ❓
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Acking this one. As discussed a follow up PR will be posted to resolve some of the things. @mpryc Thank you for working on this one !
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mpryc, shubham-pampattiwar The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Follow up PR ref issue: #52 |
/lgtm |
Moves Status outside of Spec and adjusts this to reflect
Velero Backup status as well additional Status when the
Spec within NonAdminBackup is not defined.
To test:
Scenario 1:
Scenario 2: