-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clone populator: Add clone source watches #3639
base: main
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cc @akalenyu @arnongilboa |
/test pull-containerized-data-importer-e2e-ceph |
@alromeros I'm not saying we shouldn't do this but we could do a cheap fix like this when the snapshot source does not exist:
If you go with the watches, make sure to change the line above |
You probably have a second watch in that dynamic snap one containerized-data-importer/pkg/controller/clone/planner.go Lines 179 to 183 in 249b004
|
Yeah testing watches is not something we do, I think you would use https://book.kubebuilder.io/reference/envtest.html but clearly it's not integrated at all in CDI today. |
My watch only watches source clone snapshots (when they are referenced in the volumeCloneSource spec). This is for smart clone and seem to be watching snapshots created by the controller. Don't know if we should remove that one. @mhenriks proposal is also interesting, we can fix this with a single line by requeuing here:
|
Interesting, yeah, you could do that in containerized-data-importer/pkg/controller/clone/snap-clone.go Lines 62 to 64 in 249b004
Should we be concerned about the hit we take with that new watch? |
249b004
to
9627d81
Compare
Removed both the func test and the requeue when source PVC is not ready. |
You're right my bad I think this looks good. Maybe you have to digest volumesnapshots not existing in the cluster, but I am not sure. Feel free to undraft when you're ready and we'll see |
/retest-required |
813832d
to
2f08de5
Compare
@alromeros since the only thing we are interested in is the |
You'd be spamming requeues for no reason, especially for provisioners where snap taking is long #2531 |
Isn't watching for any update also potentially expansive? why not add a ReadyToUse == true predicate, so we won't reconcile when not needed? |
It's not free but the alternative is occupying an entire thread on potentially useless requeues
I think it's a good idea, if that is the only thing we care about, we should be doing that |
I prefer to keep requeue logic within a watch and handle all potential issues there.
By that logic wouldn't we might as well drop most of our watches and rely on requeues when necessary? I prefer to keep requeue logic within a watch and handle all potential issues there, but I'm not an expert in the potential performance implications. Just that having watches for sources it's what we do in the DV controller and makes sense to keep it here. |
I meant adding the mentioned predicate in the watch like we do with Pending DVs. |
ab7504c
to
e80ef52
Compare
/test pull-cdi-goveralls |
e80ef52
to
94a07b4
Compare
…s updated/created Signed-off-by: Alvaro Romero <[email protected]>
94a07b4
to
ce1e83c
Compare
@mhenriks @akalenyu @arnongilboa adding back the requeue for source PVCs since we need to trigger a requeue if the PVC is in use by other pods. We also do this with the non-populator flow. Hope we can still keep the watches and decrease the number of requeues if the PVC is not being used. |
/test pull-cdi-unit-test |
/retest-required |
/test pull-containerized-data-importer-e2e-ceph-wffc |
/test pull-containerized-data-importer-fossa |
return kind + "/" + namespace + "/" + name | ||
} | ||
|
||
if err := mgr.GetClient().List(context.TODO(), &snapshotv1.VolumeSnapshotList{}); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I don't know if this is a real world issue, but, if there is not snapshot support in the cluster, you'd back out and not set up a PVC watch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered it but I don't know how much I should worry about this case... Don't mind handling it if necessary but code might become more complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not so complex:
containerized-data-importer/pkg/controller/dataimportcron-controller.go
Lines 1199 to 1207 in f4ddcc1
if err := mgr.GetClient().List(context.TODO(), &snapshotv1.VolumeSnapshotList{}); err != nil { | |
if meta.IsNoMatchError(err) { | |
// Back out if there's no point to attempt watch | |
return nil | |
} | |
if !cc.IsErrCacheNotStarted(err) { | |
return err | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By complex I mean that it requires duplicating the watches and mappers instead of having a list of supported sources, but yeah I'll change it if we prefer to handle this error.
predicate.Funcs{ | ||
CreateFunc: func(e event.CreateEvent) bool { return true }, | ||
DeleteFunc: func(e event.DeleteEvent) bool { return false }, | ||
UpdateFunc: func(e event.UpdateEvent) bool { return isSourceReady(obj) }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will requeue for any readyToUse snapshot. Instead, what you're after is a single case where readyToUse switches from false to true, so something like
containerized-data-importer/pkg/controller/datasource-controller.go
Lines 307 to 309 in ce1e83c
UpdateFunc: func(e event.TypedUpdateEvent[*snapshotv1.VolumeSnapshot]) bool { | |
return !reflect.DeepEqual(e.ObjectOld.Status, e.ObjectNew.Status) || | |
!reflect.DeepEqual(e.ObjectOld.Labels, e.ObjectNew.Labels) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that we still want to requeue any volume snapshot update even after it becomes ready? This would only allow for one single update requeue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The controller requeues the request if there's an error/some other reason. You only have to catch the transition
I think watchOwned could also use a predicate, I believe what happens today is that any snap goes through it's |
What this PR does / why we need it:
We lack proper clone source watches in clone populator, which can cause unwanted behaviors when we depend on source updates to trigger a reconcile.
This PR adds two watches for each clone source type.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes # https://issues.redhat.com/browse/CNV-56518
Release note: