Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated backport of #895: Add support for resource syncer to use a shared informer #913: Use informer to retry failed resources due to missing #915

Conversation

tpantelis
Copy link
Contributor

Backport of #895 #913 on release-0.17.

#895: Add support for resource syncer to use a shared informer
#913: Use informer to retry failed resources due to missing

For details on the backport process, see the backport requests page.

@submariner-bot
Copy link
Contributor

🤖 Created branch: z_pr915/tpantelis/automated-backport-of-#895-#913-upstream-release-0.17

tpantelis added 3 commits May 16, 2024 08:51
...reduce the informer cache memory usage.

Fixes submariner-io#859

Signed-off-by: Tom Pantelis <[email protected]>
This was observed in a unit test elsewhere that configures a resync
period. On resync, the K8s DeltaFIFO retrieves every object from the
cache store and re-queues them. It also invokes the transform function
which results in a data race when the resource syncer tries to nil the
ManagedFields. This is because the object instance is the same as that
stored in the cache which was previously accessed by another thread. So
mutating it without the protection of a lock is unsafe. This is really
an issue with the DeltaFIFO - it is documented that the "TransformFunc
sees the object before any other actor, and it is now safe to mutate the
object in place instead of making a copy" however this is not the case
on a re-sync. The DeltaFIFO should either elide the TransformFunc on
re-sync or make a copy of the object retrieved from the cache store.

As a workaround, the resource syncer should only set ManagedFields to
nil if it is non-nil, which will be the case a a re-sync. Added a unit
test to cover this case.

Also the object passed to the TransformFunc could be a
DeletedFinalStateUnknown so we need to handle that as well.

Signed-off-by: Tom Pantelis <[email protected]>
@tpantelis tpantelis force-pushed the automated-backport-of-#895-#913-upstream-release-0.17 branch 2 times, most recently from ec0b7ac to d971880 Compare May 16, 2024 13:20
tpantelis added 3 commits May 16, 2024 10:30
Some users may want to configure a shared informer rather than use
the internal dedicated informer for efficiency if there's multiple
consumers of a resource type.

Signed-off-by: Tom Pantelis <[email protected]>
...rather than exponential requeuing which is problematic at scale.
The user can specify a SharedInformer that is used to requeue resources
that had previously failed when a missing namespace is later created.

Signed-off-by: Tom Pantelis <[email protected]>
@tpantelis tpantelis force-pushed the automated-backport-of-#895-#913-upstream-release-0.17 branch from d971880 to 022fa4b Compare May 16, 2024 14:30
@aswinsuryan aswinsuryan merged commit ba3d8ef into submariner-io:release-0.17 May 16, 2024
16 checks passed
@submariner-bot
Copy link
Contributor

🤖 Closed branches: [z_pr915/tpantelis/automated-backport-of-#895-#913-upstream-release-0.17]

@tpantelis tpantelis deleted the automated-backport-of-#895-#913-upstream-release-0.17 branch July 17, 2024 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants