Skip to content

Reconcile is wrongly skipped after Cleanup if Endpoints remain the same #4343

@sayap

Description

@sayap

Bug Description

The Cleanup logic in https://github.com/kubernetes-sigs/aws-load-balancer-controller/blob/v2.9.2/pkg/targetgroupbinding/resource_manager.go#L140-L143 doesn't seem to update the checkpoint. As such, if we accidentally delete a Service port and trigger a cleanup, all targets will be deregistered. Then, after we add back the Service port, assuming the Endpoints remain the same as before the accidental delete, the calculated checkpoint will remain the same, and so the reconcile will be wrongly skipped, and no target will be registered.

Steps to Reproduce

  1. Delete a Service port that is used by the controller.
  2. Add back the Service port.

Expected Behavior

All the targets are registered

Actual Behavior

No target is registered.

Regression
Was the functionality working correctly in a previous version ?

Not tested, but this should work fine with v2.9.0, before the checkpoint code was added.

Current Workarounds

We can either delete the checkpoint annotations to force a reconcile, or bring up a new service pod or terminate an existing one to change the Endpoints.

Environment

  • AWS Load Balancer controller version: v2.9.2
  • Kubernetes version: 1.30
  • Using EKS (yes/no), if so version?: eks.42
  • Using Service or Ingress: Service

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions