-
Notifications
You must be signed in to change notification settings - Fork 759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes to attach probes at pod start #3206
Conversation
fa711f8
to
b467bfd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, this is the same PR as - #3188
add new lines to satisfy format check update unit tests for DialContext
@haouc Still I have this issue even after updating to version I can reproduce:
|
+1, upgrading to 1.19.3 didn't resolve connectivity issues during pod startup |
@janavenkat @m00lecule Can you let us the know the flow in which you are seeing the issue or repro steps that would help us repro the issue. Can you also share the logs of the issue to this email "[email protected]". Thanks |
Reproduce steps:
|
@Pavani-Panakanti sample logs #3203 |
Note: This PR is updated from #3188
What type of PR is this?
bug
Which issue does this PR fix?:
aws/aws-network-policy-agent#345
What does this PR do / Why do we need it?:
Testing done on this change:
Setup a generic ipv4 cluster with 3 nodes
EnforceNpToPod Testing - NP enabled
Created a pod and verified the EnforceNptoPod calls and probes are being attached to pods as expected in default allow. Verified pod connectivity in above case
Created multiple pods (10) at once and verified locking is happening in NP as expected and probes are being attached correctly
Did a large scale up and verified calls are being made as expected
EnforceNpToPod Testing - NP disabled
Created pods and verified that NP is returning success immediately
DeletePodNp - NP enabled
Deleted a pod and verified caches are being cleaned up and bpf programs are being deleted as expected
Deleted multiple pods at once and verified cleanup of caches and programs is working as expected
Verified locking on NP side when multiple calls from cni to NP happen at same time
Did a huge scale down and verified the functionality
DeletePodNp Testing - NP disabled
Deleted pods and verified that NP is returning success immediately
Tested below upgrade/downgrade functionality
If a pod is created before upgrade and deleted after upgrade - we will call DeletePodNp. If pod has policies applied, there will be probes attached and we will clean the caches and programs. If pod does not have any policies applied, we will not find any entries in the internal caches in NP related to this pod and we will just return success
If pod is created after upgrade and then deleted after downgrade - No call will be made from cni to np after downgrade. If pod has policies applied, probes will be cleaned up as part of reconcile code flow (This is how it works before these changes). If no policies are applied, there will be pod info left in internal caches and bpf program will not be deleted. Probes will be anyway deleted as part of pod delete. This should not be a concern. Internal caches will be reset on np agent restart
Tested NP enable -> disable and disable -> enable functionality as well
Tested all of the above with ipv6 cluster
Will this PR introduce any new dependencies?:
No
Will this break upgrades or downgrades? Has updating a running cluster been tested?:
Tested upgrades and downgrades. It will not break any updates or downgrades. CNI and NP have to be compatible after this change
Does this change require updates to the CNI daemonset config files to work?:
No
Does this PR introduce any user-facing change?:
No
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.