Some of the dynamic pods created from rke2 cluster throws java.net.unknownhostexception #46130
Replies: 2 comments
-
Hey Reddy, this isn't really the right place for a support question like this. Is there something on the website that is misleading or needs clarification related to this? If you need help using k8s, Slack would probably be a better place to seek help. There are a lot of people with experience running different applications on Kubernetes that may be able to help direct you towards an answer. |
Beta Was this translation helpful? Give feedback.
-
As mentioned in #46130 (comment), GitHub is not the right place for support requests. If you're looking for help, check Server Fault. You can also post your question on the Kubernetes Slack or the Discuss Kubernetes forum. |
Beta Was this translation helpful? Give feedback.
-
I am trying to deploy 200 springboot dynamic pods (created from job object) from another springboot pod (created from deployment object). Each dynamic has CPU and Memory limits of 500m and 1.3Gb respectively.
A set of 10 pods will be created at a time and there will be a 2 min delay before creating another set of 10 pods. When the total number of pods reach around 100, some of the dynamic pods (around 3 to 5 pods) failed to resolve the service dns of another pod and throws jave.net.unknownhostexception.
I am able to reproduce this issue every time I am trying to create the 200 pods and the unknownhostexception issue occurs in 3 to 5 pods when the total number of pods reach around 100.
Sometimes I see an i/o timeout error in the coredns pods. Am not sure if this is triggered due to the unknownhostexception encountered in the dynamic springboot pods.
Can anyone please explain the reason for this behavior? Also how to address this issue?
NOTE: We never hit the unknownhostexception when we created 5 pods at a time with the same 2 min delay
Cluster information:
Kubernetes version: 1.23
Cloud being used: Bare-Metal VMs created from VMWare
Installation method: Air Gapped RKE2 Installation
Host OS: RHEL 8.6
CNI and version: calico - v3.22.1
CRI and version: crictl - v1.23.0
3 - Control Nodes (54GB RAM and 24 core CPU)
15 - Compute Nodes (54GB RAM and 24 core CPU)
Beta Was this translation helpful? Give feedback.
All reactions