Replies: 1 comment
-
I checked that this is because the Pod is running for a long time (17 hrs). Newly created Pods does not have this issue. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello! We recently upgraded our ARC cluster to scale set. It works great except that we expect the runner to always pull the latest tag of the image: https://github.com/pytorch/benchmark/blob/main/docker/infra/values.yaml#L226
We will rebuild the image nightly, and we hope all runners are running with the latest tag of image. This works well on the legacy mode of ARC.
However, after upgrade, we found that the runner image is often "out-dated" and does not update to the latest tag even though we have pushed the new image. For example:
We pushed the dev20240214 image at 10:30 AM EST: https://github.com/pytorch/benchmark/actions/runs/7903121973, https://github.com/pytorch/benchmark/pkgs/container/torchbench/178971035?tag=dev20240214
However, the workflow started at 12:00 PM EST still uses the old dev20240213 image: https://github.com/pytorch/benchmark/actions/runs/7904726803/job/21575716541
Is there a way to let the K8s controller upgrade the runner's image more aggressively?
Beta Was this translation helpful? Give feedback.
All reactions