Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] KEP 5105: kubectl top pvc #5106

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

tsmetana
Copy link
Member

  • One-line PR description: Add a new kubectl top persistentvolumeclaim subcommand that would allow to display volume usage of PVCs
  • Other comments:

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/cli Categorizes an issue or PR as relevant to SIG CLI. labels Jan 30, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: tsmetana
Once this PR has been reviewed and has the lgtm label, please assign eddiezane for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jan 30, 2025
@tsmetana
Copy link
Member Author

/cc @ardaguclu
/cc @gmeghnag

@k8s-ci-robot
Copy link
Contributor

@tsmetana: GitHub didn't allow me to request PR reviews from the following users: gmeghnag.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @ardaguclu
/cc @gmeghnag

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@googs1025
Copy link
Member

/cc

@tsmetana tsmetana force-pushed the kep-5105-kubectl-top-pvc branch from 5c3233a to 15f2fec Compare January 30, 2025 11:50
Comment on lines 293 to 295
There are the `kubelet_volume_stats_used_bytes` and
`kubelet_volume_stats_capacity_bytes` Prometheus metrics which can be used to
compute the volume used space percentage.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, kubectl top pod and kubectl top node are obtained by the metrics server definition API, not directly from prometheus. If we want to support kubectl top pvc, what we want is to get it directly from prometheus? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, kubectl top pod and kubectl top node are obtained by the metrics server definition API, not directly from prometheus. If we want to support kubectl top pvc, what we want is to get it directly from prometheus? 🤔

By using Prometheus query we can get the "most interesting" value (usage percentage) with just one call -- it is simple. Though I understand the Prometheus dependency is a disadvantage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @googs1025 mentioned, the implementation needs to be same with top node and top pod. Direct Prometheus query would be very non-standard (and does not work for some cases like exec credentials plugins, etc.).

Good part is following the same patterns of these 2 commands would make the implementation easier.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsmetana We have discussed similar issues in other issues, you can also refer to: 🤔
FYI: kubernetes/kubectl#1696
FYI: kubernetes-sigs/metrics-server#1623

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I removed the Prometheus references. I believe the same constraints apply for the metrics-server, so the change is not that big in the end.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After going through the documentation, I wonder what would be the benefit of using the metrics API. The metrics-server README states:

Metrics Server is meant only for autoscaling purposes. For example, don't use it to forward metrics to monitoring solutions, or as a source of monitoring solution metrics. In such cases please collect metrics from Kubelet /metrics/resource endpoint directly.

And what we're trying to achieve sounds more like the "for example": just monitoring. Reading the metrics directly from kubelet would remove the necessity of running the metrics-server altogether. Yes, it's lighter dependency than Prometheus, but still an additional dependency that doesn't look necessary. In the light of the above quote I even wonder if the existing kubectl top implementation is doing the right thing by depending on the metrics API for simple resource usage querying and whether adding APIs for volume usage metrics would not be too heavyweight solution -- autoscalers have no use for them at all.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A similar KEP had been proposed in the past at #497, but it was closed without being implemented. Eventually, someone created a kubectl plugin (yashbhutwala/kubectl-df-pv) instead. Implementing it as a kubectl plugin might be one option.

If this feature depends on the Metrics API (where the metrics-server retrieves metrics from the kubelet and exposes them via the Metrics API), I think we need to discuss it with the SIG-Instrumentation (owner of the Metrics API and metrics-server subprojects) and get their agreement. See kubernetes-sigs/metrics-server#1318.

Here are some points I’m personally curious about:

  • What are the required permissions to execute kubectl top pvc? Read access to Pods and PVCs?
  • Not all PVs are mounted as file systems on Nodes. In the case of PVs using a FUSE CSI driver, what would be the expected result of kubectl top pvc?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Not all PVs are mounted as file systems on Nodes. In the case of PVs using a FUSE CSI driver, what would be the expected result of kubectl top pvc?

This is a good point. 🤔 There may be multiple types of PVs and support for third-party to do their own, which may make this implementation more complicated? In addition: kubelet_volume_stats_used_bytes and
kubelet_volume_stats_capacity_bytes are both compatible and feasible for the current mainstream PVs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* What are the required permissions to execute `kubectl top pvc`? Read access to Pods and PVCs?

I guess that would depend on the implementation. The user would have to be able to get to the kubelet stats somehow. Either directly or potentially through the metrics-server. But I would really just keep the implementation confined to present statistics that are already present in the cluster.

  • Not all PVs are mounted as file systems on Nodes. In the case of PVs using a FUSE CSI driver, what would be the expected result of kubectl top pvc?

This is a good point. 🤔 There may be multiple types of PVs and support for third-party to do their own, which may make this implementation more complicated? In addition: kubelet_volume_stats_used_bytes and kubelet_volume_stats_capacity_bytes are both compatible and feasible for the current mainstream PVs?

The volume stats are being collected by cAdvisor AFAIK, so whatever it supports would work. That's also where potential new volume types support should come from. Which is another reason I'd like to really stick to using the existing metrics. I assume (I have some data from OpenShift cluster so I'm reasonably certain about it) they'd cover vast majority of use cases.

@tsmetana tsmetana force-pushed the kep-5105-kubectl-top-pvc branch from 15f2fec to e863b37 Compare February 3, 2025 14:10
@tsmetana tsmetana changed the title KEP 5105: kubectl top pvc [WIP] KEP 5105: kubectl top pvc Feb 6, 2025
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/cli Categorizes an issue or PR as relevant to SIG CLI. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
Status: Needs Triage
Development

Successfully merging this pull request may close these issues.

5 participants