Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate Admission Controller Validation to CEL #7690

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

omerap12
Copy link
Member

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR migrates the admission controller validation to use CEL (Common Expression Language) for improved flexibility and consistency in validation logic at the API server level.

Which issue(s) this PR fixes:

Fixes #7665

Special notes for your reviewer:

Does this PR introduce a user-facing change?

The validation process is now handled using CEL at the API server level. Expect different validation messages as a result.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

NONE

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. area/vertical-pod-autoscaler labels Jan 14, 2025
@k8s-ci-robot k8s-ci-robot requested a review from kgolab January 14, 2025 13:19
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 14, 2025
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jan 14, 2025
@omerap12
Copy link
Member Author

I've tested it locally and it works, but I still need to add integration tests. Can you check if it looks good to you?
@adrianmoisey
@voelzmo

@adrianmoisey
Copy link
Member

Whoa! That is neat!
I didn't know that the validation could happen in the CRD itself.

Two questions:

  1. The CRD is generated, I assume these CEL rules need to be configured in the types somewhere, so they are outputted to the generated CRD
  2. Moving these to CELs means the tests move from unit tests to e2e tests, what does that look like?

@omerap12
Copy link
Member Author

Whoa! That is neat!

I didn't know that the validation could happen in the CRD itself.

Two questions:

  1. The CRD is generated, I assume these CEL rules need to be configured in the types somewhere, so they are outputted to the generated CRD

  2. Moving these to CELs means the tests move from unit tests to e2e tests, what does that look like?

Yup!

Regarding point 1, I need to explore how to configure it, but kubebuilder annotations do support this.
As for point 2, I haven’t implemented the tests yet, but that is the intended approach (this is one of the downsides of this method).

@adrianmoisey
Copy link
Member

Regarding point 1, I need to explore how to configure it, but kubebuilder annotations do support this.

Yup, I see https://book.kubebuilder.io/reference/markers/crd-validation has these listed

@omerap12
Copy link
Member Author

Regarding point 1, I need to explore how to configure it, but kubebuilder annotations do support this.

Yup, I see https://book.kubebuilder.io/reference/markers/crd-validation has these listed

Now it was generated by kubebuilder.

@omerap12
Copy link
Member Author

Tests are failing because of the change I made in the admission controller, so we should migrate those tests into e2e/integration tests.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 15, 2025
@@ -324,7 +325,11 @@ spec:
Name of the container or DefaultContainerResourcePolicy, in which
case the policy is used by the containers that don't have their own
policy specified.
pattern: ^[a-zA-Z0-9-_]+$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you can introduce new CRD validations without increasing the apiVersion.

Additionally, containerName: '*' is explicitly supported as a catch-all solution, see

if containerPolicy.ContainerName == vpa_types.DefaultContainerResourcePolicy {
defaultPolicy = &policy.ContainerPolicies[i]
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yes you are right it's just WIP at the moment.
  2. Thanks, I will adjust :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in: 9ea4821

@@ -112,25 +112,8 @@ func parseVPA(raw []byte) (*vpa_types.VerticalPodAutoscaler, error) {

// ValidateVPA checks the correctness of VPA Spec and returns an error if there is a problem.
func ValidateVPA(vpa *vpa_types.VerticalPodAutoscaler, isCreate bool) error {
if vpa.Spec.UpdatePolicy != nil {
mode := vpa.Spec.UpdatePolicy.UpdateMode
if mode == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any checks added regarding updatePolicy.updateMode – is this intentional and those checks are implicitly done somewhere else now? Or do we need to add them as CEL validations as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 15, 2025
Copy link
Contributor

@voelzmo voelzmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, didn't mean to approve 🙈

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: omerap12
Once this PR has been reviewed and has the lgtm label, please assign jbartosik for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 15, 2025
@omerap12
Copy link
Member Author

I'm opening this draft PR for review to trigger the e2e tests (just to ensure everything is working), and then I plan to update the apiVersion.
Pinging @voelzmo, @raywainman, and @adrianmoisey for reviews.

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 23, 2025
@omerap12 omerap12 marked this pull request as ready for review January 23, 2025 07:29
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 23, 2025
ContainerPolicies []ContainerResourcePolicy `json:"containerPolicies,omitempty" patchStrategy:"merge" patchMergeKey:"containerName" protobuf:"bytes,1,rep,name=containerPolicies"`
}

// ContainerResourcePolicy controls how autoscaler computes the recommended
// resources for a specific container.
// +kubebuilder:validation:XValidation:rule="!has(self.mode) || !has(self.controlledValues) || self.mode != 'Off' || self.controlledValues != 'RequestsAndLimits'",message="ControlledValues shouldn't be specified if container scaling mode is off"
Copy link
Contributor

@maxcao13 maxcao13 Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question about this. If I wanted to turn my VPA "On" and Off in my workload, either for testing purposes or whatever the case may be, would this change force me to remove containerResourcePolicies before doing so in order to apply because of cel validation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rule is saying that it's invalid to have both mode set to "Off" and controlledValues set to "RequestsAndLimits" at the same time. You don't necessarily need to remove the entire containerResourcePolicies before turning the VPA off. You only need to ensure that when you set mode to "Off", you're not also specifying controlledValues as "RequestsAndLimits" for the same container.

So If you're just toggling the mode between "On" and "Off" and you're not using controlledValues: RequestsAndLimits, you can do so freely without modifying other parts of your configuration.
If you have controlledValues: RequestsAndLimits set and you want to turn the VPA off, you would need to either remove the controlledValues field or set it to a different value before setting mode: Off.

I need to add tests for this behavior of course.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it thanks for the explanation 💯

@@ -183,18 +186,23 @@ type PodResourcePolicy struct {
// +optional
// +patchMergeKey=containerName
// +patchStrategy=merge
// +kubebuilder:validation:MaxItems=100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is max 100 items documented somewhere or just an arbitrary number?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still a work in progress - I just wanted to show what we can do with CEL validation, so I didn't document anything yet. I added this because of the runtime cost of CEL validation (https://kubernetes.io/docs/reference/using-api/cel/#runtime-cost-budget). I don’t think anyone will use more than 100 resource policies (right?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not introduce new validations in this PR, but rather convert the existing ones.
If we want to introduce new validations, we can discuss this in a new issue/PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. Ill fix it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/vertical-pod-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/feature Categorizes issue or PR as related to a new feature. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

VPA: Migrate admission webhook validations to CEL where possible
5 participants