Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1710: selinux: Update the KEP for 1.33 and graduate to Beta #5096

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

jsafrane
Copy link
Member

@jsafrane jsafrane commented Jan 28, 2025

  • One-line PR description: Update the KEP with current development (1.32/33) and a plan to graduate to Beta in 1.33
  • Other comments: See individual small commits.
    • Specify SELinux controller tests.
    • Add notes about other Linux distros (tested on Debian).
    • Add a note that the controller / KCM cannot implement SELinux label defaulting.

- A new job + testgrid with `SELinuxChangePolicy` enabled + `SELinuxMount`
  disabled is available.
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 28, 2025
@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/storage Categorizes an issue or PR as relevant to SIG Storage. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 28, 2025
keps/sig-storage/1710-selinux-relabeling/README.md Outdated Show resolved Hide resolved
keps/sig-storage/1710-selinux-relabeling/README.md Outdated Show resolved Hide resolved
* Telemetry numbers from OpenShift show that <5% of clusters would need to change any of their Pods.
* GA:
* This phase signalizes that the feature is ready for real testing. Only non-breaking parts (`SELinuxChangePolicy`) are enabled by default.
* GA of Phase 2 (`SELinuxChangePolicy` + `SELinuxMountReadWriteOncePod` are GA and locked to default):
* All known issues fixed. Otherwise, we will GA Phase 1 only.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify which phase we will be in when this is implemented? Will - SELinuxChangePolicy will have an material effect beyond allowing users to opt-out of mount-option for applying selinux labels?

What happens if SELinuxMount is enabled (because we are moving this to beta), will SELinuxChangePolicy will then have an material impact?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand. The feature gates need to be enabled in order, as described elsewhere in the KEP. Each feature gate enables a different piece of code and counts that the previous ones are enabled.

In addition, in this phase is SELinuxMount is beta, but disabled by default. "Beta" here really tells it's ready for testing,. just as we did with CSI migration and other potentially breaking changes.

I added some wording around to make it more explicit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, I noticed that we are already excluding pods from using mount option as selinux policy if they have SELinuxChangePolicy enabled and have opted-out of it.

So this question is kinda moot.

@kannon92
Copy link
Contributor

kannon92 commented Feb 1, 2025

PRR shadow:

I was reading the KEP and I think some updates for phase 1 may be needed.

https://github.com/kubernetes/enhancements/blob/fead564cd04409dc8f549ee655b3b5174c16fd29/keps/sig-storage/1710-selinux-relabeling/README.md#graduation-criteria

There is no mention of Phase 1 graduation. From the KEP issue, it sounds like you want to promote it to stable in 1.34.

@kannon92
Copy link
Contributor

kannon92 commented Feb 1, 2025

PRR shadow: looks good.
One item needs to be filled out.

This was tested manually before releasing SELinuxMountReadWriteOncePod enabled by default. It will be tested before SELinuxMount gets enabled by default.

@kannon92
Copy link
Contributor

kannon92 commented Feb 1, 2025

On a side note, what events are emitted in this case? The KEP mentions events but I didnt see what event would be emitted.

@kannon92
Copy link
Contributor

kannon92 commented Feb 1, 2025

And I think a prod-readiness should be updated as you are changing stage for one of the feature gates

https://github.com/kubernetes/enhancements/blob/master/keps/prod-readiness/sig-storage/1710.yaml

@jsafrane
Copy link
Member Author

jsafrane commented Feb 3, 2025

From the KEP issue, it sounds like you want to promote it to stable in 1.34

I updated the issue, 1.35 is the earliest and optimistic GA of SELinuxMountReadWriteOncePod + SELinuxChangePolicy. SELinuxMount at least 1 release later.

There is no mention of Phase 1 graduation.

Good catch. I added "e2e tests implemented + green".

And I think a prod-readiness should be updated as you are changing stage for one of the feature gates

I will do it in a subsequent PR, once the enhancement text is polished and merged.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: jsafrane
Once this PR has been reviewed and has the lgtm label, please assign johnbelamaric for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 4, 2025
@jsafrane
Copy link
Member Author

jsafrane commented Feb 4, 2025

Ok, I added PRR request. We want all three feature gates beta in 1.33.

@gnufied
Copy link
Member

gnufied commented Feb 4, 2025

from sig-storage and implementation POV. LGTM

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 4, 2025
@jsafrane
Copy link
Member Author

jsafrane commented Feb 6, 2025

I tested upgrade + downgrade + upgrade, it seems working to me.

One interesting observation is that when SELinuxChangePolicy FG got from true to false, the API server did not clear SELinuxChangePolicy field in existing Deployments (this is expected), but then when new Pods were created from these Deployments, their SELinuxChangePolicy got cleared to nil on creation (again, expected, but not obvious).

After flipping SELinuxChangePolicy back to true, the Deployments kept their SELinuxChangePolicy and newly created Pods inherited it.

Details:

  1. Run today's master (491a23f) with SELinuxMount && SELinuxChangePolicy off on Fedora 41, cri-o, SELinux enforcing, enabled SELinuxWarningController, single worker node (all the pods below run on the same node!)
    SELinuxWarningController won't run, because it's behind SELinuxChangePolicy feature gate, but it's on KCM command line and we can just flip the feature gates in subsequent steps

  2. Run 3 Deployments sharing the same RWO PVC in this sequence. Each has 1 replica.

    • mydeployment-priv with a Privileged Pod, no explicit SELinux label
    • mydeployment-c1 with seLinuxOpts{level:"c1,c0"} and default SELinuxChangePolicy (nil)
    • mydeployment-c2 with seLinuxOpts{level:"c2,c0"} and explicitly Recursive SELinuxChangePolicy. This is cleared to nil by the API server (SELinuxChangePolicy is off)
  3. Set FG SELinuxChangePolicy: "true", SELinuxMount: "false" in KAS, KCM, scheduler and kubelet, restart them.

    • conflicting events appeared - every pod conflict with every other.
  4. Set SELinuxChangePolicy of c2 and priv to Recursive

    • c1 still conflicts with c2 and priv.
    • c2 and priv do not conflict.
  5. Set SELinuxChangePolicy: "true", SELinuxMount: "true" in KAS, KCM, scheduler and kubelet, restart them.

    • Depending on the order in which pods started, some Pods can get ContainerCreating (because c1 wants mount -o context=...c1,c0 and the other wants no -o context).
    • c1 still conflicts with c2 and priv.
    • c2 and priv do not conflict.
  6. Set SELinuxChangePolicy: "true", SELinuxMount: "false" in KAS, KCM, scheduler and kubelet, restart them.

    • All pods are back Running.
    • Same conflict events as before
  7. Set SELinuxChangePolicy: "false", SELinuxMount: "false" in KAS, KCM, scheduler and kubelet, restart them.

    • All pods Running.
    • No conflicts (no SELinuxWarningController with SELinuxChangePolicy=false
  8. Set SELinuxChangePolicy: "true", SELinuxMount: "true" in KAS, KCM, scheduler and kubelet, restart them.

    • (skipping SELinuxChangePolicy: "true", SELinuxMount: "false" to save time)
    • Depending on the order in which pods started, some Pods can get ContainerCreating.
    • c1 still conflicts with c2 and priv.
    • c2 and priv do not conflict (= Deployments retained their SELinuxChangePolicy value from step 6).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants