Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRI API development policies #49455

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions content/en/docs/reference/node/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ no_list: true

This section contains the following reference topics about nodes:

* Policies on [Kubernetes feature development and Container runtimes](content/en/docs/reference/node/cri-api-development-policies.md)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link will need correction.

Suggested change
* Policies on [Kubernetes feature development and Container runtimes](content/en/docs/reference/node/cri-api-development-policies.md)
* Policies on [Kubernetes feature development and Container runtimes](/docs/reference/node/cri-api-development-policies)


* the kubelet's [checkpoint API](/docs/reference/node/kubelet-checkpoint-api/)
* a list of [Articles on dockershim Removal and on Using CRI-compatible Runtimes](/docs/reference/node/topics-on-dockershim-and-cri-compatible-runtimes/)

Expand Down
107 changes: 107 additions & 0 deletions content/en/docs/reference/node/cri-api-development-policies.md
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this document belong within https://k8s.dev/docs/?

If not, https://kubernetes.io/docs/reference/using-api/deprecation-policy/ should change (maybe with a new URL, and a redirect from its old home) to link to this new document.

Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
content_type: "reference"
title: Kubernetes feature development and Container runtimes
weight: 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) should be a larger number, ie lower priority

---

The mechanics of a feature development that requires new CRI APIs is covered in
documentation on CRI API [feature-development](https://github.com/kubernetes/cri-api?tab=readme-ov-file#feature-development).
Comment on lines +7 to +8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In end user docs we avoid hyperlinking to pages on GitHub, especially anything in-project.

This article declares policies for developing new Kubernetes features
that require CRI API changes. The goal of these policies is to ensure great user
experience for people trying the new feature early, adopting it when it is
enabled by default, and relying on it as a GA functionality.

## Supported container runtimes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot risk implying that runtimes not on this list are unsupported. I think we must change this text.

I think this is a list of runtimes that this project uses for its own release testing. It's not related to conformance, for example.


Features and CRI API are supposed to be portable and generic and not limited to
a specific container runtime. However at this moment we require every feature to
work on two container runtimes: [Containerd](https://containerd.io/) and
[CRI-O](https://cri-o.io/). These are two runtimes that are tested as part of a
kubernetes development and release process. When this document refers to two
container runtimes, it assumes both - Containerd and CRI-O. If any other
container runtimes begin working actively with the Kubernetes community, this
document will need to be updated.
Comment on lines +16 to +23
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like details aimed at contributors, not end users.

Also also, avoid using “we“


## Same maturity level (for beta and GA)

Implementation of an API needed for a Kubernetes feature in a container runtime
MUST be at least the same maturity as in k8s at a moment of Kubernetes release.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use bold for emphasis; avoid using un-bolded ALL CAPS to achieve the same thing.

Also, avoid the abbreviation “k8s” (especially all lowercase) within reference pages, unless the API or whatever specifically uses the abbreviation.

This is similar to the [deprecation policy](/docs/reference/using-api/deprecation-policy/#deprecating-a-feature-or-behavior)
when one feature is replacing another.

With Containerd and CRI-O it means that for the GA features, there should be a
release of Containerd and CRI-O implementing APIs needed for the feature in
those container runtimes. And those APIs MUST NOT be marked as experimental
features ([Containerd experimental features](https://containerd.io/releases/#experimental-features)).
For the beta, neither container runtime has a notion of beta feature or release and
realistically the same maturity criteria applies as for GA.

## Same maturity level (for alpha)

There should be at least one implementation of an API needed for the Kubernetes
feature merged into the container runtime default branch (main) or marked as
experimental. An e2e test may demonstrate that the feature is working should be
merged alongside the code. Note, tests may run against the default branch of a
container runtime and the feature can be still not shipped.

The actual container runtime release may be delayed to the later stage, but
Kubernetes highly encourages fast availability of a release of a container
runtime that can be tested by early adopters.

## Minimal number of implementations

Both Containerd and CRI-O MUST have a GA release with the implementations of an
API needed for a Kubernetes feature before this Kubernetes feature can be
promoted to GA.
Comment on lines +53 to +55
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid making this sound like a commitment. Imagine if the containerd project closed and everyone used to a rival thing (Docker maybe?)

We would still release Kubernetes and wouldn't require eg a bump to major version 2 to bypass this. It's not a commitment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a commitment to the extend. If it will close we will rewrite the document. The goal to not let the end users down with the feature implemented with no reasonable way to use it on a supported runtime

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My opinion here: a commitment we rewrite when the context changes doesn't sound like an actual commitment.

I recommend getting SIG Architecture to make a note somewhere about what commitments we are definitely making, and which are a strong promise to make best efforts.


## Safe Kubernetes defaults

The feature cannot be enabled by default in Kubernetes as a beta feature before
the required APIs are implemented in both container runtimes (Containerd and
CRI-O) and there is a GA release of a container runtime for each. The feature
can be marked as beta, but disabled by default, if there is only one container
runtime implementation of a required API that is released as GA. Note, as for
any Kubernetes features, at least one release with the beta feature enabled by
default is required before it is progressing to GA.

## Guarantee portability

The feature design (KEP PR) MUST be lgtm-ed by container runtime maintainers of
CRI-O and Containerd.

The feature can only be merged as alpha in Kubernetes, if there is an agreement
from both container runtime maintainers on the feature design in general and API
shape. Gaining this agreement will often involve authoring the pull request
demonstrating an API implementation to the both container runtime repositories
or an alternative way for container runtime maintainers to confirm viability of
suggested CRI APIs.

## Guaranteed implementation

CRI API can only be merged if there is a PR in both - Kubernetes repository and
container runtime repository (at least one) utilizing this API and demoing the
feature working end to end. See CRI API
[feature-development](https://github.com/kubernetes/cri-api?tab=readme-ov-file#feature-development)
documentation.

## Features discoverability

Kubernetes features that depend on the environment or special container runtime
capabilities must have its own explicit API configuration (like Pod API or Node
API) and must not depend on the cluster or node configuration that is not
clearly exposed via these APIs. For example, it is OK to have windows specific
features that are configured via Pod API. But it is not OK to design a feature
that will work on one container runtime and incompatible with the other
container runtime. There are three exceptions to this case:

- there will be a different behavior during the feature adoption period while
older runtime versions do not support the API yet. In those cases, attempting
to try the feature must result in failing as fast as possible.
- LTS and older versions of container runtimes may not have an implementation of
an API and still be widely used by Kubernetes end users.
- If any of container runtime underlying systems cannot support the feature
in-principle (e.g. [kata containers](https://katacontainers.io/) with CRI-O
may have limitations), while CRI-O still supports the feature without these
systems configured, this must be designed as part of a normal operation. In
this case, Pod or Node APIs must handle these cases gracefully and those must
be documented clearly.