Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: v1 migration and adaptation #528

Merged
merged 50 commits into from
Oct 24, 2024
Merged
Show file tree
Hide file tree
Changes from 47 commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
ac88b99
chore: update devcontainer go version
tallaxes Oct 10, 2024
6c8c6b6
chore: refresh toolcain
tallaxes Oct 10, 2024
399191e
chore: additional processing on verify
tallaxes Oct 10, 2024
ca59c0b
chore: bump dependencies
tallaxes Oct 10, 2024
5f9e04c
chore: refresh Helm charts
tallaxes Oct 10, 2024
92a3e29
chore: update golangci config
tallaxes Oct 10, 2024
ae007bd
chore: remove feature gate for drift
tallaxes Oct 10, 2024
4018810
chore: update pre-commit tooling
tallaxes Oct 10, 2024
6031c41
chore: update the shape of main
tallaxes Oct 10, 2024
3b5d8d9
chore: update the alt operator
tallaxes Oct 10, 2024
a66728a
chore: update the API (move kubelet config to AKSNodeClass)
tallaxes Oct 10, 2024
9431774
chore: migrate cloud provider to v1 API
tallaxes Oct 10, 2024
53a24a7
chore: migrate operator to v1 API
tallaxes Oct 10, 2024
0f73504
chore: migrate controllers to v1 API
tallaxes Oct 10, 2024
785c4e3
chore: add nodeclass status controller
tallaxes Oct 10, 2024
6a2660f
chore: migrate providers to v1 API
tallaxes Oct 10, 2024
95f4055
chore: migrate test pkg to v1 API
tallaxes Oct 10, 2024
03bbfc6
chore: update utils
tallaxes Oct 10, 2024
b3d3b97
chore: update and migrate E2E tests to v1 API
tallaxes Oct 10, 2024
2f45040
feat: refresh and relink CRDs
tallaxes Oct 11, 2024
1da798c
fix: move code generation into subfolders to fix golangci-lint
tallaxes Oct 11, 2024
e390c26
fix: enable most of govet in golangci
tallaxes Oct 11, 2024
6a47c36
fix(linting): exclude alt operator logger
tallaxes Oct 11, 2024
3e5aecc
fix: add nodeclass termination controller
tallaxes Oct 11, 2024
9e6f8e2
fix(lint): restore linting on verify
tallaxes Oct 11, 2024
2f15c9f
feat: add nodeclass hash controller
tallaxes Oct 12, 2024
8ce0482
fix: register additional nodeclass and status controllers
tallaxes Oct 12, 2024
d4fac7f
fix(e2e): better selection of karpenter pod for logs
tallaxes Oct 12, 2024
1f4da31
fix(e2e): fix utilization suite
tallaxes Oct 12, 2024
1ac7035
chore(e2e): add events to dump-logs (and simplify)
tallaxes Oct 12, 2024
6d9edf0
chore: rename v1 to corev1
tallaxes Oct 12, 2024
a06dcc9
fix: remove extra $
tallaxes Oct 12, 2024
d7b6df9
fix(e2e): add cilium label and taint
tallaxes Oct 15, 2024
326817d
fix(e2e): fix labels and disruption for deamonset test
tallaxes Oct 15, 2024
270f1d9
Merge branch 'main' into tallaxes/v1-migration-merge
tallaxes Oct 16, 2024
9429725
feat: update kubelet configuration
tallaxes Oct 16, 2024
efbe216
fix: conflicting nodeclaim.garbagecollcation controller name
tallaxes Oct 16, 2024
c27a1ce
chore: restore webhooks in alt operator
tallaxes Oct 18, 2024
16d2984
Clean up commented out webhook code
matthchr Oct 18, 2024
aefa932
Merge branch 'main' into tallaxes/v1-migration
tallaxes Oct 22, 2024
d0b074a
fix(test): fix test for credential provider URL in custom data
tallaxes Oct 22, 2024
ea188ff
Make webhooks work in AKS CCP context (#537)
matthchr Oct 23, 2024
5273e90
Merge branch 'main' into tallaxes/v1-migration
tallaxes Oct 23, 2024
20ccbfd
chore: remove failSwapOn from kubelet settings in AKSNodeClass
tallaxes Oct 24, 2024
27fb940
fix: populate nodeClaim.Status.ImageID
tallaxes Oct 24, 2024
07052f3
fix: record NodeClass hash and add drift on static fields
tallaxes Oct 24, 2024
74fdd8a
chore: rename variabled
tallaxes Oct 24, 2024
82e34f9
fix: remove outdated comment
tallaxes Oct 24, 2024
3569700
fix: typo
tallaxes Oct 24, 2024
c91ea43
chore: update CRDs
tallaxes Oct 24, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"build": {
"dockerfile": "Dockerfile",
"args": {
"VARIANT": "1.22-bullseye"
"VARIANT": "1.23-bullseye"
}
},
"runArgs": [ "--cap-add=SYS_PTRACE", "--security-opt", "seccomp=unconfined" ],
Expand Down
23 changes: 10 additions & 13 deletions .github/actions/e2e/dump-logs/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,26 +31,23 @@ runs:
client-id: ${{ inputs.client-id }}
tenant-id: ${{ inputs.tenant-id }}
subscription-id: ${{ inputs.subscription-id }}
- name: az set sub
- name: update cluster context
shell: bash
run: az account set --subscription ${{ inputs.subscription-id }}
run: |
az aks get-credentials --name ${{ inputs.cluster_name }} --resource-group ${{ inputs.resource_group }}
- name: controller-logs
shell: bash
run: |
echo "step: controller-logs"
AZURE_CLUSTER_NAME=${{ inputs.cluster_name }} AZURE_RESOURCE_GROUP=${{ inputs.resource_group }} make az-creds
POD_NAME=$(kubectl get pods -n karpenter --no-headers -o custom-columns=":metadata.name" | tail -n 1)
echo "logs from pod ${POD_NAME}"
kubectl logs "${POD_NAME}" -n karpenter -c controller
kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter --all-containers --ignore-errors
- name: describe-karpenter-pods
shell: bash
run: |
echo "step: describe-karpenter-pods"
AZURE_CLUSTER_NAME=${{ inputs.cluster_name }} AZURE_RESOURCE_GROUP=${{ inputs.resource_group }} make az-creds
kubectl describe pods -n karpenter
kubectl describe pods -n kube-system -l app.kubernetes.io/name=karpenter
- name: describe-nodes
shell: bash
run: |
echo "step: describe-nodes"
AZURE_CLUSTER_NAME=${{ inputs.cluster_name }} AZURE_RESOURCE_GROUP=${{ inputs.resource_group }} make az-creds
kubectl describe nodes
kubectl describe nodes
- name: get-karpenter-events
shell: bash
run: |
kubectl get events -A --field-selector source=karpenter
11 changes: 8 additions & 3 deletions .golangci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ linters:
- bidichk
- errorlint
- errcheck
- exportloopref
- copyloopvar
- gosec
- revive
- stylecheck
Expand All @@ -33,8 +33,9 @@ linters-settings:
gocyclo:
min-complexity: 11
govet:
enable:
- shadow
enable-all: true
disable:
- fieldalignment
revive:
rules:
- name: dot-imports
Expand Down Expand Up @@ -79,3 +80,7 @@ issues:
- hack
- charts
- designs
- pkg/alt/knative # copy
- pkg/alt/karpenter-core/pkg/webhooks # copy
exclude-files:
- pkg/alt/karpenter-core/pkg/operator/logger.go # copy
9 changes: 5 additions & 4 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,22 +1,23 @@
repos:
- repo: https://github.com/gitleaks/gitleaks
rev: v8.18.1
rev: v8.20.1
hooks:
- id: gitleaks
- repo: https://github.com/golangci/golangci-lint
rev: v1.55.2
rev: v1.61.0
hooks:
- id: golangci-lint
- repo: https://github.com/jumanjihouse/pre-commit-hooks
rev: 3.0.0
hooks:
- id: shellcheck
- repo: https://github.com/crate-ci/typos
rev: v1.17.2
rev: v1.26.0
hooks:
- id: typos
args: [--write-changes, --force-exclude, --exclude, go.mod]
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
rev: v5.0.0
hooks:
- id: end-of-file-fixer
- id: trailing-whitespace
5 changes: 4 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ GOFLAGS ?= $(LDFLAGS)
WITH_GOFLAGS = GOFLAGS="$(GOFLAGS)"

# # CR for local builds of Karpenter
KARPENTER_NAMESPACE ?= karpenter
KARPENTER_NAMESPACE ?= kube-system

# Common Directories
# TODO: revisit testing tools (temporarily excluded here, for make verify)
Expand Down Expand Up @@ -80,9 +80,12 @@ verify: toolchain tidy download ## Verify code. Includes dependencies, linting,
cp $(KARPENTER_CORE_DIR)/pkg/apis/crds/* pkg/apis/crds
yq -i '(.spec.versions[0].additionalPrinterColumns[] | select (.name=="Zone")) .jsonPath=".metadata.labels.karpenter\.azure\.com/zone"' \
pkg/apis/crds/karpenter.sh_nodeclaims.yaml
hack/validation/kubelet.sh
hack/validation/labels.sh
hack/validation/requirements.sh
hack/validation/common.sh
cp pkg/apis/crds/* charts/karpenter-crd/templates
hack/mutation/conversion_webhooks_injection.sh
hack/github/dependabot.sh
$(foreach dir,$(MOD_DIRS),cd $(dir) && golangci-lint run $(newline))
@git diff --quiet ||\
Expand Down

This file was deleted.

250 changes: 250 additions & 0 deletions charts/karpenter-crd/templates/karpenter.azure.com_aksnodeclasses.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,250 @@
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.4
name: aksnodeclasses.karpenter.azure.com
spec:
group: karpenter.azure.com
names:
categories:
- karpenter
kind: AKSNodeClass
listKind: AKSNodeClassList
plural: aksnodeclasses
shortNames:
- aksnc
- aksncs
singular: aksnodeclass
scope: Cluster
versions:
- name: v1alpha2
schema:
openAPIV3Schema:
description: AKSNodeClass is the Schema for the AKSNodeClass API
properties:
apiVersion:
description: |-
APIVersion defines the versioned schema of this representation of an object.
Servers should convert recognized schemas to the latest internal value, and
may reject unrecognized values.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
type: string
kind:
description: |-
Kind is a string value representing the REST resource this object represents.
Servers may infer this from the endpoint the client submits requests to.
Cannot be updated.
In CamelCase.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
type: string
metadata:
type: object
spec:
description: |-
AKSNodeClassSpec is the top level specification for the AKS Karpenter Provider.
This will contain configuration necessary to launch instances in AKS.
properties:
imageFamily:
default: Ubuntu2204
description: ImageFamily is the image family that instances use.
enum:
- Ubuntu2204
- AzureLinux
type: string
kubelet:
description: |-
Kubelet defines args to be used when configuring kubelet on provisioned nodes.
They are a subset of the upstream types, recognizing not all options may be supported.
Wherever possible, the types and names should reflect the upstream kubelet types.
properties:
allowedUnsafeSysctls:
description: |-
A comma separated whitelist of unsafe sysctls or sysctl patterns (ending in `*`).
Unsafe sysctl groups are `kernel.shm*`, `kernel.msg*`, `kernel.sem`, `fs.mqueue.*`,
and `net.*`. For example: "`kernel.msg*,net.ipv4.route.min_pmtu`"
Default: []
items:
type: string
type: array
containerLogMaxFiles:
default: 5
description: |-
containerLogMaxFiles specifies the maximum number of container log files that can be present for a container.
Default: 5
format: int32
minimum: 2
type: integer
containerLogMaxSize:
default: 50Mi
description: |-
containerLogMaxSize is a quantity defining the maximum size of the container log
file before it is rotated. For example: "5Mi" or "256Ki".
Default: "10Mi"
AKS CustomKubeletConfig has containerLogMaxSizeMB (with units), defaults to 50
pattern: ^\d+(E|P|T|G|M|K|Ei|Pi|Ti|Gi|Mi|Ki)$
type: string
cpuCFSQuota:
default: true
description: |-
CPUCFSQuota enables CPU CFS quota enforcement for containers that specify CPU limits.
Note: AKS CustomKubeletConfig uses cpuCfsQuota (camelCase)
type: boolean
cpuCFSQuotaPeriod:
default: 100ms
description: |-
cpuCfsQuotaPeriod sets the CPU CFS quota period value, `cpu.cfs_period_us`.
The value must be between 1 ms and 1 second, inclusive.
Default: "100ms"
type: string
cpuManagerPolicy:
default: none
description: cpuManagerPolicy is the name of the policy to use.
enum:
- none
- static
type: string
imageGCHighThresholdPercent:
description: |-
ImageGCHighThresholdPercent is the percent of disk usage after which image
garbage collection is always run. The percent is calculated by dividing this
field value by 100, so this field must be between 0 and 100, inclusive.
When specified, the value must be greater than ImageGCLowThresholdPercent.
Note: AKS AKS CustomKubeletConfig does not have "Percent" in the field name
format: int32
maximum: 100
minimum: 0
type: integer
imageGCLowThresholdPercent:
description: |-
ImageGCLowThresholdPercent is the percent of disk usage before which image
garbage collection is never run. Lowest disk usage to garbage collect to.
The percent is calculated by dividing this field value by 100,
so the field value must be between 0 and 100, inclusive.
When specified, the value must be less than imageGCHighThresholdPercent
Note: AKS CustomKubeletConfig does not have "Percent" in the field name
format: int32
maximum: 100
minimum: 0
type: integer
podPidsLimit:
description: |-
podPidsLimit is the maximum number of PIDs in any pod.
AKS CustomKubeletConfig uses PodMaxPids, int32 (!)
Default: -1
format: int64
type: integer
topologyManagerPolicy:
default: none
description: |-
topologyManagerPolicy is the name of the topology manager policy to use.
Valid values include:

- `restricted`: kubelet only allows pods with optimal NUMA node alignment for requested resources;
- `best-effort`: kubelet will favor pods with NUMA alignment of CPU and device resources;
- `none`: kubelet has no knowledge of NUMA alignment of a pod's CPU and device resources.
- `single-numa-node`: kubelet only allows pods with a single NUMA alignment
of CPU and device resources.
enum:
- restricted
- best-effort
- none
- single-numa-node
type: string
type: object
x-kubernetes-validations:
- message: imageGCHighThresholdPercent must be greater than imageGCLowThresholdPercent
rule: 'has(self.imageGCHighThresholdPercent) && has(self.imageGCLowThresholdPercent)
? self.imageGCHighThresholdPercent > self.imageGCLowThresholdPercent :
true'
maxPods:
description: MaxPods is an override for the maximum number of pods
that can run on a worker node instance.
format: int32
minimum: 0
type: integer
osDiskSizeGB:
default: 128
description: osDiskSizeGB is the size of the OS disk in GB.
format: int32
minimum: 100
type: integer
tags:
additionalProperties:
type: string
description: Tags to be applied on Azure resources like instances.
type: object
vnetSubnetID:
description: |-
VNETSubnetID is the subnet used by nics provisioned with this nodeclass.
If not specified, we will use the default --vnet-subnet-id specified in karpenter's options config
pattern: (?i)^\/subscriptions\/[^\/]+\/resourceGroups\/[a-zA-Z0-9_\-().]{0,89}[a-zA-Z0-9_\-()]\/providers\/Microsoft\.Network\/virtualNetworks\/[^\/]+\/subnets\/[^\/]+$
type: string
type: object
status:
description: AKSNodeClassStatus contains the resolved state of the AKSNodeClass
properties:
conditions:
description: Conditions contains signals for health and readiness
items:
description: Condition aliases the upstream type and adds additional
helper methods
properties:
lastTransitionTime:
description: |-
lastTransitionTime is the last time the condition transitioned from one status to another.
This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable.
format: date-time
type: string
message:
description: |-
message is a human readable message indicating details about the transition.
This may be an empty string.
maxLength: 32768
type: string
observedGeneration:
description: |-
observedGeneration represents the .metadata.generation that the condition was set based upon.
For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date
with respect to the current state of the instance.
format: int64
minimum: 0
type: integer
reason:
description: |-
reason contains a programmatic identifier indicating the reason for the condition's last transition.
Producers of specific condition types may define expected values and meanings for this field,
and whether the values are considered a guaranteed API.
The value should be a CamelCase string.
This field may not be empty.
maxLength: 1024
minLength: 1
pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$
type: string
status:
description: status of the condition, one of True, False, Unknown.
enum:
- "True"
- "False"
- Unknown
type: string
type:
description: type of condition in CamelCase or in foo.example.com/CamelCase.
maxLength: 316
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
type: string
required:
- lastTransitionTime
- message
- reason
- status
- type
type: object
type: array
type: object
type: object
served: true
storage: true
subresources:
status: {}

This file was deleted.

Loading
Loading