Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

google_container_cluster: node_pool_defaults.node_config_defaults.insecure_kubelet_readonly_port_enabled not sent on create even when explicitly set by user #19520

Open
MaulinS-Pangea opened this issue Sep 18, 2024 · 9 comments

Comments

@MaulinS-Pangea
Copy link

MaulinS-Pangea commented Sep 18, 2024

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
  • Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.
  • If an issue is assigned to a user, that user is claiming responsibility for the issue.
  • Customers working with a Google Technical Account Manager or Customer Engineer can ask them to reach out internally to expedite investigation and resolution of this issue.

Terraform Version & Provider Version(s)

Terraform v1.8.5
on darwin_arm64

  • provider registry.terraform.io/hashicorp/google v6.3.0

Affected Resource(s)

Reference: https://cloud.google.com/kubernetes-engine/docs/how-to/disable-kubelet-readonly-port#check-port-standard

As per GCP docs, to disable the kubelet read-only port at cluster level, the hierarchy is nodePoolDefaults.nodeConfigDefaults.nodeKubeletConfig. The terraform equivalent of this would be node_pool_defaults.node_config_defaults.insecure_kubelet_readonly_port_enabled in a google_container_cluster resource. The apply completes succesfully, but then if I run

gcloud container clusters describe cluster \
    --location=region \
    --flatten=nodePoolDefaults.nodeConfigDefaults 

I get

loggingConfig:
  variantConfig:
    variant: DEFAULT

The expected output should contain insecureKubeletReadonlyPortEnabled: false if the apply was successful and the port was disabled.

I believe the setting is applicable for new clusters only. When I do this at google_container_node_pool level, it works. Either way, it would be nice to have a bit more clear documentation.

gcloud container node-pools describe node-pool-name \
    --cluster=cluster \
    --location=region \
    --flatten=config \
    --format="value(kubeletConfig)"
cpuCfsQuota=False;cpuManagerPolicy=none;insecureKubeletReadonlyPortEnabled=False

Terraform Configuration

resource "google_container_cluster" "cluster" {
  ...
  node_pool_defaults {
    node_config_defaults {
      insecure_kubelet_readonly_port_enabled = "FALSE"
    }
  }
}

Debug Output

No response

Expected Behavior

No response

Actual Behavior

No response

Steps to reproduce

  1. terraform apply

Important Factoids

No response

References

No response

b/369904303

@rd-michel
Copy link

rd-michel commented Sep 20, 2024

hi @MaulinS-Pangea,
your terraform configuration only secured the "default pool" kubelet port. in order to disable the insecure kubelet port for all node pools you have to add the following snippet to every node config in every node pool

    kubelet_config {
      cpu_manager_policy                     = "none"
      insecure_kubelet_readonly_port_enabled = "FALSE"
    }

cpu_manager_policy is a required field ... ive set it to default

@ggtisc ggtisc self-assigned this Sep 23, 2024
@ggtisc ggtisc added service/container forward/review In review; remove label to forward labels Sep 24, 2024
@ggtisc
Copy link
Collaborator

ggtisc commented Sep 24, 2024

Hi @MaulinS-Pangea
As @rd-michel mentions you should try to use the suggested configuration. If after this you continue to have issues share us you complete google_container_cluster code to see what is happening.

You may check the terraform registry documentation here.

@megamih
Copy link

megamih commented Sep 25, 2024

Hi @ggtisc @rd-michel
I have the same issue with

  • Terraform v1.7.0 on darwin_arm64
  • provider registry.terraform.io/hashicorp/google v5.44.1
    I beleive that the way @MaulinS-Pangea (and I :) read the documentation is correct - documentaiton mentions default node pool settings for the entire cluster, and not settings for the default node pool:
[node_pool_defaults](https://registry.terraform.io/providers/hashicorp/google/5.44.1/docs/resources/container_cluster#node_pool_defaults) - (Optional) Default NodePool settings for the entire cluster. These settings are overridden if specified on the specific NodePool object.

I.e. I expect that the following Terraform Configuration

resource "google_container_cluster" "cluster" {
  ...
  node_pool_defaults {
    node_config_defaults {
      insecure_kubelet_readonly_port_enabled = "FALSE"
    }
  }
}

will set nodePoolDefaults.nodeConfigDefaults.nodeKubeletConfig.insecureKubeletReadonlyPortEnabled: false.
And unoftunately this is not the case here :(
Just in case, the assumed behaviour is true for gcfs_config, for example, when the following Terraform Configuration

resource "google_container_cluster" "cluster" {
  ...
  node_pool_defaults {
    node_config_defaults {
      gcfs_config {
        enabled = true
      }
    }
  }
}

would disable enable image streaming at cluster level and set nodePoolDefaults.nodeConfigDefaults.gcfsConfig.enabled: true

@megamih
Copy link

megamih commented Sep 26, 2024

OK, looks like I narrowed down the issue :) to the wrong handling of default values for insecure_kubelet_readonly_port_enabled.

Existing cluster state:

gcloud container clusters describe cluster \
    --project=project \
    --location=location \
    --flatten=nodePoolDefaults

---
nodeConfigDefaults:
  gcfsConfig:
    enabled: true

then I try to disable insecure kubelet readonly port with the following Terraform Configuration

resource "google_container_cluster" "cluster" {
  ...
  node_pool_defaults {
    node_config_defaults {
      insecure_kubelet_readonly_port_enabled = "FALSE"
    }
  }
}

and terraform shows no changes

terraform apply

No changes. Your infrastructure matches the configuration.

But if I explicitly enable insecure kubelet readonly port first

resource "google_container_cluster" "cluster" {
  ...
  node_pool_defaults {
    node_config_defaults {
      insecure_kubelet_readonly_port_enabled = "TRUE"
    }
  }
}

terraform detects the change

  ~ resource "google_container_cluster" "cluster" {
...
      ~ node_pool_defaults {
          ~ node_config_defaults {
              ~ insecure_kubelet_readonly_port_enabled = "FALSE" -> "TRUE"
                # (1 unchanged attribute hidden)

                # (1 unchanged block hidden)
            }
        }

and adds nodePoolDefaults.nodeConfigDefaults.nodeKubeletConfig block after apply

gcloud container clusters describe cluster \
    --project=project \
    --location=location \
    --flatten=nodePoolDefaults

---
gcfsConfig:
  enabled: true
nodeKubeletConfig:
  insecureKubeletReadonlyPortEnabled: true

And now I am able to actually disable insecure kubelet readonly port for new (not existing) node pools with

resource "google_container_cluster" "cluster" {
  ...
  node_pool_defaults {
    node_config_defaults {
      insecure_kubelet_readonly_port_enabled = "FALSE"
    }
  }
}

terraform detects the change

  ~ resource "google_container_cluster" "cluster" {
...
      ~ node_pool_defaults {
          ~ node_config_defaults {
              ~ insecure_kubelet_readonly_port_enabled = "TRUE" -> "FALSE"
                # (1 unchanged attribute hidden)

                # (1 unchanged block hidden)
            }
        }

and updates nodePoolDefaults.nodeConfigDefaults.nodeKubeletConfig block after apply

gcloud container clusters describe cluster \
    --project=project \
    --location=location \
    --flatten=nodePoolDefaults

---
gcfsConfig:
  enabled: true
nodeKubeletConfig:
  insecureKubeletReadonlyPortEnabled: false

@megamih
Copy link

megamih commented Sep 26, 2024

And just in case, I confirmed my understanding of node_pool_defaults.node_config_defaults block in google_container_cluster resource - it defines default settings for the new node pool for the entire cluster (can be overridden if explicitly specified at google_container_node_pool level), and not settings for the default node pool.

# I disabled readOnlyPort at cluster level via terraform with node_pool_defaults.node_config_defaults.insecure_kubelet_readonly_port_enabled: false
gcloud container clusters describe cluster \
    --project=project \
    --location=location \
    --flatten=nodePoolDefaults
---
nodeConfigDefaults:
  gcfsConfig:
    enabled: true
  nodeKubeletConfig:
    insecureKubeletReadonlyPortEnabled: false

# this did not affect existing node pool
gcloud container node-pools describe existing-pool \
    --project=project \
    --cluster=cluster \
    --location=location \
    --flatten=config \
    --format="value(kubeletConfig)"

nothing

# kubelet-config was not changed on the existing nodes too
gcloud compute instances describe existing-pool-gke-vm --zone zone  --project project | grep readOnlyPort
      readOnlyPort: 10255

# I forced existing nodes to restart via node pool upgrade
gcloud container clusters upgrade cluster \
  --node-pool=existing-pool \
  --project=project \
  --location=location

All nodes in node pool [existing-pool] of cluster [cluster] will be upgraded from version [1.29.6-gke.1326000] to version [1.29.8-gke.1031000].

# readOnlyPort config is still missing in the node pool config after upgrade
gcloud container node-pools describe existing-pool \
    --project=project \
    --cluster=cluster \
    --location=location \
    --flatten=config \
    --format="value(kubeletConfig)"

nothing

# kubelet-config was not changed on the nodes too
gcloud compute instances describe existing-pool-gke-vm --zone zone  --project project | grep readOnlyPort
      readOnlyPort: 10255

# I added the new test pool without explicitly disabling readOnlyPort
gcloud container node-pools create kubelet-test \
    --project=project \
    --cluster=cluster \
    --location=location \
    --num-nodes=1 \
    --service-account '[email protected]'

# and newly added node pool inherited cluster node_pool_defaults.node_config_defaults settings as expected
gcloud container node-pools describe kubelet-test \
    --project=project \
    --cluster=cluster \
    --location=location \
    --flatten=config \
    --format="value(kubeletConfig)"

insecureKubeletReadonlyPortEnabled=False

# readOnlyPort is disabled on the new node
gcloud compute instances describe kubelet-test-pool-gke-vm --project=project --zone=zone | grep readOnlyPort
      readOnlyPort: 0

# I enabled readOnlyPort via terraform with node_pool_defaults.node_config_defaults.insecure_kubelet_readonly_port_enabled: true
gcloud container clusters describe cluster \
    --project=project \
    --location=location \
    --flatten=nodePoolDefaults
---
nodeConfigDefaults:
  gcfsConfig:
    enabled: true
  nodeKubeletConfig:
    insecureKubeletReadonlyPortEnabled: true

# this did not affect existing node pool so I deleted it
gcloud container node-pools delete kubelet-test \
    --project=project \
    --cluster=cluster \
    --location=location

# and added it again
gcloud container node-pools create kubelet-test \
    --project=project \
    --cluster=cluster \
    --location=location \
    --num-nodes=1 \
    --service-account '[email protected]'

# and newly added node pool inherited cluster node_pool_defaults.node_config_defaults settings as expected again
gcloud container node-pools describe kubelet-test \
    --project=project \
    --cluster=cluster \
    --location=location \
    --flatten=config \
    --format="value(kubeletConfig)"
insecureKubeletReadonlyPortEnabled=True

# readOnlyPort is enabled on the new node
gcloud compute instances describe new-kubelet-test-pool-gke-vm --project=project --zone=location | grep readOnlyPort
      readOnlyPort: 10255

@ggtisc
Copy link
Collaborator

ggtisc commented Sep 26, 2024

Documentation inconsistency

It is not clear at all and causes confusion between users which is the correct way to set insecure_kubelet_readonly_port_enabled = "FALSE"

@ggtisc ggtisc added documentation and removed bug forward/review In review; remove label to forward labels Sep 26, 2024
@ggtisc ggtisc removed their assignment Sep 26, 2024
@wyardley
Copy link

wyardley commented Sep 28, 2024

OK, looks like I narrowed down the issue :) to the wrong handling of default values for insecure_kubelet_readonly_port_enabled.

Curious whether:
GoogleCloudPlatform/magic-modules#11688 (which came out in 6.4.0)
will make a difference, though it may not (since it only should affect the behavior at resource creation).

It's difficult to do perfectly because the API doesn't send a value if the value is suppressed. Either way, you may want to see if updating to 6.4.0 changes or fixes the behavior for you.

  • it defines default settings for the new node pool for the entire cluster (can be overridden if explicitly specified at google_container_node_pool level

Yes, in my understanding, it affects the default behavior of newly created nodepools if they don't have the setting set or not. There's also the nested node_config.node_kubelet_config in the google_container_cluster resource, which ideally shouldn't be used, but which affects the default nodepool that's created if remove_default_node_pool is not set. This parameter is also valid (for autopilot clusters anyway) in node_pool_auto_config.node_kubelet_config.

As @ggtisc says, the docs could probably be a little clearer in various spots. But it is also possible that there's an actual corner case somewhere with the behavior of node_pool_defaults.node_config_defaults.

Reading the top level Google docs on the various places this value can be set is probably a good way to double-check. For the most part, the attribute naming in the provider tracks with the Google APIs.

The good thing is that the default will be changing soon for newly created clusters, and at that point, hopefully people won't need to set this setting.

@wyardley
Copy link

wyardley commented Sep 28, 2024

@rd-michel:

cpu_manager_policy is a required field ... ive set it to default

Side note: it's not now as of 6.4.0 (#19464)
Note that in earlier versions of the provider, it's possible to set it to "" (IIRC) to get the default behavior, which doesn't totally line up with the docs.

@BBBmau BBBmau added this to the Goals milestone Oct 21, 2024
@BBBmau BBBmau added the size/m label Oct 21, 2024
@melinath
Copy link
Collaborator

Here's a base test config resource "google_container_cluster" "with_insecure_kubelet_readonly_port_enabled_node_pool_update" { name = "tf-test-awelkralskddrlkjawer" location = "us-central1-f" initial_node_count = 1

node_pool_defaults {
node_config_defaults {
insecure_kubelet_readonly_port_enabled = "FALSE"
}
}
deletion_protection = false
}

  • Create with insecure_kubelet_readonly_port_enabled = "FALSE" (or with the field unset): sends a POST request without a value for the field:
    "nodePoolDefaults": {
     "nodeConfigDefaults": {
      "loggingConfig": {
       "variantConfig": {}
      },
      "nodeKubeletConfig": {}
     }
    },
    
  • Update to remove insecure_kubelet_readonly_port_enabled sees no diff (because it's optional + computed, so it preserves the last value returned from the API)
  • Update to insecure_kubelet_readonly_port_enabled = "TRUE" sends a PUT request with:
    {
     "update": {
      "desiredNodeKubeletConfig": {
       "insecureKubeletReadonlyPortEnabled": true
      }
     }
    }
    

Previously we enabled "force sending" this field to ensure that it was sent even if false - but that caused #19428 and had to be rolled back, which has in turn caused this (slightly less bad) bug.

A large part of the reason we implemented this field as a string (that gets converted to a boolean internally) is that it should allow us to actually distinguish between "unset" and "set to FALSE" - which means we should be able to reliably tell the difference and only force send the field in cases where it has been explicitly set to FALSE by the user.

@melinath melinath changed the title google_container_cluster: node_pool_defaults.node_config_defaults.insecure_kubelet_readonly_port_enabled not working google_container_cluster: node_pool_defaults.node_config_defaults.insecure_kubelet_readonly_port_enabled not sent on create even when explicitly set by user Oct 22, 2024
@melinath melinath removed their assignment Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants