Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

do not pass null value variables #1367

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

masterleros
Copy link

@masterleros masterleros commented Sep 23, 2020

Hello Terragrunt team,
Thanks for this wonderfull tool!
Currently I'm facing a blocker issue using null values. A null value variables are "unset" variables, which means that terraform does not recognize it. Said that, when a null value is assigned to a variable it should not be exported to terraform (same behavior as terraform)

Here is a thread talking about the issue: #892

One example is when a dependency dynamically offers an output and it becomes optional to the children inputs:

inputs ={
  optional_input = try(dependency.foo.outputs.optional_output, null)
}

In the below example, when in terraform dependency's code the output was set to "null", the output actually will not be exported and will be not available, hence we will be defaulting to a null value by the try function (because has failed to find it)
That means that the input will not be set and terraform will use the default input value defined on its own "variables.tf" module's file, example:

variable optional_input {
   type = number
   default = -1
}

In this case, "-1" will be used because there was no input/dependency information about this value.

Update:
I've added a new Terragrunt option for 'retro-compatibility': --terragrunt-pass-null-vars which will enable current behavior passing the "null" string as value for null variables

@yorinasub17
Copy link
Contributor

yorinasub17 commented Sep 23, 2020

Thanks for the contribution!

I disagree with this:

Said that, when a null value is assigned to a variable it should not be exported to terraform (same behavior as terraform)

As you may want to explicitly set variables to null (e.g., you may have two required variables that are mutually exclusive, and you need to set one to null).

I think #1267 is the proper fix for this, as the intention of setting input to null in terragrunt is to pass the null value to the variable, which may have special meaning in certain modules (as opposed to omitting it from the input). Said another way, we want terragrunt to mimic the behavior of tfvars as much as possible as that is the mental model of what we are doing (preprocessing the tfvars) with the inptus. Adding this PR in deviates from that mental model, and it would make it hard to switch to that model when we address the remaining security concern in that PR.

I think a better fix for what you want would be to introduce a helper function that filters out map keys with null values (similar to compact for list). That way, this behavior is strictly opt in and is explicit. Do you think you can rework this change with that?

Note that you should be able to already do this today using for expressions (although ugly: which is why it would help to have a function):

inputs = {
  for key, val in {
    # ... your inputs ...
  } :
  key => val if val != null
}

With a helper, that would look like:

# NOTE: mapcompact does not exist!
inputs = mapcompact({
  # ... your inputs ...
})

@masterleros
Copy link
Author

Thank you for the reply,

The workaround you mention would be helping me already, thanks for that.

Additionally, I agree that the tfvars file would be able to handle 'null' value variables (unfortunately unsupported on TF_VAR_*) but it has the trade-off of provably write 'sensitive' information to disk (which may a blocker too for my requirements).

The PR approach is to follow same Terraform behavior with variables, these are not included in any definition when null,. For example, in Terragrunt you cannot consume a Terraform output 'null' value, variable does not exist, you need to threat it as if not defined. Same rules apply on my implementation.

Look at the below terraform example:

variable optional { default = null }
output optional { value = var.optional }

The output will exist depending if null or not:

terraform apply -input=false -var optional="a value"

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Outputs:

optional = a value

If no value is provided, no output exist:

terraform apply -input=false

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

If make sense, I can rework --terragrunt-pass-null-vars to --terragrunt-ignore-null-vars which would default to current approach but would allow to ignore null value variables.

@yorinasub17
Copy link
Contributor

it has the trade-off of provably write 'sensitive' information to disk (which may a blocker too for my requirements).

Yes this is the reason the PR is blocked, and it won't be merged until we figure out a way to avoid this.

Note that if we were focused on solving the null problem, there is an alternative approach to #1267 that only does the tfvars file approach for those inputs that are explicitly set to null. The point is that there are different ways to solve the problem of passing null through, and we want to do that eventually because...

The PR approach is to follow same Terraform behavior with variables, these are not included in any definition when null,. For example, in Terragrunt you cannot consume a Terraform output 'null' value, variable does not exist, you need to threat it as if not defined. Same rules apply on my implementation.

Yes I understand this approach. I wanted to highlight that there are use cases where it makes sense to not have a default but handle null at the terraform module level, and that this feature is a one way street to close that approach off (without manipulating feature flags).

Here is an example: suppose you had two ways of getting the AMI: one is to get the AMI by ID, and the other is to get it by filters. You might define it as such in your module:

variable "ami_id" {
  type = string
}
variable "ami_version_tag" {
  type = string
}

data "aws_ami" "example" {
  count = var.ami_version_tag != null && ami_id == null ? 1 : 0
  executable_users = ["self"]
  most_recent      = true
  name_regex       = "^myami-\\d{3}"
  owners           = ["self"]

  filter {
    name   = "name"
    values = ["myami-version-${var.ami_version_tag}-*"]
  }
  
  # ...other filters intentionally omitted...
}

resource "aws_instance" "web" {
  ami           = var.ami_id != null && var.ami_version_tag == null ? var.ami_id : data.aws_ami.example[0].id
  # ... other vars omitted for brevity ...
}

We want both to be required so that the user gets a reasonable error if they don't set either of them, because an AMI is required.

Yes you can easily workaround this by setting default to null to make it compatible with your feature here. But then the error message will be confusing, and there is no way to go back to the other way when we fix the null passing without using the special CLI flag every time you apply. Which brings me to...

If make sense, I can rework --terragrunt-pass-null-vars to --terragrunt-ignore-null-vars which would default to current approach but would allow to ignore null value variables.

We're generally weary of terragrunt feature flags on the CLI as it goes against the spirit of IaC. It would be better if the setting was captured in the config so that it is explicit, and you don't have to remember to pass in the flag all the time (or worse, when you have mixed requirements where some modules need the flag and others do not, and dealing with the flag during an apply-all). The point is that anything that always need to be passed in via the CLI for the module to function should be offered in the config.

This is why I suggested using a function that explicitly filters the nulls out. This would be captured in the config, and you have control over when it should be done and not, and it is locally scoped. The function is, IMO, also much more intuitive than a feature flag because it works on primitive data structures and thus does not worry about things like if the var is already defined as TF_VAR_ (e.g., if --terragrunt-ignore-null-vars is set, does that also ignore TF_VAR_varname=null? This is a pretty reasonable interpretation if it is a feature flag named as such). This is also much more maintainable: we don't need to float all the CLI args and make sure it is used in the right places, nor do we need to add extra parsing or usage logic for a special attribute (which is the alternative: e.g. terragrunt_pass_null_vars = false).

However, most of this (function over attribute feature flag) is opinionated so if you have a reason why the function approach does not work for you, happy to consider the attribute approach. But I do feel strongly that the CLI flag is not the right approach here.

@masterleros
Copy link
Author

masterleros commented Sep 29, 2020

I know that it may be not what Terragrunt group would like to implement, but as per Terraform null value definition it means:

null: a value that represents absence or omission. If you set an argument of a resource or module to null, Terraform behaves as
though you had completely omitted it — it will use the argument's default value if it has one, or raise an error if the argument is
mandatory. null is most useful in conditional expressions, so you can dynamically omit an argument if a condition isn't met.

The important part: it will use the argument's default value if it has one, or raise an error if the argument is mandatory

If the problem is readability of the error, potentially Terragrunt may print out a warning message when a variable is set to null.

Thanks

@masterleros
Copy link
Author

Yes you can easily workaround this by setting default to null to make it compatible with your feature here. But then the error message will be confusing, and there is no way to go back to the other way when we fix the null passing without using the special CLI flag every time you apply. Which brings me to...

This is provably incomplete, because if you provide both inputs it will still bring a confusing error (or even an unexpected situation)

if that is the case, this should be validated, for example, Forseti Security implements a validation step: in This example:

resource "null_resource" "org_id_and_folder_id_are_both_empty" {
  count = length(var.composite_root_resources) == 0 && var.org_id == "" && var.folder_id == "" ? 1 : 0

  provisioner "local-exec" {
    command     = "echo 'composite_root_resources=${var.composite_root_resources} org_id=${var.org_id} folder_id=${var.org_id}' >&2; false"
    interpreter = ["bash", "-c"]
  }
}

Or I have created my own validator for these situations where they are mutually exclusive:

data external invalid_project_parent {
    count   = (var.org_id == null ? 0 : 1) + (var.folder_id == null ? 0 : 1) < 1 ? 1 : 0
    query   = { ERROR = " either 'org_id' or 'folder_id' input are required" }
    ### DO NOT CHANGE ###
    program = [ "sh", "-c", "cat <&0 | sed 's/{\\|}\\|\\\"//g' >&2; false" ]
    ### DO NOT CHANGE ###
}

@yorinasub17
Copy link
Contributor

The problem with literally using the terraform definition is that it doesn't work like that when you set the variable to null at the top level, or even the module level.

For example, suppose you had a module as such:

variable "foo" {
  default = "hello world"
}

output "bar" {
  value = var.foo
}

And you invoked it with:

module "foo" {
  source = "/path/to/foo/module"
  foo = null
}

output "bar" {
  value = module.foo.bar
}

By your interpretation of the terraform defintion, it should output "hello world" for bar. However, it actually outputs "null".

Terragrunt is just feeding variables to modules, so as long as null is a valid value for the module inputs, we should support the ability to pass null.

@yorinasub17
Copy link
Contributor

To explain a different way, the mental model of terragrunt input passing is to mimic tfvars files, and it is unexpected to be omitting values automatically, if tfvars doesn't do that. E.g., in the same module foo, invoking with a tfvars file of:

foo = null

will output null instead of hello world. This deviation will be confusing.

So ignoring my personal opinion thatnull should be a value, we still want to default to try to pass null through to terraform and rely on terraform to handle it, to avoid unexpectations arising from the deviation from this mental model.


With that said, I can see why this is useful for certain use cases and hence I suggested an approach with a helper function that omits null values from the map. This feels like the best of both worlds:

  • We still default to preserve the behavior compatibility with tfvars: whatever we pass in gets passed through to terraform (assuming we solve the null passing issue).
  • If terraform changes the meaning of null to match their description even for modules, it will still work if we are passing null through (no need to change any processing in terragrunt). This is desirable because we automatically inherit how terraform works: again, the importance of following the mental model of how tfvars works.
  • If this is not desired, there is an "escape hatch" to clean up null values from the inputs map, in an easy to understand, explicit way in terragrunt.hcl.
  • Since this is just a helper function, it is straightforward to maintain from an implementation perspective.

@yorinasub17 yorinasub17 self-assigned this Oct 1, 2020
@JasP19
Copy link

JasP19 commented Jan 4, 2022

I know this response is coming quite late after the issue was opened but I believe Terraform 1.1 has introduced some functionality which makes this problem easier to solve.

When calling a module and supplying null for one of the input parameters, Terraform's default behaviour was always to set the actual value of the variable to null. There is a thread where many people have contested this behaviour: hashicorp/terraform#24142

Terraform 1.1 has released a nullable flag which allows module creators to choose whether null is a valid value for an input variable, or whether the variable should take on the default value when set to null: hashicorp/terraform#29832

As such, I propose that Terragrunt handles null values by passing null through to the module it is calling (I understand there are issues associated with actually passing the null value through and that there are workarounds available).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants