Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add caching option to get_env_var() #914

Open
kaarolch opened this issue Jun 27, 2024 · 2 comments
Open

Add caching option to get_env_var() #914

kaarolch opened this issue Jun 27, 2024 · 2 comments
Labels
type: feature A value-adding code addition that introduce new functionality. vrl: stdlib Changes to the standard library

Comments

@kaarolch
Copy link

kaarolch commented Jun 27, 2024

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Use Cases

During metric processing, we started using Vector VRL's get_env_var() to pull environment variables and use them as feature flags. Unfortunately, we observed a significant spike in CPU and vector utilization when we used the following configuration:

type: "remap"
inputs:
  - metrics_ingestion
source: |
  cardinality_limit, err = get_env_var("CARDINALITY_LIMIT")
      if err == null {
        .tags.cardinality_limit = cardinality_limit
      } else {
        .tags.cardinality_limit = "false"
      }
  metrics_processor, err = get_env_var("METRICS_PROCESSOR")
      if err == null {
        .tags.metrics_processor = metrics_processor
      } else {
        .tags.metrics_processor = "false"
      }
  tags_filter, err = get_env_var("TAGS_FILTER")
      if err == null {
        .tags.tags_filter = tags_filter
      } else {
        .tags.tags_filter = "false"
      }

This was applied to every metric. Now, we have decided to add this transformation only to the metrics that really need it (around 30% of all processed metrics), and we were able to save about 2-3 Kubernetes vCPUs (a drop from ~10 vCPUs to 7-8 vCPUs). Transform change was deployed around 13:15.
image

The problem started increasing when we added more than two environment variable checks.

Attempted Solutions

As I mentioned, we are currently trying to minimize the use of environment variable checks. Now, we are also experimenting with data enrichment. During container startup, we will render an enrichment table and add the data as extra tags during metric remapping.

Proposal

Is there any chance to support environment variable caching with a TTL? When using remap to check for very static environment variables like ENV and VERSION, which do not change during runtime, we could set a TTL to avoid querying the system each time we set the same variable. Reference.

References

No response

Version

0.37.0

@kaarolch kaarolch added the type: feature A value-adding code addition that introduce new functionality. label Jun 27, 2024
@jszwedko
Copy link
Member

Interesting, thanks for sharing @kaarolch ! I would not have expected that to be an expensive operation. We should investigate why it is and add caching if that is the best way to mitigate it.

As a workaround you can template environment variables into the config as shown here: https://vector.dev/docs/reference/configuration/#environment-variables. These are fetched only at configuration load time. It'd look something like:

  tags_filter = "${TAGS_FILTER:-)"

@jszwedko jszwedko transferred this issue from vectordotdev/vector Jun 27, 2024
@jszwedko jszwedko added the vrl: stdlib Changes to the standard library label Jun 27, 2024
@kaarolch
Copy link
Author

kaarolch commented Jun 27, 2024

thx @jszwedko

I remember that in one of previous versions we have some issue with:

tags_filter = "${TAGS_FILTER:-)"

but I will check on the current one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: feature A value-adding code addition that introduce new functionality. vrl: stdlib Changes to the standard library
Projects
None yet
Development

No branches or pull requests

2 participants