Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PFR - option to define & exclude dedicated resources/properties from CFN Drift Detection - e.g. due to AWS::EC2::SecurityGroupIngress references #1533

Open
rgoltz opened this issue Feb 19, 2023 · 4 comments
Labels
enhancement New feature or request

Comments

@rgoltz
Copy link

rgoltz commented Feb 19, 2023

Name of the resource

AWS::EC2::SecurityGroupIngress

Resource name

No response

Description

Summary & Starting point:

This idea is more a generic one - there are different circumstances, where CloudFormation Drift Detection reporting a DRIFT, but it's not - reasons could be:

After brainstorming with different people in CloudFormation area (e.g. @kanitkah @ziarrdan @LariWo), we thought of "exclusion list", which should be part of the CloudFormation template. By doing so, we could add this "exclusion-list for resources inside the stack, which should not be checked" to our version controlled CFN/CDK projects/files.

Since the benefit for case (2) is clear - Removing the resouce from list, after the github issue with the false-postive is solved. I'd like to add an example for case (1) here based on #1198:

  • Our Stack-A having a SecurityGroup (e.g. of an EC2 with a legacy database, later called as Logical ID DatabaseSecGroupIngressStaticInSync). All resources and stacks of this Team-A are not dynamically. Once deployed Team-A taking care of Stack-A.

  • There is a Team-B. This team utilize more cloud-native patterns, hence they create there ECS container-cluster only on request and drop this infrastructure (+ stacks, etc) after the workload is done and not required anymore.

  • There is a "contract" between Team-B and Team-A: In order to allow the Team-B's ECS components to access the database, Team-A defined a dedictated SecurityGroup for them (later named with Logical ID ThisSecGroupIsAlwaysDrifted) and inform on the Physical ID (like sg-666xyz), where Team-B can "inject" Ingress rules to access the database by defining a AWS::EC2::SecurityGroupIngress in TeamB's CFN/CDK. Team-A helps this pattern as well, since Team-B doesn't need to request new rules everytime once ECS is active and later removed again.

  • As we can see, Team-A encapsulate the access from Team-B using a dedicated SecurityGroup. The main reason is caused by issue AWS::EC2::SecurityGroupIngress causing false positive drift within referenced AWS::EC2::SecurityGroup #1198. Once Team-A checking there CloudFormation Stack (here called Stack-A) for Drifts, their SecurityGroup is always in state DRIFTED (due to #1198).

# This is "Stack-A" managed by "Team-A"
AWSTemplateFormatVersion: 2010-09-09
Parameters:
  VpcId:
    Type: String
    Description: 'Enter the VPC ID where the Security Groups will be created'
Resources:
  # also other stuff, like EC2, ENI, Volumes etc here ...
  DatabaseSecGroupIngressStaticInSync:  # Physical ID: sg-123abc
    Type: 'AWS::EC2::SecurityGroup'
    Properties:
      GroupDescription: SecGroup for all static/normal Ingress Rules to database of Team-A
      VpcId: !Ref VpcId
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: 1521
          ToPort: 1521
          CidrIp: 192.168.0.1/32
        # and many more ...
  ThisSecGroupIsAlwaysDrifted:          # Physical ID: sg-666xyz     
    Type: 'AWS::EC2::SecurityGroup'
    Properties:
      GroupDescription: SecGroup which Team-B will refer via AWS::EC2::SecurityGroupIngress from their template
      VpcId: !Ref VpcId
      SecurityGroupIngress:
      - IpProtocol: "-1"
        CidrIp: 127.0.0.1/32
        Description: dummy rule to create an empty SecGroup for Team-B

Ideas for this PFR:

  • We should be able to add within the the CloudFormation template (Stack) following definitions:
    • (I.) Exclude resource AWS::EC2::SecurityGroup with Logical ID ThisSecGroupIsAlwaysDrifted from Drift Detection.
    • (II.) Exclude only dedicated properties for all resources of this type within this stack, e.g. due to a known, temporarily false-postive issues in Drift Detection.
  • Conversely,
    • Other resource of the same type, like Logical ID DatabaseSecGroupIngressStaticInSync should still be part of Drift Detection.
    • Further resource & properties of other of types (AWS::RDS::DBInstance, EC2::Instance) are also checked by Drift Detection.
  • Any pattern from this PFR must be compatible with CloudFormation managed and generated by CDK.

Benefits from this PFR:

  • You are able to maintain know Drift Detection issues within your IaC. By having this new functionality the number of stack in state drift will be significantly reduced. Furthermore everybody can "read" within the CloudFormation template (along with comments with github links or internal notes, etc) why this drift happend (Bug, Limitation, Own use-case which require changes outside of CloudFormation via Console, etc.) and it's permanently excluded from Drift Detection.
  • Please note this should be a generic pattern all over CloudFormation - My example with Team-A/Team-B using AWS::EC2::SecurityGroupIngress is just one (real-life) use-case. There are many more ;)

Other Details

Happy to discuss further ideas or difficulties for such pattern - here via github or via CloudFormation Discord https://discord.gg/9zpd7TTRwq :)

@rgoltz rgoltz added the enhancement New feature or request label Feb 19, 2023
@ZalmnS
Copy link

ZalmnS commented Mar 6, 2023

The use case for this PFR has multiple use cases. Especially if you're interested in making an EC2 Security Group in one Stack, and then you want to apply the Security Group Ingress or Egress rules on the EC2 Security Group in another Stack.

The EC2 Security Group would shows drifted since the limitation highlighted here

Drift is the most consistent way of ensuring that the CloudFormation Stack Resources are in a Nominal state.

You can see the necessity for this feature with issues also relating to the AWS::RDS::DBInstance Resource if you mark the "AutoMinorVersionUpgrade" Property Boolean as True. As soon as a Minor Version is released, the CloudFormation Template would need to be updated since the Engine Version difference causes a drift.

While the AWS::RDS::DBInstance issue can be solved via a Transform, it would be more ideal to highlight either a specific property within a Resource or the Resource itself to be exempted from Drift Detection. It would make Drift Detection for RDS Resources significantly more informative and viable to leverage.

Ideally, I would love to see a development of a common CloudFormation Attribute similar to the DependsOn Attribute where you specify which properties would and would not be exempted from Drift detection, or just declare "All" to remove the Resource from being considered in Drift Detection.

With the amount of problems that Drift Detection runs into regarding edge cases like the ones highlighted in my comment, it negatively impacts confidence within the Drift Detection feature. If a new way of defining what is and isn't detected within a Resource when running Drift Detection is developed, it would effectively be able to breathe new life into the usability of Drift Detection with this requested feature giving organizations the ability to start defining what drift they would like to actually detect and ignore.

@PatMyron
Copy link
Contributor

PatMyron commented Aug 31, 2023

Case 2 feels like a crutch instead of better centralized drift solutions

When I worked on CloudFormation, drift detection was more reactive than proactive. Customers complained about drift bugs, and then CloudFormation would investigate case-by-case

Drift detection should monitor property types with high drift rates. Many customer-reported drift bugs had 100% drift rates; obviously, every customer using a property type did not manually drift out-of-band. Drift detection for those property types should be automatically disabled and CloudFormation can asynchronously work through that list to fix them without customer involvement

@rgoltz
Copy link
Author

rgoltz commented Jan 13, 2024

Thanks Pat for your valuable input. I agree.

I'm adding an other example, where -individually- based on your personal use-case a exclude within CFN Drift Detection would be great:
It's regarding Drift Detection for resource AWS::SSM::Parameter Property Value

  • In case you just create entries in Parameter Store via CFN or CDK you sometimes using a freely chosen initial Value of the Parameter (like Value "ChangeMe") in your CFN Template.
    • Sometimes you doesn't like to store the exact Value in your code repository / version control, like git.
    • It could be the case that your Lambda-Code store a ongoing changing Value inside the Parameter.
    • It could be the case that other users like to change the Value according to their needs via AWS-Console.

@rgoltz
Copy link
Author

rgoltz commented Feb 14, 2025

Here is another use-case: AWS::Lambda::EventSourceMapping - Lambda itself will change from ENABLED state to DISABLED state (in case of SQS EventSourceMapping). This results in a drift, see: #2260

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants