-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Resolve #1169] Add detect-stack-drift command #1170
[Resolve #1169] Add detect-stack-drift command #1170
Conversation
Add a `detect-stack-drift` command and tests. The `detect-stack-drift` command calls the `detect_stack_drift` Boto3 call, takes the Detector Id returned, then waits for `describe_stack_drift_detection_status(StackDriftDetectionId=detector_id)` to complete, and then finally returns `describe_stack_resource_drifts` as a JSON document. The output is formatted such that it can be passed directly into a tool like `jq`. If `--debug` is passed, sceptre also will provide feedback on the detection progress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pretty cool feature I'm looking forward to having it. With that said, it looks like this feature only really covers the "happy path" of everything going as expected. It needs to handle cases like:
- When the stack doesn't exist
- When drift detection times out
- When drift detection fails for some reason
Also, you should use Python-3 conventions rather than the old-school py2 ways of doing things, such as:
- f-strings instead off old-style %s string formatting
- type annotations instead of indicating types in docstrings
I'd also like to see some sample output here on this PR. I'm not really sure what sort of info this would show. |
integration-tests/steps/stacks.py
Outdated
if stack.stack_status == "ROLLBACK_IN_PROGRESS" and not rollback_printed: | ||
client = boto3.client("cloudformation") | ||
response = client.describe_stack_events(StackName=stack_name) | ||
print(response) | ||
rollback_printed = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the thinking behind this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jfalkenstein Those are just troubleshooting lines for me to figure out why the build keeps failing in the CI pipeline where I can't reproduce it on my laptop.
Co-authored-by: Jon Falkenstein <[email protected]>
I just tested this out and the output looks good! The output is a lot better, both in yaml and in json. The one remaining thing I'm noticing is this: For some stupid reason, cloudformation returns json within json in their responses. It makes the command output super wide, hard to read, and ends up with the awkwardness of json within json or (even worse) json within yaml. This is always present in ExpectedProperties and ActualProperties, but it can also be seen in PropertyDifferences as well, when there's a value there. I think it would be a significant improvement if we recursively deserialized embedded json. Something like this would work work: def deserialize_json_properties(value):
if isinstance(value, str):
is_json = (
(value.startswith('{') and value.endswith('}'))
or
(value.startswith('[') and value.endswith(']'))
)
if is_json:
return json.loads(value)
return value
if isinstance(value, dict):
return {
key: deserialize_json_properties(val)
for key, val in value.items()
}
if isinstance(value, list):
return [
deserialize_json_properties(item)
for item in value
]
return value Basically, if you passed each drift response on the |
Running integration tests now |
@alexharv074 , your integration tests failed again. I looked at the output and found this: "'ResourceStatus': 'CREATE_FAILED', 'ResourceStatusReason': 'Resource handler returned message: "User: arn:aws:sts::582448526747:assumed-role/sceptre-integration-test-ServiceRole-DUSW2V6ES2ZU/botocore-session-1641565891 is not authorized to perform: logs:CreateLogGroup on resource: arn:aws:logs:eu-west-1:582448526747:log-group:sceptre-integration-tests-82194e1a6fc611ecb27a0242c0a82003-drift-group-A-LogGroup-APnLzvaiAFOX:log-stream: because no identity-based policy allows the logs:CreateLogGroup action" It looks like your plan to cause drift by creating a log group requires permissions the integration test runner doesn't currently have. Looks like you'll either need to use a different resource type, or you'll need to make a PR to update this file to give the integration tests the proper permissions: https://github.com/Sceptre/sceptre-aws/blob/master/config/prod/sceptre-integration-test-service-access.yaml |
@jfalkenstein ah .... so that's what it is. Ok, back to the drawing board! I'll see if I can rewrite those tests to not require that permission. |
@alexharv074 would love to see this feature in our pipelines before the actual deployment. I just wonder if it'd make sense to make timeout configurable in case CloudFormation service takes more than 300 secs to perform the detection (not sure how long it takes today though)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, man. Thanks for your hard work!
@zaro0508 , are we good to squash-merge this into master?
Yeah we discussed this above. My thinking is that the timeout should not be reachable and if in some scenario it turns out that it is, then that should be a bug that gets fixed. In my tests, it normally takes just a few seconds, so something should be really wrong if 300 seconds pass. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few minor grammatical suggestions otherwise this looks good to me. Thanks for brining this feature to sceptre @alexharv074 and @jfalkenstein
Running integration tests now... |
Add a
detect-stack-drift
command and tests.The
detect-stack-drift
command calls thedetect_stack_drift
Boto3 call, takes the Detector Id returned, then waits for
describe_stack_drift_detection_status(StackDriftDetectionId=detector_id)
to complete, and then finally returns
describe_stack_resource_drifts
as a JSON document.If
--debug
is passed, sceptre also will provide feedback onthe detection progress.
PR Checklist
[Resolve #issue-number]
.make test
) are passing.pre-commit run --all-files
).and description in grammatically correct, complete sentences.
Approver/Reviewer Checklist
Other Information
Guide to writing a good commit