-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Completion of a restore triggers secrets refresh #1134
Completion of a restore triggers secrets refresh #1134
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #1134 +/- ##
==========================================
+ Coverage 56.84% 57.09% +0.24%
==========================================
Files 100 101 +1
Lines 10352 10401 +49
==========================================
+ Hits 5885 5938 +53
- Misses 3949 3950 +1
+ Partials 518 513 -5
|
0db7e27
to
480ca8d
Compare
The failing integration tests here look like a flake to me. I'm re-running again locally but it appears that a different one is failing each time. (It passes locally...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Todo: We need a changelog entry.
Aside from the e2e test fix, everything works as expected 👍
b0d5853
to
12f6ae0
Compare
Totally stumped on why I haven't been able to get this to run locally either, so I'm stuck using println statements to get output out of GHA. Not the best developer experience in the world. I'll keep waiting for the test to finish. |
Looking at the dumped artifacts, we don't se the annotation being placed on the superuser secrets:
The error seems legit. |
I've tried to replicate the test failure locally in various ways. I wasn't able to get two DCs working in different clusters, but I was able to get two working in two different namespaces. When I run the restores I see the following results in the logs:
The superuser secrets have the refresh tokens in both namespaces. So I think the problem here is something to do with using the wrong client in my function |
cf0755f
to
8cfde04
Compare
Co-authored-by: Alexander Dejanovski <[email protected]>
8cfde04
to
d4c5188
Compare
I've found a small problem in this. When we add the refresh annotation we add the current time. The way a reconciliation works in all other cases, is that we'd -
Normally, all of the requeues etc. are delegated back up to the top level reconciler function via the returned result type, but because we are always adding the current time, taking this approach here would lead to a constantly changing desired object, and reconciliation will never progress. I've modified the logic so we have an inner retry loop in the RefreshSecrets function, but it breaks with our usual design which I'm not particularly overjoyed about. |
I'm a bit blocked on this @adejanovski, now that the correct images are being used in the e2e tests I've gone through and resolved a number of other issues that came up (especially around the way requeues are handled etc.) however I'm still seeing issues with the tests not passing. In particular, the following log line comes up pretty regularly, do you have any ideas off the top of your head what would cause it?
I thought that refreshing the operator's view of the secret on every retry (this bit) would be sufficient to overcome that issue, but it looks like it is still occurring. Any ideas on what I might be missing? |
ec3db10
to
ab5b96e
Compare
…me of restoreJob.
ab5b96e
to
2ba393e
Compare
…'t endlessly loop.
|
version = "dev" | ||
commit = "n/a" | ||
date = "n/a" | ||
versionMessage = "#######################" + | ||
fmt.Sprintf("#### version %s commit %s date %s ####", version, commit, date) + | ||
"#######################" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this shouldn't be part of this PR I guess.
What this PR does:
Adds an annotation to the user secrets when a DC is restored. Cass operator should then pick this up and cause a refresh of user credentials on the Cassandra side.
Which issue(s) this PR fixes:
Fixes #1080
Checklist