-
Notifications
You must be signed in to change notification settings - Fork 846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix TestDeleteInactiveChannels flakiness #6801
base: main
Are you sure you want to change the base?
Conversation
@@ -437,6 +437,10 @@ func (s *interleavedWeightedRoundRobinSchedulerSuite) TestDeleteInactiveChannels | |||
|
|||
s.ts.Advance(30 * time.Minute) | |||
|
|||
// Sleeping for a small duration for the current event loop to finish. | |||
// We read ts.Now() before the loop begins and reuse it in the loop. | |||
time.Sleep(100 * time.Millisecond) //nolint:forbidigo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How difficult would it be to wire the fake clock all the way to the dispatch loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have the time source inserted to the dispatch loop. But it may read the old value and enter the loop. If this happens, LastUpdatedTime will be the old time for all tasks processed in the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the goal here to make sure the dispatch loop can see the advanced time from the eventTimeSource? If so, use a s.Eventually to wait for Now() to be advanced before submitting any new tasks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is a little different:
Consider that a thread has executed this line:
now := s.ts.Now() |
LoopDispatch
below that.
At this exact time, we update time source. ts.Now() is updated. We might add new tasks now. But the above thread has read the old time value already. These new task's LastActiveTime will be updated to the old time value that is already read.
Because of this, these channels have an old LastUpdatedTime and it will be cleaned up when we advance time in test. We have to wait for the execution that read the old time has finished.
One another way to fix this is to run ts.Now() on line 375 for each task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. Thanks for the explanation.
yeah that's pretty hard to get rid of. Maybe add more context to the comment, essentially saying we have no way to know if a dispatch loop finished or not.
What changed?
Fix TestDeleteInactiveChannels flakiness.
We read the current time before the despatch loop and resue it. Added a small wait in the test for the dispatch loop to finish after advancing the time.
This will make sure that a dispatch loop has finished before next set of tasks are processed. Otherwise dispatch loop might set old lastActive time for channels.
Why?
How did you test it?
Potential risks
Documentation
Is hotfix candidate?
No