Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix TestDeleteInactiveChannels flakiness #6801

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

prathyushpv
Copy link
Contributor

What changed?

Fix TestDeleteInactiveChannels flakiness.
We read the current time before the despatch loop and resue it. Added a small wait in the test for the dispatch loop to finish after advancing the time.
This will make sure that a dispatch loop has finished before next set of tasks are processed. Otherwise dispatch loop might set old lastActive time for channels.

Why?

How did you test it?

Potential risks

Documentation

Is hotfix candidate?

No

@prathyushpv prathyushpv requested a review from a team as a code owner November 12, 2024 21:38
@@ -437,6 +437,10 @@ func (s *interleavedWeightedRoundRobinSchedulerSuite) TestDeleteInactiveChannels

s.ts.Advance(30 * time.Minute)

// Sleeping for a small duration for the current event loop to finish.
// We read ts.Now() before the loop begins and reuse it in the loop.
time.Sleep(100 * time.Millisecond) //nolint:forbidigo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How difficult would it be to wire the fake clock all the way to the dispatch loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the time source inserted to the dispatch loop. But it may read the old value and enter the loop. If this happens, LastUpdatedTime will be the old time for all tasks processed in the loop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the goal here to make sure the dispatch loop can see the advanced time from the eventTimeSource? If so, use a s.Eventually to wait for Now() to be advanced before submitting any new tasks?

Copy link
Contributor Author

@prathyushpv prathyushpv Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is a little different:

Consider that a thread has executed this line:

and entered LoopDispatch below that.

At this exact time, we update time source. ts.Now() is updated. We might add new tasks now. But the above thread has read the old time value already. These new task's LastActiveTime will be updated to the old time value that is already read.

Because of this, these channels have an old LastUpdatedTime and it will be cleaned up when we advance time in test. We have to wait for the execution that read the old time has finished.

One another way to fix this is to run ts.Now() on line 375 for each task.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Thanks for the explanation.
yeah that's pretty hard to get rid of. Maybe add more context to the comment, essentially saying we have no way to know if a dispatch loop finished or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants