Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-37544] Replace Timer with single thread scheduled executor for BlobServer #26339

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

beliefer
Copy link
Contributor

What is the purpose of the change

This PR aims to replace Timer with single thread scheduled executor for BlobServer.
The javadoc recommends ScheduledThreadPoolExecutor instead of Timer.
屏幕快照 2024-01-12 下午12 47 57

This change based on the following two points.
System time sensitivity

Timer scheduling is based on the absolute time of the operating system and is sensitive to the operating system's time. Once the operating system's time changes, Timer scheduling is no longer precise.
The scheduled Thread Pool Executor scheduling is based on relative time and is not affected by changes in operating system time.

Are anomalies captured

Timer does not capture exceptions thrown by Timer Tasks, and in addition, Timer is single threaded. Once a scheduling task encounters an exception, the entire thread will terminate and other tasks that need to be scheduled will no longer be executed.
The scheduled Thread Pool Executor implements scheduling functions based on a thread pool. After a task throws an exception, other tasks can still execute normally.

Brief change log

Replace Timer with single thread scheduled executor for BlobServer

Verifying this change

This change is already covered by existing tests, such as (BlobCacheCleanupTest, BlobServerCleanupTest and so on).

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (no)

@flinkbot
Copy link
Collaborator

flinkbot commented Mar 24, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@beliefer
Copy link
Contributor Author

@flinkbot run azure

@beliefer
Copy link
Contributor Author

ping @NicoK cc @1996fanrui @davidradl @GOODBOY008

@davidradl
Copy link
Contributor

@beliefer have you hit any issues with the current timer implementation? If not, I am reticent to change this area unless we are actually seeing issues here.

@beliefer
Copy link
Contributor Author

beliefer commented Mar 26, 2025

Yes. I experienced the issue when using Timer.
You could refer apache/spark#44718

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants