Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent to block entire event loop when creating multiple sessions #1221

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bgb10
Copy link

@bgb10 bgb10 commented Feb 6, 2025

How I discovered?

While monitoring the TikTokApi session creation process, I noticed that the actual sleep duration varied based on the number of concurrent sessions, even with the same sleep_after value.

For instance, with sleep_after=3:

  • Creating 2 sessions took ~6 seconds
  • Creating 3 sessions took ~9 seconds
  • Creating 5 sessions took ~15 seconds

This linear scaling of total sleep time with the number of sessions indicated that the sessions weren't being created concurrently as intended. Further investigation revealed that this was causing timeout issues when the cumulative sleep time exceeded the browser's connection timeout limit. This issue particularly affects users who want to scrape data in parallel.

Causes

The root cause is the use of blocking time.sleep() within the async __create_session() function. This creates several issues:

  1. When running multiple sessions simultaneously within a single event loop, the entire event loop becomes blocked
  2. Sessions are created sequentially instead of concurrently, with each session waiting for the previous one's sleep to complete
  3. Total blocking time becomes sleep_after * number_of_sessions

For example, when using asyncio.gather() to create multiple sessions:

await asyncio.gather(
    *(
        self.__create_session(
            proxy=random_choice(proxies),
            ms_token=random_choice(ms_tokens),
            sleep_after=3,
        )
        for _ in range(5)
    )
)

Even though asyncio.gather() is designed for concurrent execution, the blocking time.sleep() prevents true concurrency, causing each session to wait for the previous one's sleep to complete.
The fix is to replace time.sleep() with await asyncio.sleep(), allowing true concurrent session creation and proper async execution. (Also putting time.sleep() is banned in many linters!)

Tested working on my end :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant