Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request] Cumulative metric #23

Open
crusaderky opened this issue Mar 7, 2024 · 0 comments
Open

[feature request] Cumulative metric #23

crusaderky opened this issue Mar 7, 2024 · 0 comments

Comments

@crusaderky
Copy link

crusaderky commented Mar 7, 2024

distributed currently runs (simplified):

knock_knock =  KnockKnock(polling_interval_micros=1000)
knock_knock.start()
cumulative_gil_contention = 0
prev_ts = now = time()

def system_monitor_periodic_callback():
    global cumulative_gil_contention, prev_ts, now

    prev_ts, now = now, time()
    # snip: fairly long list of pure-python code
    cumulative_gil_contention += (now - prev_ts) * knock_knock.contention_metric 
    knock_knock.reset_contention_metric() 

This can be problematic, as a user's function can acquire the GIL either between the sampling of time() and the sampling of contention_metric, or between sampling contention_metric and calling reset_contention_metric().

This is a screenshot of a torture test (code here) where:

  • phase 1: task runs a C extension that completely blocks the GIL for 1 second; wait 10~100ms for the next task; repeat
  • phase 2: task runs a C extension that completely blocks the GIL for 4 seconds; wait 10~100ms for the next task; repeat
  • phase 3: task runs a C extension that completely blocks the GIL for 15 seconds; wait 10~100ms for the next task; repeat
  • phase 4: task runs a hot pure-python for loop constantly

Screenshot from 2024-03-07 12-07-07

Those points at 0 in the middle of phases 1, 2, and 3 are incorrect.

Proposed design

GILKnocker should offer an API to read its own cumulative metric. This metric should hold the GIL between the moment it samples the timestamp and the moment it samples the GIL use. reset_contention_metric should not clear it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant