-
Notifications
You must be signed in to change notification settings - Fork 24.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix timing of terminating an unresponsive debugger socket #44811
Conversation
This pull request was exported from Phabricator. Differential Revision: D58220230 |
af5c591
to
1bcbed3
Compare
This pull request was exported from Phabricator. Differential Revision: D58220230 |
5 similar comments
This pull request was exported from Phabricator. Differential Revision: D58220230 |
This pull request was exported from Phabricator. Differential Revision: D58220230 |
This pull request was exported from Phabricator. Differential Revision: D58220230 |
This pull request was exported from Phabricator. Differential Revision: D58220230 |
This pull request was exported from Phabricator. Differential Revision: D58220230 |
facebook#44811) Summary: Pull Request resolved: facebook#44811 Changelog: [General][Added] - New messages for debugger disconnections [General][Changed] - Temporary disconnections with the debugger no longer terminates the debugger immediately The debugger is currently disconnected if a ping-pong message is missed. This causes the debugger to be unusable if it happens to be lagging, e.g. when the initialisation is competing with the flood of log spam T191394188 There are a few ways to fix this as discused with motiz88 and robhogan: 1. Ensure the websocket has a chance to respond, e.g. in via web worker 1. Lengthen the time allowed for the pong resopnse I've done some digging to find the root cause of the UI being blocked in CDT, However, profiling shows that most of the work is not simple to break up, i.e. the number of expensive re-layout calls. Diving into that rabbit hole could mean accidentally writing React. Because we ping every 10 seconds, we could get un/lucky where CDT happens to be busy _at that exact moment_, making this a flaky symptom to fix, even if we lengthen the allowed time-to-respond. In this diff, a more forgiving mechanism is introduced, i.e. CDT is allowed to miss a ping-pong roundtrip 3 times before the websocket connection is terminated. This allows a bit more breathing room for CDT's initialisation during log spam while maintaining the same ping-pong interval for VS Code to keep the auto SSH tunnel alive. Differential Revision: D58220230
1bcbed3
to
6e262d3
Compare
This pull request was exported from Phabricator. Differential Revision: D58220230 |
4 similar comments
This pull request was exported from Phabricator. Differential Revision: D58220230 |
This pull request was exported from Phabricator. Differential Revision: D58220230 |
This pull request was exported from Phabricator. Differential Revision: D58220230 |
This pull request was exported from Phabricator. Differential Revision: D58220230 |
6e262d3
to
1ede1f2
Compare
This pull request was exported from Phabricator. Differential Revision: D58220230 |
…4811) Summary: Pull Request resolved: facebook#44811 Changelog: [General][Added] - New messages for debugger disconnections [General][Changed] - Temporary disconnections with the debugger no longer terminates the debugger immediately The debugger is currently disconnected if a ping-pong message is missed. This causes the debugger to be unusable if it happens to be lagging, e.g. when the initialisation is competing with the flood of log spam T191394188 There are a few ways to fix this as discused with motiz88 and robhogan: 1. Ensure the websocket has a chance to respond, e.g. in via web worker 1. Lengthen the time allowed for the pong resopnse I've done some digging to find the root cause of the UI being blocked in CDT, However, profiling shows that most of the work is not simple to break up, i.e. the number of expensive re-layout calls. Diving into that rabbit hole could mean accidentally writing React. Because we ping every 10 seconds, we could get un/lucky where CDT happens to be busy _at that exact moment_, making this a flaky symptom to fix, even if we lengthen the allowed time-to-respond. # V2 So upon further investigation, CDT websocket is actually responding to the pings in due time: {F1679132204} (CDT doesn't show the ping/pong API as frames, so a custom tick/tock message was used to visualise the timing) Over here in dev-middleware, we currently start a timeout to terminate the socket after sending the ping: https://www.internalfb.com/code/fbsource/[813870db697a8701f2512d25a7fed730f0ec6ed9]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/InspectorProxy.js?lines=306-307 If CDT doesn't respond in time, websocket would be terminated. But we saw CDT respond immediately above, even during the log spam, so the delay must be coming from somewhere else. The intuition is that during the log-spam, the middleware takes a perf hit too when it's processing the spam from the device and forwarding it to the CDT websocket. We can confirm this by passing a "sent" callback via `socket.ping(cb)`: https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/websocket.js#L246-L254 This gives us the timing between calling `socket.ping()` and when the ping is actually sent. Regular, stress-free operation without log-spam shows most pings are sent within the same millisecond: {F1679223326} With the pong response grace period at 5 seconds, there's plenty of time for CDT to `pong` back. That's why it has been working in most cases. However, during the log-spam, we easily see this send-sent delay over 5 seconds. In extreme cases, almost 30 seconds would have passed before middleware sent a message to CDT, which then responded under 2 seconds: {F1679163335} This means while CDT is getting flooded and has observable lag in the UI, the smoking gun is actually the middleware. Digging a little deeper, we know that incoming messages from the target goes into a Promise queue, including the console logs: https://www.internalfb.com/code/fbsource/[d5d312082e9c]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/Device.js?lines=155-157 This means during the flood of logs from the target, the Promise queue keeps getting chained rapidly for each message. Meanhile, the `ws` lib uses the underlying NodeJS `Socket.write` method for `ping(…)` and `send(…)`: https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/sender.js#L349 …which is guaranteed to fire the callback asynchronously: nodejs/help#1504 (comment) Promise queue is in the macro task queue, which gets priority before the micro task queue. So if the Promise queue is not cleared yet, the websocket queue will have a hard time getting executed in time – explaining the extreme send-sent durations during a log spam. The fix is simple: 1. Start the terminate-socket-timer until the `ping` is actually sent 1. Treat any incoming message (along with `pong`s) as a terminate-socket-timer reset 1. This also applies if `pong` comes in between `send` and `sent`, which can happen sometimes due to the async nature of the callback: {F1679288626} # V1 ~~In this diff, a more forgiving mechanism is introduced, i.e. CDT is allowed to miss a ping-pong roundtrip 3 times before the websocket connection is terminated.~~ ~~This allows a bit more breathing room for CDT's initialisation during log spam while maintaining the same ping-pong interval for VS Code to keep the auto SSH tunnel alive.~~ Differential Revision: D58220230
This pull request was exported from Phabricator. Differential Revision: D58220230 |
…4811) Summary: Pull Request resolved: facebook#44811 Changelog: [General][Added] - New messages for debugger disconnections [General][Changed] - Temporary disconnections with the debugger no longer terminates the debugger immediately The debugger is currently disconnected if a ping-pong message is missed. This causes the debugger to be unusable if it happens to be lagging, e.g. when the initialisation is competing with the flood of log spam T191394188 There are a few ways to fix this as discused with motiz88 and robhogan: 1. Ensure the websocket has a chance to respond, e.g. in via web worker 1. Lengthen the time allowed for the pong resopnse I've done some digging to find the root cause of the UI being blocked in CDT, However, profiling shows that most of the work is not simple to break up, i.e. the number of expensive re-layout calls. Diving into that rabbit hole could mean accidentally writing React. Because we ping every 10 seconds, we could get un/lucky where CDT happens to be busy _at that exact moment_, making this a flaky symptom to fix, even if we lengthen the allowed time-to-respond. # V2 So upon further investigation, CDT websocket is actually responding to the pings in due time: {F1679132204} (CDT doesn't show the ping/pong API as frames, so a custom tick/tock message was used to visualise the timing) Over here in dev-middleware, we currently start a timeout to terminate the socket after sending the ping: https://www.internalfb.com/code/fbsource/[813870db697a8701f2512d25a7fed730f0ec6ed9]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/InspectorProxy.js?lines=306-307 If CDT doesn't respond in time, websocket would be terminated. But we saw CDT respond immediately above, even during the log spam, so the delay must be coming from somewhere else. The intuition is that during the log-spam, the middleware takes a perf hit too when it's processing the spam from the device and forwarding it to the CDT websocket. We can confirm this by passing a "sent" callback via `socket.ping(cb)`: https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/websocket.js#L246-L254 This gives us the timing between calling `socket.ping()` and when the ping is actually sent. Regular, stress-free operation without log-spam shows most pings are sent within the same millisecond: {F1679223326} With the pong response grace period at 5 seconds, there's plenty of time for CDT to `pong` back. That's why it has been working in most cases. However, during the log-spam, we easily see this send-sent delay over 5 seconds. In extreme cases, almost 30 seconds would have passed before middleware sent a message to CDT, which then responded under 2 seconds: {F1679163335} This means while CDT is getting flooded and has observable lag in the UI, the smoking gun is actually the middleware. Digging a little deeper, we know that incoming messages from the target goes into a Promise queue, including the console logs: https://www.internalfb.com/code/fbsource/[d5d312082e9c]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/Device.js?lines=155-157 This means during the flood of logs from the target, the Promise queue keeps getting chained rapidly for each message. Meanhile, the `ws` lib uses the underlying NodeJS `Socket.write` method for `ping(…)` and `send(…)`: https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/sender.js#L349 …which is guaranteed to fire the callback asynchronously: nodejs/help#1504 (comment) Promise queue is in the macro task queue, which gets priority before the micro task queue. So if the Promise queue is not cleared yet, the websocket queue will have a hard time getting executed in time – explaining the extreme send-sent durations during a log spam. The fix is simple: 1. Start the terminate-socket-timer until the `ping` is actually sent 1. Treat any incoming message (along with `pong`s) as a terminate-socket-timer reset 1. This also applies if `pong` comes in between `send` and `sent`, which can happen sometimes due to the async nature of the callback: {F1679288626} # V1 ~~In this diff, a more forgiving mechanism is introduced, i.e. CDT is allowed to miss a ping-pong roundtrip 3 times before the websocket connection is terminated.~~ ~~This allows a bit more breathing room for CDT's initialisation during log spam while maintaining the same ping-pong interval for VS Code to keep the auto SSH tunnel alive.~~ Differential Revision: D58220230
1ede1f2
to
87300b6
Compare
This pull request was exported from Phabricator. Differential Revision: D58220230 |
3 similar comments
This pull request was exported from Phabricator. Differential Revision: D58220230 |
This pull request was exported from Phabricator. Differential Revision: D58220230 |
This pull request was exported from Phabricator. Differential Revision: D58220230 |
…4811) Summary: Pull Request resolved: facebook#44811 Changelog: [General][Fixed] - Debugger frontend socket-termination countdown now begins after the ping message is actually sent The debugger is currently disconnected if a ping-pong message is missed. This causes the debugger to be unusable if it happens to be lagging, e.g. when the initialisation is competing with the flood of log spam T191394188 There are a few ways to fix this as discused with motiz88 and robhogan: 1. Ensure the websocket has a chance to respond, e.g. in via web worker 1. Lengthen the time allowed for the pong resopnse I've done some digging to find the root cause of the UI being blocked in CDT, However, profiling shows that most of the work is not simple to break up, i.e. the number of expensive re-layout calls. Diving into that rabbit hole could mean accidentally writing React. Because we ping every 10 seconds, we could get un/lucky where CDT happens to be busy _at that exact moment_, making this a flaky symptom to fix, even if we lengthen the allowed time-to-respond. # V2+ So upon further investigation, CDT websocket is actually responding to the pings in due time: {F1679132204} (CDT doesn't show the ping/pong API as frames, so a custom tick/tock message was used to visualise the timing) Over here in dev-middleware, we currently start a timeout to terminate the socket after sending the ping: https://www.internalfb.com/code/fbsource/[813870db697a8701f2512d25a7fed730f0ec6ed9]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/InspectorProxy.js?lines=306-307 If CDT doesn't respond in time, websocket would be terminated. But we saw CDT respond immediately above, even during the log spam, so the delay must be coming from somewhere else. The intuition is that during the log-spam, the middleware takes a perf hit too when it's processing the spam from the device and forwarding it to the CDT websocket. We can confirm this by passing a "sent" callback via `socket.ping(cb)`: https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/websocket.js#L246-L254 This gives us the timing between calling `socket.ping()` and when the ping is actually sent. Regular, stress-free operation without log-spam shows most pings are sent within the same millisecond: {F1679223326} With the pong response grace period at 5 seconds, there's plenty of time for CDT to `pong` back. That's why it has been working in most cases. However, during the log-spam, we easily see this send-sent delay over 5 seconds. In extreme cases, almost 30 seconds would have passed before middleware sent a message to CDT, which then responded under 2 seconds: {F1679163335} This means while CDT is getting flooded and has observable lag in the UI, the smoking gun is actually the middleware. Digging a little deeper, we know that incoming messages from the target goes into a Promise queue, including the console logs: https://www.internalfb.com/code/fbsource/[d5d312082e9c]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/Device.js?lines=155-157 This means during the flood of logs from the target, the Promise queue keeps getting chained rapidly for each message. Meanhile, the `ws` lib uses the underlying NodeJS `Socket.write` method for `ping(…)` and `send(…)`: https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/sender.js#L349 …which is guaranteed to fire the callback asynchronously: nodejs/help#1504 (comment) Promise queue is in the macro task queue, which gets priority before the micro task queue. So if the Promise queue is not cleared yet, the websocket queue will have a hard time getting executed in time – explaining the extreme send-sent durations during a log spam. The fix is simple: 1. Start the terminate-socket-timer until the `ping` is actually sent 1. Treat any incoming message (along with `pong`s) as a terminate-socket-timer reset 1. This also applies if `pong` comes in between `send` and `sent`, which can happen sometimes due to the async nature of the callback: {F1679288626} # V1 ~~In this diff, a more forgiving mechanism is introduced, i.e. CDT is allowed to miss a ping-pong roundtrip 3 times before the websocket connection is terminated.~~ ~~This allows a bit more breathing room for CDT's initialisation during log spam while maintaining the same ping-pong interval for VS Code to keep the auto SSH tunnel alive.~~ Differential Revision: D58220230
This pull request was exported from Phabricator. Differential Revision: D58220230 |
4 similar comments
This pull request was exported from Phabricator. Differential Revision: D58220230 |
This pull request was exported from Phabricator. Differential Revision: D58220230 |
This pull request was exported from Phabricator. Differential Revision: D58220230 |
This pull request was exported from Phabricator. Differential Revision: D58220230 |
…4811) Summary: Pull Request resolved: facebook#44811 Changelog: [General][Fixed] - Debugger frontend socket-termination countdown now begins after the ping message is actually sent The debugger is currently disconnected if a ping-pong message is missed. This causes the debugger to be unusable if it happens to be lagging, e.g. when the initialisation is competing with the flood of log spam T191394188 There are a few ways to fix this as discused with motiz88 and robhogan: 1. Ensure the websocket has a chance to respond, e.g. in via web worker 1. Lengthen the time allowed for the pong resopnse I've done some digging to find the root cause of the UI being blocked in CDT, However, profiling shows that most of the work is not simple to break up, i.e. the number of expensive re-layout calls. Diving into that rabbit hole could mean accidentally writing React. Because we ping every 10 seconds, we could get un/lucky where CDT happens to be busy _at that exact moment_, making this a flaky symptom to fix, even if we lengthen the allowed time-to-respond. # V2+ So upon further investigation, CDT websocket is actually responding to the pings in due time: {F1679132204} (CDT doesn't show the ping/pong API as frames, so a custom tick/tock message was used to visualise the timing) Over here in dev-middleware, we currently start a timeout to terminate the socket after sending the ping: https://www.internalfb.com/code/fbsource/[813870db697a8701f2512d25a7fed730f0ec6ed9]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/InspectorProxy.js?lines=306-307 If CDT doesn't respond in time, websocket would be terminated. But we saw CDT respond immediately above, even during the log spam, so the delay must be coming from somewhere else. The intuition is that during the log-spam, the middleware takes a perf hit too when it's processing the spam from the device and forwarding it to the CDT websocket. We can confirm this by passing a "sent" callback via `socket.ping(cb)`: https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/websocket.js#L246-L254 This gives us the timing between calling `socket.ping()` and when the ping is actually sent. Regular, stress-free operation without log-spam shows most pings are sent within the same millisecond: {F1679223326} With the pong response grace period at 5 seconds, there's plenty of time for CDT to `pong` back. That's why it has been working in most cases. However, during the log-spam, we easily see this send-sent delay over 5 seconds. In extreme cases, almost 30 seconds would have passed before middleware sent a message to CDT, which then responded under 2 seconds: {F1679163335} This means while CDT is getting flooded and has observable lag in the UI, the smoking gun is actually the middleware. Digging a little deeper, we know that incoming messages from the target goes into a Promise queue, including the console logs: https://www.internalfb.com/code/fbsource/[d5d312082e9c]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/Device.js?lines=155-157 This means during the flood of logs from the target, the Promise queue keeps getting chained rapidly for each message. Meanhile, the `ws` lib uses the underlying NodeJS `Socket.write` method for `ping(…)` and `send(…)`: https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/sender.js#L349 …which is guaranteed to fire the callback asynchronously: nodejs/help#1504 (comment) Promise queue is in the macro task queue, which gets priority before the micro task queue. So if the Promise queue is not cleared yet, the websocket queue will have a hard time getting executed in time – explaining the extreme send-sent durations during a log spam. The fix is simple: 1. Start the terminate-socket-timer until the `ping` is actually sent 1. Treat any incoming message (along with `pong`s) as a terminate-socket-timer reset 1. This also applies if `pong` comes in between `send` and `sent`, which can happen sometimes due to the async nature of the callback: {F1679288626} # V1 ~~In this diff, a more forgiving mechanism is introduced, i.e. CDT is allowed to miss a ping-pong roundtrip 3 times before the websocket connection is terminated.~~ ~~This allows a bit more breathing room for CDT's initialisation during log spam while maintaining the same ping-pong interval for VS Code to keep the auto SSH tunnel alive.~~ Differential Revision: D58220230
c9a84c6
to
9a15a57
Compare
This pull request was exported from Phabricator. Differential Revision: D58220230 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D58220230 |
…4811) Summary: Pull Request resolved: facebook#44811 Changelog: [General][Fixed] - Debugger frontend socket-termination countdown now begins after the ping message is actually sent The debugger is currently disconnected if a ping-pong message is missed. This causes the debugger to be unusable if it happens to be lagging, e.g. when the initialisation is competing with the flood of log spam T191394188 There are a few ways to fix this as discused with motiz88 and robhogan: 1. Ensure the websocket has a chance to respond, e.g. in via web worker 1. Lengthen the time allowed for the pong resopnse I've done some digging to find the root cause of the UI being blocked in CDT, However, profiling shows that most of the work is not simple to break up, i.e. the number of expensive re-layout calls. Diving into that rabbit hole could mean accidentally writing React. Because we ping every 10 seconds, we could get un/lucky where CDT happens to be busy _at that exact moment_, making this a flaky symptom to fix, even if we lengthen the allowed time-to-respond. # V2+ So upon further investigation, CDT websocket is actually responding to the pings in due time: {F1679132204} (CDT doesn't show the ping/pong API as frames, so a custom tick/tock message was used to visualise the timing) Over here in dev-middleware, we currently start a timeout to terminate the socket after sending the ping: https://www.internalfb.com/code/fbsource/[813870db697a8701f2512d25a7fed730f0ec6ed9]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/InspectorProxy.js?lines=306-307 If CDT doesn't respond in time, websocket would be terminated. But we saw CDT respond immediately above, even during the log spam, so the delay must be coming from somewhere else. The intuition is that during the log-spam, the middleware takes a perf hit too when it's processing the spam from the device and forwarding it to the CDT websocket. We can confirm this by passing a "sent" callback via `socket.ping(cb)`: https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/websocket.js#L246-L254 This gives us the timing between calling `socket.ping()` and when the ping is actually sent. Regular, stress-free operation without log-spam shows most pings are sent within the same millisecond: {F1679223326} With the pong response grace period at 5 seconds, there's plenty of time for CDT to `pong` back. That's why it has been working in most cases. However, during the log-spam, we easily see this send-sent delay over 5 seconds. In extreme cases, almost 30 seconds would have passed before middleware sent a message to CDT, which then responded under 2 seconds: {F1679163335} This means while CDT is getting flooded and has observable lag in the UI, the smoking gun is actually the middleware. Digging a little deeper, we know that incoming messages from the target goes into a Promise queue, including the console logs: https://www.internalfb.com/code/fbsource/[d5d312082e9c]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/Device.js?lines=155-157 This means during the flood of logs from the target, the Promise queue keeps getting chained rapidly for each message. Meanhile, the `ws` lib uses the underlying NodeJS `Socket.write` method for `ping(…)` and `send(…)`: https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/sender.js#L349 …which is guaranteed to fire the callback asynchronously: nodejs/help#1504 (comment) Promise queue is in the macro task queue, which gets priority before the micro task queue. So if the Promise queue is not cleared yet, the websocket queue will have a hard time getting executed in time – explaining the extreme send-sent durations during a log spam. The fix is simple: 1. Start the terminate-socket-timer until the `ping` is actually sent 1. Treat any incoming message (along with `pong`s) as a terminate-socket-timer reset 1. This also applies if `pong` comes in between `send` and `sent`, which can happen sometimes due to the async nature of the callback: {F1679288626} # V1 ~~In this diff, a more forgiving mechanism is introduced, i.e. CDT is allowed to miss a ping-pong roundtrip 3 times before the websocket connection is terminated.~~ ~~This allows a bit more breathing room for CDT's initialisation during log spam while maintaining the same ping-pong interval for VS Code to keep the auto SSH tunnel alive.~~ Differential Revision: D58220230
This pull request was exported from Phabricator. Differential Revision: D58220230 |
9a15a57
to
74c4598
Compare
This pull request was exported from Phabricator. Differential Revision: D58220230 |
2 similar comments
This pull request was exported from Phabricator. Differential Revision: D58220230 |
This pull request was exported from Phabricator. Differential Revision: D58220230 |
74c4598
to
41bc13f
Compare
This pull request was exported from Phabricator. Differential Revision: D58220230 |
…4811) Summary: Pull Request resolved: facebook#44811 Changelog: [General][Fixed] - Debugger frontend socket-termination countdown now begins after the ping message is actually sent The debugger is currently disconnected if a ping-pong message is missed. This causes the debugger to be unusable if it happens to be lagging, e.g. when the initialisation is competing with the flood of log spam T191394188 There are a few ways to fix this as discused with motiz88 and robhogan: 1. Ensure the websocket has a chance to respond, e.g. in via web worker 1. Lengthen the time allowed for the pong resopnse I've done some digging to find the root cause of the UI being blocked in CDT, However, profiling shows that most of the work is not simple to break up, i.e. the number of expensive re-layout calls. Diving into that rabbit hole could mean accidentally writing React. Because we ping every 10 seconds, we could get un/lucky where CDT happens to be busy _at that exact moment_, making this a flaky symptom to fix, even if we lengthen the allowed time-to-respond. # V2+ So upon further investigation, CDT websocket is actually responding to the pings in due time: {F1679132204} (CDT doesn't show the ping/pong API as frames, so a custom tick/tock message was used to visualise the timing) Over here in dev-middleware, we currently start a timeout to terminate the socket after sending the ping: https://www.internalfb.com/code/fbsource/[813870db697a8701f2512d25a7fed730f0ec6ed9]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/InspectorProxy.js?lines=306-307 If CDT doesn't respond in time, websocket would be terminated. But we saw CDT respond immediately above, even during the log spam, so the delay must be coming from somewhere else. The intuition is that during the log-spam, the middleware takes a perf hit too when it's processing the spam from the device and forwarding it to the CDT websocket. We can confirm this by passing a "sent" callback via `socket.ping(cb)`: https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/websocket.js#L246-L254 This gives us the timing between calling `socket.ping()` and when the ping is actually sent. Regular, stress-free operation without log-spam shows most pings are sent within the same millisecond: {F1679223326} With the pong response grace period at 5 seconds, there's plenty of time for CDT to `pong` back. That's why it has been working in most cases. However, during the log-spam, we easily see this send-sent delay over 5 seconds. In extreme cases, almost 30 seconds would have passed before middleware sent a message to CDT, which then responded under 2 seconds: {F1679163335} This means while CDT is getting flooded and has observable lag in the UI, the smoking gun is actually the middleware. Digging a little deeper, we know that incoming messages from the target goes into a Promise queue, including the console logs: https://www.internalfb.com/code/fbsource/[d5d312082e9c]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/Device.js?lines=155-157 This means during the flood of logs from the target, the Promise queue keeps getting chained rapidly for each message. Meanhile, the `ws` lib uses the underlying NodeJS `Socket.write` method for `ping(…)` and `send(…)`: https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/sender.js#L349 …which is guaranteed to fire the callback asynchronously: nodejs/help#1504 (comment) Promise queue is in the macro task queue, which gets priority before the micro task queue. So if the Promise queue is not cleared yet, the websocket queue will have a hard time getting executed in time – explaining the extreme send-sent durations during a log spam. The fix is simple: 1. Start the terminate-socket-timer until the `ping` is actually sent 1. Treat any incoming message (along with `pong`s) as a terminate-socket-timer reset 1. This also applies if `pong` comes in between `send` and `sent`, which can happen sometimes due to the async nature of the callback: {F1679288626} # V1 ~~In this diff, a more forgiving mechanism is introduced, i.e. CDT is allowed to miss a ping-pong roundtrip 3 times before the websocket connection is terminated.~~ ~~This allows a bit more breathing room for CDT's initialisation during log spam while maintaining the same ping-pong interval for VS Code to keep the auto SSH tunnel alive.~~ Differential Revision: D58220230
This pull request was exported from Phabricator. Differential Revision: D58220230 |
2 similar comments
This pull request was exported from Phabricator. Differential Revision: D58220230 |
This pull request was exported from Phabricator. Differential Revision: D58220230 |
…4811) Summary: Pull Request resolved: facebook#44811 Changelog: [General][Fixed] - Debugger frontend socket-termination countdown now begins after the ping message is actually sent The debugger is currently disconnected if a ping-pong message is missed. This causes the debugger to be unusable if it happens to be lagging, e.g. when the initialisation is competing with the flood of log spam T191394188 There are a few ways to fix this as discused with motiz88 and robhogan: 1. Ensure the websocket has a chance to respond, e.g. in via web worker 1. Lengthen the time allowed for the pong resopnse I've done some digging to find the root cause of the UI being blocked in CDT, However, profiling shows that most of the work is not simple to break up, i.e. the number of expensive re-layout calls. Diving into that rabbit hole could mean accidentally writing React. Because we ping every 10 seconds, we could get un/lucky where CDT happens to be busy _at that exact moment_, making this a flaky symptom to fix, even if we lengthen the allowed time-to-respond. # V2+ So upon further investigation, CDT websocket is actually responding to the pings in due time: {F1679132204} (CDT doesn't show the ping/pong API as frames, so a custom tick/tock message was used to visualise the timing) Over here in dev-middleware, we currently start a timeout to terminate the socket after sending the ping: https://www.internalfb.com/code/fbsource/[813870db697a8701f2512d25a7fed730f0ec6ed9]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/InspectorProxy.js?lines=306-307 If CDT doesn't respond in time, websocket would be terminated. But we saw CDT respond immediately above, even during the log spam, so the delay must be coming from somewhere else. The intuition is that during the log-spam, the middleware takes a perf hit too when it's processing the spam from the device and forwarding it to the CDT websocket. We can confirm this by passing a "sent" callback via `socket.ping(cb)`: https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/websocket.js#L246-L254 This gives us the timing between calling `socket.ping()` and when the ping is actually sent. Regular, stress-free operation without log-spam shows most pings are sent within the same millisecond: {F1679223326} With the pong response grace period at 5 seconds, there's plenty of time for CDT to `pong` back. That's why it has been working in most cases. However, during the log-spam, we easily see this send-sent delay over 5 seconds. In extreme cases, almost 30 seconds would have passed before middleware sent a message to CDT, which then responded under 2 seconds: {F1679163335} This means while CDT is getting flooded and has observable lag in the UI, the smoking gun is actually the middleware. Digging a little deeper, we know that incoming messages from the target goes into a Promise queue, including the console logs: https://www.internalfb.com/code/fbsource/[d5d312082e9c]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/Device.js?lines=155-157 This means during the flood of logs from the target, the Promise queue keeps getting chained rapidly for each message. Meanhile, the `ws` lib uses the underlying NodeJS `Socket.write` method for `ping(…)` and `send(…)`: https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/sender.js#L349 …which is guaranteed to fire the callback asynchronously: nodejs/help#1504 (comment) Promise queue is in the macro task queue, which gets priority before the micro task queue. So if the Promise queue is not cleared yet, the websocket queue will have a hard time getting executed in time – explaining the extreme send-sent durations during a log spam. The fix is simple: 1. Start the terminate-socket-timer until the `ping` is actually sent 1. Treat any incoming message (along with `pong`s) as a terminate-socket-timer reset 1. This also applies if `pong` comes in between `send` and `sent`, which can happen sometimes due to the async nature of the callback: {F1679288626} # V1 ~~In this diff, a more forgiving mechanism is introduced, i.e. CDT is allowed to miss a ping-pong roundtrip 3 times before the websocket connection is terminated.~~ ~~This allows a bit more breathing room for CDT's initialisation during log spam while maintaining the same ping-pong interval for VS Code to keep the auto SSH tunnel alive.~~ Reviewed By: huntie Differential Revision: D58220230
41bc13f
to
d5689dd
Compare
This pull request was exported from Phabricator. Differential Revision: D58220230 |
This pull request has been merged in 20462ca. |
This pull request was successfully merged by @EdmondChuiHW in 20462ca. When will my fix make it into a release? | How to file a pick request? |
Summary:
Changelog:
[General][Fixed] - Debugger frontend socket-termination countdown now begins after the ping message is actually sent
The debugger is currently disconnected if a ping-pong message is missed.
This causes the debugger to be unusable if it happens to be lagging, e.g. when the initialisation is competing with the flood of log spam T191394188
There are a few ways to fix this as discused with motiz88 and robhogan:
I've done some digging to find the root cause of the UI being blocked in CDT, However, profiling shows that most of the work is not simple to break up, i.e. the number of expensive re-layout calls. Diving into that rabbit hole could mean accidentally writing React.
Because we ping every 10 seconds, we could get un/lucky where CDT happens to be busy at that exact moment, making this a flaky symptom to fix, even if we lengthen the allowed time-to-respond.
V2
So upon further investigation, CDT websocket is actually responding to the pings in due time:
{F1679132204}
(CDT doesn't show the ping/pong API as frames, so a custom tick/tock message was used to visualise the timing)
Over here in dev-middleware, we currently start a timeout to terminate the socket after sending the ping:
react-native/packages/dev-middleware/src/inspector-proxy/InspectorProxy.js
Lines 306 to 307 in 2483c63
If CDT doesn't respond in time, websocket would be terminated.
But we saw CDT respond immediately above, even during the log spam, so the delay must be coming from somewhere else.
The intuition is that during the log-spam, the middleware takes a perf hit too when it's processing the spam from the device and forwarding it to the CDT websocket.
We can confirm this by passing a "sent" callback via
socket.ping(cb)
:https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/websocket.js#L246-L254
This gives us the timing between calling
socket.ping()
and when the ping is actually sent.Regular, stress-free operation without log-spam shows most pings are sent within the same millisecond:
{F1679223326}
With the pong response grace period at 5 seconds, there's plenty of time for CDT to
pong
back. That's why it has been working in most cases.However, during the log-spam, we easily see this send-sent delay over 5 seconds. In extreme cases, almost 30 seconds would have passed before middleware sent a message to CDT, which then responded under 2 seconds:
{F1679163335}
This means while CDT is getting flooded and has observable lag in the UI, the smoking gun is actually the middleware.
Digging a little deeper, we know that incoming messages from the target goes into a Promise queue, including the console logs:
react-native/packages/dev-middleware/src/inspector-proxy/Device.js
Lines 155 to 157 in 2483c63
This means during the flood of logs from the target, the Promise queue keeps getting chained rapidly for each message.
Meanhile, the
ws
lib uses the underlying NodeJSSocket.write
method forping(…)
andsend(…)
:https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/sender.js#L349
…which is guaranteed to fire the callback asynchronously:
nodejs/help#1504 (comment)
Promise queue is in the macro task queue, which gets priority before the micro task queue. So if the Promise queue is not cleared yet, the websocket queue will have a hard time getting executed in time – explaining the extreme send-sent durations during a log spam.
The fix is simple:
ping
is actually sentpong
s) as a terminate-socket-timer resetpong
comes in betweensend
andsent
, which can happen sometimes due to the async nature of the callback:{F1679288626}
V1
In this diff, a more forgiving mechanism is introduced, i.e. CDT is allowed to miss a ping-pong roundtrip 3 times before the websocket connection is terminated.This allows a bit more breathing room for CDT's initialisation during log spam while maintaining the same ping-pong interval for VS Code to keep the auto SSH tunnel alive.Differential Revision: D58220230