Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib/pq v1.10.3 possibly fixes connection timeout issue? #5

Open
jacobfederer opened this issue Jan 11, 2022 · 1 comment
Open

lib/pq v1.10.3 possibly fixes connection timeout issue? #5

jacobfederer opened this issue Jan 11, 2022 · 1 comment

Comments

@jacobfederer
Copy link

Hello there,

I saw that the lib/pq project just released version v1.10.3. I seems that this release introduced the context.WithTimeout functionality which should close a connection after a certain duration.

If so pq-timeouts could help people to migrate their code back to raw lib/pq.

Let me know what are your thoughts on this.

@jacobfederer jacobfederer changed the title lib/pq v1.10.3 possibly fixes timeout issue? lib/pq v1.10.3 possibly fixes connection timeout issue? Jan 11, 2022
@jybp
Copy link

jybp commented Mar 9, 2022

@jacobfederer I don't think recent changes to lib/pq addresses the safeguards put in place by pq-timeouts.
From my understanding, pq-timeouts ensures Read/Write calls to the underlying tcp connection are preceded by SetReadDeadline/SetWriteDeadline to prevent them from hanging.
lib/pq doesn't seem to use any of those functions still. Only SetDeadline is used in dial and reset right after.
For example it just wraps the conn in a *bufio.Reader and invokes io.ReadFull when reading.

About the cancellation, I think the v1.10.3 release notes you linked refer to this PR.
lib/pq uses watchCancel functions (1 & 2) to start this cancellation sequence when the context gets cancelled. It uses a context with a hardcoded 10 seconds timeout.
I think there's an edge case when you have no connect_timeout set, which is that DialContext is used but no deadlines are set afterwards. See here.
As the DialContext doc says:

Once successfully connected, any expiration of the context will not affect the connection.

Which means that the subsequent Write done in sendStartupPacket call could very well block. But it would not when using pq-timeouts with a write_timeout.

There's an even weirder edge case that doesn't necessarily apply here: If for some reason you use DialOpen with a Dialer that does not implement DialerContext, you may very well block on the dial itself since the context with the 10 seconds timeout is not used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants