-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gzip middleware leads to warp "Client closed connection prematurely" #997
Comments
I did a little more digging in the code. It seems the function in question is: compressE
:: Response
-> (Response -> IO ResponseReceived)
-> IO ResponseReceived
compressE res sendResponse =
wb $ \body -> sendResponse $
responseStream s (fixHeaders hs) $ \sendChunk flush -> do
(blazeRecv, _) <- B.newBuilderRecv B.defaultStrategy
deflate <- Z.initDeflate 1 (Z.WindowBits 31)
let sendBuilder builder = do
popper <- blazeRecv builder
fix $ \loop -> do
bs <- popper
unless (S.null bs) $ do
sendBS bs
loop
sendBS bs = Z.feedDeflate deflate bs >>= deflatePopper
flushBuilder = do
sendBuilder Blaze.flush
deflatePopper $ Z.flushDeflate deflate
flush
deflatePopper popper = fix $ \loop -> do
result <- popper
case result of
Z.PRDone -> return ()
Z.PRNext bs' -> do
sendChunk $ byteString bs'
loop
Z.PRError e -> throwIO e
body sendBuilder flushBuilder
sendBuilder Blaze.flush
deflatePopper $ Z.finishDeflate deflate
where
(s, hs, wb) = responseToStream res So it seems enabling compression changes the response to streaming-fashion. This could explain how the ALB is getting a complete 200 response before this code is finished whatever it's doing. If it decides to close the connection at that point, I think we have our answer. I don't really know how to fix it though. |
I wonder if using |
Update:
They happen much less frequently though, and seem limited to specific kinds of requests (large responses, and from only frontend/JS clients). |
We have an app using WAI, Warp, and hs-opetelemetry-instrumentation-wai and -yesod.
Ever since making this change:
Our open-telemetry tracing reports a high number of 200s with the error:
As far as we can tell, the client of such requests receives a 200 (green in this trace) but the server continues to run beyond then (orange) and reports the error.
It's as if the response is returned, but then the request handling in the service doesn't end. It would make sense the LB (satisfied with the 200 it has already forwarded along) closes the connection, and I guess that would generate this error... but why? And why only once we started on-the-fly gzipping?
Since clients are getting 200s, there is no actual negative effects here, but the errors are confusing too see and contribute to our error-rate metrics causing false-positive alerts.
The text was updated successfully, but these errors were encountered: