Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better logging for rt.fastly.com (Client.Timeout exceeded while awaiting headers) #114

Open
mrnetops opened this issue Oct 25, 2022 · 2 comments

Comments

@mrnetops
Copy link
Contributor

mrnetops commented Oct 25, 2022

Because of how fastly-exporter will wait for new stats to be published for services, we tend to get a ton of logging like this for services that are simply not handling requests, and so not generating stats.

level=error component=rt.fastly.com service_id=xxx during="execute request" err="Get "https://rt.fastly.com/v1/channel/xxx/ts/1666656765\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"

This can make it hard to suss out of there are in fact errors with or connecting to rt.fastly.com, vs simply having a number of idle services. This is a problem that is going to scale with the number of services in play the the account in question. (assuming more services overall is going to increase the incident and volume of idle services)

Possibly these errors should be reclassed as info as they are byproducts of the intended use case of connecting and listening for stat updates. and/or we should have better logging for when there are issues (connection refused, non-2xx responses, etc)

Short term, I have attempted to minimize the spurious errors with -rt-timeout 120s to increase the likelyhood of a service request -> stat response.

Interestingly, that seems to have tentatively addressed all of the errors, which makes me wonder if there is an interaction with a maximum time to stat response from rt.fastly.com, even if stats are zero. So possibly, raise that default to > the maximum stat response time from rt.fastly.com (if that is in fact what is happening)?

@leklund
Copy link
Member

leklund commented May 18, 2023

@mrnetops I was trying to reproduce this issue and I'm unable to get request timeouts for new services or services without any data. Real time stats should be returning immediately if it doesn't have any data for a given service ID. It can wait up to 30 seconds for new data for a service that had some data previously but that should still return well under the default 45 second timeout. Are you still able to reproduce this issue?

@mrnetops
Copy link
Contributor Author

I don't think I have seen it come up recently, but I'll keep an eye out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants