-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add metric for indexing delay #804
Conversation
// Pop returns the opts of the next repo to index. If the queue is empty ok is | ||
// false. | ||
func (q *Queue) Pop() (opts IndexOptions, ok bool) { | ||
type QueueItem struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not an ideal name since we already have queueItem
🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me. queueItem has the private info, and QueueItem the info for users of the API
@@ -89,6 +89,30 @@ var ( | |||
"name", // name of the repository that was indexed | |||
}) | |||
|
|||
metricIndexingDelay = promauto.NewHistogramVec(prometheus.HistogramOpts{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this produces more interpretable results on dot com, I'll open another PR to remove the old metric and adjust the buckets here.
@@ -434,6 +461,7 @@ func (s *Server) processQueue() { | |||
sglog.Uint32("id", args.RepoID), | |||
sglog.Strings("branches", branches), | |||
sglog.Duration("duration", elapsed), | |||
sglog.Duration("index delay", indexDelay), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: I don't know if we document this anywhere, but probably best to use _
for space and .
for namespacing in attribute names. I see inconsistencies everywhere in our codebases, but atleast these can end up as OTEL attributes so those guidelines prefer that. Also makes it easier to process later when there aren't spaces.
// Pop returns the opts of the next repo to index. If the queue is empty ok is | ||
// false. | ||
func (q *Queue) Pop() (opts IndexOptions, ok bool) { | ||
type QueueItem struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me. queueItem has the private info, and QueueItem the info for users of the API
Our dashboards currently report indexing delay using the
index_queue_age_seconds
metric, which tracks the duration indexing jobs waitin the queue before being "popped" for indexing. For dot com, we regularly see
the median for this metric at 2 days. However, this metric may be misleading
because it includes "noop" indexing jobs where the index is already up to date.
These jobs can stay in the queue for a long time since they are not high
priority.
This PR adds a new metric
index_indexing_delay_seconds
, which reports theduration from when an indexing job enters the queue, to when it completes. It
uses 'state' as a label, so by setting
state='success'
we can (roughly) seethe delay from when Zoekt learns about a new version, to when that version is
available to search.