Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metric for indexing delay #804

Merged
merged 2 commits into from
Aug 8, 2024
Merged

Add metric for indexing delay #804

merged 2 commits into from
Aug 8, 2024

Conversation

jtibshirani
Copy link
Member

@jtibshirani jtibshirani commented Aug 8, 2024

Our dashboards currently report indexing delay using the
index_queue_age_seconds metric, which tracks the duration indexing jobs wait
in the queue before being "popped" for indexing. For dot com, we regularly see
the median for this metric at 2 days. However, this metric may be misleading
because it includes "noop" indexing jobs where the index is already up to date.
These jobs can stay in the queue for a long time since they are not high
priority.

This PR adds a new metric index_indexing_delay_seconds, which reports the
duration from when an indexing job enters the queue, to when it completes. It
uses 'state' as a label, so by setting state='success' we can (roughly) see
the delay from when Zoekt learns about a new version, to when that version is
available to search.

@cla-bot cla-bot bot added the cla-signed label Aug 8, 2024
// Pop returns the opts of the next repo to index. If the queue is empty ok is
// false.
func (q *Queue) Pop() (opts IndexOptions, ok bool) {
type QueueItem struct {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not an ideal name since we already have queueItem 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me. queueItem has the private info, and QueueItem the info for users of the API

@@ -89,6 +89,30 @@ var (
"name", // name of the repository that was indexed
})

metricIndexingDelay = promauto.NewHistogramVec(prometheus.HistogramOpts{
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this produces more interpretable results on dot com, I'll open another PR to remove the old metric and adjust the buckets here.

@jtibshirani jtibshirani requested review from keegancsmith, stefanhengl and a team and removed request for keegancsmith and stefanhengl August 8, 2024 12:19
@@ -434,6 +461,7 @@ func (s *Server) processQueue() {
sglog.Uint32("id", args.RepoID),
sglog.Strings("branches", branches),
sglog.Duration("duration", elapsed),
sglog.Duration("index delay", indexDelay),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: I don't know if we document this anywhere, but probably best to use _ for space and . for namespacing in attribute names. I see inconsistencies everywhere in our codebases, but atleast these can end up as OTEL attributes so those guidelines prefer that. Also makes it easier to process later when there aren't spaces.

// Pop returns the opts of the next repo to index. If the queue is empty ok is
// false.
func (q *Queue) Pop() (opts IndexOptions, ok bool) {
type QueueItem struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me. queueItem has the private info, and QueueItem the info for users of the API

@jtibshirani jtibshirani merged commit 20c496e into main Aug 8, 2024
9 checks passed
@jtibshirani jtibshirani deleted the jtibs/index-delay branch August 8, 2024 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants