Skip to content

Commit

Permalink
Merge pull request #2910 from alphagov/ltr-removal
Browse files Browse the repository at this point in the history
Remove Learning to Rank code
  • Loading branch information
sihugh authored May 14, 2024
2 parents ae3f073 + 8e142a9 commit ee4e460
Show file tree
Hide file tree
Showing 60 changed files with 40 additions and 7,468 deletions.
46 changes: 0 additions & 46 deletions .github/workflows/deploy-ltr.yml

This file was deleted.

2 changes: 0 additions & 2 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@ source "https://rubygems.org"

gem "activesupport"
gem "aws-sdk-s3"
gem "aws-sdk-sagemaker"
gem "aws-sdk-sagemakerruntime"
gem "bootsnap", require: false
gem "elasticsearch", "~> 6" # We need a 6.x release to interface with Elasticsearch 6
gem "gds-api-adapters"
Expand Down
8 changes: 0 additions & 8 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -45,12 +45,6 @@ GEM
aws-sdk-core (~> 3, >= 3.194.0)
aws-sdk-kms (~> 1)
aws-sigv4 (~> 1.8)
aws-sdk-sagemaker (1.240.0)
aws-sdk-core (~> 3, >= 3.193.0)
aws-sigv4 (~> 1.1)
aws-sdk-sagemakerruntime (1.62.0)
aws-sdk-core (~> 3, >= 3.193.0)
aws-sigv4 (~> 1.1)
aws-sigv4 (1.8.0)
aws-eventstream (~> 1, >= 1.0.2)
base64 (0.2.0)
Expand Down Expand Up @@ -668,8 +662,6 @@ PLATFORMS
DEPENDENCIES
activesupport
aws-sdk-s3
aws-sdk-sagemaker
aws-sdk-sagemakerruntime
bootsnap
bunny-mock
climate_control
Expand Down
33 changes: 33 additions & 0 deletions docs/arch/adr-012-learn-to-rank.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Decision record: Decommissioning Learning to Rank

**Date:** 2024-05-14

The search team have decided to retire [Learning to Rank][] (LTR).

## Rationale

[Site search][] now uses Google's Vertex AI search instead of our ElasticSearch + Learning to Rank service. The other finders still use ElasticSearch and LTR. Site Search receives more requests than all the other finders combined.

Running our own relevance tuning service on top of ElasticSearch is not something we are equipped to do at this time, particularly when it's in support of a vastly reduced demand.

It's expensive to do well, both in terms of money spent on infrastructure and the time that the appropriate people would need to devote to it. Unfortunately, we just don't have that available.

### Limited upside to retaining it

Learning to Rank was configured primarily for Site Search and the general features of documents on GOV.UK. Other finders are often set up for small sets of specific document types. These documents have many features for which Learning to Rank has not been trained.

The model is poorly suited to differentiating between different Employment Tribunal decisions, for example.

### Limited impact to removing it

Our implementation of Learning to Rank always had a limited "blast radius" in that if would only be able to affect the rankings of a single page of results at a time. The biggest impact it could have on a result would be to promote the 20th result to be 1st (and vice versa).

This also means that there is limited downside to removing the reranking feature. All the results for each query still appear on the same page as before, but potentially in a different order.

### Unaffected use cases

Learning to Rank only affected queries which included keywords and were ordered by relevance. Other queries, such as those that power organisation, taxon and topical event pages are unaffected.


[Site search]: https://www.gov.uk/search/all
[Learning to Rank]: https://github.com/alphagov/search-api/blob/1524da75f055f144392facb460bd95ef62b67bbb/docs/arch/adr-010-learn-to-rank.md
45 changes: 0 additions & 45 deletions lib/healthcheck/reranker_healthcheck.rb

This file was deleted.

4 changes: 0 additions & 4 deletions lib/learn_to_rank/data_pipeline.rb

This file was deleted.

44 changes: 0 additions & 44 deletions lib/learn_to_rank/data_pipeline/bigquery.rb

This file was deleted.

86 changes: 0 additions & 86 deletions lib/learn_to_rank/data_pipeline/embed_features.rb

This file was deleted.

46 changes: 0 additions & 46 deletions lib/learn_to_rank/data_pipeline/judgements_to_svm.rb

This file was deleted.

21 changes: 0 additions & 21 deletions lib/learn_to_rank/data_pipeline/load_search_queries.rb

This file was deleted.

Loading

0 comments on commit ee4e460

Please sign in to comment.