Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Searchkick 5 #1288

Closed
28 of 41 tasks
ankane opened this issue Jul 15, 2019 · 6 comments
Closed
28 of 41 tasks

Searchkick 5 #1288

ankane opened this issue Jul 15, 2019 · 6 comments

Comments

@ankane
Copy link
Owner

ankane commented Jul 15, 2019

Please create a new issue to discuss any ideas.

edge branch

New

  • Add support for Elasticseach 8 (works w/o any changes)
  • Support mode: :async and mode: :queue with bulk reindexing

Breaking

  • Remove elasticsearch dependency from gemspec
  • Drop support for Elasticsearch 6
  • Drop support for NoBrainer and Cequel
  • Drop support for deprecated aws_signers_v4 (prefer aws_sigv4 instead)
  • Lazy load queries
  • Raise ArgumentError instead of RuntimeError for unknown operators
  • Add .* to non-anchored regular expressions
  • Raise error when search called on relations
  • Fix or remove wordnet option (could remove first and re-add later)
  • Use like: [{_index: ..., _id: ...}] for similar records
  • Replace japanese with japanese2
  • No longer map id to _id with order option (since sorting on _id is deprecated in Elasticsearch)
  • Raise an error for unpermitted parameters (like Active Record)
  • Raise ArgumentError (instead of warning) for invalid regular expression modifiers
  • Have relation reindex remove records when should_index? is false - Bulk Remove Indexes with should_index? #1424

Other

Maybe

  • Better notifications - log data on store and update and add filtering for sensitive data (similar to Active Record filter_attributes)
  • Have relation.reindex load all records - use relation.in_batches.reindex to load in batches
  • reindex_now/reindex_later methods
  • Add with_score method to Searchkick::Results
  • Take model pagination into account
  • Only create analyzers and filters that are used
  • Make misspellings below the default (pros: less bad matches, better performance for correctly spelled queries, cons: worse performance for misspelled queries due to additional query)
  • Deprecate (or remove) scope_results in place of load
  • Remove hashie dependency - less_deps branch
  • Choose better name for HashWrapper class - Result, Hit, Document, Record (could still subclass HashWrapper for backward compatibility)

Waiting for 5.1+

  • Active Record-like query building - Product.search("apples").where(in_stock: true).limit(10).offset(50) - see New Query API #1395 and relation branch (still need to merge certain options intuitively, show friendly error message when trying to call AR scope on Searchkick relation)
  • Use rank_features type for conversions (added in ES 7.0) - conversions_v2 branch https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-rank-feature-query.html (test w/ spaces and special characters)
  • For partial reindexing, move to reindex(update: :search_method) instead of reindex(:search_method) (warn on deprecated style)

On hold

  • Use search_as_you_type for instant search - search_as_you_type branch (on hold due to misspellings)

Before 5

@jpaas
Copy link

jpaas commented Feb 25, 2020

What do you think about mirroring ES versions? i.e. When you support ES 8 bump version to 8.x.

@ankane
Copy link
Owner Author

ankane commented Mar 18, 2020

Hey @jpaas, thanks for the suggestion. I don't see a strong reason to mirror Elasticsearch versions. The plan is for the latest version of Searchkick to always support the 2 most recent Elasticsearch versions.

@SteveC
Copy link

SteveC commented May 31, 2020

Some ideas from recent use of searchkick (thanks!!!)

  1. multi-word synonyms would be super useful
  2. bulk reindex nested object caching. For example when doing MyObject.reindex if it belongs_to OtherObject and I have a model search_data pulling in the nested data, then reindex() will reindex 1,000 records at a time but try to load the nested OtherObject with 1,000 SELECT statements. This is suboptimal if there aren't that many OtherObjects. I've gotten around this with my own little hash cache in the model that for MyObject search_data() goes to my hash of objects instead of through activerecord for the OtherObject when doing any indexing. It helps in my case that OtherObject rarely changes, and maybe rails builtin caching would be better..... but anyway, some sort of built in caching would be awesome for reindexing millions of rows, unless I somehow missed it and it's already there.
  3. convention-based reindex column. searchkick could have a convention of 'needs_reindex' column on objects for when the standard methods fail. For example, the ActiveRecord Import gem allows me to import hundreds of thousands of records very quickly but it bypasses searchkick. While, yes, I can and do track this (well, kinda) and do my own partial reindex, if I could just set a column flag which gets reindexed in some cron (or whatever) job then that would be more awesome, perhaps?

@ankane
Copy link
Owner Author

ankane commented Jun 17, 2020

Hey @SteveC, thanks for the ideas!

  1. Would love to support multi-word synonyms, but not sure there's an easy way to do this with the current analyzers. If someone is able to figure it out, happy to review a PR.
  2. The search_import scope should help with N+1 queries for bulk reindexing if you haven't tried that.
  3. It's an interesting idea. I'm not sure I want to add more reindexing patterns right now, but someone could create a gem that wraps Searchkick to do this.

Edit: Think I found a way to do multi-word synonyms (synonyms_v2 branch), so will try to make this part of Searchkick 5.

@ankane
Copy link
Owner Author

ankane commented Jun 18, 2020

No point in waiting for Searchkick 5. Reloadable, multi-word, search time synonyms are now available in Searchkick 4.4 🎉

https://github.com/ankane/searchkick#synonyms

@ankane
Copy link
Owner Author

ankane commented Feb 22, 2022

Searchkick 5 is out 🎉

@ankane ankane closed this as completed Feb 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants