Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Search-time expansion_search #500

Open
2 of 3 tasks
rschu1ze opened this issue Oct 9, 2024 · 2 comments
Open
2 of 3 tasks

Feature: Search-time expansion_search #500

rschu1ze opened this issue Oct 9, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@rschu1ze
Copy link
Contributor

rschu1ze commented Oct 9, 2024

Describe what you are looking for

The search methods in include/usearch/index_dense.hpp have this signature:

search_result_t search(b1x8_t const* vector, std::size_t wanted, std::size_t thread = any_thread(), bool exact = false) const { return search_(vector, wanted, dummy_predicate_t {}, thread, exact, casts_.from_b1x8); }
[...]

search_ does this:

[...]
index_search_config_t search_config;
[...]
search_config.expansion = config_.expansion_search;
[...]

auto typed_result = typed_->search([...], search_config, [...]);

In other words, when a search runs, HNSW parameter expansion_search (aka. efSearch) is populated from the index configuration which is set at index construction time. This feels a bit unnatural, the only two HNSW parameters that should be set at index construction time are connectivity (M) and expansion_add (aka. efConstruction).

Would it be possible to add search overloads or new parameters that allow search-time ef_search parameters?

Can you contribute to the implementation?

  • I can contribute

Is your feature request specific to a certain interface?

C++ implementation

Contact Details

[email protected]

Is there an existing issue for this?

  • I have searched the existing issues

Code of Conduct

  • I agree to follow this project's Code of Conduct
@rschu1ze rschu1ze added the enhancement New feature or request label Oct 9, 2024
@rschu1ze
Copy link
Contributor Author

rschu1ze commented Oct 14, 2024

Just found that index_dense::expansion_search and index_dense_gt::change_expansion_search do the job.

@rschu1ze rschu1ze reopened this Oct 14, 2024
@rschu1ze
Copy link
Contributor Author

rschu1ze commented Oct 14, 2024

Sorry to reopen. change_expansion_search changes the config_ member of class index_dense_gt. Method index_dense_gt::search_ then picks up whatever expansion_add value is in the index config and builds its index_search_config_t object from it:

index_search_config_t search_config;
search_config.thread = lock.thread_id;
search_config.expansion = config_.expansion_search;
search_config.exact = exact;

The problem is that when multiple callers run search, each with a different ef_search, they need to synchronize (aka. lock) as config_ is per-index. I made a slightly ugly workaround where ef_search is passed to the search without persisting it in the index here (used by this ClickHouse PR). If you like the approach, feel free to pick my commit or change it as you like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant