Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: "No available threads to lock" #550

Open
3 tasks done
danpalmer opened this issue Jan 10, 2025 · 8 comments
Open
3 tasks done

Bug: "No available threads to lock" #550

danpalmer opened this issue Jan 10, 2025 · 8 comments
Labels
bug Something isn't working

Comments

@danpalmer
Copy link

danpalmer commented Jan 10, 2025

Describe the bug

The following assert regularly fails:

usearch_assert_m(available_threads_.size(), "No available threads to lock");

Steps to reproduce

This is flaky and does not always happen, but appears to happen when USearch is being used many times in quick succession (i.e. high traffic, or a number of queue jobs running).

My USearch implementation is contained within a Swift Actor, so all access should be occurring on the same thread, making the use inherently single threaded. Checking the state of all threads at the point that the assert fails, this does appear to be the case – nothing else is using USearch.

Expected behavior

I don't expect asserts to fail in production code.

I did read #488, but it's unclear if/how this applies to the Swift bindings, and/or to single threaded access.

USearch version

2.16.9

Operating System

macOS 15.0

Hardware architecture

Arm

Which interface are you using?

Other bindings

Contact Details

See profile.

Are you open to being tagged as a contributor?

  • I am open to being mentioned in the project .git history as a contributor

Is there an existing issue for this?

  • I have searched the existing issues

Code of Conduct

  • I agree to follow this project's Code of Conduct
@danpalmer danpalmer added the bug Something isn't working label Jan 10, 2025
@ashvardanian
Copy link
Contributor

Hi @danpalmer! Can you please provide a wider code snippet of how you initialize the Index and how do you spawn the threads?

@danpalmer
Copy link
Author

Sure, this is the entire interaction with USearch. Apologies for the poor code, this hasn't been reviewed by anyone yet.

import Config
import Foundation
import PathKit
import USearch

actor USearchEmbeddingStore: EmbeddingStore {
    let index: USearchIndex
    let path: Path
    var isReady: Bool = false

    init(model: LLModel, path: Path) {
        index = USearchIndex.make(
            metric: .l2sq,
            dimensions: UInt32(model.embeddingSize),
            connectivity: 24,
            quantization: .F16
        )
        self.path = path
    }

    func load() throws {
        if isReady {
            return
        }
        if path.exists {
            index.load(path: path.string)
        } else {
            index.save(path: path.string)
        }
        isReady = true
    }

    func save() throws {
        if !isReady {
            throw EmbeddingStoreError.notReady
        }
        index.save(path: path.string)
    }

    func upsert(id: USearchKey, embedding: [Double]) throws {
        if !isReady {
            throw EmbeddingStoreError.notReady
        }
        index.reserve(1)
        index.addDouble(key: id, vector: embedding)
    }

    func remove(id: USearchKey) throws {
        if !isReady {
            throw EmbeddingStoreError.notReady
        }
        index.remove(key: id)
    }

    func query(_ embedding: [Double], nearest count: Int) async throws -> [UInt64] {
        if !isReady {
            throw EmbeddingStoreError.notReady
        }
        let (keys, _) = index.search(vector: embedding, count: count)
        return keys
    }
}

The Swift bindings don't provide any threading control, and being an actor this all operates on one thread. As the USearch functions are all synchronous I'd assume no work is happening concurrently, so I'm not sure why threading is an issue at all so far with this solution.

As I scale things it would be nice to have some concurrency in the bindings, but I'm not there yet.

@danpalmer
Copy link
Author

Hey, is there anything I can provide to help debug this? Is there anything I should look into? Anything that might work as a possible workaround?

I've tried digging through, but the Swift interface differs somewhat from the Obj-C layer, which then differs again from the C++ layer, and I don't think I'm going to make much progress debugging threading issues two languages and two concurrency models away from what I'm writing. Would appreciate a starting place!

@ashvardanian
Copy link
Contributor

It's a very bad idea to reserve space one-by-one. The solution would be to generalize the reserve function to also accept a thread count and pass it down to ObjC and C++.

@danpalmer
Copy link
Author

I must admit I really don't understand the reservation. Very open to changing this, I just can't find any documentation on it other than bugs that suggest that you always need at least 1 reserved. What exactly are the semantics of it? How does space reservation play into threading?

What is the threading model, and how does that play into Swift Actors, where all access would be guaranteed to be on a single thread?

@danpalmer
Copy link
Author

Hey @ashvardanian, any update here? I have some time to try fixes here, and if there's a need to expose threading in the Objective-C or Swift API I may be able to help with that.

@danpalmer
Copy link
Author

@ashvardanian just checking back in on this. I've now set up USearch as defensively as I possibly can in my project:

  • All access is mediated through one global singleton actor that runs on the main thread
  • That singleton holds the USearch index in an NIOLockedValueBox ensuring that all access (read and write) is locked.

I'd expect this to have significant performance penalties, but I wanted to try to get it working before making it fast. Unfortunately it still crashes (a hard, uncatchable crash) regularly with "No available threads to lock". This suggests that there's a thread pool internal to USearch, which there is no control over from the outside, and given that I'm doing strict single threaded serial access, it suggests that thread resources are used after USearch returns back to the callsite, meaning that there's no way to coordinate safe access.

I feel like I must be missing something! It doesn't sound like people have this sort of issue most of the time. What am I missing?

I'm also still not sure how threads and reserved space interact here. Your comments suggest that they do interact, but I'm not sure how. I'm now reserving space (reserve 1000, if it reaches 10 free, reserve 1000 again), so there should always be plenty of space. I mostly can't do batch inserts because my project doesn't have a batch oriented UX.

@skaplan
Copy link

skaplan commented Mar 11, 2025

I'm having the same issue. What seems to help is always calling .reserve() on the index before inserting/searching it. So, if you load an existing index from memory, call reserve even if it should have capacity. This will initialize the thread pool. It seems. I haven't tested this deeply.

In the reserve code, creating the threads:

        available_threads_.clear();
        if (!available_threads_.reserve(limits.threads()))
            return false;
        for (std::size_t i = 0; i < limits.threads(); i++)

This should be documented better, and it should automatically create threads for existing indices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants