Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DivideByZeroException in bark_load_model when calling ggml_time_us() #189

Closed
LSXAxeller opened this issue Sep 13, 2024 · 5 comments
Closed

Comments

@LSXAxeller
Copy link

When calling bark_load_model from the C# BarkSharp wrapper, a System.DivideByZeroException is thrown at the line int64_t t_load_start_us = ggml_time_us();. This seems to indicate that the ggml_time_us() function is attempting to divide by zero, likely because the timer_freq variable has not been properly initialized.

Inserting a call to ggml_time_init() at the beginning of bark_load_model to initialize the timer leads to a StackOverflowException during the allocation of memory with ggml_allocr_new_from_buffer. This suggests a potential issue with memory management or initialization within the GGML library.

Reproduction Steps:

  1. Build the Bark library: Ensure that the Bark library (including GGML) is built successfully.
  2. Create a C# project: Create a C# project that uses the BarkSharp wrapper to interact with the Bark library.
  3. Load the Bark model: Attempt to load the Bark model using the BarkContext class constructor, providing the path to the model file.

Example Code (Relevant C# snippets):

// BarkContext.cs
public BarkContext(string modelPath, BarkContextParams parameters, uint seed)
{
    var modelPathPtr = Marshal.StringToHGlobalAnsi(modelPath);
    _context = LoadModel(modelPathPtr , parameters, seed); // This call eventually leads to the exception
    // ...
}

// NativeMethods.cs
[DllImport(NativeLibraryManager.LibraryName, EntryPoint = "bark_load_model", CallingConvention = CallingConvention.Cdecl, CharSet = CharSet.Unicode)]
public static extern IntPtr LoadModel(IntPtr modelPath, BarkContextParams parameters, uint seed);

Relevant C++ Code (bark.cpp):

struct bark_context* bark_load_model(const char* model_path, struct bark_context_params params, uint32_t seed) {
    int64_t t_load_start_us = ggml_time_us(); // Exception occurs here

    struct bark_context* bctx = new bark_context();

    bctx->text_model = bark_model();
    std::string model_path_str(model_path);
    if (!bark_load_model_from_file(model_path_str, bctx, params.verbosity)) {
        fprintf(stderr, "%s: failed to load model weights from '%s'\n", __func__, model_path);
        return nullptr;
    }

    bctx->rng = std::mt19937(seed);
    bctx->params = params;
    bctx->stats.t_load_us = ggml_time_us() - t_load_start_us;

    return bctx;
}

Relevant C++ Code (ggml.c):

int64_t ggml_time_us(void) {
    LARGE_INTEGER t;
    QueryPerformanceCounter(&t);
    return ((t.QuadPart - timer_start) * 1000000) / timer_freq; // Potential division by zero
}

void ggml_time_init(void) {
    LARGE_INTEGER frequency;
    QueryPerformanceFrequency(&frequency);
    timer_freq = frequency.QuadPart;
    QueryPerformanceCounter(&timer_start);
}

Expected Behavior:

The bark_load_model function should load the model successfully without throwing any exceptions. The ggml_time_us() function should return a valid timestamp.

Actual Behavior:

A System.DivideByZeroException is thrown when calling ggml_time_us(). Attempting to fix this by calling ggml_time_init() at the beginning of bark_load_model results in a StackOverflowException during memory allocation.

Possible Causes:

  • Missing timer initialization: The timer_freq variable in ggml_time_us() might not be initialized properly before it is used, leading to the division by zero error.
  • Circular dependency or recursive call: The StackOverflowException when calling ggml_time_init() might indicate a circular dependency or a recursive function call that exceeds the stack limit.
  • Memory allocation issues: There might be problems with memory allocation or initialization within GGML, leading to the stack overflow during the creation of the allocator.
@PABannier
Copy link
Owner

Hello @LSXAxeller ! Thanks for reaching out.

I was not aware of the BarkSharp error. I cannot really help with this as I'm not familiar with C#.
Have you tried opening an issue upstream on the ggml repo?

I will try to sync with the latest ggml version as soon as possible, but this requires a bit of work on the bark.cpp repo side.

@vinovo
Copy link
Contributor

vinovo commented Oct 28, 2024

I had exactly the same error when using the dll in a python binding.

The fix to DivisionByZero is the same - by calling ggml_time_init() as least for once before calling ggml_time_us().

The stack overflow problem is not due to too many calls to ggml_time_init, but instead the calling stack of bark_load_model or generate_audio can easily exceed the 1MB stack size limit on Windows.

My fix to the problem is to manually increase the stack size limit before making calls to dll APIs

in python + ctypes, it's something like

threading.stack_size(16*1024*1024)  # set stack size to 16MB

# then spawn a new thread and make calls to the dll APIs

@LSXAxeller
Copy link
Author

Thanks for your assistance, @vinovo The library is now operating and the StackOverflowException has been addressed. The only left issue is that C#'s default stack size of 1MB is fixed and cannot be changed on the Main Thread I am currently using background threads workaround for now. I am using llama.cpp and stable-diffusion.cpp both in the main thread without requiring a stack size change, though I suppose this could be connected to the out-of-date ggml.

@PABannier My apologies, I missed the ggml_time_init(); call in the main C++ example when I was porting it to C# and after encountering StackOverflowException I thought it was wrong workaround. It would be great to align the ggml dependency with the upstream version this may fix the stack related issues. Also, since bark_load_model serves as the library's entry point, could we consider moving the ggml_time_init(); call into it? This would prevent users from needing to interact directly with the low-level ggml code and make the library more user-friendly. It would streamline the process by initializing ggml timing within bark_load_model instead of requiring it in higher-level functions like Main();.

And since this issue got addressed, I am going to close it.

@PABannier
Copy link
Owner

@LSXAxeller Thanks for the update and closing the issue!

I agree we can move the ggml_time_init() call into the bark_load_model function. I let you open a PR for that.

Regarding syncing with ggml upstream, I'm in the process of doing so. I'm almost done adapting the Encodec.cpp codebase with the latest ggml version, which bark.cpp relies on. I should be done in a few days.

@LSXAxeller
Copy link
Author

Sorry for my late response, I opened PR #203 to call ggml_time_init() before int64_t t_load_start_us = ggml_time_us(); in bark_load_model(const char* model_path, struct bark_context_params params, uint32_t seed) function

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants