DivideByZeroException in bark_load_model when calling ggml_time_us() #189

LSXAxeller · 2024-09-13T09:32:18Z

When calling bark_load_model from the C# BarkSharp wrapper, a System.DivideByZeroException is thrown at the line int64_t t_load_start_us = ggml_time_us();. This seems to indicate that the ggml_time_us() function is attempting to divide by zero, likely because the timer_freq variable has not been properly initialized.

Inserting a call to ggml_time_init() at the beginning of bark_load_model to initialize the timer leads to a StackOverflowException during the allocation of memory with ggml_allocr_new_from_buffer. This suggests a potential issue with memory management or initialization within the GGML library.

Reproduction Steps:

Build the Bark library: Ensure that the Bark library (including GGML) is built successfully.
Create a C# project: Create a C# project that uses the BarkSharp wrapper to interact with the Bark library.
Load the Bark model: Attempt to load the Bark model using the BarkContext class constructor, providing the path to the model file.

Example Code (Relevant C# snippets):

// BarkContext.cs
public BarkContext(string modelPath, BarkContextParams parameters, uint seed)
{
    var modelPathPtr = Marshal.StringToHGlobalAnsi(modelPath);
    _context = LoadModel(modelPathPtr , parameters, seed); // This call eventually leads to the exception
    // ...
}

// NativeMethods.cs
[DllImport(NativeLibraryManager.LibraryName, EntryPoint = "bark_load_model", CallingConvention = CallingConvention.Cdecl, CharSet = CharSet.Unicode)]
public static extern IntPtr LoadModel(IntPtr modelPath, BarkContextParams parameters, uint seed);

Relevant C++ Code (bark.cpp):

struct bark_context* bark_load_model(const char* model_path, struct bark_context_params params, uint32_t seed) {
    int64_t t_load_start_us = ggml_time_us(); // Exception occurs here

    struct bark_context* bctx = new bark_context();

    bctx->text_model = bark_model();
    std::string model_path_str(model_path);
    if (!bark_load_model_from_file(model_path_str, bctx, params.verbosity)) {
        fprintf(stderr, "%s: failed to load model weights from '%s'\n", __func__, model_path);
        return nullptr;
    }

    bctx->rng = std::mt19937(seed);
    bctx->params = params;
    bctx->stats.t_load_us = ggml_time_us() - t_load_start_us;

    return bctx;
}

Relevant C++ Code (ggml.c):

int64_t ggml_time_us(void) {
    LARGE_INTEGER t;
    QueryPerformanceCounter(&t);
    return ((t.QuadPart - timer_start) * 1000000) / timer_freq; // Potential division by zero
}

void ggml_time_init(void) {
    LARGE_INTEGER frequency;
    QueryPerformanceFrequency(&frequency);
    timer_freq = frequency.QuadPart;
    QueryPerformanceCounter(&timer_start);
}

Expected Behavior:

The bark_load_model function should load the model successfully without throwing any exceptions. The ggml_time_us() function should return a valid timestamp.

Actual Behavior:

A System.DivideByZeroException is thrown when calling ggml_time_us(). Attempting to fix this by calling ggml_time_init() at the beginning of bark_load_model results in a StackOverflowException during memory allocation.

Possible Causes:

Missing timer initialization: The timer_freq variable in ggml_time_us() might not be initialized properly before it is used, leading to the division by zero error.
Circular dependency or recursive call: The StackOverflowException when calling ggml_time_init() might indicate a circular dependency or a recursive function call that exceeds the stack limit.
Memory allocation issues: There might be problems with memory allocation or initialization within GGML, leading to the stack overflow during the creation of the allocator.

The text was updated successfully, but these errors were encountered:

PABannier · 2024-10-11T12:12:21Z

Hello @LSXAxeller ! Thanks for reaching out.

I was not aware of the BarkSharp error. I cannot really help with this as I'm not familiar with C#.
Have you tried opening an issue upstream on the ggml repo?

I will try to sync with the latest ggml version as soon as possible, but this requires a bit of work on the bark.cpp repo side.

vinovo · 2024-10-28T19:53:19Z

I had exactly the same error when using the dll in a python binding.

The fix to DivisionByZero is the same - by calling ggml_time_init() as least for once before calling ggml_time_us().

The stack overflow problem is not due to too many calls to ggml_time_init, but instead the calling stack of bark_load_model or generate_audio can easily exceed the 1MB stack size limit on Windows.

My fix to the problem is to manually increase the stack size limit before making calls to dll APIs

in python + ctypes, it's something like

threading.stack_size(16*1024*1024)  # set stack size to 16MB

# then spawn a new thread and make calls to the dll APIs

LSXAxeller · 2024-10-30T12:34:51Z

Thanks for your assistance, @vinovo The library is now operating and the StackOverflowException has been addressed. The only left issue is that C#'s default stack size of 1MB is fixed and cannot be changed on the Main Thread I am currently using background threads workaround for now. I am using llama.cpp and stable-diffusion.cpp both in the main thread without requiring a stack size change, though I suppose this could be connected to the out-of-date ggml.

@PABannier My apologies, I missed the ggml_time_init(); call in the main C++ example when I was porting it to C# and after encountering StackOverflowException I thought it was wrong workaround. It would be great to align the ggml dependency with the upstream version this may fix the stack related issues. Also, since bark_load_model serves as the library's entry point, could we consider moving the ggml_time_init(); call into it? This would prevent users from needing to interact directly with the low-level ggml code and make the library more user-friendly. It would streamline the process by initializing ggml timing within bark_load_model instead of requiring it in higher-level functions like Main();.

And since this issue got addressed, I am going to close it.

PABannier · 2024-10-30T13:40:04Z

@LSXAxeller Thanks for the update and closing the issue!

I agree we can move the ggml_time_init() call into the bark_load_model function. I let you open a PR for that.

Regarding syncing with ggml upstream, I'm in the process of doing so. I'm almost done adapting the Encodec.cpp codebase with the latest ggml version, which bark.cpp relies on. I should be done in a few days.

LSXAxeller · 2024-11-07T02:21:22Z

Sorry for my late response, I opened PR #203 to call ggml_time_init() before int64_t t_load_start_us = ggml_time_us(); in bark_load_model(const char* model_path, struct bark_context_params params, uint32_t seed) function

LSXAxeller closed this as completed Oct 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DivideByZeroException in bark_load_model when calling ggml_time_us() #189

DivideByZeroException in bark_load_model when calling ggml_time_us() #189

LSXAxeller commented Sep 13, 2024

PABannier commented Oct 11, 2024

vinovo commented Oct 28, 2024 •

edited

Loading

LSXAxeller commented Oct 30, 2024

PABannier commented Oct 30, 2024

LSXAxeller commented Nov 7, 2024

DivideByZeroException in bark_load_model when calling ggml_time_us() #189

DivideByZeroException in bark_load_model when calling ggml_time_us() #189

Comments

LSXAxeller commented Sep 13, 2024

PABannier commented Oct 11, 2024

vinovo commented Oct 28, 2024 • edited Loading

LSXAxeller commented Oct 30, 2024

PABannier commented Oct 30, 2024

LSXAxeller commented Nov 7, 2024

vinovo commented Oct 28, 2024 •

edited

Loading