Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPC is broken due to change of interface in llama.cpp main repository (rpc : early register backend devices #11262) #1914

Open
j-lag opened this issue Jan 30, 2025 · 2 comments

Comments

@j-lag
Copy link

j-lag commented Jan 30, 2025

RPC does not work in llama-cpp-python since this commit in llama.cpp

Revision: 667d72846c06b2cf4f7c8a4265e210991a49706b
Author: Radoslav Gerganov <[email protected]>
Date: 17/01/2025 09:57:09
Message:
rpc : early register backend devices (#11262)
Early register RPC devices and do not propagate RPC specifics in the
llama model structures.

This commit change the way rpc_servers are passed to llama.cpp, (in llama.cpp\common\arg.cpp) it is not a parameter anymore (params.rpc_servers = value;) but need to call add_rpc_devices(value);

To reproduce:

from llama_cpp import Llama
llm = Llama(
    model_path=MODEL,
	rpc_servers=RPC_NODES,
	n_ctx = 4096,
	flash_attn = False,
	n_gpu_layers = 99,
    verbose=True
)

--> should init with RPC offloading

@Hofi
Copy link

Hofi commented Feb 13, 2025

here is a quick fix i created with chatgpt to get rpc working:

diff --git a/llama_cpp/llama.py b/llama_cpp/llama.py
index 7e9a6af..5274e98 100644
--- a/llama_cpp/llama.py
+++ b/llama_cpp/llama.py
@@ -51,6 +51,45 @@ import llama_cpp._internals as internals
 from ._logger import set_verbose
 from ._utils import suppress_stdout_stderr
 
+_rpc_devices_registered = False
+from llama_cpp.llama_cpp import _lib  # now _lib holds the shared library handle
+
+def add_rpc_devices(servers: str) -> None:
+    """
+    Register RPC devices with the llama backend using the provided comma‐separated string.
+    """
+    global _rpc_devices_registered
+    if _rpc_devices_registered:
+        return  # Already registered, so do nothing
+    lib = _lib  # use the imported shared library handle
+    # Bind ggml_backend_reg_by_name
+    lib.ggml_backend_reg_by_name.argtypes = [ctypes.c_char_p]
+    lib.ggml_backend_reg_by_name.restype = ctypes.c_void_p
+    rpc_reg = lib.ggml_backend_reg_by_name(b"RPC")
+    if not rpc_reg:
+        raise ValueError("failed to find RPC backend")
+    # Bind ggml_backend_reg_get_proc_address
+    lib.ggml_backend_reg_get_proc_address.argtypes = [ctypes.c_void_p, ctypes.c_char_p]
+    lib.ggml_backend_reg_get_proc_address.restype = ctypes.c_void_p
+    rpc_add_fn_ptr = lib.ggml_backend_reg_get_proc_address(rpc_reg, b"ggml_backend_rpc_add_device")
+    if not rpc_add_fn_ptr:
+        raise ValueError("failed to find RPC device add function")
+    # Create a callable from the function pointer: returns a void pointer given a char*
+    PROTOTYPE = ctypes.CFUNCTYPE(ctypes.c_void_p, ctypes.c_char_p)
+    rpc_add_fn = PROTOTYPE(rpc_add_fn_ptr)
+    # Bind ggml_backend_device_register
+    lib.ggml_backend_device_register.argtypes = [ctypes.c_void_p]
+    lib.ggml_backend_device_register.restype = None
+    # For each server in the comma-separated list, register the device.
+    for server in servers.split(','):
+        server = server.strip().encode("utf-8")
+        dev = rpc_add_fn(server)
+        if dev:
+            lib.ggml_backend_device_register(dev)
+        else:
+            raise ValueError(f"failed to register RPC device for server: {server.decode('utf-8')}")
+    _rpc_devices_registered = True
+
 
 class Llama:
     """High-level Python wrapper for a llama.cpp model."""
@@ -227,7 +266,7 @@ class Llama:
         self.model_params.split_mode = split_mode
         self.model_params.main_gpu = main_gpu
         if rpc_servers is not None:
-            self.model_params.rpc_servers = rpc_servers.encode("utf-8")
+            add_rpc_devices(rpc_servers)
             self._rpc_servers = rpc_servers
         else:
             self._rpc_servers = None

@LeaveNhA
Copy link

What are we waiting for to fix it? Is there any road-blocker?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants