|
| 1 | +--- |
| 2 | +title: "Overcoming Go's memory constraints with Rust FFI" |
| 3 | +tags: ["golang", "rust", "performance"] |
| 4 | +description: "An experiment with Golang and Rust FFI" |
| 5 | +authors: [yash] |
| 6 | +hide_table_of_contents: false |
| 7 | +date: 2025-02-25T10:00 |
| 8 | +--- |
| 9 | + |
| 10 | +For the past few years at [Flanksource](https://flanksource.com/), I've helped build [Mission Control](https://flanksource.com/docs) - a Kubernetes-native internal developer platform that improves developer productivity and operational resilience. |
| 11 | + |
| 12 | +One Tuesday afternoon, one of our pods started crashing with an OOM (OutOfMemory) error. |
| 13 | + |
| 14 | +> When a container exceeds its memory limit in Kubernetes, the system restarts it with an OutOfMemory message. Memory leaks can trigger a crash loop cycle. |
| 15 | +
|
| 16 | +This issue occurred frequently enough to raise concerns, particularly since it only affected one customer's environment. |
| 17 | + |
| 18 | +Finding the cause proved challenging. The application logs provided no clear indicators of the crash trigger. Memory usage graphs showed normal patterns before crashes, suggesting sudden spikes that occurred too quickly to be captured. This pattern ruled out straightforward memory leakage bugs. |
| 19 | + |
| 20 | +These circumstances required deeper investigation. We leveraged Go's built-in profiling functionality to generate memory profiles, hoping to uncover clues about the issue. |
| 21 | + |
| 22 | +# Memory Profiling Investigation |
| 23 | + |
| 24 | +After running multiple profiles for several hours, the investigation did not yield conclusive results. The only certainty was that the crash occurred instantly, rather than resulting from a gradual memory leak. |
| 25 | + |
| 26 | +A trace with significant memory usage emerged during the investigation. |
| 27 | + |
| 28 | +<Screenshot img="/img/blog/rust-ffi/go-diff-first-profile.png" shadow={false} alt="Memory profile of the application"/> |
| 29 | + |
| 30 | +The trace pointed to the diff function. |
| 31 | + |
| 32 | +> Change mapping is a core feature of Mission Control. It scrapes all resources in the infrastructure (Kubernetes, AWS, etc) and records changes by generating diffs for the changelog. This provides users with a timeline of all infrastructure changes in their environment. |
| 33 | +
|
| 34 | +<Screenshot img="/img/blog/rust-ffi/change-mapping.png" shadow={false} alt="Catalog changes and diff in UI"/> |
| 35 | + |
| 36 | +Investigation revealed that certain entities with larger sizes (Kubernetes CRDs exceeding 1MB) caused increased processing time and memory consumption during diff generation. Processing these entities in bulk triggered the memory overflow. |
| 37 | + |
| 38 | +Initial experiments with golang's [GC settings](https://tip.golang.org/doc/gc-guide#Memory_limit) (GOGC & GOMEMLIMIT) did not yield an optimal solution. Controlling the heap size for this edge case required significant performance limitations, which was not a viable option. |
| 39 | + |
| 40 | +Several approaches to mitigate this issue were considered: |
| 41 | + |
| 42 | +- Creating a buffer to process diffs in a limited batch |
| 43 | +- Handling larger resources separately |
| 44 | +- Calling the garbage collector via [`runtime.GC`](https://pkg.go.dev/runtime#GC) periodically |
| 45 | +- Skipping certain types of resources |
| 46 | + |
| 47 | +None of these options provided an optimal solution. |
| 48 | + |
| 49 | +# Experimenting with FFI |
| 50 | + |
| 51 | +Memory management limitations in Go created a performance bottleneck. Languages with manual memory management, like Rust, presented a potential solution. |
| 52 | + |
| 53 | +Research revealed [FFI (Foreign Function Interface)](https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html#using-extern-functions-to-call-external-code) as a method to integrate Rust with Go. |
| 54 | + |
| 55 | +A proof of concept demonstrated the feasibility of Go-Rust integration through a basic "Hello World" implementation. |
| 56 | + |
| 57 | +```go title="main.go" |
| 58 | +package main |
| 59 | + |
| 60 | +/* |
| 61 | +#cgo LDFLAGS: ./lib/libhello.a -ldl |
| 62 | +#include "./lib/hello.h" |
| 63 | +#include <stdlib.h> |
| 64 | +*/ |
| 65 | +import "C" |
| 66 | +import "unsafe" |
| 67 | + |
| 68 | +func main() { |
| 69 | + str := C.CString("Hello World!") |
| 70 | + defer C.free(unsafe.Pointer(str)) |
| 71 | + |
| 72 | + C.printString(str) |
| 73 | +} |
| 74 | +``` |
| 75 | + |
| 76 | +and the rust code: |
| 77 | + |
| 78 | +```rust title="src/lib.rs" |
| 79 | +use std::ffi::CStr; |
| 80 | + |
| 81 | +#[no_mangle] |
| 82 | +pub extern "C" fn printString(message: *const libc::c_char) { |
| 83 | + let message_cstr = unsafe { CStr::from_ptr(message) }; |
| 84 | + let message = message_cstr.to_str().unwrap(); |
| 85 | + println!("({})", message); |
| 86 | +} |
| 87 | +``` |
| 88 | + |
| 89 | +```c title="lib/hello.h" |
| 90 | +void printString(char *message); |
| 91 | +``` |
| 92 | +
|
| 93 | +The cargo build process produces a `libhello.a` file (an archive library for static linking). While dynamic linking with `.so` (shared object) files is possible, static linking simplifies deployment by producing a single self-contained binary. |
| 94 | +
|
| 95 | +After confirming Go and Rust could be integrated, the next step was finding a suitable diff library. [Armin Ronacher's](https://mitsuhiko.at) library [similar](https://github.com/mitsuhiko/similar) provided the required functionality. |
| 96 | +
|
| 97 | +The integration of the similar library into Go took minimal effort and compiled successfully, allowing Go binaries to call Rust functions. |
| 98 | +
|
| 99 | +However, the key success metric would be the memory usage benchmarks. If the combined Go and Rust implementation didn't provide significant memory improvements, the integration would not be worthwhile. |
| 100 | +
|
| 101 | +# Moment of truth |
| 102 | +
|
| 103 | +After benchmarking both implementations using golang's standard benchmarking, the results were even better than expected. |
| 104 | +
|
| 105 | +
|
| 106 | +| | Max Allocated | ns/op | allocs/op | |
| 107 | +|----------|---------------|-------|-----------| |
| 108 | +| Golang | 4.1 GB | 64740 | 182 | |
| 109 | +| Rust FFI | 349 MB | 32619 | 2 | |
| 110 | +
|
| 111 | +
|
| 112 | +## Benchmarking Results and Production Implementation |
| 113 | +
|
| 114 | +### Performance Improvements |
| 115 | +
|
| 116 | +The benchmarking results demonstrated significant improvements in memory efficiency when using Rust. The implementation showed: |
| 117 | +
|
| 118 | +- 92% reduction in memory allocation (from 4.1GB to 349MB) |
| 119 | +- 5-6% improvement in execution time |
| 120 | +- Dramatic reduction in allocations per operation (from 182 to 2) |
| 121 | +
|
| 122 | +### From Experiment to Production |
| 123 | +
|
| 124 | +What started as an experimental project quickly gained traction within the team. After sharing the initial results with colleagues, there was immediate interest in exploring this approach for our production codebase. |
| 125 | +
|
| 126 | +With support from our technical leadership, particularly Moshe Immerman, we conducted a time-boxed proof of concept using our main codebase. The implementation process involved: |
| 127 | +
|
| 128 | +1. Creating a working prototype within one day |
| 129 | +2. Running comprehensive benchmarks against our existing test suite |
| 130 | +3. Deploying to the environment experiencing memory-related crashes |
| 131 | +4. Validating diff generation accuracy and monitoring memory usage |
| 132 | +
|
| 133 | +The results exceeded expectations - the memory-related crashes ceased completely while maintaining correct diff generation and reducing overall memory consumption. |
| 134 | +
|
| 135 | +### Production Implementation |
| 136 | +
|
| 137 | +The transition from proof of concept to production was straightforward due to our container-based deployment strategy. The primary changes involved: |
| 138 | +
|
| 139 | +1. Creating a Rust builder image |
| 140 | +2. Copying the static library (`.a` archive) before building the Go binary |
| 141 | +3. Integrating the build process into our existing containerized workflow |
| 142 | +
|
| 143 | +This implementation demonstrates how combining different programming languages, when done thoughtfully, can solve real-world production issues effectively. |
| 144 | +
|
| 145 | +```Dockerfile title="Dockerfile" |
| 146 | +FROM rust AS rust-builder |
| 147 | +... |
| 148 | +RUN cargo build --release |
| 149 | +
|
| 150 | +FROM golang AS builder |
| 151 | +COPY --from=rust-builder /path/release/target /external/diffgen/target |
| 152 | +RUN go mod download |
| 153 | +RUN make build |
| 154 | +``` |
| 155 | + |
| 156 | +## Parting thoughts |
| 157 | + |
| 158 | +What began as an experimental project was shipped to customers as a viable solution within days. While initially hesitant about combining multiple languages and their associated challenges, having clear boundaries and comprehensive tests provided confidence in the implementation. This reinforces selecting the appropriate tools for specific requirements and highlights the advantages of using multiple programming languages in software development. |
| 159 | + |
| 160 | +[Sample repo with diff gen code and benchmarks](https://github.com/yashmehrotra/go-rust-diffgen) |
| 161 | + |
| 162 | +**Further reading**: |
| 163 | +- [Using `extern` Functions to Call External Code](https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html#using-extern-functions-to-call-external-code) |
| 164 | +- [Medi-Remi's sample repo: rust-plus-golang](https://github.com/mediremi/rust-plus-golang) |
| 165 | +- [rustgo: calling Rust from Go with near-zero overhead](https://words.filippo.io/rustgo/) by Filippo |
| 166 | +- [Hooking Go from Rust - Hitchhiker’s Guide to the Go-laxy](https://metalbear.co/blog/hooking-go-from-rust-hitchhikers-guide-to-the-go-laxy/) by MetalBear |
| 167 | + |
| 168 | + |
| 169 | + |
| 170 | +*Originally posted on [yashmehrotra.com](https://yashmehrotra.com/posts/overcoming-gos-memory-constraints-with-rust-ffi/?ref=flanksource.com) |
0 commit comments