-
Notifications
You must be signed in to change notification settings - Fork 863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak as a result of no cleanup for ThreadLocal in CodedOutputStream?? #7083
Comments
Looking at this class, one of the tricky things we need to make sure in order to fix this class is that the thread that created the object and its threadlocal variable needs to also be the thread that cleans up the threadlocal variable. That is, suppose you create an expose a new method on the class to allow proper cleanup public void cleanup()
{
THREAD_LOCAL_CODED_OUTPUT_STREAM.remove()
} If a thread1 creates and uses an In summary, thread1 has to do the cleanup for the EncodedOutputStream that was created and used within thread1; other threads cannot cleanup the threadlocal variable created on other threads. |
Looking at the code, it appears both CodedOutputStream and ProtoSerializer use threadlocals without cleanup. Therefore the area of interest is: Marshaler.writeBinaryTo(). Once this is called, you have (technically 2) memory leaks because this method creates a ProtoSerializer (which sets the threadLocal in that class, which doesn't have any cleanup), and this also creates a CodedOutputStream (which sets the threadlocal in that class, which doesn't have any cleanup). Marshaler.writeBinaryTo() is the only place where a ProtoSerializer is created, and the ProtoSerializer is the only place the CodedOutputStream is created. The simplest fix is to make use of the fact ProtoSerializer implements AutoCloseable.
public void cleanup()
{
THREAD_LOCAL_CODED_OUTPUT_STREAM.remove()
}
@Override
public void close() throws IOException {
try {
output.flush();
output.cleanup(); // NEW: fixes memory leak 1 by removing threadlocal from thread when "done"
idCache.clear(); // see my next github comment about removing the threadlocal in ProtoSerializer class
} catch (IOException e) {
// If close is called automatically as part of try-with-resources, it's possible that
// output.flush() will throw the same exception. Re-throwing the same exception in a finally
// block triggers an IllegalArgumentException indicating illegal self suppression. To avoid
// this, we wrap the exception so a different instance is thrown.
throw new IOException(e);
}
} |
Here's a way to replace the THREAD_LOCAL_ID_CACHE in ProtoSerializer. Use a "global" ConcurrentHashMap to store id conversions. Pros:
import java.util.concurrent.ConcurrentHashMap;
private static final int CACHE_MAX_SIZE = 10_000; // Adjust as needed
private static final ConcurrentHashMap<String, byte[]> GLOBAL_ID_CACHE = new ConcurrentHashMap<>();
private static byte[] getCachedOrCompute(String id, int length) {
if (GLOBAL_ID_CACHE.size() > CACHE_MAX_SIZE) {
GLOBAL_ID_CACHE.clear();
}
return GLOBAL_ID_CACHE.computeIfAbsent(id, key ->
OtelEncodingUtils.bytesFromBase16(key, length)
);
} |
Discussed in #7082
Originally posted by asclark109 February 7, 2025
I am using the io.opentelemetry:opentelemetry-exporter-common:1.38.0 jar in my java web application project running on Tomcat 10. I am getting memory leaks at application shutdown (one is a io.netty.util.internal.InternalThreadLocalMap that is tracked in Netty). The other appears below.
I have looked at your class in release 1.38.0 and on main: CodedOutputStream.java
I notice that a ThreadLocal is created and updated but never cleaned up (i.e. there is no call to do
THREAD_LOCAL_CODED_OUTPUT_STREAM.remove()
).opentelemetry-java/exporters/common/src/main/java/io/opentelemetry/exporter/internal/marshal/CodedOutputStream.java
Lines 85 to 104 in 30d16eb
If someone can offer help to get around this (or patch a fix), it would be appreciated. thanks.
Only discussion page I could find on ThreadLocals #6584
The text was updated successfully, but these errors were encountered: