Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak as a result of no cleanup for ThreadLocal in CodedOutputStream?? #7083

Open
asclark109 opened this issue Feb 7, 2025 · 3 comments

Comments

@asclark109
Copy link

asclark109 commented Feb 7, 2025

Discussed in #7082

Originally posted by asclark109 February 7, 2025
I am using the io.opentelemetry:opentelemetry-exporter-common:1.38.0 jar in my java web application project running on Tomcat 10. I am getting memory leaks at application shutdown (one is a io.netty.util.internal.InternalThreadLocalMap that is tracked in Netty). The other appears below.

07-Feb-2025 12:36:08.735 SEVERE [main] org.apache.catalina.loader.WebappClassLoaderBase.checkThreadLocalMapForLeaks The web application [agtest-trunk] created a ThreadLocal with key of type [java.lang.ThreadLocal] (value [java.lang.ThreadLocal@658c5a19]) and a value of type [io.opentelemetry.exporter.internal.marshal.CodedOutputStream.OutputStreamEncoder] (value [io.opentelemetry.exporter.internal.marshal.CodedOutputStream$OutputStreamEncoder@421e361]) but failed to remove it when the web application was stopped. Threads are going to be renewed over time to try and avoid a probable memory leak.

I have looked at your class in release 1.38.0 and on main: CodedOutputStream.java

I notice that a ThreadLocal is created and updated but never cleaned up (i.e. there is no call to do THREAD_LOCAL_CODED_OUTPUT_STREAM.remove()).

private static final ThreadLocal<OutputStreamEncoder> THREAD_LOCAL_CODED_OUTPUT_STREAM =
new ThreadLocal<>();
/**
* Create a new {@code CodedOutputStream} wrapping the given {@code OutputStream}.
*
* <p>NOTE: The provided {@link OutputStream} <strong>MUST NOT</strong> retain access or modify
* the provided byte arrays. Doing so may result in corrupted data, which would be difficult to
* debug.
*/
static CodedOutputStream newInstance(final OutputStream output) {
OutputStreamEncoder cos = THREAD_LOCAL_CODED_OUTPUT_STREAM.get();
if (cos == null) {
cos = new OutputStreamEncoder(output);
THREAD_LOCAL_CODED_OUTPUT_STREAM.set(cos);
} else {
cos.reset(output);
}
return cos;
}

If someone can offer help to get around this (or patch a fix), it would be appreciated. thanks.

Only discussion page I could find on ThreadLocals #6584

@asclark109
Copy link
Author

asclark109 commented Feb 7, 2025

Looking at this class, one of the tricky things we need to make sure in order to fix this class is that the thread that created the object and its threadlocal variable needs to also be the thread that cleans up the threadlocal variable.

That is, suppose you create an expose a new method on the class to allow proper cleanup

public void cleanup()
{
  THREAD_LOCAL_CODED_OUTPUT_STREAM.remove()
}

If a thread1 creates and uses an eos EncodedOutputStream, it creates a threadlocal variable on thread1. If a thread2 comes along and invokes eos.cleanup(), this will NOT actually cleanup the threadlocal variable on thread1 because that variable only exists on thread1--not thread2. That's important to point out I think.

In summary, thread1 has to do the cleanup for the EncodedOutputStream that was created and used within thread1; other threads cannot cleanup the threadlocal variable created on other threads.

@asclark109
Copy link
Author

asclark109 commented Feb 10, 2025

Looking at the code, it appears both CodedOutputStream and ProtoSerializer use threadlocals without cleanup. Therefore the area of interest is:

Marshaler.writeBinaryTo(). Once this is called, you have (technically 2) memory leaks because this method creates a ProtoSerializer (which sets the threadLocal in that class, which doesn't have any cleanup), and this also creates a CodedOutputStream (which sets the threadlocal in that class, which doesn't have any cleanup).

Marshaler.writeBinaryTo() is the only place where a ProtoSerializer is created, and the ProtoSerializer is the only place the CodedOutputStream is created.

The simplest fix is to make use of the fact ProtoSerializer implements AutoCloseable.

  1. add the method CodedOutputStream.cleanup(), which does:
public void cleanup()
{
  THREAD_LOCAL_CODED_OUTPUT_STREAM.remove()
}
  1. in ProtoSerializer.close(), invoke output.cleanup(). This will fix memory leak in CodedOutputStream. If you also want to remove the leak in ProtoSerializer (which I saw via inspection but didn't detect yet), you also invoke idCache.remove(). This will effectively remove your caching mechanism until you change the implementation to not use threadlocals (i.e. a different kind of caching) or to use threadLocals very carefully.
  @Override
  public void close() throws IOException {
    try {
      output.flush();
      output.cleanup(); // NEW: fixes memory leak 1 by removing threadlocal from thread when "done"
      idCache.clear(); // see my next github comment about removing the threadlocal in ProtoSerializer class
    } catch (IOException e) {
      // If close is called automatically as part of try-with-resources, it's possible that
      // output.flush() will throw the same exception. Re-throwing the same exception in a finally
      // block triggers an IllegalArgumentException indicating illegal self suppression. To avoid
      // this, we wrap the exception so a different instance is thrown.
      throw new IOException(e);
    }
  }

@asclark109
Copy link
Author

asclark109 commented Feb 10, 2025

Here's a way to replace the THREAD_LOCAL_ID_CACHE in ProtoSerializer. Use a "global" ConcurrentHashMap to store id conversions. Pros:

  1. no threadlocals (avoid class loader memory leaks) therefore does not pollute threads
  2. works well in multi-threaded environments
  3. has an eviction policy to prevent unbouunded growth
import java.util.concurrent.ConcurrentHashMap;

private static final int CACHE_MAX_SIZE = 10_000;  // Adjust as needed
private static final ConcurrentHashMap<String, byte[]> GLOBAL_ID_CACHE = new ConcurrentHashMap<>();

private static byte[] getCachedOrCompute(String id, int length) {
    if (GLOBAL_ID_CACHE.size() > CACHE_MAX_SIZE) {
        GLOBAL_ID_CACHE.clear();
    }

    return GLOBAL_ID_CACHE.computeIfAbsent(id, key -> 
        OtelEncodingUtils.bytesFromBase16(key, length)
    );
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant