Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Security concerns of having Ehcache as the default #28546

Open
ben-manes opened this issue Jan 26, 2025 · 11 comments
Open

Security concerns of having Ehcache as the default #28546

ben-manes opened this issue Jan 26, 2025 · 11 comments

Comments

@ben-manes
Copy link

ben-manes commented Jan 26, 2025

From Using a cache,

Ehcache is the default cache with monoliths in JHipster.

Ehcache 3 includes a trivial hash flooding denial of service attack, which was reported against 3.0-M4 in 2015. Unfortunately the authors simply resolved the issue without fixing, and then continued to ignore community concerns since then. When testing the latest release, 3.10.8, it still includes this problem. I do not think this should be the default cache, not because there are bugs, but because it has been 10 years of hiding a severe flaw instead of making a trivial fix.

The performance of their eviction policy degrades as the cache size increases due to a poor sampling implementation in their custom hash table. This implies that the more critical, performance sensitive caches are the most likely to be vulnerable and easiest to exploit due to their central usage. On my M3 Max laptop, Ehcache takes nearly 20 minutes in a database simulation, whereas Guava and Caffeine each take only 15 seconds or less. The runtime will be significantly worse on typical, virtualized cloud compute rather than a high performance laptop.

If we apply a simple patch of using the mix function in SplittableRandom, Stafford's variant 13, then the execution time drops to 40 seconds. Further improvements could be made by the developers, but users can mitigate this problem if they are careful with their hash codes.

HashDoS demonstration

$ git clone https://github.com/ben-manes/caffeine.git
$ wget https://github.com/moka-rs/cache-trace/raw/ef0a9de8cf0202a1f2fee186a0af497774b0f0a9/arc/DS1.lis.zst
$ ./caffeine/gradlew --project-dir caffeine simulator:run -q \
  -Dcaffeine.simulator.files.format="arc" \
  -Dcaffeine.simulator.maximum-size=4_000_000 \
  -Dcaffeine.simulator.policies.0="product.Ehcache3" \
  -Dcaffeine.simulator.files.paths.0="$PWD/DS1.lis.zst"
╔══════════════════╤══════════╤═══════════╤════════════╤════════════╤════════════╤════════════╤═══════════╗
║ Policy           │ Hit Rate │ Miss Rate │ Hits       │ Misses     │ Requests   │ Evictions  │ Time      ║
╠══════════════════╪══════════╪═══════════╪════════════╪════════════╪════════════╪════════════╪═══════════╣
║ product.Ehcache3 │ 28.01 %  │ 71.99 %   │ 12,240,263 │ 31,464,716 │ 43,704,979 │ 27,464,716 │ 18.24 min ║
╚══════════════════╧══════════╧═══════════╧════════════╧════════════╧════════════╧════════════╧═══════════╝

When running, we can observe this as the problem by using jstack

$ jps | grep Simulator
29929 Simulator
$ jstack 29929 | grep \"Ehcache3Policy\" -A10
"Ehcache3Policy" #32 [43267] daemon prio=5 os_prio=31 cpu=188377.19ms elapsed=193.78s tid=0x0000000126018400 nid=43267 runnable  [0x0000000172bae000]
   java.lang.Thread.State: RUNNABLE
	at org.ehcache.impl.internal.concurrent.ConcurrentHashMap$Traverser.advance(ConcurrentHashMap.java:3410)
	at org.ehcache.impl.internal.concurrent.ConcurrentHashMap.getEvictionCandidate(ConcurrentHashMap.java:6493)
	at org.ehcache.impl.internal.store.heap.SimpleBackend.getEvictionCandidate(SimpleBackend.java:53)
	at org.ehcache.impl.internal.store.heap.OnHeapStore.evict(OnHeapStore.java:1579)
	at org.ehcache.impl.internal.store.heap.OnHeapStore.enforceCapacity(OnHeapStore.java:1558)
	at org.ehcache.impl.internal.store.heap.OnHeapStore.put(OnHeapStore.java:365)
	at org.ehcache.core.Ehcache.doPut(Ehcache.java:93)
	at org.ehcache.core.EhcacheBase.put(EhcacheBase.java:187)
	at com.github.benmanes.caffeine.cache.simulator.policy.product.Ehcache3Policy.record(Ehcache3Policy.java:80)

We can observe that other caches do not have this problem, such as Guava's LRU.

$ ./caffeine/gradlew --project-dir caffeine simulator:run -q \
  -Dcaffeine.simulator.files.format="arc" \
  -Dcaffeine.simulator.maximum-size=4_000_000 \
  -Dcaffeine.simulator.policies.0="product.Guava" \
  -Dcaffeine.simulator.files.paths.0="$PWD/DS1.lis.zst"
╔═══════════════╤══════════╤═══════════╤═══════════╤════════════╤════════════╤════════════╤═════════╗
║ Policy        │ Hit Rate │ Miss Rate │ Hits      │ Misses     │ Requests   │ Evictions  │ Time    ║
╠═══════════════╪══════════╪═══════════╪═══════════╪════════════╪════════════╪════════════╪═════════╣
║ product.Guava │ 20.24 %  │ 79.76 %   │ 8,847,984 │ 34,856,995 │ 43,704,979 │ 30,856,995 │ 14.44 s ║
╚═══════════════╧══════════╧═══════════╧═══════════╧════════════╧════════════╧════════════╧═════════╝

or Caffeine's W-TinyLFU,

$ ./caffeine/gradlew --project-dir caffeine simulator:run -q \
  -Dcaffeine.simulator.files.format="arc" \
  -Dcaffeine.simulator.maximum-size=4_000_000 \
  -Dcaffeine.simulator.policies.0="product.Caffeine" \
  -Dcaffeine.simulator.files.paths.0="$PWD/DS1.lis.zst"
╔══════════════════╤══════════╤═══════════╤════════════╤════════════╤════════════╤════════════╤═════════╗
║ Policy           │ Hit Rate │ Miss Rate │ Hits       │ Misses     │ Requests   │ Evictions  │ Time    ║
╠══════════════════╪══════════╪═══════════╪════════════╪════════════╪════════════╪════════════╪═════════╣
║ product.Caffeine │ 45.83 %  │ 54.17 %   │ 20,028,995 │ 23,675,984 │ 43,704,979 │ 19,675,984 │ 6.937 s ║
╚══════════════════╧══════════╧═══════════╧════════════╧════════════╧════════════╧════════════╧═════════╝

HashDoS mitigation

$ wget https://github.com/user-attachments/files/18551302/ehcache.patch
$ pushd caffeine ; git apply ../ehcache.patch ; popd
$ ./caffeine/gradlew --project-dir caffeine simulator:run -q \
  -Dcaffeine.simulator.files.format="arc" \
  -Dcaffeine.simulator.maximum-size=4_000_000 \
  -Dcaffeine.simulator.policies.1="product.Ehcache3" \
  -Dcaffeine.simulator.files.paths.0="$PWD/DS1.lis.zst"
╔══════════════════╤══════════╤═══════════╤═══════════╤════════════╤════════════╤════════════╤═════════╗
║ Policy           │ Hit Rate │ Miss Rate │ Hits      │ Misses     │ Requests   │ Evictions  │ Time    ║
╠══════════════════╪══════════╪═══════════╪═══════════╪════════════╪════════════╪════════════╪═════════╣
║ product.Ehcache3 │ 21.97 %  │ 78.03 %   │ 9,600,600 │ 34,104,379 │ 43,704,979 │ 30,104,379 │ 39.94 s ║
╚══════════════════╧══════════╧═══════════╧═══════════╧════════════╧════════════╧════════════╧═════════╝

You might notice that Ehcache's sampled LRU's hit rate decreased from 28% to more closely match Guava's LRU. This is because a database workload has a strong MRU / LFU bias, so the worse distribution randomly selects a better victim. This would be reverse in a recency-biased workload, such as in data streams.

Addendum While I disclosed this issue in 2015, I do not interact with the Ehcache developers. They publicly called me a fraud and claimed that I faked and manipulated my benchmarks. These were baseless accusations with no technical foundation, but was intended to simply slander my project before it had much adoption. I find it no surprise that their project has withered instead, perhaps due to their lack of professionalism. While my project certainly had bugs and flaws that were found and addressed, and I have voiced concerns over alternative's claims, in every instance I have used data and an open, fair analysis to explain my beliefs. As JHipster is one of the few projects that still prefers Ehcache, I think it is worth you being aware of this problem, perhaps make progress in having it resolved, or reconsider the defaults for your users.
@atomfrede
Copy link
Member

Thanks @ben-manes for the detailed report. I think ehcache was one of the early decisions which have never been reconsidered that much.
I agree with using another one as personally I have used guava and caffeine in other projects.

@atomfrede
Copy link
Member

@jhipster/developers What do you think? I remember some older discussions about making caffeine the default choice, but can't find it anymore.

@mraible
Copy link
Contributor

mraible commented Jan 30, 2025 via email

@atomfrede
Copy link
Member

@mraible I can't find any dedicated "primary" recommendation (https://docs.spring.io/spring-boot/reference/io/caching.html or https://docs.spring.io/spring-framework/reference/integration/cache/store-configuration.html) in the docs. Or do we use ehcache by default as it integrates easily(?) with hibernates 2nd level cache?
It looks like we support both (ehcache and caffeine) for hibernate 2nd level cache via jcache abstraction. So maybe we can make caffeine the default (with the next release).

@ben-manes
Copy link
Author

I am certainly fine if you leave your defaults but are able to wrangle the Ehcache team to fix this problem and release an upgrade. I consider this to be a serious issue and am very disappointed that they do not care. Since it has not been exploited it may be my overreaction as I would have resolved it asap in their place (Caffeine's 2.2.7 release).

@atomfrede
Copy link
Member

I think caffeine is more versatile and easier too use, so it would be a "win-win"

@jdubois
Copy link
Member

jdubois commented Jan 30, 2025

I'm the culprit for ehcache, but that's only because at that time (more than 10 years ago!!) there wasn't much choice! The strong decision for me is to have a 2nd level cache, then for the implementation that's totally up for discussion!
I believe that Caffeine seems indeed to be technically better, and also I've seen @ben-manes several times in our discussions, and the relationship/support for the 3rd party libraries we use is also super important (I'd rather work with friendly people, who contribute and help, as it's how the project has always been successful).
I'm not sure of the impact of migrating the default cache.... There's a "hidden cost" in tools like JHipster Online, as we have this in the UI. So it would also depend on that work.

@mraible
Copy link
Contributor

mraible commented Jan 30, 2025

@henri-tremblay Since you're a contributor to Ehcache, do you have any thoughts on this?

@henri-tremblay
Copy link
Contributor

I am not working on Ehcache anymore. And have lost a bit track of the developments on it. Spring Boot detects cache in the following order: https://docs.spring.io/spring-boot/docs/2.1.7.RELEASE/reference/html/boot-features-caching.html#boot-features-caching-provider. Ehcache is second after the generic one and caffeine is at the bottom. I have not opinion on that, but it's what they do.

If I remember well, Ehcache, at least by default, will indeed by unhappy if the cache is full and you keep adding to it, thus causing evictions to occur. It's traversing the cache through a little sample of entries to find the best one to evict. It's not a crazy way to do it and I think it can be tweaked with some configuration. I remember it as a trade-off. The cache works super well but should never be full. Something like that. But I will let current Ehcache chip in to give a full picture. @chrisdennis maybe Hopefully everyone will stay calm.

@Sanne
Copy link

Sanne commented Feb 7, 2025

For what it's worth, in Quarkus we only offer two options OOB:

  • Infinispan for distributed / cluster needs. Its eviction strategies are based on Caffeine.
  • A custom "light" Quarkus cache for simple local needs. Also based on Caffeine.

In both cases, big thanks to @ben-manes as he helped us make a really good integration.

I believe as in any OSS project it's not too hard for end users to plug an alternative implementation but I'd not recommend that at all, it's very tricky business and people tend to measure the performance under the wrong conditions, or looking at metrics which are not as critical as the cache's dynamic behaviour under stress.

Definitely a security issue here.

@mshima mshima added the v9 label Feb 7, 2025
@mshima
Copy link
Member

mshima commented Feb 7, 2025

I've opened a poll: #28630.

@mshima mshima mentioned this issue Feb 7, 2025
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants