I need really fast and persistent cache for my web crawler. It doesnt need to be as fast as ConcurrentSkipListSet in Java, but definitely it cannot be MySQL with hash-index based table, which i tried. After 1m+ of records it takes like 80% of processor time.
Does any one know or heard of something useful for this case?
Thanks for any hint.

- 1,663
- 2
- 22
- 29
-
the ConcurrentSkipListSet of course can stay in the game as level 1, what im looking for is something for level 2 – tomasb Aug 20 '11 at 15:36
-
How about Cassandra? Many properties would fit my scenario. Is it fast? – tomasb Aug 04 '13 at 00:55
2 Answers
Try EhCache. It's a primarily in-memory cache with options for overflow and persistence to disk backing store. Been around for years, still actively developed, and very mature.

- 398,947
- 96
- 818
- 769
-
-
1I started looking at EHCache recently, I guess that BigMemory is not persistent. – Marsellus Wallace Aug 20 '11 at 18:23
-
i guess so too, but it preserves GC with off-heap storage which is fine, jvm can stay small so GC runs faster – tomasb Aug 20 '11 at 21:10
I am working on cache2k, and researching recent cache eviction policies to make it the fastest java cache around, see cache2k benchmarks.
Persistence is added right now and will be available for preview and testing in two weeks. I expect it to be very stable in five weeks. The cache2k implementation is, of course, not as mature as EHCache, however, everything released, is used in within our own applications and proves itself in production environments.
Update: The "two weeks" was very optimistic, since the whole locking concept needed finally a rewrite and careful inspection... You can track the persistence support currently emerging on github

- 5,545
- 2
- 20
- 36
-
2Is this now persistant cache? I have read the docs but can't any find any word in there? – Paul Steven Feb 24 '20 at 05:11