7

I need really fast and persistent cache for my web crawler. It doesnt need to be as fast as ConcurrentSkipListSet in Java, but definitely it cannot be MySQL with hash-index based table, which i tried. After 1m+ of records it takes like 80% of processor time.

Does any one know or heard of something useful for this case?
Thanks for any hint.

tomasb
  • 1,663
  • 2
  • 22
  • 29
  • the ConcurrentSkipListSet of course can stay in the game as level 1, what im looking for is something for level 2 – tomasb Aug 20 '11 at 15:36
  • How about Cassandra? Many properties would fit my scenario. Is it fast? – tomasb Aug 04 '13 at 00:55

2 Answers2

5

Try EhCache. It's a primarily in-memory cache with options for overflow and persistence to disk backing store. Been around for years, still actively developed, and very mature.

skaffman
  • 398,947
  • 96
  • 818
  • 769
3

I am working on cache2k, and researching recent cache eviction policies to make it the fastest java cache around, see cache2k benchmarks.

Persistence is added right now and will be available for preview and testing in two weeks. I expect it to be very stable in five weeks. The cache2k implementation is, of course, not as mature as EHCache, however, everything released, is used in within our own applications and proves itself in production environments.

Update: The "two weeks" was very optimistic, since the whole locking concept needed finally a rewrite and careful inspection... You can track the persistence support currently emerging on github

cruftex
  • 5,545
  • 2
  • 20
  • 36