9

Is there any file system based B+ Tree implementation in c#(open source). I have found some projects, but those are not file(disk) based implementation. I am specifically looking for file system based B+ Trees.

RameshVel
  • 64,778
  • 30
  • 169
  • 213

2 Answers2

10

Update:

I've added some benchmarks of managed B-Tree implementations for your enjoyment if you looking into this sort of thing.

BplusDotNet "... is known to be somewhat buggy on deletes"

I found just the opposite to be true, RaptorDB 1.6 was corrupting state and BplusDotNet 1.0.2082.16942 seemed to work well enough.

Original:

For completeness I'm going to add my own implementation here.

csharptest.net
  • 62,602
  • 11
  • 71
  • 89
  • I can confirm that BPlusTree DOES corrupt data when reading via a custom ISerializer serializer. Entities has inbuilt SHA256 based HMACs for self-checks and would failed 1:50K times (IO streams would be closed out prematurely). Also performance drops, so reading the last 10% entities is MUCH slower than reading the first 10%. Problems went away when perform same serialization of T => byte[] outside BPlusTree and BPlusTree uses PrimitiveSerializer.Bytes to work with just byte[] instead of a custom entity. Project likely dead since NuGet has seen no release in 1.5 years – DeepSpace101 Jan 22 '14 at 22:50
  • @DeepSpace101 can you share your implementation of the ISerializer? I suspect your issue is possibly there. The BPlusTree is currently in use in several commercial offerings and has proven reliable as you indicate you experienced by using the byte[] and built-in serializer. As to the project being dead, I can certainly understand the impression as I've not made any significant improvements to it in a long time. I'm happy to help you and others with it, just shoot me an email roger @ my username. – csharptest.net Jan 23 '14 at 16:42
  • Will email you but the root cause was that the serializer had a convention that end-of-stream = end of object. That convention/assumption was broken at the `T ReadFrom(Stream stream)` interface, resulting in more bytes (beyond the 'end') to be read, killing the crypto-checksums. – DeepSpace101 Jan 24 '14 at 05:53
  • @DeepSpace101 I suspected as much, I've been bitten by the same. At least it's documented here for the next person ;) – csharptest.net Jan 24 '14 at 16:39
  • Also, it would happen just 1:50K times (vs every time) because the `ReadFrom` would happen only if the disk was hit for the read. Since we were writing => reading (back to back), reads happened off a cache. But if the disk was hit even once (1:50K times at app level), we saw that error. Our fix was to handle serialization outside the BPlusTree library and just use the byte[] interface into BPlusTree. – DeepSpace101 Jan 24 '14 at 19:08
2

http://bplusdotnet.sourceforge.net/ but this one is known to be somewhat buggy on deletes.

Another one that appears to work well:

http://www.codeproject.com/KB/database/RaptorDB.aspx

RaptorDB allows you to store key/values either indexed using a b+ tree or a hash index. You can pick when you create the files.

dave
  • 115
  • 1
  • 6