Is there any file system based B+ Tree implementation in c#(open source). I have found some projects, but those are not file(disk) based implementation. I am specifically looking for file system based B+ Trees.
Asked
Active
Viewed 9,294 times
2 Answers
10
Update:
I've added some benchmarks of managed B-Tree implementations for your enjoyment if you looking into this sort of thing.
BplusDotNet "... is known to be somewhat buggy on deletes"
I found just the opposite to be true, RaptorDB 1.6 was corrupting state and BplusDotNet 1.0.2082.16942 seemed to work well enough.
Original:
For completeness I'm going to add my own implementation here.
- Introduction - http://csharptest.net/?page_id=563
- Benchmarks - http://csharptest.net/?p=586
- Online Help - http://help.csharptest.net/
- Source Code - http://code.google.com/p/csharptest-net/
- Downloads - http://code.google.com/p/csharptest-net/downloads
- NuGet Package - http://nuget.org/List/Packages/CSharpTest.Net.BPlusTree

csharptest.net
- 62,602
- 11
- 71
- 89
-
I can confirm that BPlusTree DOES corrupt data when reading via a custom ISerializer
serializer. Entities has inbuilt SHA256 based HMACs for self-checks and would failed 1:50K times (IO streams would be closed out prematurely). Also performance drops, so reading the last 10% entities is MUCH slower than reading the first 10%. Problems went away when perform same serialization of T => byte[] outside BPlusTree and BPlusTree uses PrimitiveSerializer.Bytes to work with just byte[] instead of a custom entity. Project likely dead since NuGet has seen no release in 1.5 years – DeepSpace101 Jan 22 '14 at 22:50 -
@DeepSpace101 can you share your implementation of the ISerializer? I suspect your issue is possibly there. The BPlusTree is currently in use in several commercial offerings and has proven reliable as you indicate you experienced by using the byte[] and built-in serializer. As to the project being dead, I can certainly understand the impression as I've not made any significant improvements to it in a long time. I'm happy to help you and others with it, just shoot me an email roger @ my username. – csharptest.net Jan 23 '14 at 16:42
-
Will email you but the root cause was that the serializer had a convention that end-of-stream = end of object. That convention/assumption was broken at the `T ReadFrom(Stream stream)` interface, resulting in more bytes (beyond the 'end') to be read, killing the crypto-checksums. – DeepSpace101 Jan 24 '14 at 05:53
-
@DeepSpace101 I suspected as much, I've been bitten by the same. At least it's documented here for the next person ;) – csharptest.net Jan 24 '14 at 16:39
-
Also, it would happen just 1:50K times (vs every time) because the `ReadFrom` would happen only if the disk was hit for the read. Since we were writing => reading (back to back), reads happened off a cache. But if the disk was hit even once (1:50K times at app level), we saw that error. Our fix was to handle serialization outside the BPlusTree library and just use the byte[] interface into BPlusTree. – DeepSpace101 Jan 24 '14 at 19:08
2
http://bplusdotnet.sourceforge.net/ but this one is known to be somewhat buggy on deletes.
Another one that appears to work well:
http://www.codeproject.com/KB/database/RaptorDB.aspx
RaptorDB allows you to store key/values either indexed using a b+ tree or a hash index. You can pick when you create the files.

dave
- 115
- 1
- 6