1

In (VB).NET (4.0 framework), I'd need a Dictionary<string, string> that can efficiently store its contents, or better its internal representation to a file and load it later.

Ruby has such a thing in PStore, see ruby docs.

The intention is to once prepare a lookup dictionary for e.g. text translations out of a database and store it in a disk file that can quickly be loaded multiple times for output generation. So this should not read the file line by line and fill the Dictionary with the key-value pairs, but instead load the file contents in one go directly into its internal state representation.

I think this should be possible, but I would also like to see your explanations if you think otherwise.

gpinkas
  • 2,291
  • 2
  • 33
  • 49
  • If you are asking how to serialize the dictionary to a file so that it can be read later, use the `BinaryWriter` and `BinaryReader` classes as shown here: http://stackoverflow.com/a/4022629/284240 – Tim Schmelter Mar 14 '13 at 09:48
  • @TimSchmelter: This method also adds each key/value pair one-by-one. I'd prefer to read the whole internal class representation if possible. – gpinkas Mar 14 '13 at 09:53
  • I would use sqlite and write a wrapper `SQLiteDictionary` class which implements `IDictionary` and takes the filename name in the constructor. – Vasea Mar 14 '13 at 10:07
  • Sounds like you are working really hard to save a few milliseconds. Are you sure you need to do this? To be able to thunk a blob from memory to a file and back directly into your storage data structure you are going to have to code it in unmanaged C++ or do a lot of interop into custom .NET data collection objects and VB.NET lacks `unsafe` support to deals with pointers so it will not be fun. – tcarvin Mar 14 '13 at 14:11
  • @tcarvin: I'm evaluating possibilities, not optimising prematurely. :) So you are saying that in managed code it's not possible to save the internal "memory" state of an object? – gpinkas Mar 14 '13 at 16:53
  • You can pin it in a couple of ways so it won't be moved and then access the backing memory, but you need more than the backing memory of a single object, you need it of the larger container data structure. I'd think that would be implemented by the data structure allocating its storage and that of any children in a contiguous block, and you are not going to get that behavior in .NET unless you do it yourself from scratch. – tcarvin Mar 14 '13 at 19:43
  • @tcarvin: Maybe you'd care to point out such a concept in an answer? This sounds like a possible solution. – gpinkas Mar 18 '13 at 14:24

2 Answers2

2

As I commented, I still think any benefit you get is going to be out-weighed by the effort to implement. I recommend looking into something BerklyDB or HamsterDB, both of which I googled up in a couple of minutes.

If really want to hand-implement something, I think perhaps instead of writing your own memory manager with the Marshal class to ensure that your data-structure are stored in a contiguous block that can marshaled quickly to and from a file backing, I would look at using a persistent memory-mapped file. Then you can use CreateViewAccessor to get a MemoryMappedViewAccessor instance that can be used to read and write primitive and data structures. Combine this with any college text or open source implementation of something like a B-Tree or something similar and you could get something working.

tcarvin
  • 10,715
  • 3
  • 31
  • 52
1

If you really want to keep it simple for this particular task, I would to it this way:

class Pair {
    public string Key { get; set; }
    public string Value { get; set; }
}

var dict = new Dictionary<string, string>() {
    {"asd", "zxc"},
    {"111", "zzs"}
}; // Populate your dictionary somehow
var list = dict.ToList().Select( p => new Pair() { Key = p.Key, Value = p.Value} );
//Then XML-serialize this list

To read it:

//De-XML-serialize that list
dict = list.ToDictionary( p => p.Key, p => p.Value );

Side advantage is that you can modify stored list with just text editor.

Andrey
  • 59,039
  • 12
  • 119
  • 163
  • Well thanks, but introducing a whole database system for this task is a bit too much for my use case. – gpinkas Mar 14 '13 at 10:53
  • @gpinkas I didn't read your question properly, see my update. – Andrey Mar 14 '13 at 11:07
  • That looks better. I still hope for a faster solution though, my intention is to improve loading speed. Maybe this is already as far as it will go using only the .NET builtin data structures. – gpinkas Mar 14 '13 at 12:07
  • @gpinkas improve your speed by finding bottlenecks, not by premature optimisation. – Andrey Mar 14 '13 at 16:25
  • Of course, but I want to evaluate possible optimisations, also in terms of development time. If I'd have to implement this completely from scratch with uncertain outcome, it's hardly viable. But if someone has already built something similar, I would check further on this optimization. – gpinkas Mar 14 '13 at 16:37
  • @gpinkas it is trivial task to implement it manually in case it takes too long. Just store it in 'key=value' format and you are done. Can be implemented in 10 minutes, we already spend more time discussing it. – Andrey Mar 14 '13 at 16:54
  • No, it's not, because to access items by `my_dict[key]`, you still have to feed the values into a dictionary. I think my question is clear in this aspect. – gpinkas Mar 14 '13 at 16:58