6

I have the following problem.

I am developing a stochastic simulator which samples configurations of the system at random and stores the statistics of how many times each configuration has been visited at certain time instances. Roughly the code works like this

f[_Integer][{_Integer..}] :=0
...
someplace later in the code, e.g.,
index = get index;
c = get random configuration (i.e. a tuple of integers, say a pair {n1, n2}); 
f[index][c] = f[index][c] + 1;
which tags that configuration c has occurred once more in the simulation at time instance index.

Once the code has finished there is a list of definitions for f that looks something like this (I typed it by hand just to emphasize the most important parts)

?f
f[1][{1, 2}] = 112
f[1][{3, 4}] = 114
f[2][{1, 6}] = 216
f[2][{2, 7}] = 227
...
f[index][someconfiguration] = some value
...
f[_Integer][{_Integer..}] :=0

Please note that pattern free definitions that come first can be rather sparse. Also one cannot know which values and configurations will be picked.

The problem is to efficiently extract down values for a desired index, for example issue something like

result = ExtractConfigurationsAndOccurences[f, 2] 

which should give a list with the structure

result = {list1, list2}

where

list1 = {{1, 6}, {2, 7}} (* the list of configurations that occurred during the simulation*)
list2 = {216, 227} (* how many times each of them occurred *)

The problem is that ExtractConfigurationsAndOccurences should be very fast. The only solution I could come up with was to use SubValues[f] (which gives the full list) and filter it with Cases statement. I realize that this procedure should be avoided at any cost since there will be exponentially many configurations (definitions) to test, which slows down the code considerably.

Is there a natural way in Mathematica to do this in a fast way?

I was hoping that Mathematica would see f[2] as a single head with many down values but using DownValues[f[2]] gives nothing. Also using SubValues[f[2]] results in an error.

zorank
  • 281
  • 2
  • 7
  • 1
    It is interesting to note there is no doc page for `SubValues`, nor for `NValues`, `FormatValues` and `DefaultValues`. There is one for `UpValues`, `DownValues` and `OwnValues`. Not sure whether we should conclude from this that we aren't supposed to use `SubValues`. – Sjoerd C. de Vries Aug 31 '11 at 12:34
  • 1
    @Sjoerd The only reason to not use `SubValues` would be if one could not exclude the possibility that their support could be dropped in the future. But I would bet anything that this would never happen. – Leonid Shifrin Aug 31 '11 at 14:07
  • @Sjoerd `SubValues` rule! (And may contain `Rules`!) – telefunkenvf14 Sep 02 '11 at 17:15

4 Answers4

7

This is a complete rewrite of my previous answer. It turns out that in my previous attempts, I overlooked a much simpler method based on a combination of packed arrays and sparse arrays, that is much faster and more memory - efficient than all previous methods (at least in the range of sample sizes where I tested it), while only minimally changing the original SubValues - based approach. Since the question was asked about the most efficient method, I will remove the other ones from the answer (given that they are quite a bit more complex and take a lot of space. Those who would like to see them can inspect past revisions of this answer).

The original SubValues - based approach

We start by introducing a function to generate the test samples of configurations for us. Here it is:

Clear[generateConfigurations];
generateConfigurations[maxIndex_Integer, maxConfX_Integer, maxConfY_Integer, 
  nconfs_Integer] :=
Transpose[{
  RandomInteger[{1, maxIndex}, nconfs],
  Transpose[{
     RandomInteger[{1, maxConfX}, nconfs],
     RandomInteger[{1, maxConfY}, nconfs]
  }]}]; 

We can generate a small sample to illustrate:

In[3]:= sample  = generateConfigurations[2,2,2,10]
Out[3]= {{2,{2,1}},{2,{1,1}},{1,{2,1}},{1,{1,2}},{1,{1,2}},
          {1,{2,1}},{2,{1,2}},{2,{2,2}},{1,{2,2}},{1,{2,1}}}

We have here only 2 indices, and configurations where both "x" and "y" numbers vary from 1 to 2 only - 10 such configurations.

The following function will help us imitate the accumulation of frequencies for configurations, as we increment SubValues-based counters for repeatedly occurring ones:

Clear[testAccumulate];
testAccumulate[ff_Symbol, data_] :=
  Module[{},
   ClearAll[ff];
   ff[_][_] = 0;
   Do[
     doSomeStuff;
     ff[#1][#2]++ & @@ elem;
     doSomeMoreStaff;
   , {elem, data}]];

The doSomeStuff and doSomeMoreStaff symbols are here to represent some code that might preclude or follow the counting code. The data parameter is supposed to be a list of the form produced by generateConfigurations. For example:

In[6]:= 
testAccumulate[ff,sample];
SubValues[ff]

Out[7]= {HoldPattern[ff[1][{1,2}]]:>2,HoldPattern[ff[1][{2,1}]]:>3,
   HoldPattern[ff[1][{2,2}]]:>1,HoldPattern[ff[2][{1,1}]]:>1,
   HoldPattern[ff[2][{1,2}]]:>1,HoldPattern[ff[2][{2,1}]]:>1,
   HoldPattern[ff[2][{2,2}]]:>1,HoldPattern[ff[_][_]]:>0}

The following function will extract the resulting data (indices, configurations and their frequencies) from the list of SubValues:

Clear[getResultingData];
getResultingData[f_Symbol] :=
   Transpose[{#[[All, 1, 1, 0, 1]], #[[All, 1, 1, 1]], #[[All, 2]]}] &@
        Most@SubValues[f, Sort -> False];

For example:

In[10]:= result = getResultingData[ff]
Out[10]= {{2,{2,1},1},{2,{1,1},1},{1,{2,1},3},{1,{1,2},2},{2,{1,2},1},
{2,{2,2},1},{1,{2,2},1}}

To finish with the data-processing cycle, here is a straightforward function to extract data for a fixed index, based on Select:

Clear[getResultsForFixedIndex];
getResultsForFixedIndex[data_, index_] := 
  If[# === {}, {}, Transpose[#]] &[
    Select[data, First@# == index &][[All, {2, 3}]]];

For our test example,

In[13]:= getResultsForFixedIndex[result,1]
Out[13]= {{{2,1},{1,2},{2,2}},{3,2,1}}

This is presumably close to what @zorank tried, in code.

A faster solution based on packed arrays and sparse arrays

As @zorank noted, this becomes slow for larger sample with more indices and configurations. We will now generate a large sample to illustrate that (note! This requires about 4-5 Gb of RAM, so you may want to reduce the number of configurations if this exceeds the available RAM):

In[14]:= 
largeSample = generateConfigurations[20,500,500,5000000];
testAccumulate[ff,largeSample];//Timing

Out[15]= {31.89,Null}

We will now extract the full data from the SubValues of ff:

In[16]:= (largeres = getResultingData[ff]); // Timing
Out[16]= {10.844, Null}

This takes some time, but one has to do this only once. But when we start extracting data for a fixed index, we see that it is quite slow:

In[24]:= getResultsForFixedIndex[largeres,10]//Short//Timing
Out[24]= {2.687,{{{196,26},{53,36},{360,43},{104,144},<<157674>>,{31,305},{240,291},
 {256,38},{352,469}},{<<1>>}}}

The main idea we will use here to speed it up is to pack individual lists inside the largeres, those for indices, combinations and frequencies. While the full list can not be packed, those parts individually can:

In[18]:= Timing[
   subIndicesPacked = Developer`ToPackedArray[largeres[[All,1]]];
   subCombsPacked =  Developer`ToPackedArray[largeres[[All,2]]];
   subFreqsPacked =  Developer`ToPackedArray[largeres[[All,3]]];
]
Out[18]= {1.672,Null}

This also takes some time, but it is a one-time operation again.

The following functions will then be used to extract the results for a fixed index much more efficiently:

Clear[extractPositionFromSparseArray];
extractPositionFromSparseArray[HoldPattern[SparseArray[u___]]] := {u}[[4, 2, 2]]

Clear[getCombinationsAndFrequenciesForIndex];
getCombinationsAndFrequenciesForIndex[packedIndices_, packedCombs_, 
    packedFreqs_, index_Integer] :=
With[{positions = 
         extractPositionFromSparseArray[
               SparseArray[1 - Unitize[packedIndices - index]]]},
  {Extract[packedCombs, positions],Extract[packedFreqs, positions]}];

Now, we have:

In[25]:=  
getCombinationsAndFrequenciesForIndex[subIndicesPacked,subCombsPacked,subFreqsPacked,10]
  //Short//Timing

Out[25]= {0.094,{{{196,26},{53,36},{360,43},{104,144},<<157674>>,{31,305},{240,291},
{256,38},{352,469}},{<<1>>}}}

We get a 30 times speed-up w.r.t. the naive Select approach.

Some notes on complexity

Note that the second solution is faster because it uses optimized data structures, but its complexity is the same as that of Select- based one, which is, linear in the length of total list of unique combinations for all indices. Therefore, in theory, the previously - discussed solutions based on nested hash-table etc may be asymptotically better. The problem is, that in practice we will probably hit the memory limitations long before that. For the 10 million configurations sample, the above code was still 2-3 times faster than the fastest solution I posted before.

EDIT

The following modification:

Clear[getCombinationsAndFrequenciesForIndex];
getCombinationsAndFrequenciesForIndex[packedIndices_, packedCombs_, 
    packedFreqs_, index_Integer] :=
 With[{positions =  
          extractPositionFromSparseArray[
             SparseArray[Unitize[packedIndices - index], Automatic, 1]]},
    {Extract[packedCombs, positions], Extract[packedFreqs, positions]}];

makes the code twice faster still. Moreover, for more sparse indices (say, calling the sample-generation function with parameters like generateConfigurations[2000, 500, 500, 5000000] ), the speed-up with respect to the Select- based function is about 100 times.

Community
  • 1
  • 1
Leonid Shifrin
  • 22,449
  • 4
  • 68
  • 100
  • I won't pretend I fully understand what the newest code is doing, but I have a question. What's the purpose of `getValues` inside the `Module`? I see that it is a modification of an external variable, but I don't see what it's supposed to be doing. – rcollyer Aug 31 '11 at 14:39
  • 1
    @rcollyer The newest code is actually not hard at all, I just wasn't able to explain it well. All you do is to take the idiom of building a linked list, and insert it inside a pure function, to which `f[index]` will evaluate for all subsequent calls. I use that heads are computed before anything else, so when the first time we hit a given index in expression like `f[5][{{1,2},3}]`, the memoized code is executed, and `f[5]` is memoized as a pure function. We then evaluate `Function[...][{{1,2},3}]`, and the same for the subsequent calls to `f[5]`. The next question is how to make the pure ... – Leonid Shifrin Aug 31 '11 at 14:57
  • 1
    @rcollyer ... function modify the linked list. The answer is to embed the linked-list building code in it. The next question is how to localize the `storage` variable, so that it is not exposed to the user *and* a new variable is created for every new index. The answer is to wrap memoization code in `Module`. The final question is how to get the value of our local variable `storage` at the end -since it is not exposed. The answer is to define a global accessor function (getter method) at the memoization time - this is why we need `getValues`. It gives access to, but can not modify, `storage`. – Leonid Shifrin Aug 31 '11 at 15:00
  • Thanks. Need to digest it a bit while looking closely at the code to finish putting it together, though. – rcollyer Aug 31 '11 at 15:21
  • 1
    @rcollyer I just discovered though that the code in question is not up to the specs, since it does not allow to access the frequencies of configurations and increment them, when a new configuration is produced. All we can do within this approach is to collect all configurations and then use `Tally` on the result, but this will be quite memory-inefficient, since instead of storing the configuration just once plus the counter, we will have to store it the number of times equal to its total frequency. – Leonid Shifrin Aug 31 '11 at 15:36
  • @Leonid: Many thanks. This was way above my head so I have to study it. Just a small question; how would one take care of `f[index][c] = f[index][c] + 1;` commands, in particular I wory about the initialization part? somehow these $nm (temporary) symbols have to be defined in such a way that initially all of them are zero, but the problem is that I do not know a priori which configurations will get sampled during runtime. – zorank Aug 31 '11 at 18:07
  • @Leonid: also the problem is with the `sample =Table[{RandomInteger[{1,5}],RandomInteger[{1,100},2],...` command as the sample table would have to constructed incrementally (in a While loop) which would lead to quadratic complexity. Is there a way to build such a table where adding an element (triple) would be constant time per addition? – zorank Aug 31 '11 at 18:12
  • @Leonid: now I know what is bothering me exactly. In the code there would have to a line `$22[{_Integer..}] := 0` and likewise for each temporary symbol. So that values are accumulated by calls `$22[c] = $22[c]+1` everything starts from zero. This might be a problem since one would have to extract pattern free down values for `$22` which would require testing exponentially long list (in principle). – zorank Aug 31 '11 at 18:30
  • @zorank To answer the first question: adding an element *is constant time* in the nested hash approach. I just illustrated it with a `Table`, but you don't need it - whenever you have a configuration, just use `f[someindex][someConf] = someValue`, and it will be added (in fact, I did mention this in the text). – Leonid Shifrin Aug 31 '11 at 18:53
  • @zorank I just answered your second question with an explicit example - please see the edit #3. You should not worry about complexity in `DownValues` extraction - it should be ok. – Leonid Shifrin Aug 31 '11 at 19:18
  • @zorank I can not spend any more time on it today, but I will do a rewrite of the answer tomorrow. I have also a better code for my last solution. My conclusions so far: the first solution (based on nested hash table) can be about 4 times faster than the original one (based on `SubValues` and `Select`). The second solution takes up more memory than the first, but can be about `4` times faster still, and therefore about `16` times faster than the original. I still have to confirm the timings on M8, since was using M7 so far. – Leonid Shifrin Aug 31 '11 at 19:36
  • @zorank Please note that in my latest edit, there was a bug in the accumulator generator function: I forgot to change `f` to `accumName` in one place, inside `With`. Fixed now. – Leonid Shifrin Aug 31 '11 at 20:28
  • @Leonid: Many, many thanks. "The price we pay here is that the configurations and their results (frequencies?) are only available to us wholesale...": This is exactly how the code is supposed to work. There is no point in investigating an individual configuration, only the full set makes physical sense. I did the homework, i.e. studied and understood your code, and I think there is a way to incrementally increase accumulators and actually use the fastest form. Would like to test it before posting it... – zorank Sep 01 '11 at 13:46
  • @zorank I do have this code in the revised code, am working on putting the entire answer together. If you wait for another hour or so, you will see a revised answer. We could merge the solutions then and find the best to keep. – Leonid Shifrin Sep 01 '11 at 14:28
  • @zorank I found a faster way derived from your original `SubValues` - based method, but exploiting packed and sparse arrays. It gives about 30x speedup with respect to the straightforward code. I removed the older solutions since they are slower, and rather complex. You can see them in revisions if you still need them. – Leonid Shifrin Sep 01 '11 at 17:58
  • getCombinationsAndFrequenciesForIndex[packedIndices_, packedCombs_, packedFreqs_, index_Integer] := {Pick[packedCombs, packedIndices, index], Pick[packedFreqs, packedIndices, index]} is much simpler and almost twice as fast. – Sjoerd C. de Vries Sep 01 '11 at 20:12
  • @Sjoerd Wow, that's impressive. Thanks for mentioning this! This did not cross my mind, but in the retrospect that makes sense, since we may expect `Pick` to be very fast on packed arrays. Can not test right now since I am again on the slow machine with M7, will benchmark tomorrow. Please feel free to edit my post and add that, or I will do that tomorrow. In any case, I did not yet explore some corners of the parameter space for this problem. For example, I expect that for a huge number of indices and few configurations for each, some of my past suggestions can outperform the latest one. – Leonid Shifrin Sep 01 '11 at 20:33
  • Bypassing the whole counting stuff I generated a few test lists thus: ``n = 100000000; subIndicesPacked = Developer`ToPackedArray[RandomInteger[{0, 1000}, n]]; subCombsPacked = Developer`ToPackedArray[RandomInteger[{0, 100}, {n, 2}]]; subFreqsPacked = Developer`ToPackedArray[RandomInteger[{0, 100}, n]];`` Depending on the sparseness of the index (the 1000 in the first array) I get performance gains ranging from 2 - 5. I'm done for today. See you tomorrow. – Sjoerd C. de Vries Sep 01 '11 at 21:00
  • @Leonid: Many thanks again! If I understand correctly, have to understand the newest code yet, the main idea is to extract subvalues, once the simulation is done, and pack them into efficient data structures. This enforces somehow that the simulation should be used in this particular way. For example, one could try to add incrementally in chunks few more "runs" or "trajectories" to get more data to sample. One time procedure of packing into data structures would have to be repeated possibly many times. Somehow the latest old solution with h[] header seems more attractive. I am not sure... – zorank Sep 01 '11 at 22:02
  • @zorank Indeed, this *looks* less flexible. However, you can measure the time it takes to *accumulate* the results, and see that for my solution with `h`, it is several times more. This is because the solution with `h` induces time overhead *during* the accumulation of results. We could modify the latest suggestion so that it periodically appends these efficient data structures in smaller chunks. The main point is that it is so much faster, especially with the latest additions by @Sjoerd. Anyways, I have a reworked version of that solution with `h`. If you would like, I can make an ...contd – Leonid Shifrin Sep 01 '11 at 22:18
  • @zorank ... edit tomorrow and include it (I am done for today). – Leonid Shifrin Sep 01 '11 at 22:19
  • @Sjoerd Sorry if I confused you - by latest one I meant not your suggestion with `Pick` in particular, but the whole idea used in my latest answer - while by past solutions I meant those based on nested hash table and hash table filled with closures + linked lists. Since the latest solution will always have a complexity proportional to the total number of unique configurations for all indices (albeit with a very small constant), while past ones have complexity proportional to the number of unique configurations *for a given index*, they may win for lots of indices and a few per each index. – Leonid Shifrin Sep 01 '11 at 22:26
  • @Leonid We have been commenting a bit out of phase. I was referring to my own answer where I mentioned a factor of two speed-up. In further testing with lists, as described in my comment above, I got a 2-5 times speed-up. – Sjoerd C. de Vries Sep 02 '11 at 06:55
  • @Sjoerd I see now. I should study your sparse array solution closer, it looks pretty interesting. I never tried to fill sparse array with sparse arrays. This seems somewhat similar in spirit to nested hash table, but of course details are different. Generally, this problem looks like a very good playground to test various interesting techniques. – Leonid Shifrin Sep 02 '11 at 08:35
  • @leonid correction: I'm adding to the confusion, I'm afraid. In the above comment the word 'answer' should have been 'comment'. So I was referring to the `Pick` alternative to your `Select` alternative built using the SparseArray/Unitize trick. I was not referring to my SparseArray answer. Sorry about that. – Sjoerd C. de Vries Sep 02 '11 at 09:05
5

I'd probably use SparseArrays here (see update below), but if you insist on using functions and *Values to store and retrieve values an approach would be to have the first part (f[2] etc.) replaced by a symbol you create on the fly like:

Table[Symbol["f" <> IntegerString[i, 10, 3]], {i, 11}]
(* ==> {f001, f002, f003, f004, f005, f006, f007, f008, f009, f010, f011} *)

Symbol["f" <> IntegerString[56, 10, 3]]
(* ==> f056 *)

Symbol["f" <> IntegerString[56, 10, 3]][{3, 4}] = 12;
Symbol["f" <> IntegerString[56, 10, 3]][{23, 18}] = 12;

Symbol["f" <> IntegerString[56, 10, 3]] // Evaluate // DownValues
(* ==> {HoldPattern[f056[{3, 4}]] :> 12, HoldPattern[f056[{23, 18}]] :> 12} *)

f056 // DownValues
(* ==> {HoldPattern[f056[{3, 4}]] :> 12, HoldPattern[f056[{23, 18}]] :> 12} *)

Personally I prefer Leonid's solution, as it's much more elegant but YMMV.

Update

On OP's request, about using SparseArrays:
Large SparseArrays take up a fraction of the size of standard nested lists. We can make f to be a large (100,000 entires) sparse array of sparse arrays:

f = SparseArray[{_} -> 0, 100000];
f // ByteCount
(* ==> 672 *)

(* initialize f with sparse arrays, takes a few seconds with f this large *)
Do[  f[[i]] = SparseArray[{_} -> 0, {100, 110}], {i,100000}] // Timing//First
(* ==> 18.923 *)

(* this takes about 2.5% of the memory that a normal array would take: *)
f // ByteCount
(* ==>  108000040 *)

ConstantArray[0, {100000, 100, 100}] // ByteCount
(* ==> 4000000176 *)

(* counting phase *)
f[[1]][[1, 2]]++;
f[[1]][[1, 2]]++;
f[[1]][[42, 64]]++;
f[[2]][[100, 11]]++;

(* reporting phase *)
f[[1]] // ArrayRules
f[[2]] // ArrayRules
f // ArrayRules

(* 
 ==>{{1, 2} -> 2, {42, 64} -> 1, {_, _} -> 0}
 ==>{{100, 11} -> 1, {_, _} -> 0}
 ==>{{1, 1, 2} -> 2, {1, 42, 64} -> 1, {2, 100, 11} ->  1, {_, _, _} -> 0}
*)

As you can see, ArrayRules makes a nice list with contributions and counts. This can be done for each f[i] separately or the whole bunch together (last line).

Sjoerd C. de Vries
  • 16,122
  • 3
  • 42
  • 94
  • How would you use sparse arrays? This is an interesting possibility. I was thinking about that for a moment but my feeling is that this would blow up memory since there would be many zeros to solve so I did not investigate this further. Also, I am not sure whether sparce arrays are fixed size (non extendable) objects... – zorank Aug 31 '11 at 17:54
  • Hmmm... this was smart! Actually, `Symbol["f" <> IntegerString[56, 10, 3]][{3, 4}] = 12;` would have to be something like `Symbol["f" <> IntegerString[56, 10, 3]][{3, 4}] = Symbol["f" <> IntegerString[56, 10, 3]][{3, 4}] + 1;`. I guess it would work but would it be fast? Also, how would I take care of the initialization. All values should be zero at the beginning, but I do not know which values will pop up during simulation. – zorank Aug 31 '11 at 17:56
  • I agree that initialization should be dealt with. `Do[Symbol["f" <> IntegerString[i, 10, 3]][{_Integer ..}] := 0, {i, 11}]` does that for the above example set. I guess you have to do that for the whole possible range of f values you think you'll possibly get. For incrementing you can use the ++ operator. Incrementing Symbol[...] instead of directly using the name itself is about 5 times slower. – Sjoerd C. de Vries Sep 01 '11 at 11:35
  • @Sjoerd The problem with using `SparseArray`s for counters is that this leads to a quadratic complexity *during* the result-accumulation stage, in the length of the list of unique cofigurations. The reason is the same as for `Append` and `Prepend`-based list accumulation: while `SparseArray` will change the *existing* counter value in constant time, it will need a linear (in the length of non-zero counter list) time to first increment a previously zero counter. I was not patient enough to wait until result accumulation finishes, in my experiments. Wanted to comment on that earlier. – Leonid Shifrin Sep 01 '11 at 18:39
  • @Leonid I agree. Whether or not that's a problem depends on how sparse these SparseArrays actually are. The fuller they become the slower incrementing from 0 will get; but then, a fully filled SparseArray doesn't make sense at all and you could just as well have used normal arrays which don't have this problem. – Sjoerd C. de Vries Sep 01 '11 at 19:04
  • @Sjoerd The problem is that, at least in my tests, this became quite slow even for reasonably sparse arrays. A 2-D (or N-D) sparse array keeps internally a flat list of non-zero elements (plus their positions encoded in certain way), so even when it is very sparse, it is the length of the total number of currently non-zero elements in the sparse array that matters (and enters that quadratic complexity), not a relative degree of its sparseness. Sorry if that was obvious. – Leonid Shifrin Sep 01 '11 at 19:13
  • @Leonid I know about the lists of elements. Didn't do any timings. Apparently, it's bad. Good to realize that. Don't understand what you mean with the relative degree of sparseness though. If it is low overall then the individual arrays must generally be low as well, isn't it? – Sjoerd C. de Vries Sep 01 '11 at 19:55
  • @Sjoerd Sorry, I must have been more clear. I mean that, if you have an 100x100 array with 100 non-zero elements, its "sparseness" is 1/100 (the fraction of the non-zero elements in all elements). The same "sparseness" will have an 1000x1000 array with 10000 non-zero elements, but for our purposes it will be much, much worse, since the copmplexity of frequency accumulation depends on the absolute number of non-zero elements. – Leonid Shifrin Sep 01 '11 at 20:26
  • @Leonid OK, I see. Did you see my latest comment below your own answer? – Sjoerd C. de Vries Sep 01 '11 at 20:30
1

In some scenarios (depending upon the performance needed to generate the values), the following easy solution using an auxiliary list (f[i,0]) may be useful:

f[_Integer][{_Integer ..}] := 0;
f[_Integer, 0] := Sequence @@ {};

Table[
  r = RandomInteger[1000, 2];
  f[h = RandomInteger[100000]][r] = RandomInteger[10];
  f[h, 0] = Union[f[h, 0], {r}];
  , {i, 10^6}];

ExtractConfigurationsAndOccurences[f_, i_] := {f[i, 0], f[i][#] & /@ f[i, 0]};

Timing@ExtractConfigurationsAndOccurences[f, 10]

Out[252]= {4.05231*10^-15, {{{172, 244}, {206, 115}, {277, 861}, {299,
 862}, {316, 194}, {361, 164}, {362, 830}, {451, 306}, {614, 
769}, {882, 159}}, {5, 2, 1, 5, 4, 10, 4, 4, 1, 8}}}
Dr. belisarius
  • 60,527
  • 15
  • 115
  • 190
  • @belsarius: the problem is that in the code that generates down values one needs to use While loop, which means that rs are not generated at once, but incrementally. This implies that the command `f[h, 0] = Union[f[h, 0], {r}];` would have to be converted to something incrementally like `AppendTo[f[h,0], r]` which would lead to quadratic complexity. Though if Mathematica would support linked list structures then this solution would be very elegant. – zorank Aug 31 '11 at 17:51
  • 1
    @zorank see this http://stackoverflow.com/questions/5095505/mathematica-linked-lists-and-performance and this http://www.verbeia.com/mathematica/tips/HTMLLinks/Tricks_Misc_10.html (search for "linked lists" in the last one – Dr. belisarius Sep 01 '11 at 02:13
  • @belisarius +1 for referring to a page on my web site containing something I'd completely forgotten about (then again, I didn't write those tips, I just host them). – Verbeia Sep 02 '11 at 22:40
  • @Verbeia It is the new "click for my money" schema being deployed here :) – Dr. belisarius Sep 02 '11 at 22:47
0

Many thanks for everyone on the help provided. I've been thinking a lot about everybody's input and I believe that in the simulation setup the following is the optimal solution:

SetAttributes[linkedList, HoldAllComplete];

temporarySymbols = linkedList[];

SetAttributes[bookmarkSymbol, Listable];

bookmarkSymbol[symbol_]:= 
   With[{old = temporarySymbols}, temporarySymbols= linkedList[old,symbol]];

registerConfiguration[index_]:=registerConfiguration[index]=
  Module[
   {
    cs = linkedList[],
    bookmarkConfiguration,
    accumulator
    },
    (* remember the symbols we generate so we can remove them later *)
   bookmarkSymbol[{cs,bookmarkConfiguration,accumulator}];
   getCs[index] := List @@ Flatten[cs, Infinity, linkedList];
   getCsAndFreqs[index] := {getCs[index],accumulator /@ getCs[index]};
   accumulator[_]=0;
   bookmarkConfiguration[c_]:=bookmarkConfiguration[c]=
     With[{oldCs=cs}, cs = linkedList[oldCs, c]];
   Function[c,
    bookmarkConfiguration[c];
    accumulator[c]++;
    ]
   ]

pattern = Verbatim[RuleDelayed][Verbatim[HoldPattern][HoldPattern[registerConfiguration [_Integer]]],_];

clearSimulationData :=
 Block[{symbols},
  DownValues[registerConfiguration]=DeleteCases[DownValues[registerConfiguration],pattern];
  symbols = List @@ Flatten[temporarySymbols, Infinity, linkedList];
  (*Print["symbols to purge: ", symbols];*)
  ClearAll /@ symbols;
  temporarySymbols = linkedList[];
  ]

It is based on Leonid's solution from one of previous posts, appended with belsairus' suggestion to include extra indexing for configurations that have been processed. Previous approaches are adapted so that configurations can be naturally registered and extracted using the same code more or less. This is hitting two flies at once since bookkeeping and retrieval and strongly interrelated.

This approach will work better in the situation when one wants to add simulation data incrementally (all curves are normally noisy so one has to add runs incrementally to obtain good plots). The sparse array approach will work better when data are generated in one go and then analyzed, but I do not remember being personally in such a situation where I had to do that.

Also, I was rather naive thinking that the data extraction and generation could be treated separately. In this particular case it seems one should have both perspectives in mind. I profoundly apologise for bluntly dismissing any previous suggestions in this direction (there were few implicit ones).

There are some open/minor problems that I do not know how to handle, e.g. when clearing the symbols I cannot clear headers like accumulator$164, I can only clean subvalues associated with it. Have not clue why. Also, if With[{oldCs=cs}, cs = linkedList[oldCs, c]]; is changed into something like cs = linkedList[cs, c]]; configurations are not stored. Have no clue either why the second option does not work. But these minor problems are well defined satellite issues that one can address in the future. By and large the problem seems solved by the generous help from all involved.

Many thanks again for all the help.

Regards Zoran

p.s. There are some timings, but to understand what is going on I will append the code that is used for benchmarking. In brief, idea is to generate lists of configurations and just Map through them by invoking registerConfiguration. This essentially simulates data generation process. Here is the code used for testing:

fillSimulationData[sampleArg_] :=MapIndexed[registerConfiguration[#2[[1]]][#1]&, sampleArg,{2}];

sampleForIndex[index_]:=
  Block[{nsamples,min,max},
   min = Max[1,Floor[(9/10)maxSamplesPerIndex]];
   max =  maxSamplesPerIndex;
   nsamples = RandomInteger[{min, max}];
   RandomInteger[{1,10},{nsamples,ntypes}]
   ];

generateSample := 
  Table[sampleForIndex[index],{index, 1, nindexes}];

measureGetCsTime :=((First @ Timing[getCs[#]])& /@ Range[1, nindexes]) // Max

measureGetCsAndFreqsTime:=((First @ Timing[getCsAndFreqs[#]])& /@ Range[1, nindexes]) // Max

reportSampleLength[sampleArg_] := StringForm["Total number of confs = ``, smallest accumulator length ``, largest accumulator length = ``", Sequence@@ {Total[#],Min[#],Max[#]}& [Length /@ sampleArg]]

The first example is relatively modest:

clearSimulationData;

nindexes=100;maxSamplesPerIndex = 1000; ntypes = 2;

largeSample1 = generateSample;

reportSampleLength[largeSample1];

Total number of confs = 94891, smallest accumulator length 900, largest accumulator length = 1000;

First @ Timing @ fillSimulationData[largeSample1] 

gives 1.375 secs which is fast I think.

With[{times = Table[measureGetCsTime, {50}]}, 
 ListPlot[times, Joined -> True, PlotRange -> {0, Max[times]}]]

gives times around 0.016 secs, and

With[{times = Table[measureGetCsAndFreqsTime, {50}]}, 
 ListPlot[times, Joined -> True, PlotRange -> {0, Max[times]}]]

gives same times. Now the real killer

nindexes = 10; maxSamplesPerIndex = 100000; ntypes = 10;
largeSample3 = generateSample;
largeSample3 // Short
{{{2,2,1,5,1,3,7,9,8,2},92061,{3,8,6,4,9,9,7,8,7,2}},8,{{4,10,1,5,9,8,8,10,8,6},95498,{3,8,8}}}

reported as

Total number of confs = 933590, smallest accumulator length 90760, largest accumulator length = 96876

gives generation times of ca 1.969 - 2.016 secs which is unbeliavably fast. I mean this is like going through the gigantic list of ca one million elements and applying a function to each element.

The extraction times for configs and {configs, freqs} are roughly 0.015 and 0.03 secs respectivelly.

To me this is a mind blowing speed I would never expect from Mathematica!

zorank
  • 281
  • 2
  • 7
  • @Leonid: I do not know how the techical side of the site works so I posted this as "the final answer". I did not want to seal of your final input in any way. Everything above is still waiting for your comments, if you wish. – zorank Sep 02 '11 at 11:44
  • @everybody: Why is Flatten so damn fast? I would never expect that! – zorank Sep 02 '11 at 11:45
  • @everybody: Why are nested head constructs such as linkedList above so fast? – zorank Sep 02 '11 at 12:35
  • I will soon post my final rather comprehensive analysis where I also will include the modified version of my previous solution. It may not make much sense given that you already settled up on something similar, but I spent way too much time on this to just bury it now ) – Leonid Shifrin Sep 02 '11 at 12:44
  • Your version is perhaps better than what I was going to post for that part. Note however that your use of `accumulator` variable degenerates it in many ways to the very first solution I posted, the one based on nested hash-table. The problem of the latter was slow DownValues extraction for large accumulator length, but your lengths (around 1000) are too small to observe that. The reason why we need `With` is that we give `linkedList` a `HoldAllComplete` attribute, therefore there is no other way to put a value inside (ok, you could also have used `Append[linkedList[old],value]`). – Leonid Shifrin Sep 02 '11 at 13:16
  • @Leonid: Interesting, do I understand correctly that `Append[linkedList[old],value]` would not have quadratic complexity, i.e. the Append command would not copy (recreate) `old` part? This syntax in much better since it is more clear (I think). – zorank Sep 02 '11 at 15:43
  • @Leonid: "The problem of the latter was slow DownValues extraction for large accumulator length". But if my understanding is correct one cannot avoid this stage. I do not think there is any other way that is equally fast for accumulating counts, or perhaps I am wrong I do not know? Given that I am right, one is forced to look into these downvalues at some point otherwise they would be generated and never used. Also, in the last example there are ca 100000 configs in a typical downvalue list. – zorank Sep 02 '11 at 15:47
  • @Leonid: Also, the use of DownValues is avoid in a direct way by using Map command since careful bookkeeping is done for configurations that have been processed (the list is without repetition). – zorank Sep 02 '11 at 16:16
  • `Append` does not have a quadratic complexity for a linked list, since we always append to an expression (`linkedList[old]`) containing a single element. All elements are internally pointers, of course. Since expressions are immutable, Mathematica copies a pointer to `old`, without traversing its parts. This is where we save. To unser the second question, my solution with linked list was doing exactly that - avoiding the use of `DownValues` to extract elements. As I mentioned, I do have a solution where you can still register individual configurations, for the price of some ... – Leonid Shifrin Sep 02 '11 at 16:26
  • ... memory overhead. Perhaps I will publish it soon. In fact, I am just developing an implementation of a hash-table with a fast keyset/valueset extraction, based on a mix of DownValues and Dispatch. If I have success with that, it could be used in your sulution in place of `accumulator`, to speed up the key-value extraction for the price of *some* key - value insertion slowdown. – Leonid Shifrin Sep 02 '11 at 16:28
  • Alas, no success for a mixed hash-table attempt - despite all efforts, I get the same keyset-valueset extraction timing as for simple `DownValues`. I will publish my linked list - based solution at some point, it has somewhat better characteristics in terms of key-value set extraction times but is more memory - hungry. Can not spend on it more time at the moment, however. – Leonid Shifrin Sep 02 '11 at 19:40
  • Thanks again on this tremendeous effort on your side. Just one last thing, I think that this extraction issue is very much related to what one wants to do with it. In that sense I really liked your ideas about nested head approach combined with encapsulation. I think it is rather generic and might find a way into any simulation code since it fits nicely into the setup where configs are added incrementally. Also, configs are extracted all at once to compute some statistics (key-value part might not be that important in that sense, though might be useful in other contexts). – zorank Sep 02 '11 at 20:28
  • ...and, also, in these simulations memory use is a key issue. This is what limits what one can do. In that sense one should opt for efficient storage of frequencies + speed if possible. My feeling is that it is easy to hurt one at the expense of the other. One should find some clever balance. For example, being sparse and the hash table probably does not work well memory-wise. – zorank Sep 02 '11 at 20:36