2

In this and this posts I describe the framework that I want to develop.

I want to implement a memoization strategy to speeedup some functions execution.

An important feature of this framework should be to "remember" the computed values of past runs: let's suppose we write a program where we execute a word count function f with the big text t as input. After we have computed r=f(t) (where r is the result), we follow the memoization logic, so we store (t,r) somewhere, let's say an unordered_map object um. After this, the program terminates.

At the next execution of the same program, the expensive execution of fisn't necessary, since um contains already (t,r), so the value r is returned.

The problem in all of this is how to "remember" um state in different executions.

IMPORTANT: Obviously this is an example to make you understand a possible application of this framework, but any memoization application must work as well.

Write a c++ object to file is the only solution for this problem/scenario, or there is an other one?

Community
  • 1
  • 1
justHelloWorld
  • 6,478
  • 8
  • 58
  • 138
  • 1
    Save the information using the file system and read it in during later runs. – lcs Apr 19 '16 at 13:52
  • Can you please explain better your solution? Consider the IMPORTANT section and don't focus too much on the provided example (which is reported only to make understand this kind of memoization). – justHelloWorld Apr 19 '16 at 13:57
  • Well you are going to have to store the data somewhere. You could use a DB, a file, some sort of cloud storage or something else but the data is going to have to be stored somewhere persistent. – NathanOliver Apr 19 '16 at 14:08

2 Answers2

1

The data must be stored somewhere outside the process in order to keep it after the process terminates. The file system is an obvious place to store the data, but it can also be stored by another process, such as a database management system, or an external cache (ex. https://memcached.org/).

Most backend options store the data as a character stream. You will need to understand how to (de)serialize the data.

eerorika
  • 232,697
  • 12
  • 197
  • 326
1

Side-note: Can you keep the program running? Then you dont need to store the memos on disk.

Regardless of where you store the memos:

If this is really for word-count then forget memoization: You would need to read the whole text to find out if it is identical to a previous text. And then you could just count the words on the fly anyway.

If instead of word-count, that was just an example for some much more complicated computation: Instead of storing text -> computed-value memos you can store hash -> computed-value memos. You choose a hash depending on your situation: If you need to be somewhat sure that it is the same text, then use e.g. sha-1. If you need a really cheap 'hash' (e.g. when the stored computed-value serves only as an initial guess which will be improved upon anyway) then you could just use the number of bytes in the text-string as 'hash'.

Bernd Elkemann
  • 23,242
  • 4
  • 37
  • 66