164

I would like to know what the actual difference between caching and memoization is.
As I see it, both involve avoiding repeated function calls to get data by storing it.

What's the core difference between the two?

John Smith
  • 7,243
  • 6
  • 49
  • 61
John
  • 1,643
  • 2
  • 10
  • 4
  • 1
    I wonder if you could say "memoization is to caching" as "array is to sparse array". In other words, you only store things "on demand" rather than enumerating every possible input combination. – Sridhar Sarnobat Sep 20 '17 at 01:01
  • It's one of those jargon things that's a bit annoying, as they're very much the same thing. Generally though, memoization term is used when discussing language support and algorithm optimization. – Blaze Sep 08 '22 at 17:38

5 Answers5

164

Memoization is a specific form of caching that involves caching the return value of a function based on its parameters.

Caching is a more general term; for example, HTTP caching is caching but not memoization.

Wikipedia says:

Although related to caching, memoization refers to a specific case of this optimization, distinguishing it from forms of caching such as buffering or page replacement.

smci
  • 32,567
  • 20
  • 113
  • 146
SLaks
  • 868,454
  • 176
  • 1,908
  • 1,964
  • 3
    but you can always surrond the part where cache is used with a function and christened it 'memoization'. although the difference is you are in control of the caching policy in your function, whereas memoization is higher order and happens outside the function I guess. – nicolas May 01 '14 at 08:41
  • 2
    Why is HTTP Caching not memorisation? that also is based on the parameter (the URL of the resource requested). – Нет войне Mar 29 '18 at 19:05
  • 1
    @topomorto: Because of features like `If-Match` and expirations. Memoization only makes sense for pure function, which HTTP rarely is. – SLaks Mar 29 '18 at 20:27
  • 1
    @nicolas, not quite, i think. I think that in *memoization* the term "function" is used in pure/mathematical sense. Downloading a web page from a given address cannot be considered a function, because it may happen that the page changes. – Alexey Jul 07 '18 at 15:32
  • @Alexey doesn't the same remark applies to caching ? all those strategies relies on the same function call giving the same result, aka no *upstream* side effect. – nicolas Jul 13 '18 at 07:24
  • @nicolas, it is quite common to cache changing data AFAK, see [Cache invalidation](https://en.wikipedia.org/wiki/Cache_invalidation) Wikipedia article for example. I do not understand what you mean by an "upstream side effect". – Alexey Jul 14 '18 at 12:57
  • @Alexey indeed, that's my point. by "upstream side effect" I mean side effect which affect your own inputs, and thus might change your result VS, say, printing to a debug console : unpure, yet wont change your result. – nicolas Jul 15 '18 at 16:23
  • @nicolas, sorry, i do not understand what you are talking about (side effects affect *inputs*?). What "function" are you talking about? – Alexey Jul 15 '18 at 17:07
  • @Alexey That's my point : the remark is parametric in whichever notion of "function" you might want to think of. Just use the *same* notion for memoization and for caching when comparing both. (for "inputs", consider the fact that pure does not mean deterministic. whatever changes your result, deterministic or not, can be quite naturally seen as an input, as it would be if the function was pure....) – nicolas Jul 16 '18 at 17:59
  • @nicolas, i am talking about ordinary functions. There is no function that can associate a web page to a web address, because web pages change. – Alexey Jul 16 '18 at 18:44
  • @Alexey I agree, and it is a useful distinction to make between function in the mathematical sense, arrows in the category Sets. Half of CS is about recovering a function-like behavior by restricting what's written or by creating advanced type wizardry. but that seems an orthogonal consideration to me – nicolas Jul 18 '18 at 16:48
60

As I have seen them used, "memoization" is "caching the result of a deterministic function" that can be reproduced at any time given the same function and inputs.

"Caching" includes basically any output-buffering strategy, whether or not the source value is reproducible at a given time. In fact, caching is also used to refer to input buffering strategies, such as the write-cache on a disk or memory. So it is a much more general term.

harpo
  • 41,820
  • 13
  • 96
  • 131
  • Are you sure that the function has to be deterministic? – Gherman Oct 25 '16 at 06:59
  • 6
    @German, yes, memoization depends on determinism. The classic example is a recursive algorithm, such as the Fibonacci sequence or factorial. Instead of re-computing all the way down to the base case, a memoized function would short-circuit by reusing earlier results for values it's already computed. This obviously depends on the same input always yielding the same output, which is the definition of determinism. Caching, on the other hand, is frequently used for non-deterministic (e.g. random or timestamped) processes, with the understanding that results may not match a "refreshed" value. – harpo Oct 26 '16 at 16:31
6

I think term caching is usually used when you store results of IO operations, or basically any data that is coming to you from the outside (files, network, db queries). Term memoization usually applies to storing results of your own computations, for example in the context of dynamic programming.

MK.
  • 33,605
  • 18
  • 74
  • 111
1

Memoization is a special form of caching the result of a deterministic function. This means that caching the result outside the function is not memoization because the function would have to mutate the cache when computing a new result (not already in the cache) so it would not be a (pure) function anymore. Memoization generally implies passing the cache as an additional argument (in an helper function). Memoization will optimize functions that need to compute values several times for a single access. Caching will optimize functions that are called several times with the same parameters. In other words, Memoization will optimize the first access whether caching will only optimize recurrent accesses.

1

I would like to add to the other great answers that memoization is also known as tabling. I think it is also important to know that term for those who learn what memoization and caching are.

V. S.
  • 1,086
  • 14
  • 14