1

I would like to be able to do, for example:

pmap = persistent_map(function, iterable(args))
foo = map(bar, pmap)
baz = map(bar, pmap) # right-hand side intentionally identical to that of the previous line.

The interpreter would know to reset pmap before using it to construct baz.

I know that I could store the result of pmap as a tuple, list, or other structure (which is what I have done in my present application), but I don't want to have do that because of the possibilities of 1) huge storage requirements (also consider copies) and 2) different results on reevalation, as when iterable is produced from a dynamic file.

If the equivalent of persistent_map exists, what is the corresponding built-in feature or standard library feature? Alternatively, is there a third-party (hopefully reliable and readily available) persistent_map equivalent? If only extant option is third-party, how might you create persistent_map using only built-in and, possibly, standard library features?

In response to @MartijnPieters comment, "that's what generators are for", are you saying that one solution would be something like

def persistent_map(function, iterable):
    from functools import partial
    return partial(map, function, iterable)
foo = map(bar, pmap())
baz = map(bar, pmap())
# A toy example:
pmap = persistent_map(hex, range(3))
foo = map(len , pmap())
baz = map(hash, pmap())
print(*zip(pmap(), foo, baz))
('0x0', 3, 982571147) ('0x1', 3, 982571146) ('0x2', 3, 982571145)

In response to @kmiya ("Bit too late...") and also, now that I think I understand better what @MartijnPieters meant by "that's what generators are for," I would do it this way:

class PersistentMap:
    def __init__(self, func, generator):
        self.func = func
        self.generator = generator
    def __iter__(self):
        return map(
            self.func,
            # THIS IS HOW "The interpreter [knows] to reset:"
            self.generator()
        )

# AN EXAMPLE GENERATOR.
# NOTE THAT STORING "ALL" THE GENERATED VALUES IS IMPOSSIBLE.
def tictoc():
    k = 0
    while True:
        yield k
        k = 1 - k

pmap = PersistentMap(hex, tictoc)
bar = map(len, pmap)
baz = map(hash, pmap)
limiter = range(3)
print(*zip(pmap, bar, baz, limiter))
# ('0x0', 3, 9079185981436209546, 0) ('0x1', 3, -8706547739535565546, 1) ('0x0', 3, 9079185981436209546, 2)
print(*zip(pmap, bar, baz, limiter))
# ('0x0', 3, 9079185981436209546, 0) ('0x1', 3, -8706547739535565546, 1) ('0x0', 3, 9079185981436209546, 2)
Ana Nimbus
  • 635
  • 3
  • 16
  • What do these statements mean? 1: *The interpreter would know to reset* `pmap` *before using it to construct* `baz` 2: *I know that I could store the result of* `pmap`. Are you looking for a *Singleton*? – CristiFati Dec 07 '17 at 15:29
  • I read it as them wanting a generator that resets between uses. – Patrick Haugh Dec 07 '17 at 15:31
  • I think the interpretation of @PatrickHaugh is what I mean: I want "a generator that resets between uses." – Ana Nimbus Dec 07 '17 at 15:44
  • @AnaNimbus Why not replace `map(bar, pmap)` with `map(bar, map(function, args))`? – Patrick Haugh Dec 07 '17 at 15:46
  • @PatrickHaugh: Indeed, I misread the question then. The OP can't have what they want. – Martijn Pieters Dec 07 '17 at 15:49
  • The only *possible* implementation of a persistent map is to store the results in a list. `pmap = list(map(function, iterable(args))`. You **have** to store either the function call results, or the contents of the iterable, in a permanent structure. If memory is an issue, then consider streaming out the results or the input iterable to a file first, and feed your functions from there. – Martijn Pieters Dec 07 '17 at 15:51
  • Case in point: `itertools.tee()` could produce separate iterators for `foo` and `bar`, but if you then iterate over all of `foo`, you'll end up with all results in memory until `baz` is iterated over. – Martijn Pieters Dec 07 '17 at 15:52
  • Re: "the only _possible_ implementation", @Martijin Peters, I don't want to store the result of `persistent_map`, only some indication of the underlying process, which should be repeated at some arbitrary future time. – Ana Nimbus Dec 07 '17 at 16:40
  • 1
    @AnaNimbus: that's what generators are for. Recreate the generator when you need it again. Your requirements are too vague to give a concrete solution however. – Martijn Pieters Dec 07 '17 at 16:43
  • @MartijnPieters: I have edited the body of the question re: "what generators are for." – Ana Nimbus Dec 07 '17 at 23:57
  • You don't have a generator. You can create a generator with a generator function ([using `yield`](https://stackoverflow.com/questions/231767/what-does-the-yield-keyword-do)) or with a [generator expression](https://docs.python.org/3/tutorial/classes.html#generator-expressions). With a generator function, you can 'recreate' the generator every time you call the function. You still have to code it such that it produces deterministic results. – Martijn Pieters Dec 08 '17 at 07:47
  • @MartijnPieters, re: "or with a generator expression." If I understand correctly (not sure if I do), how is using a generator expression, e.g., `foo = map(len, (s for s in pmap()))` better (e.g., more useful, safer, etc.) than `foo = map(bar, pmap())`? See the "one solution" in the body of the question for context. It's not obvious to me why using a canonical generator should be a preferred feature of a solution to my original question. – Ana Nimbus Dec 08 '17 at 14:45

1 Answers1

2

Bit too late, but I think this would be what the OP imagined to have:

class PersistentMap:
    def __init__(self, func, iterable):
        self.func = func
        self.iterable = iterable

    def __iter__(self):
        return map(self.func, self.iterable)

foo = PersistentMap(hex, range(3))
bar = map(len, foo)
baz = map(hash, foo)
print(*zip(foo, bar, baz))
# ('0x0', 3, -1640763679766387179) ('0x1', 3, -4170177621824264673) ('0x2', 3, 4695884286851555319)
kmiya
  • 21
  • 3
  • Welcome! See my updated question (added another solution at the end). I basically merged your code with @MartijnPieters "that's what generators are for" and my own requirements (no _extra_ storage, possibly infinite source, and "would know to reset"). +1. – Ana Nimbus Nov 12 '20 at 01:30