3

Let's take a list as an example:

a = [255, 255, 1, 255, 255, 255, 1, 2, 255, 255, 2, 255, 255, 3, 255, 3, 255, 255, 255]

255 is a special value in it. It's a placeholder.

I've made a generator which replaces some of the placeholder inside the list. It works as expected.

But I need not to process the beginning placeholders [255, 255 and the ending placeholders 255, 255, 255] and yield them intact.

So, I tried to modify the generator to work it out:

Python 2.7

from __future__ import print_function
from  itertools import tee, izip, ifilterfalse

def replace(iterable,placeholder=255):
    it = enumerate(iterable) #the position is needed for the logic for the middle of the list
    it = ifilterfalse(lambda x: x[1]==placeholder, it) #create an iterator that deletes all the placeholders
    for i,(left,right) in enumerate(window(it,2)): #Slide through the filtered list with the window of 2 elements
        if i==0: #Leaving the beginning placeholders intact
            for j in range(left[0]):
                yield placeholder

        #SOME LOGIC FOR THE MIDDLE OF THE LIST (it works well)

    #Need to leave the trailing placeholders intact.

The interim values converted to list just to ease the comprehension of the code:

>>>iterable
[255,1,255,255,1,255,255,255,2,2,255,255,255,2,2,3,255,255,255,3,255,255]

>>>it = enumerate(iterable)
[(0, 255), (1, 1), (2, 255), (3, 255), (4, 1), (5, 255), (6, 255), (7, 255), (8, 2), (9, 2), (10, 255), (11, 255), (12, 255), (13, 2), (14, 2), (15, 3), (16, 255), (17, 255), (18, 255), (19, 3), (20, 255), (21, 255)]

>>>it = ifilterfalse(lambda x: x[1]==placeholder, it)
[(1, 1), (4, 1), (8, 2), (9, 2), (13, 2), (14, 2), (15, 3), (19, 3)]

>>>list(enumerate(window(it,2)))
[(0, ((1, 1), (4, 1))), (1, ((4, 1), (8, 2))), (2, ((8, 2), (9, 2))), (3, ((9, 2), (13, 2))), (4, ((13, 2), (14, 2))), (5, ((14, 2), (15, 3))), (6, ((15, 3), (19, 3)))]

So, as you can see, the list(enumerate(window(it,2))) contains the index of the leading non-placeholder value (0, ((**1**, 1), (4, 1))),, but it doesn't contain the information how many trailing placeholder the initial iterator had: list(enumerate(window(it,2))) ends in this value (6, ((15, 3), (**19**, 3))) which has only the index of the last non-placeholder value, which doesn't give the information how many placeholders are left.

I managed to process the leading placeholders by relying on it = enumerate(iterable) which yields the position of the initial iterator value which persists in the first yielded value by ifilterfalse.

But I spent quite a lot of time trying to figure out how to do the same thing with the trailing placeholders. The problem is that ifilterfalse just swallows the last placeholder values of enumerate(iterable) and I see no way to access them (it was possible for the leading placeholders since the first generated value of ifilterfalse contained the index of the value of the enumerate(iterable)).

Question

What is the best way to correct this code for it to process the trailing placeholders?

As the goal is not to create a code by all means (I have already done it using a different technique), I want to solve the task by tinkering a bit wit the code, not a complete rewriting it.

It's more of a training than a real task.

Additional information

window is the code from here.

My code does nearly the same as in this answer by @nye17. But in this code the author make inplace modifications of the initial list. And I want to create a generator which will be yielding the same values as the resultant list in that code.

Furthermore, I want my generator to accept any iterables as a parameter, not only lists (for example it may accept the iterator which reads the values from file one by one). With having only lists as a parameter, the task becomes simpler, since we can scan the list from the end.

This is not a real task I have to solve in life. It's just for a training.

Full code http://codepad.org/9UJ9comY

Community
  • 1
  • 1
ovgolovin
  • 13,063
  • 6
  • 47
  • 78
  • Where did the list come from? Can you fix that code? `255` seems like a rather un-Pythonic sort of placeholder. It's also not really clear what the logic is for placeholder replacement. – Karl Knechtel Oct 13 '11 at 16:29
  • @Knechtel Good point. But I have no idea. I found this task here http://stackoverflow.com/q/7745367/862380. While trying to write a code, I faced some problems. For me the task is purely for learning and for fun. – ovgolovin Oct 13 '11 at 16:45
  • @Knechtel In the first version of the question, the placeholder was `0`, but then I noticed some inconsistency with the other question that I mentioned above, so I decided to replace all `0` with `255` for the question to be compatible with the other question. – ovgolovin Oct 13 '11 at 16:47
  • @Knechtel About the logic for placeholder replacement. As I understood, if the placeholders are between 2 non-placeholders values which are the same, the placeholders are changed to those values. And if the values are different, the placeholder stays intact. What I'm trying to figure out in this question is how to keep the leading and trailing placeholders intact (not changed). – ovgolovin Oct 13 '11 at 16:50

3 Answers3

2
def replace(it, process, placeholder):
    it = iter(it)
    while True:
        item = it.next()
        if item == placeholder:
            yield item
        else:
            yield process(item)
    pcount = 0
    try:
        while True:
            item = it.next()
            if item == placeholder:
                pcount += 1
            else:
                for i in range(pcount):
                    yield process(placeholder)
                pcount = 0
                yield process(item)
    except StopIteration:
        for i in range(pcount):
            yield placeholder

Use it like this:

>>> a = [0, 0, 1, 0, 0, 0, 1, 2, 0, 0, 2, 0, 0, 3, 0, 3, 0, 0, 0]
>>> [x for x in replace(a, lambda n: n+20, 0)]
[0, 0, 21, 20, 20, 20, 21, 22, 20, 20, 22, 20, 20, 23, 20, 23, 0, 0, 0]
Steven Rumbalski
  • 44,786
  • 9
  • 89
  • 119
  • Thanks! So, you rely in your code on having all the iterable in the memory (`len(iterable)`; `range(...,-1)`; etc.). The problem is I want to implement lazy evaluations. So that the values of the iterable can even be read from file one by one. – ovgolovin Oct 13 '11 at 14:33
  • I only need the access to the index variable of the `enumberate(iterable)`. And by comparing it with the index of `ifilterfalse` it's possible to determine how many placeholder have to be yielded. But I don't know how to get THE INDEX :) – ovgolovin Oct 13 '11 at 14:35
  • Ah. So you want to pass an iterator, not an iterable. – Steven Rumbalski Oct 13 '11 at 14:35
  • Sorry for `iterable`. I thought it's the same as `iterator`. – ovgolovin Oct 13 '11 at 14:37
  • `iterable` seems to include `iterators`. Looking in the [docs](http://docs.python.org/library/itertools.html) we can see that `iterable` as an agrubment of all the functions. E.g. `def chain(*iterables):` accepts `lists`, or any `iterators`, or any other values that can be iterated over. – ovgolovin Oct 13 '11 at 14:46
  • You're correct. My distinction should have been between iterables that are sequences and and those that are iterators. The line `it = iter(iterable)` makes sure that this function works with any iterable, including sequences. – Steven Rumbalski Oct 13 '11 at 15:05
  • About the second code snippet. It seems to be skipping the leading placeholder values. The last placeholders seem also to be swallowed. – ovgolovin Oct 13 '11 at 15:19
  • thought that was what you wanted. sorry. – Steven Rumbalski Oct 13 '11 at 15:22
  • No, the leading and trailing placeholders have to be intact. And the other placeholders in the middle are already covered with the working code. I just want the trailing placeholders to yield intact (the trailing I have already managed to process). – ovgolovin Oct 13 '11 at 15:24
  • When we reach `item != placeholder` in the first `while`, the value is already yielded with `yield item`. But it has to be covered with a special code. We have only to yield the initial placeholders. – ovgolovin Oct 13 '11 at 15:33
  • Thanks! Not exactly what I hoped to get as an answer (I'm just interested in the solving the issue with extracting the necessary value from the iterator). Thanks for all the attempts! My (+1)! – ovgolovin Oct 13 '11 at 15:55
0
def replace(it, placeholder):
    while True:
        curr = it.next()
        if curr == placeholder:
            yield curr
        else:
            break

    yield curr

    try:
        cache = []
        while True:      
            curr = it.next()

            if curr == placeholder:
                cache.append(curr)
            else:
                for cached in cache:
                    yield TRANSFORM(cached)
                yield curr
                cache = []

    except StopIteration:
        for cached in cache:
            yield cache
Bwmat
  • 4,314
  • 3
  • 27
  • 42
0

The simplest solution I came up with is to process it = enumerate(iterable) through one more generator which just saves the last returned value by it.

So, I added the following code after it = enumerate(iterable) (inside the replace function):

def save_last(iterable):
        for i in iterable:
            yield i
        replace.last_index = i[0] #Save the last value
it = save_last(it)

After iterable is exhausted, the last operator of the generator saves the index of the yielded value (which is i[0] as enumerate stores it at the position 0 of tupele) as the replace attribute (since replace function is a instance of a class, which can have local variables).

The it is wrapped in the newly created generator save_last.

At the very end of the function I added the code which uses the saved index in replace.last_index variable.

if right[0]<replace.last_index:
    for i in range(replace.last_index-right[0]):
        yield placeholder

The full code:

from __future__ import print_function
from  itertools import tee, izip, ifilterfalse


def window(iterable,n):
    els = tee(iterable,n)
    for i,el in enumerate(els):
        for _ in range(i):
            next(el, None)
    return izip(*els)


def replace(iterable,placeholder=255):
    it = enumerate(iterable)

    def save_last(iterable):
        for i in iterable:
            yield i
        replace.last_index = i[0] #Save the last value
    it = save_last(it)

    it = ifilterfalse(lambda x: x[1]==placeholder, it)
    for i,(left,right) in enumerate(window(it,2)):
        if i==0:
            for j in range(left[0]):
                yield placeholder
        yield left[1]
        if right[0]>left[0]+1:
            if left[1]==right[1]:
                for _ in range(right[0]-left[0]-1):
                    yield left[1]
            else:
                for _ in range(right[0]-left[0]-1):
                    yield placeholder
    yield right[1]
    if right[0]<replace.last_index:
        for i in range(replace.last_index-right[0]):
            yield placeholder


a = [255,1,255,255,1,255,255,255,2,2,255,255,255,2,2,3,255,255,255,3,255,255]        
print('\nInput: {}'.format(a))
output = list(replace(a))
print('Proram output: {}'.format(output))
print('Goal output  : {}'.format([255,1,1,1,1,255,255,255,2,2,2,2,2,2,2,3,3,3,3,3,255,255]))

Which works as expected:

Input: [255, 1, 255, 255, 1, 255, 255, 255, 2, 2, 255, 255, 255, 2, 2, 3, 255, 255, 255, 3, 255, 255]
Proram output: [255, 1, 1, 1, 1, 255, 255, 255, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 255, 255]
Goal output  : [255, 1, 1, 1, 1, 255, 255, 255, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 255, 255]

The only thing that I don't like is the combination of very efficient written in C ifilterfalse and save_last written in Python.

ovgolovin
  • 13,063
  • 6
  • 47
  • 78