-1

Assume you get a string. Example:

s = "FishCatRainComp_CatWater_JamDog"

And you define a sieve. A sieve - is a list of words you would like to catch up(once if multiple occurrence in s), for example:

sieve = ["Dog","Cat"]

Passing a string through a sieve should produce a string, in our case that would be:

out = "CatDog"

What would be the most elegant way to achieve the result?

greyxray
  • 362
  • 2
  • 12

2 Answers2

4

Here is the most elegant way that comes to mind:

''.join([word for word in sieve if word in s])

Given that the order of the words in the input string should be reflected in the output string:

def SieveString(s,sieve):
    zipped = zip(sieve,[s.index(word) for word in sieve if word in s])
    zipped.sort(key=lambda x:x[1])
    return ''.join(word for word,index in zipped)
barak manos
  • 29,648
  • 10
  • 62
  • 114
  • 1
    That will produce `"DogCat"` for the sample op posted above. Not `"CatDog"` which is the correct answer. – Omada Nov 10 '16 at 17:16
  • @Omada: See my comment to the question, which originally provided a coding example giving the same output, and asked for a more elegant manner of doing it. – barak manos Nov 10 '16 at 17:18
  • @Omada nice catch. Perhaps barak should iterate through sieve backwards with sieve[::-1] – Mark Hannel Nov 10 '16 at 17:18
  • @MarkHannel No. That would reverse the whole string. – Christian Dean Nov 10 '16 at 17:18
  • 1
    @barakmanos I think you are misunderstanding how a sieve is supposed to work. The output should have the words in the same order as they appeared in `s` (at least as far as my understanding is) – Omada Nov 10 '16 at 17:19
  • @Omada: I think you are missing my point. The question **originally** gave a coding example which resulted in **this** output (`"DogCat"`), asking for a more elegant manner. I notified OP about that (since he/she also mentioned the output `"CatDog"`), and answered it as is. The OP has later removed the original coding example. – barak manos Nov 10 '16 at 17:21
  • 1
    @Omada: In any case, I've added a solution referring to this ("new") restriction. – barak manos Nov 10 '16 at 17:28
1

If you want to maintain the order of the strings as they appear in s, you could do something like this:

found = re.findall('({})'.format('|'.join(re.escape(w) for w in sieve)), s)

You would then have to remove the repeated strings:

def remove_repeated(seq):
    seen = set()
    seen_add = seen.add
    return [x for x in seq if not (x in seen or seen_add(x))]

print(''.join(remove_repeated(found)))

This solution might be longer, but it has better asymptotic behaviour

You could otherwise sort the string by their index:

>>> sorted([word for word in sieve if word in s], key=lambda word: s.index(word))
Francisco
  • 10,918
  • 6
  • 34
  • 45