-1

How to implement the following comparison using Python 2

Input is composed of two groups:

  1. Different lists collected from Experiments.
  2. All accepted sequences of some of these elements.

How to filter all those lists from input group 1. for which any of the accepted sequences from input group 2 is a proper subsequence?

For example:

Group two (defined by user):

x = [3, 1, 6]
y = [2, 1, 6]
z = [3, 4, 6]

Group one (from Experiments):

a = [1, 2, 3, 5, 6, 7]

b = [2, 1, 4, 3, 1, 8, 6]

c = [6, 3, 5, 7, 8, 4, 2, 6]

d = [1, 2, 1, 3, 4]

We accept b and c because x is a subsequence of b and z is a subsequence of c. And likewise we reject a and d because none of the x, y or z is a subsequence of either.

mysterious(a) should return [2,6] which is not acceptable as we didn't visit node 1 after 2

mysterious(b) should return [2,1,6] which is acceptable and so on

Another example(detailed):

To accept a certain set or list I need some elements that present some services.

ServiceA served by [ 3 , 2 ] ServiceB served by [ 1 , 4 ] ServiceC served by [ 6 ]

Total nodes available to end user [ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 ]

We ask him to choose a set or list from the total nodes. We will only accept any combination when the nodes which serve services appear in correct order or sequence.

So user could choose any set with unlimited number of nodes as long as: 1. the nodes are member from the total nodes. 2. the order of services will be correct.

example [1,4,5,"Node serve A", 7,1,2, "Node serve B", "Node serve C"]

or so the general form to accept a list or set is: [some elements, element serve service A, other elements, element service B, more elements, element service c, etc...]

and you can replace the node service element with any element from the correspondent set above * If that not clear please let me know and will explain in more examples.

Example three:

Lets think in a factory with 10 machines. The product need three different processes to be manufactured.

Every machine can do some of these processes or all it differs. Machine 1 can do process alpha, gama but not beta. Machine 2 can do only process alpha

Every raw material arrives need to find route through machines with condition that by end it should have the three processes done.

The processes must be in order so first do Alpha, beta, then at end Gama. we do route every time to avoid overloading machines.

so now I need a function to accept or reject certain route suggestions to enforce that every raw material go through the processes in correct order.

I can't of course make all possible combinations and then compare as this will consume time and can run for infinity.

Thanks

engbarakat
  • 39
  • 10
  • 3
    What have you tried so far, and which parts are you having problems with? – cdarke Apr 06 '17 at 12:50
  • I didn't try anything yet as I don't know what function could help me check such requirement. – engbarakat Apr 06 '17 at 12:52
  • 1
    Can you explain more deeply why one group is accepted and the other isn't? – Adirio Apr 06 '17 at 12:58
  • I _think_ you can do this with set operations, but you need to explain clearly why b & c are accepted but `a` and `d` are rejected. If you have difficulties explaining this it may help if you show more examples. – PM 2Ring Apr 06 '17 at 13:13
  • Possible duplicate of [Pythonic way to check if a list is sorted or not](http://stackoverflow.com/questions/3755136/pythonic-way-to-check-if-a-list-is-sorted-or-not) – user3467349 Apr 06 '17 at 14:08
  • I don't want to check the normal order stuff. I know sort(). – engbarakat Apr 06 '17 at 16:08
  • @PM2Ring I tried to add an explanation – engbarakat Apr 06 '17 at 16:09
  • @Adirio Please look at my update. thanks – engbarakat Apr 06 '17 at 20:51
  • 1
    Sorry, it's still not clear how your decision procedure works. At first, I thought `b` is accepted because it contains the items of `y` as a (non-contiguous) subsequence, and it also contains `x`; similarly `c` is accepted because it contains `z`. But `a` also contains `z`, so why is `a` rejected? – PM 2Ring Apr 07 '17 at 08:38
  • @PM2Ring there are points: group two sets could be combined by cartesian product of services list mentioned in example2. the other point is I want at least one element from service list to appear but in correct order. [some elements, element serve service A, other elements, element service B, more elements, element service c, etc...] – engbarakat Apr 07 '17 at 10:05
  • @PM2Ring `a` contains `z`? There's no `4` in `a` – Adirio Apr 07 '17 at 10:15
  • I thought I got the first example and then I read the second one and I just understood nothing. I will try to answer the first example. – Adirio Apr 07 '17 at 10:15
  • 1
    @engbarakat can you check if the corrections I edited are indeed correct, or do you need the *mysterious* function also to return `2, 6` (i.e. the longest matching prefix) for that, or is it enough just to filter the longest matching one. – Antti Haapala -- Слава Україні Apr 07 '17 at 10:54
  • @Adirio The data has been edited. See revision 5 and earlier http://stackoverflow.com/posts/43255697/revisions – PM 2Ring Apr 07 '17 at 11:37
  • Can you relate example 3 and 1. And by this I mean: What are `x`, `y` and `z`? Machines, product or processes? Same with `a`, `b`, `c`, `d` and numbers. – Adirio Apr 07 '17 at 12:33
  • @AnttiHaapala thanks. is Subsequences implies that order of appearance of elements is right? so [2,1,6] is right [1,2,6] is wrong. – engbarakat Apr 07 '17 at 15:13

2 Answers2

1

I wrote the following code for longest_subsequence

a = [1, 2, 3, 5, 6, 7]
b = [2, 1, 4, 3, 1, 8, 6]
c = [6, 3, 5, 7, 8, 4, 2, 6]
d = [1, 2, 1, 3, 4]

x = [3, 1, 6]
y = [2, 1, 6]
z = [3, 4, 6]

group1 = [a, b, c, d]
group2 = [x, y, z]

def longest_subsequence(experiment):
    longest = [] 
    for accepted_sequence in group2:
        it = iter(experiment)
        subsequence = []
        for element in accepted_sequence:
            if element in it:
                subsequence.append(element)

            else:
                break

        if subsequence == list(accepted_sequence):
            return subsequence

        longest = max(longest, subsequence, key=len)

    return longest

for experiment in group1:
    print(longest_subsequence(experiment))

It works by using element in iterator, which while scanning for next matching element, discards other elements in between.

The code finds the first longest subsequence in the group2. However since x precedes y and both of them are subsequences of b, x is printed for b:

[3]
[3, 1, 6]
[3, 4, 6]
[2, 1]
  • Thank you first. my first note is longest will be printed if it is not following our defined pattern. (You print how long it was correct) – engbarakat Apr 07 '17 at 15:30
0

Your first example could be handled by the following code:

group1 = {
    'x': [3, 1, 6],
    'y': [2, 1, 6],
    'z': [3, 4, 6],
}
group2 = {
    'a': [1, 2, 3, 5, 6, 7],
    'b': [2, 1, 4, 3, 1, 8, 6],
    'c': [6, 3, 5, 7, 8, 4, 2, 6],
    'd': [1, 2, 1, 3, 4],
}

def isSubsequence(sub, seq):
    start = 0
    for i in sub:
        try:
            start = seq[start:].index(i)
        except ValueError:
            return False
    return True

for seqName, seq in group2.items(): # Change items for iteritems in Python2
    for subName, sub in group1.items():  # Same than above
        if isSubsequence(sub, seq):
            print("{} is subsequence of {}".format(subName, seqName))
            #break # Uncomment this line if you only want to find the first

This outputs:

z is subsequence of c
y is subsequence of b
x is subsequence of b

If the break is uncommented the last row would not appear.

As you can see it doesn't take order of the groups into account as they are stored in dicts, we could use other containers to keep the order if it matters.

isSubsequence() uses str.index() to avoid one of the loops in order to be faster and it also returns as soon as it knows a list is not a subsequence. If only a boolean context is needed, this is, if you only need if any group1 item is a subsequence of a group2 item and not which of them uncummenting the break is suggested.

Adirio
  • 5,040
  • 1
  • 14
  • 26