0

I have an array:

A = [0,2,5,6]
B = [5,6,8,9]
C = [6,7,8,9]

I want to write two functions, below are the details:


Problem 1 When I pass in any of the arrays defined above, I get the combinations of numbers on a sequential manner (by sequential meaning, n+1). So the desired outputs are:

ResultA = [[0],[2],[5],[6],[5,6]]
ResultB = [[5],[6],[8],[9],[5,6],[8,9]]
ResultC = [[6],[7],[8],[9],[6,7],[7,8],[8,9],[6,7,8],[7,8,9],[6,7,8,9]]

Below is what I've tried:

sorted_ids = sorted(number_collection)
combinations = [sorted_ids[j: j + i] for i in range(1, len(sorted_ids)) for j in range(len(sorted_ids) - i + 1)]

the issue is it works for the array C but doesn't work that well with the others.


Problem 2 The result of the problem 1 is the input of for this problem. The thing is I want the combinations that exist on unique elements for the numbers. (I am not sure I can explain it in words properly), below is the desired output:

FinalResultA = [[0],[2],[5,6]]
FinalResultB = [[5,6],[8,9]]
FinalResultC = [[6,7,8,9]]

Is there any approaches (performance oriented) that I can use that'll help?

iam.Carrot
  • 4,976
  • 2
  • 24
  • 71
  • Will the input always be sorted? Is it ok to sort it? – juanpa.arrivillaga Jan 16 '18 at 09:54
  • It's completely okay to sort it. – iam.Carrot Jan 16 '18 at 09:54
  • 1
    How come `ResultA` doesn't include `[0, 2]`, `[0, 2, 5]`, `[0, 2, 5, 6]`, `[2, 5, 6]`..? – thefourtheye Jan 16 '18 at 09:55
  • @thefourtheye The outputs posted are the desired outputs not the ones that come from my approach. The issue with approach is existence of [0,2] etc... – iam.Carrot Jan 16 '18 at 09:57
  • @iam.Carrot You would have to explain the logic a bit better then. Why are `ResultA`, ... treated differently? – Ma0 Jan 16 '18 at 10:06
  • @evkounis I have two different modules, the first module uses the problem 1 and the output is used. The seconds module uses the problem 1 output and passes it through problem 2 and creates the desired output at FinalResult. Is there a better way to approach this problem? – iam.Carrot Jan 16 '18 at 10:09
  • What are you trying to achieve? Perhaps, you can take a look at [itertools](https://docs.python.org/3.6/library/itertools.html) module of python. – Kevad Jan 16 '18 at 10:10
  • 1
    So in the end you are looking for a partition of consecutive numbers? (This implies all list members are integers and unique, right?) Is the intermediate step `Problem 1` a requirement, i.e. will its output used for anything else? – Mr. T Jan 16 '18 at 10:12

2 Answers2

1

Here is a fairly efficient approach, although it does require O(N) auxiliary space, although if the number of runs is small then it shouldn't be significant:

from itertools import groupby

def ngrams(seq):
    stop = len(seq)+1
    for n in range(2, stop):
        for i in range(stop - n):
            yield seq[i:i+n]

def get_combos(seq):
    runs = []
    for _, g in groupby(enumerate(seq), lambda x:x[1]-x[0]):
        run = [a for _, a in g]
        for x in run:
            yield [x]
        if len(run) > 1:
            runs.append(run)
    for run in reversed(runs):
        yield from ngrams(run)

Note, this uses this classic approach for grouping consecutive integers. It iterates over the groups of consecutive integers, "runs", and yields any lone integer as a single-element list. If the run is longer than a 1, I add it to a list of runs. Finally, you iterate over the list of runs in reverse, yielding the "n-grams", from order 2 to order len(run).

In action:

>>> A = [0,2,5,6]
>>> B = [5,6,8,9]
>>> C = [6,7,8,9]
>>> list(get_combos(A))
[[0], [2], [5], [6], [5, 6]]
>>> list(get_combos(B))
[[5], [6], [8], [9], [8, 9], [5, 6]]
>>> list(get_combos(C))
[[6], [7], [8], [9], [6, 7], [7, 8], [8, 9], [6, 7, 8], [7, 8, 9], [6, 7, 8, 9]]

Note

The get_combos assumes the input is sorted.

Edit

However, for:
>>> D = [6,7,9,12,13,14,20,21,30]

This will produce:

>>> list(get_combos(D))
[[6], [7], [9], [12], [13], [14], [20], [21], [30], [20, 21], [12, 13], [13, 14], [12, 13, 14], [6, 7]]

I.E., the 3-sequence starts before the 2 sequence of subsequent runs has been yielded. If you want all n-len sequences to be yielded before n+1 len sequences, use the following approach:

from itertools import groupby

def ngrams(seq, max_len):
    curr = seq
    for n in range(1, max_len + 1):
        nxt = []
        for run in curr:
            run_len = len(run)
            if run_len > n:
                nxt.append(run)
            for i in range(run_len + 1 - n):
                yield run[i:i+n]
        curr = nxt

def _sub_index(t):
    return t[1] - t[0]

def get_consecutive_runs(seq):
    grouped = groupby(enumerate(seq), _sub_index)
    for _, g in grouped:
        yield [a for _, a in g]


def get_combos(seq):
    runs = list(get_consecutive_runs(seq))
    max_len = max(map(len, runs))
    yield from ngrams(runs, max_len)

With the following results:

>>> list(get_combos(D))
[[6], [7], [9], [12], [13], [14], [20], [21], [30], [6, 7], [12, 13], [13, 14], [20, 21], [12, 13, 14]]
juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
0

Here is your both solution in one function without any external library :

A = [0,2,5,6]
B = [5,6,8,9]
C= [6,7,8,9]

def finding_sequence(list_1):
    sub_list = []
    for j, i in enumerate(list_1):

        try:
            if list_1[j] - list_1[j - 1] == 1:
                sub_list.append((list_1[j - 1], list_1[j]))
            else:
                sub_list.append('_pos')


        except IndexError:

            pass

    sub_final_result = []
    check_result=[]
    if '_pos' not in sub_list[1:]:
        for i in sub_list[1:]:
            for k in i:
                if k not in sub_final_result:
                    sub_final_result.append(k)
                    check_result.append(k)

    else:
        for i in sub_list:
            if i != '_pos':
                sub_final_result.append(i)
                for i1 in i:
                    check_result.append(i1)


    for i1 in list_1:
        if i1 not in check_result:
            sub_final_result.append([i1])

    return sub_final_result

Test cases:

print(finding_sequence(A))

output:

[(5, 6), [0], [2]]

second

print(finding_sequence(B))

output:

[(5, 6), (8, 9)]

P.S : one request : If my answer helps you , don't accept it , just use it.