3

I am unsure how to combine the items from two lists of integers such that the order of the items is preserved and the resultant list, if concatenated into one integer, is as small as possible.

Potentially similar to this question, although the answer given doesn't address my size constraint: Interleave different length lists, elimating duplicates and preserve order in Python

For instance, given:

a = [3,4,5,7,9,2]
b = [3,5,7,4,2,8]

The shortest possible combination of these two lists would be:

c = [3,4,5,7,4,9,2,8]

With a concatenated integer value of 34574928

There are instances in which the ordering of the numbers would not affect list length, but would affect the size of the concatenated integer. In the example given, the 4 and 9 could be swapped while still maintaining order of the items, but the final number would be larger than necessary.

For further clarification:

The final list must contain every instance of each digit from the two original lists. To better represent the combination of the two in the above example:

a = [3,4,5,7,  9,2  ]
b = [3,  5,7,4,  2,8]
c = [3,4,5,7,4,9,2,8]

Of course, it will not always work out so cleanly. In this case, four of the digits from the two lists (3, 5, 7, and 2) could be merged complete. Four of the digits (4, 4, 9, and 8) could not be combined without creating larger list. For example:

a =     [3,    4,5,7,  9,2]
b =     [3,5,7,4,  2,8    ]
bad_c = [3,5,7,4,9,2,8,9,2]

In this case, I combined only the 3 and one of the 4s. When the items from those two example results are concatenated, we get:

c =     34574928
bad_c = 357492892

They both satisfy the ordering requirement, but because there is a different result which satisfies the ordering requirement but is smaller than bad_c when concatenated into an integer, bad_c is an incorrect result.

Community
  • 1
  • 1
The_Unobsequious
  • 277
  • 1
  • 2
  • 10
  • It is not clear what your rules for combining the lists are. What are you trying to achieve? – jonrsharpe Apr 18 '14 at 18:37
  • I am looking for the smallest possible integer which contains all the digits from two other arbitrary integers. The digits must remain in order, but do not necessarily need to be consecutive. i.e., given 13 and 12, the smallest number which satisfies the rule would be 123. – The_Unobsequious Apr 18 '14 at 18:40
  • So, you're going [a1,b1,a2,b2...]. And you need to determine the order of interweaving such that the final integer is as small as possible. When you interweave, does the first position determine the weave of the rest of the list?? Or can any position have the value from either list first? – Luigi Apr 18 '14 at 18:41
  • So, given the lists `[1,2,3,2]` and `[4,2,6,1]`: Can I do this?: `[1,4,2,3,6,1,2]`? Or am I required to do `[1,4,2,3,6,2,1]` because I choose the first list at the first position? – Luigi Apr 18 '14 at 18:42
  • The most important thing to keep in mind is the size of the resulting number. In this case, I think (but am not certain) that `[1,4,2,3,1,6,2]` would satisfy the rule. This is because the integer 1423162 contains all the digits from the two lists, while preserving the order of the digits from the two lists, and minimizing its own size (as compared to other answers which may preserve the order of the lists, but result in a larger final integer value. i.e. 42621231). I hope that makes sense... – The_Unobsequious Apr 18 '14 at 18:48
  • @MartijnPieters That is the one with the smallest value as an integer. I'm making sure he's not constraining the order of interweaving to the first choice as it is important to the smallest-integer part of the question. – Luigi Apr 18 '14 at 18:49
  • Okay, so I can do something like this: `[a1, b1, b2, b3, b4, a2, a3, a4]` if it gives me the lowest integer? – Luigi Apr 18 '14 at 18:50
  • @The_Unobsequious no, still not clear. Why does `c` have both 4s but not both 7s? – jonrsharpe Apr 18 '14 at 18:50
  • In my example above, the actual smallest integer from lists `[1,2,3,2]` and `[4,2,6,1]`, maintaining order *within a given list*, is made like this: `[1,2,3,2,4,2,6,1]`. The list is in order `[a1,a2,a3,a4,b1,b2,b3,b4]` and the resulting integer 12324261 is smaller than any other combination that maintains this type of order. @The_Unobsequious Am I allowed to do this or are there additional constraints? – Luigi Apr 18 '14 at 18:54
  • @jonrsharpe The combined list can share values between the two lists. In fact, it's necessary wherever possible. I saw that the two lists shared the 5,7 near the beginning, while one was preceded by a 4, and the other followed by a 4. The only way to merge the lists while combining the two 4s would require two instances of the 5,7 (in order to preserve the contents of both lists). It was more efficient to merge the 5,7 and keep the two instances of 4. – The_Unobsequious Apr 18 '14 at 19:02
  • @Louis Looks like the answer I gave to those two lists in a previous comment was incorrect. Your answer may be correct, but I cannot verify. It certainly maintains the order of the two lists. – The_Unobsequious Apr 18 '14 at 19:06
  • Why isn't `[3, 4, 5, 7, 4, 2, 8, 9, 2]` a solution to merging `a` and `b`? – Martijn Pieters Apr 18 '14 at 19:29
  • Given `a = [3,4,5,7,9,2]` and `b = [3,5,7,4,2,8]` ,`[3,4,5,7,4,2,8,9,2]` preserves the order of the two lists. However, it is not a solution because there is a list which, if all its items are concatenated into one integer, is smaller. `[3,4,5,7,4,9,2,8]` concatenates to 34574928, which preserves the order of all the items from both lists, and is smaller than the concatenation of your list (345742891). – The_Unobsequious Apr 18 '14 at 19:47
  • Perhaps you are looking for the smallest common [superinteger](http://projecteuler.net/problem=467) of the two lists? –  Apr 18 '14 at 20:23
  • Haha, yes, I am. I have gotten quite far with what I already know, and am trying to work out the final logic needed to combine the numbers. I didn't want to reference the problem in case the direction I'm going in is entirely wrong. I like to learn from my own mistakes when possible, and want to try to implement this idea successfully before trying something else. – The_Unobsequious Apr 18 '14 at 20:54
  • Gotcha `;)` I had a short go at this one as well and found generating the sequences is fairly doable. The real problem is merging them into the actual smallest "superinterger" without taking forever. *As far as I can tell* bruteforce methods don't stand much of a chance in the recent Project Euler problems and you need to apply some pretty genius math to solve them. –  Apr 18 '14 at 21:34
  • Yeah, I'm not terribly hopeful, but figured I'd try this idea. Agreed that generating the sequences themselves was easy...but brute forcing this concept with numbers greater than 10000 digits would be a Bad Idea. – The_Unobsequious Apr 18 '14 at 21:46

2 Answers2

1

Here's a fairly long but correct (as far as I can tell from the question discussion) implementation using recursion.

Important points:

  • I iterate through both lists using .pop(index). This lets me use recursion, as both lists get smaller and smaller as the function recurs, leading to a point where one list is of len(0).
  • Numbers can be chosen from either list, and there is no limit to the numbers that can be chosen in a row from a single list.
  • No consecutive duplicates are allowed.
  • When comparing two non-equal digits, the smaller digit ALWAYS goes in the larger place. 23xxx is always lower than 32xxx.

Basically, if I have something like [1,2,3] and then [6,0,4], all of the digits in the first list will be before the first digit in the second list because 1236xx will ALWAYS be less than 6xxxxx, 1236xx less than 16xxxx, and 1236xx less than 126xxx, regardless of the values chosen for x.

z = [None]
#set to None so that z[-1] doesn't throw an out-of-range error

def small_list(a,b): #recursive function

    ### BASE CASE ###

    if len(a) == 0: #if one list empty, can only add rest of other list
        for i in b:
            if i != z[-1]: #account for duplicates
                z.append(i)
            # else: #don't add duplicates

        return z.pop(0) #end recursion, remove extraneous None

    elif len(b) == 0: #if one list empty, can only add rest of other list
        for j in a:
            if j != z[-1]:#account for duplicates
                z.append(j)
            # else: #don't add duplicates

        return z.pop(0) #end recursion, remove extraneous None

    #Otherwise, we need to check whichever is smaller.  
    #The smaller number should ALWAYS go in the larger place (tens,hundreds,etc.) to make a smaller number.

    ### RECURSIVE CASE ###

    if a[0] < b[0]:
        if a[0] != z[-1]:
            z.append(a.pop(0))
        else:
            a.pop(0)
    elif a[0] > b[0]:
        if b[0] != z[-1]:
            z.append(b.pop(0))
        else:
            b.pop(0)
    elif a[0] == b[0]:
        a.pop(0)

    small_list(a,b) # recur

Examples:

z = [None]

l1 = [1,2,3,2]
l2 = [2,1,1,1]

small_list(l1,l2)
print z

This first example prints [1, 2, 1, 3, 2] which is now correct.

z = [None]

l1 = [1,2,3]
l2 = [4,5,6]

small_list(l1,l2)
print z

This second example prints [1, 2, 3, 4, 5, 6] which is also now correct.

Here's the flow of how it computes the example you give above:

# The format is: [final list]  len(a)  [list a]  len(b)  [list b]

[] len(a) = 6 [3, 4, 5, 7, 9, 2] len(b) = 6 [3, 5, 7, 4, 2, 8]
# 3 repeated, so remove it.
[] len(a) = 5 [4, 5, 7, 9, 2] len(b) = 6 [3, 5, 7, 4, 2, 8]
# add lower of first two indices to final (4 v 3), and remove from corresponding list
[3] len(a) = 5 [4, 5, 7, 9, 2] len(b) = 5 [5, 7, 4, 2, 8]
# add lower of first two indices to final (4 v 5), and remove from corresponding list
[3, 4] len(a) = 4 [5, 7, 9, 2] len(b) = 5 [5, 7, 4, 2, 8]
# 5 repeated, so remove it.
[3, 4] len(a) = 3 [7, 9, 2] len(b) = 5 [5, 7, 4, 2, 8]
# add lower of first two indices to final (7 v 5), and remove from corresponding list
[3, 4, 5] len(a) = 3 [7, 9, 2] len(b) = 4 [7, 4, 2, 8]
# 7 repeated, so remove it.
[3, 4, 5] len(a) = 2 [9, 2] len(b) = 4 [7, 4, 2, 8]
# add lower of first two indices to final (9 v 7), and remove from corresponding list
[3, 4, 5, 7] len(a) = 2 [9, 2] len(b) = 3 [4, 2, 8]
# add lower of first two indices to final (9 v 4), and remove from corresponding list
[3, 4, 5, 7, 4] len(a) = 2 [9, 2] len(b) = 2 [2, 8]
# add lower of first two indices to final (9 v 2), and remove from corresponding list
[3, 4, 5, 7, 4, 2] len(a) = 2 [9, 2] len(b) = 1 [8]
# add lower of first two indices to final (9 v 8), and remove from corresponding list
[3, 4, 5, 7, 4, 2, 8] len(a) = 2 [9, 2] len(b) = 0 []
# list b is empty, add first element of list a (if non-duplicate)
[3, 4, 5, 7, 4, 2, 8, 9] len(a) = 1 [2] len(b) = 0 []
# list b is empty, add first element of list a (if non-duplicate)

#Finally:
[3, 4, 5, 7, 4, 2, 8, 9, 2]
Luigi
  • 4,129
  • 6
  • 37
  • 57
  • If the recursion is confusing I can rewrite using a `while` loop. – Luigi Apr 18 '14 at 20:16
  • Hmm...Something isn't quite right. When I give it the two lists from my initial question (`a = [3,4,5,7,9,2]`, and `b = [3,5,7,4,2,8]`), this function provides `[3,4,5,7,4,2,8,9]`, which can't be correct because the 9 in list `a` is followed by a 2, whereas the combined list ends in 9, which doesn't satisfy the rule. I think I may be explaining it poorly (believe me, I'm trying to make sense!), because your first example also doesn't satisfy the rule. `l2` has three 1s, which means the resultant list would also need at least three 1s. The result must contain all instances of each digit. – The_Unobsequious Apr 18 '14 at 20:33
  • I hope that isn't just further confusing...where possible, digits can be shared between the two original lists. I will try to format a better example in my original question... – The_Unobsequious Apr 18 '14 at 20:37
  • @The_Unobsequious I think I've fixed it. It was a logic error with my loops. I think I understood your question but I will re-check it. – Luigi Apr 18 '14 at 22:12
  • @The_Unobsequious .....shoot. I read the update you posted to your problem and that is not what I thought you were asking (didn't realize **length** was a concern). This is now more of an alignment problem (and more difficult as a result) but is definitely still doable. My current answer gives a good approximation of your ideal case but it won't give the best case always. – Luigi Apr 18 '14 at 22:31
  • It's not quite **length**, but rather the **size** of the integer you get by concatenating the list of numbers. A shorter list will always be a shorter size, but two lists of the same length can have different sizes. `[3,4,3]` and `[4,3,3]` respectively concatenate to `343` and `433`. If these two lists were both potential solutions, 343 would win out because `343 < 433`...thanks for sticking with me here. You're giving me some interesting ideas to play around with! – The_Unobsequious Apr 18 '14 at 22:39
  • That's true, but since a shorter integer is **always** smaller than a longer integer, the primary concern is a shortest-superstring/alignment problem, which can then be optimized by size if there are multiple solutions of the same length. This is an interesting problem. You might want to use a branch-and-bound algorithm because the runtime of alignment is usually exponential in length. – Luigi Apr 18 '14 at 22:46
1

Here's a simple algorithm I think will work, the implementatation is fairly simple so I'm not posting it.

1. For the lists, find the first common element, gather the elements before say the two lists are: a & b

2. a. If no elements are in common, then merge them by comparing first elements then incrementing the index of the smaller list, and comparing. You got the idea!

2. b. Then without loss of generality if we say, a[3] matches b[4], then collect a[0]-a[2] together with b[0]-b[3], then merge the 7 items using the case where they don't have any common element and append the value of a[3] at the end.

3. Similarly do this till the end of the list. After the last appended item, merge the remaining items.

4. For this we may write a function that takes start and end index from both lists, of the sublists to be merged.

I hope this solution works, it appears correct but I haven't tried though. In case I missed something please do point it out.