1

Given a list of numbers L = { a1, a2, a3, a4, ... , aN}

The problem is to divide this L into two parts, not just once but recursively until it becomes atomic. The main idea is like this post but adding the recursion thing.

(added: June 9) For example, if we have L = {10, 20, 1, 2} (edited: June 10) the solution might be, first, divide it to {10, 1, 2} and {20}, then divide the former into {1, 2} and {10}, keep going with {1, 2} to {1}, {2}. Now all the members of L are now atomic unable to be divided anymore.

After being divided, it should look like a binary tree of some sort.

Let's say it look like this..

  (a1 a2 a3 a4)
       /\
      /  \
     /    \
    /      \
 (a1 a2) (a3 a4)
   /\      /\
  /  \    /  \
(a1)(a2)(a3)(a4)

At each node the similarity function is

abs( sum(left_child) - sum(right_child) ) / sum(itself)

I want to find an "optimized" way to divide the list (create the tree) according to the "summation" of this function. Note that at the top level this value might have more impact than the lower ones, so weights should be provided.

weight(level) * abs( sum(left_child) - sum(right_child) ) / sum(itself)

let level is the level of this node in the binary tree.

I think it is possible to solve this using dynamic programming with the time complexity of O(2^N). However, this solution seems to be too slow for me. Anyone has any better solutions ?

Both optimization and approximation are welcome as well.

Thank you in advance.

Community
  • 1
  • 1
Phizaz
  • 593
  • 4
  • 16
  • check my answer, i updated it to make it fit for your instance of problem. – Ashkan Kazemi Jun 09 '15 at 06:58
  • @AshkanKzme I did check it but I found it doesn't get to my point because I want to **minimize the summation of the similarity function** (mentioned above) .. the solution you proposed will optimize for each node but **not** for the entire tree .. btw. thanks a lot for fast reply – Phizaz Jun 09 '15 at 07:12
  • @AshkanKzme did you delete your answer ? or I did something wrong again ? I shocked when I refreshed and found nothing .. – Phizaz Jun 09 '15 at 07:56
  • 1
    I deleted the answer since it didn't provide a correct solution to your probelm. and I kinda think that you cannot get a better running time of O(2^N) with the weight constraint you have, it makes the problem extremely difficult to solve! – Ashkan Kazemi Jun 09 '15 at 08:00
  • is it absolutely necessary to make it recursively? i believe there're many ways to get the result you want in O(n) time, without using recursion – Dleep Jun 09 '15 at 17:47
  • @EmilianoSorbello my point is not to be using recursion to solve this problem. I only stated it for the sake of explanation ... I think you can propose any kind of algorithm that solves this problem. – Phizaz Jun 10 '15 at 00:50

1 Answers1

0

An O(n) time complexity but really inaccurate approach would be:

def distance(a, b):
    return abs(a - b)

def balancedSlice(myList):

    total = sum(myList)
    a = 0        #a, b are the sum's of each slice
    b = total
    dist = distance(a, b) # current distance between slices
    for i in range (len(myList)):
        a += myList[i]
        b -= myList[i]
        if dist <= distance(a, b):
            return myList[:i],myList[i:] #list sliced "from 0 to before i" and "from i to end"
        dist = distance(a, b)

Another, a bit more accurate but not perfect, greedy algorithm in O(n log n):

def balancedSlice(myList):
    list1 = list()
    list2 = list()
    myList = list(myList) #skip if you can destroy the original list.
    myList.sort() # O(n log n)
    while myList:
        item = myList.pop()
        if sum(list1) <= sum(list2):
            list1.append(item)
        else:
            list2.append(item)
    return list1, list2

However, as stated here, this is an NP problem so if your list is big enough and you can tolerate imperfect results, you should stick to something like this greedy algorithm.

Dleep
  • 1,045
  • 5
  • 12
  • I got you now .. the thing is the order of elements in the list is not important .. I can shuffle them and should get the same result. One more thing .. you also have to guarantee that the "overall" distance will be minimized according to the function I gave .. sorry for my poor explanation. – Phizaz Jun 10 '15 at 02:17
  • so basically you have a bunch of items and want to pack em in 2 bags (lists, sets, etc) so that they are as balanced as possible, right? – Dleep Jun 10 '15 at 02:20
  • not just 2 bags because each bag might have its own sub-bags see the binary tree above :D – Phizaz Jun 10 '15 at 04:02