Finding the second smallest int in list of list in Python 3 with recursion

Question

I want to find the second smallest int in list of lists where empty list may exist. I got stuck at the flatten step.

my idea

a unsorted list e.g [1, [3, 2], [[]], [4]],[[]], [12, 12], [[12], []], [[]]]
stucked!!!!!! Flatten the list with pure recursion. I tried to do this in the first part of recursive step. Above example become [1, 3, 2, 4], [12,12,12]
find the second smallest int (completed)

Here is the code

def find(abc):
#base case
if len(abc) == 2:
    if isinstance(abc[0], list) and isinstance(abc[1], list):
        re = find(abc[0] + abc[1])
    elif isinstance(abc[1], list):
        re = find(abc[:1] + abc[1])
    elif isinstance(abc[0], list):
        first = find(abc[0] + abc[1:])
    else:
        if abc[0] > abc[1]:
            re = (abc[0], abc[1])
        else:
            re = (abc[1], abc[0])
# recursive step
else:
    #i think this part has problem 
    # flatten the list
    if isinstance(abc[0], list):
        re = find(abc[0] + abc[1:])
    # if this element is interger
    else:
        current = abc[0]
        second, first = find(abc[1:])
        if (second < current):
            re = (second, first)
        elif (current > first) and (second >= current):
            re = (current, first)
        else:
            re = (first, current)            
return re
e.g   find([[[]], [12, 12], [[12], []], [[]]]) -> (12, 12)
      find([1, [2, 3], [[]], [4]]) -> (2, 1)

Take into account the desired behaviour if there are repeated values. For instance: `data = [1, 1, 2, 3]` should result in `1` or `2`? — MariusSiuram, Feb 25 '16 at 08:49

stephan · Accepted Answer · 2016-02-25T07:06:19.280

Just Fixing the Issue: You have to handle the case of an empty list in your recursion. A minimal (and admittedly somewhat hacky) change to your code would look something like this:

import sys 

def find(abc):
    #base case
    if len(abc) == 2:
        if isinstance(abc[0], list) and isinstance(abc[1], list):
            re = find(abc[0] + abc[1:])
        elif isinstance(abc[1], list):
            re = find(abc[:1] + abc[1])
        elif isinstance(abc[0], list):
            re = find(abc[0] + abc[1:])
            # ^^^ fixed typo (ifs could be simplified by dropping first if)
        else:
            if abc[0] > abc[1]:
                re = (abc[0], abc[1])
            else:
                re = (abc[1], abc[0])
    # recursive step
    else:
        # CHANGE HERE
        if len(abc) == 0:   # @HACK: handle empty list
            return (sys.maxsize, sys.maxsize)
        # CHANGE ENDS
        if isinstance(abc[0], list):
            re = find(abc[0] + abc[1:])
        # if this element is integer
        else:
            current = abc[0]
            second, first = find(abc[1:])
            if (second < current):
                re = (second, first)
            elif (current > first) and (second >= current):
                re = (current, first)
            else:
                re = (first, current)            
    return re # @TODO: possibly filter out maxsize in final result

This is far from perfect (e.g. yielding maxsize if there are not enough values and maybe having additional bugs).

Refactoring Your Code: Therefore, I would refactor your code in two ways. Firstly, I would separate out flattening and searching (i.e. flatten first and then search the flattened list):

def flatten(li):
    for el in li:
        try:    # if strings can be in list, would have to check here
            for sub in flatten(el):
                yield sub
        except TypeError:
            yield el

def find(abc):
    abc = list(flatten(abc))

    def _smallest2(abc):
        # basically your implementation for finding the smallest two values
        if len(abc) <= 2:
            return tuple(sorted(abc, reverse=True))
        current = abc[0]
        second, first = _smallest2(abc[1:])
        if (second < current):
            re = (second, first)
        elif (current > first) and (second >= current):
            re = (current, first)
        else:
            re = (first, current)       
        return re

    return _smallest2(abc)

And secondly, I would use heapq.nsmallest instead of your implementation for searching:

import heapq

def flatten(li):
    for el in li:
        try:    # if strings can be in list, would have to check here
            for sub in flatten(el):
                yield sub
        except TypeError:
            yield el

def find(li):
    return tuple(heapq.nsmallest(2, flatten(li))[::-1])

If you are ok with a slightly different return value, feel free to drop the tuple and [::-1].

Alternative Implementation: While I'd prefer the refactored code above for various reasons (e.g. robustness, expressiveness), here is an alternative implementation which arguably is more in line with your initial question. The main idea behind this implementation is to only check whether the first element is a list? If yes, flatten; if no, recursively go down the tail of the list:

def find(abc):
    try:                                                          # first element is list
        return find(abc[0] + abc[1:])                             #  -> flatten
    except:                                                       # first element is value
        if len(abc) <= 1: return abc                              # -> either end recursion
        return sorted(abc[:1] + find(abc[1:]), reverse=True)[-2:] # -> or recurse over tail

Note that the return type is slightly different (list instead of tuple). And you could replace sorted with heapq.nsmallest (which arguably is more efficient for small n).

i just realized i forgot to handle the empty list when i wanted to deal with empty list . — InWind, Feb 24 '16 at 15:52
@InWind: glad it was helpful. Looking back at your question a day later, I thought you might be also interested in a simple recursive implementation, (which I have added to the end of my answer now). — stephan, Feb 25 '16 at 07:16

MariusSiuram · Answer 2 · 2016-02-25T07:55:54.967

If your use case is huge and you want to avoid recursion, you can perform some iteration magic and avoid the recursion:

def nested_walker(data):
    working_iterators = [iter(data)]
    while working_iterators:
        current_iterator = working_iterators.pop()
        for elem in current_iterator:
            if isinstance(elem, list):
                working_iterators.append(iter(elem))
            else:
                yield elem

And then, you could do things like:

data = [[1, [3, 2], [[]], [4]],[[]], [12, 12], [[12], []], [[]]]
sorted_list = sorted(nested_walker(data))
print sorted_list[1]

There are more intelligent ways to retrieve the second smallest integer. To avoid sorting a potential huge size, you can use nested_walker being as it is a generator function.

As @stephan pointed out, there is a way to use heapq which avoids the sort:

data = [[1, [3, 2], [[]], [4]],[[]], [12, 12], [[12], []], [[]]]
two_mins = heapq.nsmallest(2, nested_walker(data))
print two_mins[1]  # the [0] is the minimum

it is worth checking the documentation about heapq, as it can be a little bit tricky in terms of performance.

+1 I like this solution. To avoid sorting a huge list, you could simply use ``def find(abc): return heapq.nsmallest(2, nested_walker(abc))``. — stephan, Feb 25 '16 at 07:46
I was sure that `heapq` had some way to be used, but was in a rush. I will update the answer now, thanks! — MariusSiuram, Feb 25 '16 at 07:51

Kenly · Answer 3 · 2016-02-24T07:24:29.880

Iterate through the list then use isinstance to check if you need recursion:

def MIN(lst):
   mn = None
   for item in lst:
       if isinstance(item, list) and item:
           tmp = MIN(item)
       else:
           tmp = item

       if not mn:
          mn = tmp
       elif mn > tmp:
           mn = tmp
   return mn

def find(lst):
  mins = []
  if not all(isinstance(item, list) for item in lst):
     mins.append(MIN(lst))
  for item in lst:
     if isinstance(item, list):
        mins.append(MIN(item))
  return filter(lambda x: x==0 or x, mins)

print find([[[]], [12, 12], [[12], []], [[]]])
print find([1, [2, 3], [[]], [4]])

score 0 · Answer 4 · edited May 23 '17 at 12:15

Here's a solution in Python 3 which flattens the lists, sorts it and finally returns the second smallest item of the result:

import collections

def flatten(l):
    for el in l:
        if isinstance(el, collections.Iterable) and not isinstance(el, (str, bytes)):
            for sub in flatten(el):
                yield sub
        else:
            yield el

a = flatten([[[]], [12, 12], [[12], []], [[]]])
b = flatten([1, [2, 3], [[]], [4]])

print(sorted(a)[1]) # 12
print(sorted(b)[1]) # 2

The flatten function was bluntly stolen from this answer. For Python 2.7 replace (str, bytes) with basestring in flatten.

Finding the second smallest int in list of list in Python 3 with recursion

4 Answers4