3

I'm building a function that takes a list which contains three different types of elements: integers, floats, and strings. The function converts the list to a dictionary with keys for each of these three different categories. Then each element in that original list is placed into the appropriate key-value pair (e.g. all string elements in the list get assigned to the "string" key). I'm able to get this working correctly, however I'm unable to sort the values inside the dictionary values (which are lists). Here's what I have:

def list_dict_sort(input_list): 
    mixed_list =['asdf', 33, 'qwerty', 123.45, 890, 3.0, 'hi', 0, 'yes', 98765., '', 25]
    sorted_dict = {}                 
    sorted_dict['integer'] = []       
    sorted_dict['float'] = []
    sorted_dict['string'] = []
    for x in mixed_list:           
        if "int" in str(type(x)):  
            sorted_dict['integer'].append(x) 
        elif "float" in str(type(x)):
            sorted_dict['float'].append(x)
        elif "str" in str(type(x)):
            sorted_dict['string'].append(x)
    sorted_dict.sort
    return(sorted_dict)

list_dict_sort(mixed_list)

this returns:

{'float': [123.45, 3.0, 98765.0],
 'integer': [33, 890, 0, 25],
 'string': ['asdf', 'qwerty', 'hi', 'yes', '']}

so structurally the function gets me what I want, except that the value-lists are sorted. The exact output that I'd like is:

{'float': [3.0, 123.45, 98765.0],
 'integer': [0, 25, 33, 890],
 'string': ['asdf', 'hi', 'qwerty',  'yes', '']}

Ideally I want to avoid doing an import here (e.g. operator), I'm just looking for a simple/basic way of sorting the value-lists. I tried using sort and sorted() but couldn't figure out how to build them in to what I already have. Is there a clean way of doing this or is there a more efficient way?

martineau
  • 119,623
  • 25
  • 170
  • 301
David
  • 459
  • 5
  • 13
  • You could use `bisect.insort` when you inserting each element. However, you will have to do `import bisect`. See [here](https://stackoverflow.com/a/18001030/1586200) for more details. – Autonomous Feb 06 '18 at 21:50
  • is there a way to add sorted or sort() into my current script? – David Feb 06 '18 at 22:41
  • See if this works (not tested). Replace `sorted_dict['integer'].append(x) ` by `bisect.insort(sorted_dict['integer'], x)`. Same for floats and strings. Then remove the line `sorted_dict.sort`. – Autonomous Feb 06 '18 at 23:58

3 Answers3

2

You could just go over the values and sort them:

for v in sorted_dict.values():
    v.sort();
Mureinik
  • 297,002
  • 52
  • 306
  • 350
  • I tried this and it returns {'float': [], 'integer': [], 'string': []} – David Feb 06 '18 at 22:53
  • @David this definitely does work. Please edit your question to show how you integrated this snippet to you code and I'll try to see what went wrong. – Mureinik Feb 07 '18 at 06:19
1

Note that you could also use the type of the mixed elements as the dictionary key, so it can be calculated directly from the elements as they are inserted, and so that when retrieving later, you don't need to know a special string (e.g. "wait, did I use 'integer' or 'int' for the key?")...

In [4]: from collections import defaultdict

In [5]: d = defaultdict(list)

In [6]: mixed_list = ['asdf', 33, 'qwerty', 123.45, 890, 3.0, 'hi', 0, 'yes', 98765., '', 25]

In [7]: for value in mixed_list:
   ...:     d[type(value)].append(value)
   ...:     

In [8]: d
Out[8]: 
defaultdict(list,
            {str: ['asdf', 'qwerty', 'hi', 'yes', ''],
             int: [33, 890, 0, 25],
             float: [123.45, 3.0, 98765.0]})

In [9]: for k, v in d.items():
   ...:     v.sort()
   ...:     

In [10]: d
Out[10]: 
defaultdict(list,
            {str: ['', 'asdf', 'hi', 'qwerty', 'yes'],
             int: [0, 25, 33, 890],
             float: [3.0, 123.45, 98765.0]})

In the last result, note that default string sorting is going to put '' at the front. You'd need to write your own string comparator that would evaluate any string as less than the empty string if you need it to be sorted to the final position.

ely
  • 74,674
  • 34
  • 147
  • 228
  • In my option, using `type(value)` as a key is not a good idea. Consider `numpy.float64(5.1321)`, the type will come out as `numpy.float64`. Use `isinstance` always and maintain an explicit mapping. – jpp Feb 06 '18 at 23:47
  • @jp_data_analysis But what if you want a key for `np.float64` that is a different key than `float`? And if you don't want these to be two different keys, then the burden is on you to first do type conversions on the elements on the mixed list. I agree it could be application-specific as to which way is better. But there's nothing inherently better about resolving the type value with `isinstance` ... especially given that instance and subclass checking are also overrideable with metaclasses. So if someone is relying on keys that came from isinstance, the actual element might fail duck typing. – ely Feb 06 '18 at 23:54
  • My point is that you should be explicit. I'm not making assumption on what user wants. There are many answers, e.g. [here](https://stackoverflow.com/a/1549854/9209546), which explain why it is a good assumption to incorporate inheritance in type checking. `isinstance` isn't perfect, nothing is, but it is considered better than `type()`. Is your point that a subclass of `float` or `int` should not be considered `float` or `int`? – jpp Feb 07 '18 at 00:00
  • @jp_data_analysis My view is that using `isinstance` as the means to check if a type is `int` is *less explicit*, because a custom object could have overridden instance checking in a way that breaks things. For example, if you plan to process the `int` elements later on, but one of them is a custom class that doesn't support all `int` operations (but was customized to return `True` for `isinstance(..., int)`, then it all becomes a huge black box problem. But if the entries are stored according to `type` only, then it is explicit, because a user must convert types explicitly ahead of time. – ely Feb 07 '18 at 13:13
1

Here is a minimalist solution via collections.defaultdict and sortedcontainers.SortedList. With SortedList, your list is guaranteed to be always sorted.

Note I have also replaced your type-checking with isinstance and added a dictionary mapping types to keys. The purpose of this is to separate logic from configuration / variables.

from collections import defaultdict
from sortedcontainers import SortedList

mixed_list = ['asdf', 33, 'qwerty', 123.45, 890, 3.0, 'hi', 0, 'yes', 98765., '', 25]

def list_dict_sort(input_list): 

    sorted_dict = defaultdict(SortedList)

    mapper = {int: 'integer',
              float: 'float',
              str: 'string'}

    for x in mixed_list:
        for k, v in mapper.items():
            if isinstance(x, k):
                sorted_dict[v].add(x)
                break

    return(sorted_dict)

list_dict_sort(mixed_list)

# defaultdict(sortedcontainers.sortedlist.SortedList,
#             {'float': SortedList([3.0, 123.45, 98765.0], load=1000),
#              'integer': SortedList([0, 25, 33, 890], load=1000),
#              'string': SortedList(['', 'asdf', 'hi', 'qwerty', 'yes'], load=1000)})
jpp
  • 159,742
  • 34
  • 281
  • 339