34

What is the best way to split a dictionary in half?

d = {'key1': 1, 'key2': 2, 'key3': 3, 'key4': 4, 'key5': 5}

I'm looking to do this:

d1 = {'key1': 1, 'key2': 2, 'key3': 3}
d2 = {'key4': 4, 'key5': 5}

It does not matter which keys/values go into each dictionary. I am simply looking for the simplest way to divide a dictionary into two.

martineau
  • 119,623
  • 25
  • 170
  • 301
user1728853
  • 2,607
  • 4
  • 30
  • 31

9 Answers9

39

This would work, although I didn't test edge-cases:

>>> d = {'key1': 1, 'key2': 2, 'key3': 3, 'key4': 4, 'key5': 5}
>>> d1 = dict(d.items()[len(d)/2:])
>>> d2 = dict(d.items()[:len(d)/2])
>>> print d1
{'key1': 1, 'key5': 5, 'key4': 4}
>>> print d2
{'key3': 3, 'key2': 2}

In python3:

d = {'key1': 1, 'key2': 2, 'key3': 3, 'key4': 4, 'key5': 5}
d1 = dict(list(d.items())[len(d)//2:])
d2 = dict(list(d.items())[:len(d)//2])

Also note that order of items is not guaranteed

mrgloom
  • 20,061
  • 36
  • 171
  • 301
jone
  • 1,864
  • 12
  • 11
  • 3
    Edge cases aren't a problem; out-of-range slices just return empty lists and calling dict on an empty list returns an empty dictionary. HOWEVER: http://docs.python.org/library/stdtypes.html#dict.items seems to indicate that the Python specification does not guarantee that calls to items() will return the pairs in the same order every time! Perhaps, to be theoretically correct, we should be storing the result of a call to items() and then slicing that stored result? – Mark Amery Oct 20 '12 at 12:36
  • @MarkAmery I believe that they are guaranteed to be stable, that is they will return in the same order so long as nothing changes the dictionary, although that order is arbitrary, so that should make this all right. – Gareth Latty Oct 20 '12 at 12:37
  • @Lattyware What you have just said is listed in the docs as being true of CPython. By omission, it is implied that other Python implementations needn't, in theory, guarantee this. Admittedly, it's hard to imagine a sane implementation in which this wasn't the case, so it's really a theoretical problem only... – Mark Amery Oct 20 '12 at 12:38
  • @MarkAmery [The docs for dict views](http://docs.python.org/py3k/library/stdtypes.html#dict-views) say that they should have that behaviour in all implementations. – Gareth Latty Oct 20 '12 at 12:41
  • @Lattyware Oops, I thought you were right for a second there, but hang on - the documentation you linked to is for the Python Standard Library, which is written in C and is a component of CPython, not an inherent and necessary part of any implementation of the Python language. Jython, for instance, does not use the Python Standard Library: http://www.jython.org/docs/library/indexprogress.html – Mark Amery Oct 20 '12 at 12:52
  • @MarkAmery This is true, but other implementations are expected to mimic the operation of the standard library. The library is a part of python and it's correct behaviour is as documented. – Gareth Latty Oct 20 '12 at 13:06
  • 5
    Stable or not - calling `items()` twice is hardly a good idea. – georg Oct 20 '12 at 15:10
  • 3
    I know this is ages old, but as a heads up for people reading this: items() will likely change the order in current versions of Python. Use `collections.OrderedDict` if you need to. – zvyn Jun 30 '16 at 09:25
  • 2
    python3> TypeError: 'dict_items' object is not subscriptable – Srinath Ganesh Nov 14 '18 at 05:32
  • 2 problems with this on python 3. One is `TypeError: 'dict_items' object is not subscriptable` is solved with using `list(d.items())`. Second is `slice indices must be integers or None or have an __index__ method` is because of float division. I fixed using this `d1 = dict(list(d.items())[len(d)//2:])` `d2 = dict(list(d.items())[:len(d)//2])` – alexbhandari Mar 07 '19 at 02:02
  • `'dict_items' object is not subscriptable` on `Python 3.6.5` – mrgloom Aug 02 '19 at 10:06
  • Could try something like this instead for defining `d2`, in order to insure that the keys are unique and complete across the two new dictionaries: `d2 = d.copy()` `for k in d1:` `del d2[k]` – Mabyn Jul 24 '20 at 12:36
9

Here's a way to do it using an iterator over the items in the dictionary and itertools.islice:

import itertools

def splitDict(d):
    n = len(d) // 2          # length of smaller half
    i = iter(d.items())      # alternatively, i = d.iteritems() works in Python 2

    d1 = dict(itertools.islice(i, n))   # grab first n items
    d2 = dict(i)                        # grab the rest

    return d1, d2
Blckknght
  • 100,903
  • 11
  • 120
  • 169
7
d1 = {key: value for i, (key, value) in enumerate(d.viewitems()) if i % 2 == 0}
d2 = {key: value for i, (key, value) in enumerate(d.viewitems()) if i % 2 == 1}
Johan Råde
  • 20,480
  • 21
  • 73
  • 110
4

If you use python +3.3, and want your splitted dictionaries to be the same across different python invocations, do not use .items, since the hash-values of the keys, which determines the order of .items() will change between python invocations. See Hash randomization


The Hungry Dictator
  • 3,444
  • 5
  • 37
  • 53
David Bielen
  • 151
  • 1
  • 2
4

The answer by jone did not work for me. I had to cast to a list before I could index the result of the .items() call. (I am running Python 3.6 in the example)

d = {'one':1, 'two':2, 'three':3, 'four':4, 'five':5}
split_idx = 3
d1 = dict(list(d.items())[:split_idx])
d2 = dict(list(d.items())[split_idx:])

"""
output:
d1
{'one': 1, 'three': 3, 'two': 2}
d2
{'five': 5, 'four': 4}
"""

Note the dicts are not necessarily stored in the order of creation so the indexes may be mixed up.

Community
  • 1
  • 1
user3548798
  • 91
  • 1
  • 6
3

Here is the function which can be used to split a dictionary to any divisions.

def linch_dict_divider(raw_dict, num):
    list_result = []
    len_raw_dict = len(raw_dict)
    if len_raw_dict > num:
        base_num = int(len_raw_dict / num)
        addr_num = int(len_raw_dict % num)
        for i in range(num):
            this_dict = dict()
            keys = list()
            if addr_num > 0:
                keys = list(raw_dict.keys())[:base_num + 1]
                addr_num -= 1
            else:
                keys = list(raw_dict.keys())[:base_num]
            for key in keys:
                this_dict[key] = raw_dict[key]
                del raw_dict[key]
            list_result.append(this_dict)

    else:
        for d in raw_dict:
            this_dict = dict()
            this_dict[d] = raw_dict[d]
            list_result.append(this_dict)

    return list_result

myDict = {'key1': 1, 'key2': 2, 'key3': 3, 'key4': 4, 'key5': 5}
print(myDict)
myList = linch_dict_divider(myDict, 2)
print(myList)
Ângelo Polotto
  • 8,463
  • 2
  • 36
  • 37
Joe Cheng
  • 8,804
  • 3
  • 26
  • 25
2

Here's a function that I use in Python 3.8 that can split a dict into a list containing the desired number of parts. If you specify more parts than elements, you'll get some empty dicts in the resulting list.

def split_dict(input_dict: dict, num_parts: int) -> list:
    list_len: int = len(input_dict)
    return [dict(list(input_dict.items())[i * list_len // num_parts:(i + 1) * list_len // num_parts])
        for i in range(num_parts)]

Output:

>>> d = {'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
>>> split_dict(d, 2)
[{'a': 1, 'b': 2}, {'c': 3, 'd': 4, 'e': 5}]
>>> split_dict(d, 3)
[{'a': 1}, {'b': 2, 'c': 3}, {'d': 4, 'e': 5}]
>>> split_dict(d, 7)
[{}, {'a': 1}, {'b': 2}, {}, {'c': 3}, {'d': 4}, {'e': 5}]
Gethin LW
  • 21
  • 2
1

We can do this efficiently with itertools.zip_longest() (note this is itertools.izip_longest() in 2.x):

from itertools import zip_longest
d = {'key1': 1, 'key2': 2, 'key3': 3, 'key4': 4, 'key5': 5}
items1, items2 = zip(*zip_longest(*[iter(d.items())]*2))
d1 = dict(item for item in items1 if item is not None)
d2 = dict(item for item in items2 if item is not None)

Which gives us:

>>> d1
{'key3': 3, 'key1': 1, 'key4': 4}
>>> d2
{'key2': 2, 'key5': 5}
Gareth Latty
  • 86,389
  • 17
  • 178
  • 183
0

If you used numpy, then you could do this :

def divide_dict(dictionary, chunk_size):

'''
Divide one dictionary into several dictionaries

Return a list, each item is a dictionary
'''

import numpy, collections

count_ar = numpy.linspace(0, len(dictionary), chunk_size+1, dtype= int)
group_lst = []
temp_dict = collections.defaultdict(lambda : None)
i = 1
for key, value in dictionary.items():
    temp_dict[key] = value
    if i in count_ar:
        group_lst.append(temp_dict)
        temp_dict = collections.defaultdict(lambda : None)
    i += 1
return group_lst
woodword
  • 21
  • 3