5

For instance if I have a dict of dicts or a dict of arrays but I only wish to "deep" copy to a depth of two levels is there an easy way to do this?

I was looking around to see if there was a library I could use or an example but I couldn't find anything. I am fairly new to Python or else I would write the subroutine to do this myself. Any ideas? Code snippets would be appreciated as it would be quicker for me to understand than just an explanation of how to do it.

Thanks.

ADDITIONAL INFO:

Some have ask why I would want to do this, I need a copy (not a ref as I am going to modify some of the values and I do not want the original modified) of some of the items from a dict but the dict is HUGE (many dict of dicts) and so I do not want to blow up my memory footprint

MY CODE SO FAR

Okay, I give up. This was more difficult than I expected and I don't have time to figure it out. My latest attempt with some debug/test code.

# Deep copy any iteratable item to a max depth and defaults to removing the
# rest. If you want to keep the stuff past max depth as references to orig
# pass the argument else_ref=1. Ex:
#   dict_copy = copy_to_depth( dict_orig, 2, else_ref=1 )
def copy_to_depth( orig, depth, **kwargs):
  copy = type(orig)()
  for key in orig:
    # Cannot find a reliable and consistent way to determine if the item 
    # is iterable.
    #print orig[key].__class__
    #if hasattr(orig[key], '__iter__'):
    #if hasattr(orig[key], '__contains__'):
    #if iterable( orig[key] ):
    #try:
    if hasattr(orig[key], '__contains__'):
      if depth > 0:
        copy[key] = copy_to_depth(orig[key], depth - 1, **kwargs)
      else:
        if 'else_ref' in kwargs:
          copy[key] = orig[key]
        else:
          copy[key] = 'PAST_MAX_DPETH_ITERABLE_REMOVED'
    #except:
    else:
      copy[key] = orig[key]
  return copy

def iterable(a):
   try:
       (x for x in a)
       return True
   except TypeError:
       return False

people = {'rebecca': 34, 'dave': 'NA', 'john': 18, 'arr': [9,8,{'a':1,'b':[1,2]}], 'lvl1':
  {'arr': [9,8,{'a':1,'b':[1,2]}], 'dave': 'NA', 'john': 18, 'rebecca': 34, 'lvl2':
    {'arr': [9,8,{'a':1,'b':[1,2]}], 'dave': 'NA', 'john': 18, 'rebecca': 34, 'lvl3':
      {'rebecca': 34, 'dave': 'NA', 'john': 18, 'arr': [9,8,{'a':1,'b':[1,2]}]}}}}
print people


ppl_cpy = copy_to_depth(people, 1)

ppl_cpy['arr'][1] = 'nine'                  # does not mod orig
ppl_cpy['john'] = 0                  # does not mod orig
ppl_cpy['lvl1']['john'] = 1          # does not mod orig b/c copy_to_depth
ppl_cpy['arr'][3]['a'] = 'aie'       # does not mod orig
#ppl_cpy['lvl1']['lvl2']['john'] = 2 # Rest cause an error
#ppl_cpy['lvl1']['lvl2']['lvl3']['john'] = 3
print people
print ppl_cpy

ppl_cpy = copy_to_depth(people, 1, else_ref=1)
ppl_cpy['john'] = 0                 # does not mod orig
ppl_cpy['lvl1']['john'] = 1         # does not mod orig b/c copy_to_depth was 1
ppl_cpy['lvl1']['lvl2']['john'] = 2 # Rest Do not cause error but modifies orig
ppl_cpy['lvl1']['lvl2']['lvl3']['john'] = 3
print people
print ppl_cpy

I Cannot find a reliable and consistent way to determine if the item is iterable. I have been reading through this post and trying to figure it out but it none of the solutions seemed to work for my test case.

I will just deep copy the entire dict and try to optimize the solution later (or not).

Thanks...

Community
  • 1
  • 1
stephenmm
  • 2,640
  • 3
  • 30
  • 48
  • 3
    What problem are you hoping to solve by doing this? – Karl Knechtel Mar 20 '12 at 20:58
  • 1
    did you try the obvious `copy.deepcopy(x)` and `pickle`? – zenpoy Mar 20 '12 at 21:08
  • @zenpoy I did look at copy.deepcopy(x) but it did not seem to be able to limit its copy toa particular depth. I did not think to use pickle and I am not sure how that would work but your suggestion made me think that maybe you could get pprint to do the copy as it does let you specify a depth??? I have to think about how that would work though. – stephenmm Mar 20 '12 at 21:25

2 Answers2

3

This sounds kind of like 'plz give me teh codz'...

In any case, you'll need a custom method unless you really want to hack up the functionality of iterables with subclasses. Pseudocode:

def copy_to_depth(original, depth)
    copy = type(original)()
    for item in original
        if item is iterable and depth > 0
            copy + copy_to_depth(item, depth - 1)
        else
            copy + item
    return copy
Silas Ray
  • 25,682
  • 5
  • 48
  • 63
1

Actually, the previous example would just copy any dictionary as it is, because if we run out of depth, we just copy the remaining part directly. Right version would be:

def copy_to_depth(original, depth)
    copy = type(original)()
    for item in original
        if item is iterable
            if depth > 0
                copy + copy_to_depth(item, depth - 1)
        else
            copy + item
    return copy

There's a subtle difference. (Unfortunately, I can't comment to the answer itself)

Michael Gendin
  • 3,285
  • 2
  • 18
  • 23
  • This will just skip adding the item all together if it is iterable but you've reached the max depth. Copying any nested dictionary beyond max depth is exactly what we want not to do. – Silas Ray Mar 20 '12 at 21:40
  • but skipping the item if we reached the max depth is what we want to do, right? Your code is basically equivalent to copy=original.copy(). – Michael Gendin Mar 20 '12 at 21:47
  • No, we want to copy by ref, essentially, below max depth while copying by val above. – Silas Ray Mar 20 '12 at 21:49
  • well, it is so, you're right. I guess @stephenmm should explain, which one he needs – Michael Gendin Mar 20 '12 at 21:54
  • I don't care what happens below max depth for my particular issue. That being said having the refs below "depth" doesn't hurt (memory footprint wise) and allows for the structure to look the same but it would also let the person using the copy to modify the original which could be dangerous! ("I thought I had a copy, not a ref. Why is my orig changing???") So, I think my preference would be to not have the things below max depth and if the user tries to access it they get an obvious error and be more likely to figure out how to fix it. – stephenmm Mar 20 '12 at 22:16
  • Yes, and thanks for catching that difference. I think it was good to have the discussion as I was not explicit in the original question and it lead me to think about the problem a bit more. – stephenmm Mar 20 '12 at 23:45