1

I have a series of functions which all pass a dict, however in each instance the dict should begin as {}. However I'm finding that on sequential requests to different views, the dict is remembering data, causing problems later down the road.

Here is the relevant code:

def picasa_sync_friend(user_profile, friend_id, force_update=False):
    logger.warn('picasa_sync_friend')
    data, error = picasa_query_by_profile('albums', user_profile, replace=(friend_id))
    ....

def picasa_sync_albums(user_profile, friend_id="default"):
    logger.warn('picasa_sync_albums')
    data, error = picasa_query_by_profile('albums', user_profile, replace=(friend_id))
    ...

def picasa_sync_pictures(user_profile, album_id, force_update=False, full_update=False):
    friend_id = picasa_get_album(user_profile, album_id).friend.foreign_reference
    subject = 'photos' if full_update else 'thumbs'
    logger.warn('picasa_sync_pictures')
    data, error = picasa_query_by_profile(subject, user_profile, replace=(friend_id, album_id))
    ...

def picasa_query_by_profile(subject, user_profile, args={}, replace=(), format='xml'):
    logger.warn('picasa_query_by_profile: %s' % args)
    access_token = picasa_get_token(user_profile).access_token
    response = picasa_query(subject, access_token, args=args, replace=replace)
    ...

def picasa_query(subject, access_token='', args={}, replace=()):
    logger.warn('picasa_query: %s' % args)
    url, request_args, method = picasa_query_params(subject, access_token=access_token, args=args, replace=replace)
    ...

def picasa_query_params(subject, access_token='', args={}, replace=()):
    method = 'GET'
    base_url = 'https://accounts.google.com/o/oauth2/'
    logger.warn('picasa_query_params (before): %s' % args)
    if subject == 'albums':
        args['access_token'] = access_token
        subject = ''
        base_url = 'https://picasaweb.google.com/data/feed/api/user/%s' % replace
        args['kind'] = 'album'
        args['v'] = '2.0'
        args['fields'] = 'entry(title,gphoto:id,gphoto:numphotos,published)'
    elif subject == 'photos':
        args['access_token'] = access_token
        base_url = 'https://picasaweb.google.com/data/feed/api/user/%s/albumid/%s' % replace
        args['v'] = '2.0'
        args['kind'] = 'photo'
        args['fields'] = 'entry(gphoto:id,content(@src),gphoto:width,gphoto:height)'
        subject = ''
    elif subject == 'thumbs':
        args['access_token'] = access_token
        base_url = 'https://picasaweb.google.com/data/feed/api/user/%s/albumid/%s' % replace
        args['v'] = '2.0'
        args['kind'] = 'photo'
        args['fields'] = 'entry(gphoto:id,content(@src),gphoto:width,gphoto:height)'
        args['max-results'] = ALBUM_THUMBNAIL_LIMIT
        subject = ''
    logger.warn('picasa_query_params (after ): %s' % args)
    url = '%s%s' % (base_url, subject)
    return url, args, method

So the problem occurs whereby, I have a view which calls picasa_sync_friend and then picasa_sync_albums and returns all albums data to the client.

Then for each album, the client side makes a separate request that calls picasa_sync_pictures for each album_id.

Logger output for the initial friend/albums request looks like the following:

WARNING 2013-01-26 07:27:23,611 picasa_sync_friend
WARNING 2013-01-26 07:27:23,617 picasa_query_by_profile: 
    {}
WARNING 2013-01-26 07:27:23,633 picasa_query:
    {}
WARNING 2013-01-26 07:27:23,633 picasa_query_params (albums:before):
    {}
WARNING 2013-01-26 07:27:23,633 picasa_query_params (:after ):
    {'access_token': u'xxx', 'fields': 'entry(title,gphoto:id,gphoto:numphotos,published)', 'kind': 'album', 'v': '2.0'}

WARNING 2013-01-26 07:27:24,388 picasa_sync_albums
WARNING 2013-01-26 07:27:24,388 picasa_query_by_profile: 
    {'access_token': u'xxx', 'fields': 'entry(title,gphoto:id,gphoto:numphotos,published)', 'kind': 'album', 'v': '2.0'}
WARNING 2013-01-26 07:27:24,389 picasa_query: 
    {'access_token': u'xxx', 'fields': 'entry(title,gphoto:id,gphoto:numphotos,published)', 'kind': 'album', 'v': '2.0'}
WARNING 2013-01-26 07:27:24,389 picasa_query_params (before): 
    {'access_token': u'xxx', 'fields': 'entry(title,gphoto:id,gphoto:numphotos,published)', 'kind': 'album', 'v': '2.0'}
WARNING 2013-01-26 07:27:24,389 picasa_query_params (after ):
    {'access_token': u'xxx', 'fields': 'entry(title,gphoto:id,gphoto:numphotos,published)', 'kind': 'album', 'v': '2.0'}

Notice that in picasa_sync_albums -> picasa_query_by_profile, the initials args dict is already populated, despite the fact that picasa_sync_albums is not sending any data to the args array key.

The above logger output concludes the friend/albums request, and the next thing to come in is the pictures list for an individual album that goes straight to picasa_sync_pictures:

WARNING 2013-01-26 07:27:25,981 picasa_sync_pictures
WARNING 2013-01-26 07:27:25,998 picasa_query_by_profile:
    {'access_token': u'xxx', 'fields': 'entry(title,gphoto:id,gphoto:numphotos,published)', 'kind': 'album', 'v': '2.0'}
WARNING 2013-01-26 07:27:26,011 picasa_query:
    {'access_token': u'xxx', 'fields': 'entry(title,gphoto:id,gphoto:numphotos,published)', 'kind': 'album', 'v': '2.0'}
WARNING 2013-01-26 07:27:26,020 picasa_query_params (before):
    {'access_token': u'xxx', 'fields': 'entry(title,gphoto:id,gphoto:numphotos,published)', 'kind': 'album', 'v': '2.0'}
WARNING 2013-01-26 07:27:26,022 picasa_query_params (after ):
    {'access_token': u'xxx', 'fields': 'entry(gphoto:id,content(@src),gphoto:width,gphoto:height)', 'kind': 'photo', 'max-results': 4, 'v': '2.0'}

Notice that in picasa_query_by_profile, args already contains 'kind': 'album' despite this supposedly being a new request.

If I then refresh the page, calling friend/albums again, I get the following log output:

WARNING 2013-01-26 07:45:32,589 picasa_sync_friend
WARNING 2013-01-26 07:45:32,593 picasa_query_by_profile:
    {'access_token': u'xxx', 'fields': 'entry(gphoto:id,content(@src),gphoto:width,gphoto:height)', 'kind': 'photo', 'max-results': 4, 'v': '2.0'}
WARNING 2013-01-26 07:45:32,597 picasa_query:
    {'access_token': u'xxx', 'fields': 'entry(gphoto:id,content(@src),gphoto:width,gphoto:height)', 'kind': 'photo', 'max-results': 4, 'v': '2.0'}
WARNING 2013-01-26 07:45:32,598 picasa_query_params (before): 
    {'access_token': u'xxx', 'fields': 'entry(gphoto:id,content(@src),gphoto:width,gphoto:height)', 'kind': 'photo', 'max-results': 4, 'v': '2.0'}
WARNING 2013-01-26 07:45:32,600 picasa_query_params (after ):
    {'access_token': u'xxx', 'fields': 'entry(title,gphoto:id,gphoto:numphotos,published)', 'kind': 'album', 'max-results': 4, 'v': '2.0'}

This is where it starts to really affect the application, as it's applying things like max-results to album listings, which is undesirable.

Now I can get around all of this by being very explicit and removing dict items that shouldn't apply to certain subjects, however that makes this very unmaintainable and the plan is for this code to be extensible for a number of different subjects so it needs to be flexible and not so easy to derail.

I'm fairly sure I'm missing some fundamental part of python/django here, but I'm totally at a loss to explain the above behavior! Thanks for any advice.

DanH
  • 5,498
  • 4
  • 49
  • 72

2 Answers2

3

You shouldn't use mutable values as default arguments to Python functions. Consider the following:

def f(arg={}):
  print arg
  if 'count' in arg:
    arg['count'] += 1
  else:
    arg['count'] = 1

f()
f()
f()

This, surprisingly, prints out

{}
{'count': 1}
{'count': 2}

What happens is that the {} in f(arg={}) is evaluated once, and the same dictionary is passed whenever f() is called. As a result, any changes to args persist across calls.

One way to fix the above code is:

def f(arg=None):
  if not arg:
    arg = {}
  print arg
  if 'count' in arg:
    arg['count'] += 1
  else:
    arg['count'] = 1

The following functions in your code are affected by this problem:

def picasa_query_by_profile(subject, user_profile, args={}, replace=(), format='xml'):
def picasa_query(subject, access_token='', args={}, replace=()):
def picasa_query_params(subject, access_token='', args={}, replace=()):
NPE
  • 486,780
  • 108
  • 951
  • 1,012
1

Don't use mutables as default values. Change all cases of code like this:

def picasa_query(subject, access_token='', args={}, replace=()):
    ...

To this:

def picasa_query(subject, access_token='', args=None, replace=()):
    if args is None:
        args = {}
    ...

The default values of a function are evaluated only once, when the function object is created, and the same object is used as default from then on. This is one of the most common python gotchas.

Community
  • 1
  • 1
Lauritz V. Thaulow
  • 49,139
  • 12
  • 73
  • 92