1

I mirrored a batch of videos from EuroPython2014 on archive.org using the master of ia-wrapper. As discussed in #64, metadata from the previous upload shows up in a subsequent upload.

I went through and hand edited the descriptions in the archive.org interface (it was just a few of the videos), but I'd like for this not to happen the next time I mirror a conference. I have a workaround (explicitly set headers when calling upload.) I'd really really really like to know how it is that the headers dict is still populated from previous calls.

When I run this, item.py L579 is not passing headers in kwargs when it calls upload_file. (I even stepped through using pycharm's debugger).

What the heck is going on?

If you want to try this out, the code below demonstrates it.

pip install -e git+https://github.com/jjjake/ia-wrapper.git@9b7b951cfb0e9266f329c9fa5a2c468a92db75f7#egg=internetarchive-master

#! /usr/bin/env python
# -*- coding: utf-8 -*-
import datetime
import internetarchive as ia
import os
from tempfile import NamedTemporaryFile


ACCESS_KEY = os.environ.get('IAS3_ACCESS_KEY') 
SECRET_KEY = os.environ.get('IAS3_SECRET_KEY')

now = datetime.datetime.utcnow().strftime('%Y_%m_%d_%H%M%S')

item = ia.Item('test_upload_iawrapper_first_%s' % now)
item2 = ia.Item('test_upload_iawrapper_second_%s' % now)

def upload(item, metadata):
    with NamedTemporaryFile() as fh:
        fh.write('testing archive_uploader')
        item.upload(fh.name,
            metadata=metadata,
            access_key=ACCESS_KEY,
            secret_key=SECRET_KEY,
            # adding headers={} is a workaround
        )

upload(item,
       metadata={
           'collection': 'test_collection',
           'description': 'not an empty description',
        })

upload(item2,
       metadata={
           'collection': 'test_collection',
           # you can also comment out description and get hte same result
           'description': '',
        })

print 'visit https://archive.org/details/{}'.format(item.identifier)
print 'visit https://archive.org/details/{}'.format(item2.identifier)
skm
  • 53
  • 4

1 Answers1

1

You've tripped over the "mutable defaults" gotcha in Python: "Least Astonishment" and the Mutable Default Argument

Change this:

def upload_file(self, body, headers={}, ...):

to this:

def upload_file(self, body, headers=None, ...):
    if headers is None:
        headers = {}
Community
  • 1
  • 1
Ned Batchelder
  • 364,293
  • 75
  • 561
  • 662