I mirrored a batch of videos from EuroPython2014 on archive.org using the master of ia-wrapper. As discussed in #64, metadata from the previous upload shows up in a subsequent upload.
I went through and hand edited the descriptions in the archive.org interface (it was just a few of the videos), but I'd like for this not to happen the next time I mirror a conference. I have a workaround (explicitly set headers when calling upload.) I'd really really really like to know how it is that the headers dict is still populated from previous calls.
When I run this, item.py L579 is not passing headers in kwargs when it calls upload_file. (I even stepped through using pycharm's debugger).
What the heck is going on?
If you want to try this out, the code below demonstrates it.
pip install -e git+https://github.com/jjjake/ia-wrapper.git@9b7b951cfb0e9266f329c9fa5a2c468a92db75f7#egg=internetarchive-master
#! /usr/bin/env python
# -*- coding: utf-8 -*-
import datetime
import internetarchive as ia
import os
from tempfile import NamedTemporaryFile
ACCESS_KEY = os.environ.get('IAS3_ACCESS_KEY')
SECRET_KEY = os.environ.get('IAS3_SECRET_KEY')
now = datetime.datetime.utcnow().strftime('%Y_%m_%d_%H%M%S')
item = ia.Item('test_upload_iawrapper_first_%s' % now)
item2 = ia.Item('test_upload_iawrapper_second_%s' % now)
def upload(item, metadata):
with NamedTemporaryFile() as fh:
fh.write('testing archive_uploader')
item.upload(fh.name,
metadata=metadata,
access_key=ACCESS_KEY,
secret_key=SECRET_KEY,
# adding headers={} is a workaround
)
upload(item,
metadata={
'collection': 'test_collection',
'description': 'not an empty description',
})
upload(item2,
metadata={
'collection': 'test_collection',
# you can also comment out description and get hte same result
'description': '',
})
print 'visit https://archive.org/details/{}'.format(item.identifier)
print 'visit https://archive.org/details/{}'.format(item2.identifier)