I am writing a command (to run via manage.py importfiles
) to import a given directory structure on the real file system in my self written filestorage in Django.
def _handle_directory(self, directory_path, directory):
for root, subFolders, files in os.walk(directory_path):
for filename in files:
path = os.path.join(root, filename)
with open(path, 'r') as f:
file_wrapper = FileWrapper(f)
self.cnt_files += 1
new_file = File(directory=directory, filename=filename,
file=file_wrapper, uploader=self.uploader)
new_file.save()
The full model can be found at GitHub. The full command is currently on gist.github.com available.
If you do not want to check the model: the attribute file
of my File
class is a FileField.
Copying the files seems to work, thanks to pajton. Nevertheless I receive a new exception, I think, there's a problem with the sqlite encoding. But I do not know how to fix it. The value of sys.getfilesystemencoding()
is mbcs
.
Traceback (most recent call last):
File ".\manage.py", line 10, in <module>
execute_from_command_line(sys.argv)
File "C:\Python27\lib\site-packages\django\core\management\__init__.py", line 399, in execute_from_command_line
utility.execute()
File "C:\Python27\lib\site-packages\django\core\management\__init__.py", line 392, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "C:\Python27\lib\site-packages\django\core\management\base.py", line 242, in run_from_argv
self.execute(*args, **options.__dict__)
File "C:\Python27\lib\site-packages\django\core\management\base.py", line 285, in execute
output = self.handle(*args, **options)
File "D:\Development\github\Palco\engine\filestorage\management\commands\importfiles.py", line 63, in handle
self._handle_directory(args[0], root)
File "D:\Development\github\Palco\engine\filestorage\management\commands\importfiles.py", line 75, in _handle_directory
new_file.save()
File "D:\Development\github\Palco\engine\filestorage\models.py", line 155, in save
super(File, self).save(*args, **kwargs)
File "C:\Python27\lib\site-packages\django\db\models\base.py", line 545, in save
force_update=force_update, update_fields=update_fields)
File "C:\Python27\lib\site-packages\django\db\models\base.py", line 573, in save_base
updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
File "C:\Python27\lib\site-packages\django\db\models\base.py", line 635, in _save_table
forced_update)
File "C:\Python27\lib\site-packages\django\db\models\base.py", line 679, in _do_update
return filtered._update(values) > 0
File "C:\Python27\lib\site-packages\django\db\models\query.py", line 507, in _update
return query.get_compiler(self.db).execute_sql(None)
File "C:\Python27\lib\site-packages\django\db\models\sql\compiler.py", line 976, in execute_sql
cursor = super(SQLUpdateCompiler, self).execute_sql(result_type)
File "C:\Python27\lib\site-packages\django\db\models\sql\compiler.py", line 782, in execute_sql
cursor.execute(sql, params)
File "C:\Python27\lib\site-packages\django\db\backends\util.py", line 69, in execute
return super(CursorDebugWrapper, self).execute(sql, params)
File "C:\Python27\lib\site-packages\django\db\backends\util.py", line 53, in execute
return self.cursor.execute(sql, params)
File "C:\Python27\lib\site-packages\django\db\utils.py", line 99, in __exit__
six.reraise(dj_exc_type, dj_exc_value, traceback)
File "C:\Python27\lib\site-packages\django\db\backends\util.py", line 53, in execute
return self.cursor.execute(sql, params)
File "C:\Python27\lib\site-packages\django\db\backends\sqlite3\base.py", line 450, in execute
return Database.Cursor.execute(self, query, params)
django.db.utils.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str
). It is highly recommended that you instead just switch your application to Unicode strings.
I changed filename
in several ways; but it is always wrong. I tried values like 'foo'
or u'foo'
, too. :( Also different combinations of .encode()
, .decode()
and unidecode
.
I am pretty sure, that's a problem with the filename
. I printed the current values of filename and the exception occurs if the filename has non-ascii characters.
Update 1: I followed pajton's advice and logged the sql querys. This is the result: (The first line is the output of print filename). D:\temp\prak-gdv-abgabe is my argument to this command.
Eigene L÷sung.pdf
(0.000) QUERY = u'BEGIN' - PARAMS = (); args=None
(0.000) QUERY = u'INSERT INTO "filestorage_file" ("directory_id", "filename", "file", "size", "content_type", "uploader_id", "datetime", "sha512") VALUES (%s, %
s, %s, %s, %s, %s, %s, %s)' - PARAMS = (164, u'Eigene L\xf6sung.pdf', u'filestorage/5/5b/5bf32077-5531-4de0-95a7-d2ea3e10a17d.pdf', None, None, 8, u'2014-02-26
23:21:17.735000', None); args=[164, 'Eigene L\xc3\xb6sung.pdf', u'filestorage/5/5b/5bf32077-5531-4de0-95a7-d2ea3e10a17d.pdf', None, None, 8, u'2014-02-26 23:21:
17.735000', None]
(0.000) QUERY = u'BEGIN' - PARAMS = (); args=None
(0.000) QUERY = u'UPDATE "filestorage_file" SET "directory_id" = %s, "filename" = %s, "file" = %s, "size" = NULL, "content_type" = %s, "uploader_id" = %s, "date
time" = %s, "sha512" = NULL WHERE "filestorage_file"."id" = %s ' - PARAMS = (164, u'D:\\Temp\\prak-gdv-abgabe\\Protokoll\\Eigene L\ufffdsung.pdf', u'filestorage
/5/5b/5bf32077-5531-4de0-95a7-d2ea3e10a17d.pdf', u'application/pdf', 8, u'2014-02-26 23:21:17.735000', 156); args=(164, 'D:\\Temp\\prak-gdv-abgabe\\Protokoll\\E
igene L\xf6sung.pdf', u'filestorage/5/5b/5bf32077-5531-4de0-95a7-d2ea3e10a17d.pdf', 'application/pdf', 8, u'2014-02-26 23:21:17.735000', 156)
Update 2: (2014-02-27 11:10 UTC)
The encoding of my sqlite database is UTF-8
as verified by PRAGMA encoding;
.
I checked the records of my database.
Id | filename | sha512 | size
1 | D:\Temp\prak-gdv-abgabe\Liesmich.html | ffeb8c3d5 | 5927
2 | D:\Temp\prak-gdv-abgabe\Liesmich.md | d206d241f | 407
3 | D:\Temp\prak-gdv-abgabe\Liesmich.txt | d206d241f | 407
4 | D:\Temp\prak-gdv-abgabe\Linux\GDV_Praktikum.bin | 5fc5749ee | 166925
5 | Eigene Lösung.pdf | |
It's very interessting, that the failing entry (id 5) has the expected filename but not the sha512 or the size values set. the other entries have the expected values for sha512 and size but not the expected filename. This is very interesting. It seems, the custom save()-method of my File class is part of my problem.... But I don't understand why these strange things happens.