111

I have an existing file on disk (say /folder/file.txt) and a FileField model field in Django.

When I do

instance.field = File(file('/folder/file.txt'))
instance.save()

it re-saves the file as file_1.txt (the next time it's _2, etc.).

I understand why, but I don't want this behavior - I know the file I want the field to be associated with is really there waiting for me, and I just want Django to point to it.

How?

Guard
  • 6,816
  • 4
  • 38
  • 58
  • 1
    Not sure you can get what you want without modifying Django or subclassing `FileField`. Whenever a `FileField` is saved, a new copy of the file is created. It would be fairly straightforward to add an option to avoid this. – Michael Mior Nov 30 '11 at 21:24
  • well yes, looks like I have to subclass and add a param. I don't wnat to create extra tables for this simple task – Guard Nov 30 '11 at 21:42
  • 1
    Put the file in a different location, create your field with this path, save it and then you have the file in the upload_to destination. – benjaoming Nov 30 '11 at 23:51

7 Answers7

156

just set instance.field.name to the path of your file

e.g.

class Document(models.Model):
    file = FileField(upload_to=get_document_path)
    description = CharField(max_length=100)


doc = Document()
doc.file.name = 'path/to/file'  # must be relative to MEDIA_ROOT
doc.file
<FieldFile: path/to/file>
juliomalegria
  • 24,229
  • 14
  • 73
  • 89
bara
  • 2,964
  • 2
  • 26
  • 24
25

If you want to do this permanently, you need to create your own FileStorage class

import os
from django.conf import settings
from django.core.files.storage import FileSystemStorage

class MyFileStorage(FileSystemStorage):

    # This method is actually defined in Storage
    def get_available_name(self, name):
        if self.exists(name):
            os.remove(os.path.join(settings.MEDIA_ROOT, name))
        return name # simply returns the name passed

Now in your model, you use your modified MyFileStorage

from mystuff.customs import MyFileStorage

mfs = MyFileStorage()

class SomeModel(model.Model):
   my_file = model.FileField(storage=mfs)
Alexander Shpindler
  • 811
  • 1
  • 11
  • 31
Burhan Khalid
  • 169,990
  • 18
  • 245
  • 284
  • 1
    oh, looks promising. cuase the FileField's code is kinda non-intuitive – Guard Dec 01 '11 at 07:05
  • but... is it possible to change storage on a per-request basis, like: instance.field.storage = mfs; instance.field.save(name, file); but not doing it in a different branch of my code – Guard Dec 01 '11 at 07:07
  • 2
    No, since the storage engine is tied to the model. You can avoid all this by simply storing your file path in either a `FilePathField` or simply as plain text. – Burhan Khalid Dec 01 '11 at 07:09
  • You can't just return a name. You need to remove existing file first. – Alexander Shpindler Jun 11 '18 at 16:27
  • 2
    This solution is only seemingly correct as this solution actually removes the file already present and creates a new one with the same name. In the end, it does NOT "point to it" as the author wrote. Imagine a situation where the user wants to point to a large file but actually ends up removing it and uploading it from scratch. – Static.Mike May 03 '21 at 16:01
24

try this (doc):

instance.field.name = <PATH RELATIVE TO MEDIA_ROOT> 
instance.save()
Kjuly
  • 34,476
  • 22
  • 104
  • 118
uNmAnNeR
  • 600
  • 5
  • 12
5

It's right to write own storage class. However get_available_name is not the right method to override.

get_available_name is called when Django sees a file with same name and tries to get a new available file name. It's not the method that causes the rename. the method caused that is _save. Comments in _save is pretty good and you can easily find it opens file for writing with flag os.O_EXCL which will throw an OSError if same file name already exists. Django catches this Error then calls get_available_name to get a new name.

So I think the correct way is to override _save and call os.open() without flag os.O_EXCL. The modification is quite simple however the method is a little be long so I don't paste it here. Tell me if you need more help :)

x1a0
  • 9,984
  • 5
  • 22
  • 30
  • it's 50 lines of code that you have to copy, which is pretty bad. Overriding get_available_name seems is more isolated, shorter, and much more safer for, say, upgrading to the newer versions of Django in future – Michael Gendin Mar 29 '12 at 20:50
  • 2
    The problem of *only* overriding `get_available_name` is when you upload a file with same name, the server will get into an endless loop. Since `_save` checks the file name and decides to get a new one however `get_available_name` still returns the duplicate one. So you need to override both. – x1a0 Mar 31 '12 at 11:15
  • 1
    Oops, we're having this discussion in two questions, but only now I noticed that they are slightly different) So I'm right in that question, and you are in this) – Michael Gendin Mar 31 '12 at 12:58
2

The answers work fine if you are using the app's filesystem to store your files. But, If your are using boto3 and uploading to sth like AWS S3 and maybe you want to set a file already existing in an S3 bucket to your model's FileField then, this is what you need.

We have a simple model class with a filefield:

class Image(models.Model):
    
    img = models.FileField()
    owner = models.ForeignKey(get_user_model(), on_delete=models.CASCADE, related_name='images')

    date_added = models.DateTimeField(editable=False)
    date_modified = models.DateTimeField(editable=True)
from botocore.exceptions import ClientError
import boto3
    
s3 = boto3.client(
    's3',
    aws_access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
    aws_secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY")
)

s3_key = S3_DIR + '/' + filename
bucket_name = os.getenv("AWS_STORAGE_BUCKET_NAME")

try:
    s3.upload_file(local_file_path, bucket_name, s3_key)
    # we want to store it to our db model called **Image** after s3 upload is complete so,
    image_data = Image()
    image_data.img.name = s3_key # this does it !!
    image_data.owner = get_user_model().objects.get(id=owner_id)
    image_data.save()
except ClientError as e:
    print(f"failed uploading to s3 {e}")

Setting the S3 KEY into the name field of the FileField does the trick. As much i have tested everything related works as expected e.g previewing the image file in django admin. fetching the images from db appends the root s3 bucket prefix (or, the cloudfront cdn prefix) to the s3 keys of the files too. Ofcourse, its given that, i already had a working setup of the django settings.py for boto and s3.

because_im_batman
  • 975
  • 10
  • 26
1

You should define your own storage, inherit it from FileSystemStorage, and override OS_OPEN_FLAGS class attribute and get_available_name() method:

Django Version: 3.1

Project/core/files/storages/backends/local.py

import os

from django.core.files.storage import FileSystemStorage


class OverwriteStorage(FileSystemStorage):
    """
    FileSystemStorage subclass that allows overwrite the already existing
    files.
    
    Be careful using this class, as user-uploaded files will overwrite
    already existing files.
    """

    # The combination that don't makes os.open() raise OSError if the
    # file already exists before it's opened.
    OS_OPEN_FLAGS = os.O_WRONLY | os.O_TRUNC | os.O_CREAT | getattr(os, 'O_BINARY', 0)

    def get_available_name(self, name, max_length=None):
        """
        This method will be called before starting the save process.
        """
        return name

In your model, use your custom OverwriteStorage

myapp/models.py

from django.db import models

from core.files.storages.backends.local import OverwriteStorage


class MyModel(models.Model):
   my_file = models.FileField(storage=OverwriteStorage())
Sultan
  • 834
  • 1
  • 8
  • 16
1

I had exactly the same problem! then I realize that my Models were causing that. example I hade my models like this:

class Tile(models.Model):
  image = models.ImageField()

Then, I wanted to have more the one tile referencing the same file in the disk! The way that I found to solve that was change my Model structure to this:

class Tile(models.Model):
  image = models.ForeignKey(TileImage)

class TileImage(models.Model):
  image = models.ImageField()

Which after I realize that make more sense, because if I want the same file being saved more then one in my DB I have to create another table for it!

I guess you can solve your problem like that too, just hoping that you can change the models!

EDIT

Also I guess you can use a different storage, like this for instance: SymlinkOrCopyStorage

http://code.welldev.org/django-storages/src/11bef0c2a410/storages/backends/symlinkorcopy.py

Arthur Neves
  • 11,840
  • 8
  • 60
  • 73
  • makes sense in your case, not in mine. I don't want it to be referenced multiple times. I create an object referencing a file, then I realize there're errors in other attrs, and I reopen the creation form. On its resubmission I don't want to loose the file which is already saved on the disk – Guard Nov 30 '11 at 21:05
  • so I guess you can use my approach! because you will have a table FormFile which will hold the file only then you have ! then in your Form table you`ll have an FK for that file! so You can change/create new forms for the same file! (btw I am changing the order of the FK in my main example) – Arthur Neves Nov 30 '11 at 21:16
  • If you want to post your domain(models) in your post ! i can have a better ideia too! – Arthur Neves Nov 30 '11 at 21:17
  • the domain actually doesn't matter - I have a model with a photo associated with it, and I have custom editing screen. once uploaded I want the photo to remain on server, but I don't actually like spawning a separate model, table and FK lookup just because the're looks to be a framework limitation – Guard Nov 30 '11 at 21:22
  • The limitation here I guess is because of when you save a FileField in django, always it passes through Django Storages! so it wont make sense you just force a file path! also how Django should know that the file already exist in the path? another approach that you can use is using the FilePathField instead! so you can just set path in your DB and make the lookup the way you think is best! – Arthur Neves Nov 30 '11 at 21:28
  • I guess I found a django-storage that could help you aim what you want, check it your my post EDIT! – Arthur Neves Nov 30 '11 at 21:30
  • thanks for your effort, but it looks really complicating things. I should probably just subclass the ImageField (the one I actually use) and give it an option to force it not to re-save the file – Guard Nov 30 '11 at 21:42
  • The URL pointing to `SymLinkOrCopy` is broken. Use [this](https://github.com/e-loue/django-storages/blob/master/storages/backends/symlinkorcopy.py) instead. Also, note that there are two `django-storages` repositories on github: 1. [jschneier/django-storages](https://github.com/jschneier/django-storages), 2. [e-loue/django-storages](https://github.com/e-loue/django-storages) – Shadi Nov 09 '17 at 05:54