0

I am hosting a small fileserver, where users can upload documents from all around the world.

Due to problems in encoding (see otherquestion), I am asking myself if I should disallow users to upload (and on the other hand download) files not supported by CP1252 charset?

or otherwise; is it senseful to allow users upload documents with arabian or chinese letters in their filenames?

PS: they download the same file some time later (and it should have the same filename as uploaded)

Community
  • 1
  • 1
Niko
  • 1,054
  • 5
  • 25
  • 52

1 Answers1

0

You should be storing the files on disk using a randomly generated name, or let the file name be based on a hash of the file contents (good for deduplicating storage as well). You can save the original file name as meta data in a database together with all other meta data about the file (who uploaded it and things like that). Then you serve the file again using a PHP script which sets the original file name from the database in an HTTP header. This way you:

  • don't need to worry about file name sanitisation or duplication
  • file system encoding issues
  • storage duplication (if using a hash)
deceze
  • 510,633
  • 85
  • 743
  • 889
  • Thanks for these hints, one thing must work: to access die file directly by link. so if a user gets a url to the file, the php connects everytime to the db? doesn't this take too much time when he has e.g. 10 files to open? do you have a ready php script for that? – Niko Apr 15 '14 at 14:03
  • Database access should in no way be a limiting factor here if done decently. See http://php.net/readfile. Also see http://stackoverflow.com/a/20563773/476 if you want pretty links. Also ideally see https://tn123.org/mod_xsendfile/. – deceze Apr 15 '14 at 15:30
  • hmmm; do you have a reference-script for that? – Niko Apr 17 '14 at 09:10