Validate the image client side by checking the files extension. If the extension is valid send it to the servlet for backend validation.
This may fail for non-Windows users, whose filetypes are not necessarily determined by an extension on the filename.
It can be useful to add a JS warning to say "this file doesn't end in .png/.gif/.jpeg/.jpg - are you sure it is an image?", but it's generally not a good idea to disallow an upload based on extension.
Validate the mime type of the image and make sure it is either image/jpeg, image/gif, or image/png.
Again there are some problems here. On Windows, the MIME type is retrieved from registry associations, which are variable and not always correct. For example IE commonly sends JPEGs as image/pjpeg
, and Citrix users may find they get uploaded as image/x-citrix-pjpeg
.
Since the media type is typically unused by upload scripts, there's little point reading/checking it. For the types here, I'd say your best bet would be to ignore the filename and MIME type; use only the magic number sniffing to determine format.
What else should I do in terms of security
1) Be careful what name you use to store the file - taking the user's submitted filename verbatim is dangerous due to directory traversal, special filenames and extensions (.htaccess
, .jsp
etc), and unreliable just because file naming rules can be complicated cross-platform.
If you want to use the supplied name on the local filesystem at all it should be basenamed, slugified (replacing all but a whitelist of simple characters), length-limited, and the extension replaced/added from the detected filetype.
Better is to store the file with a completely generated name (eg 17264.dat
for the file related to item with primary key 17264 in the database); if you need to serve it up to browsers with a pretty filename you can use rewrites on the front-end web server, or a file-serving servlet, to make it visible as /images/17264/some_name.png
.
2) Just because it has image magic numbers doesn't mean it's necessarily an image, or that even if it is a valid image, it doesn't have some other content in a different form at the same time (a 'chameleon' file).
For example, HTML-like content in a binary file can fool the dodgy MIME-sniffing in older versions of IE into treating it as HTML. Similarly Flash could be tricked into loading a <crossdomain>
policy set out of XML inside an image, and Java could load applets that were also GIFs.
One way of making this much harder is to load the image using a server-side graphics library, and then re-save it, causing a round of recompression which will generally garble any parsable content in the file. The problem with this is for lossy compression like JPEG, where recompressing results in a loss of visual quality.
The ultimate solution is usually to give up and serve the image from a completely different hostname to the main site. Then if the attacker manages to get some XSS content into the file, it doesn't matter as there's nothing on the site it's living in to compromise, only other static images.
3) If you do load the image server-side, for (2) or other reasons, ensure that the image size - both file size and width/height size - is reasonable before attempting to load it. Otherwise you can be hit by decompression bombs filling up your memory and causing denial of service.
Also if you do this make sure to keep your image library/language (eg Java Graphics2D
) up to date. There have been image-handling vulnerabilities in these languages before.