31

I am getting base64 for string from backend and i am then decoding it in Javascript to show on browser.

This string can be any file .pdf, .img, .docx, .zip etc.

My base64 string does not include the mime-type for example 'data:application/pdf;base64' part. So i need to get mime type of base64.

Is there any way to solve this solution with Javascript or Jquery?

Soumen Mukherjee
  • 2,953
  • 3
  • 22
  • 34
Hikmet Tüysüz
  • 453
  • 1
  • 4
  • 8
  • A base64 encoded string can contain anything, and you would need to know its MIME type in advance to decode it properly. As such, unless you go through and try and decode the string to all known valid file types (which is not really a workable solution) there's no way to do what you need. Going forward you **need** to keep the MIME type prefix on the encoded string. – Rory McCrossan Sep 17 '19 at 14:54
  • I searched everywhere about it, but some people can get information from that string. 'function guessImageMime(data){ if(data.charAt(0)=='/'){ return "image/jpeg"; }else if(data.charAt(0)=='R'){ return "image/gif"; }else if(data.charAt(0)=='i'){ return "image/png"; } }' Thanks for your answer. – Hikmet Tüysüz Sep 17 '19 at 14:56
  • It would be helpful if you posted a link to where you got that code from, and the logic in `guessImageMime()`. The clue is probably in the name though - 'guess' - so it's probably checking all expected file types as I mentioned before. – Rory McCrossan Sep 17 '19 at 14:58
  • Did you know this site? https://base64.guru/converter/decode/file It is decoding the string whatever you add and telling the MIME type. How can it be? If it is not possible in javascript what about other languages? – Hikmet Tüysüz Sep 17 '19 at 15:14

3 Answers3

53

You can use magic numbers to detect a MIME type (check here the list of file signatures). However, file signatures are not 100% reliable and you can easily encounter false positives. Of course, there are tasks when a such solution is more than enough.

So if you have a Base64 string and want to identify its MIME type using file signatures you don't need to decode the Base64. A much faster way is to store the file signatures as Base64 and just check if input starts with one of them. A simple example:

var signatures = {
  JVBERi0: "application/pdf",
  R0lGODdh: "image/gif",
  R0lGODlh: "image/gif",
  iVBORw0KGgo: "image/png",
  "/9j/": "image/jpg"
};

function detectMimeType(b64) {
  for (var s in signatures) {
    if (b64.indexOf(s) === 0) {
      return signatures[s];
    }
  }
}

// Some tests
console.log(detectMimeType('R0lGODdhAQABAPAAAP8AAAAAACwAAAAAAQABAAACAkQBADs='));
console.log(detectMimeType('iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR42mP4z8AAAAMBAQD3A0FDAAAAAElFTkSuQmCC'));
console.log(detectMimeType('JVBERi0xLjUKJYCBgoMKMSAwIG9iago8PC9GaWx0ZXIvRmxhdGVEZWNvZGUvRmlyc3QgMTQxL04gMjAvTGVuZ3'));
console.log(detectMimeType('/9j/4AAQSkZJRgABAQAAZABkAAD/2wCEABQQEBkSGScXFycyJh8mMi4mJiYmLj41NTU1NT5EQUFBQUFBRERERERERERE'));
Victor
  • 5,493
  • 1
  • 27
  • 28
  • what for image/jpeg? – Biswas Khayargoli Jul 27 '21 at 07:38
  • @BiswasKhayargoli Just check if the first character is: '/' (Forward slash) – Deepak Sep 29 '21 at 06:15
  • 1
    @Deepak You will run into a lot of false positives just by checking only the first character. For example, a forward slash will match `.mp3` and `.tar.xz` files, as well as various UTF byte order marks. To improve accuracy, you have to check at least for `/9j/` (see my updated answer) or use more accurate signatures such as `/9j/2w`, `/9j/4A`, `/9j/7g`, `/9j/4Q`. – Victor Oct 02 '21 at 11:10
  • @Victor Yeah agreed. Might be helpful if there are a bunch of possible options to check for which doesn't involve any other `/` :P – Deepak Oct 02 '21 at 16:46
  • for get MIME type from base 64 in PHP then see here https://stackoverflow.com/a/75397668/14344959 – Harsh Patel Mar 29 '23 at 08:44
10

There are certain file types that indicate what type they are in the base 64 string. For images the first character chages.

'/' means jpeg.

'i' means png.

'R' means gif.

'U' means webp.

'J' means PDF.

However, these aren't reliable, as other files can sometimes start with these characters. I tested the decoder on the website you mentioned, and it doesn't work for all filetypes. For some files, it just returns a general .bin. As far as detection goes, it might try decoding the string and testing to see if a certain file type fits. You could try to create your own solution that works in the same way, but it'd make way more sense to detect the file type based on the extension since you'll have access to it.

David Avsajanishvili
  • 7,678
  • 2
  • 22
  • 24
Patrick
  • 334
  • 2
  • 7
0

Just updated+minified version of answer above https://stackoverflow.com/a/58158656/8552163

const signatures = {
    JVBERi0: 'application/pdf',
    R0lGODdh: 'image/gif',
    R0lGODlh: 'image/gif',
    iVBORw0KGgo: 'image/png',
    '/9j/': 'image/jpg',
};

const getMimeType = (base64)=>{
    for(const sign in signatures)if(base64.startsWith(sign))return signatures[sign];
};
Vadim
  • 306
  • 1
  • 5
  • 13