0

before saying DUPLICATE, I found no correct answer in SO about my specific problem.

I know how to use base64_decode(). My problem is I need to check if a string is base64 to decode it, and if it's not, do nothing. The problem is some english strings like "hey guys" are base64 valid, but will return pure crap.

echo base64_decode("hey guys");

will return (in latin1)

 ì »+

Which is not what I want, I have a script looping throught strings in database checking for base64 and converting it. The problem is, the first time it will take "aGV5IGd1eXM=" and convert it to "hey guys" and will redo it to "ì »+"

How can I check if it's real base64? Is there any way?

Carlos2W
  • 2,024
  • 2
  • 16
  • 19
  • This does not answer your question: http://stackoverflow.com/a/8571649/3179423 ? – dev0 Jan 05 '16 at 13:22
  • 2
    Possible duplicate of [How to check whether the string is base64 encoded or not](http://stackoverflow.com/questions/8571501/how-to-check-whether-the-string-is-base64-encoded-or-not) – Vlad274 Jan 05 '16 at 13:24
  • No because some simple strings like `"test"` will pass this answer – Carlos2W Jan 05 '16 at 13:27
  • That's because "test" is a valid base64 encoded string representation. There is now way to distinguish between the two. – dev0 Jan 05 '16 at 13:35
  • you cannot say all valid base64 strings will decode to human sensible output. it is by design of base64. In your case you can store a flag for what you have converted and what you have not to avoid double decoding. – bansi Jan 05 '16 at 13:40
  • I think i'll just check if the base64 decoded string is latin1 proof, that way, retarded strings like "µë-"(which come from "test") will not pass. – Carlos2W Jan 05 '16 at 13:53

2 Answers2

3

Since base64 is a mapping from 8 bit to 6 bit representation of data. You have just the following options:

  • Look for non-ASCII chars (other than A-Z, a-z, 0-9, +, /) and paddings
  • Look for the number of characters (it must be dividable by three).

By this way, you can check whether the data is not base64 encoded. But you cannot check whether the data is real base64, since it can be a normal string passing the requirements of base64 encoding.

On the other hand, if you know the structure of the data, it is possible to check that the decoding of base64 text fits the structure.

cbesiktas
  • 96
  • 4
  • Since there's no way to check, like you said a real normal string would pass the test. I'll at least do your recommendations. Tank you – Carlos2W Feb 25 '16 at 16:10
1

try this:

function base64_check($name){
 $encoding_type = mb_detect_encoding(base64_decode($name));
  if($encoding_type == "ASCII" || $encoding_type == "UTF-8"){
     return true;
  }else{
     return false;
  }
}