Is there a good way to check if a string is encoded in base64
using Python?
11 Answers
I was looking for a solution to the same problem, then a very simple one just struck me in the head. All you need to do is decode, then re-encode. If the re-encoded string is equal to the encoded string, then it is base64 encoded.
Here is the code:
import base64
def isBase64(s):
try:
return base64.b64encode(base64.b64decode(s)) == s
except Exception:
return False
That's it!
Edit: Here's a version of the function that works with both the string and bytes objects in Python 3:
import base64
def isBase64(sb):
try:
if isinstance(sb, str):
# If there's any unicode here, an exception will be thrown and the function will return false
sb_bytes = bytes(sb, 'ascii')
elif isinstance(sb, bytes):
sb_bytes = sb
else:
raise ValueError("Argument must be string or bytes")
return base64.b64encode(base64.b64decode(sb_bytes)) == sb_bytes
except Exception:
return False

- 375
- 5
- 23

- 1,491
- 1
- 14
- 21
-
Nice and simple, I like it! – trukvl Sep 01 '17 at 14:39
-
If you like lambda: `isBase64 = lambda x: x.decode('base64').encode('base64').replace('\n','') == x` Note that this code will sometimes throw an incorrect padding exception. – id01 Oct 06 '17 at 16:05
-
3Side note: you can always just `return base64.b64encode(base64.b64decode(s)) == s` instead of using an if statement and returning a constant bool result :) – d0nut Nov 15 '17 at 19:49
-
True, but then you'll have to handle the binascii exceptions yourself outside the function as well. – id01 Nov 16 '17 at 00:17
-
6`isBase64('test')` return True – ahmed Feb 14 '18 at 18:17
-
2@ahmed that's because "test" is a valid base64 string. Base64 includes a-z, A-Z, 0-9, +, /, and = for padding. – id01 Feb 14 '18 at 21:48
-
Ah, d0nut, I think I get what you mean. Editing. – id01 Aug 03 '18 at 06:35
-
5on Python3 since `str` and `bytes` comparison doesn't covert them to same type implicitly(for the comparison) I had to do `return base64.b64encode(base64.b64decode(s)).decode() == s` for this to work. As my `s` was a unicode `str` while the value returned from `base64.b64encode(base64.b64decode(s))` was `bytes`. See this: https://stackoverflow.com/q/30580386/1781024 – Vikas Prasad Nov 16 '18 at 05:59
-
Vikas Prasad, thanks! I just added a version of the function for Python 3 that works on both `str` and `bytes`. – id01 Nov 16 '18 at 23:53
import base64
import binascii
try:
base64.decodestring("foo")
except binascii.Error:
print "no correct base64"
-
thank you, but I was wondering if it exists a function to test this instead of putting a try – lizzie Sep 07 '12 at 09:59
-
1I don't find any in [the documentation](http://docs.python.org/library/base64.html?highlight=base64#base64). – Sep 07 '12 at 10:05
-
3"easier to ask for forgiveness than permission", although I'd probably favour catching the actual exception that's likely to be raised (which I think will be binascii.Error) – LexyStardust Sep 07 '12 at 12:23
-
23This is incorrect, `base64.decodestring('čččč')` returns an empty string and no exception but I dont't think the string čččč is valid base64 – Roman Plášil Jan 21 '14 at 08:48
-
2base64.decodestring("dfdsfsdf ds fk") doesn't raise TypeError neither, the string doesn't seem to be a base64 string – erny Feb 22 '17 at 11:00
-
1As of 2018, Python 3.7, I use `base64.b64decode(item)` instead, and it works. – Polv Jul 16 '18 at 01:01
-
base64.decodestring("dfdsfsdf ds fk") is actually a base64 string. The base64 method ignores whitespace and what you are left with is a valid base64 string. – plaisthos Aug 07 '19 at 12:43
-
6`base64.b64decode(s, validate=true)` will decode `s` if it is valid, and otherwise throw an exception. `base64.decodestring` is very permissive and will strip any non-base64 characters which is potentially problematic. – Julian Nov 13 '19 at 23:04
This isn't possible. The best you could do would be to verify that a string might be valid Base 64, although many strings consisting of only ASCII text can be decoded as if they were Base 64.

- 87,717
- 12
- 108
- 131
-
2
-
3@coler-j yes, it is technically correct. It also probably should have been a comment but in 2012 SO was different. Maybe. – Wooble Dec 05 '18 at 22:13
The solution I used is based on one of the prior answers, but uses more up to date calls.
In my code, the my_image_string is either the image data itself in raw form or it's a base64 string. If the decode fails, then I assume it's raw data.
Note the validate=True
keyword argument to b64decode
. This is required in order for the assert to be generated by the decoder. Without it there will be no complaints about an illegal string.
import base64, binascii
try:
image_data = base64.b64decode(my_image_string, validate=True)
except binascii.Error:
image_data = my_image_string

- 5,312
- 21
- 39
Using Python RegEx
import re
txt = "VGhpcyBpcyBlbmNvZGVkIHRleHQ="
x = re.search("^([A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)?$", txt)
if (x):
print("Encoded")
else:
print("Non encoded")
Before trying to decode, I like to do a formatting check first as its the lightest weight check and does not return false positives thus following fail-fast coding principles.
Here is a utility function for this task:
RE_BASE64 = "^([A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)?$"
def likeBase64(s:str) -> bool:
return False if s is None or not re.search(RE_BASE64, s) else True

- 3,739
- 1
- 35
- 47
if the length of the encoded string is the times of 4, it can be decoded
base64.encodestring("whatever you say").strip().__len__() % 4 == 0
so, you just need to check if the string can match something like above, then it won't throw any exception(I Guess =.=)
if len(the_base64string.strip()) % 4 == 0:
# then you can just decode it anyway
base64.decodestring(the_base64string)

- 21
- 1
-
This does not work for strings with \n in them that are still valid base64 – plaisthos Aug 07 '19 at 12:47
@geoffspear is correct in that this is not 100% possible but you can get pretty close by checking the string header to see if it matches that of a base64 encoded string (re: How to check whether a string is base64 encoded or not).
# check if a string is base64 encoded.
def isBase64Encoded(s):
pattern = re.compile("^([A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$")
if not s or len(s) < 1:
return False
else:
return pattern.match(s)
Also not that in my case I wanted to return false if the string is empty to avoid decoding as there's no use in decoding nothing.

- 127
- 4
I know I'm almost 8 years late but you can use a regex expression thus you can verify if a given input is BASE64.
import re
encoding_type = 'Encoding type: '
base64_encoding = 'Base64'
def is_base64():
element = input("Enter encoded element: ")
expression = "^([A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)?$"
matches = re.match(expression, element)
if matches:
print(f"{encoding_type + base64_encoding}")
else:
print("Unknown encoding type.")
is_base64()

- 158
- 1
- 13
def is_base64(s):
s = ''.join([s.strip() for s in s.split("\n")])
try:
enc = base64.b64encode(base64.b64decode(s)).strip()
return enc == s
except TypeError:
return False
In my case, my input, s
, had newlines which I had to strip before the comparison.

- 8,010
- 15
- 46
- 69
x = 'possibly base64 encoded string'
result = x
try:
decoded = x.decode('base64', 'strict')
if x == decoded.encode('base64').strip():
result = decoded
except:
pass
this code put in the result variable decoded string if x is really encoded, and just x if not. Just try to decode doesn't always work.

- 1
-
1instead of x == decoded.encode('base64').strip() should be x == decoded.encode('base64').replace('\n', '') because of in some cases encode add several '\n' – Andy Jul 22 '15 at 14:58