TL;DR: No, there isn't a generic way to handle escaping '
and \
Google drive queries (and possibly other Google API's). Each API provider (Microsoft, Amazon, Twitter, etc.) would have their filename/string-escaping rules so creating one for each would be tedious. However, it should have been part of the API client they provided.
My question is what the the general solution to this problem of special characters in the filename
This is separate from the issue of sanitising strings for actual filenames because local filesystems don't follow the same rules as GDrive.
are there other characters that similarly needed to be escaped?
As far as I can tell, GDrive only needs the apostrophe ('
) and backslash (\
) escaped, as you pointed out. As for the actual request, there's:
Note: These examples use the unencoded q
parameter, where name = 'hello'
is encoded as name+%3d+%27hello%27
. Client libraries handle this encoding automatically.
That part is probably being handled by google-api-python-client
.
As for the two specific replacements you need:
file_name = r"tim's file\has slashes"
print(file_name)
# tim's file\has slashes
print(file_name.replace('\\', '\\\\').replace("'", "\\'"))
# tim\'s file\\has slashes
# or, better
print(file_name.replace('\\', '\\\\').replace("'", r"\'"))
# tim\'s file\\has slashes
# using raw strings also for the backslash replacement
print(file_name.replace('\\', r'\\').replace("'", r"\'"))
# tim\'s file\\has slashes
Note that there's no point using raw strings
for the backslash escape in the find
part of the first replacement because the trailing backslash before the close quote needs to be escaped anyway. And r'\'
is not a valid Python string (SyntaxError: EOL while scanning string literal
). However, r'\\'
means two backslashes because in a raw string the first backslash doesn't escape the 2nd backslash. Ie '\\'
vs r'\\'
== 1 backslash vs 2 backslashes. And if you want 3 or any odd number of number of backslashes.
Btw, replacement order is important because if you did it in reverse, then the backslash added for the apostrophe would then get escaped further:
print(file_name.replace("'", r"\'").replace('\\', r'\\')) # WRONG!
# tim\\'s file\\has slashes
And do use f-strings for the query, it's much more readable:
f"name='{file_name}'"
# "name='tim\\'s file\\\\has slashes'"
print(f"name='{file_name}'")
# name='tim\'s file\\has slashes'