3

I discovered that if searching for a filename with an apostrophe(') in Google Drive API, I needed to escape the apostrophe with a \. e.g:

# file_name is "tim's file"
file_name = file_name.replace("'", "\\'")
# file_name is "tim\'s file"

response = service.files().list(q = "name='" + file_name + "'").execute() #works

The docs mention that the backslash also needs special treatment.

My question is what the the general solution to this problem of special characters in the filename, are there other characters that similarly needed to be escaped?

Ken
  • 4,367
  • 4
  • 28
  • 41
  • `'` and \ only ones that need escaping. Your code is correct but as mentioned below use f string instead `q=f"name='{file_name}'"` – Sylver11 Aug 28 '23 at 10:31

1 Answers1

4

TL;DR: No, there isn't a generic way to handle escaping ' and \ Google drive queries (and possibly other Google API's). Each API provider (Microsoft, Amazon, Twitter, etc.) would have their filename/string-escaping rules so creating one for each would be tedious. However, it should have been part of the API client they provided.

My question is what the the general solution to this problem of special characters in the filename

This is separate from the issue of sanitising strings for actual filenames because local filesystems don't follow the same rules as GDrive.

are there other characters that similarly needed to be escaped?

As far as I can tell, GDrive only needs the apostrophe (') and backslash (\) escaped, as you pointed out. As for the actual request, there's:

Note: These examples use the unencoded q parameter, where name = 'hello' is encoded as name+%3d+%27hello%27. Client libraries handle this encoding automatically.

That part is probably being handled by google-api-python-client.

As for the two specific replacements you need:

file_name = r"tim's file\has slashes"
print(file_name)
# tim's file\has slashes

print(file_name.replace('\\', '\\\\').replace("'", "\\'"))
# tim\'s file\\has slashes

# or, better
print(file_name.replace('\\', '\\\\').replace("'", r"\'"))
# tim\'s file\\has slashes

# using raw strings also for the backslash replacement
print(file_name.replace('\\', r'\\').replace("'", r"\'"))
# tim\'s file\\has slashes

Note that there's no point using raw strings for the backslash escape in the find part of the first replacement because the trailing backslash before the close quote needs to be escaped anyway. And r'\' is not a valid Python string (SyntaxError: EOL while scanning string literal). However, r'\\' means two backslashes because in a raw string the first backslash doesn't escape the 2nd backslash. Ie '\\' vs r'\\' == 1 backslash vs 2 backslashes. And if you want 3 or any odd number of number of backslashes.

Btw, replacement order is important because if you did it in reverse, then the backslash added for the apostrophe would then get escaped further:

print(file_name.replace("'", r"\'").replace('\\', r'\\'))  # WRONG!
# tim\\'s file\\has slashes

And do use f-strings for the query, it's much more readable:

f"name='{file_name}'"
# "name='tim\\'s file\\\\has slashes'"

print(f"name='{file_name}'")
# name='tim\'s file\\has slashes'
aneroid
  • 12,983
  • 3
  • 36
  • 66