2

I want to convert/escape an untrusted string to a filename. The string can contain anything from *, + and / to ../../etc/shadow, which must not appear in a filename. I also want the filename to be as human-readable (i.e. close to the original string) as possible, so please avoid escaping alphanumeric characters like Abc123 since they are already valid characters. I'm on Ubuntu 20.04 in case it matters.

I am essentially looking for something like urllib.parse.quote but for file paths. Any ideas?

nalzok
  • 14,965
  • 21
  • 72
  • 139
  • I think you're really close. Why not use `urllib.parse.quote` but with the `safe=''` parameter? That means that `/` will also be encoded. AFAIK that should be safe enough for filenames while still being reasonably readable (readable if it's non-malicious, if it *is* malicious, i.e. with slashes and $ and things, their strings are going to be hard to read). – Lakshya Raj Aug 04 '22 at 16:59
  • You can use the same kind of encoding that's used to create "slugs" in URLs. – Barmar Aug 04 '22 at 17:08
  • What do you mean by "escape"? `*` is a perfectly valid character in a filename. If the untrusted string has an asterisk, what do you want to do exactly? – John Gordon Aug 04 '22 at 17:10
  • I agree with @JohnGordon, but you should be able to reach your goal with the slugify package (https://pypi.org/project/python-slugify/). – carlo_barth Aug 04 '22 at 17:13

0 Answers0