3

How do I put symbols such as \, /, :, *, ?, ", <, >, | into a list?

if I do this:

illegalchar = ['\', '/' ,':' ,'*', '?', '"', '<', '>', '|']

the commas separating the items will be counted as a string including the ]

PS: It is to check if the filename contains illegal characters (can't be made into a file), so if there's any alternative methods, please do tell me, thanks!

Sudheesh Singanamalla
  • 2,283
  • 3
  • 19
  • 36
jeffng50
  • 199
  • 3
  • 15
  • 4
    Try `['\\', '/', ...` – Klaus D. Feb 13 '18 at 04:29
  • You do not need a list for this. A string is more compact: `illegalchar = ['\\/:*?"<>|']`. But you still have to escape the backslash, – DYZ Feb 13 '18 at 04:30
  • @KlausD. Is \\ the same as \? i've heard it somewhere but i've forgotten, i want to double confirm, thanks anyway! – jeffng50 Feb 13 '18 at 04:31
  • Sounds like a job for a good old regex via the [`re`](https://docs.python.org/3.6/library/re.html) module? – pstatix Feb 13 '18 at 04:32
  • @pstatix sorry, i haven't learned about regex, still quite new to python – jeffng50 Feb 13 '18 at 04:33
  • @DyZ so does it still work when using `illegalchar not in x`? – jeffng50 Feb 13 '18 at 04:35
  • The backslash is an escape character. It creates escape sequences together with the following character(s). In your case it made the quotation mark not to terminate the string. A double backslash on the other hand is the escape sequence for a single backslash. – Klaus D. Feb 13 '18 at 04:35
  • @JeffNg regular expressions are a fundamental concept in computer languages. Read through the module I have given you! – pstatix Feb 13 '18 at 04:36
  • thanks guys! you all have been helpful – jeffng50 Feb 13 '18 at 04:39
  • No it does not. You must check each character individually in a loop. – DYZ Feb 13 '18 at 04:40
  • You can remove the brackets, make it a string, and do `x not in '\\/:*?"<>|'`, which is faster and more readable than a python loop – the_constant Feb 13 '18 at 04:43
  • @NoticeMeSenpai You still have to loop through each `x` in the file name. The real no-loop solution is to do set intersection: `set(filename) & set('\\/:*?"<>|') == set()` – DYZ Feb 13 '18 at 04:52
  • @DyZ "You still have to loop through each x in the file name": Incorrect, you only have to loop to the first x in the file name that is an illegal character. "The real no-loop solution is to do set intersection": Also incorrect. It still loops as explained in this question: https://stackoverflow.com/questions/20100003/whats-the-algorithm-of-set-intersection-in-python – the_constant Feb 13 '18 at 05:21
  • @NoticeMeSenpai By "no-loop solutions" we usually mean no-python-loop solutions. If a loop is implements in C (as I hope is the case with the & operator), its performance is not an issue. It's a `for` loop that should be avoided. – DYZ Feb 13 '18 at 05:30

4 Answers4

2

Use raw strings (denoted by putting an r in front of your strings). Then, transform your raw string into list:

illegals = [i for i in r'\/:*?"<>|']

# OR, using @abccd's suggestion, just use list()

illegals = list(r'\/:*?"<>|')

illegals
# ['\\', '/', ':', '*', '?', '"', '<', '>', '|']

Note the '\\' when printed is still a single backslash, but in value the first backslash is stored as an escape character.

You can read more on the documentation of lexical analysis.

This answers the question, but in reality you a string is treated like a list of characters, so both of the following will return the same elements:

[i for i in list(r'\/:*?"<>|')]
[c for c in  r'\/:*?"<>|']

As for how to identify if a filename has any of these characters, you can do this:

valid_file = 'valid_script.py'
invalid_file = 'invalid?script.py'

validate = lambda f: not any(c for c in r'\/:*?"<>|' if c in f)

validate(valid_file)
# True

validate(invalid_file)
# False

This is just one of the many ways. You might even opt for a regex approach:

import re

# Note in regex you still need to escape the slash and backslash in the match group
validate = lambda f: not re.search(r'[\\\/:*?\"<>|]+', f)

validate(valid_file)
# True

validate(invalid_file)
# False
r.ook
  • 13,466
  • 2
  • 22
  • 39
  • What's the point of `[i for i in r'\/:*?"<>|']`? You can do `list(r'\/:*?"<>|')` to the same effect, but even this is not needed. – DYZ Feb 13 '18 at 04:41
  • @DyZ I realized after the fact `list()` will do just fine. While I know you can just interpret the string as a list of individual char, OP specially asks how to put them in a list. – r.ook Feb 13 '18 at 04:43
1

You need to escape the special characters like \ in the string before inserting them into the array, like this:

In [2]: something = ["\\", "/"]
In [3]: something
Out[3]: ['\\', '\/']

Printing it will give you the escaped backslash

In [12]: something = ["\\", "/"]
In [13]: something
Out[13]: ['\\', '/']
In [14]: print ', '.join(something)
\, /
DogEatDog
  • 2,899
  • 2
  • 36
  • 65
1

If the idea is just to check for illegal characters, you are doing an overkill here with complex stuffs. python string allow for lookup as they are iterators too. I would go with below easy approach :

In [5]: illegalchar = '\/:*?"<>|'

In [6]: if "/" in illegalchar:
            print("nay")
   ...:
nay

downside : one type of quotes have to be skipped which surround the string (' in this case)

NoobEditor
  • 15,563
  • 19
  • 81
  • 112
  • 1
    One type of quotes doesn't have to be missed. You can escape the char of the surround quotes by adding a backslash. It'd look like this: `\'` – the_constant Feb 13 '18 at 05:25
1

Just add the escape character \ before the backslash \.

Change

illegalchar = ['\', '/' ,':' ,'*', '?', '"', '<', '>', '|']

to

illegalchar = ['\\', '/' ,':' ,'*', '?', '"', '<', '>', '|']
Vikram Hosakote
  • 3,528
  • 12
  • 23