1

In Windows, using Python 2.7, the contents of a file are read and certain lines from that file are (after being prepended with a string "D:\abcddev\") put into a list called FilePathList. These lines are paths to files, for example:

D:\abcddev\toeblog/folderX/fileA.h
D:\abcddev\toeblog/folderY/fileB.h

You will notice that the paths contain a mixture of forward and backward slashes. There is unfortunately nothing I can do about that, that's how they are created and I only have access to them after that.

I want to check if a certain path is found in the list. The path contains all backward slashes.

So, continuing the present example, I want to check if the following is in the list above:

D:\abcddev\toeblog\folderY\fileB.h

As you can see, this string contains all backward slashes.

So my problem is how to check for equality regardless of whether the slash is a forward or backward slash.

My idea was to convert all the members of the FilePathList to backward slash separated paths and put these into a new list NormalizedFilePathList, and then to search that list for the path I wish to find.

So this is my code:

# Declare list
NormalizedFilePathList = []

# Add backward slash separated lines to NormalizedFilePathList
for file in FilePathList:
    NormalizedFilePathList.append (os.path.normpath(file)) 

# Display the contents of NormalizedFilePathList
for file in NormalizedFilePathList
    print file

# Create the string to be searched for
test_file = 'D:\abcddev\toeblog\folderY\fileB.h'

# Search for the string in NormalizedPathFileList
if test_file in NormalizedFilePathList:
    print "Found test_file"
else:
    print "Did not find test_file"

Here is the output of the above:

D:\abcddev\toeblog\folderX\fileA.h
D:\abcddev\toeblog\folderY\fileB.h
Did not find test_file

Why does this not work? There is obviously a match for 'D:\abcddev\toeblog\folderY\fileB.h'.

I tried a few things in my perplexity to clarify matters, as follows:

  1. Printed out the strings in the NormalizedPathFileList using repr() to see if there were any hidden characters preventing a match being found. No, there were not.

  2. Created artificially a new list that I populated manually and searched that instead.

ManualList = ['D:\abcddev\toeblog\folderX\fileA.h','D:\abcddev\toeblog\folderY\fileB.h']

for file in ManualList
    print file

# Search for the string in ManualList
if test_file in ManualList:
    print "Found test_file"
else:
    print "Did not find test_file"

Here was the output:

D:\abcddev    oeblog\folderX\fileA.h
D:\abcddev    oeblog\folderY\fileB.h
Found test_file

As you can seem there is a tab character in the middle of the line. That is because the string contains '\t'

If I print out the test_file, for the same reason, I also see:

D:\abcddev    oeblog\folderY\fileB.h

This explains why the search works when I create a string manually.

So the question is how to escape the \t character in the test_file string ?

Note that whatever code I write must also work in Linux.

didjek
  • 393
  • 5
  • 16
  • What is `FilePathList`? – goodvibration May 28 '20 at 07:41
  • Could you post the output of `print NormalizedFilePathList` of your first case? – Charalamm May 28 '20 at 07:41
  • 1
    First of all, upgrade your Python version - it's dead (and if it's a school computer or something, tell those people to upgrade it) :) – Torxed May 28 '20 at 07:43
  • FilePathList is a list containing strings. Each string is a full path to a file (with a mix of backward, forward slashes). It is mentioned in the example above: – didjek May 28 '20 at 07:44
  • Torxed. The version of Python used in out of my hands. I have to work this what I have got. – didjek May 28 '20 at 07:44
  • Phineas: the output is printed already: it is ```D:\abcddev\joeblog\folderX\fileA.h D:\abcddev\joeblog\folderY\fileB.h``` – didjek May 28 '20 at 07:46
  • Low chance that it will work, but try replace ('/','\') instead of os.path.... . Please let us know what happend, in the comments – Charalamm May 28 '20 at 07:49
  • I would also try next replacing all \ with / or \\ – Charalamm May 28 '20 at 07:51
  • I tried replaced '/' with '\' and also '\' with '\\' in the list, but it did not work. However, I noticed believe that the issue is the inclusion of a '\t' in my search string, which causes a tab character to be introduced in it, with the result that it does not find it in my list. In my post above, the path is one I invented for this posting. However, in actual fact it contains a \t character. I will adjust my post accordingly. – didjek May 28 '20 at 08:45

2 Answers2

0

How about removing the slashes and compare?

def strip_slashes(path):
  return path.replace('/','').replace('\\','')

paths = ['D:\\p1\\p2/folderY/fileB.h','D:\\p1\\p2/folderX/fileA.h']
stripped_paths = [strip_slashes(p) for p in paths]
path_to_find_1 = 'D:\\p1\p2\\folderY\\fileB.h'
stripped_path_to_find_1 = strip_slashes(path_to_find_1)
path_to_find_2 = 'D:\\p1\p452\\folderY\\fileB.h'
stripped_path_to_find_2 = strip_slashes(path_to_find_2)


print('----------------')

print(stripped_path_to_find_1 in stripped_paths)
print(stripped_path_to_find_2 in stripped_paths)
balderman
  • 22,927
  • 7
  • 34
  • 52
0

You are running into issues because backslashes indicate escape characters. E.g. \t is a tab, as you discovered, but Python will treat \a and \f as escape characters as well. It turns out they stand for ASCII bell and form feed, respectively. Who knew? One solution is to use raw strings, indicated with an r before the quote marks of a string, which will not check for escape characters and will treat backslashes as plain text. Otherwise you need to write \\ to display a backslash.

Also, os.path.normpath only changes forward slashes to backslashes on Windows and will not do what you need in Linux, so you will need a replace as well. In general, if you have to pick all forward slashes or all backslashes, go with all forward, because Windows can handle forward slashes while other OSes can't handle backslashes.

# Declare list    
ManualList = [r'D:\abcddev\toeblog/folderX/fileA.h',r'D:\abcddev\toeblog/folderY/fileB.h']
NormalizedFilePathList = []

# Add standardized slash separated lines to NormalizedFilePathList
for file in ManualList:
    NormalizedFilePathList.append (os.path.normpath(file.replace('\\', '/')))

# Display the contents of NormalizedFilePathList
for file in NormalizedFilePathList:
    print file

# Create the string to be searched for. 
# Use forward slashes in the string below to preserve compatibility for Linux. 
# normpath will convert them to backslashes on Windows.
test_file = os.path.normpath('D:/abcddev/toeblog/folderY/fileB.h')

# Search for the string in NormalizedPathFileList
if test_file in NormalizedFilePathList:
    print "Found test_file"
else:
    print "Did not find test_file"
jdaz
  • 5,964
  • 2
  • 22
  • 34