1

I have been having a path error: No file or directory found for hours. After hours of debugging, I realised that python2 added an invisible '\r' at the end of each line.

The input: (trainval.txt)

Images/K0KKI1.jpg Labels/K0KKI1.xml
Images/2KVW51.jpg Labels/2KVW51.xml
Images/MMCPZY.jpg Labels/MMCPZY.xml
Images/LCW6RB.jpg Labels/LCW6RB.xml

The code I used to debug the error

with open('trainval.txt', "r") as lf:
 for line in lf.readlines():
  print ((line),repr(line))
  img_file, anno = line.strip("\n").split(" ")
  print(repr(img_file), repr(anno))

Python2 output:

("'Images/K0KKI1.jpg'", "'Labels/K0KKI1.xml\\r'")
('Images/2KVW51.jpg Labels/2KVW51.xml\r\n', "'Images/2KVW51.jpg Labels/2KVW51.xml\\r\\n'")
("'Images/2KVW51.jpg'", "'Labels/2KVW51.xml\\r'")
('Images/MMCPZY.jpg Labels/MMCPZY.xml\r\n', "'Images/MMCPZY.jpg Labels/MMCPZY.xml\\r\\n'")
("'Images/MMCPZY.jpg'", "'Labels/MMCPZY.xml\\r'")
('Images/LCW6RB.jpg Labels/LCW6RB.xml\r\n', "'Images/LCW6RB.jpg Labels/LCW6RB.xml\\r\\n'")
("'Images/LCW6RB.jpg'", "'Labels/LCW6RB.xml\\r'")

Python3 output:

Images/K0KKI1.jpg Labels/K0KKI1.xml
 'Images/K0KKI1.jpg Labels/K0KKI1.xml\n'
'Images/K0KKI1.jpg' 'Labels/K0KKI1.xml'
Images/2KVW51.jpg Labels/2KVW51.xml
 'Images/2KVW51.jpg Labels/2KVW51.xml\n'
'Images/2KVW51.jpg' 'Labels/2KVW51.xml'
Images/MMCPZY.jpg Labels/MMCPZY.xml
 'Images/MMCPZY.jpg Labels/MMCPZY.xml\n'
'Images/MMCPZY.jpg' 'Labels/MMCPZY.xml'
Images/LCW6RB.jpg Labels/LCW6RB.xml
 'Images/LCW6RB.jpg Labels/LCW6RB.xml\n'
'Images/LCW6RB.jpg' 'Labels/LCW6RB.xml'

As annoying as it was, it was that small '\r' who caused the path error. I could not see it in my console until I write the script above. My question is: Why is this '\r' even there? I did not create it. Something somewhere added it there. It would be helpful if someone could tell me what is the use of this small 'r' , why did it appear in python2 and not in python3 and how to avoid getting bugs due to it.

TSR
  • 17,242
  • 27
  • 93
  • 197

1 Answers1

2

there's probably a subtle difference of processing between Windows text file in python 2 & 3 versions.

The issue here is that your file has a Windows text format, and contains one or several carriage return chars before the linefeed. A quick & generic fix would be to change:

img_file, anno = line.strip("\n").split(" ")

by just:

img_file, anno = line.split()

Without arguments str.split is very smart:

  • it splits according to any kind of whitespace (linefeed, space, carriage return, tab)
  • it removes empty fields (no need for strip after all)

So use that cross-platform/python version agnostic form unless you need really specific split operation, and your problems will be history.

As an aside, don't do for line in lf.readlines(): but just for line in lf:, it will read & yield the lines one by one, handy when the file is big so you don't consume too much memory.

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
  • Not so subtle, actually; Python 2 required `open(filename, 'Ur')` for the behavior wich is now the standard default mode for text files in Python 3. – tripleee Aug 02 '18 at 09:01
  • interesting. I'm more used to windows, where it is different too since \r are removed when using text mode. – Jean-François Fabre Aug 02 '18 at 09:26