0

I have a text file like below:

/john
/peter
/Sam
/Jennefer

Using the the following script:

keyword_file = open(text_file)
j = keyword_file.readlines()

for i in range(len(j)):
    if j[i] == "/peter":
       print "yes"

although /peter is in the text file I don't get the printed yes. However when I delete "/"s , "yes" is printed. What is the problem with it?

James Mertz
  • 8,459
  • 11
  • 60
  • 87
user3832061
  • 477
  • 3
  • 7
  • 12
  • 1
    What is `j[i]`, have you looked? – Henrik Andersson Aug 06 '14 at 22:09
  • Do a `print repr(j[i])` and you might understand why your string comparison test failed. – Santa Aug 06 '14 at 22:10
  • You really shouldn't read files that way. Don't use `readlines`, just iterate over the file object with `for line in keyword_file:`. Have a read: http://stupidpythonideas.blogspot.co.uk/2013/06/readlines-considered-silly.html - or at the very least, if iterate over `j` directly: `for line in j:` – Ben Aug 06 '14 at 22:14

2 Answers2

0

The problem here is that you are looking for an exact match on the whole line. This includes any special ascii characters that may be included; such as a newline character.

If you instead read the text, and split it by line, and iterate over the result your code would work:

result = keyword_file.read()
for line in result.split('\n'):
    if line == "/peter":
       print "yes"

As an alternative you could use

for line in keyword_file:
    if line.startswith("/peter"): # or "/peter" in line
        print "yes"

If you want to avoid storing the whole file in memory, and still have a clean if statement you can use strip() to remove any unnecessary special characters or spaces.

with open(file_name) as file_obj:
    for line in file_obj:
        if line.strip() == '/peter':
            print "yes"
eandersson
  • 25,781
  • 8
  • 89
  • 110
  • Feel free to comment on the downvote. The problem with ```for line in keyword_file``` would still cause the same problem with a trailing ```\n``` character. – eandersson Aug 06 '14 at 22:17
  • Anyway, I was replying to the problem, not displaying the best practice for reading files in Python. There are plenty of articles on that already, and in reality reading the whole string into memory is only really a problem when reading large chunks of data. http://programmers.stackexchange.com/questions/80084/is-premature-optimization-really-the-root-of-all-evil – eandersson Aug 06 '14 at 22:26
0

First off you're not just looking for /peter you're looking for /peter\n.

Second, there's a lot here that you can do to improve your script:

  1. Use with instead of forcing yourself to open and close your file:

    with open(text_file) as fp: <your code here>

  2. Instead of reading the entire file, read it line by line:

    for line in fp: <your business logic here>

  3. compare your string using is instead of ==: See this SO answer why I'm wrong here

    if line is '/peter\n': <condition if peter is found>

Here's the combined script that match what you're trying to do:

with open(text_file) as fp:
    for line in fp:
        if line == '/peter\n':
            print("yes")  # please use print(<what you want to print here>) instead of print <what you want here> for compatibility with 3.0 and readability.
Community
  • 1
  • 1
James Mertz
  • 8,459
  • 11
  • 60
  • 87
  • Completely disagree with point 3 - you should definitely not be using `is` - see http://stackoverflow.com/questions/2988017/string-comparison-in-python-is-vs Agree with the rest though! – Ben Aug 06 '14 at 22:23
  • @eandersson I just checked it and it works correctly. – James Mertz Aug 06 '14 at 22:29
  • @KronoS: It shouldn't work with ```if line == '/peter':```, but with ```/peter\n```, which has been fixed in an edit. – eandersson Aug 06 '14 at 22:30
  • @eandersson Oh I see what you were saying. I meant to put that in there, but missed it. Thanks though. – James Mertz Aug 06 '14 at 22:31
  • What happens if the text file uses ```\r\n``` instead of ```\n```? http://programmers.stackexchange.com/questions/29075/difference-between-n-and-r-n – eandersson Aug 07 '14 at 12:53
  • @eandersson then use the following comparison: `if line == '/peter\n' or line == '/peter\r\n':` or use what you suggested: `if line.strip() == '/peter':`. Doesn't really matter. – James Mertz Aug 07 '14 at 14:22