-1

I am using this code to find a string in Python:

buildSucceeded = "Build succeeded."
datafile = r'C:\PowerBuild\logs\Release\BuildAllPart2.log'

with open(datafile, 'r') as f:
    for line in f:
        if buildSucceeded in line:
            print(line)

I am quite sure there is the string in the file although it does not return anything.

If I just print one line by line it returns a lot of 'NUL' characters between each "valid" character.

EDIT 1: The problem was the encoding of Windows. I changed the encoding following this post and it worked: Why doesn't Python recognize my utf-8 encoded source file?

Anyway the file looks like this:

Line 1.
Line 2.
...
Build succeeded.
    0 Warning(s)
    0 Error(s)
...

I am currently testing with Sublime for Windows editor - which outputs a 'NUL' character between each "real" character which is very odd. enter image description here

Using python command line I have this output:

C:\Dev>python readFile.py
Traceback (most recent call last):
  File "readFile.py", line 7, in <module>
    print(line)
  File "C:\Program Files\Python35\lib\encodings\cp437.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\xfe' in position 1: character maps to <undefined>

Thanks for your help anyway...

Community
  • 1
  • 1
  • 1. "Quite sure" isn't enough I'm afraid. 2. Try to use `strip` in `if buildSucceeded in line.strip()` in order to remove trailing `'\n'`. – DeepSpace Mar 24 '17 at 18:19
  • Try `for line in f:` instead of splitting the entire file. Then you can strip out the nul chars before you print. – Ryan Mar 24 '17 at 18:19
  • Welcome to StackOverflow. Please read and follow the posting guidelines in the help documentation. [Minimal, complete, verifiable example](http://stackoverflow.com/help/mcve) applies here. We cannot effectively help you until you post your MCVE code and accurately describe the problem. Read the file line by line, print each line as you read it, and see what you *actually* have. If this fails, *then* chop the data file down to a few lines, reproduce the problem, and post the output here. – Prune Mar 24 '17 at 18:21

2 Answers2

0

If your file is not that big you can do a simple find. Otherwise I would check to file to see if you have the string in the file/ check the location for any spelling mistakes and try to narrow down the problem.

f = open(datafile, 'r') lines = f.read() answer = lines.find(buildSucceeded) Also note that if it does not find the string answer would be -1.

Chance
  • 1
  • 3
0

As explained, the problem happening was related to encoding. In the below website there is a very good explanation on how to convert between files with one encoding to some other.

I used the last example (with Python 3 which is my case) it worked as expected:

buildSucceeded = "Build succeeded."
datafile = 'C:\\PowerBuild\\logs\\Release\\BuildAllPart2.log'

# Open both input and output streams.
#input = open(datafile, "rt", encoding="utf-16")
input = open(datafile, "r", encoding="utf-16")
output = open("output.txt", "w", encoding="utf-8")

# Stream chunks of unicode data.
with input, output:
    while True:
        # Read a chunk of data.
        chunk = input.read(4096)
        if not chunk:
            break
        # Remove vertical tabs.
        chunk = chunk.replace("\u000B", "")
        # Write the chunk of data.
        output.write(chunk)

with open('output.txt', 'r') as f:
    for line in f:
        if buildSucceeded in line:
            print(line)

Source: http://blog.etianen.com/blog/2013/10/05/python-unicode-streams/