VS2015 + Python 3.4 File IO

Question

I am trying to self teach python by writing a quick script using Visual Studio 2015 and Python 3.4 to parse an Recuva output log file.

Starting Point Promo 2013 - Compilation-HD.mp4 15020953 M:\Videos For Future Use\

Note the input file has multiple spaces between the filename, filesize and output path.

import os
import re

fileName = os.path.join( "C:\\", "Users", "temp", "Downloads", "deleted_files.txt" )

pattern = r"""
    (.*)
    \s{2,}
    \d+
    \s{2,}
    (.*)
"""

regex = re.compile( pattern, re.X )

try:
    with open( fileName ) as inputFile:
        # Loop over lines and extract variables of interest
        for line in inputFile:
            print( line )
            match_obj = regex.match( line )

            if match_obj: # pattern matched
                name = match_obj.group( 1 ) # group 1 is the first object
                directory = match_obj.group( 2 )
                print( name + "=>" + directory )
            else:
                print( "Line not matched: " + line )

        inputFile.close()
except (OSError, IOError) as err:
    raise IOError( "%s: %s" % (fileName, err.strerror))

Each time I run the code, I get an error on print( line ) and the debugger in VS2015 only shows newlines. Am I missing something in my file opening and read operation? Be kind, this is my first python script!

The error message I get back is ÿþ-'charmap' codec can't encode character '\xfe' in position 1: character maps to

EDIT (Updated Trial): I receive the same error message if I run the script from the command line:

File "RecuvaRenamer.py", line 21 in print( line ) File "C:\Python34\lib\encodings\cp437.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_map)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\xfe' in position 1 character maps to

What error? There's nothing obviously wrong with the code up to that point. — jonrsharpe, Nov 03 '15 at 15:31
Apologies, forgot to add the error: ÿþ-'charmap' codec can't encode character '\xfe' in position 1: character maps to — Sraivyn, Nov 03 '15 at 15:40
maybe this question can help you http://stackoverflow.com/questions/14630288/unicodeencodeerror-charmap-codec-cant-encode-character-maps-to-undefined — Kristian Damian, Nov 03 '15 at 15:54

score 1 · Answer 1 · answered Nov 03 '15 at 16:03

1

Ok, nevermind. Nothing wrong with the code. The input file was not encoded in UTF-8. Converted input to UTF-8 format and the file "almost" parses correctly.

answered Nov 03 '15 at 16:03

Sraivyn

61
6

VS2015 + Python 3.4 File IO

1 Answers1