0
import os
import sys
import fileinput

print ("Text to search for:")
textToSearch = input( "> " ) 

print ("Text to replace it with:")
textToReplace = input( "> " )

print ("File to perform Search-Replace on:")
fileToSearch  = input( "> " )
#fileToSearch = 'D:\dummy1.txt'

tempFile = open( fileToSearch, 'r+' , encoding="utf8")

for line in fileinput.input( fileToSearch ):
    if textToSearch in line :
        print('Match Found')
    else:
        print('Match Not Found!!')
    tempFile.write( line.replace( textToSearch, textToReplace ) )
tempFile.close()

input( '\n\n Press Enter to exit...' )
Cindy Meister
  • 25,071
  • 21
  • 34
  • 43
  • What exactly is the problem you are having ? – han solo Mar 22 '19 at 13:10
  • UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 643: character maps to im getting this message as an error – Rishabh Sharma Mar 22 '19 at 13:11
  • Any particular reason for using `fileinput` and not just `open` ? – han solo Mar 22 '19 at 13:13
  • yea,im designing it for generic purpose,its complex though,i have to read through excel and manipulate the data for a docx file – Rishabh Sharma Mar 22 '19 at 13:14
  • Possible duplicate of [UnicodeDecodeError: 'charmap' codec can't decode byte X in position Y: character maps to ](https://stackoverflow.com/questions/9233027/unicodedecodeerror-charmap-codec-cant-decode-byte-x-in-position-y-character) – srattigan Mar 22 '19 at 13:19

2 Answers2

0

Looks like you're opening a binary file (if so, save it as a plain text), or a file with not matching encoding (utf-8).

If you need to work with docx document, you need some specialized library for opening and reading, for example python-docx.

user2622016
  • 6,060
  • 3
  • 32
  • 53
0

You're getting the 0x8f error because there is a character in there that is not a unicode character. Check how the text file is saved in notepad, it might be ANSI not UTF-8.

Also, I would do a couple things differently.

First use re.search instead of just in. You'll get better results, and if you wanted to add more granularity later such as whole words only, it's easy to update.

Second, use a real Excel library like openpyxl, and the same for docx like docx (that's the name of the library). They're rendered as plain text to us by the editors, but they're stored as larger encoded files. Trying to work through them with fileinput without treating them as such is going to get messy. You can choose which library to use based on the filename, so you still have re-usability, but you're now using the right tool for the job.

Subsum44
  • 36
  • 6