1

So I'm trying to save some cipher text to a new text file that is named by the user, however when I run the code it displays this message:

Please enter the name you wish the file to be called: cipher

Traceback (most recent call last):
  File "C:/Users/User/Documents/file figure.py", line 19, in <module>
    f.write(cipher_text_write)
  File "C:\Python34\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\x8e' in position 1: character maps to <undefined>

I managed to figure out that it is the actual message that I want to save that is causing the problem. Any help will be appreciated!

Here's my code:

cipher_text = " «²²µ ³¿ ´§³« ¯¹ µ´«²² "

filename = input("Please enter the name you wish the file to be called: ")
cipher_text_write = str(cipher_text)
cipher_filename = filename + ".txt"
f = open(cipher_filename,"w+")
f.write(cipher_text_write)
f.close()
Nayuki
  • 17,911
  • 6
  • 53
  • 80
DonSmi
  • 23
  • 4
  • 1
    possible duplicate of [UnicodeEncodeError: 'charmap' codec can't encode characters](http://stackoverflow.com/questions/27092833/unicodeencodeerror-charmap-codec-cant-encode-characters) – Jake Sep 15 '15 at 19:25
  • While the error is the same, the root cause and resolution are different. I don't think it's a duplicate. – nitzanms Sep 15 '15 at 19:33

1 Answers1

0

The important thing to understand here is that the result of an encryption is simply a stream of bits. These bits do not necessarily correspond to a legal character string.

Your cypher text contains patterns of bytes that aren't encodable into any legal character. There are many ways to solve this issue, but the easiest would be to open your file as a binary or instead encode the bits in something like base64, for example:

>>> import os
>>> import base64
>>> s = str(os.urandom(10000))
>>> encs=base64.b64encode(s)
>>> s2 = base64.b64decode(encs)
>>> cmp(s,s2)
0

When you want to read the cypher text back, you need to open the file containing the cypher text as a binary or read and decode the base64 representation of the bits, in accordance with the solution you picked when writing to file.

nitzanms
  • 1,786
  • 12
  • 35
  • By doing this will I be able to read the file back in to decrypt it using a key and offset factor? – DonSmi Sep 15 '15 at 19:51
  • 1
    it's unrelated to your ability to decrypt it. the important part is, that you can reconstruct the `cipher_text` exactly from the data stored in the file (which is what this answer is about) – umläute Sep 15 '15 at 19:55
  • I've revised my answer to be more comprehensive. Let me know if you still have any questions. – nitzanms Sep 16 '15 at 12:24
  • 1
    @nitzanms Thank you so much, after 4 hours of trying different methods I finally got it to work. I've ended up encoding it with utf-8. Thanks a lot! – DonSmi Sep 18 '15 at 19:58
  • This is NOT good idea. UTF8 is a character encoding and it's possible that certain bit patterns won't be encodable in it. This question is for C#, but Jon Skeet's answer is relevant here: http://stackoverflow.com/questions/7996955/encoding-to-use-to-convert-bytes-array-to-string-and-vice-versa – nitzanms Sep 19 '15 at 16:30