So I have this list of word in a textfile, I did not produce the text file so I do not know the file encoding.
the list : http://s000.tinyupload.com/?file_id=31195244104486221180
Notepad++ tells me that it's ANSI.
When running this script (reader1.py) :
if __name__ == '__main__':
words = open("test_list.txt").read().splitlines()
for word in words:
print word
with open("test_list-rewrite.txt", "a") as myfile:
myfile.write(word + '\n')
the word piirilä
is displayed as piirilõ
in the console, however in the new file it's stored as piirilä
What I wonder is, if I compute the hash256 of the variable word, will it run it on piirilä
or piirilõ
?
word = word.decode('cp-1252')
raise an exception
Thanks
PS : Windows 8.1 64 bits, python 2.7 64 bits
Edit
after some more fidling I found something weird, made this
#!/usr/bin/env python
# --*-- encoding: utf-8 --*--
import hashlib
word1 = 'piirilä'
word2 = 'piirilõ'
word3 = 'Whatitis'
print word1
print hashlib.sha256(word1).hexdigest()
print word2
print hashlib.sha256(word2).hexdigest()
print word3
print hashlib.sha256(word3).hexdigest()
which outputs this :
piirilä
278394edd22799ae29bc881dc66e45e45a9a18972c45a35208b6a3d71e209a10
piiril├Á
7e158cf465d3afadd865684f979f46a5282ef93127c150b55273801086fa3c09
Whatitis
d338e8077b6c9d3d2f09e4e2d4a2a5f52152b72e9b6bb5c456a67f63d853e75f
And I added hashlib.sha256(word).hexdigest()
to reader1.py
which then outputs this :
billycorgan
d94a3821ad2b6d26aedf4db13b551d9e0eefeaf92d0615946cdc0215ec974692
brescos64
8840d0e40a83d711ce0b44ed66a5d1e4df06fbf6c5c168e98af4775c6e19f52b
matvois
ef5e930806489e8fcc8e0746ce5f8cb4c6715a56d2fd73d42b1c711b5e71474f
kbeans
c207d8366f3dbae64357088dee8eeeb35a047b2e021342c82aa0bd8c15753d74
Whatitis
d338e8077b6c9d3d2f09e4e2d4a2a5f52152b72e9b6bb5c456a67f63d853e75f
cphu
1427ebcff066a5386d0649842fb60b014bebfc5a1589896a62488865e8f06c50
de'mystifierait
83665461f98de4c270e6a4d69a445ea2f9079693824c0544a9add4caee5c7dd2
wendelboe
1423bf5d682dafdc72937d92811b5ff9d856681e94204d565cb0f29b809f5e13
ketanshah
f9977718f33f9068f20c52321ef02be3611e7c7a0aebb59421e74f864c259f53
piirilõ
a238ede50bc349279c62399b275cfa3271f63bc5e7499cc40aaa4ff84198666d
gasoline
4325ed4bef2a2a10c97cbb8235f822602efc0f04a900f0eb537f8e9fee9728aa
BabyBlues
8168fce33124ecec74e647f119de5b3cda795dcc69c4237d8cf27b10aca07b94
so I get 3 different hashes, which one is the one I want?