I am trying to read this simple file line by line in python:
q(A) p(B)
q(z) ∼p(x)
Then from each line I strip the newline
and then add it to list.
lst = []
f = open("input.txt", 'r')
t1 = f.readline().rstrip('\n')
t2 = f.readline().rstrip('\n')
lst.append(t1)
lst.append(t2)
print lst
Well the problem is that when I print the content of the list I get the following output:
['q(A) p(B)', 'q(z) \xe2\x88\xbcp(x)']
My file contains the tilde character ~
and I think this causes that behavior. The weird thing is that if I would print the content of the t1
and t2
they would appear normally, but printing the content of the lst
would appear different
EDIT: Answer
Well I managed to get exactly what I expected. In case anyone encounter the same problem may refer to this solution:
import codecs
f = codecs.open("input2.txt", 'r', encoding='utf8')
lst = []
t1 = f.readline().rstrip('\n')
t2 = f.readline().rstrip('\n')
res1 = ""
res2 = ""
for i in xrange(0,len(t1)):
if ord(t1[i]) == 8764:
res1 += "~"
else:
res1 += chr(ord(t1[i]))
for i in xrange(0,len(t2)):
if ord(t2[i]) == 8764:
res2 += "~"
else:
res2 += chr(ord(t2[i]))
lst.append(res1)
lst.append(res2)
print lst
And the output now is as below:
['q(A) p(B)', 'q(z) ~p(x)']