Let me start by saying that I do not speak Chinese, nor would there be any reason for my default output to be in Chinese. That said this is both the strangest and most hilarious bug I've ever encountered.
To start with my code is supposed to count the number of times different substrings of length four appear with overlap in a DNA sequence. The relevant code looks like this
#file containing data
f = open(infile, 'r')
#open an additional file to write output to
g = open("k-Mer output.txt", 'w')
#empty list
l=[]
#add lines of file to list
for line in f:
l.append(line.strip())
d = {}
#adds every unique substring of four to my dict
for i in four_mer_maker():
d[i] = 0
#l[1] is the sequence to be examined, assume it is all 1 line
#checks four letters, then shifts over one and checks those 4
for i in range(len(l[1]) - 3):
d[l[1][i:i+4]] += 1
#now just write the ordered values to an output file
for i in sorted(d.items()):
g.write(str(i[1])+ ' ')
My file is complete gibberish and looks like this
‴‱‴″‰‱‱‵‱″‱′′‱′‰‱‱″‱′‱
even stranger, I tried playing with the output a bit. changing just
g.write(str(i[1])+ 'hello')
Makes my output look like this.
栴汥潬栱汥潬栴汥潬栳汥潬栰汥潬栱汥潬栱
Google translate says its Chinese. What the heck is happening??