I have a very large text file, where most of the lines are composed of ASCII characters, but a small fraction of lines have non-ASCII characters. What is the fastest way to create a new text file containing only the ASCII lines? Right now I am checking each character in each line to see if it's ASCII, and writing each line to the new file if all the characters are ASCII, but this method is rather slow. Also, I am using Python, but would be open to using other languages in the future.
Edit: updated with code
#!/usr/bin/python
import string
def isAscii(s):
for c in s:
if ord(c) > 127 or ord(c) < 0:
return False
return True
f = open('data.tsv')
g = open('data-ASCII-only.tsv', 'w')
linenumber = 1
for line in f:
if isAscii(line):
g.write(line)
linenumber += 1
f.close()
g.close()