import fileinput, sys, codecs, re, unicodedata
def remove_control_characters(s):
return "".join(ch for ch in s if unicodedata.category(ch)[0]!="C")
file_in = 'file_with_ctrl_characters.XML'
file_out = 'out_file.xml'
with open(file_out, 'a') as out:
for line in fileinput.input([file_in]):
out.write(remove_control_characters(line)+'\n')
out.close()
os.remove(file_in)
os.rename('out_file.xml', file_in)
In short, this code works in jupyter notebook. It read a file, removes special character and then writes everything else to a new xml file.
Then it removes the old file, and gives the old file name to the new file. I'm left with same file name but without the special character.
I want to call this from command prompt passing it (one?) argument - the path of the file I need it to remove special character from.
How do I go from Jupyter notebook code to a script I can call on from command prompt by providing it the file to remove characters from?