I'm exporting feeds from alien vault otx using staxii
and trying to send them to misp
. But when sending some feeds, the following error occurs:
UnicodeEncodeError: 'latin-1' codec can't encode character '\u2019' in position 3397: Body ('’') is not valid Latin-1. Use body.encode('utf-8') if you want to send it encoded in UTF-8.
for filename in os.listdir(dest_directory):
filenameWithDir = dest_directory+filename
try:
file_index += 1
print("****************")
print(dest_directory + filename)
print(file_index)
print("****************")
misp_config.upload_stix(filenameWithDir, '1')
except UnicodeEncodeError:
with open(filenameWithDir, 'r') as file:
filedata = file.read()
filedata = filedata.replace('вЂ', ' ').replace('’', ' ').replace('“', ' ').replace('”', ' ')\
.replace('–', ' ').replace('—', ' ').replace('™', ' ').replace('​', ' ').replace(' ', ' ')\
.replace(' ', ' ').replace('…', ' ').replace('гЂЂ', ' ').replace('лЇёл¶Ѓ м •мѓЃнљЊл‹ґ м „л§ќ л°Џ 대비', ' ')\
.replace(',', ' ').replace('•', ' ').replace('‑', ' ')
with open(filenameWithDir, 'w') as file:
file.write(filedata)
file_index += 1
print("****************")
print(dest_directory + filename)
print(file_index)
print("****************")
misp_config.upload_stix(filenameWithDir, '1')
I tried to replace characters that are not readable, but there are too many of them. Is it possible to delete characters by the position indicated in the error?