My Django application is working with both .txt
and .doc
filetypes. And this application opens a file, compares it with other files in db and prints out some report.
Now the problem is that, when file type is .txt
, I get 'utf-8' codec can't decode byte
error (here I'm using encoding='utf-8'
). When I switch encoding='utf-8'
to encoding='ISO-8859-1'
error changes to 'latin-1' codec can't decode byte
.
I want to find such encoding format that works with every type of a file. This is a small part of my function:
views.py
:
@login_required(login_url='sign_in')
def result(request):
last_uploaded = OriginalDocument.objects.latest('id')
original = open(str(last_uploaded.document), 'r', encoding='utf-8')
original_words = original.read().lower().split()
words_count = len(original_words)
open_original = open(str(last_uploaded.document), "r")
read_original = open_original.read()
report_fives = open("static/report_documents/" + str(last_uploaded.student_name) +
"-" + str(last_uploaded.document_title) + "-5.txt", 'w')
# Path to the documents with which original doc is comparing
path = 'static/other_documents/doc*.txt'
files = glob.glob(path)
rows, found_count, fives_count, rounded_percentage_five, percentage_for_chart_five, fives_for_report, founded_docs_for_report = search_by_five(last_uploaded, 5, original_words, report_fives, files)
context = {
...
}
return render(request, 'result.html', context)