I have a python script that reads the names of pdf files and writes them to an HTML file with links to the PDFs. All works well unless a name has special characters.
I have read many other answers on SE to no avail.
f = open("jobs/index.html", "w")
#html divs go here
for root, dirs, files in os.walk('jobs/'):
files.sort()
for name in files:
if ((name!="index.html")&(name!=".htaccess")):
f.write("<a href='"+name+"'>"+name.rstrip(".pdf")+"</a>\n<br><br>\n")
print name.rstrip(".pdf")
Returns:
Caba�n-Sanchez, Jane.pdf
Smith, John.pdf
Which is of course breaks the text and the link to that pdf.
How can I correctly encode the file or 'name' variable so that it writes special characters correctly?
ie, Cabán-Sanchez, Jane.pdf