In follow-up on a question someone helped me here with yesterday Lost in XML and Python I am trying to compare two strings.
- String one is read from a XML file
- String two is read from a CSV file
The problem is that both are stored differently :
CSV FILE HAS : "‚"
XML FILE HAS : "‚"
But without the "
printing the strings at the time of comparison shows me why they do not match :
These are the strings it is trying to match
FROM XML : ‚
FROM CSV : x82
This will probably happen for a lot more characters then this particular one. My question is how do I resolve this?
- Read XML file differently?
- Read CSV file differently?
- Convert stored string before comparison?
After comparison the matching strings need to be stored and printed back in the format of the string in the XML.
Here is how I am opening and reading in my csv file :
import csv
csvdata = csv.reader(open('csvsmall.csv'))
csvfile = open(csvinput, "rb")
dialect = csv.Sniffer().sniff(csvfile.read(1024))
csvfile.seek(0)
reader = csv.reader(csvfile, dialect)
============================UPDATE============================================
Ok so according to the replies. I think It would be easiest to find a way to convert the escaped strings in the CSV file to the version in the XML file
That would mean converting :
"," which looks like it is being read as x82 to "‚"
Does anyone have any tips on how to do this on all the values of the csv that are stored in a dictionary? :
filenameToLabel = {}
for l,f in (x.strip().split(';') for x in (csvfile.readlines())[1:]):
filenameToLabel[f] = l