I have a method, where a .txt file is parsed with Scanner
, reassembled with DocumentBuilder
, and transformed into an .xml file with TransformerFactory
.
Everything works fine, with the exception of a little inconvenience: The file that is created that way contains what I asume to be a BOM at the beginning of its name. I'm encoding in UTF-8
.
It's saved under %EF%BB%BFexample.xml
instead of example.xml
.
How can I avoid that?
EDIT: As you can see in the comments, I was pointed to the possibility, that the first line fileTitle
which is read by Scanner
from userText
probably contains the BOM for UTF-8
, what turned out to be true (again, see comments).
private void writeXML() {
try {
File userText = new File(passedPath);
Scanner scn = new Scanner(new FileInputStream(userText), "UTF-8");
String separate = ";";
String fileTitle = scn.nextLine();
int indSepTitle = fileTitle.indexOf(separate);
fileTitle = fileTitle.substring(0,indSepTitle);
String fileOutputName = fileTitle+".xml";
File mOutFile = new File(getFilesDir(), fileOutputName);
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
//root element
Document doc = docBuilder.newDocument();
Element rootElement = doc.createElement("Collection");
doc.appendChild(rootElement);
//List element
Element listElement = doc.createElement("List");
rootElement.appendChild(listElement);
//set Attributes to listElement
Attr attr = doc.createAttribute("name");
attr.setValue(fileTitle);
listElement.setAttributeNode(attr);
while(scn.hasNext()) {
String line = scn.nextLine();
String[] parts = line.split(separate);
//vocabulary element
Element ringElement = doc.createElement("element_ring");
listElement.appendChild(n1Element);
//add 1st Element
Element n1Element = doc.createElement("element1");
natWord.appendChild(doc.createTextNode(parts[0]));
ringElement.appendChild(n1Element);
//add 2ndElement
Element n2Element = doc.createElement("element2");
forWord.appendChild(doc.createTextNode(parts[1]));
ringElement.appendChild(n2Element);
...
//add other Elements accordingly
...
}
//write the content into xml file
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(mOutFile);
transformer.transform(source, result);
} catch (ParserConfigurationException e) {
e.printStackTrace();
}
catch (FileNotFoundException e) {
e.printStackTrace();
} catch (TransformerConfigurationException e) {
e.printStackTrace();
} catch (TransformerException e) {
e.printStackTrace();
}
}