I'm using the Jaxb2Marshaller
from org.springframework.oxm.jaxb.Jaxb2Marshaller
in my Spring Batch application to marshall XML with annotated classes. The implementation of the Marshaller
is:
@Bean
public Jaxb2Marshaller productMarshaller() {
Map<String, Object> props = new HashMap<String, Object>();
props.put("com.sun.xml.bind.marshaller.CharacterEscapeHandler", new XmlCharacterEscapeHandler());
Jaxb2Marshaller marshaller = new Jaxb2Marshaller();
marshaller.setClassesToBeBound(new Class[] {Product.class, TechSpecs.class});
marshaller.setMarshallerProperties(props);
return marshaller;
}
The Marshaller
is used inside a StaxEventItemWriter
that is implemented as following:
@Bean(name = "writer")
@StepScope
public StaxEventItemWriter<Product> writer (
@Value("#{jobParameters['path']}") String path,
@Value("#{stepExecutionContext['currentFile']}") String fileName
) {
Map<String, String> rootElementAttributes = new HashMap<String, String>();
rootElementAttributes.put("xmlns:xsi", "http://www.w3.org/2001/XMLSchema-instance");
FileSystemResource file = new FileSystemResource(path + fileName);
return new StaxEventItemWriterBuilder<Product>()
.name("writer")
.version("1.0")
.encoding("UTF-8")
.standalone(false)
.rootTagName("Products")
.rootElementAttributes(rootElementAttributes)
.headerCallback(headerCallback(null, null))
.footerCallback(footerCallback())
.marshaller(productMarshaller())
.resource(file)
.build();
}
Now the problem is that when I run the code, I get an IndexOutOfBoundsException
. I found out that the exception is thrown because my Product
object has a String attribute that may contain a &
. The &
is not allowed in XML and has to be escaped.
Why is the Jaxb2Marshaller
not auto escaping the &
character? As far as I understand the Marshaller
should take care of escaping characters.
I tried to escape the character my self in the item processor with the StringEscapeUtils
, e.g. product.setFullName(StringEscapeUtils.escapeXml10(dbExport.getFullName()));
, but this didn't help. Also the String will be changed from &
to &
, which also contains a &
.
I also tried to use my own implementation of a CharacterEscapeHandler
, but the marshaller.setMarshallerProperties()
does not have any visible effect on the Marshaller
. Do I have to set the properties for the Marshaller differently?
public class XmlCharacterEscapeHandler implements CharacterEscapeHandler {
@Override
public void escape(char[] ch, int start, int length, boolean isAttVal, Writer out) throws IOException {
StringWriter buffer = new StringWriter();
for(int i = start; i < start + length; i++) {
buffer.write(ch[i]);
}
String escapedString = StringEscapeUtils.escapeXml10(buffer.toString());
out.write(escapedString);
}
}
EDIT
Unfortunately I could not resolve my issue until now. Therefore, I switched from Jaxb2Marshaller
to XStreamMarshaller
. Here I get a similar issue. As far as I can tell the underlying XStream
should use a PrettyPrintWriter
that will auto convert &
to &
as described here: https://stackoverflow.com/a/48141964/4191735 This is not happening. For me there is always an problem with &
. Why does the escaping not work? Also escaping the String itself and force converting it to UTF-8 does not help.
Minimal Complete Example Main:
package com.mwe;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.annotation.ComponentScan;
@SpringBootApplication
@ComponentScan("com.mwe")
public class Main {
public static void main(String [] args) {
System.exit(SpringApplication.exit(SpringApplication.run(Main.class, args)));
}
}
BatchConfig:
package com.mwe;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepScope;
import org.springframework.batch.item.ItemProcessor;
import org.springframework.batch.item.file.FlatFileItemReader;
import org.springframework.batch.item.file.mapping.DefaultLineMapper;
import org.springframework.batch.item.file.transform.DelimitedLineTokenizer;
import org.springframework.batch.item.xml.StaxEventItemWriter;
import org.springframework.batch.item.xml.builder.StaxEventItemWriterBuilder;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.FileSystemResource;
import org.springframework.oxm.xstream.XStreamMarshaller;
@Configuration
@EnableBatchProcessing
public class BatchConfig {
@Autowired
public JobBuilderFactory jobBuilderFactory;
@Autowired
public StepBuilderFactory stepBuilderFactory;
@Bean
@StepScope
public FlatFileItemReader<Product> reader() {
FlatFileItemReader<Product> reader = new FlatFileItemReader<Product>();
reader.setResource(new FileSystemResource("test.csv"));
DefaultLineMapper<Product> lineMapper = new DefaultLineMapper<>();
lineMapper.setFieldSetMapper(new CustomFieldMapper());
DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
tokenizer.setDelimiter("|");
tokenizer.setNames(new String[] {"ID", "NAME"});
lineMapper.setLineTokenizer(tokenizer);
reader.setLineMapper(lineMapper);
reader.setLinesToSkip(1);
return reader;
}
@Bean
public ItemProcessor<Product, Xml> processor() {
return new Processor();
}
@Bean
@StepScope
public StaxEventItemWriter<Xml> writer () {
return new StaxEventItemWriterBuilder<Xml>()
.name("writer")
.version("1.0")
.encoding("UTF-8")
.standalone(false)
.rootTagName("products")
.marshaller(getMarshaller())
.resource(new FileSystemResource("test.xml"))
.build();
}
@Bean
public Job job() {
return this.jobBuilderFactory.get("job")
.start(step1())
.build();
}
@Bean
public Step step1() {
return (stepBuilderFactory.get("step1")
.<Product, Xml>chunk(2)
.reader(reader())
.processor(processor())
.writer(writer())
.build());
}
@Bean
public XStreamMarshaller getMarshaller() {
XStreamMarshaller marshaller = new XStreamMarshaller();
marshaller.setEncoding("UTF-8");
return marshaller;
}
}
CustomFieldMapper
package com.mwe;
import org.springframework.batch.item.file.mapping.FieldSetMapper;
import org.springframework.batch.item.file.transform.FieldSet;
public class CustomFieldMapper implements FieldSetMapper<Product> {
public Product mapFieldSet(FieldSet fs) {
Product product = new Product();
product.setId(fs.readString("ID"));
product.setName(fs.readString("NAME"));
return product;
}
}
ItemProcessor:
package com.mwe;
import org.springframework.batch.item.ItemProcessor;
public class Processor implements ItemProcessor<Product, Xml> {
@Override
public Xml process(final Product product) {
Xml xml = new Xml();
xml.setId(Integer.parseInt(product.getId()));
xml.setName(product.getName());
return xml;
}
}
Product:
package com.mwe;
public class Product {
private String id;
private String name;
public String getId() {
return id;
}
public void setId(String id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
}
Xml:
package com.mwe;
public class Xml {
private int id;
private String name;
public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
}
Application Properties:
# Spring config
spring.main.allow-bean-definition-overriding=true
spring.main.banner-mode=off
spring.batch.initialize-schema=never
# Logging data source
spring.datasource.logging.driver-class-name=org.mariadb.jdbc.Driver
spring.datasource.logging.maximum-pool-size=10
spring.datasource.logging.hikar.minimum-idle=1
spring.datasource.logging.hikari.data-source-properties.useUnicode=true
spring.datasource.logging.hikari.data-source-properties.characterEncoding=UTF-8
spring.datasource.logging.hibernate.dialect=org.hibernate.dialect.MariaDBDialect
spring.datasource.logging.hibernate.ddl-auto=none
spring.datasource.url=jdbc:mariadb://localhost:3306/logging?UseUnicode=true&characterEncoding=utf8
spring.datasource.username=root
spring.datasource.password=root
Pom:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.4.2</version>
<relativePath /> <!-- lookup parent from repository -->
</parent>
<groupId>com.mwe</groupId>
<artifactId>mwe</artifactId>
<version>1</version>
<name>mwe</name>
<description>Minimal working example</description>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-batch</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-jdbc</artifactId>
</dependency>
<dependency>
<groupId>org.mariadb.jdbc</groupId>
<artifactId>mariadb-java-client</artifactId>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-oxm</artifactId>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-text</artifactId>
<version>1.9</version>
</dependency>
<dependency>
<groupId>javax.activation</groupId>
<artifactId>activation</artifactId>
<version>1.1.1</version>
</dependency>
<dependency>
<groupId>com.thoughtworks.xstream</groupId>
<artifactId>xstream</artifactId>
<version>1.4.15</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
test.csv:
"ID"|"NAME"
1|"Product 1"
2|"Product 1 & Addition"