1

My use case was to convert an Java POJO into String so it can be passed to be published to AWS Kinesis Firehose Stream.

I was writing this convertToString(), but I'm unable to find the correct way to escape delimiter.

public <T> List<String> convertToString(List<T> objectList, Class<T> tClass) {

        List<String> stringList = new ArrayList<>();
        char delimiter = ',';
        char escape = '\\';

        CsvMapper mapper = new CsvMapper();
        CsvSchema schema = mapper.schemaFor(tClass);

        for (T object : objectList ) {
            try{
                stringList.add(mapper.writer(schema.withColumnSeparator(delimiter).withEscapeChar(escape))
                        .writeValueAsString(object));
            } catch (JsonProcessingException e) {
                System.out.println("Exception : " + e);
            }
        }

        return stringList;
}

Input : SuperHero flash = new SuperHero(1, "Flash", "Barry , Allen", "DC");

Expected Output : 1,Flash,"Barry \, Allen",DC

Output I'm getting : 1,Flash,"Barry , Allen",DC

Can someone point what I'm doing wrong?

dushyantashu
  • 523
  • 3
  • 9
  • 16

1 Answers1

1

Your output is correct, in CSV when you surround an object or element with double quotes it is written as is, so "Flash" is written without quotes, and "Barry ,Allen" is written with quotes in the output, therefore the delimiter is already escaped and doesn't need to be escaped with the back slash.

EDIT/UPDATE

After reading the documentation provided on github, the following line shows generator only uses double quotes, and escape is only used for parsing.

escapeChar (int) [default: -1 meaning "none"]: character, if any, used to escape values. Most commonly defined as backslash ('\'). Only used by parser; generator only uses quoting, including doubling up of quotes to indicate quote char itself.

Zaid Qureshi
  • 1,203
  • 1
  • 10
  • 15
  • I'm interested in escaping my delimiter in the field value, so it can be count as a single field. Is there a way to achieve this? – dushyantashu Jan 05 '17 at 17:32
  • It is already counted as a single field, If you notice in your output only "Barry ,Allen" has double quotes, this means that it is a single field. You can save that output to a .csv file, and open it in excel to see the result and you will find that "Barry ,Allen" is one field without showing the double quotes. Surrounding a field with double quotes escapes characters inside, other than double quotes themselves which only need to be escaped. – Zaid Qureshi Jan 05 '17 at 18:27
  • For csv it works, but AWS Kinesis delimiter to be escaped for publishing to Redshift. Check this : http://docs.aws.amazon.com/redshift/latest/dg/r_COPY_command_examples.html#r_COPY_command_examples-copy-data-with-the-escape-option, so I was looking for a workaround. – dushyantashu Jan 05 '17 at 19:00
  • After reading the documentation of the Jackson dataformat csv, and looking at your code again I want to recommend two things, one setup the schema before your loop, no need to do it on every iteration. and also by default it is comma delimited – Zaid Qureshi Jan 09 '17 at 00:12