1

I have a sample CSV message:

header1,header2,header3
value1,value2,{"name":"John","age":30,"car":null}

How to convert it in form of embedded JSON as in:

{
  "header1": "value1",
  "header2": "value2",
  "header3": "{\"name\":\"John\",\"age\":30,\"car\":null}"
}

I am using Jackson schema builder with default column separator:

CsvSchema.builder().disableQuoteChar().setUseHeader(true).build();
CsvMapper.builder().enable(CsvParser.Feature.IGNORE_TRAILING_UNMAPPABLE, CsvParser.Feature.WRAP_AS_ARRAY).build();
Michał Ziober
  • 37,175
  • 18
  • 99
  • 146
  • Your CSV is broken. You need to change it a little bit. Wrap internal `JSON` with an escaping character. For example you can set apostrophe (') or pipe(|). Read `CSV` file, replace `{` with `|{` and `}` with `}|` and `CsvMapper` should handle it. For a general solution take a look at this question: [directly convert CSV file to JSON file using the Jackson library](https://stackoverflow.com/q/19766266/51591) – Michał Ziober Jan 10 '23 at 22:57
  • @MichałZiober, Even if the CSV gets updated, how could I escape csv and json at the same time without changing column separator from ','? – Vaibhav Tiwari Jan 12 '23 at 08:37

3 Answers3

0

You can use a csv JS library such as json-2-csv

# Global so it can be called from anywhere
npm install -g json2csv

# or as a dependency of a project
npm install json2csv --save
0

You can org.json.CDL as follows:

        BufferedReader br = new BufferedReader(new FileReader("file.csv"));
        String csvAsString = br.lines().collect(Collectors.joining("\n"));
        String json = CDL.toJSONArray(csvAsString).toString();
        try {
            Files.write(Path.of("src/main/resources/output.json"), json.getBytes(StandardCharsets.UTF_8));
        } catch (IOException e) {
            e.printStackTrace();
        }
0

Presented CSV content is broken. Values which contains column separator should be wrapped with a quote character. If we can not change the app which generates it we need to amend it before deserialising process. This example is simple so we can just replace { with |{ and } with }| and set | as a quote character. But for JSON payloads with internal objects we need to replace only the first { and the last } brackets. Code could look like below:

import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.SerializationFeature;
import com.fasterxml.jackson.databind.json.JsonMapper;
import com.fasterxml.jackson.dataformat.csv.CsvMapper;
import com.fasterxml.jackson.dataformat.csv.CsvSchema;

import java.io.File;
import java.nio.file.Files;
import java.util.stream.Collectors;

public class CsvApp {

    public static void main(String[] args) throws Exception {
        File csvFile = new File("./resource/test.csv").getAbsoluteFile();
        String csv = Files.readAllLines(csvFile.toPath()).stream().collect(Collectors.joining(System.lineSeparator()));
        csv = csv.replace("{", "|{").replace("}", "}|");

        CsvMapper csvMapper = CsvMapper.builder().build();

        CsvSchema csvSchema = CsvSchema.builder().setQuoteChar('|').setUseHeader(true).build();
        Object csvContent = csvMapper.readerFor(JsonNode.class).with(csvSchema).readValue(csv);

        JsonMapper mapper = JsonMapper.builder().enable(SerializationFeature.INDENT_OUTPUT).build();
        mapper.writeValue(System.out, csvContent);
    }
}

Above code prints:

{
  "header1" : "value1",
  "header2" : "value2",
  "header3" : "{\"name\":\"John\",\"age\":30,\"car\":null}"
}
Michał Ziober
  • 37,175
  • 18
  • 99
  • 146
  • Do we need to build a regex for nested JSON to identify first and last '{' '}' and that will again fail if we will have multiple nested JSONs? – Vaibhav Tiwari Jan 13 '23 at 14:35
  • Do not use `Regex` here. Just [StringUtils.lastIndexOf](https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html#lastIndexOf-java.lang.CharSequence-java.lang.CharSequence-) and [RegExUtils.replaceFirst](https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/RegExUtils.html#replaceFirst-java.lang.String-java.util.regex.Pattern-java.lang.String-) – Michał Ziober Jan 13 '23 at 14:37
  • But what if we have multiple json as values? – Vaibhav Tiwari Jan 13 '23 at 14:40
  • @VaibhavTiwari, then you need to process it line by line. I assume that each line contains the whole `JSON` and there is no new lines inside. In case not, you need to handle this. Generally this is not a valid `CSV` file and handling all corner cases is painful. – Michał Ziober Jan 13 '23 at 16:27