Is there a way to automatically detect FIELD / COLUMN order when loading CSV file with headers in first line?

Question

I have CSV files that always contain the same fields, or a subset of the same fields. However they come from different providers, and each provider has a different ordering. Also some providers include the field if it's empty, others remove it. Sample:

Provider 1.txt:

id,name,address,url
3,ruth,ruth address,ruth.com
B,sonia,sonia's street, thisissonia.eu

Provider 2.txt:

id,url,name
3,shutupmag.com,maggie
B,khaleesi.org,Mother of Dragons

The first line is ALWAYS the headers, and the headers are always the same (with possibility of one or more being omitted).

I'm using a Camel Route to do the processing, using Bindy and an annotated class to do the marshalling/unmarshalling. I'm pretty happy with my solution, but right now I have to manually edit the order of the fields in my Bean when I want to process a different provider. I have something like this in my Bean:

@DataField(pos = 4, defaultValue = "") //for provider Stark
//@DataField(pos = 2, defaultValue = "") //for provider Lannister
//@DataField(pos = 3, defaultValue = "") //for provider Targaryen
public String url = "";

It feels bad, and I think there probably is a way to infer the fields from the first line. At least it makes sense. I could do this in a processor, but I like Bindy, it's awesome and I would like to keep using it. I guess I could use a different folder for each provider and have a different bean for each folder, but that's not what I want either. A provider can change the order or fields without notice. But they are always required to send the fields in the first line.

So final question: Can I auto-detect the CSV field names in Camel with Bindy (or other component, I'm open to suggestions, I just prefer to keep using Bindy)?

Hello there, feel free to answer it wasn't my intention to direct the question at someone in particular :) — Ric Jafe, Sep 01 '14 at 09:41

Peter Keller · Answer 1 · 2014-10-10T18:41:15.293

Alternatively, you may use the Camel CSV component. You don't get fully initialized value beans but a list with maps holding key value pairs:

final CsvDataFormat format = new CsvDataFormat();
format.setUseMaps(true);
format.setDelimiter(",");

from("direct:start")
    .unmarshal(format)
    .process(new Processor() {
        @Override
        public void process(final Exchange exchange) throws Exception {
            final List<Map<String, String>> body = exchange.getIn().getBody(List.class);
            for (final Map<String, String> row : body) {
                LOG.info("new row: {}", row);
            }
        }
    });

With:

template.sendBody("direct:start", "id,name,address,url\n3,ruth,ruth address,ruth.com\nB,sonia,sonia's street, thisissonia.eu");
template.sendBody("direct:start", "id,url,name\n3,shutupmag.com,maggie\nB,khaleesi.org,Mother of Dragons");

You get following output:

new row: {id=3, address=ruth address, url=ruth.com, name=ruth}
new row: {id=B, address=sonia's street, url=thisissonia.eu, name=sonia}
new row: {id=3, url=shutupmag.com, name=maggie}
new row: {id=B, url=khaleesi.org, name=Mother of Dragons}

Thankyou, this is a nice thing to know! I'm not using Camel 2.13 so I don't have the format.setUseMaps() option but I get the values as a List of List so no problem. I'll give it a try today and see how I can figure the fields out and populate my beans ( I have an idea, I'll share it here if it works). (btw I'm using Camel 2.10.7 for max compatibility with ServiceMix 4.5.3) — Ric Jafe, Sep 02 '14 at 10:42

Is there a way to automatically detect FIELD / COLUMN order when loading CSV file with headers in first line?

1 Answers1