6

Scenario:

I have created transformation to load data into table from csv file and I have following columns in csv file:

  1. Customer_Id
  2. Company_Id
  3. Employee_Name

But user may give input file with column ordering (random order) as

  1. Employee_Name
  2. Company_Id
  3. Customer_Id

so, if I try to load file which has random column ordering, will kettle load correct column values as per column names ... ?

René Vogt
  • 43,056
  • 14
  • 77
  • 99
yuvi
  • 564
  • 5
  • 12
  • You could possibly do it with a `UDJC` or JavaScript if the column headers were always in the file, but the best solution is probably to push back on the users and get them to give you a properly formatted file. – Brian.D.Myers Jan 21 '16 at 17:58
  • Hi @Brian.D.Myers can u give me some sample example with ktr file with a UDJC or JavaScript to proceed with my question...highly appericiated – yuvi Jan 22 '16 at 03:51

4 Answers4

3

Using ETL Metadata Injection you can use a transformation like this, to either normalize the data, or to store it to your database:

Metadata transformation

Then you just need to send the correct data to that transformation. You can read the header line from the CSV, and use Row Normaliser to convert to the format used by ETL Metadata Injection.

I have included a quick example here: csv_inject on Dropbox, if you make something like this and run it from something that runs it per csv file it should work.

bolav
  • 6,938
  • 2
  • 18
  • 42
2

Ooh, thats some nasty javascript!

The way to do this is with metadata injection. Look at the samples, but basically you need a template which reads the file, and writes it back out. you then use another parent transformation to figure out the headings, configure that template and then execute it.

There are samples in the PDI samples folder, and also take a look at the "figuring out file format" example in matt casters blueprints project on github.

Codek
  • 5,114
  • 3
  • 24
  • 38
  • Huh. I tried it with the `ETL metadata injection` step but, never having used it before, I didn't get it working. Guess I should try again. In the past I've pre-processed things like this with a Python script. – Brian.D.Myers Jan 22 '16 at 17:15
  • This is not a very constructive answer. First of all you're saying that a solution is bad, without saying what is bad about it. And then you are saying that another solution exists without telling about how to solve it. I have learned nothing from your answer except that another answer might exist. – bolav Jan 23 '16 at 12:01
  • @codek ...can u able to give example and explain me...advance thanks – yuvi Jan 25 '16 at 03:52
0

You could try something like this as your JavaScript:

//Script here

var seen;
trans_Status = CONTINUE_TRANSFORMATION;
var col_names = ['Customer_Id','Company_Id','Employee_Name'];
var col_pos;
if (!seen) {
    // First line
    trans_Status = SKIP_TRANSFORMATION;
    seen = 1;
    col_pos = [-1,-1,-1];
    for (var i = 0; i < col_names.length; i++) {
        for (var j = 0; j < row.length; j++) {
            if (row[j] == col_names[i]) {
                col_pos[i] = j;
                break;
            }
        }
        if (col_pos[i] === -1) {
            writeToLog("e", "Cannot find " + col_names[i]);
            trans_Status = ERROR_TRANSFORMATION;
            break;
        }
    }
}

var Customer_Id = row[col_pos[0]];
var Company_Id = row[col_pos[1]];
var Employee_Name = row[col_pos[2]];

Here is the .ktr I tried: csv_reorder.ktr

(edit, here are the test csv files) 1.csv:

Customer_Id,Company_Id,Employee_Name
cust1,comp1,emp1

2.csv:

Employee_Name,Company_Id,Customer_Id
emp2,comp2,cust2
bolav
  • 6,938
  • 2
  • 18
  • 42
  • can you give me the sample based on my example ..i am not getting your ktr..using java code ..is there any other way to proced instead of java code – yuvi Jan 22 '16 at 09:58
  • This is based on your example. What do you mean "not getting your ktr"? – bolav Jan 22 '16 at 11:40
  • i have tried your csv_reorder.ktr..but i am not getting expected output – yuvi Jan 22 '16 at 12:25
  • What output do you get, and what is your input? – bolav Jan 22 '16 at 12:43
  • It is "Modified Java Script Value" - step – simar Jan 22 '16 at 13:26
  • Found interesting thing. If variable seen is bool and assigned initial to false then part of code responsible to process first row, executes for each row in stream – simar Jan 22 '16 at 13:48
  • var seen; seen = 1; this will work, var seen = false, seen = true doesn't. I meant this. – simar Jan 25 '16 at 08:38
0

Assuming rejecting the input file is not an option you basically have 4 solutions.

  1. reorder the fields in an external editor (don't use excel if it contains dates)
  2. Use code within your transformation to detect the column headers and reorder the file.
  3. Use metadata injection as proposed by bolav
  4. Create a job. This need to:

a. load the file into a temporary database. b. use an sql statement to retrieve the fields (use a SELECT with an ORDER By clause) c. output the file in the correct order

Stevetech
  • 115
  • 1
  • 1
  • 5