Indirect load in Informatica

Question

How does indirect load in informatica work internally. Does it collate all the data and then process the data or it does processing for one file at a time? If I have duplicates spanning multiple files, will the duplicate removal logic in my mapping would remove duplicates or would I have to merge the files using Union transformation and then process the data in the duplication removal logic?

score 1 · Answer 1 · answered Mar 25 '17 at 17:48

1

As far as I know, Informatica would process the data as if it were a single file. So yes it should remove the duplicates across files

answered Mar 25 '17 at 17:48

Samik

3,435
2
22
27

score 1 · Accepted Answer · answered Mar 27 '17 at 14:38

1

Informatica reads a stream as if it was a single file. It's like you'd do a cat on filename with wildcard, eg. if there are two files f1.txt with a testlineA inside and f2.txt with a testlineB inside, and you run a cat f*.txt command, you should get:

testlineA
testlineB

Just like if it was coming from one file.

answered Mar 27 '17 at 14:38

Maciejg

3,088
1
17
30

1

Correct, please note that the filenames of the individual files are available if you enable a special port. Quite useful if you add the file name to the target DB for added traceability – Lars G Olsen Mar 27 '17 at 20:57

score -1 · Answer 3 · answered Mar 25 '17 at 21:07

-1

So long as your pipeline has an active transformation (i.e. sorter) before you actually filter out the duplicates then all records will have arrived at the active transformation before moving onto the filter and the matter will be moot

answered Mar 25 '17 at 21:07

Daniel Machet

615
1
5
7

Not relevant, but a sorter won't ruin anything :) – Lars G Olsen Mar 27 '17 at 20:58

Indirect load in Informatica

3 Answers3