1

A = load 'data' as (x, y);

B = load 'data' as (x, z);

C = cogroup A by x, B by x;

D = foreach C generate flatten(A), flatten(b);

E = group D by A::x

what exactly done in the above statements and where we use flatten in realtime scenario.

GOPIREDDY G
  • 11
  • 2
  • 3
  • Well explained in the following answer, http://stackoverflow.com/questions/18544602/how-to-flatten-a-group-into-a-single-tuple-in-pig – Arun A K Feb 12 '15 at 08:20
  • It is ok for FLATTEN but i also want sample example to above statements – GOPIREDDY G Feb 12 '15 at 09:20
  • What do you mean by sample example? The above itself is an example. If you meant detailed description, check out the pig docs @ https://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#Flatten+Operator – Arun A K Feb 12 '15 at 09:34
  • Arun i m asking with populated data like (x,y)--> (1,2) (1,3) (2,3) , (x,z) -->(1,4) (1,2) (3,2) ..., and so on – GOPIREDDY G Feb 12 '15 at 10:08

1 Answers1

-1
A = load 'input1'   USING PigStorage(',') as (x, y);
(x,y) --> (1,2)(1,3)(2,3)
B = load 'input2'  USING PigStorage(',') as (x, z);`
(x,z) --> (1,4)(1,2)(3,2)*/
C = cogroup A by x, B by x;`

result:

(1,{(1,2),(1,3)},{(1,4),(1,2)})
(2,{(2,3)},{})
(3,{},{(3,2)})


D = foreach C generate group, flatten(A), flatten(B);`

when both bags flattened, the cross product of tuples are returned.  

result:
(1,1,2,1,4)
(1,1,2,1,2)
(1,1,3,1,4)
(1,1,3,1,2)  

E = group D by A::x`

here your are grouping with x column of relation A.

(1,1,2,1,4) (1,1,2,1,2) (1,1,3,1,4) (1,1,3,1,2)

Sravan K Reddy
  • 1,082
  • 1
  • 10
  • 19