0

I have the following input data table:

Product Price  Country
A       5       Italy
B       4       USA
C       12      France
A       5       Italy
B       7       Russia

I'm doing summarize operation by 2 IDs (product and country). The code is following:

t3 = LOAD '/home/Desktop/3_table.data' USING PigStorage('\t') AS (product:chararray, price:int, country:chararray);

group_pr = GROUP t3 BY (product, country);

price_1 = FOREACH group_pr GENERATE CONCAT(group.product, group.country), SUM(t3.price);

STORE price_1 INTO 'sum_by_product_country' USING PigStorage('\t');

The output is:

AItaly  10
BUSA    4
BRussia 7
CFrance 12

The problem is, that I have to get the full table that contains as input data and output all together, so expecting output should be something like this:

A       5       Italy           AItaly      10
B       4       USA             BUSA        4
C       12      France          CFrance     12
B       7       Russia          BRussia     7

Maybe someone can help, how to get this output?

Adrian Wragg
  • 7,311
  • 3
  • 26
  • 50
Ale
  • 645
  • 4
  • 16
  • 38
  • There is the JOIN operator in Pig. See http://pig.apache.org/docs/r0.12.0/basic.html. The key to join on would be the product. – Frederic Oct 28 '13 at 11:18
  • Thanks a lot! So do I have to do Full Outer Join by product? – Ale Oct 28 '13 at 11:27
  • The problem is that in the output pig does not see the schema of price_1, so i dont know how exactly to define the join. jnd = JOIN t3 BY product FULL, price_1 BY product does not work, cause in case of price_1 product is not defined. – Ale Oct 28 '13 at 11:50
  • 'price_1 = FOREACH group_pr GENERATE CONCAT(group.product, group.country), SUM(t3.price), group.product AS product;' Then do the join on product. By the way, this is all pretty basic stuff. – Frederic Oct 28 '13 at 12:47
  • Thanks! well im very new in pig, so ask pretty basic questions – Ale Oct 28 '13 at 13:09
  • But i think that Join is not the right way, because it will duplicate rows, instead is necessary to get like was shown in the input below (without repetitions) – Ale Oct 29 '13 at 08:46
  • please see this: http://stackoverflow.com/questions/38549/difference-between-inner-and-outer-join – Frederic Oct 29 '13 at 09:19

0 Answers0