2

I am trying to calculate the total shares but am receiving an error with the code, what am I missing?

grunt> history
1   stockprice = load 'PG/TutorialA/input/stockprice.csv' using PigStorage(',') AS(Stock:chararray,price:int);

2   investor = load 'PG/TutorialA/input/investor.csv' using PigStorage(',') AS(id:int,first:chararray,last:chararray,stock:chararray,price:int);

3   investor_stockprice = join investor by stock, stockprice by Stock;

4   group_by_lastname = group investor_stockprice by last;
grunt> sum_of_shares = FOREACH group_by_lastname GENERATE investor_stockprice, SUM(investor_stockprice.price) as Sum;
1275634 [main] ERROR org.apache.pig.tools.grunt.Grunt  - ERROR 1128: Cannot find field price in investor::id:int,investor::first:chararray,investor::last:chararray,investor::stock:chararray,investor::price:int,stockprice::Stock:chararray,stockprice::price:int
22/10/16 04:23:01 ERROR grunt.Grunt: ERROR 1128: Cannot find field price in investor::id:int,investor::first:chararray,investor::last:chararray,investor::stock:chararray,investor::price:int,stockprice::Stock:chararray,stockprice::price:int
Details at logfile: /mnt/var/log/pig/pig_1665892905780.log

Investor

Stockprices

Wing
  • 21
  • 2

2 Answers2

0

After grouping, all fields are prefixed with their original relations.

If you \describe investor_stockprice, you'd see that. Or as the error shows, you have both stockprice::price and investor::price in investor_stockprice, making .price ambiguous.

Related question - Reference columns in a FOREACH after a JOIN?

If you want last name and the sum, I don't think you want the entire investor_stockprice in your generate output

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
0

I found my error I had two columns with the same name

Wing
  • 21
  • 2