1

I have writen one UDF where my input schema is a Bag of tuples, Now in my UDF I am processing each tuple and appending extra field for each tuple and providing that to the output bag. This works well, Now in my next step I tried to create output schema of my output bag, I want to just append one field inside the tuple of my input of my bag. How can I do this?

here is my input bag schema.

xx: {(uniqueRS::PreprocUDF::id: long,uniqueRS::PreprocUDF::dominion: chararray,uniqueRS::PreprocUDF::affectedItemGRN: chararray,uniqueDomAndUser: {(PreprocUDF::dominion: chararray)},uniqueRS::PreprocUDF::count: long)}

Now I need it in this way

outputBag: {(uniqueRS::PreprocUDF::id: long,uniqueRS::PreprocUDF::dominion: chararray,uniqueRS::PreprocUDF::affectedItemGRN: chararray,uniqueDomAndUser: {(PreprocUDF::dominion: chararray)},uniqueRS::PreprocUDF::count: long,grpName:chararray)}

I tried this as my Output schema but it didn't worked,

public Schema outputSchema(Schema input) {
     Schema.FieldSchema grpName = new Schema.FieldSchema("grpName", DataType.CHARARRAY);
     input.add(grpName);
retrun input;
}

I also tried with `mergePrefixSchema() still no luck please help me out.

Also Tried in this way

    public Schema outputSchema(Schema input) {

    Schema.FieldSchema inputTupleFS = input.getField(0);
    Schema.FieldSchema grpName = new Schema.FieldSchema("grpName", DataType.CHARARRAY);


    ArrayList<Schema.FieldSchema> tupleList=new ArrayList();
    tupleList.add(inputTupleFS);
    tupleList.add(grpName);

    Schema bagSchema =new Schema(tupleList);
    Schema.FieldSchema bagFS =new Schema.FieldSchema("testBag", bagSchema, DataType.BAG);

    Schema outputBag=new Schema(bagFS);
}

thanks.

sudheer
  • 338
  • 1
  • 6
  • 17

1 Answers1

0

Thanks to [http://mail-archives.apache.org/mod_mbox/pig-user/201208.mbox/%3C79A5BC65BFC37343844D4BB8A05DD3EE0183A649BA@opera-ex5.ny.os.local%3E][1]

got the answer

public Schema outputSchema(Schema input) {


            Schema tupleSchema = new Schema(input.getField(0).schema.getField(0).schema.getFields());
            Schema.FieldSchema grpName = new Schema.FieldSchema("grpName", DataType.CHARARRAY);

            tupleSchema.add(grpName);

            Schema.FieldSchema tupleFs = new Schema.FieldSchema("with_grpName", tupleSchema, DataType.TUPLE);



            Schema bagSchema =new Schema(tupleFs);
            Schema.FieldSchema bagFS =new Schema.FieldSchema("testBag", bagSchema, DataType.BAG);

            Schema outputBag=new Schema(bagFS);




                return outputBag;
            }
sudheer
  • 338
  • 1
  • 6
  • 17