4

In Pig, I have a statement which basically appends the date to my generated values.

Data = FOREACH Input GENERATE (CurrentTime()),FLATTEN(group), COUNT(guid)oas Cnt;

The output gives me the date 2013-05-25T09:01:38.914-04:00 in ISO8601.

How can I make this just as "YYYY-MM-DD" ?

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
JohnMeek
  • 69
  • 1
  • 2
  • 6

2 Answers2

14

You have several options:

Convert it with Pig functions :
E.g:

A = load ...
B = foreach A {
  currTime = CurrentTime();
  year = (chararray)GetYear(currTime);
  month = (chararray)GetMonth(currTime);
  day = (chararray)GetDay(currTime);
  generate CONCAT(CONCAT(CONCAT(year, '-'), CONCAT(month, '-')),day) as myDate;
}

OR pass the date to the script as a parameter:

pig -f script.pig -param CURR_DATE=`date +%Y-%m-%d`

OR declare it in script:

%declare CURR_DATE `date +%Y-%m-%d`;

Then refer to the variable as '$CURR_DATE' in the script.

You may also create a modified CurrentTime UDF in which you convert the DateTime object to the appropriate format with the Joda-Time library.

The easiest would be to declare the date in the beginning of the script.

Lorand Bendig
  • 10,630
  • 1
  • 38
  • 45
  • Hey Lorand,Thanks for the reply. I had in fact tried the passing the date as a parameter method. But the final output for today shows 1983 instead of 2013-05-25. Any idea why? The same output parameter works fine while assigning it to the name of the stored file. For example: STORE Output INTO 'Outputs$CURR_DATE works fine and shows as Output2013-05-25. – JohnMeek May 25 '13 at 17:32
  • It's because if you do the substraction: 2013-5-25 you'll get 1983. That's why I used quotes (`'$CURR_DATE'`) so that it will be handled as chararray instead of int. – Lorand Bendig May 25 '13 at 18:42
12

If you are using Pig 0.12 or later, you can use ToString(CurrentTime(),'yyyy-MM-dd')

You can use any datetime type instead of CurrentTime()

Refer to http://pig.apache.org/docs/r0.12.0/func.html#to-string for date time formats.