1

I am a beginner in PIG.

I wrote a program following the WIKI to convert words in file to uppercase.

--cat UPPER.java

package com.bigdata.myUdf;

import java.io.IOException;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.Tuple;
import org.apache.pig.impl.util.WrappedIOException;

public class UPPER extends EvalFunc<String> {

public String exec(Tuple input) throws IOException {
    if (input == null || input.size() == 0)
    return null;
    try{
    String str = (String)input.get(0);
    return str.toUpperCase();
    }catch(Exception e){
    throw WrappedIOException.wrap("Caught exception processing input row ", e);
    }
    }

}

-- cat /home/hduser/lab/mydata/myscript.pig

REGISTER /home/hduser/software/myUdfs/UPPER.jar
std_det = LOAD '/pigdata/udf1.txt' USING PigStorage(',') as (name:chararray);
B = FOREACH std_det GENERATE com.bigdata.myUdf.UPPER(name);
dump B;

But when i run it I am getting Error.

 java -cp com.bigdata.myUdf.UPPER.jar org.apache.pig.Main -x local /home/hduser/lab/mydata/myscript.pig

ERROR

Error: Could not find or load main class org.apache.pig.Main

cat .bashrc

export PIG_INSTALL=/home/hduser/software/pig
export PATH="${PATH}:${PIG_INSTALL}/bin"
export PIG_CLASSPATH=$HADOOP_CONF_DIR:${PIG_INSTALL}:.
export CLASSPATH=.:${PIG_CLASSPATH}

The pig script is at: /home/hduser/lab/mydata/myscript.pig

The JAR file is at: /home/hduser/software/myUdfs/UPPER.jar

Please help me to understand what i am doing wrong. Thanks in advance. After following the instruction from Shivashakti. The program ran but it didnot give any output.

 pig -x local myScript.pig

 15/01/05 04:47:57 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
15/01/05 04:47:57 INFO pig.ExecTypeProvider: Picked LOCAL as the ExecType
2015-01-05 04:47:57,920 [main] INFO  org.apache.pig.Main - Apache Pig version 0.14.0 (r1640057) compiled Nov 16 2014, 18:02:05
2015-01-05 04:47:57,921 [main] INFO  org.apache.pig.Main - Logging error messages to:       /home/hduser/lab/piglog/pig_1420462077918.log
2015-01-05 04:47:57,959 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - user.name is deprecated. Instead, use mapreduce.job.user.name
2015-01-05 04:47:58,314 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-05 04:47:58,315 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-01-05 04:47:58,318 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at:   file:///
2015-01-05 04:47:58,463 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation -   fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-05 04:47:59,070 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-05 04:47:59,227 [main] INFO  org.apache.pig.Main - Pig script completed in 2 seconds and 505 milliseconds (2505 ms)
May
  • 1,158
  • 2
  • 13
  • 24
  • java -cp is not required I guess. You are not executing a pig embedded in java, instead you need to execute the UDF. Try pig -x local file.pig – Arun A K Jan 04 '15 at 19:07

2 Answers2

2

Can you follow the below steps?.

1.Download 3 jar files from the below link(pig-0.11.1.jar,hadoop-common-0.21.0.jar and piggybank.jar)

http://www.java2s.com/Code/Jar/p/Downloadpig0111jar.htm
http://www.java2s.com/Code/Jar/h/Downloadhadoopcommon0210jar.htm
http://www.java2s.com/Code/Jar/p/Downloadpiggybankjar.htm

2. Set all the above 3 jar files to your classpath

export CLASSPATH=/tmp/pig-0.11.1.jar:/tmp/hadoop-common-0.21.0.jar:/tmp/piggybank.jar

3. Create directory name "com/bigdata/myUdf/" from your current directory

>>mkdir -p com/bigdata/myUdf/

4. Compile UPPER.java file and make sure that JAVA_HOME set properly and also all the above three jars files are included in the classpath otherwise compilation issues will come

>>javac UPPER.java

5. Move the compiled UPPER.class file to "com/bigdata/myUdf/" folder

>>mv UPPER.class com/bigdata/myUdf/

6. Create a jar file name UPPER.jar

>>jar -cvf UPPER.jar com/

7. Now include the UPPER.jar into your pig script and run the below command

   >>pig -x local myscript.pig 

Once you run the above command you will get the actual output.

Example
input

hello
world

myscript.pig

REGISTER UPPER.jar; 
std_det = LOAD 'input' USING PigStorage(',') as (name:chararray);
B = FOREACH std_det GENERATE com.bigdata.myUdf.UPPER(name);
dump B;

output:

(HELLO)
(WORLD)

Sample commands:

$ ls
UPPER.java      input   myscript.pig

$ mkdir -p com/bigdata/myUdf/
$ javac UPPER.java
$ mv UPPER.class com/bigdata/myUdf/
$ jar -cvf UPPER.jar com/
$ pig -x local myscript.pig 
Sivasakthi Jayaraman
  • 4,724
  • 3
  • 17
  • 27
  • Hi Sivasakthi, Thanks for your time. Now the program doesnot complain about missing the jars... but it doesnot give any valid output. I have posted the output above. – May Jan 05 '15 at 12:51
  • Sorry I had forgot the DUMP in pig script .. its working fine now. Thanks for all the help. – May Jan 05 '15 at 13:41
0

The error indicates that apache jar is not in the classpath. -cp com.bigdata.myUdf.UPPER.jar is not including required jars. It only includes 'UPPER.jar'. You can have a look how to properly include all required jars in the classpath here

P.S. I think you should use pig command from command line instead executing it your way. But I haven't used it myself so its just a guess.

Community
  • 1
  • 1
jonasnas
  • 3,540
  • 1
  • 23
  • 32