10

I'm trying to do a simple job using oozie.
It will be a one simple Pig Action.

I have a file : FirstScript.pig containing:

dual = LOAD 'default.dual' USING org.apache.hcatalog.pig.HCatLoader();
store dual into 'dummy_file.txt' using PigStorage();

and a workflow.xml containing:

<workflow-app name="FirstWorkFlow" xmlns="uri:oozie:workflow:0.2">
    <start to="FirstJob"/> 
    <action name="FirstJob">
        <pig>
            <job-tracker>hadoop:50300</job-tracker>
            <name-node>hdfs://hadoop:8020</name-node>
            <script>/FirstScript.pig</script>
        </pig>
        <ok to="okjob"/>
        <error to="errorjob"/>
    </action>
    <ok name='okjob'>    
        <message>job OK, message[${wf:errorMessage()}]</message>
    </ok>    
    <error name='errorjob'>    
        <message>job error, error message[${wf:errorMessage()}]</message>
    </error>    
</workflow-app>

I have created structure :

FirstScript
|- lib
|---FirstScript.pig
|- workflow.xml

And what now? How do I deploy it and run with oozie?
Can anyone more experienced help?

Regards
Pawel

psmith
  • 1,769
  • 5
  • 35
  • 60

2 Answers2

9

I do it like this:

hadoop fs -put workflow.xml some_dir/ 
oozie job --oozie http://your_host:11000/oozie -config cluster_conf.xml -run

and my cluster_conf.xml looks like this (please check your ports first they depend on Hadoop distro):

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<configuration>
<property>
        <name>nameNode</name>
        <value>hdfs://my_nn:8020</value>
</property>
<property>
        <name>jobTracker</name>
        <value>my_jt:8050</value>
</property>
<property>
        <name>oozie.wf.application.path</name>
        <value>/user/my_user/some_dir/workflow.xml</value>
</property>
</configuration>
Viacheslav Rodionov
  • 2,335
  • 21
  • 22
  • Hi,Thanks for your answer. Two more questions: should this cluster_conf.xml file also be uploaded to hdfs? And second: because i haven't done the installation, i'm not sure about the address of my oozie installation. Whre can i find it? I know that configuration required for cluster_conf.xml can be found in core-site.xml and mapred-site.xml, but there is nothing about oozie... – psmith Feb 05 '14 at 07:25
  • 1
    Hi, I'm glad it helped. No you don't need to copy this conf file to hdfs. And you can find Oozie config in **oozie-site.xml** ;) In my installation it's in **/etc/oozie/conf/oozie-site.xml**, just look for **oozie.base.url** property. – Viacheslav Rodionov Feb 05 '14 at 12:32
2

Config file should point to job.properties in place of file.xml. Since, job.properties contains path to workflow.xml

oozie job --oozie http://your_host:11000/oozie -config **/job.properties** -run
DimaSan
  • 12,264
  • 11
  • 65
  • 75
Piyush Ugale
  • 101
  • 1
  • 4