I have an Uber jar that performs some Cascading ETL tasks. The jar is executed like this:
hadoop jar munge-data.jar
I'd like to pass arguments to the jar when the job is launched, e.g.
hadoop jar munge-data.jar -Denv=prod
Different credentials, hostnames, etc... will be read from properties files depending on the environment.
This would work if the job were executed java jar munge-data.jar -Denv=prod
, since the env
property could be accessed:
System.getProperty("env")
However, this doesn't work when the jar is executed hadoop jar ...
.
I saw a similar thread where the answerer states that properties can be accessed using what looks like the org.apache.hadoop.conf.Configuration class. It wasn't clear to me, from the answer, how the conf
object gets created. I tried the following and it returned null
:
Configuration configuration = new Configuration();
System.out.println(configuration.get("env"));
Presumably, the configuration properties need to be read/set.
Can you tell me how I can pass properties, e.g. hadoop jar [...] -DsomeProperty=someValue
, into my ETL job?