Questions tagged [azkaban]

Azkaban is a batch workflow job scheduler created at LinkedIn to run their Hadoop Jobs.

Often times there is a need to run a set of jobs and processes in a particular order within a workflow. Azkaban will resolve the ordering through job dependencies and provide an easy to use web user interface to maintain and track your workflows. Here are a few features:

  • Compatible with any version of Hadoop
  • Easy to use web UI
  • Simple web and http workflow uploads
  • Project workspaces
  • Scheduling of workflows
  • Modular and pluginable
  • Authentication and Authorisation
  • Tracking of user actions
  • Email alerts on failure and successes
  • SLA alerting and auto killing
  • Retrying of failed jobs

http://azkaban.github.io

64 questions
6
votes
2 answers

azkaban keeps changing executor id

I'm using Azkaban 3.0 and I have it on a server with two executors. I have a simple echo job that I'm running and I'm specifying the executor by setting the setExecutor=id# in the flow parameters. but whenever I run tise job the execution keeps…
tkyass
  • 2,968
  • 8
  • 38
  • 57
6
votes
3 answers

Azkaban: pass parameters to underlying job code

Is it possible to pass options from a azkaban work flow to the underlying job code? I have something like this, It kind of works for hard coded/pre-known dates but I would like to have the option to specify the date when I execute the flow: from…
sharath
  • 81
  • 1
  • 6
5
votes
2 answers

Relationship between HDFS, HBase, Pig, Hive and Azkaban?

I am somewhat new to Apache Hadoop. I have seen this and this questions about Hadoop, HBase, Pig, Hive and HDFS. Both of them describe comparisons between above technologies. But, I have seen that, typically a Hadoop environment contains all these…
Supun Wijerathne
  • 11,964
  • 10
  • 61
  • 87
4
votes
2 answers

Workflow orchestration for Google Dataflow

We are using Google Dataflow for batch data processing and looking for some options for workflow orchestration tools something similar to what Azkaban does for Hadoop. Key things things that we are looking for are, Configuring workflows Scheduling…
3
votes
2 answers

How to use Hive jobs with Azkaban?

I would like to use Azkaban for periodic Hive jobs, I have looked through Azkaban documentation, and it seems like by default it doesn't support Hive jobs, do you know how can I use these two together? I think, I'll have to run Hive jobs as a…
wlk
  • 5,695
  • 6
  • 54
  • 72
3
votes
0 answers

How to run selected Azkaban jobs in paralell via a script?

Since there are too many jobs on Azkaban, I have to test new jobs one by one manually. Assume I upload some new jobs and is it possible to write a Python (or any other language) script to fetch the dependencies between these jobs and then run them…
SuperDelta
  • 253
  • 2
  • 13
3
votes
1 answer

Why is my Sqoop task in Azkaban stuck after columns are selected?

I use shell command in Azkaban, and put Sqoop commands in a shell script. Today a Sqoop task stuck for no reason, sqoop_task1. It happened a few days age on another sqoop task, let's call it sqoop_task2. sqoop_task1 and sqoop_task2 are all import…
Will
  • 41
  • 4
3
votes
2 answers

How to get job-name from job file in azkaban 3.0

We need the job name from Azkaban when trying to schedule a job. Is there any built-in property for that? We are getting the flow name from ${azkaban.job.flowid}. Eg: My job file is: type=command command=python xyz.py ${azkaban.job.attempt}…
Ashish Mohan
  • 664
  • 6
  • 13
3
votes
0 answers

Why intellij not importing gradle AWS SDK libraries

Working on gradle project very first time. (Open source - Azkaban framework). I tried to add AWS SDK java as dependency in my project. But when I add it shows empty jar or sometime no jar is there. Added dependency in build.gradle…
devsda
  • 4,112
  • 9
  • 50
  • 87
2
votes
2 answers

LDAP Authentication for Azkaban

We are trying to setup Azkaban with LDAP authentication in our production environment. Any leads on how to do this? Documentation says it can be done by adding plugin jar file by extending UserManager class . I am a newbie to azkaban , looking for…
Harsha
  • 31
  • 4
2
votes
3 answers

Azkaban : Conditional executions in the workflow

I have a requirement to inject a conditional execution in my workflow. For ex: If a particular condition is met, then a particular workflow should be executed. If not, a different workflow should be executed. From my understanding, there is no…
Kranthi
  • 109
  • 2
  • 7
1
vote
1 answer

Kafka Admin Client giving Timeout Error for ListTopic

Hi I am trying to run this code in but it is working fine in another EC2 Azkaban instance but not giving below error for another instance. private val adminprops = new Properties() adminprops.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG,"Kafka…
1
vote
1 answer

java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration in azkanban hive job

CDH 6.3.2 azkaban 3.90.0 I want run hive job in azkaban ,but I get a error 23-06-2021 11:21:05 CST hive-jdbc ERROR - Job run failed! java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration at…
ighack
  • 31
  • 4
1
vote
0 answers

python virtual env specific to every execution of an azkaban flow

I've multiple azkaban flows in a project. Most of the commands are python scripts and all of them are running in a scheduled way. job nodes look like this : nodes: - name: Start type: command config: command: printenv - name:…
1
vote
2 answers

centos8 install azkaban Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'

environment: CentOS8 mysql Ver 8.0.17 java version "1.8.0_261" azkaban v3.90.0. when I try to installing Azkaban Executor Server cd /home/azkaban/azkaban/azkaban-exec-server/build/distributions tar -xzvf…
Venus
  • 1,184
  • 2
  • 13
  • 32
1
2 3 4 5