8

We have a lot of the long running, memory/cpu intensive jobs in k8s which are run with celery on kubernetes on google cloud platform. However we have big problems with scaling/retrying/monitoring/alerting/guarantee of delivery. We want to move from celery to some more advanced framework.

There is a comparison: https://github.com/argoproj/argo/issues/849 but it's not enough.

Airflow:

  • has better support in community ~400 vs ~12 tags on SO, 13k stars vs ~3.5k stars
  • python way of defining flows feels better than using just yamls
  • support in GCP as a product: Cloud Composer
  • better dashboard
  • some nice operators like email operator

Argoproj:

  • Native support of Kubernetes (which i suppose is somehow better)
  • Supports CI/CD/events which could useful in the future
  • (Probably) better support for passing results from one job to the another one (in Airflow xcom mechanism)

Our DAGs are not that much complicated. Which of those frameworks should we choose?

sacherus
  • 1,614
  • 2
  • 20
  • 27
  • Why do you want to move from Celery to some more advanced framework? - Just because there are some other overhyped technologies out there? Define "more advanced". – DejanLekic Jul 15 '19 at 13:23
  • There are a lot of reasons: 1. We have idle workers which are consuming a lot of resources, but we are running them every hour 2. It's nice to have dashboard which shows, which task in DAG failed. 3. There are some bugs in celery: eg. retry + sig term in celery. For me it looks like valid reasons. – sacherus Jul 15 '19 at 18:52
  • Community support may be a bit of a misdirection here. Argo may have less visibility/adoption/community, but it is also a much simpler system overall than Airflow and does not tool support for different Operators and Sensors etc. Its sort of on you the user to decide what goes into each task's container. also allows for better support of "not python" which may be a use case you need to consider. If you need support for triggers, calendars, sensors, etc. there is 'argo-events' but at the end of the day the scope is much narrower than Airflow. I view this as a positive aspect. – lbrindze Feb 04 '20 at 03:12

1 Answers1

8

Idiomatic Airflow isn't really designed to execute long-running jobs by itself. Rather, Airflow is meant to serve as the facilitator for kicking off compute jobs within another service (this is done with Operators) while monitoring the status of the given compute job (this is done with Sensors).

Given your example, any compute task necessary within Airflow would be initiated with the appropriate Operator for the given service being used (Airflow has GCP hooks for simplifying this) and the appropriate Sensor would determine when the task was completed and no longer blocked downstream tasks dependent on that operation.

While not intimately familiar on the details of Argoproj, it appears to be less of a "scheduling system" like Airflow and more of a system used to orchestrate and actually execute much of the compute.

Jacob Turpin
  • 190
  • 2
  • 14