0

I am trying to upgrade our version of airflow to 1.10.0. When I do, I get an error that complains it cannot connect to mysql:

worker_1     | sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (2002, 'Can\'t connect to local MySQL server through socket \'/var/run/mysqld/mysqld.sock\' (2 "No such file or directory")') (Background on this error at: http://sqlalche.me/e/e3q8)

When I try to remove mysql from our systems altogether, I get the following instead:

scheduler_1  | [2018-10-25 17:22:19,399] {{celery_executor.py:113}} ERROR - No module named 'MySQLdb'

Mysql appears in no environment variable we have set, nor does it appear in airflow.cfg. It appears as if this version of airflow requires mysql for some other reason. Is this true?

Update This is similar to the issue raised here, but I'm more interested in why airflow is calling mysql at all.

I should point out also that we do explicitly set the sqlalchemy connection to a postgres database.

AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgres://airflow:airflow@postgres/airflow

The error is happening when airflow is trying to write the result of a task run (marking something as failure).

Update

This is the dockerfile I use which defines the airflow image. Note no mention of mysql:

# SOURCE: https://github.com/puckel/docker-airflow

FROM python:3.6-jessie

# Never prompts the user for choices on installation/configuration of packages
ENV DEBIAN_FRONTEND noninteractive
ENV TERM linux

# Airflow
ARG AIRFLOW_VERSION=1.10.0
ARG AIRFLOW_HOME=/usr/local/airflow

# Define en_US.
ENV LANGUAGE en_US.UTF-8
ENV LANG en_US.UTF-8
ENV LC_ALL en_US.UTF-8
ENV LC_CTYPE en_US.UTF-8
ENV LC_MESSAGES en_US.UTF-8
ENV PYTHONPATH ${AIRFLOW_HOME}
ENV AIRFLOW_GPL_UNIDECODE yes

COPY ./requirements.txt .

RUN set -ex \
    && buildDeps=' \
        python3-dev \
        libkrb5-dev \
        libsasl2-dev \
        libssl-dev \
        libffi-dev \
        build-essential \
        libblas-dev \
        liblapack-dev \
        libpq-dev \
        git \
    ' \
    && apt-get update -yqq \
    && apt-get upgrade -yqq \
    && apt-get install -yqq --no-install-recommends \
        $buildDeps \
        python3-pip \
        python3-requests \
        apt-utils \
        curl \
        rsync \
        netcat \
        locales \
        vim \
    && sed -i 's/^# en_US.UTF-8 UTF-8$/en_US.UTF-8 UTF-8/g' /etc/locale.gen \
    && locale-gen \
    && update-locale LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 \
    && useradd -ms /bin/bash -d ${AIRFLOW_HOME} airflow \
    && pip install -U pip setuptools wheel \
    && pip install Cython \
    && pip install pytz \
    && pip install pyOpenSSL \
    && pip install ndg-httpsclient \
    && pip install pyasn1 \
    && pip install apache-airflow[crypto,celery,postgres,hive,jdbc]==$AIRFLOW_VERSION \
    && pip install 'celery[redis]>=4.1.1,<4.2.0' \
    && pip install -r requirements.txt \
    && apt-get purge --auto-remove -yqq $buildDeps \
    && apt-get autoremove -yqq --purge \
    && apt-get clean \
    && rm -rf \
        /var/lib/apt/lists/* \
        /tmp/* \
        /var/tmp/* \
        /usr/share/man \
        /usr/share/doc \
        /usr/share/doc-base

COPY script/entrypoint.sh /entrypoint.sh
COPY celery_healthcheck.sh ${AIRFLOW_HOME}
COPY config/airflow.cfg ${AIRFLOW_HOME}/airflow.cfg
COPY dags ${AIRFLOW_HOME}/dags
COPY operators ${AIRFLOW_HOME}/operators
COPY models ${AIRFLOW_HOME}/models
COPY constants.py ${AIRFLOW_HOME}/constants.py
COPY envconsul ${AIRFLOW_HOME}/envconsul
COPY *.hcl ${AIRFLOW_HOME}/

RUN chown -R airflow: ${AIRFLOW_HOME}

EXPOSE 8080 5555 8793

USER airflow
WORKDIR ${AIRFLOW_HOME}
melchoir55
  • 6,842
  • 7
  • 60
  • 106
  • Did you use a MySQL or PostreSQL DB with the previous version? – SergiyKolesnikov Oct 25 '18 at 19:28
  • I've been using Postgres, but never mysql. Mysql was installed for some reason, but it wasn't being used. – melchoir55 Oct 25 '18 at 19:44
  • 1
    Can you make sure if you are not using it in DAGs you have created? – kaxil Oct 26 '18 at 08:47
  • I did a `grep` through the project for any mention of mysql. It doesn't appear in any project file. – melchoir55 Oct 26 '18 at 16:44
  • The databases come as [extra packages](https://airflow.apache.org/installation.html#extra-packages) since not everyone will use the same one. Perhaps try a fresh install with `pip install airflow[postgres,...]`? – Daniel Huang Oct 26 '18 at 16:58
  • I added the dockerfile to show what I'm installing. I'm still working on trying to follow the stack traces, it's just tough to attach a debugger because everything is inside docker compose. – melchoir55 Oct 26 '18 at 17:55

3 Answers3

1

Airflow needs some database to work.

By setting AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgres://airflow:airflow@postgres/airflow you tell it to use the corrsponding PostreSQL database as the metadata database. And it will try to use it.

The weird thing is that it complains about MySQL database in the error messages. My guess is that you used MySQL with the previous version and initialized the Airflow metadata database with airflow initdb using MySQL. Then you removed MySQL and Airflow started complaining.

I would make sure that the PostgerSQL DB is reachable under the connection specified in AIRFLOW__CORE__SQL_ALCHEMY_CONN and run airflow initdb again. Airflow should start using the PostgreSQL DB for its metadata then.

If it does not work and you can live with losing all the metadata a full reset may help:

airflow resetdb
airflow initdb

Also note that Airflow recommends to use psycopg2 for Postgres.

SergiyKolesnikov
  • 7,369
  • 2
  • 26
  • 47
  • I tried blowing the database away entirely and starting from a fresh db with `airflow initdb`. It didn't help. I'm going to try to figure out what is happening up the stacktrace which causes the code path to start looking for mysql. – melchoir55 Oct 25 '18 at 20:23
0

Figured it out. Turns out this other env var (AIRFLOW__CELERY__RESULT_BACKEND) was set with a typo. I had it set to AIRFLOW__CELERY__CELERY_RESULT_BACKEND. I'm not clear why that worked in 1.9 and suddenly started throwing this error when updating, but when I fixed the var it now works.

melchoir55
  • 6,842
  • 7
  • 60
  • 106
  • 2
    `AIRFLOW__CELERY__RESULT_BACKEND` is for 1.10+, that setting used by called `AIRFLOW__CELERY__CELERY_RESULT_BACKEND` but got renamed in 1.10.10 https://github.com/apache/airflow/blob/1.10.0/UPDATING.md#celery-config (the old one would work but issue a warning) – Ash Berlin-Taylor Jun 10 '20 at 16:22
0

I looks like you are using some default connection configuration.
Even if you set variables like sql_alchemy_conn, Airflow will still have values that were set in the Admin -> Connections menu.
Here is how mine looked after a fresh install:

Airflow will have mysql database type as default

After a correct airflow initdb, setting, the correct values in airflow_db connection using the UI fixed all the "mysql" errors I had.

SherylHohman
  • 16,580
  • 17
  • 88
  • 94
  • Thank you for your contribution. Text in images can be difficult to read, especially on mobile devices. As per SO guidelines, all text, code, error messages, and data must be presented in textual form (and formatted as code or quote formatting, if appropriate). While the info in your image might be argued a gray area on this requirement, For readability sake, I recommend copying the information from your image into text, in order to be of maximum benefit to all visitors. You can format using Markdown pipes and dashes so it aligns into a table-like format if you wish. – SherylHohman Apr 28 '20 at 23:05