I am running Airflow v2.3.2 / Python 3.10 from the Docker Image below.
apache/airflow:2.3.2-python3.10
The Docker Image has set paramiko==2.7.2
in order to address the authentication issues that had been seen in testing.
When calling the sftp, I am using the following:
sftp = SFTPHook("connection|sftp")
sftp.look_for_keys = False
sftp.get_conn()
I have also tried it without the sftp.look_for_keys
line.
In the Connections within the Airflow UI, I have configured the Extra
section as follows:
{
"private_key": "privatekeyinfo",
"no_host_key_check": true
}
"privatekeyinfo"
is string format "-----BEGIN OPENSSH PRIVATE KEY----- with '\n' line breaks written in.
When I test the connection within the UI, it reports Connection successfully tested
. However, when the script that calls the Hook runs, I receive the following:
[TIMESTAMP] {transport.py:1819} INFO - Connected (version 2.0, client dropbear)
[TIMESTAMP] {transport.py:1819} INFO - Authentication (password) failed.
I have also attempted to pass the "host_key"
in the Extras
field but get the same Authentication error.
To be explicit, I have tried the following -
sftp.look_for_keys = False
and"no_host_key_check": true
sftp.look_for_keys = False
and"host_key": "host_key_value"
#sftp.look_for_keys = False
and"no_host_key_check": true
#sftp.look_for_keys = False
and"host_key": "host_key_value"
Connections
in the Airflow is successful for"no_host_key_check": true
inExtras
Connections
in the Airflow is successful for"host_key": "host_key_value"
inExtras
Referenced SO questions -
- Airflow SFTPHook - No hostkey for host found
- Paramiko AuthenticationException issue
- Verify host key with pysftp
- "Failed to load HostKeys" warning while connecting to SFTP server with pysftp
- How to use Airflow to SSH into a server with RSA public/private keys?
- "No hostkey for host ***** found" when connecting to SFTP server with pysftp using private key
Additional Logging from Paramiko -
[TIMESTAMP] {transport.py:1819} DEBUG - starting thread (client mode): 0x9e33d000
[TIMESTAMP] {transport.py:1819} DEBUG - Local version/idstring: SSH-2.0-paramiko_2.7.2
[TIMESTAMP] {transport.py:1819} DEBUG - Remote version/idstring: SSH-2.0-dropbear [SERVER]
[TIMESTAMP] {transport.py:1819} INFO - Connected (version 2.0, client dropbear)
[TIMESTAMP] {transport.py:1819} DEBUG - kex algos:['diffie-hellman-group1-sha1', 'diffie-hellman-group14-sha256', 'diffie-hellman-group14-sha1'] server key:['ssh-dss', 'ssh-rsa'] client encrypt:['blowfish-cbc', 'aes128-ctr', 'aes128-cbc', '3des-cbc'] server encrypt:['blowfish-cbc', 'aes128-ctr', 'aes128-cbc', '3des-cbc'] client mac:['hmac-sha1', 'hmac-md5-96', 'hmac-sha1-96', 'hmac-md5'] server mac:['hmac-sha1', 'hmac-md5-96', 'hmac-sha1-96', 'hmac-md5'] client compress:['none'] server compress:['none'] client lang:[''] server lang:[''] kex follows?False
[TIMESTAMP] {transport.py:1819} DEBUG - Kex agreed: diffie-hellman-group14-sha256
[TIMESTAMP] {transport.py:1819} DEBUG - HostKey agreed: ssh-rsa
[TIMESTAMP] {transport.py:1819} DEBUG - Cipher agreed: aes128-ctr
[TIMESTAMP] {transport.py:1819} DEBUG - MAC agreed: hmac-sha1
[TIMESTAMP] {transport.py:1819} DEBUG - Compression agreed: none
[TIMESTAMP] {transport.py:1819} DEBUG - kex engine KexGroup14SHA256 specified hash_algo <built-in function openssl_sha256>
[TIMESTAMP] {transport.py:1819} DEBUG - Switch to new keys ...
[TIMESTAMP] {transport.py:1819} DEBUG - Attempting password auth...
[TIMESTAMP] {transport.py:1819} DEBUG - userauth is OK
[TIMESTAMP] {transport.py:1819} INFO - Authentication (password) failed.
Additionally - The SFTP Server already has the public key and can be connected to using the private key (verified both using CyberDuck as well as a locally running version of Airflow).
Even on the hosted version of Airflow, in the Connections
section within the Admin
drop-down, when I go into the sftp connection and select Test
it returns Connection successfully tested
. The issue only occurs is within the DAG as it looks like it is trying to authenticate using a password instead of the private key that is provided for that connection.
Link to Airflow GH discussion - https://github.com/apache/airflow/discussions/31318