1

We are currently in the process of exploring the sshj library to download a file from SFTP path into ADLS. We are using the example as reference.

  • We have already configured the ADLS Gen2 storage in Databricks to be accessed as an abfss URL.

  • We are using scala within Databricks.

  1. How should we pass the abfss path as FileSystemFile object in the get step ?

    sftp.get("test_file", new FileSystemFile("abfss://<container_name>@a<storage_account>.dfs.core.windows.net/<path>"));
    
  2. Is the destination supposed to be a file path only or file path with file name?

Martin Prikryl
  • 188,800
  • 56
  • 490
  • 992
rainingdistros
  • 450
  • 3
  • 11

1 Answers1

0

Use streams. First obtain InputStream of the source SFTP file:

RemoteFile f = sftp.open(sftpPath);
InputStream is = f.new RemoteFileInputStream(0);

(How to read from the remote file into a Stream?)


Then obtain OutputStream of the destination file on ADLS:

OutputStream os = adlsStoreClient.createFile(adlsPath, IfExists.OVERWRITE);

(How to upload and download a file from my locale to azure adls using java sdk?)


And copy from the first to the other:

is.transferTo(os);

(Easy way to write contents of a Java InputStream to an OutputStream)

Martin Prikryl
  • 188,800
  • 56
  • 490
  • 992