1

I am trying to read a delta table in s3 using delta-rs deltalake lib , but to access the s3 I need to pass aws_access_key_id , aws_secret_access_key and a ssl certificate. with below code(without using a certificate)

from deltalake import DeltaTable
from pyarrow import fs

storage_options = {"AWS_ACCESS_KEY_ID": "aws_my_key",
                   "AWS_SECRET_ACCESS_KEY": "my_secret",
                   "AWS_ENDPOINT_URL": "https://domain.company.com"}


url = 's3://bucket/sales_data/'

raw_fs, normalized_path = fs.FileSystem.from_uri(url)
filesystem = fs.SubTreeFileSystem(normalized_path, raw_fs)

dt = DeltaTable(url, storage_options=storage_options)
ds = dt.to_pyarrow_dataset(filesystem=filesystem)

I am facing following error

deltalake.PyDeltaTableError: Failed to load checkpoint: Failed to read checkpoint content: Generic S3 error: Error performing get request sales_data/_delta_log/_last_checkpoint: response error "request error", after 0 retries: error sending request for url (https://domain.company.com/sales_data/_delta_log/_last_checkpoint): error trying to connect: The certificate was not trusted.

I have checked the delta-rs package and the documentation, but could not able to find a way to pass a certificate to it. From error message I guess that it is checking some default path for certificate, but no mention of the path in the error message.

I wonder is there a way to pass a certificate to it during initialization.

Hongbo Miao
  • 45,290
  • 60
  • 174
  • 267
Scarface
  • 359
  • 2
  • 13
  • 1
    I'm not sure but it may be possible via `boto`. Here's an issue with a similar error: https://github.com/delta-io/delta-rs/issues/855#issuecomment-1263278985. – Denny Lee Nov 08 '22 at 23:49

1 Answers1

0

Passing AWS credentials should work in v0.6.0 of delta-rs and newer, documentation [here]:

from deltalake import DeltaTable

storage_options = {"AWS_ACCESS_KEY_ID": "...", "AWS_SECRET_ACCESS_KEY": "..."}
dt = DeltaTable("s3://<bucket>/<path>", storage_options=storage_options)
dt.to_pyarrow_table().to_pydict()
Jim Hibbard
  • 205
  • 1
  • 6