1

Im trying to load my Publaynet dataset from s3 bucket to data bricks using huggingface datasets like this:

dataset_id = "/dbfs/mnt/ocr/dataset/publaynet"
dataset = load_dataset(dataset_id, data_files={"train": "/dbfs/mnt/ocr/dataset/publaynet/train.json", "validation": "/dbfs/mnt/ocr/dataset/publaynet/val.json"}, split="train", cache_dir="./cache")

My S3 bucket is in formal like below screenshot:

enter image description here

Im getting this error in databricks:

enter image description here

Christoph Rackwitz
  • 11,317
  • 4
  • 27
  • 36
hima sai
  • 95
  • 1
  • 11

0 Answers0