2

Context: For a data pipeline we need to ingest excel spreadsheets directly into foundry (arriving via email). In order to avoid any manual handling error, we'd like to build a small slate app that basically just uploads an excel sheet and automatically appends it to an existing dataset (given schema, headers, etc.). Unfortunately, there is very little documentation on the "File Import" widget or the API that gets called when drag and dropping a file into a folder.

Idea: Is there a way of uploading a file with slate? Could this file then be added to a dataset, similarly as with the prompt that opens when dropping it into a folder?

Patrick
  • 61
  • 3

1 Answers1

0

You actually don't have to build a Slate app to do this! Datasets that are made up of underlying .csv files support new additions of files directly.

Note: All of the following screenshots are from the dataset preview page.

For example, the following dataset I created from 4 .csv files:

4 files with no schema

And I can click on the Import button in the top right to add in more files (with the same schema, or not. Depends on if you want to strictly adhere to your applied schema.

add new files

If you have already applied a schema, you can also simply Import new files on top of the dataset, but the schemas of the files must exactly match those already present, otherwise your dataset will fail when attempted to be read.

import with schema

vanhooser
  • 1,497
  • 3
  • 19
  • In our use case, we want to abstract and simplify this import for our business users as much as possible and basically remove all possible configuration prompted from the drag and drop. To be sure, we also want to avoid manually importing data as described from you. This is mostly to ensure quality. Is there another way? Or would Slate be the wrong tool? – Patrick Oct 27 '20 at 14:45
  • So you could allow the user to import and upload a new .csv file into a dataset using a Slate app, but you would have to interact with the Foundry Core components to initiate a new transaction, add the file, close the transaction, and perform schema validation on the new file. This is largely taken care of by the above process while also avoiding any scale limits you would hit in Slate by having to iterate over the file in-browser; schema validation done in Foundry would use Spark and wouldn't have the same limitation. – vanhooser Oct 27 '20 at 16:59
  • Okay, in that case it's maybe more more beneficial to just allow an upload of the excel sheet via forms into a landing zone and then trigger a validation and import pipeline, correct? Just out of sheer curiosity, is there a more detailed documentation of the "File Import" widget? – Patrick Oct 27 '20 at 17:40
  • @Patrick How did you implement your use case? We are having similar questions and needs. – L99 Jul 06 '22 at 18:00
  • 1
    @L99 we ended up exporting excel to csv, then importing the files manually - taught and documented the process step by step for the business users. Unfortunately, that was most efficient at that time. – Patrick Jul 08 '22 at 15:22
  • Thanks @Patrick. If the discussion is still relevant I recently found another way https://stackoverflow.com/questions/72888310/spreadsheet-uploading-appropriate-for-business-end-users-in-foundry/72904039#72904039 – L99 Jul 08 '22 at 18:29