I need to write a rest service that offers batch validation for pdf documents.
The basic validation workflow is as follows:
Files are uploaded via rest api one file at a time. Documents have sizes between 10- and 150 Megabytes on average. Once a batch is complete validation starts: Every document is successively taken from storage, validated, a report is being generated and the originally uploaded document will get deleted afterwards.
The platform to be used for development is Java EE (Jersey and EJB). Since EJB doesn't allow for saving data directly to disk as a file, I've considered using a database via JPA to temporarily save the files until processing.
Is this a sound choice or would you prefer a different solution that I haven't thought of?
Is using a database for this scenario a bad idea performance wise?
We're expecting batches of up to 4000 documents. I'm especially worried about performance bottlenecks and resource consumption (ram, disk space).....