Source: CSV files located in a shared drive(on Prem server). Access to this shared drive and folder is controlled using a security group.
Expectation: load CSV data into Google BigQuery table.
- Is it possible to mount the network drive on Dataproc cluster and let the spark application read from the mount.
- Alternatively if I add the GCP Service Account as a member of security group and ssh into the network drive it will still ask for password which might impact the automated data pipeline.
What is the best approach to load these data into GBQ tables?