I just started with Sqoop Hands-on. I have a question, lets say I have 300 tables in a database and I want to perform an incremental load on those tables. I understand I can do incremental imports with either append mode or last modified.
But do I have to create 300 jobs, if the only thing in job which varies is Table name , CDC column and the last value/updated value?
Has anyone tried using the same job and passing this above things as parameter which can be read from a text file in a loop and execute the same job for all the tables in parallel.
What is the industry standard and recommendations ?
Also, is there a way to truncate and re-load the hadoop tables which is very small instead of performing CDC and merging the tables later?