I want to load data from on-prem (Data Lake) storage to azure Data Lake storage gen2.
For this, I have created on-prem windows server and installed self hosted Integration Run-time on it.And connected to on-prem Data Lake(HIVE) from Azure Data Factory.
In Azure Data Factory I have created a pipeline with copy activity and provided source as my on-prem Data Lake(Hive).And given SQL query to pull data.Likewise I need to add multiple copy activities for multiple tables.
I have tried with single copy activity only in my pipeline.
Here comes my problem:My pipeline is taking so much of time to load data into Data Lake.
My windows server in which my Integration Run-time is located has Bandwidth of 10Gbps.But it still loads very slow.
I have just tried to pull 20,000 records.And it took around 20 minutes to load data. The Throughput i was getting is around 15kbps which is very low.
How can I improve the performance of my activity so that it will be faster.