I have to download a 250mb~ ZIP File password encrypted through FTP. After downloaded I have to unzip it with a common password, the Zip file would contain an 1.5GB MS Access DB that I have to read and make some joins with some tables in my DB Oracle and transform and load that Data into that Oracle DB.
I'm looking for the best way to do this process. I'm a c# developer, so my first thought was to use c#, download the file via FtpClient or FtpWebRequest, then use a zip library like DotNetZip and open the MS Access dataBase via ODBC and load the records into Oracle with ODP.NEt, I think that's is my "easy way", cuz I know how to do it.
But since this a big file and I know this could take a long time, I'm concerned about time and efficiency and how to reduce the time of this process.
So I'm thinking that processing all the request directly into oracle(download the FTP from there, unzip it there, and process the information directly in there would reduce times like passing record by record from c# to oracle) should reduce the time of this process, but I'm not sure if this is the correct way of doing this.
So I started to look into librarys from oracle that could do what I'm trying to archieve and I found the PLSQL-utils and seems like they can do everything that I need except reading the MS Access DataBase and I started looking about that and found the Heterogeneous Services but I have never used them so I'm little lost about that.
Also I heard once that I could use Java directly from Oracle, and I know java can connect to MS Access via JDBC. So I searched about that and found something about Calling Java Methods in Oracle Database
That's what I have so far, but I don't know which method should I use, I mean, RDBMS as far as I know, are meant for processing data but not for programming things like downloading files or something like that, that's why we have OOP's languages.
As an additional information, this process is going to execute once or twice for month so I have to schedule it, if it is in oracle, can easily be done with an schedule job, or in c# with a Scheduled Task or Windows Service (those are the tools that I know)
Some restrictions that I have
- My client don't have a MS SQL Server and neither can buy a license for it (So I cannot use DTSX for this process)
- In the Oracle production server maybe I won't have enough permissions to do all the things, but I can comply for those if they are the best for the process
- If a backend server (Java, c# hosted on IIS or WebLogic or JBoss or anykind) is going to be required, this Server and the Oracle Server would be differents
- Oracle database hosted on Unix Server
Being said all of this, how can I efficiently do all this process, should I use .net and load record by record in my Oracle DataBase? Should I do everything in oracle?Or none of this? Is there a better way to do this?