0

I'm posting it here because I couldn't' find any such scenario on the web so far. I have a webpage which contains a set of reports both in XLS and PDF formats. I should be downloading the excel files from the page and load into my database. I wish I could use the URL for XLS file directly but the problem is the naming convention may keep changing every time (Sales_Quarter1.xlsx can be Sales_Q1.xlsx the next year). The only thing that would be constant in the following example is "Sales for Calendar Year". I should be looking up for the file that corresponds to this text and download it before loading it into database table.

I would like to know from experts if this would be possible?

<li>
   <sub>Sales for Calendar Year 2015--All Countries&#160;</sub> 
   <a href="/Data/Downloads/Documents/Sales/Sales_Quarter1.xlsx"> 
   <sub>[XLS]</sub></a><sub>&#160;, <a href="/Data/Downloads/Documents/Sales/Sales_Quarter1.pdf"><sub>[PDF]</sub></a><sub>​</sub></sub>
</li>

PS: I am using SQL Server 2014.

Thanks!

Julaayi
  • 403
  • 2
  • 8
  • 23
  • Yes it would be possible. However, you didn't provide any actual details about...well...anything so there really isn't much anybody can do to offer any help. – Sean Lange Oct 28 '15 at 13:48
  • Hello Sean, Thanks for replying. Please let me know what else you would like to see so that I can provide all those details. This is my first post here so was not sure. – Julaayi Oct 28 '15 at 14:15
  • Here is a great place to start. http://spaghettidba.com/2015/04/24/how-to-post-a-t-sql-question-on-a-public-forum/ – Sean Lange Oct 28 '15 at 14:19
  • Thanks Sean. The reason I didn't post any DDL/DML is that I am only looking for how to download the file from web using SSIS based on the condition I've mentioned. Once I get the way to download the file, I can drive by myself. – Julaayi Oct 28 '15 at 15:56

1 Answers1

0

Have a look at Integration Services. Create a package for both pulling the web page using a script task, along with a variable name that will represent your downloaded, local filenames for the html file and excel files (you will also have to parse the link out of the html file). Then utilize an Excel Source next in your package.

The variable name for the excel file used in the script task will need to be set to ReadWrite as well.

You can also schedule the resulting package execution via SQL Agent job, if you plan to run this on a reoccurring basis, placing logic into the script or the execution paths,

Community
  • 1
  • 1
Andrew Loree
  • 263
  • 1
  • 8
  • Thanks Andrew for the script task resource. I tried implementing the same and getting the following error. Error: Failed to lock variable "User::RemoteUri" for read access with error 0xC0010001 "The variable cannot be found. This occurs when an attempt is made to retrieve a variable from the Variables collection on a container during execution of the package, and the variable is not there. The variable name may have changed or the variable is not being created.". – Julaayi Oct 29 '15 at 13:06
  • You need to set the variable usage in the script task to [readwrite](http://stackoverflow.com/questions/13450289/how-to-access-ssis-package-variables-inside-script-component) – Andrew Loree Oct 30 '15 at 16:24
  • I did set the variable usage to readwrite mode in the Script Task and when I execute it, it executes forever. – Julaayi Nov 03 '15 at 16:49