0

Broadly speaking, can someone tell me if I'm headed in the right direction?

I now know how to write SQL Queries pretty well.

I would like to start aggregating multiple queries onto one "form"/template (not sure if that's the correct terminology).

I have access to lots of clean data in the form of Excel Files.

I should be able to load the excel files into Visual Studio and then write reports that refer to those excel files as databases, am I right?

I haven't been able to find a great SSDT tutorial yet, but I'll keep looking. Thanks for any feedback.

AdMac
  • 117
  • 1
  • 12
  • Adam, do you mind giving a bit more description of what you're trying to build? Is this a web application for multiple users, or a windows application that is only for your own use, or something else? When you say 'and then write reports', what kind of reports are you generating...are these excel files that you are generating? There is a lot of guidance that I can give you, but it will depend on the answers to these details. – Michael Erpenbeck Oct 12 '17 at 02:43
  • Thanks Michael - every month I receive financial statements for a variety of businesses; they are all in the exact same format. I assign each line item on the financial statement an "account code" i.e. Salaries, utilities, income etc. I would like to turn these excel files into a database that I can continuously update/add to each month. I would like to create a report that aggregates these categories by the account code rather than displaying line items line by line. The rows would be categories, columns would be the various business names; basically a business intel report. – AdMac Oct 12 '17 at 12:55
  • Do you mean Reporting Services or reports drawn by you custom code? – qxg Oct 13 '17 at 06:59

1 Answers1

0

First off, I apologize that I'm writing a bit of a novel here. My understanding of your question is that you're looking for architectural guidance on the best way to go, and that's not a quick answer.

If, however, I've misinterpreted your intent and you are actually just looking for how to code up an excel file as a database, there is already a lot of articles online that you can google.

Regarding your architectural question...it is really going to come down to choosing the best trade-offs for what you're building. I will give you some pointers that I have learned and hopefully it is helpful to you and others in the community.

I would be very hard pressed to advise that you use an excel file as a database. While it might seem like the most straight forward solution, the trade-offs here are very costly in debugging file locking issues and dealing with excel specific errors, it becomes a death by a thousand cuts. It is certainly possible, but this is a trap that I personally fell into early in my career.

Here's is a link to some descriptions of the problems that you'd have with an excel file database and here is a 2nd link.

To paraphrase your question, it sounds like you're developing a personal ETL application for improving your productivity and your company's metrics. Spreadsheets come into your e-mail inbox and transformed versions of the spreadsheets go out of your e-mail outbox. You are wanting to look at the departments' data from a historical and comparative perspective. I have done this many times in the past as well and it is a very reasonable goal.

The best way that I have found to do this is to use a SQL Server database. You can start this out in phases of minimal viable product to do this in small easy chunks.

Phase I:

  1. Download and install SQL Server 2016 Express free. Make sure to install localdb when you install SQL Server 2016. See the localdb instructions for more information.

  2. Create the localdb instance on the command line. localDb

  3. Connect to the new localdb instance in SQL Server connect to instance

  4. Create a new Database that you'll use for importing the data. Give it a name like "ReportData" Create DB

  5. Import the excel files received from the variety of businesses into the new database. This stackoverflow answer gives a great description of how to do it. Here is an alternate example.

  6. If you get any error messages about drivers you may need to download the correct drivers.

  7. Develop your SQL queries that you want to use. For simplicity, I'm just showing a basic select statement here, but you can build some sophisticated SQL queries for aggregating the data in this step. SQL Queries

  8. Export the data from the excel file into a CSV file or an excel file. You do this by right clicking in the "Results" area and selecting "Save Results As..." enter image description here

  9. Manually copy and paste the resulting values into the excel templates that you would like. Manual step for now

Note step 9 will be automated soon, but it is better for now to understand your domain objects and be thinking about the business logic that you're building in a quick iterative manner.

Phase II: Create a new Console application in Visual Studio that will transform the data from the database into an Excel file output. The most powerful way to do this is to use EPPlus. Here is a detailed explanation on how to do this. Note, when you run the source code from the detailed explanation link, you need to change the output path first, for example to c:\temp. Note also that there are plenty of other Excel spreadsheet helper packages out there, so feel free to look around at other packages as well. EPPlus is simply one that I have been successful with in my projects. This Console application will be querying your SQL Server database using the queries that you built in step 7 above.

Phase III: In time, you many find that co-workers and managers within your company want to start accessing your data directly through a web page...

At a high level, the steps you would take are:

  1. Backup the database and restore it onto a server.
  2. Implement a simple MVC application
  3. Perhaps even build web pages to allow users to import excel so that they don't need to e-mail them to you any longer.

An additional note, there are Enterprise level ETL and reporting tools out there as well, such as SSIS/SSRS, etc that you could look into if you're looking for a more sophisticated tool set, but I didn't get that impression with your question.

I hope that this answer helps and isn't too long winded. Please let me know if any of the steps are unclear, I know it's a lot of information in one post.