0

I need some suggestions from the community to a requirement I have. Below is the requirement and I need some approach suggestions.

Users from the client need to retrieve data from my source database (Let say SQL database in my production server). The users access the data by a intermediary service layer (WCF Rest service). On another server (Info Server) I have a SQL Database (Info DB) which will hold all queries that can be requested. Since in some cases my data is huge, I give the option to user to schedule the data retrieval and look at the data later. The schedule information per user would also be stored in the Info DB. I also allow user to retrieve data real time in case he wants.

In both cases I want to Query data from Source (Production DB), store them in file format (May be CSV or excel) and then when user wants the data I would send the data over to the client.

Since the queries are stored in InfoDB. I let the admin define schedule run time for the every Query. This is to enable Admin to adjust long running queries run at night time when calls to server is low. In case the user demands a query to be run at real time, I would allow that.

As a solution architecture I have thought of this :

  1. I will have a WCF rest service which would be installed on Info Server. This service will act as calling point for the Users. When user calls a query real time, the service will get the results, save to a file format and transfer it over. If the user schedules the query, the service will add an entry for the user/ for the query in the info database.
  2. I will have a Windows Service on the Info Server. This Windows Service will periodically check the Info DB for Scheduled Queries entries and if the queries fall within the scheduled time , it will start running the query, get the data and save that to a file location and add the file location entry to the Schedule entry. This will enable me to track which schedules are finished and where the data is available (File path).

Now here are my issues with this:

  1. My data can be huge, will a WCF rest service be good enough to transfer large files over the wire ? Can I transfer files over wire or I need to transfer data as JSON ? What is the best approach.
  2. If I use a windows service, is this a good approach or is there a better alternative ? The reason I am asking is because as per my understanding Windows Service will have to run always , because I need to figure out the entries which are scheduled. This means at specific interval the Windows Service would check the info database and see if the schedule entry should be run or not. In ideal scenario the windows service will run through out the day and check the database periodically without much action because preferably all schedules would be at night time.
  3. I have used an intermediary service approach because if I need to move to cloud tomorrow, I can easily move this solution. Am I right in my assumption ?
  4. If I move to cloud tomorrow, would I be bale to encrypt the data transfer (may be data encryption or file encryption). I have no idea on data encryption/decryption.

Need your suggestion(s) to this.

Rob Conklin
  • 8,806
  • 1
  • 19
  • 23
Shankar
  • 841
  • 1
  • 13
  • 39
  • You are long enough around that you must know how a good title should look like. This is not such a case. It must capture the content. – Artjom B. May 02 '15 at 09:35
  • Artjom B. : I thought that this title is best as I only need suggestions from the community. I already have an approach in mind which I have described and just wanted to check if there is anything better that can be done. – Shankar May 02 '15 at 18:03
  • StackOverflow is here to build a library of useful questions and answer. Most users find questions and answers through search engines. Now think about how this question will be presented with such an ambiguous title. Sure, it conveys something about *your* intention, but nothing about the content. – Artjom B. May 02 '15 at 18:07
  • Artjom B. : Ok. Got it. I do the same thing (Search by Engines) , But I cannot think of a better title. Simply because there are many parts to this question. DO you have anything on mind. Please suggest. I will change the title – Shankar May 03 '15 at 06:18

1 Answers1

1

My data can be huge, will a WCF rest service be good enough to transfer large files over the wire ? Can I transfer files over wire or I need to transfer data as JSON ? What is the best approach.

When you say huge, how huge? Are we talking gigabytes, megabytes, or kilobytes. I regularly have 100mb rest responses (you will probably have to tweak some things, to increase your MaxMessageLength, but this should be enough to get you going. I would take their advice and use a streaming API though, especially if you are talking several megs of content.

If I use a windows service, is this a good approach or is there a better alternative ? The reason I am asking is because as per my understanding Windows Service will have to run always , because I need to figure out the entries which are scheduled. This means at specific interval the Windows Service would check the info database and see if the schedule entry should be run or not. In ideal scenario the windows service will run through out the day and check the database periodically without much action because preferably all schedules would be at night time.

Beware writing your own scheduler. You might be better off dropping things onto a queue for processing, then just firing up workers at the appropriate time. That way you can just invoke the worker directly for your realtime call. Plus you can run it whenever the database is idle, not on a scheduled basis. It's tricky "knowing" when a service will be idle. Especially in a world of round-the-clock users.

I have used an intermediary service approach because if I need to move to cloud tomorrow, I can easily move this solution. Am I right in my assumption ?

Yes, wrapping an endpoint in a rest service (WCF) will make moving to the cloud much easier.

If I move to cloud tomorrow, would I be bale to encrypt the data transfer (may be data encryption or file encryption). I have no idea on data encryption/decryption.

HTTPS is your friend here. Read this. Don't invent your own here, or use proprietary encryption. HTTPS is old, straightforward and good.

Community
  • 1
  • 1
Rob Conklin
  • 8,806
  • 1
  • 19
  • 23
  • Hi Rob, Thanks for taking time to answer my points. My data may go over 100 MB but I am thinking to split them over many files (each max 100 MB) in case the data is more. When you say 100MB works for you, how much time it usually takes for you ? For the 2nd point when you say "You might be better off dropping things onto a queue for processing, then just firing up workers at the appropriate time.", can you please explain it a bit please. Do you mean that I run the windows service in specific interval and check the queue ? – Shankar May 03 '15 at 06:30
  • Since you've read the question enough to answer it, maybe you can edit it too, to add a better title. There are badges for answering and editing questions within 12 hours. – Artjom B. May 03 '15 at 07:51
  • When I'm just serving up pre-generated content, it can take just a few seconds over a fast connection. When I'm generating content, it can take up to 2 minutes. I get nervous when I start keeping http connections open that long, network appliances (routers load balancers proxies and such) start doing things like chopping my connections. 5 minutes seems to be a common maximum. – Rob Conklin May 03 '15 at 12:54
  • And yes, you can run the service in those intervals. Scheduling execution is a lot easier than scheduling data. You have to deal with things like missed windows, make-up time, etc. And what happens when the queries don't all get done in the available window? If you have them sitting on a queue, you can make those decisions more easily, plus you could run it whenever the DBA see's idle time. – Rob Conklin May 03 '15 at 12:56