0

I'm trying to store buses schedules into database and I'm wondering which database model is suitable for my case.

I have bus operators, each operator has several routes, each route has several turns, each turn has stops, etc. Turns are generated from something called "turn master" where the scheduling is defined (frequency, stops, etc.) within next N days.

I expect to deliver a very fast searching for bus when user tries to search a bus from city to city on given date.

I'm using MySQL, the number of stops reach around 100.000 records and the searching speed is fast but I'm not sure if it's still fast when data gets really big (thousand operators, each operator has hundreses turns, each turn has around 10 stops, turns are generated for around next 30 days).

Basically, performing a search is to look into stops (city/town/place, time) and check if it matches user search criteria.

So, my question is: Is relational database best in this case? Or using some kind of NoSQL will be better when the data get really bigs?

Thanks in advance,

khoi nguyen
  • 399
  • 2
  • 12

1 Answers1

2

NoSQL databases are designed to work with unstructured data or data which is structured in various or unpredictable ways. Your data is structured in a very well understood and predictable way.

What makes you think that relational database isn't the right answer for your application? Having a lot of rows doesn't mean your relational queries are going to be slow. The performance of your application will depend on having proper indexing, but even more importantly, it will depend on your application logic. What heuristic are you using for solving the travelling salesman problem? How you do your routing could potentially have a bigger impact on system performance than your data storage choices.

Joel Brown
  • 14,123
  • 4
  • 52
  • 64
  • Hi,thanks for your answer. We don't have to work on problem of travelling salesman. We just list all turns which have stops at user's searching destinations and show them all. Basically, I have to self-join on the same table of stops to check if stops of the same turn cover user's search destinations. If we have n row in stop table, we have to somehow loop over n! times to check (kind of cross join). Do you have other idea to improve this? Thanks so much, – khoi nguyen Oct 01 '13 at 07:57
  • I'm also thinking about how the data should be splited up so that I can use multiple threads to do the seach on each part and then combine them together to get the final result but still don't have any idea when using MySQL. I believe that when the stops get to millions / billions, the searching performance will be slow when using sigle query even we have proper indices, etc. – khoi nguyen Oct 01 '13 at 07:57
  • @khoinguyen - Regarding multiple threads: let the database engine do the query optimization. It is almost certainly going to do a better job that you will trying to do it manually. If you just can't get the performance you need out of the database engine, then try partitioning your data for performance. Re algorithm: You should have a routing heuristic. See: http://stackoverflow.com/questions/430142/what-algorithms-compute-directions-from-point-a-to-point-b-on-a-map – Joel Brown Oct 01 '13 at 10:36