22

I am working on an offline C# application that can find bus routes. I can extract the timetable/bus/route data. I am searching for the most simple solution that will work with basic data.

What algorithm can be used to find a route from bus stop "A" to bus stop "B"? Is there a open-source solution ready for C#/Java? Is the google GTFS format for database good for a simple solution? http://code.google.com/transit/spec/transit_feed_specification.html

Thanks for any help. I am stuck with this. I don't know where to start - how to store the data and how to find routes. I know about Dijkstra/A* but I have used them only on graphs that were not time dependent...

Marvin Pinto
  • 30,138
  • 7
  • 37
  • 54
Daniel Novak
  • 2,746
  • 3
  • 28
  • 37
  • 2
    [OSRM](http://project-osrm.org/) is an open source routing engine for shortest paths based in C++ . You might find it useful. – rafa.pereira Oct 23 '15 at 13:13

7 Answers7

13

The problem you are working on is not a trivial task. So much so, that is has a name: the mixed integer nonlinear programming problem (MINLP). In the words of one author (Deb 1998):

"When formulated mathematically, the time scheduling problem becomes a mixed integer nonlinear programming problem (MINLP) having a large number of resource- and service-related constraints. Although attempts have been made in the past to find an optimal schedule of a simplified model using classical optimization techniques (Bookbinder & DCsilets, 1992; Kikuchi & Parameswaran, 1993), it is observed that this is an extremely difficult task even for a small transit network. The difficulty arises mainly because of the large number of variables and constraints, discrete nature of variables, and nonlinearities involved in the objective function and the constraints."

In Deb's paper he proposes a genetic algorithm.

Your other option would be to use simulation. Just to throw something out there you can try right away-- choose thousands of random routes that start at your origin, and fish out the ones that work reasonably well at getting to the destination.

Picture the algorithm like this: You are trying to find the quickest route from stop A to stop B, starting at a certain time. You hire 1,000 people and arm them with a quarter to flip. You tell them to flip the coin every time they have a chance to get on or off a bus. Heads, get off (or get on, if already off). Tails, stay on (or keep waiting, if off). They each have an index card to write down the choices they make as they go. You go to point B and wait for the first guy to show up and take his card.

Pete
  • 16,534
  • 9
  • 40
  • 54
  • 2
    This a the very popular "vehicle routing problem", which is NP-Complete. Finding the optimal solution is probable, although unlikely. A could work, with varying levels of success, the only factor is "how correct" the solution should be. – gpampara Sep 02 '10 at 17:05
  • 1
    I don't see why finding the route from A to B with a given start time should be slower than the O(n) achieved by Dijkstra. Things only get complicated if you want to route many people taking bus capacities into account. – CodesInChaos Nov 08 '10 at 14:10
7

There is an extensive list of publications (30+) on public transport routing algorithms that has been compiled over time by contributors to the open-source (Java) OpenTripPlanner project here:

https://docs.opentripplanner.org/en/latest/Bibliography/

OpenTripPlanner is multi-modal routing engine that also includes bike and walk - from the above link:

This is a list of articles, dissertations, and books that have inspired and informed both the existing OTP routing engine and some ongoing experiments. Currently, OpenTripPlanner uses a single time-dependent (as opposed to time-expanded) graph that contains both street and transit networks. Walk-only and bicycle-only trips are generally planned using the A* algorithm with a Euclidean heuristic or contraction hierarchies. Walk+Transit or Bike+Transit trips are planned using a variant of the MOA* algorithm with epsilon-dominance for path pruning and the Tung-Chew heuristic (a graph providing a lower bound on aggregate weight) for queue ordering.

The routing bibliography above includes references for the following categories of algorithms and related work:

  • Path Search Speedup Techniques
  • Multi-objective Pareto Shortest Paths
  • Resource-constrained Routing
  • Contraction and Transfer Patterns
  • Timetable-based routing
  • ALT and Metric Embeddings
  • Calibration and Implementation Details
  • Post-Dijkstra Public Transit Routing

If you find something new that's not on the list, please feel free to add it to the wiki!

As far as other open-source public transport routing libraries - there is also the RRRR project by Bliksem Labs:

https://github.com/bliksemlabs/rrrr

From the above link:

RRRR (usually pronounced R4) is a C-language implementation of the RAPTOR public transit routing algorithm. It is the core routing component of the Bliksem journey planner and passenger information system. The goal of this project is to generate sets of Pareto-optimal itineraries over large geographic areas (e.g. BeNeLux or all of Europe), improving on the resource consumption and complexity of existing more flexible alternatives. The system should eventually support real-time vehicle/trip updates reflected in trip plans and be capable of running directly on a mobile device with no Internet connection.

Both OpenTripPlanner and RRRR use GTFS data.

Community
  • 1
  • 1
Sean Barbeau
  • 11,496
  • 8
  • 58
  • 111
6

read this:

Multi-Modal Route Planning. Master's thesis, Universität Karlsruhe (TH), Fakultät für Informatik, 2009. online available at http://i11www.ira.uka.de/extra/publications/p-mmrp-09.pdf

the section on railway routing also applies for bus routing.

The gist of it: the naive approach of expanding space and time into a single graph does not work for large networks. There are smarter solutions.

Jakob Erdmann
  • 190
  • 1
  • 7
4

Just wanted to share my final approach on this problem. This was part of a university project, so it might not be fully suitable for real world use. It had to be reasonably fast to run on a Windows Mobile device.

I ended up using 4 tables (SQLite). One table keeps the list of buses, the second one keeps a list of stations. Another table keeps the combinations - what bus does stop at a specific station and where does it go from this station and how long (minutes) does it take. All combinations have to be stored. And the last table is a simple time-table. For every station it lists every bus that stops there and the time (I stored the time as integer value - 14:34 is 1434, to make it faster for comparing).

I used a bi-directional breadth first search algorithm. The neighbor stations (directly accessible) are retrieved for the start station and destination station. There is a path from station A to station X if these two "graphs" overlap on a station. For example, from station A you can get to stations B,C,D,E (by using specific buses). And from the destination station X you can get directly to N,C,Z. These two graphs overlap on station C. So a path A -> C -> X exists. If no paths are found in this first iteration, then the algorithm continues and spreads out the graphs (BFS) again.

Time is not taken into account in the first step - this makes it fast enough. You only get a list of possible paths with a list of buses that you have to use to take these paths. The times are evaluated in the last step, you go through the list of possible paths and check if the bus travels within the specific time (increasing the time every stop).

On a small city, with 250 stations and 100+ buses/railways, these approach works up to 3 changes (where you have to change the buses on the journey). It takes only seconds to calculate. But I had to load the whole database into memory (Dictionary) to speed up the queries, which were taking too long.

I don't think this would work for a large network though. But it works for a public transport of a single small-medium size city.

Daniel Novak
  • 2,746
  • 3
  • 28
  • 37
1

Conceptually, you take the same basic algorithm for evaluating distance between A and B, but instead of distance, you should be evaluating time. Dijkstra can do both, if you give it the proper inputs.

You're used to seeing a map as a measure of distance. However, the same map can be a measure of time as well; all you need is to add data about average speed, and the time it takes to cover a particular distance of a particular road will shake itself out. You can even visualize the map in terms of time; routes that take longer will be longer. Dijkstra doesn't care which it's evaluating, really; it just cares about finding the continuous route with the lowest number, and whether that number represents length or time is immaterial.

To incorporate speed, naive algorithms simply use the daytime speed limit and assume you never have to stop while going from A to B; more advanced algorithms can incorporate information about time of day and traffic patterns (which will impact the average speed you travel on that road at that time), and whether a road is a freeway or surface street (and thus make educated guesses about time spent stopped at an intersection). What you use depends on what you have available, but a basic 4- or 5-layer time of day dimension should be adequate for all but the absolute most time-critical applications. For each direction of each road in your map, you need the average speed during morning rush, daytime, evening rush and night, possibly with lunchtime numbers as well. Once you have that, it's a relatively basic change to a Dijkstra algorithm to pass in a time of day and have it evaluate routes based on time.

KeithS
  • 70,210
  • 21
  • 112
  • 164
  • The problem with Dijkstra's algo for this application is that the routes times are variable in the following way: if you have a route from A to B to C, you have to wait at B for your transfer. That waiting time will depend on the rest of the schedule. Then, the route from B to C will in turn depend on which transfer you take, because not all transfers will go directly from B to C. – Pete Sep 02 '10 at 16:06
  • 1
    That's basically the problem I am facing, the path cost (transport time in my case) changes with time. You can take a path from A to B and it takes 10 minutes. Now from B to C, the paths will depend on the current-time + travel-time. At this point, I am only trying to plan the programming in forward, but it seems too complicated. I tried to google everything, but I didn't find an algorithm that would work with path costs that change according to a timetable. Thank you for your help. – Daniel Novak Sep 02 '10 at 16:32
  • Edit: I found something maybe valuable about Dijsktra + timetable here: http://blog.eldslott.org/tag/dijkstra/ – Daniel Novak Sep 02 '10 at 16:47
  • @Daniel_sk will ur program (1) prepare a plan for me that I can print out and take with me. In this case, I want robustness to failure. (Even if bus 1 is 3 minutes late to B, I still want to follow the original plan.) Or will ur program (2) be an interactive guide (e.g. on my mobile), telling me in real-time which bus to get on, which stop to take, etc - where the plan is updated based on current conditions and robustness is less necessary. – emory Sep 03 '10 at 01:16
  • I will try to make it work on .NET CF framework, so it's going to be an offline mobile application (maybe porting to Java/Android later out of fun). Robustness is less needed, it just needs to work and display the results in a reasonable time. – Daniel Novak Sep 03 '10 at 10:21
  • @Daniel_sk If BUS101 has time table A 8:30, B 8:59, C 9:30; and BUS102 has time table D 8:30, B 9:00, and C 9:29, then the optimal route from A to C would be take 101 from A-B, and 102 from B-C. But if 101 is late or 102 is early, then I am screwed. I would rather just take 101 from A-C and not worry about the xfer. – emory Sep 05 '10 at 08:05
  • @Pete The delay from transfers can be modeled as being edges. @Daniel_sk instead of using constant weight for the edges, have you tried calling a function which does that (time x distance x whatever) cost calculation? Or you could weight your edges as having a factor of those values, then A* would minimize that factor. – Trinidad Nov 08 '10 at 14:20
0

If you are interested in time information, then why not label the distance values on the graph edges using the time information instead of their physical distance from each other. That way you will search for the quickest route, instead of the shortest route. You can then use Dijkstra/A* in order to compute the your result.

I'm a little unclear on what you mean by time dependent. If you mean that that you need to answer queries of the form 'goes from x to y before 10am' then you can calculate what bus routes arrive before 10am, seems like a simple filter on the data. Then apply Dijkstra/A* to the data.

Richard Warburton
  • 1,462
  • 1
  • 9
  • 13
0

Try this as your data model.

Bus 1 goes to stops A, B and C. Bus 2 goes to stops B, D and E.

I would store a unique node based on both bus and stop, with distances to nodes being based on time. We would have node A1, B1, C1, B2, D2 and E2. In the special case of transfers apply the wait for the next bus as the distance between nodes. For example, if Bus 1 arrives at stop B 30 minutes before Bus 2, travel time from B1 to B2 is 30 minutes.

You should than be able to apply Dijkstra's Algorithm.

Philip Tinney
  • 1,986
  • 17
  • 19