diff options
Diffstat (limited to 'gtfs-book/ch-19-calculating-fares.md')
-rw-r--r-- | gtfs-book/ch-19-calculating-fares.md | 296 |
1 files changed, 296 insertions, 0 deletions
diff --git a/gtfs-book/ch-19-calculating-fares.md b/gtfs-book/ch-19-calculating-fares.md new file mode 100644 index 0000000..df24dd4 --- /dev/null +++ b/gtfs-book/ch-19-calculating-fares.md @@ -0,0 +1,296 @@ +## 19. Calculating Fares** + +In order to calculate a fare for a trip in GTFS, you must use data from +`fare_attributes.txt` and `fare_rules.txt`. It is relatively +straightforward to calculate the cost of a single trip (that is, +boarding a vehicle, traveling for a number of stops, then disembarking), +but it becomes much more complicated when you need to take into account +multiple trip segments (that is, one or more transfers). + +***Note:** As it stands, many feed providers do not include fare +information. This is because many systems have a unique set of rules +that cannot be modelled with the current structure of fares in GTFS. +Additionally, it is not possible to determine different classes of +pricing (such as having a separate price for adults and children). For +the purposes of this chapter, these limitations are ignored.* + +For more discussion on how fares work in GTFS, refer to *Fare +Definitions (`fare_attributes.txt` & `fare_rules.txt`)*. + +This chapter first shows you how to calculate fares for a single trip, +then how to account for transfers and for multiple trips. + +### Calculating a Single Trip Fare + +Much of the logic used when calculating fares requires knowledge of the +zones used in a trip. + +***Note:** A zone is a physical area within a transit system that +contains a series of stops. They are used to group trip pricing into key +areas in a system. Some transit systems do not work like this (for +instance, they may measure the physical distance travelled rather than +specific stops), which is one reason why the GTFS fares model does not +work in all cases.* + +A zone is defined in GTFS by the `zone_id` column in `stops.txt`. A +single stop can only belong to one zone. + +Fares are matched to a trip using a combination of any of the following: + +* The route of the trip +* The zone of the stop the passenger boards from +* The zone of the stop the passenger disembarks +* The zone(s) of any stops on the trip that are passed while the + passenger is on board. + +Consider the following simplified data set that may appear in +`stops.txt` and `stop_times.txt`. Assume for this example that the +trip `T1` belongs to a route with an ID of `R1`. + +``` +stop_id,zone_id +S1,Z1 +S2,Z1 +S3,Z2 +S4,Z3 + +trip_id,stop_id,stop_sequence +T1,S1,1 +T1,S2,2 +T1,S3,3 +T1,S4,4 +``` + +If a passenger travels from stop S1 to stop S4, then their starting zone +is Z1, their finishing zone is Z3, and the zones they pass through are +Z1, Z2 and Z3. + +***Note:** When calculating fares, the start and finish zones are also +included in the zones passed through, so in this example you Z3 is also +considered as a zone that the trip passes through.* + +Using this data, you can now calculate matching fares. To do so, you +need to find all fares that match either of the following: + +* Fares that have no associated rules. +* Fares that have rules that match the specified trip. If a fare + defines multiple zones that must be passed through (using + `contains_id`), then all zones must be matched. + +If multiple fares qualify for a trip, then the cheapest fare is the one +to use. + +### Finding Fares With No Rules + +This is the simplest use-case for fares. You can find all matching fares +with the following SQL. + +```sql +SELECT * FROM fare_attributes WHERE fare_id NOT IN ( + SELECT DISTINCT fare_id FROM fare_rules +); +``` + +If a feed only has `fare_attributes.txt` records with no rules, then +the difference between the fares is in the transfer rules. This section +only covers calculating fares for a single trip with no transfers, so +for now you can just select the cheapest fare using the following SQL. + +```sql +SELECT * FROM fare_attributes WHERE fare_id NOT IN ( + SELECT DISTINCT fare_id FROM fare_rules + ) + ORDER BY price + 0 LIMIT 1; +``` + +***Note:** You still need to check for fares with one or more rules in +order to find the cheapest price. Also, `0` is added in this query in +order to cast a string to a number. When you roll your own importer you +should instead import this as a numerical value.* + +### Finding Fares With Matched Rules + +Next you must check against specific rules for a fare. In order to do +this, you need the starting zone, finishing zone, and all zones passed +through (including the start and finish zones). + +Referring back to the previous example, if a trip starts at `Z1`, +passes through `Z2` and finishes at `Z3`, you can find fare +candidates (that is, trips that *may* match), using the following SQL +query. + +```sql +SELECT * FROM fare_attributes WHERE fare_id IN ( + SELECT fare_id FROM fare_rules + WHERE (LENGTH(route_id) = 0 OR route_id = 'R1') + AND (LENGTH(origin_id) = 0 OR origin_id = 'Z1') + AND (LENGTH(destination_id) = 0 OR destination_id = 'Z3') + AND (LENGTH(contains_id) = 0 OR contains_id IN ('Z1', 'Z2', 'Z3') + ) + ); +``` + +This returns a list of fares that may qualify for the given trip. As +some fares have multiple rules, all must be checked. The algorithm to +represent this is as follows. + +``` +fares = [ result from above query ] + +qualifyingFares = [ ] + +for (fare in fares) { + if (qualifies(fare)) + qualifyingFares.add(fare) +} + +allFares = qualifyFares + faresWithNoRules + +passengerFare = cheapest(allFares) +``` + +As shown on the final two lines, once you have the list of qualifying +fares, you can combine these with fares that have no rules (from the +previous section) and then determine the cheapest fare. + +First though, you must determine if a fare with rules qualifies for the +given trip. If a fare specifies zones that must be passed through, then +all rules must be matched. + +***Note:** If a particular rule specifies a different route, start, or +finish than the one you are checking, you do not need to ensure the +`contains_id` matches, since this rule no longer applies. You still need +to check the other rules for this fare.* + +The algorithm needs to build up a list of zone IDs from the fare rules +in order to check against the trip. Once this has been done, you need to +check that every zone ID collected from the rules is contained in the +trip's list of zones. + +``` +qualifies(fare, routeId, originId, destinationId, containsIds) { + + fareContains = [ ] + + for (rule in fare.rules) { + if (rule.contains.length == 0) + continue + + if (rule.route.length > 0 AND rule.route != routeId) + continue + + if (rule.origin.length > 0 AND rule.origin != originId) + continue + + if (rule.desination.length > 0 AND rule.destination != destinationId) + continue + + fareContains.add(rule.containsId); + } + + if (fareContains.size == 0) + return YES + + if (containIds HAS EVERY ELEMENT IN fareContains) + return YES + else + return NO +} +``` + +This algorithm achieves the following: + +* Only rules that have a value for `contains_id` are relevant. Rules + that do not have this value fall through and should be considered as + qualified. +* If the route is specified but not equal to the one being checked, it + is safe to ignore the rule's `contains_id`. If the route is empty + or equal, the loop iteration can continue. +* Check for the `origin_id` and `destination_id` in the same + manner as `route_id`. +* If the route, origin and destination all qualify then store the + `contains_id` so it can be checked after the loop. + +The algorithm returns *yes* if the fare qualifies, meaning you can save +it as a qualifying fare. You can then return the cheapest qualifying +fare to the user. + +### Calculating Trips With Transfers + +Once you introduce transfers, fare calculation becomes more complicated. +A "trip with a transfer" is considered to be a trip where the passenger +boards a vehicle, disembarks, and then gets on another vehicle. For +example: + +* Travel on trip T1 from Stop S1 to Stop S2 +* Walk from Stop S2 to Stop S3 +* Travel on trip T2 from Stop S3 to Stop S4. + +In order to calculate the total fare for a trip with transfers, the +following algorithm is used: + +1. Retrieve list of qualifying fares for each trip individually +2. Create a list of every fare combination possible +3. Loop over all combinations and find the total cost +4. Return the lowest cost from Step 3. + +Step 1 was covered in *Calculating a Single Trip Fare*, but +you must skip the final step of finding the cheapest fare. This is +because the cheapest fare may change depending on subsequent transfers. +Instead, this step is performed once the cheapest *combination* is +determined. + +To demonstrate Step 2, consider the following example: + +* The trip on T1 from S1 to S2 yields the following qualifying fares: + F1, F2. +* The subsequent trip on T2 from S3 to S4 yields the following + qualifying fares: F3, F4. + +Generating every combination of these fares yields the following +possibilities: + +* F1 + F3 +* F1 + F4 +* F2 + F3 +* F2 + F4. + +Step 3 can now be performed, which involves finding the total cost for +each combinations. As you need to take into account the possibility of +timed transfers (according to the data stored in +`fare_attributes.txt`), you also need to know about the times of these +trips. + +The following algorithm can be used to calculate the total cost using +transfer rules. In this example, you would call this function once for +each fare combination. + +``` +function totalCost(fares) { + total = 0 + + for (fare in fares) { + freeTransfer = NO + + if (previousFare ALLOWS TRANSFERS) { + if (HAS ENOUGH TRANSFERS REMAINING) { + if (TRANSFER NOT EXPIRED) { + freeTransfer = YES + } + } + } + + if (!freeTransfer) + total = total + fare.price; + + previousFare = fare; + } + + return total; +} +``` + +Once all combinations have called the `totalCost` algorithm, you will +have a price for each trip. You can then return the lowest price as the +final price for the trip. + |