19. Calculating Fares**

In order to calculate a fare for a trip in GTFS, you must use data from fare_attributes.txt and fare_rules.txt. It is relatively straightforward to calculate the cost of a single trip (that is, boarding a vehicle, traveling for a number of stops, then disembarking), but it becomes much more complicated when you need to take into account multiple trip segments (that is, one or more transfers).

Note: As it stands, many feed providers do not include fare information. This is because many systems have a unique set of rules that cannot be modelled with the current structure of fares in GTFS. Additionally, it is not possible to determine different classes of pricing (such as having a separate price for adults and children). For the purposes of this chapter, these limitations are ignored.

For more discussion on how fares work in GTFS, refer to Fare Definitions (fare_attributes.txt & fare_rules.txt).

This chapter first shows you how to calculate fares for a single trip, then how to account for transfers and for multiple trips.

Calculating a Single Trip Fare

Much of the logic used when calculating fares requires knowledge of the zones used in a trip.

Note: A zone is a physical area within a transit system that contains a series of stops. They are used to group trip pricing into key areas in a system. Some transit systems do not work like this (for instance, they may measure the physical distance travelled rather than specific stops), which is one reason why the GTFS fares model does not work in all cases.

A zone is defined in GTFS by the zone_id column in stops.txt. A single stop can only belong to one zone.

Fares are matched to a trip using a combination of any of the following:

  • The route of the trip
  • The zone of the stop the passenger boards from
  • The zone of the stop the passenger disembarks
  • The zone(s) of any stops on the trip that are passed while the passenger is on board.

Consider the following simplified data set that may appear in stops.txt and stop_times.txt. Assume for this example that the trip T1 belongs to a route with an ID of R1.

stop_id,zone_id
S1,Z1
S2,Z1
S3,Z2
S4,Z3

trip_id,stop_id,stop_sequence
T1,S1,1
T1,S2,2
T1,S3,3
T1,S4,4

If a passenger travels from stop S1 to stop S4, then their starting zone is Z1, their finishing zone is Z3, and the zones they pass through are Z1, Z2 and Z3.

Note: When calculating fares, the start and finish zones are also included in the zones passed through, so in this example you Z3 is also considered as a zone that the trip passes through.

Using this data, you can now calculate matching fares. To do so, you need to find all fares that match either of the following:

  • Fares that have no associated rules.
  • Fares that have rules that match the specified trip. If a fare defines multiple zones that must be passed through (using contains_id), then all zones must be matched.

If multiple fares qualify for a trip, then the cheapest fare is the one to use.

Finding Fares With No Rules

This is the simplest use-case for fares. You can find all matching fares with the following SQL.

SELECT * FROM fare_attributes WHERE fare_id NOT IN (
  SELECT DISTINCT fare_id FROM fare_rules
);

If a feed only has fare_attributes.txt records with no rules, then the difference between the fares is in the transfer rules. This section only covers calculating fares for a single trip with no transfers, so for now you can just select the cheapest fare using the following SQL.

SELECT * FROM fare_attributes WHERE fare_id NOT IN (
    SELECT DISTINCT fare_id FROM fare_rules
  )
  ORDER BY price + 0 LIMIT 1;

Note: You still need to check for fares with one or more rules in order to find the cheapest price. Also, 0 is added in this query in order to cast a string to a number. When you roll your own importer you should instead import this as a numerical value.

Finding Fares With Matched Rules

Next you must check against specific rules for a fare. In order to do this, you need the starting zone, finishing zone, and all zones passed through (including the start and finish zones).

Referring back to the previous example, if a trip starts at Z1, passes through Z2 and finishes at Z3, you can find fare candidates (that is, trips that may match), using the following SQL query.

SELECT * FROM fare_attributes WHERE fare_id IN (
    SELECT fare_id FROM fare_rules
      WHERE (LENGTH(route_id) = 0 OR route_id = 'R1')
      AND (LENGTH(origin_id) = 0 OR origin_id = 'Z1')
      AND (LENGTH(destination_id) = 0 OR destination_id = 'Z3')
      AND (LENGTH(contains_id) = 0 OR contains_id IN ('Z1', 'Z2', 'Z3')
    )
  );

This returns a list of fares that may qualify for the given trip. As some fares have multiple rules, all must be checked. The algorithm to represent this is as follows.

fares = [ result from above query ]

qualifyingFares = [ ]

for (fare in fares) {
  if (qualifies(fare))
    qualifyingFares.add(fare)
}

allFares = qualifyFares + faresWithNoRules

passengerFare = cheapest(allFares)

As shown on the final two lines, once you have the list of qualifying fares, you can combine these with fares that have no rules (from the previous section) and then determine the cheapest fare.

First though, you must determine if a fare with rules qualifies for the given trip. If a fare specifies zones that must be passed through, then all rules must be matched.

Note: If a particular rule specifies a different route, start, or finish than the one you are checking, you do not need to ensure the contains_id matches, since this rule no longer applies. You still need to check the other rules for this fare.

The algorithm needs to build up a list of zone IDs from the fare rules in order to check against the trip. Once this has been done, you need to check that every zone ID collected from the rules is contained in the trip's list of zones.

qualifies(fare, routeId, originId, destinationId, containsIds) {

  fareContains = [ ]
  
  for (rule in fare.rules) {
    if (rule.contains.length == 0)
      continue
      
    if (rule.route.length > 0 AND rule.route != routeId)
      continue
      
    if (rule.origin.length > 0 AND rule.origin != originId)
      continue
      
    if (rule.desination.length > 0 AND rule.destination != destinationId)
      continue
      
    fareContains.add(rule.containsId);
 }
 
 if (fareContains.size == 0)
   return YES
   
 if (containIds HAS EVERY ELEMENT IN fareContains)
   return YES
 else 
   return NO
}

This algorithm achieves the following:

  • Only rules that have a value for contains_id are relevant. Rules that do not have this value fall through and should be considered as qualified.
  • If the route is specified but not equal to the one being checked, it is safe to ignore the rule's contains_id. If the route is empty or equal, the loop iteration can continue.
  • Check for the origin_id and destination_id in the same manner as route_id.
  • If the route, origin and destination all qualify then store the contains_id so it can be checked after the loop.

The algorithm returns yes if the fare qualifies, meaning you can save it as a qualifying fare. You can then return the cheapest qualifying fare to the user.

Calculating Trips With Transfers

Once you introduce transfers, fare calculation becomes more complicated. A "trip with a transfer" is considered to be a trip where the passenger boards a vehicle, disembarks, and then gets on another vehicle. For example:

  • Travel on trip T1 from Stop S1 to Stop S2
  • Walk from Stop S2 to Stop S3
  • Travel on trip T2 from Stop S3 to Stop S4.

In order to calculate the total fare for a trip with transfers, the following algorithm is used:

  1. Retrieve list of qualifying fares for each trip individually
  2. Create a list of every fare combination possible
  3. Loop over all combinations and find the total cost
  4. Return the lowest cost from Step 3.

Step 1 was covered in Calculating a Single Trip Fare, but you must skip the final step of finding the cheapest fare. This is because the cheapest fare may change depending on subsequent transfers. Instead, this step is performed once the cheapest combination is determined.

To demonstrate Step 2, consider the following example:

  • The trip on T1 from S1 to S2 yields the following qualifying fares: F1, F2.
  • The subsequent trip on T2 from S3 to S4 yields the following qualifying fares: F3, F4.

Generating every combination of these fares yields the following possibilities:

  • F1 + F3
  • F1 + F4
  • F2 + F3
  • F2 + F4.

Step 3 can now be performed, which involves finding the total cost for each combinations. As you need to take into account the possibility of timed transfers (according to the data stored in fare_attributes.txt), you also need to know about the times of these trips.

The following algorithm can be used to calculate the total cost using transfer rules. In this example, you would call this function once for each fare combination.

function totalCost(fares) {
  total = 0
 
  for (fare in fares) {
    freeTransfer = NO
 
    if (previousFare ALLOWS TRANSFERS) {
      if (HAS ENOUGH TRANSFERS REMAINING) {
        if (TRANSFER NOT EXPIRED) {
          freeTransfer = YES
        }
      }
    }
 
    if (!freeTransfer)
      total = total + fare.price;
     
    previousFare = fare;
  }
 
  return total;
}

Once all combinations have called the totalCost algorithm, you will have a price for each trip. You can then return the lowest price as the final price for the trip.