aboutsummaryrefslogtreecommitdiff
path: root/gtfs-book/ch-19-calculating-fares.md
blob: df24dd47d6a465557aa6114a77ce5d25a7e4320b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
## 19. Calculating Fares**

In order to calculate a fare for a trip in GTFS, you must use data from
`fare_attributes.txt` and `fare_rules.txt`. It is relatively
straightforward to calculate the cost of a single trip (that is,
boarding a vehicle, traveling for a number of stops, then disembarking),
but it becomes much more complicated when you need to take into account
multiple trip segments (that is, one or more transfers).

***Note:** As it stands, many feed providers do not include fare
information. This is because many systems have a unique set of rules
that cannot be modelled with the current structure of fares in GTFS.
Additionally, it is not possible to determine different classes of
pricing (such as having a separate price for adults and children). For
the purposes of this chapter, these limitations are ignored.*

For more discussion on how fares work in GTFS, refer to *Fare
Definitions (`fare_attributes.txt` & `fare_rules.txt`)*.

This chapter first shows you how to calculate fares for a single trip,
then how to account for transfers and for multiple trips.

### Calculating a Single Trip Fare

Much of the logic used when calculating fares requires knowledge of the
zones used in a trip.

***Note:** A zone is a physical area within a transit system that
contains a series of stops. They are used to group trip pricing into key
areas in a system. Some transit systems do not work like this (for
instance, they may measure the physical distance travelled rather than
specific stops), which is one reason why the GTFS fares model does not
work in all cases.*

A zone is defined in GTFS by the `zone_id` column in `stops.txt`. A
single stop can only belong to one zone.

Fares are matched to a trip using a combination of any of the following:

* The route of the trip
* The zone of the stop the passenger boards from
* The zone of the stop the passenger disembarks
* The zone(s) of any stops on the trip that are passed while the
  passenger is on board.

Consider the following simplified data set that may appear in
`stops.txt` and `stop_times.txt`. Assume for this example that the
trip `T1` belongs to a route with an ID of `R1`.

```
stop_id,zone_id
S1,Z1
S2,Z1
S3,Z2
S4,Z3

trip_id,stop_id,stop_sequence
T1,S1,1
T1,S2,2
T1,S3,3
T1,S4,4
```

If a passenger travels from stop S1 to stop S4, then their starting zone
is Z1, their finishing zone is Z3, and the zones they pass through are
Z1, Z2 and Z3.

***Note:** When calculating fares, the start and finish zones are also
included in the zones passed through, so in this example you Z3 is also
considered as a zone that the trip passes through.*

Using this data, you can now calculate matching fares. To do so, you
need to find all fares that match either of the following:

* Fares that have no associated rules.
* Fares that have rules that match the specified trip. If a fare
  defines multiple zones that must be passed through (using
  `contains_id`), then all zones must be matched.

If multiple fares qualify for a trip, then the cheapest fare is the one
to use.

### Finding Fares With No Rules

This is the simplest use-case for fares. You can find all matching fares
with the following SQL.

```sql
SELECT * FROM fare_attributes WHERE fare_id NOT IN (
  SELECT DISTINCT fare_id FROM fare_rules
);
```

If a feed only has `fare_attributes.txt` records with no rules, then
the difference between the fares is in the transfer rules. This section
only covers calculating fares for a single trip with no transfers, so
for now you can just select the cheapest fare using the following SQL.

```sql
SELECT * FROM fare_attributes WHERE fare_id NOT IN (
    SELECT DISTINCT fare_id FROM fare_rules
  )
  ORDER BY price + 0 LIMIT 1;
```

***Note:** You still need to check for fares with one or more rules in
order to find the cheapest price. Also, `0` is added in this query in
order to cast a string to a number. When you roll your own importer you
should instead import this as a numerical value.*

### Finding Fares With Matched Rules

Next you must check against specific rules for a fare. In order to do
this, you need the starting zone, finishing zone, and all zones passed
through (including the start and finish zones).

Referring back to the previous example, if a trip starts at `Z1`,
passes through `Z2` and finishes at `Z3`, you can find fare
candidates (that is, trips that *may* match), using the following SQL
query.

```sql
SELECT * FROM fare_attributes WHERE fare_id IN (
    SELECT fare_id FROM fare_rules
      WHERE (LENGTH(route_id) = 0 OR route_id = 'R1')
      AND (LENGTH(origin_id) = 0 OR origin_id = 'Z1')
      AND (LENGTH(destination_id) = 0 OR destination_id = 'Z3')
      AND (LENGTH(contains_id) = 0 OR contains_id IN ('Z1', 'Z2', 'Z3')
    )
  );
```

This returns a list of fares that may qualify for the given trip. As
some fares have multiple rules, all must be checked. The algorithm to
represent this is as follows.

```
fares = [ result from above query ]

qualifyingFares = [ ]

for (fare in fares) {
  if (qualifies(fare))
    qualifyingFares.add(fare)
}

allFares = qualifyFares + faresWithNoRules

passengerFare = cheapest(allFares)
```

As shown on the final two lines, once you have the list of qualifying
fares, you can combine these with fares that have no rules (from the
previous section) and then determine the cheapest fare.

First though, you must determine if a fare with rules qualifies for the
given trip. If a fare specifies zones that must be passed through, then
all rules must be matched.

***Note:** If a particular rule specifies a different route, start, or
finish than the one you are checking, you do not need to ensure the
`contains_id` matches, since this rule no longer applies. You still need
to check the other rules for this fare.*

The algorithm needs to build up a list of zone IDs from the fare rules
in order to check against the trip. Once this has been done, you need to
check that every zone ID collected from the rules is contained in the
trip's list of zones.

```
qualifies(fare, routeId, originId, destinationId, containsIds) {

  fareContains = [ ]
  
  for (rule in fare.rules) {
    if (rule.contains.length == 0)
      continue
      
    if (rule.route.length > 0 AND rule.route != routeId)
      continue
      
    if (rule.origin.length > 0 AND rule.origin != originId)
      continue
      
    if (rule.desination.length > 0 AND rule.destination != destinationId)
      continue
      
    fareContains.add(rule.containsId);
 }
 
 if (fareContains.size == 0)
   return YES
   
 if (containIds HAS EVERY ELEMENT IN fareContains)
   return YES
 else 
   return NO
}
```

This algorithm achieves the following:

* Only rules that have a value for `contains_id` are relevant. Rules
  that do not have this value fall through and should be considered as
  qualified.
* If the route is specified but not equal to the one being checked, it
  is safe to ignore the rule's `contains_id`. If the route is empty
  or equal, the loop iteration can continue.
* Check for the `origin_id` and `destination_id` in the same
  manner as `route_id`.
* If the route, origin and destination all qualify then store the
  `contains_id` so it can be checked after the loop.

The algorithm returns *yes* if the fare qualifies, meaning you can save
it as a qualifying fare. You can then return the cheapest qualifying
fare to the user.

### Calculating Trips With Transfers

Once you introduce transfers, fare calculation becomes more complicated.
A "trip with a transfer" is considered to be a trip where the passenger
boards a vehicle, disembarks, and then gets on another vehicle. For
example:

* Travel on trip T1 from Stop S1 to Stop S2
* Walk from Stop S2 to Stop S3
* Travel on trip T2 from Stop S3 to Stop S4.

In order to calculate the total fare for a trip with transfers, the
following algorithm is used:

1. Retrieve list of qualifying fares for each trip individually
2. Create a list of every fare combination possible
3. Loop over all combinations and find the total cost
4. Return the lowest cost from Step 3.

Step 1 was covered in *Calculating a Single Trip Fare*, but
you must skip the final step of finding the cheapest fare. This is
because the cheapest fare may change depending on subsequent transfers.
Instead, this step is performed once the cheapest *combination* is
determined.

To demonstrate Step 2, consider the following example:

* The trip on T1 from S1 to S2 yields the following qualifying fares:
  F1, F2.
* The subsequent trip on T2 from S3 to S4 yields the following
  qualifying fares: F3, F4.

Generating every combination of these fares yields the following
possibilities:

* F1 + F3
* F1 + F4
* F2 + F3
* F2 + F4.

Step 3 can now be performed, which involves finding the total cost for
each combinations. As you need to take into account the possibility of
timed transfers (according to the data stored in
`fare_attributes.txt`), you also need to know about the times of these
trips.

The following algorithm can be used to calculate the total cost using
transfer rules. In this example, you would call this function once for
each fare combination.

```
function totalCost(fares) {
  total = 0
 
  for (fare in fares) {
    freeTransfer = NO
 
    if (previousFare ALLOWS TRANSFERS) {
      if (HAS ENOUGH TRANSFERS REMAINING) {
        if (TRANSFER NOT EXPIRED) {
          freeTransfer = YES
        }
      }
    }
 
    if (!freeTransfer)
      total = total + fare.price;
     
    previousFare = fare;
  }
 
  return total;
}
```

Once all combinations have called the `totalCost` algorithm, you will
have a price for each trip. You can then return the lowest price as the
final price for the trip.