aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--.gitignore19
-rw-r--r--.idea/.gitignore8
-rw-r--r--.idea/GTFS-books.iml9
-rw-r--r--.idea/gtfs-book/ch-00-definitive-guide-to-gtfs.md55
-rw-r--r--.idea/gtfs-book/ch-01-introduction.md45
-rw-r--r--.idea/gtfs-book/ch-02-agencies.md47
-rw-r--r--.idea/gtfs-book/ch-03-stops.md117
-rw-r--r--.idea/gtfs-book/ch-04-routes.md82
-rw-r--r--.idea/gtfs-book/ch-05-trips.md222
-rw-r--r--.idea/gtfs-book/ch-06-stop-times.md160
-rw-r--r--.idea/gtfs-book/ch-07-calendar.md138
-rw-r--r--.idea/gtfs-book/ch-08-fares.md115
-rw-r--r--.idea/gtfs-book/ch-09-shapes.md76
-rw-r--r--.idea/gtfs-book/ch-10-frequencies.md98
-rw-r--r--.idea/gtfs-book/ch-11-stop-transfers.md65
-rw-r--r--.idea/gtfs-book/ch-12-feed-information.md36
-rw-r--r--.idea/gtfs-book/ch-13-importing-to-sql.md125
-rw-r--r--.idea/gtfs-book/ch-14-switching-to-integer-ids.md111
-rw-r--r--.idea/gtfs-book/ch-15-optimizing-shapes.md158
-rw-r--r--.idea/gtfs-book/ch-16-deleting-unused-data.md106
-rw-r--r--.idea/gtfs-book/ch-17-searching-for-trips.md346
-rw-r--r--.idea/gtfs-book/ch-18-working-with-trip-blocks.md109
-rw-r--r--.idea/gtfs-book/ch-19-calculating-fares.md296
-rw-r--r--.idea/gtfs-book/ch-20-trip-patterns.md141
-rw-r--r--.idea/gtfs-book/ch-21-conclusion.md23
-rw-r--r--.idea/misc.xml6
-rw-r--r--README.md12
27 files changed, 2706 insertions, 19 deletions
diff --git a/.gitignore b/.gitignore
index 5d947ca..e43b0f9 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,18 +1 @@
-# Build and Release Folders
-bin-debug/
-bin-release/
-[Oo]bj/
-[Bb]in/
-
-# Other files and folders
-.settings/
-
-# Executables
-*.swf
-*.air
-*.ipa
-*.apk
-
-# Project files, i.e. `.project`, `.actionScriptProperties` and `.flexProperties`
-# should NOT be excluded as they contain compiler settings and other important
-# information for Eclipse / Flash Builder.
+.DS_Store
diff --git a/.idea/.gitignore b/.idea/.gitignore
new file mode 100644
index 0000000..73f69e0
--- /dev/null
+++ b/.idea/.gitignore
@@ -0,0 +1,8 @@
+# Default ignored files
+/shelf/
+/workspace.xml
+# Datasource local storage ignored files
+/dataSources/
+/dataSources.local.xml
+# Editor-based HTTP Client requests
+/httpRequests/
diff --git a/.idea/GTFS-books.iml b/.idea/GTFS-books.iml
new file mode 100644
index 0000000..d6ebd48
--- /dev/null
+++ b/.idea/GTFS-books.iml
@@ -0,0 +1,9 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<module type="JAVA_MODULE" version="4">
+ <component name="NewModuleRootManager" inherit-compiler-output="true">
+ <exclude-output />
+ <content url="file://$MODULE_DIR$" />
+ <orderEntry type="inheritedJdk" />
+ <orderEntry type="sourceFolder" forTests="false" />
+ </component>
+</module> \ No newline at end of file
diff --git a/.idea/gtfs-book/ch-00-definitive-guide-to-gtfs.md b/.idea/gtfs-book/ch-00-definitive-guide-to-gtfs.md
new file mode 100644
index 0000000..2c32c67
--- /dev/null
+++ b/.idea/gtfs-book/ch-00-definitive-guide-to-gtfs.md
@@ -0,0 +1,55 @@
+# The Definitive Guide to GTFS
+
+*Consuming open public transportation data with the General Transit Feed
+Specification*
+
+Originally written by Quentin Zervaas.
+
+## About This Book
+
+This book is a comprehensive guide to GTFS -- the General Transit Feed
+Specification. It is comprised of two main sections.
+
+The first section describes what GTFS is and provides details about the
+specification itself. In addition to this it also provides various
+discussion points and things to consider for each of the files in the
+specification.
+
+The second section covers a number of topics that relate to actually
+using GTFS feeds, such as how to calculate fares, how to search for
+trips, how to optimize feed data and more.
+
+This book is written for developers that are using transit data for web
+sites, mobile applications and more. It aims to be as language-agnostic
+as possible, but uses SQL to demonstrate concepts of extracting data
+from a GTFS feed.
+
+### About The Author
+
+Quentin Zervaas is a software developer from Adelaide, Australia.
+
+Quentin was the founder of TransitFeeds (now <OpenMobilityData.org>), a web
+site that provides a comprehensive listing of public transportation data
+available around the world. This site is referenced various times
+throughout this book.
+
+### Credits
+
+First Edition. Published in February 2014.
+
+**Technical Reviewer**
+
+Rupert Hanson
+
+**Copy Editor**
+
+Miranda Little
+
+**Disclaimer**
+
+The information in this book is distributed on an "as is" basis, without
+warranty. Although every precaution has been taken in the preparation of
+this work, the author shall not be liable to any person or entity with
+respect to any loss or damage caused or alleged to be caused directly or
+indirectly by the information contained in this book.
+
diff --git a/.idea/gtfs-book/ch-01-introduction.md b/.idea/gtfs-book/ch-01-introduction.md
new file mode 100644
index 0000000..3b3765c
--- /dev/null
+++ b/.idea/gtfs-book/ch-01-introduction.md
@@ -0,0 +1,45 @@
+## 1. Introduction to GTFS
+
+GTFS (General Transit Feed Specification) is a data standard developed
+by Google used to describe a public transportation system. Its primary
+purpose was to enable public transit agencies to upload their schedules
+to Google Transit so that users of Google Maps could easily figure out
+which bus, train, ferry or otherwise to catch.
+
+> **A GTFS feed is a ZIP file that contains a series of CSV files that
+list routes, stops and trips in a public transportation system.**
+
+This book examines GTFS in detail, including which data from a public
+transportation system can be represented, how to extract data, and
+explores some more advanced techniques for optimizing and querying data.
+
+The official GTFS specification has been referenced a number of times in
+this book. It is strongly recommended you are familiar with it. You can
+view at the following URL:
+
+<https://developers.google.com/transit/gtfs/reference>
+
+### Structure of a GTFS Feed
+
+A GTFS feed is a series of CSV files, which means that it is trivial to
+include additional files in a feed. Additionally, files required as part
+of the specification can also include additional columns. For this
+reason, feeds from different agencies generally include different levels
+of detail.
+
+***Note:** The files in a GTFS feed are CSV files, but use a file
+extension of `.txt`.*
+
+A GTFS feed can be described as follows:
+
+> **A GTFS feed has one or more routes. Each route (`routes.txt`) has one or
+more trips (`trips.txt`). Each trip visits a series of stops (`stops.txt`)
+at specified times (`stop_times.txt`). Trips and stop times only contain
+time of day information; the calendar is used to determine on which days
+a given trip runs (`calendar.txt` and `calendar_dates.txt`).**
+
+The following chapters cover the main files that are included in all
+GTFS feeds. For each file, the main columns are covered, as well as
+optional columns that can be included. This book also covers some of the
+unofficial columns that some agencies choose to include.
+
diff --git a/.idea/gtfs-book/ch-02-agencies.md b/.idea/gtfs-book/ch-02-agencies.md
new file mode 100644
index 0000000..cb837ae
--- /dev/null
+++ b/.idea/gtfs-book/ch-02-agencies.md
@@ -0,0 +1,47 @@
+## 2. Agencies (agency.txt)
+
+*This file is ***required*** to be included in GTFS feeds.*
+
+The `agency.txt` file is used to represent the agencies that provide
+data for this feed. While its presence is optional, if there are routes
+from multiple agencies included, then records in `routes.txt` make
+reference to agencies in this file.
+
+| Field | Required? | Description |
+| :----------------------------------------------------- | :--------: | :-------- |
+| `agency_id` | Optional | An ID that uniquely identifies a single transit agency in the feed. If a feed only contains routes for a single agency then this value is optional. |
+| `agency_name` | Required | The full name of the transit agency. |
+| `agency_url` | Required | The URL of the transit agency. Must be a complete URL only, beginning with `http://` or `https://`. |
+| `agency_timezone` | Required | Time zone of agency. All times in `stop_times.txt` use this time zone, unless overridden by its corresponding stop. All agencies in a single feed must use the same time zone. Example: **America/New_York** (See <http://en.wikipedia.org/wiki/List_of_tz_database_time_zones> for more examples) |
+| `agency_lang` | Required | Contains a two-letter ISO-639-1 code (such as `en` or `EN` for English) for the language used in this feed. |
+| `agency_phone` | Optional | A single voice telephone number for the agency that users can dial if required. |
+| `agency_fare_url` | Optional | A URL that describes fare information for the agency. Must be a complete URL only, beginning with `http://` or `https://`. |
+
+### Sample Data
+
+The following extract is taken from the GTFS feed of TriMet (Portland,
+USA), located at <https://openmobilitydata.org/p/trimet>.
+
+| `agency_name` | `agency_url` | `agency_timezone` | `agency_lang` | `agency_phone` |
+| :------------ | :------------------------------------------- | :-------------------- | :------------ | :--------------- |
+| `TriMet` | `[https://trimets.org](https://trimet.org/)` | `America/Los_Angeles` | `en` | `(503) 238-7433` |
+
+In this example, the `agency_id` column is included, but as there is
+only a single entry the value can be empty. This means the `agency_id`
+column in `routes.txt` also is not required.
+
+### Discussion
+
+The data in this file is typically used to provide additional
+information to users of your app or web site in case schedules derived
+from the rest of this feed are not sufficient (or in the case of
+`agency_fare_url`, an easy way to provide a reference point to users
+if the fare information in the feed is not being used).
+
+If you refer to the following screenshot, taken from Google Maps, you
+can see the information from `agency.txt` represented in the
+lower-left corner as an example of how it can be used.
+
+![GTFS agency](images/agency-google-maps.png)
+
+
diff --git a/.idea/gtfs-book/ch-03-stops.md b/.idea/gtfs-book/ch-03-stops.md
new file mode 100644
index 0000000..cd8dedb
--- /dev/null
+++ b/.idea/gtfs-book/ch-03-stops.md
@@ -0,0 +1,117 @@
+## 3. Stops & Stations (stops.txt)
+
+*This file is ***required*** to be included in GTFS feeds.*
+
+The individual locations where vehicles pick up or drop off passengers
+are represented by `stops.txt`. Records in this file are referenced in
+`stop_times.txt`. A record in this file can be either a stop or a
+station. A station has one or more child stops, as indicated using the
+`parent_station` value. Entries that are marked as stations may not
+appear in `stop_times.txt`.
+
+| Field | Required? | Description |
+| :----- | :-------- | :---------- |
+| `stop_id` | Required | An ID to uniquely identify a stop or station. |
+| `stop_code` | Optional | A number or short string used to identify a stop to passengers. This is typically displayed at the physical stop or on printed schedules. |
+| `stop_name` | Required | The name of the stop as passengers know it by. |
+| `stop_desc` | Optional | A description of the stop. If provided, this should provide additional information to the `stop_name` value. |
+| `stop_lat` | Required | The latitude of the stop (a number in the range of `-90` to `90`). |
+| `stop_lon` | Required | The longitude of the stop (a number in the range of `-180` to `180`). |
+| `zone_id` | Optional | This is an identifier used to calculate fares. A single zone ID may appear in multiple stops, but is ignored if the stop is marked as a station. |
+| `stop_url` | Optional | A URL that provides information about this stop. It should be specific to this stop and not simply link to the agency's web site. |
+| `location_type` | Optional | Indicates if a record is a stop or station. `0` or blank means a stop, `1` means a station. |
+| `parent_station` | Optional | If a record is marked as a stop and has a parent station, this contains the ID of the parent (the parent must have a `location_type` of `1`). |
+| `stop_timezone` | Optional | If a stop is located in a different time zone to the one specified in `agency.txt`, then it can be overridden here. |
+| `wheelchair_boarding` | Optional | A value of `1` indicates it is possible for passengers in wheelchairs to board or alight. A value of `2` means the stop is not wheelchair accessible, while `0` or an empty value means no information is available. If the stop has a parent station, then 0 or an empty value means to inherit from its parent. |
+
+### Sample Data
+
+The following extract is taken from the TriMet GTFS feed
+(<https://openmobilitydata.org/p/trimet>).
+
+| `stop_id` | `stop_code` | `stop_name` | `stop_lat` | `stop_lon` | `stop_url` |
+| :-------- | :----------- | :------------------ | :---------- | :------------ | :---------------------------------------------------|
+| `2` | `2` | `A Ave & Chandler` | `45.420595` | `-122.675676` |` <http://trimet.org/arrivals/tracker?locationID=2>` |
+| `3` | `3` | `A Ave & Second St` | `45.419386` | `-122.665341` |` <http://trimet.org/arrivals/tracker?locationID=3>` |
+| `4` | `4` | `A Ave & 10th St` | `45.420703` | `-122.675152` |` <http://trimet.org/arrivals/tracker?locationID=4>` |
+| `6` | `6` | `A Ave & 8th St` | `45.420217` | `-122.67307` |` <http://trimet.org/arrivals/tracker?locationID=6>` |
+
+The following diagram shows how these points look if you plot them onto
+a map.
+
+![Stops](images/stops-sample.png)
+
+In this extract, TriMet use the same value for stop IDs and stop codes.
+This is useful, because it means the stop IDs are stable (that is, they
+do not change between feed versions). This means that if you want to
+save a particular stop (for instance, if a user wants to save a
+"favorite stop") you can trust that saving the ID will get the job done.
+
+**Note:** This is not always the case though, which means you may have
+to save additional information if you want to save a stop. For instance,
+you may need to save the coordinates or the list of routes a stop serves
+so you can find it again if the stop ID has changed in a future version
+of the feed.
+
+### Stops & Stations
+
+Specifying an entry in this file as a *station* is typically used when
+there are many stops located within a single physical entity, such as a
+train station or bus depot. While many feeds do not offer this
+information, some large train stations may have up to 20 or 30
+platforms.
+
+Knowing the platform for a specific trip is extremely useful, but if a
+passenger wants to select a starting point for their trip, showing them
+a list of platforms may be confusing.
+
+Passenger:
+
+**"I want to travel from *Central Station* to *Airport Station*."**
+
+Web site / App:
+
+**"Board at *Central Station platform 5*, disembark at *Airport Station platform 1*."**
+
+In this example, the passenger selects the parent station, but they are
+presented with the specific stop so they know exactly where within the
+station they need to embark or disembark.
+
+### Wheelchair Accessibility
+
+If you are showing wheelchair accessibility information, it is important
+to differentiate between "no access" and "no information", as knowing a
+stop is not accessible is as important as knowing it is.
+
+If a stop is marked as being wheelchair accessible, you must check that
+trips that visit the stop are also accessible (using the
+`wheelchair_accessible` field in `trips.txt`). If the value in
+`trips.txt` is blank, `0` or `1` then it is safe to assume the
+trip can be accessed. If the stop is accessible and the trip is not,
+then passengers in wheelchairs cannot use the trip.
+
+### Stop Features
+
+One of the proposed changes to GTFS is the addition of a file called
+`stop_features.txt`. This is used to define characteristics about
+stops. The great thing about this file is that it allows you to indicate
+to users when a stop has a ticket machine, bike storage, lighting, or an
+electronic display with real-time information.
+
+TriMet is one of the few agencies including this file. The following is
+a sample of this file.
+
+| `stop_id` | `feature_type` |
+| :-------- | :------------- |
+| `61` | `4110` |
+| `61` | `2310` |
+| `61` | `5200` |
+
+This data indicate that stop `61` (NE Alberta & 24th) has a *Printed
+Schedule Display* (`4110`), a *Bike Rack* (`2310`) and a *Street
+Light* (`5200`).
+
+For more information about this proposal and a list of values and their
+meanings, refer to
+<https://sites.google.com/site/gtfschanges/proposals/stop-amenity>.
+
diff --git a/.idea/gtfs-book/ch-04-routes.md b/.idea/gtfs-book/ch-04-routes.md
new file mode 100644
index 0000000..2e9716e
--- /dev/null
+++ b/.idea/gtfs-book/ch-04-routes.md
@@ -0,0 +1,82 @@
+## 4. Routes (routes.txt)
+
+*This file is ***required*** to be included in GTFS feeds.*
+
+A route is a group of trips that are displayed to riders as a single
+service.
+
+| Field | Required? | Description |
+| :----- | :-------- | :---------- |
+| `route_id` | Required | An ID that uniquely identifies the route. |
+| `agency_id` | Optional | The ID of the agency a route belongs to, as it appears in `agency.txt`. Only required if there are multiple agencies in the feed. |
+| `route_short_name` | Required | A nickname or code to represent this service. If this is left empty then the `route_long_name` must be included. |
+| `route_long_name` | Required | The route full name. If this is left empty then the `route_short_name` must be included. |
+| `route_desc` | Optional | A description of the route, such as where and when the route operates. |
+| `route_type` | Required | The type of transportation used on a route (such as bus, train or ferry). See below for more information. |
+| `route_url` | Optional | A URL of a web page that describes this particular route. |
+| `route_color` | Optional | If applicable, a route can have a color assigned to it. This is useful for systems that use colors to identify routes. This value is a six-character hexadecimal number (for example, `FF0000` is red). |
+| `route_text_color` | Optional | For routes that specify the `route_color`, a corresponding text color should also be specified. |
+
+### Sample Data
+
+The following extract is taken from the TriMet GTFS feed
+(<https://openmobilitydata.org/p/trimet>).
+
+| `route_id` | `route_short_name` | `route_long_name` | `route_type` |
+| :--------- | :----------------- | :--------------------------- | :----------- |
+| `1` | `1` | `Vermont` | `3` |
+| `4` | `4` | `Division / Fessenden` | `3` |
+| `6` | `6` | `Martin Luther King Jr Blvd` | `3` |
+
+This sample shows three different bus routes for the greater Portland
+area. The `route_type` value of `3` indicates they are buses. See
+the next section for more information about route types in GTFS.
+
+There is no agency ID value in this feed, as TriMet is the only agency
+represented in the feed.
+
+The other thing to note about this data is that TriMet use the same
+value for both `route_id` and `route_short_name`. This is very
+useful, because it means if you have a user that wants to save
+information about a particular route you can trust the `route_id`
+value. Unfortunately, this is not the case in all GTFS feeds. Sometimes,
+the `route_id` value may change with every version of a feed (or at
+least, semi-frequently). Additionally, some feeds may also have multiple
+routes with the same `route_short_name`. This can present challenges
+when trying to save user data.
+
+### Route Types
+
+To indicate a route's mode of transport, the `route_type` column is
+used.
+
+| Value | Description |
+| :---- | :---------------- |
+| `0` | Tram / Light Rail |
+| `1` | Subway / Metro |
+| `2` | Rail |
+| `3` | Bus |
+| `4` | Ferry |
+| `5` | Cable Car |
+| `6` | Gondola |
+| `7` | Funicular |
+
+Agencies may interpret the meaning of these route types differently. For
+instance, some agencies specify their subway service as rail (value of
+`2` instead of `1`), while some specify their trains as light rail
+(`0` instead of `2`).
+
+These differences between agencies occur mainly because of the vague
+descriptions for each of these route types. If you use Google Transit to
+find directions, you may notice route types referenced that are
+different to those listed above. This is because Google Transit also
+supports additional route types. You can read more about these
+additional route types at
+<https://support.google.com/transitpartners/answer/3520902?hl=en>.
+
+Very few GTFS feeds made available to third-party developers actually
+make use of these values, but it is useful to know in case you come
+across one that does. For instance, Sydney Buses include their school
+buses with a route type of `712`, while other buses in the feed have
+route type `700`.
+
diff --git a/.idea/gtfs-book/ch-05-trips.md b/.idea/gtfs-book/ch-05-trips.md
new file mode 100644
index 0000000..ee2b814
--- /dev/null
+++ b/.idea/gtfs-book/ch-05-trips.md
@@ -0,0 +1,222 @@
+## 5. Trips (trips.txt)
+
+*This file is ***required*** to be included in GTFS feeds.*
+
+The `trips.txt` file contains trips for each route. The specific stop
+times are specified in `stop_times.txt`, and the days each trip runs
+on are specified in `calendar.txt` and `calendar_dates.txt`.
+
+| Field | Required? | Description |
+| :----- | :-------- | :---------- |
+| `route_id` | Required | The ID of the route a trip belongs to as it appears in `routes.txt`. |
+| `service_id` | Required | The ID of the service as it appears in `calendar.txt` or `calendar_dates.txt`, which identifies the dates on which a trip runs. |
+| `trip_id` | Required | A unique identifier for a trip in this file. This value is referenced in `stop_times.txt` when specifying individual stop times. |
+| `trip_headsign` | Optional | The text that appears to passengers as the destination or description of the trip. Mid-trip changes to the headsign can be specified in `stop_times.txt`. |
+| `trip_short_name` | Optional | A short name or code to identify the particular trip, different to the route's short name. This may identify a particular train number or a route variation. |
+| `direction_id` | Optional | Indicates the direction of travel for a trip, such as to differentiate between an inbound and an outbound trip. |
+| `block_id` | Optional | A block is a series of trips conducted by the same vehicle. This ID is used to group 2 or more trips together. |
+| `shape_id` | Optional | This value references a value from `shapes.txt` to define a shape for a trip. |
+| `wheelchair_accessible` | Optional | `0` or blank indicates unknown, while `1` indicates the vehicle can accommodate at least one wheelchair passenger. A value of `2` indicates no wheelchairs can be accommodated. |
+
+### Sample Data
+
+Consider the following extract, taken from the `trips.txt` file of the
+TriMet GTFS feed (<https://openmobilitydata.org/p/trimet>).
+
+| `route_id` | `service_id` | `trip_id` | `direction_id` | `block_id` | `shape_id` |
+| ---------- | ------------ | ----------- | -------------- | ---------- | ---------- |
+| 1 | W.378 | 4282257 | 0 | 103 | 185327 |
+| 1 | W.378 | 4282256 | 0 | 101 | 185327 |
+| 1 | W.378 | 4282255 | 0 | 102 | 185327 |
+| 1 | W.378 | 4282254 | 0 | 103 | 185327 |
+
+This data describes four individual trips for the "Vermont" bus route
+(this was determined by looking up the `route_id` value in
+`routes.txt`). While the values for `trip_id`, `block_id` and
+`shape_id` are all integers in this particular instance, this is not a
+requirement. Just like `service_id`, there may be non-numeric
+characters.
+
+As each of these trips is for the same route and runs in the same
+direction (based on `direction_id`) they can all be represented by the
+same shape. Note however that this is not always be the case as some
+agencies may start or finish trips for a single route at different
+locations depending on the time of the day. If this were the case, then
+the trip's shape would differ slightly (and therefore have a different
+shape to represent it in `shapes.txt`).
+
+Although this example does not include `trip_headsign`, many feeds do
+include this value. This is useful for indicating to a passenger where
+the trip is headed. When the trip headsign is not provided in the feed,
+you can determine the destination by using the final stop in a trip.
+
+**Tip:** If you are determining the destination based on the final stop,
+you can either present the stop name to the user, or you can
+reverse-geocode the stop's coordinates to determine the its locality.
+
+### Blocks
+
+In the preceding example, each trip has a value for `block_id`. The
+first and the last trips here both have a `block_id` value of `103`.
+This indicates that the same physical vehicle completes both of these
+trips. As each of these trips go in the same direction, it is likely
+that they start at the same location.
+
+This means there is probably another trip in the feed for the same block
+that exists between the trips listed here. It would likely travel from
+the finishing point of the first trip (`4282257`) to the starting
+point of the other trip (`4282254`). If you dig a little deeper in the
+feed you will find the trip shown in the following table.
+
+| `route_id` | `service_id` | `trip_id` | `direction_id` | `block_id` | `shape_id` |
+| :--------- | :----------- | :-------- | :------------- | :--------- | :--------- |
+| 1 | W.378 | 4282270 | 1 | 103 | 185330 |
+
+This is a trip traveling in the opposite direction for the same block.
+It has a different shape ID because it is traveling in the opposite
+direction; a shape's points must advance in the same direction a trip's
+stop times do.
+
+***Note:** You should perform some validation when grouping trips
+together using block IDs. For instance, if trips share a block_id value
+then they should also have the same service_id value. You should also
+check that the times do not overlap; otherwise the same vehicle would be
+unable to service both trips.*
+
+If you dig even further in this feed, there are actually seven different
+trips all using block `103` for the **W.378** service period. This
+roughly represents a full day's work for a single vehicle.
+
+For more discussion on blocks and how to utilize them effectively, refer
+to *Working With Trip Blocks*.
+
+### Wheelchair Accessibility
+
+Similar to `stops.txt`, you can specify the wheelchair accessibility
+of a specific trip using the `wheelchair_accessible` field. While many
+feeds do not provide this information (often because vehicles in a fleet
+can be changed at the last minute, so agencies do not want to guarantee
+this information), your wheelchair-bound users will love you if you can
+provide this information.
+
+As mentioned in the section on `stops.txt`, it is equally important to
+tell a user that a specific vehicle cannot accommodate wheelchairs as to
+when it can. Additionally, if the stops in a feed also have wheelchair
+information, then both the stop and trip must be wheelchair accessible
+for a passenger to be able to access a trip at the given stop.
+
+### Trip Direction
+
+One of the optional fields in `trips.txt` is `direction_id`, which
+is used to indicate the general direction a vehicle is traveling. At
+present the only possible values are `0` to represent "inbound" and
+`1` to represent "outbound". There are no specific guidelines as to
+what each value means, but the intention is that an inbound trip on a
+particular route should be traveling in the opposite direction to an
+outbound trip.
+
+Many GTFS feeds do not provide this information. In fact, there are a
+handful of feeds that include two entries in `routes.txt` for each
+route (one for each direction).
+
+One of the drawbacks of `direction_id` is that there are many routes
+for which "inbound" or "outbound" do not actually mean anything. Many
+cities have loop services that start and finish each trip at the same
+location. Some cities have one or more loops that travel in both
+directions (generally indicated by "clockwise loop" and
+"counter-clockwise loop", or words to that effect). In these instances,
+the `direction_id` can be used to determine which direction the route
+is traveling.
+
+### Trip Short Name
+
+The `trip_short_name` field that appears in `trips.txt` is used to
+provide a vehicle-specific code to a particular trip on a route. Based
+on GTFS feeds that are currently in publication, it appears there are
+two primary use-cases for this field:
+
+* Specifying a particular train number for all trains on a route
+* Specifying a route "sub-code" for a route variant.
+
+### Specifying a Train Number
+
+For certain commuter rail systems, such as SEPTA in Philadelphia or MBTA
+in Boston, each train has a specific number associated with it. This
+number is particularly meaningful to passengers as trains on the same
+route may have different stopping patterns or even different features
+(for instance, only a certain train may have air-conditioning or
+wheelchair access).
+
+Consider the following extract from SEPTA's rail feed
+(<https://openmobilitydata.org/p/septa>).
+
+| `route_id` | `service_id` | `trip_id` | `trip_headsign` | `block_id` | `trip_short_name` |
+| ---------- | ------------ | ------------ | ------------------------ | ---------- | ----------------- |
+| AIR | S5 | AIR_1404_V25 | Center City Philadelphia | 1404 | 1404 |
+| AIR | S1 | AIR_402_V5 | Center City Philadelphia | 402 | 402 |
+| AIR | S1 | AIR_404_V5 | Center City Philadelphia | 404 | 404 |
+| AIR | S5 | AIR_406_V25 | Center City Philadelphia | 406 | 406 |
+
+In this data, there are four different trains all heading to the same
+destination. The `trip_short_name` is a value that can safely be
+presented to users as it has meaning to them. In this case, you could
+present the first trip to passengers as:
+
+**"Train 1404 on the Airport line heading to Center City
+Philadelphia."**
+
+In this particular feed, SEPTA use the same value for
+`trip_short_name` and for `block_id`, because the train number
+belongs to a specific train. This means after it completes the trip to
+Center City Philadelphia it continues on. In this particular feed, the
+following trip also exists:
+
+**"Train 1404 on the Warminster line heading to Glenside."**
+
+You can therefore think of the `trip_short_name` value as a
+"user-facing" version of `block_id`.
+
+### Specifying a Route Sub-Code
+
+The other use-case for `trip_short_name` is for specifying a route
+sub-code. For instance, consider an agency that has a route with short
+name `100` that travels from stop `S1` to stop `S2`. At night the
+agency only has a limited number of routes running, so they extend this
+route to also visit stop `S3` (so it travels from `S1` to `S2`
+then to `S3`). As it is a minor variation of the main path, the agency
+calls this trip `100A`.
+
+The agency could either create a completely separate entry in
+`routes.txt` (so they would have `100` and `100A`), or they can
+override the handful of trips in the evening by setting the
+`trip_short_name` to `100A`. The following table shows how this
+example might be represented.
+
+| `route_id` | `trip_id` | `service_id` | `trip_short_name` |
+| ---------- | --------- | ------------ | ----------------- |
+| 100 | T1 | C1 | |
+| 100 | T2 | C1 | |
+| 100 | T3 | C1 | |
+| 100 | T4 | C1 | 100A |
+
+In this example the `trip_short_name` does not need to be set for the
+first three trips as they use the `route_short_name` value from
+`routes.txt`.
+
+### Specifying Bicycle Permissions
+
+A common field that appears in many GTFS fields is
+`trip_bikes_allowed`, which is used to indicate whether or not
+passengers are allowed to take bicycles on board. This is useful for
+automated trip planning when bicycle options can be included in the
+results.
+
+The way this field works is similar to the wheelchair information; `0`
+or empty means no information provided; `1` means no bikes allowed;
+while `2` means at least one bike can be accommodated.
+
+**Note:** Unfortunately, this value is backwards when you compare it to
+wheelchair accessibility fields. For more discussion on this matter,
+refer to the topic on the Google Group for GTFS Changes
+(<https://groups.google.com/d/topic/gtfs-changes/rEiSeKNc4cs/discussion>).
+
diff --git a/.idea/gtfs-book/ch-06-stop-times.md b/.idea/gtfs-book/ch-06-stop-times.md
new file mode 100644
index 0000000..9eb0f93
--- /dev/null
+++ b/.idea/gtfs-book/ch-06-stop-times.md
@@ -0,0 +1,160 @@
+## 6. Stop Times (stop_times.txt)
+
+*This file is ***required*** to be included in GTFS feeds.*
+
+The `stop_times.txt` file specifies individual stop arrivals and
+departures for each trip. This file is typically the largest in a GTFS
+feed as it contains many records that correspond to each entry in
+`trips.txt`.
+
+| Field | Required? | Description |
+| :----- | :-------- | :---------- |
+| `trip_id` | Required | References a trip from `trips.txt`. This ID is referenced for every stop in a trip. |
+| `arrival_time` | Required | The arrival time in `HH:MM:SS` format. Can be left blank, except for at least the first and last stop time in a trip. This value is typically be the same as `departure_time`. |
+| `departure_time` | Required | The departure time in `HH:MM:SS` format. Can be left blank, except for at least the first and last stop time in a trip. This value is typically be the same as `arrival_time`. |
+| `stop_id` | Required | References a single stop from `stops.txt`. |
+| `stop_sequence` | Required | A unique number for a given trip to indicate the stopping order. Typically these values appear in order and increment by 1 for each stop time, but this is not always the case. |
+| `stop_headsign` | Optional | This is text that appears to passengers at this stop to identify the trip's destination. It should only be used to override the `trip_headsign` value from `trips.txt`. |
+| `pickup_type` | Optional | Indicates if passengers can be picked up at this stop. Sometimes a stop is drop-off only. |
+| `drop_off_type` | Optional | Indicates if passengers can be dropped off at this stop. Sometimes a stop is pick-up only. |
+| `shape_dist_traveled` | Optional | If a trip has an associated shape, this value indicates how far along that shape the vehicle has traveled when at this stop. Values in this file and `shapes.txt` must use the same unit. |
+
+### Sample Data
+
+Consider the following extract, taken from stop_times.txt in the TriMet
+GTFS feed (<https://openmobilitydata.org/p/trimet>). This represents the
+first ten stops of a trip for bus route `1` ("Vermont") in Portland, as
+covered in Sample Data for `trips.txt`.
+
+| `trip_id` | `arrival_time` | `departure_time` | `stop_id` | `stop_sequence` | `shape_dist_traveled` |
+| :-------- | :------------- | :--------------- | :-------- | :-------------- | :-------------------- |
+| 4282247 | 06:47:00 | 06:47:00 | 13170 | 1 | 0.0 |
+| 4282247 | 06:48:18 | 06:48:18 | 7631 | 2 | 867.5 |
+| 4282247 | 06:50:13 | 06:50:13 | 7625 | 3 | 2154.9 |
+| 4282247 | 06:52:07 | 06:52:07 | 7612 | 4 | 3425.5 |
+| 4282247 | 06:53:42 | 06:53:42 | 7616 | 5 | 4491.1 |
+| 4282247 | 06:55:16 | 06:55:16 | 10491 | 6 | 5536.2 |
+| 4282247 | 06:57:06 | 06:57:06 | 7588 | 7 | 6767.1 |
+| 4282247 | 06:58:00 | 06:58:00 | 7591 | 8 | 7364.4 |
+| 4282247 | 06:58:32 | 06:58:32 | 175 | 9 | 8618.7 |
+| 4282247 | 06:58:50 | 06:58:50 | 198 | 10 | 9283.8 |
+
+If you were to plot the full trip on a map (including using its shape
+file, as specified in `shapes.txt` and referenced in `trips.txt`),
+it would look like the following diagram. The first stop is selected.
+
+![Stop Times](images/stop-times.png)
+
+### Arrival Time vs. Departure Time
+
+The first thing to notice about this data is that the values in
+`arrival_time` and `departure_time` are the same. In reality, most
+of the time these values are the same. The situation where these values
+differ is typically when a vehicle is required to wait for a period of
+time before departing. For instance:
+
+**"The train departs the domestic airport at 6:30am, arrives at the
+international terminal at 6:35am. It waits for 10 minutes for passengers
+who have just landed to board, then departs for the city at 6:45am."**
+
+While this is not typically something you need to worry about, be aware
+that some feeds differ and the holdover time could be large. A person
+rushing to a train wants to know the time it departs, while a husband
+waiting at a stop to meet his wife wants to know what time it arrives.
+
+**Note:** In a situation where the difference between the arrival time
+and departure is small, you may be better off always displaying the
+earlier time to the user. The driver may view a one or two minute
+holdover as an opportunity to keep on time, whereas a ten or fifteen
+minute holdover is unlikely to be ignored as doing so would
+significantly alter the schedule.
+
+### Scheduling Past Midnight
+
+One of the most important concepts to understand about
+`stop_times.txt` is that times later than midnight can be specified.
+For example, if a trip starts at 11:45 PM and takes an hour to complete,
+its finishing time is 12:45 AM the next day.
+
+The `departure_time` value for the first stop is be `23:45:00`,
+while the `arrival_time` value for the final stop is `24:45:00`.
+If you were to specify the final arrival time as `00:45:00`, it
+would be referencing 12:45 AM prior to the trip's starting time.
+
+While you could use the `stop_sequence` to determine which day the
+trip fell on, it would be impossible to do a quick search purely based
+on the given final stop.
+
+A trip may start and finish after midnight without being considered as
+part of the next day's service. Many transit systems shut down
+overnight, but may have a few services that run after midnight until
+about 1 AM or 2 AM. It is logical to group them all together with the
+same service as earlier trips, which this functionality allows you to
+do.
+
+However, this has implications when searching for trips. For instance,
+if you want to find all trips between 12:00 AM and 1:00 AM on 30 January
+2014, then you need to search:
+
+* Between `00:00:00` and `01:00:00` for trips with service on `20140130`
+* Between `24:00:00` and `25:00:00` for trips with service on` 20140129`
+
+In *Searching for Trips* you can see how to apply this to your
+trip searches.
+
+### Time Points
+
+Most GTFS feeds provide arrival/departure times for every single stop.
+In reality, most agencies do not have known times (or at least, they do
+not publish times) for many of their stops.
+
+Typically, for routes that describe trains, subways or ferries that make
+relatively few stops in a trip, all stops have a specified time.
+However, often for bus routes that may make many stops in a single trip,
+generally only the main stops have times shown on printed schedules.
+
+Typically, the bus drivers are required to meet these time points; if
+they are ahead of schedule they might wait at a time point until the
+scheduled time; if they are running late they might try to catch up in
+order to adhere to the specified time point.
+
+This means that the intermediate points are likely estimates that have
+been interpolated based on the amount of time between time points. If
+there are multiple stops in-between the time points then the distance
+between stops may also be used to calculate the estimate.
+
+In actual fact, GTFS feeds do not have to specify times for all stops.
+The data in the following table is perfectly valid for a trip.
+
+| `trip_id` | `arrival_time` | `stop_id` | `stop_sequence` | `shape_dist_traveled` |
+| :-------- | :------------- | :-------- | :-------------- | :-------------------- |
+| T1 | 10:00:00 | S1 | 1 | 0 |
+| T1 | | S2 | 2 | 1500 |
+| T1 | | S3 | 3 | 3000 |
+| T1 | 10:12:00 | S4 | 4 | 6000 |
+
+Based on this data, without taking into account the distance traveled,
+you may estimate that the second stop arrives at 10:04 AM while the
+third stop arrives at 10:08.
+
+If you consider the distance traveled, you might conclude the second
+stop arrives at 10:03 AM while the third stop arrives at 10:06 AM.
+
+Some agencies include an additional column in `stop_times.txt` called
+`timepoint`. This is used when they specify the times for all stops
+but also want to indicate if only certain stops are guaranteed times.
+
+The following table shows how this would look using the previous data as
+its basis.
+
+| `trip_id` | `arrival_time` | `stop_id` | `stop_sequence` | `shape_dist_traveled` | `timepoint` |
+| :--------- | :------------- | :-------- | :-------------- | :-------------------- | :---------- |
+| T1 | 10:00:00 | S1 | 1 | 0 | 1 |
+| T1 | 10:03:00 | S2 | 2 | 1500 | 0 |
+| T1 | 10:06:00 | S3 | 3 | 3000 | 0 |
+| T1 | 10:12:00 | S4 | 4 | 6000 | 1 |
+
+This can be especially useful if you want to highlight these time points
+so as to represent the printed schedules accurately, or even if you are
+a transit agency just using the data for internal reporting.
+
diff --git a/.idea/gtfs-book/ch-07-calendar.md b/.idea/gtfs-book/ch-07-calendar.md
new file mode 100644
index 0000000..aebf131
--- /dev/null
+++ b/.idea/gtfs-book/ch-07-calendar.md
@@ -0,0 +1,138 @@
+## 7. Trip Schedules (calendar.txt & calendar_dates.txt)
+
+*Each of these files are ***optional*** in a GTFS feed, but at least one
+of them is ***required***.*
+
+The `calendar.txt` file is used to indicate the range of dates on
+which trips are running. It works by including a start date and a finish
+date (typically a range of 3-6 months), then a marker for each day of
+the week on which it operates. If there are single-day scheduling
+changes that occur during this period, then the `calendar_dates.txt`
+file can be used to override the schedule for each of these days.
+
+The following table shows the specification for `calendar.txt`.
+
+| Field | Required? | Description |
+| :----- | :-------- | :---------- |
+| `service_id` | Required | A unique ID for a single service. This value is referenced by trips in `trips.txt`. |
+| `start_date` | Required | This indicates the start date for a given service, in `YYYYMMDD` format. |
+| `end_date` | Required | This indicates the end date for a given service, in `YYYYMMDD` format. |
+| `monday` | Required | Contains `1` if trips run on Mondays between the start and end dates, `0` or empty if not. |
+| `tuesday` | Required | Contains `1` if trips run on Tuesdays between the start and end dates, `0` or empty if not. |
+| `wednesday` | Required | Contains `1` if trips run on Wednesdays between the start and end dates, `0` or empty if not. |
+| `thursday` | Required | Contains `1` if trips run on Thursdays between the start and end dates, `0` or empty if not. |
+| `friday` | Required | Contains `1` if trips run on Fridays between the start and end dates, `0` or empty if not. |
+| `saturday` | Required | Contains `1` if trips run on Saturdays between the start and end dates, `0` or empty if not. |
+| `sunday` | Required | Contains `1` if trips run on Sundays between the start and end dates, `0` or empty if not. |
+
+As mentioned above, the `calendar_dates.txt` file is used to define
+exceptions to entries in `calendar.txt`. For instance, if a 3-month
+service is specified in `calendar.txt` and a holiday lies on a Monday
+during this period, then you can use calendar_dates.txt to override this
+single date.
+
+If the weekend schedule were used for a holiday, then you would add a
+record to remove the regular schedule for the holiday date, and another
+record to add the weekend schedule for the holiday date.
+
+Some feeds choose only to include `calendar_dates.txt` and not
+`calendar.txt`, in which case there is an "add service" record for
+every service and every date in this file.
+
+The following table shows the specification for `calendar_dates.txt`.
+
+| Field | Required? | Description |
+| :----- | :-------- | :---------- |
+| `service_id` | Required | The service ID that an exception is being defined for. This is referenced in both `calendar.txt` and in `trips.txt`. Unlike `calendar.txt`, it is possible for a `service_id` value to appear multiple times in this file. |
+| `date` | Required | The date for which the exception is occurring, in `YYYYMMDD` format. |
+| `exception_type` | Required | This indicates whether the exception is denoting an added service (`1`) or a removed service (`2`). |
+
+### Sample Data
+
+The following is an extract from the `calendar.txt` file in Adelaide
+Metro's GTFS feed (<https://openmobilitydata.org/p/adelaide-metro>). This
+extract includes schedules from the start of 2014 until the end of March
+2014.
+
+| `service_id` | `monday` | `tuesday` | `wednesday` | `thursday` | `friday` | `saturday` | `sunday` | `start_date` | `end_date` |
+| :----------- | :------- | :-------- | :---------- | :--------- | :------- | :--------- | :------- | :----------- | :--------- |
+| 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 20140102 | 20140331 |
+| 11 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 20140102 | 20140331 |
+| 12 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 20140102 | 20140331 |
+
+Any trip in the corresponding `trips.txt` file with a `service_id`
+value of `1` runs from Monday to Friday. Trips with a `service_id`
+of `11` run only on Saturday, while those with `12` run only on
+Sunday.
+
+Now consider an extract from `calendar_dates.txt` from the same feed,
+as shown in the following table.
+
+| `service_id` | `date` | `exception_type` |
+| ------------ | -------- | ---------------- |
+| 1 | 20140127 | 2 |
+| 1 | 20140310 | 2 |
+| 12 | 20140127 | 1 |
+| 12 | 20140310 | 1 |
+
+The first two rows mean that on January 27 and March 10 trips with
+`service_id` of `1` are not running. The final two rows mean that on
+those same dates trips with `service_id` of `12` are running. This
+has the following meaning:
+
+**"On 27 January and 10 March, use the Sunday timetable instead of the
+Monday-Friday timetable."**
+
+In Adelaide, these two dates are holidays (Australia Day and Labour
+Day). It is Adelaide Metro's policy to run their Sunday timetable on
+public holidays, which is reflected by the above records in their
+`calendar_dates.txt` file.
+
+### Structuring Services
+
+The case described above is the ideal case for specifying services in a
+GTFS feed (dates primarily specified in `calendar.txt` with a handful
+of exceptions in `calendar_dates.txt`).
+
+Be aware that there are two other major ways that services are specified
+in feeds.
+
+1. Using only `calendar_dates.txt` and expressly including every
+ single date within the service range. Each of these is included as
+ "service added" (an `exception_type` value of `1`). The
+ following table shows how this might look.
+
+| `service_id` | `date` | `exception_type` |
+| :----------- | :------- | :--------------- |
+| 1 | 20140102 | 1 |
+| 1 | 20140103 | 1 |
+| 11 | 20140104 | 1 |
+| 12 | 20140105 | 1 |
+
+2. Not using `calendar_dates.txt`, but creating many records in
+ `calendar.txt` instead to span various dates. The following table
+ shows how you can represent Monday-Friday from the sample data in
+ this fashion.
+
+| `service_id` | `monday` | `tuesday` | `wednesday` | `thursday` | `friday` | `saturday` | `sunday` | `start_date` | `end_date` |
+| :----------- | :------- | :-------- | :---------- | :--------- | :------- | :--------- | :------- | :----------- | :--------- |
+| 1a | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 20140102 | 20140126 |
+| holiday1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 20140127 | 20140127 |
+| 1b | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 20140128 | 20140309 |
+| holiday2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 20140310 | 20140310 |
+| 1c | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 20140311 | 20140331 |
+
+In this example, each holiday has its own row in `calendar.txt` that
+runs for a single day only.
+
+Refer to *Finding Service IDs* to see how to determine
+services that are running for a given day.
+
+### Service Name
+
+There are a number of feeds that specify a column in `calendar.txt`
+called `service_name`. This is used to give a descriptive name to each
+service. For example, the Sedona Roadrunner in Arizona
+(<https://openmobilitydata.org/p/sedona-roadrunner>) has services called
+"Weekday Service", "Weekend Service" and "New Year's Eve Service".
+
diff --git a/.idea/gtfs-book/ch-08-fares.md b/.idea/gtfs-book/ch-08-fares.md
new file mode 100644
index 0000000..88bcbea
--- /dev/null
+++ b/.idea/gtfs-book/ch-08-fares.md
@@ -0,0 +1,115 @@
+## 8. Fare Definitions (fare_attributes.txt & fare_rules.txt)
+
+*These files are ***optional*** in a GTFS feed, but any rules specified
+must reference a fare attributes record.*
+
+These two files define the types of fares that exist in a system,
+including their price and transfer information. The attributes of a
+particular fare exist in `fare_attributes.txt`, which has the
+following columns.
+
+| Field | Required? | Description |
+| :----- | :-------- | :---------- |
+| `fare_id` | Required | A value that uniquely identifies a fare listed in this file. It must only appear once in this file. |
+| `price` | Required | This field specifies the cost of the fare. For instance, if a trip costs $2 USD, then this value should be either `2.00` or `2`, and the `currency_type` should be `USD`. |
+| `currency_type` | Required | This is the currency code that `price` is specified in, such as USD for United States Dollar. |
+| `payment_method` | Required | This indicates when the fare is to be paid. `0` means it can be paid on board, while `1` means it must be paid before boarding. |
+| `transfers` | Required | The number of transfers that may occur. This must either be empty (unlimited) transfers, or the number of transfers. |
+| `transfer_duration` | Optional | This is the number of seconds a transfer is valid for. |
+
+The following table shows the specification for `fare_rules.txt`,
+which defines the rules used to apply a fare to a particular trip.
+
+| Field | Required? | Description |
+| :----- | :-------- | :---------- |
+| `fare_id` | Required | This is the ID of the fare that a rule applies to as it appears in `fare_attributes.txt`. |
+| `route_id` | Optional | The ID of a route as it appears in `routes.txt` for which this rule applies. If there are several routes with the same fare attributes, there may be a row in `fare_rules.txt` for each route. |
+| `origin_id` | Optional | This value corresponds to a `zone_id` value from `stops.txt`. If specified, this means a trip must begin in the given zone in order to qualify for this fare. |
+| `destination_id` | Optional | This value corresponds to a `zone_id` value from `stops.txt`. If specified, this means a trip must begin in the given zone in order to qualify for this fare. |
+| `contains_id` | Optional | This value corresponds to a `zone_id` value from `stops.txt`. If specified, this means a trip must pass through every `contains_id` zone for the given fare (in other words, several rules may need to be checked). |
+
+Note that aside from `fare_id`, all fields are optional in this file.
+This means some very complex rules can be made (especially when
+transfers come into the calculation). The following URL has discussion
+about different rules and some complex fare examples:
+
+<https://code.google.com/p/googletransitdatafeed/wiki/FareExamples>
+
+Refer to *Calculating Fares* for discussion about the
+algorithm for calculating fares for trips both with and without
+transfers.
+
+### Sample Data
+
+The following data is taken from the TriMet GTFS feed
+(<https://openmobilitydata.org/p/trimet>). Firstly, the data from the
+`fare_attributes.txt` file.
+
+| `fare_id` | `price` | `currency` | `payment_method` | `transfers` | `transfer_duration` |
+| :-------- | :------ | :--------- | :--------------- | :---------- | :-------------------|
+| B | 2.5 | USD | 0 | | 7200 |
+| R | 2.5 | USD | 1 | | 7200 |
+| BR | 2.5 | USD | 0 | | 7200 |
+| RB | 2.5 | USD | 1 | | 7200 |
+| SC | 1 | USD | 1 | 0 | |
+| AT | 4 | USD | 1 | 0 | |
+| VT | 0 | USD | 1 | 0 | |
+
+The data from the `fare_rules.txt` file is shown in the following
+table.
+
+| `fare_id` | `route_id` | `origin_id` | `destination_id` | `contains_id` |
+| :-------- | :--------- | :---------- | :--------------- | :------------ |
+| B | | B | | B |
+| R | | R | | R |
+| BR | | B | | B |
+| BR | | B | | R |
+| RB | | R | | B |
+| RB | | R | | R |
+| SC | 193 | | | |
+| SC | 194 | | | |
+| AT | 208 | | | |
+| VT | 250 | | | |
+
+In this sample data, TriMet have named some of their fares the same as
+the zones specified in `stops.txt`. In this particular feed, bus stops
+have a `zone_id` of `B`, while rail stops have `R`.
+
+The fares in this file are as follows:
+
+* Fare `B`. If you start at a bus stop (`zone_id` value of `B`),
+ you can buy your ticket on board (`payment_method` of `0`). You
+ may transfer an unlimited number (empty transfers value) for 2 hours
+ (`transfer_duration` of `7200`). The cost is $2.50 USD.
+* Fare `R`. If you start at a rail stop, you must pre-purchase your
+ ticket. You may transfer an unlimited number of times to other rail
+ services for up to 2 hours. The cost is $2.50 USD.
+
+The `BR` fare describes a trip that begins on a bus then transfers to
+a rail service (while `RB` is the opposite). This fare is not be
+matched if the passenger does not travel on both, as all `contains_id`
+values must be matched in order to apply a fare.
+
+The other fares (`SC`, `AT` and `VT`) all apply to their
+respective `route_id` values, regardless of start and finish stops.
+Tickets must be pre-purchased, and transfers are not allowed. The `VT`
+fare (which corresponds to TriMet's Vintage Trolley) is free to ride
+since it has a price of `0`.
+
+### Assigning Fares to Agencies
+
+One of the extensions available to `fare_attributes.txt` is to include
+an `agency_id` column. This is to limit a specific fare to only routes
+from the specified agency, in the case where a feed has multiple
+agencies.
+
+This is useful because there may be two agencies in a feed that define
+fares with no specific rules (in other words, the fare applies to all
+trips). If the price differs, then GTFS dictates that the cheapest fare
+is always applied. Using `agency_id` means these fares can be
+differentiated accordingly.
+
+For more information about this extension, refer to the Google Transit
+GTFS Extensions page at
+<https://support.google.com/transitpartners/answer/2450962>.
+
diff --git a/.idea/gtfs-book/ch-09-shapes.md b/.idea/gtfs-book/ch-09-shapes.md
new file mode 100644
index 0000000..bc05636
--- /dev/null
+++ b/.idea/gtfs-book/ch-09-shapes.md
@@ -0,0 +1,76 @@
+## 9. Trip Shapes (shapes.txt)
+
+*This file is ***optional*** in a GTFS feed.*
+
+Each trip in `trips.txt` can have a shape associated with it. The
+shapes.txt file defines the points that make up an individual shape in
+order to plot a trip on a map. Two or more records in `shapes.txt`
+with the same `shape_id` value define a shape.
+
+The amount of data stored in this file can be quite large. In
+*Optimizing Shapes* there are some strategies to efficiently
+reduce the amount of shape data.
+
+| Field | Required? | Description |
+| :----- | :-------- | :---------- |
+| `shape_id` | Required | An ID to uniquely identify a shape. Every point for a shape contains the same value. |
+| `shape_pt_lat` | Required | The latitude for a given point in the range of `-90` to `90`. |
+| `shape_pt_lon` | Required | The longitude for a given point in the range of `-180` to `180`. |
+| `shape_pt_sequence` | Required | A non-negative number that defines the ordering for points in a shape. A value must not be repeated within a single shape. |
+| `shape_dist_traveled` | Optional | This value represents how far along a shape a particular point exists. This is a distance in a unit such as feet or kilometers. This unit must be the same as that used in `stop_times.txt`. |
+
+### Sample Data
+
+The following table shows a portion of a shape from the TriMet GTFS
+feed. It is a portion of the shape that corresponds to the sample data
+in the `stop_times.txt` section.
+
+| `shape_id` | `shape_pt_lat` | `shape_pt_lon` | `shape_pt_sequence` | `shape_dist_traveled` |
+| :--------- | :------------- | :------------- | :------------------ | :-------------------- |
+| 185328 | 45.52291 | -122.677372 | 1 | 0.0 |
+| 185328 | 45.522921 | -122.67737 | 2 | 3.7 |
+| 185328 | 45.522991 | -122.677432 | 3 | 34.0 |
+| 185328 | 45.522992 | -122.677246 | 4 | 81.5 |
+| 185328 | 45.523002 | -122.676567 | 5 | 255.7 |
+| 185328 | 45.523004 | -122.676486 | 6 | 276.4 |
+| 185328 | 45.523007 | -122.676386 | 7 | 302.0 |
+| 185328 | 45.523024 | -122.675386 | 8 | 558.4 |
+| 185328 | 45.522962 | -122.67538 | 9 | 581.0 |
+
+In this sample data, the `shape_dist_traveled` is listed in feet.
+There is no way to specify in a GTFS feed which units are used for this
+column -- it could be feet, miles, meters, kilometers. In actual fact,
+it does not really matter, just as long as the units are the same as in
+`stop_times.txt`.
+
+If you need to present a distance to your users (such as how far you
+need to travel on a bus), you can calculate it instead by adding up the
+distance between each point and formatting it based on the user's
+locale settings.
+
+### Point Sequences
+
+In most GTFS feeds the `shape_pt_sequence` value starts at 1 and
+increments by 1 for every subsequent point. Additionally, points are
+typically listed in order of their sequence.
+
+You should not rely on these two statements though, as this is not a
+requirement of GTFS. Many transit agencies have automated systems that
+export their GTFS from a separate system, which can sometimes result in
+an unpredictable output format.
+
+For instance, a trip that has stop times listed with the sequences `1`, `2`,
+`9`, `18`, `7`, `3` is perfectly valid.
+
+### Distance Travelled
+
+The `shape_dist_traveled` column is used so you can programmatically
+determine how much of a shape to draw when showing a map to users of
+your web site or app. If you use techniques in *Optimizing Shapes*
+to reduce the file size of shape data, then it becomes difficult to
+use this value.
+
+Alternatively, you can calculate portions of shapes by determining which
+point in a shape travels closest to the start and finish points of a
+trip.
+
diff --git a/.idea/gtfs-book/ch-10-frequencies.md b/.idea/gtfs-book/ch-10-frequencies.md
new file mode 100644
index 0000000..45021ab
--- /dev/null
+++ b/.idea/gtfs-book/ch-10-frequencies.md
@@ -0,0 +1,98 @@
+## 10. Repeating Trips (frequencies.txt)
+
+*This file is ***optional*** in a GTFS feed.*
+
+In some cases a route may repeat a particular stopping pattern every few
+minutes (picture a subway line that runs every 5 minutes). Rather than
+including entries in `trips.txt` and `stop_times.txt` for every
+single occurrence, you can include the trip once then define rules for
+it to repeat for a period of time.
+
+Having a trip repeat only works in the case where the timing between
+stops remains consistent for all stops. Using `frequencies.txt`, you
+use the relative times between stops alongside a calculated starting
+time for the trip in order to determine the specific stop times.
+
+| Field | Required? | Description |
+| :----- | :-------- | :---------- |
+| `trip_id` | Required | The ID of the trip as it appears in `trips.txt` that is being repeated. A single trip can appear multiple times for different time ranges. |
+| `start_time` | Required | The time at which a given trip starts repeating, in `HH:MM:SS` format. |
+| `end_time` | Required | The time at which a given trip stops repeating, in `HH:MM:SS` format. |
+| `headway_secs` | Required | The time in seconds between departures from a given stop during the time range. |
+| `exact_times` | Optional | Whether or not repeating trips should be exactly scheduled. See below for discussion. |
+
+### Sample Data
+
+The following sample data is taken from Société de transport de Montréal
+(STM) in Montreal
+(<https://openmobilitydata.org/p/societe-de-transport-de-montreal>).
+
+| `trip_id` | `start_time` | `end_time` | `headway_secs` |
+| :----------------------- | :----------- | :--------- | :------------- |
+| 13S_13S_F1_1_2_0.26528 | 05:30:00 | 07:25:30 | 630 |
+| 13S_13S_F1_1_6_0.34167 | 07:25:30 | 08:40:10 | 560 |
+| 13S_13S_F1_1_10_0.42500 | 08:40:10 | 12:19:00 | 505 |
+| 13S_13S_F1_1_7_0.58750 | 12:19:00 | 15:00:00 | 460 |
+| 13S_13S_F1_1_11_0.66875 | 15:00:00 | 18:23:00 | 420 |
+| 13S_13S_F1_1_5_0.78889 | 18:23:00 | 21:36:35 | 505 |
+
+Each of the trips listed here have corresponding entries in
+`trips.txt` and `stop_times.txt` (more on that shortly). This data
+can be interpreted as follows.
+
+* The first trip runs every 10m 30s from 5:30am until 7:25am.
+* The second trip runs every 9m 20s from 7:25am until 8:40am, and so on.
+
+The following table shows some of the stop times for the first trip
+(`departure_time` is omitted here for brevity, since it is identical
+to `arrival_time`).
+
+| `trip_id` | `stop_id` | `arrival_time` | `stop_sequence` |
+| :---------------------- | :-------- | :------------- | :-------------- |
+| 13S_13S_F1_1_2_0.26528 | 18 | 06:22:00 | 1 |
+| 13S_13S_F1_1_2_0.26528 | 19 | 06:22:59 | 2 |
+| 13S_13S_F1_1_2_0.26528 | 20 | 06:24:00 | 3 |
+| 13S_13S_F1_1_2_0.26528 | 21 | 06:26:00 | 4 |
+
+As this trip runs to the specified frequency, the specific times do not
+matter. Instead, the differences are used. For the above stop times,
+there is a 59 second gap between the first and second time, a 61 second
+gap between the second and third, and a 120 second gap between the third
+and fourth.
+
+The stop times for the first frequency record (10.5 minutes apart) can
+be calculated as follows.
+
+* `05:30:00`, `05:30:59`, `05:32:00`, `05:34:00`
+* `05:40:30`, `05:41:29`, `05:42:30`, `05:44:30`
+* `05:51:00`, `05:51:59`, `05:53:00`, `05:55:00`
+* ...
+* `07:25:30`, `07:26:29`, `07:27:30`, `07:29:30`
+
+### Specifying Exact Times
+
+In the file definition at the beginning of this chapter there is an
+optional field called `exact_times`. It may not be immediately clear
+what this field means, so to explain it better, consider the frequency
+definitions in the following table.
+
+| `trip_id` | `start_time` | `end_time` | `headway_secs` | `exact_times` |
+| :-------- | :----------- | :--------- | :------------- | :------------ |
+| T1 | 09:00:00 | 10:00:00 | 300 | 0 |
+| T2 | 09:00:00 | 10:00:00 | 300 | 1 |
+
+These two frequencies are the same, with only the `exact_times` value
+different. The first (`T1`) should be presented in a manner such as:
+
+**"Between 9 AM and 10 AM this trip departs every 5 minutes."**
+
+The second trip (`T2`) should be presented as follows:
+
+**"This trip departs at 9 AM, 9:05 AM, 9:10 AM, ..."**
+
+While ultimately the meaning is the same, this difference is used in
+order to allow agencies to represent their schedules more accurately.
+Often, schedules that convey to passengers that they will not have to
+wait more than five minutes do so without having to explicitly list
+every departure time.
+
diff --git a/.idea/gtfs-book/ch-11-stop-transfers.md b/.idea/gtfs-book/ch-11-stop-transfers.md
new file mode 100644
index 0000000..6a43674
--- /dev/null
+++ b/.idea/gtfs-book/ch-11-stop-transfers.md
@@ -0,0 +1,65 @@
+## 11. Stop Transfers (transfers.txt)
+
+*This file is ***optional*** in a GTFS feed.*
+
+To define how passengers can transfer between routes at specific stops
+feed providers can include `transfers.txt`. This does not mean
+passengers cannot transfer elsewhere, but it does indicate if a transfer
+is not possible between certain stops, or a minimum time required if
+transfer is possible.
+
+| Field | Required? | Description |
+| :----- | :-------- | :---------- |
+| `from_stop_id` | Required | The ID of stop as it appears in `stops.txt` where the connection begins. If this references a station, then this rule applies to all stops within the station. |
+| `to_stop_id` | Required | The ID of the stop as it appears in `stops.txt` where the connection between trips ends. If this references a station, then this rule applies to all stops within the station. |
+| `transfer_type` | Required | `0` or blank means the recommended transfer point, `1` means the secondary vehicle will wait for the first, `2` means a minimum amount of time is required, `3` means transfer is not possible. |
+| `min_transfer_time` | Optional | If the `transfer_type` value is `2` then this value must be specified. It indicates the number of seconds required to transfer between the given stops. |
+
+It is also possible that records in this file are specified for
+ticketing reasons. For instance, some train stations are set up so that
+passengers can transfer between routes without needing to validate their
+ticket again or buy a transfer. Other stations that are shared between
+those same routes might not have this open transfer area, thereby
+requiring you to exit one route fully before buying another ticket to
+access the second.
+
+### Sample Data
+
+The following table shows some sample transfer rules from TriMet in
+Portland's GTFS feed (<https://openmobilitydata.org/p/trimet>).
+
+| `from_stop_id` | `to_stop_id` | `transfer_type` | `min_transfer_time` |
+| :------------- | :----------- | :-------------- | :------------------ |
+| 7807 | 5020 | 0 | |
+| 7807 | 7634 | 0 | |
+| 7807 | 7640 | 0 | |
+
+These rules indicate that if you are transferring from a route that
+visits stop `7807` to any route that visits the other stops (`5020`,
+`7634` or `7640`), then this is the ideal place to do it.
+
+In other words, if there are other locations along the first route where
+you could transfer to the second route, then those stops should not be
+used. These rules say this is the best place to transfer.
+
+Consider the transfer rule in the following table, taken from the New
+York City Subway GTFS feed (<https://openmobilitydata.org/p/mta/79>).
+
+| `from_stop_id` | `to_stop_id` | `transfer_type` | `min_transfer_time` |
+| :------------- | :----------- | :-------------- | :------------------ |
+| 121 | 121 | 2 | 180 |
+
+In this data, the MTA specifies how long it takes to transfer to
+different platforms within the same station. The stop with ID 121 refers
+to the 86th St station (as specified in `stops.txt`). It has a
+`location_type` of `1` and two stops within it (`121N` and
+`121S`). The above transfer rule says that if you need to transfer
+from `121N` to `121S` (or vice-versa) then a minimum time of 3
+minutes (180 seconds) must be allocated.
+
+If you were to calculate the time taken to transfer using the
+coordinates of each of these platforms, it would only take a few seconds
+as they are physically close to each other. In reality though, you must
+exit one platform then walk around and enter the other platform (often
+having to use stairs).
+
diff --git a/.idea/gtfs-book/ch-12-feed-information.md b/.idea/gtfs-book/ch-12-feed-information.md
new file mode 100644
index 0000000..ffad2ab
--- /dev/null
+++ b/.idea/gtfs-book/ch-12-feed-information.md
@@ -0,0 +1,36 @@
+## 12. Feed Information (feed_info.txt)
+
+*This file is ***optional*** in a GTFS feed.*
+
+Feed providers can include additional information about a feed using
+`feed_info.txt`. It should only ever have a single row (other than the
+CSV header row).
+
+| Field | Required? | Description |
+| :----- | :-------- | :---------- |
+| `feed_publisher_name` | Required | The name of the organization that publishes the feed. This may or may not be the same as any agency in `agency.txt`. |
+| `feed_publisher_url` | Required | The URL of the feed publisher's web site. |
+| `feed_lang` | Required | This specifies the language used in the feed. If an agency also has a language specified, then the agency's value should override this value. |
+| `feed_start_date` | Optional | This value is a date in `YYYYMMDD` format that asserts the data in this feed is valid from this date. If specified, it typically matches up with the earliest date in `calendar.txt` or `calendar_dates.txt`, but if it is earlier, this is explicitly saying there are no services running between this date and the earliest service date. |
+| `feed_end_date` | Optional | This value is a date in `YYYYMMDD` format that asserts the data in this feed is valid until this date. If specified, it typically matches up with the latest date in `calendar.txt` or `calendar_dates.txt`, but if it is earlier, this is explicitly saying there are no services running between the latest service date and this date. |
+| `feed_version` | Optional | A string that indicates the version of this feed. This can be useful to let feed publishers know whether the latest version of their feed has been incorporated. |
+
+### Sample Data
+
+The following sample data is taken from the GTFS feed of TriMet in
+Portland (<https://openmobilitydata.org/p/trimet>).
+
+| `feed_publisher_name` | `feed_publisher_url` | `feed_lang` | `feed_start_date` | `feed_end_date` | `feed_version` |
+| :-------------------- | :-------------------------------------- | :---------- | :---------------- | :-------------- | :---------------- |
+| TriMet | [http://trimet.org](http://trimet.org/) | en | | | 20140121-20140421 |
+
+In this example, TriMet do not include the start or end dates, meaning
+you should derive the dates this feed is active for by the dates in
+`calendar_dates.txt` (this particular feed does not have a
+`calendar.txt` file).
+
+TriMet use date stamps to indicate the feed version. This feed was
+published on 21 January 2014 and includes data up until 21 April 2014,
+so it appears they use the first/last dates as a way to specify their
+version. Each agency has its own method.
+
diff --git a/.idea/gtfs-book/ch-13-importing-to-sql.md b/.idea/gtfs-book/ch-13-importing-to-sql.md
new file mode 100644
index 0000000..e96517c
--- /dev/null
+++ b/.idea/gtfs-book/ch-13-importing-to-sql.md
@@ -0,0 +1,125 @@
+## 13. Importing a GTFS Feed to SQL
+
+One of the great things about GTFS is that it is already in a format
+conducive to being used in an SQL database. The presence of various IDs
+in each of the different files makes it easy to join the tables in order
+to extract the data you require.
+
+To try this yourself, download `GtfsToSql`
+(<https://github.com/OpenMobilityData/GtfsToSql>). This is a Java
+command-line application that imports a GTFS feed to an SQLite database.
+This application also supports PostgreSQL, but the examples used here
+are for SQLite.
+
+The pre-compiled `GtfsToSql` Java archive can be downloaded from its
+GitHub repository at
+<https://github.com/OpenMobilityData/GtfsToSql/tree/master/dist>.
+
+To use `GtfsToSql`, all you need is an extracted GTFS feed. The
+following instructions demonstrate how you to import the TriMet feed
+that has been referenced throughout this book.
+
+Firstly, download and extract the feed. The following commands use curl
+to download the file, then unzip to extract the file to a sub-directory
+called `trimet`.
+
+```
+$ curl http://developer.trimet.org/schedule/gtfs.zip > gtfs.zip
+
+$ unzip gtfs.zip -d trimet/
+```
+
+To create an SQLite database from this feed, the following command can
+be used.
+
+```
+$ java -jar GtfsToSql.jar -s jdbc:sqlite:./db.sqlite -g ./trimet
+```
+
+This may take a minute or two to complete (you will see progress as it
+imports the feed and then creates indexes), and at the end you will have
+a GTFS database in a file called `db.sqlite`. You can then query this
+database with the command-line `sqlite3` tool, as shown in the
+following example.
+
+```
+$ sqlite3 db.sqlite
+sqlite> SELECT * FROM agency;
+|TriMet|http://trimet.org|America/Los_Angeles|en|503-238-7433
+
+sqlite> SELECT * FROM routes WHERE route_type = 0;
+90|||MAX Red Line||0|http://trimet.org/schedules/r090.htm||
+100|||MAX Blue Line||0|http://trimet.org/schedules/r100.htm||
+190|||MAX Yellow Line||0|http://trimet.org/schedules/r190.htm||
+193||Portland Streetcar|NS Line||0|http://trimet.org/schedules/r193.htm||
+194||Portland Streetcar|CL Line||0|http://trimet.org/schedules/r194.htm||
+200|||MAX Green Line||0|http://trimet.org/schedules/r200.htm||
+250|||Vintage Trolley||0|http://trimet.org/schedules/r250.htm||
+```
+
+The first query above finds all agencies stored in the database, while
+the second finds all routes marked as Light Rail (`route_type` of
+`0`).
+
+**Note: In the following chapters there are more SQL examples. All of
+these examples are geared towards running on an SQLite database that has
+been created in this manner.**
+
+All tables in this database match up with the corresponding GTFS
+filename (so for `agency.txt`, the table name is `agency`, while for
+`stop_times.txt` the table name is `stop_times`). The columns in SQL
+have the same name as the value in the corresponding GTFS file.
+
+***Note:** All data imported using this tool is stored as text in the
+database. This means you may need to be careful when querying integer
+data. For example, ordering stop times by stop_sequence may not produce
+expected results (for instance, 29 as a string comes before 3). Although
+it is a performance hit, you can change this behavior by casting the
+value to integer, such as: ORDER BY stop_sequence + 0. The reason
+`GtfsToSql` works in this way is because it is intended as a lightweight
+tool to be able to quickly query GTFS data. I recommend rolling your own
+importer to treat data exactly as you need it, especially in conjunction
+with some of the optimization techniques recommended later in this book.*
+
+### File Encodings
+
+The GTFS specification does not indicate whether files should be encoded
+using UTF-8, ISO-8859-1 or otherwise. Since a GTFS feed is not
+necessarily in English, you must be willing to handle an extended
+character set.
+
+The GtfsToSql tool introduced above automatically detects the encoding
+of each file using the juniversalchardet Java library
+(<https://code.google.com/p/juniversalchardet/>).
+
+I recommend you take some time looking at the source code of GtfsToSql
+to further understand this so you are aware of handling encodings
+correctly if you write your own parser.
+
+### Optimizing GTFS Feeds
+
+If you are creating a database that is to be distributed onto a mobile
+device such as an iPhone or Android phone, then disk space and
+computational power is at a premium. Even if you are setting up a
+database to be queried on a server only, then making the database
+perform as quickly as possible is still important.
+
+In the following chapters are techniques for optimizing GTFS feeds.
+There are many techniques that can be applied to improve the performance
+of GTFS, such as:
+
+* Using integer identifiers rather than string identifiers (for route
+ IDs, trip IDs, stop IDs, etc.) and creating appropriate indexes
+* Removing redundant shape points and encoding shapes
+* Deleting unused data
+* Reusing repeating trip patterns.
+
+Changing the data to use integer IDs makes the greatest improvement to
+performance, but the other techniques also help significantly.
+
+Depending on your needs, there are other optimizations that can be made
+to reduce file size and speed up querying of the data, but the ease of
+implementing them may depend on your database storage system and the
+programming language used to query the data. The above list is a good
+starting point.
+
diff --git a/.idea/gtfs-book/ch-14-switching-to-integer-ids.md b/.idea/gtfs-book/ch-14-switching-to-integer-ids.md
new file mode 100644
index 0000000..4d3914a
--- /dev/null
+++ b/.idea/gtfs-book/ch-14-switching-to-integer-ids.md
@@ -0,0 +1,111 @@
+## 14. Switching to Integer IDs
+
+There are a number of instances in a GTFS feed where IDs are used, such
+as to identify routes, trips, stops and shapes. There are no specific
+guidelines in GTFS as to the type of data or length an ID can be. As
+such, IDs in some GTFS feeds maybe anywhere up to 30 or 40 characters
+long.
+
+Using long strings as IDs is extremely inefficient as they make the size
+of a database much larger than it needs to be, as well as making
+querying the data much slower.
+
+To demonstrate, consider a GTFS feed where trip IDs are 30 characters
+long. If there are 10,000 trips, each with an average of 30 stops in
+`stop_times.txt`, then the IDs alone take up 9.3 MB of storage.
+Realistically speaking, you need to index the `trip_id` field in order
+to look up a trip's stop times quickly, which uses even more space.
+
+The following SQL statements show how you might represent GTFS without
+optimizing the identifiers. For brevity, not all fields from the GTFS
+feed are included here.
+
+```sql
+CREATE TABLE trips (
+ trip_id TEXT,
+ route_id TEXT,
+ service_id TEXT
+);
+
+CREATE INDEX trips_trip_id ON trips (trip_id);
+
+CREATE INDEX trips_route_id ON trips (route_id);
+
+CREATE INDEX trips_service_id ON trips (service_id);
+
+CREATE TABLE stop_times (
+ trip_id TEXT,
+ stop_id TEXT,
+ stop_sequence INTEGER
+);
+
+CREATE INDEX stop_times_trip_id ON stop_times (trip_id);
+
+CREATE INDEX stop_times_stop_id on stop_times (stop_id);
+```
+
+If you were to add an integer column to `trips` called, say,
+`trip_index`, then you can reference that value from `stop_times`
+instead of `trip_id`. The following SQL statements show this.
+
+```sql
+CREATE TABLE trips (
+ trip_id TEXT,
+ trip_index INTEGER,
+ route_id TEXT,
+ service_id TEXT
+);
+
+CREATE INDEX trips_trip_id ON trips (trip_id);
+CREATE INDEX trips_trip_index ON trips (trip_index);
+CREATE INDEX trips_route_id ON trips (route_id);
+CREATE INDEX trips_service_id ON trips (service_id);
+
+CREATE TABLE stop_times (
+ trip_index INTEGER,
+ stop_id TEXT,
+ stop_sequence INTEGER
+);
+
+CREATE INDEX stop_times_trip_index ON stop_times (trip_index);
+
+CREATE INDEX stop_times_stop_id on stop_times (stop_id);
+```
+
+This results in a significant space saving (when you consider how large
+`stop_times` can be), as well as being far quicker to look up stop
+times based on a trip ID. Note that the original `trip_id` value is
+retained so it can be referenced if required.
+
+Without adding `trip_index`, you would use the following query to find
+stop times given a trip ID.
+
+```sql
+SELECT * FROM stop_times
+ WHERE trip_id = 'SOME_LONG_TRIP_ID'
+ ORDER BY stop_sequence;
+```
+
+With the addition of `trip_index`, you need to first find the record
+in `trips`. This can be achieved using the following query. This is a
+small sacrifice compared to performing string comparison on all stop
+times.
+
+```sql
+SELECT * FROM stop_times
+ WHERE trip_index = (
+ SELECT trip_index FROM trips WHERE trip_id = 'SOME_LONG_TRIP_ID'
+ )
+ ORDER BY stop_sequence;
+```
+
+You can make the same change for the other IDs in the feed, such as
+`route_id` and `stop_id`. For these columns you still keep (and
+index) the original values in `routes` and `stops` respectively,
+since you may still need to look up records based on these values.
+
+***Note:** Even though this book recommends optimizing feeds in this
+manner, the remainder of examples in this book only use their original
+IDs, in order to simplify the examples and to ensure compatibility with
+the `GtfsToSql` tool introduced previously.*
+
diff --git a/.idea/gtfs-book/ch-15-optimizing-shapes.md b/.idea/gtfs-book/ch-15-optimizing-shapes.md
new file mode 100644
index 0000000..fbb1125
--- /dev/null
+++ b/.idea/gtfs-book/ch-15-optimizing-shapes.md
@@ -0,0 +1,158 @@
+## 15. Optimizing Shapes
+
+Shape data in a GTFS feed (that is, the records from `shapes.txt`)
+represents a large amount of data. There are a number of ways to reduce
+this data, which can help to:
+
+* Speed up data retrieval
+* Reduce the amount of data to transmit to app / web site users
+* Speed up rendering of the shape onto a map (such as a native mobile
+ map or a JavaScript map).
+
+Two ways to reduce shape data are as follows:
+
+* **Reducing the number of points in a shape.** The shapes included in
+ GTFS are often very precise and include a number of redundant
+ points. Many of these can be removed without a noticeable loss of
+ shape quality using the *Douglas-Peucker Algorithm*.
+* **Encoding all points in a shape into a single value.** The *Encoded
+ Polyline Algorithm* used in the Google Maps JavaScript API can also
+ be used with GTFS shapes. This reduces the amount of storage
+ required and also makes looking up all points in a shape far
+ quicker.
+
+### Reducing Points in a Shape
+
+Many of the shapes you find in GTFS feeds are extremely detailed. They
+often follow the exact curvature of the road and may consist of hundreds
+or thousands of points for a trip that might have only 30 or 40 stops.
+
+While this level of detail is useful, the sheer amount of data required
+to be rendered on a map can be a massive performance hit from the
+perspective of retrieving the data as well as rendering on a map.
+Realistically, shapes do not need this much detail in order to convey
+their message to your users.
+
+Consider the following shape from Portland that has been rendered using
+Google Maps. The total shape consists of 1913 points.
+
+![Original Shape](images/shape-original.jpg)
+
+Compare this now to the same shape that has had redundant points
+removed. The total number of points in this shape is 175, which
+represents about a 90% reduction.
+
+![Reduced Shape](images/shape-reduced.jpg)
+
+If you look closely, you can see some minor loss of detail, but for the
+most part, the shapes are almost identical.
+
+This reduction in points can be achieved using the Douglas-Peucker
+Algorithm. It does so by discarding points that do not deviate
+significantly between its surrounding points.
+
+The Douglas-Peucker Algorithm works as follows:
+
+* Begin with the first and last points in the path (A and B). These
+ are always kept.
+* Find the point between the first and last that is furthest away from
+ the line joining the first and last line (the orthogonal distance --
+ see the figure below).
+* If this point is greater than the allowed distance (the tolerance
+ level), the point is kept (call it X).
+* Repeat this algorithm twice: once using A as the first point and X
+ as the last point, then again using X as the first point and B as
+ the last point.
+
+This algorithm is recursive, and continues until all points have been
+checked.
+
+***Note:** The tolerance level determines how aggressively points are
+removed. A higher tolerance value is less aggressive and discards less
+data, while a lower tolerance discards more data.*
+
+The following diagram shows what orthogonal distance means.
+
+![Orthogonal Distance](images/orthogonal-distance.jpg)
+
+The following resources provide more information about the
+Douglas-Peucker Algorithm and describe how to implement it in your own
+systems:
+
+* <http://en.wikipedia.org/wiki/Ramer-Douglas-Peucker_algorithm>
+* <http://www.loughrigg.org/rdp/>
+* <http://stackoverflow.com/questions/2573997/reduce-number-of-points-in-line>.
+
+You can often discard about 80-90% of all shape data before seeing a
+significant loss of line detail.
+
+### Encoding Shape Points
+
+A single entry in `shapes.txt` corresponds to a single point in a
+single shape. Each entry includes a shape ID, a latitude and longitude.
+
+***Note:** The `shape_dist_traveled` field is also included, but you do not
+strictly need to use this field (nor the corresponding field in
+stop_times.txt). The technique described in this section will not work
+if you intend to use `shape_dist_traveled`.*
+
+This means if you want to look up a shape by its ID, you may need to
+retrieve several hundreds of rows from a database. Using the Encoded
+Polyline Algorithm you can change your GTFS database so each shape is
+represented by a single row in a database. This means the shape can be
+found much more quickly and much less data needs to be processed to
+determine the shape.
+
+Consider the following data, taken from TriMet's `shapes.txt` file.
+This data represents the first five points of a shape.
+
+| `shape_id` | `shape_pt_lat` | `shape_pt_lon` | `shape_pt_sequence` | `shape_dist_traveled` |
+| :--------- | :------------- | :------------- | :------------------- | :--------------------- |
+| `185328` | `45.52291 ` | `-122.677372` | `1` | `0.0` |
+| `185328` | `45.522921` | `-122.67737` | `2` | `3.7` |
+| `185328` | `45.522991` | `-122.677432` | `3` | `34.0` |
+| `185328` | `45.522992` | `-122.677246` | `4` | `81.5` |
+| `185328` | `45.523002` | `-122.676567` | `5` | `255.7` |
+
+If you apply the Encoded Polyline Algorithm to this data, the
+coordinates can be represented using the following string.
+
+```
+eeztGrlwkVAAML?e@AgC
+```
+
+To learn how to arrive at this value, you can read up on the Encoded
+Polyline Algorithm at
+<https://developers.google.com/maps/documentation/utilities/polylinealgorithm>.
+
+Instead of having every single shape point in a single table, you can
+create a table that has one record per shape. The following SQL
+statement is a way you could achieve this.
+
+```sql
+CREATE TABLE shapes (
+ shape_id TEXT,
+ encoded_shape TEXT
+);
+```
+
+The following table shows how this data could be represented in a
+database.
+
+| `shape_id` | `encoded_shape` |
+| :--------- | :--------------------- |
+| `185328` | `eeztGrlwkVAAML?e@AgC` |
+
+Storing the shape in this manner means you can retrieve an entire shape
+by looking up only one database row and running it through your decoder.
+
+To further demonstrate how both the encoding and decoding works, try out
+the polyline utility at
+<https://developers.google.com/maps/documentation/utilities/polylineutility>.
+
+You can find implementations for encoding and decoding points for
+various languages at the following locations:
+
+* <http://facstaff.unca.edu/mcmcclur/GoogleMaps/EncodePolyline/>
+* <https://github.com/emcconville/google-map-polyline-encoding-tool>
+
diff --git a/.idea/gtfs-book/ch-16-deleting-unused-data.md b/.idea/gtfs-book/ch-16-deleting-unused-data.md
new file mode 100644
index 0000000..f2a2fc0
--- /dev/null
+++ b/.idea/gtfs-book/ch-16-deleting-unused-data.md
@@ -0,0 +1,106 @@
+## 16. Deleting Unused Data**
+
+Once you have imported a GTFS feed into a database, it is possible for
+there to be a lot of redundant data. This can ultimately slow down any
+querying of that data as well as bloating the size of the database. If
+you are making an app where the GTFS database is queried on the device
+then disk space and computational time are at a premium, so you must do
+what you can to reduce resource usage.
+
+The first thing to check for is expired services. You can do this by
+searching `calendar.txt` for entries that expire before today's date.
+Be aware though, you also need to ensure there are no
+`calendar_dates.txt` entries overriding these services (a service
+could have an `end_date` of, say, `20140110`, but also have a
+`calendar_dates.txt` entry for `20140131`).
+
+Firstly, find service IDs in `calendar_dates.txt` that are still
+active using the following query.
+
+```sql
+SELECT service_id FROM calendar_dates WHERE date >= '20140110';
+```
+
+***Note:** In order to improve performance of searching for dates, you
+should import the date field in `calendar_dates.txt` as an integer, as
+well as `start_date` and `end_date` in `calendar.txt`.*
+
+Any services matched in this query should not be removed. You can then
+find service IDs in `calendar.txt` with the following SQL query.
+
+```sql
+SELECT * FROM calendar WHERE end_date < '20140110'
+ AND service_id NOT IN (
+ SELECT service_id FROM calendar_dates WHERE date >= '20140110'
+ );
+```
+
+Before deleting these services, corresponding trips and stop times must
+be removed since you need to know the service ID in order to delete a
+trip. Likewise, stop times must be deleted before trips since you need
+to know the trip IDs to be removed.
+
+```sql
+DELETE FROM stop_times WHERE trip_id IN (
+ SELECT trip_id FROM trips WHERE service_id IN (
+ SELECT service_id FROM calendar WHERE end_date < '20140110'
+ AND service_id NOT IN (
+ SELECT service_id FROM calendar_dates WHERE date >= '20140110'
+ )
+ )
+ );
+```
+
+Now there may be a series of trips with no stop times. Rather than
+repeating the above sub-queries, a more thorough way of removing trips
+is to remove trips with no stop times.
+
+```sql
+DELETE FROM trips WHERE trip_id NOT IN (
+ SELECT DISTINCT trip_id FROM stop_times
+);
+```
+
+With all `service_id` references removed, you can remove the expired
+rows from `calendar.txt` using the following SQL query.
+
+```sql
+DELETE FROM calendar WHERE end_date < '20140110'
+ AND service_id NOT IN (
+ SELECT DISTINCT service_id FROM calendar_dates WHERE date >= '20140110'
+ );
+```
+
+The expired rows in `calendar_dates.txt` can also be removed, which
+can be achieved using the following query.
+
+```sql
+DELETE FROM calendar_dates WHERE date < '20140110';
+```
+
+There may now be some stops that are not used by any trips. These can be
+removed using the following query.
+
+```sql
+DELETE FROM stops WHERE stop_id NOT IN (
+ SELECT DISTINCT stop_id FROM stop_times
+);
+```
+
+Additionally, you can remove unused shapes and routes using the
+following queries.
+
+```sql
+DELETE FROM shapes WHERE shape_id NOT IN (
+ SELECT DISTINCT shape_id FROM trips
+);
+
+DELETE FROM routes WHERE route_id NOT IN (
+ SELECT DISTINCT route_id FROM trips
+);
+```
+
+There are other potential rows that can be removed (such as records in
+`transfers.txt` that reference non-existent stops), but hopefully you
+get the idea from the previous queries.
+
diff --git a/.idea/gtfs-book/ch-17-searching-for-trips.md b/.idea/gtfs-book/ch-17-searching-for-trips.md
new file mode 100644
index 0000000..dd0a901
--- /dev/null
+++ b/.idea/gtfs-book/ch-17-searching-for-trips.md
@@ -0,0 +1,346 @@
+## 17. Searching for Trips**
+
+This section shows you how to search for trips in a GTFS feed based on
+specified times and stops.
+
+The three scenarios covered are:
+
+* Finding all stops departing from a given stop after a certain time
+* Finding all stops arriving at a given stop before a certain time
+* Finding all trips between two stops after a given time
+
+The first and second scenarios are the simplest, because they only rely
+on a single end of each trip. The third scenario is more complex because
+you have to ensure that each trip returned visits both the start and
+finish stops.
+
+When searching for trips, the first thing you need to know is which
+services are running for the given search time.
+
+### Finding Service IDs
+
+The first step in searching for trips is to determine which services are
+running. To begin with, you need to find service IDs for a given date.
+You then need to handle exceptions accordingly. That is, you need to add
+service IDs and remove service IDs based on the rules in
+`calendar_dates.txt`.
+
+***Note:** In *Scheduling Past Midnight*, you were shown how
+GTFS works with times past midnight. The key takeaway from this is that
+you have to search for trips for two sets of service IDs. This is
+covered as this chapter progresses.*
+
+In Australia, 27 January 2014 (Monday) was the holiday for Australia
+Day. This is used as an example to demonstrate how to retrieve service
+IDs.
+
+Firstly, you need the main set of service IDs for the day. The following
+SQL query achieves this.
+
+```sql
+SELECT service_id FROM calendar
+ WHERE start_date <= '20140127' AND end_date >= '20140127'
+ AND monday = 1;
+
+# Result: 1, 2, 6, 9, 18, 871, 7501
+```
+
+Next you need to find service IDs that are to be excluded, as achieved
+by the following SQL query.
+
+```sql
+SELECT service_id FROM calendar_dates
+ WHERE date = '20140127' AND exception_type = 2;
+
+# Result: 1, 2, 6, 9, 18, 871, 7501
+```
+
+Finally, you need to find service IDs that are to be added. This query
+is identical to the previous query, except for the different
+`exception_type` value.
+
+```sql
+SELECT service_id FROM calendar_dates
+ WHERE date = '20140127' AND exception_type = 1;
+
+# Result: 12, 874, 4303, 7003
+```
+
+You can combine these three queries all into a single query in SQLite
+using `EXCEPT` and `UNION`, as shown in the following SQL query.
+
+```sql
+SELECT service_id FROM calendar
+ WHERE start_date <= '20140127' AND end_date >= '20140127'
+ AND monday = 1
+
+UNION
+
+SELECT service_id FROM calendar_dates
+ WHERE date = '20140127' AND exception_type = 1
+
+EXCEPT
+
+SELECT service_id FROM calendar_dates
+ WHERE date = '20140127' AND exception_type = 2;
+
+# Result: 12, 874, 4304, 7003
+```
+
+Now when you search for trips, only trips that have a matching
+`service_id` value are included.
+
+### Finding Trips Departing a Given Stop
+
+In order to determine the list of services above, a base timestamp on
+which to search is needed. For the purposes of this example, assume that
+timestamp is 27 January 2014 at 1 PM (`13:00:00` when using GTFS).
+
+This example searches for all services departing from Adelaide Railway
+Station, which has stop ID `6665` in the Adelaide Metro GTFS feed. To
+find all matching stop times, the following query can be performed.
+
+```sql
+SELECT * FROM stop_times
+ WHERE stop_id = '6665'
+ AND departure_time >= '13:00:00'
+ AND pickup_type = 0
+ ORDER BY departure_time;
+```
+
+This returns a series of stop times that match the given criteria. The
+only problem is it does not yet take into account valid service IDs.
+
+***Note:** This query may also return stop times that are the final stop
+on a trip, which is not useful for somebody trying to find departures.
+You may want to modify your database importer to override the final stop
+time of each trip so its `pickup_type` has a value of `1` (no pick-up) and
+its first stop time so it has a `drop_off_type` of `1` (no drop-off).*
+
+To make sure only the correct trips are returned, join `stop_times`
+with `trips` using `trip_id`, and then include the list of service
+IDs. For the purposes of this example the service IDs, stop ID and
+departure time are being hard-coded. You can either embed a sub-query,
+or include the service IDs via code.
+
+```sql
+SELECT t.*, st.* FROM stop_times st, trips t
+ WHERE st.stop_id = '6665'
+ AND st.trip_id = t.trip_id
+ AND t.service_id IN ('12', '874', '4304', '7003')
+ AND st.departure_time >= '13:00:00'
+ AND st.pickup_type = 0
+ ORDER BY st.departure_time;
+```
+
+This gives you the final list of stop times matching the desired
+criteria. You can then decide specifically which data you need to
+retrieve; you now have the `trip_id`, meaning you can find all stop
+times for a given trip if required.
+
+If you need to restrict the results to only those that occur after the
+starting stop, you can retrieve stop times with only a `stop_sequence`
+larger than that of the stop time returned in the above query.
+
+### Finding Trips Arriving at a Given Stop
+
+In order to find the trips arriving at a given stop before a specified
+time, it is just a matter of making slight modifications to the above
+query. Firstly, check the `arrival_time` instead of
+`departure_time`. Also, check the `drop_off_type` value instead of
+`pickup_type`.
+
+```sql
+SELECT t.*, st.* FROM stop_times st, trips t
+ WHERE st.stop_id = '6665'
+ AND st.trip_id = t.trip_id
+ AND t.service_id IN ('12', '874', '4304', '7003')
+ AND st.arrival_time <= '13:00:00'
+ AND st.drop_off_type = 0
+ ORDER BY st.arrival_time DESC;
+```
+
+For this particular data set, there are trips from four different routes
+returned. If you want to restrict this to a particular route, you can
+filter on the `t.route_id` value.
+
+```sql
+SELECT t.*, st.* FROM stop_times st, trips t
+ WHERE st.stop_id = '6665'
+ AND st.trip_id = t.trip_id
+ AND t.service_id IN ('12', '874', '4304', '7003')
+ AND t.route_id = 'BEL'
+ AND st.arrival_time <= '13:00:00'
+ AND st.drop_off_type = 0
+ ORDER BY st.arrival_time DESC;
+```
+
+### Performance of Searching Text-Based Times
+
+These examples search using text-based arrival/departure times (such as
+`13:00:00`). This works because the GTFS specification mandates that
+all times are `HH:MM:SS` format (although `H:MM:SS` is allowed for times
+earlier than 10 AM).
+
+Doing this kind of comparison (especially if you are scanning millions
+of rows) is quite slow and expensive. It is more efficient to convert
+all times stored in the database to integers that represent the number
+of seconds since midnight.
+
+***Note:** The GTFS specification states that arrival and departure times
+are "noon minus 12 hours" in order to account for daylight savings time.
+This is effectively midnight, except for the days that daylight savings
+starts or finishes.*
+
+In order to achieve this, you can convert the text-based time to an
+integer with `H * 3600 + M * 60 + S`. For example, `13:35:21` can
+be converted using the following steps.
+
+```
+ (13 * 3600) + (35 * 60) + (21)
+= 46800 + 2100 + 21
+= 48921
+```
+
+You can then convert back to hours, minutes and seconds in order to
+generate timestamps in your application as shown in the following
+algorithm.
+
+```
+H = floor( 48921 / 3600 )
+ = floor( 13.59 )
+ = 13
+
+M = floor( 48921 / 60 ) % 60
+ = floor( 815.35 ) % 60
+ = 815 % 60
+ = 35
+
+S = 48921 % 60
+ = 21
+```
+
+### Finding Trips Between Two Stops
+
+Now that you know how to look up a trip from or to a given stop, the
+previous query can be expanded so both the start and finish stop are
+specified. The following example finds trips that depart after 1 PM. The
+search only returns trips departing from *Adelaide Railway Station*
+(stop ID `6665`) are shown. Additionally, only trips that then visit
+*Blackwood Railway Station* (stop IDs `6670` and `101484`) are
+included.
+
+In order to achieve this, the following changes must be made to the
+previous examples.
+
+* **Join against stop_times twice.** Once for the departure stop time
+ and once for the arrival stop time.
+* **Allow for multiple stop IDs at one end.** The destination in this
+ example has two platforms, so you need to check both of them.
+* **Ensure the departure time is earlier than the arrival time.
+ **Otherwise trips heading in the opposite direction may also be
+ returned.
+
+The following query demonstrates how this is achieved.
+
+```sql
+SELECT t.*, st1.*, st2.*
+ FROM trips t, stop_times st1, stop_times st2
+ WHERE st1.trip_id = t.trip_id
+ AND st2.trip_id = t.trip_id
+ AND st1.stop_id = '6665'
+ AND st2.stop_id IN ('6670', '101484')
+ AND t.service_id IN ('12', '874', '4304', '7003')
+ AND st1.departure_time >= '13:00:00'
+ AND st1.pickup_type = 0
+ AND st2.drop_off_type = 0
+ AND st1.departure_time < st2.arrival_time
+ ORDER BY st1.departure_time;
+```
+
+In this example, the table alias `st1` is used for the departure stop
+time. Once again, the stop ID must match, as well as the departure time
+and pick-up type.
+
+For the arrival stop time the alias `st2` is used. This table also
+joins the `trips` table using `trip_id`. Since the destination has
+multiple stop IDs, the SQL `IN` construct is used. The arrival time is
+not important in this example, so only the departure is checked.
+
+The final thing to check is that the departure occurs before the
+arrival. If you do not perform this step, then trips traveling in the
+opposite direction may also be returned.
+
+***Note:** Technically, there may be multiple results returned for the
+same trip. For some transit agencies, a single trip may visit the a stop
+more than once. If this is the case, you should also check the trip
+duration (arrival time minus departure time) and use the shortest trip
+when the same trip is returned multiple times.*
+
+### Accounting for Midnight
+
+As discussed previously, GTFS has the ability for the trips to depart or
+arrive after midnight for a given service day without having to specify
+it as part of the next service day. Consequently, while the queries
+above are correct, they do not necessarily paint the full picture.
+
+In reality, when performing a trip search, you need to take into account
+trips that have wrapped times (for instance, where 12:30 AM is specified
+as `24:30:00`). If you want to find trips that depart after 12:30 AM
+on a given day, you need to check for trips departing after
+`00:30:00` on that day, as well as for trips departing at
+`24:30:00` on the previous day.
+
+This means that for each trip search you are left with two sets of
+trips, which you must then merge and present as appropriate.
+
+***Note:** In reality, agencies generally do not have trips that overlap
+from multiple service days, so technically you often only need one query
+(for example, a train service might end on 12:30 AM then restart on the
+next service day at 4:30 AM). If your app / web site only uses a single
+feed where you can tune your queries manually based on how the agency
+operates, then you can get away with only querying a single service day.
+On the other hand, if you are building a scalable system that works with
+data from many agencies, then you need to check both days.*
+
+To demonstrate how this works in practice, the following example
+searches for all trips that depart after 12:30:00 AM on 14 March 2014.
+The examples earlier in this chapter showed how to find the service IDs
+for a given date. To account for midnight, service IDs for both March 14
+(the "main" service date) and March 13 (the overlapping date) need to be
+determined.
+
+Assume that March 13 has a service ID of `C1` and March 14 has a
+service ID of `C2`. First you need to find the departures for March
+14, as shown in the following query.
+
+```sql
+SELECT t.*, st.* FROM stop_times st, trips t
+ WHERE st.stop_id = 'S1'
+ AND st.trip_id = t.trip_id
+ AND t.service_id IN ('C1')
+ AND st.departure_time >= '00:30:00'
+ AND st.pickup_type = 0
+ ORDER BY st.departure_time;
+```
+
+The resultant list needs to be combined with the trips that depart after
+midnight from the March 13 service. To check this, it is just a matter
+of swapping in the right service IDs, then adding 24 hours to the search
+time.
+
+```sql
+SELECT t.*, st.* FROM stop_times st, trips t
+ WHERE st.stop_id = 'S1'
+ AND st.trip_id = t.trip_id
+ AND t.service_id IN ('C2')
+ AND st.departure_time >= '24:30:00'
+ AND st.pickup_type = 0
+ ORDER BY st.departure_time;
+```
+
+In order to get your final list of trips you must combine the results
+from both of these queries. If you are generating complete timestamps in
+order to present the options to your users, just remember to account for
+the results from the second query being 24 hours later.
+
diff --git a/.idea/gtfs-book/ch-18-working-with-trip-blocks.md b/.idea/gtfs-book/ch-18-working-with-trip-blocks.md
new file mode 100644
index 0000000..d59b2c4
--- /dev/null
+++ b/.idea/gtfs-book/ch-18-working-with-trip-blocks.md
@@ -0,0 +1,109 @@
+## 18. Working With Trip Blocks**
+
+One of the more complex aspects of GTFS is how to properly use the
+`block_id` field in `trips.txt`. The concept is simple, but
+incorporating this information in a manner that is simple to understand
+for a passenger can be more difficult.
+
+A single vehicle (such as a bus or train) completes multiple trips in a
+single service day. For instance, once a bus completes its trip (Trip X)
+from Location A to Location B, it then begins another trip (Trip Y) from
+Location B to Location C. It then completes a final trip (Trip Z) from
+Location C back to Location A.
+
+If a passenger boards in the middle of Trip X and is allowed to stay on
+the bus until the end of the Trip Y, then this should be represented in
+GTFS by giving Trips X, Y, Z the same `block_id` value.
+
+The following diagram shows a vehicle that performs two trips,
+completing opposite directions of the same route. Often a single vehicle
+will do many more than just two trips in a day.
+
+![Block](images/block-01.png)
+
+***Note:** When trips in a block are represented in stop_times.txt, the
+final stop of a trip and the first stop of the subsequent trip must both
+be included, even though they are typically at the same stop (and often
+the same arrival & departure time). The final stop of the first trip is
+drop-off only, while the first stop of the next trip is pick-up only.*
+
+In the following diagram, the vehicle completes three trips, each for
+separate routes. As in the previous diagram, the vehicle will likely
+complete more trips over a single day.
+
+![Block](images/block-02.png)
+
+These two diagrams show the difficulties that can arise when trying to
+calculate trips across multiple blocks. The first diagram is repeated
+below, this time with stops marked so trips between two locations can be
+found.
+
+![Block](images/block-03.png)
+
+Consider the scenario where you want to retrieve all trip times from
+Stop S2 to Stop S1. Since the vehicle gets to Stop S3 then turns around
+and heads back to S1, the trip search returns two options:
+
+1. Board at S2 on the upper trip, travel via S3, then get to S1.
+2. Board at S2 on the lower trip, travel directly to S1.
+
+The second option is a subset of the first option, which means it is a
+far shorter trip (plus it avoids the passenger getting annoyed from
+passing their starting point again). To determine which option to use,
+you can use the trip duration.
+
+Now consider the other type of block formation, where a vehicle
+completes subsequent trips from different routes.
+
+![Block](images/block-04.png)
+
+If a passenger wants to travel from Stop S2 to S4, you will not get the
+situation where the vehicle travels past the same location twice.
+However, it is also possible for a passenger to travel from Stop S2 to
+Stop S6 without ever having to change vehicles.
+
+Referring back to the previous chapter about performing trip searches,
+it is relatively straightforward to account for blocks in these
+searches. Using the query to find trips between two stops as a
+reference, the following changes need to be made:
+
+* Instead of joining both occurrences of `stop_times` to the
+ `trips` table, you now need two occurrences of `trips`.
+* Each `stop_times` table occurrence is joined to a separate
+ `trips` table occurrence.
+* The two occurrences of `trips` are joined on the `block_id`
+ value (if it is available).
+
+The following query shows how to find trips that start at stop `S2`
+and finish at stop `S4`, departing after 1 PM on the day with service
+ID `C1`.
+
+```sql
+SELECT t1.*, t2.*, st1.*, st2.*
+ FROM trips t1, trips t2, stop_times st1, stop_times st2
+ WHERE st1.trip_id = t1.trip_id
+ AND st2.trip_id = t2.trip_id
+ AND st1.stop_id = 'S2'
+ AND st2.stop_id = 'S4'
+ AND t1.service_id = 'C1'
+ AND (
+ t1.trip_id = t2.trip_id
+ OR (
+ LENGTH(t1.block_id) > 0 AND t1.block_id = t2.block_id
+ )
+ )
+ AND st1.departure_time >= '13:00:00'
+ AND st1.pickup_type = 0
+ AND st2.drop_off_type = 0
+ AND st1.departure_time < st2.arrival_time
+ ORDER BY st1.departure_time;
+```
+
+Since `block_id` may be unpopulated, the query joins on both the
+`trip_id` and the `block_id`.
+
+***Note:** An alternative is to guarantee every single trip has a
+`block_id` value when importing -- even if some blocks only consist of
+a single trip. If you can guarantee this condition, then you can
+simplify this query by using `t1.block_id = t2.block_id`.*
+
diff --git a/.idea/gtfs-book/ch-19-calculating-fares.md b/.idea/gtfs-book/ch-19-calculating-fares.md
new file mode 100644
index 0000000..df24dd4
--- /dev/null
+++ b/.idea/gtfs-book/ch-19-calculating-fares.md
@@ -0,0 +1,296 @@
+## 19. Calculating Fares**
+
+In order to calculate a fare for a trip in GTFS, you must use data from
+`fare_attributes.txt` and `fare_rules.txt`. It is relatively
+straightforward to calculate the cost of a single trip (that is,
+boarding a vehicle, traveling for a number of stops, then disembarking),
+but it becomes much more complicated when you need to take into account
+multiple trip segments (that is, one or more transfers).
+
+***Note:** As it stands, many feed providers do not include fare
+information. This is because many systems have a unique set of rules
+that cannot be modelled with the current structure of fares in GTFS.
+Additionally, it is not possible to determine different classes of
+pricing (such as having a separate price for adults and children). For
+the purposes of this chapter, these limitations are ignored.*
+
+For more discussion on how fares work in GTFS, refer to *Fare
+Definitions (`fare_attributes.txt` & `fare_rules.txt`)*.
+
+This chapter first shows you how to calculate fares for a single trip,
+then how to account for transfers and for multiple trips.
+
+### Calculating a Single Trip Fare
+
+Much of the logic used when calculating fares requires knowledge of the
+zones used in a trip.
+
+***Note:** A zone is a physical area within a transit system that
+contains a series of stops. They are used to group trip pricing into key
+areas in a system. Some transit systems do not work like this (for
+instance, they may measure the physical distance travelled rather than
+specific stops), which is one reason why the GTFS fares model does not
+work in all cases.*
+
+A zone is defined in GTFS by the `zone_id` column in `stops.txt`. A
+single stop can only belong to one zone.
+
+Fares are matched to a trip using a combination of any of the following:
+
+* The route of the trip
+* The zone of the stop the passenger boards from
+* The zone of the stop the passenger disembarks
+* The zone(s) of any stops on the trip that are passed while the
+ passenger is on board.
+
+Consider the following simplified data set that may appear in
+`stops.txt` and `stop_times.txt`. Assume for this example that the
+trip `T1` belongs to a route with an ID of `R1`.
+
+```
+stop_id,zone_id
+S1,Z1
+S2,Z1
+S3,Z2
+S4,Z3
+
+trip_id,stop_id,stop_sequence
+T1,S1,1
+T1,S2,2
+T1,S3,3
+T1,S4,4
+```
+
+If a passenger travels from stop S1 to stop S4, then their starting zone
+is Z1, their finishing zone is Z3, and the zones they pass through are
+Z1, Z2 and Z3.
+
+***Note:** When calculating fares, the start and finish zones are also
+included in the zones passed through, so in this example you Z3 is also
+considered as a zone that the trip passes through.*
+
+Using this data, you can now calculate matching fares. To do so, you
+need to find all fares that match either of the following:
+
+* Fares that have no associated rules.
+* Fares that have rules that match the specified trip. If a fare
+ defines multiple zones that must be passed through (using
+ `contains_id`), then all zones must be matched.
+
+If multiple fares qualify for a trip, then the cheapest fare is the one
+to use.
+
+### Finding Fares With No Rules
+
+This is the simplest use-case for fares. You can find all matching fares
+with the following SQL.
+
+```sql
+SELECT * FROM fare_attributes WHERE fare_id NOT IN (
+ SELECT DISTINCT fare_id FROM fare_rules
+);
+```
+
+If a feed only has `fare_attributes.txt` records with no rules, then
+the difference between the fares is in the transfer rules. This section
+only covers calculating fares for a single trip with no transfers, so
+for now you can just select the cheapest fare using the following SQL.
+
+```sql
+SELECT * FROM fare_attributes WHERE fare_id NOT IN (
+ SELECT DISTINCT fare_id FROM fare_rules
+ )
+ ORDER BY price + 0 LIMIT 1;
+```
+
+***Note:** You still need to check for fares with one or more rules in
+order to find the cheapest price. Also, `0` is added in this query in
+order to cast a string to a number. When you roll your own importer you
+should instead import this as a numerical value.*
+
+### Finding Fares With Matched Rules
+
+Next you must check against specific rules for a fare. In order to do
+this, you need the starting zone, finishing zone, and all zones passed
+through (including the start and finish zones).
+
+Referring back to the previous example, if a trip starts at `Z1`,
+passes through `Z2` and finishes at `Z3`, you can find fare
+candidates (that is, trips that *may* match), using the following SQL
+query.
+
+```sql
+SELECT * FROM fare_attributes WHERE fare_id IN (
+ SELECT fare_id FROM fare_rules
+ WHERE (LENGTH(route_id) = 0 OR route_id = 'R1')
+ AND (LENGTH(origin_id) = 0 OR origin_id = 'Z1')
+ AND (LENGTH(destination_id) = 0 OR destination_id = 'Z3')
+ AND (LENGTH(contains_id) = 0 OR contains_id IN ('Z1', 'Z2', 'Z3')
+ )
+ );
+```
+
+This returns a list of fares that may qualify for the given trip. As
+some fares have multiple rules, all must be checked. The algorithm to
+represent this is as follows.
+
+```
+fares = [ result from above query ]
+
+qualifyingFares = [ ]
+
+for (fare in fares) {
+ if (qualifies(fare))
+ qualifyingFares.add(fare)
+}
+
+allFares = qualifyFares + faresWithNoRules
+
+passengerFare = cheapest(allFares)
+```
+
+As shown on the final two lines, once you have the list of qualifying
+fares, you can combine these with fares that have no rules (from the
+previous section) and then determine the cheapest fare.
+
+First though, you must determine if a fare with rules qualifies for the
+given trip. If a fare specifies zones that must be passed through, then
+all rules must be matched.
+
+***Note:** If a particular rule specifies a different route, start, or
+finish than the one you are checking, you do not need to ensure the
+`contains_id` matches, since this rule no longer applies. You still need
+to check the other rules for this fare.*
+
+The algorithm needs to build up a list of zone IDs from the fare rules
+in order to check against the trip. Once this has been done, you need to
+check that every zone ID collected from the rules is contained in the
+trip's list of zones.
+
+```
+qualifies(fare, routeId, originId, destinationId, containsIds) {
+
+ fareContains = [ ]
+
+ for (rule in fare.rules) {
+ if (rule.contains.length == 0)
+ continue
+
+ if (rule.route.length > 0 AND rule.route != routeId)
+ continue
+
+ if (rule.origin.length > 0 AND rule.origin != originId)
+ continue
+
+ if (rule.desination.length > 0 AND rule.destination != destinationId)
+ continue
+
+ fareContains.add(rule.containsId);
+ }
+
+ if (fareContains.size == 0)
+ return YES
+
+ if (containIds HAS EVERY ELEMENT IN fareContains)
+ return YES
+ else
+ return NO
+}
+```
+
+This algorithm achieves the following:
+
+* Only rules that have a value for `contains_id` are relevant. Rules
+ that do not have this value fall through and should be considered as
+ qualified.
+* If the route is specified but not equal to the one being checked, it
+ is safe to ignore the rule's `contains_id`. If the route is empty
+ or equal, the loop iteration can continue.
+* Check for the `origin_id` and `destination_id` in the same
+ manner as `route_id`.
+* If the route, origin and destination all qualify then store the
+ `contains_id` so it can be checked after the loop.
+
+The algorithm returns *yes* if the fare qualifies, meaning you can save
+it as a qualifying fare. You can then return the cheapest qualifying
+fare to the user.
+
+### Calculating Trips With Transfers
+
+Once you introduce transfers, fare calculation becomes more complicated.
+A "trip with a transfer" is considered to be a trip where the passenger
+boards a vehicle, disembarks, and then gets on another vehicle. For
+example:
+
+* Travel on trip T1 from Stop S1 to Stop S2
+* Walk from Stop S2 to Stop S3
+* Travel on trip T2 from Stop S3 to Stop S4.
+
+In order to calculate the total fare for a trip with transfers, the
+following algorithm is used:
+
+1. Retrieve list of qualifying fares for each trip individually
+2. Create a list of every fare combination possible
+3. Loop over all combinations and find the total cost
+4. Return the lowest cost from Step 3.
+
+Step 1 was covered in *Calculating a Single Trip Fare*, but
+you must skip the final step of finding the cheapest fare. This is
+because the cheapest fare may change depending on subsequent transfers.
+Instead, this step is performed once the cheapest *combination* is
+determined.
+
+To demonstrate Step 2, consider the following example:
+
+* The trip on T1 from S1 to S2 yields the following qualifying fares:
+ F1, F2.
+* The subsequent trip on T2 from S3 to S4 yields the following
+ qualifying fares: F3, F4.
+
+Generating every combination of these fares yields the following
+possibilities:
+
+* F1 + F3
+* F1 + F4
+* F2 + F3
+* F2 + F4.
+
+Step 3 can now be performed, which involves finding the total cost for
+each combinations. As you need to take into account the possibility of
+timed transfers (according to the data stored in
+`fare_attributes.txt`), you also need to know about the times of these
+trips.
+
+The following algorithm can be used to calculate the total cost using
+transfer rules. In this example, you would call this function once for
+each fare combination.
+
+```
+function totalCost(fares) {
+ total = 0
+
+ for (fare in fares) {
+ freeTransfer = NO
+
+ if (previousFare ALLOWS TRANSFERS) {
+ if (HAS ENOUGH TRANSFERS REMAINING) {
+ if (TRANSFER NOT EXPIRED) {
+ freeTransfer = YES
+ }
+ }
+ }
+
+ if (!freeTransfer)
+ total = total + fare.price;
+
+ previousFare = fare;
+ }
+
+ return total;
+}
+```
+
+Once all combinations have called the `totalCost` algorithm, you will
+have a price for each trip. You can then return the lowest price as the
+final price for the trip.
+
diff --git a/.idea/gtfs-book/ch-20-trip-patterns.md b/.idea/gtfs-book/ch-20-trip-patterns.md
new file mode 100644
index 0000000..4b0cf73
--- /dev/null
+++ b/.idea/gtfs-book/ch-20-trip-patterns.md
@@ -0,0 +1,141 @@
+## 20. Trip Patterns**
+
+In a GTFS feed, a route typically has multiple trips that start and
+finish at the same stops. If you are looking to reduce the size of the
+data stored, then converting data from `stop_times.txt` into a series
+of reusable patterns is an excellent way to do so.
+
+For two trips to share a common pattern, the following must hold true:
+
+* The stops visited and the order in which they are visited must be the same
+* The time differences between each stop must be the same.
+
+The following table shows some fictional trips to demonstrate this.
+
+| **Stop** | **Trip 1** | **Trip 2** | **Trip 3** |
+| :------- | :--------- | :--------- | :--------- |
+| S1 | 10:00:00 | 10:10:00 | 10:20:00 |
+| S2 | 10:02:00 | 10:13:00 | 10:22:00 |
+| S3 | 10:05:00 | 10:15:00 | 10:25:00 |
+| S4 | 10:06:00 | 10:18:00 | 10:26:00 |
+| S5 | 10:10:00 | 10:21:00 | 10:30:00 |
+
+In a GTFS feed, this would correspond to 15 records in
+`stop_times.txt`. If you look more closely though, you can see the
+trips are very similar. The following table shows the differences
+between each stop time, instead of the actual time.
+
+| **Stop** | **Trip 1** | **Trip 2** | **Trip 3** |
+| :------- | :-------------- | :-------------- | :-------------- |
+| S1 | 00:00:00 | 00:00:00 | 00:00:00 |
+| S2 | 00:02:00 (+2m) | 00:03:00 (+3m) | 00:02:00 (+2m) |
+| S3 | 00:05:00 (+5m) | 00:05:00 (+5m) | 00:05:00 (+5m) |
+| S4 | 00:06:00 (+6m) | 00:08:00 (+8m) | 00:06:00 (+6m) |
+| S5 | 00:10:00 (+10m) | 00:11:00 (+11m) | 00:10:00 (+10m) |
+
+You can see from this table that the first and third trip, although they
+start at different times, have the same offsets between stops (as well
+as stopping at identical stops).
+
+Instead of using a table to store stop times, you can store patterns. By
+storing the ID of the pattern with each trip, you can reduce the list of
+stop times in this example from 15 to 10. As only time offsets are
+stored for each patterns, the trip starting time also needs to be saved
+with each trip.
+
+You could use SQL such as the following to model this.
+
+```sql
+CREATE TABLE trips (
+ trip_id TEXT,
+ pattern_id INTEGER,
+ start_time TEXT,
+ start_time_secs INTEGER
+);
+
+CREATE TABLE patterns (
+ pattern_id INTEGER,
+ stop_id TEXT,
+ time_offset INTEGER,
+ stop_sequence INTEGER
+);
+```
+
+The data you would store for trips in this example is shown in the
+following table.
+
+| `trip_id` | `pattern_id` | `start_time` | `start_time_secs` |
+| :-------- | :----------- | :----------- | :---------------- |
+| T1 | 1 | 10:00:00 | 36000 |
+| T2 | 2 | 10:10:00 | 36600 |
+| T3 | 1 | 10:20:00 | 37200 |
+
+***Note:** The above table includes start_time_secs, which is an integer
+value representing the number of seconds since the day started. Using
+the hour, minutes and seconds in start_time, this value is `H * 3600 + M * 60 + S`.*
+
+In the `patterns` table, you would store data as in the following
+table.
+
+| `pattern_id` | `stop_id` | `time_offset` | `stop_sequence` |
+| :----------- | :-------- | :------------ | :-------------- |
+| 1 | S1 | 0 | 1 |
+| 1 | S2 | 120 | 2 |
+| 1 | S3 | 300 | 3 |
+| 1 | S4 | 360 | 4 |
+| 1 | S5 | 600 | 5 |
+| 2 | S1 | 0 | 1 |
+| 2 | S2 | 180 | 2 |
+| 2 | S4 | 300 | 3 |
+| 2 | S5 | 480 | 4 |
+| 2 | S6 | 660 | 5 |
+
+As you can see, this represents an easy way to significantly reduce the
+amount of data stored. You could have tens or hundreds of trips each
+sharing the same pattern. When you scale this to the entire feed, this
+could reduce, say, 3 million records to about 200,000.
+
+***Note:** This is a somewhat simplified example, as there is other data
+available in `stop_times.txt` (such as separate arrival/departure times,
+drop-off type and pick-up type). You should take all of this data into
+account when determining how to allocate patterns.*
+
+### Updating Trip Searches
+
+Changing your model to reuse patterns instead of storing every stop time
+means your data lookup routines must also be changed.
+
+For example, to find all stop times for a given trip, you must now find
+the pattern using the following SQL query.
+
+```sql
+SELECT * FROM patterns
+ WHERE pattern_id = (SELECT pattern_id FROM trips WHERE trip_id = 'YOUR_TRIP_ID')
+ ORDER BY stop_sequence;
+```
+
+If you want to determine the arrival/departure time, you must add the
+offset stored for the pattern record to the starting time stored with
+the trip. This involves joining the tables and adding `time_offset` to
+`start_time_secs`, as shown in the following query.
+
+```sql
+SELECT t.start_time_secs + p.time_offset, p.stop_id
+ FROM patterns p, trips t
+ WHERE p.pattern_id = t.pattern_id
+ AND t.trip_id = 'YOUR_TRIP_ID'
+ ORDER BY p.stop_sequence;
+```
+
+### Other Data Reduction Methods
+
+There are other ways you can reduce the amount of data, such as only
+using patterns to store the stops (and not timing offsets), and then
+storing the timings with each trip record. A technique such as this
+further reduces the size of the database, but the trade-off is that
+querying the data becomes slightly more complex.
+
+Hopefully you can see that by using the method described in this chapter
+there are a number of ways to be creative with GTFS data, and that you
+must make decisions when it comes to speed, size, and ease of querying
+data.
diff --git a/.idea/gtfs-book/ch-21-conclusion.md b/.idea/gtfs-book/ch-21-conclusion.md
new file mode 100644
index 0000000..0bdc2e0
--- /dev/null
+++ b/.idea/gtfs-book/ch-21-conclusion.md
@@ -0,0 +1,23 @@
+## Conclusion
+
+Thanks for reading *The Definitive Guide to GTFS*. While there are many
+techniques that may take some time to comprehend in this book, the
+content should bring you up to speed with GTFS quickly.
+
+At first glance, GTFS appears to be very simple, but there are a number
+of "gotchas" which are not immediately apparent until you have spent
+significant time working with a range of feeds.
+
+The key takeaways from this book are:
+
+* How GTFS feeds are structured
+* How to import a GTFS feed to SQL and perform queries
+* How to search for trips, handle blocks and calculate fares
+* How to optimize an SQL database to minimize resource usage on mobile
+ devices.
+
+If you have enjoyed this book, please share it or feel free to contribution improvements or changes as necessary.
+
+*Quentin Zervaas*
+February, 2014
+
diff --git a/.idea/misc.xml b/.idea/misc.xml
new file mode 100644
index 0000000..639900d
--- /dev/null
+++ b/.idea/misc.xml
@@ -0,0 +1,6 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project version="4">
+ <component name="ProjectRootManager">
+ <output url="file://$PROJECT_DIR$/out" />
+ </component>
+</project> \ No newline at end of file
diff --git a/README.md b/README.md
index bc2a129..c51f238 100644
--- a/README.md
+++ b/README.md
@@ -1 +1,11 @@
-# GTFS-books \ No newline at end of file
+# GTFS-books
+
+This repository contains the source for the following two books:
+
+* The Definitive Guide to GTFS (2014)
+* The Definitive Guide to GTFS-realtime (2015)
+
+These were originally published commercially, but have now been donated to OpenMobilityData.
+
+Some of the content may be outdated. You are encouraged to contribute any fixes or changes in order to keep it as up-to-date as possible.
+