diff options
Diffstat (limited to '')
-rw-r--r-- | .idea/gtfs-realtime-book/ch-05-protocol-buffers.md | 379 |
1 files changed, 379 insertions, 0 deletions
diff --git a/.idea/gtfs-realtime-book/ch-05-protocol-buffers.md b/.idea/gtfs-realtime-book/ch-05-protocol-buffers.md new file mode 100644 index 0000000..2059a63 --- /dev/null +++ b/.idea/gtfs-realtime-book/ch-05-protocol-buffers.md @@ -0,0 +1,379 @@ +## 5. Protocol Buffers + +The previous chapters have included extracts from GTFS-realtime feeds in +a human-readable format. This data is actually represented using a data +format called *Protocol Buffers*. + +Developed by Google and initially released in 2008, Protocol Buffers are +a way of serializing structured data into a format which is intended to +be smaller and faster than XML. + +Note: Remember, if you are writing a transit-related mobile app, +GTFS-realtime feeds are not intended to be consumed directly by mobile +devices due to the large amount of data transferred. Rather, you will +need an intermediate server to read the feed from the provider then +serve only relevant data to the mobile devices running your app. + +Even though it looks similar to JSON data, the human-readable version of +a protocol buffer is not intended to be manually parsed. Instead, data +is extracted from a protocol buffer using native language (such as Java, +C++ or Python). + +Note: Although the Protocol Buffers application can generate code in +Java, C++ or Python, all code examples in this book will be in Java. + +For example, assume you have written a Java program that reads and +parses a GTFS-realtime service alerts feed (shown later in this chapter, +and in the next chapter). + +In order to consume a GTFS-realtime feed provided by a transit agency +such as TriMet or MBTA, your workflow would look similar to the +following diagram: + +![GTFS-realtime Consumption](images/GTFS-realtime-consumption.png) + +When a transit agency or data provider want to publish a GTFS-realtime +feed, their process would be similar, except instead of reading the feed +every 15 seconds, they would write a new protocol buffer data file every +15 seconds using data received from their vehicles. + +***Note:** *Chapter 11. Publishing GTFS-realtime Feeds* will +show you how to create a GTFS-realtime feed using Protocol Buffers. In +order to do so, you will need to install Protocol Buffers as +demonstrated in this chapter.* + +### Installing Protocol Buffers + +In order to generate code to read or write GTFS-realtime feeds in your +native language, you first need to install Protocol Buffers. Once +installed, it is capable of generating code for Java, C++ or Python. + +This section shows you how to download and build the `protoc` +command-line tool on a UNIX or Linux-based system. These instructions +were derived from installing Protocol Buffers on Mac OS X 10.10. + +First, download and extract the Protocol Buffers source code. At the +time of writing, the current version is 2.6.1. + +``` +$ curl -L \ + https://github.com/google/protobuf/releases/download/v2.6.1/protobuf-2.6.1.tar.gz \ + -o protobuf-2.6.1.tar.gz + +$ tar -zxf protobuf-2.6.1.tar.gz + +$ cd protobuf-2.6.1 +``` + +***Note:** Visit <https://developers.google.com/protocol-buffers/> and +click the "Download" link to find the latest version. The following +instructions should still work for subsequent versions.* + +Next, compile the source files using `make`. First run the +`configure` script to build the `Makefile`, then run `make`. + +``` +$ ./configure && make +``` + +***Note:** Separating the commands by && means that make will only run if +`./configure` exits successfully.* + +Once compilation is complete, you can verify the build by running **make +check**. You can then install it globally on your system using **make +install**. If you do not want to install it globally, you can run +`protoc` directly from the **./src** directory instead. + +``` +$ make check + +$ make install +``` + +Next verify that it has been successfully built and installed by running +the `protoc` command. The output should be "Missing input file." + +``` +$ protoc + +Missing input file. +``` + +The next section will show you how to generate Java files using +`protoc` and the `gtfs-realtime.proto` file. + +### Introduction to gtfs-realtime.proto + +In order to generate source code files that can read a protocol buffer, +you need a `.proto` input file. Typically you won't need to create or +modify `.proto` files yourself, it is useful to have a basic +understanding of how they work. + +A `.proto` file contains a series of instructions that defines the +structure of the data. In the case of GTFS-realtime, there is a file +called `gtfs-realtime.proto` which contains the structure for each of +the messages available (service alerts, vehicle positions and trip +updates). + +The following is an extract from `gtfs-realtime.proto` for the +`VehiclePosition` message. + +**Note: **The `TripDescriptor` and `VehicleDescriptor` types +referenced in this extract also have declarations, which are not +included here. + +``` +message VehiclePosition { + optional TripDescriptor trip = 1; + optional Position position = 2; + optional uint32 current_stop_sequence = 3; + + enum VehicleStopStatus { + INCOMING_AT = 0; + STOPPED_AT = 1; + IN_TRANSIT_TO = 2; + } + + optional VehicleStopStatus current_status = 4 [default = IN_TRANSIT_TO]; + optional uint64 timestamp = 5; + + enum CongestionLevel { + UNKNOWN_CONGESTION_LEVEL = 0; + RUNNING_SMOOTHLY = 1; + STOP_AND_GO = 2; + CONGESTION = 3; + SEVERE_CONGESTION = 4; + } + + optional CongestionLevel congestion_level = 6; + optional string stop_id = 7; + optional VehicleDescriptor vehicle = 8; + extensions 1000 to 1999; +} +``` + +Ignoring the numerical values assigned to each field (they aren't +likely to be relevant because you never directly refer to them), you can +see how the structure is the same as the specification covered earlier +in this book for vehicle positions. + +Each field in a protocol buffer has a unique value assigned to it. This +value is used internally when encoding or decoding each field in +GTFS-realtime feed. If there is additional data to be represented in a +feed, the values between 1000 and 1999 are reserved for extensions. +*Chapter 10. GTFS-realtime Extensions* shows how extensions in +GTFS-realtime work. + +### Compiling gtfs-realtime.proto + +The next step towards consuming a GTFS-realtime feed is to compile the +`gtfs-realtime.proto` file into Java code using `protoc`. In order +to do this, you must have already created a Java project ahead of time. +`protoc` will generate the files and incorporate them directly into +your project's source tree. + +These instructions assume you have created your Java project in +**/path/to/gtfsrt** and that you will download the +`gtfs-realtime.proto` file to **/path/to/protobuf**. + +First, download the `gtfs-realtime.proto` file. + +``` +$ cd /path/to/protobuf + +$ curl \ + https://developers.google.com/transit/gtfs-realtime/gtfs-realtime.proto \ + -o gtfs-realtime.proto +``` + +***Note:** If this URL is no longer current when you read this, you can +find the updated location at +<https://developers.google.com/transit/gtfs-realtime/>.* + +In order to use `protoc`, you must specify the **--proto_path** +argument as the directory in which `gtfs-realtime.proto` resides. +Additionally, you must specify the full path to the +`gtfs-realtime.proto` file. + +Typically, in a Java project your source files will reside in a +directory called `src` within the project directory. This directory +must be specified in the **--java_out** argument. + +The full command to run is as follows: + +``` +$ protoc \ + --proto_path=/path/to/protobuf \ + --java_out=/path/to/gtfsrt/src \ + /path/to/protobuf/gtfs-realtime.proto +``` + +If this command runs successfully there will be no output to screen, but +there will be newly-created files in your source tree. There should now +be a **./com/google** directory in your source tree, and a package +called `com.google.transit.realtime`. + +### Adding the Protocol Buffers Library + +Before you can use this new package, you must add the Protocol Buffers +library to your Java project. You can either compile these files into a +Java archive (using the instructions in +**./protobuf-2.6.**1**/java/README.txt**), or you can add the Java +source files directly to your project as follows: + +``` +$ cd protobuf-2.6.1 + +$ cp -R ./java/src/main/java/com/google/protobuf \ + /path/to/gtfsrt/src/com/google/ +``` + +If you now try to build your project, an error will occur due to a +missing package called `DescriptorProtos`. You can add this to your +project using the following command: + +``` +$ cd protobuf-2.6.1 + +$ protoc --java_out=/path/to/gtfsrt/src \ + --proto_path=./src \ + ./src/google/protobuf/descriptor.proto +``` + +Your project should now build successfully, meaning you can use the +`com.google.transit.realtime` package to read data from a +GTFS-realtime feed. + +### Reading Data From a GTFS-realtime Feed + +To read the data from a GTFS-realtime feed, you need to build a +`FeedMessage` object. The simplest way to do this is by opening an +`InputStream` for the URL of the GTFS-realtime feed. + +The following code builds a `FeedMessage` for the vehicle positions +feed of the MBTA in Boston. + +***Note:** To simplify the code listings in this book, package imports +are not included. All classes used are either standard Java classes, or +classes generated by Protocol Buffers.* + +```java +public class YourClass { + public void loadFeed() throws IOException { + + URL url = new URL("http://developer.mbta.com/lib/gtrtfs/Vehicles.pb"); + + InputStream is = url.openStream(); + + FeedMessage fm = FeedMessage.parseFrom(is); + + is.close(); + + // ... + } +} +``` + +A `FeedMessage` object contains zero or more entities, each of which +is either a service alert, a vehicle position, or a trip update. You can +retrieve a list of entities using **getEntityList()**, then loop over +them as follows: + +```java +public class YourClass { + public void loadFeed() throws IOException { + + // ... + + for (FeedEntity entity : fm.getEntityList()) { + // Process the entity here + } + + // ... + + } +} +``` + +Since many of the fields in GTFS-realtime are optional, you need to +check for the presence of the field you want to use before trying to +retrieve it. This is achieved using `hasFieldName()`. You can +then retrieve it using `getFieldName()`. + +In the case of `FeedEntity`, you need to check which of the +GTFS-realtime messages are available. For instance, to check if the +entity contains a service alert, you would call `entity.hasAlert()`. +If the call to `hasAlert()` returns `true`, you can retrieve it +using `entity.getAlert()`. + +The following code shows how to access the various entities. + +```java +public class YourClass { + public void loadFeed() throws IOException { + // ... + + for (FeedEntity entity : fm.getEntityList()) { + + if (entity.hasAlert()) { + Alert alert = entity.getAlert(); + + // Process the alert here + } + + if (entity.hasVehicle()) { + VehiclePosition vp = entity.getVehicle(); + + // Process the vehicle position here + } + + if (entity.hasTripUpdate()) { + TripUpdate tu = entity.getTripUpdate(); + + // Process the trip update here + } + } + + // ... + + } +} +``` + +The next three chapters will show you how to process the `Alert`, `VehiclePosition` and `TripUpdate` objects. + +### Outputting Human-Readable GTFS-realtime Feeds + +Earlier chapters included human-readable extracts from GTFS-realtime +feeds. Although plain-text GTFS-realtime feeds are not designed to be +parsed directly, they can be useful for quickly determining the kinds of +data available within a feed. + +All objects in a GTFS-realtime feed can be output using the +`TextFormat` class in the `protobuf` package. Passing the object to +**printToString()** will generate a human-readable version of the +GTFS-realtime element. + +For instance, you can output an entire feed as follows: + +```java +FeedMessage fm = ...; + +String output = TextFormat.printToString(fm); + +System.out.println(output); +``` + +Or you can output individual entities: + +```java +for (FeedEntity entity : fm.getEntityList()) { + String output = TextFormat.printToString(entity); + + System.out.println(output); +} +``` + +Alternatively, you can call the `toString()` method on these objects +to generate the same output. + |