Select to view content in your preferred language

The Difference between GTFS network dataset for bus and Rail

1652
7
Jump to solution
11-30-2020 08:46 AM
NasimRezvanpour
Emerging Contributor

Hi, Y'all,

I am working on Los Angeles Public transit system, and I try to create a network dataset using LA GTFS data to apply the network dataset in OD Matrix. For other cities, they have only one set of GTFS data for their public transit system, including Bus, Rapid Transit, Light rail, etc. However, for LA, there are two sets of GTFS, one for Bus and one for Rail. I created two separate network datasets for each of them, and I used the same street layer in both of them, and the result is similar (slightly different). In creating OD matrix, I was expecting to see the Rail network dataset to ate a few transit lines as LA has only 6 rail lines compared to many bus routes. But it wasn't. In both results nearly 40000 lines have been created (based on my origins and destinations). Can anybody clarify why there are similar and also what is the different between Rail and Bus gtfs in creating network dataset?

0 Kudos
1 Solution

Accepted Solutions
MelindaMorang
Esri Regular Contributor

Yes, the OD Cost Matrix will use any non-restricted network components, including streets and transit lines.  Otherwise, how would the traveler walk from their initial location to the bus/rail stop in the first place, and how would they make transfers from one line to another if they have to walk around the block or something?  In some cases, walking the whole way may simply be faster or more practical than taking transit.

 

The OD Cost Matrix tool does not store the "traversal result", meaning after it has done the calculation of travel time and distance, it doesn't remember anything about what parts of the network were used.  So, in general, for any OD Cost Matrix problem, it is not possible to determine which streets, or transit lines, were used.  If you want this information, you can use the Closest Facility solver or the Route solver.

 

For Route or Closest Facility, you can use the Copy Traversed Source Features tool to determine which network dataset components were used and post-process those results however you want.  For the old Add GTFS to a Network dataset toolbox (which is no longer supported, by the way), you can use the special Copy Traversed Source Features (with Transit) tool, which wraps the core version of the tool with some special transit information.

 

Regarding when you can add multiple GTFS datasets: In the old Add GTFS to a Network Dataset tool, just dump both of them in as input when you run the 1) Generate Transit Lines and Stops tool.   I encourage you to spend some more time with that User's Guide if you're having trouble.  Because the tool is deprecated, I cannot provide extensive help with it.

 

As for whether you should or shouldn't combine your multiple datasets - It really depends on what you're trying to do.  If you are trying to model the way real passengers travel, then you should probably include both.  On the other hand, if you're trying to model the differences in bus service vs rail service, you should probably keep them separate.

 

If you switch to using the newer tools in ArcGIS Pro, there is a way to exclude transit modes in the analysis (see the section on supported parameters here).  So, you could build the network to include bus and rail, but for a particular analysis, you could "turn off" rail service using the "Exclude modes" parameter.

View solution in original post

7 Replies
MelindaMorang
Esri Regular Contributor

The OD Cost Matrix tool calculates the travel time or distance between each origin and each destination using the specified network dataset.  The 40,000 lines you see just represent these origin-destination pairs.  You can look at the Lines attribute table to see the time or distance that was calculated.

 

When you construct a network dataset using streets and GTFS data, almost certainly it will be technically possible for the traveler to travel from each origin to each destination using the streets, without using any transit at all.  If the fastest route between Origin A and Destination B is just to talk on the streets, you will still get a resulting line connecting A to B, and the travel time will reflect the walk time.

 

The minor differences you see in your bus analysis and your rail analysis probably reflect certain cases where the travel time is different because the traveler is using some transit service.

 

Note that it is possible to combine both bus and rail into the same network dataset even if they are in different GTFS datasets.  You just add both of them as input when creating the network.

NasimRezvanpour
Emerging Contributor

Thanks, Melinda, for your time. I appreciate your help. 

I have used the "Add GTFS to a Network Dataset" tool to create the network dataset. Are you mentioning that even in this case, the OD matrix may use the walk time to calculate from A to B? If that's the case, is there any possible way to know when it has calculated the TransitTravelTime based on the transit and when based on the walking time?

My second question is that, when can I add both of the transit's GTFS? Do you recommend combining the information of both GTFS folders in the very first step or adding both GTFS folders in the first step of using the tool that is "Generate Transit lines and stops"?

0 Kudos
MelindaMorang
Esri Regular Contributor

Yes, the OD Cost Matrix will use any non-restricted network components, including streets and transit lines.  Otherwise, how would the traveler walk from their initial location to the bus/rail stop in the first place, and how would they make transfers from one line to another if they have to walk around the block or something?  In some cases, walking the whole way may simply be faster or more practical than taking transit.

 

The OD Cost Matrix tool does not store the "traversal result", meaning after it has done the calculation of travel time and distance, it doesn't remember anything about what parts of the network were used.  So, in general, for any OD Cost Matrix problem, it is not possible to determine which streets, or transit lines, were used.  If you want this information, you can use the Closest Facility solver or the Route solver.

 

For Route or Closest Facility, you can use the Copy Traversed Source Features tool to determine which network dataset components were used and post-process those results however you want.  For the old Add GTFS to a Network dataset toolbox (which is no longer supported, by the way), you can use the special Copy Traversed Source Features (with Transit) tool, which wraps the core version of the tool with some special transit information.

 

Regarding when you can add multiple GTFS datasets: In the old Add GTFS to a Network Dataset tool, just dump both of them in as input when you run the 1) Generate Transit Lines and Stops tool.   I encourage you to spend some more time with that User's Guide if you're having trouble.  Because the tool is deprecated, I cannot provide extensive help with it.

 

As for whether you should or shouldn't combine your multiple datasets - It really depends on what you're trying to do.  If you are trying to model the way real passengers travel, then you should probably include both.  On the other hand, if you're trying to model the differences in bus service vs rail service, you should probably keep them separate.

 

If you switch to using the newer tools in ArcGIS Pro, there is a way to exclude transit modes in the analysis (see the section on supported parameters here).  So, you could build the network to include bus and rail, but for a particular analysis, you could "turn off" rail service using the "Exclude modes" parameter.

NasimRezvanpour
Emerging Contributor

Awesome! It was such an informative help!

Just a quick question, though! I am trying to calculate the Travel Time between specific origins and destinations using public transit routes -my origins and destinations exist in many. I am trying to solve how many of my specific origins have the Transit Travel Time less than 30 min to get to specific destinations. I found the OD matrix as the best tool that I can use. In this regard, what do you suggest? Continue with OD matrix or use the Closest facility or best route?

 

0 Kudos
MelindaMorang
Esri Regular Contributor

If I understand your message correctly, you are trying to count the number of destinations reachable from each origin within a 30-minute travel time.  Is that correct?  (If you are trying to calculate the travel time from a specific origin to a specific destination - like you have specifically matching pairs - then this is a different problem).  You can definitely do this with the OD Cost Matrix tool by using a Cutoff.  You can also do this with Closest Facility using a Cutoff and just setting the number of facilities to find to something really big so it finds more than the closest one.  Use Closest Facility if you need to examine the traversal result; otherwise, use OD Cost Matrix because it runs faster.

0 Kudos
NasimRezvanpour
Emerging Contributor

Yes. As my origins are in a large number (as an example for LA there are 1027 origins and 470 destinations), I don't plan to calculate a reachable destination from "each" origin. So as you put it well, I am trying to calculate the travel time from a specific origin to a specific destination - specifically matching pairs. In this case, I think OD is the best way to calculate the transit time. 

Thanks a million for your help. 

0 Kudos
MelindaMorang
Esri Regular Contributor

Actually, that's not what I recommend.  If you want to calculate the travel time between specific pairs of origins and destinations, you need to use the Route solver.  OD Cost Matrix only lets you calculate travel time between all origins and all destinations, with some constraints (max travel time, max number of origins to find).  It does not let you specify that you want to only calculate the travel time between specifically-designated pairs.

 

To calculate travel time between predetermined pairs, use the Route solver, use the RouteName field in the input stops to designate a unique name for the origin-destination pair.  Let's say you want to calculate a route from A to B and A to C and D to F.  You need to load A and B with a route name of AB (or something like that, and also A (again) and C with a route name of AC, and also D and F with a route name of DF.  When you solve the route analysis, it will calculate three separate routes, one from A to B, one from A to C, and one from D to F.

0 Kudos