Count trips at stops. Unexpected results

1176
5
Jump to solution
01-16-2019 06:21 AM
Torbjørn_EidsheimBøe
New Contributor II

Hi. When I run count trips at stops the results seems to give unexpected low number of trips if a generic weekday is used, and unexpected high number of trips if a spesific weekday is used. This offcourse allso gives a big difference in number of trips per hour. Its hard to know if this is correct results or due to some error. I have tried to use the python package gtfstk to check for errors, and feed.check_routes() gives an error for many routes, with the comment 'Invalid route type; maybe has extra space characters'. Don’t really know if this could be an issue affecting the result. My data has both calendar.txt and calendar_dates.txt. I have not registered any complaints about non-overlapping calendar dates. Any suggestion to the cause (maybe it is to be expected?) or further search for errors is much appreciated

0 Kudos
1 Solution

Accepted Solutions
Torbjørn_EidsheimBøe
New Contributor II

I guess the problem is understanding the data  Seems like specific dates gives the most accurate result in this case, due to how calendar and calendar_dates is being used (?).

View solution in original post

0 Kudos
5 Replies
MelindaMorang
Esri Regular Contributor

If your calendar_dates.txt file explicitly turns on service for many dates (exception type 1), then I would expect Count Trips at Stops to have much higher counts for an explicit date instead of a generic weekday.

Can you post a link to the GTFS dataset so I can take a look?

0 Kudos
Torbjørn_EidsheimBøe
New Contributor II

Thanks for replying. Indeed it seems to be the case that many has exception type 1. 

>>> df['exception_type'][df['exception_type']==1].count()
6185
>>> df['exception_type'][df['exception_type']==2].count()
89
>>> df['exception_type'].count()
6274

A total of 76 service_id's in calendar.txt and 319 uniqe service_id's in calendar_dates.txt. 

Link to the dataset (links to zipfile).

0 Kudos
MelindaMorang
Esri Regular Contributor

Okay, so is there still a problem?  It just sounds like the tool is finding all the service that has been "turned on" for the specific date and including that service in the counts.

0 Kudos
Torbjørn_EidsheimBøe
New Contributor II

I guess the problem is understanding the data  Seems like specific dates gives the most accurate result in this case, due to how calendar and calendar_dates is being used (?).

0 Kudos
MelindaMorang
Esri Regular Contributor

Yes, I think so.  The use of calendar.txt and calendar_dates.txt is kind of flexible.  Some agencies use calendar_dates.txt only for exceptions, such as holidays or planned maintenance, but others use it to explicitly turn on service every day.  Some, I suppose, do a mixture of these two.  The GTFS specification allows any of these situations, and you have to understand how your data is constructed so that you can choose the correct analysis settings.  This is understandably a common area of confusion for many users.