Hi. When I run count trips at stops the results seems to give unexpected low number of trips if a generic weekday is used, and unexpected high number of trips if a spesific weekday is used. This offcourse allso gives a big difference in number of trips per hour. Its hard to know if this is correct results or due to some error. I have tried to use the python package gtfstk to check for errors, and feed.check_routes() gives an error for many routes, with the comment 'Invalid route type; maybe has extra space characters'. Don’t really know if this could be an issue affecting the result. My data has both calendar.txt and calendar_dates.txt. I have not registered any complaints about non-overlapping calendar dates. Any suggestion to the cause (maybe it is to be expected?) or further search for errors is much appreciated
Solved! Go to Solution.
I guess the problem is understanding the data Seems like specific dates gives the most accurate result in this case, due to how calendar and calendar_dates is being used (?).
If your calendar_dates.txt file explicitly turns on service for many dates (exception type 1), then I would expect Count Trips at Stops to have much higher counts for an explicit date instead of a generic weekday.
Can you post a link to the GTFS dataset so I can take a look?
Thanks for replying. Indeed it seems to be the case that many has exception type 1.
>>> df['exception_type'][df['exception_type']==1].count()
6185
>>> df['exception_type'][df['exception_type']==2].count()
89
>>> df['exception_type'].count()
6274
A total of 76 service_id's in calendar.txt and 319 uniqe service_id's in calendar_dates.txt.
Link to the dataset (links to zipfile).
Okay, so is there still a problem? It just sounds like the tool is finding all the service that has been "turned on" for the specific date and including that service in the counts.
I guess the problem is understanding the data Seems like specific dates gives the most accurate result in this case, due to how calendar and calendar_dates is being used (?).
Yes, I think so. The use of calendar.txt and calendar_dates.txt is kind of flexible. Some agencies use calendar_dates.txt only for exceptions, such as holidays or planned maintenance, but others use it to explicitly turn on service every day. Some, I suppose, do a mixture of these two. The GTFS specification allows any of these situations, and you have to understand how your data is constructed so that you can choose the correct analysis settings. This is understandably a common area of confusion for many users.