Spatial filter by stop geometries rather than by trip geometries #44

luukvdmeer · 2021-12-16T14:17:02Z

Relates to #43. Spatial filters with filter_by_sf() now utilize the geometry of trips, if I understand correctly. As is clearly shown in the examples, this often results in a GTFS that has stops (far) outside the area which was used as a filter. When doing local analysis (with possible origins and destinations only within that area) you don't care about these stops outside the area of interest, even if they are part of trips that intersect the area of interest. For example, when filtering the Dutch GTFS file for local analysis in Amsterdam, I do want the international train to Paris to be in there for travelling between Amsterdam central station and Amsterdam airport, but am not interested in keeping the full trip all the way to Paris. Having all these other stops still included makes the GTFS file unnecessarily large and also "harms" clear visualizations of the transport network inside the area of the interest.

Therefore I think it is useful to also allow spatial filters using the stop geometries. This would only keep the stops that are inside the area of interest. All the trips using these stops will still be included, but only for that part inside the area of interest. This is different then using a "within" predicate in the current implementation, since that will remove all trips that do intersect the area of interest, but are not fully contained in it.

There may be situations in which a trip "exits" the area of interest, passes some other stops, and then "enters" the area of interest again, but as mentioned in #43 this is not a problem since stop sequence values in the stop_times table don't have to be consecutive. The only issue to overcome would be how to handle the trip shapes if a shapes table is present.

Happy to contribute if needed and when you think the idea makes sense!

The text was updated successfully, but these errors were encountered:

dhersz · 2021-12-16T15:31:26Z

I'm under the impression that this problem will be almost 100% solved if we adjust filter_by_stop_id() as you suggested in #43, and start using convert_stops_to_sf() instead of using get_trip_geometry(file = "stop_times").

The question of how to handle shapes is a tricky one. Let's say, for example, that we want to filter using the "within" predicate, but no shapes are inside the specified geometry. Our resulting GTFS could still include many shapes (that would extend beyond the geometry), because perhaps some stops are within the geometry, and the shapes are kept due to their correspondence with the trips, that themselves relate to the stops.

The question is: should we "trim" the shapes, to keep only the parts that are in fact within the geometry? My initial answer here (and I'd like to know your opinion too) is no. I'll use an example figure to justify my vote:

In this image, the input geometry is in black. The stops within the geometry are red, and the shape that correspond to one of the stops' trips are in blue (and its point are in blue as well). If we trimmed the shapes to the geometry, our result would look somewhat like this:

But that's not the actual trip shape. Any software that consumes this "trimmed GTFS" (say a routing software) would then consider the wrong path between the two existing points (e.g. AFAIK R5 uses the shapes to calculate the leg distance that we see in detailed_itineraries()). The "correctly trimmed" shape would have to look something like this:

But then we would have to come up with some heuristics to correctly trim the shapes, which could in turn lead to more problems and what not. Still, I think trimming shapes may actually be useful in some scenarios, so I think we could create a trim_shapes() (or similar) function to fulfill this need.

luukvdmeer · 2021-12-16T16:07:29Z

Yes, I agree, if a shape goes outside of the area but enters it again later on, that part outside of the area should remain in the shape, such that the shape between two stops is always correct. Extending your example, would that lead to something like this, i.e. trimming the shape to the part between the first and last remaining stop along the corresponding trip?

With the note indeed that a shape does not necessarily intersect with stop locations. This makes the "trimming" task quite a bit harder, because we first have to snap the remaining stop locations to the shape linestrings, but happy to take a look into it.

dhersz · 2021-12-16T18:47:10Z

Yes, your example is perfect. We would have to keep the shape segment that extend from the first to the last filtered stop. By the looks of the conversation in #43, I think that the decision whether to trim the shapes or not would have to be taken inside filter_by_stop_id(), maybe with a trim_shapes argument, or something like it. Then filter_by_sf() would also take a trim_shapes argument, that is ultimately passed to filter_by_stop_id().

Creating a robust snapping function has been on the plans for a while. The initial plan was to port {gtfs2gps}'s function, but I never actually had the initiative to do so. If you want to take a look into porting this function/developing your own I'd be more than happy to take the PR.

Meanwhile, I think a good way forward is to change the behaviour and inputs of filter_by_stop_id() and then adjust filter_by_sf() to start working with this function.

luukvdmeer mentioned this issue Dec 16, 2021

Filtering by stop_id does not really filter by stop_id #43

Closed

dhersz added the v1.1.0 label Jan 25, 2022

dhersz added v1.2.0 and removed v1.1.0 labels May 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spatial filter by stop geometries rather than by trip geometries #44

Spatial filter by stop geometries rather than by trip geometries #44

luukvdmeer commented Dec 16, 2021

dhersz commented Dec 16, 2021

luukvdmeer commented Dec 16, 2021

dhersz commented Dec 16, 2021

Spatial filter by stop geometries rather than by trip geometries #44

Spatial filter by stop geometries rather than by trip geometries #44

Comments

luukvdmeer commented Dec 16, 2021

dhersz commented Dec 16, 2021

luukvdmeer commented Dec 16, 2021

dhersz commented Dec 16, 2021