-
-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add loop line merger #1083
base: main
Are you sure you want to change the base?
Add loop line merger #1083
Conversation
Full logs: https://github.com/onthegomap/planetiler/actions/runs/11892070532 |
I added some tests for the public functions. I don't really know how private functions are tested in Java. Should I for example write specific tests for the private function |
It really only needs tests for the public api. If there's an internal method you wanted to make sure works the way you expected you can make it package protected (no modifier) so tests can use it since they're in the same package. |
I think something like this would make sense for an API to control those parameters: var merger = new LoopLineMerger()
.setMinLength(0.1)
.setLoopMinLength(0.1)
.setPrecision(new PrecisionModel(16)); // to control rounding
merger.add(lineStrings);
var merged = merger.getMergedLineStrings(); |
I can help look into the performance, I think there will be some low hanging fruit to make it fast. JTS line merge also only considers full linestrings, it doesn't break them up into segments so this one will always be a little slower. The first thing that jumps out at me is creating full LineStrings and reversing/comparing them while building up the lists. I think we might want to have the intermediate code work with lists of coordinates that can be concatenated efficiently, then construct the LineStrings at the very end. |
Great, thanks for the feedback! I added some more tests and marked the methods that I want to test separately as protected. |
I added a |
Thanks for adding the benchmarks. I changed the api just now. Let me have a look at the benchmark... |
Pushed a few updates to the benchmark, basically it creates between 10 and 1000 line segments in the range from (0,0) to (100, 100) with between 10 and 1000 intermediate points then uses both line merge. I disabled the 1000 lines with 1000 points because it currently does not finish with loop line merger 😬 Let me know if you can think of more realistic scenarios? |
I had a look at what is given to line merging if one wants to render
|
I guess these statistics reflect that in OpenStreetMap a typical Regarding the length of the linestrings, actually almost half of them are too short to occupy distinct start and end points on the 1/16 grid:
The bins here are 0.5 * 1/16 wide. That means that the linestrings in the first bracket will snap to the same point on the grid for start and end node. |
Cool, I think we probably want to benchmark a few "average" cases (in terms of num lines/num points) and also a worst-case to make sure it doesn't hang. Feel free to update the benchmark test data generation to be more realistic! |
Thanks for adding the benchmarks @msbarry! The numbers on my machine are currently: |
Some stats when I run over the planet using openmaptiles profile:
so for worst case we might want a case that's on the order of 500k line segments each with 2 points, and a case that's 200 lines with 5000 points each. For average case, maybe 50x2 10x10 and 2x50? |
These number sound perfect. Thanks for investigating |
Oh actually I'll try to grab those actual largest input geometries so we can just test against the real thing. |
Thanks for implementing A*. I looks like it works on my test road networks. Here are the benchmark results from my laptop. Nice that the loop(20) finishes now! The code looks good to me overall. What I am not quite sure is the heuristics part. We are sure that this will always find the shortest path, right? |
How should we handle these sort of junctions: This is after loop removal but before min length line dropping. After min length the connecting segment is gone: When we do simplification, the separation between the two connected lines will increase further. It feels like when we have situations like "short edge" connects to 2 "long edges" on both ends we might want to not drop the short edge. Or actually we can remove the short edge, but keep the connecting lines unmerged such that we have endpoints which will keep the lines there after douglas peuker. What do you think? |
By the way another nice test case is I95 south of Washington. It has 3 carriage ways that sometimes are quite far separated... https://www.openstreetmap.org/#map=15/38.43446/-77.40842 |
A* uses the heuristic to do a "best-first" search from A to B instead of breadth-first. The "best" path is determined using the current path length + the heuristic (straight line distance from the end of the path to B). As long as the heuristic is the minimum possible cost from the end node to B, it's still guaranteed to produce the shortest path. There might be a better name to give it in the code though...
I think that's very similar to the same case that I'm concerned about trying to make transportation_name segments as long as possible (and the failing test I added Instead of splitting segments at 3+ way intersections, we need to pick 2 segments to connect to make a longer one. There might be some heuristic we could use when looking at each intersection, but another idea I was thinking about was to find the "longest shortest path between 2 points" through the graph (I think this is called the "diameter" of the graph), emit that as a linestring, removing it from each of the nodes, then repeat until the remaining lines are shorter than min length. I'm not sure if there's an efficient way to do that though... |
Currently we do this:
With this approach you are right that if minLength > loopMinLength, some loops will be dropped in the "remove all short edges" step. I see this as expected behavior and would recommend to set minLength < loopMinLength. Regarding your proposal to decide at 3-way junctions which two edges to merge I think that is a great idea overall because it leads to longer connected linestrings. But what I can see as a problem is that line simplification will lead to a visual gap. Something like this: Maybe a simple solution to this would be to flip the order, i.e. simply -> merge 3-way junctions for longest possible strings. What do you think? |
Here is a nice crazy 3-junction group of highway=primary roughly at https://www.openstreetmap.org/#map=12/47.5090/8.7469 Another approach could be to merge pairs of lines at 3-way+ junctions that have a mutual angle closest to 180 degrees. I.e., lines that might be the logical continuation of each other. This would help with label placement in MapLibre because there you want flat angles too... And it would be easy to implement since it happens all locally at the nodes. |
I made a separate pull request for angle merging and progressive stub removal in wipfli#18. Let me know if you think these are good directions... |
The angle heuristic sounds like a promising way to break up 3+ way intersections, will be curious to test it on transportation_name layer. Re: simplification we could simplify individual edge geometries without touching the endpoints as a part of line merging process. That would avoid breaking up connections. The only problem is that because we group by attributes first then merge lines with the same attribute so that could break connections that should exist between lines with different tags... |
I have seen this behavior for example here, where I have two layers. One for highway=primary and one for highway=secondary. Before simplification it looks like this (https://www.openstreetmap.org/#map=11/47.1629/8.0278): And after simplification like this: Red lines are nicely connected and purple lines are nicely connected, but between them things are misaligned after simplification. I wonder if we should introduce something like a set of points that cannot be moved by line simplification? |
Nice! The overall approach looks good, made a couple of comments we can either address in that PR or over here after you want to merge it first. I tried it out on transportation_name layer (I had to change FeatureMerge to set setLoopMinLength and minLength to lengthLimit) and the results are looking closer to JTS line merging (except that JTS contains duplicate lines for each lane of the divided highway and loop line merger collapses them to 1 line 👍 ) |
For simplification, I'm trying to figure out what the rule should be for which points we would be required to keep? Within a graph of lines we are merging, it's straightforward because once we've merged all nodes with degree=2 we can simplify each edge individually before the final merge that combines at nodes based on angles. But to preserve connections across multiple graphs it seems like we'd need to preserve any point that also exists in another graph of lines to merge. I'm almost wondering if we need to make the line merger aware of all sets of graphs it needs to merge for it to handle this... |
It might actually prove useful outside of the problem of line merging to have a full graph of the road network somewhere available with edges having all the attributes attached. An interesting problem is the automatic selection of roads and there is quite some literature about it. @brawer has found these two nice papers which we are reading at the moment:
To implement those ideas in Planetiler it would be super useful to have the full network stored as a graph. I am optimistic that results could look quite good even if we do this at the tile level. |
* Progressive stub removal, angle merge after simplification * Review fixes
It wouldn't be too hard to make that change, we would basically need to pull the loop line merger out one level in FeatureMerge loop and move the "group by tags" logic into LoopLineMerge, just want to figure out how much we should try to tackle in this initial PR. Those papers seem pretty promising, and give some solutions I had started thinking about related to graph diameter, etc. Since it appears there is an endless amount of tweaking we can do here, what do you think the goal for this initial PR should be? On my end no worse performance or output quality on transportation and transportation_name layers, and any cases where it does better than JTS is a bonus. Owning the logic in planetiler opens up the door to fixing long-standing issues like properly merging one-way roads, and testing out ideas suggested in those papers later. |
planetiler-core/src/main/java/com/onthegomap/planetiler/FeatureMerge.java
Outdated
Show resolved
Hide resolved
planetiler-core/src/main/java/com/onthegomap/planetiler/FeatureMerge.java
Outdated
Show resolved
Hide resolved
merger = new LoopLineMerger(); | ||
for (var outputSegment : outputSegments) { | ||
merger.add(outputSegment); | ||
} | ||
outputSegments = merger.getMergedByAngle(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's pull merge by angle into loop line merger as an (optional?) step that it runs before returning the results so we don't need to invoke it separately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good
// result.add(feature1.copyWithNewGeometry((LineString) merged).copyWithExtraAttrs(attrs)); | ||
// attrs.put("end", "yes"); | ||
// result.add(feature1.copyWithNewGeometry(((LineString) merged).getEndPoint()).copyWithExtraAttrs(attrs)); | ||
|
||
// re-simplify since some endpoints of merged segments may be unnecessary | ||
if (line.getNumPoints() > 2 && tolerance >= 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we should tackle moving line simplification into LoopLineMerger in this initial PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we need to do angle merging after simplification to get connected lines through junctions like this
\ /
x ---- x
/ \
planetiler-core/src/main/java/com/onthegomap/planetiler/util/LoopLineMerger.java
Outdated
Show resolved
Hide resolved
} | ||
|
||
private void mergeTwoEdges(Edge a, Edge b) { | ||
assert a.to == b.from; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this assertion is failing in tests, I think it should be assert a.from == b.from;
?
// minLength = 10 * 0.0625; | ||
// removeShortEdges(); | ||
// merge(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need this, otherwise line merger is emitting linestrings shorter than min length which makes the tests fail. If we do angle merging as a part of this process, should we do it before/beetween any of these min length removals?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the order should be
- break loops
- drop stubs
- simplify
- merge by angle
- drop by minlength
so simplify should be inside looplinemerger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK great! I can add a simplifyEdges()
method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops I see you added that already! Updated it to use a specialized version of DouglasPeuckerSimplifier
that works on List<Coordinate>
to avoid the extra transformation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return result; | ||
} | ||
|
||
public List<LineString> getMergedByAngle() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's remove this public API and make it a configurable part of regular merging.
if (maxLengthToTrack <= 0) { | ||
return Double.POSITIVE_INFINITY; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was an optimization, when things settle a bit more you can try the benchmarks with and without the short-circuiting and if it doesn't make any difference than just remove it so this just computes the full length of the segment.
if (!getEdges().isEmpty() && !end.getEdges().isEmpty()) { | ||
Coordinate a = getEdges().getFirst().coordinates.getFirst(); | ||
Coordinate b = end.getEdges().getFirst().coordinates.getFirst(); | ||
return a.distance(b); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might make sense to just put coordinate on each node so you don't need to look at the edges to get its location.
My initial motivation for this pull request was to be able to render highway=primary at zoom 6 in Switzerland in the Protomaps Basemap with a good amount of line simplification. With the JTS line merger that did not really work, lines looked fragmented and short features were dropped because they were doing loops. So I agree with your goals, and suggest we worry about things like road selection methods or alignement between groupedFeatures at a later point in time. |
eba894f
to
c123593
Compare
Thanks for your review! I think I have the code roughly in the state I want for the highway=primary z6 data I used as a playground. As a next step I will look into the tests and also work on the remaining review comments. |
// TODO remove debug features comment | ||
// Map<String, Object> attrs = new HashMap<>(); | ||
// attrs.put("idx", i++); | ||
// result.add(feature1.copyWithNewGeometry(line.getStartPoint()).copyWithExtraAttrs(attrs)); | ||
// result.add(feature1.copyWithNewGeometry(line).copyWithExtraAttrs(attrs)); | ||
// attrs.put("end", "yes"); | ||
// result.add(feature1.copyWithNewGeometry(line.getEndPoint()).copyWithExtraAttrs(attrs)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove commented code now?
// i = 0; | ||
// for (var outputSegment : outputSegments) { | ||
// Map<String, Object> attrs = new HashMap<>(); | ||
// attrs.put("idx", ++i); | ||
// result.add(feature1.copyWithNewGeometry(outputSegment).copyWithExtraAttrs(attrs)); | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here?
import org.locationtech.jts.geom.PrecisionModel; | ||
|
||
|
||
public class LoopLineMerger { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good, can we just add javadoc to the class explaining the high level idea of what this line merger does, and each public method explaining what the parameter controls?
return result; | ||
} | ||
|
||
public static List<VectorTile.Feature> mergeLineStringsAllParams(List<VectorTile.Feature> features, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably don't need this extra method any more. FeatureMerge should eventually have a builder API like LoopLineMerge has, but until then should keep the main entry point opinionated about what defaults it uses.
return null; | ||
} | ||
|
||
private Edge mergeTwoEdges(Node node, Edge a, Edge b) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about this small tweak to make a best-effort attempt to preserve line segment directions from the input?
private Edge mergeTwoEdges(Node node, Edge a, Edge b) { | |
private Edge mergeTwoEdges(Node node, Edge edge1, Edge edge2) { | |
// attempt to preserve segment directions from the original line | |
// when: A << N -- B then output C reversed from B to A | |
// when: A >> N -- B then output C from A to B | |
Edge a = edge1.main ? edge2 : edge1; | |
Edge b = edge1.main ? edge1 : edge2; |
This pull request adds a
LoopLineMerger
class which is a replacement for JTSLineMerger
. The new class takes loops in the LineString graph into account. Loops that are shorter than aminLoopLength
will be cut open and only the shortest edge will be kept.Here is a bit of motivation: https://oliverwipfli.ch/improving-linestring-merging-in-planetiler-2024-10-30/
Questions from my side are:
minLoopLength
? At the moment I just set it tolengthLimit
fromFeatureMerge.mergeLineStrings
. Should we maybe add an optionalloopLengthLimitCalculator
?LoopLineMerger
is slower than JTSLineMerger
. I guess it is because they are doing less (no loops taken into account) but also they are doing things more efficiently. Should I investigate in subclassing JTSLineMergeGraph
instead of building my own data structure plus helper functions for the graph?