Fix updates for objects with housenumbers #773
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There has been a long-standing issue that updates of places with housenumbers as well as housenumber interpolation objects do not work properly. These places are added to the Photon database with a special database ID
<place_id>.<housenumber>
in order to allow multiple Photon objects for the same Nominatim place_id. This works fine on import but goes subtly wrong when doing updates, because update have only the information about the new state of a place, not the old one. Thus, it is not really possible to delete the old data for such a place because we don't know what database ID to look it up under.This PR changes the database ID for such objects to
<place_id>.<seq_nr>
. When a place is inserted that needs to be exploded into multiple Photon documents with different housenumbers, they are simply assigned with a sequential ID. As a place is always updated as a whole, we can now simply delete all documents matching the pattern<place_id>.<seq_nr>
by sequentially checking if there is such document in the database. If there is, delete it, if not, stop the entire process.The change do not modify the database schema, so the code happily works with older database dumps. Only when you want to make use of the fixed update function, then you need to start off with a new dump created by this new code or you will see duplicate housenumbers creep into your database.
Tried on a planet to update the database and was able to catch up on OSM data at a rate of about 1day/hour (updating both, the Nominatim DB and the Photon DB). This should be sufficient performance-wise.
The PR also finally adds tests for the update process and fixes an off-by-one error in the handling of new-style interpolations.