v3.7.2
This release is very focussed on improving the scoring quality of the matcher system. Four areas in particular have seen work:
- Improvements to the candidate generation system which finds possible matches using ElasticSearch. The candidate generation is the step before the generation of result scores, which pre-selected possible matches from the OpenSanctions database. It has been re-worked to assign higher scores to literal name matches, and to weight the individual terms in a company or person name in more detail (in particular, considering company type information less strongly).
- We've made the
logic-v1
matching implementations for Jaro-Winkler and Metaphone more precise in their ratings, meaning they score higher for close matches but also decrease in score for invalid candidates. - We've introduced a method to assign custom weights to the features in the
logic-v1
algorithm, allowing API users to fine-tune the scoring system to their needs. More information: https://www.opensanctions.org/docs/api/scoring/#tuning - We've re-introcuced the Jaro-Winkler and Soundex implementations from
yente
3.6.1 and frozen those in place, providing stability to any adopters.
What's Changed
- Add schema facet and option to specify which facets are included in the response by @jbothma in #332
- Bump jellyfish from 1.0.0 to 1.0.1 by @dependabot in #333
- Bump elasticsearch[async] from 8.9.0 to 8.10.0 by @dependabot in #334
- Bump fastapi from 0.103.1 to 0.103.2 by @dependabot in #336
New Contributors
Full Changelog: v3.7.0...v3.7.2