Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map NSI-country-names to Osmose country names / Fix failing tests #2404

Merged
merged 2 commits into from
Dec 8, 2024

Conversation

Famlam
Copy link
Collaborator

@Famlam Famlam commented Dec 8, 2024

See #2385

In some cases NSI uses a different country subdivision than Osmose.
This adds a map for such cases.

It doesn't fix cases where:

  1. The subdivision in NSI is finer than in Osmose, e.g. Denmark, which NSI splits in DK-81 etc
  2. Coordinates are used instead

Also my map probably isn't complete, I went through it fairly quickly, but left the code to get the missing entries at the bottom.
Another thing that probably has to be done is to have some library to convert the 3-letter codes (like fra) to 2 letter codes (like fr). Possibly https://github.com/pycountry/pycountry

Also fixes the tests, since the Marseille firefighter case isn't enabled for all of France

p.s. I don't like the map solution very much, since it'll be difficult to maintain, but as mentioned in the issue I don't see many good alternatives...

With the other PR (about incorrect country codes) merged, and ignoring "001", this would be the remaining list of unsupported countries
002
029
039
150
151
155
202
419
830
ag
aldi-sued
aq
aut
bel
bq
bu
conus
crimea
deu
dk-040
dk-81
dk-82
dk-83
dk-84
dk-85
es-ib
esp
fra
gb-abd
gb-abe
gb-bas
gb-bir
gb-con
gb-dev
gb-dnd
gb-dor
gb-east-england
gb-east-midlands
gb-glg
gb-greater-manchester
gb-iow
gb-lan
gb-lon
gb-mik
gb-north-east
gb-north-west
gb-nsm
gb-nwm
gb-ply
gb-som
gb-south-central
gb-south-east-coast
gb-south-west
gb-sry
gb-swd
gb-west-midlands
gb-wil
gb-yorkshire
idn
ie-d
jp-01
jp-08
jp-12
jp-45
jp-46
jp-ntt-east
jp-ntt-west
konsum-dresden
konsum-leipzig
london-cycles
miltonkeynes-cycles
ms,
northern cyprus
nz-auk
nz-bop
nz-can
nz-gis
nz-hkb
nz-mwt
nz-ntl
nz-ota
nz-tas
nz-wgn
nz-wko
ph-00-las_pinas_city
ph-00-mandaluyong_city
ph-00-marikina_city
ph-00-muntinlupa_city
ph-00-navotas_city
ph-00-paranaque_city
ph-00-pasig_city
ph-00-valenzuela_city
ph-01-alaminos_city
ph-01-candon_city
ph-01-dagupan_city
ph-01-san_carlos_city
ph-03-angeles_city
ph-05-naga_city
ph-07-danao_city
ph-07-lapu-lapu_city
ph-07-mandaue_city
ph-07-toledo_city
ph-08-ormoc_city
ph-13-butuan_city
ph-40-batangas_city
ph-40-lipa_city
ph-40-tanauan_city
ph-41-puerto_princesa_city
ph-agn
ph-aur
ph-boh
ph-buk
ph-bul
ph-cas
ph-ceb
ph-dav
ph-din
ph-iln
ph-ils
ph-isa
ph-kal
ph-lag
ph-ley
ph-lun
ph-mas
ph-mdc
ph-msc
ph-msr
ph-nir
ph-nir-bais_city
ph-nir-cadiz_city
ph-nir-la_carlota_city
ph-nir-sagay_city
ph-nir-silay_city
ph-nir-tanjay_city
ph-nsa
ph-nue
ph-nuv
ph-pam
ph-pan
ph-plw
ph-que
ph-sun
ph-sur
ph-zan
ph-zas
ph-zmb
pt-20
pt-30
q19188
q3336843
q644636
sba
stadtmobil-berlin
stadtmobil-hannover
stadtmobil-karlsruhe
stadtmobil-rhein-main
stadtmobil-rhein-neckar
stadtmobil-rhein-ruhr
stadtmobil-stuttgart
stadtmobil-suedbaden
stadtmobil-trier
us-baltimore_and_dc
us-ca-east_bay
us-ca-los_angeles_county
us-ca-orange_county
us-ca-san_francisco
us-ca-san_jose
us-ca-san_luis_obispo_county
us-cat_hood_river
us-fl-florida_keys
us-foodland_eastern
us-ky-peoples_bank_flemingsburg
us-md-baltimore
us-ne-first_state_bank_east
us-ne-first_state_bank_west
us-nv-southern
us-nv-washoe_county
us-ny-new_york_city
us-oh-cuyahoga_county
us-oh-greater_dayton_rta
us-oh-metro_rta
us-peoples_bank_oh

- `Bataillon de marins-pompiers de Marseille` shouldn't match for all of France, since it's restricted to part of France in NSI
- Add a specific test for FR13 instead
- Add a replacement test using LU (which doesn't depend on the other changes in this PR)
@frodrigo
Copy link
Member

frodrigo commented Dec 8, 2024

Loock good to me.

In some case, like bad country code, this could not be fixed on NSI side ?

@Famlam
Copy link
Collaborator Author

Famlam commented Dec 8, 2024

In some case, like bad country code, this could not be fixed on NSI side ?

With the exception of ms, (for which I made a PR at NSI which they just merged) they're unfortunately all valid in their own parser.

Their own parser allows:

  • Coordinates
  • Custom geojson files (which they fortunately frequently name according to the ISO subdivisions that we also use, but sometimes they use a custom subdivision)
  • Codes, see https://ideditor.codes/
    • Wikidata codes for regions
    • 2 letter ISO 3166-1 country codes
    • 3 letter ISO 3166-1 country codes
    • UN M49 codes
    • Some special cases like UK vs GB, EL vs GR

Which means it's all valid, e.g. Q41,300,EL,GR,GRC all refer to Greece. Osmose only supports GR natively.

It's even a bit more complicated because their GB for instance also includes Jersey, while we treat Jersey as independent country... hence my maintenance concerns here.

But for the time being I guess this PR could be a good start...

@frodrigo frodrigo merged commit b00f716 into dev Dec 8, 2024
6 checks passed
@Famlam Famlam deleted the famlam-fix-fr-nsi branch December 8, 2024 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants