Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorporate Geographic Identifiers and Geometries #2

Open
gompertzmakeham opened this issue Feb 26, 2019 · 1 comment
Open

Incorporate Geographic Identifiers and Geometries #2

gompertzmakeham opened this issue Feb 26, 2019 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@gompertzmakeham
Copy link
Owner

Incorporating geographic identifiers requires dealing with three thorny data volume problems:

  1. Finding a postal code for each interval for each person. It is not entirely clear what heuristic to use to choose a postal code in the case of multiple choices, particularly because heuristics based on time sorting will be computationally expensive to do at scale.
  2. Finding the local geography code for each postal code requires a non-trivial look-up against a large reference table. Again running this at scale will be computationally punishing.
  3. Finally mapping the local geography code to a geometry requires only a look-up against a relatively small reference table, but then requires incorporating a geometry field.
@gompertzmakeham gompertzmakeham self-assigned this Feb 26, 2019
@gompertzmakeham gompertzmakeham added the enhancement New feature or request label Feb 26, 2019
@gompertzmakeham
Copy link
Owner Author

gompertzmakeham commented Jun 13, 2019

The strategy for efficiently tabulating geography:

  1. Find all unique combinations of days and putative postal codes.
  2. Filter against the postal code list to pull only valid postal codes.
  3. To resolve day collisions take the alphabetically highest postal code because of the behavior to enter unknown postal codes as alphabetically lower postal codes, e.g. T0T0T0.
  4. For each persons semi-annual interval find the postal code at the beginning and the end of the period.
  5. Link in the local geography codes for mapping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant