-
Notifications
You must be signed in to change notification settings - Fork 3
DAS Components
The internal Census Bureau DAS implementation is divided among many
different GIT repositories, arranged within the root das_decennial
repo as submodules. The source code included in this redistricting
production release consists of all the source source code modules used
in executing the DAS for that production run (on 2020 and 2010 Census
Edited Files). We flattened references to code in different
repositories into a single directory tree. This page summarizes the
individual components found in this flattened tree, marking those that
constitute separate GIT repositories in the current internal DAS
repository. [XXXX these sentences didn't make sense. Check my edits]
-
configs
contains default configuration files used by the DAS itself, as INI files read by Python -
das_framwework/ctools
(repo) common tools for working with Census data -
das_framework
(repo) contains the general (read/protect/write) framework of the DAS engine -
programs/nodes
implements the classes for the geographic nodes, instances of which represent a specific geographic location at a specific level of the geographic hierarchy, with attributes representing, for example, its privacy-protected measurements and, if available, privacy-protected microdata -
programs/geographic_spines
implements the representation of the geographic hierarchy used by the Top-Down Algorithm (TDA), including optimization of the spine to reduce error in selected off-spine geographies -
programs/queries
implements the DAS Query classes (especially DPQuery) which are the basic units of disclosure avoidance in the DAS -
programs/optimization
implements code for generating microdata (represented asnumpy
ndarray
data structures, interpreted as histograms) with minimum distance to the noisy measurements (using theGurobi
solver) -
programs/invariants
implements the representation of invariants used in the optimization process -
programs/constraints
implements the various constraints applied during the optimization process for different Census products -
programs/engine
implements the Top-Down Algorithm (TDA) used for Decennial disclosure avoidance, coordinating the interplay between the other major DAS "protect"-step processes (e.g., taking of noisy measurements, nodes, optimization, queries) -
programs/reader
reads input files (the CEF files and the geographic files) and transforms them into the geographic tree-of-histogram representation expected by the processing steps implemented inprograms/engine
-
programs/writers
converts from the Block-level histogram objects generated byengine
to microdata-formatted MDF files suitable for downstream consumption for, e.g., Census tabulation -
programs/python_dvs
contains code for the Data Vintaging System which tracks the provenance of source and generated data files
Broadly speaking, the DAS code flows - as organized by
das_framework
- from reader
to engine
to writers
, with the
other subfolders playing supporting roles and defining objects used in
one or several of these major steps.