Automated-Data-Quality-Control-STATA

This is a real use case of automating the process of quality control any micro economic data to ensure that the data possessing pipeline has not altered the raw data in any unfavorable manner. This shall ensure data integrity and reliability. The scripts is capable to examine and test for multiple individual datasets concurrently, and will run the checks and outputs a standard formatted QA report in an .xlsx format.

[Note] This procedure is currently in-use by the World Bank’s Global Education Unite (the Global Education Policy Dashboard team).

What does the procedure exactly include?

The procedure runs checks to report on the following:

Changes to # of unique count of observations for each dataset; between raw and processed data
Changes to # of duplicates between raw and processed data
Changes to # of observations with a missing unique ID between raw and processed data
Changes to # of features/variables between raw and processed data

Pre-requisites for error-free implementation

To ensure that the script will run with no error, please ensure the following:

Redefine file paths and data directories (at the top of the script) according to your machine and workflow.
Redefine the macros (globals/locals) (at the top of the script) according to your data files names and id variables
This script uses the "frames" functionality, which was introduced in STATA 16 (Stata-16 or higher is recommended).

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Example of the report output.xlsx		Example of the report output.xlsx
LICENSE		LICENSE
README.md		README.md
Raw-Clean_checks.do		Raw-Clean_checks.do

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automated-Data-Quality-Control-STATA

[Note] This procedure is currently in-use by the World Bank’s Global Education Unite (the Global Education Policy Dashboard team).

What does the procedure exactly include?

Pre-requisites for error-free implementation

About

Releases

Packages

Languages

License

Mohammed-ElDesouky/Automated-Data-Quality-Control-STATA

Folders and files

Latest commit

History

Repository files navigation

Automated-Data-Quality-Control-STATA

[Note] This procedure is currently in-use by the World Bank’s Global Education Unite (the Global Education Policy Dashboard team).

What does the procedure exactly include?

Pre-requisites for error-free implementation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages