Raw scrapings of https://transparencia.registrocivil.org.br/
The idea is that if we minimize the number of people scraping their website, everyone will benefit, so this repo will try to keep fine grained data as possible. Due to the design of their website extracting detailed information may be costly.
If you feel any data you need is missing, please open an issue here.
Notice: This repo is just a copy of the data available at the site and isn't responsible for it, please read their documentation.
Also, the site scrapping is a continuous, incremental and lengthy process, and may introduce additional errors in the data, beware of that when analyzing it.
Registrations at https://transparencia.registrocivil.org.br/registros
Monthly entries, contains all the reported cities and states, since 2015, there are multiple sub-types, see below.
name | type | notes |
---|---|---|
start_date | date | yyyy-mm-dd Registration date period start (inclusive) |
end_date | date | yyyy-mm-dd Registration date period end (inclusive) |
state | string | Registration UF code |
state_ibge_code | integer | Registration state ibge code |
city | string | Registration city name, if empty then deaths_total are state-wise |
city_ibge_code | integer | Registration city ibge code, if empty then deaths_total are state-wise |
xxxxx_total | integer | Total registrations at date |
created_at | datetime | yyyy-mm-dd hh:mm Approximated time the request to the server was made |
Scrap of all-cause death registrations
Scrap of birth certificates registrations
Scrap of natural-cause deaths at https://transparencia.registrocivil.org.br/especial-covid (from Causas Cardiacas)
Notice : The name covid comes from their panel, actually the table contains natural causes, not only covid deaths.
Daily entries, there are multiple sub-types, see below.
name | type | notes |
---|---|---|
date | date | yyyy-mm-dd Ocurrence date |
state | string | Ocurrence UF code |
state_ibge_code | integer | Ocurrence state ibge code |
city | string | [optional] Ocurrence city name |
city_ibge_code | integer | [optional] Ocurrence city ibge code |
place | string | [optional] place(s) where the deaths occurred, + separated (hospital, home, public, others) |
gender | string | [optional] F, M |
age_group | string | [optional] age group (9-, 10-19, 20-29, 30-39, 40-49, 50-59, 60-69, 70-79, 80-89, 90-99, 100+, NA) |
deaths_sars | integer | Number of SARS deaths (SRAG) |
deaths_pneumonia | integer | Number of pneumonia deaths (PNEUMONIA) |
deaths_respiratory_failure | integer | Number of respiratory failure deaths (INSUFICIENCIA_RESPIRATORIA) |
deaths_septicemia | integer | Number of septicemia deaths (SEPTICEMIA) |
deaths_indeterminate | integer | Number of indeterminate deaths (INDETERMINADA) |
deaths_others | integer | Number of others deaths (OUTRAS) |
deaths_covid19 | integer | Number of COVID-19 only deaths (COVID) |
deaths_stroke | integer | Number of stroke deaths (AVC) |
deaths_stroke_covid19 | integer | Number of stroke deaths with COVID-19 (COVID_AVC) |
deaths_cardiopathy | integer | Number of cardiopathy deaths (CARDIOPATIA) |
deaths_cardiogenic_shock | integer | Number of cardiogenic shock deaths (CHOQUE_CARD) |
deaths_heart_attack | integer | Number of heart attack deaths (INFARTO) |
deaths_heart_attack_covid19 | integer | Number of heart attack deaths with COVID-19 (COVID_INFARTO) |
deaths_sudden_cardiac | integer | Number of sudden cardiac arrest deaths (SUBITA) |
created_at | datetime | yyyy-mm-dd hh:mm approximated time the data was produced according to the server |
Notice: On the site, there are some displayed aggregations:
Name | Aggregation |
---|---|
COVID-19 | deaths_covid19 + deaths_stroke_covid19 + deaths_heart_attack_covid19 |
Demais óbitos cardiovasculares | deaths_cardiopathy + deaths_cardiogenic_shock + deaths_sudden_cardiac |
Table (no gender nor age group) for all the 27 brazilian states, since 2018
Table (no gender nor age group) for brazilian cities over 100,000 population and capitals (about 317), since 2018
Table (with gender and age group) for all the 27 brazilian states, since 2019
Table (with gender and age group) for brazilian cities over 500,000 population and capitals (about 56), since 2019
Notice: Normaly the repo is updated daily, except for the detailed scraps that normally is on a weekly basis (they take more than one day to scrap).
- Added
civil_registry_births.csv
containing birth certificates - In order to improve the scrapping speed, the detailed covid scraps now reuse data from older scraps, it if detect no data change in broader queries (quarter, monthly, etc). So the created_at columns may reflect older dates since no data actually changed and was reused.
Now includes year 2021
Added cardiac causes, 7 more columns, from deaths_stroke to deaths_sudden_cardiac, as committed at cd7a6b3
Notice: COVID-19 deaths are now split in three columns (deaths_covid19, deaths_stroke_covid19, deaths_heart_attack_covid19), see table above.
Fixes #4 by adding capital cities to detailed, as commited at fbef16
In order to fix #3, the city of "Brasilia" (ibge_code=5300108) now contains the data for the whole state "DF" (ibge_code =53), as committed at a043e3d3
https://www.ibge.gov.br/explica/codigos-dos-municipios.php
Creative Commons Attribution ShareAlike
Please mention the original source and this repo.
-
- https://brasil.io/dataset/covid19/obito_cartorio/
- There you can find the equivalent of the civil_registry_covid_states.csv (except no death places) in a easier to digest format
- Here you can contribute to it: https://github.com/turicas/covid19-br
- There are plans to bring all the data on this repo to it in a near future
-
Portal da Transparência do Registro Civil: https://transparencia.registrocivil.org.br/
-
Scrapping code is currently kept on a private repo to prevent abuse, but if you are a researcher and want access or more information, contact me at https://twitter.com/capyvara