-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathREADME.Rmd
137 lines (94 loc) · 5.2 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
echo = TRUE,
eval = TRUE,
comment = "#>",
fig.path = "README-"
)
```
# opendatascot <img src = "man/figures/logo.svg" alt = "opendatascot logo" align = "right" height = 150/>
<!-- badges: start -->
[![Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public.](https://www.repostatus.org/badges/latest/wip.svg)](https://www.repostatus.org/#wip)
<!-- badges: end -->
Use opendatascot to download data from [statistics.gov.scot](http://statistics.gov.scot/home) with a single line of R code. opendatascot removes the need to write SPARQL code; you simply need the URI of a dataset. opendatascot can be used interactively, or as part of a [Reproducible Analytical Pipeline (RAP)](https://analysisfunction.civilservice.gov.uk/support/reproducible-analytical-pipelines/).
## Installation
If you are working within the Scottish Government network, you can install opendatascot in the same way as with other R packages. The easiest way to do this is by using the [pkginstaller](https://github.com/ScotGovAnalysis/pkginstaller/tree/main) add-in. Further guidance is available on [eRDM](https://erdm.scotland.gov.uk:8443/documents/A42404229/details).
Alternatively, opendatascot can be installed directly from GitHub. Note that this method requires the devtools package and may not work from within the Scottish Government network.
```{r, eval = FALSE}
devtools::install_github(
"ScotGovAnalysis/opendatascot",
upgrade = "never",
build_vignettes = TRUE
)
```
Finally, opendatascot can also be installed by downloading the [zip of the repository](https://github.com/ScotGovAnalysis/opendatascot/archive/main.zip) and running the following code, replacing the section marked `<>` (including the arrows themselves) with the location of the downloaded zip:
```{r, eval = FALSE}
devtools::install_local(
"<FILEPATH OF ZIPPED FILE>/opendatascot-main.zip",
upgrade = "never",
build_vignettes = TRUE
)
```
## Usage
Learn more in **vignette("opendatascot")** or **?ods_dataset**.
**ods_all_datasets()** finds all datasets currently loaded onto statistics.gov.scot, and their publisher
**ods_dataset()** returns data from a dataset in
statistics.gov.scot
**ods_structure()** finds the full sets of categories and values
for a particular dataset (helpful for creating new filters for
**ods_dataset**\!)
**ods_print_query()** produces the SPARQL query used by **ods_dataset()**.
**ods_find_higher_geography()** and **ods_find_higher_geography()** will find all geographical areas with contain, or are contained by a specified geography.
## Examples
Get a dataframe of all datasets on statistics.gov.scot, their uri, and publisher
``` {r}
opendatascot::ods_all_datasets()
```
Discover the structure of the dataset on homelessness applications - so
we can use this in a later filter
``` {r}
opendatascot::ods_structure("homelessness-applications")
```
After viewing the structure, we decide we only want the data for "all-applications" and for the periods "2015/2016" and "2016/2017", so we add these to the filter.
``` {r}
opendatascot::ods_dataset("homelessness-applications",
applicationType = "all-applications",
refPeriod = c("2015/2016", "2016/2017"))
```
If you're only interested in a particular geographical level, you can use the "geography" argument to return only specific levels.
``` {r}
opendatascot::ods_dataset("homelessness-applications",
geography = "la")
```
Option for geography are:<br/>
**"dz"** - returns datazones only<br/>
**"iz"** - returns intermediate zones only<br/>
**"hb"** - returns healthboards only<br/>
**"la"** - returns local authorities only<br/>
**"sc"** - returns Scotland as a whole only<br/>
## Geography manipulation
If you're looking for information about what geographies are contained by, or containing, other geographies, there are two handy functions to help -
**ods_find_lower_geographies()** will return a dataframe of all geographies that are contained by the geography you pass it
**ods_find_higher_geographies()** will return a dataframe of all geographies that contain the geography you pass it
``` {r}
all_zones_in_iz <- opendatascot::ods_find_lower_geographies("S02000003")
all_zones_in_iz
```
This dataframe can then be passed to ods_dataset to get information about these geographies! We just need to select the vector of geography codes, and use the refArea filter option:
``` {r}
opendatascot::ods_dataset("house-sales-prices",
refArea = all_zones_in_iz$geography,
measureType = "mean",
refPeriod = "2013")
```
## Future development
This package is under active development, so any further functionality
will be mentioned here when it’s ready. If something important is
missing, feel free to contact the contributors or [add a new
issue](https://github.com/scotgovanalysis/opendatascot/issues).
Since this package is under active development, breaking changes may be
necessary. We will make it clear once the package is reasonably stable.