-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathsession-dplyr-showcase.qmd
95 lines (69 loc) · 1.54 KB
/
session-dplyr-showcase.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
---
title: "Introduction to R and Rstudio"
subtitle: "Session - Showing more {dplyr} functions"
execute:
eval: true
---
## More {dplyr}
```{r}
#| echo: false
#| label: "libs"
#| include: false
library(readr)
library(dplyr)
```
```{r }
#| echo: false
#| label: "load-data"
beds_data <- read_csv(url("https://raw.githubusercontent.com/nhs-r-community/intro_r_data/main/beds_data.csv"),
col_types = cols(date = col_date(format = "%d/%m/%Y")),
skip = 3)
```
The following are useful functions and some examples of their capabilities for manipulating data.
## select()
Selecting can be by column name
```{r}
beds_data |>
select(org_code,
org_name)
```
Or position (including a range from:to)
```{r}
beds_data |>
select(3:5)
```
## Deselecting
```{r}
beds_data |>
select(-org_code)
```
## Select everything()
Re-position a column and then refer to everything else
```{r}
beds_data |>
select(org_name,
everything())
```
## Select starts_with()
Select columns which start with the same text
```{r}
beds_data |>
select(starts_with("org"))
```
Also `ends_with()`
## contains()
Searches for strings in the column names without the use of %wildcards%
```{r}
beds_data |>
select(contains("s_a"))
```
## Using n() and n_distinct()
```{r }
beds_data |>
summarise(number = n(), # distinct number of org_name
distinct_number = n_distinct(org_name),
.by = org_code) |>
filter(distinct_number > 1) |>
arrange(desc(distinct_number))
```
## End session