session-dplyr-showcase.qmd

---
title: "Introduction to R and Rstudio"
subtitle: "Session  - Showing more {dplyr} functions"
execute:
  eval: true
---

## More {dplyr}

```{r}
#| echo: false
#| label: "libs"
#| include: false
library(readr)
library(dplyr)
```

```{r }
#| echo: false
#| label: "load-data"
beds_data <- read_csv(url("https://raw.githubusercontent.com/nhs-r-community/intro_r_data/main/beds_data.csv"), 
                      col_types = cols(date = col_date(format = "%d/%m/%Y")), 
                      skip = 3)
```

The following are useful functions and some examples of their capabilities for manipulating data.


## select()

Selecting can be by column name

```{r}
beds_data |> 
  select(org_code, 
         org_name)
```

Or position (including a range from:to)

```{r}
beds_data |> 
  select(3:5)
```

## Deselecting

```{r}
beds_data |> 
  select(-org_code)
```

## Select everything()

Re-position a column and then refer to everything else

```{r}
beds_data |> 
  select(org_name,
         everything())
```

## Select starts_with()

Select columns which start with the same text

```{r}
beds_data |> 
  select(starts_with("org"))
```

Also `ends_with()`

## contains()

Searches for strings in the column names without the use of %wildcards%

```{r}
beds_data |> 
  select(contains("s_a"))
```


## Using n() and n_distinct()

```{r }
beds_data |> 
  summarise(number = n(), # distinct number of org_name
            distinct_number = n_distinct(org_name),
            .by = org_code) |> 
  filter(distinct_number > 1) |> 
  arrange(desc(distinct_number))
```

## End session