-
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathun-votes.Rmd
196 lines (172 loc) · 7.68 KB
/
un-votes.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
---
title: "UN Votes: Plotting votes on United Nations resolutions"
description: |
Graphs and analysis using the #TidyTuesday data set for week 13 of 2021
(23/3/2021): "UN Votes"
author:
- name: Ronan Harrington
url: https://github.com/rnnh/
date: 03-30-2021
repository_url: https://github.com/rnnh/TidyTuesday/
preview: un-votes_files/figure-html5/figure2-1.png
output:
distill::distill_article:
self_contained: false
toc: true
---
## Setup and data preparation
Loading the `R` libraries and [data set](https://github.com/rfordatascience/tidytuesday/blob/master/data/2021/2021-03-23/readme.md).
```{r, code_folding=TRUE, include=TRUE}
# Loading libraries
library(forcats)
library(tidytext)
library(tidyverse)
library(tidytuesdayR)
# Loading data
tt <- tt_load("2021-03-23")
```
Wrangling data for visualisation.
```{r, code_folding=TRUE, include=TRUE}
# Extracting US votes on each UN resolution
US_votes <- tt$unvotes %>%
filter(country_code == "US") %>%
select(rcid, vote)
colnames(US_votes) <- c("rcid", "US_vote")
# Combining US votes and resolution issues with the "roll_calls" data set for
# important votes
important_votes <- tt$roll_calls %>%
filter(rcid %in% US_votes$rcid &
importantvote == 1)
important_votes <- left_join(important_votes, tt$unvotes,
by = "rcid")
important_votes <- left_join(important_votes, US_votes,
by = "rcid")
important_votes <- important_votes %>%
filter(country != "United States") %>%
mutate(agree_with_US = vote == US_vote)
important_votes$agree_with_US <- as.factor(important_votes$agree_with_US)
levels(important_votes$agree_with_US)
levels(important_votes$agree_with_US) <- c("Votes in disagreement with US",
"Votes in agreement with US")
important_votes <- left_join(important_votes,
tt$issues, by = "rcid")
important_votes
# Counting the number of votes for each country that coincide with the US vote
# on each resolution
agree_with_US_count <- important_votes %>%
group_by(agree_with_US) %>%
select(country, agree_with_US) %>%
count(agree_with_US, country, sort = TRUE)
agree_with_US_count$country <- as.factor(agree_with_US_count$country)
# Counting the number of votes that are the same/different as the US vote on
# UN resolutions in different categories
votes_by_issue <- important_votes %>%
filter(!is.na(issue)) %>%
select(date, agree_with_US, issue) %>%
group_by(date, issue) %>%
count(agree_with_US)
# Calculating term frequency-inverse document frequency (tf-idf) for the
# descriptions of UN resolutions across different issues
issue_descr_words <- important_votes %>%
filter(!is.na(issue)) %>%
select(issue, descr) %>%
unnest_tokens(word, descr) %>%
count(issue, word, sort = TRUE)
issue_descr_words$issue <- as.factor(issue_descr_words$issue)
# Removing "stop words" from resolution descriptions (e.g. "the", "and", "of")
issue_descr_words <- issue_descr_words %>%
anti_join(stop_words, by = "word")
total_words <- issue_descr_words %>%
group_by(issue) %>%
summarise(total = sum(n))
issue_descr_words <- left_join(issue_descr_words, total_words, by = "issue")
issue_descr_words <- issue_descr_words %>%
bind_tf_idf(word, issue, n)
```
## Plotting the countries that have agreed or disagreed with the most US votes
One of the variables in this data set is `importantvote`, which is used to
record whether or not each United Nations (UN) resolution is classified as
important by the United States (US) State Department. This is taken from its
"Voting Practices in the United Nations" report. This variable is used to select
UN resolutions important to the US. For these important resolutions, the vote
of each country is compared with the US vote. The number of times each country
votes with or against the US is counted; the countries that have
agreed/disagreed on the most resolutions are plotted below.
```{r figure1, code_folding=TRUE, include=TRUE, fig.height=5, fig.width=8}
# Plotting countries that have agreed/disagreed with the most US votes
agree_with_US_count %>%
slice_head(n = 15) %>%
ggplot(aes(n, fct_reorder(country, n), fill = agree_with_US)) +
geom_col(show.legend = FALSE) +
facet_wrap(~agree_with_US, ncol = 2, scales = "free") +
theme_classic() +
labs(x = "Votes on important UN resolutions", y = "Countries",
title = "Votes on important United Nations (UN) resolutions",
subtitle = "Resolution importance classified by the United States (US) State Department")
```
## Plotting votes on UN resolutions on different issues
The comparison of each UN resolution vote with the US vote is also used in this
plot. In this case, it is used to gauge the amount of consensus on each issue.
```{r figure2, code_folding=TRUE, include=TRUE, fig.height=5, fig.width=8}
# Plotting votes on different categories of UN resolutions over time
votes_by_issue %>%
ggplot(aes(x = date, y = n, group = agree_with_US, colour = agree_with_US)) +
geom_line(size = 0.8) +
facet_wrap(~issue, scales = "free") +
theme_classic() +
theme(legend.position = "top") +
theme(legend.text = element_text(size = 12)) +
labs(x = "Time", y = "UN resolution votes",
title = "Votes on UN resolutions over time by issue",
subtitle = "Agreement/disagreement with US vote used to gauge consensus on each issue",
colour = NULL)
```
## Plotting keywords in UN resolution descriptions
The different issues covered by each UN resolution is taken directly from the
data set (`tt$issues`). In this section, significant words used to describe UN
resolutions on each issue are plotted. This is done by...
- taking the UN resolution descriptions in `tt$roll_calls$descrl` as a corpus
- splitting that corpus into a document for each issue in `tt$issues`
- calculating [tf-idf](https://www.tidytextmining.com/tfidf.html) to find
significant words used to describe UN resolutions on each issue
These keywords outline which UN resolution issues pertain to different countries.
For example, resolutions on "Human rights" frequently use the words "situation"
and "Iran"; resolutions on "Economic development" frequently use the words
"embargo" and "Cuba"; resolutions on "Colonialism" frequently use the words
"territories" and "Palestinian".
```{r figure3, code_folding=TRUE, include=TRUE, fig.height=8, fig.width=7}
# Plotting keywords in UN resolution descriptions
issue_descr_words %>%
group_by("issue") %>%
slice_max(tf_idf, n = 35) %>%
ungroup() %>%
ggplot(aes(tf_idf, fct_reorder(word, tf_idf), fill = issue)) +
geom_col(show.legend = FALSE) +
facet_wrap(~issue, ncol = 2, scales = "free") +
labs(x = "tf-idf", y = NULL) +
theme_classic() +
labs(x = "Term frequency-inverse document freqeuncy (tf-idf)",
title = "Keywords in UN resolution descriptions across different categories",
subtitle = "tf-idf applied to resolution descriptions on different issues")
```
## Plotting the percentage of UN votes on each issue in data set
This pie chart illustrates the percentage of UN votes on each issue in the
data set.
```{r figure4, code_folding=TRUE, include=TRUE}
# Plotting percentage of votes in each UN resolution category
important_votes %>%
filter(!is.na(issue)) %>%
select(issue) %>%
group_by(issue) %>%
count(issue) %>%
ungroup() %>%
mutate(perc = round(n/sum(n) * 100)) %>%
ggplot(aes(x = "", y = perc, fill = issue)) +
geom_bar(stat="identity", width=1) +
coord_polar("y", start=0) +
theme_void() +
geom_text(aes(label = paste(perc, "%", sep = "")), position = position_stack(vjust=0.5)) +
labs(title = "Percentage of UN resolution votes on different issues",
subtitle = "Data set covers 6,202 UN resolutions from 1946 to 2019",
fill = "Issue")
```