-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathindex.html
271 lines (269 loc) · 13.4 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Group 1 Dashboard</title>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css"
integrity="sha384-B0vP5xmATw1+K9KRQjQERJvTumQW0nPEzvF6L/Z6nronJ3oUOFUFpCjEUQouq2+l" crossorigin="anonymous" />
<!-- plotly src code cdn -->
<script src="https://cdn.plot.ly/plotly-latest.min.js"></script>
<!-- pyLDAvis stylesheet -->
<link rel="stylesheet" type="text/css"
href="https://cdn.jsdelivr.net/gh/bmabey/[email protected]/pyLDAvis/js/ldavis.v1.0.0.css" />
<link rel="stylesheet" type="text/css" href="deployment/css/style.css" />
</head>
<body>
<div class="wrapper">
<nav class="navbar navbar-expand-lg navbar-light bg-light">
<a class="navbar-brand" href="#index.html"></a>
<button class="navbar-toggler" type="button" data-toggle="collapse" data-target="#navbarNavAltMarkup" aria-controls="navbarNavAltMarkup" aria-expanded="false" aria-label="Toggle navigation">
<span class="navbar-toggler-icon"></span>
</button>
<div class="collapse navbar-collapse" id="navbarNavAltMarkup">
<div class="navbar-nav">
<!-- took out "home" text -->
<a class="nav-link active" href="#"> <span class="sr-only">(current)</span></a>
<a class="nav-link" href="#AA">Apriori Analysis</a>
<a class="nav-link" href="#SA">Segmentation Analysis</a>
<a class="nav-link" href="#TA">Topic Analysis</a>
</div>
</div>
</nav>
</div>
<div class="jumbotron jumbotron-fluid">
<h1 class="display-4";>Analyzing Consumer Behavior</h1>
</div>
<!-- test altenerate section heading -->
<div class="section-head">
<h2 id="altintro">Introduction</h2>
<p>
This page contains summary results and visualizations of our study into consumer behavior.
In order to gain insight into how big data can offer ecommerce companies greater opportunity to drive sales and revenue,
we leveraged machine learning algorithms to analyze Amazon customer data.<br>
Our research had 3 goals:
<ol>
<li>Develop a list of items frequently bought together</li>
<li>Create customer segments based on product categories purchased</li>
<li>Build a model to identify main topics included in the customer reviews of a product</li>
</ol>
</p>
</div>
<hr class="section-break">
<div class="section-head">
<h2 id="AA">Apriori Analysis</h2>
<hr>
<p>
The Apriori algorithm is used for mining frequent item sets and relevant association rules from relational databases.
The parameters “support” and “confidence” are utilized, support are the items’ frequency of occurrence and confidence is a conditional probability.
The goal of the analysis is to identify items bought together and show them in the ecommerce website to increase cross sell and sales.
</p>
</div>
<hr>
<div class="container-fluid">
<div class="row";>
<div class="col-md-4">
<img src="deployment/apriori/apriori-key-v2.png" class="img-fluid" alt="apriori-graphic-legend">
</div>
<div class="col-md-6">
<img src="deployment/apriori/apriori-network-graph-v6.png" class="img-fluid" alt="apriori-graphic">
</div>
</div>
</div>
<hr>
<div class="container-fluid">
<div class="row";>
<div class="col-md-4">
<h4>Item Association by Segment</h4>
</div>
<div class="col-md-8">
<p>
In the table below, the top 2 results of each product category are shown.
Only music and videos had a confidence higher than 60%, but since the only downside is showing recommendations that a consumer might not
have interest on and the upside is increased sales, the risk is low of showing results with lower confidence.
</p>
</div>
</div>
</div>
<div class="container-fluid">
<div class="row">
<div class="col-md-3">
<form class="button-form">
<div class="dropdown">
<button class="btn btn-secondary dropdown-toggle" type="button" id="dropdownMenuButton"
data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">
Segment
</button>
<div class="dropdown-menu" aria-labelledby="dropdownMenuButton">
<a class="dropdown-item" href="#">Apparel</a>
<a class="dropdown-item" href="#">Furniture</a>
<a class="dropdown-item" href="#">Music</a>
<a class="dropdown-item" href="#">Office Products</a>
<a class="dropdown-item" href="#">Personal Care Appliances</a>
<a class="dropdown-item" href="#">Video Games</a>
<a class="dropdown-item" href="#">Videos</a>
<a class="dropdown-item" href="#">Watches</a>
</div>
</div>
</form>
</div>
</div>
</div>
<div class="container-fluid">
<div class="row">
<div class="col-md-12">
<table class="table table-striped">
<thead style="color: #000000">
<tr>
<th>Segment</th>
<th>Antecedent IDs</th>
<th>Antecedent Names</th>
<th>Consequent IDs</th>
<th>Consequent Names</th>
<th>Confidence</th>
</tr>
</thead>
<tbody style="color: #000000"></tbody>
</table>
</div>
</div>
</div>
<hr>
<div class="container-fluid">
<div class="row";>
<div class="col-md-4">
<p>
<ul>
<li>The histogram depicts the frequency of Apriori associations by itemsets</li>
<li>Highest number of instances are for 3 product itemsets with about 5,909 associations</li>
<li>Lowest number of associations are 8 product itemsets with only 9 associations</li>
<li>Total number of recommendations the Apriori analysis gathered was 23,590</li>
</ul>
</p>
</div>
<div class="col-md-8">
<img src="deployment/apriori/apriori-frequency.png"
class="img-fluid" alt="apriori-frequency">
</div>
</div>
</div>
<hr class="section-break">
<div class="section-head">
<h2 id="SA">Segmentation Analysis</h2>
<p>
Based on data of eight different product categories; apparel, furniture, music, watches, personal care, office products, video and video games;
the data was consolidated based on the product quantities bought from each segment by customer. The K-means cluster analysis was the machine learning used,
since it is an unsupervised model that groups data into clusters, or in this case, customer segments.
</p>
</div>
<div class="container-fluid">
<div class="row";>
<div class="col-md-12">
<!-- plotly viz goes here -->
<div id="kmeans-chart"></div>
</div>
</div>
</div>
<hr>
<div class="container-fluid">
<div class="row";>
<div class="col-md-4">
<img src="deployment/segment-analysis/clusters-v3-large.png" class="img-fluid" alt="clusters-v1">
</div>
<div class="col-md-1">
</div>
<div class="col-md-4">
<h4>Results:</h4>
<p>
<ul>
<li>Cluster 0: Apparel</li>
<li>Cluster 1: Personal Care Appliances</li>
<li>Cluster 2: Furniture</li>
<li>Cluster 3: Office Products</li>
<li>Cluster 4: Multi-Category</li>
</ul>
</p>
<h4>Insights:</h4>
<p>
<ul>
<li>The largest segment is Cluster 4, Multi-Category, with 44% of customers buying products from multiple categories</li>
<li>Cluster 2 (furniture) is a priority to target in future marketing campaigns looking to broaden purchasing categories of existing customers, since there are current furniture buyers represented in Cluster 4, the Multi-Category segment</li>
<li>Create additional campaigns targeting clusters 0, 1 and 3, by giving discounts in other product categories to incentivize product mix and sales</li>
</ul>
</p>
</div>
</div>
</div>
<hr class="section-break">
<div class="section-head">
<h2 id="TA">Topic Analysis</h2>
<p>
For this analysis one specific product was selected, Product ID B000M0MJU2, an air mattress.
The Latent Dirichlet Allocation (LDA) machine learning model was used to identify topics with the customer reviews.
To better interpret the data, the analysis was split into bad (1-star) and good (5-stars) reviews.
</p>
<p>
The bubble charts below represent the output of the analysis, each bubble represents a different topic, the larger the bubble,
the higher percentage of the number of reviews in the corpus of the topic.
The blue bars show the overall frequency of each word in the corpus, if no topic is selected, the blue bars display the most frequently used words.
The red bars give the estimated number of times a given term was generated by a given topic.
The further the bubbles are away for each other, the more different they are.
</p>
<p>
<ul>
<li>Similar words between topics for good and bad reviews with different connotation</li>
<li>Analysis can be biased by person interpreting the outputs, hard to extract meaning of topics</li>
<li>Hard to identify different topics, similar words and feedback, recommended only for a superficial analysis</li>
<li>Need to improve corpus to combine words for more accurate analysis</li>
</ul>
</p>
</div>
<hr>
<div class="container-fluid">
<div class="col-md-12">
<h3>5 Star Reviews</h3>
<img id="cloud" src="deployment/topic-analysis/5_star_word_cloud_white-v3.png" class="img-fluid" alt="5-star-world-cloud">
</div>
</div>
<div class="container-fluid" id=special-container>
<div class="col-md-12">
<h3>5 Star LDA</h3>
<div id="five_star_LDA"></div>
</div>
</div>
<hr>
<div class="container-fluid">
<div class="col-md-12">
<h3>1 Star Reviews</h3>
<img id="cloud" src="deployment/topic-analysis/1_star_word_cloud_white-v3.png" class="img-fluid" alt="1-star-world-cloud">
</div>
</div>
<div class="container-fluid" id=special-container>
<div class="col-md-12">
<h3>1 Star LDA</h3>
<div id="one_star_LDA"></div>
</div>
</div>
</div>
<hr>
<script src="https://code.jquery.com/jquery-3.3.1.slim.min.js"
integrity="sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo" crossorigin="anonymous">
</script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.6/umd/popper.min.js"
integrity="sha384-wHAiFfRlMFy6i5SRaxvfOCifBUQy1xHdJ/yoi7FRNXMRBu5WHdZYu1hA6ZOblgut" crossorigin="anonymous">
</script>
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.2.1/js/bootstrap.min.js"
integrity="sha384-B0UglyR+jN6CkvvICOB2joaf5I4l3gm9GU6Hc1og6Ls7i6U/mkkaduKaBhlAXv9k" crossorigin="anonymous">
</script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/d3/4.11.0/d3.js"></script>
<!-- apriori table src paths -->
<script src="deployment/apriori/apriori_table_data.js"></script>
<script src="deployment/apriori/apriori_results_table.js"></script>
<!-- segment analysis src paths -->
<script src=deployment/segment-analysis/kmeans-data.json></script>
<script src=deployment/segment-analysis/kmeans-chart.js></script>
<!-- topic analysis source paths -->
<script src="deployment/topic-analysis/chart.js"></script>
<script src="deployment/topic-analysis/data.js"></script>
</body>