-
Notifications
You must be signed in to change notification settings - Fork 1.4k
/
welcome.Rmd
93 lines (75 loc) · 3.83 KB
/
welcome.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
---
---
# [Developing Data Products](index.html)
## Welcome
I'm glad that you decided to take Developing Data Products,
part of the [Data Science Specialization](https://www.coursera.org/specializations/jhu-data-science/)
from Johns Hopkins Biostatistics!
A data product is the production output from a statistical
analysis. Data products automate complex analysis tasks or
use technology to expand the utility of a data informed
model, algorithm or inference. This course covers the basics
of creating data products using Shiny, R packages, and
interactive graphics. This course focuses on the statistical
fundamentals of creating a data product that can be used to
tell a story about data to a mass audience.
You will learn how to communicate using statistics and
statistical products. Emphasis will be paid to communicating
uncertainty in statistical results. You will learn how to
create simple Shiny web applications and R packages for
their data products. In addition, we'll cover reproducible
presentations and interactive graphics.
We believe that the key word in Data Science is "science".
Our specialization is focused on providing you with three
things: (1) an introduction to the key ideas behind working
with data in a scientific way that will produce new and
reproducible insight, (2) an introduction to the tools that
will allow you to execute on a data analytic strategy, from
raw data in a database to a completed report with
interactive graphics, and (3) on giving you plenty of hands
on practice so you can learn the techniques for yourself.
This course represents the final cog in a data science
application, creating an end-usable data product.
We are excited about the opportunity to attempt to scale
Data Science education. We intend for the courses to be
self-contained, fast-paced, and interactive.
## Some Basics
A couple of first week housekeeping items. First, make sure
that you've had R Programming and the Data Scientist's
Toolbox. Reproducible Research would be helpful, but is not
mandatory. At a minimum you must know: very basic git, basic
R and very basic knitr.
An important aspect of this class is to peruse the materials
in the github repository. All of the most up to date
material can be found here:
https://github.com/DataScienceSpecialization/Developing_Data_Products
You should clone this repository as your first step in this
class and make sure to fetch updates periodically. (Please
send pull requests too!) It is one of the most essential
components of the Specialization that you start to use Git
frequently. We're practicing what we preach as well by using
the tools in the series to create the series, especially git.
You can clone the whole repo with (http)
- `git clone https://github.com/DataScienceSpecialization/Developing_Data_Products.git`
- or `(ssh)`
- `git clone [email protected]:DataScienceSpecialization/Developing_Data_Products.git`
The lectures are in the index.Rmd lecture files. In this
class, we'll cover how to create these sorts of slides. You
will see all of the R code to recreate the lectures. Going
through the R code is the best way to familiarize yourself
with the lecture materials.
**The lecture material for this class is largely front-loaded.
This is because the latter time of the class is devoted to
developing your data application.** Thus the class should be
doable in about a month's time or maybe less. Though make
sure you're keeping up with the classes at the beginning so
that you have some space in your schedule later on for app
development!
If you'd like to keep up with the instructors I'm
[\@bcaffo](http://twitter.com/bcaffo)
on twitter, Roger is [\@rdpeng](http://twitter.com/rdpeng) and
Jeff is [\@jtleek](http://twitter.com/jtleek). The
Department of Biostat here is
[\@jhubiostat](http://twitter.com/jhubiostat).
---
[**Back to Developing Data Products Home**](index.html)