diff --git a/README.md b/README.md index c400a7fc..6495f35e 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,4 @@ -Bayesian Statistics with Julia and Turing -================ +# Bayesian Statistics using Julia and Turing [![CC BY-SA 4.0](https://img.shields.io/badge/License-CC%20BY--SA%204.0-lightgrey.svg)](http://creativecommons.org/licenses/by-sa/4.0/) @@ -13,11 +12,19 @@ Bayesian for Everyone! -Welcome to the repository of tutorials on how to do **Bayesian Statistics** using [**Julia**](https://www.julialang.org) and [**Turing**](http://turing.ml/). Tutorials are available at [storopoli.github.io/Bayesian-Julia](https://storopoli.github.io/Bayesian-Julia). +Welcome to the repository of tutorials on how to do **Bayesian Statistics** using [**Julia**](https://www.julialang.org) and [**Turing**](http://turing.ml/). +Tutorials are available at [storopoli.github.io/Bayesian-Julia](https://storopoli.github.io/Bayesian-Julia). -**Bayesian statistics** is an approach to inferential statistics based on Bayes' theorem, where available knowledge about parameters in a statistical model is updated with the information in observed data. The background knowledge is expressed as a prior distribution and combined with observational data in the form of a likelihood function to determine the posterior distribution. The posterior can also be used for making predictions about future events. +**Bayesian statistics** is an approach to inferential statistics based on Bayes' theorem, +where available knowledge about parameters in a statistical model is updated with the information in observed data. +The background knowledge is expressed as a prior distribution and combined with observational data in the form of a likelihood function to determine the posterior distribution. +The posterior can also be used for making predictions about future events. -**Bayesian statistics** is a departure from classical inferential statistics that prohibits probability statements about parameters and is based on asymptotically sampling infinite samples from a theoretical population and finding parameter values that maximize the likelihood function. Mostly notorious is null-hypothesis significance testing (NHST) based on *p*-values. Bayesian statistics **incorporate uncertainty** (and prior knowledge) by allowing probability statements about parameters, and the process of parameter value inference is a direct result of the **Bayes' theorem**. +**Bayesian statistics** is a departure from classical inferential statistics that prohibits probability statements about parameters and +is based on asymptotically sampling infinite samples from a theoretical population and finding parameter values that maximize the likelihood function. +Mostly notorious is null-hypothesis significance testing (NHST) based on *p*-values. +Bayesian statistics **incorporate uncertainty** (and prior knowledge) by allowing probability statements about parameters, +and the process of parameter value inference is a direct result of the **Bayes' theorem**. ## Table of Contents @@ -37,7 +44,12 @@ Welcome to the repository of tutorials on how to do **Bayesian Statistics** usin ## Julia -[**Julia**](https://www.julialang.org) is a fast dynamic-typed language that just-in-time (JIT) compiles into native code using LLVM. It ["runs like C but reads like Python"](https://www.nature.com/articles/d41586-019-02310-3), meaning that is *blazing* fast, easy to prototype and read/write code. It is multi-paradigm, combining features of imperative, functional, and object-oriented programming. I won't cover Julia basics and any sort of data manipulation using Julia in the tutorials, instead please take a look into the following resources which covers most of the introduction to Julia and how to work with tabular data in Julia: +[**Julia**](https://www.julialang.org) is a fast dynamic-typed language that just-in-time (JIT) compiles into native code using LLVM. +It ["runs like C but reads like Python"](https://www.nature.com/articles/d41586-019-02310-3), +meaning that is *blazing* fast, easy to prototype and to read/write code. +It is multi-paradigm, combining features of imperative, functional, and object-oriented programming. +I won't cover Julia basics and any sort of data manipulation using Julia in the tutorials, +instead please take a look into the following resources which covers most of the introduction to Julia and how to work with tabular data in Julia: * [**Julia Documentation**](https://docs.julialang.org/): Julia documentation is a very friendly and well-written resource that explains the basic design and functionality of the language. * [**Julia Data Science**](https://juliadatascience.io): open source and open access book on how to do Data Science using Julia. @@ -47,7 +59,8 @@ Welcome to the repository of tutorials on how to do **Bayesian Statistics** usin ## Turing -[**Turing**](http://turing.ml/) is an ecosystem of Julia packages for Bayesian Inference using [probabilistic programming](https://en.wikipedia.org/wiki/Probabilistic_programming). Models specified using Turing are easy to read and write — models work the way you write them. Like everything in Julia, Turing is [fast](https://arxiv.org/abs/2002.02702). +[**Turing**](http://turing.ml/) is an ecosystem of Julia packages for Bayesian Inference using [probabilistic programming](https://en.wikipedia.org/wiki/Probabilistic_programming). +Models specified using Turing are easy to read and write — models work the way you write them. Like everything in Julia, Turing is [fast](https://arxiv.org/abs/2002.02702). ## Author @@ -55,7 +68,11 @@ Jose Storopoli, PhD - [*Lattes* CV](http://lattes.cnpq.br/2281909649311607) - [O ## How to use the content? -The content is licensed under a very permissive Creative Commons license (CC BY-SA). You are mostly welcome to contribute with [issues](https://www.github.com/storopoli/Bayesian-Julia/issues) and [pull requests](https://github.com/storopoli/Bayesian-Julia/pulls). My hope is to have **more people into Bayesian statistics**. The content is aimed towards social scientists and PhD candidates in social sciences. I chose to provide an **intuitive approach** rather than focusing on rigorous mathematical formulations. I've made it to be how I would have liked to be introduced to Bayesian statistics. +The content is licensed under a very permissive Creative Commons license (CC BY-SA). +You are mostly welcome to contribute with [issues](https://www.github.com/storopoli/Bayesian-Julia/issues) and [pull requests](https://github.com/storopoli/Bayesian-Julia/pulls). +My hope is to have **more people into Bayesian statistics**. The content is aimed towards social scientists and PhD candidates in social sciences. +I chose to provide an **intuitive approach** rather than focusing on rigorous mathematical formulations. +I've made it to be how I would have liked to be introduced to Bayesian statistics. To configure a local environment: @@ -85,14 +102,40 @@ To configure a local environment: 11. [**Computational Tricks with Turing (Non-Centered Parametrization and QR Decomposition)**](https://storopoli.github.io/Bayesian-Julia/pages/11_Turing_tricks/) 12. [**Epidemiological Models using ODE Solvers in Turing**](https://storopoli.github.io/Bayesian-Julia/pages/12_epi_models/) +## Datasets + +- `kidiq` (linear regression): data from a survey of adult American women and their children + (a subsample from the National Longitudinal Survey of Youth). + Source: Gelman and Hill (2007). +- `wells` (logistic regression): a survey of 3200 residents in a small area of Bangladesh suffering + from arsenic contamination of groundwater. + Respondents with elevated arsenic levels in their wells had been encouraged to switch their water source + to a safe public or private well in the nearby area + and the survey was conducted several years later to + learn which of the affected residents had switched wells. + Souce: Gelman and Hill (2007). +- `esoph` (ordinal regression): data from a case-control study of (o)esophageal cancer in Ille-et-Vilaine, France. + Source: Breslow and Day (1980). +- `roaches` (Poisson regression): data on the efficacy of a pest management system at reducing the number of roaches in urban apartments. + Source: Gelman and Hill (2007). +- `duncan` (robust regression): data from occupation's prestige filled with outliers. + Source: Duncan (1961). +- `cheese` (hierarchical models): data from cheese ratings. + A group of 10 rural and 10 urban raters rated 4 types of different cheeses (A, B, C and D) in two samples. + Source: Boatwright, McCulloch and Rossi (1999). + ## What about other Turing tutorials? -Despite not being the only Turing tutorial that exists, this tutorial aims to introduce Bayesian inference along with how to use Julia and Turing. Here is a (not complete) list of other Turing tutorials: +Despite not being the only Turing tutorial that exists, this tutorial aims to introduce Bayesian inference along with how to use Julia and Turing. +Here is a (not complete) list of other Turing tutorials: 1. [**Official Turing Tutorials**](https://turing.ml/dev/tutorials/): tutorials on how to implement common models in Turing 2. [**Statistical Rethinking - Turing Models**](https://statisticalrethinkingjulia.github.io/TuringModels.jl/): Julia versions of the Bayesian models described in *Statistical Rethinking* Edition 1 (McElreath, 2016) and Edition 2 (McElreath, 2020) 3. [**Håkan Kjellerstrand Turing Tutorials**](http://hakank.org/julia/turing/): a collection of Julia Turing models +I also have a free and opensource graduate course on Bayesian Statistics with Turing and Stan code. +You can find it at [`storopoli/Bayesian-Statistics`](https://github.com/storopoli/Bayesian-Statistics). + ## How to cite To cite these tutorials, please use: @@ -108,6 +151,10 @@ Or in BibTeX format (LaTeX): year = {2021} } +## References + +The references are divided in **books**, **papers**, **software**, and **datasets**. + ### Books * Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). *Bayesian Data Analysis*. Chapman and Hall/CRC. @@ -126,7 +173,7 @@ Or in BibTeX format (LaTeX): * Amrhein, V., Greenland, S., & McShane, B. (2019). Scientists rise up against statistical significance. *Nature*, *567*(7748), 305–307. * van de Schoot, R., Kaplan, D., Denissen, J., Asendorpf, J. B., Neyer, F. J., & van Aken, M. A. G. (2014). A Gentle Introduction to Bayesian Analysis: Applications to Developmental Research. *Child Development*, *85*(3), 842–860. -### Software References +### Software * Bezanson, J., Edelman, A., Karpinski, S., & Shah, V. B. (2017). Julia: A fresh approach to numerical computing. SIAM Review, 59(1), 65–98. * Ge, H., Xu, K., & Ghahramani, Z. (2018). Turing: A Language for Flexible Probabilistic Inference. International Conference on Artificial Intelligence and Statistics, 1682–1690. http://proceedings.mlr.press/v84/ge18b.html @@ -134,9 +181,16 @@ Or in BibTeX format (LaTeX): * Xu, K., Ge, H., Tebbutt, W., Tarek, M., Trapp, M., & Ghahramani, Z. (2020). AdvancedHMC.jl: A robust, modular and efficient implementation of advanced HMC algorithms. Symposium on Advances in Approximate Bayesian Inference, 1–10. http://proceedings.mlr.press/v118/xu20a.html * Revels, J., Lubin, M., & Papamarkou, T. (2016). Forward-Mode Automatic Differentiation in Julia. ArXiv:1607.07892 [Cs]. http://arxiv.org/abs/1607.07892 +### Datasets + +- Boatwright, P., McCulloch, R., & Rossi, P. (1999). Account-level modeling for trade promotion: An application of a constrained parameter hierarchical model. _Journal of the American Statistical Association_, 94(448), 1063–1073. +- Breslow, N. E. & Day, N. E. (1980). **Statistical Methods in Cancer Research. Volume 1: The Analysis of Case-Control Studies**. IARC Lyon / Oxford University Press. +- Duncan, O. D. (1961). A socioeconomic index for all occupations. Class: Critical Concepts, 1, 388–426. +- Gelman, A., & Hill, J. (2007). **Data analysis using regression and + multilevel/hierarchical models**. Cambridge university press. ## License This content is licensed under [Creative Commons Attribution-ShareAlike 4.0 Internacional](http://creativecommons.org/licenses/by-sa/4.0/). -[![CC BY-SA 4.0](https://licensebuttons.net/l/by-sa/4.0/88x31.png)](http://creativecommons.org/licenses/by-sa/4.0/) +[![CC BY-SA 4.0](https://licensebuttons.net/l/by-sa/4.0/88x31.png)](http://creativecommons.org/licenses/by-sa/4.0/) \ No newline at end of file diff --git a/_literate/1_why_Julia.jl b/_literate/1_why_Julia.jl index 84b2069a..b1831118 100644 --- a/_literate/1_why_Julia.jl +++ b/_literate/1_why_Julia.jl @@ -1,8 +1,9 @@ # # Why Julia? -# [Julia](https://www.julialang.org) (Bezanson, Edelman, Karpinski & Shah, 2017) is a relatively new language, first released in 2012, aims to be both **high-level** and **fast**. -# Julia is a fast dynamic-typed language that just-in-time (JIT) -# compiles into native code using LLVM. It ["runs like C but reads like Python"](https://www.nature.com/articles/d41586-019-02310-3), +# [Julia](https://www.julialang.org) (Bezanson, Edelman, Karpinski & Shah, 2017) is a relatively new language, +# first released in 2012, aims to be both **high-level** and **fast**. +# Julia is a fast dynamic-typed language that just-in-time (JIT) compiles into native code using LLVM. +# It ["runs like C but reads like Python"](https://www.nature.com/articles/d41586-019-02310-3), # meaning that is *blazing* fast, easy to prototype and read/write code. # It is **multi-paradigm**, combining features of imperative, functional, and object-oriented programming. @@ -16,22 +17,26 @@ # > Something that is dirt simple to learn, yet keeps the most serious hackers happy. We want it interactive and we want it compiled. # **Why this needs to be an extra language?** Why cannot Python (or R) be made that fast for instance? -# See the official compact answer to this in the [Julia manual FAQ](https://docs.julialang.org/en/v1/manual/faq/#Why-don't-you-compile-Matlab/Python/R/%E2%80%A6-code-to-Julia?): +# See the official compact answer to this in the +# [Julia manual FAQ](https://docs.julialang.org/en/v1/manual/faq/#Why-don't-you-compile-Matlab/Python/R/%E2%80%A6-code-to-Julia?): -# > The basic issue is that there is nothing special about Julia's compiler: we use a commonplace compiler (LLVM) with no -# > "secret sauce" that other language developers don't know about. +# > The basic issue is that there is nothing special about Julia's compiler: +# > we use a commonplace compiler (LLVM) with no "secret sauce" that other language developers don't know about. # > Julia's performance advantage derives almost entirely from its front-end: its language semantics allow a well-written Julia program -# > to give more opportunities to the compiler to generate efficient code and memory layouts. If you tried to compile Matlab or -# > Python code to Julia, our compiler would be limited by the semantics of Matlab or Python to producing code no better than that of -# > existing compilers for those languages (and probably worse). +# > to give more opportunities to the compiler to generate efficient code and memory layouts. +# > If you tried to compile Matlab or Python code to Julia, +# > our compiler would be limited by the semantics of Matlab or +# > Python to producing code no better than that of existing compilers for those languages (and probably worse). # > # > Julia's advantage is that good performance is not limited to a small subset of "built-in" types and operations, # > and one can write high-level type-generic code that works on arbitrary user-defined types while remaining fast and memory-efficient. -# > Types in languages like Python simply don't provide enough information to the compiler for similar capabilities, so as soon as you used -# > those languages as a Julia front-end you would be stuck. +# > Types in languages like Python simply don't provide enough information to the compiler for similar capabilities, +# > so as soon as you used those languages as a Julia front-end you would be stuck. -# These are the "official" answers from the Julia community. Now let me share with you my opinion. -# From my point-of-view Julia has **three main features that makes it a unique language to work with**, specially in scientific computing: +# These are the "official" answers from the Julia community. +# Now let me share with you my opinion. +# From my point-of-view Julia has **three main features that makes it a unique language to work with**, +# specially in scientific computing: # * **Speed** # * **Ease of Use** @@ -41,23 +46,27 @@ # ## Speed -# Yes, Julia is **fast**. **Very fast!** It was made for speed from the drawing board. It bypass any sort of intermediate representation and translate -# code into machine native code using LLVM compiler. Comparing this with R, that uses either FORTRAN or C, or Python, that uses CPython; +# Yes, Julia is **fast**. **Very fast!** It was made for speed from the drawing board. +# It bypass any sort of intermediate representation and translate code into machine native code using LLVM compiler. +# Comparing this with R, that uses either FORTRAN or C, or Python, that uses CPython; # and you'll clearly see that Julia has a major speed advantage over other languages that are common in data science and statistics. -# Julia exposes the machine code to LLVM's compiler which in turn can optimize code as it wishes, like a good compiler such as LLVM excels in. +# Julia exposes the machine code to LLVM's compiler which in turn can optimize code as it wishes, +# like a good compiler such as LLVM excels in. # One notable example: NASA uses Julia to analyze the # "[Largest Batch of Earth-Sized Planets Ever Found](https://exoplanets.nasa.gov/news/1669/seven-rocky-trappist-1-planets-may-be-made-of-similar-stuff/)". -# Also, you can find [benchmarks](https://julialang.org/benchmarks/) for a range of common code patterns, such as function calls, string -# parsing, sorting, numerical loops, random number generation, recursion, and array operations using Julia and also several -# other languages such as C, Rust, Go, JavaScript, R, Python, Fortran and Java. The figure below was taken from -# [Julia's website](https://julialang.org/benchmarks/). As you can see Julia is **indeed** fast: +# Also, you can find [benchmarks](https://julialang.org/benchmarks/) for a range of common code patterns, +# such as function calls, string parsing, sorting, numerical loops, random number generation, recursion, +# and array operations using Julia and also several other languages such as C, Rust, Go, JavaScript, R, Python, Fortran and Java. +# The figure below was taken from [Julia's website](https://julialang.org/benchmarks/). +# As you can see Julia is **indeed** fast: # ![Common Benchmarks](/pages/images/benchmarks.svg) # # \center{*Common Benchmarks*} \\ -# Let me demonstrate how fast Julia is. Here is a simple "groupby" operation using random stuff to emulate common data analysis +# Let me demonstrate how fast Julia is. +# Here is a simple "groupby" operation using random stuff to emulate common data analysis # "split-apply-combine" operations in three languages[^updatedversion] : # # * Julia: using [`DataFrames.jl`](https://dataframes.juliadata.org/stable/) - 0.4ms @@ -121,25 +130,31 @@ # ) # ``` -# So clearly **Julia is the winner here**, being **4x faster than Python** and almost **10x faster than R**. Also note that `Pandas` -# (along with `NumPy`) and `{dplyr}` are all written in C or C++. Additionally, I didn't let Julia cheat by allowing the compiler -# optimize for `df` by passing a reference `$df`. So, I guess this is a fair comparison. +# So clearly **Julia is the winner here**, being **4x faster than Python** and almost **10x faster than R**. +# Also note that `Pandas` (along with `NumPy`) and `{dplyr}` are all written in C or C++. +# Additionally, I didn't let Julia cheat by allowing the compiler optimize for `df` by passing a reference `$df`. +# So, I guess this is a fair comparison. # ## Ease of Use -# What is most striking is that Julia can be as fast as C (and faster than Java in some applications) while **having a very simple and -# intelligible syntax**. This feature along with its speed is what Julia creators denote as **"the two language problem"** that Julia -# addresses. The **"two language problem" is a very typical situation in scientific computing** where a researcher or computer scientist -# devises an algorithm or a solution that he or she prototypes in an easy to code language (like Python) and, if it works, he or she -# would code in a fast language that is not easy to code (C or FORTRAN). Thus, we have two languages involved in the process -# of developing a new solution. One which is easy to prototype but is not suited for implementation (mostly due to being slow). -# And another one which is not so easy to code (and, consequently, not easy to prototype) but suited for implementation -# (mostly because it is fast). Julia comes to **eliminate such situations** by being the **same language** that you **prototype** (ease of use) +# What is most striking is that Julia can be as fast as C (and faster than Java in some applications) +# while **having a very simple and intelligible syntax**. +# This feature along with its speed is what Julia creators denote as **"the two language problem"** that Julia addresses. +# The **"two language problem" is a very typical situation in scientific computing** +# where a researcher or computer scientist devises an algorithm or a solution that he or she +# prototypes in an easy to code language (like Python) and, if it works, +# he or she would code in a fast language that is not easy to code (C or FORTRAN). +# Thus, we have two languages involved in the process of developing a new solution. +# One which is easy to prototype but is not suited for implementation (mostly due to being slow). +# And another one which is not so easy to code (and, consequently, not easy to prototype) +# but suited for implementation (mostly because it is fast). +# Julia comes to **eliminate such situations** by being the **same language** that you **prototype** (ease of use) # and **implement the solution** (speed). -# Also, Julia lets you use **unicode characters as variables or parameters**. This means no more using `sigma` or `sigma_i`, -# and instead just use `σ` or `σᵢ` as you would in mathematical notation. When you see code for an algorithm or for a -# mathematical equation you see a **one-to-one relation to code and math**. This is a **powerful** feature. +# Also, Julia lets you use **unicode characters as variables or parameters**. +# This means no more using `sigma` or `sigma_i`, and instead just use `σ` or `σᵢ` as you would in mathematical notation. +# When you see code for an algorithm or for a mathematical equation you see a **one-to-one relation to code and math**. +# This is a **powerful** feature. # I think that the "two language problem" and the one-to-one code and math relation are best described by # one of the creators of Julia, Alan Edelman, in a **TED Talk** (see the video below): @@ -152,13 +167,14 @@ # [**Metropolis algorithm**](https://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm) # for a **bivariate normal distribution**. I would mostly prototype it in a dynamically-typed language such as R or Python. # Then, deploy the algorithm using a fast but hard to code language such as C++. This is exactly what I'll do now. -# The algorithm will be coded in **Julia**, **R**, **C++** and [**`Stan`**](https://mc-stan.org). There are two caveats. -# First, I am coding the **original 1950s Metropolis version**, not the **1970s Metropolis-Hastings**, which implies -# **symmetrical proposal distributions** just for the sake of the example. +# The algorithm will be coded in **Julia**, **R**, **C++** and [**`Stan`**](https://mc-stan.org). +# There are two caveats. +# First, I am coding the **original 1950s Metropolis version**, not the **1970s Metropolis-Hastings**, +# which implies **symmetrical proposal distributions** just for the sake of the example. # Second, the proposals are based on a **uniform distribution** on the current proposal values of the proposal values ± a certain `width`. # -# Let's start with **Julia** which uses the [`Distributions.jl`](https://juliastats.org/Distributions.jl/stable/) package for -# its probabilistic distributions along with `logpdf()` defined methods for all of the distributions. +# Let's start with **Julia** which uses the [`Distributions.jl`](https://juliastats.org/Distributions.jl/stable/) +# package for its probabilistic distributions along with `logpdf()` defined methods for all of the distributions. # ```julia # using Distributions @@ -220,9 +236,12 @@ # } # ``` -# Now **C++**. Here I am using the [`Eigen`](https://eigen.tuxfamily.org/) library. Note that, since C++ is a very powerful language -# to be used as "close to the metal" as possible, I don't have any -# convenient predefined multivariate normal to use. So I will have to create this from zero[^mvnimplem]. Ok, be **ready**! +# Now **C++**. +# Here I am using the [`Eigen`](https://eigen.tuxfamily.org/) library. +# Note that, since C++ is a very powerful language to be used as "close to the metal" as possible, +# I don't have any convenient predefined multivariate normal to use. +# So I will have to create this from zero[^mvnimplem].\ +# Ok, be **ready**! # This is a mouthful: # ```cpp @@ -316,12 +335,13 @@ # # $$ \text{PDF}(\boldsymbol{\mu}, \boldsymbol{\Sigma}) = (2\pi)^{-{\frac{k}{2}}}\det({\boldsymbol{\Sigma}})^{-{\frac {1}{2}}}e^{-{\frac{1}{2}}(\mathbf{x}-{\boldsymbol{\mu}})^{T}{\boldsymbol{\Sigma }}^{-1}(\mathbf{x} -{\boldsymbol{\mu}})} \label{mvnpdf} , $$ # -# where $\boldsymbol{\mu}$ is a vector of means, $k$ is the number of dimensions, $\boldsymbol{\Sigma}$ is a covariance matrix, $\det$ is the determinant and $\mathbf{x}$ -# is a vector of values that the PDF is evaluated for. +# where $\boldsymbol{\mu}$ is a vector of means, $k$ is the number of dimensions, $\boldsymbol{\Sigma}$ is a covariance matrix, +# $\det$ is the determinant and $\mathbf{x}$ is a vector of values that the PDF is evaluated for. -# **SPOILER ALERT**: Julia will beat this C++ Eigen implementation by being almost 100x faster. So I will try to *help* C++ beat Julia (😂) -# by making a bivariate normal class `BiNormal` in order to avoid the expensive operation of inverting a covariance matrix and computing -# determinants in every logpdf proposal evaluation. +# **SPOILER ALERT**: Julia will beat this C++ Eigen implementation by being almost 100x faster. +# So I will try to *help* C++ beat Julia (😂) by making a bivariate normal class `BiNormal` +# in order to avoid the expensive operation of inverting a covariance matrix and +# computing determinants in every logpdf proposal evaluation. # Also since we are not doing linear algebra computations I've removed Eigen and used C++ STL's ``: # ```cpp @@ -364,9 +384,10 @@ # # no more determinants or matrix inversions. Easy-peasy for C++. -# Now let's go to the last, but not least: [`Stan`](https://mc-stan.org) is a probabilistic language for specifying probabilistic -# models (does the same as `Turing.jl` does) and comes also with a very fast C++-based MCMC sampler. `Stan` is a personal favorite -# of mine and I have a [whole graduate course of Bayesian statistics using `Stan`](https://storopoli.github.io/Estatistica-Bayesiana/). +# Now let's go to the last, +# but not least: [`Stan`](https://mc-stan.org) is a probabilistic language for specifying probabilistic models (does the same as `Turing.jl` does) +# and comes also with a very fast C++-based MCMC sampler. +# `Stan` is a personal favorite of mine and I have a [whole graduate course of Bayesian statistics using `Stan`](https://storopoli.github.io/Bayesian-Statistics/). # Here's the `Stan` implementation: # ```stan @@ -398,8 +419,10 @@ # } # ``` -# Wow, that was lot... Not let's go to the results. I've benchmarked R and `Stan` code using `{bench}` and `{rstan}` packages, C++ using `catch2`, Julia using -# `BenchmarkTools.jl`. For all benchmarks the parameters were: `S = 10_000` simulations, `width = 2.75` and `ρ = 0.8`. +# Wow, that was lot... +# Not let's go to the results. +# I've benchmarked R and `Stan` code using `{bench}` and `{rstan}` packages, C++ using `catch2`, Julia using `BenchmarkTools.jl`. +# For all benchmarks the parameters were: `S = 10_000` simulations, `width = 2.75` and `ρ = 0.8`. # From fastest to slowest: # # * `Stan` - 3.6ms @@ -410,18 +433,20 @@ # **Conclusion**: a naïve Julia implementation beats C++ # (while also beating a C++ math-helped faster implementation using bivariate normal PDFs) and gets very close to `Stan`, -# a highly specialized probabilistic language that compiles and runs on C++ with lots of contributors, funding and development -# time invested. +# a highly specialized probabilistic language that compiles and runs on C++ with lots of contributors, +# funding and development time invested. -# Despite being *blazing* fast, Julia also **codes very easily**. You can write and read code without much effort. +# Despite being *blazing* fast, Julia also **codes very easily**.\ +# You can write and read code without much effort. # ## Multiple Dispatch -# I think that this is the **real gamechanger of Julia language**: The ability to define **function behavior** across many combinations of argument -# types via [**multiple dispatch**](https://en.wikipedia.org/wiki/Multiple_dispatch). **Multiple dispatch** is a feature -# that allows a function or method to be **dynamically dispatched** based on the run-time (dynamic) type or, -# in the more general case, some other attribute of more than one of its arguments. This is a **generalization of -# single-dispatch polymorphism** where a function or method call is dynamically dispatched based on the derived type of +# I think that this is the **real gamechanger of Julia language**: +# The ability to define **function behavior** across many combinations of argument types via +# [**multiple dispatch**](https://en.wikipedia.org/wiki/Multiple_dispatch). +# **Multiple dispatch** is a feature that allows a function or method to be **dynamically dispatched** based on the run-time (dynamic) type or, +# in the more general case, some other attribute of more than one of its arguments. +# This is a **generalization of single-dispatch polymorphism** where a function or method call is dynamically dispatched based on the derived type of # the object on which the method has been called. Multiple dispatch routes the dynamic dispatch to the # implementing function or method using the combined characteristics of one or more arguments. @@ -436,16 +461,17 @@ # # ### Example: Dogs and Cats # -# I will reproduce Karpinski's example. In the talk, Karpinski designs a structure of classes which are very common in -# object-oriented programming (OOP). In Julia, we don't have classes but we have **structures** (`struct`) that are meant to be -# "structured data": they define the kind of information that is embedded in the structure, -# that is a set of fields (aka "properties" or "attributes" in other languages), and then individual instances (or "objects") can -# be produced each with its own specific values for the fields defined by the structure. +# I will reproduce Karpinski's example. In the talk, Karpinski designs a structure of classes which are very common in object-oriented programming (OOP). +# In Julia, we don't have classes but we have **structures** (`struct`) that are meant to be "structured data": +# they define the kind of information that is embedded in the structure, +# that is a set of fields (aka "properties" or "attributes" in other languages), +# and then individual instances (or "objects") can be produced each with its own specific values for the fields defined by the structure. # # We create an abstract `type` called `Pet`. # Then, we proceed by creating two derived `struct` from `Pet` that has one field `name` (a `String`). -# These derived `struct` are `Dog` and `Cat`. We also define some methods for what happens in an "encounter" by defining -# a generic function `meets()` and several specific methods of `meets()` that will be multiple dispatched by Julia in runtime +# These derived `struct` are `Dog` and `Cat`. +# We also define some methods for what happens in an "encounter" by defining a generic function `meets()` +# and several specific methods of `meets()` that will be multiple dispatched by Julia in runtime # to define the action that one type `Pet` takes when it meets another `Pet`: abstract type Pet end @@ -475,8 +501,10 @@ encounter(rex, whiskers) encounter(spots, fido) encounter(whiskers, spots) -# It works as expected. Now let's translate this to modern C++ as literally as possible. -# Let's define a class `Pet` with a member variable `name` -- in C ++ we can do this. Then we define a base function `meets()`, +# It works as expected. +# Now let's translate this to modern C++ as literally as possible. +# Let's define a class `Pet` with a member variable `name` -- in C ++ we can do this. +# Then we define a base function `meets()`, # a function `encounter() `for two objects of the type `Pet`, and finally, define derived classes `Dog `and `Cat` # overload `meets()` for them: # diff --git a/index.md b/index.md index 215df670..055ff346 100644 --- a/index.md +++ b/index.md @@ -8,25 +8,40 @@ ![Bayesian for Everyone](images/bayes-meme.jpg) \center{*Bayesian for Everyone*} \\ -Welcome to the repository of tutorials on how to do **Bayesian Statistics** using [**Julia**](https://www.julialang.org) and [**Turing**](http://turing.ml/). Tutorials are available at [storopoli.github.io/Bayesian-Julia](https://storopoli.github.io/Bayesian-Julia). +Welcome to the repository of tutorials on how to do **Bayesian Statistics** using [**Julia**](https://www.julialang.org) and [**Turing**](http://turing.ml/). +Tutorials are available at [storopoli.github.io/Bayesian-Julia](https://storopoli.github.io/Bayesian-Julia). -**Bayesian statistics** is an approach to inferential statistics based on Bayes' theorem, where available knowledge about parameters in a statistical model is updated with the information in observed data. The background knowledge is expressed as a prior distribution and combined with observational data in the form of a likelihood function to determine the posterior distribution. The posterior can also be used for making predictions about future events. +**Bayesian statistics** is an approach to inferential statistics based on Bayes' theorem, +where available knowledge about parameters in a statistical model is updated with the information in observed data. +The background knowledge is expressed as a prior distribution and combined with observational data in the form of a likelihood function to determine the posterior distribution. +The posterior can also be used for making predictions about future events. -**Bayesian statistics** is a departure from classical inferential statistics that prohibits probability statements about parameters and is based on asymptotically sampling infinite samples from a theoretical population and finding parameter values that maximize the likelihood function. Mostly notorious is null-hypothesis significance testing (NHST) based on *p*-values. Bayesian statistics **incorporate uncertainty** (and prior knowledge) by allowing probability statements about parameters, and the process of parameter value inference is a direct result of the **Bayes' theorem**. +**Bayesian statistics** is a departure from classical inferential statistics that prohibits probability statements about parameters and +is based on asymptotically sampling infinite samples from a theoretical population and finding parameter values that maximize the likelihood function. +Mostly notorious is null-hypothesis significance testing (NHST) based on *p*-values. +Bayesian statistics **incorporate uncertainty** (and prior knowledge) by allowing probability statements about parameters, +and the process of parameter value inference is a direct result of the **Bayes' theorem**. \toc ## Julia -[**Julia**](https://www.julialang.org) is a fast dynamic-typed language that just-in-time (JIT) compiles into native code using LLVM. It ["runs like C but reads like Python"](https://www.nature.com/articles/d41586-019-02310-3), meaning that is *blazing* fast, easy to prototype and to read/write code. It is multi-paradigm, combining features of imperative, functional, and object-oriented programming. I won't cover Julia basics and any sort of data manipulation using Julia in the tutorials, instead please take a look into the following resources which covers most of the introduction to Julia and how to work with tabular data in Julia: +[**Julia**](https://www.julialang.org) is a fast dynamic-typed language that just-in-time (JIT) compiles into native code using LLVM. +It ["runs like C but reads like Python"](https://www.nature.com/articles/d41586-019-02310-3), +meaning that is *blazing* fast, easy to prototype and to read/write code. +It is multi-paradigm, combining features of imperative, functional, and object-oriented programming. +I won't cover Julia basics and any sort of data manipulation using Julia in the tutorials, +instead please take a look into the following resources which covers most of the introduction to Julia and how to work with tabular data in Julia: * [**Julia Documentation**](https://docs.julialang.org/): Julia documentation is a very friendly and well-written resource that explains the basic design and functionality of the language. +* [**Julia Data Science**](https://juliadatascience.io): open source and open access book on how to do Data Science using Julia. * [**Thinking Julia**](https://benlauwens.github.io/ThinkJulia.jl/latest/book.html): introductory beginner-friendly book that explains the main concepts and functionality behind the Julia language. * [**Julia High Performance**](https://www.amazon.com/Julia-High-Performance-Avik-Sengupta/dp/178829811X): book by two of the creators of the Julia Language ([Avik Sengupta](https://www.linkedin.com/in/aviks) and [Alan Edelman](http://www-math.mit.edu/~edelman/)), it covers how to make Julia even faster with some principles and tricks of the trade. * [**An Introduction DataFrames**](https://github.com/bkamins/Julia-DataFrames-Tutorial): the package [`DataFrames.jl`](https://dataframes.juliadata.org/stable/) provides a set of tools for working with tabular data in Julia. Its design and functionality are similar to those of `pandas` (in Python) and `data.frame`, `data.table` and `dplyr` (in R), making it a great general purpose data science tool, especially for those coming to Julia from R or Python.This is a collection of notebooks that introduces `DataFrames.jl` made by one of its core contributors [Bogumił Kamiński](https://github.com/bkamins). ## Turing -[**Turing**](http://turing.ml/) is an ecosystem of Julia packages for Bayesian Inference using [probabilistic programming](https://en.wikipedia.org/wiki/Probabilistic_programming). Models specified using Turing are easy to read and write — models work the way you write them. Like everything in Julia, Turing is [fast](https://arxiv.org/abs/2002.02702). +[**Turing**](http://turing.ml/) is an ecosystem of Julia packages for Bayesian Inference using [probabilistic programming](https://en.wikipedia.org/wiki/Probabilistic_programming). +Models specified using Turing are easy to read and write — models work the way you write them. Like everything in Julia, Turing is [fast](https://arxiv.org/abs/2002.02702). ## Author @@ -34,7 +49,11 @@ José Eduardo Storopoli, PhD - [*Lattes* CV](http://lattes.cnpq.br/2281909649311 ## How to use the content? -The content is licensed under a very permissive Creative Commons license (CC BY-SA). You are mostly welcome to contribute with [issues](https://www.github.com/storopoli/Bayesian-Julia/issues) and [pull requests](https://github.com/storopoli/Bayesian-Julia/pulls). My hope is to have **more people into Bayesian statistics**. The content is aimed towards social scientists and PhD candidates in social sciences. I chose to provide an **intuitive approach** rather than focusing on rigorous mathematical formulations. I've made it to be how I would have liked to be introduced to Bayesian statistics. +The content is licensed under a very permissive Creative Commons license (CC BY-SA). +You are mostly welcome to contribute with [issues](https://www.github.com/storopoli/Bayesian-Julia/issues) and [pull requests](https://github.com/storopoli/Bayesian-Julia/pulls). +My hope is to have **more people into Bayesian statistics**. The content is aimed towards social scientists and PhD candidates in social sciences. +I chose to provide an **intuitive approach** rather than focusing on rigorous mathematical formulations. +I've made it to be how I would have liked to be introduced to Bayesian statistics. To configure a local environment: @@ -64,14 +83,41 @@ Pkg.instantiate() 11. [**Computational Tricks with Turing (Non-Centered Parametrization and QR Decomposition)**](/pages/11_Turing_tricks/) 12. [**Epidemiological Models using ODE Solvers in Turing**](/pages/12_epi_models/) +## Datasets + +- `kidiq` (linear regression): data from a survey of adult American women and their children + (a subsample from the National Longitudinal Survey of Youth). + Source: Gelman and Hill (2007). +- `wells` (logistic regression): a survey of 3200 residents in a small area of Bangladesh suffering + from arsenic contamination of groundwater. + Respondents with elevated arsenic levels in their wells had been encouraged to switch their water source + to a safe public or private well in the nearby area + and the survey was conducted several years later to + learn which of the affected residents had switched wells. + Souce: Gelman and Hill (2007). +- `esoph` (ordinal regression): data from a case-control study of (o)esophageal cancer in Ille-et-Vilaine, France. + Source: Breslow and Day (1980). +- `roaches` (Poisson regression): data on the efficacy of a pest management system at reducing the number of roaches in urban apartments. + Source: Gelman and Hill (2007). +- `duncan` (robust regression): data from occupation's prestige filled with outliers. + Source: Duncan (1961). +- `cheese` (hierarchical models): data from cheese ratings. + A group of 10 rural and 10 urban raters rated 4 types of different cheeses (A, B, C and D) in two samples. + Source: Boatwright, McCulloch and Rossi (1999). + + ## What about other Turing tutorials? -Despite not being the only Turing tutorial that exists, this tutorial aims to introduce Bayesian inference along with how to use Julia and Turing. Here is a (not complete) list of other Turing tutorials: +Despite not being the only Turing tutorial that exists, this tutorial aims to introduce Bayesian inference along with how to use Julia and Turing. +Here is a (not complete) list of other Turing tutorials: 1. [**Official Turing Tutorials**](https://turing.ml/dev/tutorials/): tutorials on how to implement common models in Turing 2. [**Statistical Rethinking - Turing Models**](https://statisticalrethinkingjulia.github.io/TuringModels.jl/): Julia versions of the Bayesian models described in *Statistical Rethinking* Edition 1 (McElreath, 2016) and Edition 2 (McElreath, 2020) 3. [**Håkan Kjellerstrand Turing Tutorials**](http://hakank.org/julia/turing/): a collection of Julia Turing models +I also have a free and opensource graduate course on Bayesian Statistics with Turing and Stan code. +You can find it at [`storopoli/Bayesian-Statistics`](https://github.com/storopoli/Bayesian-Statistics). + ## How to cite To cite these tutorials, please use: @@ -93,6 +139,8 @@ Or in BibTeX format $\LaTeX$: ## References +The references are divided in **books**, **papers**, **software**, and **datasets**. + ### Books * Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). *Bayesian Data Analysis*. Chapman and Hall/CRC. @@ -111,7 +159,7 @@ Or in BibTeX format $\LaTeX$: * Amrhein, V., Greenland, S., & McShane, B. (2019). Scientists rise up against statistical significance. *Nature*, *567*(7748), 305–307. * van de Schoot, R., Kaplan, D., Denissen, J., Asendorpf, J. B., Neyer, F. J., & van Aken, M. A. G. (2014). A Gentle Introduction to Bayesian Analysis: Applications to Developmental Research. *Child Development*, *85*(3), 842–860. -### Software References +### Software * Bezanson, J., Edelman, A., Karpinski, S., & Shah, V. B. (2017). Julia: A fresh approach to numerical computing. SIAM Review, 59(1), 65–98. * Ge, H., Xu, K., & Ghahramani, Z. (2018). Turing: A Language for Flexible Probabilistic Inference. International Conference on Artificial Intelligence and Statistics, 1682–1690. http://proceedings.mlr.press/v84/ge18b.html @@ -119,6 +167,14 @@ Or in BibTeX format $\LaTeX$: * Xu, K., Ge, H., Tebbutt, W., Tarek, M., Trapp, M., & Ghahramani, Z. (2020). AdvancedHMC.jl: A robust, modular and efficient implementation of advanced HMC algorithms. Symposium on Advances in Approximate Bayesian Inference, 1–10. http://proceedings.mlr.press/v118/xu20a.html * Revels, J., Lubin, M., & Papamarkou, T. (2016). Forward-Mode Automatic Differentiation in Julia. ArXiv:1607.07892 [Cs]. http://arxiv.org/abs/1607.07892 +### Datasets + +- Boatwright, P., McCulloch, R., & Rossi, P. (1999). Account-level modeling for trade promotion: An application of a constrained parameter hierarchical model. _Journal of the American Statistical Association_, 94(448), 1063–1073. +- Breslow, N. E. & Day, N. E. (1980). **Statistical Methods in Cancer Research. Volume 1: The Analysis of Case-Control Studies**. IARC Lyon / Oxford University Press. +- Duncan, O. D. (1961). A socioeconomic index for all occupations. Class: Critical Concepts, 1, 388–426. +- Gelman, A., & Hill, J. (2007). **Data analysis using regression and + multilevel/hierarchical models**. Cambridge university press. + ## License This content is licensed under [Creative Commons Attribution-ShareAlike 4.0 Internacional](http://creativecommons.org/licenses/by-sa/4.0/). @@ -143,4 +199,4 @@ list = ["$(p.name) $(p.version)" for p in deps] sort!(list) println(join(list, '\n')) ``` -\output{packages} +\output{packages} \ No newline at end of file