diff --git a/_posts/2020-09-01-bounds.md b/_posts/2020-09-01-bounds.md
index 920bf8f4a..5bca0f00c 100644
--- a/_posts/2020-09-01-bounds.md
+++ b/_posts/2020-09-01-bounds.md
@@ -82,7 +82,7 @@ This result seems to make sense qualitatively.
 ## Applying variational inference 
 
 Let's now try to apply variational inference to this problem.
-We will use Gen's support for [black box variational inference](https://www.gen.dev/dev/ref/vi/#Black-box-variational-inference-1), which is a class of algorithms introduced by Rajesh Ranganath et al. in a [2013 paper](https://arxiv.org/abs/1401.0118) that requires only the ability to evaluate the unnormalized log probability density of the model.
+We will use Gen's support for [black box variational inference](https://www.gen.dev/docs/dev/ref/vi/#Black-box-variational-inference-1), which is a class of algorithms introduced by Rajesh Ranganath et al. in a [2013 paper](https://arxiv.org/abs/1401.0118) that requires only the ability to evaluate the unnormalized log probability density of the model.
 Gen lets you apply black box variational inference using variational approximating families that are themselves defined as probabilistic programs.
 
 The first step is to write the probabilistic program that defines the variational approximating family that we will optimize to match the posterior as closely as possible.
diff --git a/ecosystem.md b/ecosystem.md
index affe87b9e..2a8572aa7 100644
--- a/ecosystem.md
+++ b/ecosystem.md
@@ -54,4 +54,4 @@ Probability distributions and involutive MCMC kernels on orientations and rotati
 Wrapper for employing the [Redner](https://github.com/BachiLi/redner) differentiable renderer in Gen generative models.
 
 #### [GenTraceKernelDSL](https://github.com/probcomp/GenTraceKernelDSL.jl)
-An alternative interface to defining [trace translators](https://www.gen.dev/dev/ref/trace_translators/).
+An alternative interface to defining [trace translators](https://www.gen.dev/docs/dev/ref/trace_translators/).
diff --git a/tutorials/bottom-up-intro/tutorial.md b/tutorials/bottom-up-intro/tutorial.md
index 6a421d304..b3784c7e3 100644
--- a/tutorials/bottom-up-intro/tutorial.md
+++ b/tutorials/bottom-up-intro/tutorial.md
@@ -249,7 +249,7 @@ plot(map(p -> query(p, 14), [0.1, 0.5, 0.9])...)
 
 ## 2. Tracing the values of random choices in generative functions
 
-The ability to *trace* the values of random choices in a probabilistic program (i.e. record the value of each choice in a trace data structure) is one of the basic features of Gen's built-in modeling language. To write a function in this language we use the `@gen` macro provided by Gen. Note that the built-in modeling language is just one way of defining a [generative function](https://probcomp.github.io/Gen/dev/ref/distributions/).
+The ability to *trace* the values of random choices in a probabilistic program (i.e. record the value of each choice in a trace data structure) is one of the basic features of Gen's built-in modeling language. To write a function in this language we use the `@gen` macro provided by Gen. Note that the built-in modeling language is just one way of defining a [generative function](https://www.gen.dev/docs/stable/ref/distributions/).
 
 Below, we write a `@gen function` version of the function `f` defined above, this time using Gen's tracing instead of our own:
 
@@ -282,7 +282,7 @@ gen_f(0.3)
 
 
 
-To run a `@gen` function and get a trace of the execution, we use the [`simulate`](https://probcomp.github.io/Gen/dev/ref/gfi/#Gen.simulate) method:
+To run a `@gen` function and get a trace of the execution, we use the [`simulate`](https://www.gen.dev/docs/stable/ref/gfi/#Gen.simulate) method:
 
 
 ```julia
@@ -441,7 +441,7 @@ end
     expected: 0.5760000000000001, actual: 0.5754
 
 
-We can also get the log probability that an individual trace would be generated by the function ($\log p(t; x)$), using the [`get_score`](https://probcomp.github.io/Gen/dev/ref/gfi/#Gen.get_score) method.
+We can also get the log probability that an individual trace would be generated by the function ($\log p(t; x)$), using the [`get_score`](https://www.gen.dev/docs/stable/ref/gfi/#Gen.get_score) method.
 
 Let's generate a trace below, get its log probability with `get_score`
 
@@ -476,13 +476,13 @@ So far, we have run generative functions in two ways:
         gen_f(0.3)
     ```
 
-2. Using the [`simulate`](https://probcomp.github.io/Gen/dev/ref/gfi/#Gen.simulate) method:
+2. Using the [`simulate`](https://www.gen.dev/docs/stable/ref/gfi/#Gen.simulate) method:
 
     ```julia
         trace = simulate(gen_f, (0.3,))
     ```
 
-We can also generate a trace that satisfies a set of constraints on the valus of random choices using the [`generate`](https://probcomp.github.io/Gen/dev/ref/gfi/#Gen.generate) method. Suppose that we want a trace where `:a` is always `true` and `:c` is always `false`. We first construct a choice map containing these constraints:
+We can also generate a trace that satisfies a set of constraints on the valus of random choices using the [`generate`](https://www.gen.dev/docs/stable/ref/gfi/#Gen.generate) method. Suppose that we want a trace where `:a` is always `true` and `:c` is always `false`. We first construct a choice map containing these constraints:
 
 
 ```julia
@@ -617,7 +617,7 @@ function my_importance_sampler(gen_fn, args, constraints, num_traces)
 end;
 ```
 
-A more efficient and numerically robust implementation of importance resampling is provided in Gen's inference library (see [`importance_resampling`](https://probcomp.github.io/Gen/dev/ref/inference/#Gen.importance_resampling)).
+A more efficient and numerically robust implementation of importance resampling is provided in Gen's inference library (see [`importance_resampling`](https://www.gen.dev/docs/stable/ref/inference/#Gen.importance_resampling)).
 
 Suppose our goal is to sample `:a` and `:b` from the conditional distribution given that we have observed `:c` is `false`. That is, we want to sample choice map $t$ with probability $0$ if $t(c) = \mbox{false}$ and otherwise probability:
 
@@ -720,7 +720,7 @@ get_choices(trace)
 
 
 
-Now, we use the [`update`](https://probcomp.github.io/Gen/dev/ref/gfi/#Gen.update) method, to change the value of `:c` from `true` to `false`:
+Now, we use the [`update`](https://www.gen.dev/docs/stable/ref/gfi/#Gen.update) method, to change the value of `:c` from `true` to `false`:
 
 
 ```julia
diff --git a/tutorials/data-driven-proposals/tutorial.md b/tutorials/data-driven-proposals/tutorial.md
index a9768b9c7..bd6db6222 100644
--- a/tutorials/data-driven-proposals/tutorial.md
+++ b/tutorials/data-driven-proposals/tutorial.md
@@ -513,7 +513,7 @@ Run inference using Gen's built-in importance resampling implementation. Use
 
 To see how to use the built-in importance resampling function, run
 ```?Gen.importance_resampling``` or check out the
-[documentation](https://www.gen.dev/dev/ref/importance/#Gen.importance_resampling).
+[documentation](https://www.gen.dev/docs/dev/ref/importance/#Gen.importance_resampling).
 
 We have provided some starter code.
 
@@ -709,7 +709,7 @@ visualize_inference(measurements, scene_2doors, start, computation_amt=50, sampl
 ## 2. Writing a data-driven proposal as a generative function <a name="custom-proposal"></a>
 
 The inference algorithm above used a variant of
-[`Gen.importance_resampling`](https://probcomp.github.io/Gen/dev/ref/importance/#Gen.importance_resampling)
+[`Gen.importance_resampling`](https://www.gen.dev/docs/stable/ref/importance/#Gen.importance_resampling)
 that does not take a custom proposal distribution. It uses the default
 proposal distribution associated with the generative model. For generative
 functions defined using the built-in modeling DSL, the default proposal
@@ -780,7 +780,7 @@ num_y_bins = 5;
 ```
 
 We will propose the x-coordinate of the destination from a
-[piecewise_uniform](https://www.gen.dev/dev/ref/distributions/#Gen.piecewise_uniform)
+[piecewise_uniform](https://www.gen.dev/docs/dev/ref/distributions/#Gen.piecewise_uniform)
 distribution, where we set higher probability for certain bins based on the
 heuristic described above and use a uniform continuous distribution for the
 coordinate within a bin. The `compute_bin_probs` function below computes the
@@ -861,7 +861,7 @@ end;
 ```
 
 We can propose values of random choices from the proposal function using
-[`Gen.propose`](https://probcomp.github.io/Gen/dev/ref/gfi/#Gen.propose).
+[`Gen.propose`](https://www.gen.dev/docs/stable/ref/gfi/#Gen.propose).
 This method returns the choices, as well as some other information, which we
 won't need for our purposes. For now, you can think of `Gen.propose` as
 similar to `Gen.generate` except that it does not produce a full execution
@@ -937,7 +937,7 @@ Alone, this is just a heuristic. But we can use it as a proposal for importance
 
 We now use our data-driven proposal within an inference algorithm. There is a
 second variant of
-[`Gen.importance_resampling`](https://probcomp.github.io/Gen/dev/ref/importance/#Gen.importance_resampling)
+[`Gen.importance_resampling`](https://www.gen.dev/docs/stable/ref/importance/#Gen.importance_resampling)
 that accepts a generative function representing a custom proposal. This
 proposal generative function makes traced random choices at the addresses of
 a subset of the unobserved random choices made by the generative model. In
@@ -956,7 +956,7 @@ proposal accepts arguments `(measurements, scene)`.
 
 This time, use only 5 importance samples (`amt_computation`). You can run
 `?Gen.importance_resampling` or check out the
-[documentation](https://probcomp.github.io/Gen/dev/ref/inference/#Importance-Sampling-1)
+[documentation](https://www.gen.dev/docs/stable/ref/inference/#Importance-Sampling-1)
 to understand how to supply the arguments to invoke this second version of of
 importance resampling.
 
@@ -1075,7 +1075,7 @@ end;
 
 Our choice of the `score_high` value of 5. was somewhat arbitrary. To use
 more informed value, we can make `score_high` into a [*trainable
-parameter*](https://www.gen.dev/dev/ref/gfi/#Trainable-parameters-1)
+parameter*](https://www.gen.dev/docs/dev/ref/gfi/#Trainable-parameters-1)
 of the generative function. Below, we write a new version of the proposal
 function that makes `score_high` trainable. However, the optimization
 algorithms we will use for training work best with *unconstrained* parameters
@@ -1184,7 +1184,7 @@ end;
 Next, we choose type of optimization algorithm we will use for training. Gen
 supports a set of gradient-based optimization algorithms (see [Optimizing
 Trainable
-Parameters](https://www.gen.dev/dev/ref/parameter_optimization/#Optimizing-Trainable-Parameters-1)).
+Parameters](https://www.gen.dev/docs/dev/ref/parameter_optimization/#Optimizing-Trainable-Parameters-1)).
 Here we will use gradient descent with a fixed step size of 0.001.
 
 
@@ -1193,7 +1193,7 @@ update = Gen.ParamUpdate(Gen.FixedStepGradientDescent(0.001), custom_dest_propos
 ```
 
 Finally, we use the
-[`Gen.train!`](https://probcomp.github.io/Gen/dev/ref/inference/#Gen.train!)
+[`Gen.train!`](https://www.gen.dev/docs/stable/ref/inference/#Gen.train!)
 method to actually do the training.
 
 For each epoch, `Gen.train!` makes `epoch_size` calls to the data-generator
diff --git a/tutorials/intro-to-modeling/tutorial.md b/tutorials/intro-to-modeling/tutorial.md
index 361ab37de..ef3b18212 100644
--- a/tutorials/intro-to-modeling/tutorial.md
+++ b/tutorials/intro-to-modeling/tutorial.md
@@ -148,7 +148,7 @@ Probabilistic models are represented in Gen as *generative functions*.
 Generative functions are used to represent a variety of different types of
 probabilistic computations including generative models, inference models,
 custom proposal distributions, and variational approximations (see the [Gen
-documentation](https://probcomp.github.io/Gen/dev/ref/gfi/) or the 
+documentation](https://www.gen.dev/docs/stable/ref/gfi/) or the 
 [paper](https://dl.acm.org/doi/10.1145/3314221.3314642)). In this
 tutorial,
 we focus on implementing _generative models_. A generative model represents
@@ -157,7 +157,7 @@ our data and our problem domain.
 
 
 The simplest way to construct a generative function is by using the [built-in
-modeling DSL](https://probcomp.github.io/Gen/dev/ref/modeling/). Generative
+modeling DSL](https://www.gen.dev/docs/stable/ref/modeling/). Generative
 functions written in the built-in modeling DSL are based on Julia function
 definition syntax, but are prefixed with the `@gen` macro:
 
@@ -312,7 +312,7 @@ times, but each time, the random choice it makes is given a distinct address.
 Although the random choices are not included in the return value, they *are*
 included in the *execution trace* of the generative function. We can run the
 generative function and obtain its trace using the [`
-simulate`](https://probcomp.github.io/Gen/dev/ref/gfi/#Gen.simulate) method
+simulate`](https://www.gen.dev/docs/stable/ref/gfi/#Gen.simulate) method
 from the Gen API:
 
 
@@ -523,10 +523,10 @@ amplitude, and then generates y-coordinates from a given vector of
 x-coordinates by adding noise to the value of the wave at each x-coordinate.
 Use a  `gamma(1, 1)` prior distribution for the period, and a `gamma(1, 1)`
 prior distribution on the amplitude (see
-[`Gen.gamma`](https://probcomp.github.io/Gen/dev/ref/distributions/#Gen.gamma)).
+[`Gen.gamma`](https://www.gen.dev/docs/stable/ref/distributions/#Gen.gamma)).
 Sampling from a Gamma distribution will ensure to give us postive real values.
 Use a uniform distribution between 0 and $2\pi$ for the phase (see
-[`Gen.uniform`](https://probcomp.github.io/Gen/dev/ref/distributions/#Gen.uniform)).
+[`Gen.uniform`](https://www.gen.dev/docs/stable/ref/distributions/#Gen.uniform)).
 
 The sine wave should implement:
 
@@ -771,7 +771,7 @@ Write an inference program that generates traces of `sine_model` that explain th
 What if we'd want to predict `ys` given `xs`?
 
 Using the API method
-[`generate`](https://www.gen.dev/dev/ref/gfi/#Gen.generate), we
+[`generate`](https://www.gen.dev/docs/dev/ref/gfi/#Gen.generate), we
 can generate a trace of a generative function in which the values of certain
 random choices are constrained to given values. The constraints are a choice
 map that maps the addresses of the constrained random choices to their
diff --git a/tutorials/iterative-inference/tutorial.md b/tutorials/iterative-inference/tutorial.md
index bd888f835..a451f235e 100644
--- a/tutorials/iterative-inference/tutorial.md
+++ b/tutorials/iterative-inference/tutorial.md
@@ -1097,7 +1097,7 @@ For example, let's say we wanted to take a trace and assign each point's
 `is_outlier` score to the most likely possibility. We can do this by
 iterating over both possible traces, scoring them, and choosing the one with
 the higher score. We can do this using Gen's
-[`update`](https://www.gen.dev/dev/ref/gfi/#Update-1) function,
+[`update`](https://www.gen.dev/docs/dev/ref/gfi/#Update-1) function,
 which allows us to manually update a trace to satisfy some constraints:
 
 
diff --git a/tutorials/particle-filtering/tutorial.md b/tutorials/particle-filtering/tutorial.md
index add743511..548fcea8a 100644
--- a/tutorials/particle-filtering/tutorial.md
+++ b/tutorials/particle-filtering/tutorial.md
@@ -45,12 +45,12 @@ We show how Gen's support for SMC integrates with its support for MCMC, enabling
 "bearings only tracking" problem described in [4]. 
 
 This notebook will also introduce you to the 
-[`Unfold`](https://www.gen.dev/dev/ref/combinators/#Unfold-combinator-1) combinator, 
+[`Unfold`](https://www.gen.dev/docs/dev/ref/combinators/#Unfold-combinator-1) combinator, 
 which can be used to improve performance of SMC.
 `Unfold` is just one example of the levers that Gen provides for
 improving performance; once you understand it, you can check
 Gen's documentation to see how similar principles apply to the 
-[`Map`](https://www.gen.dev/dev/ref/combinators/#Map-combinator-1) combinator 
+[`Map`](https://www.gen.dev/docs/dev/ref/combinators/#Map-combinator-1) combinator 
 and to the static DSL. (These features are also covered in the previous tutorial,
 [Scaling with Combinators and the Static Modeling Language](../scaling-with-combinators-new/tutorial).)
 
@@ -238,7 +238,7 @@ sample of `num_samples` traces from the weighted collection that the particle
 filter produces.
 
 Gen provides methods for initializing and updating the state of a particle
-filter, documented in [Particle Filtering](https://www.gen.dev/dev/ref/pf/).
+filter, documented in [Particle Filtering](https://www.gen.dev/docs/dev/ref/pf/).
 
 - `Gen.initialize_particle_filter`
 
@@ -300,7 +300,7 @@ and then we introduce one additional bearing measurement by calling
 - The new arguments to the generative function for this step. In our case,
   this is the number of measurements beyond the first measurement.
 
-- The [argdiff](https://www.gen.dev/dev/ref/gfi/#Argdiffs-1)
+- The [argdiff](https://www.gen.dev/docs/dev/ref/gfi/#Argdiffs-1)
   value, which provides detailed information about the change to the
   arguments between the previous step and this step. We will revisit this
   value later.  For now, we indicate that we do not know how the `T::Int`
@@ -645,7 +645,7 @@ body whenever performing a trace update. This allows the built-in modeling
 DSL to be very flexible and to have a simple implementation, at the cost of
 performance. There are several ways of improving performance after one has a
 prototype written in the built-in modeling DSL. One of these is [Generative
-Function Combinators](https://www.gen.dev/dev/ref/combinators/), which make 
+Function Combinators](https://www.gen.dev/docs/dev/ref/combinators/), which make 
 the flow of information through the generative process more explicit to Gen, 
 and enable asymptotically more efficient inference programs.
 
@@ -676,7 +676,7 @@ Julia `for` loop in our model.
 This `for` loop has a very specific pattern of information flow&mdash;there is a
 sequence of states (represented by `x`, `y`, `vx`, and `vy`), and each state is
 generated from the previous state. This is exactly the pattern that the
-[Unfold](https://www.gen.dev/dev/ref/combinators/#Unfold-combinator-1)
+[Unfold](https://www.gen.dev/docs/dev/ref/combinators/#Unfold-combinator-1)
 generative function combinator is designed to handle.
 
 Below, we re-express the Julia `for` loop over the state sequence using the
diff --git a/tutorials/regenerate/tutorial.md b/tutorials/regenerate/tutorial.md
index 079455854..708e0e996 100644
--- a/tutorials/regenerate/tutorial.md
+++ b/tutorials/regenerate/tutorial.md
@@ -5,7 +5,7 @@ layout: splash
 
 # Reasoning About Regenerate
 
-Gen provides a primitive called [`regenerate`](https://www.gen.dev/dev/ref/gfi/#Regenerate-1) that allows users to ask for certain random choices in a trace to be re-generated from scratch. `regenerate` is the basis of one variant of the [`metropolis_hastings`](https://www.gen.dev/dev/ref/mcmc/#Gen.metropolis_hastings) operator in Gen's inference library.
+Gen provides a primitive called [`regenerate`](https://www.gen.dev/docs/dev/ref/gfi/#Regenerate-1) that allows users to ask for certain random choices in a trace to be re-generated from scratch. `regenerate` is the basis of one variant of the [`metropolis_hastings`](https://www.gen.dev/docs/dev/ref/mcmc/#Gen.metropolis_hastings) operator in Gen's inference library.
 
 This notebook aims to help you understand the computation that `regenerate` is performing.
 
@@ -61,7 +61,7 @@ using Gen: regenerate, select, NoChange
 (trace, weight, retdiff) = regenerate(trace, (0.3,), (NoChange(),), select(:a));
 ```
 
-Note that unlike [`update`](https://www.gen.dev/dev/ref/gfi/#Gen.update), we do not provide the new values for the random choices that we want to change. Instead, we simply pass in a [selection](https://www.gen.dev/dev/ref/selections/#Selections-1) indicating the addresses that we want to propose new values for.
+Note that unlike [`update`](https://www.gen.dev/docs/dev/ref/gfi/#Gen.update), we do not provide the new values for the random choices that we want to change. Instead, we simply pass in a [selection](https://www.gen.dev/docs/dev/ref/selections/#Selections-1) indicating the addresses that we want to propose new values for.
 
 Note that `select(:a)` is equivalent to:
 ```julia
@@ -91,7 +91,7 @@ get_choices(trace)
 
 Re-run the regenerate command until you get a trace where `a` is `false`. Note that the address `b` doesn't appear in the resulting trace. Then, run the command again until you get a trace where `a` is `true`. Note that now there is a value for `b`. This value of `b` was sampled along with the new value for `a`---`regenerate` will regenerate new values for the selected adddresses, but also any new addresses that may be introduced as a consequence of stochastic control flow.
 
-What distribution is `regenerate` sampling the selected values from? It turns out that `regenerate` is using the [*internal proposal distribution family*](https://www.gen.dev/dev/ref/gfi/#Internal-proposal-distribution-family-1) $q(t; x, u)$, just like like `generate`. Recall that for `@gen` functions, the internal proposal distribution is based on *ancestral sampling*.  But whereas `generate` was given the expicit choice map of constraints ($u$) as an argument, `regenerate` constructs $u$ by starting with the previous trace $t$ and then removing any selected addresses. In other words, `regenerate` is like `generate`, but where the constraints are the choices made in the previous trace less the selected choices.
+What distribution is `regenerate` sampling the selected values from? It turns out that `regenerate` is using the [*internal proposal distribution family*](https://www.gen.dev/docs/dev/ref/gfi/#Internal-proposal-distribution-family-1) $q(t; x, u)$, just like like `generate`. Recall that for `@gen` functions, the internal proposal distribution is based on *ancestral sampling*.  But whereas `generate` was given the expicit choice map of constraints ($u$) as an argument, `regenerate` constructs $u$ by starting with the previous trace $t$ and then removing any selected addresses. In other words, `regenerate` is like `generate`, but where the constraints are the choices made in the previous trace less the selected choices.
 
 We can make this concrete. Let us start with a deterministic trace again:
 
diff --git a/tutorials/rj/tutorial.md b/tutorials/rj/tutorial.md
index fe4b48d96..6e4f9975f 100644
--- a/tutorials/rj/tutorial.md
+++ b/tutorials/rj/tutorial.md
@@ -54,7 +54,7 @@ Given a dataset of `xs`, our model will randomly divide the range `(xmin, xmax)`
 It does this by sampling a number of segments (`:segment_count`), then sampling a vector of _proportions_ from a Dirichlet distribution (`:fractions`). The vector is guaranteed to sum to 1: if there are, say, three segments, this vector might be `[0.3, 0.5, 0.2]`. The length of each segment is the fraction of the interval assigned to it, times the length of the entire interval, e.g. `0.2 * (xmax - xmin)`. For each segmment, we generate a `y` value from a normal distribution. Finally, we sample the `y` values near the piecewise constant function described by the segments.
 
 ### Using `@dist` to define new distributions for convenience 
-To sample the number of segments, we need a distribution with support only on the positive integers. We create one using the [Distributions DSL](https://www.gen.dev/dev/ref/distributions/#dist_dsl-1):
+To sample the number of segments, we need a distribution with support only on the positive integers. We create one using the [Distributions DSL](https://www.gen.dev/docs/dev/ref/distributions/#dist_dsl-1):
 
 
 ```julia
diff --git a/tutorials/scaling-with-combinators-new/tutorial.md b/tutorials/scaling-with-combinators-new/tutorial.md
index ac0ac66c1..76fb25972 100644
--- a/tutorials/scaling-with-combinators-new/tutorial.md
+++ b/tutorials/scaling-with-combinators-new/tutorial.md
@@ -5,11 +5,11 @@ layout: splash
 
 # Scaling with Combinators and the Static Modeling Language
 
-Up until this point, we have been using [Gen's generic built-in modeling language](https://www.gen.dev/dev/ref/modeling/), which is a very flexible modeling language that is shallowly embedded in Julia. However, better performance and scaling characteristics can be obtained using specialized modeling languages or modeling constructs. This notebook introduces two built-in features of Gen:
+Up until this point, we have been using [Gen's generic built-in modeling language](https://www.gen.dev/docs/dev/ref/modeling/), which is a very flexible modeling language that is shallowly embedded in Julia. However, better performance and scaling characteristics can be obtained using specialized modeling languages or modeling constructs. This notebook introduces two built-in features of Gen:
 
-- A more specialized [Static Modeling Language](https://www.gen.dev/dev/ref/modeling/#Static-Modeling-Language-1) which is built-in to Gen.
+- A more specialized [Static Modeling Language](https://www.gen.dev/docs/dev/ref/modeling/#Static-Modeling-Language-1) which is built-in to Gen.
 
-- A class of modeling constructs called [Generative function combinators](https://www.gen.dev/dev/ref/combinators/).
+- A class of modeling constructs called [Generative function combinators](https://www.gen.dev/docs/dev/ref/combinators/).
 
 These features provide both constant-factor speedups, as well as improvements in asymptotic orders of growth, over the generic built-in modeling language.
 
@@ -163,7 +163,7 @@ The reason for the quadratic scaling is that the running time of the call to `mh
 
 However, it should be possible for the algorithm to scale linearly in the number of data points. Briefly, deciding whether to update a given `is_outlier` variable can be done without referencing the other data points. This is because each `is_outiler` variable is conditionally independent of the outlier variables and y-coordinates of the other data points, conditioned on the parameters.
 
-We can make this conditional independence structure explicit using the [Map generative function combinator](https://probcomp.github.io/Gen/dev/ref/combinators/#Map-combinator-1). Combinators like map encapsulate common modeling patterns (e.g., a loop in which each iteration is making independent choices), and when you use them, Gen can take advantage of the restrictions they enforce to implement performance optimizations automatically during inference. The `Map` combinator, like the `map` function in a functional programming language, helps to execute the same generative code repeatedly. 
+We can make this conditional independence structure explicit using the [Map generative function combinator](https://www.gen.dev/docs/stable/ref/combinators/#Map-combinator-1). Combinators like map encapsulate common modeling patterns (e.g., a loop in which each iteration is making independent choices), and when you use them, Gen can take advantage of the restrictions they enforce to implement performance optimizations automatically during inference. The `Map` combinator, like the `map` function in a functional programming language, helps to execute the same generative code repeatedly. 
 
 ## 2. Introducing the map combinator <a name="map"></a>
 
@@ -181,7 +181,7 @@ To use the map combinator to express the conditional independences in our model,
 end;
 ```
 
-We then apply the [`Map`](https://probcomp.github.io/Gen/dev/ref/combinators/#Map-combinator-1), which is a Julia function, to this generative function, to obtain a new generative function:
+We then apply the [`Map`](https://www.gen.dev/docs/stable/ref/combinators/#Map-combinator-1), which is a Julia function, to this generative function, to obtain a new generative function:
 
 
 ```julia
@@ -357,7 +357,7 @@ Even though the function `generate_all_points` knows that each of the calls to `
 
 ## 3.Combining the map combinator with the static modeling language <a name="combining"></a>
 
-In order to provide `generate_all_points` with the knowledge that its arguments do not change during an update to the `is_outlier` variable, we need to write the top-level model generative function that calls `generate_all_points` in the [Static Modeling Language](https://probcomp.github.io/Gen/dev/ref/modeling/#Static-Modeling-Language-1), which is a restricted variant of the built-in modeling language that uses static analysis of the computation graph to generate specialized trace data structures and specialized implementations of trace operations. We indicate that a function is to be interpreted using the static language using the `static` annotation:
+In order to provide `generate_all_points` with the knowledge that its arguments do not change during an update to the `is_outlier` variable, we need to write the top-level model generative function that calls `generate_all_points` in the [Static Modeling Language](https://www.gen.dev/docs/stable/ref/modeling/#Static-Modeling-Language-1), which is a restricted variant of the built-in modeling language that uses static analysis of the computation graph to generate specialized trace data structures and specialized implementations of trace operations. We indicate that a function is to be interpreted using the static language using the `static` annotation:
 
 
 ```julia
diff --git a/tutorials/scaling-with-combinators/Scaling with Combinators and the Static Modeling Language.md b/tutorials/scaling-with-combinators/Scaling with Combinators and the Static Modeling Language.md
index 7dbcea726..ccd70c5b9 100644
--- a/tutorials/scaling-with-combinators/Scaling with Combinators and the Static Modeling Language.md	
+++ b/tutorials/scaling-with-combinators/Scaling with Combinators and the Static Modeling Language.md	
@@ -5,11 +5,11 @@ layout: splash
 
 # Scaling with Combinators and the Static Modeling Language
 
-Up until this point, we have been using [Gen's generic built-in modeling language](https://probcomp.github.io/Gen/dev/ref/modeling/#Built-in-Modeling-Language-1), which is a very flexible modeling language that is shallowly embedded in Julia. However, better performance and scaling characteristics can be obtained using specialized modeling languages or modeling constructs. This notebook introduces two built-in features of Gen:
+Up until this point, we have been using [Gen's generic built-in modeling language](https://www.gen.dev/docs/stable/ref/modeling/#Built-in-Modeling-Language-1), which is a very flexible modeling language that is shallowly embedded in Julia. However, better performance and scaling characteristics can be obtained using specialized modeling languages or modeling constructs. This notebook introduces two built-in features of Gen:
 
-- A more specialized [Static Modeling Language](https://probcomp.github.io/Gen/dev/ref/modeling/#Static-Modeling-Language-1) which is built-in to Gen.
+- A more specialized [Static Modeling Language](https://www.gen.dev/docs/stable/ref/modeling/#Static-Modeling-Language-1) which is built-in to Gen.
 
-- A class of modeling constructs called [Generative function combinators](https://probcomp.github.io/Gen/dev/ref/combinators/).
+- A class of modeling constructs called [Generative function combinators](https://www.gen.dev/docs/stable/ref/combinators/).
 
 These features provide both constant-factor speedups, as well as improvements in asymptotic orders of growth, over the generic built-in modeling language.
 
@@ -158,7 +158,7 @@ The reason for the quadratic scaling is that the running time of the call to `mh
 
 However, it should be possible for the algorithm to scale linearly in the number of data points. Briefly, deciding whether to update a given `is_outlier` variable can be done without referencing the other data points. This is because each `is_outiler` variable is conditionally independent of the outlier variables and y-coordinates of the other data points, conditioned on the parameters.
 
-We can make this conditional independence structure explicit using the [Map generative function combinator](https://probcomp.github.io/Gen/dev/ref/combinators/#Map-combinator-1). Combinators like map encapsulate common modeling patterns (e.g., a loop in which each iteration is making independent choices), and when you use them, Gen can take advantage of the restrictions they enforce to implement performance optimizations automatically during inference. The `Map` combinator, like the `map` function in a functional programming language, helps to execute the same generative code repeatedly. 
+We can make this conditional independence structure explicit using the [Map generative function combinator](https://www.gen.dev/docs/stable/ref/combinators/#Map-combinator-1). Combinators like map encapsulate common modeling patterns (e.g., a loop in which each iteration is making independent choices), and when you use them, Gen can take advantage of the restrictions they enforce to implement performance optimizations automatically during inference. The `Map` combinator, like the `map` function in a functional programming language, helps to execute the same generative code repeatedly. 
 
 ## 2. Introducing the map combinator <a name="map"></a>
 
@@ -176,7 +176,7 @@ To use the map combinator to express the conditional independences in our model,
 end;
 ```
 
-We then apply the [`Map`](https://probcomp.github.io/Gen/dev/ref/combinators/#Map-combinator-1), which is a Julia function, to this generative function, to obtain a new generative function:
+We then apply the [`Map`](https://www.gen.dev/docs/stable/ref/combinators/#Map-combinator-1), which is a Julia function, to this generative function, to obtain a new generative function:
 
 
 ```julia
@@ -342,7 +342,7 @@ Even though the function `generate_all_points` knows that each of the calls to `
 
 ## 3.Combining the map combinator with the static modeling language <a name="combining"></a>
 
-In order to provide `generate_all_points` with the knowledge that its arguments do not change during an update to the `is_outlier` variable, we need to write the top-level model generative function that calls `generate_all_points` in the [Static Modeling Language](https://probcomp.github.io/Gen/dev/ref/modeling/#Static-Modeling-Language-1), which is a restricted variant of the built-in modeling language that uses static analysis of the computation graph to generate specialized trace data structures and specialized implementations of trace operations. We indicate that a function is to be interpreted using the static language using the `static` annotation:
+In order to provide `generate_all_points` with the knowledge that its arguments do not change during an update to the `is_outlier` variable, we need to write the top-level model generative function that calls `generate_all_points` in the [Static Modeling Language](https://www.gen.dev/docs/stable/ref/modeling/#Static-Modeling-Language-1), which is a restricted variant of the built-in modeling language that uses static analysis of the computation graph to generate specialized trace data structures and specialized implementations of trace operations. We indicate that a function is to be interpreted using the static language using the `static` annotation:
 
 
 ```julia
diff --git a/tutorials/tf-mnist/tutorial.md b/tutorials/tf-mnist/tutorial.md
index 5c662f6e1..6660c7d5b 100644
--- a/tutorials/tf-mnist/tutorial.md
+++ b/tutorials/tf-mnist/tutorial.md
@@ -5,7 +5,7 @@ layout: splash
 
 # Modeling with TensorFlow code
 
-So far, we have seen generative functions that are defined only using the built-in modeling language, which uses the `@gen` keyword. However, Gen can also be extended with other modeling languages, as long as they produce generative functions that implement the [Generative Function Interface](https://probcomp.github.io/Gen/dev/ref/gfi/). The [GenTF](https://github.com/probcomp/GenTF) Julia package provides one such modeling language which allow generative functions to be constructed from user-defined TensorFlow computation graphs. Generative functions written in the built-in language can invoke generative functions defined using the GenTF language.
+So far, we have seen generative functions that are defined only using the built-in modeling language, which uses the `@gen` keyword. However, Gen can also be extended with other modeling languages, as long as they produce generative functions that implement the [Generative Function Interface](https://www.gen.dev/docs/stable/ref/gfi/). The [GenTF](https://github.com/probcomp/GenTF) Julia package provides one such modeling language which allow generative functions to be constructed from user-defined TensorFlow computation graphs. Generative functions written in the built-in language can invoke generative functions defined using the GenTF language.
 
 This notebook shows how to write a generative function in the GenTF language, how to invoke a GenTF generative function from a `@gen` function, and how to perform basic supervised training of a generative function. Specifically, we will train a softmax regression conditional inference model to generate the label of an MNIST digit given the pixels. Later tutorials will show how to use deep learning and TensorFlow to accelerate inference in generative models, using ideas from "amortized inference".
 
@@ -208,7 +208,7 @@ training_data_loader = MNISTTrainDataLoader();
     [y/n]
 
 
-Now, we train the trainable parameters of the `tf_softmax_model` generative function  (`W` and `b`) on the MNIST traing data. Note that these parameters are stored as the state of the TensorFlow variables. We will use the [`Gen.train!`](https://probcomp.github.io/Gen/dev/ref/inference/#Gen.train!) method, which supports supervised training of generative functions using stochastic gradient opimization methods. In particular, this method takes the generative function to be trained (`digit_model`), a Julia function of no arguments that generates a batch of training data, and the update to apply to the trainable parameters.
+Now, we train the trainable parameters of the `tf_softmax_model` generative function  (`W` and `b`) on the MNIST traing data. Note that these parameters are stored as the state of the TensorFlow variables. We will use the [`Gen.train!`](https://www.gen.dev/docs/stable/ref/inference/#Gen.train!) method, which supports supervised training of generative functions using stochastic gradient opimization methods. In particular, this method takes the generative function to be trained (`digit_model`), a Julia function of no arguments that generates a batch of training data, and the update to apply to the trainable parameters.
 
 The `ParamUpdate` constructor takes the type of update to perform (in this case a gradient descent update with step size 0.00001), and a specification of which trainable parameters should be updated). Here, we request that the `W` and `b` trainable parameters of the `tf_softmax_model` generative function should be trained.