Skip to content

Modelling

Josh Fogg edited this page Jul 23, 2024 · 2 revisions

The mathematical models used for these problems might seem a little opaque, which in turn might make the module difficult to use. Here we've attempted to explain top-level what's going, though if anything here is unclear please do open an issue.

Definitions

In a breeding programme, optimal contribution selection (OCS) asks how much the $i^{\text{th}}$ candidate in a cohort should contribute to the next generation ($w_i$), given their own expected breeding value ($\mu_i$) and relationship to the $j^{\text{th}}$ candidate ($\sigma_{ij}$). If we denote the vector of optimal contributions by $\mathbf{w}$ (with entries summing to one), the problem is formulated as

$$ \min_{\mathbf{w}\in W} \left(\boldsymbol{\mu}^{T}\mathbf{w} - \frac{\lambda}{2} \mathbf{w}^{T}\Sigma\mathbf{w} \right)\quad \text{ subject to }\quad \sum_{i\in\mathcal{S}} w_i = \frac{1}{2},~ \sum_{i\in\mathcal{D}} w_i = \frac{1}{2},~ $$

where $\mathcal{S}$ contains candidates who are sires, $\mathcal{D}$ those who are dams, and $\lambda\geq0$ is a parameter which controls the balance between risk and return. Here $\mathbf{w}\in W$ represents general linear constraints on $\mathbf{w}$ (e.g. bounds). Mathematically, this is a quadratic program with a quadratic objective and linear constraints.

Modelling Uncertainty

While there are models where $\Sigma$ is known exactly (e.g. Wright's Numerator Relationship Matrix), $\boldsymbol{\mu}$ is estimated and carries some uncertainty. We can model it as a univariate normal variate random variable, $\boldsymbol{\mu} \sim \text{N}(\boldsymbol{\bar\mu}, \Omega)$, where $\boldsymbol{\bar\mu}$ are the average values and $\Omega$ the covariances between values.

To ensure the problem remains nice to work with, we can bound $\boldsymbol{\mu}$ in a ball with the constraint

$$ {(\boldsymbol{\mu} - \boldsymbol{\bar{\mu}})}^T \Omega^{-1} {(\boldsymbol{\mu} - \boldsymbol{\bar{\mu}})} \leq \kappa^2 $$

of radius $\kappa$, centred on $\boldsymbol{\bar{\mu}}$, and with shape controlled by $\Omega$. Mathematically, this what we call a quadratic uncertainty set. We now have $\kappa\geq0$ as a parameter that controls the tolerance for uncertainty in the values of $\boldsymbol{\mu}$.

Robust Problems

When we put these two ideas together, we arrive at robust optimal contribution selection which incorporates some "worst case scenario" planning. This involves solving the non-linear optimization problem

$$ \min_{\mathbf{w}\in W} \left(\boldsymbol{\mu}^{T}\mathbf{w} - \frac{\lambda}{2} \mathbf{w}^{T}\Sigma\mathbf{w} - \kappa\sqrt{ \mathbf{w}^T \Omega \mathbf{w} } \right)\quad \text{ subject to }\quad \sum_{i\in\mathcal{S}} w_i = \frac{1}{2},~ \sum_{i\in\mathcal{D}} w_i = \frac{1}{2}. $$

This can be done using conic programming by converting it to the form

$$ \min_{\mathbf{w}\in W,\ z\geq0} \left(\boldsymbol{\mu}^{T}\mathbf{w} - \frac{\lambda}{2} \mathbf{w}^{T}\Sigma\mathbf{w} - \kappa z \right)\quad \text{ subject to }\quad \sum_{i\in\mathcal{S}} w_i = \frac{1}{2},~ \sum_{i\in\mathcal{D}} w_i = \frac{1}{2},~ z \geq \sqrt{ \mathbf{w}^T \Omega \mathbf{w} }, $$

where the latter constraint is conic. Alternatively we can use sequential quadratic programming problem by using a solving a series of quadratic problems in the form

$$ \min_{\mathbf{w}\in W,\ z\geq0} \left(\boldsymbol{\mu}^{T}\mathbf{w} - \frac{\lambda}{2} \mathbf{w}^{T}\Sigma\mathbf{w} - \kappa z \right)\quad \text{ subject to }\quad \sum_{i\in\mathcal{S}} w_i = \frac{1}{2},~ \sum_{i\in\mathcal{D}} w_i = \frac{1}{2},\text{ and } z \geq \frac{\mathbf w_{(k)}^T\Omega\mathbf{w}}{\sqrt{ \mathbf w_{(k)}^T \Omega \mathbf w_{(k)}^{} }}\text{ for } k = 0, 1, \ldots $$

where $\mathbf w_{(k)}$ is the solution to the problem with constraints $k = 0, \ldots, k - 1$ only, and $\mathbf w_{(0)}$ is the solution with $z \geq 0$ only.