Foggalong · Foggalong · Apr 19, 2024 · Apr 17, 2024 · Apr 18, 2024 · Apr 19, 2024
diff --git a/.vscode/settings.json b/.vscode/settings.json
@@ -8,6 +8,7 @@
         "Gregor's",
         "gurobi",
         "gurobipy",
+        "Heckel",
         "hendo",
         "linalg",
         "loadtxt",

diff --git a/Scratch/Timing Test/README.md b/Scratch/Timing Test/README.md
@@ -0,0 +1,3 @@
+# Timing Test
+
+Quick timing test of a non-trivial problem to show that it's the same in Python vs Jupyter.
diff --git a/Scratch/Timing Test/timing-test.ipynb b/Scratch/Timing Test/timing-test.ipynb
@@ -0,0 +1,361 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Robust Optimization for Genetic Selection\n",
+    "\n",
+    "This notebook provides a didactic for using Python to solve the multi-objective optimization problem which arises in the context of robust genetic selection. It has been tested with Python 3.10 specifically and there are some standard packages which this depends on, imported below."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np                  # defines matrix structures\n",
+    "from qpsolvers import solve_qp      # used for quadratic optimization\n",
+    "import gurobipy as gp               # Gurobi optimization interface (1)\n",
+    "from gurobipy import GRB            # Gurobi optimization interface (2)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Utility functions and output settings used in this notebook are defined in the two cells below."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# want to round rather than truncate when printing\n",
+    "np.set_printoptions(threshold=np.inf)\n",
+    "\n",
+    "# only show numpy output to five decimal places\n",
+    "np.set_printoptions(formatter={'float_kind':\"{:.5f}\".format})"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Real (Simulated) Data\n",
+    "\n",
+    "Now we've looked at how these problems can be approached with Gurobi, we can try an example with realistic simulated data. Here we have data for a cohort of 50 candidates split across three data files, each containing space-separated values.\n",
+    "\n",
+    "1. In `A50.txt` the matrix $\\Sigma$ is described, with columns for $i$, $j$, and $a_{ij}$, where only the upper triangle is described since the matrix is symmetric.\n",
+    "2. In `EBV50.txt` the vector $\\bar{\\mu}$ is described, with only one column containing the posterior mean over 1000 samples.\n",
+    "3. In `S50.txt` the matrix $\\Omega$ is described, with columns for $i$, $j$, and $\\sigma_{ij}$, where only the upper triangle is described since the matrix is symmetric.\n",
+    "\n",
+    "Note that this particular problem does not contain a separate file for sex data; odd indexed candidates are sires and even indexed candidates are dams. For now, we also don't have a $l$ or $u$ bounding the possible weights.\n",
+    "\n",
+    "\n",
+    "The first step then is clearly going to be loading the matrices in particular from file. We create the following function to do so.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def load_problem(A_filename, E_filename, S_filename, dimension=False):\n",
+    "    \"\"\"\n",
+    "    Used to load genetic selection problems into NumPy. It takes three\n",
+    "    string inputs for filenames where Sigma, Mu, and Omega are stored,\n",
+    "    as well as an optional integer input for problem dimension if this\n",
+    "    is known. If it's know know, it's worked out based on E_filename.\n",
+    "\n",
+    "    As output, it returns (A, E, S, n), where A and S are n-by-n NumPy\n",
+    "    arrays, E is a length n NumPy array, and n is an integer.\n",
+    "    \"\"\"\n",
+    "\n",
+    "    def load_symmetric_matrix(filename, dimension):\n",
+    "        \"\"\"\n",
+    "        Since NumPy doesn't have a stock way to load matrices\n",
+    "        stored in coordinate format format, this adds one.\n",
+    "        \"\"\"\n",
+    "\n",
+    "        matrix = np.zeros([dimension, dimension])\n",
+    "\n",
+    "        with open(filename, 'r') as file:\n",
+    "            for line in file:\n",
+    "                i, j, entry = line.split(\" \")\n",
+    "                # data files indexed from 1, not 0\n",
+    "                matrix[int(i)-1, int(j)-1] = entry\n",
+    "                matrix[int(j)-1, int(i)-1] = entry\n",
+    "\n",
+    "        return matrix\n",
+    "\n",
+    "\n",
+    "    # if dimension wasn't supplied, need to find that\n",
+    "    if not dimension:\n",
+    "        # get dimension from EBV, since it's the smallest file\n",
+    "        with open(E_filename, 'r') as file:\n",
+    "            dimension = sum(1 for _ in file)\n",
+    "\n",
+    "    # EBV isn't in coordinate format so can be loaded directly\n",
+    "    E = np.loadtxt(E_filename)  \n",
+    "    # A and S are stored by coordinates so need special loader\n",
+    "    A = load_symmetric_matrix(A_filename, dimension)\n",
+    "    S = load_symmetric_matrix(S_filename, dimension)\n",
+    "\n",
+    "    return A, E, S, dimension"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "For the given example problem we can now solve it with Gurobi using the exact same methods as before, for both the standard genetic selection problem and the robust version of the problem. The following cell does both alongside each other to accentuate the differences."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Set parameter Username\n",
+      "Academic license - for non-commercial use only - expires 2025-02-26\n",
+      "Set parameter TimeLimit to value 300\n",
+      "Set parameter MIPGap to value 0.009\n",
+      "Set parameter MIPGap to value 0.009\n",
+      "Gurobi Optimizer version 11.0.0 build v11.0.0rc2 (linux64 - \"Ubuntu 22.04.4 LTS\")\n",
+      "\n",
+      "CPU model: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz, instruction set [SSE2|AVX|AVX2]\n",
+      "Thread count: 4 physical cores, 8 logical processors, using up to 8 threads\n",
+      "\n",
+      "Optimize a model with 4 rows, 50 columns and 100 nonzeros\n",
+      "Model fingerprint: 0x534ff85e\n",
+      "Model has 1275 quadratic objective terms\n",
+      "Coefficient statistics:\n",
+      "  Matrix range     [1e+00, 1e+00]\n",
+      "  Objective range  [7e-01, 1e+00]\n",
+      "  QObjective range [5e-02, 1e+00]\n",
+      "  Bounds range     [0e+00, 0e+00]\n",
+      "  RHS range        [5e-01, 5e-01]\n",
+      "Presolve removed 2 rows and 0 columns\n",
+      "Presolve time: 0.01s\n",
+      "Presolved: 2 rows, 50 columns, 50 nonzeros\n",
+      "Presolved model has 1275 quadratic objective terms\n",
+      "Ordering time: 0.00s\n",
+      "\n",
+      "Barrier statistics:\n",
+      " Free vars  : 49\n",
+      " AA' NZ     : 1.274e+03\n",
+      " Factor NZ  : 1.326e+03\n",
+      " Factor Ops : 4.553e+04 (less than 1 second per iteration)\n",
+      " Threads    : 1\n",
+      "\n",
+      "                  Objective                Residual\n",
+      "Iter       Primal          Dual         Primal    Dual     Compl     Time\n",
+      "   0   3.76397773e+05 -4.22619805e+05  2.50e+04 1.12e+00  9.99e+05     0s\n",
+      "   1  -5.01117677e+01 -9.99230833e+02  2.68e+01 1.25e-04  1.09e+03     0s\n",
+      "   2  -8.81632395e-01 -9.63667643e+02  2.68e-05 1.25e-10  1.93e+01     0s\n",
+      "   3  -8.81606636e-01 -2.03034027e+00  5.19e-09 2.41e-14  2.30e-02     0s\n",
+      "   4  -9.04332054e-01 -1.04180295e+00  4.73e-10 2.19e-15  2.75e-03     0s\n",
+      "   5  -9.59656693e-01 -1.01318293e+00  1.67e-15 8.88e-16  1.07e-03     0s\n",
+      "   6  -9.73793052e-01 -9.76849441e-01  2.22e-16 1.33e-15  6.11e-05     0s\n",
+      "   7  -9.76163966e-01 -9.76239391e-01  1.39e-15 6.66e-16  1.51e-06     0s\n",
+      "   8  -9.76213879e-01 -9.76217890e-01  9.91e-15 5.55e-16  8.02e-08     0s\n",
+      "   9  -9.76215229e-01 -9.76215592e-01  1.65e-13 5.55e-16  7.25e-09     0s\n",
+      "  10  -9.76215438e-01 -9.76215482e-01  7.03e-13 6.66e-16  8.86e-10     0s\n",
+      "  11  -9.76215473e-01 -9.76215476e-01  4.24e-13 6.66e-16  4.91e-11     0s\n",
+      "\n",
+      "Barrier solved model in 11 iterations and 0.03 seconds (0.00 work units)\n",
+      "Optimal objective -9.76215473e-01\n",
+      "\n",
+      "Gurobi Optimizer version 11.0.0 build v11.0.0rc2 (linux64 - \"Ubuntu 22.04.4 LTS\")\n",
+      "\n",
+      "CPU model: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz, instruction set [SSE2|AVX|AVX2]\n",
+      "Thread count: 4 physical cores, 8 logical processors, using up to 8 threads\n",
+      "\n",
+      "Optimize a model with 5 rows, 51 columns and 101 nonzeros\n",
+      "Model fingerprint: 0x8c48581e\n",
+      "Model has 1275 quadratic objective terms\n",
+      "Model has 50 quadratic constraints\n",
+      "Coefficient statistics:\n",
+      "  Matrix range     [1e+00, 1e+00]\n",
+      "  QMatrix range    [7e-06, 1e+00]\n",
+      "  Objective range  [7e-01, 1e+00]\n",
+      "  QObjective range [5e-02, 1e+00]\n",
+      "  Bounds range     [0e+00, 0e+00]\n",
+      "  RHS range        [5e-01, 5e-01]\n",
+      "Presolve removed 3 rows and 0 columns\n",
+      "\n",
+      "Continuous model is non-convex -- solving as a MIP\n",
+      "\n",
+      "Presolve removed 3 rows and 0 columns\n",
+      "Presolve time: 0.01s\n",
+      "Presolved: 2552 rows, 1327 columns, 7600 nonzeros\n",
+      "Presolved model has 1275 quadratic objective terms\n",
+      "Presolved model has 1 quadratic constraint(s)\n",
+      "Presolved model has 1275 bilinear constraint(s)\n",
+      "Variable types: 1327 continuous, 0 integer (0 binary)\n",
+      "Found heuristic solution: objective -0.9763584\n",
+      "\n",
+      "Root relaxation: objective -1.285097e+00, 31 iterations, 0.00 seconds (0.00 work units)\n",
+      "\n",
+      "    Nodes    |    Current Node    |     Objective Bounds      |     Work\n",
+      " Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time\n",
+      "\n",
+      "     0     0   -1.28510    0   36   -0.97636   -1.28510  31.6%     -    0s\n",
+      "     0     0   -1.13066    0   69   -0.97636   -1.13066  15.8%     -    0s\n",
+      "     0     0   -1.05344    0   51   -0.97636   -1.05344  7.89%     -    0s\n",
+      "     0     0   -1.01483    0   51   -0.97636   -1.01483  3.94%     -    0s\n",
+      "     0     0   -0.99552    0   37   -0.97636   -0.99552  1.96%     -    0s\n",
+      "     0     0   -0.99119    0 1175   -0.97636   -0.99119  1.52%     -    0s\n",
+      "     0     0   -0.99098    0 1118   -0.97636   -0.99098  1.50%     -    0s\n",
+      "     0     0   -0.98587    0   47   -0.97636   -0.98587  0.97%     -    1s\n",
+      "     0     2   -0.98587    0   47   -0.97636   -0.98587  0.97%     -    1s\n",
+      "   184   201   -0.98413   26  341   -0.97636   -0.98567  0.95%   156    5s\n",
+      "   434   418   -0.97843   58  533   -0.97636   -0.98567  0.95%   318   10s\n",
+      "H  609   566                      -0.9768443   -0.98567  0.90%   296   12s\n",
+      "H  699   541                      -0.9768479   -0.98567  0.90%   282   14s\n",
+      "   704   545   -0.97802   28    0   -0.97685   -0.98567  0.90%   280   15s\n",
+      "\n",
+      "Cutting planes:\n",
+      "  RLT: 39\n",
+      "  PSD: 143\n",
+      "\n",
+      "Explored 704 nodes (208237 simplex iterations) in 15.37 seconds (17.22 work units)\n",
+      "Thread count was 8 (of 8 available processors)\n",
+      "\n",
+      "Solution count 1: -0.976848 \n",
+      "\n",
+      "Optimal solution found (tolerance 9.00e-03)\n",
+      "Best objective -9.768443464472e-01, best bound -9.854709478274e-01, gap 0.8831%\n",
+      "\n",
+      "SIRE WEIGHTS\t\t\t DAM WEIGHTS\n",
+      "--------------------\t\t --------------------\n",
+      " i   w_std    w_rbs\t\t  i   w_std    w_rbs\n",
+      "00  0.00000  0.00002             01  0.08319  0.08339\n",
+      "02  0.11649  0.11508             03  0.00000  0.00001\n",
+      "04  0.00000  0.00002             05  0.00000  0.00002\n",
+      "06  0.00000  0.00001             07  0.06817  0.06830\n",
+      "08  0.00000  0.00001             09  0.03762  0.03701\n",
+      "10  0.00000  0.00002             11  0.00000  0.00002\n",
+      "12  0.00000  0.00001             13  0.00355  0.00564\n",
+      "14  0.11542  0.11450             15  0.06966  0.06913\n",
+      "16  0.00000  0.00006             17  0.00000  0.00001\n",
+      "18  0.00000  0.00002             19  0.00000  0.00003\n",
+      "20  0.00000  0.00002             21  0.00001  0.00062\n",
+      "22  0.00000  0.00017             23  0.00000  0.00004\n",
+      "24  0.03537  0.03493             25  0.00000  0.00001\n",
+      "26  0.00000  0.00001             27  0.00000  0.00002\n",
+      "28  0.00000  0.00002             29  0.00000  0.00002\n",
+      "30  0.00000  0.00001             31  0.03580  0.03612\n",
+      "32  0.08447  0.08517             33  0.00000  0.00003\n",
+      "34  0.00000  0.00001             35  0.02849  0.02747\n",
+      "36  0.13489  0.13487             37  0.07446  0.07264\n",
+      "38  0.00000  0.00001             39  0.00000  0.00001\n",
+      "40  0.00000  0.00003             41  0.07312  0.07388\n",
+      "42  0.00000  0.00004             43  0.00000  0.00003\n",
+      "44  0.01336  0.01485             45  0.00000  0.00002\n",
+      "46  0.00000  0.00009             47  0.02593  0.02555\n",
+      "48  0.00000  0.00003             49  0.00000  0.00000\n"
+     ]
+    }
+   ],
+   "source": [
+    "sigma, mubar, omega, n = load_problem(\n",
+    "    \"../Example/A50.txt\",\n",
+    "    \"../Example/EBV50.txt\",\n",
+    "    \"../Example/S50.txt\",\n",
+    "    50)\n",
+    "\n",
+    "lam = 0.5\n",
+    "kappa = 2\n",
+    "\n",
+    "# define the M so that column i is [1;0] if i is a sire (so even) and [0;1] otherwise \n",
+    "M = np.zeros((2, n))\n",
+    "M[0, range(0,50,2)] = 1\n",
+    "M[1, range(1,50,2)] = 1\n",
+    "# define the right hand side of the constraint Mx = m\n",
+    "m = np.array([[0.5], [0.5]])\n",
+    "\n",
+    "# create models for standard and robust genetic selection\n",
+    "model_std = gp.Model(\"n50standard\")\n",
+    "model_rbs = gp.Model(\"n50robust\")\n",
+    "\n",
+    "# initialise w for both models, z for robust model\n",
+    "w_std = model_std.addMVar(shape=n, vtype=GRB.CONTINUOUS, name=\"w\") \n",
+    "w_rbs = model_rbs.addMVar(shape=n, vtype=GRB.CONTINUOUS, name=\"w\")\n",
+    "z_rbs = model_rbs.addVar(name=\"z\")\n",
+    "\n",
+    "# define the objective functions for both models\n",
+    "model_std.setObjective(\n",
+    "    0.5*w_std@(sigma@w_std) - lam*w_std.transpose()@mubar,\n",
+    "GRB.MINIMIZE)\n",
+    "\n",
+    "model_rbs.setObjective(\n",
+    "    # Gurobi does offer a way to set one objective in terms of another, i.e.\n",
+    "    # we could use `model_std.getObjective() - lam*kappa*z_rbs` to define this\n",
+    "    # robust objective, but it results in a significant slowdown in code.\n",
+    "    0.5*w_rbs@(sigma@w_rbs) - lam*w_rbs.transpose()@mubar - lam*kappa*z_rbs,\n",
+    "GRB.MINIMIZE)\n",
+    "\n",
+    "# add sum-to-half constraints to both models\n",
+    "model_std.addConstr(M @ w_std == m, name=\"sum-to-half\")\n",
+    "model_rbs.addConstr(M @ w_std == m, name=\"sum-to-half\")\n",
+    "\n",
+    "# add quadratic uncertainty constraint to the robust model\n",
+    "model_rbs.addConstr(z_rbs**2 <= np.inner(w_rbs, omega@w_rbs), name=\"uncertainty\")\n",
+    "model_rbs.addConstr(z_rbs >= 0, name=\"z positive\")\n",
+    "\n",
+    "# since working with non-trivial size, set a time limit\n",
+    "time_limit = 60*5  # 5 minutes\n",
+    "model_std.setParam(GRB.Param.TimeLimit, time_limit)\n",
+    "model_std.setParam(GRB.Param.TimeLimit, time_limit)\n",
+    "\n",
+    "# for the same reason, also set a duality gap tolerance\n",
+    "duality_gap = 0.009\n",
+    "model_std.setParam('MIPGap', duality_gap)\n",
+    "model_rbs.setParam('MIPGap', duality_gap)\n",
+    "\n",
+    "# solve both problems with Gurobi\n",
+    "model_std.optimize()\n",
+    "model_rbs.optimize()\n",
+    "\n",
+    "# HACK code which prints the results for comparison in a nice format\n",
+    "print(\"\\nSIRE WEIGHTS\\t\\t\\t DAM WEIGHTS\")\n",
+    "print(\"-\"*20 + \"\\t\\t \" + \"-\"*20)\n",
+    "print(\" i   w_std    w_rbs\\t\\t  i   w_std    w_rbs\")\n",
+    "for candidate in range(25):\n",
+    "    print(f\"{candidate*2:02d}  {w_std.X[candidate*2]:.5f}  {w_rbs.X[candidate*2]:.5f} \\\n",
+    "            {candidate*2+1:02d}  {w_std.X[candidate*2+1]:.5f}  {w_rbs.X[candidate*2+1]:.5f}\")"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.12"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,3 @@
		# Timing Test

		Quick timing test of a non-trivial problem to show that it's the same in Python vs Jupyter.