Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more complete derivations #1

Merged
merged 5 commits into from
Apr 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
"Gregor's",
"gurobi",
"gurobipy",
"Heckel",
"hendo",
"linalg",
"loadtxt",
Expand Down
3 changes: 3 additions & 0 deletions Scratch/Timing Test/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Timing Test

Quick timing test of a non-trivial problem to show that it's the same in Python vs Jupyter.
361 changes: 361 additions & 0 deletions Scratch/Timing Test/timing-test.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,361 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Robust Optimization for Genetic Selection\n",
"\n",
"This notebook provides a didactic for using Python to solve the multi-objective optimization problem which arises in the context of robust genetic selection. It has been tested with Python 3.10 specifically and there are some standard packages which this depends on, imported below."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np # defines matrix structures\n",
"from qpsolvers import solve_qp # used for quadratic optimization\n",
"import gurobipy as gp # Gurobi optimization interface (1)\n",
"from gurobipy import GRB # Gurobi optimization interface (2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Utility functions and output settings used in this notebook are defined in the two cells below."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# want to round rather than truncate when printing\n",
"np.set_printoptions(threshold=np.inf)\n",
"\n",
"# only show numpy output to five decimal places\n",
"np.set_printoptions(formatter={'float_kind':\"{:.5f}\".format})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Real (Simulated) Data\n",
"\n",
"Now we've looked at how these problems can be approached with Gurobi, we can try an example with realistic simulated data. Here we have data for a cohort of 50 candidates split across three data files, each containing space-separated values.\n",
"\n",
"1. In `A50.txt` the matrix $\\Sigma$ is described, with columns for $i$, $j$, and $a_{ij}$, where only the upper triangle is described since the matrix is symmetric.\n",
"2. In `EBV50.txt` the vector $\\bar{\\mu}$ is described, with only one column containing the posterior mean over 1000 samples.\n",
"3. In `S50.txt` the matrix $\\Omega$ is described, with columns for $i$, $j$, and $\\sigma_{ij}$, where only the upper triangle is described since the matrix is symmetric.\n",
"\n",
"Note that this particular problem does not contain a separate file for sex data; odd indexed candidates are sires and even indexed candidates are dams. For now, we also don't have a $l$ or $u$ bounding the possible weights.\n",
"\n",
"\n",
"The first step then is clearly going to be loading the matrices in particular from file. We create the following function to do so.\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"def load_problem(A_filename, E_filename, S_filename, dimension=False):\n",
" \"\"\"\n",
" Used to load genetic selection problems into NumPy. It takes three\n",
" string inputs for filenames where Sigma, Mu, and Omega are stored,\n",
" as well as an optional integer input for problem dimension if this\n",
" is known. If it's know know, it's worked out based on E_filename.\n",
"\n",
" As output, it returns (A, E, S, n), where A and S are n-by-n NumPy\n",
" arrays, E is a length n NumPy array, and n is an integer.\n",
" \"\"\"\n",
"\n",
" def load_symmetric_matrix(filename, dimension):\n",
" \"\"\"\n",
" Since NumPy doesn't have a stock way to load matrices\n",
" stored in coordinate format format, this adds one.\n",
" \"\"\"\n",
"\n",
" matrix = np.zeros([dimension, dimension])\n",
"\n",
" with open(filename, 'r') as file:\n",
" for line in file:\n",
" i, j, entry = line.split(\" \")\n",
" # data files indexed from 1, not 0\n",
" matrix[int(i)-1, int(j)-1] = entry\n",
" matrix[int(j)-1, int(i)-1] = entry\n",
"\n",
" return matrix\n",
"\n",
"\n",
" # if dimension wasn't supplied, need to find that\n",
" if not dimension:\n",
" # get dimension from EBV, since it's the smallest file\n",
" with open(E_filename, 'r') as file:\n",
" dimension = sum(1 for _ in file)\n",
"\n",
" # EBV isn't in coordinate format so can be loaded directly\n",
" E = np.loadtxt(E_filename) \n",
" # A and S are stored by coordinates so need special loader\n",
" A = load_symmetric_matrix(A_filename, dimension)\n",
" S = load_symmetric_matrix(S_filename, dimension)\n",
"\n",
" return A, E, S, dimension"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For the given example problem we can now solve it with Gurobi using the exact same methods as before, for both the standard genetic selection problem and the robust version of the problem. The following cell does both alongside each other to accentuate the differences."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Set parameter Username\n",
"Academic license - for non-commercial use only - expires 2025-02-26\n",
"Set parameter TimeLimit to value 300\n",
"Set parameter MIPGap to value 0.009\n",
"Set parameter MIPGap to value 0.009\n",
"Gurobi Optimizer version 11.0.0 build v11.0.0rc2 (linux64 - \"Ubuntu 22.04.4 LTS\")\n",
"\n",
"CPU model: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz, instruction set [SSE2|AVX|AVX2]\n",
"Thread count: 4 physical cores, 8 logical processors, using up to 8 threads\n",
"\n",
"Optimize a model with 4 rows, 50 columns and 100 nonzeros\n",
"Model fingerprint: 0x534ff85e\n",
"Model has 1275 quadratic objective terms\n",
"Coefficient statistics:\n",
" Matrix range [1e+00, 1e+00]\n",
" Objective range [7e-01, 1e+00]\n",
" QObjective range [5e-02, 1e+00]\n",
" Bounds range [0e+00, 0e+00]\n",
" RHS range [5e-01, 5e-01]\n",
"Presolve removed 2 rows and 0 columns\n",
"Presolve time: 0.01s\n",
"Presolved: 2 rows, 50 columns, 50 nonzeros\n",
"Presolved model has 1275 quadratic objective terms\n",
"Ordering time: 0.00s\n",
"\n",
"Barrier statistics:\n",
" Free vars : 49\n",
" AA' NZ : 1.274e+03\n",
" Factor NZ : 1.326e+03\n",
" Factor Ops : 4.553e+04 (less than 1 second per iteration)\n",
" Threads : 1\n",
"\n",
" Objective Residual\n",
"Iter Primal Dual Primal Dual Compl Time\n",
" 0 3.76397773e+05 -4.22619805e+05 2.50e+04 1.12e+00 9.99e+05 0s\n",
" 1 -5.01117677e+01 -9.99230833e+02 2.68e+01 1.25e-04 1.09e+03 0s\n",
" 2 -8.81632395e-01 -9.63667643e+02 2.68e-05 1.25e-10 1.93e+01 0s\n",
" 3 -8.81606636e-01 -2.03034027e+00 5.19e-09 2.41e-14 2.30e-02 0s\n",
" 4 -9.04332054e-01 -1.04180295e+00 4.73e-10 2.19e-15 2.75e-03 0s\n",
" 5 -9.59656693e-01 -1.01318293e+00 1.67e-15 8.88e-16 1.07e-03 0s\n",
" 6 -9.73793052e-01 -9.76849441e-01 2.22e-16 1.33e-15 6.11e-05 0s\n",
" 7 -9.76163966e-01 -9.76239391e-01 1.39e-15 6.66e-16 1.51e-06 0s\n",
" 8 -9.76213879e-01 -9.76217890e-01 9.91e-15 5.55e-16 8.02e-08 0s\n",
" 9 -9.76215229e-01 -9.76215592e-01 1.65e-13 5.55e-16 7.25e-09 0s\n",
" 10 -9.76215438e-01 -9.76215482e-01 7.03e-13 6.66e-16 8.86e-10 0s\n",
" 11 -9.76215473e-01 -9.76215476e-01 4.24e-13 6.66e-16 4.91e-11 0s\n",
"\n",
"Barrier solved model in 11 iterations and 0.03 seconds (0.00 work units)\n",
"Optimal objective -9.76215473e-01\n",
"\n",
"Gurobi Optimizer version 11.0.0 build v11.0.0rc2 (linux64 - \"Ubuntu 22.04.4 LTS\")\n",
"\n",
"CPU model: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz, instruction set [SSE2|AVX|AVX2]\n",
"Thread count: 4 physical cores, 8 logical processors, using up to 8 threads\n",
"\n",
"Optimize a model with 5 rows, 51 columns and 101 nonzeros\n",
"Model fingerprint: 0x8c48581e\n",
"Model has 1275 quadratic objective terms\n",
"Model has 50 quadratic constraints\n",
"Coefficient statistics:\n",
" Matrix range [1e+00, 1e+00]\n",
" QMatrix range [7e-06, 1e+00]\n",
" Objective range [7e-01, 1e+00]\n",
" QObjective range [5e-02, 1e+00]\n",
" Bounds range [0e+00, 0e+00]\n",
" RHS range [5e-01, 5e-01]\n",
"Presolve removed 3 rows and 0 columns\n",
"\n",
"Continuous model is non-convex -- solving as a MIP\n",
"\n",
"Presolve removed 3 rows and 0 columns\n",
"Presolve time: 0.01s\n",
"Presolved: 2552 rows, 1327 columns, 7600 nonzeros\n",
"Presolved model has 1275 quadratic objective terms\n",
"Presolved model has 1 quadratic constraint(s)\n",
"Presolved model has 1275 bilinear constraint(s)\n",
"Variable types: 1327 continuous, 0 integer (0 binary)\n",
"Found heuristic solution: objective -0.9763584\n",
"\n",
"Root relaxation: objective -1.285097e+00, 31 iterations, 0.00 seconds (0.00 work units)\n",
"\n",
" Nodes | Current Node | Objective Bounds | Work\n",
" Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time\n",
"\n",
" 0 0 -1.28510 0 36 -0.97636 -1.28510 31.6% - 0s\n",
" 0 0 -1.13066 0 69 -0.97636 -1.13066 15.8% - 0s\n",
" 0 0 -1.05344 0 51 -0.97636 -1.05344 7.89% - 0s\n",
" 0 0 -1.01483 0 51 -0.97636 -1.01483 3.94% - 0s\n",
" 0 0 -0.99552 0 37 -0.97636 -0.99552 1.96% - 0s\n",
" 0 0 -0.99119 0 1175 -0.97636 -0.99119 1.52% - 0s\n",
" 0 0 -0.99098 0 1118 -0.97636 -0.99098 1.50% - 0s\n",
" 0 0 -0.98587 0 47 -0.97636 -0.98587 0.97% - 1s\n",
" 0 2 -0.98587 0 47 -0.97636 -0.98587 0.97% - 1s\n",
" 184 201 -0.98413 26 341 -0.97636 -0.98567 0.95% 156 5s\n",
" 434 418 -0.97843 58 533 -0.97636 -0.98567 0.95% 318 10s\n",
"H 609 566 -0.9768443 -0.98567 0.90% 296 12s\n",
"H 699 541 -0.9768479 -0.98567 0.90% 282 14s\n",
" 704 545 -0.97802 28 0 -0.97685 -0.98567 0.90% 280 15s\n",
"\n",
"Cutting planes:\n",
" RLT: 39\n",
" PSD: 143\n",
"\n",
"Explored 704 nodes (208237 simplex iterations) in 15.37 seconds (17.22 work units)\n",
"Thread count was 8 (of 8 available processors)\n",
"\n",
"Solution count 1: -0.976848 \n",
"\n",
"Optimal solution found (tolerance 9.00e-03)\n",
"Best objective -9.768443464472e-01, best bound -9.854709478274e-01, gap 0.8831%\n",
"\n",
"SIRE WEIGHTS\t\t\t DAM WEIGHTS\n",
"--------------------\t\t --------------------\n",
" i w_std w_rbs\t\t i w_std w_rbs\n",
"00 0.00000 0.00002 01 0.08319 0.08339\n",
"02 0.11649 0.11508 03 0.00000 0.00001\n",
"04 0.00000 0.00002 05 0.00000 0.00002\n",
"06 0.00000 0.00001 07 0.06817 0.06830\n",
"08 0.00000 0.00001 09 0.03762 0.03701\n",
"10 0.00000 0.00002 11 0.00000 0.00002\n",
"12 0.00000 0.00001 13 0.00355 0.00564\n",
"14 0.11542 0.11450 15 0.06966 0.06913\n",
"16 0.00000 0.00006 17 0.00000 0.00001\n",
"18 0.00000 0.00002 19 0.00000 0.00003\n",
"20 0.00000 0.00002 21 0.00001 0.00062\n",
"22 0.00000 0.00017 23 0.00000 0.00004\n",
"24 0.03537 0.03493 25 0.00000 0.00001\n",
"26 0.00000 0.00001 27 0.00000 0.00002\n",
"28 0.00000 0.00002 29 0.00000 0.00002\n",
"30 0.00000 0.00001 31 0.03580 0.03612\n",
"32 0.08447 0.08517 33 0.00000 0.00003\n",
"34 0.00000 0.00001 35 0.02849 0.02747\n",
"36 0.13489 0.13487 37 0.07446 0.07264\n",
"38 0.00000 0.00001 39 0.00000 0.00001\n",
"40 0.00000 0.00003 41 0.07312 0.07388\n",
"42 0.00000 0.00004 43 0.00000 0.00003\n",
"44 0.01336 0.01485 45 0.00000 0.00002\n",
"46 0.00000 0.00009 47 0.02593 0.02555\n",
"48 0.00000 0.00003 49 0.00000 0.00000\n"
]
}
],
"source": [
"sigma, mubar, omega, n = load_problem(\n",
" \"../Example/A50.txt\",\n",
" \"../Example/EBV50.txt\",\n",
" \"../Example/S50.txt\",\n",
" 50)\n",
"\n",
"lam = 0.5\n",
"kappa = 2\n",
"\n",
"# define the M so that column i is [1;0] if i is a sire (so even) and [0;1] otherwise \n",
"M = np.zeros((2, n))\n",
"M[0, range(0,50,2)] = 1\n",
"M[1, range(1,50,2)] = 1\n",
"# define the right hand side of the constraint Mx = m\n",
"m = np.array([[0.5], [0.5]])\n",
"\n",
"# create models for standard and robust genetic selection\n",
"model_std = gp.Model(\"n50standard\")\n",
"model_rbs = gp.Model(\"n50robust\")\n",
"\n",
"# initialise w for both models, z for robust model\n",
"w_std = model_std.addMVar(shape=n, vtype=GRB.CONTINUOUS, name=\"w\") \n",
"w_rbs = model_rbs.addMVar(shape=n, vtype=GRB.CONTINUOUS, name=\"w\")\n",
"z_rbs = model_rbs.addVar(name=\"z\")\n",
"\n",
"# define the objective functions for both models\n",
"model_std.setObjective(\n",
" 0.5*w_std@(sigma@w_std) - lam*w_std.transpose()@mubar,\n",
"GRB.MINIMIZE)\n",
"\n",
"model_rbs.setObjective(\n",
" # Gurobi does offer a way to set one objective in terms of another, i.e.\n",
" # we could use `model_std.getObjective() - lam*kappa*z_rbs` to define this\n",
" # robust objective, but it results in a significant slowdown in code.\n",
" 0.5*w_rbs@(sigma@w_rbs) - lam*w_rbs.transpose()@mubar - lam*kappa*z_rbs,\n",
"GRB.MINIMIZE)\n",
"\n",
"# add sum-to-half constraints to both models\n",
"model_std.addConstr(M @ w_std == m, name=\"sum-to-half\")\n",
"model_rbs.addConstr(M @ w_std == m, name=\"sum-to-half\")\n",
"\n",
"# add quadratic uncertainty constraint to the robust model\n",
"model_rbs.addConstr(z_rbs**2 <= np.inner(w_rbs, omega@w_rbs), name=\"uncertainty\")\n",
"model_rbs.addConstr(z_rbs >= 0, name=\"z positive\")\n",
"\n",
"# since working with non-trivial size, set a time limit\n",
"time_limit = 60*5 # 5 minutes\n",
"model_std.setParam(GRB.Param.TimeLimit, time_limit)\n",
"model_std.setParam(GRB.Param.TimeLimit, time_limit)\n",
"\n",
"# for the same reason, also set a duality gap tolerance\n",
"duality_gap = 0.009\n",
"model_std.setParam('MIPGap', duality_gap)\n",
"model_rbs.setParam('MIPGap', duality_gap)\n",
"\n",
"# solve both problems with Gurobi\n",
"model_std.optimize()\n",
"model_rbs.optimize()\n",
"\n",
"# HACK code which prints the results for comparison in a nice format\n",
"print(\"\\nSIRE WEIGHTS\\t\\t\\t DAM WEIGHTS\")\n",
"print(\"-\"*20 + \"\\t\\t \" + \"-\"*20)\n",
"print(\" i w_std w_rbs\\t\\t i w_std w_rbs\")\n",
"for candidate in range(25):\n",
" print(f\"{candidate*2:02d} {w_std.X[candidate*2]:.5f} {w_rbs.X[candidate*2]:.5f} \\\n",
" {candidate*2+1:02d} {w_std.X[candidate*2+1]:.5f} {w_rbs.X[candidate*2+1]:.5f}\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading