-
Notifications
You must be signed in to change notification settings - Fork 0
File Formats
RobustOCS works with a standard set of file formats to read in problem data. These can be generated through the AlphaSimR package for simulation experiments. If anything here is unclear, please do open an issue.
Below are file descriptions for each of the problem variables, using the toy n = 4 problem as an example. All files have the standard text file .txt
extension unless specified otherwise.
The sex of each candidate in the cohort is stored in a space-separated values files. Each row of the file has an integer index of a candidate in the first column and a string for the sex of that candidate in the second column. Male candidates (sires) are denoted by an M
and female candidates (dams) are denoted by an F
.
For example, the file contents
1 M
2 F
3 M
4 F
correspond to the index sets
EBVs, regardless whether actual (
For example, the file contents
1
2
1
2
corresponds to the vector
The covariance matrix for expected breeding values (i.e. the
Note that the filetype indexes the matrix rows and columns from one, not zero as is the case in Python. For example, the file contents
1 1 0.11111111111111
2 2 0.11111111111111
3 3 4.0
4 4 4.0
correspond to the covariance matrix
There are two standard formats that RobustOCS can work with with the relationship matrix, with which we use depending how we choose to model relationships in the cohort. Firstly, we can model the relationships using a covariance matrix in which case
1 1 1
2 2 1
3 3 1
4 4 1
corresponds to the 4-by-4 identity matrix.
We can also model the relationships using a pedigree tree, which can be stored in a *.ped
datafile, a CSV file with columns i
, p
, and q
, where unknown parents are represented by a zero. When computing WNRM it doesn't matter whether each of
Consider an example with four candidates:
- a sire with unknown parentage,
- a dam parented by the first and an unknown dam,
- a dam parented by the first and the second,
- a dam parented by the first and the third.
We can represent these relationships using a digraph,
which is associated pedigree file shown below.
i,p,q
1,0,0
2,1,0
3,1,2
4,1,3
Once the tree loaded RobustOCS generates
Below are file descriptions for the output files, using the toy n = 4 problem as an example.
The solveROCS
quick-start function has an optional parameter solution_output="filename"
, which used will save the solution vector filename.csv
in the local directory. This includes the optimal contribution for each candidate alongside the candidates' identifier, which are loaded alongside sex data. For the
candidate,contribution
1,0.3822569445737661
2,0.3822569445664524
3,0.11774305542623399
4,0.11774305543354763
By using the optional model_output="filename"
parameter with any of the solver functions, RobustOCS creates an MPS file filename.mps
. This includes the model in a format that can be read into other optimization software, but it is not a human readable file so is unlikely to have any other use. For the
NAME robust-genetics
ROWS
N Obj
E r0
E r1
G r2
G r3
G r4
G r5
G r6
G r7
G r8
G r9
G r10
G r11
G r12
COLUMNS
c0 Obj -1
c0 r0 1
c0 r3 -0.05555555556
c0 r4 -0.04761904762
c0 r5 -0.03961550665
c0 r6 -0.04446456755
c0 r7 -0.04235026123
c0 r8 -0.04290175015
c0 r9 -0.04263010846
c0 r10 -0.04249123841
c0 r11 -0.04242101681
c0 r12 -0.0424561939
c1 Obj -1
c1 r1 1
c1 r3 -0.05555555556
c1 r4 -0.04761904762
c1 r5 -0.03961550665
c1 r6 -0.04446456755
c1 r7 -0.04235026123
c1 r8 -0.04290175015
c1 r9 -0.04263010846
c1 r10 -0.04249123841
c1 r11 -0.04242101681
c1 r12 -0.0424561939
c2 Obj -2
c2 r0 1
c2 r2 -2
c2 r4 -0.2857142857
c2 r5 -0.5738417606
c2 r6 -0.3992755682
c2 r7 -0.4753905958
c2 r8 -0.4555369945
c2 r9 -0.4653160956
c2 r10 -0.4703154171
c2 r11 -0.472843395
c2 r12 -0.4715770195
c3 Obj -2
c3 r1 1
c3 r2 -2
c3 r4 -0.2857142857
c3 r5 -0.5738417606
c3 r6 -0.3992755682
c3 r7 -0.4753905958
c3 r8 -0.4555369945
c3 r9 -0.4653160956
c3 r10 -0.4703154171
c3 r11 -0.472843395
c3 r12 -0.4715770196
c4 Obj 1
c4 r2 1.414213562
c4 r3 0.2357022604
c4 r4 0.2857142857
c4 r5 0.4391994692
c4 r6 0.3395559593
c4 r7 0.3811586449
c4 r8 0.3699825127
c4 r9 0.3754615893
c4 r10 0.3782821592
c4 r11 0.3797133209
c4 r12 0.3789959814
RHS
RHS_V r0 0.5
RHS_V r1 0.5
BOUNDS
UP BOUND c0 1
UP BOUND c1 1
UP BOUND c2 1
UP BOUND c3 1
QUADOBJ
c0 c0 0.5
c1 c1 0.5
c2 c2 0.5
c3 c3 0.5
ENDATA
Note that with SQP methods, the model file produced will be for the final model the solver constructed. This includes all of the constraints necessary to approximate the relaxed robust objective term.
As mentioned, the examples with 50, 1000, and 10,000 candidates are generated using AlphaSimR to have a realistic structure. The original simulation was of a 12,000 candidate cohort. Based on that simulated data we constructed:
-
$\boldsymbol{\bar\mu}$ , a vector of length 12,000 computed as the posterior mean over 1000 samples of the expected breeding values, -
$\Sigma$ , a 12,000-by-12,000 matrix measuring co-ancestry between individuals based on the pedigree data, -
$\Omega$ , the a 12,000-by-12,000 covariance matrix between those 1000 EBV samples.
In each case for the 50, 1000, and 10,000 examples we then took the
This documentation relates to the latest version of the package on GitHub. For past versions, download the zip bundled with the release from here. If anything in this wiki looks incorrect or you think key information is missing, please do open an issue.