Skip to content

Censored Trait Analysis

Hao Cheng edited this page Nov 24, 2021 · 20 revisions

Censored traits are allowed if the upper bounds are provided as an array in the argument censored_trait, and lower bounds are provided as responses (phenotypes).

Data Simulation

Below we will convert a continuous trait "y1" to a censored trait by setting a phenotypic value (e.g.,y11) to a range [lower_bound, upper_bound] as follows,

  1. [y11 - 0.5, y11 + 0.5] if 0.0 < y11 <1.0
  2. [2.0, Inf] if y11 > 2.0
  3. [-Inf, -3.0] if y11 < -3.0
  4. [y11, y11] otherwise

This is only used to demonstrate the censored trait analysis using JWAS.

using JWAS,DataFrames,CSV,Statistics,JWAS.Datasets

phenofile  = dataset("phenotypes.csv")
phenotypes = CSV.read(phenofile,DataFrame,delim = ',',header=true,missingstrings=["NA"])

phenotypes[!, :lower_bound] .= round.(phenotypes[!, :y1],digits=2);
phenotypes[!, :upper_bound] .= round.(phenotypes[!, :y1],digits=2);
phenotypes[0.0 .< phenotypes[!, :y1] .< 1.0, :upper_bound] .= phenotypes[0.0 .< phenotypes[!, :y1] .< 1.0, :y1] .+ 0.5;
phenotypes[0.0 .< phenotypes[!, :y1] .< 1.0, :lower_bound] .= phenotypes[0.0 .< phenotypes[!, :y1] .< 1.0, :y1] .- 0.5;
phenotypes[phenotypes[!, :y1] .> 2.0,:upper_bound] .= Inf;
phenotypes[phenotypes[!, :y1] .> 2.0,:lower_bound] .= 2.0;
phenotypes[phenotypes[!, :y1] .< -3.0,:upper_bound].= -3.0;
phenotypes[phenotypes[!, :y1] .< -3.0,:lower_bound].= -Inf;

Univariate Linear Mixed Model (Genomic data)

# Step 1: Load packages
using JWAS,DataFrames,CSV,Statistics,JWAS.Datasets

# Step 2: Read data 
pedfile    = dataset("pedigree.csv")
genofile   = dataset("genotypes.csv")
pedigree   = get_pedigree(pedfile,separator=",",header=true);
genotypes  = get_genotypes(genofile,separator=',',method="BayesC");
first(phenotypes,5)

# Step 3: Build Model Equations

model_equation  ="lower_bound = intercept + x1 + x2 + x2*x3 + ID + dam + genotypes"
model = build_model(model_equation);

# Step 4: Set Factors or Covariates
set_covariate(model,"x1");

# Step 5: Set Random or Fixed Effects
set_random(model,"x2");
set_random(model,"ID dam",pedigree);

# Step 6: Run Analysis
out=runMCMC(model,phenotypes,censored_trait=phenotypes[!,:upper_bound],double_precision=true);

# Step 7: Check Accuruacy
results    = innerjoin(out["EBV_lower_bound"], phenotypes, on = :ID) 
accuruacy  = cor(results[!,:EBV],results[!,:bv1])

Univariate Linear Additive Genetic Model

To run analysis without genomic data, just remove "genotypes" in the model_equation from the script above.

Note that if you have observations without upper or lower bounds, you can have them in your phenotype file as

lower_bound, upper_bound
120,Inf
-Inf,130
Clone this wiki locally