You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wanted to open the issue on the simultaneous deconvolution model, so I understand it better.
The likelihood model
Is the following model valid?
Let $y(t) = (y_1(t), \dotsc, y_V(t))^T$ be a vector of proportions of variants and $A_{mv} = 1$ when variant $v$ has mutation $m$ and $A_{mv} = 0$ otherwise.
We have $p_m(t) = A y(t) = \sum_v A_{mv} y_v(t) \le 1$, which is the probability of observing mutation $m$.
Now, for some loci we observe some values. We want to use the quasibinomial model.
Handling non-existent values
How do we handle missing values in $x^{(n)}$, i.e., when some mutations are not observed? Do we set them to zero (more realistic if they are not detected due to low abundance) or try to do something like missing at random (assuming that the sequencing failed due to reasons independent on a given mutation and the abundance)?
Modeling the proportions of variants
A loose thought: it could be cool to have the loss being a function of the $y(t)$ time series, so that we can easily substitute the logistic model for something else, e.g., #15.
The text was updated successfully, but these errors were encountered:
I wanted to open the issue on the simultaneous deconvolution model, so I understand it better.
The likelihood model
Is the following model valid?$y(t) = (y_1(t), \dotsc, y_V(t))^T$ be a vector of proportions of variants and $A_{mv} = 1$ when variant $v$ has mutation $m$ and $A_{mv} = 0$ otherwise.$p_m(t) = A y(t) = \sum_v A_{mv} y_v(t) \le 1$ , which is the probability of observing mutation $m$ .
Let
We have
Now, for some loci we observe some values. We want to use the quasibinomial model.
Handling non-existent values
How do we handle missing values in$x^{(n)}$ , i.e., when some mutations are not observed? Do we set them to zero (more realistic if they are not detected due to low abundance) or try to do something like missing at random (assuming that the sequencing failed due to reasons independent on a given mutation and the abundance)?
Modeling the proportions of variants
A loose thought: it could be cool to have the loss being a function of the$y(t)$ time series, so that we can easily substitute the logistic model for something else, e.g., #15.
The text was updated successfully, but these errors were encountered: