Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simultaneous deconvolution #20

Open
pawel-czyz opened this issue Nov 14, 2024 · 0 comments
Open

Simultaneous deconvolution #20

pawel-czyz opened this issue Nov 14, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@pawel-czyz
Copy link
Member

pawel-czyz commented Nov 14, 2024

I wanted to open the issue on the simultaneous deconvolution model, so I understand it better.

The likelihood model

Is the following model valid?
Let $y(t) = (y_1(t), \dotsc, y_V(t))^T$ be a vector of proportions of variants and $A_{mv} = 1$ when variant $v$ has mutation $m$ and $A_{mv} = 0$ otherwise.
We have $p_m(t) = A y(t) = \sum_v A_{mv} y_v(t) \le 1$, which is the probability of observing mutation $m$.

Now, for some loci we observe some values. We want to use the quasibinomial model.

Handling non-existent values

How do we handle missing values in $x^{(n)}$, i.e., when some mutations are not observed? Do we set them to zero (more realistic if they are not detected due to low abundance) or try to do something like missing at random (assuming that the sequencing failed due to reasons independent on a given mutation and the abundance)?

Modeling the proportions of variants

A loose thought: it could be cool to have the loss being a function of the $y(t)$ time series, so that we can easily substitute the logistic model for something else, e.g., #15.

@pawel-czyz pawel-czyz added the enhancement New feature or request label Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant