Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversion Rate A B Testing #25

Open
utterances-bot opened this issue Jan 8, 2022 · 7 comments
Open

Conversion Rate A B Testing #25

utterances-bot opened this issue Jan 8, 2022 · 7 comments

Comments

@utterances-bot
Copy link

Conversion Rate A B Testing

https://blog.alexandervolkmann.com/2022/01/05/conversion-rate-A-B-testing.html

Copy link

jthaman commented Jan 8, 2022

I make an intuitive argument that p is not estimable in this model. What do you think?

1-p is like the proportion of no-conversions in the data, or the proportion of lines with arrow heads in your figure. But almost any value of p can be consistent with the data you collect up to 7 days, because, for all we know, every censored observation could be a no-conversation. Also consistent with the data, every censored observation could be a conversation. The data set just doesn't have enough info to tell you about p. In terms of your figure, you only see data in the blue box, so you learn nothing about how many arrow-head lines you got.

I guess I need to get some data and fit the model because it just seems impossible to me, unless the prior is jerking p around too much.

Copy link
Owner

volkale commented Jan 9, 2022

Thanks for your comment @jthaman.
Note that everything is based on the model assumptions, i.e. we have a conversion probability p and the time lag (conditional on a conversion) is distributed according to a zero-inflated geometric distribution. Now you are right that if we don't have a large enough time frame of observations relative to the "typical" lag (or in the extreme case we didn't observe any conversions) the data is compatible with many different parameters of this model class, and your data won't be very informative for your posterior inference.
In the simulated data that I used however, there were enough positive (i.e. conversion) examples to infer the value or the parameter p to the precision depicted in the posterior distribution plot. Hope that answers the question.

Copy link
Owner

volkale commented Jan 9, 2022

I added a script that lets you reproduce the results and plots of this blog post here. Unfortunately, I did the rookie mistake of not setting a random seed when I produced the simulations for the article >_< , but you should get very similar numbers and plots.

Copy link

jthaman commented Jan 10, 2022

Thanks for the script, and the blog post. Ill try to work through some examples to better demonstrate the estimability issue. I don’t speak Python, so it will take some time…

Copy link

jthaman commented Jan 27, 2022

I was able to reproduce some interested results on my own in R, so I retract my earlier statements about the model not be estimable.

You might be interested in knowing that this work falls into the field of cure models https://www.annualreviews.org/doi/abs/10.1146/annurev-statistics-031017-100101

Copy link
Owner

volkale commented Feb 9, 2022

Great :)
Thanks for the pointer and the paper, very interesting!

Copy link

ploshay commented Feb 27, 2022

I think it is possible to define a conversion (a week, for example) before running the test.
Don't take to account everything that happens after the week.
This approach will work if 90-95% of conversions happen in 7 days period.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants