Closed lower bound questions are misleading in UI vs API, and possibly unnecessary?? #1685
-
Beta Was this translation helpful? Give feedback.
Replies: 6 comments 1 reply
-
@jkraybill you're confusing pdfs and cdfs. Your screenshot shows the pdf, which indeed can be >0 at the lower bound. Without using Dirac deltas (which we don't support) it's impossible for the cdf at a closed lower bound to be anything but 0. This is actually symmetrical with the cdf being 1 at closed upper bounds. Here is a screenshot of the same question, using the cdf view: As you can see the cdf starts at 0 and ends at 1. To be extra clear: currently the first two cdf values for the Community Prediction are |
Beta Was this translation helpful? Give feedback.
-
@SylvainChevalier thanks so much for this explanation, it helped but I am still unclear on a couple things if you don't mind a couple more questions? When I submit a CDF, is cdf[0] the P(x <= min) or P(x < min)? In the above example, you said that if x=1, you would get scored on cdf[1] - cdf[0], which to me implies that cdf[0] is the P(x < min). But if that is the case, then wouldn't cdf[201] be P(x < max), making 16 an unscoreable outcome in the above example? I feel like I'm missing something pretty fundamental here, if this is well-documented somewhere please let me know. Basically I'm unclear how both 1 and 16 can be valid, scoreable outcomes in the example above, and more generally in discrete-answer questions with double-closed bounds how both P(min) and P(max) can be estimable. Thanks for any assistance you can offer in helping me understand this. Happy to take it to Discord if that's more appropriate. |
Beta Was this translation helpful? Give feedback.
-
@SylvainChevalier OK after reading the code and docs more, I think I understand: for all CDFs (closed or not), cdf[i] represents P(X < x) where x is (min + ((max - min) / 200 ) * i). The only exception is cdf[200], which represents P(X <= max)? If you confirm, sorry I wasted your time with the earlier question! |
Beta Was this translation helpful? Give feedback.
-
I don't mind questions! I agree this is somewhat confusing, because we're using a continuous question to represent a discrete event. If X was truly a continuous variable that cannot be <1, then we would have But in our case, it is actually very possible for the outcome to turn out to be 1 (or 16)! There are two ways to think about it:
In the meantime, we can mostly ignore the issue in practice, since what matters for scoring is the value of the pdf you put on the resolution value. And that can be >0, even if your cdf there is indeed 0. The main downside is confusion as soon as one thinks to hard about it. Sorry about that! |
Beta Was this translation helpful? Give feedback.
-
Awesome, thanks. So to help me clarify my thinking, I've got another question. Imagine we have a discrete question that has only three possible outcomes [0,1,2], each of which have equal probability, but it's been launched as a continuous question with closed bounds. If I want to maximise my score (and therefore pdf) at each of those resolution values, I believe I would set cdf[1], cdf[101], and cdf[200] to 0.3333, 0.6666, and 1 respectively. cdf[0] would have to be 0, and every other cdf value would be set to the minimum allowable value of cdf[x] = cdf[x-1] + (0.01 / 200). Is that the correct way to maximise my score for this question via the submitted cdf, or am I still missing something?? |
Beta Was this translation helpful? Give feedback.
-
Yes, that sounds correct. |
Beta Was this translation helpful? Give feedback.
Yes, that sounds correct.