is it possible to `predict recall` in pure sql? #67

LazerJesus · 2024-04-30T18:14:33Z

the title says it all.
is it possible to predict recall in pure sql?

my usecase is that i'd like to maintain a column recallProbabilityNow and update that every few minutes.
of course, i can just load all the rows and do the compute in python/js, but it would be cool (and fast, i guess) to run this inside the db.

The text was updated successfully, but these errors were encountered:

fasiha · 2024-04-30T19:10:52Z

@LazerJesus which version are we talking about? (I think the answer is going to be "not really" but depending on which version you have different options 😅)

LazerJesus · 2024-04-30T22:14:18Z

i am on v2

fasiha · 2024-08-28T21:24:28Z

First, sorry for the delay! Life got busy (but happy busy 😁).

Second, to do this in SQL we have to use an approximation. We pretend that $E[p^\delta] \approx (E[p])^\delta$.

In words, we know that the expected value of a random variable $p$ (probability of recall) raised to a power is not equal to the expected value itself raised to the power. (There's a brief discussion of this at https://fasiha.github.io/ebisu/#why-we-work-with-random-variables.)

However, in practice, maybe the difference isn't that bad. Here's how to compute this approximation in both Python and SQL: I create a simple SQLite db that contains a timestamp (seconds in the Unix epoch) and the Ebisu model as JSON and then, given the current time, calculate the approximate recall probability entirely in SQLite:

import sqlite3
import time
import json
from math import exp
import ebisu  # type: ignore

HOURS_PER_SECOND = 1 / 3600
connection = sqlite3.connect(":memory:")
cursor = connection.cursor()

cursor.execute("""
CREATE TABLE IF NOT EXISTS data (
    quiz TEXT,
    timestamp FLOAT, -- seconds in Unix epoch
    ebisu TEXT
)
""")

quizName = 'foo'
model = ebisu.defaultModel(24, 2.5)  # 24 hours halflife, α = β = 2.5
t = time.time()  # seconds in Unix epoch

cursor.execute("INSERT INTO data (quiz, timestamp, ebisu) VALUES (?, ?, ?)",
               (quizName, t, json.dumps(model)))

elapsedHours = 10
newT = t + elapsedHours / HOURS_PER_SECOND  # some time in the future, Unix seconds

exact = ebisu.predictRecall(model, elapsedHours,
                            exact=True)  # full Bayesian expectation
approx = exp(-((newT - t) * HOURS_PER_SECOND) / model[-1])

# replace the first ? with `strftime('%s', 'now')` if you want SQL to compute the current
# time (seconds in Unix epoch) instead of passing it in
query = """
SELECT exp(-((? - timestamp) / (60 * 60)) / json_extract(ebisu, '$[2]')) AS approx
FROM data
WHERE quiz=?
"""
cursor.execute(query, (newT, quizName))
fromSql = cursor.fetchone()[0]

relativeError = abs(fromSql - approx) / approx
print(f'{approx=:0.6f}, {fromSql=:0.6f} ({relativeError=}; {exact=:0.2f})')

Output:

approx=0.659241, fromSql=0.659241 (relativeError=0.0; exact=0.73)

There's no error between Python and SQL calculating the exp, but you see there's quite a bit of difference between the approximation, 66%, and the exact prediction 73%. Probably in practice this difference isn't that big a deal, since Ebisu v2 is bad at the exact probability (though hopefully the relative ranking is acceptable).

Does this help?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

is it possible to `predict recall` in pure sql? #67

is it possible to `predict recall` in pure sql? #67

LazerJesus commented Apr 30, 2024

fasiha commented Apr 30, 2024

LazerJesus commented Apr 30, 2024

fasiha commented Aug 28, 2024

is it possible to predict recall in pure sql? #67

is it possible to predict recall in pure sql? #67

Comments

LazerJesus commented Apr 30, 2024

fasiha commented Apr 30, 2024

LazerJesus commented Apr 30, 2024

fasiha commented Aug 28, 2024

is it possible to `predict recall` in pure sql? #67

is it possible to `predict recall` in pure sql? #67