-
-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automated retries #1776
base: master
Are you sure you want to change the base?
Automated retries #1776
Conversation
eb65eb5
to
7c875d1
Compare
7c875d1
to
c0157fc
Compare
8757074
to
09cb758
Compare
f79f837
to
ccbde0d
Compare
ccbde0d
to
15c799d
Compare
0da5cad
to
95e4000
Compare
I have tested 95e4000 again. I used this patch to test the locking and make sure only one client at a time will be able to retry. Essentially, it allows retries of QR codes, makes retrying faster and attempts to retry every invoice twice: diff --git a/api/resolvers/wallet.js b/api/resolvers/wallet.js
index 8e6f3f0e..fb9edd2f 100644
--- a/api/resolvers/wallet.js
+++ b/api/resolvers/wallet.js
@@ -469,7 +469,6 @@ const resolvers = {
SELECT * FROM "Invoice"
WHERE "userId" = ${me.id}
AND "actionState" = 'FAILED'
- AND "userCancel" = false
AND "cancelledAt" < now() - ${`${WALLET_RETRY_AFTER_MS} milliseconds`}::interval
AND "cancelledAt" > now() - ${`${WALLET_RETRY_BEFORE_MS} milliseconds`}::interval
AND "paymentAttempt" < ${WALLET_MAX_RETRIES}
diff --git a/lib/constants.js b/lib/constants.js
index 9ec20245..de55ac05 100644
--- a/lib/constants.js
+++ b/lib/constants.js
@@ -201,7 +201,7 @@ export const WALLET_CREATE_INVOICE_TIMEOUT_MS = 45_000
// interval between which failed invoices are returned to a client for automated retries.
// retry-after must be high enough such that intermediate failed invoices that will already
// be retried by the client due to sender or receiver fallbacks are not returned to the client.
-export const WALLET_RETRY_AFTER_MS = 60_000 // 1 minute
+export const WALLET_RETRY_AFTER_MS = 5_000 // 1 minute
export const WALLET_RETRY_BEFORE_MS = 86_400_000 // 24 hours
// we want to attempt a payment three times so we retry two times
export const WALLET_MAX_RETRIES = 2
diff --git a/wallets/index.js b/wallets/index.js
index 42fb55be..ab2580c3 100644
--- a/wallets/index.js
+++ b/wallets/index.js
@@ -255,7 +255,9 @@ function RetryHandler ({ children }) {
return
}
- for (const inv of failedInvoices) {
+ console.log('XXX failedInvoices', failedInvoices)
+
+ for (const inv of [...failedInvoices, ...failedInvoices]) {
try {
await retry(inv)
} catch (err) {
@@ -277,7 +279,7 @@ function RetryHandler ({ children }) {
console.error('retry poll failed:', err)
}
if (!stopped) queuePoll()
- }, NORMAL_POLL_INTERVAL)
+ }, 1_000)
}
const stopPolling = () => {
stopped = true 2025-02-04.19-29-10.mp4 |
#1776 is the year civilization was automatically retied in america |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. I'll QA fully tomorrow and fix any nitpicks (if I have any) tomorrow.
// we want to attempt a payment three times so we retry two times | ||
export const WALLET_MAX_RETRIES = 2 | ||
// when a pending retry for an invoice should be considered expired and can be attempted again | ||
export const WALLET_RETRY_TIMEOUT_MS = 60_000 // 1 minute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't be subject to multiple WALLET_CREATE_INVOICE_TIMEOUT_MS
during a retry right? I remember you mentioning something like that in a previous pr. (Just double checking that this time lock isn't too short.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohhh, good catch!
We actually can be subject to multiple WALLET_CREATE_INVOICE_TIMEOUT_MS
during a retry because the timeout is applied in walletCreateInvoice
which is called in the receiver wallet loop until we found a wallet that returned an invoice:
stacker.news/wallets/server.js
Lines 33 to 51 in 95e9850
for (const { def, wallet } of wallets) { | |
const logger = walletLogger({ wallet, models }) | |
try { | |
logger.info( | |
`↙ incoming payment: ${formatSats(msatsToSats(msats))}`, | |
{ | |
amount: formatMsats(msats) | |
}) | |
let invoice | |
try { | |
invoice = await walletCreateInvoice( | |
{ wallet, def }, | |
{ msats, description, descriptionHash, expiry }, | |
{ logger, models }) | |
} catch (err) { | |
throw new Error('failed to create invoice: ' + err.message) | |
} |
This also affects WALLET_RETRY_AFTER_MS
afaict. It exists to avoid returning failed invoices which will already get retried by a client due to sender or receiver fallbacks (the time between FAILED
and RETRYING
for an invoice). But since we transition an invoice to RETRYING
at the end of the mutation and therefore after we called walletCreateInvoice
, this means an invoice can also stay failed for n
* WALLET_CREATE_INVOICE_TIMEOUT_MS
which is currently larger than WALLET_RETRY_AFTER_MS
for n=2 even though we can assume it will get retried.
I think the solution is to also put a timeout on the full loop instead of only on each external wallet call. To give each wallet independent of order the same amount of time, we could divide the loop timeout by the amount of receiving wallets available. This means that the more receiving wallets you have, the less time is allocated to each wallet. This might be confusing since adding a second receiving wallet can make the first receiving wallet more likely to fail since it effectively has now only half the amount of time to respond as before. But I think having a timeout on the loop that doesn't depend on how many wallets someone has is more important and we can of course tweak the timeouts as we like.
For example, we could use a 1 minute timeout on the full loop so if you have two receiving wallets, each wallet gets 30 seconds to return an invoice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is at least less of an issue than I thought initially. Worst case here is that some unused invoices get created. Initially I thought that multiple payments might be made, but only one call to retryPaidAction
will succeed because the invoice is transitioned (and returned) conditional on it being in a FAILED
state.
tl;dr we can keep this as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think doing #1734 would be a better idea than any more messing with timeouts anyway
If there are no send wallets attached, payment failures never show up in notifications because they are never retried. I think we'll probably want another conditional on the notification that says: if it's been 10 minutes (or whatever) and 0 retries have been attempted, display notification. Perhaps we can reuse |
Also as noted in chat, the |
From chat: I think the move to deal with having a sending wallet or not is:
|
4f7d9d4
to
2c14275
Compare
2c14275 now immediately shows failed payments for posts or comments in /notifications and also allows auto-retries for them: 2025-02-09.15-17-00.mp4I also made sure that auto-retries work even if you already retried manually once. It works because we only increment the payment counter if it was automatically retried. This is also why we can retry manually forever. However, this commit does implement auto-retries even if
I think it's better to have automated retries for everything. It doesn't mess with showing something in the notifications before all automated retries were attempted. If you're currently looking at your notifications and an invoice is automatically retried, it will update the notification you're looking at. Or am I missing something why we shouldn't allow automated retries for We don't do it for zaps because there we don't know if they expected that the zap gets aborted if they close the QR code. But if you posted something and already tried to pay once, I think we can assume they want to continue trying to pay for it.1 edit: I still want to think more about this but it's definitely good to know that we can never accidentally pay twice thanks to our lock in retries as mentioned in #1776 (comment) (but I also want to think through this myself more). Footnotes
|
Description
close #1492 based on #1785, #1787
All failed invoices are returned to every client periodically via a new query
failedInvoices
. To make sure an invoice is only retried by one client at a time, we implemented expiring locks: we first try to update a new timestamp columnretryPendingSince
on the invoice. Only one client at a time will be able to successfully update the invoice. This lock expires after a minute after which another retry for the same invoice can be attempted by a client.To stop after three payment attempts (= two retries), a new integer column
Invoice.paymentAttempt
tracks at which payment attempt we are. This number is increased when we retry an invoice and passnewAttempt: true
which we do when we retry these fetched failed invoices. When the number is increased, the payment will start from the beginning with all sender and receiver wallets available.TODO:
added
userCancel
column, see #1785client only polls when it has send wallets
added
"cancelledAt" < now() - interval '${WALLET_RETRY_AFTER_MS} milliseconds'
filteradded
WALLET_RETRY_BEFORE_MS
used in this filter:new TODO since feedback:
hasNewNotes
Additional Context
see https://github.com/stackernews/stacker.news/pull/1776/files#r1907791409Checklist
Are your changes backwards compatible? Please answer below:
yes
On a scale of 1-10 how well and how have you QA'd this change and any features it might affect? Please answer below:
7
. Tested automated retries for posting, replies and zapping with this patch:Simulate multiple clients with this patch:
Test p2p zaps with this patch that makes forwards fail and disables the fallback to CCs:
For frontend changes: Tested on mobile, light and dark mode? Please answer below:
n/a
Did you introduce any new environment variables? If so, call them out explicitly here:
no