-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: network/host rate limiting #1413
Comments
While working on bug#1376 I noticed that the code is aggressive about fetching an icon for the feed. It fires of a number of requests into the high-priority queue searching for an icon. This does not address the instance if you have more than one feed hitting the same server, it is possible for the same server to be accessed in a small time window. I noticed this with, I think, feedproxy.com, that number of sites used for servicing feeds. Occasionally, it would return a 4xx if the number requests exceeded a limit. With or without pr#1398, it should be possible to defer a query to a server unless X seconds have elapsed since last contacted, for simple cases. It may be difficult to implement if the same server has one or more aliases that is serviced by a single network blocking device. |
the other thing that could be done in addition to what i proposed, is to add an option to limit the number of simultaneous requests - this isn't the perfect answer in the case of a server having aliases, but it's a help |
@atomGit When bitchute.com blocks, does it send HTTP 429? If yes, support to react to HTTP 429 was added with previous release. |
i'd have to create another test case because i don't recall if the response was 429, but, if that is the response, isn't it too late to do anything about it by that time? how did your fix address that? |
@atomGit I just implemented normal HTTP 429 handling. On the first HTTP 429 the client does no further requests on the domain until a given time. The time is either a default back-off interval 5min or an interval specified in the By implementing this the server in the worst case get's only 1 unwanted request every 5 minutes. |
ah, ok - that wouldn't for me since i fetch all feeds in one go once or twice a day - i keep auto-updating disabled i think if you shuffled the urls to query and had an option to limit the number of simultaneous requests, that might work ... or ... if feeds are grouped by domain (alphabetical for ex.), then adding a configurable delay between multiple requests to the same domain might work ... or maybe some combination thereof ps: openrss.org is another domain that might be very finicky about rapid, consecutive requests (openrss can gen feeds for some sites that don't offer them) |
@atomGit I understand you thinking. But I do not see how to automatically choose the right rate limit. I also do not want to maintain a more complex network stack logic. I'll think a bit on a maximum rate per domain logic a bit. Especially for background requests. |
liferea suffers from the same problem some other readers do; it updates feeds too quickly and this can cause various problems when a given host is hit with multiple requests in quick secession
bitchute.com is one such example where, if there are more than x number of requests in n seconds (and i don't know what x and n are) , then feed fetching is temp-blocked
i had the same problem in a script i wrote to check for broken hyperlinks in a website and got around it by first shuffling the array of urls, then by keeping a rolling list of urls with a time stamp for when they were queried and dropping the next url to be checked to the bottom of the array if the same domain was checked less than x seconds ago
The text was updated successfully, but these errors were encountered: