You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I use the package to distinguish crawlers from human users in HTTP server. The logic is to prevent crawlers from "spoiling" one time links shared in Discord and similar chats which request all the links sent to chats to make preview. Because the link is one-time, the request from the crawler uses it and it does not open when human user opens it. I solved this by blocking access from crawlers to such links. If you need more details, please see starius/pasta#8
Danger of false positives
If some legit browser sends User Agent which accidentally matches one of patterns, the user won't be able to access the link, because the site will treat this request as originated by a crawler.
I guess, other uses of this package will also benefit if false positives are minimized.
Proposed solution
Let's add a test to CI which runs most common User Agents through the patterns and fails if any of them matches.
The list of patterns can be loaded from here: https://github.com/microlinkhq/top-user-agents/tree/master/src
If somebody adds a pattern which matches any of them, it will be early detected and prevented.
Also if some popular browser starts using some User Agent accidentally matching one of patterns, this will also trigger the test failure.
The text was updated successfully, but these errors were encountered:
Context
I use the package to distinguish crawlers from human users in HTTP server. The logic is to prevent crawlers from "spoiling" one time links shared in Discord and similar chats which request all the links sent to chats to make preview. Because the link is one-time, the request from the crawler uses it and it does not open when human user opens it. I solved this by blocking access from crawlers to such links. If you need more details, please see starius/pasta#8
Danger of false positives
If some legit browser sends User Agent which accidentally matches one of patterns, the user won't be able to access the link, because the site will treat this request as originated by a crawler.
I guess, other uses of this package will also benefit if false positives are minimized.
Proposed solution
Let's add a test to CI which runs most common User Agents through the patterns and fails if any of them matches.
The list of patterns can be loaded from here: https://github.com/microlinkhq/top-user-agents/tree/master/src
If somebody adds a pattern which matches any of them, it will be early detected and prevented.
Also if some popular browser starts using some User Agent accidentally matching one of patterns, this will also trigger the test failure.
The text was updated successfully, but these errors were encountered: