You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We should consider factoring in the absolute number of changed characters or words into the how textual changes contribute to priority. In extremely large pages, even a large change (which is worth looking at) can seem small percentage-wise. For example, only 1.1% of the text here changed, but that’s still 1,785 characters!
Maybe the easiest way to do this is to put a ceiling on how many characters of a page we’ll consider, e.g. pretend a page can never be longer than 5,000 (?) characters. That way, this example change above would have equated to 35.7% changed rather than 1.1% changed.
We should consider factoring in the absolute number of changed characters or words into the how textual changes contribute to priority. In extremely large pages, even a large change (which is worth looking at) can seem small percentage-wise. For example, only 1.1% of the text here changed, but that’s still 1,785 characters!
https://monitoring.envirodatagov.org/page/6767f063-29f7-4c50-93d0-b851d0292c98/4da08f36-ab67-463d-8517-cf191857dc02..0eae6081-9fac-4f00-b914-f19c0218e7fe
Currently, we only look at the percentage changed:
web-monitoring-task-sheets/analyst_sheets/analyze.py
Lines 324 to 325 in 54a6759
The text was updated successfully, but these errors were encountered: