You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This note describes a hash-based mass-searching algorithm, finding (count, location of first match) entries from a dictionary against a string $s$ of length $n$. The presented implementation makes use of all substrings of $s$ whose lengths are powers of $2$ to construct an offline algorithm that can, in some cases, reach a complexity of $O(n \log^2n)$ even if there are $O(n^2)$ possible matches. If there is a limit on the dictionary size $m$, then the precomputation complexity is $O(m + n \log^2n)$, and the search complexity is bounded by $O(n\log^2n + m\log n)$, even if it performs in practice like $O(n\log^2n + \sqrt{nm}\log n)$. Other applications, such as finding the number of distinct substrings of $s$ for each length between $1$ and $n$, can be done with the same algorithm in $O(n\log^2n)$.