ExpoSizeStringSearch

Exponential Length Substrings in Pattern Matching

This note describes a hash-based mass-searching algorithm, finding (count, location of first match) entries from a dictionary against a string $s$ of length $n$. The presented implementation makes use of all substrings of $s$ whose lengths are powers of $2$ to construct an offline algorithm that can, in some cases, reach a complexity of $O(n \log^2n)$ even if there are $O(n^2)$ possible matches. If there is a limit on the dictionary size $m$, then the precomputation complexity is $O(m + n \log^2n)$, and the search complexity is bounded by $O(n\log^2n + m\log n)$, even if it performs in practice like $O(n\log^2n + \sqrt{nm}\log n)$. Other applications, such as finding the number of distinct substrings of $s$ for each length between $1$ and $n$, can be done with the same algorithm in $O(n\log^2n)$.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
comparison		comparison
llm_intersection_local		llm_intersection_local
nlp		nlp
nonImplicitImplementations		nonImplicitImplementations
snort_benchmark		snort_benchmark
.gitattributes		.gitattributes
E3S_draft.pdf		E3S_draft.pdf
LICENSE		LICENSE
README.md		README.md
aggro.cpp		aggro.cpp
countDistinctSubstrings.cpp		countDistinctSubstrings.cpp
offline.cpp		offline.cpp
substringDistribution.cpp		substringDistribution.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ExpoSizeStringSearch

About

Releases

Packages

Contributors 2

Languages

License

vlad-ulmeanu01/ExpoSizeStringSearch

Folders and files

Latest commit

History

Repository files navigation

ExpoSizeStringSearch

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages