Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add method to output all matched substrings #5

Open
mganss opened this issue Mar 20, 2015 · 5 comments
Open

Add method to output all matched substrings #5

mganss opened this issue Mar 20, 2015 · 5 comments
Labels

Comments

@mganss
Copy link

mganss commented Mar 20, 2015

Something like this:

public IEnumerable<KeyValuePair<string, int>> Search(string text)
{ ... }

where the key is the matched pattern and the value is the start index into the searched string.

I tried adding this method assuming that IsFinished means a node is in the dictionary ("blue node" as in the description on Wikipedia). But that doesn't seem to be the case so I gave up 😢

@alexandrnikitin
Copy link
Owner

IsFinished field shows that some keyword ends at that node. It's possible to add such feature but it seems too specific. One more thing my intention was to make the implementation as fast as possible. So it's better to measure the performance.

@mganss
Copy link
Author

mganss commented Mar 21, 2015

IMO finding all the matching substrings and their locations is a fundamental feature of the Aho Corasick algorithm and a requirement for many use cases.

@alexandrnikitin
Copy link
Owner

Push new version with the "Search" feature. You can try it using Install-Package AhoCorasick.Net -Version 0.6.0-beta

@mganss
Copy link
Author

mganss commented Mar 30, 2015

This

var ac = new AhoCorasickTree(new[] { "a", "ab", "bab", "bc", "bca", "c", "caa" });
var m = ac.Search("abccab").ToList();

finds only "a", "ab", "bc", "a", "ab". Should find "a", "ab", "bc", "c", "c", "a", "ab". This is the example from Wikipedia.

@alexandrnikitin
Copy link
Owner

Pushed the fix with 0.6.1-beta version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants