-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter a VertexClustering object based on cluster sizes #507
Comments
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions. |
Hello! I would like to contribute to igraph and noticed that this is labeled as a good first issue, which I am hunting for, since it will be my very first contribution. Is this still open and if so, is everything I need to get starrted in the wiki? |
Yes, the issue is still open and feel free to get started on it. There's no wiki page but the README contains some instructions for getting started with development (after checking out the repo and all its submodules); see here. I'd suggest that you open a PR with your draft as soon as possible (and mark it as draft) and commit often so we can give you early feedback if needed. As for this specific task, you can use my SO response as a starting point, but it needs to be made a proper method of the |
Wouldn't it make sense to try to design some sort of generalized filtering mechanism instead of a one-trick method that only handles sizes? What if:
Generally speaking, the area where most open source projects don't do so well is having a coherent, predictable, well thought out design. If you work with a system like Mathematica often enough, this becomes painfully obvious. I would like igraph to stand out from the crowd by being better in this area. |
I was considering this, but I don't want to overstretch the scope of this issue. It's always possible to add such a filtering method later and then rewrite That being said, I can imagine adding a |
Alright, that makes sense. But it's still a useful exercise to think about how a generalization could be formulated and then implemented. Creating the subgraph seems like a heavyweight operation. Would it make sense to add a helper function to the C core which can retrieve the IDs of edges contained within a subgraph induced by some vertices? Then it will become relatively easy to compute things like:
More generally, it might make sense for a filtering method to have access to:
The first is trivial, as the clusters are generally understood in terms of their vertices and API provides access to these through |
Sounds like a good idea.
I think that all that the filtering function needs access to is the graph object itself and the members of the current cluster being considered. The graph might not even be needed as you can always add it using
This would allow But again, I still maintain that this is outside the scope of the current issue, which only asks for filtering based on cluster sizes, so in order not to derail the current discussion, I'd recommend opening a separate issue for a more generic filtering method. |
Is anyone still working on this issue currently? |
What is the feature or improvement you would like to see?
It is a common requirement when clustering a large graph to plot only the larger communities but not the smaller ones, as illustrated by this SO question.
Maybe it would be even more flexible if we allowed the user to pass an upper and a lower size limit, or an aribtrary filtering predicate based on the sizes.
I've posted a quick sketch of a possible implementation in the SO question.
Use cases for the feature
See above; the SO question provides a possible use-case.
References
N/A
The text was updated successfully, but these errors were encountered: