Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replaced Vector data structure by BitSet in greedy_color #379

Merged
merged 3 commits into from
May 28, 2024

Conversation

AntoineBut
Copy link
Contributor

This PR adresses #378 .

Replacing Vector with a Set yields immediate speedup of 50-100x on my machine.

As the we are using dense sets of integers from 1 to the number of colors to color the graph (i.e relatively small integers), BitSet implementation seems to be the most appropriate here. (and is also faster on my machine, even though I have not tested on an extensive benchmark suite)

Im open to feedback of course (I am still pretty new here) :))

@AntoineBut AntoineBut changed the title Replaced Vector data structure by BitSet Replaced Vector data structure by BitSet in greedy_color May 26, 2024
Copy link

codecov bot commented May 26, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.31%. Comparing base (615afe4) to head (0104925).

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #379   +/-   ##
=======================================
  Coverage   97.31%   97.31%           
=======================================
  Files         120      120           
  Lines        6953     6954    +1     
=======================================
+ Hits         6766     6767    +1     
  Misses        187      187           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@gdalle gdalle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the contribution!

src/traversals/greedy_color.jl Show resolved Hide resolved
@gdalle
Copy link
Member

gdalle commented May 27, 2024

@AntoineBut can you check what my change from BitSet to Set does in terms of performance?
It might be faster in theory but slower in practice, at least on instances of moderate size.

@AntoineBut
Copy link
Contributor Author

AntoineBut commented May 28, 2024

@gdalle Here are results from a rapid benchmarking :

Setup :
3 types of synthetic graphs, all in 3 sizes :
1 graph imported from Snap

julia> doro_small = dorogovtsev_mendes(10)
{10, 17} undirected simple Int64 graph

julia> doro_med = dorogovtsev_mendes(300)
{300, 597} undirected simple Int64 graph

julia> doro_large = dorogovtsev_mendes(100_000)
{100000, 199997} undirected simple Int64 graph

julia> bara_small = barabasi_albert(10, 2)
{10, 16} undirected simple Int64 graph

julia> bara_med = barabasi_albert(300, 2)
{300, 596} undirected simple Int64 graph

julia> bara_large = barabasi_albert(100_000, 2)
{100000, 199996} undirected simple Int64 graph

julia> complete_small = complete_graph(10)
{10, 45} undirected simple Int64 graph

julia> complete_med = complete_graph(300)
{300, 44850} undirected simple Int64 graph

julia> complete_large = complete_graph(10_000)
{10000, 49995000} undirected simple Int64 graph

# Real-world data graph 
julia> twitch_graph = loadgraph("/Users/antoine/Documents/GitHub/ParallelGraphs.jl/benchmark/data/large_twitch_edges.csv", "twitch", EdgeListFormat())
{168116, 6797558} directed simple Int64 graph

Results :

  Set{T} BitSet
Barabasi Albert    
Small 662.556 ns (9 allocations: 928 bytes) 411.692 ns (6 allocations: 624 bytes)
Med 13.791 μs (10 allocations: 11.28 KiB) 7.208 μs (7 allocations: 10.98 KiB)
Large 7.714 ms (15 allocations: 3.15 MiB) 4.736 ms (12 allocations: 3.15 MiB)
Dorogostev Mendes    
Small 638.643 ns (9 allocations: 928 bytes) 404.084 ns (6 allocations: 624 bytes)
Med 13.958 μs (10 allocations: 11.28 KiB) 7.229 μs (7 allocations: 10.98 KiB)
Large 7.721 ms (15 allocations: 3.15 MiB) 4.740 ms (12 allocations: 3.15 MiB)
Complete Graph    
Small 827.465 ns (9 allocations: 928 bytes) 479.492 ns (6 allocations: 624 bytes)
Med 378.917 μs (18 allocations: 21.38 KiB) 128.459 μs (7 allocations: 8.44 KiB)
Large 457.570 ms (30 allocations: 438.27 KiB) 123.255 ms (12 allocations: 246.50 KiB)
Twitch Graph    
  46.081 ms (18 allocations: 5.29 MiB) 31.722 ms (12 allocations: 5.29 MiB)

I tried to represent graphs that behave differently with respect to coloring, some of them are colored with only 5 colors (even on thousands of vertices) and the complete graph is colored in N colors.
BitSet was faster in all configurations I tried, with speedup ranging from x1,4 to x3,5.

I hope that adresses your concerns!
Let me know if there is anything else :)

@gdalle
Copy link
Member

gdalle commented May 28, 2024

Just out of curiosity, could you try with a BitVector? Then we'll pick the best and merge. My gut tells me it may be as fast as BitSet sometimes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants