Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert betweenness centrality code to Ruby #999

Open
1 task
ferrisoxide opened this issue Sep 5, 2024 · 2 comments
Open
1 task

Convert betweenness centrality code to Ruby #999

ferrisoxide opened this issue Sep 5, 2024 · 2 comments
Labels

Comments

@ferrisoxide
Copy link
Collaborator

ferrisoxide commented Sep 5, 2024

Description

Convert the existing betweeness_centrality_lambda service to Ruby, to be called directly in the app proper.

This is to both reduce latency as well as allow for larger datasets.

Ideally we should aim to write the algorithm and contribute it to the Ruby equivalent of the Python NetworkX library. Note, there is already a ticket for this within the Ruby NetworkX library. See:

SciRuby/networkx.rb#7

Proposed Solution

  1. Fork https://github.com/SciRuby/networkx.rb
  2. Implement betweenness centrality by porting the algorithm from NetworkX
  3. Create PR and present to the SciRuby team for consideration

Acceptance Criteria

  • MUST produce the same betweenness centrality values as the AWS Lambda service.
@ferrisoxide ferrisoxide converted this from a draft issue Sep 5, 2024
@ferrisoxide
Copy link
Collaborator Author

ferrisoxide commented Sep 5, 2024

DEV NOTE

The following link data:

[[1057, 1081],
 [1057, 1081],
 [1081, 1092],
 [1082, 1092],
 [1085, 1092],
 [1086, 1092],
 [1087, 1092],
 [1088, 1092],
 [1081, 1082],
 [1081, 1085],
 [1081, 1086],
 [1081, 1087],
 [1081, 1088],
 [1081, 1092],
 [1081, 1087],
 [1082, 1087],
 [1085, 1087],
 [1086, 1087],
 [1087, 1088],
 [1087, 1092],
 [1081, 1085],
 [1082, 1085],
 [1085, 1086],
 [1085, 1087],
 [1085, 1088],
 [1085, 1092],
 [1081, 1086],
 [1082, 1086],
 [1085, 1086],
 [1086, 1087],
 [1086, 1088],
 [1086, 1092],
 [1081, 1088],
 [1082, 1088],
 [1085, 1088],
 [1086, 1088],
 [1087, 1088],
 [1088, 1092],
 [1081, 1082],
 [1082, 1085],
 [1082, 1086],
 [1082, 1087],
 [1082, 1088],
 [1082, 1092],
 [883, 1071],
 [883, 1081],
 [883, 1087],
 [883, 1090],
 [883, 1097],
 [883, 1081],
 [1071, 1081],
 [1081, 1087],
 [1081, 1090],
 [1081, 1097],
 [883, 1087],
 [1071, 1087],
 [1081, 1087],
 [1087, 1090],
 [1087, 1097],
 [883, 1071],
 [1071, 1081],
 [1071, 1087],
 [1071, 1090],
 [1071, 1097],
 [883, 1090],
 [1071, 1090],
 [1081, 1090],
 [1087, 1090],
 [1090, 1097],
 [883, 1097],
 [1071, 1097],
 [1081, 1097],
 [1087, 1097],
 [1090, 1097],
 [1060, 1081],
 [1066, 1081],
 [1081, 1089],
 [1081, 1091],
 [1081, 1095],
 [1060, 1095],
 [1066, 1095],
 [1081, 1095],
 [1089, 1095],
 [1091, 1095],
 [1060, 1066],
 [1060, 1081],
 [1060, 1089],
 [1060, 1091],
 [1060, 1095],
 [1060, 1066],
 [1066, 1081],
 [1066, 1089],
 [1066, 1091],
 [1066, 1095],
 [1060, 1091],
 [1066, 1091],
 [1081, 1091],
 [1089, 1091],
 [1091, 1095],
 [1060, 1089],
 [1066, 1089],
 [1081, 1089],
 [1089, 1091],
 [1089, 1095],
 [1061, 1069],
 [1061, 1079],
 [1061, 1081],
 [1061, 1079],
 [1069, 1079],
 [1079, 1081],
 [1061, 1081],
 [1069, 1081],
 [1079, 1081],
 [1061, 1069],
 [1069, 1079],
 [1069, 1081],
 [884, 1081],
 [884, 1081],
 [1064, 1096],
 [1070, 1096],
 [1081, 1096],
 [1087, 1096],
 [1064, 1081],
 [1070, 1081],
 [1081, 1087],
 [1081, 1096],
 [1064, 1087],
 [1070, 1087],
 [1081, 1087],
 [1087, 1096],
 [1064, 1070],
 [1070, 1081],
 [1070, 1087],
 [1070, 1096],
 [1064, 1070],
 [1064, 1081],
 [1064, 1087],
 [1064, 1096],
 [1075, 1081],
 [1075, 1081],
 [897, 1081],
 [897, 1081],
 [883, 1058],
 [883, 1068],
 [883, 1078],
 [883, 1081],
 [883, 1087],
 [883, 1093],
 [883, 1094],
 [883, 1081],
 [1058, 1081],
 [1068, 1081],
 [1078, 1081],
 [1081, 1087],
 [1081, 1093],
 [1081, 1094],
 [883, 1087],
 [1058, 1087],
 [1068, 1087],
 [1078, 1087],
 [1081, 1087],
 [1087, 1093],
 [1087, 1094],
 [883, 1068],
 [1058, 1068],
 [1068, 1078],
 [1068, 1081],
 [1068, 1087],
 [1068, 1093],
 [1068, 1094],
 [883, 1093],
 [1058, 1093],
 [1068, 1093],
 [1078, 1093],
 [1081, 1093],
 [1087, 1093],
 [1093, 1094],
 [883, 1094],
 [1058, 1094],
 [1068, 1094],
 [1078, 1094],
 [1081, 1094],
 [1087, 1094],
 [1093, 1094],
 [883, 1078],
 [1058, 1078],
 [1068, 1078],
 [1078, 1081],
 [1078, 1087],
 [1078, 1093],
 [1078, 1094],
 [883, 1058],
 [1058, 1068],
 [1058, 1078],
 [1058, 1081],
 [1058, 1087],
 [1058, 1093],
 [1058, 1094],
 [1062, 1081],
 [1062, 1081],
 [883, 1081],
 [883, 1081],
 [883, 1081],
 [883, 1081],
 [1067, 1073],
 [1073, 1081],
 [1073, 1085],
 [1073, 1087],
 [1067, 1081],
 [1073, 1081],
 [1081, 1085],
 [1081, 1087],
 [1067, 1087],
 [1073, 1087],
 [1081, 1087],
 [1085, 1087],
 [1067, 1085],
 [1073, 1085],
 [1081, 1085],
 [1085, 1087],
 [1067, 1073],
 [1067, 1081],
 [1067, 1085],
 [1067, 1087],
 [1074, 1081],
 [1074, 1081],
 [883, 1081],
 [883, 1081],
 [883, 1081],
 [883, 1081],
 [1070, 1084],
 [1081, 1084],
 [1083, 1084],
 [1070, 1081],
 [1081, 1083],
 [1081, 1084],
 [1070, 1083],
 [1081, 1083],
 [1083, 1084],
 [1070, 1081],
 [1070, 1083],
 [1070, 1084],
 [883, 1059],
 [883, 1063],
 [883, 1076],
 [883, 1080],
 [883, 1081],
 [883, 1081],
 [1059, 1081],
 [1063, 1081],
 [1076, 1081],
 [1080, 1081],
 [883, 1076],
 [1059, 1076],
 [1063, 1076],
 [1076, 1080],
 [1076, 1081],
 [883, 1063],
 [1059, 1063],
 [1063, 1076],
 [1063, 1080],
 [1063, 1081],
 [883, 1059],
 [1059, 1063],
 [1059, 1076],
 [1059, 1080],
 [1059, 1081],
 [883, 1080],
 [1059, 1080],
 [1063, 1080],
 [1076, 1080],
 [1080, 1081]]

Should resolve to the following betweenness centrality values:

{"1057"=>0.0,
 "1081"=>0.7771367521367523,
 "1092"=>0.0,
 "1082"=>0.0,
 "1085"=>0.003418803418803419,
 "1086"=>0.0,
 "1087"=>0.08098290598290597,
 "1088"=>0.0,
 "883"=>0.029487179487179487,
 "1071"=>0.0,
 "1090"=>0.0,
 "1097"=>0.0,
 "1060"=>0.0,
 "1066"=>0.0,
 "1089"=>0.0,
 "1091"=>0.0,
 "1095"=>0.0,
 "1061"=>0.0,
 "1069"=>0.0,
 "1079"=>0.0,
 "884"=>0.0,
 "1064"=>0.0,
 "1096"=>0.0,
 "1070"=>0.0038461538461538464,
 "1075"=>0.0,
 "897"=>0.0,
 "1058"=>0.0,
 "1068"=>0.0,
 "1078"=>0.0,
 "1093"=>0.0,
 "1094"=>0.0,
 "1062"=>0.0,
 "1067"=>0.0,
 "1073"=>0.0,
 "1074"=>0.0,
 "1084"=>0.0,
 "1083"=>0.0,
 "1059"=>0.0,
 "1063"=>0.0,
 "1076"=>0.0,
 "1080"=>0.0}

@ferrisoxide
Copy link
Collaborator Author

DEV NOTE

A naive implementation, based on the RGL gem, was suggested by Github Co-pilot:

require 'rgl/adjacency'
require 'rgl/traversal'
require 'rgl/dijkstra'
require 'rgl/connected_components'

    class Graph
      def initialize
        @graph = RGL::DirectedAdjacencyGraph.new
      end

      def add_edge(source, target)
        @graph.add_edge(source, target)
      end

      def betweenness_centrality
        nodes = @graph.vertices
        centrality = Hash.new(0)

        nodes.each do |s|
          stack = []
          paths = {}
          sigma = Hash.new(0)
          delta = Hash.new(0)
          distance = Hash.new(-1)

          sigma[s] = 1
          distance[s] = 0
          queue = [s]

          while queue.any?
            v = queue.shift
            stack.push(v)
            @graph.adjacent_vertices(v).each do |w|
              if distance[w] < 0
                queue.push(w)
                distance[w] = distance[v] + 1
              end
              if distance[w] == distance[v] + 1
                sigma[w] += sigma[v]
                paths[w] ||= []
                paths[w] << v
              end
            end
          end

          while stack.any?
            w = stack.pop
            if paths[w]
              paths[w].each do |v|
                delta[v] += (sigma[v].to_f / sigma[w]) * (1 + delta[w])
              end
            end
            centrality[w] += delta[w] if w != s
          end
        end

        debugger
        rescale(centrality, @graph.num_vertices, true, directed: !@graph.directed?)
      end

      def rescale(betweenness, n, normalized, directed: false, k: nil, endpoints: false)
        scale = nil

        if normalized
          if endpoints
            if n < 2
              scale = nil  # no normalization
            else
              # Scale factor should include endpoint nodes
              scale = 1.0 / (n * (n - 1))
            end
          elsif n <= 2
            scale = nil  # no normalization b=0 for all nodes
          else
            scale = 1.0 / ((n - 1) * (n - 2))
          end
        else  # rescale by 2 for undirected graphs
          if !directed
            scale = 0.5
          else
            scale = nil
          end
        end

        if scale
          scale *= n / k if k
          betweenness.each do |v, value|
            betweenness[v] *= scale
          end
        end

        betweenness
      end
    end

      graph = Graph.new
      links.each do |(target, source)|
        graph.add_edge(target, source)
      end
      graph.betweenness_centrality

NB This does not produce correct results, but it does identify the nodes that are central

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Backlog
Development

No branches or pull requests

1 participant