JLaborda · JLaborda · Jul 12, 2024 · Jun 12, 2023 · Nov 27, 2023 · Nov 27, 2023
diff --git a/.github/workflows/CI-CD-pipeline.yml b/.github/workflows/CI-CD-pipeline.yml
@@ -46,7 +46,7 @@ jobs:
          cache: maven
       - name: Build with Maven
         run: mvn clean verify --batch-mode
-        # Codecoverage
+        # Code-coverage
       - name: Install dependencies
         run: mvn install -DskipTests=true -Dmaven.javadoc.skip=true -B -V
       - name: Run tests and collect coverage

diff --git a/.github/workflows/maven-publish.yml b/.github/workflows/maven-publish.yml
@@ -1,7 +1,7 @@
 # This workflow will build a package using Maven and then publish it to GitHub packages when a release is created
 # For more information see: https://github.com/actions/setup-java/blob/main/docs/advanced-usage.md#apache-maven-with-a-settings-path
 
-name: Maven Package
+name: Publish Package
 
 on:
   release:

diff --git a/.gitignore b/.gitignore
@@ -28,3 +28,29 @@ replay_pid*
 /res/networks/others/
 /scripts/
 /results/
+
+# Experiment outputs and errors
+*.e*
+*.o*
+
+# target folder
+/target/*
+
+# Code coverage
+cov.xml
+
+# .idea folder
+.idea/
+
+#.DS_Store
+.DS_Store
+
+# large_datasets
+res/large_datasets/
+
+# parameters folder
+res/parameters/
+
+#scripts
+res/scripts/
+!res/scripts/experiments/
diff --git a/README.md b/README.md
@@ -2,9 +2,9 @@
 [![codecov](https://codecov.io/gh/JLaborda/cges/branch/main/graph/badge.svg?token=C9GeO49RsE)](https://codecov.io/gh/JLaborda/cges)
 
 # CGES
-Circular Greedy Equivalence Search (CGES) is a distributed structural learning algorithm for Bayesian Networks developed by Jorge Daniel Laborda, Pablo Torrijos, José M. Puerta and José A. Gámez.
+Circular/Ring Greedy Equivalence Search (CGES or rGES) is a distributed structural learning algorithm for Bayesian Networks developed by Jorge Daniel Laborda, Pablo Torrijos, José M. Puerta and José A. Gámez.
 This repository contains the code implementation of the algorithm described in the research article titled [A Ring-Based Distributed Algorithm for
-Learning High-Dimensional Bayesian Networks]. The algorithm focuses on structural learning of Bayesian Networks in high-dimensional domains, aiming to reduce complexity and improve efficiency. The algorithm is limited to discrete problems.
+Learning High-Dimensional Bayesian Networks](https://link.springer.com/chapter/10.1007/978-3-031-45608-4_10). This algorithm focuses on structural learning of Bayesian Networks in high-dimensional domains, aiming to reduce complexity and improve efficiency. It is limited to discrete problems.
 
 ## Table of Contents
 - [Introduction](#introduction)
@@ -17,8 +17,18 @@ Learning High-Dimensional Bayesian Networks]. The algorithm focuses on structura
 
 ## Introduction
 In this research project, we propose an algorithm, named cGES, for learning Bayesian Networks in high-dimensional domains. The algorithm utilizes a divide-and-conquer approach, parallelism, and fusion techniques to address the challenges associated with structural learning in high-dimensional datasets. The code in this repository implements the cGES algorithm and provides a practical tool for researchers and practitioners interested in Bayesian Network learning.
+
 ![Figura-cges-mejorado](https://github.com/JLaborda/cges/assets/15078416/5c16635d-3ef2-4f46-bb87-4c6863f24cc6)
 
+We have added other algorithms that follow a star topology into the project named. We've named these algorithms Star Greedy Equivalence Search (sGES). The algorithms designed with this topology are the following:
+* Random Broadcasting (srGES): The input connections between processes are determined randomly at the end of each iteration. In other words, the DAGs of each process are randomly selected for input for each process.
+* Best Broadcasting (sbGES): The best DAG of the iteration is passed as input to each process.
+In both scenarios, we avoid self-feedback by prohibiting a process output from being its input in the next iteration. The following figure shows the star topology structure:
+
+![cges-star](https://github.com/user-attachments/assets/c72f8a5f-4a16-47b4-9b78-38612f6568d3)
+
+We have also tested with other broadcasting, but they are much less efficient.
+
 ## Requirements
 - [Java 8](https://www.oracle.com/java/technologies/java8.html)
 - [Tetrad 7.1.2-2](https://github.com/cmu-phil/tetrad) (Provided in this repository)
@@ -37,38 +47,63 @@ docker build -t cges .
 ```
 
 ## Usage
-1. [Instructions for how to use the code]
-2. [Description of input/output formats]
 The parameters you need to provide to either the jar file, or to the docker container are: 
-   1. The path to the file with the parameters you want your experiments to execute.
-   2. The index (number of line - 1) of the file for which the experiment will be executed.
-   3. (Optional) 
-   The parameter file needs to have the following information separated by a blank space in each line:
+1. The path to the file with the parameters you want your experiments to execute.
+2. The index (number of line - 1) of the file for which the experiment will be executed.
+3. (Optional) The parameter file needs to have the following information separated by a blank space in each line:
 
-   ```
-   algorithm_name net_name net_path dataset_path number_cges_threads edge_limitation random_seed 
-   ```
-   You have at your disposal a file of parameters for the networks andes, link and munin in the './res/parameters/' folder. Feel free to modify it as you wish to run any experiment you want.
+A line in the parameter file will have this format:
+```
+algName 'value' netName 'value' clusteringName 'value' numberOfClusters 'value' broadcasting 'value' databasePath 'value' netPath 'value' seed 'value'
+```
+The seed is only used in the random broadcasting setup. There is no need to add blank values for parameters that are not used.
 
-   You can run any experiment by using these sentences and
-   ```
-   java -jar [jar-file-with-dependencies] [parameters-file] [index-of-file] [result_path](optional)
-   ```
-   If you wish to use the docker container, use the following:
-   ```
-   docker run [cges_container_name] [parameters-file] [index-of-file] [result_path](optional)
-   ```
+Here is an example of a line of a valid params file to run a sbGES algorithm:
+```
+algName cges netName alarm clusteringName HierarchicalClustering numberOfClusters 2 broadcasting BEST_BROADCASTING databasePath ./res/datasets/alarm/alarm1.csv netPath ./res/networks/alarm/alarm.xbif
+```
+
+Another example to run a srGES algorithm:
+```
+algName cges netName alarm clusteringName HierarchicalClustering numberOfClusters 4 broadcasting RANDOM_BROADCASTING seed 11 databasePath ./res/datasets/alarm/alarm1.csv netPath ./res/networks/alarm/alarm.xbif
+```
+
+Here is an example of a params line to execute a control algorithm like GES:
+```
+algName ges netName alarm databasePath ./res/datasets/alarm/alarm2.csv netPath ./res/networks/alarm/alarm.xbif
+```
+
+The allowed values of each parameter are:
+* algName: [cges, ges, fges, fges-faithfulness]. Use cges for all the new algorithms in this project.
+* netName: [The name of the network].
+* clusteringName: [HierarchicalClustering, RandomClustering]. We recommend that you use with HierarchicalClustering.
+* numberOfClusters: [Any number, preferable even]. We suggest sticking to the following numbers [2,4,8,16].
+* broadcasting: [NO_BROADCASTING, RANDOM_BROADCASTING, BEST_BROADCASTING]. NO_BROADCASTING is for the rGES or cGES algorithm. RANDOM_BROADCASTING is for the srGES. BEST_BROADCASTING is for sbGES.
+* seed: (optional) Any number. It's only used in RANDOM_BROADCASTING.
+* databasePath: The local path of the data you are using.
+* netPath: The local path of the original bayesian network you used to sample the data in format xbif.
+
+You have a file of parameters in './example-params.txt' as an example. Feel free to modify it to run any experiment you want.
+
+You can run any experiment by using these sentences and
+```
+java -jar [jar-file-with-dependencies] [parameters-file] [index-of-file] [result_path](optional)
+```
+If you wish to use the docker container, use the following:
+```
+docker run [cges_container_name] [parameters-file] [index-of-file] [result_path](optional)
+```
 
 ## Example
 **Package**
    ```
    mvn package
-   java -jar target/CGES-1.0-jar-with-dependencies.jar ./res/parameters/andes_parameters.txt 2 ./MyResults.txt
+   java -jar target/CGES-1.0-jar-with-dependencies.jar ./example-params.txt 2 ./MyResults.txt
    ```
 **Docker Container**
    ```
    docker build -t cges .
-   docker run -v $(pwd)/res:/res -v $(pwd)/results:/results --rm cges /res/parameters/andes_parameters.txt 2 results/myResults.csv
+   docker run -v $(pwd)/res:/res -v $(pwd)/results:/results --rm cges ./example-params.txt 2 results/myResults.csv
    ```
 
 ## Contributing

diff --git a/example-params.txt b/example-params.txt
@@ -0,0 +1,3 @@
+algName cges netName alarm clusteringName HierarchicalClustering numberOfClusters 16 broadcasting RANDOM_BROADCASTING seed 11 databasePath ./res/datasets/alarm/alarm5.csv netPath ./res/networks/alarm/alarm.xbif
+algName ges netName win95pts databasePath ./res/datasets/win95pts/win95ptsALL.csv netPath ./res/networks/win95pts/win95pts.xbif
+algName cges netName alarm clusteringName HierarchicalClustering numberOfClusters 16 broadcasting RANDOM_BROADCASTING seed 19 databasePath ./res/datasets/alarm/alarm5.csv netPath ./res/networks/alarm/alarm.xbif