Skip to content

iresbaylor/clone-comparer-script

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

clone-comparer-script

This is a script project written to run multiple code clone detection tools and compare the results to each other. It wraps multiple projects into one script. The full script has only been tested on Ubuntu & RHEL Linux but will likely run on Mac as well. For Windows, you might try Cygwin.

Setup

To run this script, you'll need to do the following:

  1. Clone the repository with the submodules. This will install both Cyclone and the comparer tool.
git clone --recurse-submodules https://github.com/iresbaylor/clone-comparer-script.git
  1. Install TxL. You'll need to do it as the superuser or NiCad won't be able to build correctly.

  2. Download a copy of NiCad and place it under tools/NiCad/. Follow the instructions to initialize it.

  3. Get a copy of the Moss script, name it moss, and put it under tools/Moss/

  4. As instructed in the Moss documentation, retrieve a user ID and update the $userId parameter in the moss script.

  5. Install PyPy for Python 3. Make sure the pypy3 executable is in your path.

  6. Run init.sh to create the Python virtual environment.

You may get an error when installing psycopg2 about needing to install a PostgreSQL package. Do the following to fix it:

# Mac OSX - set the LDFLAGS environment variable
export LDFLAGS=$(pg_config --ldflags)

# Ubuntu - install PostgreSQL development libraries
sudo apt-get install libpq-dev

# RHEL - install PostgreSQL development libraries
sudo yum install postgresql-devel
  1. Install Maven and Java 15 to run the comparer tool. Make sure both executables are in your path.

  2. Build the project by navigating to tools/clone-comparer and running:

mvn clean verify

Running the project

  1. Assemble a list of URLs for GitHub repositories you want to compare. Drop them in a file (see repositories.txt for an example).

  2. Source the build environment:

source tools/codeDuplicationParser/venv/bin/activate
  1. Run the comparer. Use -h to understand the required arguments.
./run.sh [-hk] -m <mode> (single/double) -f <repository_file>
  1. The results will be under output-single-<timestamp>.csv or output-double-<timestamp>.csv, depending on which mode you ran.