ADE (Acordar Download and Extraction) repository contains the code (in Java and Python) for the download and parsing phases of the ACORDAR collection. These are the first steps for a more complete reproducibility study based on the ACORDAR paper
The execution pipeline is the follwoing:
- Collection Download by using the code in the
/python/download
directory - File recovering by using the code in the
/python/file_recovering
directory - Datasets parsing by using the code in the
/java
and/python/mining
directory - Statistics retrieving by using the Jupyter Notebooks in the
/python/analysis/
directory
Inside the /java
and /python
folders you can find other README.md files with all the instructions for running the code.
Before executing all the code take a look to the ACORDAR repository and clone it. You will need the datasets.json
file that contains the list with all the collection datasets.