Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation Scripts #408

Merged
merged 131 commits into from
Sep 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
131 commits
Select commit Hold shift + click to select a range
55e1420
send one request to update security relevance of a batch of commits (…
lauraschauer Jul 22, 2024
19d16fe
Reduces throwing errors when the LLM returns a verbose answer in comm…
lauraschauer Jul 30, 2024
5af0501
cleaner error logging when git command fails
lauraschauer Aug 9, 2024
552509b
Advisory References: commit
lauraschauer Aug 14, 2024
3be03fb
fixes bug when extracting text from Jira Refs
lauraschauer Aug 14, 2024
c7dbd7f
removes 'dubious ownership' error when trying to access git repositor…
lauraschauer Jul 30, 2024
ba72de1
prints error to console instead of crashing at None, -1 returned from…
lauraschauer Aug 5, 2024
9108729
[FIX] makes sure that commit classification information is only retri…
lauraschauer Aug 26, 2024
a57769b
sorts and cleans up gitignore
lauraschauer Aug 29, 2024
bcf7279
orders and cleans up gitignore
lauraschauer Aug 29, 2024
e76a19a
renames folder with pipeline code to pipeline
lauraschauer Aug 29, 2024
3ed03ed
comments out cluttering print statement
lauraschauer Aug 29, 2024
2870cf3
adds the pipeline folder as a volume
lauraschauer Aug 29, 2024
98074ff
adds the full code folder as a volume to the worker container
lauraschauer Aug 29, 2024
53d53a2
adjusts import paths after renaming folder to pipeline
lauraschauer Aug 29, 2024
a409c12
adds console output for steps of pipeline
lauraschauer Aug 29, 2024
a07ea30
more adjustment after renaming to pipeline
lauraschauer Aug 29, 2024
aa71baa
updates and sorts gitignore
lauraschauer Aug 29, 2024
f077121
can successfully generate reports with the pipeline
lauraschauer Aug 29, 2024
04746d2
cleans up code
lauraschauer Aug 29, 2024
44c4d5b
adds readme
lauraschauer Aug 29, 2024
95bbacc
adds LLM statistics to statistics object and
lauraschauer Jul 15, 2024
a670bc8
adds LLM statistics to statistics object and
lauraschauer Jul 15, 2024
5697bfe
adds matteo's script
lauraschauer Jul 11, 2024
b2c6690
script runs
lauraschauer Jul 11, 2024
aa22101
results from run
lauraschauer Jul 15, 2024
03e1cef
adds collection of LLM stats to the LLM class
lauraschauer Jul 15, 2024
ef1c1a2
adds folder structure for reports and 1 first report generated in con…
lauraschauer Jul 15, 2024
67fca69
breaks up matteo's script into smaller files
lauraschauer Jul 15, 2024
edf0894
improves code
lauraschauer Jul 15, 2024
1c5803d
adds LLM generated docstrings to start understanding what's happening
lauraschauer Jul 15, 2024
aee608f
adds function to analyse llm statistics across Prospector reports
lauraschauer Jul 16, 2024
ba799b6
adds CL argument for llm statistics evaluation
lauraschauer Jul 16, 2024
ab6f22b
makes analysis code more robust to JSON report files missing fields
lauraschauer Jul 17, 2024
fe18896
adds seaborn and matplotlib to dev requirements
lauraschauer Jul 17, 2024
f18c644
logs to console
lauraschauer Jul 17, 2024
153ba2e
adds readme
lauraschauer Jul 18, 2024
5d9fa02
bundels run variables in config.yaml and adds readme
lauraschauer Jul 18, 2024
5043ce6
this is the dataset used in D6.3
lauraschauer Jul 18, 2024
35e93ae
renames dataset
lauraschauer Jul 18, 2024
c5f8b70
adds dataset description to readme
lauraschauer Jul 18, 2024
7244462
mounts volume so that reports are available on the host
lauraschauer Jul 19, 2024
910463e
adds function to update latex table from D6.3 report
lauraschauer Jul 19, 2024
59c0c66
puts matteo's code into functions and writes results to files instead…
lauraschauer Jul 19, 2024
4711219
puts matteo's code into functions and writes results to files instead…
lauraschauer Jul 19, 2024
d3338c7
fixes precision of the rules function
lauraschauer Jul 19, 2024
f074472
correctly display number of dispatched jobs
lauraschauer Jul 19, 2024
d174f20
moves execution time measurement for LLM from child to parent function
lauraschauer Jul 22, 2024
6cbb208
adds function to stop all jobs on queue
lauraschauer Jul 22, 2024
eef559f
removes cluttering console output from redis queue, adjusts folder pa…
lauraschauer Jul 22, 2024
9402ba1
moves functions to extract version intervals from cves to utils
lauraschauer Jul 22, 2024
7727e01
writes llm stats analysis to file
lauraschauer Jul 23, 2024
c5d627a
corrects git cache path and adds docstring to do_clone
lauraschauer Jul 23, 2024
1d81fc1
adds progress bar to clone_repo_multiple
lauraschauer Jul 24, 2024
dc9deac
adds script to clone repos
lauraschauer Jul 24, 2024
a9f961e
executes prospector container with host user so that created files (e…
lauraschauer Jul 25, 2024
cfe7b7e
adds comments, docstrings and improves logging of repo cloning
lauraschauer Jul 25, 2024
0beec15
reformatting
lauraschauer Jul 25, 2024
74e0df5
creates own evaluation logger
lauraschauer Jul 25, 2024
4ba005b
restructuring to have better file hierarchy and avoid cyclic imports
lauraschauer Jul 25, 2024
f727ae5
adds script to trigger repository cloning for CVEs
lauraschauer Jul 29, 2024
d36bbc4
rewrites method to generate D6.3 summary execution table
lauraschauer Jul 29, 2024
df890c5
new function to analyse reports now yields (almost) correct results
lauraschauer Jul 30, 2024
2c64853
adds new cli flag to count which reports are missing
lauraschauer Jul 30, 2024
6dd5574
updates gitignore
lauraschauer Jul 30, 2024
154a696
changes config file to resemble the prospector config file, removes s…
lauraschauer Jul 30, 2024
1f8fa29
added flag to remove jobs from queue
lauraschauer Jul 30, 2024
971f7b7
increased default timeout
lauraschauer Jul 30, 2024
cb61e13
adds script to compare pairs of prospector reports (eg. the reports f…
lauraschauer Aug 1, 2024
4d75870
clean analysis code
lauraschauer Aug 2, 2024
7cb6e57
set parameter so that already cloned repos are not cloned again
lauraschauer Aug 2, 2024
634b404
sets using LLMs to infer repo URLs to false in any case and refactors…
lauraschauer Aug 2, 2024
8578dfd
adds two more folders to share between the container and host for the…
lauraschauer Aug 2, 2024
312ee0c
removes unnecessary import
lauraschauer Aug 2, 2024
2a69348
renames run_multiple to main and removes unused code from it
lauraschauer Aug 2, 2024
6ca5102
changes back to one reports path from config
lauraschauer Aug 2, 2024
2d9ea2c
moves cloning repositories to fill gitcache into evaluation folder
lauraschauer Aug 2, 2024
8c8b01d
creates new script to analyse statistics
lauraschauer Aug 2, 2024
1e579e7
prints error to console instead of crashing at None, -1 returned from…
lauraschauer Aug 5, 2024
3452233
starts function to create boxplot with statistics
lauraschauer Aug 5, 2024
41f7c98
configuration file for evaluation code
lauraschauer Aug 5, 2024
fd99b6f
adds gitignore for evaluation folder
lauraschauer Aug 5, 2024
2dea986
adds script to extract lines with error reasons from log
lauraschauer Aug 5, 2024
fee7a4a
updates readme and removes unused cl flags from main
lauraschauer Aug 5, 2024
3ad62e8
creates boxplot of overall execution time for all 4 categories
lauraschauer Aug 5, 2024
1a09525
adds function to analyse which reports change category in different e…
lauraschauer Aug 5, 2024
60ed59c
[DOC] updates readme
lauraschauer Aug 5, 2024
62dcee1
[DOC] cleans up utils file
lauraschauer Aug 5, 2024
2f2540e
[ADD] adds function to plot commit classification times as boxplots
lauraschauer Aug 5, 2024
6deb2e5
[ADD] function to check whether two ground truth datasets contain the…
lauraschauer Aug 7, 2024
7820c32
[ADD] function to see correlation between time needed for execution a…
lauraschauer Aug 7, 2024
fe67a7d
[DOC] adds logging statements for git commands and which period of ta…
lauraschauer Aug 9, 2024
ca24925
[FIX] makes sure that only commits referenced in the advisory's refer…
lauraschauer Aug 9, 2024
ad94c05
[ADD] function to choose which set of strong rules should be used + f…
lauraschauer Aug 9, 2024
22319bd
[ADD] information about which CVEs are missing reports for both direc…
lauraschauer Aug 9, 2024
49976cc
[DOC] restructures folders to have all reports not in prospector fold…
lauraschauer Aug 9, 2024
a9caf27
[FIX] fixes bug when extracting text from Jira Refs
lauraschauer Aug 14, 2024
27a61a3
[FIX] fixes mistake in filtering out the 'commit::master' references
lauraschauer Aug 14, 2024
06cce63
[DOC] updates readme
lauraschauer Aug 14, 2024
58ff072
[DOC] removes unused code
lauraschauer Aug 14, 2024
f57080e
[FIX] uses 20 instead of 2 workers
lauraschauer Aug 14, 2024
49301b3
[DOC] adds images for readme
lauraschauer Aug 14, 2024
eea2a2e
[ADD] adds script to compare differently classified reports
lauraschauer Aug 14, 2024
ff9591f
[ADD] options to select certain CVEs for analysis
lauraschauer Aug 14, 2024
aa76ef0
[FIX] mounts the code into the container instead of copying, so that …
lauraschauer Aug 14, 2024
9afd5ee
[IMP] small code changes
lauraschauer Aug 14, 2024
e2deaba
[FIX] gives host permission to read and write to log files
lauraschauer Aug 14, 2024
7d16871
[FIX] new method to generate better flow analysis
lauraschauer Aug 16, 2024
7775f65
[FIX] changes weight of commit classification rule to 16 instead of 3…
lauraschauer Aug 16, 2024
158c0bb
[IMP] refactors the code so that the batch of reports can be chosen i…
lauraschauer Aug 19, 2024
e03ef04
[ADD] adds comparison of keywords in reports to flow file
lauraschauer Aug 19, 2024
36d1ac7
[IMP] removes unused code
lauraschauer Aug 19, 2024
45e449b
[ADD] function to produce checkmarks table as in D6.3
lauraschauer Aug 20, 2024
b3c1103
[FIX] makes sure that version interval is only supplied when 'version…
lauraschauer Aug 20, 2024
7f33fea
[ADD] function to create sankey diagram with flows between categories
lauraschauer Aug 21, 2024
08d1d58
[FIX] changes the config parameter use_comparison to batch, which can…
lauraschauer Aug 21, 2024
6383842
[FIX] changes to using clearer config variables for batch selection
lauraschauer Aug 26, 2024
6c35088
[FIX] adjusts code so that with LLm support can run with 10 workers, …
lauraschauer Aug 26, 2024
e0f16d8
[FIX] removes unused variables
lauraschauer Aug 26, 2024
0ec2004
[FIX] adjusts titles of plots, uses different paths (alltogether unim…
lauraschauer Aug 26, 2024
95f7111
[IMP] improves the execution time analysis plots by adding a violin p…
lauraschauer Aug 27, 2024
7c6bfd5
[ADD] copies code for using the API to create jobs
lauraschauer Aug 28, 2024
64eeb01
[ADD] code to use the API to enqueue jobs
lauraschauer Aug 28, 2024
36e9318
[FIX] temporary fix to not use database for cc when use_backend set t…
lauraschauer Aug 28, 2024
579a622
[DOC] cleans up unimportant files
lauraschauer Aug 28, 2024
b5023d5
[FIX] after rebasing on pipeline branch
lauraschauer Aug 30, 2024
bfe4ffa
[IMP] changes main file to use registry pattern for selection of func…
lauraschauer Aug 30, 2024
5642c5f
[IMP] adjust config.yaml file to structure changes from last commit
lauraschauer Aug 30, 2024
b56718c
[IMP] adjusted all files and functions to the new registry pattern
lauraschauer Aug 30, 2024
e0e4944
[IMP] cleans up code and removes all unused or unimportant files
lauraschauer Aug 30, 2024
09b92d9
Merge branch 'main' into evaluation-scripts
copernico Sep 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,8 @@ prospector/test_report.json
prospector/.idea/*
prospector/*.html
prospector/*.json
prospector/evaluation
prospector/evaluation/data/input/*
prospector/evaluation/data/reports/*
prospector/evaluation/config.yaml
.DS_Store
prospector/pipeline/reports/*
3 changes: 2 additions & 1 deletion prospector/datamodel/advisory.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@

# for k, v in self.references.items():
# print(k, v)
logger.debug("References: " + str(self.references))
# logger.debug("References: " + str(self.references))

# TODO: I should extract interesting stuff from the references immediately ad maintain them just for a fast lookup
logger.debug(f"Relevant references: {len(self.references)}")
Expand Down Expand Up @@ -210,6 +210,7 @@
}
limit += 1

# Filter out references that are not commit hashes, eg. commit::master

Check warning on line 213 in prospector/datamodel/advisory.py

View check run for this annotation

In Solidarity / Inclusive Language

Match Found

Please consider an alternative to `master`. Possibilities include: `primary`, `main`, `leader`, `active`, `writer`
Raw output
/\b(?!masterdata|masterdata\w+\b)master/gi
return [
ref.split("::")[1]
for ref in self.references
Expand Down
22 changes: 17 additions & 5 deletions prospector/datamodel/nlp.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,10 +50,14 @@ def extract_words_from_text(text: str) -> List[str]:
]


def find_similar_words(adv_words: Set[str], commit_msg: str, exclude: str) -> Set[str]:
def find_similar_words(
adv_words: Set[str], commit_msg: str, exclude: str
) -> Set[str]:
"""Extract nouns from commit message that appears in the advisory text"""
commit_words = {
word for word in extract_words_from_text(commit_msg) if word not in exclude
word
for word in extract_words_from_text(commit_msg)
if word not in exclude
}
return commit_words.intersection(adv_words)
# return [word for word in extract_words_from_text(commit_msg) if word in adv_words]
Expand All @@ -63,7 +67,9 @@ def extract_versions(text: str) -> List[str]:
"""
Extract all versions mentioned in the text
"""
return list(set(re.findall(r"(\d+(?:\.\d+)+)", text))) # Should be more accurate
return list(
set(re.findall(r"(\d+(?:\.\d+)+)", text))
) # Should be more accurate
# return re.findall(r"[0-9]+\.[0-9]+[0-9a-z.]*", text)


Expand Down Expand Up @@ -134,7 +140,8 @@ def extract_filename(text: str, relevant_extensions: List[str]) -> List[str]:
# This regex covers cases with various camelcase filenames and underscore, dash names
if bool(
re.search(
r"(?:[a-z]|[A-Z])[a-zA-Z]+[A-Z]\w*|(?:[a-zA-Z]{2,}[_-])+[a-zA-Z]{2,}", text
r"(?:[a-z]|[A-Z])[a-zA-Z]+[A-Z]\w*|(?:[a-zA-Z]{2,}[_-])+[a-zA-Z]{2,}",
text,
)
):
return [text], None
Expand Down Expand Up @@ -195,7 +202,12 @@ def extract_cve_references(text: str) -> List[str]:
Extract CVE identifiers
"""
return list(
set([result.group(0) for result in re.finditer(r"CVE-\d{4}-\d{4,8}", text)])
set(
[
result.group(0)
for result in re.finditer(r"CVE-\d{4}-\d{4,8}", text)
]
)
)


Expand Down
10 changes: 3 additions & 7 deletions prospector/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,20 +16,16 @@ services:
GIT_CACHE: /tmp/gitcache
CVE_DATA_PATH: /app/cve_data
REDIS_URL: redis://redis:6379/0
#POSTGRES_HOST: db
#POSTGRES_PORT: 5432
#POSTGRES_USER: postgres
#POSTGRES_PASSWORD: example
#POSTGRES_DBNAME: postgres
#NVD_API_KEY: ${NVD_API_KEY}

worker:
build:
context: .
dockerfile: docker/worker/Dockerfile
volumes:
- ./:/app
# - ./pipeline/reports:/app/pipeline/reports
- ./data_sources/reports:/app/data_sources/reports
- ./evaluation/data/reports/:/app/evaluation/data/reports
- ./../../../data/gitcache:/tmp/gitcache
depends_on:
- redis
environment:
Expand Down
10 changes: 9 additions & 1 deletion prospector/docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,7 +1,15 @@
FROM python:3.10-slim

COPY . /app
RUN mkdir -p /app
COPY ./requirements.txt /app/
WORKDIR /app
# Create log files with permissions for host user
RUN touch evaluation.log
RUN touch prospector.log
RUN chown ${UID}:${GID} evaluation.log
RUN chown ${UID}:${GID} prospector.log

# Install dependencies with pip
RUN pip install --upgrade pip
RUN apt update && apt install -y --no-install-recommends gcc g++ libffi-dev python3-dev libpq-dev git curl
RUN pip install --no-cache-dir -r requirements.txt
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ command=/usr/local/bin/python3 /usr/local/bin/rq worker {{env['RQ_QUEUE']}} -u r
process_name=%(program_name)s%(process_num)01d

; If you want to run more than one worker instance, increase this
numprocs=2
numprocs=10
redirect_stderr=true

; This is the directory from which RQ is ran. Be sure to point this to the
Expand Down
60 changes: 60 additions & 0 deletions prospector/evaluation/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Evaluate Prospector

This folder contains the scripts used for evaluating Prospector's reports and data needed for it (created and used in Summer 2024). The folder is structured as follows:

1. **Data** folder: contains input data, Prospector reports and results of the analysis of the Prospector reports.
2. **Scripts**: The scripts used for running Prospector on a batch of CVEs, and for analysing the created reports.

Prospector is run in the following way in this evaluation:

First, the five docker containers must be started with `make docker-setup` or manually with `docker` commands. Once they are running, `docker ps` should show the following:

```bash
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c73aed108475 prospector_backend "python ./service/ma…" 47 minutes ago Up 47 minutes 0.0.0.0:8000->8000/tcp, :::8000->8000/tcp prospector_backend_1
2e9da86b09a8 prospector_worker "/usr/local/bin/star…" 47 minutes ago Up 47 minutes prospector_worker_1
b219fd6219ed adminer "entrypoint.sh php -…" 47 minutes ago Up 47 minutes 0.0.0.0:8080->8080/tcp, :::8080->8080/tcp prospector_adminer_1
9aacdc04f7c5 postgres "docker-entrypoint.s…" 47 minutes ago Up 47 minutes 0.0.0.0:5432->5432/tcp, :::5432->5432/tcp db
7c540450ab76 redis:alpine "docker-entrypoint.s…" 47 minutes ago Up 47 minutes 0.0.0.0:6379->6379/tcp, :::6379->6379/tcp prospector_redis_1
```

[`dispatch_jobs.py`](#running-prospector-on-multiple-cves-dispatch_jobspy) creates jobs with the `prospector()` function in them and enqueues
them in a Redis Queue, from which the `prospector_worker` container fetches jobs and executes them. To visualise what is going on, run
`docker attach prospector_worker_1` to see the usual console output. In order to change something inside the container, run `docker exec -it prospector_worker_1 bash` to open an interactive bash shell.

You can set the number of workers in `docker/worker/etc_supervisor_confd_rqworker.conf.j2`.

## Configuration File

The configuration file has two parts to it: a main part and a Prospector settings part, which is a copy of a part of the original Prospector `config.yaml` file.

The main part at the top allows you to set the path to where the input data can be found, where Prospector reports should be saved to and where analysis results should be saved to.

The Prospector part allows you to set the settings for Prospector (independent from the Prospector settings used when running Prospector with `./run_prospector`). **Watch out**: Since the `prospector_worker` container is created in the beginning with the current state of the `config.yaml`, simply saving any changes in `config.yaml` and dispatching new jobs will still run them with the old configuration. For new configuration parameters to take effect, either destroy the containers with `make docker-clean` and rebuild them with `make docker-setup` or open an interactive shell to the container and make your changes to the code in there.

## Script Files explained

### Running Prospector on multiple CVEs (`dispatch_jobs.py`)

The code for running Prospector is in `dispatch_jobs.py`. It exctracts CVE IDs from the data given in the path constructed as: `input_data_path` + the `-i` CL parameter. It then dispatches a job for each CVE ID to the queue, from where these jobs get executed. The path to the input file is split into two components (`input_data_path` in `config.yaml` and the `-i` parameter) because you might have one folder in which you have several different input data files of the same format. This keeps you from typing the full path, but still allows you to switch between the files between different runs.

The reports are generated in the worker container, and saved to `prospector_reports_path_container`. This folder is mounted into the container, so you can see any newly generated reports in the same folder on the host.

Do not confuse this paramter with `prospector_reports_path_host`, which sets the path to a batch of reports on the host (used for analysis). Your workflow should be as follows:

1. Dispatch reports
2. When the report generation has finished, move the reports to any other folder (preferably outside of the `prospector/` folder to keep the build context for the container from getting too big).
3. Analyse the reports by setting the `prospector_reports_path_host` to the folder where you moved the reports to.

### Analysing the generated reports (`analyse.py`)

Start an analysis with

```bash
python3 evaluation/main.py -i <your_input_data_csv_file> -a
```

This will start the `analyse_prospector_reports()` function in `analyse.py`, which re-creates the summary execution table from [AssureMOSS D6.3](https://assuremoss.eu/en/resources/Deliverables/D6.3):
![D6.3 Summary Execution Table](images/summary_execution_table.png)

It also creates a detailed JSON file (listing CVEs in each category) in `data/results/summary_execution/` to inspect which CVEs are in which category.
Empty file.
Loading
Loading