Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: ajinabraham/libsast
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: 3.1.1
Choose a base ref
...
head repository: ajinabraham/libsast
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: master
Choose a head ref
  • 13 commits
  • 11 files changed
  • 2 contributors

Commits on Nov 14, 2024

  1. test billiard

    ajinabraham committed Nov 14, 2024

    Verified

    This commit was signed with the committer’s verified signature.
    codebytere Shelley Vohr
    Copy the full SHA
    ce0f68f View commit details
  2. Support usage with queue

    ajinabraham committed Nov 14, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
    Copy the full SHA
    1d75514 View commit details
  3. Merge pull request #51 from ajinabraham/3.1.2

    3.1.2
    ajinabraham authored Nov 14, 2024

    Verified

    This commit was signed with the committer’s verified signature.
    jkleinsc John Kleinschmidt
    Copy the full SHA
    18cb1b0 View commit details
  4. update PatternMatcher and ChoiceMatcher internal apis

    ajinabraham committed Nov 14, 2024

    Verified

    This commit was signed with the committer’s verified signature.
    jkleinsc John Kleinschmidt
    Copy the full SHA
    7af4c50 View commit details
  5. code qa

    ajinabraham committed Nov 14, 2024

    Verified

    This commit was signed with the committer’s verified signature.
    jkleinsc John Kleinschmidt
    Copy the full SHA
    192b564 View commit details
  6. Merge pull request #52 from ajinabraham/3.1.3

    Update PatternMatcher and ChoiceMatcher internal apis
    ajinabraham authored Nov 14, 2024

    Verified

    This commit was signed with the committer’s verified signature.
    jkleinsc John Kleinschmidt
    Copy the full SHA
    21f98eb View commit details
  7. make semgrep optional

    ajinabraham committed Nov 14, 2024

    Verified

    This commit was signed with the committer’s verified signature.
    jkleinsc John Kleinschmidt
    Copy the full SHA
    56ebeaf View commit details
  8. Semgrep is optional, action QA

    ajinabraham committed Nov 14, 2024
    Copy the full SHA
    ed4476c View commit details
  9. Merge pull request #53 from ajinabraham/3.1.4

    Make semgrep optional, also update actions.
    ajinabraham authored Nov 14, 2024
    Copy the full SHA
    3171b66 View commit details
  10. Support multiple multiprocessing options

    ajinabraham committed Nov 14, 2024
    Copy the full SHA
    330d5bc View commit details
  11. Merge pull request #54 from ajinabraham/3.1.5

    Support multiple multiprocessing options
    ajinabraham authored Nov 14, 2024
    Copy the full SHA
    b8b4048 View commit details
  12. Expose multiprocessing to cli args

    ajinabraham committed Nov 14, 2024
    Copy the full SHA
    272cdc7 View commit details
  13. Merge pull request #55 from ajinabraham/3.1.6

    Expose multiprocessing to cli args
    ajinabraham authored Nov 14, 2024
    Copy the full SHA
    73f3fc4 View commit details
2 changes: 1 addition & 1 deletion .github/workflows/codeql.yml
Original file line number Diff line number Diff line change
@@ -24,7 +24,7 @@ jobs:

steps:
- name: Checkout
uses: actions/checkout@v3
uses: actions/checkout@v4.2.2

- name: Initialize CodeQL
uses: github/codeql-action/init@v2
4 changes: 2 additions & 2 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
@@ -10,9 +10,9 @@ jobs:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4.2.2
- name: Set up Python
uses: actions/setup-python@v3
uses: actions/setup-python@v5.3.0
with:
python-version: '3.x'
- name: Install dependencies
6 changes: 3 additions & 3 deletions .github/workflows/python_test.yml
Original file line number Diff line number Diff line change
@@ -19,9 +19,9 @@ jobs:
python-version: ['3.10', '3.11', '3.12']

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4.2.2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
uses: actions/setup-python@v5.3.0
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
@@ -33,7 +33,7 @@ jobs:
tox -e lint
- name: Install libsast
run: |
poetry install --no-interaction --no-ansi
poetry install --no-interaction --no-ansi --with semgrep
- name: Bandit Scan
run: |
poetry run bandit -ll libsast -r
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -17,7 +17,10 @@ Made with ![Love](https://cloud.githubusercontent.com/assets/4301109/16754758/82

## Install

`pip install libsast`
```bash
pip install semgrep==1.86.0 #For semgrep support
pip install libsast
```

Pattern Matcher is cross-platform, but Semgrep supports only Mac and Linux.

2 changes: 1 addition & 1 deletion libsast/__init__.py
Original file line number Diff line number Diff line change
@@ -12,7 +12,7 @@
__title__ = 'libsast'
__authors__ = 'Ajin Abraham'
__copyright__ = f'Copyright {year} Ajin Abraham, opensecurity.in'
__version__ = '3.1.1'
__version__ = '3.1.6'
__version_info__ = tuple(int(i) for i in __version__.split('.'))
__all__ = [
'Scanner',
7 changes: 7 additions & 0 deletions libsast/__main__.py
Original file line number Diff line number Diff line change
@@ -78,6 +78,12 @@ def main():
help='No of CPU cores to use. Use all cores by default',
type=int,
required=False)
parser.add_argument('-mp', '--multiprocessing',
help=('Multiprocessing strategy to use.'
' Options: default, thread, billiard'),
default='default',
type=str,
required=False)
parser.add_argument('-v', '--version',
help='Show libsast version',
required=False,
@@ -94,6 +100,7 @@ def main():
'ignore_paths': args.ignore_paths,
'show_progress': args.show_progress,
'cpu_core': args.cpu_core,
'multiprocessing': args.multiprocessing,
}
result = Scanner(options, args.path).scan()
output(args.output, result)
44 changes: 31 additions & 13 deletions libsast/core_matcher/choice_matcher.py
Original file line number Diff line number Diff line change
@@ -24,6 +24,7 @@ def __init__(self, options: dict) -> None:
self.scan_rules = get_rules(options.get('choice_rules'))
self.show_progress = options.get('show_progress')
self.cpu = options.get('cpu_core')
self.multiprocessing = options.get('multiprocessing')
self.alternative_path = options.get('alternative_path')
exts = options.get('choice_extensions')
self.exts = [ext.lower() for ext in exts] if exts else []
@@ -40,9 +41,9 @@ def scan(self, paths: list) -> dict:

def read_file_contents(self, paths: list) -> list:
"""Load file(s) content."""
if not (self.scan_rules and paths):
return
self.validate_rules()
if not paths:
return []

choice_args = []
for rule in self.scan_rules:
scan_paths = paths
@@ -63,17 +64,34 @@ def read_file_contents(self, paths: list) -> list:
futures.append(future)
return [future.result() for future in futures]

def regex_scan(self, file_contents) -> list:
def regex_scan(self, file_contents: list, rules=None) -> dict:
"""Process regex matches on the file contents."""
# Use ProcessPoolExecutor for regex processing
with ProcessPoolExecutor(max_workers=self.cpu) as cpu_executor:

results = []
for content in file_contents:
# Process Choice Matcher on the file contents
process_future = cpu_executor.submit(
self.choice_matcher, content)
results.append(process_future.result())
if rules:
self.scan_rules = get_rules(rules)
if not (self.scan_rules and file_contents):
return {}
self.validate_rules()

if self.multiprocessing == 'billiard':
# Use billiard's pool for regex (support queues)
from billiard import Pool
with Pool(processes=self.cpu) as pool:
# Run regex on file data
results = pool.map(
self.choice_matcher,
file_contents)
elif self.multiprocessing == 'thread':
# Use a ThreadPool for regex check
with ThreadPoolExecutor() as io_executor:
results = list(io_executor.map(
self.choice_matcher,
file_contents))
else:
# Use ProcessPoolExecutor for regex processing
with ProcessPoolExecutor(max_workers=self.cpu) as cpu_executor:
results = list(cpu_executor.map(
self.choice_matcher,
file_contents))

self.add_finding(results)
return self.findings
47 changes: 35 additions & 12 deletions libsast/core_matcher/pattern_matcher.py
Original file line number Diff line number Diff line change
@@ -25,6 +25,7 @@ def __init__(self, options: dict) -> None:
self.scan_rules = get_rules(options.get('match_rules'))
self.show_progress = options.get('show_progress')
self.cpu = options.get('cpu_core')
self.multiprocessing = options.get('multiprocessing')
exts = options.get('match_extensions')
self.exts = [ext.lower() for ext in exts] if exts else []
self.findings = {}
@@ -40,9 +41,8 @@ def scan(self, paths: list) -> dict:

def read_file_contents(self, paths: list) -> list:
"""Load file(s) content."""
if not (self.scan_rules and paths):
return
self.validate_rules()
if not paths:
return []

# Filter files by extension and size, prepare list for processing
files_to_scan = {
@@ -60,16 +60,39 @@ def read_file_contents(self, paths: list) -> list:
self._read_file_content, files_to_scan))
return file_contents

def regex_scan(self, file_contents: list) -> dict:
def regex_scan(self, file_contents: list, rules=None) -> dict:
"""Scan file(s) content."""
# Use a ProcessPool for CPU-bound regex
with ProcessPoolExecutor(max_workers=self.cpu) as cpu_executor:

# Run regex on file data
results = cpu_executor.map(
self.pattern_matcher,
file_contents,
)
if rules:
self.scan_rules = get_rules(rules)
if not (self.scan_rules and file_contents):
return {}
self.validate_rules()

if self.multiprocessing == 'billiard':
# Use billiard's pool for CPU-bound regex (support queues)
from billiard import Pool
with Pool(processes=self.cpu) as cpu_executor:
# Run regex on file data
results = cpu_executor.map(
self.pattern_matcher,
file_contents,
)
elif self.multiprocessing == 'thread':
# Use a ThreadPool for regex check
with ThreadPoolExecutor() as io_executor:
# Run regex on file data
results = io_executor.map(
self.pattern_matcher,
file_contents,
)
else:
# Use a ProcessPool for CPU-bound regex
with ProcessPoolExecutor(max_workers=self.cpu) as cpu_executor:
# Run regex on file data
results = cpu_executor.map(
self.pattern_matcher,
file_contents,
)

# Compile findings
self.add_finding(results)
9 changes: 5 additions & 4 deletions libsast/scanner.py
Original file line number Diff line number Diff line change
@@ -26,6 +26,7 @@ def __init__(self, options: dict, paths: list) -> None:
'ignore_paths': [],
'show_progress': False,
'cpu_core': 1,
'multiprocessing': 'default',
# Overwrite with options from invocation
**(options or {}),
}
@@ -53,7 +54,7 @@ def __init__(self, options: dict, paths: list) -> None:
def scan(self) -> dict:
"""Start Scan."""
results = {}
valid_paths = self.get_scan_files(self.paths)
valid_paths = self.get_scan_files()

if not valid_paths:
return {}
@@ -67,13 +68,13 @@ def scan(self) -> dict:

return results

def get_scan_files(self, paths):
def get_scan_files(self):
"""Get files valid for scanning."""
if not isinstance(paths, list):
if not isinstance(self.paths, list):
raise InvalidPathError('Path should be a list')

all_files = set()
for path in paths:
for path in self.paths:
pobj = Path(path)
if pobj.is_dir():
all_files.update({
13 changes: 12 additions & 1 deletion poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 7 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "libsast"
version = "3.1.1"
version = "3.1.6"
description = "A generic SAST library built on top of semgrep and regex"
keywords = ["libsast", "SAST", "Python SAST", "SAST API", "Regex SAST", "Pattern Matcher"]
authors = ["Ajin Abraham <ajin@opensecurity.in>"]
@@ -26,6 +26,12 @@ libsast = "libsast.__main__:main"
python = "^3.8"
requests = "*"
pyyaml = ">=6.0"
billiard = "^4.2.1"

[tool.poetry.group.semgrep]
optional = true

[tool.poetry.group.semgrep.dependencies]
semgrep = {version = "1.86.0", markers = "sys_platform != 'win32'"}

[tool.poetry.group.dev.dependencies]