Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests for federated SPARQL queries between the curies mapping service and popular triplestores #53

Open
wants to merge 32 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
843541f
Add tests to check if federated queries between the curies mapping se…
vemonet Apr 11, 2023
899d586
Update MANIFEST.in
cthoyt Apr 13, 2023
6863cc3
Merge branch 'main' into pr/53
cthoyt Apr 13, 2023
e77a846
Update test_sparql.py
cthoyt Apr 13, 2023
67d009e
Remove redundant code
cthoyt Apr 13, 2023
79dd180
Merge branch 'main' into pr/53
cthoyt Apr 13, 2023
7a61e43
Update test_sparql.py
cthoyt Apr 13, 2023
423e9f6
Code cleanup
cthoyt Apr 13, 2023
e4443c0
Update test_sparql.py
cthoyt Apr 13, 2023
418fd74
Update test_sparql.py
cthoyt Apr 13, 2023
41dfdb7
Merge branch 'main' into pr/53
cthoyt Apr 13, 2023
eaf2906
Update test_sparql.py
cthoyt Apr 13, 2023
5f555bf
Update test_sparql.py
cthoyt Apr 13, 2023
62a1f49
try to fix a bit the URLs that have been changed without checking
vemonet Apr 13, 2023
e32c6da
fix the blazegraph local URL in test
vemonet Apr 13, 2023
984f10c
fix federated queries test, only test_from_virtuoso_to_mapping_servic…
vemonet Apr 14, 2023
0e709cf
fix CSV parsing, which fixes all tests
vemonet Apr 14, 2023
85f8652
Use the same query for test from the mapping service to external trip…
vemonet Apr 14, 2023
432fc06
improve how triples are defined in init script
vemonet Apr 14, 2023
477fe35
Add externally configurable tests
cthoyt Apr 14, 2023
5e5e1e9
Add second generic test
cthoyt Apr 14, 2023
78fb063
Better configure queries
cthoyt Apr 14, 2023
3561a57
Update src/curies/mapping_service/utils.py
cthoyt Apr 14, 2023
58e6021
Cleanup code
cthoyt Apr 14, 2023
18cb2d7
pass flake8
cthoyt Apr 14, 2023
f2b9f74
add federated queries tests for fuseki
vemonet Apr 14, 2023
dac165c
merge
vemonet Apr 14, 2023
efbfa00
Remove non-generic tests
cthoyt Apr 14, 2023
3657376
Update test_sparql.py
cthoyt Apr 14, 2023
7e6d339
Make tests generic and not rely on docker bioregistry
cthoyt Apr 14, 2023
7b66804
Switch to cases
cthoyt Apr 14, 2023
4dd192c
Merge branch 'main' into add-federated-queries-test-with-docker
cthoyt Dec 11, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,12 @@ jobs:
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Start local triplestores for testing with docker
if: matrix.os == 'ubuntu-latest'
run: |
docker-compose up -d
sleep 20
./tests/resources/init_triplestores.sh
- name: Install dependencies
run: pip install tox
- name: Test with pytest and generate coverage file
Expand Down
2 changes: 1 addition & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,4 @@ recursive-include docs/source *.png
global-exclude *.py[cod] __pycache__ *.so *.dylib .DS_Store *.gpickle

include README.md LICENSE
exclude tox.ini .flake8 .bumpversion.cfg .readthedocs.yml codecov.yml
exclude tox.ini .flake8 .bumpversion.cfg .readthedocs.yml codecov.yml docker-compose.yml
12 changes: 12 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -251,6 +251,18 @@ $ cd curies
$ pip install -e .
```

To test the the curies mapping service SPARQL endpoint federated queries properly work with popular triplestore you will need to start the triplestores locally with `docker` (otherwise the tests defined in `tests/test_sparql.py` will be skipped):

```bash
$ docker compose up -d
```

The first time you start the triplestores you will need to initialize them by running a script:

```bash
$ ./tests/resources/init_triplestores.sh
```

### 🥼 Testing

After cloning the repository and installing `tox` with `pip install tox`, the unit tests in the `tests/` folder can be
Expand Down
30 changes: 30 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
version: "3"
services:

mapping-service:
build:
context: .
dockerfile: tests/resources/Dockerfile
ports:
- 8888:8888
volumes:
- ./src:/app/src
- ./tests:/app/tests

blazegraph:
image: metaphacts/blazegraph-basic:2.2.0-20160908.003514-6-jetty9.4.44-jre8-45dbfff
ports:
- 8889:8080
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason we can't just use the default ports for each service as we expose outside of docker?


virtuoso:
image: openlink/virtuoso-opensource-7:latest
ports:
- 8890:8890
environment:
- DBA_PASSWORD=${VIRTUOSO_PASSWORD:-dba}
- SPARQL_UPDATE=true
- VIRT_Database_ErrorLogLevel=7 # 7 is maximum logs
- VIRT_HTTPServer_HTTPLogFile=/http.log
# https://docs.openlinksw.com/virtuoso/loggingandrecording/

# TODO: add Apache Fuseki
3 changes: 2 additions & 1 deletion setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,8 @@ tests =
pandas =
pandas
bioregistry =
bioregistry>=0.5.136
bioregistry[web]>=0.5.136
flasgger
flask =
flask
defusedxml
Expand Down
11 changes: 11 additions & 0 deletions tests/resources/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
FROM python:3.10

# Dockerfile used to spawn a mapping service SPARQL endpoint for testing

WORKDIR /app

ADD . .

RUN pip install -e ".[fastapi,rdflib,bioregistry]"

CMD [ "bioregistry", "web", "--port", "8888", "--host", "0.0.0.0" ]
9 changes: 9 additions & 0 deletions tests/resources/init_triplestores.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
## Script to initialize the triplestores started with docker
# Run it from the root of the repo: ./tests/resources/init_triplestores.sh

# Enable federated query for Virtuoso and load a triple for testing
docker compose exec virtuoso isql -U dba -P dba exec='GRANT "SPARQL_SELECT_FED" TO "SPARQL";'
docker compose exec virtuoso isql -U dba -P dba exec='SPARQL INSERT IN <https://purl.uniprot.org> { <https://purl.uniprot.org/uniprot/P07862> <https://w3id.org/biolink/vocab/category> <https://w3id.org/biolink/vocab/GeneProduct> };'

# Load a triple to local blazegraph for testing
docker compose exec blazegraph curl -X POST http://localhost:8080/blazegraph/namespace/kb/sparql -d 'update=insert data {<http://identifiers.org/ensembl/ENSG00000006453> <https://w3id.org/biolink/vocab/category> <https://w3id.org/biolink/vocab/Gene> . }'
cthoyt marked this conversation as resolved.
Show resolved Hide resolved
126 changes: 126 additions & 0 deletions tests/test_sparql.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
"""Tests federated SPARQL queries between the curies mapping service and popular triplestores."""

import unittest
from typing import Set, Tuple

from curies.mapping_service.utils import (
get_sparql_record_so_tuples,
get_sparql_records,
sparql_service_available,
)
from tests.test_mapping_service import VALID_CONTENT_TYPES

# NOTE: federated queries need to use docker internal URL
DOCKER_BIOREGISTRY = "http://mapping-service:8888/sparql"
LOCAL_BIOREGISTRY = "http://localhost:8888/sparql"
LOCAL_BLAZEGRAPH = "http://localhost:8889/blazegraph/namespace/kb/sparql"
DOCKER_BLAZEGRAPH = "http://blazegraph:8080/blazegraph/namespace/kb/sparql"
LOCAL_VIRTUOSO = "http://localhost:8890/sparql"
DOCKER_VIRTUOSO = "http://virtuoso:8890/sparql"


def get(endpoint: str, sparql: str, accept: str) -> Set[Tuple[str, str]]:
"""Get a response from a given SPARQL query."""
records = get_sparql_records(endpoint=endpoint, sparql=sparql, accept=accept)
return get_sparql_record_so_tuples(records)


SPARQL_VALUES = f"""\
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT DISTINCT ?s ?o WHERE {{
SERVICE <{DOCKER_BIOREGISTRY}> {{
VALUES ?s {{ <http://purl.obolibrary.org/obo/CHEBI_24867> <http://purl.obolibrary.org/obo/CHEBI_24868> }} .
?s owl:sameAs ?o .
}}
}}
""".rstrip()

SPARQL_SIMPLE = f"""\
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT DISTINCT ?s ?o WHERE {{
SERVICE <{DOCKER_BIOREGISTRY}> {{
<http://purl.obolibrary.org/obo/CHEBI_24867> owl:sameAs ?o .
?s owl:sameAs ?o .
}}
}}
""".rstrip()


@unittest.skipUnless(
sparql_service_available(LOCAL_BIOREGISTRY), reason="No local Bioregistry is running"
)
class TestSPARQL(unittest.TestCase):
"""Tests federated SPARQL queries between the curies mapping service and blazegraph/virtuoso triplestores.

Run and init the required triplestores locally:
1. docker compose up
2. ./tests/resources/init_triplestores.sh
"""

def assert_endpoint(self, endpoint: str, query: str, *, accept: str):
"""Assert the endpoint returns favorable results."""
records = get(endpoint, query, accept=accept)
self.assertIn(
("http://purl.obolibrary.org/obo/CHEBI_24867", "https://bioregistry.io/chebi:24867"),
records,
)

@unittest.skipUnless(
sparql_service_available(LOCAL_BLAZEGRAPH), reason="No local BlazeGraph is running"
)
def test_from_blazegraph_to_bioregistry(self):
"""Test a federated query from a Blazegraph triplestore to the curies service."""
for mimetype in VALID_CONTENT_TYPES:
with self.subTest(mimetype=mimetype):
self.assert_endpoint(LOCAL_BLAZEGRAPH, SPARQL_SIMPLE, accept=mimetype)
self.assert_endpoint(LOCAL_BLAZEGRAPH, SPARQL_VALUES, accept=mimetype)

@unittest.skipUnless(
sparql_service_available(LOCAL_VIRTUOSO), reason="No local Virtuoso is running"
)
def test_from_virtuoso_to_bioregistry(self):
"""Test a federated query from a OpenLink Virtuoso triplestore to the curies service."""
for mimetype in VALID_CONTENT_TYPES:
with self.subTest(mimetype=mimetype):
self.assert_endpoint(LOCAL_VIRTUOSO, SPARQL_SIMPLE, accept=mimetype)
# TODO: Virtuoso fails to resolves VALUES in federated query
# self.assert_endpoint(LOCAL_VIRTUOSO, SPARQL_VALUES, accept=mimetype)

@unittest.skipUnless(
sparql_service_available(LOCAL_BIOREGISTRY), reason="No local Bioregistry is running"
)
def test_from_bioregistry_to_virtuoso(self):
"""Test a federated query from the curies service to a OpenLink Virtuoso triplestore."""
query = f"""\
SELECT ?s ?o WHERE {{
<https://identifiers.org/uniprot/P07862> <http://www.w3.org/2002/07/owl#sameAs> ?s .
SERVICE <{DOCKER_VIRTUOSO}> {{
?s ?p ?o .
}}
}}
""".rstrip()
for mimetype in VALID_CONTENT_TYPES:
with self.subTest(mimetype=mimetype):
records = get(LOCAL_BIOREGISTRY, query, accept=mimetype)
self.assertGreater(len(records), 0)

@unittest.skipUnless(
sparql_service_available(LOCAL_BIOREGISTRY), reason="No local Bioregistry is running"
)
def test_from_bioregistry_to_blazegraph(self):
"""Test a federated query from the curies service to a OpenLink Virtuoso triplestore."""
query = f"""\
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX bl: <https://w3id.org/biolink/vocab/>
SELECT ?s ?o WHERE {{
<https://www.ensembl.org/id/ENSG00000006453> owl:sameAs ?s .

SERVICE <{DOCKER_BLAZEGRAPH}> {{
?s bl:category ?o .
}}
}}
""".rstrip()
for mimetype in VALID_CONTENT_TYPES:
with self.subTest(mimetype=mimetype):
records = get(LOCAL_BIOREGISTRY, query, accept=mimetype)
self.assertGreater(len(records), 0)