We use accio to rewrite a federated query into multiple single-source ones, and combine the result locally with datafusion.
Currently, this feature is only enabled by building connectorx from source code as follows:
- Clone connectorx:
git clone [email protected]:sfu-db/connector-x.git
. - Build connectorx from source follows the instruction. Note, for the final step, build wheel with
build-python-wheel-fed
command instead ofbuild-python-wheel
. - Install connectorx:
pip install ${YOUR_LOCAL_CONNECTORX_PATH}/connectorx-python/target/wheels/${YOUR_WHEEL_FILE}
. - Clone accio:
git clone [email protected]:sfu-db/accio.git
. - Build accio:
cd accio/rewriter && mvn package -Dmaven.test.skip=true
. - Move the jar file to location
${YOUR_LOCAL_PYTHON_PATH}/site-packages/connectorx/dependencies/federated-rewriter.jar
- Configure accio and set the configuration path as
FED_CONFIG_PATH
. Example configurations can be found here. - Run federated query using connectorx!
Alternatively, accio provides wrappers that can directly run federated queries on various query engines. In particular, it uses connectorx as the data fetching method when datafusion or polars are the federation engine. For more details, check out here.