This repo contains the example pytorch code for the paper [Enhancing Table Retrieval with Dual Graph Representations].
This paper/code aims to enhance table retrieval with Dual Graph Rrepresentations. We first decouple a table into the row view and column view, then build dual graphs with the consideration of table contexts. Afterward, intra-graph and inter-graph interactions are iteratively performed, and an adaptive fusion strategy is tailor-made for sophisticated table representations. In this way, the input query can match the target tables and achieve the ultimate ranking results more accurately.
See below for an overview of the model architecture:
- python3 (tested on 3.7.11)
- pytorch (tested on 1.11.0)
- networkx (tested on 2.5)
- dlg (tested on 0.8.0.post2)
Install requirements:
pip install -r requirements.txt
The code requires that you have access to the WikiTables dataset and Webquerytable dataset.
First, download and unzip FastText vectors:
wget https://dl.fbaipublicfiles.com/fasttext/vectors-wiki/wiki.en.zip
unzip wiki.en.zip
Next, install trec_eval tool:
git clone https://github.com/usnistgov/trec_eval.git
cd trec_eval
make
To run cross validation on WikiTables dataset:
python run.py --dataset wikitables
Training DualG on Webquerytable dataset:
python run.py --dataset webquerytable
Model checkpoints and logs will be saved to saved/model_*.pt
and saved/results_*/*.log
, respectively.
Detailed parameter information used in configs/*.json
.
@inproceedings{liu2023enhancing,
title={Enhancing Table Retrieval with Dual Graph Representations},
author={Liu, Tianyun and Zhang, Xinghua and Zhang, Zhenyu and Wang, Yubin and Li, Quangang and Zhang, Shuai and Liu, Tingwen},
booktitle={Joint European Conference on Machine Learning and Knowledge Discovery in Databases},
pages={107--123},
year={2023}
}