Skip to content

Commit

Permalink
Add paper QueryFormer
Browse files Browse the repository at this point in the history
  • Loading branch information
paul356 committed Feb 29, 2024
1 parent cffcd33 commit c20aea3
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 6 deletions.
10 changes: 5 additions & 5 deletions _org/2024-02-02-feb-papers.org
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,14 @@ nav_order: {{ page.date }}
---
#+END_EXPORT

|-------------------------------------------------------------------------------+-----------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------+---------------------------------------------------------------------------------------|
| Title | Authors | Synthesis | Publisher | Keywords |
|-------------------------------------------------------------------------------+-----------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------+---------------------------------------------------------------------------------------|
|-------------------------------------------------------------------------------+-----------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------+---------------------------------------------------------------------------------------|
| Title | Authors | Synthesis | Publisher | Keywords |
|-------------------------------------------------------------------------------+-----------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------+---------------------------------------------------------------------------------------|
| Access Path Selection in a Relational Database Management System | P. Griffiths Selinger, M. M. Astrahan, D. D. Chamberlin, R. R. Lorie, T. G. Price | This paper present a framework for selecting the access path for single table scans and n-table joins. This method first emurates the different single relation scan plans, next it emurates second-level joins from the single relation plans, then enumerate third-level joins until all tables are in the join plans. In order to reduce the plans considered it introduces some heuristics to do early pruning favoring joins with high selectivity factors. | SIGMOD 1979 | Access Path Selection, Query Optimizer |
| A Relational Model of Data for Large Shared Data Banks | Edgar F. Codd | This paper proposes to use relational model for data management systems. It defines the notion of Relation, and operations applicable to Relations. It also suggests seperating the sublanguage of data into R and H. R is a high level language to descirbe the characteristics of the data set. H is taken care of by the implementers of data systems. This paper introduces the operations can be used on relations, like permutations, projections, and joins. It also touches some aspects of schema design and data redundancy. | Information Retrieval | Relational Model 1970 |
| Table-GPT: Table-tuned GPT for Diverse Table Tasks | Peng Li, Yeye He, Surajit Chaudhuri, etc. | This paper presents a methodology to train GPT models on database tasks. The authors synthesize diverse database tasks, in (Instruction, Table, Completion) triples, as training instructions. They use these synthesized tasks to tune GPT3.5 into table tunned GPT3.5. The results show there are improvements on complicated database tasks. But results also show two regressions. | arXiv 2023 | Large Language Model, GPT, Table-GPT |
| SlabCity: Whole-Query Optimization using Program Synthesis | Rui Dong, Jie Liu, Cong Yan, Xinyu Wang, etc. | This paper presents a synthesis-based SQL query rewrite framework which consists of a rewriter, an equivelance checker and a performance ranker. This framework can optimize input queries, and it also can give if the result query is equal to the input query in some cases. When it is not sure or the bounded time is used up, the decision is left for the user. | PVLDB 2023 | SQL Rewrite, Program Synthesis, Equivalence Checker |
| Tiresias: Enabling Predictive Autonomous Storage and Indexing | Michael Abebe, Horatiu Lazu, Khuzaima Daudjee | This paper presents a online predictive system which has the ability to predicate the upcoming query type and access the benefit of different storage configurations like index, row-store or column-store. So the authors build a module to estimate dat access arrivals based on SPAR and hybrid-ensemble. In order to access if it is worthwhile to adjust the storage configuration the authors also introduces a latency/cost assess module based on a model combining a linear model and a neural network. The authors their system can improve OLTP workload throughput and reduce OLAP workload latency. | VLDB 2022 | Autonomous Storage and Indexing, Latency Predication, Data Access Arrival Predication |
| To Partition, or Not to Partition, That is the Join Question in a Real System | Maximilian Bandle, Jana Giceva, Thomas Neumann | This paper revisit the claimed performance advantage by using the workload of TPC-H which is near real system workloads. The test result is contrary to what the previous researchers claimed. The playload size and pipline depth are the key factors that radix based joins can't work well in real systems and previous researchers neglected. | SIGMOD 2021 | Radix Join, Hash Join, In-memory Database, TPC-H |
| | | | | |
|-------------------------------------------------------------------------------+-----------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------+---------------------------------------------------------------------------------------|
| QueryFormer: A Tree Transformer Model for Query Plan Representation | Yue Zhao, Gao Cong, Jiachen Shi, Chunyan Miao | This paper presents a new query plan representation model QueryFormer. This model use the attention mechanism of the transformer model in the representation of query plans. The model uses methods like Height Encoding, Tree-Bias and Super Node to solve the challenges posed by the tree structure of query plans. As the experiments show this representation model can be efficiently trained and leads to better cardinality and cost estimations. | VLDB 2022 | Attention, Query Plan Representation, Learned Database Model |
|-------------------------------------------------------------------------------+-----------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------+---------------------------------------------------------------------------------------|
2 changes: 1 addition & 1 deletion _posts/2024-02-02-feb-papers.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,4 @@ nav_order: {{ page.date }}
| SlabCity: Whole-Query Optimization using Program Synthesis | Rui Dong, Jie Liu, Cong Yan, Xinyu Wang, etc. | This paper presents a synthesis-based SQL query rewrite framework which consists of a rewriter, an equivelance checker and a performance ranker. This framework can optimize input queries, and it also can give if the result query is equal to the input query in some cases. When it is not sure or the bounded time is used up, the decision is left for the user. | PVLDB 2023 | SQL Rewrite, Program Synthesis, Equivalence Checker |
| Tiresias: Enabling Predictive Autonomous Storage and Indexing | Michael Abebe, Horatiu Lazu, Khuzaima Daudjee | This paper presents a online predictive system which has the ability to predicate the upcoming query type and access the benefit of different storage configurations like index, row-store or column-store. So the authors build a module to estimate dat access arrivals based on SPAR and hybrid-ensemble. In order to access if it is worthwhile to adjust the storage configuration the authors also introduces a latency/cost assess module based on a model combining a linear model and a neural network. The authors their system can improve OLTP workload throughput and reduce OLAP workload latency. | VLDB 2022 | Autonomous Storage and Indexing, Latency Predication, Data Access Arrival Predication |
| To Partition, or Not to Partition, That is the Join Question in a Real System | Maximilian Bandle, Jana Giceva, Thomas Neumann | This paper revisit the claimed performance advantage by using the workload of TPC-H which is near real system workloads. The test result is contrary to what the previous researchers claimed. The playload size and pipline depth are the key factors that radix based joins can't work well in real systems and previous researchers neglected. | SIGMOD 2021 | Radix Join, Hash Join, In-memory Database, TPC-H |
| | | | | |
| QueryFormer: A Tree Transformer Model for Query Plan Representation | Yue Zhao, Gao Cong, Jiachen Shi, Chunyan Miao | This paper presents a new query plan representation model QueryFormer. This model use the attention mechanism of the transformer model in the representation of query plans. The model uses methods like Height Encoding, Tree-Bias and Super Node to solve the challenges posed by the tree structure of query plans. As the experiments show this representation model can be efficiently trained and leads to better cardinality and cost estimations. | VLDB 2022 | Attention, Query Plan Representation, Learned Database Model |

0 comments on commit c20aea3

Please sign in to comment.