Querying partitioned tables #31

Dietr1ch · 2025-01-11T22:34:38Z

I want to merge multiple tables with the same into a single one.

Say I have a table Sales(transaction_id, date, amount) and sharded files in my file system,

sales/
  - 2024/
    - 12/
      - 30.csv
      - 31.csv
  - 2025/
    - 01/
      - 01.csv
      - 02.csv
      - 03.csv

Is there a convenient way to treat sales/**/*.csv as a single table?

So far it seems that bdt query supports 2 flags for input tables,

--table path/to/single_file.csv
- A single table is read with name single_file
--tables path/to/directory/
- Multiple tables are read, each one with it's own basename
  - This imports N tables
  - I don't see much value here, why not using the shell to expand something like path/to/directory/*.csv?

I kind of want a new input file flag that expects a table name, and a set of (compatible) files,

bdt query \
  --partitioned_table sales sales/**/*.csv \  # Shell will expand these globs
  --sql "
    select
      count(*)
    from
      sales
  "

Which would use a flag with 1+N arguments, --partitioned_table sales sales/2024/12/30.csv sales/2024/12/31.csv sales/2025/01/01.csv sales/2025/01/02.csv sales/2025/01/03.csv, and make the table sales available.

Is there a way to get this today? I tried the --tables flag, but instead got N different tables that were hard to work with as a unit.

It's not hard to create a single file that concatenates all tables, but I'd nice not needing to create it as it'd allow writing queries from the shell, with a tiny rewrite --partitioned_table sales sales/2024/12/*.csv would get me info about sales in December 2024 without any made-up disk writes.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Querying partitioned tables #31

Querying partitioned tables #31

Dietr1ch commented Jan 11, 2025 •

edited

Loading

Querying partitioned tables #31

Querying partitioned tables #31

Comments

Dietr1ch commented Jan 11, 2025 • edited Loading

Dietr1ch commented Jan 11, 2025 •

edited

Loading