You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is there a convenient way to treat sales/**/*.csv as a single table?
So far it seems that bdt query supports 2 flags for input tables,
--table path/to/single_file.csv
A single table is read with name single_file
--tables path/to/directory/
Multiple tables are read, each one with it's own basename
This imports N tables
I don't see much value here, why not using the shell to expand something like path/to/directory/*.csv?
I kind of want a new input file flag that expects a table name, and a set of (compatible) files,
bdt query \--partitioned_table sales sales/**/*.csv \ # Shell will expand these globs--sql" select count(*) from sales"
Which would use a flag with 1+N arguments, --partitioned_table sales sales/2024/12/30.csv sales/2024/12/31.csv sales/2025/01/01.csv sales/2025/01/02.csv sales/2025/01/03.csv, and make the table sales available.
Is there a way to get this today? I tried the --tables flag, but instead got N different tables that were hard to work with as a unit.
It's not hard to create a single file that concatenates all tables, but I'd nice not needing to create it as it'd allow writing queries from the shell, with a tiny rewrite --partitioned_table sales sales/2024/12/*.csv would get me info about sales in December 2024 without any made-up disk writes.
The text was updated successfully, but these errors were encountered:
I want to merge multiple tables with the same into a single one.
Say I have a table
Sales(transaction_id, date, amount)
and sharded files in my file system,Is there a convenient way to treat
sales/**/*.csv
as a single table?So far it seems that
bdt query
supports 2 flags for input tables,--table path/to/single_file.csv
single_file
--tables path/to/directory/
path/to/directory/*.csv
?I kind of want a new input file flag that expects a table name, and a set of (compatible) files,
Which would use a flag with 1+N arguments,
--partitioned_table sales sales/2024/12/30.csv sales/2024/12/31.csv sales/2025/01/01.csv sales/2025/01/02.csv sales/2025/01/03.csv
, and make the tablesales
available.Is there a way to get this today? I tried the
--tables
flag, but instead got N different tables that were hard to work with as a unit.It's not hard to create a single file that concatenates all tables, but I'd nice not needing to create it as it'd allow writing queries from the shell, with a tiny rewrite
--partitioned_table sales sales/2024/12/*.csv
would get me info about sales in December 2024 without any made-up disk writes.The text was updated successfully, but these errors were encountered: