-
Notifications
You must be signed in to change notification settings - Fork 73
Luau Development
Joel Natividad edited this page Dec 21, 2024
·
1 revision
The luau
command provides powerful data transformation capabilities using Luau scripting (version 0.653). It serves as qsv's Domain-Specific Language (DSL) for data wrangling, allowing you to create new columns, filter rows, and perform complex calculations.
qsv luau <subcommand> [options] <args>... [<input>]
Subcommands:
map - Create new columns by mapping Luau script results
filter - Filter rows based on Luau script conditions
There are 3 ways to reference columns in your Luau scripts:
- Direct column name: Amount (can be disabled with -g)
- Using col table: col.Amount or col["Total Balance"]
- Column index (1-based): col[1] (requires --colindex or --no-headers)
-
_IDX
: Current row number (read-only) -
_INDEX
: Enable random access mode for indexed files -
_ROWCOUNT
: Total number of rows -
_LASTROW
: Last row number in the file
Scripts can have three sections:
-
BEGIN { ... }!
: Initialization code - Main script: Row processing logic after BEGIN
-
END { ... }!
: Final calculations/cleanup
-- Running sum
qsv_cumsum("name", value) -- Returns cumulative sum
-- Running product
qsv_cumprod("name", value) -- Returns cumulative product
-- Running min/max
qsv_cummin("name", value) -- Returns running minimum
qsv_cummax("name", value) -- Returns running maximum
-- Boolean operations
qsv_cumany("name", condition) -- Returns true if any value was true
qsv_cumall("name", condition) -- Returns true if all values were true
-- Access previous values
qsv_lag("name", value, lag_periods, default)
-- Calculate differences
qsv_diff("name", value, periods)
-- Load CSV as lookup table
qsv_loadcsv("table_name", "path/to/file.csv", "key_column")
-- Load JSON data
qsv_loadjson("table_name", "path/to/file.json")
-- Register remote lookup table
qsv_register_lookup("table_name", "lookup_uri", cache_age_secs)
-- Logging
qsv_log(level, message, ...) -- Levels: info, warn, error, debug, trace
-- Flow control
qsv_break(message) -- Stop processing with message
qsv_skip() -- Skip current row in output
-- File operations
qsv_writefile(filename, data)
qsv_fileexists(filepath)
-- Environment variables
qsv_setenv(name, value)
qsv_getenv(name)
-- Shell commands
qsv_cmd(qsv_args) -- Execute qsv command
qsv_shellcmd(cmd, args) -- Execute allowed shell commands
- Basic Column Creation
# Add a new column that doubles the Amount
echo 'Amount\n10\n20\n30' | qsv luau map 'doubled' 'Amount * 2'
- Running Totals with BEGIN/END Blocks
# Create a script file named totals.lua
cat > totals.lua << 'EOF'
BEGIN {
total = 0
grand_total = 0
}!
total = total + Amount
grand_total = grand_total + total
return total
END {
return ("Grand total: " .. grand_total)
}!
EOF
# Run the script
qsv luau map 'Running_Total' 'file:totals.lua' input.csv
- Using Lookup Tables
# Create a script file named lookup.lua
cat > lookup.lua << 'EOF'
BEGIN {
-- Load lookup table from CSV
states = qsv_register_lookup("states", "states.csv", 3600)
if not states then
qsv_break("Failed to load states lookup table")
end
}!
-- Look up state name from abbreviation
return states[StateCode]["Name"]
EOF
# Run the script
qsv luau map 'State_Name' 'file:lookup.lua' input.csv
- Random Access Processing
# Create a script file named percent_of_total.lua
cat > percent_of_total.lua << 'EOF'
BEGIN {
qsv_autoindex() -- Create index for random access
total_amount = 0
-- First pass: calculate total
for idx = 1, _LASTROW do
_INDEX = idx
total_amount = total_amount + tonumber(Amount)
end
-- Reset to start processing rows
_INDEX = 1
}!
-- Calculate percentage of total for each row
local pct = (tonumber(Amount) / total_amount) * 100
_INDEX = _INDEX + 1
return string.format("%.1f%%", pct)
END {
return ("Processed " .. _ROWCOUNT .. " rows, Total Amount: " .. total_amount)
}!
EOF
# Run the script
qsv luau map 'Percent_Of_Total' 'file:percent_of_total.lua' input.csv
- Multiple Column Output
# Create multiple columns in one pass
qsv luau map 'col1,col2,col3' '{value + 1, value * 2, value ^ 2}'
- Use
--no-globals
when possible for better performance - Create indexes for random access operations
- Use
--colindex
when working with numeric column positions - Consider using the progress bar (-p) for long-running operations
- Invalid scripts return
<ERROR>
in map mode - While developing/debugging Luau scripts, set the
QSV_LOG_LEVEL
environment variable for detailed error messages - Use
--max-errors
to control error tolerance