-
Notifications
You must be signed in to change notification settings - Fork 14
/
Copy pathother.malloynb
47 lines (42 loc) · 1.22 KB
/
other.malloynb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
>>>markdown
# Bucketing with 'Other'
Often you want to limit the number of group-by values in a table, and bucket everything else into an 'other' category.
In the `top_states_by_eleveation` query below, we have a query with two stages. The first stage calculates the top states and nests the data to be aggregated. The second pipeline stage produces the actual aggregation.
>>>malloy
source: airports is duckdb.table('../data/airports.parquet') extend {
measure:
airport_count is count()
avg_elevation is elevation.avg()
view: top_states_by_elevation is {
group_by: state
aggregate: avg_elevation
calculate: row_num is row_number()
nest: data is {
group_by: code, elevation
}
} -> {
group_by: state is
pick state when row_num < 5
else 'OTHER'
aggregate:
avg_elevation is data.elevation.avg()
airport_count is data.count()
}
}
>>>markdown
## Basic Query
>>>malloy
#(docs) size=large limit=5000
run: airports -> top_states_by_elevation
>>>markdown
## Nested Query
>>>malloy
#(docs) size=large limit=5000
run: airports -> {
group_by: `Facility Type` is fac_type
aggregate:
airport_count
avg_elevation
nest: top_states_by_elevation
}
>>>markdown