Decimal type design #5274

benesch · 2021-01-12T14:17:11Z

benesch
Jan 12, 2021
Maintainer

Our decimal type was added in a hurry back in summer 2019 to get the TPC-H queries working, which make heavy use of decimal types to represent money. As a result their design priorities wound up being:

Performance, for the manual TPC-H benchmarks @frankmcsherry was running at the time.
Correctness in the limited range of TPC-H
SQL standard compatibility
PostgreSQL compatibility

This is no longer the right priority order. Users expect decimal arithmetic to be exact, even at the boundaries. So correctness should be the top priority. And compatibility with PostgreSQL has shaped up to be more important than compatibility with the SQL standard.

To quickly summarize the current design: decimals are stored as a 128-bit signed integer (Rust's i128 type). The scale information, i.e., "where the decimal point goes", is stored separately in type of the datum. This is at variance with PostgreSQL, where scale information is carried in the decimal value itself. So where we can say "every datum in this column is a decimal with two decimal places", PostgreSQL can only say "every datum in this column is some decimal", and you have to actually go look at each row in the column to see what its scale is.

This design was largely engineered for efficient differential reductions. With a uniform scale, to sum up a pile of decimals, we just drop the decimal values into differential's diff field and do a fast, in-place accumulation of a bunch of i128s.

There are a few limitations with this design across the board:

Carrying decimal scale information in the type does not fit well with PostgreSQL's type system. Decimals are special cased all throughout the planner—in casting logic, in function selection logic, and so on—in order to hack in support for unifying scale information. And the rules for this are ad hoc, since we've been making them up as we go.
There is no good type to hold the accumulation of many decimal values. When we sum int4s, we output an int8. When we sum int8s, we output a decimal. But when we sum decimals... we output another decimal. Hopefully you weren't using all the bits in your i128s.
The precision information in the decimal type is a total joke. It's not used for anything except futzing with the output scale of multiplication and division operations in a really confusing way.

@frankmcsherry recently floated the idea of using arbitrary-precision numbers (e.g. rug) as the backing storage for decimals, rather than i128s. I thought it would be good to start sketching out a design for this. Do we still want to keep decimal scale information in the type? Should we look into arbitrary-precision decimal libraries rather than arbitrary-precision integer libraries, to offload even more work? (Last time I looked there didn't seem to be any compelling Rust APD libraries though.) Will the performance hit be acceptable?

/cc @frankmcsherry @mjibson

maddyblue · 2021-01-12T18:07:13Z

maddyblue
Jan 12, 2021

rug looks like it's GPL due to wrapping GNU code so we can't use that right? Are you aware of license compatible crates that do what we need? When I was looking around for cockroach there were a few packages that could represent decimals, but none that could actually operate on them correctly and not slowly (much less fast).

I'm on board with any change that increases our correctness, and we can iterate on performance after that. However the next smallest amount of work to get more correctness is maybe huge and we should delay it for as long as possible.

From the cockroach/Go/apd perspective, there are two main performance problems.

Heap-allocated bigints (a Go type that uses byte slices to represent ints of any length). Jordan from Cockroach recently spent many twitch streams trying to do allocation tricks to make that faster but was not able to and tried out a Go apd-like library that stores things in an internal int64 when possible (which is like almost always the case) and uses big.int when needed, thus removing the need to heap-allocate most of the time. This is the approach we should take if we care about performance. We could even do something like continue using i128 for the coefficient, and just error anytime the precision would exceed that. Then we never have to care about heap allocations. Or if we did some day, we could add it on.
Scaling two decimals so they are operable with each other. Some operations require scaling the decimals to the same exponent. Depending on how your decimals are represented this can be slow. If you use a big.Int-like base2 representation it is the most slow it can get, at the benefit of being the most memory-efficient. This is because you have to calculate the correct power of 10 exponent and multiply your coeff by that, which is very slow for a heap-backed data structure. (If we were backed by an i128, this would probably be fast enough that we wouldn't notice, but we'd have to care more about not overscaling things, which is ok.)

This is not a small amount of work. On the order of months to get a crate written, tested, and integrated into mz. Are there some specific bugs or user issues that have happened or are preventing new uses of mz?

1 reply

benesch Jan 12, 2021
Maintainer Author

rug looks like it's GPL due to wrapping GNU code so we can't use that right? Are you aware of license compatible crates that do what we need? When I was looking around for cockroach there were a few packages that could represent decimals, but none that could actually operate on them correctly and not slowly (much less fast).

rug since v6 is also available under the LGPL, so I think we're actually ok there.

I'm on board with any change that increases our correctness, and we can iterate on performance after that. However the next smallest amount of work to get more correctness is maybe huge and we should delay it for as long as possible.

From the cockroach/Go/apd perspective, there are two main performance problems.

Heap-allocated bigints (a Go type that uses byte slices to represent ints of any length). Jordan from Cockroach recently spent many twitch streams trying to do allocation tricks to make that faster but was not able to and tried out a Go apd-like library that stores things in an internal int64 when possible (which is like almost always the case) and uses big.int when needed, thus removing the need to heap-allocate most of the time. This is the approach we should take if we care about performance. We could even do something like continue using i128 for the coefficient, and just error anytime the precision would exceed that. Then we never have to care about heap allocations. Or if we did some day, we could add it on.

Scaling two decimals so they are operable with each other. Some operations require scaling the decimals to the same exponent. Depending on how your decimals are represented this can be slow. If you use a big.Int-like base2 representation it is the most slow it can get, at the benefit of being the most memory-efficient. This is because you have to calculate the correct power of 10 exponent and multiply your coeff by that, which is very slow for a heap-backed data structure. (If we were backed by an i128, this would probably be fast enough that we wouldn't notice, but we'd have to care more about not overscaling things, which is ok.)

This is not a small amount of work. On the order of months to get a crate written, tested, and integrated into mz. Are there some specific bugs or user issues that have happened or are preventing new uses of mz?

Thanks, this is super helpful context! I'm not sure what spurred @frankmcsherry to look into this a few days ago, but I assume it was something specific.

If it is just a matter of needing more precision, I think swapping i128 -> rug is a straightforward change that will increase precision without affecting the current (in)correctness. Adding additional overflow checks is also likely straightforward and a not large amount of work.

I agree that improving correctness any further is a ton of work. Thanks for confirming.

frankmcsherry · 2021-01-19T13:09:02Z

frankmcsherry
Jan 19, 2021
Maintainer

I'm not sure what spurred @frankmcsherry to look into this a few days ago, but I assume it was something specific.

We had some questions raised about overflow in i64 and decimal accumulation. By accumulating in an i128, it's not too hard to wave hands about overflow being unlikely for things like count, maybe even for things like sum(i64), but there were clear cases where we could/should worry about overflow when folks use decimals because they can put whatever digits they want there and overflow with a small number of records.

I'm not personally facing anything urgent, but it seemed that at some point you want an arbitrary precision thing, for the folks who want to opt in to "certainly correct" answers and out of "best performance/memory use".

0 replies

frankmcsherry · 2021-01-20T14:34:31Z

frankmcsherry
Jan 20, 2021
Maintainer

This is no longer the right priority order.

I hope this doesn't sound too mean, but there is one thing not on the list which is "minimize implementation effort", i.e. how much do we want to work to e.g. track precision correctly etc. IIRC that was one of the highest priorities at the time, because we had other things to do. It is now something that could be demoted, and one way forward could be "actually track precision correctly". I can understand that we might prefer APD stuff instead (and I like that too) but e.g. if we needed something working at week's end, we could try and fix what we have.

0 replies

benesch · 2021-01-20T14:38:02Z

benesch
Jan 20, 2021
Maintainer Author

As it turns out—I knew this once, but had forgotten—rug does not provide decimal floats, only binary floats. So that's out.

So our options now are:

Write a Rust wrapper for the C decNumber library: http://speleotrove.com/decimal/dnintro.html. This is the standard decimal arithmetic library used by GCC and Unicode.
Try to extract the PostgreSQL numeric code into a reusable C library, then write a wrapper for it.
Write our own arbitrary-precision decimal arithmetic library in Rust.
Fix up our current decimal library.

None of these options are particularly appealing. Neither (1) or (2) uses an optimized memory representation, so they're gonna hurt memory-usage wise:

https://github.com/gcc-mirror/gcc/blob/ea74a3f548eb321429c371e327e778e63d9128a0/libdecnumber/decNumber.h#L76-L84
https://github.com/postgres/postgres/blob/6b4d3046f422c2682365924b515c7588d5a3e651/src/backend/utils/adt/numeric.c#L304-L312

(3) is hard for the obvious reason.

(4) starts to look more and more appealing, but the task needs an owner.

0 replies

maddyblue · 2021-01-20T18:00:20Z

maddyblue
Jan 20, 2021

How is (4) not the same as (3)? Cockroach had similar problems with its first decimal implementation. I started fixing a few of them and then realized that it was fundamentally wrong and it would be easier to implement from scratch. Do you have reason to believe that we will be able to fix the current library without it resulting in a full rewrite? I'm also onboard with the argument: just do incremental improvements to address specific bugs and that's good enough. (Although I'm still worried some of those will be rather large due to significant mismatch between what we have and what we need.)

1 reply

benesch Jan 21, 2021
Maintainer Author

I'm also onboard with the argument: just do incremental improvements to address specific bugs and that's good enough.

Yeah, this is precisely the difference between (3) and (4). It's not clear that (4) isn't just a slower way of going about (3). Anyway, more discussion happened on Slack that has changed the proposal a bit, I think!

andrioni · 2021-02-14T21:45:15Z

andrioni
Feb 14, 2021

One thing to note is that most of the modern data ecosystem only supports decimals with precision up to 38 (e.g. Spark, Flink, Arrow, Snowflake) due to using i128s underneath (usually with i64s backing decimals with precision <= 18), so I imagine most users would be comfortable with this limitation, especially if removing this limitation decreases performance for common use cases.

Looking at Nubank, even though arbitrary precision java.math.BigDecimals are used in the OLTP side, the entire analytical environment defaults to DECIMAL(38, 18) instead, with scale frequently being smaller in some cases due to regulatory requirements. We are overall quite happy to have this restriction in exchange for better performance (including better memory utilization!).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decimal type design #5274

{{title}}

Replies: 6 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Decimal type design #5274

benesch Jan 12, 2021 Maintainer

Replies: 6 comments · 2 replies

maddyblue Jan 12, 2021

benesch Jan 12, 2021 Maintainer Author

frankmcsherry Jan 19, 2021 Maintainer

frankmcsherry Jan 20, 2021 Maintainer

benesch Jan 20, 2021 Maintainer Author

maddyblue Jan 20, 2021

benesch Jan 21, 2021 Maintainer Author

andrioni Feb 14, 2021

benesch
Jan 12, 2021
Maintainer

Replies: 6 comments 2 replies

maddyblue
Jan 12, 2021

benesch Jan 12, 2021
Maintainer Author

frankmcsherry
Jan 19, 2021
Maintainer

frankmcsherry
Jan 20, 2021
Maintainer

benesch
Jan 20, 2021
Maintainer Author

maddyblue
Jan 20, 2021

benesch Jan 21, 2021
Maintainer Author

andrioni
Feb 14, 2021