Document number: P0554R1
Date: 2019-06-17
Reply-to: John McFarlane, [[email protected]](mailto:[email protected]
Audience: SG6, LEWGI
- Revisions
- Introduction
- Goals
- Definitions
- Prior Art
- Componentization
- Composition
- Advanced Topics
- Acknowledgements
- References
Revision 1 (This revision):
- Replaced
fixed_point
withscaled_integer
- Replaced
safe_integer
withoverflow_integer
- Dropped explanation of fixed-point arithmetic
- Dropped discussion of fixed-point division
- Added section on construction of
elastic_integer
frompromotion_integer
andnarrow_integer
- Removed several advanced topics: Leaky Abstractions, Generic Numeric Functions
- Paragraph discussing P0880
- Remove explicit reference in favor of inline hyperlinks
- Numerous grammar and formatting fixes
One of the goals of the Numerics TS is to add general-purpose arithmetic types to the library [P0101]. These types are to provide much needed safety and correctness features which are absent from fundamental types.
This paper puts the case for expressing those features as individual components which can then be composed into types with the desired feature set. This approach is referred to as the compositional approach and is characterized by class templates with type parameters:
// approximate a real number by scaling an int32_t by 2^-16
using f = scaled_integer<int32_t, power<-16, 2>>;
The compositional approach is contrasted with the comprehensive approach which aims to achieve similar goals using class templates with non-type parameters:
// approximate a real number with 15 integer and 16 fractional digits
using f = scaled_integer<31, power<-16, 2>>;
The design space of arithmetic types aims to satisfy goals which fall into three competing categories: efficiency, correctness and usability.
One aim of a library-based solution should be to maintain the efficiency of fundamental arithmetic types. Some run-time costs are inevitable additions to existing computation. All other run-time costs should be avoided in pursuit of zero-cost abstraction.
Further, a new wave of specialized hardware is reaching maturity in the form of heterogeneous processing units. These include the GPUs which provide the majority of computing power on modern PCs and mobile devices. The design and capabilities of these units is continually changing in ways which are difficult to predict. An interface which can easily be employed to harness these capabilities is highly valuable. And this may improve on the performance of existing fundamental types.
Over the past few decades, floating-point performance has largely caught up with integer performance. However, this has come at the cost of increased energy consumption. As performance per watt becomes the limiting factor for cloud and mobile platforms, so the economic impetus to use integers to approximate real numbers increases.
In summary, three efficiency-driven goals are:
- zero-cost abstraction over existing fundamental types;
- extensibility and
- increased applicability of integer hardware units.
Native numeric types have a variety of characteristics that can make them difficult to use correctly. Among those within the scope of this document are:
Integers:
- operator overflow from: cast to float, shift left, multiplication, addition;
- conversion overflow from: cast to narrower types and unsigned types;
- undefined signed overflow vs modulo unsigned overflow;
- surprising promotion rules;
- divide-by-zero;
- underflow/flushing;
- variation across architectures;
- endianness;
- most negative value;
- inconsistent/inaccurate rounding;
- lack of fractional digits.
Floating-point:
- precision varied by scale;
- silent transition to special values;
- special values causing loss of strict weak ordering;
- lack of associativity and commutativity;
- non-determinism;
- divide-by-zero.
One reason the list is shorter for floating-point is because of extra work it performs at run-time. But opportunities exist to address many of these items without run-time cost: increasingly, smarter numeric types can shift cost to the compiler. However, one problem is that the range of solutions is wide with no single solution to satisfy all use cases.
There are two overwhelming reasons why any new arithmetic types should emulate existing fundamental ones:
- They require little more than a rudimentary grasp of elementary algebra in order start learning.
- They are a language feature with which almost all users have a familiarity (if not a mastery).
It is true that many users complain that arithmetic types are hard to use. What they often mean is that they are hard to use correctly: the learning curve is mostly smooth but hits a bump whenever a Correctness issue is encountered.
A class is recognized as an arithmetic type if it possesses certain similarities with fundamental arithmetic types. In particular, it must have value semantics and arithmetic operators. It is often composed of simpler types such as fundamental arithmetic types. However, it is only said to have a compositional interface if the choice of type is a feature of its interface. Thus compositional interfaces tend to be part of class templates.
For example,
template<class Rep>
class A {
Rep data;
};
template<int Bits, bool Signed>
class B;
template<>
class B<32, false> {
uint32_t data;
};
A<uint32_t> a;
B<32, false> b;
both a
and b
are composed of the same data type but we only describe a
as a compositional interface.
An important property of compositional interfaces are that they can be nested:
template<class T>
class C;
C<A<T>> c;
Examples of existing types with a compositional interface include:
- integer types which test for overflow at run-time [Chromium, P0228, type_safe];
- fixed-point real number approximations [P0037, fp];
- arbitrarily wide integers [Boost.Multiprecision];
- auto-widening integers [elastic_integer);
- complex numbers (
std::complex
) and - statically scaled time interval (
std::chrono::duration
).
Counter-examples include Lawrence Crowl's fixed-point paper [P0106] which proposes a set of comprehensive class templates. These produce types which solve all or most of the integer correctness issues listed in section, Correctness.
Other counter-examples include:
- arbitrarily wide integers [P0104, wide_int, Boost.Multiprecision] and
- an integer type which detects overflow conditions [bounded::integer].
(Note that [Boost.Multiprecision] earns a place on both lists as it uses both compositional and non-compositional techniques.)
This section introduces several examples of arithmetic components and explores how they behave when specialized using fundamental types. Each is compared to equivalent comprehensive classes and functionality from [P0106].
There are many examples of libraries which aim to add run-time error detection to integers. These are useful for addressing:
- uninitialized data;
- divide by zero;
- overflow due to arithmetic operations;
- overflow due to cast from wider integer;
- overflow due to cast from floating-point type and
- most negative value.
The integral
(signed) and cardinal
(unsigned) class templates from
[P0106]
perform such a role using non-type template parameters, e.g.
template<int Crng, overflow Covf = overflow::exception>
class integral;
where Crng
specifies the range of the value in bits
and overflow
is an enum which specifies what happens if overflow is detected.
The values of overflow
include:
undefined
- much like signed integer overflow behavior;abort
- forces program termination;exception
- throws an exception;saturate
- limits the value to the maximum/minimum allowed by the range.
The complete list of values is found in [P0105] and covers a wide range of use cases. However, it is necessarily finite. If any new mode is desired, it must wait on a revision to the standard.
The compositional equivalent might look like
template<class Rep = int, class Tag = contract_overflow_tag>
class overflow_integer {
Rep rep_;
public:
// ...
};
where Tag
is an alternative to P0106's overflow
enumeration and provides a
choice of behaviors to exhibit when overflow is detected. Choices may include:
contract_overflow_tag
- makes out-of-range a pre/post-condition violation;saturate_overflow_tag
- clamps out-of-range values to minimum/maximum value;modulo_overflow_tag
- wraps out-of-range value a la unsigned andsticky_overflow_tag
- saturates and stays saturated (like IEEE 754).
In practice, all of these integer types: overflow_integer
, integral
and
cardinal
behave similarly to fundamental integers until overflow occurs:
overflow_integer<uint8_t> n{255}; // interface hints that n resembles uint8_t
++ n; // but when its range is exceeded, an exception is thrown
overflow_integer<int, saturate_overflow_tag> i = 1e100; // i == INT_MAX
Crucially, overflow_integer<T>
gives the user a strong hint that they can expect similar characteristics to T
.
If the user is familiar with int
and chooses it to specialize overflow_integer
,
they should have a good idea of how it will behave.
But the most remarkable safety feature of
[P0106]'s
types is not supported by overflow_integer
at all...
Elasticity is the ability to expand range in order to contain a growing value. For instance, when two 8-bit numbers are summed, the result cannot exceed 9 bits. And when two 20-bit numbers are multiplied, the result cannot exceed 40 bits. Thus, an elastic type tracks the exact number of bits used and returns suitably widened results from binary arithmetic operations. This eliminates a common class of overflow errors without the need for run-time checks.
[P0106]'s integer types achieve elasticity by adjusting range:
integral<4> n = 15; // 4-digit signed integer
auto nn = n * n; // 8-digit signed integer
Consider an equivalent compositional type, elastic_integer
:
template<int Digits, class Narrowest = int>
class elastic_integer;
Digits
is equivalent to Crng
- the number of digits of capacity.
Narrowest
serves a similar purpose to the Rep
parameter of
overflow_integer
.
However, it is not guaranteed to be the type of elastic_integer
's member variable.
For example,
elastic_integer<10, int16_t> n;
uses an int16_t
to store a value in the range, [-1023..1023].
To avoid overflow, elastic_integer
's binary arithmetic operators 'stretch' the value of Digits
:
auto nn = n * n; // elastic_integer<20, int16_t>
As int16_t
cannot store a 20-bit value, a wider type is used.
Narrowest
might seem superfluous - given Digits
can override the type's width.
However, it serves two important purposes.
Firstly, storage requirements vary depending on use case.
When serializing to file, compactness may be preferable, in which case, int8_t
or uint8_t
is appropriate.
In most other situations, performance is paramount and int
makes a better choice.
It is hoped that Narrowest
makes the most sense to users who are already familiar with this trade-off.
Secondly, Narrowest
determines the set of types from which a member variable will be chosen.
If int8_t
is specified, then int16_t
, int32_t
and int64_t
will be used as width dictates
whereas uint8_t
will expand to uint16_t
, uint32_t
and uint64_t
.
There is nothing to prevent Narrowest
being a custom arithmetic type of the user's choosing.
One stipulation, however, is that elastic_integer
is able to choose a wider type as required.
The details of how this is achieved are covered in
[P1751].
Fixed-point numbers have always held certain efficiency and correctness advantages over floating-point types. Modern language features mean that they are also much simpler to work with.
[P0106]'s
integral
and cardinal
class templates are complemented by fixed-point templates,
negatable
and nonnegative
respectively, e.g.:
template<int Crng, int Crsl, round Crnd = rounding::tie_to_odd, overflow Covf = overflow::exception>
class negatable;
They come with two additional template parameters. Crsl
specifies resolution.
For example,
nonnegative<8, -4> n;
instantiates a variable with 8 integer bits and 4 fractional bits.
Its maximum, minimum and lowest values are 255.9375
, 0.0625
and 0
respectively.
Crnd
specifies the rounding mode as described in
[P0105].
The default, rounding::tie_to_odd
, represents the best information preservation.
The list of alternatives supplied by rounding
is extensive but - just as with overflow
- it is finite.
An alternative, compositional fixed-point solution is, scaled_integer
,
proposed in [P0037]:
template<int Exponent = 0, int Radix = 2>
class power;
template<class Rep = int, class Scale = power<>>
class scaled_integer;
power
is a helper type — rather like a parameterised tag type. It represents
a compile-time scaling factor.
Exponent
governs scaling in much the same way as the exponent of an IEEE 754 floating-point type.
It is equivalent to Crsl
.
Radix
can be changed to 10
in order to produce a decimal fixed-point type
suitable for financial — and other — applications where accurate
representation of decimal fractions is essential.
Aside from its compositional interface the most obvious difference between
scaled_integer
and negatable
is that scaled_integer
does not handle rounding.
Rounding and scaling are orthogonal concerns.
It should be possible to devise a rounding_integer
component which handles rounding separately.
See Rounding for further discussion.
In the previous section, three arithmetic components are introduced.
Each component can be instantiated using fundamental integer types such as
int
, resulting in incremental enhancements to int
.
This section illustrates ways in which nested composition can combine these enhancements.
Fixed-point arithmetic is especially susceptible to overflow. Integers typically represent small quantities whereas fixed-point values often saturate their range:
int32_t toes = 10; // uses 4 bits
scaled_integer<int32_t, power<-30>> probability = 0.9; // uses 30 bits
This means that hand-rolled fixed-point arithmetic involves keeping track of - not only scale but also - range:
// square a floating-point value with 16 bits of fractional precision
float square(float input) {
auto fixed = static_cast<int32_t>{input * 65536.f};
auto prod = int64_t{fixed} * fixed; // result must be widened
return prod / 4294967296.f; // gotcha: scale has changed
}
This error-prone work can be automated with a template composed of
scaled_integer
and elastic_integer
:
template<int Digits, int Exponent, class Narrowest = int>
using elastic_scaled_integer =
scaled_integer<
elastic_integer<Digits, Narrowest>,
power<Exponent>>;
The same square
function is now clearer:
float square(float input) {
auto fixed = elastic_scaled_integer<31, -16>{input};
auto prod = fixed * fixed;
return static_cast<float>(prod);
}
In both cases, a modern optimizing compiler produces the same x86-64 assembler instructions [godbolt], e.g.:
square(float):
mulss xmm0, DWORD PTR .LC0[rip]
cvttss2si eax, xmm0
pxor xmm0, xmm0
cdqe
imul rax, rax
cvtsi2ssq xmm0, rax
mulss xmm0, DWORD PTR .LC1[rip]
ret
.LC0:
.long 1199570944
.LC1:
.long 796917760
Multiplication is the easiest arithmetic operator to deal with. Addition and subtraction require that both operands be scaled to the same amount before addition. Division is an open topic explored in [P1368].
With the correct set of components, it's possible to instantiate an integer type which has excellent correctness and usability characteristics. The following composite is practically overflow-proof and performs minimal run-time checks:
template<int IntegerDigits, class Rep = int, class OverflowTag = contract_overflow_tag>
using safe_integer =
overflow_integer<
elastic_integer<IntegerDigits, Rep>,
class OverflowTag>;
The order of the nesting is crucial in ensuring that checks are minimized.
This is because there is overlap in the concerns of the two components:
overflow_integer
traps binary arithmetic overflow whereas elastic_integer
avoids it.
Detection of something that isn't going to happen is a needless cost.
By being aware of the types involved in operations, overflow_integer
can be more selective about checks.
Consider overflow_integer
's multiplication operator:
template<class RepA, class RepB, class OverflowTag>
auto operator*(overflow_integer<RepA, OverflowTag> const & a,
overflow_integer<RepB, OverflowTag> const & b)
{
using result_type = decltype(a.data() * b.data());
constexpr auto digits_a = std::numeric_limits<RepA>::digits;
constexpr auto digits_b = std::numeric_limits<RepB>::digits;
constexpr auto digits_result = std::numeric_limits<result_type>::digits;
if constexpr (digits_a + digits_b >= digits_result) {
// perform overflow test
}
else {
// skip overflow test
}
return a.data() * b.data();
}
We see that unnecessary run-time checks are eliminated.
This holds - not only for elastic_integer
but also - for fundamental types that are implicitly promoted:
overflow_integer<int8_t> a, b;
auto c = a * b; // c is overflow_integer<int>; no need for run-time check
This sub-section combines all three of scaled_integer
, overflow_integer
and
elastic_integer
to make an even more feature-rich, deeply-nested composite:
template<
int Digits,
int Exponent = 0,
class Narrowest = int,
class OverflowTag = contract_overflow_tag>
using safe_scaled_integer =
scaled_integer<
overflow_integer<
elastic_integer<
Digits,
Narrowest>,
OverflowTag>,
power<Exponent>>;
This type is a run-time-efficient, overflow-proof real number approximation:
safe_scaled_integer<5, 0> a = 30;
safe_scaled_integer<2, 0> b = 2;
auto c = a * b; // safe_scaled_integer<7, 0>{60}
auto d = a + b; // safe_scaled_integer<6, 0>{32}
auto e = c / d; // safe_scaled_integer<7, 6>{1.875}
a -= 10; // safe_scaled_integer<5>{20}
++ b; // safe_scaled_integer<2>{3}
++ b; // contract violation!
The above is an abstraction over the following code...
int a = limit<overflow::exception>(-32, 31, 30);
int b = limit<overflow::exception>(-4, 3, 2);
int c = a * b;
int d = a + b;
int e = (c * (1 << 6)) / d;
a = limit<overflow::exception>(-32, 31, a - 10);
b = limit<overflow::exception>(-4, 3, b + 1);
b = limit<overflow::exception>(-4, 3, b + 1);
...where limit
is a function template which handles overflow at run-time.
(A similar facility is detailed in
[P0105].)
Components themselves can often be further broken down into constituent parts.
An example is how elastic_integer
can be built up as an alias of two other
types, promotion_integer
and narrow_integer
:
template<class Rep = int>
promotion_integer;
template<int MinDigits, class Narrowest = int>
narrow_integer;
Just like elastic_integer
, promotion_integer
returns widened results from
arithmetic operations in order to avoid overflow. But unlike elastic_integer
,
it doesn't remember the number of digits being used by its Rep
value.
Conversely, narrow_integer
remembers the number of digits being used by its
Rep
value, but it doesn't return widened results from arithmetic operations.
Clearly, these types were made for each other. Together, they make
elastic_integer
as an alias:
template<int MinDigits, class = Narrowest>
using elastic_integer = promotion_integer<narrow_integer<MinDigits, Narrowest>>;
Why bother to break elastic_integer
down like this? Well, unlike fundamental
integer types, many numeric types track their own digits. They don't need
elastic_integer
's MinDigits
parameter. In fact, it confuses matters quite a
bit.
Many statically-allocated 'bignum' types track their width as
part of their type. One such example is the wide_integer
type from
[P0539]:
template<size_t Bits, typename S>
class wide_integer;
The combination of promotion_integer
and wide_integer
works well but
elastic_integer
or narrow_integer
with their additional MinDigits
parameter would only get in the way:
template<int MinDigits, typename S = signed>
using wide_elastic_integer = promotion_integer<wide_integer<MinDigits, S>>;
This composite, wide_elastic_integer
, is a powerful abstraction for safely
storing large numbers and operating on them with less chance of overflow.
It is not claimed that a complete set of solutions has been presented so far. However, it is hoped that much of the appeal of the compositional approach is apparent. This section is a grab bag of topics of relevance to the types of a numerics TS. These topics are discussed with the compositional approach in mind.
[P0106]'s
negatable
and nonnegative
offer a choice of rounding modes whereas their integer equivalents do not.
However, rounding is orthogonal to real number approximation
and is best embodied in its own component, rounding_integer
:
template<class Rep, class RoundingTag = nearest_rounding_mode>
class rounding_integer;
Another component which has barely been mentioned so far is wide_integer
.
Multiplication of elastic_integer
results in double-wide types.
This quickly hits limits:
elastic_integer<> n; // typically 31 digits
auto n2 = n * n; // 62 digits
auto n4 = n2 * n2; // 124 digits
There is no standard facility to represent values greater than 64 bits with fundamental types.
This limit routinely causes expressions to fail compilation.
For this reason, either a wide_integer
is needed or elastic_integer
must handle multi-word values.
Papers, [P0104]
and [wide_int] both explore this topic.
This paper is also light on details regarding operator overloads. However, they play a vital role in providing the semantics of arithmetic types. It is a non-trivial undertaking to support a complete set of operators for a single interface yet the compositional approach requires multiple interfaces. Thus the number of operator overloads (and special functions) is considerable. Any language features which can reduce this boilerplate are welcome.
Many functions are lightweight wrappers over a Rep
type.
An example of a feature which could reduce such boilerplate is the spaceship
operator.
Another proposal which aims to reduce the number of operators that need to be
written is P0880.
Unfortunately, it is of limited use as it requires homogenous operand types for
binary operators and many of the most interesting types, notably scaled_integer
and promotion_integer
insist on heterogeneous operands.
Integral constants are very useful for picking the best type for certain tasks.
Compound operation on elastic_integer
leads quickly to overflow. This is not
helped when the initial input to the expression is already 31 digits in width
due to it being initialized by an int
value:
elastic_integer<2> x = 3; // x uses 2 digits
auto y = x * 2; // suddenly result uses over 33 digits to store value 6
If the value (and therefore the range) is known at compile time, a much narrower result type can be generated:
auto z = x * std::integral_constant<int, 2>; // only 4 digits used in result
Binary bit shift operations are a major benefactor of integral_constant
operands. In this example
auto a = scaled_integer<int, power<0>>{1};
auto b = a << std::integral_constant<int, 10>; // scaled_integer<int, power<10>>{1024}
both a
and b
store the same underlying value. No actual shift is required.
Another technique which uses compile-time constants to avoid run-time computation involves division.
Any division can be approximated to varying degrees by different combinations of multiplication and bit shift.
For example, to divide an int16_t
by 7
using an int32_t
:
int16_t divide_by_seven(int16_t n) {
return (n * 37449) >> 18;
}
Unfortunately, the coefficient is costly to calculate and the operation requires additional width. If the divisor and headroom is known at compile-time, the calculation can potentially outperform a division operation.
It is possible to improve on std::integral_constant
in various ways.
Consider:
template<
class Integral,
Integral Value,
int Digits = calculate_digits(Value),
int Zeros = calculate_zeros(Value)>
class const_integer {};
This type resembles integral_constant
with two additional non-type parameters:
- Digits - the numbers of digits required to store
Value
; - Zeros - the number of trailing binary
0
s between the least significant binary1
and the radix position.
For example:
auto m = const_integer<int, 10>{}; // Value = 0b1010, Digits = 4, Zeros = 1
auto n = const_integer<int, 96>{}; //xx Value = 0b1100000, Digits = 7, Zeros = 5
Now consider a user-defined literal which provides a shorthand for const_integer
values:
auto m = 10_c; // const_integer<int, 10>{}
auto n = 96_c; // const_integer<int, 96>{}
Finally, combine these user-defined literals with class template deduction.
The Digits
and Zeros
parameters of const_integer
can be matched with the template parameters of composite types:
auto e = elastic_integer(10_c); // elastic_integer<4, int>{10}
auto f = scaled_integer(96_c); // scaled_integer<int, power<5>>{96}
The result is easy to read formulation of tailored composite types. Unfortunately, class template deduction does not extend to aliases. Because of this, it is not yet possible to write definitions such as:
auto kibi = elastic_scaled_integer(1024_c); // stores value 1024 using 2 bits
auto half = kibi / 2048_c; // stores value 0.5 using 3 bits
Thanks to Lawrence Crowl and Jens Maurer for valuable feedback on early drafts. Special thanks to Michael Wong and Davis Herring for extensive and invaluable support and input on this topic.