Contents Index Previous Next
G.2.1 Model of Floating Point Arithmetic
1
In the strict mode, the predefined operations
of a floating point type shall satisfy the accuracy requirements specified here
and shall avoid or signal overflow in the situations described. This behavior
is presented in terms of a model of floating point arithmetic that builds on
the concept of the canonical form (see
A.5.3).
Static Semantics
2
Associated with each floating point type is an
infinite set of model numbers. The model numbers of a type are used to
define the accuracy requirements that have to be satisfied by certain
predefined operations of the type; through certain attributes of the
model numbers, they are also used to explain the meaning of a user-declared
floating point type declaration. The model numbers of a derived type
are those of the parent type; the model numbers of a subtype are those
of its type.
3
{model number}
The
model numbers of a floating point type T are
zero and all the values expressible in the canonical form (for the type T),
in which
mantissa has T'Model_Mantissa digits and
exponent has
a value greater than or equal to T'Model_Emin. (These attributes are defined
in
G.2.2.)
3.a
Discussion: The model is
capable of describing the behavior of most existing hardware that has
a mantissa-exponent representation. As applied to a type T, it is parameterized
by the values of T'Machine_Radix, T'Model_Mantissa, T'Model_Emin, T'Safe_First,
and T'Safe_Last. The values of these attributes are determined by how,
and how well, the hardware behaves. They in turn determine the set of
model numbers and the safe range of the type, which figure in the accuracy
and range (overflow avoidance) requirements.
3.b
In hardware that is free of arithmetic
anomalies, T'Model_Mantissa, T'Model_Emin, T'Safe_First, and T'Safe_Last
will yield the same values as T'Machine_Mantissa, T'Machine_Emin, T'Base'First,
and T'Base'Last, respectively, and the model numbers in the safe range
of the type T will coincide with the machine numbers of the type T. In
less perfect hardware, it is not possible for the model-oriented attributes
to have these optimal values, since the hardware, by definition, and
therefore the implementation, cannot conform to the stringencies of the
resulting model; in this case, the values yielded by the model-oriented
parameters have to be made more conservative (i.e., have to be penalized),
with the result that the model numbers are more widely separated than
the machine numbers, and the safe range is a subrange of the base range.
The implementation will then be able to conform to the requirements of
the weaker model defined by the sparser set of model numbers and the
smaller safe range.
4
{model interval} A
model interval of a floating point type is any interval whose
bounds are model numbers of the type.
{model interval
(associated with a value)} The
model
interval of a type T
associated with a value v is the
smallest model interval of T that includes
v. (The model interval
associated with a model number of a type consists of that number only.)
Implementation Requirements
5
The accuracy requirements for the evaluation of
certain predefined operations of floating point types are as follows.
5.a
Discussion: This subclause does
not cover the accuracy of an operation of a static expression; such operations
have to be evaluated exactly (see 4.9). It also does
not cover the accuracy of the predefined attributes of a floating point subtype
that yield a value of the type; such operations also yield exact results (see
3.5.8 and A.5.3).
6
{operand interval}
An
operand interval is the model interval,
of the type specified for the operand of an operation, associated with
the value of the operand.
7
For any predefined
arithmetic operation that yields a result of a floating point type T,
the required bounds on the result are given by a model interval of T
(called the result interval) defined in terms of the operand values
as follows:
8
- {result interval (for the evaluation of
a predefined arithmetic operation)} The
result interval is the smallest model interval of T that includes the
minimum and the maximum of all the values obtained by applying the (exact)
mathematical operation to values arbitrarily selected from the respective
operand intervals.
9
The result interval of an exponentiation is obtained
by applying the above rule to the sequence of multiplications defined
by the exponent, assuming arbitrary association of the factors, and to
the final division in the case of a negative exponent.
10
The result interval of a conversion of a numeric
value to a floating point type T is the model interval of T associated
with the operand value, except when the source expression is of a fixed
point type with a small that is not a power of T'Machine_Radix
or is a fixed point multiplication or division either of whose operands
has a small that is not a power of T'Machine_Radix; in these cases,
the result interval is implementation defined.
10.a
Implementation defined: The
result interval in certain cases of fixed-to-float conversion.
11
{Overflow_Check [partial]}
{check, language-defined (Overflow_Check)}
For any of the foregoing operations, the implementation
shall deliver a value that belongs to the result interval when both bounds
of the result interval are in the safe range of the result type T, as
determined by the values of T'Safe_First and T'Safe_Last; otherwise,
12
- {Constraint_Error (raised by failure of
run-time check)} if T'Machine_Overflows
is True, the implementation shall either deliver a value that belongs
to the result interval or raise Constraint_Error;
13
- if T'Machine_Overflows is False, the result is implementation
defined.
13.a
Implementation defined: The
result of a floating point arithmetic operation in overflow situations,
when the Machine_Overflows attribute of the result type is False.
14
For any predefined relation on operands of a
floating point type T, the implementation may deliver any value (i.e.,
either True or False) obtained by applying the (exact) mathematical comparison
to values arbitrarily chosen from the respective operand intervals.
15
The result of a membership test is defined in
terms of comparisons of the operand value with the lower and upper bounds
of the given range or type mark (the usual rules apply to these comparisons).
Implementation Permissions
16
If the underlying floating point hardware implements
division as multiplication by a reciprocal, the result interval for division
(and exponentiation by a negative exponent) is implementation defined.
16.a
Implementation defined: The
result interval for division (or exponentiation by a negative exponent),
when the floating point hardware implements division as multiplication
by a reciprocal.
Wording Changes from Ada 83
16.b
The Ada 95 model numbers of a
floating point type that are in the safe range of the type are comparable
to the Ada 83 safe numbers of the type. There is no analog of the Ada
83 model numbers. The Ada 95 model numbers, when not restricted to the
safe range, are an infinite set.
Inconsistencies With Ada 83
16.c
{inconsistencies with Ada 83}
Giving the model numbers the hardware radix, instead
of always a radix of two, allows (in conjunction with other changes)
some borderline declared types to be represented with less precision
than in Ada 83 (i.e., with single precision, whereas Ada 83 would have
used double precision). Because the lower precision satisfies the requirements
of the model (and did so in Ada 83 as well), this change is viewed as
a desirable correction of an anomaly, rather than a worrisome inconsistency.
(Of course, the wider representation chosen in Ada 83 also remains eligible
for selection in Ada 95.)
16.d
As an example of this phenomenon,
assume that Float is represented in single precision and that a double
precision type is also available. Also assume hexadecimal hardware with
clean properties, for example certain IBM hardware. Then,
16.e
type T is digits Float'Digits range -Float'Last .. Float'Last;
16.f
results in T being represented
in double precision in Ada 83 and in single precision in Ada 95. The
latter is intuitively correct; the former is counterintuitive. The reason
why the double precision type is used in Ada 83 is that Float has model
and safe numbers (in Ada 83) with 21 binary digits in their mantissas,
as is required to model the hypothesized hexadecimal hardware using a
binary radix; thus Float'Last, which is not a model number, is slightly
outside the range of safe numbers of the single precision type, making
that type ineligible for selection as the representation of T even though
it provides adequate precision. In Ada 95, Float'Last (the same value
as before) is a model number and is in the safe range of Float on the
hypothesized hardware, making Float eligible for the representation of
T.
Extensions to Ada 83
16.g
{extensions to Ada 83}
Giving the model numbers the hardware radix allows
for practical implementations on decimal hardware.
Wording Changes from Ada 83
16.h
The wording of the model of floating
point arithmetic has been simplified to a large extent.
Contents Index Previous Next Legal