浮点数打印不一致。为什么有时能奏效？

1条回答

网友

1楼 · 发布于 2024-09-29 21:45:52

您正在打印numpy.float64对象，而不是使用David Gay's dtoa algorithm的Python内置float类型。你知道吗

从version 1.14开始，numpy使用dragon4 algorithm to print floating point values，调整后的输出与用于Python float类型的David Gay算法的输出相同：

Numpy scalars use the dragon4 algorithm in "unique" mode (see below) for str/repr, in a way that tries to match python float output.

^{} function更详细地记录了这一点：

unique : boolean, optional
If True, use a digit-generation strategy which gives the shortest representation which uniquely identifies the floating-point number from other values of the same type, by judicious rounding. If precision was omitted, print out all necessary digits, otherwise digit generation is cut off after precision digits and the remaining value is rounded.

因此0.2可以通过只打印0.2来唯一地表示，但是序列中的下一个值（0.30000000000000004）不能，您必须包含额外的数字来唯一地表示确切的值。你知道吗

这其中的如何实际上相当复杂；你可以在Bungie的Destiny游戏工程师Ryan Juckett的Printing Floating-Point Numbers series中阅读关于这方面的完整报告。你知道吗

但基本上，输出字符串的代码需要确定围绕可能的浮点数（不能解释为下一个或上一个可能的浮点数）聚集的所有十进制数的最短表示形式：

这张图片来自The Shortest Decimal String That Round-Trips: Examples和Rick Regan，它也涵盖了一些其他情况。蓝色中的数字是可能的float64值，绿色中的数字是十进制数字的可能表示。注意灰色的中间点标记，在浮动值周围的这两个中间点之间匹配的任何表示都是公平的，因为所有这些表示都将产生相同的值。你知道吗

davidgay和Dragon4算法的目标都是找到最短的十进制字符串输出，从而再次产生完全相同的浮点值。从Python 3.1 What's New section on the David Gay approach：

Python now uses David Gay’s algorithm for finding the shortest floating point representation that doesn’t change its value. This should help mitigate some of the confusion surrounding binary floating point numbers.
The significance is easily seen with a number like 1.1 which does not have an exact equivalent in binary floating point. Since there is no exact equivalent, an expression like float('1.1') evaluates to the nearest representable value which is 0x1.199999999999ap+0 in hex or 1.100000000000000088817841970012523233890533447265625 in decimal. That nearest value was and still is used in subsequent floating point calculations.
What is new is how the number gets displayed. Formerly, Python used a simple approach. The value of repr(1.1) was computed as format(1.1, '.17g') which evaluated to '1.1000000000000001'. The advantage of using 17 digits was that it relied on IEEE-754 guarantees to assure that eval(repr(1.1)) would round-trip exactly to its original value. The disadvantage is that many people found the output to be confusing (mistaking intrinsic limitations of binary floating point representation as being a problem with Python itself).
The new algorithm for repr(1.1) is smarter and returns '1.1'. Effectively, it searches all equivalent string representations (ones that get stored with the same underlying float value) and returns the shortest representation.
The new algorithm tends to emit cleaner representations when possible, but it does not change the underlying values. So, it is still the case that 1.1 + 2.2 != 3.3 even though the representations may suggest otherwise.

相关问题更多 >

编程相关推荐

热门问题

热门文章