浮点数打印不一致。为什么有时能奏效?

2024-09-29 21:45:52 发布

您现在位置:Python中文网/ 问答频道 /正文

使用以下(几乎最小)示例:

import numpy as np
for x in np.arange(0,2,0.1):
    print(x)

我们得到:

0.0
0.1
0.2
0.30000000000000004
0.4
0.5
0.6000000000000001
0.7000000000000001
0.8
0.9
1.0
1.1
1.2000000000000002
1.3
1.4000000000000001
1.5
1.6
1.7000000000000002
1.8
1.9000000000000001

作为输出。你知道吗

我知道“浮点数精度问题”是X.X000001输出的罪魁祸首,但我不明白的是为什么它有时会工作。很明显,0.3不能用一个浮点数精确地表示在基数2中,我看不到数字中没有显示一个十进制数字的模式。你知道吗

为什么Python知道0.1足以显示一个数字?是什么魔法让它去截断剩下的数字?为什么它只在某些时候起作用?你知道吗


Tags: inimportnumpy示例forasnp模式
1条回答
网友
1楼 · 发布于 2024-09-29 21:45:52

您正在打印numpy.float64对象,而不是使用David Gay's dtoa algorithm的Python内置float类型。你知道吗

version 1.14开始,numpy使用dragon4 algorithm to print floating point values,调整后的输出与用于Python float类型的David Gay算法的输出相同:

Numpy scalars use the dragon4 algorithm in "unique" mode (see below) for str/repr, in a way that tries to match python float output.

^{} function更详细地记录了这一点:

unique : boolean, optional

If True, use a digit-generation strategy which gives the shortest representation which uniquely identifies the floating-point number from other values of the same type, by judicious rounding. If precision was omitted, print out all necessary digits, otherwise digit generation is cut off after precision digits and the remaining value is rounded.

因此0.2可以通过只打印0.2来唯一地表示,但是序列中的下一个值(0.30000000000000004)不能,您必须包含额外的数字来唯一地表示确切的值。你知道吗

这其中的如何实际上相当复杂;你可以在Bungie的Destiny游戏工程师Ryan Juckett的Printing Floating-Point Numbers series中阅读关于这方面的完整报告。你知道吗

但基本上,输出字符串的代码需要确定围绕可能的浮点数(不能解释为下一个或上一个可能的浮点数)聚集的所有十进制数的最短表示形式:

floating point number line for 0.1, with the next and previous possible float values and possible representations

这张图片来自The Shortest Decimal String That Round-Trips: ExamplesRick Regan,它也涵盖了一些其他情况。蓝色中的数字是可能的float64值,绿色中的数字是十进制数字的可能表示。注意灰色的中间点标记,在浮动值周围的这两个中间点之间匹配的任何表示都是公平的,因为所有这些表示都将产生相同的值。你知道吗

davidgay和Dragon4算法的目标都是找到最短的十进制字符串输出,从而再次产生完全相同的浮点值。从Python 3.1 What's New section on the David Gay approach

Python now uses David Gay’s algorithm for finding the shortest floating point representation that doesn’t change its value. This should help mitigate some of the confusion surrounding binary floating point numbers.

The significance is easily seen with a number like 1.1 which does not have an exact equivalent in binary floating point. Since there is no exact equivalent, an expression like float('1.1') evaluates to the nearest representable value which is 0x1.199999999999ap+0 in hex or 1.100000000000000088817841970012523233890533447265625 in decimal. That nearest value was and still is used in subsequent floating point calculations.

What is new is how the number gets displayed. Formerly, Python used a simple approach. The value of repr(1.1) was computed as format(1.1, '.17g') which evaluated to '1.1000000000000001'. The advantage of using 17 digits was that it relied on IEEE-754 guarantees to assure that eval(repr(1.1)) would round-trip exactly to its original value. The disadvantage is that many people found the output to be confusing (mistaking intrinsic limitations of binary floating point representation as being a problem with Python itself).

The new algorithm for repr(1.1) is smarter and returns '1.1'. Effectively, it searches all equivalent string representations (ones that get stored with the same underlying float value) and returns the shortest representation.

The new algorithm tends to emit cleaner representations when possible, but it does not change the underlying values. So, it is still the case that 1.1 + 2.2 != 3.3 even though the representations may suggest otherwise.

相关问题 更多 >

    热门问题