标签归档:hash

为什么Python的无穷大散列具有π的数字?

问题:为什么Python的无穷大散列具有π的数字?

Python中无穷大的哈​​希值具有与pi匹配的数字:

>>> inf = float('inf')
>>> hash(inf)
314159
>>> int(math.pi*1e5)
314159

这仅仅是巧合还是故意的?

The hash of infinity in Python has digits matching pi:

>>> inf = float('inf')
>>> hash(inf)
314159
>>> int(math.pi*1e5)
314159

Is that just a coincidence or is it intentional?


回答 0

_PyHASH_INF定义为等于的常数314159

我找不到关于此的任何讨论,也没有提供原因的评论。我认为它或多或少是任意选择的。我想只要它们不将相同的有意义的值用于其他哈希,就没关系。

_PyHASH_INF is defined as a constant equal to 314159.

I can’t find any discussion about this, or comments giving a reason. I think it was chosen more or less arbitrarily. I imagine that as long as they don’t use the same meaningful value for other hashes, it shouldn’t matter.


回答 1

简介:这不是巧合;在Python的默认CPython实现中_PyHASH_INF被硬编码为314159,并在2000年被Tim Peters选为任意值(显然是从π的数字)。


的值hash(float('inf'))是数值类型内置散列函数的系统相关的参数中的一个,并且也可以作为sys.hash_info.inf在Python 3:

>>> import sys
>>> sys.hash_info
sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
>>> sys.hash_info.inf
314159

与PyPy的结果相同。)


就代码而言,hash是一个内置函数。在Python float对象上调用它会调用函数,该函数的指针由内置float类型()的tp_hash属性给定,该类型定义为的函数,PyTypeObject PyFloat_Type而该函数又具有float_hashreturn _Py_HashDouble(v->ob_fval)

    if (Py_IS_INFINITY(v))
        return v > 0 ? _PyHASH_INF : -_PyHASH_INF;

其中_PyHASH_INF定义为 314159:

#define _PyHASH_INF 314159

从历史的角度来看,Tim Peters在2000年8月添加了314159此上下文中Python代码中的第一个提及(您可以使用git bisect或找到git log -S 314159 -p),现在在git存储库中提交了39dce293cpython

提交消息说:

修复了http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470的问题。这是一个令人误解的错误-真正的“错误”是hash(x)xinfinity为无限时返回错误。修复了。向添加了新的Py_IS_INFINITYpyport.h。重新排列了代码,以减少浮点数和复数的散列中越来越多的重复,从而将Trent之前的做法推到了合理的结论。修复了一个极其罕见的错误,即即使没有错误,浮点数的哈希也可能返回-1(并没有浪费时间来构造一个测试用例,从代码中可以明显看出它可能发生)。改进了复杂的哈希,因此 hash(complex(x, y))不再系统地相等hash(complex(y, x))

特别是,在此提交中,他撕掉了static long float_hash(PyFloatObject *v)in 的代码Objects/floatobject.c并使它成为just return _Py_HashDouble(v->ob_fval);,并在in的定义long _Py_HashDouble(double v)Objects/object.c添加了以下几行:

        if (Py_IS_INFINITY(intpart))
            /* can't convert to long int -- arbitrary */
            v = v < 0 ? -271828.0 : 314159.0;

因此,如上所述,这是一个任意选择。请注意,271828由e的前几个十进制数字形成。

相关的以后的提交:

Summary: It’s not a coincidence; _PyHASH_INF is hardcoded as 314159 in the default CPython implementation of Python, and was picked as an arbitrary value (obviously from the digits of π) by Tim Peters in 2000.


The value of hash(float('inf')) is one of the system-dependent parameters of the built-in hash function for numeric types, and is also available as sys.hash_info.inf in Python 3:

>>> import sys
>>> sys.hash_info
sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
>>> sys.hash_info.inf
314159

(Same results with PyPy too.)


In terms of code, hash is a built-in function. Calling it on a Python float object invokes the function whose pointer is given by the tp_hash attribute of the built-in float type (PyTypeObject PyFloat_Type), which is the float_hash function, defined as return _Py_HashDouble(v->ob_fval), which in turn has

    if (Py_IS_INFINITY(v))
        return v > 0 ? _PyHASH_INF : -_PyHASH_INF;

where _PyHASH_INF is defined as 314159:

#define _PyHASH_INF 314159

In terms of history, the first mention of 314159 in this context in the Python code (you can find this with git bisect or git log -S 314159 -p) was added by Tim Peters in August 2000, in what is now commit 39dce293 in the cpython git repository.

The commit message says:

Fix for http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470. This was a misleading bug — the true “bug” was that hash(x) gave an error return when x is an infinity. Fixed that. Added new Py_IS_INFINITY macro to pyport.h. Rearranged code to reduce growing duplication in hashing of float and complex numbers, pushing Trent’s earlier stab at that to a logical conclusion. Fixed exceedingly rare bug where hashing of floats could return -1 even if there wasn’t an error (didn’t waste time trying to construct a test case, it was simply obvious from the code that it could happen). Improved complex hash so that hash(complex(x, y)) doesn’t systematically equal hash(complex(y, x)) anymore.

In particular, in this commit he ripped out the code of static long float_hash(PyFloatObject *v) in Objects/floatobject.c and made it just return _Py_HashDouble(v->ob_fval);, and in the definition of long _Py_HashDouble(double v) in Objects/object.c he added the lines:

        if (Py_IS_INFINITY(intpart))
            /* can't convert to long int -- arbitrary */
            v = v < 0 ? -271828.0 : 314159.0;

So as mentioned, it was an arbitrary choice. Note that 271828 is formed from the first few decimal digits of e.

Related later commits:


回答 2

确实,

sys.hash_info.inf

返回314159。该值不会生成,而是内置在源代码中。事实上,

hash(float('-inf'))

-271828在python 2中返回或大约为-e(现在为-314159)。

将所有时间中两个最著名的无理数用作哈希值的事实使得它不太可能是巧合。

Indeed,

sys.hash_info.inf

returns 314159. The value is not generated, it’s built into the source code. In fact,

hash(float('-inf'))

returns -271828, or approximately -e, in python 2 (it’s -314159 now).

The fact that the two most famous irrational numbers of all time are used as the hash values makes it very unlikely to be a coincidence.


如何对字典中的所有值求和?

问题:如何对字典中的所有值求和?

假设我有一本字典,其中的键映射为整数,例如:

d = {'key1': 1,'key2': 14,'key3': 47}

是否有返回值的总和在语法上简约的方式d-即62在这种情况下?

Let’s say I have a dictionary in which the keys map to integers like:

d = {'key1': 1,'key2': 14,'key3': 47}

Is there a syntactically minimalistic way to return the sum of the values in d—i.e. 62 in this case?


回答 0

如您所料:

sum(d.values())

As you’d expect:

sum(d.values())

回答 1

在Python 2中,您可以避免通过使用itervalues()dictionary方法创建所有值的临时副本,该方法返回字典键的迭代器:

sum(d.itervalues())

在Python 3中,您只能使用d.values()该方法,因为该方法已更改为可以这样做(并且itervalues()由于不再需要而被删除)。

为了使编写总是在字典键值上进行迭代的版本无关代码更容易,实用程序功能可能会有所帮助:

import sys

def itervalues(d):
    return iter(getattr(d, ('itervalues', 'values')[sys.version_info[0]>2])())

sum(itervalues(d))

这本质上就是本杰明·彼得森的six模块所做的。

In Python 2 you can avoid making a temporary copy of all the values by using the itervalues() dictionary method, which returns an iterator of the dictionary’s keys:

sum(d.itervalues())

In Python 3 you can just use d.values() because that method was changed to do that (and itervalues() was removed since it was no longer needed).

To make it easier to write version independent code which always iterates over the values of the dictionary’s keys, a utility function can be helpful:

import sys

def itervalues(d):
    return iter(getattr(d, ('itervalues', 'values')[sys.version_info[0]>2])())

sum(itervalues(d))

This is essentially what Benjamin Peterson’s six module does.


回答 2

当然可以。这是一种汇总字典值的方法。

>>> d = {'key1':1,'key2':14,'key3':47}
>>> sum(d.values())
62

Sure there is. Here is a way to sum the values of a dictionary.

>>> d = {'key1':1,'key2':14,'key3':47}
>>> sum(d.values())
62

回答 3

d = {'key1': 1,'key2': 14,'key3': 47}
sum1 = sum(d[item] for item in d)
print(sum1)

你可以使用for循环

d = {'key1': 1,'key2': 14,'key3': 47}
sum1 = sum(d[item] for item in d)
print(sum1)

you can do it using the for loop


回答 4

我觉得这sum(d.values())是获得总和的最有效方法。

您也可以尝试用reduce函数来计算总和以及lambda表达式:

reduce(lambda x,y:x+y,d.values())

I feel sum(d.values()) is the most efficient way to get the sum.

You can also try the reduce function to calculate the sum along with a lambda expression:

reduce(lambda x,y:x+y,d.values())

回答 5

sum(d.values())-“ d”->您的字典变量

sum(d.values()) – “d” -> Your dictionary Variable


回答 6

phihag的答案(和类似的答案)在python3中不起作用。

对于python 3:

d = {'key1': 1,'key2': 14,'key3': 47}
sum(list(d.values()))

更新!有人抱怨说这行不通!我只是在终端上附上截图。可能是版本不匹配等。

phihag’s answer (and similar ones) won’t work in python3.

For python 3:

d = {'key1': 1,'key2': 14,'key3': 47}
sum(list(d.values()))

Update! There are complains that it doesn’t work! I just attach a screenshot from my terminal. Could be some mismatch in versions etc.


回答 7

您可以为此考虑“ for循环”:

  d = {'data': 100, 'data2': 200, 'data3': 500}
  total = 0
  for i in d.values():
        total += i

总计= 800

You could consider ‘for loop’ for this:

  d = {'data': 100, 'data2': 200, 'data3': 500}
  total = 0
  for i in d.values():
        total += i

total = 800


回答 8

您可以获取字典中所有值的生成器,然后将其转换为列表,然后使用sum()函数获取所有值的总和。

例:

c={"a":123,"b":4,"d":4,"c":-1001,"x":2002,"y":1001}

sum(list(c.values()))

You can get a generator of all the values in the dictionary, then cast it to a list and use the sum() function to get the sum of all the values.

Example:

c={"a":123,"b":4,"d":4,"c":-1001,"x":2002,"y":1001}

sum(list(c.values()))