分类目录归档:知识问答

Python中的__future__是什么,以及如何/何时使用它以及如何工作

问题:Python中的__future__是什么,以及如何/何时使用它以及如何工作

__future__经常出现在Python模块中。__future__即使阅读了python的__future__文档,我也不明白它的用途以及使用时间/方式。

有人可以举例说明吗?

关于__future__我收到的基本用法的一些答案似乎是正确的。

但是,我需要了解有关__future__工作原理的另一件事:

对我来说,最令人困惑的概念是当前的python版本如何包含未来版本的功能,以及如何使用当前版本的Python成功地编译使用未来版本的功能的程序。

我猜想当前版本包含了将来的潜在功能。但是,这些功能仅可通过使用获得,__future__因为它们不是当前标准。让我知道我是否正确。

__future__ frequently appears in Python modules. I do not understand what __future__ is for and how/when to use it even after reading the Python’s __future__ doc.

Can anyone explain with examples?

A few answers regarding the basic usage of __future__ I’ve received seemed correct.

However, I need to understand one more thing regarding how __future__ works:

The most confusing concept for me is how a current python release includes features for future releases, and how a program using a feature from a future release can be be compiled successfully in the current version of Python.

I am guessing that the current release is packaged with potential features for the future. However, the features are available only by using __future__ because they are not the current standard. Let me know if I am right.


回答 0

通过__future__包含模块,您可以慢慢习惯不兼容的更改或引入新关键字的更改。

例如,对于使用上下文管理器,您必须from __future__ import with_statement在2.5中进行操作,因为with关键字是new,不再应该用作变量名。为了with在Python 2.5或更早版本中用作Python关键字,您将需要使用上面的import。

另一个例子是

from __future__ import division
print 8/7  # prints 1.1428571428571428
print 8//7 # prints 1

没有这些__future__东西,两个print语句都将打印出来1

内部差异在于没有导入时,/映射到__div__()方法,而使用导入__truediv__()。(无论如何,请//调用__floordiv__()。)

Apropos printprint在3.x中成为函数,失去其特殊属性作为关键字。反之亦然。

>>> print

>>> from __future__ import print_function
>>> print
<built-in function print>
>>>

With __future__ module’s inclusion, you can slowly be accustomed to incompatible changes or to such ones introducing new keywords.

E.g., for using context managers, you had to do from __future__ import with_statement in 2.5, as the with keyword was new and shouldn’t be used as variable names any longer. In order to use with as a Python keyword in Python 2.5 or older, you will need to use the import from above.

Another example is

from __future__ import division
print 8/7  # prints 1.1428571428571428
print 8//7 # prints 1

Without the __future__ stuff, both print statements would print 1.

The internal difference is that without that import, / is mapped to the __div__() method, while with it, __truediv__() is used. (In any case, // calls __floordiv__().)

Apropos print: print becomes a function in 3.x, losing its special property as a keyword. So it is the other way round.

>>> print

>>> from __future__ import print_function
>>> print
<built-in function print>
>>>

回答 1

当你做

from __future__ import whatever

您实际上不是在使用import语句,而是在将来的语句。您正在阅读错误的文档,因为您实际上并未在导入该模块。

以后的语句很特殊-它们更改了Python模块的解析方式,这就是为什么它们必须位于文件顶部的原因。它们为文件中的单词或符号赋予了新的(或不同的)含义。从文档:

将来的语句是对编译器的指令,即应使用将来指定的Python版本中可用的语法或语义来编译特定模块。将来的声明旨在简化向Python的未来版本的移植,从而对语言进行不兼容的更改。它允许在该功能成为标准版本之前,按模块使用新功能。

如果您确实要导入__future__模块,请执行

import __future__

然后照常访问它。

When you do

from __future__ import whatever

You’re not actually using an import statement, but a future statement. You’re reading the wrong docs, as you’re not actually importing that module.

Future statements are special — they change how your Python module is parsed, which is why they must be at the top of the file. They give new — or different — meaning to words or symbols in your file. From the docs:

A future statement is a directive to the compiler that a particular module should be compiled using syntax or semantics that will be available in a specified future release of Python. The future statement is intended to ease migration to future versions of Python that introduce incompatible changes to the language. It allows use of the new features on a per-module basis before the release in which the feature becomes standard.

If you actually want to import the __future__ module, just do

import __future__

and then access it as usual.


回答 2

__future__ 是一个伪模块,程序员可以使用它来启用与当前解释器不兼容的新语言功能。例如,该表达式11/4当前的计算结果为2。如果执行该模块的模块通过执行以下命令启用了真除法:

from __future__ import division

该表达式的11/4计算结果为2.75。通过导入__future__模块并评估其变量,您可以看到何时将新功能首次添加到语言中以及何时将其成为默认功能:

  >>> import __future__
  >>> __future__.division
  _Feature((2, 2, 0, 'alpha', 2), (3, 0, 0, 'alpha', 0), 8192)

__future__ is a pseudo-module which programmers can use to enable new language features which are not compatible with the current interpreter. For example, the expression 11/4 currently evaluates to 2. If the module in which it is executed had enabled true division by executing:

from __future__ import division

the expression 11/4 would evaluate to 2.75. By importing the __future__ module and evaluating its variables, you can see when a new feature was first added to the language and when it will become the default:

  >>> import __future__
  >>> __future__.division
  _Feature((2, 2, 0, 'alpha', 2), (3, 0, 0, 'alpha', 0), 8192)

回答 3

它可以用于使用某些功能,这些功能将在具有较旧版本的Python的同时以较新的版本显示。

例如

>>> from __future__ import print_function

将允许您将其print用作功能:

>>> print('# of entries', len(dictionary), file=sys.stderr)

It can be used to use features which will appear in newer versions while having an older release of Python.

For example

>>> from __future__ import print_function

will allow you to use print as a function:

>>> print('# of entries', len(dictionary), file=sys.stderr)

回答 4

已经有一些不错的答案,但是都没有一个完整的清单 __future__语句当前支持。

简而言之,__future__语句强制Python解释器使用该语言的更新功能。


当前支持的功能如下:

nested_scopes

在Python 2.1之前,以下代码将引发NameError

def f():
    ...
    def g(value):
        ...
        return g(value-1) + 1
    ...

from __future__ import nested_scopes指令将允许启用此功能。

generators

引入了以下生成器函数,以在连续的函数调用之间保存状态:

def fib():
    a, b = 0, 1
    while 1:
       yield b
       a, b = b, a+b

division

在Python 2.x版本中使用经典除法。这意味着某些除法语句返回合理的除法近似值(“真除法”),而另一些则返回下限(“地板除法”)。从Python 3.0开始,真正的除法由指定x/y,而场除由指定x//y

from __future__ import division指令强制使用Python 3.0样式划分。

absolute_import

允许用括号括起多个import语句。例如:

from Tkinter import (Tk, Frame, Button, Entry, Canvas, Text,
    LEFT, DISABLED, NORMAL, RIDGE, END)

代替:

from Tkinter import Tk, Frame, Button, Entry, Canvas, Text, \
    LEFT, DISABLED, NORMAL, RIDGE, END

要么:

from Tkinter import Tk, Frame, Button, Entry, Canvas, Text
from Tkinter import LEFT, DISABLED, NORMAL, RIDGE, END

with_statement

with在Python中将该语句作为关键字添加,以消除对try/finally语句的需要。在执行文件I / O时,通常的用法是:

with open('workfile', 'r') as f:
     read_data = f.read()

print_function

强制使用Python 3括号样式print()函数调用代替print MESSAGEstyle语句。

unicode_literals

介绍bytes对象的文字语法。意味着诸如之类的陈述bytes('Hello world', 'ascii')可以简单地表达为b'Hello world'

generator_stop

StopIteration生成器函数内部使用的异常的使用替换为RuntimeError异常。

上面没有提到的另一种用法是该__future__语句还需要使用Python 2.1+解释器,因为使用较旧的版本将引发运行时异常。


参考文献

There are some great answers already, but none of them address a complete list of what the __future__ statement currently supports.

Put simply, the __future__ statement forces Python interpreters to use newer features of the language.


The features that it currently supports are the following:

nested_scopes

Prior to Python 2.1, the following code would raise a NameError:

def f():
    ...
    def g(value):
        ...
        return g(value-1) + 1
    ...

The from __future__ import nested_scopes directive will allow for this feature to be enabled.

generators

Introduced generator functions such as the one below to save state between successive function calls:

def fib():
    a, b = 0, 1
    while 1:
       yield b
       a, b = b, a+b

division

Classic division is used in Python 2.x versions. Meaning that some division statements return a reasonable approximation of division (“true division”) and others return the floor (“floor division”). Starting in Python 3.0, true division is specified by x/y, whereas floor division is specified by x//y.

The from __future__ import division directive forces the use of Python 3.0 style division.

absolute_import

Allows for parenthesis to enclose multiple import statements. For example:

from Tkinter import (Tk, Frame, Button, Entry, Canvas, Text,
    LEFT, DISABLED, NORMAL, RIDGE, END)

Instead of:

from Tkinter import Tk, Frame, Button, Entry, Canvas, Text, \
    LEFT, DISABLED, NORMAL, RIDGE, END

Or:

from Tkinter import Tk, Frame, Button, Entry, Canvas, Text
from Tkinter import LEFT, DISABLED, NORMAL, RIDGE, END

with_statement

Adds the statement with as a keyword in Python to eliminate the need for try/finally statements. Common uses of this are when doing file I/O such as:

with open('workfile', 'r') as f:
     read_data = f.read()

print_function:

Forces the use of Python 3 parenthesis-style print() function call instead of the print MESSAGE style statement.

unicode_literals

Introduces the literal syntax for the bytes object. Meaning that statements such as bytes('Hello world', 'ascii') can be simply expressed as b'Hello world'.

generator_stop

Replaces the use of the StopIteration exception used inside generator functions with the RuntimeError exception.

One other use not mentioned above is that the __future__ statement also requires the use of Python 2.1+ interpreters since using an older version will throw a runtime exception.


References


回答 5

还是说“既然是python v2.7,请在python 3中添加它后,再使用另一个已添加到python v2.7中的’print’函数,因此我的’print’将不再是语句(例如,打印“ message”),但具有功能(例如,print(“ message”,选项)。这样,当我的代码在python 3中运行时,“ print”不会中断。”

from __future__ import print_function

print_function是包含“ print”的新实现的模块,具体取决于python v3中的行为。

这有更多解释:http : //python3porting.com/noconv.html

Or is it like saying “Since this is python v2.7, use that different ‘print’ function that has also been added to python v2.7, after it was added in python 3. So my ‘print’ will no longer be statements (eg print “message” ) but functions (eg, print(“message”, options). That way when my code is run in python 3, ‘print’ will not break.”

In

from __future__ import print_function

print_function is the module containing the new implementation of ‘print’ as per how it is behaving in python v3.

This has more explanation: http://python3porting.com/noconv.html


回答 6

我发现非常有用的用途之一是print_functionfrom__future__模块。

在Python 2.7中,我希望将来自不同打印语句的字符打印在同一行上而没有空格。

可以在最后使用逗号(“,”)来完成此操作,但是它还会附加一个额外的空间。上面的语句用作:

from __future__ import print_function
...
print (v_num,end="")
...

这将v_num在一行中没有空格的情况下打印每次迭代的值。

One of the uses which I found to be very useful is the print_function from __future__ module.

In Python 2.7, I wanted chars from different print statements to be printed on same line without spaces.

It can be done using a comma(“,”) at the end, but it also appends an extra space. The above statement when used as :

from __future__ import print_function
...
print (v_num,end="")
...

This will print the value of v_num from each iteration in a single line without spaces.


回答 7

从Python 3.0开始,print不再只是一个语句,而是一个函数。并包含在PEP 3105中。

我也认为Python 3.0包仍然具有这些特殊功能。让我们通过Python中的传统“金字塔程序”查看其可用性:

from __future__ import print_function

class Star(object):
    def __init__(self,count):
        self.count = count

    def start(self):
        for i in range(1,self.count):
            for j in range (i): 
                print('*', end='') # PEP 3105: print As a Function 
            print()

a = Star(5)
a.start()

Output:
*
**
***
****

如果我们使用普通的打印功能,将无法获得相同的输出,因为print()带有额外的换行符。因此,每次执行内部for循环时,它将在下一行上打印*。

After Python 3.0 onward, print is no longer just a statement, its a function instead. and is included in PEP 3105.

Also I think the Python 3.0 package has still these special functionality. Lets see its usability through a traditional “Pyramid program” in Python:

from __future__ import print_function

class Star(object):
    def __init__(self,count):
        self.count = count

    def start(self):
        for i in range(1,self.count):
            for j in range (i): 
                print('*', end='') # PEP 3105: print As a Function 
            print()

a = Star(5)
a.start()

Output:
*
**
***
****

If we use normal print function, we won’t be able to achieve the same output, since print() comes with a extra newline. So every time the inner for loop execute, it will print * onto the next line.


从git repo分支安装pip

问题:从git repo分支安装pip

尝试pip安装仓库的特定分支。Google告诉我

点安装git + https://github.com/user/repo.git@branch

分支的名称是issue/34/oscar-0.6我这样做的,pip install https://github.com/tangentlabs/django-oscar-paypal.git@/issue/34/oscar-0.6但是它返回了404。

如何安装此分支?

Trying to pip install a repo’s specific branch. Google tells me to

pip install git+https://github.com/user/repo.git@branch

The branch’s name is issue/34/oscar-0.6 so I did pip install https://github.com/tangentlabs/django-oscar-paypal.git@/issue/34/oscar-0.6 but its returning a 404.

How do I install this branch?


回答 0

在url前缀之前git+(请参阅VCS支持):

pip install git+https://github.com/tangentlabs/django-oscar-paypal.git@issue/34/oscar-0.6

并指定分支名称,但不要以/。开头。

Prepend the url prefix git+ (See VCS Support):

pip install git+https://github.com/tangentlabs/django-oscar-paypal.git@issue/34/oscar-0.6

And specify the branch name without the leading /.


回答 1

将git +与pip一起使用来克隆存储库可能非常慢(例如,使用https://github.com/django/django@stable/1.6.x进行测试,这将需要几分钟的时间)。我发现与GitHub和BitBucket兼容的最快的东西是:

pip install https://github.com/user/repository/archive/branch.zip

成为django master的对象:

pip install https://github.com/django/django/archive/master.zip

对于django stable / 1.7.x:

pip install https://github.com/django/django/archive/stable/1.7.x.zip

使用BitBucket,它具有相同的可预测模式:

pip install https://bitbucket.org/izi/django-admin-tools/get/default.zip

在这里,master分支通常被命名为default。这将使您的requirements.txt安装速度更快。

其他一些答案提到将要安装的软件包放入您的时所需的变体requirements.txt。请注意,这个档案的语法,领先-e和落后#egg=blah-blah不是必需的,你可以只是简单粘贴URL,所以你requirements.txt的样子:

https://github.com/user/repository/archive/branch.zip

Using pip with git+ to clone a repository can be extremely slow (test with https://github.com/django/django@stable/1.6.x for example, it will take a few minutes). The fastest thing I’ve found, which works with GitHub and BitBucket, is:

pip install https://github.com/user/repository/archive/branch.zip

which becomes for django master:

pip install https://github.com/django/django/archive/master.zip

for django stable/1.7.x:

pip install https://github.com/django/django/archive/stable/1.7.x.zip

With BitBucket it’s about the same predictable pattern:

pip install https://bitbucket.org/izi/django-admin-tools/get/default.zip

Here, the master branch is generally named default. This will make your requirements.txt installing much faster.

Some other answers mention variations required when placing the package to be installed into your requirements.txt. Note that with this archive syntax, the leading -e and trailing #egg=blah-blah are not required, and you can just simply paste the URL, so your requirements.txt looks like:

https://github.com/user/repository/archive/branch.zip

回答 2

使用ssh凭证从专用存储库安装的说明:

$ pip install git+ssh://git@github.com/myuser/foo.git@my_version

Instructions to install from private repo using ssh credentials:

$ pip install git+ssh://git@github.com/myuser/foo.git@my_version

回答 3

只是要添加一个额外的内容,如果要在pip文件中安装它,可以这样添加:

-e git+https://github.com/tangentlabs/django-oscar-paypal.git@issue/34/oscar-0.6#egg=django-oscar-paypal

但是它将被保存为鸡蛋。

Just to add an extra, if you want to install it in your pip file it can be added like this:

-e git+https://github.com/tangentlabs/django-oscar-paypal.git@issue/34/oscar-0.6#egg=django-oscar-paypal

It will be saved as an egg though.


回答 4

您使用了egg文件的安装过程。该程序支持安装了gitgit+httpgit+httpsgit+sshgit+gitgit+file。其中提到了一些。

可以使用分支,标签或哈希值进行安装是很好的。

@Steve_K指出,使用“ git +”安装可能会很慢,并建议通过zip文件进行安装:

pip install https://github.com/user/repository/archive/branch.zip

或者,建议您使用该.whl文件(如果存在)进行安装。

pip install https://github.com/user/repository/archive/branch.whl

这是一种非常新的格式,比egg文件更新。它需要wheel和setuptools> = 0.8软件包。您可以在这里找到更多。

You used the egg files install procedure. This procedure supports installing over git, git+http, git+https, git+ssh, git+git and git+file. Some of these are mentioned.

It’s good you can use branches, tags, or hashes to install.

@Steve_K noted it can be slow to install with “git+” and proposed installing via zip file:

pip install https://github.com/user/repository/archive/branch.zip

Alternatively, I suggest you may install using the .whl file if this exists.

pip install https://github.com/user/repository/archive/branch.whl

It’s pretty new format, newer than egg files. It requires wheel and setuptools>=0.8 packages. You can find more in here.


回答 5

这就像魅力一样工作:

pip3 install git+https://github.com/deepak1725/fabric8-analytics-worker.git@develop

哪里:

发展:分支

fabric8-analytics-worker.git:回购

deepak1725:用户

This worked like charm:

pip3 install git+https://github.com/deepak1725/fabric8-analytics-worker.git@develop

Where :

develop: Branch

fabric8-analytics-worker.git : Repo

deepak1725: user


如何在Python中创建GUID / UUID

问题:如何在Python中创建GUID / UUID

如何在独立于平台的Python中创建GUID?我听说有一种在Windows上使用ActivePython的方法,但这仅是Windows,因为它使用COM。有没有使用普通Python的方法?

How do I create a GUID in Python that is platform independent? I hear there is a method using ActivePython on Windows but it’s Windows only because it uses COM. Is there a method using plain Python?


回答 0

Python 2.5及更高版本中的uuid模块提供了符合RFC的UUID生成。有关详细信息,请参见模块文档和RFC。[ 来源 ]

文件:

示例(在2和3上工作):

>>> import uuid
>>> uuid.uuid4()
UUID('bd65600d-8669-4903-8a14-af88203add38')
>>> str(uuid.uuid4())
'f50ec0b7-f960-400d-91f0-c42a6d44e3d0'
>>> uuid.uuid4().hex
'9fe2c4e93f654fdbb24c02b15259716c'

The uuid module, in Python 2.5 and up, provides RFC compliant UUID generation. See the module docs and the RFC for details. [source]

Docs:

Example (working on 2 and 3):

>>> import uuid
>>> uuid.uuid4()
UUID('bd65600d-8669-4903-8a14-af88203add38')
>>> str(uuid.uuid4())
'f50ec0b7-f960-400d-91f0-c42a6d44e3d0'
>>> uuid.uuid4().hex
'9fe2c4e93f654fdbb24c02b15259716c'

回答 1

如果您使用的是Python 2.5或更高版本,则uuid模块已经包含在Python标准发行版中。

例如:

>>> import uuid
>>> uuid.uuid4()
UUID('5361a11b-615c-42bf-9bdb-e2c3790ada14')

If you’re using Python 2.5 or later, the uuid module is already included with the Python standard distribution.

Ex:

>>> import uuid
>>> uuid.uuid4()
UUID('5361a11b-615c-42bf-9bdb-e2c3790ada14')

回答 2

复制自:https : //docs.python.org/2/library/uuid.html(由于发布的链接无效,并且会不断更新)

>>> import uuid

>>> # make a UUID based on the host ID and current time
>>> uuid.uuid1()
UUID('a8098c1a-f86e-11da-bd1a-00112444be1e')

>>> # make a UUID using an MD5 hash of a namespace UUID and a name
>>> uuid.uuid3(uuid.NAMESPACE_DNS, 'python.org')
UUID('6fa459ea-ee8a-3ca4-894e-db77e160355e')

>>> # make a random UUID
>>> uuid.uuid4()
UUID('16fd2706-8baf-433b-82eb-8c7fada847da')

>>> # make a UUID using a SHA-1 hash of a namespace UUID and a name
>>> uuid.uuid5(uuid.NAMESPACE_DNS, 'python.org')
UUID('886313e1-3b8a-5372-9b90-0c9aee199e5d')

>>> # make a UUID from a string of hex digits (braces and hyphens ignored)
>>> x = uuid.UUID('{00010203-0405-0607-0809-0a0b0c0d0e0f}')

>>> # convert a UUID to a string of hex digits in standard form
>>> str(x)
'00010203-0405-0607-0809-0a0b0c0d0e0f'

>>> # get the raw 16 bytes of the UUID
>>> x.bytes
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f'

>>> # make a UUID from a 16-byte string
>>> uuid.UUID(bytes=x.bytes)
UUID('00010203-0405-0607-0809-0a0b0c0d0e0f')

Copied from : https://docs.python.org/2/library/uuid.html (Since the links posted were not active and they keep updating)

>>> import uuid

>>> # make a UUID based on the host ID and current time
>>> uuid.uuid1()
UUID('a8098c1a-f86e-11da-bd1a-00112444be1e')

>>> # make a UUID using an MD5 hash of a namespace UUID and a name
>>> uuid.uuid3(uuid.NAMESPACE_DNS, 'python.org')
UUID('6fa459ea-ee8a-3ca4-894e-db77e160355e')

>>> # make a random UUID
>>> uuid.uuid4()
UUID('16fd2706-8baf-433b-82eb-8c7fada847da')

>>> # make a UUID using a SHA-1 hash of a namespace UUID and a name
>>> uuid.uuid5(uuid.NAMESPACE_DNS, 'python.org')
UUID('886313e1-3b8a-5372-9b90-0c9aee199e5d')

>>> # make a UUID from a string of hex digits (braces and hyphens ignored)
>>> x = uuid.UUID('{00010203-0405-0607-0809-0a0b0c0d0e0f}')

>>> # convert a UUID to a string of hex digits in standard form
>>> str(x)
'00010203-0405-0607-0809-0a0b0c0d0e0f'

>>> # get the raw 16 bytes of the UUID
>>> x.bytes
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f'

>>> # make a UUID from a 16-byte string
>>> uuid.UUID(bytes=x.bytes)
UUID('00010203-0405-0607-0809-0a0b0c0d0e0f')

回答 3

我将GUID用作数据库类型操作的随机密钥。

对我来说,带有破折号和多余字符的十六进制形式似乎不必要。但我也喜欢表示十六进制数字的字符串,因为它们不包含在某些情况下可能导致问题的字符,例如“ +”,“ =”等,因此非常安全。

我使用的是网址安全的base64字符串,而不是十六进制的。但是,以下内容不符合任何UUID / GUID规范(除了具有所需的随机性之外)。

import base64
import uuid

# get a UUID - URL safe, Base64
def get_a_uuid():
    r_uuid = base64.urlsafe_b64encode(uuid.uuid4().bytes)
    return r_uuid.replace('=', '')

I use GUIDs as random keys for database type operations.

The hexadecimal form, with the dashes and extra characters seem unnecessarily long to me. But I also like that strings representing hexadecimal numbers are very safe in that they do not contain characters that can cause problems in some situations such as ‘+’,’=’, etc..

Instead of hexadecimal, I use a url-safe base64 string. The following does not conform to any UUID/GUID spec though (other than having the required amount of randomness).

import base64
import uuid

# get a UUID - URL safe, Base64
def get_a_uuid():
    r_uuid = base64.urlsafe_b64encode(uuid.uuid4().bytes)
    return r_uuid.replace('=', '')

回答 4

如果您需要为模型或唯一字段的主键传递UUID,则下面的代码将返回UUID对象-

 import uuid
 uuid.uuid4()

如果您需要将UUID用作URL的参数,则可以执行以下代码-

import uuid
str(uuid.uuid4())

如果您想要UUID的十六进制值,则可以执行以下操作-

import uuid    
uuid.uuid4().hex

If you need to pass UUID for a primary key for your model or unique field then below code returns the UUID object –

 import uuid
 uuid.uuid4()

If you need to pass UUID as a parameter for URL you can do like below code –

import uuid
str(uuid.uuid4())

If you want the hex value for a UUID you can do the below one –

import uuid    
uuid.uuid4().hex

回答 5

此功能是完全可配置的,并根据指定的格式生成唯一的uid

例如:-[8,4,4,4,12],这是提到的格式,它将生成以下uuid

LxoYNyXe-7hbQ-caJt-DSdU-PDAht56cMEWi

 import random as r

 def generate_uuid():
        random_string = ''
        random_str_seq = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
        uuid_format = [8, 4, 4, 4, 12]
        for n in uuid_format:
            for i in range(0,n):
                random_string += str(random_str_seq[r.randint(0, len(random_str_seq) - 1)])
            if n != 12:
                random_string += '-'
        return random_string

This function is fully configurable and generates unique uid based on the format specified

eg:- [8, 4, 4, 4, 12] , this is the format mentioned and it will generate the following uuid

LxoYNyXe-7hbQ-caJt-DSdU-PDAht56cMEWi

 import random as r

 def generate_uuid():
        random_string = ''
        random_str_seq = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
        uuid_format = [8, 4, 4, 4, 12]
        for n in uuid_format:
            for i in range(0,n):
                random_string += str(random_str_seq[r.randint(0, len(random_str_seq) - 1)])
            if n != 12:
                random_string += '-'
        return random_string

回答 6

2019年答案(对于Windows):

如果您希望使用永久性的UUID在Windows上唯一标识一台机器,则可以使用以下技巧:(摘自https://stackoverflow.com/a/58416992/8874388的答案)。

from typing import Optional
import re
import subprocess
import uuid

def get_windows_uuid() -> Optional[uuid.UUID]:
    try:
        # Ask Windows for the device's permanent UUID. Throws if command missing/fails.
        txt = subprocess.check_output("wmic csproduct get uuid").decode()

        # Attempt to extract the UUID from the command's result.
        match = re.search(r"\bUUID\b[\s\r\n]+([^\s\r\n]+)", txt)
        if match is not None:
            txt = match.group(1)
            if txt is not None:
                # Remove the surrounding whitespace (newlines, space, etc)
                # and useless dashes etc, by only keeping hex (0-9 A-F) chars.
                txt = re.sub(r"[^0-9A-Fa-f]+", "", txt)

                # Ensure we have exactly 32 characters (16 bytes).
                if len(txt) == 32:
                    return uuid.UUID(txt)
    except:
        pass # Silence subprocess exception.

    return None

print(get_windows_uuid())

使用Windows API获取计算机的永久UUID,然后处理字符串以确保它是有效的UUID,最后返回一个Python对象(https://docs.python.org/3/library/uuid.html),这为您提供了方便使用数据的方式(例如128位整数,十六进制字符串等)。

祝好运!

PS:子进程调用可能被直接调用Windows内核/ DLL的ctypes代替。但是出于我的目的,此功能是我所需要的。它会进行严格的验证并产生正确的结果。

2019 Answer (for Windows):

If you want a permanent UUID that identifies a machine uniquely on Windows, you can use this trick: (Copied from my answer at https://stackoverflow.com/a/58416992/8874388).

from typing import Optional
import re
import subprocess
import uuid

def get_windows_uuid() -> Optional[uuid.UUID]:
    try:
        # Ask Windows for the device's permanent UUID. Throws if command missing/fails.
        txt = subprocess.check_output("wmic csproduct get uuid").decode()

        # Attempt to extract the UUID from the command's result.
        match = re.search(r"\bUUID\b[\s\r\n]+([^\s\r\n]+)", txt)
        if match is not None:
            txt = match.group(1)
            if txt is not None:
                # Remove the surrounding whitespace (newlines, space, etc)
                # and useless dashes etc, by only keeping hex (0-9 A-F) chars.
                txt = re.sub(r"[^0-9A-Fa-f]+", "", txt)

                # Ensure we have exactly 32 characters (16 bytes).
                if len(txt) == 32:
                    return uuid.UUID(txt)
    except:
        pass # Silence subprocess exception.

    return None

print(get_windows_uuid())

Uses Windows API to get the computer’s permanent UUID, then processes the string to ensure it’s a valid UUID, and lastly returns a Python object (https://docs.python.org/3/library/uuid.html) which gives you convenient ways to use the data (such as 128-bit integer, hex string, etc).

Good luck!

PS: The subprocess call could probably be replaced with ctypes directly calling Windows kernel/DLLs. But for my purposes this function is all I need. It does strong validation and produces correct results.


回答 7

查看这篇文章,对我有很大帮助。简而言之,对我来说最好的选择是:

import random 
import string 

# defining function for random 
# string id with parameter 
def ran_gen(size, chars=string.ascii_uppercase + string.digits): 
    return ''.join(random.choice(chars) for x in range(size)) 

# function call for random string 
# generation with size 8 and string  
print (ran_gen(8, "AEIOSUMA23")) 

因为我只需要4-6个随机字符,而不需要笨重的GUID。

Check this post, helped me a lot. In short, the best option for me was:

import random 
import string 

# defining function for random 
# string id with parameter 
def ran_gen(size, chars=string.ascii_uppercase + string.digits): 
    return ''.join(random.choice(chars) for x in range(size)) 

# function call for random string 
# generation with size 8 and string  
print (ran_gen(8, "AEIOSUMA23")) 

Because I needed just 4-6 random characters instead of bulky GUID.


如何检测Python变量是否为函数?

问题:如何检测Python变量是否为函数?

我有一个变量, x并且我想知道它是否指向一个函数。

我曾希望我可以做些类似的事情:

>>> isinstance(x, function)

但这给了我:

Traceback (most recent call last):
  File "<stdin>", line 1, in ?
NameError: name 'function' is not defined

我之所以选择,是因为

>>> type(x)
<type 'function'>

I have a variable, x, and I want to know whether it is pointing to a function or not.

I had hoped I could do something like:

>>> isinstance(x, function)

But that gives me:

Traceback (most recent call last):
  File "<stdin>", line 1, in ?
NameError: name 'function' is not defined

The reason I picked that is because

>>> type(x)
<type 'function'>

回答 0

如果这是用于Python 2.x或Python 3.2+,则也可以使用callable()。它曾经不推荐使用,但是现在不推荐使用,因此您可以再次使用它。您可以在此处阅读讨论内容:http : //bugs.python.org/issue10518。您可以执行以下操作:

callable(obj)

如果这是针对Python 3.x但在3.2之前的版本,请检查对象是否具有__call__属性。您可以执行以下操作:

hasattr(obj, '__call__')

经常建议的types.FunctionTypes方法是不正确的,因为它无法涵盖您可能希望其通过的许多情况,例如内置函数:

>>> isinstance(open, types.FunctionType)
False

>>> callable(open)
True

检查鸭子型物体属性的正确方法是询问它们是否发出嘎嘎声,而不是查看它们是否适合鸭子大小的容器。types.FunctionType除非您对功能是什么有一个非常具体的了解,否则不要使用。

If this is for Python 2.x or for Python 3.2+, you can also use callable(). It used to be deprecated, but is now undeprecated, so you can use it again. You can read the discussion here: http://bugs.python.org/issue10518. You can do this with:

callable(obj)

If this is for Python 3.x but before 3.2, check if the object has a __call__ attribute. You can do this with:

hasattr(obj, '__call__')

The oft-suggested types.FunctionTypes approach is not correct because it fails to cover many cases that you would presumably want it to pass, like with builtins:

>>> isinstance(open, types.FunctionType)
False

>>> callable(open)
True

The proper way to check properties of duck-typed objects is to ask them if they quack, not to see if they fit in a duck-sized container. Don’t use types.FunctionType unless you have a very specific idea of what a function is.


回答 1

内置命名空间中没有构造函数的内置类型(例如,函数,生成器,方法)位于types模块中。您可以types.FunctionTypeisinstance通话中使用:

In [1]: import types
In [2]: types.FunctionType
Out[2]: <type 'function'>
In [3]: def f(): pass
   ...:
In [4]: isinstance(f, types.FunctionType)
Out[4]: True
In [5]: isinstance(lambda x : None, types.FunctionType)
Out[5]: True

请注意,这使用了非常特殊的“功能”概念,通常不是您所需要的。例如,它拒绝zip(从技术上讲是一个类):

>>> type(zip), isinstance(zip, types.FunctionType)
(<class 'type'>, False)

open (内置函数的类型不同):

>>> type(open), isinstance(open, types.FunctionType)
(<class 'builtin_function_or_method'>, False)

random.shuffle(从技术上讲是一个隐藏random.Random实例的方法):

>>> type(random.shuffle), isinstance(random.shuffle, types.FunctionType)
(<class 'method'>, False)

如果您要针对types.FunctionType实例执行特定操作,例如反编译其字节码或检查闭包变量,请使用types.FunctionType,但如果您只需要像函数一样可调用的对象,请使用callable

Builtin types that don’t have constructors in the built-in namespace (e.g. functions, generators, methods) are in the types module. You can use types.FunctionType in an isinstance call:

In [1]: import types
In [2]: types.FunctionType
Out[2]: <type 'function'>
In [3]: def f(): pass
   ...:
In [4]: isinstance(f, types.FunctionType)
Out[4]: True
In [5]: isinstance(lambda x : None, types.FunctionType)
Out[5]: True

Note that this uses a very specific notion of “function” that is usually not what you need. For example, it rejects zip (technically a class):

>>> type(zip), isinstance(zip, types.FunctionType)
(<class 'type'>, False)

open (built-in functions have a different type):

>>> type(open), isinstance(open, types.FunctionType)
(<class 'builtin_function_or_method'>, False)

and random.shuffle (technically a method of a hidden random.Random instance):

>>> type(random.shuffle), isinstance(random.shuffle, types.FunctionType)
(<class 'method'>, False)

If you’re doing something specific to types.FunctionType instances, like decompiling their bytecode or inspecting closure variables, use types.FunctionType, but if you just need an object to be callable like a function, use callable.


回答 2

从Python 2.1开始,您可以isfunctioninspect模块导入。

>>> from inspect import isfunction
>>> def f(): pass
>>> isfunction(f)
True
>>> isfunction(lambda x: x)
True

Since Python 2.1 you can import isfunction from the inspect module.

>>> from inspect import isfunction
>>> def f(): pass
>>> isfunction(f)
True
>>> isfunction(lambda x: x)
True

回答 3

公认的答案是在提供该答案时被认为是正确的。事实证明,有无可替代callable(),这是背面在Python 3.2:具体地,callable()检查tp_call对象的领域被测试。没有普通的Python等效项。大多数建议的测试在大多数情况下都是正确的:

>>> class Spam(object):
...     def __call__(self):
...         return 'OK'
>>> can_o_spam = Spam()


>>> can_o_spam()
'OK'
>>> callable(can_o_spam)
True
>>> hasattr(can_o_spam, '__call__')
True
>>> import collections
>>> isinstance(can_o_spam, collections.Callable)
True

通过__call__从类中删除,我们可以为此投入很多精力。只是为了让事情变得更加令人兴奋,请__call__为实例添加伪造品!

>>> del Spam.__call__
>>> can_o_spam.__call__ = lambda *args: 'OK?'

注意,这确实是不可调用的:

>>> can_o_spam()
Traceback (most recent call last):
  ...
TypeError: 'Spam' object is not callable

callable() 返回正确的结果:

>>> callable(can_o_spam)
False

但是hasattr错误的

>>> hasattr(can_o_spam, '__call__')
True

can_o_spam确实具有那个属性;只是在调用实例时不使用它。

更微妙的是,isinstance()也会出错:

>>> isinstance(can_o_spam, collections.Callable)
True

因为我们之前使用了此检查,后来又删除了该方法,abc.ABCMeta 所以将结果缓存。可以说这是一个错误abc.ABCMeta。就是说,与使用结果本身相比,它实际上不可能产生比结果更准确的结果callable(),因为typeobject->tp_call 槽的方法是不以任何其他方式使用。

只需使用 callable()

The accepted answer was at the time it was offered thought to be correct. As it turns out, there is no substitute for callable(), which is back in Python 3.2: Specifically, callable() checks the tp_call field of the object being tested. There is no plain Python equivalent. Most of the suggested tests are correct most of the time:

>>> class Spam(object):
...     def __call__(self):
...         return 'OK'
>>> can_o_spam = Spam()


>>> can_o_spam()
'OK'
>>> callable(can_o_spam)
True
>>> hasattr(can_o_spam, '__call__')
True
>>> import collections
>>> isinstance(can_o_spam, collections.Callable)
True

We can throw a monkey-wrench into this by removing the __call__ from the class. And just to keep things extra exciting, add a fake __call__ to the instance!

>>> del Spam.__call__
>>> can_o_spam.__call__ = lambda *args: 'OK?'

Notice this really isn’t callable:

>>> can_o_spam()
Traceback (most recent call last):
  ...
TypeError: 'Spam' object is not callable

callable() returns the correct result:

>>> callable(can_o_spam)
False

But hasattr is wrong:

>>> hasattr(can_o_spam, '__call__')
True

can_o_spam does have that attribute after all; it’s just not used when calling the instance.

Even more subtle, isinstance() also gets this wrong:

>>> isinstance(can_o_spam, collections.Callable)
True

Because we used this check earlier and later deleted the method, abc.ABCMeta caches the result. Arguably this is a bug in abc.ABCMeta. That said, there’s really no possible way it could produce a more accurate result than the result than by using callable() itself, since the typeobject->tp_call slot method is not accessible in any other way.

Just use callable()


回答 4

以下应返回布尔值:

callable(x)

The following should return a boolean:

callable(x)

回答 5

Python的2to3工具(http://docs.python.org/dev/library/2to3.html)建议:

import collections
isinstance(obj, collections.Callable)

似乎是hasattr(x, '__call__')因为http://bugs.python.org/issue7006选择了它而不是方法。

Python’s 2to3 tool (http://docs.python.org/dev/library/2to3.html) suggests:

import collections
isinstance(obj, collections.Callable)

It seems this was chosen instead of the hasattr(x, '__call__') method because of http://bugs.python.org/issue7006.


回答 6

callable(x) 如果可以在Python中调用传递的对象,但该函数返回true,但该函数在Python 3.0中不存在,并且正确地讲不能区分以下两者:

class A(object):
    def __call__(self):
        return 'Foo'

def B():
    return 'Bar'

a = A()
b = B

print type(a), callable(a)
print type(b), callable(b)

您将获得<class 'A'> True<type function> True作为输出。

isinstance可以很好地确定某物是否是一个函数(try isinstance(b, types.FunctionType));如果您真的想知道是否可以调用某些东西,可以使用hasattr(b, '__call__')也可以尝试一下。

test_as_func = True
try:
    b()
except TypeError:
    test_as_func = False
except:
    pass

当然,这不会告诉您它是否可以调用,但是TypeError在执行时会引发一个,还是一开始就不可调用。这可能对您来说并不重要。

callable(x) will return true if the object passed can be called in Python, but the function does not exist in Python 3.0, and properly speaking will not distinguish between:

class A(object):
    def __call__(self):
        return 'Foo'

def B():
    return 'Bar'

a = A()
b = B

print type(a), callable(a)
print type(b), callable(b)

You’ll get <class 'A'> True and <type function> True as output.

isinstance works perfectly well to determine if something is a function (try isinstance(b, types.FunctionType)); if you’re really interested in knowing if something can be called, you can either use hasattr(b, '__call__') or just try it.

test_as_func = True
try:
    b()
except TypeError:
    test_as_func = False
except:
    pass

This, of course, won’t tell you whether it’s callable but throws a TypeError when it executes, or isn’t callable in the first place. That may not matter to you.


回答 7

如果要检测语法上看起来像函数的所有内容:函数,方法,内置fun / meth,lambda …,但要排除可调用对象(__call__定义了方法的对象),请尝试以下方法:

import types
isinstance(x, (types.FunctionType, types.BuiltinFunctionType, types.MethodType, types.BuiltinMethodType, types.UnboundMethodType))

我将其与模块中的is*()检查代码进行了比较,inspect并且上面的表达式更加完整,尤其是当您的目标是过滤掉任何功能或检测对象的常规属性时。

If you want to detect everything that syntactically looks like a function: a function, method, built-in fun/meth, lambda … but exclude callable objects (objects with __call__ method defined), then try this one:

import types
isinstance(x, (types.FunctionType, types.BuiltinFunctionType, types.MethodType, types.BuiltinMethodType, types.UnboundMethodType))

I compared this with the code of is*() checks in inspect module and the expression above is much more complete, especially if your goal is filtering out any functions or detecting regular properties of an object.


回答 8

尝试使用callable(x)

Try using callable(x).


回答 9

如果您已学习C++,则必须熟悉function objectfunctor,表示可以be called as if it is a function

在C ++中, an ordinary function是一个函数对象,一个函数指针也是如此;更一般而言,define的类的对象也是如此operator()。在C ++ 11和更高版本中the lambda expression也是functor如此。

相似,在Python中,这些functors都是callableAn ordinary function可以调用,a lambda expression可以调用,可以调用,functional.partial可以调用的实例class with a __call__() method


好的,回到问题: I have a variable, x, and I want to know whether it is pointing to a function or not.

如果要判断天气,对象的作用就像一个函数,则callable建议的方法@John Feminella还可以。

如果要judge whether a object is just an ordinary function or not(不是可调用的类实例或lambda表达式),则xtypes.XXX建议使用by @Ryan是更好的选择。

然后,我使用这些代码进行实验:

#!/usr/bin/python3
# 2017.12.10 14:25:01 CST
# 2017.12.10 15:54:19 CST

import functools
import types
import pprint

定义一个类和一个普通函数。

class A():
    def __call__(self, a,b):
        print(a,b)
    def func1(self, a, b):
        print("[classfunction]:", a, b)
    @classmethod
    def func2(cls, a,b):
        print("[classmethod]:", a, b)
    @staticmethod
    def func3(a,b):
        print("[staticmethod]:", a, b)

def func(a,b):
    print("[function]", a,b)

定义函子:

#(1.1) built-in function
builtins_func = open
#(1.2) ordinary function
ordinary_func = func
#(1.3) lambda expression
lambda_func  = lambda a : func(a,4)
#(1.4) functools.partial
partial_func = functools.partial(func, b=4)

#(2.1) callable class instance
class_callable_instance = A()
#(2.2) ordinary class function
class_ordinary_func = A.func1
#(2.3) bound class method
class_bound_method = A.func2
#(2.4) static class method
class_static_func = A.func3

定义函子列表和类型列表:

## list of functors
xfuncs = [builtins_func, ordinary_func, lambda_func, partial_func, class_callable_instance, class_ordinary_func, class_bound_method, class_static_func]
## list of type
xtypes = [types.BuiltinFunctionType, types.FunctionType, types.MethodType, types.LambdaType, functools.partial]

判断函子是否可调用。如您所见,它们都是可调用的。

res = [callable(xfunc)  for xfunc in xfuncs]
print("functors callable:")
print(res)

"""
functors callable:
[True, True, True, True, True, True, True, True]
"""

判断函子的类型(types.XXX)。那么函子的类型并不完全相同。

res = [[isinstance(xfunc, xtype) for xtype in xtypes] for xfunc in xfuncs]

## output the result
print("functors' types")
for (row, xfunc) in zip(res, xfuncs):
    print(row, xfunc)

"""
functors' types
[True, False, False, False, False] <built-in function open>
[False, True, False, True, False] <function func at 0x7f1b5203e048>
[False, True, False, True, False] <function <lambda> at 0x7f1b5081fd08>
[False, False, False, False, True] functools.partial(<function func at 0x7f1b5203e048>, b=4)
[False, False, False, False, False] <__main__.A object at 0x7f1b50870cc0>
[False, True, False, True, False] <function A.func1 at 0x7f1b5081fb70>
[False, False, True, False, False] <bound method A.func2 of <class '__main__.A'>>
[False, True, False, True, False] <function A.func3 at 0x7f1b5081fc80>
"""

我使用数据绘制了可调用函子类型的表。

然后,您可以选择合适的函子类型。

如:

def func(a,b):
    print("[function]", a,b)

>>> callable(func)
True
>>> isinstance(func,  types.FunctionType)
True
>>> isinstance(func, (types.BuiltinFunctionType, types.FunctionType, functools.partial))
True
>>> 
>>> isinstance(func, (types.MethodType, functools.partial))
False

If you have learned C++, you must be familiar with function object or functor, means any object that can be called as if it is a function.

In C++, an ordinary function is a function object, and so is a function pointer; more generally, so is an object of a class that defines operator(). In C++11 and greater, the lambda expression is the functor too.

Similarity, in Python, those functors are all callable. An ordinary function can be callable, a lambda expression can be callable, a functional.partial can be callable, the instances of class with a __call__() method can be callable.


Ok, go back to question : I have a variable, x, and I want to know whether it is pointing to a function or not.

If you want to judge weather the object acts like a function, then the callable method suggested by @John Feminella is ok.

If you want to judge whether a object is just an ordinary function or not( not a callable class instance, or a lambda expression), then the xtypes.XXX suggested by @Ryan is a better choice.

Then I do an experiment using those code:

#!/usr/bin/python3
# 2017.12.10 14:25:01 CST
# 2017.12.10 15:54:19 CST

import functools
import types
import pprint

Define a class and an ordinary function.

class A():
    def __call__(self, a,b):
        print(a,b)
    def func1(self, a, b):
        print("[classfunction]:", a, b)
    @classmethod
    def func2(cls, a,b):
        print("[classmethod]:", a, b)
    @staticmethod
    def func3(a,b):
        print("[staticmethod]:", a, b)

def func(a,b):
    print("[function]", a,b)

Define the functors:

#(1.1) built-in function
builtins_func = open
#(1.2) ordinary function
ordinary_func = func
#(1.3) lambda expression
lambda_func  = lambda a : func(a,4)
#(1.4) functools.partial
partial_func = functools.partial(func, b=4)

#(2.1) callable class instance
class_callable_instance = A()
#(2.2) ordinary class function
class_ordinary_func = A.func1
#(2.3) bound class method
class_bound_method = A.func2
#(2.4) static class method
class_static_func = A.func3

Define the functors’ list and the types’ list:

## list of functors
xfuncs = [builtins_func, ordinary_func, lambda_func, partial_func, class_callable_instance, class_ordinary_func, class_bound_method, class_static_func]
## list of type
xtypes = [types.BuiltinFunctionType, types.FunctionType, types.MethodType, types.LambdaType, functools.partial]

Judge wether the functor is callable. As you can see, they all are callable.

res = [callable(xfunc)  for xfunc in xfuncs]
print("functors callable:")
print(res)

"""
functors callable:
[True, True, True, True, True, True, True, True]
"""

Judge the functor’s type( types.XXX). Then the types of functors are not all the same.

res = [[isinstance(xfunc, xtype) for xtype in xtypes] for xfunc in xfuncs]

## output the result
print("functors' types")
for (row, xfunc) in zip(res, xfuncs):
    print(row, xfunc)

"""
functors' types
[True, False, False, False, False] <built-in function open>
[False, True, False, True, False] <function func at 0x7f1b5203e048>
[False, True, False, True, False] <function <lambda> at 0x7f1b5081fd08>
[False, False, False, False, True] functools.partial(<function func at 0x7f1b5203e048>, b=4)
[False, False, False, False, False] <__main__.A object at 0x7f1b50870cc0>
[False, True, False, True, False] <function A.func1 at 0x7f1b5081fb70>
[False, False, True, False, False] <bound method A.func2 of <class '__main__.A'>>
[False, True, False, True, False] <function A.func3 at 0x7f1b5081fc80>
"""

I draw a table of callable functor’s types using the data.

Then you can choose the functors’ types that suitable.

such as:

def func(a,b):
    print("[function]", a,b)

>>> callable(func)
True
>>> isinstance(func,  types.FunctionType)
True
>>> isinstance(func, (types.BuiltinFunctionType, types.FunctionType, functools.partial))
True
>>> 
>>> isinstance(func, (types.MethodType, functools.partial))
False

回答 10

作为公认的答案,约翰·费米内拉说:

检查鸭子型物体属性的正确方法是询问它们是否发出嘎嘎声,而不是查看它们是否适合鸭子大小的容器。“直接比较”方法将对许多功能(例如内置函数)给出错误的答案。

即使有两个库严格区分功能,我还是绘制了一个详尽的可比较表:

8.9。类型-动态类型创建和内置类型的名称-Python 3.7.0文档

30.13。inspect —检查活动对象— Python 3.7.0文档

#import inspect             #import types
['isabstract',
 'isasyncgen',              'AsyncGeneratorType',
 'isasyncgenfunction', 
 'isawaitable',
 'isbuiltin',               'BuiltinFunctionType',
                            'BuiltinMethodType',
 'isclass',
 'iscode',                  'CodeType',
 'iscoroutine',             'CoroutineType',
 'iscoroutinefunction',
 'isdatadescriptor',
 'isframe',                 'FrameType',
 'isfunction',              'FunctionType',
                            'LambdaType',
                            'MethodType',
 'isgenerator',             'GeneratorType',
 'isgeneratorfunction',
 'ismethod',
 'ismethoddescriptor',
 'ismodule',                'ModuleType',        
 'isroutine',            
 'istraceback',             'TracebackType'
                            'MappingProxyType',
]

“鸭式打字”是通用的首选解决方案:

def detect_function(obj):
    return hasattr(obj,"__call__")

In [26]: detect_function(detect_function)
Out[26]: True
In [27]: callable(detect_function)
Out[27]: True

至于内建函数

In [43]: callable(hasattr)
Out[43]: True

再走一步检查内置功能或用户定义的功能

#check inspect.isfunction and type.FunctionType
In [46]: inspect.isfunction(detect_function)
Out[46]: True
In [47]: inspect.isfunction(hasattr)
Out[47]: False
In [48]: isinstance(detect_function, types.FunctionType)
Out[48]: True
In [49]: isinstance(getattr, types.FunctionType)
Out[49]: False
#so they both just applied to judge the user-definded

确定是否 builtin function

In [50]: isinstance(getattr, types.BuiltinFunctionType)
Out[50]: True
In [51]: isinstance(detect_function, types.BuiltinFunctionType)
Out[51]: False

摘要

采用callable鸭式检查功能,如果您有进一步指定的需求,
请使用types.BuiltinFunctionType

As the accepted answer, John Feminella stated that:

The proper way to check properties of duck-typed objects is to ask them if they quack, not to see if they fit in a duck-sized container. The “compare it directly” approach will give the wrong answer for many functions, like builtins.

Even though, there’re two libs to distinguish functions strictly, I draw an exhaustive comparable table:

8.9. types — Dynamic type creation and names for built-in types — Python 3.7.0 documentation

30.13. inspect — Inspect live objects — Python 3.7.0 documentation

#import inspect             #import types
['isabstract',
 'isasyncgen',              'AsyncGeneratorType',
 'isasyncgenfunction', 
 'isawaitable',
 'isbuiltin',               'BuiltinFunctionType',
                            'BuiltinMethodType',
 'isclass',
 'iscode',                  'CodeType',
 'iscoroutine',             'CoroutineType',
 'iscoroutinefunction',
 'isdatadescriptor',
 'isframe',                 'FrameType',
 'isfunction',              'FunctionType',
                            'LambdaType',
                            'MethodType',
 'isgenerator',             'GeneratorType',
 'isgeneratorfunction',
 'ismethod',
 'ismethoddescriptor',
 'ismodule',                'ModuleType',        
 'isroutine',            
 'istraceback',             'TracebackType'
                            'MappingProxyType',
]

The “duck typing” is a preferred solution for general purpose:

def detect_function(obj):
    return hasattr(obj,"__call__")

In [26]: detect_function(detect_function)
Out[26]: True
In [27]: callable(detect_function)
Out[27]: True

As for the builtins function

In [43]: callable(hasattr)
Out[43]: True

When go one more step to check if builtin function or user-defined funtion

#check inspect.isfunction and type.FunctionType
In [46]: inspect.isfunction(detect_function)
Out[46]: True
In [47]: inspect.isfunction(hasattr)
Out[47]: False
In [48]: isinstance(detect_function, types.FunctionType)
Out[48]: True
In [49]: isinstance(getattr, types.FunctionType)
Out[49]: False
#so they both just applied to judge the user-definded

Determine if builtin function

In [50]: isinstance(getattr, types.BuiltinFunctionType)
Out[50]: True
In [51]: isinstance(detect_function, types.BuiltinFunctionType)
Out[51]: False

Summary

Employ callable to duck type checking a function,
Use types.BuiltinFunctionType if you have further specified demand.


回答 11

函数只是带有__call__方法的类,因此您可以执行

hasattr(obj, '__call__')

例如:

>>> hasattr(x, '__call__')
True

>>> x = 2
>>> hasattr(x, '__call__')
False

这是“最佳”方法,但是根据您为什么需要知道它是否可调用或注释的原因,您可以将其放在try / execpt块中:

try:
    x()
except TypeError:
    print "was not callable"

如果try / except比使用Python更有意义,那是有争议的if hasattr(x, '__call__'): x()。我想说hasattr是更准确的,因为您不会偶然捕获到错误的TypeError,例如:

>>> def x():
...     raise TypeError
... 
>>> hasattr(x, '__call__')
True # Correct
>>> try:
...     x()
... except TypeError:
...     print "x was not callable"
... 
x was not callable # Wrong!

A function is just a class with a __call__ method, so you can do

hasattr(obj, '__call__')

For example:

>>> hasattr(x, '__call__')
True

>>> x = 2
>>> hasattr(x, '__call__')
False

That is the “best” way of doing it, but depending on why you need to know if it’s callable or note, you could just put it in a try/execpt block:

try:
    x()
except TypeError:
    print "was not callable"

It’s arguable if try/except is more Python’y than doing if hasattr(x, '__call__'): x().. I would say hasattr is more accurate, since you wont accidently catch the wrong TypeError, for example:

>>> def x():
...     raise TypeError
... 
>>> hasattr(x, '__call__')
True # Correct
>>> try:
...     x()
... except TypeError:
...     print "x was not callable"
... 
x was not callable # Wrong!

回答 12

这是另外两种方式:

def isFunction1(f) :
    return type(f) == type(lambda x: x);

def isFunction2(f) :
    return 'function' in str(type(f));

这是我想到的第二种方法:

>>> type(lambda x: x);
<type 'function'>
>>> str(type(lambda x: x));
"<type 'function'>"
# Look Maa, function! ... I ACTUALLY told my mom about this!

Here’s a couple of other ways:

def isFunction1(f) :
    return type(f) == type(lambda x: x);

def isFunction2(f) :
    return 'function' in str(type(f));

Here’s how I came up with the second:

>>> type(lambda x: x);
<type 'function'>
>>> str(type(lambda x: x));
"<type 'function'>"
# Look Maa, function! ... I ACTUALLY told my mom about this!

回答 13

'__call__'您可以检查用户定义的函数是否具有属性,等等func_name,而不是进行检查(这不是函数所独有的)func_doc。这不适用于方法。

>>> def x(): pass
... 
>>> hasattr(x, 'func_name')
True

另一种检查isfunction()方法是使用inspect模块中的方法。

>>> import inspect
>>> inspect.isfunction(x)
True

要检查对象是否为方法,请使用 inspect.ismethod()

Instead of checking for '__call__' (which is not exclusive to functions), you can check whether a user-defined function has attributes func_name, func_doc, etc. This does not work for methods.

>>> def x(): pass
... 
>>> hasattr(x, 'func_name')
True

Another way of checking is using the isfunction() method from the inspect module.

>>> import inspect
>>> inspect.isfunction(x)
True

To check if an object is a method, use inspect.ismethod()


回答 14

由于类也具有__call__方法,因此我建议另一个解决方案:

class A(object):
    def __init__(self):
        pass
    def __call__(self):
        print 'I am a Class'

MyClass = A()

def foo():
    pass

print hasattr(foo.__class__, 'func_name') # Returns True
print hasattr(A.__class__, 'func_name')   # Returns False as expected

print hasattr(foo, '__call__') # Returns True
print hasattr(A, '__call__')   # (!) Returns True while it is not a function

Since classes also have __call__ method, I recommend another solution:

class A(object):
    def __init__(self):
        pass
    def __call__(self):
        print 'I am a Class'

MyClass = A()

def foo():
    pass

print hasattr(foo.__class__, 'func_name') # Returns True
print hasattr(A.__class__, 'func_name')   # Returns False as expected

print hasattr(foo, '__call__') # Returns True
print hasattr(A, '__call__')   # (!) Returns True while it is not a function

回答 15

请注意,Python类也是可调用的。

要获取功能(按功能,我们指的是标准功能和lambda),请使用:

import types

def is_func(obj):
    return isinstance(obj, (types.FunctionType, types.LambdaType))


def f(x):
    return x


assert is_func(f)
assert is_func(lambda x: x)

Note that Python classes are also callable.

To get functions (and by functions we mean standard functions and lambdas) use:

import types

def is_func(obj):
    return isinstance(obj, (types.FunctionType, types.LambdaType))


def f(x):
    return x


assert is_func(f)
assert is_func(lambda x: x)

回答 16

无论函数是一个类,因此您都可以使用实例x的类的名称并进行比较:


if(x.__class__.__name__ == 'function'):
     print "it's a function"

Whatever function is a class so you can take the name of the class of instance x and compare:


if(x.__class__.__name__ == 'function'):
     print "it's a function"

回答 17

在某些答案中使用hasattr(obj, '__call__')callable(.)提到的解决方案有一个主要缺点:都返回True类和带有__call__()方法的类实例。例如。

>>> import collections
>>> Test = collections.namedtuple('Test', [])
>>> callable(Test)
True
>>> hasattr(Test, '__call__')
True

检查对象是否是用户定义的函数(除了那个以外的任何东西)的一种正确方法是使用isfunction(.)

>>> import inspect
>>> inspect.isfunction(Test)
False
>>> def t(): pass
>>> inspect.isfunction(t)
True

如果需要检查其他类型,请查看一下检查—检查活动对象

The solutions using hasattr(obj, '__call__') and callable(.) mentioned in some of the answers have a main drawback: both also return True for classes and instances of classes with a __call__() method. Eg.

>>> import collections
>>> Test = collections.namedtuple('Test', [])
>>> callable(Test)
True
>>> hasattr(Test, '__call__')
True

One proper way of checking if an object is a user-defined function (and nothing but a that) is to use isfunction(.):

>>> import inspect
>>> inspect.isfunction(Test)
False
>>> def t(): pass
>>> inspect.isfunction(t)
True

If you need to check for other types, have a look at inspect — Inspect live objects.


回答 18

精确功能检查器

callable是一个非常好的解决方案。但是,我想以相反的方式对待约翰·费米内拉。而不是像这样说:

检查鸭子型物体属性的正确方法是询问它们是否发出嘎嘎声,而不是查看它们是否适合鸭子大小的容器。“直接比较”方法将对许多功能(例如内置函数)给出错误的答案。

我们将这样处理:

检查某物是否是鸭子的正确方法不是通过若干过滤器查看它是否会发出嘎嘎声,而是通过多种过滤器查看它是否真的是鸭子,而不是仅仅从表面上检查它是否看起来像鸭子。

我们将如何实施

‘types’模块具有大量用于检测功能的类,其中最有用的是type.FunctionType,但是还有很多其他类,例如方法类型,内置类型和lambda类型。我们还将“ functools.partial”对象视为函数。

我们检查所有类型是否简单的简单方法是使用isinstance条件。以前,我想创建一个继承自上述所有类的基类,但我无法做到这一点,因为Python不允许我们继承上述某些类。

下表列出了哪些类别可以对哪些功能进行分类:

以上功能表由kinght-金

起作用的代码

现在,这是完成我们上面描述的所有工作的代码。

from types import BuiltinFunctionType, BuiltinMethodType,  FunctionType, MethodType, LambdaType
from functools import partial

def is_function(obj):
  return isinstance(obj, (BuiltinFunctionType, BuiltinMethodType,  FunctionType, MethodType, LambdaType, partial))

#-------------------------------------------------

def my_func():
  pass

def add_both(x, y):
  return x + y

class a:
  def b(self):
    pass

check = [

is_function(lambda x: x + x),
is_function(my_func),
is_function(a.b),
is_function(partial),
is_function(partial(add_both, 2))

]

print(check)
>>> [True, True, True, False, True]

一个错误是is_function(partial),因为它是一个类,而不是一个函数,而这恰好是函数,而不是类。这是预览版,可让您试用其中的代码。

结论

如果希望通过对绝对值的鸭式输入来检查,callable(obj)是检查对象是否为函数的首选方法

我们的自定义is_function(obj),如果您不将可调用类实例算作一个函数,而是仅定义为内置函数或使用lambdadef,则可能需要进行一些编辑才能检查对象是否为函数的首选方法或部分

而且我认为这一切都包含在内。祝你有美好的一天!

An Exact Function Checker

callable is a very good solution. However, I wanted to treat this the opposite way of John Feminella. Instead of treating it like this saying:

The proper way to check properties of duck-typed objects is to ask them if they quack, not to see if they fit in a duck-sized container. The “compare it directly” approach will give the wrong answer for many functions, like builtins.

We’ll treat it like this:

The proper way to check if something is a duck is not to see if it can quack, but rather to see if it truly is a duck through several filters, instead of just checking if it seems like a duck from the surface.

How Would We Implement It

The ‘types’ module has plenty of classes to detect functions, the most useful being types.FunctionType, but there are also plenty of others, like a method type, a built in type, and a lambda type. We also will consider a ‘functools.partial’ object as being a function.

The simple way we check if it is a function is by using an isinstance condition on all of these types. Previously, I wanted to make a base class which inherits from all of the above, but I am unable to do that, as Python does not allow us to inherit from some of the above classes.

Here’s a table of what classes can classify what functions:

Above function table by kinght-金

The Code Which Does It

Now, this is the code which does all of the work we described from above.

from types import BuiltinFunctionType, BuiltinMethodType,  FunctionType, MethodType, LambdaType
from functools import partial

def is_function(obj):
  return isinstance(obj, (BuiltinFunctionType, BuiltinMethodType,  FunctionType, MethodType, LambdaType, partial))

#-------------------------------------------------

def my_func():
  pass

def add_both(x, y):
  return x + y

class a:
  def b(self):
    pass

check = [

is_function(lambda x: x + x),
is_function(my_func),
is_function(a.b),
is_function(partial),
is_function(partial(add_both, 2))

]

print(check)
>>> [True, True, True, False, True]

The one false was is_function(partial), because that’s a class, not a function, and this is exactly functions, not classes. Here is a preview for you to try out the code from.

Conclusion

callable(obj) is the preferred method to check if an object is a function if you want to go by duck-typing over absolutes.

Our custom is_function(obj), maybe with some edits is the preferred method to check if an object is a function if you don’t any count callable class instance as a function, but only functions defined built-in, or with lambda, def, or partial.

And I think that wraps it all up. Have a good day!


回答 19

在Python3中,我想出了type (f) == type (lambda x:x)产生Trueif f是否为函数的结果False。但是我想我更喜欢isinstance (f, types.FunctionType),感觉不太特别。我想做type (f) is function,但这不起作用。

In Python3 I came up with type (f) == type (lambda x:x) which yields True if f is a function and False if it is not. But I think I prefer isinstance (f, types.FunctionType), which feels less ad hoc. I wanted to do type (f) is function, but that doesn’t work.


回答 20

在先前的答复之后,我想到了这一点:

from pprint import pprint

def print_callables_of(obj):
    li = []
    for name in dir(obj):
        attr = getattr(obj, name)
        if hasattr(attr, '__call__'):
            li.append(name)
    pprint(li)

Following previous replies, I came up with this:

from pprint import pprint

def print_callables_of(obj):
    li = []
    for name in dir(obj):
        attr = getattr(obj, name)
        if hasattr(attr, '__call__'):
            li.append(name)
    pprint(li)

回答 21

您可以尝试以下方法:

if obj.__class__.__name__ in ['function', 'builtin_function_or_method']:
    print('probably a function')

甚至更怪异的东西:

if "function" in lower(obj.__class__.__name__):
    print('probably a function')

You could try this:

if obj.__class__.__name__ in ['function', 'builtin_function_or_method']:
    print('probably a function')

or even something more bizarre:

if "function" in lower(obj.__class__.__name__):
    print('probably a function')

回答 22

如果该值可调用,代码将继续执行调用,则只需执行调用并catch即可TypeError

def myfunc(x):
  try:
    x()
  except TypeError:
    raise Exception("Not callable")

If the code will go on to perform the call if the value is callable, just perform the call and catch TypeError.

def myfunc(x):
  try:
    x()
  except TypeError:
    raise Exception("Not callable")

回答 23

以下是检查它的“替代方法”。它也适用于lambda。

def a():pass
type(a) #<class 'function'>
str(type(a))=="<class 'function'>" #True

b = lambda x:x*2
str(type(b))=="<class 'function'>" #True

The following is a “repr way” to check it. Also it works with lambda.

def a():pass
type(a) #<class 'function'>
str(type(a))=="<class 'function'>" #True

b = lambda x:x*2
str(type(b))=="<class 'function'>" #True

回答 24

这对我有用:

str(type(a))=="<class 'function'>"

This works for me:

str(type(a))=="<class 'function'>"

如果PyPy快6.3倍,为什么我不应该在CPython上使用PyPy?

问题:如果PyPy快6.3倍,为什么我不应该在CPython上使用PyPy?

我已经听到很多有关PyPy项目的信息。他们声称它比其站点上的CPython解释器快6.3倍。

每当我们谈论诸如Python之类的动态语言时,速度都是头等大事。为了解决这个问题,他们说PyPy快6.3倍。

第二个问题是并行性,臭名昭著的Global Interpreter Lock(GIL)。为此,PyPy表示可以提供无GIL的Python

如果PyPy可以解决这些巨大的挑战,那么它的哪些弱点正在阻碍广泛采用?也就是说,是什么原因导致我这样的人,一个典型的Python开发,切换到PyPy 现在

I’ve been hearing a lot about the PyPy project. They claim it is 6.3 times faster than the CPython interpreter on their site.

Whenever we talk about dynamic languages like Python, speed is one of the top issues. To solve this, they say PyPy is 6.3 times faster.

The second issue is parallelism, the infamous Global Interpreter Lock (GIL). For this, PyPy says it can give GIL-less Python.

If PyPy can solve these great challenges, what are its weaknesses that are preventing wider adoption? That is to say, what’s preventing someone like me, a typical Python developer, from switching to PyPy right now?


回答 0

注意: PyPy现在比2013年提出这个问题时更加成熟,并且得到了更好的支持。避免从过时的信息中得出结论。


  1. 正如其他人很快提到的,PyPy 对C扩展提供了长期的支持。它具有支持,但通常速度低于Python,并且充其量也只是个问题。因此,许多模块只需要 CPython。PyPy不支持numpy PyPy现在支持numpy。某些扩展仍然不受支持(Pandas,SciPy等),请在进行更改之前先查看支持的软件包的列表
  2. 目前,对Python 3的支持尚处于试验阶段。 刚刚达到稳定!自2014年6月20日起,PyPy3 2.3.1-Fulcrum退出了
  3. PyPy有时并不真正更快“脚本”,其中有很多人使用Python进行。这些是运行时间短的程序,它们执行简单和小的操作。由于PyPy是JIT编译器,因此其主要优点来自运行时间长和简单的类型(例如数字)。坦率地说,与CPython相比,PyPy的JIT之前速度非常差
  4. 惯性。迁移到PyPy通常需要重新配置工具,对于某些人和组织而言,这简直就是太多的工作。

我会说,这些是影响我的主要原因。

NOTE: PyPy is more mature and better supported now than it was in 2013, when this question was asked. Avoid drawing conclusions from out-of-date information.


  1. PyPy, as others have been quick to mention, has tenuous support for C extensions. It has support, but typically at slower-than-Python speeds and it’s iffy at best. Hence a lot of modules simply require CPython. PyPy doesn’t support numpy PyPy now supports numpy. Some extensions are still not supported (Pandas, SciPy, etc.), take a look at the list of supported packages before making the change.
  2. Python 3 support is experimental at the moment. has just reached stable! As of 20th June 2014, PyPy3 2.3.1 – Fulcrum is out!
  3. PyPy sometimes isn’t actually faster for “scripts”, which a lot of people use Python for. These are the short-running programs that do something simple and small. Because PyPy is a JIT compiler its main advantages come from long run times and simple types (such as numbers). Frankly, PyPy’s pre-JIT speeds are pretty bad compared to CPython.
  4. Inertia. Moving to PyPy often requires retooling, which for some people and organizations is simply too much work.

Those are the main reasons that affect me, I’d say.


回答 1

该网站也没有权利要求PyPy比CPython的快6.3倍。报价:

所有基准的几何平均值比CPython快0.16或6.3倍

这与您所做的一揽子声明完全不同,当您了解差异时,您将至少了解一组不能仅仅说“使用PyPy”的原因。听起来好像我很挑剔,但是了解为什么这两个陈述完全不同是至关重要的。

分解:

  • 他们所做的陈述仅适用于他们所使用的基准。它完全没有说明您的程序(除非您的程序与其基准之一完全相同)。

  • 该声明大约是一组基准的平均值。没有人声称运行PyPy甚至可以为他们测试过的程序带来6.3倍的改进。

  • 没有人声称PyPy甚至可以运行CPython运行的所有程序,更不用说更快了。

That site does not claim PyPy is 6.3 times faster than CPython. To quote:

The geometric average of all benchmarks is 0.16 or 6.3 times faster than CPython

This is a very different statement to the blanket statement you made, and when you understand the difference, you’ll understand at least one set of reasons why you can’t just say “use PyPy”. It might sound like I’m nit-picking, but understanding why these two statements are totally different is vital.

To break that down:

  • The statement they make only applies to the benchmarks they’ve used. It says absolutely nothing about your program (unless your program is exactly the same as one of their benchmarks).

  • The statement is about an average of a group of benchmarks. There is no claim that running PyPy will give a 6.3 times improvement even for the programs they have tested.

  • There is no claim that PyPy will even run all the programs that CPython runs at all, let alone faster.


回答 2

由于pypy并非100%兼容,因此需要8 gig的ram进行编译,这是一个不断变化的目标,并且处于高度试验阶段,而cpython是稳定的,这是模块构建器默认的目标,长达20年(包括无法在pypy上运行的c扩展名) ),并且已经广泛部署。

Pypy可能永远不会成为参考实现,但是它是一个很好的工具。

Because pypy is not 100% compatible, takes 8 gigs of ram to compile, is a moving target, and highly experimental, where cpython is stable, the default target for module builders for 2 decades (including c extensions that don’t work on pypy), and already widely deployed.

Pypy will likely never be the reference implementation, but it is a good tool to have.


回答 3

第二个问题更容易回答:如果所有代码都是纯Python,则基本上可以使用PyPy替代。但是,许多广泛使用的库(包括一些标准库)都是用C编写的,并作为Python扩展进行编译。其中有些可以与PyPy一起使用,有些则不能。PyPy提供了与Python相同的“面向前”工具-也就是说,它是Python-,但是它的内在功能是不同的,因此与这些内在功能连接的工具将不起作用。

关于第一个问题,我想这有点像第一个Catch-22:PyPy一直在迅速发展,以提高速度并增强与其他代码的互操作性。这使其比官方更具实验性。

我认为,如果PyPy进入稳定状态,则有可能开始被更广泛地使用。我也认为Python摆脱C的支持是很棒的。但这不会一会儿发生。PyPy还没有达到临界质量的地方是几乎对自己有用的,足以做你想要的一切,这将激励人们以填补空白。

The second question is easier to answer: you basically can use PyPy as a drop-in replacement if all your code is pure Python. However, many widely used libraries (including some of the standard library) are written in C and compiled as Python extensions. Some of these can be made to work with PyPy, some can’t. PyPy provides the same “forward-facing” tool as Python — that is, it is Python — but its innards are different, so tools that interface with those innards won’t work.

As for the first question, I imagine it is sort of a Catch-22 with the first: PyPy has been evolving rapidly in an effort to improve speed and enhance interoperability with other code. This has made it more experimental than official.

I think it’s possible that if PyPy gets into a stable state, it may start getting more widely used. I also think it would be great for Python to move away from its C underpinnings. But it won’t happen for a while. PyPy hasn’t yet reached the critical mass where it is almost useful enough on its own to do everything you’d want, which would motivate people to fill in the gaps.


回答 4

我对此主题做了一个小型基准测试。尽管许多其他发布者在兼容性方面都提出了很好的观点,但我的经验是,PyPy仅仅移动一些位并没有那么快。对于Python的许多用途,它实际上仅存在于在两个或多个服务之间转换位。例如,很少有Web应用程序对数据集执行CPU密集型分析。相反,它们从客户端获取一些字节,将其存储在某种数据库中,然后再将其返回给其他客户端。有时,数据格式会更改。

BDFL和CPython开发人员是一群非常聪明的人,并设法帮助CPython在这种情况下表现出色。这是一个无耻的博客插件:http : //www.hydrogen18.com/blog/unpickling-buffers.html。我正在使用Stackless,它是从CPython派生的,并保留了完整的C模块接口。在那种情况下,我发现使用PyPy没有任何优势。

I did a small benchmark on this topic. While many of the other posters have made good points about compatibility, my experience has been that PyPy isn’t that much faster for just moving around bits. For many uses of Python, it really only exists to translate bits between two or more services. For example, not many web applications are performing CPU intensive analysis of datasets. Instead, they take some bytes from a client, store them in some sort of database, and later return them to other clients. Sometimes the format of the data is changed.

The BDFL and the CPython developers are a remarkably intelligent group of people and have a managed to help CPython perform excellent in such a scenario. Here’s a shameless blog plug: http://www.hydrogen18.com/blog/unpickling-buffers.html . I’m using Stackless, which is derived from CPython and retains the full C module interface. I didn’t find any advantage to using PyPy in that case.


回答 5

问:如果与CPython相比,PyPy可以解决这些巨大的挑战(速度,内存消耗,并行性),那么它的哪些弱点在阻止更广泛的采用?

答:首先,很少有证据表明PyPy团队可以解决问题的速度一般。长期证据表明,PyPy运行某些Python代码要比CPython慢​​,而且这一缺点似乎深深地植根于PyPy。

其次,在相当多的情况下,当前版本的PyPy消耗的内存比CPython多得多。因此,PyPy尚未解决内存消耗问题。

无论PyPy解决所提到的巨大挑战,并在一般更快,较少的内存饿了,和更友好的并行与CPython是一个悬而未决的问题无法在短期内得到解决。有人押注,PyPy将永远无法提供一种通用解决方案,使它在所有情况下均能统治CPython 2.7和3.3。

如果PyPy总体上要比CPython更好,这是值得怀疑的,那么影响其广泛采用的主要弱点将是与CPython的兼容性。还存在一些问题,例如CPython可在更广泛的CPU和OS上运行,但是与PyPy的性能和CPython兼容性目标相比,这些问题的重要性要小得多。


问:为什么现在不能放弃用PyPy替换CPython?

答:PyPy并非100%与CPython兼容,因为它没有在后台模拟CPython。有些程序可能仍依赖于PyPy中缺少的CPython的独特功能,例如C绑定,Python对象和方法的C实现,或CPython垃圾收集器的增量性质。

Q: If PyPy can solve these great challenges (speed, memory consumption, parallelism) in comparison to CPython, what are its weaknesses that are preventing wider adoption?

A: First, there is little evidence that the PyPy team can solve the speed problem in general. Long-term evidence is showing that PyPy runs certain Python codes slower than CPython and this drawback seems to be rooted very deeply in PyPy.

Secondly, the current version of PyPy consumes much more memory than CPython in a rather large set of cases. So PyPy didn’t solve the memory consumption problem yet.

Whether PyPy solves the mentioned great challenges and will in general be faster, less memory hungry, and more friendly to parallelism than CPython is an open question that cannot be solved in the short term. Some people are betting that PyPy will never be able to offer a general solution enabling it to dominate CPython 2.7 and 3.3 in all cases.

If PyPy succeeds to be better than CPython in general, which is questionable, the main weakness affecting its wider adoption will be its compatibility with CPython. There also exist issues such as the fact that CPython runs on a wider range of CPUs and OSes, but these issues are much less important compared to PyPy’s performance and CPython-compatibility goals.


Q: Why can’t I do drop in replacement of CPython with PyPy now?

A: PyPy isn’t 100% compatible with CPython because it isn’t simulating CPython under the hood. Some programs may still depend on CPython’s unique features that are absent in PyPy such as C bindings, C implementations of Python object&methods, or the incremental nature of CPython’s garbage collector.


回答 6

CPython具有引用计数和垃圾收集,PyPy仅具有垃圾收集。

因此,对象倾向于更早地删除,并__del__在CPython中以更可预测的方式调用。一些软件依赖于这种行为,因此它们还没有准备好迁移到PyPy。

某些其他软件可同时使用这两种软件,但CPython使用较少的内存,因为较早时释放了未使用的对象。(我没有任何度量来表明这有多重要,还有哪些其他实现细节会影响内存使用。)

CPython has reference counting and garbage collection, PyPy has garbage collection only.

So objects tend to be deleted earlier and __del__ is called in a more predictable way in CPython. Some software relies on this behavior, thus they are not ready for migrating to PyPy.

Some other software works with both, but uses less memory with CPython, because unused objects are freed earlier. (I don’t have any measurements to indicate how significant this is and what other implementation details affect the memory use.)


回答 7

对于许多项目,在速度方面,不同的python之间实际上有0%的差异。那就是那些受工程时间支配并且所有python都具有相同数量的库支持的库。

For a lot of projects, there is actually 0% difference between the different pythons in terms of speed. That is those that are dominated by engineering time and where all pythons have the same amount of library support.


回答 8

简单地说:PyPy提供了CPython所缺乏的速度,但却牺牲了它的兼容性。但是,大多数人选择Python是因为它具有灵活性和“含电池”功能(高兼容性),而不是因为它的速度(尽管它仍然是首选)。

To make this simple: PyPy provides the speed that’s lacked by CPython but sacrifices its compatibility. Most people, however, choose Python for its flexibility and its “battery-included” feature (high compatibility), not for its speed (it’s still preferred though).


回答 9

我发现了一些例子,其中PyPy比Python慢​​。但是:仅在Windows上。

C:\Users\User>python -m timeit -n10 -s"from sympy import isprime" "isprime(2**521-1);isprime(2**1279-1)"
10 loops, best of 3: 294 msec per loop

C:\Users\User>pypy -m timeit -n10 -s"from sympy import isprime" "isprime(2**521-1);isprime(2**1279-1)"
10 loops, best of 3: 1.33 sec per loop

因此,如果您想到的是PyPy,请忘记Windows。在Linux上,您可以实现出色的加速。示例(列出1到1,000,000之间的所有素数):

from sympy import sieve
primes = list(sieve.primerange(1, 10**6))

PyPy的运行速度比Python快10(!)倍。但不在Windows上。那里只有3倍的速度。

I’ve found examples, where PyPy is slower than Python. But: Only on Windows.

C:\Users\User>python -m timeit -n10 -s"from sympy import isprime" "isprime(2**521-1);isprime(2**1279-1)"
10 loops, best of 3: 294 msec per loop

C:\Users\User>pypy -m timeit -n10 -s"from sympy import isprime" "isprime(2**521-1);isprime(2**1279-1)"
10 loops, best of 3: 1.33 sec per loop

So, if you think of PyPy, forget Windows. On Linux, you can achieve awesome accelerations. Example (list all primes between 1 and 1,000,000):

from sympy import sieve
primes = list(sieve.primerange(1, 10**6))

This runs 10(!) times faster on PyPy than on Python. But not on windows. There it is only 3x as fast.


回答 10

PyPy已经支持Python 3一段时间了,但是根据Anthony Shaw在2018年4月2日发布的HackerNoon帖子中所述,PyPy3仍然比PyPy(Python 2)慢几倍。

对于许多科学计算,尤其是矩阵计算,numpy是更好的选择(请参阅FAQ:我应该安装numpy还是numpypy?)。

Pypy不支持gmpy2。您可以改用gmpy_cffi, 尽管我尚未测试过它的速度,并且该项目在2014年发布了一个版本。

对于Project Euler问题,我经常使用PyPy,对于简单的数值计算通常from __future__ import division足以满足我的目的,但是截至2018年,Python 3支持仍在开发中,最好的选择是在64位Linux上。Windows PyPy3.5 v6.0(截至2018年12月)为最新版本。

PyPy has had Python 3 support for a while, but according to this HackerNoon post by Anthony Shaw from April 2nd, 2018, PyPy3 is still several times slower than PyPy (Python 2).

For many scientific calculations, particularly matrix calculations, numpy is a better choice (see FAQ: Should I install numpy or numpypy?).

Pypy does not support gmpy2. You can instead make use of gmpy_cffi though I haven’t tested its speed and the project had one release in 2014.

For Project Euler problems, I make frequent use of PyPy, and for simple numerical calculations often from __future__ import division is sufficient for my purposes, but Python 3 support is still being worked on as of 2018, with your best bet being on 64-bit Linux. Windows PyPy3.5 v6.0, the latest as of December 2018, is in beta.


回答 11

支持的Python版本

引用PythonZen

可读性很重要。

例如,Python 3.7引入了数据类,Python 3.8引入了fstring =

Python 3.7和Python 3.8中可能还有其他更重要的功能。关键是PyPy目前不支持Python 3.7或Python 3.8。

Supported Python Versions

To cite the Zen of Python:

Readability counts.

For example, Python 3.7 introduced dataclasses and Python 3.8 introduced fstring =.

There might be other features in Python 3.7 and Python 3.8 which are more important to you. The point is that PyPy does not support Python 3.7 or Python 3.8 at the moment.


从字符串列表中删除空字符串

问题:从字符串列表中删除空字符串

我想从python中的字符串列表中删除所有空字符串。

我的想法如下:

while '' in str_list:
    str_list.remove('')

还有其他pythonic方式可以做到这一点吗?

I want to remove all empty strings from a list of strings in python.

My idea looks like this:

while '' in str_list:
    str_list.remove('')

Is there any more pythonic way to do this?


回答 0

我会使用filter

str_list = filter(None, str_list)
str_list = filter(bool, str_list)
str_list = filter(len, str_list)
str_list = filter(lambda item: item, str_list)

Python 3从返回一个迭代器filter,因此应包装在对的调用中list()

str_list = list(filter(None, str_list))

I would use filter:

str_list = filter(None, str_list)
str_list = filter(bool, str_list)
str_list = filter(len, str_list)
str_list = filter(lambda item: item, str_list)

Python 3 returns an iterator from filter, so should be wrapped in a call to list()

str_list = list(filter(None, str_list))

回答 1

使用列表理解是最Python的方式:

>>> strings = ["first", "", "second"]
>>> [x for x in strings if x]
['first', 'second']

如果必须就地修改列表,因为还有其他引用必须看到更新的数据,则使用分片分配:

strings[:] = [x for x in strings if x]

Using a list comprehension is the most Pythonic way:

>>> strings = ["first", "", "second"]
>>> [x for x in strings if x]
['first', 'second']

If the list must be modified in-place, because there are other references which must see the updated data, then use a slice assignment:

strings[:] = [x for x in strings if x]

回答 2

过滤器实际上对此有一个特殊的选择:

filter(None, sequence)

它将滤除所有评估为False的元素。此处无需使用实际的可调用对象,例如bool,len等。

和map(bool,…)一样快

filter actually has a special option for this:

filter(None, sequence)

It will filter out all elements that evaluate to False. No need to use an actual callable here such as bool, len and so on.

It’s equally fast as map(bool, …)


回答 3

>>> lstr = ['hello', '', ' ', 'world', ' ']
>>> lstr
['hello', '', ' ', 'world', ' ']

>>> ' '.join(lstr).split()
['hello', 'world']

>>> filter(None, lstr)
['hello', ' ', 'world', ' ']

比较时间

>>> from timeit import timeit
>>> timeit('" ".join(lstr).split()', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
4.226747989654541
>>> timeit('filter(None, lstr)', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
3.0278358459472656

请注意,filter(None, lstr)它不会删除带有空格的空字符串' ',只会修剪掉''而同时' '.join(lstr).split()删除它们。

要使用filter()删除的空格字符串,需要花费更多时间:

>>> timeit('filter(None, [l.replace(" ", "") for l in lstr])', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
18.101892948150635
>>> lstr = ['hello', '', ' ', 'world', ' ']
>>> lstr
['hello', '', ' ', 'world', ' ']

>>> ' '.join(lstr).split()
['hello', 'world']

>>> filter(None, lstr)
['hello', ' ', 'world', ' ']

Compare time

>>> from timeit import timeit
>>> timeit('" ".join(lstr).split()', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
4.226747989654541
>>> timeit('filter(None, lstr)', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
3.0278358459472656

Notice that filter(None, lstr) does not remove empty strings with a space ' ', it only prunes away '' while ' '.join(lstr).split() removes both.

To use filter() with white space strings removed, it takes a lot more time:

>>> timeit('filter(None, [l.replace(" ", "") for l in lstr])', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
18.101892948150635

回答 4

@ Ib33X的回复很棒。如果要删除每个空字符串,请剥离后。您也需要使用strip方法。否则,如果有空格,它将也返回空字符串。如,“”对于该答案也将有效。这样,就可以实现。

strings = ["first", "", "second ", " "]
[x.strip() for x in strings if x.strip()]

答案是["first", "second"]
如果要改用filtermethod,可以执行like
list(filter(lambda item: item.strip(), strings))。这给出了相同的结果。

Reply from @Ib33X is awesome. If you want to remove every empty string, after stripped. you need to use the strip method too. Otherwise, it will return the empty string too if it has white spaces. Like, ” ” will be valid too for that answer. So, can be achieved by.

strings = ["first", "", "second ", " "]
[x.strip() for x in strings if x.strip()]

The answer for this will be ["first", "second"].
If you want to use filter method instead, you can do like
list(filter(lambda item: item.strip(), strings)). This is give the same result.


回答 5

代替if x,我将使用if X!=”来消除空字符串。像这样:

str_list = [x for x in str_list if x != '']

这将在列表中保留“无”数据类型。此外,如果您的列表中有整数,并且0是其中的一个,它也将被保留。

例如,

str_list = [None, '', 0, "Hi", '', "Hello"]
[x for x in str_list if x != '']
[None, 0, "Hi", "Hello"]

Instead of if x, I would use if X != ” in order to just eliminate empty strings. Like this:

str_list = [x for x in str_list if x != '']

This will preserve None data type within your list. Also, in case your list has integers and 0 is one among them, it will also be preserved.

For example,

str_list = [None, '', 0, "Hi", '', "Hello"]
[x for x in str_list if x != '']
[None, 0, "Hi", "Hello"]

回答 6

根据列表的大小,如果您使用list.remove()而不是创建新列表,则可能是最有效的:

l = ["1", "", "3", ""]

while True:
  try:
    l.remove("")
  except ValueError:
    break

这具有不创建新列表的优点,但是具有每次都必须从头开始搜索的缺点,尽管与while '' in l上面建议的用法不同,它每次出现时仅需要搜索一次''(当然,有一种方法可以保持最佳状态)两种方法,但更为复杂)。

Depending on the size of your list, it may be most efficient if you use list.remove() rather than create a new list:

l = ["1", "", "3", ""]

while True:
  try:
    l.remove("")
  except ValueError:
    break

This has the advantage of not creating a new list, but the disadvantage of having to search from the beginning each time, although unlike using while '' in l as proposed above, it only requires searching once per occurrence of '' (there is certainly a way to keep the best of both methods, but it is more complicated).


回答 7

请记住,如果要将空格保留在字符串中,则可以使用某些方法无意中将其删除。如果你有这个清单

[‘hello world’,”,’,’hello’]您可能想要的内容[‘hello world’,’hello’]

首先修剪列表以将任何类型的空格转换为空字符串:

space_to_empty = [x.strip() for x in _text_list]

然后从列表中删除空字符串

space_clean_list = [x for x in space_to_empty if x]

Keep in mind that if you want to keep the white spaces within a string, you may remove them unintentionally using some approaches. If you have this list

[‘hello world’, ‘ ‘, ”, ‘hello’] what you may want [‘hello world’,’hello’]

first trim the list to convert any type of white space to empty string:

space_to_empty = [x.strip() for x in _text_list]

then remove empty string from them list

space_clean_list = [x for x in space_to_empty if x]

回答 8

用途filter

newlist=filter(lambda x: len(x)>0, oldlist) 

如所指出的,使用过滤器的缺点是它比替代方法慢。而且,lambda通常很昂贵。

或者,您可以选择最简单,最迭代的方法:

# I am assuming listtext is the original list containing (possibly) empty items
for item in listtext:
    if item:
        newlist.append(str(item))
# You can remove str() based on the content of your original list

这是最直观的方法,并且可以在适当的时间内完成。

Use filter:

newlist=filter(lambda x: len(x)>0, oldlist) 

The drawbacks of using filter as pointed out is that it is slower than alternatives; also, lambda is usually costly.

Or you can go for the simplest and the most iterative of all:

# I am assuming listtext is the original list containing (possibly) empty items
for item in listtext:
    if item:
        newlist.append(str(item))
# You can remove str() based on the content of your original list

this is the most intuitive of the methods and does it in decent time.


回答 9

正如Aziz Alto 所报告的filter(None, lstr)那样,不会删除带有空格的空字符串,' '但是如果您确定lstr仅包含字符串,则可以使用filter(str.strip, lstr)

>>> lstr = ['hello', '', ' ', 'world', ' ']
>>> lstr
['hello', '', ' ', 'world', ' ']
>>> ' '.join(lstr).split()
['hello', 'world']
>>> filter(str.strip, lstr)
['hello', 'world']

比较我的电脑上的时间

>>> from timeit import timeit
>>> timeit('" ".join(lstr).split()', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
3.356455087661743
>>> timeit('filter(str.strip, lstr)', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
5.276503801345825

删除''和清空带有空格的字符串的最快解决方案' '仍然是' '.join(lstr).split()

如评论中所述,如果您的字符串包含空格,则情况会有所不同。

>>> lstr = ['hello', '', ' ', 'world', '    ', 'see you']
>>> lstr
['hello', '', ' ', 'world', '    ', 'see you']
>>> ' '.join(lstr).split()
['hello', 'world', 'see', 'you']
>>> filter(str.strip, lstr)
['hello', 'world', 'see you']

您会看到filter(str.strip, lstr)保留带空格的字符串,但' '.join(lstr).split()会拆分这些字符串。

As reported by Aziz Alto filter(None, lstr) does not remove empty strings with a space ' ' but if you are sure lstr contains only string you can use filter(str.strip, lstr)

>>> lstr = ['hello', '', ' ', 'world', ' ']
>>> lstr
['hello', '', ' ', 'world', ' ']
>>> ' '.join(lstr).split()
['hello', 'world']
>>> filter(str.strip, lstr)
['hello', 'world']

Compare time on my pc

>>> from timeit import timeit
>>> timeit('" ".join(lstr).split()', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
3.356455087661743
>>> timeit('filter(str.strip, lstr)', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
5.276503801345825

The fastest solution to remove '' and empty strings with a space ' ' remains ' '.join(lstr).split().

As reported in a comment the situation is different if your strings contain spaces.

>>> lstr = ['hello', '', ' ', 'world', '    ', 'see you']
>>> lstr
['hello', '', ' ', 'world', '    ', 'see you']
>>> ' '.join(lstr).split()
['hello', 'world', 'see', 'you']
>>> filter(str.strip, lstr)
['hello', 'world', 'see you']

You can see that filter(str.strip, lstr) preserve strings with spaces on it but ' '.join(lstr).split() will split this strings.


回答 10

总结最佳答案:

1.消除空洞而无需剥离:

也就是说,保留所有空格字符串:

slist = list(filter(None, slist))

优点:

  • 最简单
  • 最快(请参见下面的基准)。

2.去除剥离后的空容器…

2.a …当字符串在单词之间不包含空格时:

slist = ' '.join(slist).split()

优点:

  • 小代码
  • 快速(但由于内存原因,对于大型数据集而言并非最快,这与@ paolo-melchiorre结果相反)

2.b …字符串在单词之间包含空格吗?

slist = list(filter(str.strip, slist))

优点:

  • 最快的;
  • 代码的可理解性。

2018年机器上的基准测试:

## Build test-data
#
import random, string
nwords = 10000
maxlen = 30
null_ratio = 0.1
rnd = random.Random(0)                  # deterministic results
words = [' ' * rnd.randint(0, maxlen)
         if rnd.random() > (1 - null_ratio)
         else
         ''.join(random.choices(string.ascii_letters, k=rnd.randint(0, maxlen)))
         for _i in range(nwords)
        ]

## Test functions
#
def nostrip_filter(slist):
    return list(filter(None, slist))

def nostrip_comprehension(slist):
    return [s for s in slist if s]

def strip_filter(slist):
    return list(filter(str.strip, slist))

def strip_filter_map(slist): 
    return list(filter(None, map(str.strip, slist))) 

def strip_filter_comprehension(slist):  # waste memory
    return list(filter(None, [s.strip() for s in slist]))

def strip_filter_generator(slist):
    return list(filter(None, (s.strip() for s in slist)))

def strip_join_split(slist):  # words without(!) spaces
    return ' '.join(slist).split()

## Benchmarks
#
%timeit nostrip_filter(words)
142 µs ± 16.8 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit nostrip_comprehension(words)
263 µs ± 19.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit strip_filter(words)
653 µs ± 37.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit strip_filter_map(words)
642 µs ± 36 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit strip_filter_comprehension(words)
693 µs ± 42.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit strip_filter_generator(words)
750 µs ± 28.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit strip_join_split(words)
796 µs ± 103 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Sum up best answers:

1. Eliminate emtpties WITHOUT stripping:

That is, all-space strings are retained:

slist = list(filter(None, slist))

PROs:

  • simplest;
  • fastest (see benchmarks below).

2. To eliminate empties after stripping …

2.a … when strings do NOT contain spaces between words:

slist = ' '.join(slist).split()

PROs:

  • small code
  • fast (BUT not fastest with big datasets due to memory, contrary to what @paolo-melchiorre results)

2.b … when strings contain spaces between words?

slist = list(filter(str.strip, slist))

PROs:

  • fastest;
  • understandability of the code.

Benchmarks on a 2018 machine:

## Build test-data
#
import random, string
nwords = 10000
maxlen = 30
null_ratio = 0.1
rnd = random.Random(0)                  # deterministic results
words = [' ' * rnd.randint(0, maxlen)
         if rnd.random() > (1 - null_ratio)
         else
         ''.join(random.choices(string.ascii_letters, k=rnd.randint(0, maxlen)))
         for _i in range(nwords)
        ]

## Test functions
#
def nostrip_filter(slist):
    return list(filter(None, slist))

def nostrip_comprehension(slist):
    return [s for s in slist if s]

def strip_filter(slist):
    return list(filter(str.strip, slist))

def strip_filter_map(slist): 
    return list(filter(None, map(str.strip, slist))) 

def strip_filter_comprehension(slist):  # waste memory
    return list(filter(None, [s.strip() for s in slist]))

def strip_filter_generator(slist):
    return list(filter(None, (s.strip() for s in slist)))

def strip_join_split(slist):  # words without(!) spaces
    return ' '.join(slist).split()

## Benchmarks
#
%timeit nostrip_filter(words)
142 µs ± 16.8 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit nostrip_comprehension(words)
263 µs ± 19.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit strip_filter(words)
653 µs ± 37.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit strip_filter_map(words)
642 µs ± 36 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit strip_filter_comprehension(words)
693 µs ± 42.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit strip_filter_generator(words)
750 µs ± 28.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit strip_join_split(words)
796 µs ± 103 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

回答 11

对于包含空格和空值的列表,请使用简单的列表理解-

>>> s = ['I', 'am', 'a', '', 'great', ' ', '', '  ', 'person', '!!', 'Do', 'you', 'think', 'its', 'a', '', 'a', '', 'joke', '', ' ', '', '?', '', '', '', '?']

因此,您可以看到,此列表包含空格和null元素的组合。使用摘要-

>>> d = [x for x in s if x.strip()]
>>> d
>>> d = ['I', 'am', 'a', 'great', 'person', '!!', 'Do', 'you', 'think', 'its', 'a', 'a', 'joke', '?', '?']

For a list with a combination of spaces and empty values, use simple list comprehension –

>>> s = ['I', 'am', 'a', '', 'great', ' ', '', '  ', 'person', '!!', 'Do', 'you', 'think', 'its', 'a', '', 'a', '', 'joke', '', ' ', '', '?', '', '', '', '?']

So, you can see, this list has a combination of spaces and null elements. Using the snippet –

>>> d = [x for x in s if x.strip()]
>>> d
>>> d = ['I', 'am', 'a', 'great', 'person', '!!', 'Do', 'you', 'think', 'its', 'a', 'a', 'joke', '?', '?']

如何以常规格式打印日期?

问题:如何以常规格式打印日期?

这是我的代码:

import datetime
today = datetime.date.today()
print(today)

打印:2008-11-22这正是我想要的。

但是,我有一个列表要附加到该列表中,然后突然所有内容都变得“异常”。这是代码:

import datetime
mylist = []
today = datetime.date.today()
mylist.append(today)
print(mylist)

打印以下内容:

[datetime.date(2008, 11, 22)]

我怎样才能得到一个简单的约会2008-11-22

This is my code:

import datetime
today = datetime.date.today()
print(today)

This prints: 2008-11-22 which is exactly what I want.

But, I have a list I’m appending this to and then suddenly everything goes “wonky”. Here is the code:

import datetime
mylist = []
today = datetime.date.today()
mylist.append(today)
print(mylist)

This prints the following:

[datetime.date(2008, 11, 22)]

How can I get just a simple date like 2008-11-22?


回答 0

为什么:日期是对象

在Python中,日期是对象。因此,当您操作它们时,您将操作对象,而不是字符串,时间戳或其他任何对象。

Python中的任何对象都有两个字符串表示形式:

  • 可以使用str()函数获取“打印”所使用的常规表示形式。在大多数情况下,它是最常见的人类可读格式,用于简化显示。所以str(datetime.datetime(2008, 11, 22, 19, 53, 42))给你'2008-11-22 19:53:42'

  • 用于表示对象性质(作为数据)的替代表示。它可以使用该repr()函数获得,并且很容易知道在开发或调试时要处理的数据类型。repr(datetime.datetime(2008, 11, 22, 19, 53, 42))给你'datetime.datetime(2008, 11, 22, 19, 53, 42)'

发生的事情是,当您使用“打印”打印日期时,会使用它,str()以便可以看到一个不错的日期字符串。但是在打印后mylist,您已经打印了一个对象列表,Python尝试使用来表示数据集repr()

方法:您想怎么做?

好吧,当您操作日期时,请一直使用日期对象。他们获得了数千种有用的方法,并且大多数Python API都希望日期成为对象。

要显示它们时,只需使用str()。在Python中,良好的做法是显式转换所有内容。因此,仅在打印时,使用即可获取日期的字符串表示形式str(date)

最后一件事。当您尝试打印日期时,您打印了mylist。如果要打印日期,则必须打印日期对象,而不是其容器(列表)。

EG,您想将所有日期打印在列表中:

for date in mylist :
    print str(date)

请注意,在这种特定情况下,您甚至可以省略,str()因为打印将为您使用它。但这不应该成为一种习惯:-)

实际案例,使用您的代码

import datetime
mylist = []
today = datetime.date.today()
mylist.append(today)
print mylist[0] # print the date object, not the container ;-)
2008-11-22

# It's better to always use str() because :

print "This is a new day : ", mylist[0] # will work
>>> This is a new day : 2008-11-22

print "This is a new day : " + mylist[0] # will crash
>>> cannot concatenate 'str' and 'datetime.date' objects

print "This is a new day : " + str(mylist[0]) 
>>> This is a new day : 2008-11-22

高级日期格式

日期具有默认表示形式,但是您可能需要以特定格式打印日期。在这种情况下,您可以使用strftime()方法获得自定义的字符串表示形式。

strftime() 需要一个字符串模式来说明如何格式化日期。

EG:

print today.strftime('We are the %d, %b %Y')
>>> 'We are the 22, Nov 2008'

a之后的所有字母"%"代表某种格式:

  • %d 是天数
  • %m 是月份号
  • %b 是月份的缩写
  • %y 是年份的后两位数字
  • %Y 是整年

等等

查看官方文档McCutchen的快速参考资料,您可能一无所知

PEP3101开始,每个对象都可以具有自己的格式,该格式可以由任何字符串的方法格式自动使用。对于日期时间,格式与strftime中使用的格式相同。因此,您可以像上面这样做:

print "We are the {:%d, %b %Y}".format(today)
>>> 'We are the 22, Nov 2008'

这种形式的优点是您还可以同时转换其他对象。
引入了格式化字符串文字(自Python 3.6,2016-12-23起),可以这样写:

import datetime
f"{datetime.datetime.now():%Y-%m-%d}"
>>> '2017-06-15'

本土化

如果您以正确的方式使用日期,日期会自动适应当地的语言和文化,但这有点复杂。也许是关于SO(堆栈溢出)的另一个问题;-)

The WHY: dates are objects

In Python, dates are objects. Therefore, when you manipulate them, you manipulate objects, not strings, not timestamps nor anything.

Any object in Python have TWO string representations:

  • The regular representation that is used by “print”, can be get using the str() function. It is most of the time the most common human readable format and is used to ease display. So str(datetime.datetime(2008, 11, 22, 19, 53, 42)) gives you '2008-11-22 19:53:42'.

  • The alternative representation that is used to represent the object nature (as a data). It can be get using the repr() function and is handy to know what kind of data your manipulating while you are developing or debugging. repr(datetime.datetime(2008, 11, 22, 19, 53, 42)) gives you 'datetime.datetime(2008, 11, 22, 19, 53, 42)'.

What happened is that when you have printed the date using “print”, it used str() so you could see a nice date string. But when you have printed mylist, you have printed a list of objects and Python tried to represent the set of data, using repr().

The How: what do you want to do with that?

Well, when you manipulate dates, keep using the date objects all long the way. They got thousand of useful methods and most of the Python API expect dates to be objects.

When you want to display them, just use str(). In Python, the good practice is to explicitly cast everything. So just when it’s time to print, get a string representation of your date using str(date).

One last thing. When you tried to print the dates, you printed mylist. If you want to print a date, you must print the date objects, not their container (the list).

E.G, you want to print all the date in a list :

for date in mylist :
    print str(date)

Note that in that specific case, you can even omit str() because print will use it for you. But it should not become a habit :-)

Practical case, using your code

import datetime
mylist = []
today = datetime.date.today()
mylist.append(today)
print mylist[0] # print the date object, not the container ;-)
2008-11-22

# It's better to always use str() because :

print "This is a new day : ", mylist[0] # will work
>>> This is a new day : 2008-11-22

print "This is a new day : " + mylist[0] # will crash
>>> cannot concatenate 'str' and 'datetime.date' objects

print "This is a new day : " + str(mylist[0]) 
>>> This is a new day : 2008-11-22

Advanced date formatting

Dates have a default representation, but you may want to print them in a specific format. In that case, you can get a custom string representation using the strftime() method.

strftime() expects a string pattern explaining how you want to format your date.

E.G :

print today.strftime('We are the %d, %b %Y')
>>> 'We are the 22, Nov 2008'

All the letter after a "%" represent a format for something :

  • %d is the day number
  • %m is the month number
  • %b is the month abbreviation
  • %y is the year last two digits
  • %Y is the all year

etc

Have a look at the official documentation, or McCutchen’s quick reference you can’t know them all.

Since PEP3101, every object can have its own format used automatically by the method format of any string. In the case of the datetime, the format is the same used in strftime. So you can do the same as above like this:

print "We are the {:%d, %b %Y}".format(today)
>>> 'We are the 22, Nov 2008'

The advantage of this form is that you can also convert other objects at the same time.
With the introduction of Formatted string literals (since Python 3.6, 2016-12-23) this can be written as

import datetime
f"{datetime.datetime.now():%Y-%m-%d}"
>>> '2017-06-15'

Localization

Dates can automatically adapt to the local language and culture if you use them the right way, but it’s a bit complicated. Maybe for another question on SO(Stack Overflow) ;-)


回答 1

import datetime
print datetime.datetime.now().strftime("%Y-%m-%d %H:%M")

编辑:

在Cees建议之后,我也开始使用时间:

import time
print time.strftime("%Y-%m-%d %H:%M")
import datetime
print datetime.datetime.now().strftime("%Y-%m-%d %H:%M")

Edit:

After Cees suggestion, I have started using time as well:

import time
print time.strftime("%Y-%m-%d %H:%M")

回答 2

date,datetime和time对象均支持strftime(format)方法,以在显式格式字符串的控制下创建表示时间的字符串。

这是格式代码及其指令和含义的列表。

    %a  Locales abbreviated weekday name.
    %A  Locales full weekday name.      
    %b  Locales abbreviated month name.     
    %B  Locales full month name.
    %c  Locales appropriate date and time representation.   
    %d  Day of the month as a decimal number [01,31].    
    %f  Microsecond as a decimal number [0,999999], zero-padded on the left
    %H  Hour (24-hour clock) as a decimal number [00,23].    
    %I  Hour (12-hour clock) as a decimal number [01,12].    
    %j  Day of the year as a decimal number [001,366].   
    %m  Month as a decimal number [01,12].   
    %M  Minute as a decimal number [00,59].      
    %p  Locales equivalent of either AM or PM.
    %S  Second as a decimal number [00,61].
    %U  Week number of the year (Sunday as the first day of the week)
    %w  Weekday as a decimal number [0(Sunday),6].   
    %W  Week number of the year (Monday as the first day of the week)
    %x  Locales appropriate date representation.    
    %X  Locales appropriate time representation.    
    %y  Year without century as a decimal number [00,99].    
    %Y  Year with century as a decimal number.   
    %z  UTC offset in the form +HHMM or -HHMM.
    %Z  Time zone name (empty string if the object is naive).    
    %%  A literal '%' character.

这就是我们可以使用Python中的datetime和time模块来做的事情

    import time
    import datetime

    print "Time in seconds since the epoch: %s" %time.time()
    print "Current date and time: ", datetime.datetime.now()
    print "Or like this: ", datetime.datetime.now().strftime("%y-%m-%d-%H-%M")


    print "Current year: ", datetime.date.today().strftime("%Y")
    print "Month of year: ", datetime.date.today().strftime("%B")
    print "Week number of the year: ", datetime.date.today().strftime("%W")
    print "Weekday of the week: ", datetime.date.today().strftime("%w")
    print "Day of year: ", datetime.date.today().strftime("%j")
    print "Day of the month : ", datetime.date.today().strftime("%d")
    print "Day of week: ", datetime.date.today().strftime("%A")

这将打印出如下内容:

    Time in seconds since the epoch:    1349271346.46
    Current date and time:              2012-10-03 15:35:46.461491
    Or like this:                       12-10-03-15-35
    Current year:                       2012
    Month of year:                      October
    Week number of the year:            40
    Weekday of the week:                3
    Day of year:                        277
    Day of the month :                  03
    Day of week:                        Wednesday

The date, datetime, and time objects all support a strftime(format) method, to create a string representing the time under the control of an explicit format string.

Here is a list of the format codes with their directive and meaning.

    %a  Locale’s abbreviated weekday name.
    %A  Locale’s full weekday name.      
    %b  Locale’s abbreviated month name.     
    %B  Locale’s full month name.
    %c  Locale’s appropriate date and time representation.   
    %d  Day of the month as a decimal number [01,31].    
    %f  Microsecond as a decimal number [0,999999], zero-padded on the left
    %H  Hour (24-hour clock) as a decimal number [00,23].    
    %I  Hour (12-hour clock) as a decimal number [01,12].    
    %j  Day of the year as a decimal number [001,366].   
    %m  Month as a decimal number [01,12].   
    %M  Minute as a decimal number [00,59].      
    %p  Locale’s equivalent of either AM or PM.
    %S  Second as a decimal number [00,61].
    %U  Week number of the year (Sunday as the first day of the week)
    %w  Weekday as a decimal number [0(Sunday),6].   
    %W  Week number of the year (Monday as the first day of the week)
    %x  Locale’s appropriate date representation.    
    %X  Locale’s appropriate time representation.    
    %y  Year without century as a decimal number [00,99].    
    %Y  Year with century as a decimal number.   
    %z  UTC offset in the form +HHMM or -HHMM.
    %Z  Time zone name (empty string if the object is naive).    
    %%  A literal '%' character.

This is what we can do with the datetime and time modules in Python

    import time
    import datetime

    print "Time in seconds since the epoch: %s" %time.time()
    print "Current date and time: ", datetime.datetime.now()
    print "Or like this: ", datetime.datetime.now().strftime("%y-%m-%d-%H-%M")


    print "Current year: ", datetime.date.today().strftime("%Y")
    print "Month of year: ", datetime.date.today().strftime("%B")
    print "Week number of the year: ", datetime.date.today().strftime("%W")
    print "Weekday of the week: ", datetime.date.today().strftime("%w")
    print "Day of year: ", datetime.date.today().strftime("%j")
    print "Day of the month : ", datetime.date.today().strftime("%d")
    print "Day of week: ", datetime.date.today().strftime("%A")

That will print out something like this:

    Time in seconds since the epoch:    1349271346.46
    Current date and time:              2012-10-03 15:35:46.461491
    Or like this:                       12-10-03-15-35
    Current year:                       2012
    Month of year:                      October
    Week number of the year:            40
    Weekday of the week:                3
    Day of year:                        277
    Day of the month :                  03
    Day of week:                        Wednesday

回答 3

使用date.strftime。格式参数在文档中进行了描述

这是您想要的:

some_date.strftime('%Y-%m-%d')

这一部分考虑了语言环境。(做这个)

some_date.strftime('%c')

Use date.strftime. The formatting arguments are described in the documentation.

This one is what you wanted:

some_date.strftime('%Y-%m-%d')

This one takes Locale into account. (do this)

some_date.strftime('%c')

回答 4

这更短:

>>> import time
>>> time.strftime("%Y-%m-%d %H:%M")
'2013-11-19 09:38'

This is shorter:

>>> import time
>>> time.strftime("%Y-%m-%d %H:%M")
'2013-11-19 09:38'

回答 5

# convert date time to regular format.

d_date = datetime.datetime.now()
reg_format_date = d_date.strftime("%Y-%m-%d %I:%M:%S %p")
print(reg_format_date)

# some other date formats.
reg_format_date = d_date.strftime("%d %B %Y %I:%M:%S %p")
print(reg_format_date)
reg_format_date = d_date.strftime("%Y-%m-%d %H:%M:%S")
print(reg_format_date)

输出值

2016-10-06 01:21:34 PM
06 October 2016 01:21:34 PM
2016-10-06 13:21:34
# convert date time to regular format.

d_date = datetime.datetime.now()
reg_format_date = d_date.strftime("%Y-%m-%d %I:%M:%S %p")
print(reg_format_date)

# some other date formats.
reg_format_date = d_date.strftime("%d %B %Y %I:%M:%S %p")
print(reg_format_date)
reg_format_date = d_date.strftime("%Y-%m-%d %H:%M:%S")
print(reg_format_date)

OUTPUT

2016-10-06 01:21:34 PM
06 October 2016 01:21:34 PM
2016-10-06 13:21:34

回答 6

甚至

from datetime import datetime, date

"{:%d.%m.%Y}".format(datetime.now())

出:’25 .12.2013

要么

"{} - {:%d.%m.%Y}".format("Today", datetime.now())

离开:“今天-2013年12月25日”

"{:%A}".format(date.today())

出:“星期三”

'{}__{:%Y.%m.%d__%H-%M}.log'.format(__name__, datetime.now())

出:’__main ____ 2014.06.09__16-56.log’

Or even

from datetime import datetime, date

"{:%d.%m.%Y}".format(datetime.now())

Out: ‘25.12.2013

or

"{} - {:%d.%m.%Y}".format("Today", datetime.now())

Out: ‘Today – 25.12.2013’

"{:%A}".format(date.today())

Out: ‘Wednesday’

'{}__{:%Y.%m.%d__%H-%M}.log'.format(__name__, datetime.now())

Out: ‘__main____2014.06.09__16-56.log’


回答 7

简单的答案-

datetime.date.today().isoformat()

Simple answer –

datetime.date.today().isoformat()

回答 8

格式化的字符串文字中使用特定于类型的datetime字符串格式(请参阅nk9的答案str.format()。)(自Python 3.6,2016-12-23起):

>>> import datetime
>>> f"{datetime.datetime.now():%Y-%m-%d}"
'2017-06-15'

日期/时间格式指令不会记录为部分格式字符串语法,而是在datedatetimetimestrftime()文档。它们基于1989 C标准,但自Python 3.6起包含一些ISO 8601指令。

With type-specific datetime string formatting (see nk9’s answer using str.format().) in a Formatted string literal (since Python 3.6, 2016-12-23):

>>> import datetime
>>> f"{datetime.datetime.now():%Y-%m-%d}"
'2017-06-15'

The date/time format directives are not documented as part of the Format String Syntax but rather in date, datetime, and time‘s strftime() documentation. The are based on the 1989 C Standard, but include some ISO 8601 directives since Python 3.6.


回答 9

您需要将日期时间对象转换为字符串。

以下代码为我工作:

import datetime
collection = []
dateTimeString = str(datetime.date.today())
collection.append(dateTimeString)
print collection

让我知道您是否需要更多帮助。

You need to convert the date time object to a string.

The following code worked for me:

import datetime
collection = []
dateTimeString = str(datetime.date.today())
collection.append(dateTimeString)
print collection

Let me know if you need any more help.


回答 10

你可以做:

mylist.append(str(today))

You can do:

mylist.append(str(today))

回答 11

我讨厌为了方便而导入太多模块的想法。我宁愿使用可用模块,在这种情况下也datetime不愿调用新模块time

>>> a = datetime.datetime(2015, 04, 01, 11, 23, 22)
>>> a.strftime('%Y-%m-%d %H:%M')
'2015-04-01 11:23'

I hate the idea of importing too many modules for convenience. I would rather work with available module which in this case is datetime rather than calling a new module time.

>>> a = datetime.datetime(2015, 04, 01, 11, 23, 22)
>>> a.strftime('%Y-%m-%d %H:%M')
'2015-04-01 11:23'

回答 12

考虑到您要求做一些简单的事情来做自己想做的事情,您可以:

import datetime
str(datetime.date.today())

Considering the fact you asked for something simple to do what you wanted, you could just:

import datetime
str(datetime.date.today())

回答 13

对于那些想要基于区域设置的日期而不包括时间的人,请使用:

>>> some_date.strftime('%x')
07/11/2019

For those wanting locale-based date and not including time, use:

>>> some_date.strftime('%x')
07/11/2019

回答 14

您可能想将其附加为字符串?

import datetime 
mylist = [] 
today = str(datetime.date.today())
mylist.append(today) 
print mylist

You may want to append it as a string?

import datetime 
mylist = [] 
today = str(datetime.date.today())
mylist.append(today) 
print mylist

回答 15

由于print today返回所需的内容,因此这意味着Today对象的__str__函数将返回您要查找的字符串。

所以你也可以做mylist.append(today.__str__())

Since the print today returns what you want this means that the today object’s __str__ function returns the string you are looking for.

So you can do mylist.append(today.__str__()) as well.


回答 16

您可以使用easy_date使其变得容易:

import date_converter
my_date = date_converter.date_to_string(today, '%Y-%m-%d')

You can use easy_date to make it easy:

import date_converter
my_date = date_converter.date_to_string(today, '%Y-%m-%d')

回答 17

我的答案免责声明-我只学习Python大约2周,所以我绝不是专家。因此,我的解释可能不是最好的,并且我可能使用了错误的术语。无论如何,就这样。

我在您的代码中注意到,在声明变量时,today = datetime.date.today()您选择使用内置函数的名称来命名变量。

当您的下一行代码mylist.append(today)附加到列表中时,它附加了整个字符串datetime.date.today()(您之前将其设置为today变量的值),而不仅仅是追加了today()

一个简单的解决方案是更改变量的名称,尽管大多数编码人员在使用datetime模块时不会使用该解决方案。

这是我尝试过的:

import datetime
mylist = []
present = datetime.date.today()
mylist.append(present)
print present

它打印yyyy-mm-dd

A quick disclaimer for my answer – I’ve only been learning Python for about 2 weeks, so I am by no means an expert; therefore, my explanation may not be the best and I may use incorrect terminology. Anyway, here it goes.

I noticed in your code that when you declared your variable today = datetime.date.today() you chose to name your variable with the name of a built-in function.

When your next line of code mylist.append(today) appended your list, it appended the entire string datetime.date.today(), which you had previously set as the value of your today variable, rather than just appending today().

A simple solution, albeit maybe not one most coders would use when working with the datetime module, is to change the name of your variable.

Here’s what I tried:

import datetime
mylist = []
present = datetime.date.today()
mylist.append(present)
print present

and it prints yyyy-mm-dd.


回答 18

这是将日期显示为(年/月/日)的方法:

from datetime import datetime
now = datetime.now()

print '%s/%s/%s' % (now.year, now.month, now.day)

Here is how to display the date as (year/month/day) :

from datetime import datetime
now = datetime.now()

print '%s/%s/%s' % (now.year, now.month, now.day)

回答 19

from datetime import date
def time-format():
  return str(date.today())
print (time-format())

如果那是您想要的,它将打印6-23-2018 :)

from datetime import date
def time-format():
  return str(date.today())
print (time-format())

this will print 6-23-2018 if that’s what you want :)


回答 20

import datetime
import time

months = ["Unknown","January","Febuary","Marchh","April","May","June","July","August","September","October","November","December"]
datetimeWrite = (time.strftime("%d-%m-%Y "))
date = time.strftime("%d")
month= time.strftime("%m")
choices = {'01': 'Jan', '02':'Feb','03':'Mar','04':'Apr','05':'May','06': 'Jun','07':'Jul','08':'Aug','09':'Sep','10':'Oct','11':'Nov','12':'Dec'}
result = choices.get(month, 'default')
year = time.strftime("%Y")
Date = date+"-"+result+"-"+year
print Date

这样,您就可以将日期格式设置为以下示例:22-Jun-2017

import datetime
import time

months = ["Unknown","January","Febuary","Marchh","April","May","June","July","August","September","October","November","December"]
datetimeWrite = (time.strftime("%d-%m-%Y "))
date = time.strftime("%d")
month= time.strftime("%m")
choices = {'01': 'Jan', '02':'Feb','03':'Mar','04':'Apr','05':'May','06': 'Jun','07':'Jul','08':'Aug','09':'Sep','10':'Oct','11':'Nov','12':'Dec'}
result = choices.get(month, 'default')
year = time.strftime("%Y")
Date = date+"-"+result+"-"+year
print Date

In this way you can get Date formatted like this example: 22-Jun-2017


回答 21

我不太了解,但是可以pandas用来获取正确格式的时间:

>>> import pandas as pd
>>> pd.to_datetime('now')
Timestamp('2018-10-07 06:03:30')
>>> print(pd.to_datetime('now'))
2018-10-07 06:03:47
>>> pd.to_datetime('now').date()
datetime.date(2018, 10, 7)
>>> print(pd.to_datetime('now').date())
2018-10-07
>>> 

和:

>>> l=[]
>>> l.append(pd.to_datetime('now').date())
>>> l
[datetime.date(2018, 10, 7)]
>>> map(str,l)
<map object at 0x0000005F67CCDF98>
>>> list(map(str,l))
['2018-10-07']

但是它存储字符串,但易于转换:

>>> l=list(map(str,l))
>>> list(map(pd.to_datetime,l))
[Timestamp('2018-10-07 00:00:00')]

I don’t fully understand but, can use pandas for getting times in right format:

>>> import pandas as pd
>>> pd.to_datetime('now')
Timestamp('2018-10-07 06:03:30')
>>> print(pd.to_datetime('now'))
2018-10-07 06:03:47
>>> pd.to_datetime('now').date()
datetime.date(2018, 10, 7)
>>> print(pd.to_datetime('now').date())
2018-10-07
>>> 

And:

>>> l=[]
>>> l.append(pd.to_datetime('now').date())
>>> l
[datetime.date(2018, 10, 7)]
>>> map(str,l)
<map object at 0x0000005F67CCDF98>
>>> list(map(str,l))
['2018-10-07']

But it’s storing strings but easy to convert:

>>> l=list(map(str,l))
>>> list(map(pd.to_datetime,l))
[Timestamp('2018-10-07 00:00:00')]

如何检查python模块的版本?

问题:如何检查python模块的版本?

我刚安装的Python模块:constructstatlibsetuptools这样的:

# Install setuptools to be able to download the following
sudo apt-get install python-setuptools

# Install statlib for lightweight statistical tools
sudo easy_install statlib

# Install construct for packing/unpacking binary data
sudo easy_install construct

我希望能够(以编程方式)检查其版本。是否有一个相当于python --version我可以在命令行中运行?

我的python版本是2.7.3

I just installed the python modules: construct and statlib with setuptools like this:

# Install setuptools to be able to download the following
sudo apt-get install python-setuptools

# Install statlib for lightweight statistical tools
sudo easy_install statlib

# Install construct for packing/unpacking binary data
sudo easy_install construct

I want to be able to (programmatically) check their versions. Is there an equivalent to python --version I can run from the command line?

My python version is 2.7.3.


回答 0

我建议使用pip代替easy_install。使用pip,您可以列出所有已安装的软件包及其版本

pip freeze

在大多数linux系统中,您可以将此管道传送到grep(或findstr在Windows上)以找到您感兴趣的特定软件包的行:

Linux:
$ pip freeze | grep lxml
lxml==2.3

Windows:
c:\> pip freeze | findstr lxml
lxml==2.3

对于单个模块,可以尝试使用__version__属性,但是有些模块没有该属性

$ python -c "import requests; print(requests.__version__)"
2.14.2
$ python -c "import lxml; print(lxml.__version__)"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
AttributeError: 'module' object has no attribute '__version__'

最后,由于问题中的命令带有前缀sudo,因此您似乎正在安装到全局python环境。强烈建议研究python 虚拟环境管理器,例如virtualenvwrapper

I suggest using pip in place of easy_install. With pip, you can list all installed packages and their versions with

pip freeze

In most linux systems, you can pipe this to grep(or findstr on Windows) to find the row for the particular package you’re interested in:

Linux:
$ pip freeze | grep lxml
lxml==2.3

Windows:
c:\> pip freeze | findstr lxml
lxml==2.3

For an individual module, you can try the __version__ attribute, however there are modules without it:

$ python -c "import requests; print(requests.__version__)"
2.14.2
$ python -c "import lxml; print(lxml.__version__)"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
AttributeError: 'module' object has no attribute '__version__'

Lastly, as the commands in your question are prefixed with sudo, it appears you’re installing to the global python environment. Strongly advise to take look into python virtual environment managers, for example virtualenvwrapper


回答 1

你可以试试

>>> import statlib
>>> print statlib.__version__

>>> import construct
>>> print contruct.__version__

You can try

>>> import statlib
>>> print statlib.__version__

>>> import construct
>>> print contruct.__version__

回答 2

使用pkg_resourcessetuptools库一起分发的模块。请注意,传递给get_distribution方法的字符串应对应于PyPI条目。

>>> import pkg_resources
>>> pkg_resources.get_distribution("construct").version
'2.5.2'

如果要从命令行运行它,可以执行以下操作:

python -c "import pkg_resources; print(pkg_resources.get_distribution('construct').version)"

请注意,传递给该get_distribution方法的字符串应该是在PyPI中注册的包名称,而不是您要导入的模块名称。

不幸的是,这些并不总是相同的(例如,您可以pip install memcached,但是import memcache)。

Use pkg_resources module distributed with setuptools library. Note that the string that you pass to get_distribution method should correspond to the PyPI entry.

>>> import pkg_resources
>>> pkg_resources.get_distribution("construct").version
'2.5.2'

and if you want to run it from the command line you can do:

python -c "import pkg_resources; print(pkg_resources.get_distribution('construct').version)"

Note that the string that you pass to the get_distribution method should be the package name as registered in PyPI, not the module name that you are trying to import.

Unfortunately these aren’t always the same (e.g. you do pip install memcached, but import memcache).


回答 3

我认为这可以帮助您,但请先安装show软件包才能运行,pip show然后使用show查找版本!

sudo pip install show
# in order to get package version execute the below command
sudo pip show YOUR_PACKAGE_NAME | grep Version

I think this can help but first install show package in order to run pip show then use show to find the version!

sudo pip install show
# in order to get package version execute the below command
sudo pip show YOUR_PACKAGE_NAME | grep Version

回答 4

更好的方法是:


有关特定包装的详细信息

pip show <package_name>

它详细列出了Package_name,版本,作者,位置等。


$ pip show numpy
Name: numpy
Version: 1.13.3
Summary: NumPy: array processing for numbers, strings, records, and objects.
Home-page: http://www.numpy.org
Author: NumPy Developers
Author-email: numpy-discussion@python.org
License: BSD
Location: c:\users\prowinjvm\appdata\local\programs\python\python36\lib\site-packages
Requires:

更多细节: >>> pip help


pip 应该对此进行更新。

pip install --upgrade pip

在Windows上,推荐的命令是:

python -m pip install --upgrade pip

The Better way to do that is:


For the details of specific Package

pip show <package_name>

It details out the Package_name, Version, Author, Location etc.


$ pip show numpy
Name: numpy
Version: 1.13.3
Summary: NumPy: array processing for numbers, strings, records, and objects.
Home-page: http://www.numpy.org
Author: NumPy Developers
Author-email: numpy-discussion@python.org
License: BSD
Location: c:\users\prowinjvm\appdata\local\programs\python\python36\lib\site-packages
Requires:

For more Details: >>> pip help


pip should be updated to do this.
pip install --upgrade pip

On Windows recommend command is:

python -m pip install --upgrade pip


回答 5

在python3中带有括号的打印

>>> import celery
>>> print(celery.__version__)
3.1.14

In python3 with brackets around print

>>> import celery
>>> print(celery.__version__)
3.1.14

回答 6

module.__version__ 首先尝试尝试是一件好事,但它并不总是有效。

如果您不希望掏空,并且使用的是pip 8或9,则仍然可以使用pip.get_installed_distributions()Python中的版本来获取版本:

更新: 此处的解决方案适用于第8点和第9点,但在第10点中,该功能已从移至pip.get_installed_distributionspip._internal.utils.misc.get_installed_distributions以明确指示该功能不供外部使用。如果您使用的是pip 10+,则不是一个好主意。

import pip

pip.get_installed_distributions()  # -> [distribute 0.6.16 (...), ...]

[
    pkg.key + ': ' + pkg.version
    for pkg in pip.get_installed_distributions()
    if pkg.key in ['setuptools', 'statlib', 'construct']
] # -> nicely filtered list of ['setuptools: 3.3', ...]

module.__version__ is a good first thing to try, but it doesn’t always work.

If you don’t want to shell out, and you’re using pip 8 or 9, you can still use pip.get_installed_distributions() to get versions from within Python:

update: the solution here works in pip 8 and 9, but in pip 10 the function has been moved from pip.get_installed_distributions to pip._internal.utils.misc.get_installed_distributions to explicitly indicate that it’s not for external use. It’s not a good idea to rely on it if you’re using pip 10+.

import pip

pip.get_installed_distributions()  # -> [distribute 0.6.16 (...), ...]

[
    pkg.key + ': ' + pkg.version
    for pkg in pip.get_installed_distributions()
    if pkg.key in ['setuptools', 'statlib', 'construct']
] # -> nicely filtered list of ['setuptools: 3.3', ...]

回答 7

先前的答案不能解决我的问题,但是这段代码可以解决:

import sys 
for name, module in sorted(sys.modules.items()): 
  if hasattr(module, '__version__'): 
    print name, module.__version__ 

The previous answers did not solve my problem, but this code did:

import sys 
for name, module in sorted(sys.modules.items()): 
  if hasattr(module, '__version__'): 
    print name, module.__version__ 

回答 8

您可以importlib_metadata为此使用库。

如果您使用的是python <3.8,请首先使用以下命令进行安装:

pip install importlib_metadata

从python开始,3.8它已包含在标准库中。

然后,要检查软件包的版本(在本示例中为lxml),请运行:

>>> from importlib_metadata import version
>>> version('lxml')
'4.3.1'

请记住,这仅适用于从PyPI安装的软件包。另外,您必须将包名称作为该version方法的参数传递,而不是传递此包提供的模块名称(尽管它们通常是相同的)。

You can use importlib_metadata library for this.

If you’re on python <3.8, first install it with:

pip install importlib_metadata

Since python 3.8 it’s included in the standard library.

Then, to check a package’s version (in this example lxml) run:

>>> from importlib_metadata import version
>>> version('lxml')
'4.3.1'

Keep in mind that this works only for packages installed from PyPI. Also, you must pass a package name as an argument to the version method, rather than a module name that this package provides (although they’re usually the same).


回答 9

使用dir()以找出是否该模块具有__version__的所有属性。

>>> import selenium
>>> dir(selenium)
['__builtins__', '__doc__', '__file__', '__name__',
 '__package__', '__path__', '__version__']
>>> selenium.__version__
'3.141.0'
>>> selenium.__path__
['/venv/local/lib/python2.7/site-packages/selenium']

Use dir() to find out if the module has a __version__ attribute at all.

>>> import selenium
>>> dir(selenium)
['__builtins__', '__doc__', '__file__', '__name__',
 '__package__', '__path__', '__version__']
>>> selenium.__version__
'3.141.0'
>>> selenium.__path__
['/venv/local/lib/python2.7/site-packages/selenium']

回答 10

如果上述方法不起作用,则值得在python中尝试以下方法:

import modulename

modulename.version
modulename.version_info

请参阅获取Python Tornado版本?

请注意,.version除了龙卷风以外,它还为我在其他几个项目上起作用。

If the above methods do not work, it is worth trying the following in python:

import modulename

modulename.version
modulename.version_info

See Get Python Tornado Version?

Note, the .version worked for me on a few others besides tornado as well.


回答 11

首先添加python,将pip添加到您的环境变量中。这样您就可以从命令提示符下执行命令。然后只需给出python命令。然后导入包

->import scrapy

然后打印版本名称

->print(scrapy.__version__)

这肯定会工作

first add python, pip to your environment variables. so that you can execute your commands from command prompt. then simply give python command. then import the package

–>import scrapy

then print the version name

–>print(scrapy.__version__)

This will definitely work


回答 12

假设我们正在使用Jupyter Notebook(如果使用Terminal,请删除感叹号):

1)如果软件包(例如xgboost)是通过pip安装的:

!pip show xgboost
!pip freeze | grep xgboost
!pip list | grep xgboost

2)如果软件包(例如caffe)与conda一起安装:

!conda list caffe

Assuming we are using Jupyter Notebook (if using Terminal, drop the exclamation marks):

1) if the package (e.g. xgboost) was installed with pip:

!pip show xgboost
!pip freeze | grep xgboost
!pip list | grep xgboost

2) if the package (e.g. caffe) was installed with conda:

!conda list caffe

回答 13

有些模块没有__version__属性,所以最简单的方法是在终端中检查:pip list

Some modules don’t have __version__ attribute, so the easiest way is check in the terminal: pip list


回答 14

在Python 3.8版本metadata中,importlib软件包中有一个新模块,它也可以做到这一点。

这是docs中的示例:

>>> from importlib.metadata import version
>>> version('requests')
'2.22.0'

In Python 3.8 version there is a new metadata module in importlib package, which can do that as well.

Here is an example from docs:

>>> from importlib.metadata import version
>>> version('requests')
'2.22.0'

回答 15

我建议python在终端(您感兴趣的python版本)中打开一个shell,导入该库并获取其__version__属性。

>>> import statlib
>>> statlib.__version__

>>> import construct
>>> contruct.__version__

注意1:我们必须考虑python版本。如果安装了不同版本的python,则必须使用感兴趣的python版本打开终端。例如,使用python3.8can(肯定会)打开终端会提供与使用python3.5or 相比不同的库版本python2.7

注意2:我们避免使用print函数,因为它的行为取决于python2或python3。我们不需要它,终端将显示表达式的值。

I suggest opening a python shell in terminal (in the python version you are interested), importing the library, and getting its __version__ attribute.

>>> import statlib
>>> statlib.__version__

>>> import construct
>>> contruct.__version__

Note 1: We must regard the python version. If we have installed different versions of python, we have to open the terminal in the python version we are interested in. For example, opening the terminal with python3.8 can (surely will) give a different version of a library than opening with python3.5 or python2.7.

Note 2: We avoid using the print function, because its behavior depends on python2 or python3. We do not need it, the terminal will show the value of the expression.


回答 16

这也适用于Windows上的Jupyter Notebook!只要从兼容bash的命令行(如Git Bash(MingW64))启动Jupyter,就可以通过一些细微调整在Windows系统上的Jupyter Notebook中使用许多答案中给出的解决方案。

我正在运行Windows 10 Pro,并且通过Anaconda安装了Python,并且当我通过Git Bash启动Jupyter时,以下代码有效(但是从Anaconda提示符启动时不起作用)。

调整:!在其前面添加一个感叹号()pip以使其成为!pip

>>>!pip show lxml | grep Version
Version: 4.1.0

>>>!pip freeze | grep lxml
lxml==4.1.0

>>>!pip list | grep lxml
lxml                               4.1.0                  

>>>!pip show lxml
Name: lxml
Version: 4.1.0
Summary: Powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API.
Home-page: http://lxml.de/
Author: lxml dev team
Author-email: lxml-dev@lxml.de
License: BSD
Location: c:\users\karls\anaconda2\lib\site-packages
Requires: 
Required-by: jupyter-contrib-nbextensions

This works in Jupyter Notebook on Windows, too! As long as Jupyter is launched from a bash-compliant command line such as Git Bash (MingW64), the solutions given in many of the answers can be used in Jupyter Notebook on Windows systems with one tiny tweak.

I’m running windows 10 Pro with Python installed via Anaconda, and the following code works when I launch Jupyter via Git Bash (but does not when I launch from the Anaconda prompt).

The tweak: Add an exclamation mark (!) in front of pip to make it !pip.

>>>!pip show lxml | grep Version
Version: 4.1.0

>>>!pip freeze | grep lxml
lxml==4.1.0

>>>!pip list | grep lxml
lxml                               4.1.0                  

>>>!pip show lxml
Name: lxml
Version: 4.1.0
Summary: Powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API.
Home-page: http://lxml.de/
Author: lxml dev team
Author-email: lxml-dev@lxml.de
License: BSD
Location: c:\users\karls\anaconda2\lib\site-packages
Requires: 
Required-by: jupyter-contrib-nbextensions

回答 17

快速的python程序列出所有包装(您可以将其复制到requirements.txt

from pip._internal.utils.misc import get_installed_distributions
print_log = ''
for module in sorted(get_installed_distributions(), key=lambda x: x.key): 
    print_log +=  module.key + '~=' + module.version  + '\n'
print(print_log)

输出如下:

asn1crypto~=0.24.0
attrs~=18.2.0
automat~=0.7.0
beautifulsoup4~=4.7.1
botocore~=1.12.98

Quick python program to list all packges (you can copy it to requirements.txt)

from pip._internal.utils.misc import get_installed_distributions
print_log = ''
for module in sorted(get_installed_distributions(), key=lambda x: x.key): 
    print_log +=  module.key + '~=' + module.version  + '\n'
print(print_log)

The output would look like:

asn1crypto~=0.24.0
attrs~=18.2.0
automat~=0.7.0
beautifulsoup4~=4.7.1
botocore~=1.12.98

回答 18

(另请参阅https://stackoverflow.com/a/56912280/7262247

我发现使用各种可用的工具(包括Jakub Kukul的回答中pkg_resources提到的最好的一种)是非常不可靠的,因为它们中的大多数都不能涵盖所有情况。例如

  • 内置模块
  • 未安装但仅添加到python路径的模块(例如,通过您的IDE)
  • 可以使用同一模块的两个版本(在python路径中取代已安装的一个)

由于我们需要一种可靠的方法来获取任何软件包,模块或子模块的版本,因此我最终编写了getversion。使用起来非常简单:

from getversion import get_module_version
import foo
version, details = get_module_version(foo)

有关详细信息,请参见文档

(see also https://stackoverflow.com/a/56912280/7262247)

I found it quite unreliable to use the various tools available (including the best one pkg_resources mentioned by Jakub Kukul’ answer), as most of them do not cover all cases. For example

  • built-in modules
  • modules not installed but just added to the python path (by your IDE for example)
  • two versions of the same module available (one in python path superseding the one installed)

Since we needed a reliable way to get the version of any package, module or submodule, I ended up writing getversion. It is quite simple to use:

from getversion import get_module_version
import foo
version, details = get_module_version(foo)

See the documentation for details.


回答 19

要获取当前模块中导入的非标准(点子)模块的列表:

[{pkg.key : pkg.version} for pkg in pip.get_installed_distributions() 
   if pkg.key in set(sys.modules) & set(globals())]

结果:

>>> import sys, pip, nltk, bs4
>>> [{pkg.key : pkg.version} for pkg in pip.get_installed_distributions() if pkg.key in set(sys.modules) & set(globals())]
[{'pip': '9.0.1'}, {'nltk': '3.2.1'}, {'bs4': '0.0.1'}]

注意:

此代码是从此页面上的解决方案以及如何列出导入的模块?

To get a list of non-standard (pip) modules imported in the current module:

[{pkg.key : pkg.version} for pkg in pip.get_installed_distributions() 
   if pkg.key in set(sys.modules) & set(globals())]

Result:

>>> import sys, pip, nltk, bs4
>>> [{pkg.key : pkg.version} for pkg in pip.get_installed_distributions() if pkg.key in set(sys.modules) & set(globals())]
[{'pip': '9.0.1'}, {'nltk': '3.2.1'}, {'bs4': '0.0.1'}]

Note:

This code was put together from solutions both on this page and from How to list imported modules?


回答 20

基于Jakub Kukul的回答,我找到了解决此问题的更可靠方法。

这种方法的主要问题是需要“常规”安装软件包(并且不包括using pip install --user),或者在Python初始化时将其置于系统PATH中。

要绕开它,您可以使用pkg_resources.find_distributions(path_to_search)。这基本上path_to_search是在系统PATH中搜索可导入的发行版。

我们可以像这样遍历这个生成器:

avail_modules = {}
distros = pkg_resources.find_distributions(path_to_search)
for d in distros:
    avail_modules[d.key] = d.version

这将返回一个以模块为键,其版本为值的字典。这种方法可以扩展到版本号之外。

感谢Jakub Kukul指出正确的方向

Building on Jakub Kukul’s answer I found a more reliable way to solve this problem.

The main problem of that approach is that requires the packages to be installed “conventionally” (and that does not include using pip install --user), or be in the system PATH at Python initialisation.

To get around that you can use pkg_resources.find_distributions(path_to_search). This basically searches for distributions that would be importable if path_to_search was in the system PATH.

We can iterate through this generator like this:

avail_modules = {}
distros = pkg_resources.find_distributions(path_to_search)
for d in distros:
    avail_modules[d.key] = d.version

This will return a dictionary having modules as keys and their version as value. This approach can be extended to a lot more than version number.

Thanks to Jakub Kukul for pointing to the right direction


回答 21

综上所述:

conda list   

(它将提供所有库以及版本详细信息)。

和:

pip show tensorflow

(它提供了完整的库详细信息)。

In Summary:

conda list   

(It will provide all the libraries along with version details).

And:

pip show tensorflow

(It gives complete library details).


回答 22

为Windows用户编写此答案。正如所有其他答案中所建议的那样,您可以使用以下语句:

import [type the module name]
print(module.__version__)      # module + '.' + double underscore + version + double underscore

但是,有些模块即使使用上述方法后也不会打印其版本。因此,您可以简单地执行以下操作:

  1. 打开命令提示符。
  2. 使用cd [ 文件地址 ] 导航到文件地址/目录,在该文件中,您已安装了python和所有支持的模块。
  3. 使用命令“ pip install [模块名称] ”,然后按Enter。
  4. 这将向您显示一条消息:“ 已满足要求:文件地址\文件夹名称(带有版本) ”。
  5. 例如,请参见下面的屏幕截图:我必须知道预安装模块的版本,名称为“ Selenium-Screenshot”。它正确显示为1.5.0:

Writing this answer for windows users. As suggested in all other answers, you can use the statements as:

import [type the module name]
print(module.__version__)      # module + '.' + double underscore + version + double underscore

But, there are some modules which don’t print their version even after using the method above. So, what you can simply do is:

  1. Open the command prompt.
  2. Navigate to the file address/directory by using cd [file address] where you’ve kept your python and all supporting modules installed.
  3. use the command “pip install [module name]” and hit enter.
  4. This will show you a message as “Requirement already satisfied: file address\folder name (with version)“.
  5. See the screenshot below for ex: I had to know the version of a pre-installed module named as “Selenium-Screenshot”. It showed me correctly as 1.5.0:


回答 23

您可以尝试以下方法:

pip list

这将输出所有软件包及其版本。 输出量

You can try this:

pip list

This will output all the packages with their versions. Output


回答 24

您可以先安装类似这样的软件包,然后检查其版本

pip install package
import package
print(package.__version__)

它应该给你包的版本

you can first install some package like this and then check its version

pip install package
import package
print(package.__version__)

it should give you package version


如何确定Python中对象的大小?

问题:如何确定Python中对象的大小?

我想知道如何在Python中获取对象的大小,例如字符串,整数等。

相关问题:Python列表(元组)中每个元素有多少个字节?

我使用的XML文件包含指定值大小的大小字段。我必须解析此XML并进行编码。当我想更改特定字段的值时,我将检查该值的大小字段。在这里,我想比较输入的新值是否与XML中的值相同。我需要检查新值的大小。如果是字符串,我可以说它的长度。但是如果是int,float等,我会感到困惑。

I want to know how to get size of objects like a string, integer, etc. in Python.

Related question: How many bytes per element are there in a Python list (tuple)?

I am using an XML file which contains size fields that specify the size of value. I must parse this XML and do my coding. When I want to change the value of a particular field, I will check the size field of that value. Here I want to compare whether the new value that I’m gong to enter is of the same size as in XML. I need to check the size of new value. In case of a string I can say its the length. But in case of int, float, etc. I am confused.


回答 0

只需使用模块中定义的sys.getsizeof函数即可sys

sys.getsizeof(object[, default])

返回对象的大小(以字节为单位)。该对象可以是任何类型的对象。所有内置对象都将返回正确的结果,但是对于第三方扩展,这不一定成立,因为它是特定于实现的。

default参数允许定义一个值,如果对象类型不提供检索大小的方法并导致,则将返回该值 TypeError

getsizeof__sizeof__如果对象由垃圾收集器管理,则调用该对象的 方法并添加额外的垃圾收集器开销。

用法示例,在python 3.0中:

>>> import sys
>>> x = 2
>>> sys.getsizeof(x)
24
>>> sys.getsizeof(sys.getsizeof)
32
>>> sys.getsizeof('this')
38
>>> sys.getsizeof('this also')
48

如果您使用的是python <2.6及以下版本,则sys.getsizeof可以使用此扩展模块。虽然从未使用过。

Just use the sys.getsizeof function defined in the sys module.

sys.getsizeof(object[, default]):

Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.

The default argument allows to define a value which will be returned if the object type does not provide means to retrieve the size and would cause a TypeError.

getsizeof calls the object’s __sizeof__ method and adds an additional garbage collector overhead if the object is managed by the garbage collector.

Usage example, in python 3.0:

>>> import sys
>>> x = 2
>>> sys.getsizeof(x)
24
>>> sys.getsizeof(sys.getsizeof)
32
>>> sys.getsizeof('this')
38
>>> sys.getsizeof('this also')
48

If you are in python < 2.6 and don’t have sys.getsizeof you can use this extensive module instead. Never used it though.


回答 1

如何确定Python中对象的大小?

答案“仅使用sys.getsizeof”不是一个完整的答案。

该答案确实直接适用于内置对象,但没有考虑这些对象可能包含的内容,特别是不包含哪些类型,例如自定义对象,元组,列表,字典和集合所包含的类型。它们可以互相包含实例,以及数字,字符串和其他对象。

更完整的答案

使用Anaconda发行版中的64位Python 3.6和sys.getsizeof,我确定了以下对象的最小大小,并请注意set和dict预分配了空间,因此空的对象直到设定的数量后才再次增长。因语言的实现而异):

Python 3:

Empty
Bytes  type        scaling notes
28     int         +4 bytes about every 30 powers of 2
37     bytes       +1 byte per additional byte
49     str         +1-4 per additional character (depending on max width)
48     tuple       +8 per additional item
64     list        +8 for each additional
224    set         5th increases to 736; 21nd, 2272; 85th, 8416; 341, 32992
240    dict        6th increases to 368; 22nd, 1184; 43rd, 2280; 86th, 4704; 171st, 9320
136    func def    does not include default args and other attrs
1056   class def   no slots 
56     class inst  has a __dict__ attr, same scaling as dict above
888    class def   with slots
16     __slots__   seems to store in mutable tuple-like structure
                   first slot grows to 48, and so on.

您如何解释呢?好吧,说您有一套10件物品。如果每个项目都是100字节,那么整个数据结构有多大?该集合本身为736,因为它的大小增加了一倍,达到736字节。然后,添加项目的大小,因此总计1736字节

有关函数和类定义的一些警告:

请注意,每个类定义都有一个__dict__用于类attrs 的代理(48字节)结构。每个插槽property在类定义中都有一个描述符(如)。

开槽实例在其第一个元素上以48个字节开头,并且每增加一个字节就增加8个字节。只有空的带槽对象具有16个字节,而没有数据的实例意义不大。

此外,每个函数定义都有代码对象(可能是文档字符串)和其他可能的属性,甚至是__dict__

还要注意,我们sys.getsizeof()之所以使用,是因为我们关心的是边际空间使用情况,其中包括docs中对象的垃圾回收开销:

__sizeof__如果对象是由垃圾收集器管理的,则getsizeof()调用对象的方法并增加额外的垃圾收集器开销。

还要注意,调整列表的大小(例如重复添加到列表中)会使它们预先分配空间,类似于集合和字典。从listobj.c源代码

    /* This over-allocates proportional to the list size, making room
     * for additional growth.  The over-allocation is mild, but is
     * enough to give linear-time amortized behavior over a long
     * sequence of appends() in the presence of a poorly-performing
     * system realloc().
     * The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
     * Note: new_allocated won't overflow because the largest possible value
     *       is PY_SSIZE_T_MAX * (9 / 8) + 6 which always fits in a size_t.
     */
    new_allocated = (size_t)newsize + (newsize >> 3) + (newsize < 9 ? 3 : 6);

历史数据

Python 2.7分析,通过guppy.hpy和确认sys.getsizeof

Bytes  type        empty + scaling notes
24     int         NA
28     long        NA
37     str         + 1 byte per additional character
52     unicode     + 4 bytes per additional character
56     tuple       + 8 bytes per additional item
72     list        + 32 for first, 8 for each additional
232    set         sixth item increases to 744; 22nd, 2280; 86th, 8424
280    dict        sixth item increases to 1048; 22nd, 3352; 86th, 12568 *
120    func def    does not include default args and other attrs
64     class inst  has a __dict__ attr, same scaling as dict above
16     __slots__   class with slots has no dict, seems to store in 
                   mutable tuple-like structure.
904    class def   has a proxy __dict__ structure for class attrs
104    old class   makes sense, less stuff, has real dict though.

请注意,字典(而非集合)在Python 3.6中得到了更紧凑的表示形式

我认为在64位计算机上,每个附加项目要引用8个字节是很有意义的。这8个字节指向所包含项在内存中的位置。如果我没记错的话,Python 2的unicode的4个字节是固定宽度的,但是在Python 3中,str变成的unicode的宽度等于字符的最大宽度。

(有关插槽的更多信息,请参见此答案

更完整的功能

我们需要一个功能来搜索列表,元组,集合,字典,obj.__dict__‘s和中的元素obj.__slots__,以及我们可能尚未想到的其他内容。

我们希望依靠gc.get_referents此搜索,因为它可以在C级别上运行(使其变得非常快)。缺点是get_referents可以返回冗余成员,因此我们需要确保不会重复计算。

类,模块和函数是单例-它们在内存中存在一次。我们对它们的大小不太感兴趣,因为我们对此无能为力-它们是程序的一部分。因此,如果碰巧引用了它们,我们将避免计算它们。

我们将使用类型的黑名单,因此我们不将整个程序包括在我们的大小计数中。

import sys
from types import ModuleType, FunctionType
from gc import get_referents

# Custom objects know their class.
# Function objects seem to know way too much, including modules.
# Exclude modules as well.
BLACKLIST = type, ModuleType, FunctionType


def getsize(obj):
    """sum size of object & members."""
    if isinstance(obj, BLACKLIST):
        raise TypeError('getsize() does not take argument of type: '+ str(type(obj)))
    seen_ids = set()
    size = 0
    objects = [obj]
    while objects:
        need_referents = []
        for obj in objects:
            if not isinstance(obj, BLACKLIST) and id(obj) not in seen_ids:
                seen_ids.add(id(obj))
                size += sys.getsizeof(obj)
                need_referents.append(obj)
        objects = get_referents(*need_referents)
    return size

为了与下面的白名单功能形成对比,大多数对象都知道如何遍历自身以进行垃圾回收(当我们想知道某些对象在内存中有多昂贵时,这正是我们要寻找的东西。gc.get_referents。)但是,如果我们不谨慎的话,这一措施的范围将比我们预期的要广泛得多。

例如,函数对创建它们的模块非常了解。

另一个对比点是,字典中作为键的字符串通常会被保留,因此不会重复。检查id(key)还将使我们避免计算重复项,这将在下一部分中进行。黑名单解决方案会跳过对全部为字符串的键的计数。

白名单类型,递归访问者(旧的实现)

为了亲自涵盖其中的大多数类型,我编写了此递归函数以尝试估算大多数Python对象的大小,包括大多数内建函数,集合模块中的类型以及自定义类型(有槽或其他),而不是依赖于gc模块。 。

这种功能可以对要计算内存使用情况的类型进行更细粒度的控制,但存在将类型排除在外的危险:

import sys
from numbers import Number
from collections import Set, Mapping, deque

try: # Python 2
    zero_depth_bases = (basestring, Number, xrange, bytearray)
    iteritems = 'iteritems'
except NameError: # Python 3
    zero_depth_bases = (str, bytes, Number, range, bytearray)
    iteritems = 'items'

def getsize(obj_0):
    """Recursively iterate to sum size of object & members."""
    _seen_ids = set()
    def inner(obj):
        obj_id = id(obj)
        if obj_id in _seen_ids:
            return 0
        _seen_ids.add(obj_id)
        size = sys.getsizeof(obj)
        if isinstance(obj, zero_depth_bases):
            pass # bypass remaining control flow and return
        elif isinstance(obj, (tuple, list, Set, deque)):
            size += sum(inner(i) for i in obj)
        elif isinstance(obj, Mapping) or hasattr(obj, iteritems):
            size += sum(inner(k) + inner(v) for k, v in getattr(obj, iteritems)())
        # Check for custom object instances - may subclass above too
        if hasattr(obj, '__dict__'):
            size += inner(vars(obj))
        if hasattr(obj, '__slots__'): # can have __slots__ with __dict__
            size += sum(inner(getattr(obj, s)) for s in obj.__slots__ if hasattr(obj, s))
        return size
    return inner(obj_0)

我相当随意地测试了它(我应该对其进行单元测试):

>>> getsize(['a', tuple('bcd'), Foo()])
344
>>> getsize(Foo())
16
>>> getsize(tuple('bcd'))
194
>>> getsize(['a', tuple('bcd'), Foo(), {'foo': 'bar', 'baz': 'bar'}])
752
>>> getsize({'foo': 'bar', 'baz': 'bar'})
400
>>> getsize({})
280
>>> getsize({'foo':'bar'})
360
>>> getsize('foo')
40
>>> class Bar():
...     def baz():
...         pass
>>> getsize(Bar())
352
>>> getsize(Bar().__dict__)
280
>>> sys.getsizeof(Bar())
72
>>> getsize(Bar.__dict__)
872
>>> sys.getsizeof(Bar.__dict__)
280

此实现违反了类定义和函数定义,因为我们没有使用它们的所有属性,但是由于它们在该进程的内存中应该只存在一次,因此它们的大小实际上并没有太大关系。

How do I determine the size of an object in Python?

The answer, “Just use sys.getsizeof” is not a complete answer.

That answer does work for builtin objects directly, but it does not account for what those objects may contain, specifically, what types, such as custom objects, tuples, lists, dicts, and sets contain. They can contain instances each other, as well as numbers, strings and other objects.

A More Complete Answer

Using 64 bit Python 3.6 from the Anaconda distribution, with sys.getsizeof, I have determined the minimum size of the following objects, and note that sets and dicts preallocate space so empty ones don’t grow again until after a set amount (which may vary by implementation of the language):

Python 3:

Empty
Bytes  type        scaling notes
28     int         +4 bytes about every 30 powers of 2
37     bytes       +1 byte per additional byte
49     str         +1-4 per additional character (depending on max width)
48     tuple       +8 per additional item
64     list        +8 for each additional
224    set         5th increases to 736; 21nd, 2272; 85th, 8416; 341, 32992
240    dict        6th increases to 368; 22nd, 1184; 43rd, 2280; 86th, 4704; 171st, 9320
136    func def    does not include default args and other attrs
1056   class def   no slots 
56     class inst  has a __dict__ attr, same scaling as dict above
888    class def   with slots
16     __slots__   seems to store in mutable tuple-like structure
                   first slot grows to 48, and so on.

How do you interpret this? Well say you have a set with 10 items in it. If each item is 100 bytes each, how big is the whole data structure? The set is 736 itself because it has sized up one time to 736 bytes. Then you add the size of the items, so that’s 1736 bytes in total

Some caveats for function and class definitions:

Note each class definition has a proxy __dict__ (48 bytes) structure for class attrs. Each slot has a descriptor (like a property) in the class definition.

Slotted instances start out with 48 bytes on their first element, and increase by 8 each additional. Only empty slotted objects have 16 bytes, and an instance with no data makes very little sense.

Also, each function definition has code objects, maybe docstrings, and other possible attributes, even a __dict__.

Also note that we use sys.getsizeof() because we care about the marginal space usage, which includes the garbage collection overhead for the object, from the docs:

getsizeof() calls the object’s __sizeof__ method and adds an additional garbage collector overhead if the object is managed by the garbage collector.

Also note that resizing lists (e.g. repetitively appending to them) causes them to preallocate space, similarly to sets and dicts. From the listobj.c source code:

    /* This over-allocates proportional to the list size, making room
     * for additional growth.  The over-allocation is mild, but is
     * enough to give linear-time amortized behavior over a long
     * sequence of appends() in the presence of a poorly-performing
     * system realloc().
     * The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
     * Note: new_allocated won't overflow because the largest possible value
     *       is PY_SSIZE_T_MAX * (9 / 8) + 6 which always fits in a size_t.
     */
    new_allocated = (size_t)newsize + (newsize >> 3) + (newsize < 9 ? 3 : 6);

Historical data

Python 2.7 analysis, confirmed with guppy.hpy and sys.getsizeof:

Bytes  type        empty + scaling notes
24     int         NA
28     long        NA
37     str         + 1 byte per additional character
52     unicode     + 4 bytes per additional character
56     tuple       + 8 bytes per additional item
72     list        + 32 for first, 8 for each additional
232    set         sixth item increases to 744; 22nd, 2280; 86th, 8424
280    dict        sixth item increases to 1048; 22nd, 3352; 86th, 12568 *
120    func def    does not include default args and other attrs
64     class inst  has a __dict__ attr, same scaling as dict above
16     __slots__   class with slots has no dict, seems to store in 
                   mutable tuple-like structure.
904    class def   has a proxy __dict__ structure for class attrs
104    old class   makes sense, less stuff, has real dict though.

Note that dictionaries (but not sets) got a more compact representation in Python 3.6

I think 8 bytes per additional item to reference makes a lot of sense on a 64 bit machine. Those 8 bytes point to the place in memory the contained item is at. The 4 bytes are fixed width for unicode in Python 2, if I recall correctly, but in Python 3, str becomes a unicode of width equal to the max width of the characters.

(And for more on slots, see this answer )

A More Complete Function

We want a function that searches the elements in lists, tuples, sets, dicts, obj.__dict__‘s, and obj.__slots__, as well as other things we may not have yet thought of.

We want to rely on gc.get_referents to do this search because it works at the C level (making it very fast). The downside is that get_referents can return redundant members, so we need to ensure we don’t double count.

Classes, modules, and functions are singletons – they exist one time in memory. We’re not so interested in their size, as there’s not much we can do about them – they’re a part of the program. So we’ll avoid counting them if they happen to be referenced.

We’re going to use a blacklist of types so we don’t include the entire program in our size count.

import sys
from types import ModuleType, FunctionType
from gc import get_referents

# Custom objects know their class.
# Function objects seem to know way too much, including modules.
# Exclude modules as well.
BLACKLIST = type, ModuleType, FunctionType


def getsize(obj):
    """sum size of object & members."""
    if isinstance(obj, BLACKLIST):
        raise TypeError('getsize() does not take argument of type: '+ str(type(obj)))
    seen_ids = set()
    size = 0
    objects = [obj]
    while objects:
        need_referents = []
        for obj in objects:
            if not isinstance(obj, BLACKLIST) and id(obj) not in seen_ids:
                seen_ids.add(id(obj))
                size += sys.getsizeof(obj)
                need_referents.append(obj)
        objects = get_referents(*need_referents)
    return size

To contrast this with the following whitelisted function, most objects know how to traverse themselves for the purposes of garbage collection (which is approximately what we’re looking for when we want to know how expensive in memory certain objects are. This functionality is used by gc.get_referents.) However, this measure is going to be much more expansive in scope than we intended if we are not careful.

For example, functions know quite a lot about the modules they are created in.

Another point of contrast is that strings that are keys in dictionaries are usually interned so they are not duplicated. Checking for id(key) will also allow us to avoid counting duplicates, which we do in the next section. The blacklist solution skips counting keys that are strings altogether.

Whitelisted Types, Recursive visitor (old implementation)

To cover most of these types myself, instead of relying on the gc module, I wrote this recursive function to try to estimate the size of most Python objects, including most builtins, types in the collections module, and custom types (slotted and otherwise).

This sort of function gives much more fine-grained control over the types we’re going to count for memory usage, but has the danger of leaving types out:

import sys
from numbers import Number
from collections import Set, Mapping, deque

try: # Python 2
    zero_depth_bases = (basestring, Number, xrange, bytearray)
    iteritems = 'iteritems'
except NameError: # Python 3
    zero_depth_bases = (str, bytes, Number, range, bytearray)
    iteritems = 'items'

def getsize(obj_0):
    """Recursively iterate to sum size of object & members."""
    _seen_ids = set()
    def inner(obj):
        obj_id = id(obj)
        if obj_id in _seen_ids:
            return 0
        _seen_ids.add(obj_id)
        size = sys.getsizeof(obj)
        if isinstance(obj, zero_depth_bases):
            pass # bypass remaining control flow and return
        elif isinstance(obj, (tuple, list, Set, deque)):
            size += sum(inner(i) for i in obj)
        elif isinstance(obj, Mapping) or hasattr(obj, iteritems):
            size += sum(inner(k) + inner(v) for k, v in getattr(obj, iteritems)())
        # Check for custom object instances - may subclass above too
        if hasattr(obj, '__dict__'):
            size += inner(vars(obj))
        if hasattr(obj, '__slots__'): # can have __slots__ with __dict__
            size += sum(inner(getattr(obj, s)) for s in obj.__slots__ if hasattr(obj, s))
        return size
    return inner(obj_0)

And I tested it rather casually (I should unittest it):

>>> getsize(['a', tuple('bcd'), Foo()])
344
>>> getsize(Foo())
16
>>> getsize(tuple('bcd'))
194
>>> getsize(['a', tuple('bcd'), Foo(), {'foo': 'bar', 'baz': 'bar'}])
752
>>> getsize({'foo': 'bar', 'baz': 'bar'})
400
>>> getsize({})
280
>>> getsize({'foo':'bar'})
360
>>> getsize('foo')
40
>>> class Bar():
...     def baz():
...         pass
>>> getsize(Bar())
352
>>> getsize(Bar().__dict__)
280
>>> sys.getsizeof(Bar())
72
>>> getsize(Bar.__dict__)
872
>>> sys.getsizeof(Bar.__dict__)
280

This implementation breaks down on class definitions and function definitions because we don’t go after all of their attributes, but since they should only exist once in memory for the process, their size really doesn’t matter too much.


回答 2

Pympler封装的asizeof模块可以做到这一点。

用法如下:

from pympler import asizeof
asizeof.asizeof(my_object)

sys.getsizeof与之不同,它适用于您自己创建的对象。它甚至可以与numpy一起使用。

>>> asizeof.asizeof(tuple('bcd'))
200
>>> asizeof.asizeof({'foo': 'bar', 'baz': 'bar'})
400
>>> asizeof.asizeof({})
280
>>> asizeof.asizeof({'foo':'bar'})
360
>>> asizeof.asizeof('foo')
40
>>> asizeof.asizeof(Bar())
352
>>> asizeof.asizeof(Bar().__dict__)
280
>>> A = rand(10)
>>> B = rand(10000)
>>> asizeof.asizeof(A)
176
>>> asizeof.asizeof(B)
80096

正如提到的

可以通过设置option来包含类,函数,方法,模块等对象的(字节)代码大小code=True

如果您需要其他有关实时数据的视图,Pympler的

该模块muppy用于对Python应用程序进行在线监视,该模块Class Tracker提供对所选Python对象生命周期的离线分析。

The Pympler package’s asizeof module can do this.

Use as follows:

from pympler import asizeof
asizeof.asizeof(my_object)

Unlike sys.getsizeof, it works for your self-created objects. It even works with numpy.

>>> asizeof.asizeof(tuple('bcd'))
200
>>> asizeof.asizeof({'foo': 'bar', 'baz': 'bar'})
400
>>> asizeof.asizeof({})
280
>>> asizeof.asizeof({'foo':'bar'})
360
>>> asizeof.asizeof('foo')
40
>>> asizeof.asizeof(Bar())
352
>>> asizeof.asizeof(Bar().__dict__)
280
>>> A = rand(10)
>>> B = rand(10000)
>>> asizeof.asizeof(A)
176
>>> asizeof.asizeof(B)
80096

As mentioned,

The (byte)code size of objects like classes, functions, methods, modules, etc. can be included by setting option code=True.

And if you need other view on live data, Pympler’s

module muppy is used for on-line monitoring of a Python application and module Class Tracker provides off-line analysis of the lifetime of selected Python objects.


回答 3

对于numpy数组,getsizeof它不起作用-对于我来说,由于某种原因它总是返回40:

from pylab import *
from sys import getsizeof
A = rand(10)
B = rand(10000)

然后(在ipython中):

In [64]: getsizeof(A)
Out[64]: 40

In [65]: getsizeof(B)
Out[65]: 40

令人高兴的是:

In [66]: A.nbytes
Out[66]: 80

In [67]: B.nbytes
Out[67]: 80000

For numpy arrays, getsizeof doesn’t work – for me it always returns 40 for some reason:

from pylab import *
from sys import getsizeof
A = rand(10)
B = rand(10000)

Then (in ipython):

In [64]: getsizeof(A)
Out[64]: 40

In [65]: getsizeof(B)
Out[65]: 40

Happily, though:

In [66]: A.nbytes
Out[66]: 80

In [67]: B.nbytes
Out[67]: 80000

回答 4

这可能比看起来要复杂得多,具体取决于您要如何计算事物。例如,如果您有一个整数列表,您是否想要包含整数引用的列表的大小?(即仅列出,而不列出其中的内容),还是要包括指向的实际数据,在这种情况下,您需要处理重复的引用,以及当两个对象包含对引用的引用时如何防止重复计算同一对象。

您可能想看看其中一种python内存分析器,例如pysizer,看看它们是否满足您的需求。

This can be more complicated than it looks depending on how you want to count things. For instance, if you have a list of ints, do you want the size of the list containing the references to the ints? (ie. list only, not what is contained in it), or do you want to include the actual data pointed to, in which case you need to deal with duplicate references, and how to prevent double-counting when two objects contain references to the same object.

You may want to take a look at one of the python memory profilers, such as pysizer to see if they meet your needs.


回答 5

Raymond Hettinger 在此宣布sys.getsizeof,Python 3.8(2019年第一季度)将更改的某些结果:

在64位版本中,Python容器要小8字节。

tuple ()  48 -> 40       
list  []  64 ->56
set()    224 -> 216
dict  {} 240 -> 232

这是在问题33597Inada Naoki(methane围绕Compact PyGC_Head和PR 7043开展的工作之后

这个想法将PyGC_Head的大小减少到两个单词

目前,PyGC_Head包含三个词gc_prevgc_nextgc_refcnt

  • gc_refcnt 收集时用于尝试删除。
  • gc_prev 用于跟踪和取消跟踪。

因此,如果我们可以避免在尝试删除时进行跟踪/取消跟踪,gc_prev并且gc_refcnt可以共享相同的内存空间。

参见commit d5c875b

Py_ssize_t从中删除一名成员PyGC_Head
所有GC跟踪的对象(例如,元组,列表,字典)的大小都减少了4或8个字节。

Python 3.8 (Q1 2019) will change some of the results of sys.getsizeof, as announced here by Raymond Hettinger:

Python containers are 8 bytes smaller on 64-bit builds.

tuple ()  48 -> 40       
list  []  64 ->56
set()    224 -> 216
dict  {} 240 -> 232

This comes after issue 33597 and Inada Naoki (methane)‘s work around Compact PyGC_Head, and PR 7043

This idea reduces PyGC_Head size to two words.

Currently, PyGC_Head takes three words; gc_prev, gc_next, and gc_refcnt.

  • gc_refcnt is used when collecting, for trial deletion.
  • gc_prev is used for tracking and untracking.

So if we can avoid tracking/untracking while trial deletion, gc_prev and gc_refcnt can share same memory space.

See commit d5c875b:

Removed one Py_ssize_t member from PyGC_Head.
All GC tracked objects (e.g. tuple, list, dict) size is reduced 4 or 8 bytes.


回答 6

我本人多次遇到此问题,然后写了一个小函数(受@ aaron-hall的启发)和测试,实现了sys.getsizeof的期望:

https://github.com/bosswissam/pysize

如果您对背景故事感兴趣,请在这里

编辑:附加下面的代码,以方便参考。要查看最新代码,请检查github链接。

    import sys

    def get_size(obj, seen=None):
        """Recursively finds size of objects"""
        size = sys.getsizeof(obj)
        if seen is None:
            seen = set()
        obj_id = id(obj)
        if obj_id in seen:
            return 0
        # Important mark as seen *before* entering recursion to gracefully handle
        # self-referential objects
        seen.add(obj_id)
        if isinstance(obj, dict):
            size += sum([get_size(v, seen) for v in obj.values()])
            size += sum([get_size(k, seen) for k in obj.keys()])
        elif hasattr(obj, '__dict__'):
            size += get_size(obj.__dict__, seen)
        elif hasattr(obj, '__iter__') and not isinstance(obj, (str, bytes, bytearray)):
            size += sum([get_size(i, seen) for i in obj])
        return size

Having run into this problem many times myself, I wrote up a small function (inspired by @aaron-hall’s answer) & tests that does what I would have expected sys.getsizeof to do:

https://github.com/bosswissam/pysize

If you’re interested in the backstory, here it is

EDIT: Attaching the code below for easy reference. To see the most up-to-date code, please check the github link.

    import sys

    def get_size(obj, seen=None):
        """Recursively finds size of objects"""
        size = sys.getsizeof(obj)
        if seen is None:
            seen = set()
        obj_id = id(obj)
        if obj_id in seen:
            return 0
        # Important mark as seen *before* entering recursion to gracefully handle
        # self-referential objects
        seen.add(obj_id)
        if isinstance(obj, dict):
            size += sum([get_size(v, seen) for v in obj.values()])
            size += sum([get_size(k, seen) for k in obj.keys()])
        elif hasattr(obj, '__dict__'):
            size += get_size(obj.__dict__, seen)
        elif hasattr(obj, '__iter__') and not isinstance(obj, (str, bytes, bytearray)):
            size += sum([get_size(i, seen) for i in obj])
        return size

回答 7

这是我根据先前的答案编写的一个快速脚本,用于列出所有变量的大小

for i in dir():
    print (i, sys.getsizeof(eval(i)) )

Here is a quick script I wrote based on the previous answers to list sizes of all variables

for i in dir():
    print (i, sys.getsizeof(eval(i)) )

回答 8

您可以序列化对象以得出与对象大小密切相关的度量:

import pickle

## let o be the object, whose size you want to measure
size_estimate = len(pickle.dumps(o))

如果您要测量无法腌制的对象(例如,由于lambda表达式),则可以使用混浊解决方案。

You can serialize the object to derive a measure that is closely related to the size of the object:

import pickle

## let o be the object, whose size you want to measure
size_estimate = len(pickle.dumps(o))

If you want to measure objects that cannot be pickled (e.g. because of lambda expressions) cloudpickle can be a solution.


回答 9

如果不想包含链接(嵌套)对象的大小,请使用sys.getsizeof()

但是,如果您要计算嵌套在列表,字典,集合,元组中的子对象(通常这就是您要查找的内容),请使用递归的deep sizeof()函数,如下所示:

import sys
def sizeof(obj):
    size = sys.getsizeof(obj)
    if isinstance(obj, dict): return size + sum(map(sizeof, obj.keys())) + sum(map(sizeof, obj.values()))
    if isinstance(obj, (list, tuple, set, frozenset)): return size + sum(map(sizeof, obj))
    return size

您还可以在漂亮的工具箱中找到此功能,以及许多其他有用的单行代码:

https://github.com/mwojnars/nifty/blob/master/util.py

Use sys.getsizeof() if you DON’T want to include sizes of linked (nested) objects.

However, if you want to count sub-objects nested in lists, dicts, sets, tuples – and usually THIS is what you’re looking for – use the recursive deep sizeof() function as shown below:

import sys
def sizeof(obj):
    size = sys.getsizeof(obj)
    if isinstance(obj, dict): return size + sum(map(sizeof, obj.keys())) + sum(map(sizeof, obj.values()))
    if isinstance(obj, (list, tuple, set, frozenset)): return size + sum(map(sizeof, obj))
    return size

You can also find this function in the nifty toolbox, together with many other useful one-liners:

https://github.com/mwojnars/nifty/blob/master/util.py


回答 10

如果您不需要对象的确切大小,但大致了解对象的大小,一种快速(又脏)的方法是让程序运行,睡眠较长时间并检查内存使用情况(例如:Mac的活动监视器)通过此特定的python进程。当您尝试在python进程中查找单个大对象的大小时,这将是有效的。例如,我最近想检查一个新数据结构的内存使用情况,并将其与Python的set数据结构进行比较。首先,我将元素(大型公共领域书中的单词)写到一个集合中,然后检查流程的大小,然后对其他数据结构执行相同的操作。我发现一组Python进程占用的内存是新数据结构的两倍。再一次,你不会 不能准确地说出进程使用的内存等于对象的大小。随着对象的大小变大,与要监视的对象的大小相比,该过程的其余部分所消耗的内存可以忽略不计,这变得接近。

If you don’t need the exact size of the object but roughly to know how big it is, one quick (and dirty) way is to let the program run, sleep for an extended period of time, and check the memory usage (ex: Mac’s activity monitor) by this particular python process. This would be effective when you are trying to find the size of one single large object in a python process. For example, I recently wanted to check the memory usage of a new data structure and compare it with that of Python’s set data structure. First I wrote the elements (words from a large public domain book) to a set, then checked the size of the process, and then did the same thing with the other data structure. I found out the Python process with a set is taking twice as much memory as the new data structure. Again, you wouldn’t be able to exactly say the memory used by the process is equal to the size of the object. As the size of the object gets large, this becomes close as the memory consumed by the rest of the process becomes negligible compared to the size of the object you are trying to monitor.


回答 11

您可以使用如下所述的getSizeof()来确定对象的大小

import sys
str1 = "one"
int_element=5
print("Memory size of '"+str1+"' = "+str(sys.getsizeof(str1))+ " bytes")
print("Memory size of '"+ str(int_element)+"' = "+str(sys.getsizeof(int_element))+ " bytes")

You can make use of getSizeof() as mentioned below to determine the size of an object

import sys
str1 = "one"
int_element=5
print("Memory size of '"+str1+"' = "+str(sys.getsizeof(str1))+ " bytes")
print("Memory size of '"+ str(int_element)+"' = "+str(sys.getsizeof(int_element))+ " bytes")

回答 12

我使用这个技巧…可能在小对象上不准确,但是我认为它对于复杂对象(如pygame表面)比sys.getsizeof()更准确

import pygame as pg
import os
import psutil
import time


process = psutil.Process(os.getpid())
pg.init()    
vocab = ['hello', 'me', 'you', 'she', 'he', 'they', 'we',
         'should', 'why?', 'necessarily', 'do', 'that']

font = pg.font.SysFont("monospace", 100, True)

dct = {}

newMem = process.memory_info().rss  # don't mind this line
Str = f'store ' + f'Nothing \tsurface use about '.expandtabs(15) + \
      f'0\t bytes'.expandtabs(9)  # don't mind this assignment too

usedMem = process.memory_info().rss

for word in vocab:
    dct[word] = font.render(word, True, pg.Color("#000000"))

    time.sleep(0.1)  # wait a moment

    # get total used memory of this script:
    newMem = process.memory_info().rss
    Str = f'store ' + f'{word}\tsurface use about '.expandtabs(15) + \
          f'{newMem - usedMem}\t bytes'.expandtabs(9)

    print(Str)
    usedMem = newMem

在我的Windows 10(python 3.7.3)上,输出为:

store hello          surface use about 225280    bytes
store me             surface use about 61440     bytes
store you            surface use about 94208     bytes
store she            surface use about 81920     bytes
store he             surface use about 53248     bytes
store they           surface use about 114688    bytes
store we             surface use about 57344     bytes
store should         surface use about 172032    bytes
store why?           surface use about 110592    bytes
store necessarily    surface use about 311296    bytes
store do             surface use about 57344     bytes
store that           surface use about 110592    bytes

I use this trick… May won’t be accurate on small objects, but I think it’s much more accurate for a complex object (like pygame surface) rather than sys.getsizeof()

import pygame as pg
import os
import psutil
import time


process = psutil.Process(os.getpid())
pg.init()    
vocab = ['hello', 'me', 'you', 'she', 'he', 'they', 'we',
         'should', 'why?', 'necessarily', 'do', 'that']

font = pg.font.SysFont("monospace", 100, True)

dct = {}

newMem = process.memory_info().rss  # don't mind this line
Str = f'store ' + f'Nothing \tsurface use about '.expandtabs(15) + \
      f'0\t bytes'.expandtabs(9)  # don't mind this assignment too

usedMem = process.memory_info().rss

for word in vocab:
    dct[word] = font.render(word, True, pg.Color("#000000"))

    time.sleep(0.1)  # wait a moment

    # get total used memory of this script:
    newMem = process.memory_info().rss
    Str = f'store ' + f'{word}\tsurface use about '.expandtabs(15) + \
          f'{newMem - usedMem}\t bytes'.expandtabs(9)

    print(Str)
    usedMem = newMem

On my windows 10, python 3.7.3, the output is:

store hello          surface use about 225280    bytes
store me             surface use about 61440     bytes
store you            surface use about 94208     bytes
store she            surface use about 81920     bytes
store he             surface use about 53248     bytes
store they           surface use about 114688    bytes
store we             surface use about 57344     bytes
store should         surface use about 172032    bytes
store why?           surface use about 110592    bytes
store necessarily    surface use about 311296    bytes
store do             surface use about 57344     bytes
store that           surface use about 110592    bytes

如何使用Python获取系统主机名?

问题:如何使用Python获取系统主机名?

我正在为本地网络编写聊天程序。我希望能够使用Python识别计算机并获取用户设置的计算机名称。

I’m writing a chat program for a local network. I would like be able to identify computers and get the user-set computer name with Python.


回答 0

使用socket及其gethostname()功能。这将获得hostname运行Python解释器的计算机的名称:

import socket
print(socket.gethostname())

Use socket and its gethostname() functionality. This will get the hostname of the computer where the Python interpreter is running:

import socket
print(socket.gethostname())

回答 1

这两个都是可移植的:

import platform
platform.node()

import socket
socket.gethostname()

使用HOSTHOSTNAME环境变量的任何解决方案都是不可移植的。即使它在您的系统上运行时也可以运行,但是在cron等特殊环境中运行时也可能无法运行。

Both of these are pretty portable:

import platform
platform.node()

import socket
socket.gethostname()

Any solutions using the HOST or HOSTNAME environment variables are not portable. Even if it works on your system when you run it, it may not work when run in special environments such as cron.


回答 2

无论如何,您可能都将加载os模块,因此另一个建议是:

import os
myhost = os.uname()[1]

You will probably load the os module anyway, so another suggestion would be:

import os
myhost = os.uname()[1]

回答 3

关于什么 :

import platform

h = platform.uname()[1]

其实你可能想看看所有的结果 platform.uname()

What about :

import platform

h = platform.uname()[1]

Actually you may want to have a look to all the result in platform.uname()


回答 4

os.getenv('HOSTNAME')而且os.environ['HOSTNAME']不总是工作。在cron作业和WSDL中,未设置HTTP HOSTNAME。使用此代替:

import socket
socket.gethostbyaddr(socket.gethostname())[0]

即使您在/ etc / hosts中定义了短别名,它也始终(即使在Windows上)返回标准主机名。

如果你定义的别名/ etc / hosts文件,然后socket.gethostname()将返回别名。platform.uname()[1]做同样的事情。

我遇到了以上方法不起作用的情况。这就是我现在正在使用的:

import socket
if socket.gethostname().find('.')>=0:
    name=socket.gethostname()
else:
    name=socket.gethostbyaddr(socket.gethostname())[0]

它首先调用gethostname来查看它是否返回类似于主机名的内容,如果不是,则使用我原来的解决方案。

os.getenv('HOSTNAME') and os.environ['HOSTNAME'] don’t always work. In cron jobs and WSDL, HTTP HOSTNAME isn’t set. Use this instead:

import socket
socket.gethostbyaddr(socket.gethostname())[0]

It always (even on Windows) returns a fully qualified host name, even if you defined a short alias in /etc/hosts.

If you defined an alias in /etc/hosts then socket.gethostname() will return the alias. platform.uname()[1] does the same thing.

I ran into a case where the above didn’t work. This is what I’m using now:

import socket
if socket.gethostname().find('.')>=0:
    name=socket.gethostname()
else:
    name=socket.gethostbyaddr(socket.gethostname())[0]

It first calls gethostname to see if it returns something that looks like a host name, if not it uses my original solution.


回答 5

如果我是正确的,那么您正在寻找socket.gethostname函数:

>> import socket
>> socket.gethostname()
'terminus'

If I’m correct, you’re looking for the socket.gethostname function:

>> import socket
>> socket.gethostname()
'terminus'

回答 6

至少从python> = 3.3开始

您可以使用该字段nodename,避免使用数组索引:

os.uname().nodename

虽然,即使是os.uname的文档建议使用socket.gethostname()

From at least python >= 3.3:

You can use the field nodename and avoid using array indexing:

os.uname().nodename

Although, even the documentation of os.uname suggests using socket.gethostname()


回答 7

socket.gethostname() 可以做

socket.gethostname() could do


回答 8

在某些系统上,主机名是在环境中设置的。如果是这种情况,则os模块可以通过os.getenv将其拉出环境。例如,如果HOSTNAME是包含所需内容的环境变量,则将通过以下方式获取它:

import os
system_name = os.getenv('HOSTNAME')

更新:如评论中所述,这并不总是可行,因为并非每个人的环境都是以此方式设置的。我相信,当我最初回答这个问题时,我正在使用此解决方案,因为这是我在网络搜索中发现的第一件事,当时对我有用。由于缺乏可移植性,我现在可能不会使用它。但是,我将这个答案留作参考。FWIW,如果您的环境具有系统名称并且您已经导入了os模块,那么它确实消除了其他导入的需要。测试它-如果它不能在您希望程序在其中运行的所有环境中都起作用,请使用提供的其他解决方案之一。

On some systems, the hostname is set in the environment. If that is the case for you, the os module can pull it out of the environment via os.getenv. For example, if HOSTNAME is the environment variable containing what you want, the following will get it:

import os
system_name = os.getenv('HOSTNAME')

Update: As noted in the comments, this doesn’t always work, as not everyone’s environment is set up this way. I believe that at the time I initially answered this I was using this solution as it was the first thing I’d found in a web search and it worked for me at the time. Due to the lack of portability I probably wouldn’t use this now. However, I am leaving this answer for reference purposes. FWIW, it does eliminate the need for other imports if your environment has the system name and you are already importing the os module. Test it – if it doesn’t work in all the environments in which you expect your program to operate, use one of the other solutions provided.


回答 9

我需要在PyLog conf文件中使用PC的名称,并且套接字库不可用,但是os库可用。

对于Windows,我使用了:

os.getenv('COMPUTERNAME', 'defaultValue')

其中defaultValue是防止返回None的字符串

I needed the name of the PC to use in my PyLog conf file, and the socket library is not available, but os library is.

For Windows I used:

os.getenv('COMPUTERNAME', 'defaultValue')

Where defaultValue is a string to prevent None being returned


回答 10

您必须执行此行代码

sock_name = socket.gethostname()

然后,您可以使用该名称来查找addr:

print(socket.gethostbyname(sock_name))

You have to execute this line of code

sock_name = socket.gethostname()

And then you can use the name to find the addr :

print(socket.gethostbyname(sock_name))