分类目录归档:知识问答

在写入文件时,如何在Python中指定换行符?

问题:在写入文件时,如何在Python中指定换行符?

与Java(以字符串形式)相比,您可以执行"First Line\r\nSecond Line"

那么,为了在常规文件中写多行,您将如何在Python中执行此操作?

In comparison to Java (in a string), you would do something like "First Line\r\nSecond Line".

So how would you do that in Python, for purposes of writing multiple lines to a regular file?


回答 0

这取决于您想做的正确程度。\n通常会做的工作。如果您确实想解决问题,请在os包中查找换行符。(实际上称为linesep。)

注意:使用Python API写入文件时,请勿使用os.linesep。随便用\n; Python会自动将其转换为适合您平台的换行符。

It depends on how correct you want to be. \n will usually do the job. If you really want to get it right, you look up the newline character in the os package. (It’s actually called linesep.)

Note: when writing to files using the Python API, do not use the os.linesep. Just use \n; Python automatically translates that to the proper newline character for your platform.


回答 1

新行字符为\n。它在字符串内使用。

例:

    print('First line \n Second line') 

\n换行符在哪里。

这将产生结果:

First line
 Second line

如果使用Python 2,则不要在print函数上使用括号。

The new line character is \n. It is used inside a string.

Example:

    print('First line \n Second line') 

where \n is the newline character.

This would yield the result:

First line
 Second line

If you use Python 2, you do not use the parentheses on the print function.


回答 2

您可以分别在新行中编写,也可以在单个字符串中编写,这样更容易。

例子1

输入项

line1 = "hello how are you"
line2 = "I am testing the new line escape sequence"
line3 = "this seems to work"

您可以单独编写“ \ n”:

file.write(line1)
file.write("\n")
file.write(line2)
file.write("\n")
file.write(line3)
file.write("\n")

输出量

hello how are you
I am testing the new line escape sequence
this seems to work

例子2

输入项

正如其他人在前面的答案中指出的那样,将\ n放在字符串中的相关点上:

line = "hello how are you\nI am testing the new line escape sequence\nthis seems to work"

file.write(line)

输出量

hello how are you
I am testing the new line escape sequence
this seems to work

You can either write in the new lines separately or within a single string, which is easier.

Example 1

Input

line1 = "hello how are you"
line2 = "I am testing the new line escape sequence"
line3 = "this seems to work"

You can write the ‘\n’ separately:

file.write(line1)
file.write("\n")
file.write(line2)
file.write("\n")
file.write(line3)
file.write("\n")

Output

hello how are you
I am testing the new line escape sequence
this seems to work

Example 2

Input

As others have pointed out in the previous answers, place the \n at the relevant points in your string:

line = "hello how are you\nI am testing the new line escape sequence\nthis seems to work"

file.write(line)

Output

hello how are you
I am testing the new line escape sequence
this seems to work

回答 3

如果您一次输入多行文本,我发现这是最易读的格式。

file.write("\
Life's but a walking shadow, a poor player\n\
That struts and frets his hour upon the stage\n\
And then is heard no more: it is a tale\n\
Told by an idiot, full of sound and fury,\n\
Signifying nothing.\n\
")

每行末尾的\将转义新行(这将导致错误)。

If you are entering several lines of text at once, I find this to be the most readable format.

file.write("\
Life's but a walking shadow, a poor player\n\
That struts and frets his hour upon the stage\n\
And then is heard no more: it is a tale\n\
Told by an idiot, full of sound and fury,\n\
Signifying nothing.\n\
")

The \ at the end of each line escapes the new line (which would cause an error).


回答 4

在Python中,您只能使用换行符,即 \n

In Python you can just use the new-line character, i.e. \n


回答 5

最简单的解决方案

如果仅print不带任何参数的情况下调用,它将输出空白行。

print

您可以将输出通过管道传输到这样的文件中(考虑您的示例):

f = open('out.txt', 'w')
print 'First line' >> f
print >> f
print 'Second line' >> f
f.close()

它不仅与操作系统无关(甚至无需使用os软件包),而且比放在\n字符串中更具可读性。

说明

print()函数在字符串的末尾有一个可选的关键字参数,称为end,默认为OS的换行符,例如。\n。因此,当您打电话时print('hello'),Python实际上正在打印'hello' + '\n'。这意味着当您print不带任何参数调用时,它实际上是print '' + '\n',这导致换行符。

另类

使用多行字符串。

s = """First line
    Second line
    Third line"""
f = open('out.txt', 'w')
print s >> f
f.close()

Simplest solution

If you only call print without any arguments, it will output a blank line.

print

You can pipe the output to a file like this (considering your example):

f = open('out.txt', 'w')
print 'First line' >> f
print >> f
print 'Second line' >> f
f.close()

Not only is it OS-agnostic (without even having to use the os package), it’s also more readable than putting \n within strings.

Explanation

The print() function has an optional keyword argument for the end of the string, called end, which defaults to the OS’s newline character, for eg. \n. So, when you’re calling print('hello'), Python is actually printing 'hello' + '\n'. Which means that when you’re calling just print without any arguments, it’s actually printing '' + '\n', which results in a newline.

Alternative

Use multi-line strings.

s = """First line
    Second line
    Third line"""
f = open('out.txt', 'w')
print s >> f
f.close()

回答 6

独立于平台的断线器:Linux,Windows和IOS

import os
keyword = 'physical'+ os.linesep + 'distancing'
print(keyword)

输出:

physical
distancing

Platform independent line breaker: Linux,windows & IOS

import os
keyword = 'physical'+ os.linesep + 'distancing'
print(keyword)

Output:

physical
distancing

回答 7

与相同的方法'\n',尽管您可能不需要'\r'。您是否有理由在Java版本中使用它?如果确实需要/想要它,您也可以在Python中以相同的方式使用它。

The same way with '\n', though you’d probably not need the '\r'. Is there a reason you have it in your Java version? If you do need/want it, you can use it in the same way in Python too.


回答 8

\ n-简单的换行符插入工程:

# Here's the test example - string with newline char:
In [36]: test_line = "Hi!!!\n testing first line.. \n testing second line.. \n and third line....."

# Output:
In [37]: print(test_line)

Hi!!!
 testing first line..
 testing second line..
 and third line.....

\n – simple newline character insertion works:

# Here's the test example - string with newline char:
In [36]: test_line = "Hi!!!\n testing first line.. \n testing second line.. \n and third line....."

# Output:
In [37]: print(test_line)

Hi!!!
 testing first line..
 testing second line..
 and third line.....

回答 9

Java字符串文字中的大多数转义字符在Python中也有效,例如“ \ r”和“ \ n”。

Most escape characters in string literals from Java are also valid in Python, such as “\r” and “\n”.


回答 10

如其他答案所述:“换行符为\ n。它在字符串内使用”。

我发现最简单易读的方法是使用“格式”功能,将nl用作新行的名称,然后将要打印的字符串分解为要打印的确切格式:

print("line1{nl}"
      "line2{nl}"
      "line3".format(nl="\n"))

将会输出:

line1
line2
line3

这样,它可以执行任务,并且还可以使代码具有较高的可读性:)

As mentioned in other answers: “The new line character is \n. It is used inside a string”.

I found the most simple and readable way is to use the “format” function, using nl as the name for a new line, and break the string you want to print to the exact format you going to print it:

python2:

print("line1{nl}"
      "line2{nl}"
      "line3".format(nl="\n"))

python3:

nl = "\n"
print(f"line1{nl}"
      f"line2{nl}"
      f"line3")

That will output:

line1
line2
line3

This way it performs the task, and also gives high readability of the code :)


回答 11

\ n分隔字符串的行。在下面的示例中,我将循环写记录。每个记录用分隔\n

f = open("jsonFile.txt", "w")

for row_index in range(2, sheet.nrows):

  mydict1 = {
    "PowerMeterId" : row_index + 1,
    "Service": "Electricity",
    "Building": "JTC FoodHub",
    "Floor": str(Floor),
    "Location": Location,
    "ReportType": "Electricity",
    "System": System,
    "SubSystem": "",
    "Incomer": "",
    "Category": "",
    "DisplayName": DisplayName,
    "Description": Description,
    "Tag": tag,
    "IsActive": 1,
    "DataProviderType": int(0),
    "DataTable": ""
  }
  mydict1.pop("_id", None)
  f.write(str(mydict1) + '\n')

f.close()

\n separates the lines of a string. In the following example, I keep writing the records in a loop. Each record is separated by \n.

f = open("jsonFile.txt", "w")

for row_index in range(2, sheet.nrows):

  mydict1 = {
    "PowerMeterId" : row_index + 1,
    "Service": "Electricity",
    "Building": "JTC FoodHub",
    "Floor": str(Floor),
    "Location": Location,
    "ReportType": "Electricity",
    "System": System,
    "SubSystem": "",
    "Incomer": "",
    "Category": "",
    "DisplayName": DisplayName,
    "Description": Description,
    "Tag": tag,
    "IsActive": 1,
    "DataProviderType": int(0),
    "DataTable": ""
  }
  mydict1.pop("_id", None)
  f.write(str(mydict1) + '\n')

f.close()

回答 12

值得注意的是,当您使用交互式python shell或Jupyter笔记本检查字符串时,\n以及其他反斜杠字符串将按字面\t呈现:

>>> gotcha = 'Here is some random message...'
>>> gotcha += '\nAdditional content:\n\t{}'.format('Yet even more great stuff!')
>>> gotcha
'Here is some random message...\nAdditional content:\n\tYet even more great stuff!'

换行符,制表符和其他特殊的非打印字符仅在打印或写入文件时才呈现为空白:

>>> print('{}'.format(gotcha))
Here is some random message...
Additional content:
    Yet even more great stuff!

Worth noting that when you inspect a string using the interactive python shell or a Jupyter notebook, the \n and other backslashed strings like \t are rendered literally:

>>> gotcha = 'Here is some random message...'
>>> gotcha += '\nAdditional content:\n\t{}'.format('Yet even more great stuff!')
>>> gotcha
'Here is some random message...\nAdditional content:\n\tYet even more great stuff!'

The newlines, tabs, and other special non-printed characters are rendered as whitespace only when printed, or written to a file:

>>> print('{}'.format(gotcha))
Here is some random message...
Additional content:
    Yet even more great stuff!

为什么IoC / DI在Python中不常见?

问题:为什么IoC / DI在Python中不常见?

在Java中,IoC / DI是一种非常普遍的做法,广泛用于Web应用程序,几乎所有可用的框架和Java EE中。另一方面,也有很多大型的Python Web应用程序,但是除了Zope(我听说过应该非常可怕的编码)之外,IoC在Python世界中似乎并不普遍。(如果您认为我错了,请举一些例子)。

当然,有一些流行的Java IoC框架的克隆可用于Python,例如springpython。但是它们似乎都没有被实际使用。至少,我从来没有在一个stumpled Django的SQLAlchemy的 + <insert your favorite wsgi toolkit here>,它使用类似的东西,基于Web应用程序。

我认为IoC具有合理的优势,例如可以轻松替换django-default-user-model,但是在Python中广泛使用接口类和IoC看起来有些奇怪,而不是“ pythonic”。但是也许有人有一个更好的解释,为什么IoC在Python中没有得到广泛使用。

In Java IoC / DI is a very common practice which is extensively used in web applications, nearly all available frameworks and Java EE. On the other hand, there are also lots of big Python web applications, but beside of Zope (which I’ve heard should be really horrible to code) IoC doesn’t seem to be very common in the Python world. (Please name some examples if you think that I’m wrong).

There are of course several clones of popular Java IoC frameworks available for Python, springpython for example. But none of them seems to get used practically. At least, I’ve never stumpled upon a Django or sqlalchemy+<insert your favorite wsgi toolkit here> based web application which uses something like that.

In my opinion IoC has reasonable advantages and would make it easy to replace the django-default-user-model for example, but extensive usage of interface classes and IoC in Python looks a bit odd and not »pythonic«. But maybe someone has a better explanation, why IoC isn’t widely used in Python.


回答 0

我实际上并不认为DI / IoC 在Python 并不罕见。什么不常见的,但是,是DI / IoC的框架/容器

想一想:DI容器做什么?它可以让你

  1. 将独立的组件连接成一个完整的应用程序…
  2. …在运行时。

我们有“连接在一起”和“运行时”的名称:

  1. 脚本编写
  2. 动态

因此,DI容器不过是动态脚本语言的解释器。实际上,让我改写一下:一个典型的Java / .NET DI容器只不过是一个糟糕的解释器,它解释了一种非常糟糕的动态脚本语言,其使用的语法有些笨拙,有时甚至是基于XML的。

当您使用Python进行编程时,为什么要使用丑陋,糟糕的脚本语言,却要拥有漂亮,精妙的脚本语言呢?实际上,这是一个更笼统的问题:当您使用几乎任何一种语言进行编程时,为什么要使用Jython和IronPython来使用一种丑陋的,糟糕的脚本语言?

因此,回顾一下:出于完全相同的原因,DI / IoC 的实践在Python中与在Java中一样重要。但是,DI / IoC 的实现已内置于该语言中,并且通常如此轻巧,以至于它完全消失了。

(这里有一个简短的类比:在汇编中,子例程调用是一件很重要的事情-您必须将本地变量和寄存器保存到内存中,将返回地址保存在某个地方,将指令指针更改为要调用的子例程,安排它完成后以某种方式跳回到您的子例程中,将参数放在被调用者可以找到它们的地方,依此类推。IOW:在汇编中,“子例程调用”是一种设计模式,在出现诸如内置了子例程调用的Fortran,人们正在构建自己的“子例程框架”。您会说在Python中子例程调用是“罕见的”,仅仅是因为您不使用子例程框架吗?)

顺便说一句:让DI成为逻辑结论的示例,请看一下Gilad BrachaNewspeak编程语言及其在该主题上的著作:

I don’t actually think that DI/IoC are that uncommon in Python. What is uncommon, however, are DI/IoC frameworks/containers.

Think about it: what does a DI container do? It allows you to

  1. wire together independent components into a complete application …
  2. … at runtime.

We have names for “wiring together” and “at runtime”:

  1. scripting
  2. dynamic

So, a DI container is nothing but an interpreter for a dynamic scripting language. Actually, let me rephrase that: a typical Java/.NET DI container is nothing but a crappy interpreter for a really bad dynamic scripting language with butt-ugly, sometimes XML-based, syntax.

When you program in Python, why would you want to use an ugly, bad scripting language when you have a beautiful, brilliant scripting language at your disposal? Actually, that’s a more general question: when you program in pretty much any language, why would you want to use an ugly, bad scripting language when you have Jython and IronPython at your disposal?

So, to recap: the practice of DI/IoC is just as important in Python as it is in Java, for exactly the same reasons. The implementation of DI/IoC however, is built into the language and often so lightweight that it completely vanishes.

(Here’s a brief aside for an analogy: in assembly, a subroutine call is a pretty major deal – you have to save your local variables and registers to memory, save your return address somewhere, change the instruction pointer to the subroutine you are calling, arrange for it to somehow jump back into your subroutine when it is finished, put the arguments somewhere where the callee can find them, and so on. IOW: in assembly, “subroutine call” is a Design Pattern, and before there were languages like Fortran which had subroutine calls built in, people were building their own “subroutine frameworks”. Would you say that subroutine calls are “uncommon” in Python, just because you don’t use subroutine frameworks?)

BTW: for an example of what it looks like to take DI to its logical conclusion, take a look at Gilad Bracha‘s Newspeak Programming Language and his writings on the subject:


回答 1

它的一部分是模块系统在Python中的工作方式。您只需从模块导入即可免费获得某种“单身”。在模块中定义对象的实际实例,然后任何客户端代码都可以导入该对象,并实际上获得一个可以正常工作的,完全构建的/填充的对象。

这与Java相反,在Java中,您不导入对象的实际实例。这意味着您始终必须自己实例化它们(或使用某种IoC / DI样式方法)。您可以通过使用静态工厂方法(或实际工厂类)来减轻必须实例化所有内容的麻烦,但是您仍然会每次实际创建新方法时会产生资源开销。

Part of it is the way the module system works in Python. You can get a sort of “singleton” for free, just by importing it from a module. Define an actual instance of an object in a module, and then any client code can import it and actually get a working, fully constructed / populated object.

This is in contrast to Java, where you don’t import actual instances of objects. This means you are always having to instantiate them yourself, (or use some sort of IoC/DI style approach). You can mitigate the hassle of having to instantiate everything yourself by having static factory methods (or actual factory classes), but then you still incur the resource overhead of actually creating new ones each time.


回答 2

IoC和DI在成熟的Python代码中非常常见。由于鸭子输入,您只需要一个框架来实现DI。

最好的示例是如何使用来设置Django应用程序settings.py

# settings.py
CACHES = {
    'default': {
        'BACKEND': 'django_redis.cache.RedisCache',
        'LOCATION': REDIS_URL + '/1',
    },
    'local': {
        'BACKEND': 'django.core.cache.backends.locmem.LocMemCache',
        'LOCATION': 'snowflake',
    }
}

Django Rest Framework大量利用了DI:

class FooView(APIView):
    # The "injected" dependencies:
    permission_classes = (IsAuthenticated, )
    throttle_classes = (ScopedRateThrottle, )
    parser_classes = (parsers.FormParser, parsers.JSONParser, parsers.MultiPartParser)
    renderer_classes = (renderers.JSONRenderer,)

    def get(self, request, *args, **kwargs):
        pass

    def post(self, request, *args, **kwargs):
        pass

让我提醒一下(来源):

“依赖性注入”是5美分概念的25美元术语。依赖注入意味着给对象一个实例变量。[…]。

IoC and DI are super common in mature Python code. You just don’t need a framework to implement DI thanks to duck typing.

The best example is how you set up a Django application using settings.py:

# settings.py
CACHES = {
    'default': {
        'BACKEND': 'django_redis.cache.RedisCache',
        'LOCATION': REDIS_URL + '/1',
    },
    'local': {
        'BACKEND': 'django.core.cache.backends.locmem.LocMemCache',
        'LOCATION': 'snowflake',
    }
}

Django Rest Framework utilizes DI heavily:

class FooView(APIView):
    # The "injected" dependencies:
    permission_classes = (IsAuthenticated, )
    throttle_classes = (ScopedRateThrottle, )
    parser_classes = (parsers.FormParser, parsers.JSONParser, parsers.MultiPartParser)
    renderer_classes = (renderers.JSONRenderer,)

    def get(self, request, *args, **kwargs):
        pass

    def post(self, request, *args, **kwargs):
        pass

Let me remind (source):

“Dependency Injection” is a 25-dollar term for a 5-cent concept. […] Dependency injection means giving an object its instance variables. […].


回答 3

Django充分利用了控制反转。例如,数据库服务器由配置文件选择,然后框架向数据库客户端提供适当的数据库包装器实例。

区别在于Python具有一流的类型。数据类型(包括类)本身就是对象。如果您想要某些东西使用特定的类,只需为该类命名。例如:

if config_dbms_name == 'postgresql':
    import psycopg
    self.database_interface = psycopg
elif config_dbms_name == 'mysql':
    ...

随后的代码可以通过编写以下内容来创建数据库接口:

my_db_connection = self.database_interface()
# Do stuff with database.

Python用一两行普通代码来代替Java和C ++所需的样板工厂功能。这就是函数式编程与命令式编程的强项。

Django makes great use of inversion of control. For instance, the database server is selected by the configuration file, then the framework provides appropriate database wrapper instances to database clients.

The difference is that Python has first-class types. Data types, including classes, are themselves objects. If you want something to use a particular class, simply name the class. For example:

if config_dbms_name == 'postgresql':
    import psycopg
    self.database_interface = psycopg
elif config_dbms_name == 'mysql':
    ...

Later code can then create a database interface by writing:

my_db_connection = self.database_interface()
# Do stuff with database.

Instead of the boilerplate factory functions that Java and C++ need, Python does it with one or two lines of ordinary code. This is the strength of functional versus imperative programming.


回答 4

它看到人们真的不再得到依赖注入和控制反转意味着什么了。

使用控制反转的做法是让类或函数依赖于另一个类或函数,但是与其在函数代码的类中创建实例相比,不如在函数代码的类中创建实例,则最好将其作为参数来接收,因此可以简化松耦合。这具有许多优点,因为它们具有更高的可测试性以及归档liskov替换原理。

您会发现,通过使用接口和注入,代码可以更容易维护,因为您可以轻松更改行为,因为您不必重写一行代码(在DI配置中为一两行)类更改其行为,因为实现您的类正在等待的接口的类可以独立变化,只要它们遵循该接口即可。保持代码分离和易于维护的最佳策略之一是至少遵循单一的责任,替换和依赖关系反转原则。

如果您可以自己在包中实例化对象并将其导入以自己注入,那么DI库有什么用?选择的答案是正确的,因为Java没有过程部分(类之外的代码),所有这些都进入了无聊的配置xml,因此需要类来实例化和注入依赖于惰性加载方式的依赖项,因此您不会感到厌烦您的性能,而在python上,您只需在代码的“过程”(类外部代码)部分上编码注入

It seens that people really dont get what Dependency injection and inversion of control means anymore.

The practice of using inversion of control is to have classes or function that depends of another classes or functions, but instead of creating the instances whithin the class of function code it is better to receive it as a parameter, so loose coupling can be archieved. That has many benefits as more testability and to archieve the liskov substitution principle.

You see, by working with interfaces and injections, your code gets more maintanable, since you can change the behavior easily, because you won’t have to rewrite a single line of code (maybe a line or two on the DI configuration) of your class to change it’s behavior, since the classes that implements the interface your class is waiting for can vary independently as long as they follow the interface. One of the best strategies to keep code decoupled and easy to maintain is to follow at least the single responsability, substitution and dependency inversion principles.

Whats a DI library good for if you can instantiate a object yourself inside a package and import it to inject it yourself? The chosen answer is right, since java has no procedural sections (code outside of classes), all that goes into boring configuration xml’s, hence the need of a class to instantiate and inject dependencies on a lazy load fashion so you don’t blow away your performance, while on python you just code the injections on the “procedural” (code outside classes) sections of your code


回答 5

几年来没有使用过Python,但是我想说它与动态类型化语言的关系比其他任何事情都重要。举一个简单的例子,在Java中,如果我想测试是否适当地写了一些标准,我可以使用DI并传入任何PrintStream来捕获正在编写的文本并进行验证。但是,当我在Ruby中工作时,我可以动态替换STDOUT上的“ puts”方法来进行验证,而将DI完全排除在外。如果我创建抽象的唯一原因是测试使用抽象的类(例如文件系统操作或Java中的时钟),则DI / IoC会在解决方案中造成不必要的复杂性。

Haven’t used Python in several years, but I would say that it has more to do with it being a dynamically typed language than anything else. For a simple example, in Java, if I wanted to test that something wrote to standard out appropriately I could use DI and pass in any PrintStream to capture the text being written and verify it. When I’m working in Ruby, however, I can dynamically replace the ‘puts’ method on STDOUT to do the verify, leaving DI completely out of the picture. If the only reason I’m creating an abstraction is to test the class that’s using it (think File system operations or the clock in Java) then DI/IoC creates unnecessary complexity in the solution.


回答 6

实际上,用DI编写足够干净和紧凑的代码是很容易的(我想知道那会是/保持pythonic,但无论如何:)),例如,我实际上更喜欢这种编码方式:

def polite(name_str):
    return "dear " + name_str

def rude(name_str):
    return name_str + ", you, moron"

def greet(name_str, call=polite):
    print "Hello, " + call(name_str) + "!"

_

>>greet("Peter")
Hello, dear Peter!
>>greet("Jack", rude)
Hello, Jack, you, moron!

是的,可以将其视为参数化函数/类的简单形式,但是它确实可以工作。因此,也许Python随附的默认电池在这里也足够了。

PS我还在动态评估Python中的简单布尔逻辑时还发布了这种天真方法的更大示例。

Actually, it is quite easy to write sufficiently clean and compact code with DI (I wonder, will it be/stay pythonic then, but anyway :) ), for example I actually perefer this way of coding:

def polite(name_str):
    return "dear " + name_str

def rude(name_str):
    return name_str + ", you, moron"

def greet(name_str, call=polite):
    print "Hello, " + call(name_str) + "!"

_

>>greet("Peter")
Hello, dear Peter!
>>greet("Jack", rude)
Hello, Jack, you, moron!

Yes, this can be viewed as just a simple form of parameterizing functions/classes, but it does its work. So, maybe Python’s default-included batteries are enough here too.

P.S. I have also posted a larger example of this naive approach at Dynamically evaluating simple boolean logic in Python.


回答 7

IoC / DI是一个设计概念,但不幸的是,它通常被视为适用于某些语言(或键入系统)的概念。我希望看到依赖注入容器在Python中变得越来越流行。有Spring,但是那是一个超级框架,似乎是Java概念的直接移植,而无需过多考虑“ Python方式”。

给定Python 3中的注释,我决定对功能齐全但简单的依赖项注入容器进行破解:https : //github.com/zsims/dic。它基于.NET依赖项注入容器中的一些概念(如果您曾经在该领域中玩,那么IMO就是一个不错的选择),但是却被Python概念所突变。

IoC/DI is a design concept, but unfortunately it’s often taken as a concept that applies to certain languages (or typing systems). I’d love to see dependency injection containers become far more popular in Python. There’s Spring, but that’s a super-framework and seems to be a direct port of the Java concepts without much consideration for “The Python Way.”

Given Annotations in Python 3, I decided to have a crack at a full featured, but simple, dependency injection container: https://github.com/zsims/dic . It’s based on some concepts from a .NET dependency injection container (which IMO is fantastic if you’re ever playing in that space), but mutated with Python concepts.


回答 8

我认为,由于python的动态性质,人们经常看不到需要另一个动态框架。当类从新样式的“对象”继承时,您可以动态创建一个新变量(https://wiki.python.org/moin/NewClassVsClassicClass)。

在纯python中:

#application.py
class Application(object):
    def __init__(self):
        pass

#main.py
Application.postgres_connection = PostgresConnection()

#other.py
postgres_connection = Application.postgres_connection
db_data = postgres_connection.fetchone()

但是,查看https://github.com/noodleflake/pyioc,这可能是您想要的。

在pyooc

from libs.service_locator import ServiceLocator

#main.py
ServiceLocator.register(PostgresConnection)

#other.py
postgres_connection = ServiceLocator.resolve(PostgresConnection)
db_data = postgres_connection.fetchone()

I think due to the dynamic nature of python people don’t often see the need for another dynamic framework. When a class inherits from the new-style ‘object’ you can create a new variable dynamically (https://wiki.python.org/moin/NewClassVsClassicClass).

i.e. In plain python:

#application.py
class Application(object):
    def __init__(self):
        pass

#main.py
Application.postgres_connection = PostgresConnection()

#other.py
postgres_connection = Application.postgres_connection
db_data = postgres_connection.fetchone()

However have a look at https://github.com/noodleflake/pyioc this might be what you are looking for.

i.e. In pyioc

from libs.service_locator import ServiceLocator

#main.py
ServiceLocator.register(PostgresConnection)

#other.py
postgres_connection = ServiceLocator.resolve(PostgresConnection)
db_data = postgres_connection.fetchone()

回答 9

我支持“JörgW Mittag”的回答:“ DI / IoC的Python实现非常轻巧,因此完全消失了”。

为了支持这一说法,请看一下著名的Martin Fowler从Java移植到Python的示例: Python:Design_Patterns:Inversion_of_Control

从上面的链接中可以看到,Python中的“容器”可以用8行代码编写:

class Container:
    def __init__(self, system_data):
        for component_name, component_class, component_args in system_data:
            if type(component_class) == types.ClassType:
                args = [self.__dict__[arg] for arg in component_args]
                self.__dict__[component_name] = component_class(*args)
            else:
                self.__dict__[component_name] = component_class

I back “Jörg W Mittag” answer: “The Python implementation of DI/IoC is so lightweight that it completely vanishes”.

To back up this statement, take a look at the famous Martin Fowler’s example ported from Java to Python: Python:Design_Patterns:Inversion_of_Control

As you can see from the above link, a “Container” in Python can be written in 8 lines of code:

class Container:
    def __init__(self, system_data):
        for component_name, component_class, component_args in system_data:
            if type(component_class) == types.ClassType:
                args = [self.__dict__[arg] for arg in component_args]
                self.__dict__[component_name] = component_class(*args)
            else:
                self.__dict__[component_name] = component_class

回答 10

我的2cents是,在大多数Python应用程序中,您不需要它,即使您需要它,也有很多Java仇恨者(以及认为自己是开发人员的无能的提琴手)认为它不好,只是因为它在Java中很流行。

当您具有复杂的对象网络时,IoC系统实际上很有用,其中每个对象可能是其他几个对象的依赖项,而本身又是其他对象的依赖项。在这种情况下,您将希望一次定义所有这些对象,并具有一种机制,可以根据尽可能多的隐式规则将它们自动组合在一起。如果您还需要由应用程序用户/管理员以简单的方式定义配置,那么这就是希望IoC系统能够从简单的XML文件(即配置)中读取其组件的另一个原因。

没有这样复杂的体系结构,典型的Python应用程序要简单得多,只有一堆脚本。我个人知道IoC实际上是什么(与在此处写了某些答案的人相反),而我在有限的Python经验中从未感到过对IoC的需求(而且我并没有在所有地方都使用Spring,不是在优点时它给您带来了不合理的开发开销)。

也就是说,在某些Python情况下,IoC方法实际上是有用的,实际上,我在这里读到Django使用了它。

上面的相同推理可以应用于Java世界中的面向方面的编程,不同之处在于AOP真正值得的案例数量更加有限。

My 2cents is that in most Python applications you don’t need it and, even if you needed it, chances are that many Java haters (and incompetent fiddlers who believe to be developers) consider it as something bad, just because it’s popular in Java.

An IoC system is actually useful when you have complex networks of objects, where each object may be a dependency for several others and, in turn, be itself a dependant on other objects. In such a case you’ll want to define all these objects once and have a mechanism to put them together automatically, based on as many implicit rules as possible. If you also have configuration to be defined in a simple way by the application user/administrator, that’s an additional reason to desire an IoC system that can read its components from something like a simple XML file (which would be the configuration).

The typical Python application is much simpler, just a bunch of scripts, without such a complex architecture. Personally I’m aware of what an IoC actually is (contrary to those who wrote certain answers here) and I’ve never felt the need for it in my limited Python experience (also I don’t use Spring everywhere, not when the advantages it gives don’t justify its development overhead).

That said, there are Python situations where the IoC approach is actually useful and, in fact, I read here that Django uses it.

The same reasoning above could be applied to Aspect Oriented Programming in the Java world, with the difference that the number of cases where AOP is really worthwhile is even more limited.


回答 11

pytest夹具全部基于DI(来源

pytest fixtures all based on DI (source)


回答 12

我同意@Jorg的观点,那就是DI / IoC在Python中是可能的,更容易的,甚至更漂亮的。缺少的是支持它的框架,但是有一些exceptions。我想举几个例子:

  • Django注释使您可以使用自定义逻辑和表单来连接自己的Comment类。[更多信息]

  • Django允许您使用自定义Profile对象附加到您的User模型。这不是完全的IoC,而是一种很好的方法。我个人希望像注释框架那样替换空洞的User模型。[更多信息]

I agree with @Jorg in the point that DI/IoC is possible, easier and even more beautiful in Python. What’s missing is the frameworks supporting it, but there are a few exceptions. To point a couple of examples that come to my mind:

  • Django comments let you wire your own Comment class with your custom logic and forms. [More Info]

  • Django let you use a custom Profile object to attach to your User model. This is not completely IoC but is a good approach. Personally I’d like to replace the hole User model as the comments framework does. [More Info]


回答 13

在我看来,诸如依赖注入之类的东西就是僵化和过度复杂框架的症状。当代码主体变得过于繁重而无法轻松更改时,您会发现自己不得不选择其中的一小部分,为它们定义接口,然后允许人们通过插入这些接口的对象来更改行为。一切都很好,但是最好首先避免这种复杂性。

这也是静态类型语言的症状。当您唯一需要表达抽象的工具是继承时,那么几乎到处都可以使用它。话虽这么说,C ++非常相似,但从未像Java开发人员那样在任何地方都对Builders和Interfaces着迷。梦想拥有灵活性和可扩展性很容易变得过于狂妄,而这样做的代价是编写太多的通用代码,却没有什么实际的好处。我认为这是文化的事情。

通常,我认为Python人员习惯于为工作选择合适的工具,这是一个连贯且简单的整体,而不是一个可以做任何事情但提供令人困惑的可能配置排列的单一工具(带有千种插件) 。仍然有必要时可互换的部分,但是由于鸭子类型的灵活性和语言的相对简单性,因此不需要定义固定接口的庞大形式。

In my opinion, things like dependency injection are symptoms of a rigid and over-complex framework. When the main body of code becomes much too weighty to change easily, you find yourself having to pick small parts of it, define interfaces for them, and then allowing people to change behaviour via the objects that plug into those interfaces. That’s all well and good, but it’s better to avoid that sort of complexity in the first place.

It’s also the symptom of a statically-typed language. When the only tool you have to express abstraction is inheritance, then that’s pretty much what you use everywhere. Having said that, C++ is pretty similar but never picked up the fascination with Builders and Interfaces everywhere that Java developers did. It is easy to get over-exuberant with the dream of being flexible and extensible at the cost of writing far too much generic code with little real benefit. I think it’s a cultural thing.

Typically I think Python people are used to picking the right tool for the job, which is a coherent and simple whole, rather than the One True Tool (With A Thousand Possible Plugins) that can do anything but offers a bewildering array of possible configuration permutations. There are still interchangeable parts where necessary, but with no need for the big formalism of defining fixed interfaces, due to the flexibility of duck-typing and the relative simplicity of the language.


回答 14

与Java中强类型化的特性不同。Python的鸭子输入行为使传递对象变得非常容易。

Java开发人员专注于构造对象之间的类结构和关系,同时保持事物的灵活性。IoC对于实现这一点极为重要。

Python开发人员专注于完成工作。他们只是在需要时上课。他们甚至不必担心类的类型。只要能发出嘎嘎声,它就是鸭子!这种性质没有留给IoC的空间。

Unlike the strong typed nature in Java. Python’s duck typing behavior makes it so easy to pass objects around.

Java developers are focusing on the constructing the class strcuture and relation between objects, while keeping things flexible. IoC is extremely important for achieving this.

Python developers are focusing on getting the work done. They just wire up classes when they need it. They don’t even have to worry about the type of the class. As long as it can quack, it’s a duck! This nature leaves no room for IoC.


如何在Python中找到与正则表达式的所有匹配项?

问题:如何在Python中找到与正则表达式的所有匹配项?

在我编写的程序中,我使用Python re.search()函数在文本块中查找匹配项并打印结果。但是,一旦找到文本块中的第一个匹配项,程序就会退出。

在找到所有匹配项之前程序不停止的情况下,如何重复执行此操作?是否有单独的功能来执行此操作?

In a program I’m writing I have Python use the re.search() function to find matches in a block of text and print the results. However, the program exits once it finds the first match in the block of text.

How do I do this repeatedly where the program doesn’t stop until ALL matches have been found? Is there a separate function to do this?


回答 0

使用re.findallre.finditer代替。

re.findall(pattern, string) 返回匹配字符串的列表。

re.finditer(pattern, string)返回MatchObject对象上的迭代器。

例:

re.findall( r'all (.*?) are', 'all cats are smarter than dogs, all dogs are dumber than cats')
# Output: ['cats', 'dogs']

[x.group() for x in re.finditer( r'all (.*?) are', 'all cats are smarter than dogs, all dogs are dumber than cats')]
# Output: ['all cats are', 'all dogs are']

Use re.findall or re.finditer instead.

re.findall(pattern, string) returns a list of matching strings.

re.finditer(pattern, string) returns an iterator over MatchObject objects.

Example:

re.findall( r'all (.*?) are', 'all cats are smarter than dogs, all dogs are dumber than cats')
# Output: ['cats', 'dogs']

[x.group() for x in re.finditer( r'all (.*?) are', 'all cats are smarter than dogs, all dogs are dumber than cats')]
# Output: ['all cats are', 'all dogs are']

Python字典:keys()和values()总是相同的顺序吗?

问题:Python字典:keys()和values()总是相同的顺序吗?

看起来字典的keys()values()方法返回的列表始终是一对一映射(假设在调用这两种方法之间字典没有改变)。

例如:

>>> d = {'one':1, 'two': 2, 'three': 3}
>>> k, v = d.keys(), d.values()
>>> for i in range(len(k)):
    print d[k[i]] == v[i]

True
True
True

如果您没有在调用keys()和调用之间更改字典values(),那么假设上述for循环将始终显示True是否错误?我找不到任何证明文件。

It looks like the lists returned by keys() and values() methods of a dictionary are always a 1-to-1 mapping (assuming the dictionary is not altered between calling the 2 methods).

For example:

>>> d = {'one':1, 'two': 2, 'three': 3}
>>> k, v = d.keys(), d.values()
>>> for i in range(len(k)):
    print d[k[i]] == v[i]

True
True
True

If you do not alter the dictionary between calling keys() and calling values(), is it wrong to assume the above for-loop will always print True? I could not find any documentation confirming this.


回答 0

发现了这一点:

如果items()keys()values()iteritems()iterkeys(),和 itervalues()被称为中间没有修改的字典,列表会直接对应。

2.x文档3.x文档上

Found this:

If items(), keys(), values(), iteritems(), iterkeys(), and itervalues() are called with no intervening modifications to the dictionary, the lists will directly correspond.

On 2.x documentation and 3.x documentation.


回答 1

是的,您观察到的确实是一个有保证的属性-如果未更改dict,则keys(),values()和items()会以一致的顺序返回列表。iterkeys()&c也以与相应列表相同的顺序进行迭代。

Yes, what you observed is indeed a guaranteed property — keys(), values() and items() return lists in congruent order if the dict is not altered. iterkeys() &c also iterate in the same order as the corresponding lists.


回答 2

是的,在python 2.x中可以保证

如果对键,值和项目视图进行了迭代而没有对字典进行任何中间修改,则项目的顺序将直接对应。

Yes it is guaranteed in python 2.x:

If keys, values and items views are iterated over with no intervening modifications to the dictionary, the order of items will directly correspond.


回答 3

是。从CPython 3.6开始,字典按插入顺序返回项目

忽略说这是实现细节的部分。此行为在CPython 3.6中得到保证,并且从Python 3.7开始的所有其他Python实现都需要此行为。

Yes. Starting with CPython 3.6, dictionaries return items in the order you inserted them.

Ignore the part that says this is an implementation detail. This behaviour is guaranteed in CPython 3.6 and is required for all other Python implementations starting with Python 3.7.


回答 4

对文档的良好引用。无论文档/实现如何,都可以通过以下方法保证订单的准确性:

k, v = zip(*d.iteritems())

Good references to the docs. Here’s how you can guarantee the order regardless of the documentation / implementation:

k, v = zip(*d.iteritems())

回答 5

根据http://docs.python.org/dev/py3k/library/stdtypes.html#dictionary-view-objects,dict的keys(),values()和items()方法将返回其顺序的相应迭代器对应。但是,对于同一件事,我无法找到python 2.x官方文档的引用。

据我所知,答案是肯定的,但仅适用于python 3.0+

According to http://docs.python.org/dev/py3k/library/stdtypes.html#dictionary-view-objects , the keys(), values() and items() methods of a dict will return corresponding iterators whose orders correspond. However, I am unable to find a reference to the official documentation for python 2.x for the same thing.

So as far as I can tell, the answer is yes, but only in python 3.0+


回答 6

就其价值而言,我编写的一些常用的生产代码都基于此假设,但我从未遇到过任何问题。我知道那不是真的:-)

如果您不想冒险,我会尽可能使用iteritems()。

for key, value in myDictionary.iteritems():
    print key, value

For what it’s worth, some heavy used production code I have written is based on this assumption and I never had a problem with it. I know that doesn’t make it true though :-)

If you don’t want to take the risk I would use iteritems() if you can.

for key, value in myDictionary.iteritems():
    print key, value

回答 7

我对这些答案不满意,因为我想确保即使使用不同的字典,导出的值也具有相同的顺序。

在这里,您可以预先指定键顺序,即使字典更改,返回的值也将始终具有相同的顺序,或者您使用不同的字典。

keys = dict1.keys()
ordered_keys1 = [dict1[cur_key] for cur_key in keys]
ordered_keys2 = [dict2[cur_key] for cur_key in keys]

I wasn’t satisfied with these answers since I wanted to ensure the exported values had the same ordering even when using different dicts.

Here you specify the key order upfront, the returned values will always have the same order even if the dict changes, or you use a different dict.

keys = dict1.keys()
ordered_keys1 = [dict1[cur_key] for cur_key in keys]
ordered_keys2 = [dict2[cur_key] for cur_key in keys]

了解地图功能

问题:了解地图功能

map(function, iterable, ...)

将函数应用于每个iterable项目,并返回结果列表。如果传递了其他可迭代参数,则函数必须采用那么多参数,并且并行地将其应用于所有可迭代对象的项。

如果一个可迭代项短于另一个可迭代项,则假定它扩展为None。

如果function是None,则假定身份函数;如果有多个参数,则map()返回一个由元组组成的列表,其中包含所有可迭代对象中的对应项(一种转置操作)。

可迭代参数可以是序列或任何可迭代对象。结果总是一个列表。

这在制作笛卡尔积时起什么作用?

content = map(tuple, array)

将元组放在任何地方会有什么作用?我也注意到,如果没有地图功能的输出abc,并与它,它的a, b, c

我想完全了解此功能。参考定义也很难理解。花哨的绒毛太多。

map(function, iterable, ...)

Apply function to every item of iterable and return a list of the results. If additional iterable arguments are passed, function must take that many arguments and is applied to the items from all iterables in parallel.

If one iterable is shorter than another it is assumed to be extended with None items.

If function is None, the identity function is assumed; if there are multiple arguments, map() returns a list consisting of tuples containing the corresponding items from all iterables (a kind of transpose operation).

The iterable arguments may be a sequence or any iterable object; the result is always a list.

What role does this play in making a Cartesian product?

content = map(tuple, array)

What effect does putting a tuple anywhere in there have? I also noticed that without the map function the output is abc and with it, it’s a, b, c.

I want to fully understand this function. The reference definitions is also hard to understand. Too much fancy fluff.


回答 0

map不是特别的pythonic。我建议改用列表推导:

map(f, iterable)

基本上等同于:

[f(x) for x in iterable]

map单独不能执行笛卡尔积,因为其输出列表的长度始终与输入列表相同。您可以通过列表理解来简单地做笛卡尔积:

[(a, b) for a in iterable_a for b in iterable_b]

语法有点混乱-基本上等同于:

result = []
for a in iterable_a:
    for b in iterable_b:
        result.append((a, b))

map isn’t particularly pythonic. I would recommend using list comprehensions instead:

map(f, iterable)

is basically equivalent to:

[f(x) for x in iterable]

map on its own can’t do a Cartesian product, because the length of its output list is always the same as its input list. You can trivially do a Cartesian product with a list comprehension though:

[(a, b) for a in iterable_a for b in iterable_b]

The syntax is a little confusing — that’s basically equivalent to:

result = []
for a in iterable_a:
    for b in iterable_b:
        result.append((a, b))

回答 1

map尽管我想象一个精通函数式编程的人可能会提出一些无法理解的使用生成方法的方法,但它与笛卡尔积完全无关map

map Python 3中的等效于此:

def map(func, iterable):
    for i in iterable:
        yield func(i)

Python 2的唯一区别是它将建立完整的结果列表以立即返回所有结果而不是yielding。

尽管Python约定通常更喜欢列表推导(或生成器表达式)来实现与调用相同的结果map,尤其是如果您使用lambda表达式作为第一个参数:

[func(i) for i in iterable]

作为您在问题注释中要求的示例(“将字符串转换为数组”),通过“数组”,您可能想要一个元组或一个列表(它们的行为都与其他语言的数组类似) —

 >>> a = "hello, world"
 >>> list(a)
['h', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd']
>>> tuple(a)
('h', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd')

map如果您从一个字符串列表而不是单个字符串开始,可以在这里使用- map可以单独列出所有字符串:

>>> a = ["foo", "bar", "baz"]
>>> list(map(list, a))
[['f', 'o', 'o'], ['b', 'a', 'r'], ['b', 'a', 'z']]

请注意,这map(list, a)在Python 2 中是等效的,但是在Python 3中,list如果您想要执行其他操作而不是将其馈入for循环(或诸如sum仅需要可迭代的处理函数,而无需序列)的处理函数,则需要调用。但也请注意,通常首选列表理解:

>>> [list(b) for b in a]
[['f', 'o', 'o'], ['b', 'a', 'r'], ['b', 'a', 'z']]

map doesn’t relate to a Cartesian product at all, although I imagine someone well versed in functional programming could come up with some impossible to understand way of generating a one using map.

map in Python 3 is equivalent to this:

def map(func, iterable):
    for i in iterable:
        yield func(i)

and the only difference in Python 2 is that it will build up a full list of results to return all at once instead of yielding.

Although Python convention usually prefers list comprehensions (or generator expressions) to achieve the same result as a call to map, particularly if you’re using a lambda expression as the first argument:

[func(i) for i in iterable]

As an example of what you asked for in the comments on the question – “turn a string into an array”, by ‘array’ you probably want either a tuple or a list (both of them behave a little like arrays from other languages) –

 >>> a = "hello, world"
 >>> list(a)
['h', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd']
>>> tuple(a)
('h', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd')

A use of map here would be if you start with a list of strings instead of a single string – map can listify all of them individually:

>>> a = ["foo", "bar", "baz"]
>>> list(map(list, a))
[['f', 'o', 'o'], ['b', 'a', 'r'], ['b', 'a', 'z']]

Note that map(list, a) is equivalent in Python 2, but in Python 3 you need the list call if you want to do anything other than feed it into a for loop (or a processing function such as sum that only needs an iterable, and not a sequence). But also note again that a list comprehension is usually preferred:

>>> [list(b) for b in a]
[['f', 'o', 'o'], ['b', 'a', 'r'], ['b', 'a', 'z']]

回答 2

map 通过将函数应用于源的每个元素来创建新列表:

xs = [1, 2, 3]

# all of those are equivalent — the output is [2, 4, 6]
# 1. map
ys = map(lambda x: x * 2, xs)
# 2. list comprehension
ys = [x * 2 for x in xs]
# 3. explicit loop
ys = []
for x in xs:
    ys.append(x * 2)

n-ary map等效于将输入可迭代对象压缩在一起,然后将转换函数应用于该中间压缩列表的每个元素。它不是笛卡尔积:

xs = [1, 2, 3]
ys = [2, 4, 6]

def f(x, y):
    return (x * 2, y // 2)

# output: [(2, 1), (4, 2), (6, 3)]
# 1. map
zs = map(f, xs, ys)
# 2. list comp
zs = [f(x, y) for x, y in zip(xs, ys)]
# 3. explicit loop
zs = []
for x, y in zip(xs, ys):
    zs.append(f(x, y))

我在zip这里使用过,但是map当可迭代项的大小不同时,行为实际上稍有不同–如其文档中所述,它将可迭代项扩展为contains None

map creates a new list by applying a function to every element of the source:

xs = [1, 2, 3]

# all of those are equivalent — the output is [2, 4, 6]
# 1. map
ys = map(lambda x: x * 2, xs)
# 2. list comprehension
ys = [x * 2 for x in xs]
# 3. explicit loop
ys = []
for x in xs:
    ys.append(x * 2)

n-ary map is equivalent to zipping input iterables together and then applying the transformation function on every element of that intermediate zipped list. It’s not a Cartesian product:

xs = [1, 2, 3]
ys = [2, 4, 6]

def f(x, y):
    return (x * 2, y // 2)

# output: [(2, 1), (4, 2), (6, 3)]
# 1. map
zs = map(f, xs, ys)
# 2. list comp
zs = [f(x, y) for x, y in zip(xs, ys)]
# 3. explicit loop
zs = []
for x, y in zip(xs, ys):
    zs.append(f(x, y))

I’ve used zip here, but map behaviour actually differs slightly when iterables aren’t the same size — as noted in its documentation, it extends iterables to contain None.


回答 3

简化一下,您可以想象map()这样做:

def mymap(func, lst):
    result = []
    for e in lst:
        result.append(func(e))
    return result

如您所见,它接受一个函数和一个列表,并返回一个新列表,并将该函数应用于输入列表中的每个元素。我说“简化一点”是因为实际上map()可以处理多个迭代:

如果传递了其他可迭代参数,则函数必须采用那么多参数,并且并行地将其应用于所有可迭代对象的项。如果一个可迭代项短于另一个可迭代项,则假定它扩展为None。

对于问题的第二部分:这在制造笛卡尔积中起什么作用?好,map() 可以用于生成列表的笛卡尔积,如下所示:

lst = [1, 2, 3, 4, 5]

from operator import add
reduce(add, map(lambda i: map(lambda j: (i, j), lst), lst))

…但是说实话,使用product()是解决问题的一种更简单自然的方法:

from itertools import product
list(product(lst, lst))

无论哪种方式,结果都是上述lst定义的笛卡尔乘积:

[(1, 1), (1, 2), (1, 3), (1, 4), (1, 5),
 (2, 1), (2, 2), (2, 3), (2, 4), (2, 5),
 (3, 1), (3, 2), (3, 3), (3, 4), (3, 5),
 (4, 1), (4, 2), (4, 3), (4, 4), (4, 5),
 (5, 1), (5, 2), (5, 3), (5, 4), (5, 5)]

Simplifying a bit, you can imagine map() doing something like this:

def mymap(func, lst):
    result = []
    for e in lst:
        result.append(func(e))
    return result

As you can see, it takes a function and a list, and returns a new list with the result of applying the function to each of the elements in the input list. I said “simplifying a bit” because in reality map() can process more than one iterable:

If additional iterable arguments are passed, function must take that many arguments and is applied to the items from all iterables in parallel. If one iterable is shorter than another it is assumed to be extended with None items.

For the second part in the question: What role does this play in making a Cartesian product? well, map() could be used for generating the cartesian product of a list like this:

lst = [1, 2, 3, 4, 5]

from operator import add
reduce(add, map(lambda i: map(lambda j: (i, j), lst), lst))

… But to tell the truth, using product() is a much simpler and natural way to solve the problem:

from itertools import product
list(product(lst, lst))

Either way, the result is the cartesian product of lst as defined above:

[(1, 1), (1, 2), (1, 3), (1, 4), (1, 5),
 (2, 1), (2, 2), (2, 3), (2, 4), (2, 5),
 (3, 1), (3, 2), (3, 3), (3, 4), (3, 5),
 (4, 1), (4, 2), (4, 3), (4, 4), (4, 5),
 (5, 1), (5, 2), (5, 3), (5, 4), (5, 5)]

回答 4

map()函数可以将相同的过程应用于可迭代数据结构中的每个项目,例如列表,生成器,字符串和其他内容。

让我们看一个例子: map()可以遍历列表中的每个项目并对每个项目应用一个函数,这样它将返回(返回)新列表。

假设您有一个接受数字的函数,将数字加1并返回:

def add_one(num):
  new_num = num + 1
  return new_num

您还有一个数字列表:

my_list = [1, 3, 6, 7, 8, 10]

如果要递增列表中的每个数字,可以执行以下操作:

>>> map(add_one, my_list)
[2, 4, 7, 8, 9, 11]

注意:至少map()需要两个参数。首先是函数名称,其次是列表。

让我们看看其他一些不错的事情map()map()可以采用多个可迭代项(列表,字符串等),并将每个可迭代项中的元素作为参数传递给函数。

我们有三个列表:

list_one = [1, 2, 3, 4, 5]
list_two = [11, 12, 13, 14, 15]
list_three = [21, 22, 23, 24, 25]

map() 可以使您成为一个新列表,其中包含在特定索引处添加的元素。

现在记住map(),需要一个功能。这次我们将使用内置sum()函数。运行map()给出以下结果:

>>> map(sum, list_one, list_two, list_three)
[33, 36, 39, 42, 45]

记住:
在Python 2中map(),将根据最长的列表进行迭代(遍历列表中的元素),并传递None给较短列表的函数,因此您的函数应查找None并处理它们,否则会出错。在Python 3中map(),使用最短列表结束后将停止。同样,在Python 3中,map()返回迭代器,而不是列表。

The map() function is there to apply the same procedure to every item in an iterable data structure, like lists, generators, strings, and other stuff.

Let’s look at an example: map() can iterate over every item in a list and apply a function to each item, than it will return (give you back) the new list.

Imagine you have a function that takes a number, adds 1 to that number and returns it:

def add_one(num):
  new_num = num + 1
  return new_num

You also have a list of numbers:

my_list = [1, 3, 6, 7, 8, 10]

if you want to increment every number in the list, you can do the following:

>>> map(add_one, my_list)
[2, 4, 7, 8, 9, 11]

Note: At minimum map() needs two arguments. First a function name and second something like a list.

Let’s see some other cool things map() can do. map() can take multiple iterables (lists, strings, etc.) and pass an element from each iterable to a function as an argument.

We have three lists:

list_one = [1, 2, 3, 4, 5]
list_two = [11, 12, 13, 14, 15]
list_three = [21, 22, 23, 24, 25]

map() can make you a new list that holds the addition of elements at a specific index.

Now remember map(), needs a function. This time we’ll use the builtin sum() function. Running map() gives the following result:

>>> map(sum, list_one, list_two, list_three)
[33, 36, 39, 42, 45]

REMEMBER:
In Python 2 map(), will iterate (go through the elements of the lists) according to the longest list, and pass None to the function for the shorter lists, so your function should look for None and handle them, otherwise you will get errors. In Python 3 map() will stop after finishing with the shortest list. Also, in Python 3, map() returns an iterator, not a list.


回答 5

Python3-映射(函数,可迭代)

没有完全提及的一件事(尽管@BlooB kinda提到了)是地图返回地图对象而不是列表。在初始化和迭代的时间性能方面,这是一个很大的差异。考虑这两个测试。

import time
def test1(iterable):
    a = time.clock()
    map(str, iterable)
    a = time.clock() - a

    b = time.clock()
    [ str(x) for x in iterable ]
    b = time.clock() - b

    print(a,b)


def test2(iterable):
    a = time.clock()
    [ x for x in map(str, iterable)]
    a = time.clock() - a

    b = time.clock()
    [ str(x) for x in iterable ]
    b = time.clock() - b

    print(a,b)


test1(range(2000000))  # Prints ~1.7e-5s   ~8s
test2(range(2000000))  # Prints ~9s        ~8s

如您所见,初始化map函数几乎不需要时间。但是,通过map对象进行迭代比仅简单地迭代可迭代对象要花费更长的时间。这意味着传递给map()的函数不会应用到每个元素,直到在迭代中到达该元素为止。如果要使用列表,请使用列表理解。如果您打算在for循环中进行迭代并且会在某个时候中断,请使用map。

Python3 – map(func, iterable)

One thing that wasn’t mentioned completely (although @BlooB kinda mentioned it) is that map returns a map object NOT a list. This is a big difference when it comes to time performance on initialization and iteration. Consider these two tests.

import time
def test1(iterable):
    a = time.clock()
    map(str, iterable)
    a = time.clock() - a

    b = time.clock()
    [ str(x) for x in iterable ]
    b = time.clock() - b

    print(a,b)


def test2(iterable):
    a = time.clock()
    [ x for x in map(str, iterable)]
    a = time.clock() - a

    b = time.clock()
    [ str(x) for x in iterable ]
    b = time.clock() - b

    print(a,b)


test1(range(2000000))  # Prints ~1.7e-5s   ~8s
test2(range(2000000))  # Prints ~9s        ~8s

As you can see initializing the map function takes almost no time at all. However iterating through the map object takes longer than simply iterating through the iterable. This means that the function passed to map() is not applied to each element until the element is reached in the iteration. If you want a list use list comprehension. If you plan to iterate through in a for loop and will break at some point, then use map.


如何在python中获取当前时间并分解为年,月,日,小时,分钟?

问题:如何在python中获取当前时间并分解为年,月,日,小时,分钟?

我想获得当前时间在Python,并将它们分配到变量喜欢yearmonthdayhourminute。如何在Python 2.7中完成?

I would like to get the current time in Python and assign them into variables like year, month, day, hour, minute. How can this be done in Python 2.7?


回答 0

datetime模块是您的朋友:

import datetime
now = datetime.datetime.now()
print now.year, now.month, now.day, now.hour, now.minute, now.second
# 2015 5 6 8 53 40

您不需要单独的变量,返回datetime对象上的属性就可以满足您的所有需求。

The datetime module is your friend:

import datetime
now = datetime.datetime.now()
print(now.year, now.month, now.day, now.hour, now.minute, now.second)
# 2015 5 6 8 53 40

You don’t need separate variables, the attributes on the returned datetime object have all you need.


回答 1

这是一个单线,最大字符数不超过80个字符。

import time
year, month, day, hour, min = map(int, time.strftime("%Y %m %d %H %M").split())

Here’s a one-liner that comes in just under the 80 char line max.

import time
year, month, day, hour, min = map(int, time.strftime("%Y %m %d %H %M").split())

回答 2

tzamandatetime答案要干净得多,但是您可以使用原始的python 模块来实现:time

import time
strings = time.strftime("%Y,%m,%d,%H,%M,%S")
t = strings.split(',')
numbers = [ int(x) for x in t ]
print numbers

输出:

[2016, 3, 11, 8, 29, 47]

The datetime answer by tzaman is much cleaner, but you can do it with the original python time module:

import time
strings = time.strftime("%Y,%m,%d,%H,%M,%S")
t = strings.split(',')
numbers = [ int(x) for x in t ]
print numbers

Output:

[2016, 3, 11, 8, 29, 47]

回答 3

通过解压缩timetupledatetime对象,您应该得到想要的东西:

from datetime import datetime

n = datetime.now()
t = n.timetuple()
y, m, d, h, min, sec, wd, yd, i = t

By unpacking timetuple of datetime object, you should get what you want:

from datetime import datetime

n = datetime.now()
t = n.timetuple()
y, m, d, h, min, sec, wd, yd, i = t

回答 4

对于python 3

import datetime
now = datetime.datetime.now()
print(now.year, now.month, now.day, now.hour, now.minute, now.second)

For python 3

import datetime
now = datetime.datetime.now()
print(now.year, now.month, now.day, now.hour, now.minute, now.second)

回答 5

让我们看看如何从当前时间获取并打印python中的日,月,年:

import datetime

now = datetime.datetime.now()
year = '{:02d}'.format(now.year)
month = '{:02d}'.format(now.month)
day = '{:02d}'.format(now.day)
hour = '{:02d}'.format(now.hour)
minute = '{:02d}'.format(now.minute)
day_month_year = '{}-{}-{}'.format(year, month, day)

print('day_month_year: ' + day_month_year)

结果:

day_month_year: 2019-03-26

Let’s see how to get and print day,month,year in python from current time:

import datetime

now = datetime.datetime.now()
year = '{:02d}'.format(now.year)
month = '{:02d}'.format(now.month)
day = '{:02d}'.format(now.day)
hour = '{:02d}'.format(now.hour)
minute = '{:02d}'.format(now.minute)
day_month_year = '{}-{}-{}'.format(year, month, day)

print('day_month_year: ' + day_month_year)

result:

day_month_year: 2019-03-26

回答 6

import time
year = time.strftime("%Y") # or "%y"
import time
year = time.strftime("%Y") # or "%y"

回答 7

三个用于访问和操纵日期和时间的库,即日期时间,箭头和摆锤,都使这些项在命名元组中可用,命名元组的元素可通过名称或索引访问。此外,可以完全相同的方式访问项目。(我想如果我更聪明,我不会感到惊讶。)

>>> YEARS, MONTHS, DAYS, HOURS, MINUTES = range(5)
>>> import datetime
>>> import arrow
>>> import pendulum
>>> [datetime.datetime.now().timetuple()[i] for i in [YEARS, MONTHS, DAYS, HOURS, MINUTES]]
[2017, 6, 16, 19, 15]
>>> [arrow.now().timetuple()[i] for i in [YEARS, MONTHS, DAYS, HOURS, MINUTES]]
[2017, 6, 16, 19, 15]
>>> [pendulum.now().timetuple()[i] for i in [YEARS, MONTHS, DAYS, HOURS, MINUTES]]
[2017, 6, 16, 19, 16]

Three libraries for accessing and manipulating dates and times, namely datetime, arrow and pendulum, all make these items available in namedtuples whose elements are accessible either by name or index. Moreover, the items are accessible in precisely the same way. (I suppose if I were more intelligent I wouldn’t be surprised.)

>>> YEARS, MONTHS, DAYS, HOURS, MINUTES = range(5)
>>> import datetime
>>> import arrow
>>> import pendulum
>>> [datetime.datetime.now().timetuple()[i] for i in [YEARS, MONTHS, DAYS, HOURS, MINUTES]]
[2017, 6, 16, 19, 15]
>>> [arrow.now().timetuple()[i] for i in [YEARS, MONTHS, DAYS, HOURS, MINUTES]]
[2017, 6, 16, 19, 15]
>>> [pendulum.now().timetuple()[i] for i in [YEARS, MONTHS, DAYS, HOURS, MINUTES]]
[2017, 6, 16, 19, 16]

回答 8

您可以使用gmtime

from time import gmtime

detailed_time = gmtime() 
#returns a struct_time object for current time

year = detailed_time.tm_year
month = detailed_time.tm_mon
day = detailed_time.tm_mday
hour = detailed_time.tm_hour
minute = detailed_time.tm_min

注意:可以将时间戳传递给gmtime,默认为time()返回的当前时间

eg.
gmtime(1521174681)

参见struct_time

You can use gmtime

from time import gmtime

detailed_time = gmtime() 
#returns a struct_time object for current time

year = detailed_time.tm_year
month = detailed_time.tm_mon
day = detailed_time.tm_mday
hour = detailed_time.tm_hour
minute = detailed_time.tm_min

Note: A time stamp can be passed to gmtime, default is current time as returned by time()

eg.
gmtime(1521174681)

See struct_time


回答 9

这是一个比较老的问题,但是我想出了一个我认为其他人可能会喜欢的解决方案。

def get_current_datetime_as_dict():
n = datetime.now()
t = n.timetuple()
field_names = ["year",
               "month",
               "day",
               "hour",
               "min",
               "sec",
               "weekday",
               "md",
               "yd"]
return dict(zip(field_names, t))

timetuple()可以与另一个数组一起压缩,这将创建带标签的元组。将其转换为字典,然后可以使用生成的产品get_current_datetime_as_dict()['year']

这比这里的其他一些解决方案有更多的开销,但是我发现能够在代码中为清楚起见而访问命名值真是太好了。

This is an older question, but I came up with a solution I thought others might like.

def get_current_datetime_as_dict():
n = datetime.now()
t = n.timetuple()
field_names = ["year",
               "month",
               "day",
               "hour",
               "min",
               "sec",
               "weekday",
               "md",
               "yd"]
return dict(zip(field_names, t))

timetuple() can be zipped with another array, which creates labeled tuples. Cast that to a dictionary and the resultant product can be consumed with get_current_datetime_as_dict()['year'].

This has a little more overhead than some of the other solutions on here, but I’ve found it’s so nice to be able to access named values for clartiy’s sake in the code.


回答 10

您可以使用datetime模块获取Python 2.7中的当前日期和时间

import datetime
print datetime.datetime.now()

输出:

2015-05-06 14:44:14.369392

you can use datetime module to get current Date and Time in Python 2.7

import datetime
print datetime.datetime.now()

Output :

2015-05-06 14:44:14.369392

了解Keras LSTM

问题:了解Keras LSTM

我试图调和对LSTM的理解,并在克里斯托弗·奥拉(Christopher Olah)在Keras中实现的这篇文章中指出了这一点。我正在关注Jason Brownlee为Keras教程撰写博客。我主要感到困惑的是

  1. 将数据系列重塑为 [samples, time steps, features]和,
  2. 有状态的LSTM

让我们参考下面粘贴的代码专注于以上两个问题:

# reshape into X=t and Y=t+1
look_back = 3
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)

# reshape input to be [samples, time steps, features]
trainX = numpy.reshape(trainX, (trainX.shape[0], look_back, 1))
testX = numpy.reshape(testX, (testX.shape[0], look_back, 1))
########################
# The IMPORTANT BIT
##########################
# create and fit the LSTM network
batch_size = 1
model = Sequential()
model.add(LSTM(4, batch_input_shape=(batch_size, look_back, 1), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
for i in range(100):
    model.fit(trainX, trainY, nb_epoch=1, batch_size=batch_size, verbose=2, shuffle=False)
    model.reset_states()

注意:create_dataset采用长度为N的序列,并返回一个N-look_back数组,其中每个元素都是一个look_back长度序列。

什么是时间步骤和功能?

可以看出TrainX是一个3D数组,其中Time_steps和Feature是最后两个维度(在此特定代码中为3和1)。关于下图,这是否意味着我们正在考虑many to one粉红色盒数为3的情况?还是字面上的意思是链长为3(即仅考虑了3个绿色框)。在此处输入图片说明

当我们考虑多元序列时,features参数是否有意义?例如同时模拟两个金融股票?

有状态的LSTM

有状态LSTM是否意味着我们在批次运行之间保存了单元内存值?如果是这样,batch_size则为1,并且在两次训练之间将内存重置,那么说它是有状态的就意味着什么。我猜想这与训练数据没有被改组的事实有关,但是我不确定如何做。

有什么想法吗?图片参考:http : //karpathy.github.io/2015/05/21/rnn-efficiency/

编辑1:

@van对红色和绿色方框相等的评论有点困惑。因此,为了确认一下,以下API调用是否与展开的图相对应?特别注意第二张图(batch_size被任意选择): 在此处输入图片说明 在此处输入图片说明

编辑2:

对于已经完成Udacity深度学习类但仍对time_step参数感到困惑的人,请查看以下讨论:https ://discussions.udacity.com/t/rnn-lstm-use-implementation/163169

更新:

原来model.add(TimeDistributed(Dense(vocab_len)))是我要找的东西。这是一个示例:https : //github.com/sachinruk/ShakespeareBot

更新2:

我在这里总结了我对LSTM的大部分理解:https : //www.youtube.com/watch?v= ywinX5wgdEU

I am trying to reconcile my understand of LSTMs and pointed out here in this post by Christopher Olah implemented in Keras. I am following the blog written by Jason Brownlee for the Keras tutorial. What I am mainly confused about is,

  1. The reshaping of the data series into [samples, time steps, features] and,
  2. The stateful LSTMs

Lets concentrate on the above two questions with reference to the code pasted below:

# reshape into X=t and Y=t+1
look_back = 3
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)

# reshape input to be [samples, time steps, features]
trainX = numpy.reshape(trainX, (trainX.shape[0], look_back, 1))
testX = numpy.reshape(testX, (testX.shape[0], look_back, 1))
########################
# The IMPORTANT BIT
##########################
# create and fit the LSTM network
batch_size = 1
model = Sequential()
model.add(LSTM(4, batch_input_shape=(batch_size, look_back, 1), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
for i in range(100):
    model.fit(trainX, trainY, nb_epoch=1, batch_size=batch_size, verbose=2, shuffle=False)
    model.reset_states()

Note: create_dataset takes a sequence of length N and returns a N-look_back array of which each element is a look_back length sequence.

What is Time Steps and Features?

As can be seen TrainX is a 3-D array with Time_steps and Feature being the last two dimensions respectively (3 and 1 in this particular code). With respect to the image below, does this mean that we are considering the many to one case, where the number of pink boxes are 3? Or does it literally mean the chain length is 3 (i.e. only 3 green boxes considered). enter image description here

Does the features argument become relevant when we consider multivariate series? e.g. modelling two financial stocks simultaneously?

Stateful LSTMs

Does stateful LSTMs mean that we save the cell memory values between runs of batches? If this is the case, batch_size is one, and the memory is reset between the training runs so what was the point of saying that it was stateful. I’m guessing this is related to the fact that training data is not shuffled, but I’m not sure how.

Any thoughts? Image reference: http://karpathy.github.io/2015/05/21/rnn-effectiveness/

Edit 1:

A bit confused about @van’s comment about the red and green boxes being equal. So just to confirm, does the following API calls correspond to the unrolled diagrams? Especially noting the second diagram (batch_size was arbitrarily chosen.): enter image description here enter image description here

Edit 2:

For people who have done Udacity’s deep learning course and still confused about the time_step argument, look at the following discussion: https://discussions.udacity.com/t/rnn-lstm-use-implementation/163169

Update:

It turns out model.add(TimeDistributed(Dense(vocab_len))) was what I was looking for. Here is an example: https://github.com/sachinruk/ShakespeareBot

Update2:

I have summarised most of my understanding of LSTMs here: https://www.youtube.com/watch?v=ywinX5wgdEU


回答 0

首先,你选择伟大的教程(12)开始。

Time-step的含义Time-steps==3X.shape(描述数据形状)表示三个粉红色的框。由于在Keras中,每个步骤都需要输入,因此绿色框的数量通常应等于红色框的数量。除非您破解结构。

多对多与多对一:在keras中,return_sequences初始化LSTMor GRU或时有一个参数SimpleRNN。当return_sequencesFalse(默认情况下)时,则如图所示多对一。其返回形状为(batch_size, hidden_unit_length),代表最后一个状态。如果return_sequences是的True话,那就是很多很多。它的返回形状是(batch_size, time_step, hidden_unit_length)

features参数是否相关:Feature参数表示“您的红框有多大”或每步的输入维数是多少?例如,如果您要从8种市场信息中进行预测,则可以使用生成数据feature==8

有状态:您可以查找源代码。初始化状态时,如果为stateful==True,则将最后一次训练的状态用作初始状态,否则将生成新状态。我还没打开stateful呢。但是,我不同意的是,当batch_size只能为1 stateful==True

当前,您将使用收集的数据生成数据。将您的库存信息以流的形式显示,而不是等待一天收集所有顺序的图像,而是想在通过网络进行培训/预测时在线生成输入数据。如果您有400只股票共享同一网络,则可以设置batch_size==400

First of all, you choose great tutorials(1,2) to start.

What Time-step means: Time-steps==3 in X.shape (Describing data shape) means there are three pink boxes. Since in Keras each step requires an input, therefore the number of the green boxes should usually equal to the number of red boxes. Unless you hack the structure.

many to many vs. many to one: In keras, there is a return_sequences parameter when your initializing LSTM or GRU or SimpleRNN. When return_sequences is False (by default), then it is many to one as shown in the picture. Its return shape is (batch_size, hidden_unit_length), which represent the last state. When return_sequences is True, then it is many to many. Its return shape is (batch_size, time_step, hidden_unit_length)

Does the features argument become relevant: Feature argument means “How big is your red box” or what is the input dimension each step. If you want to predict from, say, 8 kinds of market information, then you can generate your data with feature==8.

Stateful: You can look up the source code. When initializing the state, if stateful==True, then the state from last training will be used as the initial state, otherwise it will generate a new state. I haven’t turn on stateful yet. However, I disagree with that the batch_size can only be 1 when stateful==True.

Currently, you generate your data with collected data. Image your stock information is coming as stream, rather than waiting for a day to collect all sequential, you would like to generate input data online while training/predicting with network. If you have 400 stocks sharing a same network, then you can set batch_size==400.


回答 1

作为已接受答案的补充,此答案显示了keras行为以及如何获得每张照片。

一般Keras行为

标准keras内部处理总是如下图所示(我在其中使用features=2,压力和温度为例):

多对多

在此图中,我将步骤数增加到5,以避免与其他维度混淆。

对于此示例:

  • 我们有N个油箱
  • 我们每小时花费5个小时采取措施(时间步长)
  • 我们测量了两个功能:
    • 压力P
    • 温度T

输入数组的形状应为(N,5,2)

        [     Step1      Step2      Step3      Step4      Step5
Tank A:    [[Pa1,Ta1], [Pa2,Ta2], [Pa3,Ta3], [Pa4,Ta4], [Pa5,Ta5]],
Tank B:    [[Pb1,Tb1], [Pb2,Tb2], [Pb3,Tb3], [Pb4,Tb4], [Pb5,Tb5]],
  ....
Tank N:    [[Pn1,Tn1], [Pn2,Tn2], [Pn3,Tn3], [Pn4,Tn4], [Pn5,Tn5]],
        ]

滑动窗输入

通常,LSTM层应该处理整个序列。划分窗口可能不是最好的主意。该层具有有关序列前进过程的内部状态。Windows消除了学习长序列的可能性,从而将所有序列限制为窗口大小。

在窗口中,每个窗口都是一个较长的原始序列的一部分,但是Keras会将它们视为独立的序列:

        [     Step1    Step2    Step3    Step4    Step5
Window  A:  [[P1,T1], [P2,T2], [P3,T3], [P4,T4], [P5,T5]],
Window  B:  [[P2,T2], [P3,T3], [P4,T4], [P5,T5], [P6,T6]],
Window  C:  [[P3,T3], [P4,T4], [P5,T5], [P6,T6], [P7,T7]],
  ....
        ]

请注意,在这种情况下,最初只有一个序列,但是您将其分为许多序列以创建窗口。

“什么是序列”的概念是抽象的。重要的部分是:

  • 您可以批量处理许多单独的序列
  • 使序列成为序列的原因是它们是逐步演化的(通常是时间步长)

通过“单层”实现每种情况

实现许多标准:

标准多对多

您可以使用一个简单的LSTM层来实现很多对很多return_sequences=True

outputs = LSTM(units, return_sequences=True)(inputs)

#output_shape -> (batch_size, steps, units)

实现多对一:

使用完全相同的图层,keras将执行完全相同的内部预处理,但是当您使用return_sequences=False(或简单地忽略此参数)时,keras会自动放弃最后一步的步骤:

多对一

outputs = LSTM(units)(inputs)

#output_shape -> (batch_size, units) --> steps were discarded, only the last was returned

实现一对多

现在,仅keras LSTM层不支持此功能。您将必须创建自己的策略来重复步骤。有两种好的方法:

  • 通过重复张量创建恒定的多步输入
  • 使用a stateful=True反复获取一个步骤的输出,并将其用作下一步的输入(需要output_features == input_features

一对多与重复向量

为了适应keras的标准行为,我们需要分步进行输入,因此,我们只需重复输入所需的长度即可:

一对多重复

outputs = RepeatVector(steps)(inputs) #where inputs is (batch,features)
outputs = LSTM(units,return_sequences=True)(outputs)

#output_shape -> (batch_size, steps, units)

了解状态=真

现在出现一种可能的用法 stateful=True(避免避免一次加载无法容纳计算机内存的数据)

有状态允许我们分阶段输入序列的“部分”。区别在于:

  • 在中stateful=False,第二批包含完整的新序列,独立于第一批
  • 在中stateful=True,第二批继续第一批,扩展了相同的序列。

这就像在Windows中划分序列一样,有两个主要区别:

  • 这些窗户不叠加!
  • stateful=True 将看到这些窗口作为单个长序列连接

在中stateful=True,每个新批次将被解释为继续前一个批次(直到您调用model.reset_states())。

  • 批次2中的序列1将继续批次1中的序列1。
  • 批次2中的序列2将继续批次1中的序列2。
  • 批次2中的序列n将继续批次1中的序列n。

输入示例,批次1包含步骤1和2,批次2包含步骤3至5:

                   BATCH 1                           BATCH 2
        [     Step1      Step2        |    [    Step3      Step4      Step5
Tank A:    [[Pa1,Ta1], [Pa2,Ta2],     |       [Pa3,Ta3], [Pa4,Ta4], [Pa5,Ta5]],
Tank B:    [[Pb1,Tb1], [Pb2,Tb2],     |       [Pb3,Tb3], [Pb4,Tb4], [Pb5,Tb5]],
  ....                                |
Tank N:    [[Pn1,Tn1], [Pn2,Tn2],     |       [Pn3,Tn3], [Pn4,Tn4], [Pn5,Tn5]],
        ]                                  ]

注意批次1和批次2中的储罐对齐!这就是我们需要的原因shuffle=False(当然,除非我们仅使用一个序列)。

您可以无限期地拥有任意数量的批次。(对于每批具有可变长度,请使用input_shape=(None,features)

一对多与有状态= True

对于我们这里的情况,每批将只使用1步,因为我们希望获得一个输出步并将其作为输入。

请注意,图片中的行为不是由“引起的” stateful=True。我们将在下面的手动循环中强制执行该操作。在此示例中,stateful=True是“允许”我们停止序列,操纵我们想要的并从我们停止的地方继续进行操作的东西。

一对多状态

老实说,对于这种情况,重复方法可能是更好的选择。但是,由于我们正在研究stateful=True,所以这是一个很好的例子。最好的使用方法是下一个“多对多”案例。

层:

outputs = LSTM(units=features, 
               stateful=True, 
               return_sequences=True, #just to keep a nice output shape even with length 1
               input_shape=(None,features))(inputs) 
    #units = features because we want to use the outputs as inputs
    #None because we want variable length

#output_shape -> (batch_size, steps, units) 

现在,我们将需要一个手动循环进行预测:

input_data = someDataWithShape((batch, 1, features))

#important, we're starting new sequences, not continuing old ones:
model.reset_states()

output_sequence = []
last_step = input_data
for i in steps_to_predict:

    new_step = model.predict(last_step)
    output_sequence.append(new_step)
    last_step = new_step

 #end of the sequences
 model.reset_states()

有状态=真对多对多

现在,在这里,我们得到一个非常好的应用程序:给定一个输入序列,尝试预测其未来未知的步骤。

我们使用的方法与上述“一对多”方法相同,不同之处在于:

  • 我们将使用序列本身作为目标数据,向前迈出一步
  • 我们知道序列的一部分(因此我们丢弃了这部分结果)。

多对多状态

图层(与上面相同):

outputs = LSTM(units=features, 
               stateful=True, 
               return_sequences=True, 
               input_shape=(None,features))(inputs) 
    #units = features because we want to use the outputs as inputs
    #None because we want variable length

#output_shape -> (batch_size, steps, units) 

训练:

我们将训练模型以预测序列的下一步:

totalSequences = someSequencesShaped((batch, steps, features))
    #batch size is usually 1 in these cases (often you have only one Tank in the example)

X = totalSequences[:,:-1] #the entire known sequence, except the last step
Y = totalSequences[:,1:] #one step ahead of X

#loop for resetting states at the start/end of the sequences:
for epoch in range(epochs):
    model.reset_states()
    model.train_on_batch(X,Y)

预测:

我们预测的第一阶段涉及“调整状态”。这就是为什么即使我们已经知道序列的这一部分,我们也要再次预测整个序列:

model.reset_states() #starting a new sequence
predicted = model.predict(totalSequences)
firstNewStep = predicted[:,-1:] #the last step of the predictions is the first future step

现在我们像一对多情况一样进入循环。但是请不要在这里重置状态!。我们希望模型知道序列的哪一步(并且由于上面我们所做的预测,它知道它在第一步)

output_sequence = [firstNewStep]
last_step = firstNewStep
for i in steps_to_predict:

    new_step = model.predict(last_step)
    output_sequence.append(new_step)
    last_step = new_step

 #end of the sequences
 model.reset_states()

这些答案和文件中使用了这种方法:

实现复杂的配置

在上面的所有示例中,我都展示了“一层”的行为。

当然,您可以在彼此之上堆叠许多层,而不必全部遵循相同的模式,然后创建自己的模型。

出现的一个有趣的例子是“自动编码器”,它具有“多对一编码器”,后跟“一对多”解码器:

编码器:

inputs = Input((steps,features))

#a few many to many layers:
outputs = LSTM(hidden1,return_sequences=True)(inputs)
outputs = LSTM(hidden2,return_sequences=True)(outputs)    

#many to one layer:
outputs = LSTM(hidden3)(outputs)

encoder = Model(inputs,outputs)

解码器:

使用“重复”方法;

inputs = Input((hidden3,))

#repeat to make one to many:
outputs = RepeatVector(steps)(inputs)

#a few many to many layers:
outputs = LSTM(hidden4,return_sequences=True)(outputs)

#last layer
outputs = LSTM(features,return_sequences=True)(outputs)

decoder = Model(inputs,outputs)

自动编码器:

inputs = Input((steps,features))
outputs = encoder(inputs)
outputs = decoder(outputs)

autoencoder = Model(inputs,outputs)

与一起训练 fit(X,X)

补充说明

如果您想了解有关LSTM中如何计算步数的详细信息,或有关上述stateful=True情况的详细信息,则可以在此答案中阅读更多内容:关于“了解Keras LSTM”的疑问

As a complement to the accepted answer, this answer shows keras behaviors and how to achieve each picture.

General Keras behavior

The standard keras internal processing is always a many to many as in the following picture (where I used features=2, pressure and temperature, just as an example):

ManyToMany

In this image, I increased the number of steps to 5, to avoid confusion with the other dimensions.

For this example:

  • We have N oil tanks
  • We spent 5 hours taking measures hourly (time steps)
  • We measured two features:
    • Pressure P
    • Temperature T

Our input array should then be something shaped as (N,5,2):

        [     Step1      Step2      Step3      Step4      Step5
Tank A:    [[Pa1,Ta1], [Pa2,Ta2], [Pa3,Ta3], [Pa4,Ta4], [Pa5,Ta5]],
Tank B:    [[Pb1,Tb1], [Pb2,Tb2], [Pb3,Tb3], [Pb4,Tb4], [Pb5,Tb5]],
  ....
Tank N:    [[Pn1,Tn1], [Pn2,Tn2], [Pn3,Tn3], [Pn4,Tn4], [Pn5,Tn5]],
        ]

Inputs for sliding windows

Often, LSTM layers are supposed to process the entire sequences. Dividing windows may not be the best idea. The layer has internal states about how a sequence is evolving as it steps forward. Windows eliminate the possibility of learning long sequences, limiting all sequences to the window size.

In windows, each window is part of a long original sequence, but by Keras they will be seen each as an independent sequence:

        [     Step1    Step2    Step3    Step4    Step5
Window  A:  [[P1,T1], [P2,T2], [P3,T3], [P4,T4], [P5,T5]],
Window  B:  [[P2,T2], [P3,T3], [P4,T4], [P5,T5], [P6,T6]],
Window  C:  [[P3,T3], [P4,T4], [P5,T5], [P6,T6], [P7,T7]],
  ....
        ]

Notice that in this case, you have initially only one sequence, but you’re dividing it in many sequences to create windows.

The concept of “what is a sequence” is abstract. The important parts are:

  • you can have batches with many individual sequences
  • what makes the sequences be sequences is that they evolve in steps (usually time steps)

Achieving each case with “single layers”

Achieving standard many to many:

StandardManyToMany

You can achieve many to many with a simple LSTM layer, using return_sequences=True:

outputs = LSTM(units, return_sequences=True)(inputs)

#output_shape -> (batch_size, steps, units)

Achieving many to one:

Using the exact same layer, keras will do the exact same internal preprocessing, but when you use return_sequences=False (or simply ignore this argument), keras will automatically discard the steps previous to the last:

ManyToOne

outputs = LSTM(units)(inputs)

#output_shape -> (batch_size, units) --> steps were discarded, only the last was returned

Achieving one to many

Now, this is not supported by keras LSTM layers alone. You will have to create your own strategy to multiplicate the steps. There are two good approaches:

  • Create a constant multi-step input by repeating a tensor
  • Use a stateful=True to recurrently take the output of one step and serve it as the input of the next step (needs output_features == input_features)

One to many with repeat vector

In order to fit to keras standard behavior, we need inputs in steps, so, we simply repeat the inputs for the length we want:

OneToManyRepeat

outputs = RepeatVector(steps)(inputs) #where inputs is (batch,features)
outputs = LSTM(units,return_sequences=True)(outputs)

#output_shape -> (batch_size, steps, units)

Understanding stateful = True

Now comes one of the possible usages of stateful=True (besides avoiding loading data that can’t fit your computer’s memory at once)

Stateful allows us to input “parts” of the sequences in stages. The difference is:

  • In stateful=False, the second batch contains whole new sequences, independent from the first batch
  • In stateful=True, the second batch continues the first batch, extending the same sequences.

It’s like dividing the sequences in windows too, with these two main differences:

  • these windows do not superpose!!
  • stateful=True will see these windows connected as a single long sequence

In stateful=True, every new batch will be interpreted as continuing the previous batch (until you call model.reset_states()).

  • Sequence 1 in batch 2 will continue sequence 1 in batch 1.
  • Sequence 2 in batch 2 will continue sequence 2 in batch 1.
  • Sequence n in batch 2 will continue sequence n in batch 1.

Example of inputs, batch 1 contains steps 1 and 2, batch 2 contains steps 3 to 5:

                   BATCH 1                           BATCH 2
        [     Step1      Step2        |    [    Step3      Step4      Step5
Tank A:    [[Pa1,Ta1], [Pa2,Ta2],     |       [Pa3,Ta3], [Pa4,Ta4], [Pa5,Ta5]],
Tank B:    [[Pb1,Tb1], [Pb2,Tb2],     |       [Pb3,Tb3], [Pb4,Tb4], [Pb5,Tb5]],
  ....                                |
Tank N:    [[Pn1,Tn1], [Pn2,Tn2],     |       [Pn3,Tn3], [Pn4,Tn4], [Pn5,Tn5]],
        ]                                  ]

Notice the alignment of tanks in batch 1 and batch 2! That’s why we need shuffle=False (unless we are using only one sequence, of course).

You can have any number of batches, indefinitely. (For having variable lengths in each batch, use input_shape=(None,features).

One to many with stateful=True

For our case here, we are going to use only 1 step per batch, because we want to get one output step and make it be an input.

Please notice that the behavior in the picture is not “caused by” stateful=True. We will force that behavior in a manual loop below. In this example, stateful=True is what “allows” us to stop the sequence, manipulate what we want, and continue from where we stopped.

OneToManyStateful

Honestly, the repeat approach is probably a better choice for this case. But since we’re looking into stateful=True, this is a good example. The best way to use this is the next “many to many” case.

Layer:

outputs = LSTM(units=features, 
               stateful=True, 
               return_sequences=True, #just to keep a nice output shape even with length 1
               input_shape=(None,features))(inputs) 
    #units = features because we want to use the outputs as inputs
    #None because we want variable length

#output_shape -> (batch_size, steps, units) 

Now, we’re going to need a manual loop for predictions:

input_data = someDataWithShape((batch, 1, features))

#important, we're starting new sequences, not continuing old ones:
model.reset_states()

output_sequence = []
last_step = input_data
for i in steps_to_predict:

    new_step = model.predict(last_step)
    output_sequence.append(new_step)
    last_step = new_step

 #end of the sequences
 model.reset_states()

Many to many with stateful=True

Now, here, we get a very nice application: given an input sequence, try to predict its future unknown steps.

We’re using the same method as in the “one to many” above, with the difference that:

  • we will use the sequence itself to be the target data, one step ahead
  • we know part of the sequence (so we discard this part of the results).

ManyToManyStateful

Layer (same as above):

outputs = LSTM(units=features, 
               stateful=True, 
               return_sequences=True, 
               input_shape=(None,features))(inputs) 
    #units = features because we want to use the outputs as inputs
    #None because we want variable length

#output_shape -> (batch_size, steps, units) 

Training:

We are going to train our model to predict the next step of the sequences:

totalSequences = someSequencesShaped((batch, steps, features))
    #batch size is usually 1 in these cases (often you have only one Tank in the example)

X = totalSequences[:,:-1] #the entire known sequence, except the last step
Y = totalSequences[:,1:] #one step ahead of X

#loop for resetting states at the start/end of the sequences:
for epoch in range(epochs):
    model.reset_states()
    model.train_on_batch(X,Y)

Predicting:

The first stage of our predicting involves “ajusting the states”. That’s why we’re going to predict the entire sequence again, even if we already know this part of it:

model.reset_states() #starting a new sequence
predicted = model.predict(totalSequences)
firstNewStep = predicted[:,-1:] #the last step of the predictions is the first future step

Now we go to the loop as in the one to many case. But don’t reset states here!. We want the model to know in which step of the sequence it is (and it knows it’s at the first new step because of the prediction we just made above)

output_sequence = [firstNewStep]
last_step = firstNewStep
for i in steps_to_predict:

    new_step = model.predict(last_step)
    output_sequence.append(new_step)
    last_step = new_step

 #end of the sequences
 model.reset_states()

This approach was used in these answers and file:

Achieving complex configurations

In all examples above, I showed the behavior of “one layer”.

You can, of course, stack many layers on top of each other, not necessarly all following the same pattern, and create your own models.

One interesting example that has been appearing is the “autoencoder” that has a “many to one encoder” followed by a “one to many” decoder:

Encoder:

inputs = Input((steps,features))

#a few many to many layers:
outputs = LSTM(hidden1,return_sequences=True)(inputs)
outputs = LSTM(hidden2,return_sequences=True)(outputs)    

#many to one layer:
outputs = LSTM(hidden3)(outputs)

encoder = Model(inputs,outputs)

Decoder:

Using the “repeat” method;

inputs = Input((hidden3,))

#repeat to make one to many:
outputs = RepeatVector(steps)(inputs)

#a few many to many layers:
outputs = LSTM(hidden4,return_sequences=True)(outputs)

#last layer
outputs = LSTM(features,return_sequences=True)(outputs)

decoder = Model(inputs,outputs)

Autoencoder:

inputs = Input((steps,features))
outputs = encoder(inputs)
outputs = decoder(outputs)

autoencoder = Model(inputs,outputs)

Train with fit(X,X)

Additional explanations

If you want details about how steps are calculated in LSTMs, or details about the stateful=True cases above, you can read more in this answer: Doubts regarding `Understanding Keras LSTMs`


回答 2

当您在RNN的最后一层中有return_sequences时,您不能使用简单的Dense层,而应使用TimeDistributed。

这是一段示例代码,可能会对其他人有所帮助。

单词= keras.layers.Input(batch_shape =(None,self.maxSequenceLength),名称=“输入”)

    # Build a matrix of size vocabularySize x EmbeddingDimension 
    # where each row corresponds to a "word embedding" vector.
    # This layer will convert replace each word-id with a word-vector of size Embedding Dimension.
    embeddings = keras.layers.embeddings.Embedding(self.vocabularySize, self.EmbeddingDimension,
        name = "embeddings")(words)
    # Pass the word-vectors to the LSTM layer.
    # We are setting the hidden-state size to 512.
    # The output will be batchSize x maxSequenceLength x hiddenStateSize
    hiddenStates = keras.layers.GRU(512, return_sequences = True, 
                                        input_shape=(self.maxSequenceLength,
                                        self.EmbeddingDimension),
                                        name = "rnn")(embeddings)
    hiddenStates2 = keras.layers.GRU(128, return_sequences = True, 
                                        input_shape=(self.maxSequenceLength, self.EmbeddingDimension),
                                        name = "rnn2")(hiddenStates)

    denseOutput = TimeDistributed(keras.layers.Dense(self.vocabularySize), 
        name = "linear")(hiddenStates2)
    predictions = TimeDistributed(keras.layers.Activation("softmax"), 
        name = "softmax")(denseOutput)  

    # Build the computational graph by specifying the input, and output of the network.
    model = keras.models.Model(input = words, output = predictions)
    # model.compile(loss='kullback_leibler_divergence', \
    model.compile(loss='sparse_categorical_crossentropy', \
        optimizer = keras.optimizers.Adam(lr=0.009, \
            beta_1=0.9,\
            beta_2=0.999, \
            epsilon=None, \
            decay=0.01, \
            amsgrad=False))

When you have return_sequences in your last layer of RNN you cannot use a simple Dense layer instead use TimeDistributed.

Here is an example piece of code this might help others.

words = keras.layers.Input(batch_shape=(None, self.maxSequenceLength), name = “input”)

    # Build a matrix of size vocabularySize x EmbeddingDimension 
    # where each row corresponds to a "word embedding" vector.
    # This layer will convert replace each word-id with a word-vector of size Embedding Dimension.
    embeddings = keras.layers.embeddings.Embedding(self.vocabularySize, self.EmbeddingDimension,
        name = "embeddings")(words)
    # Pass the word-vectors to the LSTM layer.
    # We are setting the hidden-state size to 512.
    # The output will be batchSize x maxSequenceLength x hiddenStateSize
    hiddenStates = keras.layers.GRU(512, return_sequences = True, 
                                        input_shape=(self.maxSequenceLength,
                                        self.EmbeddingDimension),
                                        name = "rnn")(embeddings)
    hiddenStates2 = keras.layers.GRU(128, return_sequences = True, 
                                        input_shape=(self.maxSequenceLength, self.EmbeddingDimension),
                                        name = "rnn2")(hiddenStates)

    denseOutput = TimeDistributed(keras.layers.Dense(self.vocabularySize), 
        name = "linear")(hiddenStates2)
    predictions = TimeDistributed(keras.layers.Activation("softmax"), 
        name = "softmax")(denseOutput)  

    # Build the computational graph by specifying the input, and output of the network.
    model = keras.models.Model(input = words, output = predictions)
    # model.compile(loss='kullback_leibler_divergence', \
    model.compile(loss='sparse_categorical_crossentropy', \
        optimizer = keras.optimizers.Adam(lr=0.009, \
            beta_1=0.9,\
            beta_2=0.999, \
            epsilon=None, \
            decay=0.01, \
            amsgrad=False))

如何在Python中获取“时区感知”的datetime.today()值?

问题:如何在Python中获取“时区感知”的datetime.today()值?

我正在尝试从的值中减去一个日期值,datetime.today()以计算某物是多久以前的。但它抱怨:

TypeError: can't subtract offset-naive and offset-aware datetimes

该值datetime.today()似乎不是“时区感知”的,而我的其他日期值是。如何获得datetime.today()时区感知的值?

现在,这给了我当地时间,正好是PST,即UTC-8个小时。最坏的情况是,有没有一种方法可以手动将时区值输入datetime返回的对象datetime.today()并将其设置为UTC-8?

当然,理想的解决方案是让它自动知道时区。

I am trying to subtract one date value from the value of datetime.today() to calculate how long ago something was. But it complains:

TypeError: can't subtract offset-naive and offset-aware datetimes

The value datetime.today() doesn’t seem to be “timezone aware”, while my other date value is. How do I get a value of datetime.today() that is timezone aware?

Right now, it’s giving me the time in local time, which happens to be PST, i.e. UTC – 8 hours. Worst case, is there a way I can manually enter a timezone value into the datetime object returned by datetime.today() and set it to UTC-8?

Of course, the ideal solution would be for it to automatically know the timezone.


回答 0

在标准库中,没有跨平台的方法来创建感知时区而不创建自己的时区类。

在Windows上有win32timezone.utcnow(),但这是pywin32的一部分。我宁愿建议使用pytz库,该库具有大多数时区的不断更新的数据库。

使用本地时区可能非常棘手(请参见下面的“更多阅读”链接),因此您可能希望在整个应用程序中使用UTC,尤其是对于算术运算(如计算两个时间点之间的差)。

您可以像这样获取当前日期/时间:

import pytz
from datetime import datetime
datetime.utcnow().replace(tzinfo=pytz.utc)

记住这一点datetime.today()datetime.now()返回本地时间,而不是UTC时间,因此.replace(tzinfo=pytz.utc)向他们申请是不正确的。

另一个好的方法是:

datetime.now(pytz.utc)

这有点短,并且执行相同的操作。


进一步阅读/观看为什么在许多情况下更喜欢UTC:

In the standard library, there is no cross-platform way to create aware timezones without creating your own timezone class.

On Windows, there’s win32timezone.utcnow(), but that’s part of pywin32. I would rather suggest to use the pytz library, which has a constantly updated database of most timezones.

Working with local timezones can be very tricky (see “Further reading” links below), so you may rather want to use UTC throughout your application, especially for arithmetic operations like calculating the difference between two time points.

You can get the current date/time like so:

import pytz
from datetime import datetime
datetime.utcnow().replace(tzinfo=pytz.utc)

Mind that datetime.today() and datetime.now() return the local time, not the UTC time, so applying .replace(tzinfo=pytz.utc) to them would not be correct.

Another nice way to do it is:

datetime.now(pytz.utc)

which is a bit shorter and does the same.


Further reading/watching why to prefer UTC in many cases:


回答 1

获取特定时区的当前时间:

import datetime
import pytz
my_date = datetime.datetime.now(pytz.timezone('US/Pacific'))

Get the current time, in a specific timezone:

import datetime
import pytz
my_date = datetime.datetime.now(pytz.timezone('US/Pacific'))

回答 2

在Python 3中,标准库使将UTC指定为时区变得容易得多:

>>> import datetime
>>> datetime.datetime.now(datetime.timezone.utc)
datetime.datetime(2016, 8, 26, 14, 34, 34, 74823, tzinfo=datetime.timezone.utc)

如果您想要一个仅使用标准库并且在Python 2和Python 3中均可使用的解决方案,请参见jfs的答案

如果您需要当地时区而不是UTC,请参见MihaiCapotă的答案

In Python 3, the standard library makes it much easier to specify UTC as the timezone:

>>> import datetime
>>> datetime.datetime.now(datetime.timezone.utc)
datetime.datetime(2016, 8, 26, 14, 34, 34, 74823, tzinfo=datetime.timezone.utc)

If you want a solution that uses only the standard library and that works in both Python 2 and Python 3, see jfs’ answer.

If you need the local timezone, not UTC, see Mihai Capotă’s answer


回答 3

这是一个适用于Python 2和3的stdlib解决方案:

from datetime import datetime

now = datetime.now(utc) # Timezone-aware datetime.utcnow()
today = datetime(now.year, now.month, now.day, tzinfo=utc) # Midnight

其中today是一个已知的datetime实例,表示UTC中的一天的开始(午夜),并且utc是tzinfo对象(来自文档的示例):

from datetime import tzinfo, timedelta

ZERO = timedelta(0)

class UTC(tzinfo):
    def utcoffset(self, dt):
        return ZERO

    def tzname(self, dt):
        return "UTC"

    def dst(self, dt):
        return ZERO

utc = UTC()

相关:在给定UTC时间获得午夜(一天的开始)几种方法的性能比较。注意:要获取具有非固定UTC偏移量的时区的午夜更为复杂。

Here’s a stdlib solution that works on both Python 2 and 3:

from datetime import datetime

now = datetime.now(utc) # Timezone-aware datetime.utcnow()
today = datetime(now.year, now.month, now.day, tzinfo=utc) # Midnight

where today is an aware datetime instance representing the beginning of the day (midnight) in UTC and utc is a tzinfo object (example from the documentation):

from datetime import tzinfo, timedelta

ZERO = timedelta(0)

class UTC(tzinfo):
    def utcoffset(self, dt):
        return ZERO

    def tzname(self, dt):
        return "UTC"

    def dst(self, dt):
        return ZERO

utc = UTC()

Related: performance comparison of several ways to get midnight (start of a day) for a given UTC time. Note: it is more complex, to get midnight for a time zone with a non-fixed UTC offset.


回答 4

构造表示当前时间的时区感知日期时间对象的另一种方法:

import datetime
import pytz

pytz.utc.localize( datetime.datetime.utcnow() )  

Another method to construct time zone aware datetime object representing current time:

import datetime
import pytz

pytz.utc.localize( datetime.datetime.utcnow() )  

回答 5

从Python 3.3开始,仅使用标准库的单行代码就可以使用。您可以datetime使用来获取本地时区感知对象astimezone(如johnchen902建议):

from datetime import datetime, timezone

aware_local_now = datetime.now(timezone.utc).astimezone()

print(aware_local_now)
# 2020-03-03 09:51:38.570162+01:00

print(repr(aware_local_now))
# datetime.datetime(2020, 3, 3, 9, 51, 38, 570162, tzinfo=datetime.timezone(datetime.timedelta(0, 3600), 'CET'))

A one-liner using only the standard library works starting with Python 3.3. You can get a local timezone aware datetime object using astimezone (as suggested by johnchen902):

from datetime import datetime, timezone

aware_local_now = datetime.now(timezone.utc).astimezone()

print(aware_local_now)
# 2020-03-03 09:51:38.570162+01:00

print(repr(aware_local_now))
# datetime.datetime(2020, 3, 3, 9, 51, 38, 570162, tzinfo=datetime.timezone(datetime.timedelta(0, 3600), 'CET'))

回答 6

如果您使用的是Django,则可以将日期设置为非tz感知(仅UTC)。

在settings.py中注释以下行:

USE_TZ = True

If you are using Django, you can set dates non-tz aware (only UTC).

Comment the following line in settings.py:

USE_TZ = True

回答 7

pytz是一个Python库,可以使用Python 2.3或更高版本进行准确的跨平台时区计算。

使用stdlib,这是不可能的。

SO上看到类似的问题。

pytz is a Python library that allows accurate and cross platform timezone calculations using Python 2.3 or higher.

With the stdlib, this is not possible.

See a similar question on SO.


回答 8

这是使用stdlib生成它的一种方法:

import time
from datetime import datetime

FORMAT='%Y-%m-%dT%H:%M:%S%z'
date=datetime.strptime(time.strftime(FORMAT, time.localtime()),FORMAT)

date将存储本地日期和相对于UTC偏移量,而不是UTC时区的日期,因此,如果需要确定日期在哪个时区生成,可以使用此解决方案。在此示例中以及我的本地时区:

date
datetime.datetime(2017, 8, 1, 12, 15, 44, tzinfo=datetime.timezone(datetime.timedelta(0, 7200)))

date.tzname()
'UTC+02:00'

关键是将%z指令添加到表示形式FORMAT中,以指示生成的时间结构的UTC偏移量。其他表示形式可以在datetime模块文档中查询

如果您需要UTC时区的日期,则可以将time.localtime()替换为time.gmtime()

date=datetime.strptime(time.strftime(FORMAT, time.gmtime()),FORMAT)

date    
datetime.datetime(2017, 8, 1, 10, 23, 51, tzinfo=datetime.timezone.utc)

date.tzname()
'UTC'

编辑

这仅适用于python3。z指令在python 2 _strptime.py代码上不可用

Here is one way to generate it with the stdlib:

import time
from datetime import datetime

FORMAT='%Y-%m-%dT%H:%M:%S%z'
date=datetime.strptime(time.strftime(FORMAT, time.localtime()),FORMAT)

date will store the local date and the offset from UTC, not the date at UTC timezone, so you can use this solution if you need to identify which timezone the date is generated at. In this example and in my local timezone:

date
datetime.datetime(2017, 8, 1, 12, 15, 44, tzinfo=datetime.timezone(datetime.timedelta(0, 7200)))

date.tzname()
'UTC+02:00'

The key is adding the %z directive to the representation FORMAT, to indicate the UTC offset of the generated time struct. Other representation formats can be consulted in the datetime module docs

If you need the date at the UTC timezone, you can replace time.localtime() with time.gmtime()

date=datetime.strptime(time.strftime(FORMAT, time.gmtime()),FORMAT)

date    
datetime.datetime(2017, 8, 1, 10, 23, 51, tzinfo=datetime.timezone.utc)

date.tzname()
'UTC'

Edit

This works only on python3. The z directive is not available on python 2 _strptime.py code


回答 9

使用可识别时区的Python datetime.datetime.now()中所述的dateutil :

from dateutil.tz import tzlocal
# Get the current date/time with the timezone.
now = datetime.datetime.now(tzlocal())

Use dateutil as described in Python datetime.datetime.now() that is timezone aware:

from dateutil.tz import tzlocal
# Get the current date/time with the timezone.
now = datetime.datetime.now(tzlocal())

回答 10

在时区中获取可识别时区的日期utc足以使日期减法起作用。

但是,如果您想在当前时区使用时区感知日期,tzlocal则可以采用以下方法:

from tzlocal import get_localzone  # pip install tzlocal
from datetime import datetime
datetime.now(get_localzone())

PS dateutil具有类似的功能(dateutil.tz.tzlocal)。但是,尽管共享名称,但它具有完全不同的代码库,正如JF Sebastian 指出的那样,可能会产生错误的结果。

Getting a timezone-aware date in utc timezone is enough for date subtraction to work.

But if you want a timezone-aware date in your current time zone, tzlocal is the way to go:

from tzlocal import get_localzone  # pip install tzlocal
from datetime import datetime
datetime.now(get_localzone())

PS dateutil has a similar function (dateutil.tz.tzlocal). But inspite of sharing the name it has a completely different code base, which as noted by J.F. Sebastian can give wrong results.


回答 11

这是一个使用可读时区的解决方案,该解决方案适用于today():

from pytz import timezone

datetime.now(timezone('Europe/Berlin'))
datetime.now(timezone('Europe/Berlin')).today()

您可以列出所有时区,如下所示:

import pytz

pytz.all_timezones
pytz.common_timezones # or

Here is a solution using a readable timezone and that works with today():

from pytz import timezone

datetime.now(timezone('Europe/Berlin'))
datetime.now(timezone('Europe/Berlin')).today()

You can list all timezones as follows:

import pytz

pytz.all_timezones
pytz.common_timezones # or

回答 12

特别是对于非UTC时区:

唯一具有自己方法的时区是timezone.utc,但是如果需要,您可以使用timedeltatimezone,并使用强制使用UTC偏移量来伪装时区.replace

In [1]: from datetime import datetime, timezone, timedelta

In [2]: def force_timezone(dt, utc_offset=0):
   ...:     return dt.replace(tzinfo=timezone(timedelta(hours=utc_offset)))
   ...:

In [3]: dt = datetime(2011,8,15,8,15,12,0)

In [4]: str(dt)
Out[4]: '2011-08-15 08:15:12'

In [5]: str(force_timezone(dt, -8))
Out[5]: '2011-08-15 08:15:12-08:00'

在这里使用timezone(timedelta(hours=n))时区是真正的灵丹妙药,它还有许多其他有用的应用程序。

Especially for non-UTC timezones:

The only timezone that has its own method is timezone.utc, but you can fudge a timezone with any UTC offset if you need to by using timedelta & timezone, and forcing it using .replace.

In [1]: from datetime import datetime, timezone, timedelta

In [2]: def force_timezone(dt, utc_offset=0):
   ...:     return dt.replace(tzinfo=timezone(timedelta(hours=utc_offset)))
   ...:

In [3]: dt = datetime(2011,8,15,8,15,12,0)

In [4]: str(dt)
Out[4]: '2011-08-15 08:15:12'

In [5]: str(force_timezone(dt, -8))
Out[5]: '2011-08-15 08:15:12-08:00'

Using timezone(timedelta(hours=n)) as the time zone is the real silver bullet here, and it has lots of other useful applications.


回答 13

如果您在python中获得了当前时间和日期,则在导入日期和时间后,您将在python中获得当前日期和时间,如下所示。

from datetime import datetime
import pytz
import time
str(datetime.strftime(datetime.now(pytz.utc),"%Y-%m-%d %H:%M:%S%t"))

If you get current time and date in python then import date and time,pytz package in python after you will get current date and time like as..

from datetime import datetime
import pytz
import time
str(datetime.strftime(datetime.now(pytz.utc),"%Y-%m-%d %H:%M:%S%t"))

回答 14

在我看来,另一种替代方法是使用Pendulum代替pytz。考虑以下简单代码:

>>> import pendulum

>>> dt = pendulum.now().to_iso8601_string()
>>> print (dt)
2018-03-27T13:59:49+03:00
>>>

要安装Pendulum并查看其文档,请转到此处。它具有大量选项(例如简单的ISO8601,RFC3339和许多其他格式支持),更好的性能并倾向于产生更简单的代码。

Another alternative, in my mind a better one, is using Pendulum instead of pytz. Consider the following simple code:

>>> import pendulum

>>> dt = pendulum.now().to_iso8601_string()
>>> print (dt)
2018-03-27T13:59:49+03:00
>>>

To install Pendulum and see their documentation, go here. It have tons of options (like simple ISO8601, RFC3339 and many others format support), better performance and tend to yield simpler code.


回答 15

如下所示,将时区用于可识别时区的日期时间。默认为UTC:

from django.utils import timezone
today = timezone.now()

Use the timezone as shown below for a timezone-aware date time. The default is UTC:

from django.utils import timezone
today = timezone.now()

回答 16

来自“ howchoo”的Tyler撰写了一篇非常出色的文章,帮助我对Datetime Objects有了更好的了解,请点击以下链接

使用日期时间

本质上,我只是在两个datetime对象的末尾添加了以下内容

.replace(tzinfo=pytz.utc)

例:

import pytz
import datetime from datetime

date = datetime.now().replace(tzinfo=pytz.utc)

Tyler from ‘howchoo’ made a really great article that helped me get a better idea of the Datetime Objects, link below

Working with Datetime

essentially, I just added the following to the end of both my datetime objects

.replace(tzinfo=pytz.utc)

Example:

import pytz
import datetime from datetime

date = datetime.now().replace(tzinfo=pytz.utc)

回答 17

尝试pnp_datetime,所有使用和返回的时间都是带时区的,不会造成任何天真偏移和可感知偏移的问题。

>>> from pnp_datetime.pnp_datetime import Pnp_Datetime
>>>
>>> Pnp_Datetime.utcnow()
datetime.datetime(2020, 6, 5, 12, 26, 18, 958779, tzinfo=<UTC>)

try pnp_datetime, all the time been used and returned is with timezone, and will not cause any offset-naive and offset-aware issues.

>>> from pnp_datetime.pnp_datetime import Pnp_Datetime
>>>
>>> Pnp_Datetime.utcnow()
datetime.datetime(2020, 6, 5, 12, 26, 18, 958779, tzinfo=<UTC>)

回答 18

应该强调的是,从Python 3.6开始,您只需要标准的lib即可获取表示本地时间(操作系统的设置)的时区感知日期时间对象。使用astimezone()

import datetime

datetime.datetime(2010, 12, 25, 10, 59).astimezone()
# e.g.
# datetime.datetime(2010, 12, 25, 10, 59, tzinfo=datetime.timezone(datetime.timedelta(seconds=3600), 'Mitteleuropäische Zeit'))

datetime.datetime(2010, 12, 25, 12, 59).astimezone().isoformat()
# e.g.
# '2010-12-25T12:59:00+01:00'

# I'm on CET/CEST

(请参阅@ johnchen902的评论)。

It should be emphasized that since Python 3.6, you only need the standard lib to get a timezone aware datetime object that represents local time (the setting of your OS). Using astimezone()

import datetime

datetime.datetime(2010, 12, 25, 10, 59).astimezone()
# e.g.
# datetime.datetime(2010, 12, 25, 10, 59, tzinfo=datetime.timezone(datetime.timedelta(seconds=3600), 'Mitteleuropäische Zeit'))

datetime.datetime(2010, 12, 25, 12, 59).astimezone().isoformat()
# e.g.
# '2010-12-25T12:59:00+01:00'

# I'm on CET/CEST

(see @johnchen902’s comment). Note there’s a small caveat though, astimezone(None) gives aware datetime, unaware of DST.


什么是“可赎回”?

问题:什么是“可赎回”?

既然清楚了什么是元类,就有一个相关的概念,我一直都在使用它,而不知道它的真正含义。

我想每个人都曾因括号错误而导致“对象不可调用”异常。而且,使用__init____new__导致怀疑这种血腥的__call__用途。

您能给我一些解释,包括魔术方法的例子吗?

Now that it’s clear what a metaclass is, there is an associated concept that I use all the time without knowing what it really means.

I suppose everybody made once a mistake with parenthesis, resulting in an “object is not callable” exception. What’s more, using __init__ and __new__ lead to wonder what this bloody __call__ can be used for.

Could you give me some explanations, including examples with the magic method ?


回答 0

可调用对象是可以调用的任何东西。

所述内置的可调用(PyCallable_Check在objects.c)检查该参数可以是:

  • 具有__call__方法的类的实例或
  • 具有非null tp_call(c struct)成员的类型,该成员以其他方式指示可调用性(例如在函数,方法等中)

命名的方法__call__是(根据文档

当实例被“调用”为函数时调用

class Foo:
  def __call__(self):
    print 'called'

foo_instance = Foo()
foo_instance() #this is calling the __call__ method

A callable is anything that can be called.

The built-in callable (PyCallable_Check in objects.c) checks if the argument is either:

  • an instance of a class with a __call__ method or
  • is of a type that has a non null tp_call (c struct) member which indicates callability otherwise (such as in functions, methods etc.)

The method named __call__ is (according to the documentation)

Called when the instance is ”called” as a function

Example

class Foo:
  def __call__(self):
    print 'called'

foo_instance = Foo()
foo_instance() #this is calling the __call__ method

回答 1

从Python的来源object.c

/* Test whether an object can be called */

int
PyCallable_Check(PyObject *x)
{
    if (x == NULL)
        return 0;
    if (PyInstance_Check(x)) {
        PyObject *call = PyObject_GetAttrString(x, "__call__");
        if (call == NULL) {
            PyErr_Clear();
            return 0;
        }
        /* Could test recursively but don't, for fear of endless
           recursion if some joker sets self.__call__ = self */
        Py_DECREF(call);
        return 1;
    }
    else {
        return x->ob_type->tp_call != NULL;
    }
}

它说:

  1. 如果对象是一些类的实例,那么它是可调用当且仅当它有__call__属性。
  2. 其他对象x是可调用的iff x->ob_type->tp_call != NULL

tp_call领域描述:

ternaryfunc tp_call指向实现调用对象的函数的可选指针。如果对象不可调用,则应为NULL。签名与PyObject_Call()相同。该字段由子类型继承。

您始终可以使用内置callable函数来确定给定对象是否可调用;或更好,只是调用它并TypeError稍后捕获。callable已在Python 3.0和3.1中删除,请使用callable = lambda o: hasattr(o, '__call__')isinstance(o, collections.Callable)

示例,一个简单的缓存实现:

class Cached:
    def __init__(self, function):
        self.function = function
        self.cache = {}

    def __call__(self, *args):
        try: return self.cache[args]
        except KeyError:
            ret = self.cache[args] = self.function(*args)
            return ret    

用法:

@Cached
def ack(x, y):
    return ack(x-1, ack(x, y-1)) if x*y else (x + y + 1) 

来自标准库,文件site.py,内置定义exit()quit()函数的示例:

class Quitter(object):
    def __init__(self, name):
        self.name = name
    def __repr__(self):
        return 'Use %s() or %s to exit' % (self.name, eof)
    def __call__(self, code=None):
        # Shells like IDLE catch the SystemExit, but listen when their
        # stdin wrapper is closed.
        try:
            sys.stdin.close()
        except:
            pass
        raise SystemExit(code)
__builtin__.quit = Quitter('quit')
__builtin__.exit = Quitter('exit')

From Python’s sources object.c:

/* Test whether an object can be called */

int
PyCallable_Check(PyObject *x)
{
    if (x == NULL)
        return 0;
    if (PyInstance_Check(x)) {
        PyObject *call = PyObject_GetAttrString(x, "__call__");
        if (call == NULL) {
            PyErr_Clear();
            return 0;
        }
        /* Could test recursively but don't, for fear of endless
           recursion if some joker sets self.__call__ = self */
        Py_DECREF(call);
        return 1;
    }
    else {
        return x->ob_type->tp_call != NULL;
    }
}

It says:

  1. If an object is an instance of some class then it is callable iff it has __call__ attribute.
  2. Else the object x is callable iff x->ob_type->tp_call != NULL

Desciption of tp_call field:

ternaryfunc tp_call An optional pointer to a function that implements calling the object. This should be NULL if the object is not callable. The signature is the same as for PyObject_Call(). This field is inherited by subtypes.

You can always use built-in callable function to determine whether given object is callable or not; or better yet just call it and catch TypeError later. callable is removed in Python 3.0 and 3.1, use callable = lambda o: hasattr(o, '__call__') or isinstance(o, collections.Callable).

Example, a simplistic cache implementation:

class Cached:
    def __init__(self, function):
        self.function = function
        self.cache = {}

    def __call__(self, *args):
        try: return self.cache[args]
        except KeyError:
            ret = self.cache[args] = self.function(*args)
            return ret    

Usage:

@Cached
def ack(x, y):
    return ack(x-1, ack(x, y-1)) if x*y else (x + y + 1) 

Example from standard library, file site.py, definition of built-in exit() and quit() functions:

class Quitter(object):
    def __init__(self, name):
        self.name = name
    def __repr__(self):
        return 'Use %s() or %s to exit' % (self.name, eof)
    def __call__(self, code=None):
        # Shells like IDLE catch the SystemExit, but listen when their
        # stdin wrapper is closed.
        try:
            sys.stdin.close()
        except:
            pass
        raise SystemExit(code)
__builtin__.quit = Quitter('quit')
__builtin__.exit = Quitter('exit')

回答 2

可调用对象是允许您使用圆括号()并最终传递一些参数的对象,就像函数一样。

每次定义函数时,python都会创建一个可调用对象。例如,您可以通过以下方式定义函数func(相同):

class a(object):
    def __call__(self, *args):
        print 'Hello'

func = a()

# or ... 
def func(*args):
    print 'Hello'

您可以使用此方法代替doitrun之类的方法,我认为看到obj()比obj.doit()更清楚

A callable is an object allows you to use round parenthesis ( ) and eventually pass some parameters, just like functions.

Every time you define a function python creates a callable object. In example, you could define the function func in these ways (it’s the same):

class a(object):
    def __call__(self, *args):
        print 'Hello'

func = a()

# or ... 
def func(*args):
    print 'Hello'

You could use this method instead of methods like doit or run, I think it’s just more clear to see obj() than obj.doit()


回答 3

让我向后解释:

考虑一下…

foo()

…作为以下方面的语法糖:

foo.__call__()

foo响应的对象在哪里__call__?当我说任何对象时,我的意思是:内置类型,您自己的类及其实例。

对于内置类型,在编写时:

int('10')
unicode(10)

您实际上是在做:

int.__call__('10')
unicode.__call__(10)

这就是为什么您没有使用foo = new intPython:只需使class对象在上返回它的一个实例__call__。我认为Python解决此问题的方式非常优雅。

Let me explain backwards:

Consider this…

foo()

… as syntactic sugar for:

foo.__call__()

Where foo can be any object that responds to __call__. When I say any object, I mean it: built-in types, your own classes and their instances.

In the case of built-in types, when you write:

int('10')
unicode(10)

You’re essentially doing:

int.__call__('10')
unicode.__call__(10)

That’s also why you don’t have foo = new int in Python: you just make the class object return an instance of it on __call__. The way Python solves this is very elegant in my opinion.


回答 4

Callable是具有该__call__方法的对象。这意味着您可以伪造可调用的函数,或执行诸如Partial Function Application之类的整洁事情,在该函数中,您可以使用一个函数并添加一些可以增强其功能或填充某些参数的函数,从而返回可以依次调用的函数(在函数式编程圈中称为Currying)。

某些印刷错误将使解释器尝试调用您不想要的内容,例如字符串。在解释器尝试执行不可调用的应用程序时,这可能会产生错误。您可以通过以下类似的脚本来查看在python解释器中发生的情况。

[nigel@k9 ~]$ python
Python 2.5 (r25:51908, Nov  6 2007, 15:55:44) 
[GCC 4.1.2 20070925 (Red Hat 4.1.2-27)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 'aaa'()    # <== Here we attempt to call a string.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object is not callable
>>> 

A Callable is an object that has the __call__ method. This means you can fake callable functions or do neat things like Partial Function Application where you take a function and add something that enhances it or fills in some of the parameters, returning something that can be called in turn (known as Currying in functional programming circles).

Certain typographic errors will have the interpreter attempting to call something you did not intend, such as (for example) a string. This can produce errors where the interpreter attempts to execute a non-callable application. You can see this happening in a python interpreter by doing something like the transcript below.

[nigel@k9 ~]$ python
Python 2.5 (r25:51908, Nov  6 2007, 15:55:44) 
[GCC 4.1.2 20070925 (Red Hat 4.1.2-27)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 'aaa'()    # <== Here we attempt to call a string.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object is not callable
>>> 

回答 5

__call__ 使任何对象都可以作为函数调用。

此示例将输出8:

class Adder(object):
  def __init__(self, val):
    self.val = val

  def __call__(self, val):
    return self.val + val

func = Adder(5)
print func(3)

__call__ makes any object be callable as a function.

This example will output 8:

class Adder(object):
  def __init__(self, val):
    self.val = val

  def __call__(self, val):
    return self.val + val

func = Adder(5)
print func(3)

回答 6

很简单,“可调用”是可以像方法一样被调用的东西。内置函数“ callable()”将告诉您是否有某些东西可调用,就像检查call属性一样。函数可以像类一样被调用,类实例可以被调用。在这里这里看到更多关于这个的信息

Quite simply, a “callable” is something that can be called like a method. The built in function “callable()” will tell you whether something appears to be callable, as will checking for a call property. Functions are callable as are classes, class instances can be callable. See more about this here and here.


回答 7

在Python中,可调用对象是类型具有__call__方法的对象:

>>> class Foo:
...  pass
... 
>>> class Bar(object):
...  pass
... 
>>> type(Foo).__call__(Foo)
<__main__.Foo instance at 0x711440>
>>> type(Bar).__call__(Bar)
<__main__.Bar object at 0x712110>
>>> def foo(bar):
...  return bar
... 
>>> type(foo).__call__(foo, 42)
42

就如此容易 :)

这当然可以重载:

>>> class Foo(object):
...  def __call__(self):
...   return 42
... 
>>> f = Foo()
>>> f()
42

In Python a callable is an object which type has a __call__ method:

>>> class Foo:
...  pass
... 
>>> class Bar(object):
...  pass
... 
>>> type(Foo).__call__(Foo)
<__main__.Foo instance at 0x711440>
>>> type(Bar).__call__(Bar)
<__main__.Bar object at 0x712110>
>>> def foo(bar):
...  return bar
... 
>>> type(foo).__call__(foo, 42)
42

As simple as that :)

This of course can be overloaded:

>>> class Foo(object):
...  def __call__(self):
...   return 42
... 
>>> f = Foo()
>>> f()
42

回答 8

检查函数或类的方法是否可调用,这意味着我们可以调用该函数。

Class A:
    def __init__(self,val):
        self.val = val
    def bar(self):
        print "bar"

obj = A()      
callable(obj.bar)
True
callable(obj.__init___)
False
def foo(): return "s"
callable(foo)
True
callable(foo())
False

To check function or method of class is callable or not that means we can call that function.

Class A:
    def __init__(self,val):
        self.val = val
    def bar(self):
        print "bar"

obj = A()      
callable(obj.bar)
True
callable(obj.__init___)
False
def foo(): return "s"
callable(foo)
True
callable(foo())
False

回答 9

您可以在其后加上“(args)”,并期望它能正常工作。可调用对象通常是方法或类。方法被调用,类被实例化。

It’s something you can put “(args)” after and expect it to work. A callable is usually a method or a class. Methods get called, classes get instantiated.


回答 10

可调用__call__对象实现特殊方法,因此具有这种方法的任何对象都是可调用的。

callables implement the __call__ special method so any object with such a method is callable.


回答 11

可调用是带有方法调用的 “内置函数或方法”的类型或类

>>> type(callable)
<class 'builtin_function_or_method'>
>>>

示例: print是一个可调用对象。使用内置函数__call__ 调用print函数时,Python创建类型为print对象,并调用其方法__call__并传递参数(如果有)。

>>> type(print)
<class 'builtin_function_or_method'>
>>> print.__call__(10)
10
>>> print(10)
10
>>>

谢谢。问候,马里斯

Callable is a type or class of “Build-in function or Method” with a method call

>>> type(callable)
<class 'builtin_function_or_method'>
>>>

Example: print is a callable object. With a build-in function __call__ When you invoke the print function, Python creates an object of type print and invokes its method __call__ passing the parameters if any.

>>> type(print)
<class 'builtin_function_or_method'>
>>> print.__call__(10)
10
>>> print(10)
10
>>>

Thank you. Regards, Maris


了解__get__和__set__以及Python描述符

问题:了解__get__和__set__以及Python描述符

试图了解什么是Python的描述符以及它们的用途。我了解它们的工作原理,但这是我的疑问。考虑以下代码:

class Celsius(object):
    def __init__(self, value=0.0):
        self.value = float(value)
    def __get__(self, instance, owner):
        return self.value
    def __set__(self, instance, value):
        self.value = float(value)


class Temperature(object):
    celsius = Celsius()
  1. 为什么需要描述符类?

  2. 什么是instanceowner这里?(在中__get__)。这些参数的目的是什么?

  3. 我将如何调用/使用此示例?

I am trying to understand what Python’s descriptors are and what they are useful for. I understand how they work, but here are my doubts. Consider the following code:

class Celsius(object):
    def __init__(self, value=0.0):
        self.value = float(value)
    def __get__(self, instance, owner):
        return self.value
    def __set__(self, instance, value):
        self.value = float(value)


class Temperature(object):
    celsius = Celsius()
  1. Why do I need the descriptor class?

  2. What is instance and owner here? (in __get__). What is the purpose of these parameters?

  3. How would I call/use this example?


回答 0

描述符是如何实现Python property类型的。描述符仅实现__get____set__等等,然后被添加到其定义中的另一个类中(就像上面对Temperature类所做的那样)。例如:

temp=Temperature()
temp.celsius #calls celsius.__get__

访问您为描述符分配的属性(celsius在上面的示例中)调用适当的描述符方法。

instancein __get__是类的实例(因此,上面__get__将接收tempowner而是带有描述符的类(因此将是Temperature)。

您需要使用描述符类来封装为其提供动力的逻辑。这样,如果描述符用于缓存某些昂贵的操作(例如),它可以将值存储在自身而不是其类上。

关于描述符的文章可以在这里找到。

编辑:正如jchl在评论中指出的,如果您只是尝试Temperature.celsiusinstance将是None

The descriptor is how Python’s property type is implemented. A descriptor simply implements __get__, __set__, etc. and is then added to another class in its definition (as you did above with the Temperature class). For example:

temp=Temperature()
temp.celsius #calls celsius.__get__

Accessing the property you assigned the descriptor to (celsius in the above example) calls the appropriate descriptor method.

instance in __get__ is the instance of the class (so above, __get__ would receive temp, while owner is the class with the descriptor (so it would be Temperature).

You need to use a descriptor class to encapsulate the logic that powers it. That way, if the descriptor is used to cache some expensive operation (for example), it could store the value on itself and not its class.

An article about descriptors can be found here.

EDIT: As jchl pointed out in the comments, if you simply try Temperature.celsius, instance will be None.


回答 1

为什么需要描述符类?

它使您可以更好地控制属性的工作方式。例如,如果您习惯于使用Java中的getter和setter,那么这就是Python的方法。优点之一是,它对用户的外观就像一个属性(语法没有变化)。因此,您可以从一个普通属性开始,然后在需要做一些花哨的事情时,切换到一个描述符。

属性只是可变值。描述符使您可以在读取或设置(或删除)值时执行任意代码。因此,您可以想象使用它将属性映射到数据库中的字段,例如–一种ORM。

另一个用法可能是通过抛出异常来拒绝接受新值,__set__从而有效地使“属性” 变为只读。

什么是instanceowner这里?(在中__get__)。这些参数的目的是什么?

这是相当微妙的(这也是我在这里写一个新答案的原因-我在想这个问题的同时发现了这个问题,却没有发现现有的答案那么好)。

描述符是在类上定义的,但通常是从实例中调用的。当同时从实例instance和实例都调用它时owner(并且可以从中进行计算ownerinstance因此似乎没有意义)。但是,当从类中调用时,仅owner会设置-这就是它在那里的原因。

这仅是需要的,__get__因为它是唯一可以在类上调用的类。如果设置类值,则设置描述符本身。对于删除同样如此。这就是为什么owner不需要那里的原因。

我将如何调用/使用此示例?

好吧,这是一个使用类似类的绝妙技巧:

class Celsius:

    def __get__(self, instance, owner):
        return 5 * (instance.fahrenheit - 32) / 9

    def __set__(self, instance, value):
        instance.fahrenheit = 32 + 9 * value / 5


class Temperature:

    celsius = Celsius()

    def __init__(self, initial_f):
        self.fahrenheit = initial_f


t = Temperature(212)
print(t.celsius)
t.celsius = 0
print(t.fahrenheit)

(我使用的是Python 3;对于python 2,您需要确保这些除法是/ 5.0/ 9.0)。这给出了:

100.0
32.0

现在还有其他可以说是更好的方法可以在python中实现相同的效果(例如,如果celsius是一个属性,这是相同的基本机制,但是将所有源都放在Temperature类中),但这显示了可以完成的工作…

Why do I need the descriptor class?

It gives you extra control over how attributes work. If you’re used to getters and setters in Java, for example, then it’s Python’s way of doing that. One advantage is that it looks to users just like an attribute (there’s no change in syntax). So you can start with an ordinary attribute and then, when you need to do something fancy, switch to a descriptor.

An attribute is just a mutable value. A descriptor lets you execute arbitrary code when reading or setting (or deleting) a value. So you could imagine using it to map an attribute to a field in a database, for example – a kind of ORM.

Another use might be refusing to accept a new value by throwing an exception in __set__ – effectively making the “attribute” read only.

What is instance and owner here? (in __get__). What is the purpose of these parameters?

This is pretty subtle (and the reason I am writing a new answer here – I found this question while wondering the same thing and didn’t find the existing answer that great).

A descriptor is defined on a class, but is typically called from an instance. When it’s called from an instance both instance and owner are set (and you can work out owner from instance so it seems kinda pointless). But when called from a class, only owner is set – which is why it’s there.

This is only needed for __get__ because it’s the only one that can be called on a class. If you set the class value you set the descriptor itself. Similarly for deletion. Which is why the owner isn’t needed there.

How would I call/use this example?

Well, here’s a cool trick using similar classes:

class Celsius:

    def __get__(self, instance, owner):
        return 5 * (instance.fahrenheit - 32) / 9

    def __set__(self, instance, value):
        instance.fahrenheit = 32 + 9 * value / 5


class Temperature:

    celsius = Celsius()

    def __init__(self, initial_f):
        self.fahrenheit = initial_f


t = Temperature(212)
print(t.celsius)
t.celsius = 0
print(t.fahrenheit)

(I’m using Python 3; for python 2 you need to make sure those divisions are / 5.0 and / 9.0). That gives:

100.0
32.0

Now there are other, arguably better ways to achieve the same effect in python (e.g. if celsius were a property, which is the same basic mechanism but places all the source inside the Temperature class), but that shows what can be done…


回答 2

我试图了解什么是Python的描述符以及它们可用于什么。

描述符是具有以下任何特殊方法的类属性(如属性或方法):

  • __get__ (非数据描述符方法,例如方法/函数)
  • __set__ (数据描述符方法,例如在属性实例上)
  • __delete__ (数据描述符方法)

这些描述符对象可用作其他对象类定义的属性。(也就是说,它们位于__dict__类对象的中。)

描述符对象可用于以编程方式管理foo.descriptor正则表达式,赋值甚至删除中的点分查找(例如)的结果。

函数/方法,绑定方法,propertyclassmethodstaticmethod所有使用这些特殊的方法来控制它们是如何通过点查找访问。

像这样的数据描述符property可以根据对象的简单状态对属性进行延迟评估,与实例中预先计算每个可能的属性相比,允许实例使用更少的内存。

member_descriptor创建的另一个数据描述符a __slots__通过允许类将数据存储在可变的类似元组的数据结构中而不是更灵活但占用空间的方法来节省内存__dict__

非数据描述符(通常是实例,类和静态方法)从其非数据描述符方法中获取其隐式第一个参数(通常分别命名为clsself__get__

大多数Python用户只需要学习简单的用法,而无需进一步学习或理解描述符的实现。

深入:什么是描述符?

描述符是具有以下任何一种方法(__get____set____delete__)的对象,旨在通过点分查找来使用,就好像它是实例的典型属性一样。对于obj_instance具有一个descriptor对象的所有者对象,:

  • obj_instance.descriptor调用
    descriptor.__get__(self, obj_instance, owner_class)返回a。value
    这就是所有方法和geton属性的工作方式。

  • obj_instance.descriptor = value调用
    descriptor.__set__(self, obj_instance, value)返回None
    这就是setteron属性的工作方式。

  • del obj_instance.descriptor调用
    descriptor.__delete__(self, obj_instance)返回None
    这就是deleteron属性的工作方式。

obj_instance是实例,其类包含描述符对象的实例。self描述符的实例(可能只是的类的一个obj_instance

要使用代码定义此对象,如果对象的属性集与任何必需的属性相交,则该对象为描述符:

def has_descriptor_attrs(obj):
    return set(['__get__', '__set__', '__delete__']).intersection(dir(obj))

def is_descriptor(obj):
    """obj can be instance of descriptor or the descriptor class"""
    return bool(has_descriptor_attrs(obj))

数据描述符具有一个__set__和/或__delete__
一个非数据描述既没有__set__也没有__delete__

def has_data_descriptor_attrs(obj):
    return set(['__set__', '__delete__']) & set(dir(obj))

def is_data_descriptor(obj):
    return bool(has_data_descriptor_attrs(obj))

内置描述符对象示例:

  • classmethod
  • staticmethod
  • property
  • 一般功能

非数据描述符

我们可以看到,classmethodstaticmethod在非数据描述符:

>>> is_descriptor(classmethod), is_data_descriptor(classmethod)
(True, False)
>>> is_descriptor(staticmethod), is_data_descriptor(staticmethod)
(True, False)

两者都只有__get__方法:

>>> has_descriptor_attrs(classmethod), has_descriptor_attrs(staticmethod)
(set(['__get__']), set(['__get__']))

请注意,所有函数也是非数据描述符:

>>> def foo(): pass
... 
>>> is_descriptor(foo), is_data_descriptor(foo)
(True, False)

数据描述符 property

但是,property是一个数据描述符:

>>> is_data_descriptor(property)
True
>>> has_descriptor_attrs(property)
set(['__set__', '__get__', '__delete__'])

点分查找顺序

这些是重要的区别,因为它们会影响点分查找的查找顺序。

obj_instance.attribute
  1. 首先,上面的代码看一下该属性是否是实例类上的Data-Descriptor,
  2. 如果不是,它将查看该属性是否在obj_instance的中__dict__,然后
  3. 最后,它归结为非数据描述符。

此查找顺序的结果是实例可以覆盖诸如函数/方法之类的非数据描述符。

回顾与下一步

我们已经了解到,描述与任何对象__get____set____delete__。这些描述符对象可用作其他对象类定义的属性。现在,以您的代码为例,看看它们的用法。


从问题中分析代码

这是您的代码,然后是每个问题和答案:

class Celsius(object):
    def __init__(self, value=0.0):
        self.value = float(value)
    def __get__(self, instance, owner):
        return self.value
    def __set__(self, instance, value):
        self.value = float(value)

class Temperature(object):
    celsius = Celsius()
  1. 为什么需要描述符类?

您的描述符可确保您始终为的此类属性具有浮点数Temperature,并且不能用于del删除该属性:

>>> t1 = Temperature()
>>> del t1.celsius
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: __delete__

否则,描述符将忽略所有者类和所有者实例,而是将状态存储在描述符中。您可以使用一个简单的class属性轻松地在所有实例之间共享状态(只要您始终将其设置为该类的float且从不删除它,或者让您的代码用户满意):

class Temperature(object):
    celsius = 0.0

这样可以使您获得与示例完全相同的行为(请参见下面对问题3的回答),但是使用Python内置(property),并且会被认为更惯用:

class Temperature(object):
    _celsius = 0.0
    @property
    def celsius(self):
        return type(self)._celsius
    @celsius.setter
    def celsius(self, value):
        type(self)._celsius = float(value)
  1. 什么是实例和所有者?(获得)。这些参数的目的是什么?

instance是调用描述符的所有者的实例。所有者是使用描述符对象管理对数据点的访问的类。有关更多描述性变量名,请参见此答案第一段旁边的定义描述符的特殊方法的描述。

  1. 我将如何调用/使用此示例?

这是一个示范:

>>> t1 = Temperature()
>>> t1.celsius
0.0
>>> t1.celsius = 1
>>> 
>>> t1.celsius
1.0
>>> t2 = Temperature()
>>> t2.celsius
1.0

您无法删除该属性:

>>> del t2.celsius
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: __delete__

而且您不能分配不能转换为浮点数的变量:

>>> t1.celsius = '0x02'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 7, in __set__
ValueError: invalid literal for float(): 0x02

否则,您这里拥有的是所有实例的全局状态,可以通过分配给任何实例来进行管理。

最有经验的Python程序员完成此结果的预期方式是使用property装饰器,该装饰器在幕后使用相同的描述符,但将行为带入了owner类的实现(同样,如上所定义):

class Temperature(object):
    _celsius = 0.0
    @property
    def celsius(self):
        return type(self)._celsius
    @celsius.setter
    def celsius(self, value):
        type(self)._celsius = float(value)

具有与原始代码完全相同的预期行为:

>>> t1 = Temperature()
>>> t2 = Temperature()
>>> t1.celsius
0.0
>>> t1.celsius = 1.0
>>> t2.celsius
1.0
>>> del t1.celsius
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: can't delete attribute
>>> t1.celsius = '0x02'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 8, in celsius
ValueError: invalid literal for float(): 0x02

结论

我们已经介绍了定义描述符的属性,数据描述符和非数据描述符之间的区别,使用它们的内置对象以及有关使用的特定问题。

同样,您将如何使用问题的示例?我希望你不会。我希望您从我的第一个建议(一个简单的类属性)开始,如果有必要,请继续进行第二个建议(属性装饰器)。

I am trying to understand what Python’s descriptors are and what they can be useful for.

Descriptors are class attributes (like properties or methods) with any of the following special methods:

  • __get__ (non-data descriptor method, for example on a method/function)
  • __set__ (data descriptor method, for example on a property instance)
  • __delete__ (data descriptor method)

These descriptor objects can be used as attributes on other object class definitions. (That is, they live in the __dict__ of the class object.)

Descriptor objects can be used to programmatically manage the results of a dotted lookup (e.g. foo.descriptor) in a normal expression, an assignment, and even a deletion.

Functions/methods, bound methods, property, classmethod, and staticmethod all use these special methods to control how they are accessed via the dotted lookup.

A data descriptor, like property, can allow for lazy evaluation of attributes based on a simpler state of the object, allowing instances to use less memory than if you precomputed each possible attribute.

Another data descriptor, a member_descriptor, created by __slots__, allow memory savings by allowing the class to store data in a mutable tuple-like datastructure instead of the more flexible but space-consuming __dict__.

Non-data descriptors, usually instance, class, and static methods, get their implicit first arguments (usually named cls and self, respectively) from their non-data descriptor method, __get__.

Most users of Python need to learn only the simple usage, and have no need to learn or understand the implementation of descriptors further.

In Depth: What Are Descriptors?

A descriptor is an object with any of the following methods (__get__, __set__, or __delete__), intended to be used via dotted-lookup as if it were a typical attribute of an instance. For an owner-object, obj_instance, with a descriptor object:

  • obj_instance.descriptor invokes
    descriptor.__get__(self, obj_instance, owner_class) returning a value
    This is how all methods and the get on a property work.

  • obj_instance.descriptor = value invokes
    descriptor.__set__(self, obj_instance, value) returning None
    This is how the setter on a property works.

  • del obj_instance.descriptor invokes
    descriptor.__delete__(self, obj_instance) returning None
    This is how the deleter on a property works.

obj_instance is the instance whose class contains the descriptor object’s instance. self is the instance of the descriptor (probably just one for the class of the obj_instance)

To define this with code, an object is a descriptor if the set of its attributes intersects with any of the required attributes:

def has_descriptor_attrs(obj):
    return set(['__get__', '__set__', '__delete__']).intersection(dir(obj))

def is_descriptor(obj):
    """obj can be instance of descriptor or the descriptor class"""
    return bool(has_descriptor_attrs(obj))

A Data Descriptor has a __set__ and/or __delete__.
A Non-Data-Descriptor has neither __set__ nor __delete__.

def has_data_descriptor_attrs(obj):
    return set(['__set__', '__delete__']) & set(dir(obj))

def is_data_descriptor(obj):
    return bool(has_data_descriptor_attrs(obj))

Builtin Descriptor Object Examples:

  • classmethod
  • staticmethod
  • property
  • functions in general

Non-Data Descriptors

We can see that classmethod and staticmethod are Non-Data-Descriptors:

>>> is_descriptor(classmethod), is_data_descriptor(classmethod)
(True, False)
>>> is_descriptor(staticmethod), is_data_descriptor(staticmethod)
(True, False)

Both only have the __get__ method:

>>> has_descriptor_attrs(classmethod), has_descriptor_attrs(staticmethod)
(set(['__get__']), set(['__get__']))

Note that all functions are also Non-Data-Descriptors:

>>> def foo(): pass
... 
>>> is_descriptor(foo), is_data_descriptor(foo)
(True, False)

Data Descriptor, property

However, property is a Data-Descriptor:

>>> is_data_descriptor(property)
True
>>> has_descriptor_attrs(property)
set(['__set__', '__get__', '__delete__'])

Dotted Lookup Order

These are important distinctions, as they affect the lookup order for a dotted lookup.

obj_instance.attribute
  1. First the above looks to see if the attribute is a Data-Descriptor on the class of the instance,
  2. If not, it looks to see if the attribute is in the obj_instance‘s __dict__, then
  3. it finally falls back to a Non-Data-Descriptor.

The consequence of this lookup order is that Non-Data-Descriptors like functions/methods can be overridden by instances.

Recap and Next Steps

We have learned that descriptors are objects with any of __get__, __set__, or __delete__. These descriptor objects can be used as attributes on other object class definitions. Now we will look at how they are used, using your code as an example.


Analysis of Code from the Question

Here’s your code, followed by your questions and answers to each:

class Celsius(object):
    def __init__(self, value=0.0):
        self.value = float(value)
    def __get__(self, instance, owner):
        return self.value
    def __set__(self, instance, value):
        self.value = float(value)

class Temperature(object):
    celsius = Celsius()
  1. Why do I need the descriptor class?

Your descriptor ensures you always have a float for this class attribute of Temperature, and that you can’t use del to delete the attribute:

>>> t1 = Temperature()
>>> del t1.celsius
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: __delete__

Otherwise, your descriptors ignore the owner-class and instances of the owner, instead, storing state in the descriptor. You could just as easily share state across all instances with a simple class attribute (so long as you always set it as a float to the class and never delete it, or are comfortable with users of your code doing so):

class Temperature(object):
    celsius = 0.0

This gets you exactly the same behavior as your example (see response to question 3 below), but uses a Pythons builtin (property), and would be considered more idiomatic:

class Temperature(object):
    _celsius = 0.0
    @property
    def celsius(self):
        return type(self)._celsius
    @celsius.setter
    def celsius(self, value):
        type(self)._celsius = float(value)
  1. What is instance and owner here? (in get). What is the purpose of these parameters?

instance is the instance of the owner that is calling the descriptor. The owner is the class in which the descriptor object is used to manage access to the data point. See the descriptions of the special methods that define descriptors next to the first paragraph of this answer for more descriptive variable names.

  1. How would I call/use this example?

Here’s a demonstration:

>>> t1 = Temperature()
>>> t1.celsius
0.0
>>> t1.celsius = 1
>>> 
>>> t1.celsius
1.0
>>> t2 = Temperature()
>>> t2.celsius
1.0

You can’t delete the attribute:

>>> del t2.celsius
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: __delete__

And you can’t assign a variable that can’t be converted to a float:

>>> t1.celsius = '0x02'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 7, in __set__
ValueError: invalid literal for float(): 0x02

Otherwise, what you have here is a global state for all instances, that is managed by assigning to any instance.

The expected way that most experienced Python programmers would accomplish this outcome would be to use the property decorator, which makes use of the same descriptors under the hood, but brings the behavior into the implementation of the owner class (again, as defined above):

class Temperature(object):
    _celsius = 0.0
    @property
    def celsius(self):
        return type(self)._celsius
    @celsius.setter
    def celsius(self, value):
        type(self)._celsius = float(value)

Which has the exact same expected behavior of the original piece of code:

>>> t1 = Temperature()
>>> t2 = Temperature()
>>> t1.celsius
0.0
>>> t1.celsius = 1.0
>>> t2.celsius
1.0
>>> del t1.celsius
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: can't delete attribute
>>> t1.celsius = '0x02'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 8, in celsius
ValueError: invalid literal for float(): 0x02

Conclusion

We’ve covered the attributes that define descriptors, the difference between data- and non-data-descriptors, builtin objects that use them, and specific questions about use.

So again, how would you use the question’s example? I hope you wouldn’t. I hope you would start with my first suggestion (a simple class attribute) and move on to the second suggestion (the property decorator) if you feel it is necessary.


回答 3

在详细介绍描述符之前,了解Python中的属性查找如何工作可能很重要。这假定该类没有元类,并且使用的默认实现__getattribute__(均可用于“自定义”行为)。

在这种情况下,属性查找(在Python 3.x中或在Python 2.x中用于新样式类)的最佳说明来自于了解Python元类(ionel的代码日志)。该图像:代替“不可自定义的属性查找”。

这代表一个属性的查找foobarinstanceClass

在此处输入图片说明

这里有两个条件很重要:

  • 如果的类instance具有属性名称的条目,并且具有__get____set__
  • 如果instance已经没有了属性名称条目,但类有一个和它有__get__

这就是描述符的所在:

  • 具有__get__和的数据描述符__set__
  • 仅具有的非数据描述符__get__

在这两种情况下,返回的值都__get__以实例作为第一个参数,而类作为第二个参数进行调用。

对于类属性查找,查找甚至更加复杂(例如,请参见类属性查找(在上述博客中))。

让我们转到您的具体问题:

为什么需要描述符类?

在大多数情况下,您不需要编写描述符类!但是,您可能是非常普通的最终用户。例如功能。函数是描述符,这就是将函数用作self隐式传递为第一个参数的方法的方式。

def test_function(self):
    return self

class TestClass(object):
    def test_method(self):
        ...

如果您查找test_method实例,您将获得“绑定方法”:

>>> instance = TestClass()
>>> instance.test_method
<bound method TestClass.test_method of <__main__.TestClass object at ...>>

同样,您也可以通过__get__手动调用函数的方法来绑定函数(不建议这样做,仅出于说明目的):

>>> test_function.__get__(instance, TestClass)
<bound method test_function of <__main__.TestClass object at ...>>

您甚至可以将此方法称为“自绑定方法”:

>>> test_function.__get__(instance, TestClass)()
<__main__.TestClass at ...>

请注意,我没有提供任何参数,该函数确实返回了绑定的实例!

函数是非数据描述符

数据描述符的一些内置示例为property。忽略gettersetterdeleterproperty描述符是(来自描述符方法指南“属性”):

class Property(object):
    def __init__(self, fget=None, fset=None, fdel=None, doc=None):
        self.fget = fget
        self.fset = fset
        self.fdel = fdel
        if doc is None and fget is not None:
            doc = fget.__doc__
        self.__doc__ = doc

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        if self.fget is None:
            raise AttributeError("unreadable attribute")
        return self.fget(obj)

    def __set__(self, obj, value):
        if self.fset is None:
            raise AttributeError("can't set attribute")
        self.fset(obj, value)

    def __delete__(self, obj):
        if self.fdel is None:
            raise AttributeError("can't delete attribute")
        self.fdel(obj)

因为它是一个数据描述符它的调用,只要你抬头看的“名字” property,它只是委托给装饰的功能@property@name.setter以及@name.deleter(如果存在的话)。

还有一些其他的描述符在标准库中,例如staticmethodclassmethod

描述符的要点很容易(尽管您很少需要它们):用于属性访问的抽象通用代码。property是实例变量访问function的抽象,staticmethod提供方法的抽象,为不需要实例访问classmethod的方法提供抽象,为需要类访问而不是实例访问的方法提供抽象(这有点简化)。

另一个示例是class属性

一个有趣的示例(__set_name__从Python 3.6 使用)也可以是仅允许特定类型的属性:

class TypedProperty(object):
    __slots__ = ('_name', '_type')
    def __init__(self, typ):
        self._type = typ

    def __get__(self, instance, klass=None):
        if instance is None:
            return self
        return instance.__dict__[self._name]

    def __set__(self, instance, value):
        if not isinstance(value, self._type):
            raise TypeError(f"Expected class {self._type}, got {type(value)}")
        instance.__dict__[self._name] = value

    def __delete__(self, instance):
        del instance.__dict__[self._name]

    def __set_name__(self, klass, name):
        self._name = name

然后,您可以在类中使用描述符:

class Test(object):
    int_prop = TypedProperty(int)

并玩一点:

>>> t = Test()
>>> t.int_prop = 10
>>> t.int_prop
10

>>> t.int_prop = 20.0
TypeError: Expected class <class 'int'>, got <class 'float'>

或“懒惰的财产”:

class LazyProperty(object):
    __slots__ = ('_fget', '_name')
    def __init__(self, fget):
        self._fget = fget

    def __get__(self, instance, klass=None):
        if instance is None:
            return self
        try:
            return instance.__dict__[self._name]
        except KeyError:
            value = self._fget(instance)
            instance.__dict__[self._name] = value
            return value

    def __set_name__(self, klass, name):
        self._name = name

class Test(object):
    @LazyProperty
    def lazy(self):
        print('calculating')
        return 10

>>> t = Test()
>>> t.lazy
calculating
10
>>> t.lazy
10

在这些情况下,将逻辑移到公共描述符中可能很有意义,但是也可以使用其他方法解决它们(但可能需要重复一些代码)。

什么是instanceowner这里?(在中__get__)。这些参数的目的是什么?

这取决于您如何查找属性。如果您在实例上查找属性,则:

  • 第二个参数是您在其中查找属性的实例
  • 第三个参数是实例的类

如果您在类上查找属性(假设描述符是在类上定义的):

  • 第二个参数是 None
  • 第三个参数是您在其中查找属性的类

因此,基本上,如果要在执行类级查找时自定义行为(因为instanceis None),则第三个参数是必需的。

我将如何调用/使用此示例?

您的示例基本上是一个属性,该属性仅允许将值转换为该值,float并且该值可以在该类的所有实例之间共享(并且可以在该类上共享-尽管只能在该类上使用“读取”访问权限,否则您将替换描述符实例):

>>> t1 = Temperature()
>>> t2 = Temperature()

>>> t1.celsius = 20   # setting it on one instance
>>> t2.celsius        # looking it up on another instance
20.0

>>> Temperature.celsius  # looking it up on the class
20.0

这就是为什么描述符通常使用第二个参数(instance)存储值以避免共享它的原因。但是在某些情况下,可能需要在实例之间共享一个值(尽管目前我无法想到一种情况)。但是,对于温度等级的摄氏温度特性几乎没有任何意义……除了纯粹作为学术练习之外。

Before going into the details of descriptors it may be important to know how attribute lookup in Python works. This assumes that the class has no metaclass and that it uses the default implementation of __getattribute__ (both can be used to “customize” the behavior).

The best illustration of attribute lookup (in Python 3.x or for new-style classes in Python 2.x) in this case is from Understanding Python metaclasses (ionel’s codelog). The image uses : as substitute for “non-customizable attribute lookup”.

This represents the lookup of an attribute foobar on an instance of Class:

enter image description here

Two conditions are important here:

  • If the class of instance has an entry for the attribute name and it has __get__ and __set__.
  • If the instance has no entry for the attribute name but the class has one and it has __get__.

That’s where descriptors come into it:

  • Data descriptors which have both __get__ and __set__.
  • Non-data descriptors which only have __get__.

In both cases the returned value goes through __get__ called with the instance as first argument and the class as second argument.

The lookup is even more complicated for class attribute lookup (see for example Class attribute lookup (in the above mentioned blog)).

Let’s move to your specific questions:

Why do I need the descriptor class?

In most cases you don’t need to write descriptor classes! However you’re probably a very regular end user. For example functions. Functions are descriptors, that’s how functions can be used as methods with self implicitly passed as first argument.

def test_function(self):
    return self

class TestClass(object):
    def test_method(self):
        ...

If you look up test_method on an instance you’ll get back a “bound method”:

>>> instance = TestClass()
>>> instance.test_method
<bound method TestClass.test_method of <__main__.TestClass object at ...>>

Similarly you could also bind a function by invoking its __get__ method manually (not really recommended, just for illustrative purposes):

>>> test_function.__get__(instance, TestClass)
<bound method test_function of <__main__.TestClass object at ...>>

You can even call this “self-bound method”:

>>> test_function.__get__(instance, TestClass)()
<__main__.TestClass at ...>

Note that I did not provide any arguments and the function did return the instance I had bound!

Functions are Non-data descriptors!

Some built-in examples of a data-descriptor would be property. Neglecting getter, setter, and deleter the property descriptor is (from Descriptor HowTo Guide “Properties”):

class Property(object):
    def __init__(self, fget=None, fset=None, fdel=None, doc=None):
        self.fget = fget
        self.fset = fset
        self.fdel = fdel
        if doc is None and fget is not None:
            doc = fget.__doc__
        self.__doc__ = doc

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        if self.fget is None:
            raise AttributeError("unreadable attribute")
        return self.fget(obj)

    def __set__(self, obj, value):
        if self.fset is None:
            raise AttributeError("can't set attribute")
        self.fset(obj, value)

    def __delete__(self, obj):
        if self.fdel is None:
            raise AttributeError("can't delete attribute")
        self.fdel(obj)

Since it’s a data descriptor it’s invoked whenever you look up the “name” of the property and it simply delegates to the functions decorated with @property, @name.setter, and @name.deleter (if present).

There are several other descriptors in the standard library, for example staticmethod, classmethod.

The point of descriptors is easy (although you rarely need them): Abstract common code for attribute access. property is an abstraction for instance variable access, function provides an abstraction for methods, staticmethod provides an abstraction for methods that don’t need instance access and classmethod provides an abstraction for methods that need class access rather than instance access (this is a bit simplified).

Another example would be a class property.

One fun example (using __set_name__ from Python 3.6) could also be a property that only allows a specific type:

class TypedProperty(object):
    __slots__ = ('_name', '_type')
    def __init__(self, typ):
        self._type = typ

    def __get__(self, instance, klass=None):
        if instance is None:
            return self
        return instance.__dict__[self._name]

    def __set__(self, instance, value):
        if not isinstance(value, self._type):
            raise TypeError(f"Expected class {self._type}, got {type(value)}")
        instance.__dict__[self._name] = value

    def __delete__(self, instance):
        del instance.__dict__[self._name]

    def __set_name__(self, klass, name):
        self._name = name

Then you can use the descriptor in a class:

class Test(object):
    int_prop = TypedProperty(int)

And playing a bit with it:

>>> t = Test()
>>> t.int_prop = 10
>>> t.int_prop
10

>>> t.int_prop = 20.0
TypeError: Expected class <class 'int'>, got <class 'float'>

Or a “lazy property”:

class LazyProperty(object):
    __slots__ = ('_fget', '_name')
    def __init__(self, fget):
        self._fget = fget

    def __get__(self, instance, klass=None):
        if instance is None:
            return self
        try:
            return instance.__dict__[self._name]
        except KeyError:
            value = self._fget(instance)
            instance.__dict__[self._name] = value
            return value

    def __set_name__(self, klass, name):
        self._name = name

class Test(object):
    @LazyProperty
    def lazy(self):
        print('calculating')
        return 10

>>> t = Test()
>>> t.lazy
calculating
10
>>> t.lazy
10

These are cases where moving the logic into a common descriptor might make sense, however one could also solve them (but maybe with repeating some code) with other means.

What is instance and owner here? (in __get__). What is the purpose of these parameters?

It depends on how you look up the attribute. If you look up the attribute on an instance then:

  • the second argument is the instance on which you look up the attribute
  • the third argument is the class of the instance

In case you look up the attribute on the class (assuming the descriptor is defined on the class):

  • the second argument is None
  • the third argument is the class where you look up the attribute

So basically the third argument is necessary if you want to customize the behavior when you do class-level look-up (because the instance is None).

How would I call/use this example?

Your example is basically a property that only allows values that can be converted to float and that is shared between all instances of the class (and on the class – although one can only use “read” access on the class otherwise you would replace the descriptor instance):

>>> t1 = Temperature()
>>> t2 = Temperature()

>>> t1.celsius = 20   # setting it on one instance
>>> t2.celsius        # looking it up on another instance
20.0

>>> Temperature.celsius  # looking it up on the class
20.0

That’s why descriptors generally use the second argument (instance) to store the value to avoid sharing it. However in some cases sharing a value between instances might be desired (although I cannot think of a scenario at this moment). However it makes practically no sense for a celsius property on a temperature class… except maybe as purely academic exercise.


回答 4

为什么需要描述符类?

由Buciano Ramalho的Fluent Python启发

想象你有一个这样的Class

class LineItem:
     price = 10.9
     weight = 2.1
     def __init__(self, name, price, weight):
          self.name = name
          self.price = price
          self.weight = weight

item = LineItem("apple", 2.9, 2.1)
item.price = -0.9  # it's price is negative, you need to refund to your customer even you delivered the apple :(
item.weight = -0.8 # negative weight, it doesn't make sense

我们应该验证权重和价格,以避免给它们分配负数,如果我们使用描述符作为代理,则可以编写更少的代码,因为

class Quantity(object):
    __index = 0

    def __init__(self):
        self.__index = self.__class__.__index
        self._storage_name = "quantity#{}".format(self.__index)
        self.__class__.__index += 1

    def __set__(self, instance, value):
        if value > 0:
            setattr(instance, self._storage_name, value)
        else:
           raise ValueError('value should >0')

   def __get__(self, instance, owner):
        return getattr(instance, self._storage_name)

然后像这样定义类LineItem:

class LineItem(object):
     weight = Quantity()
     price = Quantity()

     def __init__(self, name, weight, price):
         self.name = name
         self.weight = weight
         self.price = price

我们可以扩展Quantity类以进行更常见的验证

Why do I need the descriptor class?

Inspired by Fluent Python by Buciano Ramalho

Imaging you have a class like this

class LineItem:
     price = 10.9
     weight = 2.1
     def __init__(self, name, price, weight):
          self.name = name
          self.price = price
          self.weight = weight

item = LineItem("apple", 2.9, 2.1)
item.price = -0.9  # it's price is negative, you need to refund to your customer even you delivered the apple :(
item.weight = -0.8 # negative weight, it doesn't make sense

We should validate the weight and price in avoid to assign them a negative number, we can write less code if we use descriptor as a proxy as this

class Quantity(object):
    __index = 0

    def __init__(self):
        self.__index = self.__class__.__index
        self._storage_name = "quantity#{}".format(self.__index)
        self.__class__.__index += 1

    def __set__(self, instance, value):
        if value > 0:
            setattr(instance, self._storage_name, value)
        else:
           raise ValueError('value should >0')

   def __get__(self, instance, owner):
        return getattr(instance, self._storage_name)

then define class LineItem like this:

class LineItem(object):
     weight = Quantity()
     price = Quantity()

     def __init__(self, name, weight, price):
         self.name = name
         self.weight = weight
         self.price = price

and we can extend the Quantity class to do more common validating


回答 5

我尝试(根据建议进行了一些小的更改)安德鲁·库克答案中的代码。(我正在运行python 2.7)。

编码:

#!/usr/bin/env python
class Celsius:
    def __get__(self, instance, owner): return 9 * (instance.fahrenheit + 32) / 5.0
    def __set__(self, instance, value): instance.fahrenheit = 32 + 5 * value / 9.0

class Temperature:
    def __init__(self, initial_f): self.fahrenheit = initial_f
    celsius = Celsius()

if __name__ == "__main__":

    t = Temperature(212)
    print(t.celsius)
    t.celsius = 0
    print(t.fahrenheit)

结果:

C:\Users\gkuhn\Desktop>python test2.py
<__main__.Celsius instance at 0x02E95A80>
212

对于3之前的Python,请确保您从对象继承了子类,这将使描述符正确工作,因为get魔术不适用于旧样式类。

I tried (with minor changes as suggested) the code from Andrew Cooke’s answer. (I am running python 2.7).

The code:

#!/usr/bin/env python
class Celsius:
    def __get__(self, instance, owner): return 9 * (instance.fahrenheit + 32) / 5.0
    def __set__(self, instance, value): instance.fahrenheit = 32 + 5 * value / 9.0

class Temperature:
    def __init__(self, initial_f): self.fahrenheit = initial_f
    celsius = Celsius()

if __name__ == "__main__":

    t = Temperature(212)
    print(t.celsius)
    t.celsius = 0
    print(t.fahrenheit)

The result:

C:\Users\gkuhn\Desktop>python test2.py
<__main__.Celsius instance at 0x02E95A80>
212

With Python prior to 3, make sure you subclass from object which will make the descriptor work correctly as the get magic does not work for old style classes.


回答 6

您会看到https://docs.python.org/3/howto/descriptor.html#properties

class Property(object):
    "Emulate PyProperty_Type() in Objects/descrobject.c"

    def __init__(self, fget=None, fset=None, fdel=None, doc=None):
        self.fget = fget
        self.fset = fset
        self.fdel = fdel
        if doc is None and fget is not None:
            doc = fget.__doc__
        self.__doc__ = doc

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        if self.fget is None:
            raise AttributeError("unreadable attribute")
        return self.fget(obj)

    def __set__(self, obj, value):
        if self.fset is None:
            raise AttributeError("can't set attribute")
        self.fset(obj, value)

    def __delete__(self, obj):
        if self.fdel is None:
            raise AttributeError("can't delete attribute")
        self.fdel(obj)

    def getter(self, fget):
        return type(self)(fget, self.fset, self.fdel, self.__doc__)

    def setter(self, fset):
        return type(self)(self.fget, fset, self.fdel, self.__doc__)

    def deleter(self, fdel):
        return type(self)(self.fget, self.fset, fdel, self.__doc__)

You’d see https://docs.python.org/3/howto/descriptor.html#properties

class Property(object):
    "Emulate PyProperty_Type() in Objects/descrobject.c"

    def __init__(self, fget=None, fset=None, fdel=None, doc=None):
        self.fget = fget
        self.fset = fset
        self.fdel = fdel
        if doc is None and fget is not None:
            doc = fget.__doc__
        self.__doc__ = doc

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        if self.fget is None:
            raise AttributeError("unreadable attribute")
        return self.fget(obj)

    def __set__(self, obj, value):
        if self.fset is None:
            raise AttributeError("can't set attribute")
        self.fset(obj, value)

    def __delete__(self, obj):
        if self.fdel is None:
            raise AttributeError("can't delete attribute")
        self.fdel(obj)

    def getter(self, fget):
        return type(self)(fget, self.fset, self.fdel, self.__doc__)

    def setter(self, fset):
        return type(self)(self.fget, fset, self.fdel, self.__doc__)

    def deleter(self, fdel):
        return type(self)(self.fget, self.fset, fdel, self.__doc__)