标签归档:metaprogramming

小马(ORM)如何发挥作用?

问题:小马(ORM)如何发挥作用?

Pony ORM很好地把生成器表达式转换成SQL。例:

>>> select(p for p in Person if p.name.startswith('Paul'))
        .order_by(Person.name)[:2]

SELECT "p"."id", "p"."name", "p"."age"
FROM "Person" "p"
WHERE "p"."name" LIKE "Paul%"
ORDER BY "p"."name"
LIMIT 2

[Person[3], Person[1]]
>>>

我知道Python具有出色的自省和内置元编程功能,但是该库如何能够在不进行预处理的情况下转换生成器表达式?看起来像魔术。

[更新]

搅拌器写道:

这是您要查找的文件。似乎可以使用一些自省向导来重构生成器。我不确定它是否支持100%的Python语法,但这很酷。- 搅拌机

我以为他们正在研究生成器表达协议中的某些功能,但正在查看此文件并看到其中ast涉及的模块…不,他们不是在动态检查程序源,是吗?令人振奋…

@BrenBarn:如果我尝试在select函数调用之外调用生成器,则结果为:

>>> x = (p for p in Person if p.age > 20)
>>> x.next()
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
  File "<interactive input>", line 1, in <genexpr>
  File "C:\Python27\lib\site-packages\pony\orm\core.py", line 1822, in next
    % self.entity.__name__)
  File "C:\Python27\lib\site-packages\pony\utils.py", line 92, in throw
    raise exc
TypeError: Use select(...) function or Person.select(...) method for iteration
>>>

好像他们在做更多不可思议的事情,例如检查select函数调用和动态处理Python抽象语法语法树。

我仍然希望看到有人对此进行解释,其来源远远超出了我的巫术水平。

Pony ORM does the nice trick of converting a generator expression into SQL. Example:

>>> select(p for p in Person if p.name.startswith('Paul'))
        .order_by(Person.name)[:2]

SELECT "p"."id", "p"."name", "p"."age"
FROM "Person" "p"
WHERE "p"."name" LIKE "Paul%"
ORDER BY "p"."name"
LIMIT 2

[Person[3], Person[1]]
>>>

I know Python has wonderful introspection and metaprogramming builtin, but how this library is able to translate the generator expression without preprocessing? It looks like magic.

[update]

Blender wrote:

Here is the file that you’re after. It seems to reconstruct the generator using some introspection wizardry. I’m not sure if it supports 100% of Python’s syntax, but this is pretty cool. – Blender

I was thinking they were exploring some feature from the generator expression protocol, but looking this file, and seeing the ast module involved… No, they are not inspecting the program source on the fly, are they? Mind-blowing…

@BrenBarn: If I try to call the generator outside the select function call, the result is:

>>> x = (p for p in Person if p.age > 20)
>>> x.next()
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
  File "<interactive input>", line 1, in <genexpr>
  File "C:\Python27\lib\site-packages\pony\orm\core.py", line 1822, in next
    % self.entity.__name__)
  File "C:\Python27\lib\site-packages\pony\utils.py", line 92, in throw
    raise exc
TypeError: Use select(...) function or Person.select(...) method for iteration
>>>

Seems like they are doing more arcane incantations like inspecting the select function call and processing the Python abstract syntax grammar tree on the fly.

I still would like to see someone explaining it, the source is way beyond my wizardry level.


回答 0

小马ORM作者在这里。

Pony通过三个步骤将Python生成器转换为SQL查询:

  1. 反编译生成器字节码并重建生成器AST(抽象语法树)
  2. 将Python AST转换为“抽象SQL”-SQL查询的基于列表的通用表示形式
  3. 将抽象SQL表示转换为特定于数据库的SQL方言

最复杂的部分是第二步,其中Pony必须了解Python表达式的“含义”。似乎您对第一步最感兴趣,所以让我解释一下反编译的工作原理。

让我们考虑以下查询:

>>> from pony.orm.examples.estore import *
>>> select(c for c in Customer if c.country == 'USA').show()

将其转换为以下SQL:

SELECT "c"."id", "c"."email", "c"."password", "c"."name", "c"."country", "c"."address"
FROM "Customer" "c"
WHERE "c"."country" = 'USA'

下面是该查询的结果,将其打印出来:

id|email              |password|name          |country|address  
--+-------------------+--------+--------------+-------+---------
1 |john@example.com   |***     |John Smith    |USA    |address 1
2 |matthew@example.com|***     |Matthew Reed  |USA    |address 2
4 |rebecca@example.com|***     |Rebecca Lawson|USA    |address 4

select()函数接受python生成器作为参数,然后分析其字节码。我们可以使用标准的python dis模块获取此生成器的字节码指令:

>>> gen = (c for c in Customer if c.country == 'USA')
>>> import dis
>>> dis.dis(gen.gi_frame.f_code)
  1           0 LOAD_FAST                0 (.0)
        >>    3 FOR_ITER                26 (to 32)
              6 STORE_FAST               1 (c)
              9 LOAD_FAST                1 (c)
             12 LOAD_ATTR                0 (country)
             15 LOAD_CONST               0 ('USA')
             18 COMPARE_OP               2 (==)
             21 POP_JUMP_IF_FALSE        3
             24 LOAD_FAST                1 (c)
             27 YIELD_VALUE         
             28 POP_TOP             
             29 JUMP_ABSOLUTE            3
        >>   32 LOAD_CONST               1 (None)
             35 RETURN_VALUE

Pony ORM decompile()在模块内pony.orm.decompiling具有可以从字节码恢复AST 的功能:

>>> from pony.orm.decompiling import decompile
>>> ast, external_names = decompile(gen)

在这里,我们可以看到AST节点的文本表示形式:

>>> ast
GenExpr(GenExprInner(Name('c'), [GenExprFor(AssName('c', 'OP_ASSIGN'), Name('.0'),
[GenExprIf(Compare(Getattr(Name('c'), 'country'), [('==', Const('USA'))]))])]))

现在让我们看看该decompile()函数是如何工作的。

decompile()函数创建一个Decompiler对象,该对象实现了Visitor模式。反编译器实例一一获取字节码指令。对于每条指令,反编译器对象都会调用其自己的方法。该方法的名称等于当前字节码指令的名称。

Python计算表达式时,它使用堆栈,该堆栈存储中间的计算结果。反编译器对象也有自己的堆栈,但是该堆栈不存储表达式计算的结果,而是存储表达式的AST节点。

当调用下一个字节码指令的反编译器方法时,它将从堆栈中取出AST节点,将它们组合成一个新的AST节点,然后将该节点放在堆栈的顶部。

例如,让我们看看如何c.country == 'USA'计算子表达式。相应的字节码片段为:

              9 LOAD_FAST                1 (c)
             12 LOAD_ATTR                0 (country)
             15 LOAD_CONST               0 ('USA')
             18 COMPARE_OP               2 (==)

因此,反编译器对象执行以下操作:

  1. 来电decompiler.LOAD_FAST('c')。此方法将Name('c')节点放在反编译器堆栈的顶部。
  2. 来电decompiler.LOAD_ATTR('country')。此方法Name('c')从堆栈中取出节点,创建该Geattr(Name('c'), 'country')节点并将其放在堆栈顶部。
  3. 来电decompiler.LOAD_CONST('USA')。此方法将Const('USA')节点放在堆栈顶部。
  4. 来电decompiler.COMPARE_OP('==')。此方法从堆栈中获取两个节点(Getattr和Const),然后将其Compare(Getattr(Name('c'), 'country'), [('==', Const('USA'))]) 放在堆栈的顶部。

在处理完所有字节码指令之后,反编译器堆栈将包含一个与整个生成器表达式相对应的AST节点。

由于Pony ORM仅需要反编译生成器和lambda,因此并没有那么复杂,因为生成器的指令流相对简单-它只是一堆嵌套循环。

目前,Pony ORM涵盖了整个生成器指令集,但以下两点除外:

  1. 内联if表达式: a if b else c
  2. 复合比较: a < b < c

如果Pony遇到此类表达,则会引发NotImplementedError异常。但是即使在这种情况下,您也可以通过将生成器表达式作为字符串传递来使其工作。当您将生成器作为字符串传递时,Pony不使用反编译器模块。相反,它使用标准Python compiler.parse函数获取AST 。

希望这能回答您的问题。

Pony ORM author is here.

Pony translates Python generator into SQL query in three steps:

  1. Decompiling of generator bytecode and rebuilding generator AST (abstract syntax tree)
  2. Translation of Python AST into “abstract SQL” — universal list-based representation of a SQL query
  3. Converting abstract SQL representation into specific database-dependent SQL dialect

The most complex part is the second step, where Pony must understand the “meaning” of Python expressions. Seems you are most interested in the first step, so let me explain how decompiling works.

Let’s consider this query:

>>> from pony.orm.examples.estore import *
>>> select(c for c in Customer if c.country == 'USA').show()

Which will be translated into the following SQL:

SELECT "c"."id", "c"."email", "c"."password", "c"."name", "c"."country", "c"."address"
FROM "Customer" "c"
WHERE "c"."country" = 'USA'

And below is the result of this query which will be printed out:

id|email              |password|name          |country|address  
--+-------------------+--------+--------------+-------+---------
1 |john@example.com   |***     |John Smith    |USA    |address 1
2 |matthew@example.com|***     |Matthew Reed  |USA    |address 2
4 |rebecca@example.com|***     |Rebecca Lawson|USA    |address 4

The select() function accepts a python generator as argument, and then analyzes its bytecode. We can get bytecode instructions of this generator using standard python dis module:

>>> gen = (c for c in Customer if c.country == 'USA')
>>> import dis
>>> dis.dis(gen.gi_frame.f_code)
  1           0 LOAD_FAST                0 (.0)
        >>    3 FOR_ITER                26 (to 32)
              6 STORE_FAST               1 (c)
              9 LOAD_FAST                1 (c)
             12 LOAD_ATTR                0 (country)
             15 LOAD_CONST               0 ('USA')
             18 COMPARE_OP               2 (==)
             21 POP_JUMP_IF_FALSE        3
             24 LOAD_FAST                1 (c)
             27 YIELD_VALUE         
             28 POP_TOP             
             29 JUMP_ABSOLUTE            3
        >>   32 LOAD_CONST               1 (None)
             35 RETURN_VALUE

Pony ORM has the function decompile() within module pony.orm.decompiling which can restore an AST from the bytecode:

>>> from pony.orm.decompiling import decompile
>>> ast, external_names = decompile(gen)

Here, we can see the textual representation of the AST nodes:

>>> ast
GenExpr(GenExprInner(Name('c'), [GenExprFor(AssName('c', 'OP_ASSIGN'), Name('.0'),
[GenExprIf(Compare(Getattr(Name('c'), 'country'), [('==', Const('USA'))]))])]))

Let’s now see how the decompile() function works.

The decompile() function creates a Decompiler object, which implements the Visitor pattern. The decompiler instance gets bytecode instructions one-by-one. For each instruction the decompiler object calls its own method. The name of this method is equal to the name of current bytecode instruction.

When Python calculates an expression, it uses stack, which stores an intermediate result of calculation. The decompiler object also has its own stack, but this stack stores not the result of expression calculation, but AST node for the expression.

When decompiler method for the next bytecode instruction is called, it takes AST nodes from the stack, combines them into a new AST node, and then puts this node on the top of the stack.

For example, let’s see how the subexpression c.country == 'USA' is calculated. The corresponding bytecode fragment is:

              9 LOAD_FAST                1 (c)
             12 LOAD_ATTR                0 (country)
             15 LOAD_CONST               0 ('USA')
             18 COMPARE_OP               2 (==)

So, the decompiler object does the following:

  1. Calls decompiler.LOAD_FAST('c'). This method puts the Name('c') node on the top of the decompiler stack.
  2. Calls decompiler.LOAD_ATTR('country'). This method takes the Name('c') node from the stack, creates the Geattr(Name('c'), 'country') node and puts it on the top of the stack.
  3. Calls decompiler.LOAD_CONST('USA'). This method puts the Const('USA') node on top of the stack.
  4. Calls decompiler.COMPARE_OP('=='). This method takes two nodes (Getattr and Const) from the stack, and then puts Compare(Getattr(Name('c'), 'country'), [('==', Const('USA'))]) on the top of the stack.

After all bytecode instructions are processed, the decompiler stack contains a single AST node which corresponds to the whole generator expression.

Since Pony ORM needs to decompile generators and lambdas only, this is not that complex, because the instruction flow for a generator is relatively straightforward – it is just a bunch of nested loops.

Currently Pony ORM covers the whole generator instructions set except two things:

  1. Inline if expressions: a if b else c
  2. Compound comparisons: a < b < c

If Pony encounters such expression it raises the NotImplementedError exception. But even in this case you can make it work by passing the generator expression as a string. When you pass a generator as a string Pony doesn’t use the decompiler module. Instead it gets the AST using the standard Python compiler.parse function.

Hope this answers your question.


一行Python代码可以知道其缩进嵌套级别吗?

问题:一行Python代码可以知道其缩进嵌套级别吗?

从这样的事情:

print(get_indentation_level())

    print(get_indentation_level())

        print(get_indentation_level())

我想得到这样的东西:

1
2
3

代码可以这样读取吗?

我想要的只是更多嵌套代码部分的输出。以使代码易于阅读的方式,使输出易于阅读。

当然,我可以使用eg手动实现此功能.format(),但是我想到的是自定义打印功能,该功能print(i*' ' + string)在哪里i是缩进级别。这将是使终端上的输出可读的一种快速方法。

有没有更好的方法可以避免麻烦的手动格式化?

From something like this:

print(get_indentation_level())

    print(get_indentation_level())

        print(get_indentation_level())

I would like to get something like this:

1
2
3

Can the code read itself in this way?

All I want is the output from the more nested parts of the code to be more nested. In the same way that this makes code easier to read, it would make the output easier to read.

Of course I could implement this manually, using e.g. .format(), but what I had in mind was a custom print function which would print(i*' ' + string) where i is the indentation level. This would be a quick way to make readable output on my terminal.

Is there a better way to do this which avoids painstaking manual formatting?


回答 0

如果您想缩进而不是使用空格和制表符来嵌套级别,那么事情就会变得棘手。例如,在以下代码中:

if True:
    print(
get_nesting_level())

get_nesting_level尽管实际上在行的行上没有前导空格,但对的调用实际上嵌套了一层深度get_nesting_level呼叫。同时,在以下代码中:

print(1,
      2,
      get_nesting_level())

调用 get_nesting_level尽管该行中存在领先的空格,对仍嵌套在零级深度。

在下面的代码中:

if True:
  if True:
    print(get_nesting_level())

if True:
    print(get_nesting_level())

两次调用 get_nesting_level尽管前导空白是相同的,但这处于不同的嵌套级别。

在下面的代码中:

if True: print(get_nesting_level())

是嵌套的零级,还是一级?在INDENTDEDENT形式语法中标记,深度为零,但是您可能会感觉不一样。


如果要执行此操作,则必须标记整个文件,直到调用,计数INDENTDEDENT标记为止。该tokenize模块对于此类功能非常有用:

import inspect
import tokenize

def get_nesting_level():
    caller_frame = inspect.currentframe().f_back
    filename, caller_lineno, _, _, _ = inspect.getframeinfo(caller_frame)
    with open(filename) as f:
        indentation_level = 0
        for token_record in tokenize.generate_tokens(f.readline):
            token_type, _, (token_lineno, _), _, _ = token_record
            if token_lineno > caller_lineno:
                break
            elif token_type == tokenize.INDENT:
                indentation_level += 1
            elif token_type == tokenize.DEDENT:
                indentation_level -= 1
        return indentation_level

If you want indentation in terms of nesting level rather than spaces and tabs, things get tricky. For example, in the following code:

if True:
    print(
get_nesting_level())

the call to get_nesting_level is actually nested one level deep, despite the fact that there is no leading whitespace on the line of the get_nesting_level call. Meanwhile, in the following code:

print(1,
      2,
      get_nesting_level())

the call to get_nesting_level is nested zero levels deep, despite the presence of leading whitespace on its line.

In the following code:

if True:
  if True:
    print(get_nesting_level())

if True:
    print(get_nesting_level())

the two calls to get_nesting_level are at different nesting levels, despite the fact that the leading whitespace is identical.

In the following code:

if True: print(get_nesting_level())

is that nested zero levels, or one? In terms of INDENT and DEDENT tokens in the formal grammar, it’s zero levels deep, but you might not feel the same way.


If you want to do this, you’re going to have to tokenize the whole file up to the point of the call and count INDENT and DEDENT tokens. The tokenize module would be very useful for such a function:

import inspect
import tokenize

def get_nesting_level():
    caller_frame = inspect.currentframe().f_back
    filename, caller_lineno, _, _, _ = inspect.getframeinfo(caller_frame)
    with open(filename) as f:
        indentation_level = 0
        for token_record in tokenize.generate_tokens(f.readline):
            token_type, _, (token_lineno, _), _, _ = token_record
            if token_lineno > caller_lineno:
                break
            elif token_type == tokenize.INDENT:
                indentation_level += 1
            elif token_type == tokenize.DEDENT:
                indentation_level -= 1
        return indentation_level

回答 1

是的,绝对有可能,这是一个可行的示例:

import inspect

def get_indentation_level():
    callerframerecord = inspect.stack()[1]
    frame = callerframerecord[0]
    info = inspect.getframeinfo(frame)
    cc = info.code_context[0]
    return len(cc) - len(cc.lstrip())

if 1:
    print get_indentation_level()
    if 1:
        print get_indentation_level()
        if 1:
            print get_indentation_level()

Yeah, that’s definitely possible, here’s a working example:

import inspect

def get_indentation_level():
    callerframerecord = inspect.stack()[1]
    frame = callerframerecord[0]
    info = inspect.getframeinfo(frame)
    cc = info.code_context[0]
    return len(cc) - len(cc.lstrip())

if 1:
    print get_indentation_level()
    if 1:
        print get_indentation_level()
        if 1:
            print get_indentation_level()

回答 2

您可以使用sys.current_frame.f_lineno以获取行号。然后,为了找到压痕级别的数量,您需要找到压痕为零的前一行,然后从该行的数量中减去当前行号,您将获得压痕数量:

import sys
current_frame = sys._getframe(0)

def get_ind_num():
    with open(__file__) as f:
        lines = f.readlines()
    current_line_no = current_frame.f_lineno
    to_current = lines[:current_line_no]
    previous_zoro_ind = len(to_current) - next(i for i, line in enumerate(to_current[::-1]) if not line[0].isspace())
    return current_line_no - previous_zoro_ind

演示:

if True:
    print get_ind_num()
    if True:
        print(get_ind_num())
        if True:
            print(get_ind_num())
            if True: print(get_ind_num())
# Output
1
3
5
6

如果您想要基于先前行的缩进级别编号,:则只需稍作更改即可:

def get_ind_num():
    with open(__file__) as f:
        lines = f.readlines()

    current_line_no = current_frame.f_lineno
    to_current = lines[:current_line_no]
    previous_zoro_ind = len(to_current) - next(i for i, line in enumerate(to_current[::-1]) if not line[0].isspace())
    return sum(1 for line in lines[previous_zoro_ind-1:current_line_no] if line.strip().endswith(':'))

演示:

if True:
    print get_ind_num()
    if True:
        print(get_ind_num())
        if True:
            print(get_ind_num())
            if True: print(get_ind_num())
# Output
1
2
3
3

作为替代答案,这里是一个用于获取缩进数量(空格)的函数:

import sys
from itertools import takewhile
current_frame = sys._getframe(0)

def get_ind_num():
    with open(__file__) as f:
        lines = f.readlines()
    return sum(1 for _ in takewhile(str.isspace, lines[current_frame.f_lineno - 1]))

You can use sys.current_frame.f_lineno in order to get the line number. Then in order to find the number of indentation level you need to find the previous line with zero indentation then be subtracting the current line number from that line’s number you’ll get the number of indentation:

import sys
current_frame = sys._getframe(0)

def get_ind_num():
    with open(__file__) as f:
        lines = f.readlines()
    current_line_no = current_frame.f_lineno
    to_current = lines[:current_line_no]
    previous_zoro_ind = len(to_current) - next(i for i, line in enumerate(to_current[::-1]) if not line[0].isspace())
    return current_line_no - previous_zoro_ind

Demo:

if True:
    print get_ind_num()
    if True:
        print(get_ind_num())
        if True:
            print(get_ind_num())
            if True: print(get_ind_num())
# Output
1
3
5
6

If you want the number of the indentation level based on the previouse lines with : you can just do it with a little change:

def get_ind_num():
    with open(__file__) as f:
        lines = f.readlines()

    current_line_no = current_frame.f_lineno
    to_current = lines[:current_line_no]
    previous_zoro_ind = len(to_current) - next(i for i, line in enumerate(to_current[::-1]) if not line[0].isspace())
    return sum(1 for line in lines[previous_zoro_ind-1:current_line_no] if line.strip().endswith(':'))

Demo:

if True:
    print get_ind_num()
    if True:
        print(get_ind_num())
        if True:
            print(get_ind_num())
            if True: print(get_ind_num())
# Output
1
2
3
3

And as an alternative answer here is a function for getting the number of indentation (whitespace):

import sys
from itertools import takewhile
current_frame = sys._getframe(0)

def get_ind_num():
    with open(__file__) as f:
        lines = f.readlines()
    return sum(1 for _ in takewhile(str.isspace, lines[current_frame.f_lineno - 1]))

回答 3

为了解决导致您提出问题的“实际”问题,您可以实现一个contextmanager,它可以跟踪缩进级别并使with代码中的块结构与输出的缩进级别相对应。这样,代码缩进仍然可以反映输出缩进,而不会造成过多的耦合。仍然可以将代码重构为不同的功能,并基于代码结构使用其他缩进,而不会干扰输出缩进。

#!/usr/bin/env python
# coding: utf8
from __future__ import absolute_import, division, print_function


class IndentedPrinter(object):

    def __init__(self, level=0, indent_with='  '):
        self.level = level
        self.indent_with = indent_with

    def __enter__(self):
        self.level += 1
        return self

    def __exit__(self, *_args):
        self.level -= 1

    def print(self, arg='', *args, **kwargs):
        print(self.indent_with * self.level + str(arg), *args, **kwargs)


def main():
    indented = IndentedPrinter()
    indented.print(indented.level)
    with indented:
        indented.print(indented.level)
        with indented:
            indented.print('Hallo', indented.level)
            with indented:
                indented.print(indented.level)
            indented.print('and back one level', indented.level)


if __name__ == '__main__':
    main()

输出:

0
  1
    Hallo 2
      3
    and back one level 2

To solve the ”real” problem that lead to your question you could implement a contextmanager which keeps track of the indention level and make the with block structure in the code correspond to the indentation levels of the output. This way the code indentation still reflects the output indentation without coupling both too much. It is still possible to refactor the code into different functions and have other indentations based on code structure not messing with the output indentation.

#!/usr/bin/env python
# coding: utf8
from __future__ import absolute_import, division, print_function


class IndentedPrinter(object):

    def __init__(self, level=0, indent_with='  '):
        self.level = level
        self.indent_with = indent_with

    def __enter__(self):
        self.level += 1
        return self

    def __exit__(self, *_args):
        self.level -= 1

    def print(self, arg='', *args, **kwargs):
        print(self.indent_with * self.level + str(arg), *args, **kwargs)


def main():
    indented = IndentedPrinter()
    indented.print(indented.level)
    with indented:
        indented.print(indented.level)
        with indented:
            indented.print('Hallo', indented.level)
            with indented:
                indented.print(indented.level)
            indented.print('and back one level', indented.level)


if __name__ == '__main__':
    main()

Output:

0
  1
    Hallo 2
      3
    and back one level 2

回答 4

>>> import inspect
>>> help(inspect.indentsize)
Help on function indentsize in module inspect:

indentsize(line)
    Return the indent size, in spaces, at the start of a line of text.
>>> import inspect
>>> help(inspect.indentsize)
Help on function indentsize in module inspect:

indentsize(line)
    Return the indent size, in spaces, at the start of a line of text.

来自对象字段的Python字典

问题:来自对象字段的Python字典

您是否知道是否有内置函数可以从任意对象构建字典?我想做这样的事情:

>>> class Foo:
...     bar = 'hello'
...     baz = 'world'
...
>>> f = Foo()
>>> props(f)
{ 'bar' : 'hello', 'baz' : 'world' }

注意:它不应包含方法。仅字段。

Do you know if there is a built-in function to build a dictionary from an arbitrary object? I’d like to do something like this:

>>> class Foo:
...     bar = 'hello'
...     baz = 'world'
...
>>> f = Foo()
>>> props(f)
{ 'bar' : 'hello', 'baz' : 'world' }

NOTE: It should not include methods. Only fields.


回答 0

请注意,Python 2.7中的最佳实践是使用新型类(Python 3不需要),即

class Foo(object):
   ...

同样,“对象”和“类”之间也存在差异。要从任意对象构建字典,只需使用即可__dict__。通常,您将在类级别声明您的方法,并在实例级别声明您的属性,因此__dict__应该没问题。例如:

>>> class A(object):
...   def __init__(self):
...     self.b = 1
...     self.c = 2
...   def do_nothing(self):
...     pass
...
>>> a = A()
>>> a.__dict__
{'c': 2, 'b': 1}

更好的方法(由robert建议在注释中使用)是内置vars函数:

>>> vars(a)
{'c': 2, 'b': 1}

另外,根据您要执行的操作,最好继承自dict。然后,您的Class已经是字典,并且如果您愿意,可以覆盖getattr和/或setattr调用并设置字典。例如:

class Foo(dict):
    def __init__(self):
        pass
    def __getattr__(self, attr):
        return self[attr]

    # etc...

Note that best practice in Python 2.7 is to use new-style classes (not needed with Python 3), i.e.

class Foo(object):
   ...

Also, there’s a difference between an ‘object’ and a ‘class’. To build a dictionary from an arbitrary object, it’s sufficient to use __dict__. Usually, you’ll declare your methods at class level and your attributes at instance level, so __dict__ should be fine. For example:

>>> class A(object):
...   def __init__(self):
...     self.b = 1
...     self.c = 2
...   def do_nothing(self):
...     pass
...
>>> a = A()
>>> a.__dict__
{'c': 2, 'b': 1}

A better approach (suggested by robert in comments) is the builtin vars function:

>>> vars(a)
{'c': 2, 'b': 1}

Alternatively, depending on what you want to do, it might be nice to inherit from dict. Then your class is already a dictionary, and if you want you can override getattr and/or setattr to call through and set the dict. For example:

class Foo(dict):
    def __init__(self):
        pass
    def __getattr__(self, attr):
        return self[attr]

    # etc...

回答 1

取而代之的是x.__dict__,它实际上更具有Pythonic的用法vars(x)

Instead of x.__dict__, it’s actually more pythonic to use vars(x).


回答 2

dir内置会给你对象的所有属性,包括特殊的方法,如__str____dict__和一大堆人,你可能不希望的。但是您可以执行以下操作:

>>> class Foo(object):
...     bar = 'hello'
...     baz = 'world'
...
>>> f = Foo()
>>> [name for name in dir(f) if not name.startswith('__')]
[ 'bar', 'baz' ]
>>> dict((name, getattr(f, name)) for name in dir(f) if not name.startswith('__')) 
{ 'bar': 'hello', 'baz': 'world' }

因此可以通过定义如下props函数将其扩展为仅返回数据属性而不是方法:

import inspect

def props(obj):
    pr = {}
    for name in dir(obj):
        value = getattr(obj, name)
        if not name.startswith('__') and not inspect.ismethod(value):
            pr[name] = value
    return pr

The dir builtin will give you all the object’s attributes, including special methods like __str__, __dict__ and a whole bunch of others which you probably don’t want. But you can do something like:

>>> class Foo(object):
...     bar = 'hello'
...     baz = 'world'
...
>>> f = Foo()
>>> [name for name in dir(f) if not name.startswith('__')]
[ 'bar', 'baz' ]
>>> dict((name, getattr(f, name)) for name in dir(f) if not name.startswith('__')) 
{ 'bar': 'hello', 'baz': 'world' }

So can extend this to only return data attributes and not methods, by defining your props function like this:

import inspect

def props(obj):
    pr = {}
    for name in dir(obj):
        value = getattr(obj, name)
        if not name.startswith('__') and not inspect.ismethod(value):
            pr[name] = value
    return pr

回答 3

我已经结合了两个答案:

dict((key, value) for key, value in f.__dict__.iteritems() 
    if not callable(value) and not key.startswith('__'))

I’ve settled with a combination of both answers:

dict((key, value) for key, value in f.__dict__.iteritems() 
    if not callable(value) and not key.startswith('__'))

回答 4

我以为我会花些时间向您展示如何通过转换对象来决定字典dict(obj)

class A(object):
    d = '4'
    e = '5'
    f = '6'

    def __init__(self):
        self.a = '1'
        self.b = '2'
        self.c = '3'

    def __iter__(self):
        # first start by grabbing the Class items
        iters = dict((x,y) for x,y in A.__dict__.items() if x[:2] != '__')

        # then update the class items with the instance items
        iters.update(self.__dict__)

        # now 'yield' through the items
        for x,y in iters.items():
            yield x,y

a = A()
print(dict(a)) 
# prints "{'a': '1', 'c': '3', 'b': '2', 'e': '5', 'd': '4', 'f': '6'}"

此代码的关键部分是 __iter__功能。

正如评论所解释的,我们要做的第一件事是获取Class项,并防止以’__’开头的任何东西。

一旦创建了它dict,就可以使用updatedict函数并传入实例__dict__

这些将为您提供完整的成员类+实例字典。现在剩下的就是迭代它们并产生回报。

另外,如果您打算大量使用它,则可以创建一个@iterable类装饰器。

def iterable(cls):
    def iterfn(self):
        iters = dict((x,y) for x,y in cls.__dict__.items() if x[:2] != '__')
        iters.update(self.__dict__)

        for x,y in iters.items():
            yield x,y

    cls.__iter__ = iterfn
    return cls

@iterable
class B(object):
    d = 'd'
    e = 'e'
    f = 'f'

    def __init__(self):
        self.a = 'a'
        self.b = 'b'
        self.c = 'c'

b = B()
print(dict(b))

I thought I’d take some time to show you how you can translate an object to dict via dict(obj).

class A(object):
    d = '4'
    e = '5'
    f = '6'

    def __init__(self):
        self.a = '1'
        self.b = '2'
        self.c = '3'

    def __iter__(self):
        # first start by grabbing the Class items
        iters = dict((x,y) for x,y in A.__dict__.items() if x[:2] != '__')

        # then update the class items with the instance items
        iters.update(self.__dict__)

        # now 'yield' through the items
        for x,y in iters.items():
            yield x,y

a = A()
print(dict(a)) 
# prints "{'a': '1', 'c': '3', 'b': '2', 'e': '5', 'd': '4', 'f': '6'}"

The key section of this code is the __iter__ function.

As the comments explain, the first thing we do is grab the Class items and prevent anything that starts with ‘__’.

Once you’ve created that dict, then you can use the update dict function and pass in the instance __dict__.

These will give you a complete class+instance dictionary of members. Now all that’s left is to iterate over them and yield the returns.

Also, if you plan on using this a lot, you can create an @iterable class decorator.

def iterable(cls):
    def iterfn(self):
        iters = dict((x,y) for x,y in cls.__dict__.items() if x[:2] != '__')
        iters.update(self.__dict__)

        for x,y in iters.items():
            yield x,y

    cls.__iter__ = iterfn
    return cls

@iterable
class B(object):
    d = 'd'
    e = 'e'
    f = 'f'

    def __init__(self):
        self.a = 'a'
        self.b = 'b'
        self.c = 'c'

b = B()
print(dict(b))

回答 5

要从任意对象构建字典,只需使用即可__dict__

这会错过对象从其类继承的属性。例如,

class c(object):
    x = 3
a = c()

hasattr(a,’x’)是true,但是’x’不会出现在a .__ dict__

To build a dictionary from an arbitrary object, it’s sufficient to use __dict__.

This misses attributes that the object inherits from its class. For example,

class c(object):
    x = 3
a = c()

hasattr(a, ‘x’) is true, but ‘x’ does not appear in a.__dict__


回答 6

答案较晚,但提供了完整性和对Google员工的好处:

def props(x):
    return dict((key, getattr(x, key)) for key in dir(x) if key not in dir(x.__class__))

这不会显示在类中定义的方法,但仍会显示字段,包括分配给lambda的字段或以双下划线开头的字段。

Late answer but provided for completeness and the benefit of googlers:

def props(x):
    return dict((key, getattr(x, key)) for key in dir(x) if key not in dir(x.__class__))

This will not show methods defined in the class, but it will still show fields including those assigned to lambdas or those which start with a double underscore.


回答 7

我认为最简单的方法是为该类创建一个getitem属性。如果需要写入对象,则可以创建一个自定义setattr。这是getitem的示例:

class A(object):
    def __init__(self):
        self.b = 1
        self.c = 2
    def __getitem__(self, item):
        return self.__dict__[item]

# Usage: 
a = A()
a.__getitem__('b')  # Outputs 1
a.__dict__  # Outputs {'c': 2, 'b': 1}
vars(a)  # Outputs {'c': 2, 'b': 1}

dict将对象属性生成到字典中,并且字典对象可用于获取所需的项目。

I think the easiest way is to create a getitem attribute for the class. If you need to write to the object, you can create a custom setattr . Here is an example for getitem:

class A(object):
    def __init__(self):
        self.b = 1
        self.c = 2
    def __getitem__(self, item):
        return self.__dict__[item]

# Usage: 
a = A()
a.__getitem__('b')  # Outputs 1
a.__dict__  # Outputs {'c': 2, 'b': 1}
vars(a)  # Outputs {'c': 2, 'b': 1}

dict generates the objects attributes into a dictionary and the dictionary object can be used to get the item you need.


回答 8

使用的缺点 __dict__是它很浅。它不会将任何子类转换为字典。

如果您使用的是Python3.5或更高版本,则可以使用jsons

>>> import jsons
>>> jsons.dump(f)
{'bar': 'hello', 'baz': 'world'}

A downside of using __dict__ is that it is shallow; it won’t convert any subclasses to dictionaries.

If you’re using Python3.5 or higher, you can use jsons:

>>> import jsons
>>> jsons.dump(f)
{'bar': 'hello', 'baz': 'world'}

回答 9

如果要列出部分属性,请覆盖__dict__

def __dict__(self):
    d = {
    'attr_1' : self.attr_1,
    ...
    }
    return d

# Call __dict__
d = instance.__dict__()

如果您instance获得了一些大块数据,并且想要d像消息队列一样推送到Redis ,这将很有帮助。

If you want to list part of your attributes, override __dict__:

def __dict__(self):
    d = {
    'attr_1' : self.attr_1,
    ...
    }
    return d

# Call __dict__
d = instance.__dict__()

This helps a lot if your instance get some large block data and you want to push d to Redis like message queue.


回答 10

PYTHON 3:

class DateTimeDecoder(json.JSONDecoder):

   def __init__(self, *args, **kargs):
        JSONDecoder.__init__(self, object_hook=self.dict_to_object,
                         *args, **kargs)

   def dict_to_object(self, d):
       if '__type__' not in d:
          return d

       type = d.pop('__type__')
       try:
          dateobj = datetime(**d)
          return dateobj
       except:
          d['__type__'] = type
          return d

def json_default_format(value):
    try:
        if isinstance(value, datetime):
            return {
                '__type__': 'datetime',
                'year': value.year,
                'month': value.month,
                'day': value.day,
                'hour': value.hour,
                'minute': value.minute,
                'second': value.second,
                'microsecond': value.microsecond,
            }
        if isinstance(value, decimal.Decimal):
            return float(value)
        if isinstance(value, Enum):
            return value.name
        else:
            return vars(value)
    except Exception as e:
        raise ValueError

现在,您可以在自己的类中使用上述代码:

class Foo():
  def toJSON(self):
        return json.loads(
            json.dumps(self, sort_keys=True, indent=4, separators=(',', ': '), default=json_default_format), cls=DateTimeDecoder)


Foo().toJSON() 

PYTHON 3:

class DateTimeDecoder(json.JSONDecoder):

   def __init__(self, *args, **kargs):
        JSONDecoder.__init__(self, object_hook=self.dict_to_object,
                         *args, **kargs)

   def dict_to_object(self, d):
       if '__type__' not in d:
          return d

       type = d.pop('__type__')
       try:
          dateobj = datetime(**d)
          return dateobj
       except:
          d['__type__'] = type
          return d

def json_default_format(value):
    try:
        if isinstance(value, datetime):
            return {
                '__type__': 'datetime',
                'year': value.year,
                'month': value.month,
                'day': value.day,
                'hour': value.hour,
                'minute': value.minute,
                'second': value.second,
                'microsecond': value.microsecond,
            }
        if isinstance(value, decimal.Decimal):
            return float(value)
        if isinstance(value, Enum):
            return value.name
        else:
            return vars(value)
    except Exception as e:
        raise ValueError

Now you can use above code inside your own class :

class Foo():
  def toJSON(self):
        return json.loads(
            json.dumps(self, sort_keys=True, indent=4, separators=(',', ': '), default=json_default_format), cls=DateTimeDecoder)


Foo().toJSON() 

回答 11

vars() 很棒,但是不适用于对象的嵌套对象

将对象的嵌套对象转换为dict:

def to_dict(self):
    return json.loads(json.dumps(self, default=lambda o: o.__dict__))

vars() is great, but doesn’t work for nested objects of objects

Convert nested object of objects to dict:

def to_dict(self):
    return json.loads(json.dumps(self, default=lambda o: o.__dict__))