标签归档:conventions

导入函数内部是pythonic吗?

问题:导入函数内部是pythonic吗?

PEP 8说:

  • 导入总是放在文件的顶部,紧随任何模块注释和文档字符串之后,以及模块全局变量和常量之前。

占用时,我违反了PEP8。有时,我在函数中导入了东西。通常,如果存在仅在单个函数中使用的导入,则执行此操作。

有什么意见吗?

编辑(我觉得导入函数的原因可能是个好主意):

主要原因:它可以使代码更清晰。

  • 在查看函数代码时,我可能会问自己:“函数/类xxx是什么?” (在函数内部使用xxx)。如果我的所有导入都在模块顶部,则必须去那里确定xxx是什么。使用时,这更成问题from m import xxx。看到m.xxx该功能可能会告诉我更多信息。取决于什么m:它是众所周知的顶级模块/软件包(import m)?还是子模块/包(from a.b.c import m)?
  • 在某些情况下,具有与使用xxx接近的位置的额外信息(“ xxx是什么?”)可以使功能更易于理解。

PEP 8 says:

  • Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants.

On occation, I violate PEP 8. Some times I import stuff inside functions. As a general rule, I do this if there is an import that is only used within a single function.

Any opinions?

EDIT (the reason I feel importing in functions can be a good idea):

Main reason: It can make the code clearer.

  • When looking at the code of a function I might ask myself: “What is function/class xxx?” (xxx being used inside the function). If I have all my imports at the top of the module, I have to go look there to determine what xxx is. This is more of an issue when using from m import xxx. Seeing m.xxx in the function probably tells me more. Depending on what m is: Is it a well-known top-level module/package (import m)? Or is it a sub-module/package (from a.b.c import m)?
  • In some cases having that extra information (“What is xxx?”) close to where xxx is used can make the function easier to understand.

回答 0

从长远来看,我想您会喜欢将大多数导入都放在文件顶部的方式,这样一来您就可以一目了然地判断模块需要导入的内容有多复杂。

如果要在现有文件中添加新代码,通常会在需要的地方进行导入,然后,如果代码保留下来,则可以通过将导入行移到文件顶部来使事情变得更永久。

还有一点,我更喜欢ImportError在运行任何代码之前获取一个异常-作为健全性检查,因此这是在顶部导入的另一个原因。

pyChecker用来检查未使用的模块。

In the long run I think you’ll appreciate having most of your imports at the top of the file, that way you can tell at a glance how complicated your module is by what it needs to import.

If I’m adding new code to an existing file I’ll usually do the import where it’s needed and then if the code stays I’ll make things more permanent by moving the import line to the top of the file.

One other point, I prefer to get an ImportError exception before any code is run — as a sanity check, so that’s another reason to import at the top.

I use pyChecker to check for unused modules.


回答 1

在这方面,我有两次违反PEP 8的情况:

  • 循环导入:模块A导入模块B,但是模块B中的某些东西需要模块A(尽管这通常表明我需要重构模块以消除循环依赖)
  • 插入pdb断点:import pdb; pdb.set_trace()这很方便,我不想放在import pdb可能要调试的每个模块的顶部,并且很容易记住在删除断点时要删除导入。

除了这两种情况外,最好将所有内容放在首位。它使依赖关系更加清晰。

There are two occasions where I violate PEP 8 in this regard:

  • Circular imports: module A imports module B, but something in module B needs module A (though this is often a sign that I need to refactor the modules to eliminate the circular dependency)
  • Inserting a pdb breakpoint: import pdb; pdb.set_trace() This is handy b/c I don’t want to put import pdb at the top of every module I might want to debug, and it easy to remember to remove the import when I remove the breakpoint.

Outside of these two cases, it’s a good idea to put everything at the top. It makes the dependencies clearer.


回答 2

这是我们使用的四个导入用例

  1. import(和from x import yimport x as y)在顶部

  2. 导入选择。在顶部。

    import settings
    if setting.something:
        import this as foo
    else:
        import that as foo
  3. 有条件的导入。与JSON,XML库等一起使用。在顶部。

    try:
        import this as foo
    except ImportError:
        import that as foo
  4. 动态导入。到目前为止,我们只有一个例子。

    import settings
    module_stuff = {}
    module= __import__( settings.some_module, module_stuff )
    x = module_stuff['x']

    请注意,这种动态导入不会引入代码,而是引入以Python编写的复杂数据结构。这有点像腌制的数据,只是我们手工腌制了。

    这或多或少都在模块的顶部


这是使代码更清晰的方法:

  • 保持模块简短。

  • 如果我将所有导入内容都放在模块顶部,则必须去那里确定名称。如果模块很短,那很容易做到。

  • 在某些情况下,使多余的信息靠近名称的使用位置可使该功能更易于理解。如果模块很短,那很容易做到。

Here are the four import use cases that we use

  1. import (and from x import y and import x as y) at the top

  2. Choices for Import. At the top.

    import settings
    if setting.something:
        import this as foo
    else:
        import that as foo
    
  3. Conditional Import. Used with JSON, XML libraries and the like. At the top.

    try:
        import this as foo
    except ImportError:
        import that as foo
    
  4. Dynamic Import. So far, we only have one example of this.

    import settings
    module_stuff = {}
    module= __import__( settings.some_module, module_stuff )
    x = module_stuff['x']
    

    Note that this dynamic import doesn’t bring in code, but brings in complex data structures written in Python. It’s kind of like a pickled piece of data except we pickled it by hand.

    This is also, more-or-less, at the top of a module


Here’s what we do to make the code clearer:

  • Keep the modules short.

  • If I have all my imports at the top of the module, I have to go look there to determine what a name is. If the module is short, that’s easy to do.

  • In some cases having that extra information close to where a name is used can make the function easier to understand. If the module is short, that’s easy to do.


回答 3

请记住一件事:不必要的导入可能会导致性能问题。因此,如果此函数经常被调用,则最好将导入放在顶部。当然,这一种优化,因此,如果有一个有效的案例可以证明,在函数内部的导入比在文件顶部的导入更清晰,那么在大多数情况下,这会降低性能。

如果您正在使用IronPython,则会被告知最好导入内部函数(因为在IronPython中编译代码可能很慢)。因此,您也许可以找到一种导入内部函数的方法。但是除此之外,我认为与常规作斗争是不值得的。

通常,如果存在仅在单个函数中使用的导入,则执行此操作。

我想提出的另一点是,这可能是潜在的维护问题。如果添加的功能使用的模块以前仅由一个功能使用,会发生什么情况?您是否还记得将导入添加到文件顶部?还是要扫描每个功能的导入?

FWIW,在某些情况下,在函数内部导入是有意义的。例如,如果要在cx_Oracle中设置语言,则需要导入之前设置NLS _LANG环境变量。因此,您可能会看到如下代码:

import os

oracle = None

def InitializeOracle(lang):
    global oracle
    os.environ['NLS_LANG'] = lang
    import cx_Oracle
    oracle = cx_Oracle

One thing to bear in mind: needless imports can cause performance problems. So if this is a function that will be called frequently, you’re better off just putting the import at the top. Of course this is an optimization, so if there’s a valid case to be made that importing inside a function is more clear than importing at the top of a file, that trumps performance in most cases.

If you’re doing IronPython, I’m told that it’s better to import inside functions (since compiling code in IronPython can be slow). Thus, you may be able to get a way with importing inside functions then. But other than that, I’d argue that it’s just not worth it to fight convention.

As a general rule, I do this if there is an import that is only used within a single function.

Another point I’d like to make is that this may be a potential maintenence problem. What happens if you add a function that uses a module that was previously used by only one function? Are you going to remember to add the import to the top of the file? Or are you going to scan each and every function for imports?

FWIW, there are cases where it makes sense to import inside a function. For example, if you want to set the language in cx_Oracle, you need to set an NLS_LANG environment variable before it is imported. Thus, you may see code like this:

import os

oracle = None

def InitializeOracle(lang):
    global oracle
    os.environ['NLS_LANG'] = lang
    import cx_Oracle
    oracle = cx_Oracle

回答 4

对于自测模块,我之前已经打破了此规则。也就是说,它们通常仅用于支持,但是我为它们定义了一个主要版本,因此,如果您自己运行它们,则可以测试其功能。在那种情况下,我有时会导入getoptcmd只是进入main,因为我希望阅读代码的人可以清楚地知道这些模块与模块的正常运行无关,仅包含在测试中。

I’ve broken this rule before for modules that are self-testing. That is, they are normally just used for support, but I define a main for them so that if you run them by themselves you can test their functionality. In that case I sometimes import getopt and cmd just in main, because I want it to be clear to someone reading the code that these modules have nothing to do with the normal operation of the module and are only being included for testing.


回答 5

来自关于 两次加载模块 -为什么不两者都?

脚本顶部的导入将指示依赖关系,并且该函数中的另一个导入将使该函数更具原子性,同时由于连续导入的成本较低,因此似乎不会造成任何性能劣势。

Coming from the question about loading the module twice – Why not both?

An import at the top of the script will indicate the dependencies and another import in the function with make this function more atomic, while seemingly not causing any performance disadvantage, since a consecutive import is cheap.


回答 6

只要它importfrom x import *,您就应该将它们放在顶部。它仅向全局命名空间添加一个名称,并且您坚持使用PEP8。此外,如果以后需要在其他地方使用它,则无需四处移动。

没什么大不了的,但是由于几乎没有区别,所以我建议按照PEP 8的说明进行操作。

As long as it’s import and not from x import *, you should put them at the top. It adds just one name to the global namespace, and you stick to PEP 8. Plus, if you later need it somewhere else, you don’t have to move anything around.

It’s no big deal, but since there’s almost no difference I’d suggest doing what PEP 8 says.


回答 7

看看sqlalchemy中使用的替代方法:依赖项注入:

@util.dependencies("sqlalchemy.orm.query")
def merge_result(query, *args):
    #...
    query.Query(...)

注意导入的库如何在装饰器中声明,并作为参数传递给函数

这种方法使代码更整洁,并且比语句快4.5倍import

基准:https : //gist.github.com/kolypto/589e84fbcfb6312532658df2fabdb796

Have a look at the alternative approach that’s used in sqlalchemy: dependency injection:

@util.dependencies("sqlalchemy.orm.query")
def merge_result(query, *args):
    #...
    query.Query(...)

Notice how the imported library is declared in a decorator, and passed as an argument to the function!

This approach makes the code cleaner, and also works 4.5 times faster than an import statement!

Benchmark: https://gist.github.com/kolypto/589e84fbcfb6312532658df2fabdb796


回答 8

在既是“正常”模块又可以执行的模块中(即 if __name__ == '__main__': -section),我通常导入仅在主要部分内执行模块时使用的模块。

例:

def really_useful_function(data):
    ...


def main():
    from pathlib import Path
    from argparse import ArgumentParser
    from dataloader import load_data_from_directory

    parser = ArgumentParser()
    parser.add_argument('directory')
    args = parser.parse_args()
    data = load_data_from_directory(Path(args.directory))
    print(really_useful_function(data)


if __name__ == '__main__':
    main()

In modules that are both ‘normal’ modules and can be executed (i.e. have a if __name__ == '__main__':-section), I usually import modules that are only used when executing the module inside the main section.

Example:

def really_useful_function(data):
    ...


def main():
    from pathlib import Path
    from argparse import ArgumentParser
    from dataloader import load_data_from_directory

    parser = ArgumentParser()
    parser.add_argument('directory')
    args = parser.parse_args()
    data = load_data_from_directory(Path(args.directory))
    print(really_useful_function(data)


if __name__ == '__main__':
    main()

回答 9

还有另一种(可能是“角落”)情况,这种情况可能会对import内部很少使用的功能有利:缩短启动时间。

我曾经在小型IoT服务器上运行一个相当复杂的程序来碰壁,它接受来自串行线路的命令并执行操作,可能是非常复杂的操作。

import语句放在文件顶部意味着在服务器启动之前已处理所有导入;因为import名单中包括jinja2lxmlsignxml等“重物”(和SoC是不是很厉害),这意味着分钟的第一个指令之前实际执行。

OTOH将大多数导入放置在功能中,我能够在几秒钟内使服务器在串行线上“运行”。当然,当实际需要模块时,我必须付出代价(注意:这也可以通过import在空闲时间内生成后台任务来缓解)。

There’s another (probably “corner”) case where it may be beneficial to import inside rarely used functions: shorten startup time.

I hit that wall once with a rather complex program running on a small IoT server accepting commands from a serial line and performing operations, possibly very complex operations.

Placing import statements at top of files meant to have all imports processed before server start; since import list included jinja2, lxml, signxml and other “heavy weights” (and SoC was not very powerful) this meant minutes before the first instruction was actually executed.

OTOH placing most imports in functions I was able to have the server “alive” on the serial line in seconds. Of course when the modules were actually needed I had to pay the price (Note: also this can be mitigated by spawning a background task doing imports in idle time).


如何将可迭代的内容添加到集合中?

问题:如何将可迭代的内容添加到集合中?

将可迭代的所有项目添加到现有项目的“一种明显方法”set什么?

What is the “one […] obvious way” to add all items of an iterable to an existing set?


回答 0

您可以像这样list将a的元素添加set

>>> foo = set(range(0, 4))
>>> foo
set([0, 1, 2, 3])
>>> foo.update(range(2, 6))
>>> foo
set([0, 1, 2, 3, 4, 5])

You can add elements of a list to a set like this:

>>> foo = set(range(0, 4))
>>> foo
set([0, 1, 2, 3])
>>> foo.update(range(2, 6))
>>> foo
set([0, 1, 2, 3, 4, 5])

回答 1

为了使任何可能相信(例如)aset.add()在循环中进行的工作都具有竞争优势的人受益,aset.update()以下示例说明了如何在公开之前快速检验自己的信念:

>\python27\python -mtimeit -s"it=xrange(10000);a=set(xrange(100))" "a.update(it)"
1000 loops, best of 3: 294 usec per loop

>\python27\python -mtimeit -s"it=xrange(10000);a=set(xrange(100))" "for i in it:a.add(i)"
1000 loops, best of 3: 950 usec per loop

>\python27\python -mtimeit -s"it=xrange(10000);a=set(xrange(100))" "a |= set(it)"
1000 loops, best of 3: 458 usec per loop

>\python27\python -mtimeit -s"it=xrange(20000);a=set(xrange(100))" "a.update(it)"
1000 loops, best of 3: 598 usec per loop

>\python27\python -mtimeit -s"it=xrange(20000);a=set(xrange(100))" "for i in it:a.add(i)"
1000 loops, best of 3: 1.89 msec per loop

>\python27\python -mtimeit -s"it=xrange(20000);a=set(xrange(100))" "a |= set(it)"
1000 loops, best of 3: 891 usec per loop

看来循环方法的每项成本是该方法的三倍以上update

使用|= set()成本大约是原来的1.5倍update,但循环添加每个单独项目的成本却是原来的一半。

For the benefit of anyone who might believe e.g. that doing aset.add() in a loop would have performance competitive with doing aset.update(), here’s an example of how you can test your beliefs quickly before going public:

>\python27\python -mtimeit -s"it=xrange(10000);a=set(xrange(100))" "a.update(it)"
1000 loops, best of 3: 294 usec per loop

>\python27\python -mtimeit -s"it=xrange(10000);a=set(xrange(100))" "for i in it:a.add(i)"
1000 loops, best of 3: 950 usec per loop

>\python27\python -mtimeit -s"it=xrange(10000);a=set(xrange(100))" "a |= set(it)"
1000 loops, best of 3: 458 usec per loop

>\python27\python -mtimeit -s"it=xrange(20000);a=set(xrange(100))" "a.update(it)"
1000 loops, best of 3: 598 usec per loop

>\python27\python -mtimeit -s"it=xrange(20000);a=set(xrange(100))" "for i in it:a.add(i)"
1000 loops, best of 3: 1.89 msec per loop

>\python27\python -mtimeit -s"it=xrange(20000);a=set(xrange(100))" "a |= set(it)"
1000 loops, best of 3: 891 usec per loop

Looks like the cost per item of the loop approach is over THREE times that of the update approach.

Using |= set() costs about 1.5x what update does but half of what adding each individual item in a loop does.


回答 2

您可以使用set()函数将一个可迭代对象转换为一个集合,然后使用标准集合更新运算符(| =)将新集合中的唯一值添加到现有集合中。

>>> a = { 1, 2, 3 }
>>> b = ( 3, 4, 5 )
>>> a |= set(b)
>>> a
set([1, 2, 3, 4, 5])

You can use the set() function to convert an iterable into a set, and then use standard set update operator (|=) to add the unique values from your new set into the existing one.

>>> a = { 1, 2, 3 }
>>> b = ( 3, 4, 5 )
>>> a |= set(b)
>>> a
set([1, 2, 3, 4, 5])

回答 3

只是快速更新,使用python 3进行计时:

#!/usr/local/bin python3
from timeit import Timer

a = set(range(1, 100000))
b = list(range(50000, 150000))

def one_by_one(s, l):
    for i in l:
        s.add(i)    

def cast_to_list_and_back(s, l):
    s = set(list(s) + l)

def update_set(s,l):
    s.update(l)

结果是:

one_by_one 10.184448844986036
cast_to_list_and_back 7.969255169969983
update_set 2.212590195937082

Just a quick update, timings using python 3:

#!/usr/local/bin python3
from timeit import Timer

a = set(range(1, 100000))
b = list(range(50000, 150000))

def one_by_one(s, l):
    for i in l:
        s.add(i)    

def cast_to_list_and_back(s, l):
    s = set(list(s) + l)

def update_set(s,l):
    s.update(l)

results are:

one_by_one 10.184448844986036
cast_to_list_and_back 7.969255169969983
update_set 2.212590195937082

回答 4

使用列表理解。

使用列表来短路可迭代的创建:)

>>> x = [1, 2, 3, 4]
>>> 
>>> k = x.__iter__()
>>> k
<listiterator object at 0x100517490>
>>> l = [y for y in k]
>>> l
[1, 2, 3, 4]
>>> 
>>> z = Set([1,2])
>>> z.update(l)
>>> z
set([1, 2, 3, 4])
>>> 

[编辑:错过了问题的设定部分]

Use list comprehension.

Short circuiting the creation of iterable using a list for example :)

>>> x = [1, 2, 3, 4]
>>> 
>>> k = x.__iter__()
>>> k
<listiterator object at 0x100517490>
>>> l = [y for y in k]
>>> l
[1, 2, 3, 4]
>>> 
>>> z = Set([1,2])
>>> z.update(l)
>>> z
set([1, 2, 3, 4])
>>> 

[Edit: missed the set part of question]


回答 5

for item in items:
   extant_set.add(item)

作为记录,我认为这样的主张“应该有一种-最好只有一种-显而易见的方式”。是假的。它假设许多技术娴熟的人都会做出这样的假设,而每个人的想法都是相同的。对一个人显而易见的东西对另一个人不是那么明显。

我认为我提出的解决方案清晰易读,并且可以满足您的要求。我不相信它会带来任何性能上的损失-尽管我承认我可能会遗漏一些东西。但是尽管如此,对于其他开发人员来说,它可能并不明显且更可取。

for item in items:
   extant_set.add(item)

For the record, I think the assertion that “There should be one– and preferably only one –obvious way to do it.” is bogus. It makes an assumption that many technical minded people make, that everyone thinks alike. What is obvious to one person is not so obvious to another.

I would argue that my proposed solution is clearly readable, and does what you ask. I don’t believe there are any performance hits involved with it–though I admit I might be missing something. But despite all of that, it might not be obvious and preferable to another developer.


Python __str__与__unicode__

问题:Python __str__与__unicode__

有没有时,你应该实现一个python约定__str__()__unicode__()。我已经看到类重写的__unicode__()频率高于,__str__()但似乎不一致。当实施一个相对于另一个更好时,是否有特定的规则?实施这两种方法是否必要/良好做法?

Is there a python convention for when you should implement __str__() versus __unicode__(). I’ve seen classes override __unicode__() more frequently than __str__() but it doesn’t appear to be consistent. Are there specific rules when it is better to implement one versus the other? Is it necessary/good practice to implement both?


回答 0

__str__()是旧方法-它返回字节。__unicode__()是新的首选方法-它返回字符。名称有些混乱,但是在2.x中,出于兼容性原因,我们坚持使用它们。通常,您应该将所有字符串格式都放在中__unicode__(),并创建一个存根__str__()方法:

def __str__(self):
    return unicode(self).encode('utf-8')

在3.0中,str包含字符,因此将__bytes__()和命名为相同的方法__str__()。这些行为符合预期。

__str__() is the old method — it returns bytes. __unicode__() is the new, preferred method — it returns characters. The names are a bit confusing, but in 2.x we’re stuck with them for compatibility reasons. Generally, you should put all your string formatting in __unicode__(), and create a stub __str__() method:

def __str__(self):
    return unicode(self).encode('utf-8')

In 3.0, str contains characters, so the same methods are named __bytes__() and __str__(). These behave as expected.


回答 1

如果我不特别关心给定类的微优化字符串化,那么我将始终__unicode__只实施它,因为它更笼统。当我确实关心此类微小的性能问题(这是exceptions,不是规则)时,__str__仅(当我可以证明在字符串化输出中绝不会出现非ASCII字符时)或两者(当两者都可能时)可能会救命。

我认为这些是牢固的原则,但实际上知道这是很常见的,只有ASCII字符会不做任何努力来证明它(例如,字符串形式只有数字,标点符号,并且可能是短的ASCII名称;-)在这种情况下,直接采用“公正__str__”方法是很典型的做法(但如果我与一个编程团队合作,提出了一项本地准则来避免这种情况,我将对该提案+1,因为在这些问题上很容易犯错,并且“过早的优化是编程中万恶之源” ;-)。

If I didn’t especially care about micro-optimizing stringification for a given class I’d always implement __unicode__ only, as it’s more general. When I do care about such minute performance issues (which is the exception, not the rule), having __str__ only (when I can prove there never will be non-ASCII characters in the stringified output) or both (when both are possible), might help.

These I think are solid principles, but in practice it’s very common to KNOW there will be nothing but ASCII characters without doing effort to prove it (e.g. the stringified form only has digits, punctuation, and maybe a short ASCII name;-) in which case it’s quite typical to move on directly to the “just __str__” approach (but if a programming team I worked with proposed a local guideline to avoid that, I’d be +1 on the proposal, as it’s easy to err in these matters AND “premature optimization is the root of all evil in programming”;-).


回答 2

随着世界变得越来越小,您遇到的任何字符串都有可能最终包含Unicode。因此,对于任何新应用,您至少应提供__unicode__()__str__()然后,您是否还要覆盖也只是一个品味问题。

With the world getting smaller, chances are that any string you encounter will contain Unicode eventually. So for any new apps, you should at least provide __unicode__(). Whether you also override __str__() is then just a matter of taste.


回答 3

如果您在Django中同时使用python2和python3,则建议使用python_2_unicode_compatible装饰器:

Django提供了一种简单的方法来定义可在Python 2和3上使用的str()和 unicode()方法,您必须定义一个返回文本的str()方法并应用python_2_unicode_compatible()装饰器。

如前面对另一个答案的注释中所述,某些版本的future.utils也支持此装饰器。在我的系统上,我需要为python2安装一个新的future模块,并为python3安装future。之后,这是一个功能示例:

#! /usr/bin/env python

from future.utils import python_2_unicode_compatible
from sys import version_info

@python_2_unicode_compatible
class SomeClass():
    def __str__(self):
        return "Called __str__"


if __name__ == "__main__":
    some_inst = SomeClass()
    print(some_inst)
    if (version_info > (3,0)):
        print("Python 3 does not support unicode()")
    else:
        print(unicode(some_inst))

这是示例输出(其中venv2 / venv3是virtualenv实例):

~/tmp$ ./venv3/bin/python3 demo_python_2_unicode_compatible.py 
Called __str__
Python 3 does not support unicode()

~/tmp$ ./venv2/bin/python2 demo_python_2_unicode_compatible.py 
Called __str__
Called __str__

If you are working in both python2 and python3 in Django, I recommend the python_2_unicode_compatible decorator:

Django provides a simple way to define str() and unicode() methods that work on Python 2 and 3: you must define a str() method returning text and to apply the python_2_unicode_compatible() decorator.

As noted in earlier comments to another answer, some versions of future.utils also support this decorator. On my system, I needed to install a newer future module for python2 and install future for python3. After that, then here is a functional example:

#! /usr/bin/env python

from future.utils import python_2_unicode_compatible
from sys import version_info

@python_2_unicode_compatible
class SomeClass():
    def __str__(self):
        return "Called __str__"


if __name__ == "__main__":
    some_inst = SomeClass()
    print(some_inst)
    if (version_info > (3,0)):
        print("Python 3 does not support unicode()")
    else:
        print(unicode(some_inst))

Here is example output (where venv2/venv3 are virtualenv instances):

~/tmp$ ./venv3/bin/python3 demo_python_2_unicode_compatible.py 
Called __str__
Python 3 does not support unicode()

~/tmp$ ./venv2/bin/python2 demo_python_2_unicode_compatible.py 
Called __str__
Called __str__

回答 4

Python 2: 仅实现__str __(),并返回unicode。

什么时候__unicode__()省略,有人打电话unicode(o)u"%s"%o,Python的呼叫o.__str__()并转换为Unicode使用系统编码。(请参阅的文档__unicode__()。)

相反的说法是不正确的。如果实施__unicode__()但未__str__(),则当有人调用str(o)或时"%s"%o,Python返回repr(o)


基本原理

为什么unicode要从中返回a__str__()
如果__str__()返回unicode,Python会自动str使用系统编码将其转换为。

有什么好处?
①它使您不必担心系统编码是什么(即locale.getpreferredencoeding(…))。就个人而言,这不仅麻烦,而且我认为系统无论如何都要注意这一点。②如果小心,您的代码可能会与Python 3相互兼容,其中__str__()返回unicode。

从名为的函数中返回unicode是骗人的 __str__()
一点。但是,您可能已经在这样做了。如果你有from __future__ import unicode_literals位于文件的顶部,则很有可能在不知道的情况下返回unicode。

那么Python 3呢?
Python 3不使用__unicode__()。但是,如果您实现__str__()了使其在Python 2或Python 3下返回unicode的功能,那么那部分代码将是交叉兼容的。

如果我想unicode(o)与之有本质区别str()怎么办?
同时实现__str__()(可能返回str)和__unicode__()。我想这很少见,但您可能希望获得实质上不同的输出(例如,特殊字符的ASCII版本,例如":)"for u"☺")。

我意识到有些人可能会发现这一争议。

Python 2: Implement __str__() only, and return a unicode.

When __unicode__() is omitted and someone calls unicode(o) or u"%s"%o, Python calls o.__str__() and converts to unicode using the system encoding. (See documentation of __unicode__().)

The opposite is not true. If you implement __unicode__() but not __str__(), then when someone calls str(o) or "%s"%o, Python returns repr(o).


Rationale

Why would it work to return a unicode from __str__()?
If __str__() returns a unicode, Python automatically converts it to str using the system encoding.

What’s the benefit?
① It frees you from worrying about what the system encoding is (i.e., locale.getpreferredencoeding(…)). Not only is that messy, personally, but I think it’s something the system should take care of anyway. ② If you are careful, your code may come out cross-compatible with Python 3, in which __str__() returns unicode.

Isn’t it deceptive to return a unicode from a function called __str__()?
A little. However, you might be already doing it. If you have from __future__ import unicode_literals at the top of your file, there’s a good chance you’re returning a unicode without even knowing it.

What about Python 3?
Python 3 does not use __unicode__(). However, if you implement __str__() so that it returns unicode under either Python 2 or Python 3, then that part of your code will be cross-compatible.

What if I want unicode(o) to be substantively different from str()?
Implement both __str__() (possibly returning str) and __unicode__(). I imagine this would be rare, but you might want substantively different output (e.g., ASCII versions of special characters, like ":)" for u"☺").

I realize some may find this controversial.


回答 5

值得向那些不熟悉该__unicode__功能的人指出一些在Python 2.x中围绕它的默认行为,尤其是与并排定义时__str__

class A :
    def __init__(self) :
        self.x = 123
        self.y = 23.3

    #def __str__(self) :
    #    return "STR      {}      {}".format( self.x , self.y)
    def __unicode__(self) :
        return u"UNICODE  {}      {}".format( self.x , self.y)

a1 = A()
a2 = A()

print( "__repr__ checks")
print( a1 )
print( a2 )

print( "\n__str__ vs __unicode__ checks")
print( str( a1 ))
print( unicode(a1))
print( "{}".format( a1 ))
print( u"{}".format( a1 ))

产生以下控制台输出…

__repr__ checks
<__main__.A instance at 0x103f063f8>
<__main__.A instance at 0x103f06440>

__str__ vs __unicode__ checks
<__main__.A instance at 0x103f063f8>
UNICODE 123      23.3
<__main__.A instance at 0x103f063f8>
UNICODE 123      23.3

现在,当我取消注释该__str__方法时

__repr__ checks
STR      123      23.3
STR      123      23.3

__str__ vs __unicode__ checks
STR      123      23.3
UNICODE  123      23.3
STR      123      23.3
UNICODE  123      23.3

It’s worth pointing out to those unfamiliar with the __unicode__ function some of the default behaviors surrounding it back in Python 2.x, especially when defined side by side with __str__.

class A :
    def __init__(self) :
        self.x = 123
        self.y = 23.3

    #def __str__(self) :
    #    return "STR      {}      {}".format( self.x , self.y)
    def __unicode__(self) :
        return u"UNICODE  {}      {}".format( self.x , self.y)

a1 = A()
a2 = A()

print( "__repr__ checks")
print( a1 )
print( a2 )

print( "\n__str__ vs __unicode__ checks")
print( str( a1 ))
print( unicode(a1))
print( "{}".format( a1 ))
print( u"{}".format( a1 ))

yields the following console output…

__repr__ checks
<__main__.A instance at 0x103f063f8>
<__main__.A instance at 0x103f06440>

__str__ vs __unicode__ checks
<__main__.A instance at 0x103f063f8>
UNICODE 123      23.3
<__main__.A instance at 0x103f063f8>
UNICODE 123      23.3

Now when I uncomment out the __str__ method

__repr__ checks
STR      123      23.3
STR      123      23.3

__str__ vs __unicode__ checks
STR      123      23.3
UNICODE  123      23.3
STR      123      23.3
UNICODE  123      23.3

是否有标准化的方法可以在Python中交换两个变量?

问题:是否有标准化的方法可以在Python中交换两个变量?

在Python中,我已经看到使用此语法交换了两个变量值:

left, right = right, left

这是否被认为是交换两个变量值的标准方法,或者是否有其他一些方式可以按照惯例最通常地交换两个变量?

In Python, I’ve seen two variable values swapped using this syntax:

left, right = right, left

Is this considered the standard way to swap two variable values or is there some other means by which two variables are by convention most usually swapped?


回答 0

Python从左到右计算表达式。请注意,在评估分配时,右侧的评估先于左侧。

http://docs.python.org/3/reference/expressions.html#evaluation-order

这意味着该表达式的以下内容a,b = b,a

  • 对右侧b,a进行求值,即在内存中创建两个元素的元组。这两个元素是由标识符b和所指定的对象,这些对象a在程序执行期间对指令进行加密之前就已存在
  • 创建该元组后,仍未分配该元组对象,但这没关系,Python内部知道它在哪里
  • 然后,评估左侧,即将元组分配给左侧
  • 由于左侧是由两个标识符组成的,因此将元组解压缩,以便将第一个标识符a分配给元组的第一个元素(这是交换之前为b的对象,因为它具有名称b
    ,并且第二个标识符b分配给元组的第二个元素(该对象以前是交换之前的a,因为其标识符为a

该机制有效地交换了分配给标识符的对象,a并且b

因此,回答您的问题:是的,这是在两个对象上交换两个标识符的标准方法。
顺便说一下,对象不是变量,而是对象。

Python evaluates expressions from left to right. Notice that while evaluating an assignment, the right-hand side is evaluated before the left-hand side.

http://docs.python.org/3/reference/expressions.html#evaluation-order

That means the following for the expression a,b = b,a :

  • the right-hand side b,a is evaluated, that is to say a tuple of two elements is created in the memory. The two element are the objects designated by the identifiers b and a, that were existing before the instruction is encoutered during an execution of program
  • just after the creation of this tuple, no assignement of this tuple object have still been made, but it doesn’t matter, Python internally knows where it is
  • then, the left-hand side is evaluated, that is to say the tuple is assigned to the left-hand side
  • as the left-hand side is composed of two identifiers, the tuple is unpacked in order that the first identifier a be assigned to the first element of the tuple (which is the object that was formely b before the swap because it had name b)
    and the second identifier b is assigned to the second element of the tuple (which is the object that was formerly a before the swap because its identifiers was a)

This mechanism has effectively swapped the objects assigned to the identifiers a and b

So, to answer your question: YES, it’s the standard way to swap two identifiers on two objects.
By the way, the objects are not variables, they are objects.


回答 1

这是交换两个变量的标准方法,是的。

That is the standard way to swap two variables, yes.


回答 2

我知道三种交换变量的方法,但是a, b = b, a最简单。有

XOR(整数)

x = x ^ y
y = y ^ x
x = x ^ y

或简而言之,

x ^= y
y ^= x
x ^= y

临时变量

w = x
x = y
y = w
del w

元组交换

x, y = y, x

I know three ways to swap variables, but a, b = b, a is the simplest. There is

XOR (for integers)

x = x ^ y
y = y ^ x
x = x ^ y

Or concisely,

x ^= y
y ^= x
x ^= y

Temporary variable

w = x
x = y
y = w
del w

Tuple swap

x, y = y, x

回答 3

我不会说这是一种标准的交换方式,因为它将导致一些意外错误。

nums[i], nums[nums[i] - 1] = nums[nums[i] - 1], nums[i]

nums[i]将首先被修改,然后影响第二个变量nums[nums[i] - 1]

I would not say it is a standard way to swap because it will cause some unexpected errors.

nums[i], nums[nums[i] - 1] = nums[nums[i] - 1], nums[i]

nums[i] will be modified first and then affect the second variable nums[nums[i] - 1].


回答 4

不适用于多维数组,因为此处使用了引用。

import numpy as np

# swaps
data = np.random.random(2)
print(data)
data[0], data[1] = data[1], data[0]
print(data)

# does not swap
data = np.random.random((2, 2))
print(data)
data[0], data[1] = data[1], data[0]
print(data)

另请参见交换Numpy数组的切片

Does not work for multidimensional arrays, because references are used here.

import numpy as np

# swaps
data = np.random.random(2)
print(data)
data[0], data[1] = data[1], data[0]
print(data)

# does not swap
data = np.random.random((2, 2))
print(data)
data[0], data[1] = data[1], data[0]
print(data)

See also Swap slices of Numpy arrays


回答 5

为了解决eyquem解释的问题,您可以使用copy模块通过一个函数返回一个包含(反向)值副本的元组:

from copy import copy

def swapper(x, y):
  return (copy(y), copy(x))

与的功能相同lambda

swapper = lambda x, y: (copy(y), copy(x))

然后,将它们分配给所需的名称,如下所示:

x, y = swapper(y, x)

注意:如果需要,可以导入/使用deepcopy而不是copy

To get around the problems explained by eyquem, you could use the copy module to return a tuple containing (reversed) copies of the values, via a function:

from copy import copy

def swapper(x, y):
  return (copy(y), copy(x))

Same function as a lambda:

swapper = lambda x, y: (copy(y), copy(x))

Then, assign those to the desired names, like this:

x, y = swapper(y, x)

NOTE: if you wanted to you could import/use deepcopy instead of copy.


回答 6

您可以组合元组XOR交换:x,y = x ^ x ^ y,x ^ y ^ y

x, y = 10, 20

print('Before swapping: x = %s, y = %s '%(x,y))

x, y = x ^ x ^ y, x ^ y ^ y

print('After swapping: x = %s, y = %s '%(x,y))

要么

x, y = 10, 20

print('Before swapping: x = %s, y = %s '%(x,y))

print('After swapping: x = %s, y = %s '%(x ^ x ^ y, x ^ y ^ y))

使用lambda

x, y = 10, 20

print('Before swapping: x = %s, y = %s' % (x, y))

swapper = lambda x, y : ((x ^ x ^ y), (x ^ y ^ y))

print('After swapping: x = %s, y = %s ' % swapper(x, y))

输出:

Before swapping: x =  10 , y =  20
After swapping: x =  20 , y =  10

You can combine tuple and XOR swaps: x, y = x ^ x ^ y, x ^ y ^ y

x, y = 10, 20

print('Before swapping: x = %s, y = %s '%(x,y))

x, y = x ^ x ^ y, x ^ y ^ y

print('After swapping: x = %s, y = %s '%(x,y))

or

x, y = 10, 20

print('Before swapping: x = %s, y = %s '%(x,y))

print('After swapping: x = %s, y = %s '%(x ^ x ^ y, x ^ y ^ y))

Using lambda:

x, y = 10, 20

print('Before swapping: x = %s, y = %s' % (x, y))

swapper = lambda x, y : ((x ^ x ^ y), (x ^ y ^ y))

print('After swapping: x = %s, y = %s ' % swapper(x, y))

Output:

Before swapping: x =  10 , y =  20
After swapping: x =  20 , y =  10