问题:列表理解与地图

有理由更喜欢使用map()列表理解吗?反之亦然?它们中的一个通常比另一个效率更高,或者通常被认为比另一个更Python化吗?

Is there a reason to prefer using map() over list comprehension or vice versa? Is either of them generally more efficient or considered generally more pythonic than the other?


回答 0

map在某些情况下(如果您不是出于此目的而使用lambda,而是在map和listcomp中使用相同的函数),在微观上可能会更快。在其他情况下,列表理解可能会更快,并且大多数(并非全部)pythonista用户认为列表更直接,更清晰。

使用完全相同的函数时map的微小速度优势的一个示例:

$ python -mtimeit -s'xs=range(10)' 'map(hex, xs)'
100000 loops, best of 3: 4.86 usec per loop
$ python -mtimeit -s'xs=range(10)' '[hex(x) for x in xs]'
100000 loops, best of 3: 5.58 usec per loop

当地图需要使用lambda时,如何完全颠倒性能比较的示例:

$ python -mtimeit -s'xs=range(10)' 'map(lambda x: x+2, xs)'
100000 loops, best of 3: 4.24 usec per loop
$ python -mtimeit -s'xs=range(10)' '[x+2 for x in xs]'
100000 loops, best of 3: 2.32 usec per loop

map may be microscopically faster in some cases (when you’re NOT making a lambda for the purpose, but using the same function in map and a listcomp). List comprehensions may be faster in other cases and most (not all) pythonistas consider them more direct and clearer.

An example of the tiny speed advantage of map when using exactly the same function:

$ python -mtimeit -s'xs=range(10)' 'map(hex, xs)'
100000 loops, best of 3: 4.86 usec per loop
$ python -mtimeit -s'xs=range(10)' '[hex(x) for x in xs]'
100000 loops, best of 3: 5.58 usec per loop

An example of how performance comparison gets completely reversed when map needs a lambda:

$ python -mtimeit -s'xs=range(10)' 'map(lambda x: x+2, xs)'
100000 loops, best of 3: 4.24 usec per loop
$ python -mtimeit -s'xs=range(10)' '[x+2 for x in xs]'
100000 loops, best of 3: 2.32 usec per loop

回答 1

案例

  • 常见情况:几乎总是,您将要在python中使用列表推导,因为对于新手程序员来说,阅读代码会更加明显。(这不适用于可能适用其他习惯用法的其他语言。)由于列表推导是python中用于迭代的事实上的标准,因此您对python程序员所做的工作甚至会更加明显。他们是预期的
  • 较少见的情况:但是,如果您已经定义了一个函数,则使用通常是合理的map,尽管它被认为是“非pythonic”的。例如,map(sum, myLists)比更加优雅/简洁[sum(x) for x in myLists]。您可以不必编写一个虚拟变量(例如sum(x) for x...or sum(_) for _...sum(readableName) for readableName...),而只需键入两次即可进行迭代,从而获得了优雅。同样的道理也适用于filterreduce从什么itertools模块:如果你已经有一个方便的功能,您可以继续前进,做一些函数式编程。在某些情况下,这会提高可读性,而在其他情况下(例如,新手程序员,多个参数),则会失去可读性。但是,无论如何,代码的可读性在很大程度上取决于注释。
  • 几乎永远不会map在进行函数编程时,您可能希望将函数用作纯抽象函数,在这种情况下您正在映射map,currying map,或者从map以函数的形式进行讨论中受益。例如,在Haskell中,一个称为functor的接口可以fmap概括任何数据结构上的映射。这在python中非常罕见,因为python语法迫使您使用生成器样式来谈论迭代;您不能轻易将其概括。(这有时是好事,有时是坏事。)您可能会想出一些罕见的python例子,这map(f, *lists)是合理的事情。我能想到的最接近的示例是sumEach = partial(map,sum),这是一个单行代码,大致相当于:

def sumEach(myLists):
    return [sum(_) for _ in myLists]
  • 仅使用for-loop:您当然也可以使用for循环。尽管从函数式编程的角度来看并不那么优雅,但有时​​非局部变量使命令式编程语言(例如python)中的代码更清晰,因为人们已经非常习惯于以这种方式读取代码。通常,当您仅执行任何不构建列表的复杂操作(例如列表理解和映射)(例如,求和或制作树等)时,for循环也是最有效的。就内存而言,它是高效的(不必在时间上,我希望在最坏的情况下,它是一个恒定的因素,除非出现一些罕见的病理性垃圾收集问题)。

“ Python主义”

我不喜欢“ pythonic”一词,因为我发现pythonic在我眼中并不总是那么优雅。然而,mapfilter和类似的功能(如非常有用的itertools模块)很可能在风格方面考虑unpythonic。

懒惰

就效率而言,就像大多数函数式编程构造一样,MAP可以是LAZY,实际上在python中是懒惰的。这意味着您可以执行此操作(在python3中),并且计算机不会耗尽内存,并且不会丢失所有未保存的数据:

>>> map(str, range(10**100))
<map object at 0x2201d50>

尝试通过列表理解做到这一点:

>>> [str(n) for n in range(10**100)]
# DO NOT TRY THIS AT HOME OR YOU WILL BE SAD #

请注意,列表推导本质上也是惰性的,但是python选择将其实现为非惰性的。不过,python确实以生成器表达式的形式支持惰性列表推导,如下所示:

>>> (str(n) for n in range(10**100))
<generator object <genexpr> at 0xacbdef>

您基本上可以将[...]语法视为将生成器表达式传递给list构造函数,例如list(x for x in range(5))

简短的人为例子

from operator import neg
print({x:x**2 for x in map(neg,range(5))})

print({x:x**2 for x in [-y for y in range(5)]})

print({x:x**2 for x in (-y for y in range(5))})

列表推导是非延迟的,因此可能需要更多内存(除非您使用生成器推导)。方括号[...]通常使事情变得显而易见,尤其是在括号中。另一方面,有时您最终会变得像打字一样冗长[x for x in...。只要您使迭代器变量简短,如果不缩进代码,列表解析通常会更加清晰。但是您总是可以缩进代码。

print(
    {x:x**2 for x in (-y for y in range(5))}
)

或分手:

rangeNeg5 = (-y for y in range(5))
print(
    {x:x**2 for x in rangeNeg5}
)

python3的效率比较

map 现在很懒:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=map(f,xs)'
1000000 loops, best of 3: 0.336 usec per loop            ^^^^^^^^^

因此,如果您将不使用所有数据,或者不提前知道需要多少数据,那么map在python3中(以及python2或python3中的生成器表达式)将避免计算它们的值,直到最后一刻。通常,这通常会超过使用带来的任何开销map。不利之处在于,与大多数功能语言相反,这在python中非常有限:只有按“顺序”从左至右访问数据时,您才能获得此好处,因为python生成器表达式只能按order求值x[0], x[1], x[2], ...

但是,假设我们有一个f想要的预制函数map,并且我们忽略了map通过立即强制使用来进行赋值的懒惰list(...)。我们得到一些非常有趣的结果:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(map(f,xs))'                                                                                                                                                
10000 loops, best of 3: 165/124/135 usec per loop        ^^^^^^^^^^^^^^^
                    for list(<map object>)

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=[f(x) for x in xs]'                                                                                                                                      
10000 loops, best of 3: 181/118/123 usec per loop        ^^^^^^^^^^^^^^^^^^
                    for list(<generator>), probably optimized

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(f(x) for x in xs)'                                                                                                                                    
1000 loops, best of 3: 215/150/150 usec per loop         ^^^^^^^^^^^^^^^^^^^^^^
                    for list(<generator>)

结果为AAA / BBB / CCC格式,其中A在带有python 3。?。?的约2010年英特尔工作站上执行,而B和C则是在python 3.2.1的约2013年AMD工作站上执行,具有截然不同的硬件。结果似乎是,地图和列表理解的性能可比,这受其他随机因素的影响最大。我们可以告诉的唯一的事情似乎是,奇怪的是,虽然我们期待列表解析[...]比生成器表达式更好地发挥(...)map也更高效,生成器表达式(再次假设计算所有的值/使用)。

重要的是要意识到这些测试假设一个非常简单的功能(身份功能)。但是这很好,因为如果功能复杂,那么与程序中的其他因素相比,性能开销可以忽略不计。(测试其他简单的东西可能仍然很有趣,例如f=lambda x:x+x

如果您精通python汇编语言,则可以使用该dis模块来查看这是否是幕后真正发生的事情:

>>> listComp = compile('[f(x) for x in xs]', 'listComp', 'eval')
>>> dis.dis(listComp)
  1           0 LOAD_CONST               0 (<code object <listcomp> at 0x2511a48, file "listComp", line 1>) 
              3 MAKE_FUNCTION            0 
              6 LOAD_NAME                0 (xs) 
              9 GET_ITER             
             10 CALL_FUNCTION            1 
             13 RETURN_VALUE         
>>> listComp.co_consts
(<code object <listcomp> at 0x2511a48, file "listComp", line 1>,)
>>> dis.dis(listComp.co_consts[0])
  1           0 BUILD_LIST               0 
              3 LOAD_FAST                0 (.0) 
        >>    6 FOR_ITER                18 (to 27) 
              9 STORE_FAST               1 (x) 
             12 LOAD_GLOBAL              0 (f) 
             15 LOAD_FAST                1 (x) 
             18 CALL_FUNCTION            1 
             21 LIST_APPEND              2 
             24 JUMP_ABSOLUTE            6 
        >>   27 RETURN_VALUE

 

>>> listComp2 = compile('list(f(x) for x in xs)', 'listComp2', 'eval')
>>> dis.dis(listComp2)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_CONST               0 (<code object <genexpr> at 0x255bc68, file "listComp2", line 1>) 
              6 MAKE_FUNCTION            0 
              9 LOAD_NAME                1 (xs) 
             12 GET_ITER             
             13 CALL_FUNCTION            1 
             16 CALL_FUNCTION            1 
             19 RETURN_VALUE         
>>> listComp2.co_consts
(<code object <genexpr> at 0x255bc68, file "listComp2", line 1>,)
>>> dis.dis(listComp2.co_consts[0])
  1           0 LOAD_FAST                0 (.0) 
        >>    3 FOR_ITER                17 (to 23) 
              6 STORE_FAST               1 (x) 
              9 LOAD_GLOBAL              0 (f) 
             12 LOAD_FAST                1 (x) 
             15 CALL_FUNCTION            1 
             18 YIELD_VALUE          
             19 POP_TOP              
             20 JUMP_ABSOLUTE            3 
        >>   23 LOAD_CONST               0 (None) 
             26 RETURN_VALUE

 

>>> evalledMap = compile('list(map(f,xs))', 'evalledMap', 'eval')
>>> dis.dis(evalledMap)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_NAME                1 (map) 
              6 LOAD_NAME                2 (f) 
              9 LOAD_NAME                3 (xs) 
             12 CALL_FUNCTION            2 
             15 CALL_FUNCTION            1 
             18 RETURN_VALUE 

似乎使用[...]语法比更好list(...)。遗憾的是,map该类在拆卸方面有点不透明,但是我们可以通过速度测试来确定。

Cases

  • Common case: Almost always, you will want to use a list comprehension in python because it will be more obvious what you’re doing to novice programmers reading your code. (This does not apply to other languages, where other idioms may apply.) It will even be more obvious what you’re doing to python programmers, since list comprehensions are the de-facto standard in python for iteration; they are expected.
  • Less-common case: However if you already have a function defined, it is often reasonable to use map, though it is considered ‘unpythonic’. For example, map(sum, myLists) is more elegant/terse than [sum(x) for x in myLists]. You gain the elegance of not having to make up a dummy variable (e.g. sum(x) for x... or sum(_) for _... or sum(readableName) for readableName...) which you have to type twice, just to iterate. The same argument holds for filter and reduce and anything from the itertools module: if you already have a function handy, you could go ahead and do some functional programming. This gains readability in some situations, and loses it in others (e.g. novice programmers, multiple arguments)… but the readability of your code highly depends on your comments anyway.
  • Almost never: You may want to use the map function as a pure abstract function while doing functional programming, where you’re mapping map, or currying map, or otherwise benefit from talking about map as a function. In Haskell for example, a functor interface called fmap generalizes mapping over any data structure. This is very uncommon in python because the python grammar compels you to use generator-style to talk about iteration; you can’t generalize it easily. (This is sometimes good and sometimes bad.) You can probably come up with rare python examples where map(f, *lists) is a reasonable thing to do. The closest example I can come up with would be sumEach = partial(map,sum), which is a one-liner that is very roughly equivalent to:

def sumEach(myLists):
    return [sum(_) for _ in myLists]
  • Just using a for-loop: You can also of course just use a for-loop. While not as elegant from a functional-programming viewpoint, sometimes non-local variables make code clearer in imperative programming languages such as python, because people are very used to reading code that way. For-loops are also, generally, the most efficient when you are merely doing any complex operation that is not building a list like list-comprehensions and map are optimized for (e.g. summing, or making a tree, etc.) — at least efficient in terms of memory (not necessarily in terms of time, where I’d expect at worst a constant factor, barring some rare pathological garbage-collection hiccuping).

“Pythonism”

I dislike the word “pythonic” because I don’t find that pythonic is always elegant in my eyes. Nevertheless, map and filter and similar functions (like the very useful itertools module) are probably considered unpythonic in terms of style.

Laziness

In terms of efficiency, like most functional programming constructs, MAP CAN BE LAZY, and in fact is lazy in python. That means you can do this (in python3) and your computer will not run out of memory and lose all your unsaved data:

>>> map(str, range(10**100))
<map object at 0x2201d50>

Try doing that with a list comprehension:

>>> [str(n) for n in range(10**100)]
# DO NOT TRY THIS AT HOME OR YOU WILL BE SAD #

Do note that list comprehensions are also inherently lazy, but python has chosen to implement them as non-lazy. Nevertheless, python does support lazy list comprehensions in the form of generator expressions, as follows:

>>> (str(n) for n in range(10**100))
<generator object <genexpr> at 0xacbdef>

You can basically think of the [...] syntax as passing in a generator expression to the list constructor, like list(x for x in range(5)).

Brief contrived example

from operator import neg
print({x:x**2 for x in map(neg,range(5))})

print({x:x**2 for x in [-y for y in range(5)]})

print({x:x**2 for x in (-y for y in range(5))})

List comprehensions are non-lazy, so may require more memory (unless you use generator comprehensions). The square brackets [...] often make things obvious, especially when in a mess of parentheses. On the other hand, sometimes you end up being verbose like typing [x for x in.... As long as you keep your iterator variables short, list comprehensions are usually clearer if you don’t indent your code. But you could always indent your code.

print(
    {x:x**2 for x in (-y for y in range(5))}
)

or break things up:

rangeNeg5 = (-y for y in range(5))
print(
    {x:x**2 for x in rangeNeg5}
)

Efficiency comparison for python3

map is now lazy:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=map(f,xs)'
1000000 loops, best of 3: 0.336 usec per loop            ^^^^^^^^^

Therefore if you will not be using all your data, or do not know ahead of time how much data you need, map in python3 (and generator expressions in python2 or python3) will avoid calculating their values until the last moment necessary. Usually this will usually outweigh any overhead from using map. The downside is that this is very limited in python as opposed to most functional languages: you only get this benefit if you access your data left-to-right “in order”, because python generator expressions can only be evaluated the order x[0], x[1], x[2], ....

However let’s say that we have a pre-made function f we’d like to map, and we ignore the laziness of map by immediately forcing evaluation with list(...). We get some very interesting results:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(map(f,xs))'                                                                                                                                                
10000 loops, best of 3: 165/124/135 usec per loop        ^^^^^^^^^^^^^^^
                    for list(<map object>)

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=[f(x) for x in xs]'                                                                                                                                      
10000 loops, best of 3: 181/118/123 usec per loop        ^^^^^^^^^^^^^^^^^^
                    for list(<generator>), probably optimized

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(f(x) for x in xs)'                                                                                                                                    
1000 loops, best of 3: 215/150/150 usec per loop         ^^^^^^^^^^^^^^^^^^^^^^
                    for list(<generator>)

In results are in the form AAA/BBB/CCC where A was performed with on a circa-2010 Intel workstation with python 3.?.?, and B and C were performed with a circa-2013 AMD workstation with python 3.2.1, with extremely different hardware. The result seems to be that map and list comprehensions are comparable in performance, which is most strongly affected by other random factors. The only thing we can tell seems to be that, oddly, while we expect list comprehensions [...] to perform better than generator expressions (...), map is ALSO more efficient that generator expressions (again assuming that all values are evaluated/used).

It is important to realize that these tests assume a very simple function (the identity function); however this is fine because if the function were complicated, then performance overhead would be negligible compared to other factors in the program. (It may still be interesting to test with other simple things like f=lambda x:x+x)

If you’re skilled at reading python assembly, you can use the dis module to see if that’s actually what’s going on behind the scenes:

>>> listComp = compile('[f(x) for x in xs]', 'listComp', 'eval')
>>> dis.dis(listComp)
  1           0 LOAD_CONST               0 (<code object <listcomp> at 0x2511a48, file "listComp", line 1>) 
              3 MAKE_FUNCTION            0 
              6 LOAD_NAME                0 (xs) 
              9 GET_ITER             
             10 CALL_FUNCTION            1 
             13 RETURN_VALUE         
>>> listComp.co_consts
(<code object <listcomp> at 0x2511a48, file "listComp", line 1>,)
>>> dis.dis(listComp.co_consts[0])
  1           0 BUILD_LIST               0 
              3 LOAD_FAST                0 (.0) 
        >>    6 FOR_ITER                18 (to 27) 
              9 STORE_FAST               1 (x) 
             12 LOAD_GLOBAL              0 (f) 
             15 LOAD_FAST                1 (x) 
             18 CALL_FUNCTION            1 
             21 LIST_APPEND              2 
             24 JUMP_ABSOLUTE            6 
        >>   27 RETURN_VALUE

 

>>> listComp2 = compile('list(f(x) for x in xs)', 'listComp2', 'eval')
>>> dis.dis(listComp2)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_CONST               0 (<code object <genexpr> at 0x255bc68, file "listComp2", line 1>) 
              6 MAKE_FUNCTION            0 
              9 LOAD_NAME                1 (xs) 
             12 GET_ITER             
             13 CALL_FUNCTION            1 
             16 CALL_FUNCTION            1 
             19 RETURN_VALUE         
>>> listComp2.co_consts
(<code object <genexpr> at 0x255bc68, file "listComp2", line 1>,)
>>> dis.dis(listComp2.co_consts[0])
  1           0 LOAD_FAST                0 (.0) 
        >>    3 FOR_ITER                17 (to 23) 
              6 STORE_FAST               1 (x) 
              9 LOAD_GLOBAL              0 (f) 
             12 LOAD_FAST                1 (x) 
             15 CALL_FUNCTION            1 
             18 YIELD_VALUE          
             19 POP_TOP              
             20 JUMP_ABSOLUTE            3 
        >>   23 LOAD_CONST               0 (None) 
             26 RETURN_VALUE

 

>>> evalledMap = compile('list(map(f,xs))', 'evalledMap', 'eval')
>>> dis.dis(evalledMap)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_NAME                1 (map) 
              6 LOAD_NAME                2 (f) 
              9 LOAD_NAME                3 (xs) 
             12 CALL_FUNCTION            2 
             15 CALL_FUNCTION            1 
             18 RETURN_VALUE 

It seems it is better to use [...] syntax than list(...). Sadly the map class is a bit opaque to disassembly, but we can make due with our speed test.


回答 2

Python 2:您应该使用mapfilter而不是列表推导。

即使它们不是“ Pythonic”的,您还是还是偏爱它们的一个客观原因是:
它们需要函数/ lambda作为参数,从而引入了新的作用域

我被这个不止一次地咬了:

for x, y in somePoints:
    # (several lines of code here)
    squared = [x ** 2 for x in numbers]
    # Oops, x was silently overwritten!

但如果相反,我曾说过:

for x, y in somePoints:
    # (several lines of code here)
    squared = map(lambda x: x ** 2, numbers)

那一切都会好起来的

您可能会说我在相同范围内使用相同的变量名很愚蠢。

我不是 最初的代码很好-两者x不在同一范围内。
直到我内部块移到代码的不同部分之后,问题才出现(阅读:维护期间的问题,而不是开发过程中的问题),而且我没想到。

是的,如果您从未犯过此错误,则列表理解会更优雅。
但是从个人经验(和看到其他人犯同样的错误)中,我已经看到它发生了很多次,我认为当这些错误潜入您的代码中时,您不应该经历这种痛苦。

结论:

使用mapfilter。它们可以防止与范围相关的细微难以诊断的错误。

边注:

如果适合您的情况,请不要忘记考虑使用imapifilter(中的itertools)!

Python 2: You should use map and filter instead of list comprehensions.

An objective reason why you should prefer them even though they’re not “Pythonic” is this:
They require functions/lambdas as arguments, which introduce a new scope.

I’ve gotten bitten by this more than once:

for x, y in somePoints:
    # (several lines of code here)
    squared = [x ** 2 for x in numbers]
    # Oops, x was silently overwritten!

but if instead I had said:

for x, y in somePoints:
    # (several lines of code here)
    squared = map(lambda x: x ** 2, numbers)

then everything would’ve been fine.

You could say I was being silly for using the same variable name in the same scope.

I wasn’t. The code was fine originally — the two xs weren’t in the same scope.
It was only after I moved the inner block to a different section of the code that the problem came up (read: problem during maintenance, not development), and I didn’t expect it.

Yes, if you never make this mistake then list comprehensions are more elegant.
But from personal experience (and from seeing others make the same mistake) I’ve seen it happen enough times that I think it’s not worth the pain you have to go through when these bugs creep into your code.

Conclusion:

Use map and filter. They prevent subtle hard-to-diagnose scope-related bugs.

Side note:

Don’t forget to consider using imap and ifilter (in itertools) if they are appropriate for your situation!


回答 3

实际上,map列表理解在Python 3语言中的行为大不相同。看一下下面的Python 3程序:

def square(x):
    return x*x
squares = map(square, [1, 2, 3])
print(list(squares))
print(list(squares))

您可能希望它打印两次“ [1,4,9]”行,但是打印“ [1,4,9]”后跟“ []”。第一次查看时,squares它似乎表现为三个元素的序列,但是第二次查看时为空元素。

在Python 2语言中,会map返回一个普通的旧列表,就像列表推导在两种语言中一样。症结在于,mapPython 3(和imapPython 2)中的return值不是一个列表-它是一个迭代器!

与遍历列表不同,遍历迭代器时将消耗元素。这就是为什么squares在最后print(list(squares))一行看起来空白。

总结一下:

  • 在处理迭代器时,必须记住它们是有状态的,并且在遍历它们时会发生变化。
  • 列表更容易预测,因为它们仅在您显式对其进行更改时才会更改;他们是容器
  • 还有一个好处:数字,字符串和元组甚至可以更容易预测,因为它们根本无法更改;它们是价值

Actually, map and list comprehensions behave quite differently in the Python 3 language. Take a look at the following Python 3 program:

def square(x):
    return x*x
squares = map(square, [1, 2, 3])
print(list(squares))
print(list(squares))

You might expect it to print the line “[1, 4, 9]” twice, but instead it prints “[1, 4, 9]” followed by “[]”. The first time you look at squares it seems to behave as a sequence of three elements, but the second time as an empty one.

In the Python 2 language map returns a plain old list, just like list comprehensions do in both languages. The crux is that the return value of map in Python 3 (and imap in Python 2) is not a list – it’s an iterator!

The elements are consumed when you iterate over an iterator unlike when you iterate over a list. This is why squares looks empty in the last print(list(squares)) line.

To summarize:

  • When dealing with iterators you have to remember that they are stateful and that they mutate as you traverse them.
  • Lists are more predictable since they only change when you explicitly mutate them; they are containers.
  • And a bonus: numbers, strings, and tuples are even more predictable since they cannot change at all; they are values.

回答 4

我发现列表理解通常比我要表达的要表达的要多map-它们都可以完成,但是前者节省了试图理解什么可能是复杂lambda表达的精神负担。

在某个地方(我无法找到它)也有一次采访,其中Guido列出了lambdas和函数功能,这是他最后悔接受Python的事情,因此您可以凭借这些观点认为它们是非Python的其中。

I find list comprehensions are generally more expressive of what I’m trying to do than map – they both get it done, but the former saves the mental load of trying to understand what could be a complex lambda expression.

There’s also an interview out there somewhere (I can’t find it offhand) where Guido lists lambdas and the functional functions as the thing he most regrets about accepting into Python, so you could make the argument that they’re un-Pythonic by virtue of that.


回答 5

这是一种可能的情况:

map(lambda op1,op2: op1*op2, list1, list2)

与:

[op1*op2 for op1,op2 in zip(list1,list2)]

我猜想zip()是一个不幸的和不必要的开销,如果您坚持使用列表推导而不是地图,则需要沉迷于此。如果有人肯定或否定这一点,那就太好了。

Here is one possible case:

map(lambda op1,op2: op1*op2, list1, list2)

versus:

[op1*op2 for op1,op2 in zip(list1,list2)]

I am guessing the zip() is an unfortunate and unnecessary overhead you need to indulge in if you insist on using list comprehensions instead of the map. Would be great if someone clarifies this whether affirmatively or negatively.


回答 6

如果您打算编写任何异步,并行或分布式代码,则您可能会更喜欢map列表理解-因为大多数异步,并行或分布式程序包都提供了map使python过载的功能map。然后,通过将适当的map函数传递给其余代码,您可能不必修改原始串行代码即可使其并行运行(等)。

If you plan on writing any asynchronous, parallel, or distributed code, you will probably prefer map over a list comprehension — as most asynchronous, parallel, or distributed packages provide a map function to overload python’s map. Then by passing the appropriate map function to the rest of your code, you may not have to modify your original serial code to have it run in parallel (etc).


回答 7

因此,由于Python 3 是迭代器,因此您需要牢记所需的东西:迭代器或list对象。

正如@AlexMartelli已经提到的那样map()仅当您不使用lambda函数时,它比列表理解要快。

我将向您介绍一些时间比较。

Python 3.5.2和CPython
我使用了Jupiter笔记本,尤其是%timeit内置的魔术命令
测量:s == 1000 ms == 1000 * 1000 µs = 1000 * 1000 * 1000 ns

设定:

x_list = [(i, i+1, i+2, i*2, i-9) for i in range(1000)]
i_list = list(range(1000))

内置功能:

%timeit map(sum, x_list)  # creating iterator object
# Output: The slowest run took 9.91 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 277 ns per loop

%timeit list(map(sum, x_list))  # creating list with map
# Output: 1000 loops, best of 3: 214 µs per loop

%timeit [sum(x) for x in x_list]  # creating list with list comprehension
# Output: 1000 loops, best of 3: 290 µs per loop

lambda 功能:

%timeit map(lambda i: i+1, i_list)
# Output: The slowest run took 8.64 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 325 ns per loop

%timeit list(map(lambda i: i+1, i_list))
# Output: 1000 loops, best of 3: 183 µs per loop

%timeit [i+1 for i in i_list]
# Output: 10000 loops, best of 3: 84.2 µs per loop

还有诸如生成器表达式之类的东西,请参阅PEP-0289。所以我认为将其添加到比较中将很有用

%timeit (sum(i) for i in x_list)
# Output: The slowest run took 6.66 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 495 ns per loop

%timeit list((sum(x) for x in x_list))
# Output: 1000 loops, best of 3: 319 µs per loop

%timeit (i+1 for i in i_list)
# Output: The slowest run took 6.83 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 506 ns per loop

%timeit list((i+1 for i in i_list))
# Output: 10000 loops, best of 3: 125 µs per loop

您需要list对象:

如果是自定义函数,list(map())则使用列表理解;如果有内置函数,则使用列表理解

您不需要list对象,只需要一个可迭代的对象:

始终使用map()

So since Python 3, is an iterator, you need to keep in mind what do you need: an iterator or list object.

As @AlexMartelli already mentioned, map() is faster than list comprehension only if you don’t use lambda function.

I will present you some time comparisons.

Python 3.5.2 and CPython
I’ve used Jupiter notebook and especially %timeit built-in magic command
Measurements: s == 1000 ms == 1000 * 1000 µs = 1000 * 1000 * 1000 ns

Setup:

x_list = [(i, i+1, i+2, i*2, i-9) for i in range(1000)]
i_list = list(range(1000))

Built-in function:

%timeit map(sum, x_list)  # creating iterator object
# Output: The slowest run took 9.91 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 277 ns per loop

%timeit list(map(sum, x_list))  # creating list with map
# Output: 1000 loops, best of 3: 214 µs per loop

%timeit [sum(x) for x in x_list]  # creating list with list comprehension
# Output: 1000 loops, best of 3: 290 µs per loop

lambda function:

%timeit map(lambda i: i+1, i_list)
# Output: The slowest run took 8.64 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 325 ns per loop

%timeit list(map(lambda i: i+1, i_list))
# Output: 1000 loops, best of 3: 183 µs per loop

%timeit [i+1 for i in i_list]
# Output: 10000 loops, best of 3: 84.2 µs per loop

There is also such thing as generator expression, see PEP-0289. So i thought it would be useful to add it to comparison

%timeit (sum(i) for i in x_list)
# Output: The slowest run took 6.66 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 495 ns per loop

%timeit list((sum(x) for x in x_list))
# Output: 1000 loops, best of 3: 319 µs per loop

%timeit (i+1 for i in i_list)
# Output: The slowest run took 6.83 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 506 ns per loop

%timeit list((i+1 for i in i_list))
# Output: 10000 loops, best of 3: 125 µs per loop

You need list object:

Use list comprehension if it’s custom function, use list(map()) if there is builtin function

You don’t need list object, you just need iterable one:

Always use map()!


回答 8

我进行了一项快速测试,比较了三种调用对象方法的方法。在这种情况下,时差可以忽略不计,并且与所讨论的功能有关(请参阅@Alex Martelli的回复)。在这里,我查看了以下方法:

# map_lambda
list(map(lambda x: x.add(), vals))

# map_operator
from operator import methodcaller
list(map(methodcaller("add"), vals))

# map_comprehension
[x.add() for x in vals]

我查看vals了整数(Python int)和浮点数(Python )的列表(存储在变量中),float以增加列表大小。DummyNum考虑以下虚拟类:

class DummyNum(object):
    """Dummy class"""
    __slots__ = 'n',

    def __init__(self, n):
        self.n = n

    def add(self):
        self.n += 5

具体来说,add方法。该__slots__属性是Python中的一种简单优化,用于定义类(属性)所需的总内存,从而减小了内存大小。这是结果图。

映射Python对象方法的性能

如前所述,所使用的技术差异很小,您应该以一种对您最易读的方式或在特定情况下进行编码。在这种情况下,列表推导(map_comprehension技巧)对于对象中两种类型的加法最快,尤其是对于较短的列表。

访问此pastebin,获取用于生成图和数据的源。

I ran a quick test comparing three methods for invoking the method of an object. The time difference, in this case, is negligible and is a matter of the function in question (see @Alex Martelli’s response). Here, I looked at the following methods:

# map_lambda
list(map(lambda x: x.add(), vals))

# map_operator
from operator import methodcaller
list(map(methodcaller("add"), vals))

# map_comprehension
[x.add() for x in vals]

I looked at lists (stored in the variable vals) of both integers (Python int) and floating point numbers (Python float) for increasing list sizes. The following dummy class DummyNum is considered:

class DummyNum(object):
    """Dummy class"""
    __slots__ = 'n',

    def __init__(self, n):
        self.n = n

    def add(self):
        self.n += 5

Specifically, the add method. The __slots__ attribute is a simple optimization in Python to define the total memory needed by the class (attributes), reducing memory size. Here are the resulting plots.

Performance of mapping Python object methods

As stated previously, the technique used makes a minimal difference and you should code in a way that is most readable to you, or in the particular circumstance. In this case, the list comprehension (map_comprehension technique) is fastest for both types of additions in an object, especially with shorter lists.

Visit this pastebin for the source used to generate the plot and data.


回答 9

我认为最Python的方式是使用列表理解而不是mapand filter。原因是列表理解比map和更清晰filter

In [1]: odd_cubes = [x ** 3 for x in range(10) if x % 2 == 1] # using a list comprehension

In [2]: odd_cubes_alt = list(map(lambda x: x ** 3, filter(lambda x: x % 2 == 1, range(10)))) # using map and filter

In [3]: odd_cubes == odd_cubes_alt
Out[3]: True

如您所见,理解并不需要额外的lambda表达式map。此外,理解还允许容易地过滤,同时map需要filter允许过滤。

I consider that the most Pythonic way is to use a list comprehension instead of map and filter. The reason is that list comprehensions are clearer than map and filter.

In [1]: odd_cubes = [x ** 3 for x in range(10) if x % 2 == 1] # using a list comprehension

In [2]: odd_cubes_alt = list(map(lambda x: x ** 3, filter(lambda x: x % 2 == 1, range(10)))) # using map and filter

In [3]: odd_cubes == odd_cubes_alt
Out[3]: True

As you an see, a comprehension does not require extra lambda expressions as map needs. Furthermore, a comprehension also allows filtering easily, while map requires filter to allow filtering.


回答 10

我尝试了@ alex-martelli的代码,但发现了一些差异

python -mtimeit -s "xs=range(123456)" "map(hex, xs)"
1000000 loops, best of 5: 218 nsec per loop
python -mtimeit -s "xs=range(123456)" "[hex(x) for x in xs]"
10 loops, best of 5: 19.4 msec per loop

映射即使在非常大的范围内也要花费相同的时间,而使用列表理解则要花费很多时间,这从我的代码中可以明显看出。因此,除了被视为“ unpythonic”之外,我还没有遇到任何与使用map有关的性能问题。

I tried the code by @alex-martelli but found some discrepancies

python -mtimeit -s "xs=range(123456)" "map(hex, xs)"
1000000 loops, best of 5: 218 nsec per loop
python -mtimeit -s "xs=range(123456)" "[hex(x) for x in xs]"
10 loops, best of 5: 19.4 msec per loop

map takes the same amount of time even for very large ranges while using list comprehension takes a lot of time as is evident from my code. So apart from being considered “unpythonic”, I have not faced any performance issues relating to usage of map.


声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。