标签归档:map-function

列表理解和功能比“ for循环”快吗?

问题:列表理解和功能比“ for循环”快吗?

在Python的性能方面,是一个列表理解或功能,如map()filter()reduce()比for循环快?从技术上讲,为什么它们以C速度运行,而for循环以python虚拟机速度运行

假设在我正在开发的游戏中,我需要使用for循环绘制复杂而庞大的地图。这个问题肯定是相关的,例如,如果列表理解确实确实更快,那么它将是避免滞后的更好选择(尽管代码具有视觉上的复杂性)。

In terms of performance in Python, is a list-comprehension, or functions like map(), filter() and reduce() faster than a for loop? Why, technically, they run in a C speed, while the for loop runs in the python virtual machine speed?.

Suppose that in a game that I’m developing I need to draw complex and huge maps using for loops. This question would be definitely relevant, for if a list-comprehension, for example, is indeed faster, it would be a much better option in order to avoid lags (Despite the visual complexity of the code).


回答 0

以下是粗略的准则和基于经验的有根据的猜测。您应该timeit或配置您的具体用例以获得确切的数字,并且这些数字有时可能与以下内容不一致。

列表理解通常比精确等效的for循环(实际上是构建列表)要快一点,这很可能是因为它不必append在每次迭代时都查找列表及其方法。但是,列表理解仍然会执行字节码级别的循环:

>>> dis.dis(<the code object for `[x for x in range(10)]`>)
 1           0 BUILD_LIST               0
             3 LOAD_FAST                0 (.0)
       >>    6 FOR_ITER                12 (to 21)
             9 STORE_FAST               1 (x)
            12 LOAD_FAST                1 (x)
            15 LIST_APPEND              2
            18 JUMP_ABSOLUTE            6
       >>   21 RETURN_VALUE

由于创建和扩展列表的开销很大,因此使用列表推导代替构建列表的循环,无意义地累积无意义的值的列表然后将其扔掉通常会较慢。列表理解并不是比一个好的旧循环本质上更快的魔术。

至于功能列表处理功能:虽然这些都是用C语言编写,并可能超越Python编写的相同的功能,它们是不是一定是最快的选择。如果该函数也是用C编写的,则可以提高速度。但是在大多数情况下,使用lambda(或其他Python函数)重复设置Python堆栈框架等开销会消耗掉所有的节省。在没有函数调用的情况下,简单地在线完成相同的工作(例如,用列表理解代替mapor filter)通常会稍快一些。

假设在我正在开发的游戏中,我需要使用for循环绘制复杂而庞大的地图。这个问题肯定是相关的,例如,如果列表理解确实确实更快,那么它将是避免滞后的更好选择(尽管代码具有视觉上的复杂性)。

很有可能,如果用良好的非“优化” Python编写时这样的代码还不够快,那么就没有足够的Python级微优化可以使它足够快了,您应该开始考虑降级为C。微观优化通常可以大大加快Python代码的速度,对此(绝对值)的限制很低。而且,即使在达到极限之前,咬住子弹并写一些C语言也变得更具成本效益(加速15%,加速300%)。

The following are rough guidelines and educated guesses based on experience. You should timeit or profile your concrete use case to get hard numbers, and those numbers may occasionally disagree with the below.

A list comprehension is usually a tiny bit faster than the precisely equivalent for loop (that actually builds a list), most likely because it doesn’t have to look up the list and its append method on every iteration. However, a list comprehension still does a bytecode-level loop:

>>> dis.dis(<the code object for `[x for x in range(10)]`>)
 1           0 BUILD_LIST               0
             3 LOAD_FAST                0 (.0)
       >>    6 FOR_ITER                12 (to 21)
             9 STORE_FAST               1 (x)
            12 LOAD_FAST                1 (x)
            15 LIST_APPEND              2
            18 JUMP_ABSOLUTE            6
       >>   21 RETURN_VALUE

Using a list comprehension in place of a loop that doesn’t build a list, nonsensically accumulating a list of meaningless values and then throwing the list away, is often slower because of the overhead of creating and extending the list. List comprehensions aren’t magic that is inherently faster than a good old loop.

As for functional list processing functions: While these are written in C and probably outperform equivalent functions written in Python, they are not necessarily the fastest option. Some speed up is expected if the function is written in C too. But most cases using a lambda (or other Python function), the overhead of repeatedly setting up Python stack frames etc. eats up any savings. Simply doing the same work in-line, without function calls (e.g. a list comprehension instead of map or filter) is often slightly faster.

Suppose that in a game that I’m developing I need to draw complex and huge maps using for loops. This question would be definitely relevant, for if a list-comprehension, for example, is indeed faster, it would be a much better option in order to avoid lags (Despite the visual complexity of the code).

Chances are, if code like this isn’t already fast enough when written in good non-“optimized” Python, no amount of Python level micro optimization is going to make it fast enough and you should start thinking about dropping to C. While extensive micro optimizations can often speed up Python code considerably, there is a low (in absolute terms) limit to this. Moreover, even before you hit that ceiling, it becomes simply more cost efficient (15% speedup vs. 300% speed up with the same effort) to bite the bullet and write some C.


回答 1

如果查看python.org上信息,则可以看到以下摘要:

Version Time (seconds)
Basic loop 3.47
Eliminate dots 2.45
Local variable & no dots 1.79
Using map function 0.54

但是您确实应该详细阅读以上文章,以了解造成性能差异的原因。

我也强烈建议您使用timeit对代码计时。在一天的结束时,可能会出现例如for在满足条件时需要跳出循环的情况。它可能比调用找出结果更快map

If you check the info on python.org, you can see this summary:

Version Time (seconds)
Basic loop 3.47
Eliminate dots 2.45
Local variable & no dots 1.79
Using map function 0.54

But you really should read the above article in details to understand the cause of the performance difference.

I also strongly suggest you should time your code by using timeit. At the end of the day, there can be a situation where, for example, you may need to break out of for loop when a condition is met. It could potentially be faster than finding out the result by calling map.


回答 2

你具体问左右map()filter()reduce(),但我相信你想了解一般的函数式编程。在对一组点中所有点之间的距离进行计算的问题上进行了自我测试之后,结果表明,函数编程(使用starmap内置itertools模块中的函数)比for循环要慢一些(使用1.25倍的时间)。事实)。这是我使用的示例代码:

import itertools, time, math, random

class Point:
    def __init__(self,x,y):
        self.x, self.y = x, y

point_set = (Point(0, 0), Point(0, 1), Point(0, 2), Point(0, 3))
n_points = 100
pick_val = lambda : 10 * random.random() - 5
large_set = [Point(pick_val(), pick_val()) for _ in range(n_points)]
    # the distance function
f_dist = lambda x0, x1, y0, y1: math.sqrt((x0 - x1) ** 2 + (y0 - y1) ** 2)
    # go through each point, get its distance from all remaining points 
f_pos = lambda p1, p2: (p1.x, p2.x, p1.y, p2.y)

extract_dists = lambda x: itertools.starmap(f_dist, 
                          itertools.starmap(f_pos, 
                          itertools.combinations(x, 2)))

print('Distances:', list(extract_dists(point_set)))

t0_f = time.time()
list(extract_dists(large_set))
dt_f = time.time() - t0_f

功能版本是否比程序版本更快?

def extract_dists_procedural(pts):
    n_pts = len(pts)
    l = []    
    for k_p1 in range(n_pts - 1):
        for k_p2 in range(k_p1, n_pts):
            l.append((pts[k_p1].x - pts[k_p2].x) ** 2 +
                     (pts[k_p1].y - pts[k_p2].y) ** 2)
    return l

t0_p = time.time()
list(extract_dists_procedural(large_set)) 
    # using list() on the assumption that
    # it eats up as much time as in the functional version

dt_p = time.time() - t0_p

f_vs_p = dt_p / dt_f
if f_vs_p >= 1.0:
    print('Time benefit of functional progamming:', f_vs_p, 
          'times as fast for', n_points, 'points')
else:
    print('Time penalty of functional programming:', 1 / f_vs_p, 
          'times as slow for', n_points, 'points')

You ask specifically about map(), filter() and reduce(), but I assume you want to know about functional programming in general. Having tested this myself on the problem of computing distances between all points within a set of points, functional programming (using the starmap function from the built-in itertools module) turned out to be slightly slower than for-loops (taking 1.25 times as long, in fact). Here is the sample code I used:

import itertools, time, math, random

class Point:
    def __init__(self,x,y):
        self.x, self.y = x, y

point_set = (Point(0, 0), Point(0, 1), Point(0, 2), Point(0, 3))
n_points = 100
pick_val = lambda : 10 * random.random() - 5
large_set = [Point(pick_val(), pick_val()) for _ in range(n_points)]
    # the distance function
f_dist = lambda x0, x1, y0, y1: math.sqrt((x0 - x1) ** 2 + (y0 - y1) ** 2)
    # go through each point, get its distance from all remaining points 
f_pos = lambda p1, p2: (p1.x, p2.x, p1.y, p2.y)

extract_dists = lambda x: itertools.starmap(f_dist, 
                          itertools.starmap(f_pos, 
                          itertools.combinations(x, 2)))

print('Distances:', list(extract_dists(point_set)))

t0_f = time.time()
list(extract_dists(large_set))
dt_f = time.time() - t0_f

Is the functional version faster than the procedural version?

def extract_dists_procedural(pts):
    n_pts = len(pts)
    l = []    
    for k_p1 in range(n_pts - 1):
        for k_p2 in range(k_p1, n_pts):
            l.append((pts[k_p1].x - pts[k_p2].x) ** 2 +
                     (pts[k_p1].y - pts[k_p2].y) ** 2)
    return l

t0_p = time.time()
list(extract_dists_procedural(large_set)) 
    # using list() on the assumption that
    # it eats up as much time as in the functional version

dt_p = time.time() - t0_p

f_vs_p = dt_p / dt_f
if f_vs_p >= 1.0:
    print('Time benefit of functional progamming:', f_vs_p, 
          'times as fast for', n_points, 'points')
else:
    print('Time penalty of functional programming:', 1 / f_vs_p, 
          'times as slow for', n_points, 'points')

回答 3

我写了一个简单的脚本来测试速度,这就是我发现的结果。实际上对于我来说,for循环最快。真的让我感到惊讶,请检查波纹管(计算平方和)。

from functools import reduce
import datetime


def time_it(func, numbers, *args):
    start_t = datetime.datetime.now()
    for i in range(numbers):
        func(args[0])
    print (datetime.datetime.now()-start_t)

def square_sum1(numbers):
    return reduce(lambda sum, next: sum+next**2, numbers, 0)


def square_sum2(numbers):
    a = 0
    for i in numbers:
        i = i**2
        a += i
    return a

def square_sum3(numbers):
    sqrt = lambda x: x**2
    return sum(map(sqrt, numbers))

def square_sum4(numbers):
    return(sum([int(i)**2 for i in numbers]))


time_it(square_sum1, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum2, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum3, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum4, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
0:00:00.302000 #Reduce
0:00:00.144000 #For loop
0:00:00.318000 #Map
0:00:00.390000 #List comprehension

I wrote a simple script that test the speed and this is what I found out. Actually for loop was fastest in my case. That really suprised me, check out bellow (was calculating sum of squares).

from functools import reduce
import datetime


def time_it(func, numbers, *args):
    start_t = datetime.datetime.now()
    for i in range(numbers):
        func(args[0])
    print (datetime.datetime.now()-start_t)

def square_sum1(numbers):
    return reduce(lambda sum, next: sum+next**2, numbers, 0)


def square_sum2(numbers):
    a = 0
    for i in numbers:
        i = i**2
        a += i
    return a

def square_sum3(numbers):
    sqrt = lambda x: x**2
    return sum(map(sqrt, numbers))

def square_sum4(numbers):
    return(sum([int(i)**2 for i in numbers]))


time_it(square_sum1, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum2, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum3, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum4, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
0:00:00.302000 #Reduce
0:00:00.144000 #For loop
0:00:00.318000 #Map
0:00:00.390000 #List comprehension

回答 4

我修改了@Alisa的代码,并用来cProfile说明为什么列表理解速度更快:

from functools import reduce
import datetime

def reduce_(numbers):
    return reduce(lambda sum, next: sum + next * next, numbers, 0)

def for_loop(numbers):
    a = []
    for i in numbers:
        a.append(i*2)
    a = sum(a)
    return a

def map_(numbers):
    sqrt = lambda x: x*x
    return sum(map(sqrt, numbers))

def list_comp(numbers):
    return(sum([i*i for i in numbers]))

funcs = [
        reduce_,
        for_loop,
        map_,
        list_comp
        ]

if __name__ == "__main__":
    # [1, 2, 5, 3, 1, 2, 5, 3]
    import cProfile
    for f in funcs:
        print('=' * 25)
        print("Profiling:", f.__name__)
        print('=' * 25)
        pr = cProfile.Profile()
        for i in range(10**6):
            pr.runcall(f, [1, 2, 5, 3, 1, 2, 5, 3])
        pr.create_stats()
        pr.print_stats()

结果如下:

=========================
Profiling: reduce_
=========================
         11000000 function calls in 1.501 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1000000    0.162    0.000    1.473    0.000 profiling.py:4(reduce_)
  8000000    0.461    0.000    0.461    0.000 profiling.py:5(<lambda>)
  1000000    0.850    0.000    1.311    0.000 {built-in method _functools.reduce}
  1000000    0.028    0.000    0.028    0.000 {method 'disable' of '_lsprof.Profiler' objects}


=========================
Profiling: for_loop
=========================
         11000000 function calls in 1.372 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1000000    0.879    0.000    1.344    0.000 profiling.py:7(for_loop)
  1000000    0.145    0.000    0.145    0.000 {built-in method builtins.sum}
  8000000    0.320    0.000    0.320    0.000 {method 'append' of 'list' objects}
  1000000    0.027    0.000    0.027    0.000 {method 'disable' of '_lsprof.Profiler' objects}


=========================
Profiling: map_
=========================
         11000000 function calls in 1.470 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1000000    0.264    0.000    1.442    0.000 profiling.py:14(map_)
  8000000    0.387    0.000    0.387    0.000 profiling.py:15(<lambda>)
  1000000    0.791    0.000    1.178    0.000 {built-in method builtins.sum}
  1000000    0.028    0.000    0.028    0.000 {method 'disable' of '_lsprof.Profiler' objects}


=========================
Profiling: list_comp
=========================
         4000000 function calls in 0.737 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1000000    0.318    0.000    0.709    0.000 profiling.py:18(list_comp)
  1000000    0.261    0.000    0.261    0.000 profiling.py:19(<listcomp>)
  1000000    0.131    0.000    0.131    0.000 {built-in method builtins.sum}
  1000000    0.027    0.000    0.027    0.000 {method 'disable' of '_lsprof.Profiler' objects}

恕我直言:

  • reduce并且map总体上来说很慢。不仅如此,summap返回sum列表相比,在返回的迭代器上使用慢
  • for_loop 使用append,这在某种程度上当然很慢
  • 列表理解不仅花费了最少的时间来构建列表,而且与之sum相比,它也变得更快map

I modified @Alisa’s code and used cProfile to show why list comprehension is faster:

from functools import reduce
import datetime

def reduce_(numbers):
    return reduce(lambda sum, next: sum + next * next, numbers, 0)

def for_loop(numbers):
    a = []
    for i in numbers:
        a.append(i*2)
    a = sum(a)
    return a

def map_(numbers):
    sqrt = lambda x: x*x
    return sum(map(sqrt, numbers))

def list_comp(numbers):
    return(sum([i*i for i in numbers]))

funcs = [
        reduce_,
        for_loop,
        map_,
        list_comp
        ]

if __name__ == "__main__":
    # [1, 2, 5, 3, 1, 2, 5, 3]
    import cProfile
    for f in funcs:
        print('=' * 25)
        print("Profiling:", f.__name__)
        print('=' * 25)
        pr = cProfile.Profile()
        for i in range(10**6):
            pr.runcall(f, [1, 2, 5, 3, 1, 2, 5, 3])
        pr.create_stats()
        pr.print_stats()

Here’s the results:

=========================
Profiling: reduce_
=========================
         11000000 function calls in 1.501 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1000000    0.162    0.000    1.473    0.000 profiling.py:4(reduce_)
  8000000    0.461    0.000    0.461    0.000 profiling.py:5(<lambda>)
  1000000    0.850    0.000    1.311    0.000 {built-in method _functools.reduce}
  1000000    0.028    0.000    0.028    0.000 {method 'disable' of '_lsprof.Profiler' objects}


=========================
Profiling: for_loop
=========================
         11000000 function calls in 1.372 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1000000    0.879    0.000    1.344    0.000 profiling.py:7(for_loop)
  1000000    0.145    0.000    0.145    0.000 {built-in method builtins.sum}
  8000000    0.320    0.000    0.320    0.000 {method 'append' of 'list' objects}
  1000000    0.027    0.000    0.027    0.000 {method 'disable' of '_lsprof.Profiler' objects}


=========================
Profiling: map_
=========================
         11000000 function calls in 1.470 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1000000    0.264    0.000    1.442    0.000 profiling.py:14(map_)
  8000000    0.387    0.000    0.387    0.000 profiling.py:15(<lambda>)
  1000000    0.791    0.000    1.178    0.000 {built-in method builtins.sum}
  1000000    0.028    0.000    0.028    0.000 {method 'disable' of '_lsprof.Profiler' objects}


=========================
Profiling: list_comp
=========================
         4000000 function calls in 0.737 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1000000    0.318    0.000    0.709    0.000 profiling.py:18(list_comp)
  1000000    0.261    0.000    0.261    0.000 profiling.py:19(<listcomp>)
  1000000    0.131    0.000    0.131    0.000 {built-in method builtins.sum}
  1000000    0.027    0.000    0.027    0.000 {method 'disable' of '_lsprof.Profiler' objects}

IMHO:

  • reduce and map are in general pretty slow. Not only that, using sum on the iterators that map returned is slow, compared to suming a list
  • for_loop uses append, which is of course slow to some extent
  • list-comprehension not only spent the least time building the list, it also makes sum much quicker, in contrast to map

回答 5

Alphii答案增加一点扭曲,实际上for循环将是第二好,并且比循环慢6倍map

from functools import reduce
import datetime


def time_it(func, numbers, *args):
    start_t = datetime.datetime.now()
    for i in range(numbers):
        func(args[0])
    print (datetime.datetime.now()-start_t)

def square_sum1(numbers):
    return reduce(lambda sum, next: sum+next**2, numbers, 0)


def square_sum2(numbers):
    a = 0
    for i in numbers:
        a += i**2
    return a

def square_sum3(numbers):
    a = 0
    map(lambda x: a+x**2, numbers)
    return a

def square_sum4(numbers):
    a = 0
    return [a+i**2 for i in numbers]

time_it(square_sum1, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum2, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum3, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum4, 100000, [1, 2, 5, 3, 1, 2, 5, 3])

主要的更改是消除了缓慢的sum通话,以及int()在最后一种情况下可能不必要的通话。实际上,将for循环和map放在相同的术语中就可以说是事实。请记住,lambda是功能性概念,从理论上讲不应该具有副作用,但是,它们可以具有诸如添加到的副作用a。在这种情况下使用Python 3.6.1,Ubuntu 14.04,Intel(R)Core(TM)i7-4770 CPU @ 3.40GHz时的结果

0:00:00.257703 #Reduce
0:00:00.184898 #For loop
0:00:00.031718 #Map
0:00:00.212699 #List comprehension

Adding a twist to Alphii answer, actually the for loop would be second best and about 6 times slower than map

from functools import reduce
import datetime


def time_it(func, numbers, *args):
    start_t = datetime.datetime.now()
    for i in range(numbers):
        func(args[0])
    print (datetime.datetime.now()-start_t)

def square_sum1(numbers):
    return reduce(lambda sum, next: sum+next**2, numbers, 0)


def square_sum2(numbers):
    a = 0
    for i in numbers:
        a += i**2
    return a

def square_sum3(numbers):
    a = 0
    map(lambda x: a+x**2, numbers)
    return a

def square_sum4(numbers):
    a = 0
    return [a+i**2 for i in numbers]

time_it(square_sum1, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum2, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum3, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum4, 100000, [1, 2, 5, 3, 1, 2, 5, 3])

Main changes have been to eliminate the slow sum calls, as well as the probably unnecessary int() in the last case. Putting the for loop and map in the same terms makes it quite fact, actually. Remember that lambdas are functional concepts and theoretically shouldn’t have side effects, but, well, they can have side effects like adding to a. Results in this case with Python 3.6.1, Ubuntu 14.04, Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz

0:00:00.257703 #Reduce
0:00:00.184898 #For loop
0:00:00.031718 #Map
0:00:00.212699 #List comprehension

回答 6

我设法修改了@alpiii的一些代码,并发现List理解比for循环快一点。它可能是由引起的int(),在列表理解和for循环之间是不公平的。

from functools import reduce
import datetime

def time_it(func, numbers, *args):
    start_t = datetime.datetime.now()
    for i in range(numbers):
        func(args[0])
    print (datetime.datetime.now()-start_t)

def square_sum1(numbers):
    return reduce(lambda sum, next: sum+next*next, numbers, 0)

def square_sum2(numbers):
    a = []
    for i in numbers:
        a.append(i*2)
    a = sum(a)
    return a

def square_sum3(numbers):
    sqrt = lambda x: x*x
    return sum(map(sqrt, numbers))

def square_sum4(numbers):
    return(sum([i*i for i in numbers]))

time_it(square_sum1, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum2, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum3, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum4, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
0:00:00.101122 #Reduce

0:00:00.089216 #For loop

0:00:00.101532 #Map

0:00:00.068916 #List comprehension

I have managed to modify some of @alpiii’s code and discovered that List comprehension is a little faster than for loop. It might be caused by int(), it is not fair between list comprehension and for loop.

from functools import reduce
import datetime

def time_it(func, numbers, *args):
    start_t = datetime.datetime.now()
    for i in range(numbers):
        func(args[0])
    print (datetime.datetime.now()-start_t)

def square_sum1(numbers):
    return reduce(lambda sum, next: sum+next*next, numbers, 0)

def square_sum2(numbers):
    a = []
    for i in numbers:
        a.append(i*2)
    a = sum(a)
    return a

def square_sum3(numbers):
    sqrt = lambda x: x*x
    return sum(map(sqrt, numbers))

def square_sum4(numbers):
    return(sum([i*i for i in numbers]))

time_it(square_sum1, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum2, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum3, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum4, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
0:00:00.101122 #Reduce

0:00:00.089216 #For loop

0:00:00.101532 #Map

0:00:00.068916 #List comprehension

映射python字典中的值

问题:映射python字典中的值

给定一个字典,{ k1: v1, k2: v2 ... }我想{ k1: f(v1), k2: f(v2) ... }提供一个函数f

有没有这样的内置功能?还是我必须做

dict([(k, f(v)) for (k, v) in my_dictionary.iteritems()])

理想情况下,我只会写

my_dictionary.map_values(f)

要么

my_dictionary.mutate_values_with(f)

也就是说,对原始词典进行了突变还是创建副本对我来说都没有关系。

Given a dictionary { k1: v1, k2: v2 ... } I want to get { k1: f(v1), k2: f(v2) ... } provided I pass a function f.

Is there any such built in function? Or do I have to do

dict([(k, f(v)) for (k, v) in my_dictionary.iteritems()])

Ideally I would just write

my_dictionary.map_values(f)

or

my_dictionary.mutate_values_with(f)

That is, it doesn’t matter to me if the original dictionary is mutated or a copy is created.


回答 0

没有这样的功能;最简单的方法是使用dict理解:

my_dictionary = {k: f(v) for k, v in my_dictionary.items()}

在python 2.7中,请使用.iteritems()方法而不是.items()节省内存。dict理解语法直到python 2.7才引入。

注意,列表上也没有这种方法。您将不得不使用列表推导或map()函数。

这样,您也可以使用该map()函数来处理字典:

my_dictionary = dict(map(lambda kv: (kv[0], f(kv[1])), my_dictionary.iteritems()))

但这确实不是那么可读。

There is no such function; the easiest way to do this is to use a dict comprehension:

my_dictionary = {k: f(v) for k, v in my_dictionary.items()}

In python 2.7, use the .iteritems() method instead of .items() to save memory. The dict comprehension syntax wasn’t introduced until python 2.7.

Note that there is no such method on lists either; you’d have to use a list comprehension or the map() function.

As such, you could use the map() function for processing your dict as well:

my_dictionary = dict(map(lambda kv: (kv[0], f(kv[1])), my_dictionary.iteritems()))

but that’s not that readable, really.


回答 1

这些工具非常适合这种简单但重复的逻辑。

http://toolz.readthedocs.org/en/latest/api.html#toolz.dicttoolz.valmap

使您正确地放在想要的位置。

import toolz
def f(x):
  return x+1

toolz.valmap(f, my_list)

These toolz are great for this kind of simple yet repetitive logic.

http://toolz.readthedocs.org/en/latest/api.html#toolz.dicttoolz.valmap

Gets you right where you want to be.

import toolz
def f(x):
  return x+1

toolz.valmap(f, my_list)

回答 2

您可以就地执行此操作,而不是创建一个新的字典,这对于大型词典(如果您不需要副本)可能更可取。

def mutate_dict(f,d):
    for k, v in d.iteritems():
        d[k] = f(v)

my_dictionary = {'a':1, 'b':2}
mutate_dict(lambda x: x+1, my_dictionary)

结果my_dictionary包含:

{'a': 2, 'b': 3}

You can do this in-place, rather than create a new dict, which may be preferable for large dictionaries (if you do not need a copy).

def mutate_dict(f,d):
    for k, v in d.iteritems():
        d[k] = f(v)

my_dictionary = {'a':1, 'b':2}
mutate_dict(lambda x: x+1, my_dictionary)

results in my_dictionary containing:

{'a': 2, 'b': 3}

回答 3

由于PEP-0469将iteritems()重命名为items(),而PEP-3113删除了Tuple参数解包,因此在Python 3.x中,您应该这样编写Martijn Pieters♦答案

my_dictionary = dict(map(lambda item: (item[0], f(item[1])), my_dictionary.items()))

Due to PEP-0469 which renamed iteritems() to items() and PEP-3113 which removed Tuple parameter unpacking, in Python 3.x you should write Martijn Pieters♦ answer like this:

my_dictionary = dict(map(lambda item: (item[0], f(item[1])), my_dictionary.items()))

回答 4

虽然我的原始答案没有指出要点(通过尝试使用defaultdict的工厂中的Accessing key解决方案来解决此问题),但我对其进行了重新设计以提出针对当前问题的实际解决方案。

这里是:

class walkableDict(dict):
  def walk(self, callback):
    try:
      for key in self:
        self[key] = callback(self[key])
    except TypeError:
      return False
    return True

用法:

>>> d = walkableDict({ k1: v1, k2: v2 ... })
>>> d.walk(f)

想法是将原始dict子类化以赋予其所需的功能:在所有值上“映射”一个功能。

加号的是,该字典可用于存储原始数据,就好像它是一个dict,同时根据请求通过回调转换任何数据。

当然,可以随意使用所需的名称来命名类和函数(此答案中选择的名称受PHP array_walk()函数的启发)。

注意:tryexcept块和return语句都不是功能必需的,它们可以进一步模仿PHP的行为array_walk

While my original answer missed the point (by trying to solve this problem with the solution to Accessing key in factory of defaultdict), I have reworked it to propose an actual solution to the present question.

Here it is:

class walkableDict(dict):
  def walk(self, callback):
    try:
      for key in self:
        self[key] = callback(self[key])
    except TypeError:
      return False
    return True

Usage:

>>> d = walkableDict({ k1: v1, k2: v2 ... })
>>> d.walk(f)

The idea is to subclass the original dict to give it the desired functionality: “mapping” a function over all the values.

The plus point is that this dictionary can be used to store the original data as if it was a dict, while transforming any data on request with a callback.

Of course, feel free to name the class and the function the way you want (the name chosen in this answer is inspired by PHP’s array_walk() function).

Note: Neither the tryexcept block nor the return statements are mandatory for the functionality, they are there to further mimic the behavior of the PHP’s array_walk.


回答 5

为了避免从lambda内部进行索引,例如:

rval = dict(map(lambda kv : (kv[0], ' '.join(kv[1])), rval.iteritems()))

您也可以这样做:

rval = dict(map(lambda(k,v) : (k, ' '.join(v)), rval.iteritems()))

To avoid doing indexing from inside lambda, like:

rval = dict(map(lambda kv : (kv[0], ' '.join(kv[1])), rval.iteritems()))

You can also do:

rval = dict(map(lambda(k,v) : (k, ' '.join(v)), rval.iteritems()))

回答 6

刚遇到这个用例。我实现了gens的answer,添加了一种递归方法来处理也是dict的值:

def mutate_dict_in_place(f, d):
    for k, v in d.iteritems():
        if isinstance(v, dict):
            mutate_dict_in_place(f, v)
        else:
            d[k] = f(v)

# Exemple handy usage
def utf8_everywhere(d):
    mutate_dict_in_place((
        lambda value:
            value.decode('utf-8')
            if isinstance(value, bytes)
            else value
        ),
        d
    )

my_dict = {'a': b'byte1', 'b': {'c': b'byte2', 'd': b'byte3'}}
utf8_everywhere(my_dict)
print(my_dict)

这在处理在Python 2中将字符串编码为字节的json或yaml文件时非常有用

Just came accross this use case. I implemented gens’s answer, adding a recursive approach for handling values that are also dicts:

def mutate_dict_in_place(f, d):
    for k, v in d.iteritems():
        if isinstance(v, dict):
            mutate_dict_in_place(f, v)
        else:
            d[k] = f(v)

# Exemple handy usage
def utf8_everywhere(d):
    mutate_dict_in_place((
        lambda value:
            value.decode('utf-8')
            if isinstance(value, bytes)
            else value
        ),
        d
    )

my_dict = {'a': b'byte1', 'b': {'c': b'byte2', 'd': b'byte3'}}
utf8_everywhere(my_dict)
print(my_dict)

This can be useful when dealing with json or yaml files that encode strings as bytes in Python 2


了解地图功能

问题:了解地图功能

map(function, iterable, ...)

将函数应用于每个iterable项目,并返回结果列表。如果传递了其他可迭代参数,则函数必须采用那么多参数,并且并行地将其应用于所有可迭代对象的项。

如果一个可迭代项短于另一个可迭代项,则假定它扩展为None。

如果function是None,则假定身份函数;如果有多个参数,则map()返回一个由元组组成的列表,其中包含所有可迭代对象中的对应项(一种转置操作)。

可迭代参数可以是序列或任何可迭代对象。结果总是一个列表。

这在制作笛卡尔积时起什么作用?

content = map(tuple, array)

将元组放在任何地方会有什么作用?我也注意到,如果没有地图功能的输出abc,并与它,它的a, b, c

我想完全了解此功能。参考定义也很难理解。花哨的绒毛太多。

map(function, iterable, ...)

Apply function to every item of iterable and return a list of the results. If additional iterable arguments are passed, function must take that many arguments and is applied to the items from all iterables in parallel.

If one iterable is shorter than another it is assumed to be extended with None items.

If function is None, the identity function is assumed; if there are multiple arguments, map() returns a list consisting of tuples containing the corresponding items from all iterables (a kind of transpose operation).

The iterable arguments may be a sequence or any iterable object; the result is always a list.

What role does this play in making a Cartesian product?

content = map(tuple, array)

What effect does putting a tuple anywhere in there have? I also noticed that without the map function the output is abc and with it, it’s a, b, c.

I want to fully understand this function. The reference definitions is also hard to understand. Too much fancy fluff.


回答 0

map不是特别的pythonic。我建议改用列表推导:

map(f, iterable)

基本上等同于:

[f(x) for x in iterable]

map单独不能执行笛卡尔积,因为其输出列表的长度始终与输入列表相同。您可以通过列表理解来简单地做笛卡尔积:

[(a, b) for a in iterable_a for b in iterable_b]

语法有点混乱-基本上等同于:

result = []
for a in iterable_a:
    for b in iterable_b:
        result.append((a, b))

map isn’t particularly pythonic. I would recommend using list comprehensions instead:

map(f, iterable)

is basically equivalent to:

[f(x) for x in iterable]

map on its own can’t do a Cartesian product, because the length of its output list is always the same as its input list. You can trivially do a Cartesian product with a list comprehension though:

[(a, b) for a in iterable_a for b in iterable_b]

The syntax is a little confusing — that’s basically equivalent to:

result = []
for a in iterable_a:
    for b in iterable_b:
        result.append((a, b))

回答 1

map尽管我想象一个精通函数式编程的人可能会提出一些无法理解的使用生成方法的方法,但它与笛卡尔积完全无关map

map Python 3中的等效于此:

def map(func, iterable):
    for i in iterable:
        yield func(i)

Python 2的唯一区别是它将建立完整的结果列表以立即返回所有结果而不是yielding。

尽管Python约定通常更喜欢列表推导(或生成器表达式)来实现与调用相同的结果map,尤其是如果您使用lambda表达式作为第一个参数:

[func(i) for i in iterable]

作为您在问题注释中要求的示例(“将字符串转换为数组”),通过“数组”,您可能想要一个元组或一个列表(它们的行为都与其他语言的数组类似) —

 >>> a = "hello, world"
 >>> list(a)
['h', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd']
>>> tuple(a)
('h', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd')

map如果您从一个字符串列表而不是单个字符串开始,可以在这里使用- map可以单独列出所有字符串:

>>> a = ["foo", "bar", "baz"]
>>> list(map(list, a))
[['f', 'o', 'o'], ['b', 'a', 'r'], ['b', 'a', 'z']]

请注意,这map(list, a)在Python 2 中是等效的,但是在Python 3中,list如果您想要执行其他操作而不是将其馈入for循环(或诸如sum仅需要可迭代的处理函数,而无需序列)的处理函数,则需要调用。但也请注意,通常首选列表理解:

>>> [list(b) for b in a]
[['f', 'o', 'o'], ['b', 'a', 'r'], ['b', 'a', 'z']]

map doesn’t relate to a Cartesian product at all, although I imagine someone well versed in functional programming could come up with some impossible to understand way of generating a one using map.

map in Python 3 is equivalent to this:

def map(func, iterable):
    for i in iterable:
        yield func(i)

and the only difference in Python 2 is that it will build up a full list of results to return all at once instead of yielding.

Although Python convention usually prefers list comprehensions (or generator expressions) to achieve the same result as a call to map, particularly if you’re using a lambda expression as the first argument:

[func(i) for i in iterable]

As an example of what you asked for in the comments on the question – “turn a string into an array”, by ‘array’ you probably want either a tuple or a list (both of them behave a little like arrays from other languages) –

 >>> a = "hello, world"
 >>> list(a)
['h', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd']
>>> tuple(a)
('h', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd')

A use of map here would be if you start with a list of strings instead of a single string – map can listify all of them individually:

>>> a = ["foo", "bar", "baz"]
>>> list(map(list, a))
[['f', 'o', 'o'], ['b', 'a', 'r'], ['b', 'a', 'z']]

Note that map(list, a) is equivalent in Python 2, but in Python 3 you need the list call if you want to do anything other than feed it into a for loop (or a processing function such as sum that only needs an iterable, and not a sequence). But also note again that a list comprehension is usually preferred:

>>> [list(b) for b in a]
[['f', 'o', 'o'], ['b', 'a', 'r'], ['b', 'a', 'z']]

回答 2

map 通过将函数应用于源的每个元素来创建新列表:

xs = [1, 2, 3]

# all of those are equivalent — the output is [2, 4, 6]
# 1. map
ys = map(lambda x: x * 2, xs)
# 2. list comprehension
ys = [x * 2 for x in xs]
# 3. explicit loop
ys = []
for x in xs:
    ys.append(x * 2)

n-ary map等效于将输入可迭代对象压缩在一起,然后将转换函数应用于该中间压缩列表的每个元素。它不是笛卡尔积:

xs = [1, 2, 3]
ys = [2, 4, 6]

def f(x, y):
    return (x * 2, y // 2)

# output: [(2, 1), (4, 2), (6, 3)]
# 1. map
zs = map(f, xs, ys)
# 2. list comp
zs = [f(x, y) for x, y in zip(xs, ys)]
# 3. explicit loop
zs = []
for x, y in zip(xs, ys):
    zs.append(f(x, y))

我在zip这里使用过,但是map当可迭代项的大小不同时,行为实际上稍有不同–如其文档中所述,它将可迭代项扩展为contains None

map creates a new list by applying a function to every element of the source:

xs = [1, 2, 3]

# all of those are equivalent — the output is [2, 4, 6]
# 1. map
ys = map(lambda x: x * 2, xs)
# 2. list comprehension
ys = [x * 2 for x in xs]
# 3. explicit loop
ys = []
for x in xs:
    ys.append(x * 2)

n-ary map is equivalent to zipping input iterables together and then applying the transformation function on every element of that intermediate zipped list. It’s not a Cartesian product:

xs = [1, 2, 3]
ys = [2, 4, 6]

def f(x, y):
    return (x * 2, y // 2)

# output: [(2, 1), (4, 2), (6, 3)]
# 1. map
zs = map(f, xs, ys)
# 2. list comp
zs = [f(x, y) for x, y in zip(xs, ys)]
# 3. explicit loop
zs = []
for x, y in zip(xs, ys):
    zs.append(f(x, y))

I’ve used zip here, but map behaviour actually differs slightly when iterables aren’t the same size — as noted in its documentation, it extends iterables to contain None.


回答 3

简化一下,您可以想象map()这样做:

def mymap(func, lst):
    result = []
    for e in lst:
        result.append(func(e))
    return result

如您所见,它接受一个函数和一个列表,并返回一个新列表,并将该函数应用于输入列表中的每个元素。我说“简化一点”是因为实际上map()可以处理多个迭代:

如果传递了其他可迭代参数,则函数必须采用那么多参数,并且并行地将其应用于所有可迭代对象的项。如果一个可迭代项短于另一个可迭代项,则假定它扩展为None。

对于问题的第二部分:这在制造笛卡尔积中起什么作用?好,map() 可以用于生成列表的笛卡尔积,如下所示:

lst = [1, 2, 3, 4, 5]

from operator import add
reduce(add, map(lambda i: map(lambda j: (i, j), lst), lst))

…但是说实话,使用product()是解决问题的一种更简单自然的方法:

from itertools import product
list(product(lst, lst))

无论哪种方式,结果都是上述lst定义的笛卡尔乘积:

[(1, 1), (1, 2), (1, 3), (1, 4), (1, 5),
 (2, 1), (2, 2), (2, 3), (2, 4), (2, 5),
 (3, 1), (3, 2), (3, 3), (3, 4), (3, 5),
 (4, 1), (4, 2), (4, 3), (4, 4), (4, 5),
 (5, 1), (5, 2), (5, 3), (5, 4), (5, 5)]

Simplifying a bit, you can imagine map() doing something like this:

def mymap(func, lst):
    result = []
    for e in lst:
        result.append(func(e))
    return result

As you can see, it takes a function and a list, and returns a new list with the result of applying the function to each of the elements in the input list. I said “simplifying a bit” because in reality map() can process more than one iterable:

If additional iterable arguments are passed, function must take that many arguments and is applied to the items from all iterables in parallel. If one iterable is shorter than another it is assumed to be extended with None items.

For the second part in the question: What role does this play in making a Cartesian product? well, map() could be used for generating the cartesian product of a list like this:

lst = [1, 2, 3, 4, 5]

from operator import add
reduce(add, map(lambda i: map(lambda j: (i, j), lst), lst))

… But to tell the truth, using product() is a much simpler and natural way to solve the problem:

from itertools import product
list(product(lst, lst))

Either way, the result is the cartesian product of lst as defined above:

[(1, 1), (1, 2), (1, 3), (1, 4), (1, 5),
 (2, 1), (2, 2), (2, 3), (2, 4), (2, 5),
 (3, 1), (3, 2), (3, 3), (3, 4), (3, 5),
 (4, 1), (4, 2), (4, 3), (4, 4), (4, 5),
 (5, 1), (5, 2), (5, 3), (5, 4), (5, 5)]

回答 4

map()函数可以将相同的过程应用于可迭代数据结构中的每个项目,例如列表,生成器,字符串和其他内容。

让我们看一个例子: map()可以遍历列表中的每个项目并对每个项目应用一个函数,这样它将返回(返回)新列表。

假设您有一个接受数字的函数,将数字加1并返回:

def add_one(num):
  new_num = num + 1
  return new_num

您还有一个数字列表:

my_list = [1, 3, 6, 7, 8, 10]

如果要递增列表中的每个数字,可以执行以下操作:

>>> map(add_one, my_list)
[2, 4, 7, 8, 9, 11]

注意:至少map()需要两个参数。首先是函数名称,其次是列表。

让我们看看其他一些不错的事情map()map()可以采用多个可迭代项(列表,字符串等),并将每个可迭代项中的元素作为参数传递给函数。

我们有三个列表:

list_one = [1, 2, 3, 4, 5]
list_two = [11, 12, 13, 14, 15]
list_three = [21, 22, 23, 24, 25]

map() 可以使您成为一个新列表,其中包含在特定索引处添加的元素。

现在记住map(),需要一个功能。这次我们将使用内置sum()函数。运行map()给出以下结果:

>>> map(sum, list_one, list_two, list_three)
[33, 36, 39, 42, 45]

记住:
在Python 2中map(),将根据最长的列表进行迭代(遍历列表中的元素),并传递None给较短列表的函数,因此您的函数应查找None并处理它们,否则会出错。在Python 3中map(),使用最短列表结束后将停止。同样,在Python 3中,map()返回迭代器,而不是列表。

The map() function is there to apply the same procedure to every item in an iterable data structure, like lists, generators, strings, and other stuff.

Let’s look at an example: map() can iterate over every item in a list and apply a function to each item, than it will return (give you back) the new list.

Imagine you have a function that takes a number, adds 1 to that number and returns it:

def add_one(num):
  new_num = num + 1
  return new_num

You also have a list of numbers:

my_list = [1, 3, 6, 7, 8, 10]

if you want to increment every number in the list, you can do the following:

>>> map(add_one, my_list)
[2, 4, 7, 8, 9, 11]

Note: At minimum map() needs two arguments. First a function name and second something like a list.

Let’s see some other cool things map() can do. map() can take multiple iterables (lists, strings, etc.) and pass an element from each iterable to a function as an argument.

We have three lists:

list_one = [1, 2, 3, 4, 5]
list_two = [11, 12, 13, 14, 15]
list_three = [21, 22, 23, 24, 25]

map() can make you a new list that holds the addition of elements at a specific index.

Now remember map(), needs a function. This time we’ll use the builtin sum() function. Running map() gives the following result:

>>> map(sum, list_one, list_two, list_three)
[33, 36, 39, 42, 45]

REMEMBER:
In Python 2 map(), will iterate (go through the elements of the lists) according to the longest list, and pass None to the function for the shorter lists, so your function should look for None and handle them, otherwise you will get errors. In Python 3 map() will stop after finishing with the shortest list. Also, in Python 3, map() returns an iterator, not a list.


回答 5

Python3-映射(函数,可迭代)

没有完全提及的一件事(尽管@BlooB kinda提到了)是地图返回地图对象而不是列表。在初始化和迭代的时间性能方面,这是一个很大的差异。考虑这两个测试。

import time
def test1(iterable):
    a = time.clock()
    map(str, iterable)
    a = time.clock() - a

    b = time.clock()
    [ str(x) for x in iterable ]
    b = time.clock() - b

    print(a,b)


def test2(iterable):
    a = time.clock()
    [ x for x in map(str, iterable)]
    a = time.clock() - a

    b = time.clock()
    [ str(x) for x in iterable ]
    b = time.clock() - b

    print(a,b)


test1(range(2000000))  # Prints ~1.7e-5s   ~8s
test2(range(2000000))  # Prints ~9s        ~8s

如您所见,初始化map函数几乎不需要时间。但是,通过map对象进行迭代比仅简单地迭代可迭代对象要花费更长的时间。这意味着传递给map()的函数不会应用到每个元素,直到在迭代中到达该元素为止。如果要使用列表,请使用列表理解。如果您打算在for循环中进行迭代并且会在某个时候中断,请使用map。

Python3 – map(func, iterable)

One thing that wasn’t mentioned completely (although @BlooB kinda mentioned it) is that map returns a map object NOT a list. This is a big difference when it comes to time performance on initialization and iteration. Consider these two tests.

import time
def test1(iterable):
    a = time.clock()
    map(str, iterable)
    a = time.clock() - a

    b = time.clock()
    [ str(x) for x in iterable ]
    b = time.clock() - b

    print(a,b)


def test2(iterable):
    a = time.clock()
    [ x for x in map(str, iterable)]
    a = time.clock() - a

    b = time.clock()
    [ str(x) for x in iterable ]
    b = time.clock() - b

    print(a,b)


test1(range(2000000))  # Prints ~1.7e-5s   ~8s
test2(range(2000000))  # Prints ~9s        ~8s

As you can see initializing the map function takes almost no time at all. However iterating through the map object takes longer than simply iterating through the iterable. This means that the function passed to map() is not applied to each element until the element is reached in the iteration. If you want a list use list comprehension. If you plan to iterate through in a for loop and will break at some point, then use map.


获取map()以在Python 3.x中返回列表

问题:获取map()以在Python 3.x中返回列表

我正在尝试将列表映射为十六进制,然后在其他地方使用该列表。在python 2.6中,这很简单:

答: Python 2.6:

>>> map(chr, [66, 53, 0, 94])
['B', '5', '\x00', '^']

但是,在Python 3.1中,以上代码返回一个map对象。

B: Python 3.1:

>>> map(chr, [66, 53, 0, 94])
<map object at 0x00AF5570>

如何在Python 3.x上检索映射列表(如上面的A所示)?

另外,还有更好的方法吗?我的初始列表对象大约有45个项目,并且id喜欢将它们转换为十六进制。

I’m trying to map a list into hex, and then use the list elsewhere. In python 2.6, this was easy:

A: Python 2.6:

>>> map(chr, [66, 53, 0, 94])
['B', '5', '\x00', '^']

However, in Python 3.1, the above returns a map object.

B: Python 3.1:

>>> map(chr, [66, 53, 0, 94])
<map object at 0x00AF5570>

How do I retrieve the mapped list (as in A above) on Python 3.x?

Alternatively, is there a better way of doing this? My initial list object has around 45 items and id like to convert them to hex.


回答 0

做这个:

list(map(chr,[66,53,0,94]))

在Python 3+中,许多迭代可迭代对象的进程本身都会返回迭代器。在大多数情况下,这最终会节省内存,并且应该使处理速度更快。

如果您要做的只是最终遍历此列表,则无需将其转换为列表,因为您仍然可以map像这样遍历该对象:

# Prints "ABCD"
for ch in map(chr,[65,66,67,68]):
    print(ch)

Do this:

list(map(chr,[66,53,0,94]))

In Python 3+, many processes that iterate over iterables return iterators themselves. In most cases, this ends up saving memory, and should make things go faster.

If all you’re going to do is iterate over this list eventually, there’s no need to even convert it to a list, because you can still iterate over the map object like so:

# Prints "ABCD"
for ch in map(chr,[65,66,67,68]):
    print(ch)

回答 1

Python 3.5的新功能:

[*map(chr, [66, 53, 0, 94])]

多亏了其他拆包概述

更新

一直在寻找更短的途径,我发现这也行得通:

*map(chr, [66, 53, 0, 94]),

开箱也适用于元组。注意最后的逗号。这使其成为1个元素的元组。也就是说,它相当于(*map(chr, [66, 53, 0, 94]),)

它比带有方括号的版本短一个字符,但我认为最好写,因为您从星号-扩展语法开始,所以我觉得它比较软。:)

New and neat in Python 3.5:

[*map(chr, [66, 53, 0, 94])]

Thanks to Additional Unpacking Generalizations

UPDATE

Always seeking for shorter ways, I discovered this one also works:

*map(chr, [66, 53, 0, 94]),

Unpacking works in tuples too. Note the comma at the end. This makes it a tuple of 1 element. That is, it’s equivalent to (*map(chr, [66, 53, 0, 94]),)

It’s shorter by only one char from the version with the list-brackets, but, in my opinion, better to write, because you start right ahead with the asterisk – the expansion syntax, so I feel it’s softer on the mind. :)


回答 2

您为什么不这样做:

[chr(x) for x in [66,53,0,94]]

这称为列表理解。您可以在Google上找到很多信息,但是这里是list comprehensions上的Python(2.6)文档的链接。不过,您可能对Python 3文档更感兴趣。

Why aren’t you doing this:

[chr(x) for x in [66,53,0,94]]

It’s called a list comprehension. You can find plenty of information on Google, but here’s the link to the Python (2.6) documentation on list comprehensions. You might be more interested in the Python 3 documenation, though.


回答 3

返回列表的地图功能具有保存键入的优点,尤其是在交互式会话期间。您可以定义返回列表的lmap函数(类似于python2的函数imap):

lmap = lambda func, *iterable: list(map(func, *iterable))

然后打电话 lmap而不是即可map完成工作:比 lmap(str, x)短5个字符(在这种情况下为30%),list(map(str, x))并且肯定比短[str(v) for v in x]。您也可以创建类似的功能filter

对原始问题有一条评论:

我建议重命名为Geting map()以返回Python 3. *中的列表,因为它适用于所有Python3版本。有没有办法做到这一点?– meawoppl 1月24日17:58

有可能做到这一点,但它是一个非常糟糕的主意。只是为了好玩,您可以(但不应)执行以下操作:

__global_map = map #keep reference to the original map
lmap = lambda func, *iterable: list(__global_map(func, *iterable)) # using "map" here will cause infinite recursion
map = lmap
x = [1, 2, 3]
map(str, x) #test
map = __global_map #restore the original map and don't do that again
map(str, x) #iterator

List-returning map function has the advantage of saving typing, especially during interactive sessions. You can define lmap function (on the analogy of python2’s imap) that returns list:

lmap = lambda func, *iterable: list(map(func, *iterable))

Then calling lmap instead of map will do the job: lmap(str, x) is shorter by 5 characters (30% in this case) than list(map(str, x)) and is certainly shorter than [str(v) for v in x]. You may create similar functions for filter too.

There was a comment to the original question:

I would suggest a rename to Getting map() to return a list in Python 3.* as it applies to all Python3 versions. Is there a way to do this? – meawoppl Jan 24 at 17:58

It is possible to do that, but it is a very bad idea. Just for fun, here’s how you may (but should not) do it:

__global_map = map #keep reference to the original map
lmap = lambda func, *iterable: list(__global_map(func, *iterable)) # using "map" here will cause infinite recursion
map = lmap
x = [1, 2, 3]
map(str, x) #test
map = __global_map #restore the original map and don't do that again
map(str, x) #iterator

回答 4

我的旧注释转换为更好的可见性:为了更好地“更好地做到这一点” map,如果您的输入已知为ASCII序数,则通常可以更快地转换为bytesla bytes(list_of_ordinals).decode('ascii')。这样就可以得到一个str值,但是如果您需要a list来实现可变性或类似功能,则可以将其转换(并且转换速度仍然更快)。例如,在微ipython基准中转换45个输入:

>>> %%timeit -r5 ordinals = list(range(45))
... list(map(chr, ordinals))
...
3.91 µs ± 60.2 ns per loop (mean ± std. dev. of 5 runs, 100000 loops each)

>>> %%timeit -r5 ordinals = list(range(45))
... [*map(chr, ordinals)]
...
3.84 µs ± 219 ns per loop (mean ± std. dev. of 5 runs, 100000 loops each)

>>> %%timeit -r5 ordinals = list(range(45))
... [*bytes(ordinals).decode('ascii')]
...
1.43 µs ± 49.7 ns per loop (mean ± std. dev. of 5 runs, 1000000 loops each)

>>> %%timeit -r5 ordinals = list(range(45))
... bytes(ordinals).decode('ascii')
...
781 ns ± 15.9 ns per loop (mean ± std. dev. of 5 runs, 1000000 loops each)

如果您将其保留为str,则最快的map解决方案会花费大约20%的时间;即使转换回列表,它仍然不到最快map解决方案的40%。批量转换通过bytesbytes.decode然后批量转换回以list节省大量工作,但是如上所述,仅当您所有的输入都是ASCII序号(或每个字符特定于区域设置编码的字节中的序号latin-1)时,该方法才有效。

Converting my old comment for better visibility: For a “better way to do this” without map entirely, if your inputs are known to be ASCII ordinals, it’s generally much faster to convert to bytes and decode, a la bytes(list_of_ordinals).decode('ascii'). That gets you a str of the values, but if you need a list for mutability or the like, you can just convert it (and it’s still faster). For example, in ipython microbenchmarks converting 45 inputs:

>>> %%timeit -r5 ordinals = list(range(45))
... list(map(chr, ordinals))
...
3.91 µs ± 60.2 ns per loop (mean ± std. dev. of 5 runs, 100000 loops each)

>>> %%timeit -r5 ordinals = list(range(45))
... [*map(chr, ordinals)]
...
3.84 µs ± 219 ns per loop (mean ± std. dev. of 5 runs, 100000 loops each)

>>> %%timeit -r5 ordinals = list(range(45))
... [*bytes(ordinals).decode('ascii')]
...
1.43 µs ± 49.7 ns per loop (mean ± std. dev. of 5 runs, 1000000 loops each)

>>> %%timeit -r5 ordinals = list(range(45))
... bytes(ordinals).decode('ascii')
...
781 ns ± 15.9 ns per loop (mean ± std. dev. of 5 runs, 1000000 loops each)

If you leave it as a str, it takes ~20% of the time of the fastest map solutions; even converting back to list it’s still less than 40% of the fastest map solution. Bulk convert via bytes and bytes.decode then bulk converting back to list saves a lot of work, but as noted, only works if all your inputs are ASCII ordinals (or ordinals in some one byte per character locale specific encoding, e.g. latin-1).


回答 5

list(map(chr, [66, 53, 0, 94]))

map(func,* iterables)->地图对象创建一个迭代器,该迭代器使用每个可迭代对象的参数来计算函数。当最短的迭代次数用尽时停止。

“进行迭代”

表示它将返回迭代器。

“使用每个可迭代对象的参数来计算函数”

意味着迭代器的next()函数将为每个可迭代对象取一个值,并将每个值传递给该函数的一个位置参数。

因此,您可以从map()函数中获得一个迭代器,然后jsut将其传递给内置函数list()或使用列表推导。

list(map(chr, [66, 53, 0, 94]))

map(func, *iterables) –> map object Make an iterator that computes the function using arguments from each of the iterables. Stops when the shortest iterable is exhausted.

“Make an iterator”

means it will return an iterator.

“that computes the function using arguments from each of the iterables”

means that the next() function of the iterator will take one value of each iterables and pass each of them to one positional parameter of the function.

So you get an iterator from the map() funtion and jsut pass it to the list() builtin function or use list comprehensions.


回答 6

除了上述答案外Python 3,我们还可以简单地listmapas中创建结果值a

li = []
for x in map(chr,[66,53,0,94]):
    li.append(x)

print (li)
>>>['B', '5', '\x00', '^']

我们可以通过另一个例子来概括我被打动的情况,对地图的操作也可以像regex问题中一样的方式处理,我们可以编写函数以获取list要映射的项目并同时获取结果集。例如

b = 'Strings: 1,072, Another String: 474 '
li = []
for x in map(int,map(int, re.findall('\d+', b))):
    li.append(x)

print (li)
>>>[1, 72, 474]

In addition to above answers in Python 3, we may simply create a list of result values from a map as

li = []
for x in map(chr,[66,53,0,94]):
    li.append(x)

print (li)
>>>['B', '5', '\x00', '^']

We may generalize by another example where I was struck, operations on map can also be handled in similar fashion like in regex problem, we can write function to obtain list of items to map and get result set at the same time. Ex.

b = 'Strings: 1,072, Another String: 474 '
li = []
for x in map(int,map(int, re.findall('\d+', b))):
    li.append(x)

print (li)
>>>[1, 72, 474]

回答 7

您可以尝试通过仅迭代对象中的每个项目并将其存储在另一个变量中来从地图对象获取列表。

a = map(chr, [66, 53, 0, 94])
b = [item for item in a]
print(b)
>>>['B', '5', '\x00', '^']

You can try getting a list from the map object by just iterating each item in the object and store it in a different variable.

a = map(chr, [66, 53, 0, 94])
b = [item for item in a]
print(b)
>>>['B', '5', '\x00', '^']

回答 8

使用python中的列表理解和基本的地图函数实用程序,还可以做到这一点:

chi = [x for x in map(chr,[66,53,0,94])]

Using list comprehension in python and basic map function utility, one can do this also:

chi = [x for x in map(chr,[66,53,0,94])]


列表理解与地图

问题:列表理解与地图

有理由更喜欢使用map()列表理解吗?反之亦然?它们中的一个通常比另一个效率更高,或者通常被认为比另一个更Python化吗?

Is there a reason to prefer using map() over list comprehension or vice versa? Is either of them generally more efficient or considered generally more pythonic than the other?


回答 0

map在某些情况下(如果您不是出于此目的而使用lambda,而是在map和listcomp中使用相同的函数),在微观上可能会更快。在其他情况下,列表理解可能会更快,并且大多数(并非全部)pythonista用户认为列表更直接,更清晰。

使用完全相同的函数时map的微小速度优势的一个示例:

$ python -mtimeit -s'xs=range(10)' 'map(hex, xs)'
100000 loops, best of 3: 4.86 usec per loop
$ python -mtimeit -s'xs=range(10)' '[hex(x) for x in xs]'
100000 loops, best of 3: 5.58 usec per loop

当地图需要使用lambda时,如何完全颠倒性能比较的示例:

$ python -mtimeit -s'xs=range(10)' 'map(lambda x: x+2, xs)'
100000 loops, best of 3: 4.24 usec per loop
$ python -mtimeit -s'xs=range(10)' '[x+2 for x in xs]'
100000 loops, best of 3: 2.32 usec per loop

map may be microscopically faster in some cases (when you’re NOT making a lambda for the purpose, but using the same function in map and a listcomp). List comprehensions may be faster in other cases and most (not all) pythonistas consider them more direct and clearer.

An example of the tiny speed advantage of map when using exactly the same function:

$ python -mtimeit -s'xs=range(10)' 'map(hex, xs)'
100000 loops, best of 3: 4.86 usec per loop
$ python -mtimeit -s'xs=range(10)' '[hex(x) for x in xs]'
100000 loops, best of 3: 5.58 usec per loop

An example of how performance comparison gets completely reversed when map needs a lambda:

$ python -mtimeit -s'xs=range(10)' 'map(lambda x: x+2, xs)'
100000 loops, best of 3: 4.24 usec per loop
$ python -mtimeit -s'xs=range(10)' '[x+2 for x in xs]'
100000 loops, best of 3: 2.32 usec per loop

回答 1

案例

  • 常见情况:几乎总是,您将要在python中使用列表推导,因为对于新手程序员来说,阅读代码会更加明显。(这不适用于可能适用其他习惯用法的其他语言。)由于列表推导是python中用于迭代的事实上的标准,因此您对python程序员所做的工作甚至会更加明显。他们是预期的
  • 较少见的情况:但是,如果您已经定义了一个函数,则使用通常是合理的map,尽管它被认为是“非pythonic”的。例如,map(sum, myLists)比更加优雅/简洁[sum(x) for x in myLists]。您可以不必编写一个虚拟变量(例如sum(x) for x...or sum(_) for _...sum(readableName) for readableName...),而只需键入两次即可进行迭代,从而获得了优雅。同样的道理也适用于filterreduce从什么itertools模块:如果你已经有一个方便的功能,您可以继续前进,做一些函数式编程。在某些情况下,这会提高可读性,而在其他情况下(例如,新手程序员,多个参数),则会失去可读性。但是,无论如何,代码的可读性在很大程度上取决于注释。
  • 几乎永远不会map在进行函数编程时,您可能希望将函数用作纯抽象函数,在这种情况下您正在映射map,currying map,或者从map以函数的形式进行讨论中受益。例如,在Haskell中,一个称为functor的接口可以fmap概括任何数据结构上的映射。这在python中非常罕见,因为python语法迫使您使用生成器样式来谈论迭代;您不能轻易将其概括。(这有时是好事,有时是坏事。)您可能会想出一些罕见的python例子,这map(f, *lists)是合理的事情。我能想到的最接近的示例是sumEach = partial(map,sum),这是一个单行代码,大致相当于:

def sumEach(myLists):
    return [sum(_) for _ in myLists]
  • 仅使用for-loop:您当然也可以使用for循环。尽管从函数式编程的角度来看并不那么优雅,但有时​​非局部变量使命令式编程语言(例如python)中的代码更清晰,因为人们已经非常习惯于以这种方式读取代码。通常,当您仅执行任何不构建列表的复杂操作(例如列表理解和映射)(例如,求和或制作树等)时,for循环也是最有效的。就内存而言,它是高效的(不必在时间上,我希望在最坏的情况下,它是一个恒定的因素,除非出现一些罕见的病理性垃圾收集问题)。

“ Python主义”

我不喜欢“ pythonic”一词,因为我发现pythonic在我眼中并不总是那么优雅。然而,mapfilter和类似的功能(如非常有用的itertools模块)很可能在风格方面考虑unpythonic。

懒惰

就效率而言,就像大多数函数式编程构造一样,MAP可以是LAZY,实际上在python中是懒惰的。这意味着您可以执行此操作(在python3中),并且计算机不会耗尽内存,并且不会丢失所有未保存的数据:

>>> map(str, range(10**100))
<map object at 0x2201d50>

尝试通过列表理解做到这一点:

>>> [str(n) for n in range(10**100)]
# DO NOT TRY THIS AT HOME OR YOU WILL BE SAD #

请注意,列表推导本质上也是惰性的,但是python选择将其实现为非惰性的。不过,python确实以生成器表达式的形式支持惰性列表推导,如下所示:

>>> (str(n) for n in range(10**100))
<generator object <genexpr> at 0xacbdef>

您基本上可以将[...]语法视为将生成器表达式传递给list构造函数,例如list(x for x in range(5))

简短的人为例子

from operator import neg
print({x:x**2 for x in map(neg,range(5))})

print({x:x**2 for x in [-y for y in range(5)]})

print({x:x**2 for x in (-y for y in range(5))})

列表推导是非延迟的,因此可能需要更多内存(除非您使用生成器推导)。方括号[...]通常使事情变得显而易见,尤其是在括号中。另一方面,有时您最终会变得像打字一样冗长[x for x in...。只要您使迭代器变量简短,如果不缩进代码,列表解析通常会更加清晰。但是您总是可以缩进代码。

print(
    {x:x**2 for x in (-y for y in range(5))}
)

或分手:

rangeNeg5 = (-y for y in range(5))
print(
    {x:x**2 for x in rangeNeg5}
)

python3的效率比较

map 现在很懒:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=map(f,xs)'
1000000 loops, best of 3: 0.336 usec per loop            ^^^^^^^^^

因此,如果您将不使用所有数据,或者不提前知道需要多少数据,那么map在python3中(以及python2或python3中的生成器表达式)将避免计算它们的值,直到最后一刻。通常,这通常会超过使用带来的任何开销map。不利之处在于,与大多数功能语言相反,这在python中非常有限:只有按“顺序”从左至右访问数据时,您才能获得此好处,因为python生成器表达式只能按order求值x[0], x[1], x[2], ...

但是,假设我们有一个f想要的预制函数map,并且我们忽略了map通过立即强制使用来进行赋值的懒惰list(...)。我们得到一些非常有趣的结果:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(map(f,xs))'                                                                                                                                                
10000 loops, best of 3: 165/124/135 usec per loop        ^^^^^^^^^^^^^^^
                    for list(<map object>)

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=[f(x) for x in xs]'                                                                                                                                      
10000 loops, best of 3: 181/118/123 usec per loop        ^^^^^^^^^^^^^^^^^^
                    for list(<generator>), probably optimized

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(f(x) for x in xs)'                                                                                                                                    
1000 loops, best of 3: 215/150/150 usec per loop         ^^^^^^^^^^^^^^^^^^^^^^
                    for list(<generator>)

结果为AAA / BBB / CCC格式,其中A在带有python 3。?。?的约2010年英特尔工作站上执行,而B和C则是在python 3.2.1的约2013年AMD工作站上执行,具有截然不同的硬件。结果似乎是,地图和列表理解的性能可比,这受其他随机因素的影响最大。我们可以告诉的唯一的事情似乎是,奇怪的是,虽然我们期待列表解析[...]比生成器表达式更好地发挥(...)map也更高效,生成器表达式(再次假设计算所有的值/使用)。

重要的是要意识到这些测试假设一个非常简单的功能(身份功能)。但是这很好,因为如果功能复杂,那么与程序中的其他因素相比,性能开销可以忽略不计。(测试其他简单的东西可能仍然很有趣,例如f=lambda x:x+x

如果您精通python汇编语言,则可以使用该dis模块来查看这是否是幕后真正发生的事情:

>>> listComp = compile('[f(x) for x in xs]', 'listComp', 'eval')
>>> dis.dis(listComp)
  1           0 LOAD_CONST               0 (<code object <listcomp> at 0x2511a48, file "listComp", line 1>) 
              3 MAKE_FUNCTION            0 
              6 LOAD_NAME                0 (xs) 
              9 GET_ITER             
             10 CALL_FUNCTION            1 
             13 RETURN_VALUE         
>>> listComp.co_consts
(<code object <listcomp> at 0x2511a48, file "listComp", line 1>,)
>>> dis.dis(listComp.co_consts[0])
  1           0 BUILD_LIST               0 
              3 LOAD_FAST                0 (.0) 
        >>    6 FOR_ITER                18 (to 27) 
              9 STORE_FAST               1 (x) 
             12 LOAD_GLOBAL              0 (f) 
             15 LOAD_FAST                1 (x) 
             18 CALL_FUNCTION            1 
             21 LIST_APPEND              2 
             24 JUMP_ABSOLUTE            6 
        >>   27 RETURN_VALUE

 

>>> listComp2 = compile('list(f(x) for x in xs)', 'listComp2', 'eval')
>>> dis.dis(listComp2)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_CONST               0 (<code object <genexpr> at 0x255bc68, file "listComp2", line 1>) 
              6 MAKE_FUNCTION            0 
              9 LOAD_NAME                1 (xs) 
             12 GET_ITER             
             13 CALL_FUNCTION            1 
             16 CALL_FUNCTION            1 
             19 RETURN_VALUE         
>>> listComp2.co_consts
(<code object <genexpr> at 0x255bc68, file "listComp2", line 1>,)
>>> dis.dis(listComp2.co_consts[0])
  1           0 LOAD_FAST                0 (.0) 
        >>    3 FOR_ITER                17 (to 23) 
              6 STORE_FAST               1 (x) 
              9 LOAD_GLOBAL              0 (f) 
             12 LOAD_FAST                1 (x) 
             15 CALL_FUNCTION            1 
             18 YIELD_VALUE          
             19 POP_TOP              
             20 JUMP_ABSOLUTE            3 
        >>   23 LOAD_CONST               0 (None) 
             26 RETURN_VALUE

 

>>> evalledMap = compile('list(map(f,xs))', 'evalledMap', 'eval')
>>> dis.dis(evalledMap)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_NAME                1 (map) 
              6 LOAD_NAME                2 (f) 
              9 LOAD_NAME                3 (xs) 
             12 CALL_FUNCTION            2 
             15 CALL_FUNCTION            1 
             18 RETURN_VALUE 

似乎使用[...]语法比更好list(...)。遗憾的是,map该类在拆卸方面有点不透明,但是我们可以通过速度测试来确定。

Cases

  • Common case: Almost always, you will want to use a list comprehension in python because it will be more obvious what you’re doing to novice programmers reading your code. (This does not apply to other languages, where other idioms may apply.) It will even be more obvious what you’re doing to python programmers, since list comprehensions are the de-facto standard in python for iteration; they are expected.
  • Less-common case: However if you already have a function defined, it is often reasonable to use map, though it is considered ‘unpythonic’. For example, map(sum, myLists) is more elegant/terse than [sum(x) for x in myLists]. You gain the elegance of not having to make up a dummy variable (e.g. sum(x) for x... or sum(_) for _... or sum(readableName) for readableName...) which you have to type twice, just to iterate. The same argument holds for filter and reduce and anything from the itertools module: if you already have a function handy, you could go ahead and do some functional programming. This gains readability in some situations, and loses it in others (e.g. novice programmers, multiple arguments)… but the readability of your code highly depends on your comments anyway.
  • Almost never: You may want to use the map function as a pure abstract function while doing functional programming, where you’re mapping map, or currying map, or otherwise benefit from talking about map as a function. In Haskell for example, a functor interface called fmap generalizes mapping over any data structure. This is very uncommon in python because the python grammar compels you to use generator-style to talk about iteration; you can’t generalize it easily. (This is sometimes good and sometimes bad.) You can probably come up with rare python examples where map(f, *lists) is a reasonable thing to do. The closest example I can come up with would be sumEach = partial(map,sum), which is a one-liner that is very roughly equivalent to:

def sumEach(myLists):
    return [sum(_) for _ in myLists]
  • Just using a for-loop: You can also of course just use a for-loop. While not as elegant from a functional-programming viewpoint, sometimes non-local variables make code clearer in imperative programming languages such as python, because people are very used to reading code that way. For-loops are also, generally, the most efficient when you are merely doing any complex operation that is not building a list like list-comprehensions and map are optimized for (e.g. summing, or making a tree, etc.) — at least efficient in terms of memory (not necessarily in terms of time, where I’d expect at worst a constant factor, barring some rare pathological garbage-collection hiccuping).

“Pythonism”

I dislike the word “pythonic” because I don’t find that pythonic is always elegant in my eyes. Nevertheless, map and filter and similar functions (like the very useful itertools module) are probably considered unpythonic in terms of style.

Laziness

In terms of efficiency, like most functional programming constructs, MAP CAN BE LAZY, and in fact is lazy in python. That means you can do this (in python3) and your computer will not run out of memory and lose all your unsaved data:

>>> map(str, range(10**100))
<map object at 0x2201d50>

Try doing that with a list comprehension:

>>> [str(n) for n in range(10**100)]
# DO NOT TRY THIS AT HOME OR YOU WILL BE SAD #

Do note that list comprehensions are also inherently lazy, but python has chosen to implement them as non-lazy. Nevertheless, python does support lazy list comprehensions in the form of generator expressions, as follows:

>>> (str(n) for n in range(10**100))
<generator object <genexpr> at 0xacbdef>

You can basically think of the [...] syntax as passing in a generator expression to the list constructor, like list(x for x in range(5)).

Brief contrived example

from operator import neg
print({x:x**2 for x in map(neg,range(5))})

print({x:x**2 for x in [-y for y in range(5)]})

print({x:x**2 for x in (-y for y in range(5))})

List comprehensions are non-lazy, so may require more memory (unless you use generator comprehensions). The square brackets [...] often make things obvious, especially when in a mess of parentheses. On the other hand, sometimes you end up being verbose like typing [x for x in.... As long as you keep your iterator variables short, list comprehensions are usually clearer if you don’t indent your code. But you could always indent your code.

print(
    {x:x**2 for x in (-y for y in range(5))}
)

or break things up:

rangeNeg5 = (-y for y in range(5))
print(
    {x:x**2 for x in rangeNeg5}
)

Efficiency comparison for python3

map is now lazy:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=map(f,xs)'
1000000 loops, best of 3: 0.336 usec per loop            ^^^^^^^^^

Therefore if you will not be using all your data, or do not know ahead of time how much data you need, map in python3 (and generator expressions in python2 or python3) will avoid calculating their values until the last moment necessary. Usually this will usually outweigh any overhead from using map. The downside is that this is very limited in python as opposed to most functional languages: you only get this benefit if you access your data left-to-right “in order”, because python generator expressions can only be evaluated the order x[0], x[1], x[2], ....

However let’s say that we have a pre-made function f we’d like to map, and we ignore the laziness of map by immediately forcing evaluation with list(...). We get some very interesting results:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(map(f,xs))'                                                                                                                                                
10000 loops, best of 3: 165/124/135 usec per loop        ^^^^^^^^^^^^^^^
                    for list(<map object>)

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=[f(x) for x in xs]'                                                                                                                                      
10000 loops, best of 3: 181/118/123 usec per loop        ^^^^^^^^^^^^^^^^^^
                    for list(<generator>), probably optimized

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(f(x) for x in xs)'                                                                                                                                    
1000 loops, best of 3: 215/150/150 usec per loop         ^^^^^^^^^^^^^^^^^^^^^^
                    for list(<generator>)

In results are in the form AAA/BBB/CCC where A was performed with on a circa-2010 Intel workstation with python 3.?.?, and B and C were performed with a circa-2013 AMD workstation with python 3.2.1, with extremely different hardware. The result seems to be that map and list comprehensions are comparable in performance, which is most strongly affected by other random factors. The only thing we can tell seems to be that, oddly, while we expect list comprehensions [...] to perform better than generator expressions (...), map is ALSO more efficient that generator expressions (again assuming that all values are evaluated/used).

It is important to realize that these tests assume a very simple function (the identity function); however this is fine because if the function were complicated, then performance overhead would be negligible compared to other factors in the program. (It may still be interesting to test with other simple things like f=lambda x:x+x)

If you’re skilled at reading python assembly, you can use the dis module to see if that’s actually what’s going on behind the scenes:

>>> listComp = compile('[f(x) for x in xs]', 'listComp', 'eval')
>>> dis.dis(listComp)
  1           0 LOAD_CONST               0 (<code object <listcomp> at 0x2511a48, file "listComp", line 1>) 
              3 MAKE_FUNCTION            0 
              6 LOAD_NAME                0 (xs) 
              9 GET_ITER             
             10 CALL_FUNCTION            1 
             13 RETURN_VALUE         
>>> listComp.co_consts
(<code object <listcomp> at 0x2511a48, file "listComp", line 1>,)
>>> dis.dis(listComp.co_consts[0])
  1           0 BUILD_LIST               0 
              3 LOAD_FAST                0 (.0) 
        >>    6 FOR_ITER                18 (to 27) 
              9 STORE_FAST               1 (x) 
             12 LOAD_GLOBAL              0 (f) 
             15 LOAD_FAST                1 (x) 
             18 CALL_FUNCTION            1 
             21 LIST_APPEND              2 
             24 JUMP_ABSOLUTE            6 
        >>   27 RETURN_VALUE

 

>>> listComp2 = compile('list(f(x) for x in xs)', 'listComp2', 'eval')
>>> dis.dis(listComp2)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_CONST               0 (<code object <genexpr> at 0x255bc68, file "listComp2", line 1>) 
              6 MAKE_FUNCTION            0 
              9 LOAD_NAME                1 (xs) 
             12 GET_ITER             
             13 CALL_FUNCTION            1 
             16 CALL_FUNCTION            1 
             19 RETURN_VALUE         
>>> listComp2.co_consts
(<code object <genexpr> at 0x255bc68, file "listComp2", line 1>,)
>>> dis.dis(listComp2.co_consts[0])
  1           0 LOAD_FAST                0 (.0) 
        >>    3 FOR_ITER                17 (to 23) 
              6 STORE_FAST               1 (x) 
              9 LOAD_GLOBAL              0 (f) 
             12 LOAD_FAST                1 (x) 
             15 CALL_FUNCTION            1 
             18 YIELD_VALUE          
             19 POP_TOP              
             20 JUMP_ABSOLUTE            3 
        >>   23 LOAD_CONST               0 (None) 
             26 RETURN_VALUE

 

>>> evalledMap = compile('list(map(f,xs))', 'evalledMap', 'eval')
>>> dis.dis(evalledMap)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_NAME                1 (map) 
              6 LOAD_NAME                2 (f) 
              9 LOAD_NAME                3 (xs) 
             12 CALL_FUNCTION            2 
             15 CALL_FUNCTION            1 
             18 RETURN_VALUE 

It seems it is better to use [...] syntax than list(...). Sadly the map class is a bit opaque to disassembly, but we can make due with our speed test.


回答 2

Python 2:您应该使用mapfilter而不是列表推导。

即使它们不是“ Pythonic”的,您还是还是偏爱它们的一个客观原因是:
它们需要函数/ lambda作为参数,从而引入了新的作用域

我被这个不止一次地咬了:

for x, y in somePoints:
    # (several lines of code here)
    squared = [x ** 2 for x in numbers]
    # Oops, x was silently overwritten!

但如果相反,我曾说过:

for x, y in somePoints:
    # (several lines of code here)
    squared = map(lambda x: x ** 2, numbers)

那一切都会好起来的

您可能会说我在相同范围内使用相同的变量名很愚蠢。

我不是 最初的代码很好-两者x不在同一范围内。
直到我内部块移到代码的不同部分之后,问题才出现(阅读:维护期间的问题,而不是开发过程中的问题),而且我没想到。

是的,如果您从未犯过此错误,则列表理解会更优雅。
但是从个人经验(和看到其他人犯同样的错误)中,我已经看到它发生了很多次,我认为当这些错误潜入您的代码中时,您不应该经历这种痛苦。

结论:

使用mapfilter。它们可以防止与范围相关的细微难以诊断的错误。

边注:

如果适合您的情况,请不要忘记考虑使用imapifilter(中的itertools)!

Python 2: You should use map and filter instead of list comprehensions.

An objective reason why you should prefer them even though they’re not “Pythonic” is this:
They require functions/lambdas as arguments, which introduce a new scope.

I’ve gotten bitten by this more than once:

for x, y in somePoints:
    # (several lines of code here)
    squared = [x ** 2 for x in numbers]
    # Oops, x was silently overwritten!

but if instead I had said:

for x, y in somePoints:
    # (several lines of code here)
    squared = map(lambda x: x ** 2, numbers)

then everything would’ve been fine.

You could say I was being silly for using the same variable name in the same scope.

I wasn’t. The code was fine originally — the two xs weren’t in the same scope.
It was only after I moved the inner block to a different section of the code that the problem came up (read: problem during maintenance, not development), and I didn’t expect it.

Yes, if you never make this mistake then list comprehensions are more elegant.
But from personal experience (and from seeing others make the same mistake) I’ve seen it happen enough times that I think it’s not worth the pain you have to go through when these bugs creep into your code.

Conclusion:

Use map and filter. They prevent subtle hard-to-diagnose scope-related bugs.

Side note:

Don’t forget to consider using imap and ifilter (in itertools) if they are appropriate for your situation!


回答 3

实际上,map列表理解在Python 3语言中的行为大不相同。看一下下面的Python 3程序:

def square(x):
    return x*x
squares = map(square, [1, 2, 3])
print(list(squares))
print(list(squares))

您可能希望它打印两次“ [1,4,9]”行,但是打印“ [1,4,9]”后跟“ []”。第一次查看时,squares它似乎表现为三个元素的序列,但是第二次查看时为空元素。

在Python 2语言中,会map返回一个普通的旧列表,就像列表推导在两种语言中一样。症结在于,mapPython 3(和imapPython 2)中的return值不是一个列表-它是一个迭代器!

与遍历列表不同,遍历迭代器时将消耗元素。这就是为什么squares在最后print(list(squares))一行看起来空白。

总结一下:

  • 在处理迭代器时,必须记住它们是有状态的,并且在遍历它们时会发生变化。
  • 列表更容易预测,因为它们仅在您显式对其进行更改时才会更改;他们是容器
  • 还有一个好处:数字,字符串和元组甚至可以更容易预测,因为它们根本无法更改;它们是价值

Actually, map and list comprehensions behave quite differently in the Python 3 language. Take a look at the following Python 3 program:

def square(x):
    return x*x
squares = map(square, [1, 2, 3])
print(list(squares))
print(list(squares))

You might expect it to print the line “[1, 4, 9]” twice, but instead it prints “[1, 4, 9]” followed by “[]”. The first time you look at squares it seems to behave as a sequence of three elements, but the second time as an empty one.

In the Python 2 language map returns a plain old list, just like list comprehensions do in both languages. The crux is that the return value of map in Python 3 (and imap in Python 2) is not a list – it’s an iterator!

The elements are consumed when you iterate over an iterator unlike when you iterate over a list. This is why squares looks empty in the last print(list(squares)) line.

To summarize:

  • When dealing with iterators you have to remember that they are stateful and that they mutate as you traverse them.
  • Lists are more predictable since they only change when you explicitly mutate them; they are containers.
  • And a bonus: numbers, strings, and tuples are even more predictable since they cannot change at all; they are values.

回答 4

我发现列表理解通常比我要表达的要表达的要多map-它们都可以完成,但是前者节省了试图理解什么可能是复杂lambda表达的精神负担。

在某个地方(我无法找到它)也有一次采访,其中Guido列出了lambdas和函数功能,这是他最后悔接受Python的事情,因此您可以凭借这些观点认为它们是非Python的其中。

I find list comprehensions are generally more expressive of what I’m trying to do than map – they both get it done, but the former saves the mental load of trying to understand what could be a complex lambda expression.

There’s also an interview out there somewhere (I can’t find it offhand) where Guido lists lambdas and the functional functions as the thing he most regrets about accepting into Python, so you could make the argument that they’re un-Pythonic by virtue of that.


回答 5

这是一种可能的情况:

map(lambda op1,op2: op1*op2, list1, list2)

与:

[op1*op2 for op1,op2 in zip(list1,list2)]

我猜想zip()是一个不幸的和不必要的开销,如果您坚持使用列表推导而不是地图,则需要沉迷于此。如果有人肯定或否定这一点,那就太好了。

Here is one possible case:

map(lambda op1,op2: op1*op2, list1, list2)

versus:

[op1*op2 for op1,op2 in zip(list1,list2)]

I am guessing the zip() is an unfortunate and unnecessary overhead you need to indulge in if you insist on using list comprehensions instead of the map. Would be great if someone clarifies this whether affirmatively or negatively.


回答 6

如果您打算编写任何异步,并行或分布式代码,则您可能会更喜欢map列表理解-因为大多数异步,并行或分布式程序包都提供了map使python过载的功能map。然后,通过将适当的map函数传递给其余代码,您可能不必修改原始串行代码即可使其并行运行(等)。

If you plan on writing any asynchronous, parallel, or distributed code, you will probably prefer map over a list comprehension — as most asynchronous, parallel, or distributed packages provide a map function to overload python’s map. Then by passing the appropriate map function to the rest of your code, you may not have to modify your original serial code to have it run in parallel (etc).


回答 7

因此,由于Python 3 map()是迭代器,因此您需要牢记所需的东西:迭代器或list对象。

正如@AlexMartelli已经提到的那样map()仅当您不使用lambda函数时,它比列表理解要快。

我将向您介绍一些时间比较。

Python 3.5.2和CPython
我使用了Jupiter笔记本,尤其是%timeit内置的魔术命令
测量:s == 1000 ms == 1000 * 1000 µs = 1000 * 1000 * 1000 ns

设定:

x_list = [(i, i+1, i+2, i*2, i-9) for i in range(1000)]
i_list = list(range(1000))

内置功能:

%timeit map(sum, x_list)  # creating iterator object
# Output: The slowest run took 9.91 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 277 ns per loop

%timeit list(map(sum, x_list))  # creating list with map
# Output: 1000 loops, best of 3: 214 µs per loop

%timeit [sum(x) for x in x_list]  # creating list with list comprehension
# Output: 1000 loops, best of 3: 290 µs per loop

lambda 功能:

%timeit map(lambda i: i+1, i_list)
# Output: The slowest run took 8.64 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 325 ns per loop

%timeit list(map(lambda i: i+1, i_list))
# Output: 1000 loops, best of 3: 183 µs per loop

%timeit [i+1 for i in i_list]
# Output: 10000 loops, best of 3: 84.2 µs per loop

还有诸如生成器表达式之类的东西,请参阅PEP-0289。所以我认为将其添加到比较中将很有用

%timeit (sum(i) for i in x_list)
# Output: The slowest run took 6.66 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 495 ns per loop

%timeit list((sum(x) for x in x_list))
# Output: 1000 loops, best of 3: 319 µs per loop

%timeit (i+1 for i in i_list)
# Output: The slowest run took 6.83 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 506 ns per loop

%timeit list((i+1 for i in i_list))
# Output: 10000 loops, best of 3: 125 µs per loop

您需要list对象:

如果是自定义函数,list(map())则使用列表理解;如果有内置函数,则使用列表理解

您不需要list对象,只需要一个可迭代的对象:

始终使用map()

So since Python 3, map() is an iterator, you need to keep in mind what do you need: an iterator or list object.

As @AlexMartelli already mentioned, map() is faster than list comprehension only if you don’t use lambda function.

I will present you some time comparisons.

Python 3.5.2 and CPython
I’ve used Jupiter notebook and especially %timeit built-in magic command
Measurements: s == 1000 ms == 1000 * 1000 µs = 1000 * 1000 * 1000 ns

Setup:

x_list = [(i, i+1, i+2, i*2, i-9) for i in range(1000)]
i_list = list(range(1000))

Built-in function:

%timeit map(sum, x_list)  # creating iterator object
# Output: The slowest run took 9.91 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 277 ns per loop

%timeit list(map(sum, x_list))  # creating list with map
# Output: 1000 loops, best of 3: 214 µs per loop

%timeit [sum(x) for x in x_list]  # creating list with list comprehension
# Output: 1000 loops, best of 3: 290 µs per loop

lambda function:

%timeit map(lambda i: i+1, i_list)
# Output: The slowest run took 8.64 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 325 ns per loop

%timeit list(map(lambda i: i+1, i_list))
# Output: 1000 loops, best of 3: 183 µs per loop

%timeit [i+1 for i in i_list]
# Output: 10000 loops, best of 3: 84.2 µs per loop

There is also such thing as generator expression, see PEP-0289. So i thought it would be useful to add it to comparison

%timeit (sum(i) for i in x_list)
# Output: The slowest run took 6.66 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 495 ns per loop

%timeit list((sum(x) for x in x_list))
# Output: 1000 loops, best of 3: 319 µs per loop

%timeit (i+1 for i in i_list)
# Output: The slowest run took 6.83 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 506 ns per loop

%timeit list((i+1 for i in i_list))
# Output: 10000 loops, best of 3: 125 µs per loop

You need list object:

Use list comprehension if it’s custom function, use list(map()) if there is builtin function

You don’t need list object, you just need iterable one:

Always use map()!


回答 8

我进行了一项快速测试,比较了三种调用对象方法的方法。在这种情况下,时差可以忽略不计,并且与所讨论的功能有关(请参阅@Alex Martelli的回复)。在这里,我查看了以下方法:

# map_lambda
list(map(lambda x: x.add(), vals))

# map_operator
from operator import methodcaller
list(map(methodcaller("add"), vals))

# map_comprehension
[x.add() for x in vals]

我查看vals了整数(Python int)和浮点数(Python )的列表(存储在变量中),float以增加列表大小。DummyNum考虑以下虚拟类:

class DummyNum(object):
    """Dummy class"""
    __slots__ = 'n',

    def __init__(self, n):
        self.n = n

    def add(self):
        self.n += 5

具体来说,add方法。该__slots__属性是Python中的一种简单优化,用于定义类(属性)所需的总内存,从而减小了内存大小。这是结果图。

如前所述,所使用的技术差异很小,您应该以一种对您最易读的方式或在特定情况下进行编码。在这种情况下,列表推导(map_comprehension技巧)对于对象中两种类型的加法最快,尤其是对于较短的列表。

访问此pastebin,获取用于生成图和数据的源。

I ran a quick test comparing three methods for invoking the method of an object. The time difference, in this case, is negligible and is a matter of the function in question (see @Alex Martelli’s response). Here, I looked at the following methods:

# map_lambda
list(map(lambda x: x.add(), vals))

# map_operator
from operator import methodcaller
list(map(methodcaller("add"), vals))

# map_comprehension
[x.add() for x in vals]

I looked at lists (stored in the variable vals) of both integers (Python int) and floating point numbers (Python float) for increasing list sizes. The following dummy class DummyNum is considered:

class DummyNum(object):
    """Dummy class"""
    __slots__ = 'n',

    def __init__(self, n):
        self.n = n

    def add(self):
        self.n += 5

Specifically, the add method. The __slots__ attribute is a simple optimization in Python to define the total memory needed by the class (attributes), reducing memory size. Here are the resulting plots.

As stated previously, the technique used makes a minimal difference and you should code in a way that is most readable to you, or in the particular circumstance. In this case, the list comprehension (map_comprehension technique) is fastest for both types of additions in an object, especially with shorter lists.

Visit this pastebin for the source used to generate the plot and data.


回答 9

我认为最Python的方式是使用列表理解而不是mapand filter。原因是列表理解比map和更清晰filter

In [1]: odd_cubes = [x ** 3 for x in range(10) if x % 2 == 1] # using a list comprehension

In [2]: odd_cubes_alt = list(map(lambda x: x ** 3, filter(lambda x: x % 2 == 1, range(10)))) # using map and filter

In [3]: odd_cubes == odd_cubes_alt
Out[3]: True

如您所见,理解并不需要额外的lambda表达式map。此外,理解还允许容易地过滤,同时map需要filter允许过滤。

I consider that the most Pythonic way is to use a list comprehension instead of map and filter. The reason is that list comprehensions are clearer than map and filter.

In [1]: odd_cubes = [x ** 3 for x in range(10) if x % 2 == 1] # using a list comprehension

In [2]: odd_cubes_alt = list(map(lambda x: x ** 3, filter(lambda x: x % 2 == 1, range(10)))) # using map and filter

In [3]: odd_cubes == odd_cubes_alt
Out[3]: True

As you an see, a comprehension does not require extra lambda expressions as map needs. Furthermore, a comprehension also allows filtering easily, while map requires filter to allow filtering.


回答 10

我尝试了@ alex-martelli的代码,但发现了一些差异

python -mtimeit -s "xs=range(123456)" "map(hex, xs)"
1000000 loops, best of 5: 218 nsec per loop
python -mtimeit -s "xs=range(123456)" "[hex(x) for x in xs]"
10 loops, best of 5: 19.4 msec per loop

映射即使在非常大的范围内也要花费相同的时间,而使用列表理解则要花费很多时间,这从我的代码中可以明显看出。因此,除了被视为“ unpythonic”之外,我还没有遇到任何与使用map有关的性能问题。

I tried the code by @alex-martelli but found some discrepancies

python -mtimeit -s "xs=range(123456)" "map(hex, xs)"
1000000 loops, best of 5: 218 nsec per loop
python -mtimeit -s "xs=range(123456)" "[hex(x) for x in xs]"
10 loops, best of 5: 19.4 msec per loop

map takes the same amount of time even for very large ranges while using list comprehension takes a lot of time as is evident from my code. So apart from being considered “unpythonic”, I have not faced any performance issues relating to usage of map.