Python 实用宝典

Question 1

This is quite n00bish, but I’m trying to learn/understand functional programming in python. The following code:

foos = [1.0,2.0,3.0,4.0,5.0]
bars = [1,2,3]

def maptest(foo, bar):
    print foo, bar

map(maptest, foos, bars)

produces:

1.0 1
2.0 2
3.0 3
4.0 None
5.0 None

Q. Is there a way to use map or any other functional tools in python to produce the following without loops etc.

1.0 [1,2,3]
2.0 [1,2,3]
3.0 [1,2,3]
4.0 [1,2,3]
5.0 [1,2,3]

Just as a side note how would the implementation change if there is a dependency between foo and bar. e.g.

foos = [1.0,2.0,3.0,4.0,5.0]
bars = [1,2,3,4,5]

and print:

1.0 [2,3,4,5]
2.0 [1,3,4,5]
3.0 [1,2,4,5]
...

P.S: I know how to do it naively using if, loops and/or generators, but I’d like to learn how to achieve the same using functional tools. Is it just a case of adding an if statement to maptest or apply another filter map to bars internally within maptest?

Question 2

The easiest way would be not to pass bars through the different functions, but to access it directly from maptest:

foos = [1.0,2.0,3.0,4.0,5.0]
bars = [1,2,3]

def maptest(foo):
    print foo, bars

map(maptest, foos)

With your original maptest function you could also use a lambda function in map:

map((lambda foo: maptest(foo, bars)), foos)

Question 3

Are you familiar with other functional languages? i.e. are you trying to learn how python does functional programming, or are you trying to learn about functional programming and using python as the vehicle?

Also, do you understand list comprehensions?

map(f, sequence)

is directly equivalent (*) to:

[f(x) for x in sequence]

In fact, I think map() was once slated for removal from python 3.0 as being redundant (that didn’t happen).

map(f, sequence1, sequence2)

is mostly equivalent to:

[f(x1, x2) for x1, x2 in zip(sequence1, sequence2)]

(there is a difference in how it handles the case where the sequences are of different length. As you saw, map() fills in None when one of the sequences runs out, whereas zip() stops when the shortest sequence stops)

So, to address your specific question, you’re trying to produce the result:

foos[0], bars
foos[1], bars
foos[2], bars
# etc.

You could do this by writing a function that takes a single argument and prints it, followed by bars:

def maptest(x):
     print x, bars
map(maptest, foos)

Alternatively, you could create a list that looks like this:

[bars, bars, bars, ] # etc.

and use your original maptest:

def maptest(x, y):
    print x, y

One way to do this would be to explicitely build the list beforehand:

barses = [bars] * len(foos)
map(maptest, foos, barses)

Alternatively, you could pull in the itertools module. itertools contains many clever functions that help you do functional-style lazy-evaluation programming in python. In this case, we want itertools.repeat, which will output its argument indefinitely as you iterate over it. This last fact means that if you do:

map(maptest, foos, itertools.repeat(bars))

you will get endless output, since map() keeps going as long as one of the arguments is still producing output. However, itertools.imap is just like map(), but stops as soon as the shortest iterable stops.

itertools.imap(maptest, foos, itertools.repeat(bars))

Hope this helps :-)

(*) It’s a little different in python 3.0. There, map() essentially returns a generator expression.

Question 4

Here’s the solution you’re looking for:

>>> foos = [1.0, 2.0, 3.0, 4.0, 5.0]
>>> bars = [1, 2, 3]
>>> [(x, bars) for x in foos]
[(1.0, [1, 2, 3]), (2.0, [1, 2, 3]), (3.0, [1, 2, 3]), (4.0, [1, 2, 3]), (5.0, [
1, 2, 3])]

I’d recommend using a list comprehension (the [(x, bars) for x in foos] part) over using map as it avoids the overhead of a function call on every iteration (which can be very significant). If you’re just going to use it in a for loop, you’ll get better speeds by using a generator comprehension:

>>> y = ((x, bars) for x in foos)
>>> for z in y:
...     print z
...
(1.0, [1, 2, 3])
(2.0, [1, 2, 3])
(3.0, [1, 2, 3])
(4.0, [1, 2, 3])
(5.0, [1, 2, 3])

The difference is that the generator comprehension is lazily loaded.

UPDATE In response to this comment:

Of course you know, that you don’t copy bars, all entries are the same bars list. So if you modify any one of them (including original bars), you modify all of them.

I suppose this is a valid point. There are two solutions to this that I can think of. The most efficient is probably something like this:

tbars = tuple(bars)
[(x, tbars) for x in foos]

Since tuples are immutable, this will prevent bars from being modified through the results of this list comprehension (or generator comprehension if you go that route). If you really need to modify each and every one of the results, you can do this:

from copy import copy
[(x, copy(bars)) for x in foos]

However, this can be a bit expensive both in terms of memory usage and in speed, so I’d recommend against it unless you really need to add to each one of them.

Question 5

Functional programming is about creating side-effect-free code.

map is a functional list transformation abstraction. You use it to take a sequence of something and turn it into a sequence of something else.

You are trying to use it as an iterator. Don’t do that. :)

Here is an example of how you might use map to build the list you want. There are shorter solutions (I’d just use comprehensions), but this will help you understand what map does a bit better:

def my_transform_function(input):
    return [input, [1, 2, 3]]

new_list = map(my_transform, input_list)

Notice at this point, you’ve only done a data manipulation. Now you can print it:

for n,l in new_list:
    print n, ll

— I’m not sure what you mean by ‘without loops.’ fp isn’t about avoiding loops (you can’t examine every item in a list without visiting each one). It’s about avoiding side-effects, thus writing fewer bugs.

Question 6

>>> from itertools import repeat
>>> for foo, bars in zip(foos, repeat(bars)):
...     print foo, bars
... 
1.0 [1, 2, 3]
2.0 [1, 2, 3]
3.0 [1, 2, 3]
4.0 [1, 2, 3]
5.0 [1, 2, 3]

Question 7

import itertools

foos=[1.0, 2.0, 3.0, 4.0, 5.0]
bars=[1, 2, 3]

print zip(foos, itertools.cycle([bars]))

Question 8

Here’s an overview of the parameters to the map(function, *sequences) function:

function is the name of your function.
sequences is any number of sequences, which are usually lists or tuples. map will iterate over them simultaneously and give the current values to function. That’s why the number of sequences should equal the number of parameters to your function.

It sounds like you’re trying to iterate for some of function‘s parameters but keep others constant, and unfortunately map doesn’t support that. I found an old proposal to add such a feature to Python, but the map construct is so clean and well-established that I doubt something like that will ever be implemented.

Use a workaround like global variables or list comprehensions, as others have suggested.

Question 9

Would this do it?

foos = [1.0,2.0,3.0,4.0,5.0]
bars = [1,2,3]

def maptest2(bar):
  print bar

def maptest(foo):
  print foo
  map(maptest2, bars)

map(maptest, foos)

Question 10

How about this:

foos = [1.0,2.0,3.0,4.0,5.0]
bars = [1,2,3]

def maptest(foo, bar):
    print foo, bar

map(maptest, foos, [bars]*len(foos))

Question 11

Does anyone here have any useful code which uses reduce() function in python? Is there any code other than the usual + and * that we see in the examples?

Refer Fate of reduce() in Python 3000 by GvR

Question 12

The other uses I’ve found for it besides + and * were with and and or, but now we have any and all to replace those cases.

foldl and foldr do come up in Scheme a lot…

Here’s some cute usages:

Flatten a list

Goal: turn [[1, 2, 3], [4, 5], [6, 7, 8]] into [1, 2, 3, 4, 5, 6, 7, 8].

reduce(list.__add__, [[1, 2, 3], [4, 5], [6, 7, 8]], [])

List of digits to a number

Goal: turn [1, 2, 3, 4, 5, 6, 7, 8] into 12345678.

Ugly, slow way:

int("".join(map(str, [1,2,3,4,5,6,7,8])))

Pretty reduce way:

reduce(lambda a,d: 10*a+d, [1,2,3,4,5,6,7,8], 0)

Question 13

reduce() can be used to find Least common multiple for 3 or more numbers:

#!/usr/bin/env python
from fractions import gcd
from functools import reduce

def lcm(*args):
    return reduce(lambda a,b: a * b // gcd(a, b), args)

Example:

>>> lcm(100, 23, 98)
112700
>>> lcm(*range(1, 20))
232792560

Question 14

reduce() could be used to resolve dotted names (where eval() is too unsafe to use):

>>> import __main__
>>> reduce(getattr, "os.path.abspath".split('.'), __main__)
<function abspath at 0x009AB530>

Question 15

Find the intersection of N given lists:

input_list = [[1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [3, 4, 5, 6, 7]]

result = reduce(set.intersection, map(set, input_list))

returns:

result = set([3, 4, 5])

via: Python – Intersection of two lists

Question 16

I think reduce is a silly command. Hence:

reduce(lambda hold,next:hold+chr(((ord(next.upper())-65)+13)%26+65),'znlorabggbbhfrshy','')

Question 17

The usage of reduce that I found in my code involved the situation where I had some class structure for logic expression and I needed to convert a list of these expression objects to a conjunction of the expressions. I already had a function make_and to create a conjunction given two expressions, so I wrote reduce(make_and,l). (I knew the list wasn’t empty; otherwise it would have been something like reduce(make_and,l,make_true).)

This is exactly the reason that (some) functional programmers like reduce (or fold functions, as such functions are typically called). There are often already many binary functions like +, *, min, max, concatenation and, in my case, make_and and make_or. Having a reduce makes it trivial to lift these operations to lists (or trees or whatever you got, for fold functions in general).

Of course, if certain instantiations (such as sum) are often used, then you don’t want to keep writing reduce. However, instead of defining the sum with some for-loop, you can just as easily define it with reduce.

Readability, as mentioned by others, is indeed an issue. You could argue, however, that only reason why people find reduce less “clear” is because it is not a function that many people know and/or use.

Question 18

Function composition: If you already have a list of functions that you’d like to apply in succession, such as:

color = lambda x: x.replace('brown', 'blue')
speed = lambda x: x.replace('quick', 'slow')
work = lambda x: x.replace('lazy', 'industrious')
fs = [str.lower, color, speed, work, str.title]

Then you can apply them all consecutively with:

>>> call = lambda s, func: func(s)
>>> s = "The Quick Brown Fox Jumps Over the Lazy Dog"
>>> reduce(call, fs, s)
'The Slow Blue Fox Jumps Over The Industrious Dog'

In this case, method chaining may be more readable. But sometimes it isn’t possible, and this kind of composition may be more readable and maintainable than a f1(f2(f3(f4(x)))) kind of syntax.

Question 19

You could replace value = json_obj['a']['b']['c']['d']['e'] with:

value = reduce(dict.__getitem__, 'abcde', json_obj)

If you already have the path a/b/c/.. as a list. For example, Change values in dict of nested dicts using items in a list.

Question 20

@Blair Conrad: You could also implement your glob/reduce using sum, like so:

files = sum([glob.glob(f) for f in args], [])

This is less verbose than either of your two examples, is perfectly Pythonic, and is still only one line of code.

So to answer the original question, I personally try to avoid using reduce because it’s never really necessary and I find it to be less clear than other approaches. However, some people get used to reduce and come to prefer it to list comprehensions (especially Haskell programmers). But if you’re not already thinking about a problem in terms of reduce, you probably don’t need to worry about using it.

Question 21

reduce can be used to support chained attribute lookups:

reduce(getattr, ('request', 'user', 'email'), self)

Of course, this is equivalent to

self.request.user.email

but it’s useful when your code needs to accept an arbitrary list of attributes.

(Chained attributes of arbitrary length are common when dealing with Django models.)

Question 22

reduce is useful when you need to find the union or intersection of a sequence of set-like objects.

>>> reduce(operator.or_, ({1}, {1, 2}, {1, 3}))  # union
{1, 2, 3}
>>> reduce(operator.and_, ({1}, {1, 2}, {1, 3}))  # intersection
{1}

(Apart from actual sets, an example of these are Django’s Q objects.)

On the other hand, if you’re dealing with bools, you should use any and all:

>>> any((True, False, True))
True

Question 23

After grepping my code, it seems the only thing I’ve used reduce for is calculating the factorial:

reduce(operator.mul, xrange(1, x+1) or (1,))

Question 24

I’m writing a compose function for a language, so I construct the composed function using reduce along with my apply operator.

In a nutshell, compose takes a list of functions to compose into a single function. If I have a complex operation that is applied in stages, I want to put it all together like so:

complexop = compose(stage4, stage3, stage2, stage1)

This way, I can then apply it to an expression like so:

complexop(expression)

And I want it to be equivalent to:

stage4(stage3(stage2(stage1(expression))))

Now, to build my internal objects, I want it to say:

Lambda([Symbol('x')], Apply(stage4, Apply(stage3, Apply(stage2, Apply(stage1, Symbol('x'))))))

(The Lambda class builds a user-defined function, and Apply builds a function application.)

Now, reduce, unfortunately, folds the wrong way, so I wound up using, roughly:

reduce(lambda x,y: Apply(y, x), reversed(args + [Symbol('x')]))

To figure out what reduce produces, try these in the REPL:

reduce(lambda x, y: (x, y), range(1, 11))
reduce(lambda x, y: (y, x), reversed(range(1, 11)))

Question 25

reduce can be used to get the list with the maximum nth element

reduce(lambda x,y: x if x[2] > y[2] else y,[[1,2,3,4],[5,2,5,7],[1,6,0,2]])

would return [5, 2, 5, 7] as it is the list with max 3rd element +

Question 26

Reduce isn’t limited to scalar operations; it can also be used to sort things into buckets. (This is what I use reduce for most often).

Imagine a case in which you have a list of objects, and you want to re-organize it hierarchically based on properties stored flatly in the object. In the following example, I produce a list of metadata objects related to articles in an XML-encoded newspaper with the articles function. articles generates a list of XML elements, and then maps through them one by one, producing objects that hold some interesting info about them. On the front end, I’m going to want to let the user browse the articles by section/subsection/headline. So I use reduce to take the list of articles and return a single dictionary that reflects the section/subsection/article hierarchy.

from lxml import etree
from Reader import Reader

class IssueReader(Reader):
    def articles(self):
        arts = self.q('//div3')  # inherited ... runs an xpath query against the issue
        subsection = etree.XPath('./ancestor::div2/@type')
        section = etree.XPath('./ancestor::div1/@type')
        header_text = etree.XPath('./head//text()')
        return map(lambda art: {
            'text_id': self.id,
            'path': self.getpath(art)[0],
            'subsection': (subsection(art)[0] or '[none]'),
            'section': (section(art)[0] or '[none]'),
            'headline': (''.join(header_text(art)) or '[none]')
        }, arts)

    def by_section(self):
        arts = self.articles()

        def extract(acc, art):  # acc for accumulator
            section = acc.get(art['section'], False)
            if section:
                subsection = acc.get(art['subsection'], False)
                if subsection:
                    subsection.append(art)
                else:
                    section[art['subsection']] = [art]
            else:
                acc[art['section']] = {art['subsection']: [art]}
            return acc

        return reduce(extract, arts, {})

I give both functions here because I think it shows how map and reduce can complement each other nicely when dealing with objects. The same thing could have been accomplished with a for loop, … but spending some serious time with a functional language has tended to make me think in terms of map and reduce.

By the way, if anybody has a better way to set properties like I’m doing in extract, where the parents of the property you want to set might not exist yet, please let me know.

Question 27

Not sure if this is what you are after but you can search source code on Google.

Follow the link for a search on ‘function:reduce() lang:python’ on Google Code search

At first glance the following projects use reduce()

MoinMoin
Zope
Numeric
ScientificPython

etc. etc. but then these are hardly surprising since they are huge projects.

The functionality of reduce can be done using function recursion which I guess Guido thought was more explicit.

Update:

Since Google’s Code Search was discontinued on 15-Jan-2012, besides reverting to regular Google searches, there’s something called Code Snippets Collection that looks promising. A number of other resources are mentioned in answers this (closed) question Replacement for Google Code Search?.

Update 2 (29-May-2017):

A good source for Python examples (in open-source code) is the Nullege search engine.

Question 28

import os

files = [
    # full filenames
    "var/log/apache/errors.log",
    "home/kane/images/avatars/crusader.png",
    "home/jane/documents/diary.txt",
    "home/kane/images/selfie.jpg",
    "var/log/abc.txt",
    "home/kane/.vimrc",
    "home/kane/images/avatars/paladin.png",
]

# unfolding of plain filiname list to file-tree
fs_tree = ({}, # dict of folders
           []) # list of files
for full_name in files:
    path, fn = os.path.split(full_name)
    reduce(
        # this fucction walks deep into path
        # and creates placeholders for subfolders
        lambda d, k: d[0].setdefault(k,         # walk deep
                                     ({}, [])), # or create subfolder storage
        path.split(os.path.sep),
        fs_tree
    )[1].append(fn)

print fs_tree
#({'home': (
#    {'jane': (
#        {'documents': (
#           {},
#           ['diary.txt']
#        )},
#        []
#    ),
#    'kane': (
#       {'images': (
#          {'avatars': (
#             {},
#             ['crusader.png',
#             'paladin.png']
#          )},
#          ['selfie.jpg']
#       )},
#       ['.vimrc']
#    )},
#    []
#  ),
#  'var': (
#     {'log': (
#         {'apache': (
#            {},
#            ['errors.log']
#         )},
#         ['abc.txt']
#     )},
#     [])
#},
#[])

Question 29

def dump(fname,iterable):
  with open(fname,'w') as f:
    reduce(lambda x, y: f.write(unicode(y,'utf-8')), iterable)

Question 30

I used reduce to concatenate a list of PostgreSQL search vectors with the || operator in sqlalchemy-searchable:

vectors = (self.column_vector(getattr(self.table.c, column_name))
           for column_name in self.indexed_columns)
concatenated = reduce(lambda x, y: x.op('||')(y), vectors)
compiled = concatenated.compile(self.conn)

Question 31

I have an old Python implementation of pipegrep that uses reduce and the glob module to build a list of files to process:

files = []
files.extend(reduce(lambda x, y: x + y, map(glob.glob, args)))

I found it handy at the time, but it’s really not necessary, as something similar is just as good, and probably more readable

files = []
for f in args:
    files.extend(glob.glob(f))

Question 32

Let say that there are some yearly statistic data stored a list of Counters. We want to find the MIN/MAX values in each month across the different years. For example, for January it would be 10. And for February it would be 15. We need to store the results in a new Counter.

from collections import Counter

stat2011 = Counter({"January": 12, "February": 20, "March": 50, "April": 70, "May": 15,
           "June": 35, "July": 30, "August": 15, "September": 20, "October": 60,
           "November": 13, "December": 50})

stat2012 = Counter({"January": 36, "February": 15, "March": 50, "April": 10, "May": 90,
           "June": 25, "July": 35, "August": 15, "September": 20, "October": 30,
           "November": 10, "December": 25})

stat2013 = Counter({"January": 10, "February": 60, "March": 90, "April": 10, "May": 80,
           "June": 50, "July": 30, "August": 15, "September": 20, "October": 75,
           "November": 60, "December": 15})

stat_list = [stat2011, stat2012, stat2013]

print reduce(lambda x, y: x & y, stat_list)     # MIN
print reduce(lambda x, y: x | y, stat_list)     # MAX

Question 33

I have objects representing some kind of overlapping intervals (genomic exons), and redefined their intersection using __and__:

class Exon:
    def __init__(self):
        ...
    def __and__(self,other):
        ...
        length = self.length + other.length  # (e.g.)
        return self.__class__(...length,...)

Then when I have a collection of them (for instance, in the same gene), I use

intersection = reduce(lambda x,y: x&y, exons)

Question 34

I just found useful usage of reduce: splitting string without removing the delimiter. The code is entirely from Programatically Speaking blog. Here’s the code:

reduce(lambda acc, elem: acc[:-1] + [acc[-1] + elem] if elem == "\n" else acc + [elem], re.split("(\n)", "a\nb\nc\n"), [])

Here’s the result:

['a\n', 'b\n', 'c\n', '']

Note that it handles edge cases that popular answer in SO doesn’t. For more in-depth explanation, I am redirecting you to original blog post.

Question 35

Using reduce() to find out if a list of dates are consecutive:

from datetime import date, timedelta


def checked(d1, d2):
    """
    We assume the date list is sorted.
    If d2 & d1 are different by 1, everything up to d2 is consecutive, so d2
    can advance to the next reduction.
    If d2 & d1 are not different by 1, returning d1 - 1 for the next reduction
    will guarantee the result produced by reduce() to be something other than
    the last date in the sorted date list.

    Definition 1: 1/1/14, 1/2/14, 1/2/14, 1/3/14 is consider consecutive
    Definition 2: 1/1/14, 1/2/14, 1/2/14, 1/3/14 is consider not consecutive

    """
    #if (d2 - d1).days == 1 or (d2 - d1).days == 0:  # for Definition 1
    if (d2 - d1).days == 1:                          # for Definition 2
        return d2
    else:
        return d1 + timedelta(days=-1)

# datelist = [date(2014, 1, 1), date(2014, 1, 3),
#             date(2013, 12, 31), date(2013, 12, 30)]

# datelist = [date(2014, 2, 19), date(2014, 2, 19), date(2014, 2, 20),
#             date(2014, 2, 21), date(2014, 2, 22)]

datelist = [date(2014, 2, 19), date(2014, 2, 21),
            date(2014, 2, 22), date(2014, 2, 20)]

datelist.sort()

if datelist[-1] == reduce(checked, datelist):
    print "dates are consecutive"
else:
    print "dates are not consecutive"

Question 36

What is the most idiomatic way to achieve something like the following, in Haskell:

foldl (+) 0 [1,2,3,4,5]
--> 15

Or its equivalent in Ruby:

[1,2,3,4,5].inject(0) {|m,x| m + x}
#> 15

Obviously, Python provides the reduce function, which is an implementation of fold, exactly as above, however, I was told that the ‘pythonic’ way of programming was to avoid lambda terms and higher-order functions, preferring list-comprehensions where possible. Therefore, is there a preferred way of folding a list, or list-like structure in Python that isn’t the reduce function, or is reduce the idiomatic way of achieving this?

Question 37

The Pythonic way of summing an array is using sum. For other purposes, you can sometimes use some combination of reduce (from the functools module) and the operator module, e.g.:

def product(xs):
    return reduce(operator.mul, xs, 1)

Be aware that reduce is actually a foldl, in Haskell terms. There is no special syntax to perform folds, there’s no builtin foldr, and actually using reduce with non-associative operators is considered bad style.

Using higher-order functions is quite pythonic; it makes good use of Python’s principle that everything is an object, including functions and classes. You are right that lambdas are frowned upon by some Pythonistas, but mostly because they tend not to be very readable when they get complex.

Question 38

Haskell

foldl (+) 0 [1,2,3,4,5]

Python

reduce(lambda a,b: a+b, [1,2,3,4,5], 0)

Obviously, that is a trivial example to illustrate a point. In Python you would just do sum([1,2,3,4,5]) and even Haskell purists would generally prefer sum [1,2,3,4,5].

For non-trivial scenarios when there is no obvious convenience function, the idiomatic pythonic approach is to explicitly write out the for loop and use mutable variable assignment instead of using reduce or a fold.

That is not at all the functional style, but that is the “pythonic” way. Python is not designed for functional purists. See how Python favors exceptions for flow control to see how non-functional idiomatic python is.

Question 39

In Python 3, the reduce has been removed: Release notes. Nevertheless you can use the functools module

import operator, functools
def product(xs):
    return functools.reduce(operator.mul, xs, 1)

On the other hand, the documentation expresses preference towards for-loop instead of reduce, hence:

def product(xs):
    result = 1
    for i in xs:
        result *= i
    return result

Question 40

You can reinvent the wheel as well:

def fold(f, l, a):
    """
    f: the function to apply
    l: the list to fold
    a: the accumulator, who is also the 'zero' on the first call
    """ 
    return a if(len(l) == 0) else fold(f, l[1:], f(a, l[0]))

print "Sum:", fold(lambda x, y : x+y, [1,2,3,4,5], 0)

print "Any:", fold(lambda x, y : x or y, [False, True, False], False)

print "All:", fold(lambda x, y : x and y, [False, True, False], True)

# Prove that result can be of a different type of the list's elements
print "Count(x==True):", 
print fold(lambda x, y : x+1 if(y) else x, [False, True, True], 0)

Question 41

Not really answer to the question, but one-liners for foldl and foldr:

a = [8,3,4]

## Foldl
reduce(lambda x,y: x**y, a)
#68719476736

## Foldr
reduce(lambda x,y: y**x, a[::-1])
#14134776518227074636666380005943348126619871175004951664972849610340958208L

Question 42

Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), which gives the possibility to name the result of an expression, we can use a list comprehension to replicate what other languages call fold/foldleft/reduce operations:

Given a list, a reducing function and an accumulator:

items = [1, 2, 3, 4, 5]
f = lambda acc, x: acc * x
accumulator = 1

we can fold items with f in order to obtain the resulting accumulation:

[accumulator := f(accumulator, x) for x in items]
# accumulator = 120

or in a condensed formed:

acc = 1; [acc := acc * x for x in [1, 2, 3, 4, 5]]
# acc = 120

Note that this is actually also a “scanleft” operation as the result of the list comprehension represents the state of the accumulation at each step:

acc = 1
scanned = [acc := acc * x for x in [1, 2, 3, 4, 5]]
# scanned = [1, 2, 6, 24, 120]
# acc = 120

Question 43

The actual answer to this (reduce) problem is: Just use a loop!

initial_value = 0
for x in the_list:
    initial_value += x #or any function.

This will be faster than a reduce and things like PyPy can optimize loops like that.

BTW, the sum case should be solved with the sum function

Question 44

I believe some of the respondents of this question have missed the broader implication of the fold function as an abstract tool. Yes, sum can do the same thing for a list of integers, but this is a trivial case. fold is more generic. It is useful when you have a sequence of data structures of varying shape and want to cleanly express an aggregation. So instead of having to build up a for loop with an aggregate variable and manually recompute it each time, a fold function (or the Python version, which reduce appears to correspond to) allows the programmer to express the intent of the aggregation much more plainly by simply providing two things:

A default starting or “seed” value for the aggregation.
A function that takes the current value of the aggregation (starting with the “seed”) and the next element in the list, and returns the next aggregation value.

Question 45

I may be quite late to the party, but we can create custom foldr using simple lambda calculus and curried function. Here is my implementation of foldr in python.

def foldr(func):
    def accumulator(acc):
        def listFunc(l):
            if l:
                x = l[0]
                xs = l[1:]
                return func(x)(foldr(func)(acc)(xs))
            else:
                return acc
        return listFunc
    return accumulator  


def curried_add(x):
    def inner(y):
        return x + y
    return inner

def curried_mult(x):
    def inner(y):
        return x * y
    return inner

print foldr(curried_add)(0)(range(1, 6))
print foldr(curried_mult)(1)(range(1, 6))

Even though the implementation is recursive (might be slow), it will print the values 15 and 120 respectively

Question 46

Partial application is cool. What functionality does functools.partial offer that you can’t get through lambdas?

>>> sum = lambda x, y : x + y
>>> sum(1, 2)
3
>>> incr = lambda y : sum(1, y)
>>> incr(2)
3
>>> def sum2(x, y):
    return x + y

>>> incr2 = functools.partial(sum2, 1)
>>> incr2(4)
5

Is functools somehow more efficient, or readable?

Question 47

What functionality does functools.partial offer that you can’t get through lambdas?

Not much in terms of extra functionality (but, see later) – and, readability is in the eye of the beholder.
Most people who are familiar with functional programming languages (those in the Lisp/Scheme families in particular) appear to like lambda just fine – I say “most”, definitely not all, because Guido and I assuredly are among those “familiar with” (etc) yet think of lambda as an eyesore anomaly in Python…
He was repentant of ever having accepted it into Python whereas planned to remove it from Python 3, as one of “Python’s glitches”.
I fully supported him in that. (I love lambda in Scheme… while its limitations in Python, and the weird way it just doesn’t fit in with the rest of the language, make my skin crawl).

Not so, however, for the hordes of lambda lovers — who staged one of the closest things to a rebellion ever seen in Python’s history, until Guido backtracked and decided to leave lambda in.
Several possible additions to functools (to make functions returning constants, identity, etc) didn’t happen (to avoid explicitly duplicating more of lambda‘s functionality), though partial did of course remain (it’s no total duplication, nor is it an eyesore).

Remember that lambda‘s body is limited to be an expression, so it’s got limitations. For example…:

>>> import functools
>>> f = functools.partial(int, base=2)
>>> f.args
()
>>> f.func
<type 'int'>
>>> f.keywords
{'base': 2}
>>>

functools.partial‘s returned function is decorated with attributes useful for introspection — the function it’s wrapping, and what positional and named arguments it fixes therein. Further, the named arguments can be overridden right back (the “fixing” is rather, in a sense, the setting of defaults):

>>> f('23', base=10)
23

So, as you see, it’s definely not as simplistic as lambda s: int(s, base=2)!-)

Yes, you could contort your lambda to give you some of this – e.g., for the keyword-overriding,

>>> f = lambda s, **k: int(s, **dict({'base': 2}, **k))

but I dearly hope that even the most ardent lambda-lover doesn’t consider this horror more readable than the partial call!-). The “attribute setting” part is even harder, because of the “body’s a single expression” limitation of Python’s lambda (plus the fact that assignment can never be part of a Python expression)… you end up “faking assignments within an expression” by stretching list comprehension well beyond its design limits…:

>>> f = [f for f in (lambda f: int(s, base=2),)
           if setattr(f, 'keywords', {'base': 2}) is None][0]

Now combine the named-arguments overridability, plus the setting of three attributes, into a single expression, and tell me just how readable that is going to be…!

Question 48

Well, here’s an example that shows a difference:

In [132]: sum = lambda x, y: x + y

In [133]: n = 5

In [134]: incr = lambda y: sum(n, y)

In [135]: incr2 = partial(sum, n)

In [136]: print incr(3), incr2(3)
8 8

In [137]: n = 9

In [138]: print incr(3), incr2(3)
12 8

These posts by Ivan Moore expand on the “limitations of lambda” and closures in python:

Question 49

In the latest versions of Python (>=2.7), you can pickle a partial, but not a lambda:

>>> pickle.dumps(partial(int))
'cfunctools\npartial\np0\n(c__builtin__\nint\np1\ntp2\nRp3\n(g1\n(tNNtp4\nb.'
>>> pickle.dumps(lambda x: int(x))
Traceback (most recent call last):
  File "<ipython-input-11-e32d5a050739>", line 1, in <module>
    pickle.dumps(lambda x: int(x))
  File "/usr/lib/python2.7/pickle.py", line 1374, in dumps
    Pickler(file, protocol).dump(obj)
  File "/usr/lib/python2.7/pickle.py", line 224, in dump
    self.save(obj)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 748, in save_global
    (obj, module, name))
PicklingError: Can't pickle <function <lambda> at 0x1729aa0>: it's not found as __main__.<lambda>

Question 50

Is functools somehow more efficient..?

As a partly answer to this I decided to test the performance. Here is my example:

from functools import partial
import time, math

def make_lambda():
    x = 1.3
    return lambda: math.sin(x)

def make_partial():
    x = 1.3
    return partial(math.sin, x)

Iter = 10**7

start = time.clock()
for i in range(0, Iter):
    l = make_lambda()
stop = time.clock()
print('lambda creation time {}'.format(stop - start))

start = time.clock()
for i in range(0, Iter):
    l()
stop = time.clock()
print('lambda execution time {}'.format(stop - start))

start = time.clock()
for i in range(0, Iter):
    p = make_partial()
stop = time.clock()
print('partial creation time {}'.format(stop - start))

start = time.clock()
for i in range(0, Iter):
    p()
stop = time.clock()
print('partial execution time {}'.format(stop - start))

on Python 3.3 it gives:

lambda creation time 3.1743163756961392
lambda execution time 3.040552701787919
partial creation time 3.514482823352731
partial execution time 1.7113973411608114

Which means that partial needs a bit more time for creation but considerably less time for execution. This can well be the effect of the early and late binding which are discussed in the answer from ars.

Question 51

Besides the extra functionality Alex mentioned, another advantage of functools.partial is speed. With partial you can avoid constructing (and destructing) another stack frame.

Neither the function generated by partial nor lambdas have docstrings by default (though you can set the doc string for any objects via __doc__).

You can find more details in this blog: Partial Function Application in Python

Question 52

I understand the intent quickest in the third example.

When I parse lambdas, I’m expecting more complexity/oddity than offered by the standard library directly.

Also, you’ll notice that the third example is the only one which doesn’t depend on the full signature of sum2; thus making it slightly more loosely coupled.

Question 53

I am not able to get my head on how the partial works in functools. I have the following code from here:

>>> sum = lambda x, y : x + y
>>> sum(1, 2)
3
>>> incr = lambda y : sum(1, y)
>>> incr(2)
3
>>> def sum2(x, y):
    return x + y

>>> incr2 = functools.partial(sum2, 1)
>>> incr2(4)
5

Now in the line

incr = lambda y : sum(1, y)

I get that whatever argument I pass to incr it will be passed as y to lambda which will return sum(1, y) i.e 1 + y.

I understand that. But I didn’t understand this incr2(4).

How does the 4 gets passed as x in partial function? To me, 4 should replace the sum2. What is the relation between x and 4?

Question 54

Roughly, partial does something like this (apart from keyword args support etc):

def partial(func, *part_args):
    def wrapper(*extra_args):
        args = list(part_args)
        args.extend(extra_args)
        return func(*args)

    return wrapper

So, by calling partial(sum2, 4) you create a new function (a callable, to be precise) that behaves like sum2, but has one positional argument less. That missing argument is always substituted by 4, so that partial(sum2, 4)(2) == sum2(4, 2)

As for why it’s needed, there’s a variety of cases. Just for one, suppose you have to pass a function somewhere where it’s expected to have 2 arguments:

class EventNotifier(object):
    def __init__(self):
        self._listeners = []

    def add_listener(self, callback):
        ''' callback should accept two positional arguments, event and params '''
        self._listeners.append(callback)
        # ...

    def notify(self, event, *params):
        for f in self._listeners:
            f(event, params)

But a function you already have needs access to some third context object to do its job:

def log_event(context, event, params):
    context.log_event("Something happened %s, %s", event, params)

So, there are several solutions:

A custom object:

class Listener(object):
   def __init__(self, context):
       self._context = context

   def __call__(self, event, params):
       self._context.log_event("Something happened %s, %s", event, params)


 notifier.add_listener(Listener(context))

Lambda:

log_listener = lambda event, params: log_event(context, event, params)
notifier.add_listener(log_listener)

With partials:

context = get_context()  # whatever
notifier.add_listener(partial(log_event, context))

Of those three, partial is the shortest and the fastest. (For a more complex case you might want a custom object though).

Question 55

partials are incredibly useful.

For instance, in a ‘pipe-lined’ sequence of function calls (in which the returned value from one function is the argument passed to the next).

Sometimes a function in such a pipeline requires a single argument, but the function immediately upstream from it returns two values.

In this scenario, functools.partial might allow you to keep this function pipeline intact.

Here’s a specific, isolated example: suppose you want to sort some data by each data point’s distance from some target:

# create some data
import random as RND
fnx = lambda: RND.randint(0, 10)
data = [ (fnx(), fnx()) for c in range(10) ]
target = (2, 4)

import math
def euclid_dist(v1, v2):
    x1, y1 = v1
    x2, y2 = v2
    return math.sqrt((x2 - x1)**2 + (y2 - y1)**2)

To sort this data by distance from the target, what you would like to do of course is this:

data.sort(key=euclid_dist)

but you can’t–the sort method’s key parameter only accepts functions that take a single argument.

so re-write euclid_dist as a function taking a single parameter:

from functools import partial

p_euclid_dist = partial(euclid_dist, target)

p_euclid_dist now accepts a single argument,

>>> p_euclid_dist((3, 3))
  1.4142135623730951

so now you can sort your data by passing in the partial function for the sort method’s key argument:

data.sort(key=p_euclid_dist)

# verify that it works:
for p in data:
    print(round(p_euclid_dist(p), 3))

    1.0
    2.236
    2.236
    3.606
    4.243
    5.0
    5.831
    6.325
    7.071
    8.602

Or for instance, one of the function’s arguments changes in an outer loop but is fixed during iteration in the inner loop. By using a partial, you don’t have to pass in the additional parameter during iteration of the inner loop, because the modified (partial) function doesn’t require it.

>>> from functools import partial

>>> def fnx(a, b, c):
      return a + b + c

>>> fnx(3, 4, 5)
      12

create a partial function (using keyword arg)

>>> pfnx = partial(fnx, a=12)

>>> pfnx(b=4, c=5)
     21

you can also create a partial function with a positional argument

>>> pfnx = partial(fnx, 12)

>>> pfnx(4, 5)
      21

but this will throw (e.g., creating partial with keyword argument then calling using positional arguments)

>>> pfnx = partial(fnx, a=12)

>>> pfnx(4, 5)
      Traceback (most recent call last):
      File "<pyshell#80>", line 1, in <module>
      pfnx(4, 5)
      TypeError: fnx() got multiple values for keyword argument 'a'

another use case: writing distributed code using python’s multiprocessing library. A pool of processes is created using the Pool method:

>>> import multiprocessing as MP

>>> # create a process pool:
>>> ppool = MP.Pool()

Pool has a map method, but it only takes a single iterable, so if you need to pass in a function with a longer parameter list, re-define the function as a partial, to fix all but one:

>>> ppool.map(pfnx, [4, 6, 7, 8])

Question 56

short answer, partial gives default values to the parameters of a function that would otherwise not have default values.

from functools import partial

def foo(a,b):
    return a+b

bar = partial(foo, a=1) # equivalent to: foo(a=1, b)
bar(b=10)
#11 = 1+10
bar(a=101, b=10)
#111=101+10

Question 57

Partials can be used to make new derived functions that have some input parameters pre-assigned

To see some real world usage of partials, refer to this really good blog post:
http://chriskiehl.com/article/Cleaner-coding-through-partially-applied-functions/

A simple but neat beginner’s example from the blog, covers how one might use partial on re.search to make code more readable. re.search method’s signature is:

search(pattern, string, flags=0)

By applying partial we can create multiple versions of the regular expression search to suit our requirements, so for example:

is_spaced_apart = partial(re.search, '[a-zA-Z]\s\=')
is_grouped_together = partial(re.search, '[a-zA-Z]\=')

Now is_spaced_apart and is_grouped_together are two new functions derived from re.search that have the pattern argument applied(since pattern is the first argument in the re.search method’s signature).

The signature of these two new functions(callable) is:

is_spaced_apart(string, flags=0)     # pattern '[a-zA-Z]\s\=' applied
is_grouped_together(string, flags=0) # pattern '[a-zA-Z]\=' applied

This is how you could then use these partial functions on some text:

for text in lines:
    if is_grouped_together(text):
        some_action(text)
    elif is_spaced_apart(text):
        some_other_action(text)
    else:
        some_default_action()

You can refer the link above to get a more in depth understanding of the subject, as it covers this specific example and much more..

Question 58

In my opinion, it’s a way to implement currying in python.

from functools import partial
def add(a,b):
    return a + b

def add2number(x,y,z):
    return x + y + z

if __name__ == "__main__":
    add2 = partial(add,2)
    print("result of add2 ",add2(1))
    add3 = partial(partial(add2number,1),2)
    print("result of add3",add3(1))

The result is 3 and 4.

Question 59

Also worth to mention, that when partial function passed another function where we want to “hard code” some parameters, that should be rightmost parameter

def func(a,b):
    return a*b
prt = partial(func, b=7)
    print(prt(4))
#return 28

but if we do the same, but changing a parameter instead

def func(a,b):
    return a*b
 prt = partial(func, a=7)
    print(prt(4))

it will throw error, “TypeError: func() got multiple values for argument ‘a'”

Question 60

This answer is more of an example code. All the above answers give good explanations regarding why one should use partial. I will give my observations and use cases about partial.

from functools import partial
 def adder(a,b,c):
    print('a:{},b:{},c:{}'.format(a,b,c))
    ans = a+b+c
    print(ans)
partial_adder = partial(adder,1,2)
partial_adder(3)  ## now partial_adder is a callable that can take only one argument

Output of the above code should be:

a:1,b:2,c:3
6

Notice that in the above example a new callable was returned that will take parameter (c) as it’s argument. Note that it is also the last argument to the function.

args = [1,2]
partial_adder = partial(adder,*args)
partial_adder(3)

Output of the above code is also:

a:1,b:2,c:3
6

Notice that * was used to unpack the non-keyword arguments and the callable returned in terms of which argument it can take is same as above.

Another observation is: Below example demonstrates that partial returns a callable which will take the undeclared parameter (a) as an argument.

def adder(a,b=1,c=2,d=3,e=4):
    print('a:{},b:{},c:{},d:{},e:{}'.format(a,b,c,d,e))
    ans = a+b+c+d+e
    print(ans)
partial_adder = partial(adder,b=10,c=2)
partial_adder(20)

Output of the above code should be:

a:20,b:10,c:2,d:3,e:4
39

Similarly,

kwargs = {'b':10,'c':2}
partial_adder = partial(adder,**kwargs)
partial_adder(20)

Above code prints

a:20,b:10,c:2,d:3,e:4
39

I had to use it when I was using Pool.map_async method from multiprocessing module. You can pass only one argument to the worker function so I had to use partial to make my worker function look like a callable with only one input argument but in reality my worker function had multiple input arguments.

问题：使用python map和其他功能工具

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

问题：使用reduce（）的有用代码？[关闭]

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

回答 13

回答 14

回答 15

回答 16

回答 17

回答 18

回答 19

回答 20

回答 21

回答 22

回答 23

问题：函数式编程中的“ pythonic”等同于“ fold”函数是什么？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

问题：Python：为什么functools.partial是必需的？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

问题：functools部分如何做？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

问题：如何在Python 3中使用过滤，映射和归约

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

问题：为什么Python对于函数式编程不是很好？[关闭]

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

问题：列表理解与lambda +过滤器

回答 0

回答 1