Python 实用宝典

Question 1

I have two lists that i need to combine where the second list has any duplicates of the first list ignored. .. A bit hard to explain, so let me show an example of what the code looks like, and what i want as a result.

first_list = [1, 2, 2, 5]

second_list = [2, 5, 7, 9]

# The result of combining the two lists should result in this list:
resulting_list = [1, 2, 2, 5, 7, 9]

You’ll notice that the result has the first list, including its two “2” values, but the fact that second_list also has an additional 2 and 5 value is not added to the first list.

Normally for something like this i would use sets, but a set on first_list would purge the duplicate values it already has. So i’m simply wondering what the best/fastest way to achieve this desired combination.

Thanks.

Question 2

You need to append to the first list those elements of the second list that aren’t in the first – sets are the easiest way of determining which elements they are, like this:

first_list = [1, 2, 2, 5]
second_list = [2, 5, 7, 9]

in_first = set(first_list)
in_second = set(second_list)

in_second_but_not_in_first = in_second - in_first

result = first_list + list(in_second_but_not_in_first)
print(result)  # Prints [1, 2, 2, 5, 9, 7]

Or if you prefer one-liners 8-)

print(first_list + list(set(second_list) - set(first_list)))

Question 3

resulting_list = list(first_list)
resulting_list.extend(x for x in second_list if x not in resulting_list)

Question 4

You can use sets:

first_list = [1, 2, 2, 5]
second_list = [2, 5, 7, 9]

resultList= list(set(first_list) | set(second_list))

print(resultList)
# Results in : resultList = [1,2,5,7,9]

Question 5

You can bring this down to one single line of code if you use numpy:

a = [1,2,3,4,5,6,7]
b = [2,4,7,8,9,10,11,12]

sorted(np.unique(a+b))

>>> [1,2,3,4,5,6,7,8,9,10,11,12]

Question 6

first_list = [1, 2, 2, 5]
second_list = [2, 5, 7, 9]

print( set( first_list + second_list ) )

Question 7

resulting_list = first_list + [i for i in second_list if i not in first_list]

Question 8

Simplest to me is:

first_list = [1, 2, 2, 5]
second_list = [2, 5, 7, 9]

merged_list = list(set(first_list+second_list))
print(merged_list)

#prints [1, 2, 5, 7, 9]

Question 9

You can also combine RichieHindle’s and Ned Batchelder’s responses for an average-case O(m+n) algorithm that preserves order:

first_list = [1, 2, 2, 5]
second_list = [2, 5, 7, 9]

fs = set(first_list)
resulting_list = first_list + [x for x in second_list if x not in fs]

assert(resulting_list == [1, 2, 2, 5, 7, 9])

Note that x in s has a worst-case complexity of O(m), so the worst-case complexity of this code is still O(m*n).

Question 10

This might help

def union(a,b):
    for e in b:
        if e not in a:
            a.append(e)

The union function merges the second list into first, with out duplicating an element of a, if it’s already in a. Similar to set union operator. This function does not change b. If a=[1,2,3] b=[2,3,4]. After union(a,b) makes a=[1,2,3,4] and b=[2,3,4]

Question 11

Based on the recipe :

resulting_list = list(set().union(first_list, second_list))

Question 12

    first_list = [1, 2, 2, 5]
    second_list = [2, 5, 7, 9]

    newList=[]
    for i in first_list:
        newList.append(i)
    for z in second_list:
        if z not in newList:
            newList.append(z)
    newList.sort()
    print newList

[1, 2, 2, 5, 7, 9]

Question 13

I would like to know if there is a better way to print all objects in a Python list than this :

myList = [Person("Foo"), Person("Bar")]
print("\n".join(map(str, myList)))
Foo
Bar

I read this way is not really good :

myList = [Person("Foo"), Person("Bar")]
for p in myList:
    print(p)

Isn’t there something like :

print(p) for p in myList

If not, my question is… why ? If we can do this kind of stuff with comprehensive lists, why not as a simple statement outside a list ?

Question 14

Assuming you are using Python 3.x:

print(*myList, sep='\n')

You can get the same behavior on Python 2.x using from __future__ import print_function, as noted by mgilson in comments.

With the print statement on Python 2.x you will need iteration of some kind, regarding your question about print(p) for p in myList not working, you can just use the following which does the same thing and is still one line:

for p in myList: print p

For a solution that uses '\n'.join(), I prefer list comprehensions and generators over map() so I would probably use the following:

print '\n'.join(str(p) for p in myList)

Question 15

I use this all the time :

#!/usr/bin/python

l = [1,2,3,7] 
print "".join([str(x) for x in l])

Question 16

[print(a) for a in list] will give a bunch of None types at the end though it prints out all the items

Question 17

For Python 2.*:

If you overload the function __str__() for your Person class, you can omit the part with map(str, …). Another way for this is creating a function, just like you wrote:

def write_list(lst):
    for item in lst:
        print str(item) 

...

write_list(MyList)

There is in Python 3.* the argument sep for the print() function. Take a look at documentation.

Question 18

Expanding @lucasg’s answer (inspired by the comment it received):

To get a formatted list output, you can do something along these lines:

l = [1,2,5]
print ", ".join('%02d'%x for x in l)

01, 02, 05

Now the ", " provides the separator (only between items, not at the end) and the formatting string '02d'combined with %x gives a formatted string for each item x – in this case, formatted as an integer with two digits, left-filled with zeros.

Question 19

To display each content, I use:

mylist = ['foo', 'bar']
indexval = 0
for i in range(len(mylist)):     
    print(mylist[indexval])
    indexval += 1

Example of using in a function:

def showAll(listname, startat):
   indexval = startat
   try:
      for i in range(len(mylist)):
         print(mylist[indexval])
         indexval = indexval + 1
   except IndexError:
      print('That index value you gave is out of range.')

Hope I helped.

Question 20

I think this is the most convenient if you just want to see the content in the list:

myList = ['foo', 'bar']
print('myList is %s' % str(myList))

Simple, easy to read and can be used together with format string.

Question 21

OP’s question is: does something like following exists, if not then why

print(p) for p in myList # doesn't work, OP's intuition

answer is, it does exist which is:

[p for p in myList] #works perfectly

Basically, use [] for list comprehension and get rid of print to avoiding printing None. To see why print prints None see this

Question 22

I recently made a password generator and although I’m VERY NEW to python, I whipped this up as a way to display all items in a list (with small edits to fit your needs…

    x = 0
    up = 0
    passwordText = ""
    password = []
    userInput = int(input("Enter how many characters you want your password to be: "))
    print("\n\n\n") # spacing

    while x <= (userInput - 1): #loops as many times as the user inputs above
            password.extend([choice(groups.characters)]) #adds random character from groups file that has all lower/uppercase letters and all numbers
            x = x+1 #adds 1 to x w/o using x ++1 as I get many errors w/ that
            passwordText = passwordText + password[up]
            up = up+1 # same as x increase


    print(passwordText)

Like I said, IM VERY NEW to Python and I’m sure this is way to clunky for a expert, but I’m just here for another example

Question 23

Assuming you are fine with your list being printed [1,2,3], then an easy way in Python3 is:

mylist=[1,2,3,'lorem','ipsum','dolor','sit','amet']

print(f"There are {len(mylist):d} items in this lorem list: {str(mylist):s}")

Running this produces the following output:

There are 8 items in this lorem list: [1, 2, 3, ‘lorem’, ‘ipsum’, ‘dolor’, ‘sit’, ‘amet’]

Question 24

I found, that there is related question, about how to find if at least one item exists in a list:
How to check if one of the following items is in a list?

But what is the best and pythonic way to find whether all items exists in a list?

Searching through the docs I found this solution:

>>> l = ['a', 'b', 'c']
>>> set(['a', 'b']) <= set(l)
True
>>> set(['a', 'x']) <= set(l)
False

Other solution would be this:

>>> l = ['a', 'b', 'c']
>>> all(x in l for x in ['a', 'b'])
True
>>> all(x in l for x in ['a', 'x'])
False

But here you must do more typing.

Is there any other solutions?

Question 25

Operators like <= in Python are generally not overriden to mean something significantly different than “less than or equal to”. It’s unusual for the standard library does this–it smells like legacy API to me.

Use the equivalent and more clearly-named method, set.issubset. Note that you don’t need to convert the argument to a set; it’ll do that for you if needed.

set(['a', 'b']).issubset(['a', 'b', 'c'])

Question 26

I would probably use set in the following manner :

set(l).issuperset(set(['a','b']))

or the other way round :

set(['a','b']).issubset(set(l))

I find it a bit more readable, but it may be over-kill. Sets are particularly useful to compute union/intersection/differences between collections, but it may not be the best option in this situation …

Question 27

I like these two because they seem the most logical, the latter being shorter and probably fastest (shown here using set literal syntax which has been backported to Python 2.7):

all(x in {'a', 'b', 'c'} for x in ['a', 'b'])
#   or
{'a', 'b'}.issubset({'a', 'b', 'c'})

Question 28

What if your lists contain duplicates like this:

v1 = ['s', 'h', 'e', 'e', 'p']
v2 = ['s', 's', 'h']

Sets do not contain duplicates. So, the following line returns True.

set(v2).issubset(v1)

To count for duplicates, you can use the code:

v1 = sorted(v1)
v2 = sorted(v2)


def is_subseq(v2, v1):
    """Check whether v2 is a subsequence of v1."""
    it = iter(v1)
    return all(c in it for c in v2)

So, the following line returns False.

is_subseq(v2, v1)

Question 29

This was what I was searching online but unfortunately found not online but while experimenting on python interpreter.

>>> case  = "caseCamel"
>>> label = "Case Camel"
>>> list  = ["apple", "banana"]
>>>
>>> (case or label) in list
False
>>> list = ["apple", "caseCamel"]
>>> (case or label) in list
True
>>> (case and label) in list
False
>>> list = ["case", "caseCamel", "Case Camel"]
>>> (case and label) in list
True
>>>

and if you have a looong list of variables held in a sublist variable

>>>
>>> list  = ["case", "caseCamel", "Case Camel"]
>>> label = "Case Camel"
>>> case  = "caseCamel"
>>>
>>> sublist = ["unique banana", "very unique banana"]
>>>
>>> # example for if any (at least one) item contained in superset (or statement)
...
>>> next((True for item in sublist if next((True for x in list if x == item), False)), False)
False
>>>
>>> sublist[0] = label
>>>
>>> next((True for item in sublist if next((True for x in list if x == item), False)), False)
True
>>>
>>> # example for whether a subset (all items) contained in superset (and statement)
...
>>> # a bit of demorgan's law
...
>>> next((False for item in sublist if item not in list), True)
False
>>>
>>> sublist[1] = case
>>>
>>> next((False for item in sublist if item not in list), True)
True
>>>
>>> next((True for item in sublist if next((True for x in list if x == item), False)), False)
True
>>>
>>>

Question 30

An example of how to do this using a lambda expression would be:

issublist = lambda x, y: 0 in [_ in x for _ in y]

Question 31

Not OP’s case, but – for anyone who wants to assert intersection in dicts and ended up here due to poor googling (e.g. me) – you need to work with dict.items:

>>> a = {'key': 'value'}
>>> b = {'key': 'value', 'extra_key': 'extra_value'}
>>> all(item in a.items() for item in b.items())
True
>>> all(item in b.items() for item in a.items())
False

That’s because dict.items returns tuples of key/value pairs, and much like any object in Python, they’re interchangeably comparable

Question 32

Given a list of numbers, how does one find differences between every (i)-th elements and its (i+1)-th?

Is it better to use a lambda expression or maybe a list comprehension?

For example:

Given a list t=[1,3,6,...], the goal is to find a list v=[2,3,...] because 3-1=2, 6-3=3, etc.

Question 33

>>> t
[1, 3, 6]
>>> [j-i for i, j in zip(t[:-1], t[1:])]  # or use itertools.izip in py2k
[2, 3]

Question 34

The other answers are correct but if you’re doing numerical work, you might want to consider numpy. Using numpy, the answer is:

v = numpy.diff(t)

Question 35

If you don’t want to use numpy nor zip, you can use the following solution:

>>> t = [1, 3, 6]
>>> v = [t[i+1]-t[i] for i in range(len(t)-1)]
>>> v
[2, 3]

Question 36

You can use itertools.tee and zip to efficiently build the result:

from itertools import tee
# python2 only:
#from itertools import izip as zip

def differences(seq):
    iterable, copied = tee(seq)
    next(copied)
    for x, y in zip(iterable, copied):
        yield y - x

Or using itertools.islice instead:

from itertools import islice

def differences(seq):
    nexts = islice(seq, 1, None)
    for x, y in zip(seq, nexts):
        yield y - x

You can also avoid using the itertools module:

def differences(seq):
    iterable = iter(seq)
    prev = next(iterable)
    for element in iterable:
        yield element - prev
        prev = element

All these solution work in constant space if you don’t need to store all the results and support infinite iterables.

Here are some micro-benchmarks of the solutions:

In [12]: L = range(10**6)

In [13]: from collections import deque
In [15]: %timeit deque(differences_tee(L), maxlen=0)
10 loops, best of 3: 122 ms per loop

In [16]: %timeit deque(differences_islice(L), maxlen=0)
10 loops, best of 3: 127 ms per loop

In [17]: %timeit deque(differences_no_it(L), maxlen=0)
10 loops, best of 3: 89.9 ms per loop

And the other proposed solutions:

In [18]: %timeit [x[1] - x[0] for x in zip(L[1:], L)]
10 loops, best of 3: 163 ms per loop

In [19]: %timeit [L[i+1]-L[i] for i in range(len(L)-1)]
1 loops, best of 3: 395 ms per loop

In [20]: import numpy as np

In [21]: %timeit np.diff(L)
1 loops, best of 3: 479 ms per loop

In [35]: %%timeit
    ...: res = []
    ...: for i in range(len(L) - 1):
    ...:     res.append(L[i+1] - L[i])
    ...: 
1 loops, best of 3: 234 ms per loop

Note that:

zip(L[1:], L) is equivalent to zip(L[1:], L[:-1]) since zip already terminates on the shortest input, however it avoids a whole copy of L.
Accessing the single elements by index is very slow because every index access is a method call in python
numpy.diff is slow because it has to first convert the list to a ndarray. Obviously if you start with an ndarray it will be much faster:
```
In [22]: arr = np.array(L)

In [23]: %timeit np.diff(arr)
100 loops, best of 3: 3.02 ms per loop
```

Question 37

Using the := walrus operator available in Python 3.8+:

>>> t = [1, 3, 6]
>>> prev = t[0]; [-prev + (prev := x) for x in t[1:]]
[2, 3]

Question 38

I would suggest using

v = np.diff(t)

this is simple and easy to read.

But if you want v to have the same length as t then

v = np.diff([t[0]] + t) # for python 3.x

or

v = np.diff(t + [t[-1]])

FYI: this will only work for lists.

for numpy arrays

v = np.diff(np.append(t[0], t))

Question 39

A functional approach:

>>> import operator
>>> a = [1,3,5,7,11,13,17,21]
>>> map(operator.sub, a[1:], a[:-1])
[2, 2, 2, 4, 2, 4, 4]

Using generator:

>>> import operator, itertools
>>> g1,g2 = itertools.tee((x*x for x in xrange(5)),2)
>>> list(itertools.imap(operator.sub, itertools.islice(g1,1,None), g2))
[1, 3, 5, 7]

Using indices:

>>> [a[i+1]-a[i] for i in xrange(len(a)-1)]
[2, 2, 2, 4, 2, 4, 4]

Question 40

Ok. I think I found the proper solution:

v = [x[1]-x[0] for x in zip(t[1:],t[:-1])]

Question 41

Solution with periodic boundaries

Sometimes with numerical integration you will want to difference a list with periodic boundary conditions (so the first element calculates the difference to the last. In this case the numpy.roll function is helpful:

v-np.roll(v,1)

Solutions with zero prepended

Another numpy solution (just for completeness) is to use

numpy.ediff1d(v)

This works as numpy.diff, but only on a vector (it flattens the input array). It offers the ability to prepend or append numbers to the resulting vector. This is useful when handling accumulated fields that is often the case fluxes in meteorological variables (e.g. rain, latent heat etc), as you want a resulting list of the same length as the input variable, with the first entry untouched.

Then you would write

np.ediff1d(v,to_begin=v[0])

Of course, you can also do this with the np.diff command, in this case though you need to prepend zero to the series with the prepend keyword:

np.diff(v,prepend=0.0)

All the above solutions return a vector that is the same length as the input.

Question 42

My way

>>>v = [1,2,3,4,5]
>>>[v[i] - v[i-1] for i, value in enumerate(v[1:], 1)]
[1, 1, 1, 1]

Question 43

I am trying to create a nice column list in python for use with commandline admin tools which I create.

Basicly, I want a list like:

[['a', 'b', 'c'], ['aaaaaaaaaa', 'b', 'c'], ['a', 'bbbbbbbbbb', 'c']]

To turn into:

a            b            c
aaaaaaaaaa   b            c
a            bbbbbbbbbb   c

Using plain tabs wont do the trick here because I don’t know the longest data in each row.

This is the same behavior as ‘column -t’ in Linux..

$ echo -e "a b c\naaaaaaaaaa b c\na bbbbbbbbbb c"
a b c
aaaaaaaaaa b c
a bbbbbbbbbb c

$ echo -e "a b c\naaaaaaaaaa b c\na bbbbbbbbbb c" | column -t
a           b           c
aaaaaaaaaa  b           c
a           bbbbbbbbbb  c

I have looked around for various python libraries to do this but can’t find anything useful.

Question 44

data = [['a', 'b', 'c'], ['aaaaaaaaaa', 'b', 'c'], ['a', 'bbbbbbbbbb', 'c']]

col_width = max(len(word) for row in data for word in row) + 2  # padding
for row in data:
    print "".join(word.ljust(col_width) for word in row)

a            b            c            
aaaaaaaaaa   b            c            
a            bbbbbbbbbb   c

What this does is calculate the longest data entry to determine the column width, then use .ljust() to add the necessary padding when printing out each column.

Question 45

Since Python 2.6+, you can use a format string in the following way to set the columns to a minimum of 20 characters and align text to right.

table_data = [
    ['a', 'b', 'c'],
    ['aaaaaaaaaa', 'b', 'c'], 
    ['a', 'bbbbbbbbbb', 'c']
]
for row in table_data:
    print("{: >20} {: >20} {: >20}".format(*row))

Output:

               a                    b                    c
      aaaaaaaaaa                    b                    c
               a           bbbbbbbbbb                    c

Question 46

I came here with the same requirements but @lvc and @Preet’s answers seems more inline with what column -t produces in that columns have different widths:

>>> rows =  [   ['a',           'b',            'c',    'd']
...         ,   ['aaaaaaaaaa',  'b',            'c',    'd']
...         ,   ['a',           'bbbbbbbbbb',   'c',    'd']
...         ]
...

>>> widths = [max(map(len, col)) for col in zip(*rows)]
>>> for row in rows:
...     print "  ".join((val.ljust(width) for val, width in zip(row, widths)))
...
a           b           c  d
aaaaaaaaaa  b           c  d
a           bbbbbbbbbb  c  d

Question 47

This is a little late to the party, and a shameless plug for a package I wrote, but you can also check out the Columnar package.

It takes a list of lists of input and a list of headers and outputs a table-formatted string. This snippet creates a docker-esque table:

from columnar import columnar

headers = ['name', 'id', 'host', 'notes']

data = [
    ['busybox', 'c3c37d5d-38d2-409f-8d02-600fd9d51239', 'linuxnode-1-292735', 'Test server.'],
    ['alpine-python', '6bb77855-0fda-45a9-b553-e19e1a795f1e', 'linuxnode-2-249253', 'The one that runs python.'],
    ['redis', 'afb648ba-ac97-4fb2-8953-9a5b5f39663e', 'linuxnode-3-3416918', 'For queues and stuff.'],
    ['app-server', 'b866cd0f-bf80-40c7-84e3-c40891ec68f9', 'linuxnode-4-295918', 'A popular destination.'],
    ['nginx', '76fea0f0-aa53-4911-b7e4-fae28c2e469b', 'linuxnode-5-292735', 'Traffic Cop'],
]

table = columnar(data, headers, no_borders=True)
print(table)

Or you can get a little fancier with colors and borders.

To read more about the column-sizing algorithm and see the rest of the API you can check out the link above or see the Columnar GitHub Repo

Question 48

You have to do this with 2 passes:

get the maximum width of each column.
formatting the columns using our knowledge of max width from the first pass using str.ljust() and str.rjust()

Question 49

Transposing the columns like that is a job for zip:

>>> a = [['a', 'b', 'c'], ['aaaaaaaaaa', 'b', 'c'], ['a', 'bbbbbbbbbb', 'c']]
>>> list(zip(*a))
[('a', 'aaaaaaaaaa', 'a'), ('b', 'b', 'bbbbbbbbbb'), ('c', 'c', 'c')]

To find the required length of each column, you can use max:

>>> trans_a = zip(*a)
>>> [max(len(c) for c in b) for b in trans_a]
[10, 10, 1]

Which you can use, with suitable padding, to construct strings to pass to print:

>>> col_lenghts = [max(len(c) for c in b) for b in trans_a]
>>> padding = ' ' # You might want more
>>> padding.join(s.ljust(l) for s,l in zip(a[0], col_lenghts))
'a          b          c'

Question 50

To get fancier tables like

---------------------------------------------------
| First Name | Last Name        | Age | Position  |
---------------------------------------------------
| John       | Smith            | 24  | Software  |
|            |                  |     | Engineer  |
---------------------------------------------------
| Mary       | Brohowski        | 23  | Sales     |
|            |                  |     | Manager   |
---------------------------------------------------
| Aristidis  | Papageorgopoulos | 28  | Senior    |
|            |                  |     | Reseacher |
---------------------------------------------------

you can use this Python recipe:

'''
From http://code.activestate.com/recipes/267662-table-indentation/
PSF License
'''
import cStringIO,operator

def indent(rows, hasHeader=False, headerChar='-', delim=' | ', justify='left',
           separateRows=False, prefix='', postfix='', wrapfunc=lambda x:x):
    """Indents a table by column.
       - rows: A sequence of sequences of items, one sequence per row.
       - hasHeader: True if the first row consists of the columns' names.
       - headerChar: Character to be used for the row separator line
         (if hasHeader==True or separateRows==True).
       - delim: The column delimiter.
       - justify: Determines how are data justified in their column. 
         Valid values are 'left','right' and 'center'.
       - separateRows: True if rows are to be separated by a line
         of 'headerChar's.
       - prefix: A string prepended to each printed row.
       - postfix: A string appended to each printed row.
       - wrapfunc: A function f(text) for wrapping text; each element in
         the table is first wrapped by this function."""
    # closure for breaking logical rows to physical, using wrapfunc
    def rowWrapper(row):
        newRows = [wrapfunc(item).split('\n') for item in row]
        return [[substr or '' for substr in item] for item in map(None,*newRows)]
    # break each logical row into one or more physical ones
    logicalRows = [rowWrapper(row) for row in rows]
    # columns of physical rows
    columns = map(None,*reduce(operator.add,logicalRows))
    # get the maximum of each column by the string length of its items
    maxWidths = [max([len(str(item)) for item in column]) for column in columns]
    rowSeparator = headerChar * (len(prefix) + len(postfix) + sum(maxWidths) + \
                                 len(delim)*(len(maxWidths)-1))
    # select the appropriate justify method
    justify = {'center':str.center, 'right':str.rjust, 'left':str.ljust}[justify.lower()]
    output=cStringIO.StringIO()
    if separateRows: print >> output, rowSeparator
    for physicalRows in logicalRows:
        for row in physicalRows:
            print >> output, \
                prefix \
                + delim.join([justify(str(item),width) for (item,width) in zip(row,maxWidths)]) \
                + postfix
        if separateRows or hasHeader: print >> output, rowSeparator; hasHeader=False
    return output.getvalue()

# written by Mike Brown
# http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/148061
def wrap_onspace(text, width):
    """
    A word-wrap function that preserves existing line breaks
    and most spaces in the text. Expects that existing line
    breaks are posix newlines (\n).
    """
    return reduce(lambda line, word, width=width: '%s%s%s' %
                  (line,
                   ' \n'[(len(line[line.rfind('\n')+1:])
                         + len(word.split('\n',1)[0]
                              ) >= width)],
                   word),
                  text.split(' ')
                 )

import re
def wrap_onspace_strict(text, width):
    """Similar to wrap_onspace, but enforces the width constraint:
       words longer than width are split."""
    wordRegex = re.compile(r'\S{'+str(width)+r',}')
    return wrap_onspace(wordRegex.sub(lambda m: wrap_always(m.group(),width),text),width)

import math
def wrap_always(text, width):
    """A simple word-wrap function that wraps text on exactly width characters.
       It doesn't split the text in words."""
    return '\n'.join([ text[width*i:width*(i+1)] \
                       for i in xrange(int(math.ceil(1.*len(text)/width))) ])

if __name__ == '__main__':
    labels = ('First Name', 'Last Name', 'Age', 'Position')
    data = \
    '''John,Smith,24,Software Engineer
       Mary,Brohowski,23,Sales Manager
       Aristidis,Papageorgopoulos,28,Senior Reseacher'''
    rows = [row.strip().split(',')  for row in data.splitlines()]

    print 'Without wrapping function\n'
    print indent([labels]+rows, hasHeader=True)
    # test indent with different wrapping functions
    width = 10
    for wrapper in (wrap_always,wrap_onspace,wrap_onspace_strict):
        print 'Wrapping function: %s(x,width=%d)\n' % (wrapper.__name__,width)
        print indent([labels]+rows, hasHeader=True, separateRows=True,
                     prefix='| ', postfix=' |',
                     wrapfunc=lambda x: wrapper(x,width))

    # output:
    #
    #Without wrapping function
    #
    #First Name | Last Name        | Age | Position         
    #-------------------------------------------------------
    #John       | Smith            | 24  | Software Engineer
    #Mary       | Brohowski        | 23  | Sales Manager    
    #Aristidis  | Papageorgopoulos | 28  | Senior Reseacher 
    #
    #Wrapping function: wrap_always(x,width=10)
    #
    #----------------------------------------------
    #| First Name | Last Name  | Age | Position   |
    #----------------------------------------------
    #| John       | Smith      | 24  | Software E |
    #|            |            |     | ngineer    |
    #----------------------------------------------
    #| Mary       | Brohowski  | 23  | Sales Mana |
    #|            |            |     | ger        |
    #----------------------------------------------
    #| Aristidis  | Papageorgo | 28  | Senior Res |
    #|            | poulos     |     | eacher     |
    #----------------------------------------------
    #
    #Wrapping function: wrap_onspace(x,width=10)
    #
    #---------------------------------------------------
    #| First Name | Last Name        | Age | Position  |
    #---------------------------------------------------
    #| John       | Smith            | 24  | Software  |
    #|            |                  |     | Engineer  |
    #---------------------------------------------------
    #| Mary       | Brohowski        | 23  | Sales     |
    #|            |                  |     | Manager   |
    #---------------------------------------------------
    #| Aristidis  | Papageorgopoulos | 28  | Senior    |
    #|            |                  |     | Reseacher |
    #---------------------------------------------------
    #
    #Wrapping function: wrap_onspace_strict(x,width=10)
    #
    #---------------------------------------------
    #| First Name | Last Name  | Age | Position  |
    #---------------------------------------------
    #| John       | Smith      | 24  | Software  |
    #|            |            |     | Engineer  |
    #---------------------------------------------
    #| Mary       | Brohowski  | 23  | Sales     |
    #|            |            |     | Manager   |
    #---------------------------------------------
    #| Aristidis  | Papageorgo | 28  | Senior    |
    #|            | poulos     |     | Reseacher |
    #---------------------------------------------

The Python recipe page contains a few improvements on it.

Question 51

pandas based solution with creating dataframe:

import pandas as pd
l = [['a', 'b', 'c'], ['aaaaaaaaaa', 'b', 'c'], ['a', 'bbbbbbbbbb', 'c']]
df = pd.DataFrame(l)

print(df)
            0           1  2
0           a           b  c
1  aaaaaaaaaa           b  c
2           a  bbbbbbbbbb  c

To remove index and header values to create output what you want you could use to_string method:

result = df.to_string(index=False, header=False)

print(result)
          a           b  c
 aaaaaaaaaa           b  c
          a  bbbbbbbbbb  c

Question 52

Scolp is a new library that lets you pretty print streaming columnar data easily while auto-adjusting column width.

(Disclaimer: I am the author)

Question 53

This sets independent, best-fit column widths based on the max-metric used in other answers.

data = [['a', 'b', 'c'], ['aaaaaaaaaa', 'b', 'c'], ['a', 'bbbbbbbbbb', 'c']]
padding = 2
col_widths = [max(len(w) for w in [r[cn] for r in data]) + padding for cn in range(len(data[0]))]
format_string = "{{:{}}}{{:{}}}{{:{}}}".format(*col_widths)
for row in data:
    print(format_string.format(*row))

Question 54

For lazy people

that are using Python 3.* and Pandas/Geopandas; universal simple in-class approach (for ‘normal’ script just remove self):

Function colorize:

    def colorize(self,s,color):
        s = color+str(s)+"\033[0m"
        return s

Header:

print('{0:<23} {1:>24} {2:>26} {3:>26} {4:>11} {5:>11}'.format('Road name','Classification','Function','Form of road','Length','Distance') )

and then data from Pandas/Geopandas dataframe:

            for index, row in clipped.iterrows():
                rdName      = self.colorize(row['name1'],"\033[32m")
                rdClass     = self.colorize(row['roadClassification'],"\033[93m")
                rdFunction  = self.colorize(row['roadFunction'],"\033[33m")
                rdForm      = self.colorize(row['formOfWay'],"\033[94m")
                rdLength    = self.colorize(row['length'],"\033[97m")
                rdDistance  = self.colorize(row['distance'],"\033[96m")
                print('{0:<30} {1:>35} {2:>35} {3:>35} {4:>20} {5:>20}'.format(rdName,rdClass,rdFunction,rdForm,rdLength,rdDistance) )

Meaning of {0:<30} {1:>35} {2:>35} {3:>35} {4:>20} {5:>20}:

0, 1, 2, 3, 4, 5 -> columns, there are 6 in total in this case

30, 35, 20 -> width of column (note that you’ll have to add length of \033[96m – this for Python is a string as well), just experiment :)

>, < -> justify: right, left (there is = for filling with zeros as well)

If you want to distinct e.g. max value, you’ll have to switch to special Pandas style function, but suppose that’s far enough to present data on terminal window.

Result:

Question 55

A slight variation on a previous answer (I don’t have enough rep to comment on it). The format library lets you specify the width and alignment of an element but not where it starts, ie, you can say “be 20 columns wide” but not “start in column 20”. Which leads to this issue:

table_data = [
    ['a', 'b', 'c'],
    ['aaaaaaaaaa', 'b', 'c'], 
    ['a', 'bbbbbbbbbb', 'c']
]

print("first row: {: >20} {: >20} {: >20}".format(*table_data[0]))
print("second row: {: >20} {: >20} {: >20}".format(*table_data[1]))
print("third row: {: >20} {: >20} {: >20}".format(*table_data[2]))

Output

first row:                    a                    b                    c
second row:           aaaaaaaaaa                    b                    c
third row:                    a           bbbbbbbbbb                    c

The answer of course is to format the literal strings as well, which combines slightly weirdly with the format:

table_data = [
    ['a', 'b', 'c'],
    ['aaaaaaaaaa', 'b', 'c'], 
    ['a', 'bbbbbbbbbb', 'c']
]

print(f"{'first row:': <20} {table_data[0][0]: >20} {table_data[0][1]: >20} {table_data[0][2]: >20}")
print("{: <20} {: >20} {: >20} {: >20}".format(*['second row:', *table_data[1]]))
print("{: <20} {: >20} {: >20} {: >20}".format(*['third row:', *table_data[1]]))

Output

first row:                              a                    b                    c
second row:                    aaaaaaaaaa                    b                    c
third row:                     aaaaaaaaaa                    b                    c

Question 56

I found this answer super-helpful and elegant, originally from here:

matrix = [["A", "B"], ["C", "D"]]

print('\n'.join(['\t'.join([str(cell) for cell in row]) for row in matrix]))

Output

A   B
C   D

Question 57

Here is a variation of the Shawn Chin’s answer. The width is fixed per column, not over all columns. There is also a border below the first row and between the columns. (icontract library is used to enforce the contracts.)

@icontract.pre(
    lambda table: not table or all(len(row) == len(table[0]) for row in table))
@icontract.post(lambda table, result: result == "" if not table else True)
@icontract.post(lambda result: not result.endswith("\n"))
def format_table(table: List[List[str]]) -> str:
    """
    Format the table as equal-spaced columns.

    :param table: rows of cells
    :return: table as string
    """
    cols = len(table[0])

    col_widths = [max(len(row[i]) for row in table) for i in range(cols)]

    lines = []  # type: List[str]
    for i, row in enumerate(table):
        parts = []  # type: List[str]

        for cell, width in zip(row, col_widths):
            parts.append(cell.ljust(width))

        line = " | ".join(parts)
        lines.append(line)

        if i == 0:
            border = []  # type: List[str]

            for width in col_widths:
                border.append("-" * width)

            lines.append("-+-".join(border))

    result = "\n".join(lines)

    return result

Here is an example:

>>> table = [['column 0', 'another column 1'], ['00', '01'], ['10', '11']]
>>> result = packagery._format_table(table=table)
>>> print(result)
column 0 | another column 1
---------+-----------------
00       | 01              
10       | 11

Question 58

updated @Franck Dernoncourt fancy recipe to be python 3 and PEP8 compliant

import io
import math
import operator
import re
import functools

from itertools import zip_longest


def indent(
    rows,
    has_header=False,
    header_char="-",
    delim=" | ",
    justify="left",
    separate_rows=False,
    prefix="",
    postfix="",
    wrapfunc=lambda x: x,
):
    """Indents a table by column.
       - rows: A sequence of sequences of items, one sequence per row.
       - hasHeader: True if the first row consists of the columns' names.
       - headerChar: Character to be used for the row separator line
         (if hasHeader==True or separateRows==True).
       - delim: The column delimiter.
       - justify: Determines how are data justified in their column.
         Valid values are 'left','right' and 'center'.
       - separateRows: True if rows are to be separated by a line
         of 'headerChar's.
       - prefix: A string prepended to each printed row.
       - postfix: A string appended to each printed row.
       - wrapfunc: A function f(text) for wrapping text; each element in
         the table is first wrapped by this function."""

    # closure for breaking logical rows to physical, using wrapfunc
    def row_wrapper(row):
        new_rows = [wrapfunc(item).split("\n") for item in row]
        return [[substr or "" for substr in item] for item in zip_longest(*new_rows)]

    # break each logical row into one or more physical ones
    logical_rows = [row_wrapper(row) for row in rows]
    # columns of physical rows
    columns = zip_longest(*functools.reduce(operator.add, logical_rows))
    # get the maximum of each column by the string length of its items
    max_widths = [max([len(str(item)) for item in column]) for column in columns]
    row_separator = header_char * (
        len(prefix) + len(postfix) + sum(max_widths) + len(delim) * (len(max_widths) - 1)
    )
    # select the appropriate justify method
    justify = {"center": str.center, "right": str.rjust, "left": str.ljust}[
        justify.lower()
    ]
    output = io.StringIO()
    if separate_rows:
        print(output, row_separator)
    for physicalRows in logical_rows:
        for row in physicalRows:
            print( output, prefix + delim.join(
                [justify(str(item), width) for (item, width) in zip(row, max_widths)]
            ) + postfix)
        if separate_rows or has_header:
            print(output, row_separator)
            has_header = False
    return output.getvalue()


# written by Mike Brown
# http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/148061
def wrap_onspace(text, width):
    """
    A word-wrap function that preserves existing line breaks
    and most spaces in the text. Expects that existing line
    breaks are posix newlines (\n).
    """
    return functools.reduce(
        lambda line, word, i_width=width: "%s%s%s"
        % (
            line,
            " \n"[
                (
                    len(line[line.rfind("\n") + 1 :]) + len(word.split("\n", 1)[0])
                    >= i_width
                )
            ],
            word,
        ),
        text.split(" "),
    )


def wrap_onspace_strict(text, i_width):
    """Similar to wrap_onspace, but enforces the width constraint:
       words longer than width are split."""
    word_regex = re.compile(r"\S{" + str(i_width) + r",}")
    return wrap_onspace(
        word_regex.sub(lambda m: wrap_always(m.group(), i_width), text), i_width
    )


def wrap_always(text, width):
    """A simple word-wrap function that wraps text on exactly width characters.
       It doesn't split the text in words."""
    return "\n".join(
        [
            text[width * i : width * (i + 1)]
            for i in range(int(math.ceil(1.0 * len(text) / width)))
        ]
    )


if __name__ == "__main__":
    labels = ("First Name", "Last Name", "Age", "Position")
    data = """John,Smith,24,Software Engineer
           Mary,Brohowski,23,Sales Manager
           Aristidis,Papageorgopoulos,28,Senior Reseacher"""
    rows = [row.strip().split(",") for row in data.splitlines()]

    print("Without wrapping function\n")
    print(indent([labels] + rows, has_header=True))

    # test indent with different wrapping functions
    width = 10
    for wrapper in (wrap_always, wrap_onspace, wrap_onspace_strict):
        print("Wrapping function: %s(x,width=%d)\n" % (wrapper.__name__, width))

        print(
            indent(
                [labels] + rows,
                has_header=True,
                separate_rows=True,
                prefix="| ",
                postfix=" |",
                wrapfunc=lambda x: wrapper(x, width),
            )
        )

    # output:
    #
    # Without wrapping function
    #
    # First Name | Last Name        | Age | Position
    # -------------------------------------------------------
    # John       | Smith            | 24  | Software Engineer
    # Mary       | Brohowski        | 23  | Sales Manager
    # Aristidis  | Papageorgopoulos | 28  | Senior Reseacher
    #
    # Wrapping function: wrap_always(x,width=10)
    #
    # ----------------------------------------------
    # | First Name | Last Name  | Age | Position   |
    # ----------------------------------------------
    # | John       | Smith      | 24  | Software E |
    # |            |            |     | ngineer    |
    # ----------------------------------------------
    # | Mary       | Brohowski  | 23  | Sales Mana |
    # |            |            |     | ger        |
    # ----------------------------------------------
    # | Aristidis  | Papageorgo | 28  | Senior Res |
    # |            | poulos     |     | eacher     |
    # ----------------------------------------------
    #
    # Wrapping function: wrap_onspace(x,width=10)
    #
    # ---------------------------------------------------
    # | First Name | Last Name        | Age | Position  |
    # ---------------------------------------------------
    # | John       | Smith            | 24  | Software  |
    # |            |                  |     | Engineer  |
    # ---------------------------------------------------
    # | Mary       | Brohowski        | 23  | Sales     |
    # |            |                  |     | Manager   |
    # ---------------------------------------------------
    # | Aristidis  | Papageorgopoulos | 28  | Senior    |
    # |            |                  |     | Reseacher |
    # ---------------------------------------------------
    #
    # Wrapping function: wrap_onspace_strict(x,width=10)
    #
    # ---------------------------------------------
    # | First Name | Last Name  | Age | Position  |
    # ---------------------------------------------
    # | John       | Smith      | 24  | Software  |
    # |            |            |     | Engineer  |
    # ---------------------------------------------
    # | Mary       | Brohowski  | 23  | Sales     |
    # |            |            |     | Manager   |
    # ---------------------------------------------
    # | Aristidis  | Papageorgo | 28  | Senior    |
    # |            | poulos     |     | Reseacher |
    # ---------------------------------------------

Question 59

I realize this question is old but I didn’t understand Antak’s answer and didn’t want to use a library so I rolled my own solution.

Solution assumes records is a 2D array, records are all the same length, and that fields are all strings.

def stringifyRecords(records):
    column_widths = [0] * len(records[0])
    for record in records:
        for i, field in enumerate(record):
            width = len(field)
            if width > column_widths[i]: column_widths[i] = width

    s = ""
    for record in records:
        for column_width, field in zip(column_widths, record):
            s += field.ljust(column_width+1)
        s += "\n"

    return s

Question 60

I have code which looks something like this:

thing_index = thing_list.index(thing)
otherfunction(thing_list, thing_index)

ok so that’s simplified but you get the idea. Now thing might not actually be in the list, in which case I want to pass -1 as thing_index. In other languages this is what you’d expect index() to return if it couldn’t find the element. In fact it throws a ValueError.

I could do this:

try:
    thing_index = thing_list.index(thing)
except ValueError:
    thing_index = -1
otherfunction(thing_list, thing_index)

But this feels dirty, plus I don’t know if ValueError could be raised for some other reason. I came up with the following solution based on generator functions, but it seems a little complex:

thing_index = ( [(i for i in xrange(len(thing_list)) if thing_list[i]==thing)] or [-1] )[0]

Is there a cleaner way to achieve the same thing? Let’s assume the list isn’t sorted.

Question 61

There is nothing “dirty” about using try-except clause. This is the pythonic way. ValueError will be raised by the .index method only, because it’s the only code you have there!

To answer the comment:
In Python, easier to ask forgiveness than to get permission philosophy is well established, and no index will not raise this type of error for any other issues. Not that I can think of any.

Question 62

thing_index = thing_list.index(elem) if elem in thing_list else -1

One line. Simple. No exceptions.

Question 63

The dict type has a get function, where if the key doesn’t exist in the dictionary, the 2nd argument to get is the value that it should return. Similarly there is setdefault, which returns the value in the dict if the key exists, otherwise it sets the value according to your default parameter and then returns your default parameter.

You could extend the list type to have a getindexdefault method.

class SuperDuperList(list):
    def getindexdefault(self, elem, default):
        try:
            thing_index = self.index(elem)
            return thing_index
        except ValueError:
            return default

Which could then be used like:

mylist = SuperDuperList([0,1,2])
index = mylist.getindexdefault( 'asdf', -1 )

Question 64

There is nothing wrong with your code that uses ValueError. Here’s yet another one-liner if you’d like to avoid exceptions:

thing_index = next((i for i, x in enumerate(thing_list) if x == thing), -1)

问题：合并两个列表并删除重复项，而不删除原始列表中的重复项

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

问题：Python方式打印列表项

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

问题：如何检查以下所有项目是否都在列表中？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

问题：查找列表元素之间的差异

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

问题：在python中创建漂亮的列输出

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

对于懒惰的人

结果：

For lazy people

Result:

回答 11

回答 12

回答 13

回答 14

回答 15

问题：在python中处理list.index（可能不存在）的最佳方法？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

问题：[]和{}与list（）和dict（），哪个更好？

回答 0

回答 1

回答 2

问题：索引所有除外 python中的一项