Is there a portable way to get the current user’s username in Python (i.e., one that works under both Linux and Windows, at least). It would work like os.getuid:
I googled around and was surprised not to find a definitive answer (although perhaps I was just googling poorly). The pwd module provides a relatively easy way to achieve this under, say, Linux, but it is not present on Windows. Some of the search results suggested that getting the username under Windows can be complicated in certain circumstances (e.g., running as a Windows service), although I haven’t verified that.
p.s. Per comment below “this function looks at the values of various environment variables to determine the user name. Therefore, this function should not be relied on for access control purposes (or possibly any other purpose, since it allows any user to impersonate any other).“
import os
# Gives user's home directory
userhome = os.path.expanduser('~')print"User's home Dir: "+ userhome
# Gives username by splitting path based on OSprint"username: "+ os.path.split(userhome)[-1]
To me using os module looks the best for portability: Works best on both Linux and Windows.
import os
# Gives user's home directory
userhome = os.path.expanduser('~')
print "User's home Dir: " + userhome
# Gives username by splitting path based on OS
print "username: " + os.path.split(userhome)[-1]
On UNIX it returns your username, but on Windows, it returns your user’s group, slash, your username.
—
I.E.
UNIX returns: “username”
Windows returns: “domain/username”
—
It’s interesting, but probably not ideal unless you are doing something in the the terminal anyway… in which case you would probably be using os.system to begin with. For example, a while ago I needed to add my user to a group, so I did (this is in Linux, mind you)
import os
os.system("sudo usermod -aG \"group_name\" $(whoami)")
print "You have been added to \"group_name\"! Please log out for this to take effect"
I feel like that is easier to read and you don’t have to import pwd or getpass.
I also feel like having “domain/user” could be helpful in certain applications in Windows.
I wrote the plx module some time ago to get the user name in a portable way on Unix and Windows (among other things):
http://www.decalage.info/en/python/plx
Usage:
import plx
username = plx.get_username()
(it requires win32 extensions on Windows)
回答 10
仅使用标准python库:
from os import environ,getcwd
getUser =lambda: environ["USERNAME"]if"C:"in getcwd()else environ["USER"]
user = getUser()
适用于Windows,Mac或Linux
或者,您可以通过立即调用删除一行:
from os import environ,getcwd
user =(lambda: environ["USERNAME"]if"C:"in getcwd()else environ["USER"])()
I’m trying to port an open-source library to Python 3. (SymPy, if anyone is wondering.)
So, I need to run 2to3 automatically when building for Python 3. To do that, I need to use distribute. Therefore, I need to port the current system, which (according to the doctest) is distutils.
The Problem
Unfortunately, I’m not sure what’s the difference between these modules—distutils, distribute, setuptools. The documentation is sketchy as best, as they all seem to be a fork of one another, intended to be compatible in most circumstances (but actually, not all)…and so on, and so forth.
The Question
Could someone explain the differences? What am I supposed to use? What is the most modern solution? (As an aside, I’d also appreciate some guide on porting to Distribute, but that’s a tad beyond the scope of the question…)
As of March 2020, most of the other answers to this question are several years out-of-date. When you come across advice on Python packaging issues, remember to look at the date of publication, and don’t trust out-of-date information.
The Python Packaging User Guide is worth a read. Every page has a “last updated” date displayed, so you can check the recency of the manual, and it’s quite comprehensive. The fact that it’s hosted on a subdomain of python.org of the Python Software Foundation just adds credence to it. The Project Summaries page is especially relevant here.
Summary of tools:
Here’s a summary of the Python packaging landscape:
Supported tools:
distutils is still the standard tool for packaging in Python. It is included in the standard library (Python 2 and Python 3). It is useful for simple Python distributions, but lacks features. It introduces the distutils Python package that can be imported in your setup.py script.
setuptools was developed to overcome Distutils’ limitations, and is not included in the standard library. It introduced a command-line utility called easy_install. It also introduced the setuptools Python package that can be imported in your setup.py script, and the pkg_resources Python package that can be imported in your code to locate data files installed with a distribution. One of its gotchas is that it monkey-patches the distutils Python package. It should work well with pip. It sees regular releases.
scikit-build is an improved build system generator that internally uses CMake to build compiled Python extensions. Because scikit-build isn’t based on distutils, it doesn’t really have any of its limitations. When ninja-build is present, scikit-build can compile large projects over three times faster than the alternatives. It should work well with pip.
distribute was a fork of setuptools. It shared the same namespace, so if you had Distribute installed, import setuptools would actually import the package distributed with Distribute. Distribute was merged back into Setuptools 0.7, so you don’t need to use Distribute any more. In fact, the version on Pypi is just a compatibility layer that installs Setuptools.
distutils2 was an attempt to take the best of distutils, setuptools and distribute and become the standard tool included in Python’s standard library. The idea was that distutils2 would be distributed for old Python versions, and that distutils2 would be renamed to packaging for Python 3.3, which would include it in its standard library. These plans did not go as intended, however, and currently, distutils2 is an abandoned project. The latest release was in March 2012, and its Pypi home page has finally been updated to reflect its death.
Others:
There are other tools, if you are interested, read Project Summaries in the Python Packaging User Guide. I won’t list them all, to not repeat that page, and to keep the answer matching the question, which was only about distribute, distutils, setuptools and distutils2.
Recommendation:
If all of this is new to you, and you don’t know where to start, I would recommend learning setuptools, along with pip and virtualenv, which all work very well together.
I’m a distutils maintainer and distutils2/packaging contributor. I did a talk about Python packaging at ConFoo 2011 and these days I’m writing an extended version of it. It’s not published yet, so here are excerpts that should help define things.
Distutils is the standard tool used for packaging. It works rather well for simple needs, but is limited and not trivial to extend.
Setuptools is a project born from the desire to fill missing distutils functionality and explore new directions. In some subcommunities, it’s a de facto standard. It uses monkey-patching and magic that is frowned upon by Python core developers.
Distribute is a fork of Setuptools that was started by developers feeling that its development pace was too slow and that it was not possible to evolve it. Its development was considerably slowed when distutils2 was started by the same group. 2013-August update: distribute is merged back into setuptools and discontinued.
Distutils2 is a new distutils library, started as a fork of the distutils codebase, with good ideas taken from setup tools (of which some were thoroughly discussed in PEPs), and a basic installer inspired by pip. The actual name you use to import Distutils2 is packaging in the Python 3.3+ standard library, or distutils2 in 2.4+ and 3.1–3.2. (A backport will be available soon.) Distutils2 did not make the Python 3.3 release, and it was put on hold.
NOTE: Answer deprecated, Distribute now obsolete. This answer is no longer valid since the Python Packaging Authority was formed and has done a lot of work cleaning this up.
Yep, you got it. :-o I think at this time the preferred package is Distribute, which is a fork of setuptools, which are an extension of distutils (the original packaging system). Setuptools was not being maintained so is was forked and renamed, however when installed it uses the package name of setuptools! I think most Python developers now use Distribute, and I can say for sure that I do.
I realize that I have replied to your secondary question without addressing unquestioned assumptions in your original problem:
I’m trying to port an open-source library (SymPy, if anyone is wondering) to Python 3. To
do this, I need to run 2to3 automatically when building for Python 3.
Updating this question in late 2014 where fortunately the Python packaging chaos has been greatly cleaned up by Continuum’s “conda” package manager.
In particular, conda quickly enables the creation of conda “environments“. You can configure your environments with different versions of Python. For example:
conda create -n py34 python=3.4 anaconda
conda create -n py26 python=2.6 anaconda
will create two (“py34” or “py26”) Python environments with different versions of Python.
Afterwards you can invoke the environment with the specific version of Python with:
source activate <env name>
This feature seems especially useful in your case where you are having to deal with different version of Python.
Moreover, conda has the following features:
Python agnostic
Cross platform
No admin privileges required
Smart dependency management (by way of a SAT solver)
Nicely deals with C, Fortran and system level libraries that you may have to link against
That last point is especially important if you are in the scientific computing arena.
voidPyBytes_ConcatAndDel(registerPyObject**pv,registerPyObject*w){PyBytes_Concat(pv, w);Py_XDECREF(w);}/* The following function breaks the notion that strings are immutable:
it changes the size of a string. We get away with this only if there
is only one module referencing the object. You can also think of it
as creating a new string object and destroying the old one, only
more efficiently. In any case, don't use this if the string may
already be known to some other part of the code...
Note that if there's not enough memory to resize the string, the original
string object at *pv is deallocated, *pv is set to NULL, an "out of
memory" exception is set, and -1 is returned. Else (on success) 0 is
returned, and the value in *pv may or may not be the same as on input.
As always, an extra byte is allocated for a trailing \0 byte (newsize
does *not* include that), and a trailing \0 byte is stored.
*/int_PyBytes_Resize(PyObject**pv,Py_ssize_t newsize){registerPyObject*v;registerPyBytesObject*sv;
v =*pv;if(!PyBytes_Check(v)||Py_REFCNT(v)!=1|| newsize <0){*pv =0;Py_DECREF(v);PyErr_BadInternalCall();return-1;}/* XXX UNREF/NEWREF interface should be more symmetrical */_Py_DEC_REFTOTAL;_Py_ForgetReference(v);*pv =(PyObject*)PyObject_REALLOC((char*)v,PyBytesObject_SIZE+ newsize);if(*pv == NULL){PyObject_Del(v);PyErr_NoMemory();return-1;}_Py_NewReference(*pv);
sv =(PyBytesObject*)*pv;Py_SIZE(sv)= newsize;
sv->ob_sval[newsize]='\0';
sv->ob_shash =-1;/* invalidate cached hash value */return0;}
凭经验进行验证很容易。
$ python -m timeit -s“ s =”“”对于xrange(10):s + ='a'
1000000次循环,每循环3:1.85最佳
$ python -m timeit -s“ s =”“”对于xrange(100):s + ='a'
10000次循环,最佳为3次:每个循环16.8微秒
$ python -m timeit -s“ s =”“”对于xrange(1000)中的我来说:s + ='a'“
10000次循环,最佳为3次:每个循环158微秒
$ python -m timeit -s“ s =”“”对于xrange(10000):s + ='a'
1000次循环,每循环3:1.71毫秒最佳
$ python -m timeit -s“ s =”“”对于xrange(100000):s + ='a'
10个循环,每循环最好3:14.6毫秒
$ python -m timeit -s“ s =”“”对于xrange(1000000):s + ='a'
10个循环,最佳3:每个循环173毫秒
If you only have one reference to a string and you concatenate another string to the end, CPython now special cases this and tries to extend the string in place.
The end result is that the operation is amortized O(n).
e.g.
s = ""
for i in range(n):
s+=str(i)
used to be O(n^2), but now it is O(n).
From the source (bytesobject.c):
void
PyBytes_ConcatAndDel(register PyObject **pv, register PyObject *w)
{
PyBytes_Concat(pv, w);
Py_XDECREF(w);
}
/* The following function breaks the notion that strings are immutable:
it changes the size of a string. We get away with this only if there
is only one module referencing the object. You can also think of it
as creating a new string object and destroying the old one, only
more efficiently. In any case, don't use this if the string may
already be known to some other part of the code...
Note that if there's not enough memory to resize the string, the original
string object at *pv is deallocated, *pv is set to NULL, an "out of
memory" exception is set, and -1 is returned. Else (on success) 0 is
returned, and the value in *pv may or may not be the same as on input.
As always, an extra byte is allocated for a trailing \0 byte (newsize
does *not* include that), and a trailing \0 byte is stored.
*/
int
_PyBytes_Resize(PyObject **pv, Py_ssize_t newsize)
{
register PyObject *v;
register PyBytesObject *sv;
v = *pv;
if (!PyBytes_Check(v) || Py_REFCNT(v) != 1 || newsize < 0) {
*pv = 0;
Py_DECREF(v);
PyErr_BadInternalCall();
return -1;
}
/* XXX UNREF/NEWREF interface should be more symmetrical */
_Py_DEC_REFTOTAL;
_Py_ForgetReference(v);
*pv = (PyObject *)
PyObject_REALLOC((char *)v, PyBytesObject_SIZE + newsize);
if (*pv == NULL) {
PyObject_Del(v);
PyErr_NoMemory();
return -1;
}
_Py_NewReference(*pv);
sv = (PyBytesObject *) *pv;
Py_SIZE(sv) = newsize;
sv->ob_sval[newsize] = '\0';
sv->ob_shash = -1; /* invalidate cached hash value */
return 0;
}
It’s easy enough to verify empirically.
$ python -m timeit -s"s=''" "for i in xrange(10):s+='a'"
1000000 loops, best of 3: 1.85 usec per loop
$ python -m timeit -s"s=''" "for i in xrange(100):s+='a'"
10000 loops, best of 3: 16.8 usec per loop
$ python -m timeit -s"s=''" "for i in xrange(1000):s+='a'"
10000 loops, best of 3: 158 usec per loop
$ python -m timeit -s"s=''" "for i in xrange(10000):s+='a'"
1000 loops, best of 3: 1.71 msec per loop
$ python -m timeit -s"s=''" "for i in xrange(100000):s+='a'"
10 loops, best of 3: 14.6 msec per loop
$ python -m timeit -s"s=''" "for i in xrange(1000000):s+='a'"
10 loops, best of 3: 173 msec per loop
It’s important however to note that this optimisation isn’t part of the Python spec. It’s only in the cPython implementation as far as I know. The same empirical testing on pypy or jython for example might show the older O(n**2) performance .
$ pypy -m timeit -s"s=''" "for i in xrange(10):s+='a'"
10000 loops, best of 3: 90.8 usec per loop
$ pypy -m timeit -s"s=''" "for i in xrange(100):s+='a'"
1000 loops, best of 3: 896 usec per loop
$ pypy -m timeit -s"s=''" "for i in xrange(1000):s+='a'"
100 loops, best of 3: 9.03 msec per loop
$ pypy -m timeit -s"s=''" "for i in xrange(10000):s+='a'"
10 loops, best of 3: 89.5 msec per loop
So far so good, but then,
$ pypy -m timeit -s"s=''" "for i in xrange(100000):s+='a'"
10 loops, best of 3: 12.8 sec per loop
ouch even worse than quadratic. So pypy is doing something that works well with short strings, but performs poorly for larger strings.
Don’t prematurely optimize. If you have no reason to believe there’s a speed bottleneck caused by string concatenations then just stick with + and +=:
s = 'foo'
s += 'bar'
s += 'baz'
That said, if you’re aiming for something like Java’s StringBuilder, the canonical Python idiom is to add items to a list and then use str.join to concatenate them all at the end:
l = []
l.append('foo')
l.append('bar')
l.append('baz')
s = ''.join(l)
That joins str1 and str2 with a space as separators. You can also do "".join(str1, str2, ...). str.join() takes an iterable, so you’d have to put the strings in a list or a tuple.
That’s about as efficient as it gets for a builtin method.
If you need to do many append operations to build a large string, you can use StringIO or cStringIO. The interface is like a file. ie: you write to append text to it.
If you’re just appending two strings then just use +.
it really depends on your application. If you’re looping through hundreds of words and want to append them all into a list, .join() is better. But if you’re putting together a long sentence, you’re better off using +=.
回答 7
基本上没有区别。唯一一致的趋势是,每个版本的Python似乎都变得越来越慢… :(
清单
%%timeit
x =[]for i in range(100000000):# xrange on Python 2.7
x.append('a')
x =''.join(x)
Python 2.7
1个循环,每循环3:7.34 s 最佳
Python 3.4
1个循环,每个循环最好3:7.99 s
Python 3.5
1次循环,每循环3:8.48 s 最佳
Python 3.6
1次循环,每循环3:9.93 s 最佳
串
%%timeit
x =''for i in range(100000000):# xrange on Python 2.7
x +='a'
def all_perms(elements):if len(elements)<=1:yield elements
else:for perm in all_perms(elements[1:]):for i in range(len(elements)):# nb elements[0:1] works in both string and list contextsyield perm[:i]+ elements[0:1]+ perm[i:]
的文档中列出了几种其他方法itertools.permutations。这是一个:
def permutations(iterable, r=None):# permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC# permutations(range(3)) --> 012 021 102 120 201 210
pool = tuple(iterable)
n = len(pool)
r = n if r isNoneelse r
if r > n:return
indices = range(n)
cycles = range(n, n-r,-1)yield tuple(pool[i]for i in indices[:r])while n:for i in reversed(range(r)):
cycles[i]-=1if cycles[i]==0:
indices[i:]= indices[i+1:]+ indices[i:i+1]
cycles[i]= n - i
else:
j = cycles[i]
indices[i], indices[-j]= indices[-j], indices[i]yield tuple(pool[i]for i in indices[:r])breakelse:return
另一个基于itertools.product:
def permutations(iterable, r=None):
pool = tuple(iterable)
n = len(pool)
r = n if r isNoneelse r
for indices in product(range(n), repeat=r):if len(set(indices))== r:yield tuple(pool[i]for i in indices)
If you’re using an older Python (<2.6) for some reason or are just curious to know how it works, here’s one nice approach, taken from http://code.activestate.com/recipes/252178/:
def all_perms(elements):
if len(elements) <=1:
yield elements
else:
for perm in all_perms(elements[1:]):
for i in range(len(elements)):
# nb elements[0:1] works in both string and list contexts
yield perm[:i] + elements[0:1] + perm[i:]
A couple of alternative approaches are listed in the documentation of itertools.permutations. Here’s one:
def permutations(iterable, r=None):
# permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
# permutations(range(3)) --> 012 021 102 120 201 210
pool = tuple(iterable)
n = len(pool)
r = n if r is None else r
if r > n:
return
indices = range(n)
cycles = range(n, n-r, -1)
yield tuple(pool[i] for i in indices[:r])
while n:
for i in reversed(range(r)):
cycles[i] -= 1
if cycles[i] == 0:
indices[i:] = indices[i+1:] + indices[i:i+1]
cycles[i] = n - i
else:
j = cycles[i]
indices[i], indices[-j] = indices[-j], indices[i]
yield tuple(pool[i] for i in indices[:r])
break
else:
return
And another, based on itertools.product:
def permutations(iterable, r=None):
pool = tuple(iterable)
n = len(pool)
r = n if r is None else r
for indices in product(range(n), repeat=r):
if len(set(indices)) == r:
yield tuple(pool[i] for i in indices)
#!/usr/bin/env python
def perm(a, k=0):
if k == len(a):
print a
else:
for i in xrange(k, len(a)):
a[k], a[i] = a[i] ,a[k]
perm(a, k+1)
a[k], a[i] = a[i], a[k]
perm([1,2,3])
As I’m swapping the content of the list it’s required a mutable sequence type as input. E.g. perm(list("ball")) will work and perm("ball") won’t because you can’t change a string.
This Python implementation is inspired by the algorithm presented in the book Computer Algorithms by Horowitz, Sahni and Rajasekeran.
回答 5
此解决方案实现了一个生成器,以避免将所有排列保留在内存中:
def permutations (orig_list):ifnot isinstance(orig_list, list):
orig_list = list(orig_list)yield orig_list
if len(orig_list)==1:returnfor n in sorted(orig_list):
new_list = orig_list[:]
pos = new_list.index(n)del(new_list[pos])
new_list.insert(0, n)for resto in permutations(new_list[1:]):if new_list[:1]+ resto <> orig_list:yield new_list[:1]+ resto
This solution implements a generator, to avoid holding all the permutations on memory:
def permutations (orig_list):
if not isinstance(orig_list, list):
orig_list = list(orig_list)
yield orig_list
if len(orig_list) == 1:
return
for n in sorted(orig_list):
new_list = orig_list[:]
pos = new_list.index(n)
del(new_list[pos])
new_list.insert(0, n)
for resto in permutations(new_list[1:]):
if new_list[:1] + resto <> orig_list:
yield new_list[:1] + resto
回答 6
以实用的风格
def addperm(x,l):return[ l[0:i]+[x]+ l[i:]for i in range(len(l)+1)]def perm(l):if len(l)==0:return[[]]return[x for y in perm(l[1:])for x in addperm(l[0],y)]print perm([ i for i in range(3)])
def addperm(x,l):
return [ l[0:i] + [x] + l[i:] for i in range(len(l)+1) ]
def perm(l):
if len(l) == 0:
return [[]]
return [x for y in perm(l[1:]) for x in addperm(l[0],y) ]
print perm([ i for i in range(3)])
def permute_in_place(a):
a.sort()yield list(a)if len(a)<=1:return
first =0
last = len(a)while1:
i = last -1while1:
i = i -1if a[i]< a[i+1]:
j = last -1whilenot(a[i]< a[j]):
j = j -1
a[i], a[j]= a[j], a[i]# swap the values
r = a[i+1:last]
r.reverse()
a[i+1:last]= r
yield list(a)breakif i == first:
a.reverse()returnif __name__ =='__main__':for n in range(5):for a in permute_in_place(range(1, n+1)):print a
printfor a in permute_in_place([0,0,1,1,1]):print a
print
The following code is an in-place permutation of a given list, implemented as a generator. Since it only returns references to the list, the list should not be modified outside the generator.
The solution is non-recursive, so uses low memory. Work well also with multiple copies of elements in the input list.
def permute_in_place(a):
a.sort()
yield list(a)
if len(a) <= 1:
return
first = 0
last = len(a)
while 1:
i = last - 1
while 1:
i = i - 1
if a[i] < a[i+1]:
j = last - 1
while not (a[i] < a[j]):
j = j - 1
a[i], a[j] = a[j], a[i] # swap the values
r = a[i+1:last]
r.reverse()
a[i+1:last] = r
yield list(a)
break
if i == first:
a.reverse()
return
if __name__ == '__main__':
for n in range(5):
for a in permute_in_place(range(1, n+1)):
print a
print
for a in permute_in_place([0, 0, 1, 1, 1]):
print a
print
回答 8
我认为一种很明显的方式可能是:
def permutList(l):ifnot l:return[[]]
res =[]for e in l:
temp = l[:]
temp.remove(e)
res.extend([[e]+ r for r in permutList(temp)])return res
def permutList(l):
if not l:
return [[]]
res = []
for e in l:
temp = l[:]
temp.remove(e)
res.extend([[e] + r for r in permutList(temp)])
return res
回答 9
list2Perm =[1,2.0,'three']
listPerm =[[a, b, c]for a in list2Perm
for b in list2Perm
for c in list2Perm
if( a != b and b != c and a != c )]print listPerm
list2Perm = [1, 2.0, 'three']
listPerm = [[a, b, c]
for a in list2Perm
for b in list2Perm
for c in list2Perm
if ( a != b and b != c and a != c )
]
print listPerm
from math import factorial
def permutations(l):
permutations=[]
length=len(l)for x in xrange(factorial(length)):
available=list(l)
newPermutation=[]for radix in xrange(length,0,-1):
placeValue=factorial(radix-1)
index=x/placeValue
newPermutation.append(available.pop(index))
x-=index*placeValue
permutations.append(newPermutation)return permutations
permutations(range(3))
I used an algorithm based on the factorial number system– For a list of length n, you can assemble each permutation item by item, selecting from the items left at each stage. You have n choices for the first item, n-1 for the second, and only one for the last, so you can use the digits of a number in the factorial number system as the indices. This way the numbers 0 through n!-1 correspond to all possible permutations in lexicographic order.
from math import factorial
def permutations(l):
permutations=[]
length=len(l)
for x in xrange(factorial(length)):
available=list(l)
newPermutation=[]
for radix in xrange(length, 0, -1):
placeValue=factorial(radix-1)
index=x/placeValue
newPermutation.append(available.pop(index))
x-=index*placeValue
permutations.append(newPermutation)
return permutations
permutations(range(3))
This method is non-recursive, but it is slightly slower on my computer and xrange raises an error when n! is too large to be converted to a C long integer (n=13 for me). It was enough when I needed it, but it’s no itertools.permutations by a long shot.
回答 11
请注意,此算法具有n factorial时间复杂度,其中n是输入列表的长度
打印运行结果:
global result
result =[]def permutation(li):if li ==[]or li ==None:returnif len(li)==1:
result.append(li[0])print result
result.pop()returnfor i in range(0,len(li)):
result.append(li[i])
permutation(li[:i]+ li[i+1:])
result.pop()
Note that this algorithm has an n factorial time complexity, where n is the length of the input list
Print the results on the run:
global result
result = []
def permutation(li):
if li == [] or li == None:
return
if len(li) == 1:
result.append(li[0])
print result
result.pop()
return
for i in range(0,len(li)):
result.append(li[i])
permutation(li[:i] + li[i+1:])
result.pop()
def all_perms(elements):if len(elements)<=1:yield elements # Only permutation possible = no permutationelse:# Iteration over the first element in the result permutation:for(index, first_elmt)in enumerate(elements):
other_elmts = elements[:index]+elements[index+1:]for permutation in all_perms(other_elmts):yield[first_elmt]+ permutation
One can indeed iterate over the first element of each permutation, as in tzwenn’s answer. It is however more efficient to write this solution this way:
def all_perms(elements):
if len(elements) <= 1:
yield elements # Only permutation possible = no permutation
else:
# Iteration over the first element in the result permutation:
for (index, first_elmt) in enumerate(elements):
other_elmts = elements[:index]+elements[index+1:]
for permutation in all_perms(other_elmts):
yield [first_elmt] + permutation
This solution is about 30 % faster, apparently thanks to the recursion ending at len(elements) <= 1 instead of 0.
It is also much more memory-efficient, as it uses a generator function (through yield), like in Riccardo Reyes’s solution.
回答 13
这是受Haskell实现的启发,该实现使用列表理解:
def permutation(list):if len(list)==0:return[[]]else:return[[x]+ ys for x in list for ys in permutation(delete(list, x))]def delete(list, item):
lc = list[:]
lc.remove(item)return lc
This is inspired by the Haskell implementation using list comprehension:
def permutation(list):
if len(list) == 0:
return [[]]
else:
return [[x] + ys for x in list for ys in permutation(delete(list, x))]
def delete(list, item):
lc = list[:]
lc.remove(item)
return lc
回答 14
常规执行(无收益-将在内存中做所有事情):
def getPermutations(array):if len(array)==1:return[array]
permutations =[]for i in range(len(array)):# get all perm's of subarray w/o current item
perms = getPermutations(array[:i]+ array[i+1:])for p in perms:
permutations.append([array[i],*p])return permutations
收益实施:
def getPermutations(array):if len(array)==1:yield array
else:for i in range(len(array)):
perms = getPermutations(array[:i]+ array[i+1:])for p in perms:yield[array[i],*p]
Regular implementation (no yield – will do everything in memory):
def getPermutations(array):
if len(array) == 1:
return [array]
permutations = []
for i in range(len(array)):
# get all perm's of subarray w/o current item
perms = getPermutations(array[:i] + array[i+1:])
for p in perms:
permutations.append([array[i], *p])
return permutations
Yield implementation:
def getPermutations(array):
if len(array) == 1:
yield array
else:
for i in range(len(array)):
perms = getPermutations(array[:i] + array[i+1:])
for p in perms:
yield [array[i], *p]
The basic idea is to go over all the elements in the array for the 1st position, and then in 2nd position go over all the rest of the elements without the chosen element for the 1st, etc. You can do this with recursion, where the stop criteria is getting to an array of 1 element – in which case you return that array.
from numpy import empty, uint8
from math import factorial
def perms(n):
f =1
p = empty((2*n-1, factorial(n)), uint8)for i in range(n):
p[i,:f]= i
p[i+1:2*i+1,:f]= p[:i,:f]# constitution de blocsfor j in range(i):
p[:i+1, f*(j+1):f*(j+2)]= p[j+1:j+i+2,:f]# copie de blocs
f = f*(i+1)return p[:n,:]
For performance, a numpy solution inspired by Knuth, (p22) :
from numpy import empty, uint8
from math import factorial
def perms(n):
f = 1
p = empty((2*n-1, factorial(n)), uint8)
for i in range(n):
p[i, :f] = i
p[i+1:2*i+1, :f] = p[:i, :f] # constitution de blocs
for j in range(i):
p[:i+1, f*(j+1):f*(j+2)] = p[j+1:j+i+2, :f] # copie de blocs
f = f*(i+1)
return p[:n, :]
Copying large blocs of memory saves time –
it’s 20x faster than list(itertools.permutations(range(n)) :
In [1]: %timeit -n10 list(permutations(range(10)))
10 loops, best of 3: 815 ms per loop
In [2]: %timeit -n100 perms(10)
100 loops, best of 3: 40 ms per loop
回答 16
from __future__ import print_function
def perm(n):
p =[]for i in range(0,n+1):
p.append(i)whileTrue:for i in range(1,n+1):print(p[i], end=' ')print("")
i = n -1
found =0while(not found and i>0):if p[i]<p[i+1]:
found =1else:
i = i -1
k = n
while p[i]>p[k]:
k = k -1
aux = p[i]
p[i]= p[k]
p[k]= aux
for j in range(1,(n-i)/2+1):
aux = p[i+j]
p[i+j]= p[n-j+1]
p[n-j+1]= aux
ifnot found:break
perm(5)
from __future__ import print_function
def perm(n):
p = []
for i in range(0,n+1):
p.append(i)
while True:
for i in range(1,n+1):
print(p[i], end=' ')
print("")
i = n - 1
found = 0
while (not found and i>0):
if p[i]<p[i+1]:
found = 1
else:
i = i - 1
k = n
while p[i]>p[k]:
k = k - 1
aux = p[i]
p[i] = p[k]
p[k] = aux
for j in range(1,(n-i)/2+1):
aux = p[i+j]
p[i+j] = p[n-j+1]
p[n-j+1] = aux
if not found:
break
perm(5)
def permute(xs, low=0):if low +1>= len(xs):yield xs
else:for p in permute(xs, low +1):yield p
for i in range(low +1, len(xs)):
xs[low], xs[i]= xs[i], xs[low]for p in permute(xs, low +1):yield p
xs[low], xs[i]= xs[i], xs[low]for p in permute([1,2,3,4]):print p
def permute(xs, low=0):
if low + 1 >= len(xs):
yield xs
else:
for p in permute(xs, low + 1):
yield p
for i in range(low + 1, len(xs)):
xs[low], xs[i] = xs[i], xs[low]
for p in permute(xs, low + 1):
yield p
xs[low], xs[i] = xs[i], xs[low]
for p in permute([1, 2, 3, 4]):
print p
def permute(items):
length = len(items)def inner(ix=[]):
do_yield = len(ix)== length -1for i in range(0, length):if i in ix:#avoid duplicatescontinueif do_yield:yield tuple([items[y]for y in ix +[i]])else:for p in inner(ix +[i]):yield p
return inner()
用法:
for p in permute((1,2,3)):print(p)(1,2,3)(1,3,2)(2,1,3)(2,3,1)(3,1,2)(3,2,1)
This algorithm is the most effective one, it avoids of array passing and manipulation in recursive calls, works in Python 2, 3:
def permute(items):
length = len(items)
def inner(ix=[]):
do_yield = len(ix) == length - 1
for i in range(0, length):
if i in ix: #avoid duplicates
continue
if do_yield:
yield tuple([items[y] for y in ix + [i]])
else:
for p in inner(ix + [i]):
yield p
return inner()
Usage:
for p in permute((1,2,3)):
print(p)
(1, 2, 3)
(1, 3, 2)
(2, 1, 3)
(2, 3, 1)
(3, 1, 2)
(3, 2, 1)
回答 20
def pzip(c, seq):
result =[]for item in seq:for i in range(len(item)+1):
result.append(item[i:]+c+item[:i])return result
def perm(line):
seq =[c for c in line]if len(seq)<=1:return seq
else:return pzip(seq[0], perm(seq[1:]))
def pzip(c, seq):
result = []
for item in seq:
for i in range(len(item)+1):
result.append(item[i:]+c+item[:i])
return result
def perm(line):
seq = [c for c in line]
if len(seq) <=1 :
return seq
else:
return pzip(seq[0], perm(seq[1:]))
回答 21
另一种方法(无库)
def permutation(input):if len(input)==1:return input if isinstance(input, list)else[input]
result =[]for i in range(len(input)):
first = input[i]
rest = input[:i]+ input[i +1:]
rest_permutation = permutation(rest)for p in rest_permutation:
result.append(first + p)return result
def permutation(input):
if len(input) == 1:
return input if isinstance(input, list) else [input]
result = []
for i in range(len(input)):
first = input[i]
rest = input[:i] + input[i + 1:]
rest_permutation = permutation(rest)
for p in rest_permutation:
result.append(first + p)
return result
The trotter package is different from most implementations in that it generates pseudo lists that don’t actually contain permutations but rather describe mappings between permutations and respective positions in an ordering, making it possible to work with very large ‘lists’ of permutations, as shown in this demo which performs pretty instantaneous operations and look-ups in a pseudo-list ‘containing’ all the permutations of the letters in the alphabet, without using more memory or processing than a typical web page.
In any case, to generate a list of permutations, we can do the following.
import trotter
my_permutations = trotter.Permutations(3, [1, 2, 3])
print(my_permutations)
for p in my_permutations:
print(p)
def calcperm(arr, size):
result = set([()])
for dummy_idx in range(size):
temp = set()
for dummy_lst in result:
for dummy_outcome in arr:
if dummy_outcome not in dummy_lst:
new_seq = list(dummy_lst)
new_seq.append(dummy_outcome)
temp.add(tuple(new_seq))
result = temp
return result
@numba.njit()def permutations(A, k):
r =[[i for i in range(0)]]for i in range(k):
r =[[a]+ b for a in A for b in r if(a in b)==False]return r
permutations([1,2,3],3)[[1,2,3],[1,3,2],[2,1,3],[2,3,1],[3,1,2],[3,2,1]]
给人印象的表现:
%timeit permutations(np.arange(5),5)243µs ±11.1µs per loop (mean ± std. dev. of 7 runs,1 loop each)
time:406 ms
%timeit list(itertools.permutations(np.arange(5),5))15.9µs ±8.61 ns per loop (mean ± std. dev. of 7 runs,100000 loops each)
time:12.9 s
To save you folks possible hours of searching and experimenting, here’s the non-recursive permutaions solution in Python which also works with Numba (as of v. 0.41):
@numba.njit()
def permutations(A, k):
r = [[i for i in range(0)]]
for i in range(k):
r = [[a] + b for a in A for b in r if (a in b)==False]
return r
permutations([1,2,3],3)
[[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]
To give an impression about performance:
%timeit permutations(np.arange(5),5)
243 µs ± 11.1 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
time: 406 ms
%timeit list(itertools.permutations(np.arange(5),5))
15.9 µs ± 8.61 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
time: 12.9 s
So use this version only if you have to call it from njitted function, otherwise prefer itertools implementation.
I see a lot of iteration going on inside these recursive functions, not exactly pure recursion…
so for those of you who cannot abide by even a single loop, here’s a gross, totally unnecessary fully recursive solution
def all_insert(x, e, i=0):
return [x[0:i]+[e]+x[i:]] + all_insert(x,e,i+1) if i<len(x)+1 else []
def for_each(X, e):
return all_insert(X[0], e) + for_each(X[1:],e) if X else []
def permute(x):
return [x] if len(x) < 2 else for_each( permute(x[1:]) , x[0])
perms = permute([1,2,3])
回答 26
另一个解决方案:
def permutation(flag, k =1):
N = len(flag)for i in xrange(0, N):if flag[i]!=0:continue
flag[i]= k
if k == N:print flag
permutation(flag, k+1)
flag[i]=0
permutation([0,0,0])
def permutation(flag, k =1 ):
N = len(flag)
for i in xrange(0, N):
if flag[i] != 0:
continue
flag[i] = k
if k == N:
print flag
permutation(flag, k+1)
flag[i] = 0
permutation([0, 0, 0])
回答 27
我的Python解决方案:
def permutes(input,offset):if( len(input)== offset ):return[''.join(input)]
result=[]for i in range( offset, len(input)):
input[offset], input[i]= input[i], input[offset]
result = result + permutes(input,offset+1)
input[offset], input[i]= input[i], input[offset]return result
# input is a "string"# return value is a list of stringsdef permutations(input):return permutes( list(input),0)# Main Programprint( permutations("wxyz"))
def permutes(input,offset):
if( len(input) == offset ):
return [''.join(input)]
result=[]
for i in range( offset, len(input) ):
input[offset], input[i] = input[i], input[offset]
result = result + permutes(input,offset+1)
input[offset], input[i] = input[i], input[offset]
return result
# input is a "string"
# return value is a list of strings
def permutations(input):
return permutes( list(input), 0 )
# Main Program
print( permutations("wxyz") )
回答 28
def permutation(word, first_char=None):if word ==Noneor len(word)==0:return[]if len(word)==1:return[word]
result =[]
first_char = word[0]for sub_word in permutation(word[1:], first_char):
result += insert(first_char, sub_word)return sorted(result)def insert(ch, sub_word):
arr =[ch + sub_word]for i in range(len(sub_word)):
arr.append(sub_word[i:]+ ch + sub_word[:i])return arr
assert permutation(None)==[]assert permutation('')==[]assert permutation('1')==['1']assert permutation('12')==['12','21']print permutation('abc')
from collections importCounterdef permutations(nums):
ans =[[]]
cache =Counter(nums)for idx, x in enumerate(nums):
result =[]for items in ans:
cache1 =Counter(items)for id, n in enumerate(nums):if cache[n]!= cache1[n]and items +[n]notin result:
result.append(items +[n])
ans = result
return ans
permutations([1,2,2])>[[1,2,2],[2,1,2],[2,2,1]]
from collections import Counter
def permutations(nums):
ans = [[]]
cache = Counter(nums)
for idx, x in enumerate(nums):
result = []
for items in ans:
cache1 = Counter(items)
for id, n in enumerate(nums):
if cache[n] != cache1[n] and items + [n] not in result:
result.append(items + [n])
ans = result
return ans
permutations([1, 2, 2])
> [[1, 2, 2], [2, 1, 2], [2, 2, 1]]
with open(fname,'rb')as f:
lines =[x.strip()for x in f.readlines()]for line in lines:
tmp = line.strip().lower()if'some-pattern'in tmp:continue# ... code
升级到3.5后,我得到了:
TypeError: a bytes-like object is required,not'str'
I’ve very recently migrated to Py 3.5.
This code was working properly in Python 2.7:
with open(fname, 'rb') as f:
lines = [x.strip() for x in f.readlines()]
for line in lines:
tmp = line.strip().lower()
if 'some-pattern' in tmp: continue
# ... code
After upgrading to 3.5, I’m getting the:
TypeError: a bytes-like object is required, not 'str'
error on the last line (the pattern search code).
I’ve tried using the .decode() function on either side of the statement, also tried:
if tmp.find('some-pattern') != -1: continue
– to no avail.
I was able to resolve almost all 2:3 issues quickly, but this little statement is bugging me.
Like it has been already mentioned, you are reading the file in binary mode and then creating a list of bytes. In your following for loop you are comparing string to bytes and that is where the code is failing.
Decoding the bytes while adding to the list should work. The changed code should look as follows:
with open(fname, 'rb') as f:
lines = [x.decode('utf8').strip() for x in f.readlines()]
The bytes type was introduced in Python 3 and that is why your code worked in Python 2. In Python 2 there was no data type for bytes:
I got this error when I was trying to convert a char (or string) to bytes, the code was something like this with Python 2.7:
# -*- coding: utf-8 -*-
print( bytes('ò') )
This is the way of Python 2.7 when dealing with unicode chars.
This won’t work with Python 3.6, since bytes require an extra argument for encoding, but this can be little tricky, since different encoding may output different result:
import datetime
dt = '21/03/2012'
day, month, year = (int(x) for x in dt.split('/'))
ans = datetime.date(year, month, day)
print (ans.strftime("%A"))
回答 5
没有导入日期为1700/1/1之后的解决方案
def weekDay(year, month, day):
offset =[0,31,59,90,120,151,181,212,243,273,304,334]
week =['Sunday','Monday','Tuesday','Wednesday','Thursday','Friday','Saturday']
afterFeb =1if month >2: afterFeb =0
aux = year -1700- afterFeb
# dayOfWeek for 1700/1/1 = 5, Friday
dayOfWeek =5# partial sum of days betweem current date and 1700/1/1
dayOfWeek +=(aux + afterFeb)*365# leap year correction
dayOfWeek += aux /4- aux /100+(aux +100)/400# sum monthly and day offsets
dayOfWeek += offset[month -1]+(day -1)
dayOfWeek %=7return dayOfWeek, week[dayOfWeek]print weekDay(2013,6,15)==(6,'Saturday')print weekDay(1969,7,20)==(0,'Sunday')print weekDay(1945,4,30)==(1,'Monday')print weekDay(1900,1,1)==(1,'Monday')print weekDay(1789,7,14)==(2,'Tuesday')
datetime library sometimes gives errors with strptime() so I switched to dateutil library. Here’s an example of how you can use it :
from dateutil import parser
parser.parse('January 11, 2010').strftime("%a")
The output that you get from this is 'Mon'. If you want the output as ‘Monday’, use the following :
parser.parse('January 11, 2010').strftime("%A")
This worked for me pretty quickly. I was having problems while using the datetime library because I wanted to store the weekday name instead of weekday number and the format from using the datetime library was causing problems. If you’re not having problems with this, great! If you are, you cand efinitely go for this as it has a simpler syntax as well. Hope this helps.
回答 9
假设给定日期,月份和年份,则可以执行以下操作:
import datetime
DayL=['Mon','Tues','Wednes','Thurs','Fri','Satur','Sun']
date =DayL[datetime.date(year,month,day).weekday()]+'day'#Set day, month, year to your value#Now, date is set as an actual day, not a number from 0 to 6.print(date)
Assuming you are given the day, month, and year, you could do:
import datetime
DayL = ['Mon','Tues','Wednes','Thurs','Fri','Satur','Sun']
date = DayL[datetime.date(year,month,day).weekday()] + 'day'
#Set day, month, year to your value
#Now, date is set as an actual day, not a number from 0 to 6.
print(date)
def dow(year,month,day):""" day of week, Sunday = 1, Saturday = 7
http://en.wikipedia.org/wiki/Zeller%27s_congruence """
m, q = month, day
if m ==1:
m =13
year -=1elif m ==2:
m =14
year -=1
K = year %100
J = year //100
f =(q + int(13*(m +1)/5.0)+ K + int(K/4.0))
fg = f + int(J/4.0)-2* J
fj = f +5- J
if year >1582:
h = fg %7else:
h = fj %7if h ==0:
h =7return h
If you have reason to avoid the use of the datetime module, then this function will work.
Note: The change from the Julian to the Gregorian calendar is assumed to have occurred in 1582. If this is not true for your calendar of interest then change the line if year > 1582: accordingly.
def dow(year,month,day):
""" day of week, Sunday = 1, Saturday = 7
http://en.wikipedia.org/wiki/Zeller%27s_congruence """
m, q = month, day
if m == 1:
m = 13
year -= 1
elif m == 2:
m = 14
year -= 1
K = year % 100
J = year // 100
f = (q + int(13*(m + 1)/5.0) + K + int(K/4.0))
fg = f + int(J/4.0) - 2 * J
fj = f + 5 - J
if year > 1582:
h = fg % 7
else:
h = fj % 7
if h == 0:
h = 7
return h
When I read Django code I often see in models what is called a “slug”. I am not quite sure what this is, but I do know it has something to do with URLs. How and when is this slug-thing supposed to be used?
A “slug” is a way of generating a valid URL, generally using data already obtained. For instance, a slug uses the title of an article to generate a URL. I advise to generate the slug by means of a function, given the title (or another piece of data), rather than setting it manually.
An example:
<title> The 46 Year Old Virgin </title>
<content> A silly comedy movie </content>
<slug> the-46-year-old-virgin </slug>
Now let’s pretend that we have a Django model such as:
class Article(models.Model):
title = models.CharField(max_length=100)
content = models.TextField(max_length=1000)
slug = models.SlugField(max_length=40)
How would you reference this object with a URL and with a meaningful name? You could for instance use Article.id so the URL would look like this:
www.example.com/article/23
Or, you might want to reference the title like this:
www.example.com/article/The 46 Year Old Virgin
Since spaces aren’t valid in URLs, they must be replaced by %20, which results in:
The term “slug” has to do with casting metal—lead, in this case—out of which the press fonts were made. Every paper then had its fonts factory regularly re-melted and recast in fresh molds, since after many prints they became worn out. Apprentices like me started their career there, and went all the way to the top (not anymore).
Typographs had to compose the text of an article in a backward manner with lead characters stacked in a wise. So at printing time the letters would be straight on the paper. All typographs could read the newspaper mirrored as fast as the printed one. Therefore the slugs, (like snails) also the slow stories (the last to be fixed) were many on the bench waiting, solely identified by their fist letters, mostly the whole title generally more readable. Some “hot” news were waiting there on the bench, for possible last minute correction, (Evening paper) before last assembly and definitive printing.
Django emerged from the offices of the Lawrence journal in Kansas. Where probably some printing jargon still lingers. A-django-enthusiast-&-friendly-old-slug-boy-from-France.
The term ‘slug’ comes from the world of newspaper production.
It’s an informal name given to a story during the production process. As the story winds its path from the beat reporter (assuming these even exist any more?) through to editor through to the “printing presses”, this is the name it is referenced by, e.g., “Have you fixed those errors in the ‘kate-and-william’ story?”.
Some systems (such as Django) use the slug as part of the URL to locate the story, an example being www.mysite.com/archives/kate-and-william.
Even Stack Overflow itself does this, with the GEB-ish(a) self-referential https://stackoverflow.com/questions/427102/what-is-a-slug-in-django/427201#427201, although you can replace the slug with blahblah and it will still find it okay.
It may even date back earlier than that, since screenplays had “slug lines” at the start of each scene, which basically sets the background for that scene (where, when, and so on). It’s very similar in that it’s a precis or preamble of what follows.
On a Linotype machine, a slug was a single line piece of metal which was created from the individual letter forms. By making a single slug for the whole line, this greatly improved on the old character-by-character compositing.
Although the following is pure conjecture, an early meaning of slug was for a counterfeit coin (which would have to be pressed somehow). I could envisage that usage being transformed to the printing term (since the slug had to be pressed using the original characters) and from there, changing from the ‘piece of metal’ definition to the ‘story summary’ definition. From there, it’s a short step from proper printing to the online world.
(a) “Godel Escher, Bach”, by one Douglas Hofstadter, which I (at least) consider one of the great modern intellectual works. You should also check out his other work, “Metamagical Themas”.
Slug is a newspaper term. A slug is a short label for something, containing only letters, numbers, underscores or hyphens. They’re generally used in URLs. (as in Django docs)
A slug field in Django is used to store and generate valid URLs for your dynamically created web pages.
Just like the way you added this question on Stack Overflow and a dynamic page was generated and when you see in the address bar you will see your question title with “-” in place of the spaces. That’s exactly the job of a slug field.
The title entered by you was something like this -> What is a “slug” in Django?
On storing it into a slug field it becomes “what-is-a-slug-in-django” (see URL of this page)
“Slug” is a newspaper term, but what
it means here is the final bit of the
URL. For example, a post with the
title, “A bit about Django” would
become, “bit-about-django”
automatically (you can, of course,
change it easily if you don’t like the
auto-generated slug).
It’s a descriptive part of the URL that is there to make it more human descriptive, but without necessarily being required by the web server – in What is a “slug” in Django? the slug is ‘in-django-what-is-a-slug’, but the slug is not used to determine the page served (on this site at least)
classArticle(models.Model):
title = models.CharField(max_length=100)
slug = models.SlugField(max_length=100)
如果您想使用标题作为标题,django有一个简单的函数称为 slugify
from django.template.defaultfilters import slugify
classArticle(models.Model):
title = models.CharField(max_length=100)def slug(self):return slugify(self.title)
如果需要唯一性,请添加unique=True子弹字段。
例如,从前面的示例中:
classArticle(models.Model):
title = models.CharField(max_length=100)
slug = models.SlugField(max_length=100, unique=True)
Slug is a URL friendly short label for specific content. It only contain Letters, Numbers, Underscores or Hyphens. Slugs are commonly save with the respective content and it pass as a URL string.
Slug can create using SlugField
Ex:
class Article(models.Model):
title = models.CharField(max_length=100)
slug = models.SlugField(max_length=100)
If you want to use title as slug, django has a simple function called slugify
from django.template.defaultfilters import slugify
class Article(models.Model):
title = models.CharField(max_length=100)
def slug(self):
return slugify(self.title)
If it needs uniqueness, add unique=True in slug field.
for instance, from the previous example:
class Article(models.Model):
title = models.CharField(max_length=100)
slug = models.SlugField(max_length=100, unique=True)
Are you lazy to do slug process ? don’t worry, this plugin will help you.
django-autoslug
A short label for something, containing only letters, numbers, underscores or hyphens. They’re generally used in URLs. For example, in a typical blog entry URL:
So, I started learning to code in Python and later Django. The first times it was hard looking at tracebacks and actually figure out what I did wrong and where the syntax error was. Some time has passed now and some way along the way, I guess I got a routine in debugging my Django code. As this was done early in my coding experience, I sat down and wondered if how I was doing this was ineffective and could be done faster. I usually manage to find and correct the bugs in my code, but I wonder if I should be doing it faster?
I usually just use the debug info Django gives when enabled. When things do end up as I thought it would, I break the code flow a lot with a syntax error, and look at the variables at that point in the flow to figure out, where the code does something other than what I wanted.
But can this be improved? Are there some good tools or better ways to debug your Django code?
There are a bunch of ways to do it, but the most straightforward is to simply
use the Python debugger. Just add following line in to a Django view function:
import pdb; pdb.set_trace()
or
breakpoint() #from Python3.7
If you try to load that page in your browser, the browser will hang and you get a prompt to carry on debugging on actual executing code.
However there are other options (I am not recommending them):
* return HttpResponse({variable to inspect})
* print {variable to inspect}
* raise Exception({variable to inspect})
But the Python Debugger (pdb) is highly recommended for all types of Python code. If you are already into pdb, you’d also want to have a look at IPDB that uses ipython for debugging.
I really like Werkzeug‘s interactive debugger. It’s similar to Django’s debug page, except that you get an interactive shell on every level of the traceback. If you use the django-extensions, you get a runserver_plus managment command which starts the development server and gives you Werkzeug’s debugger on exceptions.
Of course, you should only run this locally, as it gives anyone with a browser the rights to execute arbitrary python code in the context of the server.
回答 2
模板标记的小工具:
@register.filter
def pdb(element):import pdb; pdb.set_trace()return element
@register.filter
def pdb(element):
import pdb; pdb.set_trace()
return element
Now, inside a template you can do {{ template_var|pdb }} and enter a pdb session (given you’re running the local devel server) where you can inspect element to your heart’s content.
It’s a very nice way to see what’s happened to your object when it arrives at the template.
Then you need good logging using the Python logging facility. You can send logging output to a log file, but an easier option is sending log output to firepython. To use this you need to use the Firefox browser with the firebug extension. Firepython includes a firebug plugin that will display any server-side logging in a Firebug tab.
Firebug itself is also critical for debugging the Javascript side of any app you develop. (Assuming you have some JS code of course).
I also liked django-viewtools for debugging views interactively using pdb, but I don’t use it that much.
There are more useful tools like dozer for tracking down memory leaks (there are also other good suggestions given in answers here on SO for memory tracking).
Almost everything has been mentioned so far, so I’ll only add that instead of pdb.set_trace() one can use ipdb.set_trace() which uses iPython and therefore is more powerful (autocomplete and other goodies). This requires ipdb package, so you only need to pip install ipdb
bash: manage.py runserver --pdb
Validating models...0 errors found
Django version 1.3, using settings 'testproject.settings'Development server is running at http://127.0.0.1:8000/Quit the server with CONTROL-C.
GET /
function "myview"in testapp/views.py:6
args:()
kwargs:{}>/Users/tom/github/django-pdb/testproject/testapp/views.py(7)myview()-> a =1(Pdb)
并运行:manage.py test --pdb在测试失败/错误时进入pdb …
bash: manage.py test testapp --pdb
Creating test database for alias 'default'...
E
======================================================================>>> test_error (testapp.tests.SimpleTest)----------------------------------------------------------------------Traceback(most recent call last):File".../django-pdb/testproject/testapp/tests.py", line 16,in test_error
one_plus_one = four
NameError:global name 'four'isnot defined
======================================================================>/Users/tom/github/django-pdb/testproject/testapp/tests.py(16)test_error()-> one_plus_one = four
(Pdb)
I’ve pushed django-pdb to PyPI.
It’s a simple app that means you don’t need to edit your source code every time you want to break into pdb.
Installation is just…
pip install django-pdb
Add 'django_pdb' to your INSTALLED_APPS
You can now run: manage.py runserver --pdb to break into pdb at the start of every view…
bash: manage.py runserver --pdb
Validating models...
0 errors found
Django version 1.3, using settings 'testproject.settings'
Development server is running at http://127.0.0.1:8000/
Quit the server with CONTROL-C.
GET /
function "myview" in testapp/views.py:6
args: ()
kwargs: {}
> /Users/tom/github/django-pdb/testproject/testapp/views.py(7)myview()
-> a = 1
(Pdb)
And run: manage.py test --pdb to break into pdb on test failures/errors…
bash: manage.py test testapp --pdb
Creating test database for alias 'default'...
E
======================================================================
>>> test_error (testapp.tests.SimpleTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File ".../django-pdb/testproject/testapp/tests.py", line 16, in test_error
one_plus_one = four
NameError: global name 'four' is not defined
======================================================================
> /Users/tom/github/django-pdb/testproject/testapp/tests.py(16)test_error()
-> one_plus_one = four
(Pdb)
The project’s hosted on GitHub, contributions are welcome of course.
The easiest way to debug python – especially for programmers that are used to Visual Studio – is using PTVS (Python Tools for Visual Studio).
The steps are simple:
Your breakpoint is hit, you can view/change the variables as easy as debugging C#/C++ programs.
That’s all :)
If you want to debug Django using PTVS, you need to do the following:
In Project settings – General tab, set “Startup File” to “manage.py”, the entry point of the Django program.
In Project settings – Debug tab, set “Script Arguments” to “runserver –noreload”. The key point is the “–noreload” here. If you don’t set it, your breakpoints won’t be hit.
I use PyCharm and stand by it all the way. It cost me a little but I have to say the advantage that I get out of it is priceless. I tried debugging from console and I do give people a lot of credit who can do that, but for me being able to visually debug my application(s) is great.
I have to say though, PyCharm does take a lot of memory. But then again, nothing good is free in life. They just came with their latest version 3. It also plays very well with Django, Flask and Google AppEngine. So, all in all, I’d say it’s a great handy tool to have for any developer.
If you are not using it yet, I’d recommend to get the trial version for 30 days to take a look at the power of PyCharm. I’m sure there are other tools also available, such as Aptana. But I guess I just also like the way PyCharm looks. I feel very comfortable debugging my apps there.
From my perspective, we could break down common code debugging tasks into three distinct usage patterns:
Something has raised an exception: runserver_plus‘ Werkzeug debugger to the rescue. The ability to run custom code at all the trace levels is a killer. And if you’re completely stuck, you can create a Gist to share with just a click.
Page is rendered, but the result is wrong: again, Werkzeug rocks. To make a breakpoint in code, just type assert False in the place you want to stop at.
Code works wrong, but the quick look doesn’t help. Most probably, an algorithmic problem. Sigh. Then I usually fire up a console debugger PuDB: import pudb; pudb.set_trace(). The main advantage over [i]pdb is that PuDB (while looking as you’re in 80’s) makes setting custom watch expressions a breeze. And debugging a bunch of nested loops is much simpler with a GUI.
Ah, yes, the templates’ woes. The most common (to me and my colleagues) problem is a wrong context: either you don’t have a variable, or your variable doesn’t have some attribute. If you’re using debug toolbar, just inspect the context at the “Templates” section, or, if it’s not sufficient, set a break in your views’ code just after your context is filled up.
One thing I love about epdb for debugging Django or other Python webservers is the epdb.serve() command. This sets a trace and serves this on a local port that you can connect to. Typical use case:
I have a view that I want to go through step-by-step. I’ll insert the following at the point I want to set the trace.
import epdb; epdb.serve()
Once this code gets executed, I open a Python interpreter and connect to the serving instance. I can analyze all the values and step through the code using the standard pdb commands like n, s, etc.
In [2]: import epdb; epdb.connect()
(Epdb) request
<WSGIRequest
path:/foo,
GET:<QueryDict: {}>,
POST:<QuestDict: {}>,
...
>
(Epdb) request.session.session_key
'i31kq7lljj3up5v7hbw9cff0rga2vlq5'
(Epdb) list
85 raise some_error.CustomError()
86
87 # Example login view
88 def login(request, username, password):
89 import epdb; epdb.serve()
90 -> return my_login_method(username, password)
91
92 # Example view to show session key
93 def get_session_key(request):
94 return request.session.session_key
95
And tons more that you can learn about typing epdb help at any time.
If you want to serve or connect to multiple epdb instances at the same time, you can specify the port to listen on (default is 8080). I.e.
host defaults to ‘localhost’ if not specified. I threw it in here to demonstrate how you can use this to debug something other than a local instance, like a development server on your local LAN. Obviously, if you do this be careful that the set trace never makes it onto your production server!
As a quick note, you can still do the same thing as the accepted answer with epdb (import epdb; epdb.set_trace()) but I wanted to highlight the serve functionality since I’ve found it so useful.
“There are IDEs like PyCharm that have their own debuggers. They offer similar or equal set of features … However to use them you have to use those specific IDEs (and some of then are non-free or may not be available for all platforms). Pick the right tool for your needs.”
Finally, if you’d like to see a nice graphical printout of your call stack in Django, checkout:
https://github.com/joerick/pyinstrument. Just add pyinstrument.middleware.ProfilerMiddleware to MIDDLEWARE_CLASSES, then add ?profile to the end of the request URL to activate the profiler.
Can also run pyinstrument from command line or by importing as a module.
Add import pdb; pdb.set_trace() or breakpoint()(form python3.7) at the corresponding line in the Python code and execute it. The execution will stop with an interactive shell. In the shell you can execute Python code (i.e. print variables) or use commands such as:
c continue execution
n step to the next line within the same function
s step to the next line in this function or a called function
wdb works with python 2 (2.6, 2.7), python 3 (3.2, 3.3, 3.4, 3.5) and pypy. Even better, it is possible to debug a python 2 program with a wdb server running on python 3 and vice-versa or debug a program running on a computer with a debugging server running on another computer inside a web page on a third computer!
Even betterer, it is now possible to pause a currently running python process/thread using code injection from the web interface. (This requires gdb and ptrace enabled)
In other words it’s a very enhanced version of pdb directly in your browser with nice features.
Install and run the server, and in your code add:
import wdb
wdb.set_trace()
According to the author, main differences with respect to pdb are:
For those who don’t know the project, wdb is a python debugger like pdb, but with a slick web front-end and a lot of additional features, such as:
Source syntax highlighting
Visual breakpoints
Interactive code completion using jedi
Persistent breakpoints
Deep objects inspection using mouse Multithreading / Multiprocessing support
Remote debugging
Watch expressions
In debugger code edition
Popular web servers integration to break on error
In exception breaking during trace (not post-mortem) in contrary to the werkzeug debugger for instance
Breaking in currently running programs through code injection (on supported systems)
It has a great browser-based user interface. A joy to use! :)
I use PyCharm and different debug tools. Also have a nice articles set about easy set up those things for novices. You may start here. It tells about PDB and GUI debugging in general with Django projects. Hope someone would benefit from them.
I find Visual Studio Code is awesome for debugging Django apps. The standard python launch.json parameters run python manage.py with the debugger attached, so you can set breakpoints and step through your code as you like.
回答 20
对于那些可能意外将pdb添加到实时提交中的人,我可以建议#Koobz答案的扩展名:
@register.filter
def pdb(element):from django.conf import settings
if settings.DEBUG:import pdb
pdb.set_trace()return element
As mentioned in other posts here – setting breakpoints in your code and walking thru the code to see if it behaves as you expected is a great way to learn something like Django until you have a good sense of how it all behaves – and what your code is doing.
To do this I would recommend using WingIde. Just like other mentioned IDEs nice and easy to use, nice layout and also easy to set breakpoints evaluate / modify the stack etc. Perfect for visualizing what your code is doing as you step through it. I’m a big fan of it.
Also I use PyCharm – it has excellent static code analysis and can help sometimes spot problems before you realize they are there.
And while not explicitly a debug or analysis tool – one of my favorites is SQL Printing Middleware available from Django Snippets at https://djangosnippets.org/snippets/290/
This will display the SQL queries that your view has generated. This will give you a good sense of what the ORM is doing and if your queries are efficient or you need to rework your code (or add caching).
I find it invaluable for keeping an eye on query performance while developing and debugging my application.
Just one other tip – I modified it slightly for my own use to only show the summary and not the SQL statement…. So I always use it while developing and testing. I also added that if the len(connection.queries) is greater than a pre-defined threshold it displays an extra warning.
Then if I spot something bad (from a performance or number of queries perspective) is happening I turn back on the full display of the SQL statements to see exactly what is going on. Very handy when you are working on a large Django project with multiple developers.
TypeError at /db/hcm91dmo/catalog/records/
render_option() argument after * must be a sequence,not int
....Error during template rendering
In template /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/crispy_forms/templates/bootstrap3/field.html, error at line 28
render_option() argument after * must be a sequence,not int
1819{%if field|is_checkboxselectmultiple %}20{% include 'bootstrap3/layout/checkboxselectmultiple.html'%}21{% endif %}2223{%if field|is_radioselect %}24{% include 'bootstrap3/layout/radioselect.html'%}25{% endif %}2627{%ifnot field|is_checkboxselectmultiple andnot field|is_radioselect %}28{%if field|is_checkbox and form_show_labels %}
File"/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/django/forms/forms.py", line 537,in __str__
return self.as_widget()File"/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/django/forms/forms.py", line 593,in as_widget
return force_text(widget.render(name, self.value(), attrs=attrs))File"/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/django/forms/widgets.py", line 513,in render
options = self.render_options(choices,[value])File"/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/django/forms/widgets.py", line 543,in render_options
output.append(self.render_option(selected_choices,*option))TypeError: render_option() argument after * must be a sequence,not int
INFO lib.capture_middleware log write_to_index(http://localhost:8082/db/hcm91dmo/catalog/records.html)
INFO lib.capture_middleware log write_to_index:end
>/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/django/forms/widgets.py(543)render_options()-> output.append(self.render_option(selected_choices,*option))(Pdb)import pprint
(Pdb) pprint.PrettyPrinter(indent=4).pprint(self)<django.forms.widgets.Select object at 0x115fe7d10>(Pdb) pprint.PrettyPrinter(indent=4).pprint(vars(self)){'attrs':{'class':'select form-control'},'choices':[[('_','any type'),(7,(7,'type 7','RECTYPE_TABLE'))]],'is_required':False}(Pdb)
You can leverage nosetests and pdb together, rather injecting pdb.set_trace() in your views manually. The advantage is that you can observe error conditions when they first start, potentially in 3rd party code.
Here’s an error for me today.
TypeError at /db/hcm91dmo/catalog/records/
render_option() argument after * must be a sequence, not int
....
Error during template rendering
In template /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/crispy_forms/templates/bootstrap3/field.html, error at line 28
render_option() argument after * must be a sequence, not int
18
19 {% if field|is_checkboxselectmultiple %}
20 {% include 'bootstrap3/layout/checkboxselectmultiple.html' %}
21 {% endif %}
22
23 {% if field|is_radioselect %}
24 {% include 'bootstrap3/layout/radioselect.html' %}
25 {% endif %}
26
27 {% if not field|is_checkboxselectmultiple and not field|is_radioselect %}
28
{% if field|is_checkbox and form_show_labels %}
Now, I know this means that I goofed the constructor for the form, and I even have good idea of which field is a problem. But, can I use pdb to see what crispy forms is complaining about, within a template?
Yes, I can. Using the –pdb option on nosetests:
tests$ nosetests test_urls_catalog.py --pdb
As soon as I hit any exception (including ones handled gracefully), pdb stops where it happens and I can look around.
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/django/forms/forms.py", line 537, in __str__
return self.as_widget()
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/django/forms/forms.py", line 593, in as_widget
return force_text(widget.render(name, self.value(), attrs=attrs))
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/django/forms/widgets.py", line 513, in render
options = self.render_options(choices, [value])
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/django/forms/widgets.py", line 543, in render_options
output.append(self.render_option(selected_choices, *option))
TypeError: render_option() argument after * must be a sequence, not int
INFO lib.capture_middleware log write_to_index(http://localhost:8082/db/hcm91dmo/catalog/records.html)
INFO lib.capture_middleware log write_to_index:end
> /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/django/forms/widgets.py(543)render_options()
-> output.append(self.render_option(selected_choices, *option))
(Pdb) import pprint
(Pdb) pprint.PrettyPrinter(indent=4).pprint(self)
<django.forms.widgets.Select object at 0x115fe7d10>
(Pdb) pprint.PrettyPrinter(indent=4).pprint(vars(self))
{ 'attrs': { 'class': 'select form-control'},
'choices': [[('_', 'any type'), (7, (7, 'type 7', 'RECTYPE_TABLE'))]],
'is_required': False}
(Pdb)
Now, it’s clear that my choices argument to the crispy field constructor was as it was a list within a list, rather than a list/tuple of tuples.
import numpy as np
np.set_printoptions(threshold=np.inf)
I suggest using np.inf instead of np.nan which is suggested by others. They both work for your purpose, but by setting the threshold to “infinity” it is obvious to everybody reading your code what you mean. Having a threshold of “not a number” seems a little vague to me.
If you use NumPy 1.15 (released 2018-07-23) or newer, you can use the printoptions context manager:
with numpy.printoptions(threshold=numpy.inf):
print(arr)
(of course, replace numpy by np if that’s how you imported numpy)
The use of a context manager (the with-block) ensures that after the context manager is finished, the print options will revert to whatever they were before the block started. It ensures the setting is temporary, and only applied to code within the block.
import numpy as np
from contextlib import contextmanager
@contextmanagerdef show_complete_array():
oldoptions = np.get_printoptions()
np.set_printoptions(threshold=np.inf)try:yieldfinally:
np.set_printoptions(**oldoptions)
在您的代码中,可以这样使用它:
a = np.arange(1001)print(a)# shows the truncated arraywith show_complete_array():print(a)# shows the complete arrayprint(a)# shows the truncated array (again)
This is a slight modification (removed the option to pass additional arguments to set_printoptions)of neoks answer.
It shows how you can use contextlib.contextmanager to easily create such a contextmanager with fewer lines of code:
import numpy as np
from contextlib import contextmanager
@contextmanager
def show_complete_array():
oldoptions = np.get_printoptions()
np.set_printoptions(threshold=np.inf)
try:
yield
finally:
np.set_printoptions(**oldoptions)
In your code it can be used like this:
a = np.arange(1001)
print(a) # shows the truncated array
with show_complete_array():
print(a) # shows the complete array
print(a) # shows the truncated array (again)
Complementary to this answer from the maximum number of columns (fixed with numpy.set_printoptions(threshold=numpy.nan)), there is also a limit of characters to be displayed. In some environments like when calling python from bash (rather than the interactive session), this can be fixed by setting the parameter linewidth as following.
import numpy as np
np.set_printoptions(linewidth=2000) # default = 75
Mat = np.arange(20000,20150).reshape(2,75) # 150 elements (75 columns)
print(Mat)
In this case, your window should limit the number of characters to wrap the line.
For those out there using sublime text and wanting to see results within the output window, you should add the build option "word_wrap": false to the sublime-build file [source] .
If you want to print the full array in a one-off way (without toggling np.set_printoptions), but want something simpler (less code) than the context manager, just do
for row in arr:
print row
回答 13
稍作修改:(因为您要打印大量列表)
import numpy as np
np.set_printoptions(threshold=np.inf, linewidth=200)
x = np.arange(1000)print(x)
A slight modification: (since you are going to print a huge list)
import numpy as np
np.set_printoptions(threshold=np.inf, linewidth=200)
x = np.arange(1000)
print(x)
This will increase the number of characters per line (default linewidth of 75). Use any value you like for the linewidth which suits your coding environment. This will save you from having to go through huge number of output lines by adding more characters per line.
avoids the side effect of requiring a reset of numpy.set_printoptions(threshold=sys.maxsize) and you don’t get the numpy.array and brackets. I find this convenient for dumping a wide array into a log file
If an array is too large to be printed, NumPy automatically skips the central part of the array and only prints the corners:
To disable this behaviour and force NumPy to print the entire array, you can change the printing options using set_printoptions.