The communicate() method returns an array of bytes:
>>> command_stdout
b'total 0\n-rw-rw-r-- 1 thomas thomas 0 Mar 3 07:03 file1\n-rw-rw-r-- 1 thomas thomas 0 Mar 3 07:03 file2\n'
However, I’d like to work with the output as a normal Python string. So that I could print it like this:
>>> print(command_stdout)
-rw-rw-r-- 1 thomas thomas 0 Mar 3 07:03 file1
-rw-rw-r-- 1 thomas thomas 0 Mar 3 07:03 file2
I thought that’s what the binascii.b2a_qp() method is for, but when I tried it, I got the same byte array again:
>>> binascii.b2a_qp(command_stdout)
b'total 0\n-rw-rw-r-- 1 thomas thomas 0 Mar 3 07:03 file1\n-rw-rw-r-- 1 thomas thomas 0 Mar 3 07:03 file2\n'
How do I convert the bytes value back to string? I mean, using the “batteries” instead of doing it manually. And I’d like it to be OK with Python 3.
回答 0
您需要解码bytes对象以产生一个字符串:
>>> b"abcde"
b'abcde'# utf-8 is used here because it is a very common encoding, but you# need to use the encoding your data is actually in.>>> b"abcde".decode("utf-8")'abcde'
You need to decode the bytes object to produce a string:
>>> b"abcde"
b'abcde'
# utf-8 is used here because it is a very common encoding, but you
# need to use the encoding your data is actually in.
>>> b"abcde".decode("utf-8")
'abcde'
# --- preparationimport codecs
def slashescape(err):""" codecs error handler. err is UnicodeDecode instance. return
a tuple with a replacement for the unencodable part of the input
and a position where encoding should continue"""#print err, dir(err), err.start, err.end, err.object[:err.start]
thebyte = err.object[err.start:err.end]
repl = u'\\x'+hex(ord(thebyte))[2:]return(repl, err.end)
codecs.register_error('slashescape', slashescape)# --- processing
stream =[b'\x80abc']
lines =[]for line in stream:
lines.append(line.decode('utf-8','slashescape'))
If you don’t know the encoding, then to read binary input into string in Python 3 and Python 2 compatible way, use the ancient MS-DOS CP437 encoding:
PY3K = sys.version_info >= (3, 0)
lines = []
for line in stream:
if not PY3K:
lines.append(line)
else:
lines.append(line.decode('cp437'))
Because encoding is unknown, expect non-English symbols to translate to characters of cp437 (English characters are not translated, because they match in most single byte encodings and UTF-8).
Decoding arbitrary binary input to UTF-8 is unsafe, because you may get this:
>>> b'\x00\x01\xffsd'.decode('utf-8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 2: invalid
start byte
The same applies to latin-1, which was popular (the default?) for Python 2. See the missing points in Codepage Layout – it is where Python chokes with infamous ordinal not in range.
UPDATE 20150604: There are rumors that Python 3 has the surrogateescape error strategy for encoding stuff into binary data without data loss and crashes, but it needs conversion tests, [binary] -> [str] -> [binary], to validate both performance and reliability.
UPDATE 20170116: Thanks to comment by Nearoo – there is also a possibility to slash escape all unknown bytes with backslashreplace error handler. That works only for Python 3, so even with this workaround you will still get inconsistent output from different Python versions:
PY3K = sys.version_info >= (3, 0)
lines = []
for line in stream:
if not PY3K:
lines.append(line)
else:
lines.append(line.decode('utf-8', 'backslashreplace'))
UPDATE 20170119: I decided to implement slash escaping decode that works for both Python 2 and Python 3. It should be slower than the cp437 solution, but it should produce identical results on every Python version.
# --- preparation
import codecs
def slashescape(err):
""" codecs error handler. err is UnicodeDecode instance. return
a tuple with a replacement for the unencodable part of the input
and a position where encoding should continue"""
#print err, dir(err), err.start, err.end, err.object[:err.start]
thebyte = err.object[err.start:err.end]
repl = u'\\x'+hex(ord(thebyte))[2:]
return (repl, err.end)
codecs.register_error('slashescape', slashescape)
# --- processing
stream = [b'\x80abc']
lines = []
for line in stream:
lines.append(line.decode('utf-8', 'slashescape'))
Aaron’s answer was correct, except that you need to know which encoding to use. And I believe that Windows uses ‘windows-1252’. It will only matter if you have some unusual (non-ASCII) characters in your content, but then it will make a difference.
By the way, the fact that it does matter is the reason that Python moved to using two different types for binary and text data: it can’t convert magically between them, because it doesn’t know the encoding unless you tell it! The only way YOU would know is to read the Windows documentation (or read it here).
Trying to decode such byte soup using utf-8 encoding raises UnicodeDecodeError.
It can be worse. The decoding may fail silently and produce mojibake
if you use a wrong incompatible encoding:
>>> '—'.encode('utf-8').decode('cp1252')
'—'
The data is corrupted but your program remains unaware that a failure
has occurred.
In general, what character encoding to use is not embedded in the byte sequence itself. You have to communicate this info out-of-band. Some outcomes are more likely than others and therefore chardet module exists that can guess the character encoding. A single Python script may use multiple character encodings in different places.
ls output can be converted to a Python string using os.fsdecode()
function that succeeds even for undecodable
filenames (it uses
sys.getfilesystemencoding() and surrogateescape error handler on
Unix):
import os
import subprocess
output = os.fsdecode(subprocess.check_output('ls'))
To get the original bytes, you could use os.fsencode().
If you pass universal_newlines=True parameter then subprocess uses
locale.getpreferredencoding(False) to decode bytes e.g., it can be
cp1252 on Windows.
Different commands may use different character encodings for their
output e.g., dir internal command (cmd) may use cp437. To decode its
output, you could pass the encoding explicitly (Python 3.6+):
Since this question is actually asking about subprocess output, you have a more direct approach available since Popen accepts an encoding keyword (in Python 3.6+):
>>> from subprocess import Popen, PIPE
>>> text = Popen(['ls', '-l'], stdout=PIPE, encoding='utf-8').communicate()[0]
>>> type(text)
str
>>> print(text)
total 0
-rw-r--r-- 1 wim badger 0 May 31 12:45 some_file.txt
The general answer for other users is to decode bytes to text:
>>> b'abcde'.decode()
'abcde'
With no argument, sys.getdefaultencoding() will be used. If your data is not sys.getdefaultencoding(), then you must specify the encoding explicitly in the decode call:
All your line endings will be doubled (to \r\r\n), leading to extra empty lines. Python’s text-read functions usually normalize line endings so that strings use only \n. If you receive binary data from a Windows system, Python does not have a chance to do that. Thus,
def cleanLists(self, lista):
lista =[x.strip()for x in lista]
lista =[x.replace('\n','')for x in lista]
lista =[x.replace('\b','')for x in lista]
lista =[x.encode('utf8')for x in lista]
lista =[x.decode('utf8')for x in lista]return lista
def cleanLists(self, lista):
lista = [x.strip() for x in lista]
lista = [x.replace('\n', '') for x in lista]
lista = [x.replace('\b', '') for x in lista]
lista = [x.encode('utf8') for x in lista]
lista = [x.decode('utf8') for x in lista]
return lista
回答 13
对于Python 3,这是一个更安全和Python的方法来从转换byte到string:
def byte_to_str(bytes_or_str):if isinstance(bytes_or_str, bytes):# Check if it's in bytesprint(bytes_or_str.decode('utf-8'))else:print("Object not of byte type")
byte_to_str(b'total 0\n-rw-rw-r-- 1 thomas thomas 0 Mar 3 07:03 file1\n-rw-rw-r-- 1 thomas thomas 0 Mar 3 07:03 file2\n')
输出:
total 0-rw-rw-r--1 thomas thomas 0Mar307:03 file1
-rw-rw-r--1 thomas thomas 0Mar307:03 file2
For Python 3, this is a much safer and Pythonic approach to convert from byte to string:
def byte_to_str(bytes_or_str):
if isinstance(bytes_or_str, bytes): # Check if it's in bytes
print(bytes_or_str.decode('utf-8'))
else:
print("Object not of byte type")
byte_to_str(b'total 0\n-rw-rw-r-- 1 thomas thomas 0 Mar 3 07:03 file1\n-rw-rw-r-- 1 thomas thomas 0 Mar 3 07:03 file2\n')
Output:
total 0
-rw-rw-r-- 1 thomas thomas 0 Mar 3 07:03 file1
-rw-rw-r-- 1 thomas thomas 0 Mar 3 07:03 file2
To write or read binary data from/to the standard streams, use the underlying binary buffer. For example, to write bytes to stdout, use sys.stdout.buffer.write(b'abc').
回答 15
def toString(string):try:return v.decode("utf-8")exceptValueError:return string
b = b'97.080.500'
s ='97.080.500'print(toString(b))print(toString(s))
For your specific case of “run a shell command and get its output as text instead of bytes”, on Python 3.7, you should use subprocess.run and pass in text=True (as well as capture_output=True to capture the output)
command_result = subprocess.run(["ls", "-l"], capture_output=True, text=True)
command_result.stdout # is a `str` containing your program's stdout
text used to be called universal_newlines, and was changed (well, aliased) in Python 3.7. If you want to support Python versions before 3.7, pass in universal_newlines=True instead of text=True
If you want to convert any bytes, not just string converted to bytes:
with open("bytesfile", "rb") as infile:
str = base64.b85encode(imageFile.read())
with open("bytesfile", "rb") as infile:
str2 = json.dumps(list(infile.read()))
This is not very efficient, however. It will turn a 2 MB picture into 9 MB.
It is my understanding that the range() function, which is actually an object type in Python 3, generates its contents on the fly, similar to a generator.
This being the case, I would have expected the following line to take an inordinate amount of time, because in order to determine whether 1 quadrillion is in the range, a quadrillion values would have to be generated:
1000000000000000 in range(1000000000000001)
Furthermore: it seems that no matter how many zeroes I add on, the calculation more or less takes the same amount of time (basically instantaneous).
I have also tried things like this, but the calculation is still almost instant:
1000000000000000000000 in range(0,1000000000000000000001,10) # count by tens
If I try to implement my own range function, the result is not so nice!!
def my_crappy_range(N):
i = 0
while i < N:
yield i
i += 1
return
What is the range() object doing under the hood that makes it so fast?
Martijn Pieters’ answer was chosen for its completeness, but also see abarnert’s first answer for a good discussion of what it means for range to be a full-fledged sequence in Python 3, and some information/warning regarding potential inconsistency for __contains__ function optimization across Python implementations. abarnert’s other answer goes into some more detail and provides links for those interested in the history behind the optimization in Python 3 (and lack of optimization of xrange in Python 2). Answers by poke and by wim provide the relevant C source code and explanations for those who are interested.
class my_range(object):def __init__(self, start, stop=None, step=1):if stop isNone:
start, stop =0, start
self.start, self.stop, self.step = start, stop, step
if step <0:
lo, hi, step = stop, start,-step
else:
lo, hi = start, stop
self.length =0if lo > hi else((hi - lo -1)// step)+1def __iter__(self):
current = self.start
if self.step <0:while current > self.stop:yield current
current += self.step
else:while current < self.stop:yield current
current += self.step
def __len__(self):return self.length
def __getitem__(self, i):if i <0:
i += self.length
if0<= i < self.length:return self.start + i * self.step
raiseIndexError('Index out of range: {}'.format(i))def __contains__(self, num):if self.step <0:ifnot(self.stop < num <= self.start):returnFalseelse:ifnot(self.start <= num < self.stop):returnFalsereturn(num - self.start)% self.step ==0
The Python 3 range() object doesn’t produce numbers immediately; it is a smart sequence object that produces numbers on demand. All it contains is your start, stop and step values, then as you iterate over the object the next integer is calculated each iteration.
The object also implements the object.__contains__ hook, and calculates if your number is part of its range. Calculating is a (near) constant time operation *. There is never a need to scan through all possible integers in the range.
The advantage of the range type over a regular list or tuple is that a range object will always take the same (small) amount of memory, no matter the size of the range it represents (as it only stores the start, stop and step values, calculating individual items and subranges as needed).
So at a minimum, your range() object would do:
class my_range(object):
def __init__(self, start, stop=None, step=1):
if stop is None:
start, stop = 0, start
self.start, self.stop, self.step = start, stop, step
if step < 0:
lo, hi, step = stop, start, -step
else:
lo, hi = start, stop
self.length = 0 if lo > hi else ((hi - lo - 1) // step) + 1
def __iter__(self):
current = self.start
if self.step < 0:
while current > self.stop:
yield current
current += self.step
else:
while current < self.stop:
yield current
current += self.step
def __len__(self):
return self.length
def __getitem__(self, i):
if i < 0:
i += self.length
if 0 <= i < self.length:
return self.start + i * self.step
raise IndexError('Index out of range: {}'.format(i))
def __contains__(self, num):
if self.step < 0:
if not (self.stop < num <= self.start):
return False
else:
if not (self.start <= num < self.stop):
return False
return (num - self.start) % self.step == 0
This is still missing several things that a real range() supports (such as the .index() or .count() methods, hashing, equality testing, or slicing), but should give you an idea.
I also simplified the __contains__ implementation to only focus on integer tests; if you give a real range() object a non-integer value (including subclasses of int), a slow scan is initiated to see if there is a match, just as if you use a containment test against a list of all the contained values. This was done to continue to support other numeric types that just happen to support equality testing with integers but are not expected to support integer arithmetic as well. See the original Python issue that implemented the containment test.
* Near constant time because Python integers are unbounded and so math operations also grow in time as N grows, making this a O(log N) operation. Since it’s all executed in optimised C code and Python stores integer values in 30-bit chunks, you’d run out of memory before you saw any performance impact due to the size of the integers involved here.
回答 1
此处的根本误解是认为range是生成器。不是。实际上,它不是任何迭代器。
您可以很容易地说出这一点:
>>> a = range(5)>>>print(list(a))[0,1,2,3,4]>>>print(list(a))[0,1,2,3,4]
如果它是一个生成器,则对其进行一次迭代将耗尽它:
>>> b = my_crappy_range(5)>>>print(list(b))[0,1,2,3,4]>>>print(list(b))[]
The difference between a range and a list is that a range is a lazy or dynamic sequence; it doesn’t remember all of its values, it just remembers its start, stop, and step, and creates the values on demand on __getitem__.
(As a side note, if you print(iter(a)), you’ll notice that range uses the same listiterator type as list. How does that work? A listiterator doesn’t use anything special about list except for the fact that it provides a C implementation of __getitem__, so it works fine for range too.)
Now, there’s nothing that says that Sequence.__contains__ has to be constant time—in fact, for obvious examples of sequences like list, it isn’t. But there’s nothing that says it can’t be. And it’s easier to implement range.__contains__ to just check it mathematically ((val - start) % step, but with some extra complexity to deal with negative steps) than to actually generate and test all the values, so why shouldn’t it do it the better way?
But there doesn’t seem to be anything in the language that guarantees this will happen. As Ashwini Chaudhari points out, if you give it a non-integral value, instead of converting to integer and doing the mathematical test, it will fall back to iterating all the values and comparing them one by one. And just because CPython 3.2+ and PyPy 3.x versions happen to contain this optimization, and it’s an obvious good idea and easy to do, there’s no reason that IronPython or NewKickAssPython 3.x couldn’t leave it out. (And in fact CPython 3.0-3.1 didn’t include it.)
If range actually were a generator, like my_crappy_range, then it wouldn’t make sense to test __contains__ this way, or at least the way it makes sense wouldn’t be obvious. If you’d already iterated the first 3 values, is 1 still in the generator? Should testing for 1 cause it to iterate and consume all the values up to 1 (or up to the first value >= 1)?
>>> x, r =1000000000000000, range(1000000000000001)>>>classMyInt(int):...pass...>>> x_ =MyInt(x)>>> x in r # calculates immediately :) True>>> x_ in r # iterates for ages.. :( ^\Quit (core dumped)
In CPython, range(...).__contains__ (a method wrapper) will eventually delegate to a simple calculation which checks if the value can possibly be in the range. The reason for the speed here is we’re using mathematical reasoning about the bounds, rather than a direct iteration of the range object. To explain the logic used:
Check that the number is between start and stop, and
Check that the stride value doesn’t “step over” our number.
For example, 994 is in range(4, 1000, 2) because:
4 <= 994 < 1000, and
(994 - 4) % 2 == 0.
The full C code is included below, which is a bit more verbose because of memory management and reference counting details, but the basic idea is there:
static int
range_contains_long(rangeobject *r, PyObject *ob)
{
int cmp1, cmp2, cmp3;
PyObject *tmp1 = NULL;
PyObject *tmp2 = NULL;
PyObject *zero = NULL;
int result = -1;
zero = PyLong_FromLong(0);
if (zero == NULL) /* MemoryError in int(0) */
goto end;
/* Check if the value can possibly be in the range. */
cmp1 = PyObject_RichCompareBool(r->step, zero, Py_GT);
if (cmp1 == -1)
goto end;
if (cmp1 == 1) { /* positive steps: start <= ob < stop */
cmp2 = PyObject_RichCompareBool(r->start, ob, Py_LE);
cmp3 = PyObject_RichCompareBool(ob, r->stop, Py_LT);
}
else { /* negative steps: stop < ob <= start */
cmp2 = PyObject_RichCompareBool(ob, r->start, Py_LE);
cmp3 = PyObject_RichCompareBool(r->stop, ob, Py_LT);
}
if (cmp2 == -1 || cmp3 == -1) /* TypeError */
goto end;
if (cmp2 == 0 || cmp3 == 0) { /* ob outside of range */
result = 0;
goto end;
}
/* Check that the stride does not invalidate ob's membership. */
tmp1 = PyNumber_Subtract(ob, r->start);
if (tmp1 == NULL)
goto end;
tmp2 = PyNumber_Remainder(tmp1, r->step);
if (tmp2 == NULL)
goto end;
/* result = ((int(ob) - start) % step) == 0 */
result = PyObject_RichCompareBool(tmp2, zero, Py_EQ);
end:
Py_XDECREF(tmp1);
Py_XDECREF(tmp2);
Py_XDECREF(zero);
return result;
}
static int
range_contains(rangeobject *r, PyObject *ob)
{
if (PyLong_CheckExact(ob) || PyBool_Check(ob))
return range_contains_long(r, ob);
return (int)_PySequence_IterSearch((PyObject*)r, ob,
PY_ITERSEARCH_CONTAINS);
}
As a final note – look at the range_contains function at the bottom of the code snippet. If the exact type check fails then we don’t use the clever algorithm described, instead falling back to a dumb iteration search of the range using _PySequence_IterSearch! You can check this behaviour in the interpreter (I’m using v3.5.0 here):
>>> x, r = 1000000000000000, range(1000000000000001)
>>> class MyInt(int):
... pass
...
>>> x_ = MyInt(x)
>>> x in r # calculates immediately :)
True
>>> x_ in r # iterates for ages.. :(
^\Quit (core dumped)
def range_contains (rangeObj, obj):if isinstance(obj, int):return range_contains_long(rangeObj, obj)# default logic by iteratingreturn any(obj == x for x in rangeObj)def range_contains_long (r, num):if r.step >0:# positive step: r.start <= num < r.stop
cmp2 = r.start <= num
cmp3 = num < r.stop
else:# negative step: r.start >= num > r.stop
cmp2 = num <= r.start
cmp3 = r.stop < num
# outside of the range boundariesifnot cmp2 ornot cmp3:returnFalse# num must be on a valid step inside the boundariesreturn(num - r.start)% r.step ==0
To add to Martijn’s answer, this is the relevant part of the source (in C, as the range object is written in native code):
static int
range_contains(rangeobject *r, PyObject *ob)
{
if (PyLong_CheckExact(ob) || PyBool_Check(ob))
return range_contains_long(r, ob);
return (int)_PySequence_IterSearch((PyObject*)r, ob,
PY_ITERSEARCH_CONTAINS);
}
So for PyLong objects (which is int in Python 3), it will use the range_contains_long function to determine the result. And that function essentially checks if ob is in the specified range (although it looks a bit more complex in C).
If it’s not an int object, it falls back to iterating until it finds the value (or not).
The whole logic could be translated to pseudo-Python like this:
def range_contains (rangeObj, obj):
if isinstance(obj, int):
return range_contains_long(rangeObj, obj)
# default logic by iterating
return any(obj == x for x in rangeObj)
def range_contains_long (r, num):
if r.step > 0:
# positive step: r.start <= num < r.stop
cmp2 = r.start <= num
cmp3 = num < r.stop
else:
# negative step: r.start >= num > r.stop
cmp2 = num <= r.start
cmp3 = r.stop < num
# outside of the range boundaries
if not cmp2 or not cmp3:
return False
# num must be on a valid step inside the boundaries
return (num - r.start) % r.step == 0
If you’re wondering why this optimization was added to range.__contains__, and why it wasn’t added to xrange.__contains__ in 2.7:
First, as Ashwini Chaudhary discovered, issue 1766304 was opened explicitly to optimize [x]range.__contains__. A patch for this was accepted and checked in for 3.2, but not backported to 2.7 because “xrange has behaved like this for such a long time that I don’t see what it buys us to commit the patch this late.” (2.7 was nearly out at that point.)
Meanwhile:
Originally, xrange was a not-quite-sequence object. As the 3.1 docs say:
Range objects have very little behavior: they only support indexing, iteration, and the len function.
This wasn’t quite true; an xrange object actually supported a few other things that come automatically with indexing and len,* including __contains__ (via linear search). But nobody thought it was worth making them full sequences at the time.
Then, as part of implementing the Abstract Base Classes PEP, it was important to figure out which builtin types should be marked as implementing which ABCs, and xrange/range claimed to implement collections.Sequence, even though it still only handled the same “very little behavior”. Nobody noticed that problem until issue 9213. The patch for that issue not only added index and count to 3.2’s range, it also re-worked the optimized __contains__ (which shares the same math with index, and is directly used by count).**This change went in for 3.2 as well, and was not backported to 2.x, because “it’s a bugfix that adds new methods”. (At this point, 2.7 was already past rc status.)
So, there were two chances to get this optimization backported to 2.7, but they were both rejected.
* In fact, you even get iteration for free with indexing alone, but in 2.3xrange objects got a custom iterator.
** The first version actually reimplemented it, and got the details wrong—e.g., it would give you MyIntSubclass(2) in range(5) == False. But Daniel Stutzbach’s updated version of the patch restored most of the previous code, including the fallback to the generic, slow _PySequence_IterSearch that pre-3.2 range.__contains__ was implicitly using when the optimization doesn’t apply.
回答 5
其他答案已经很好地说明了这一点,但是我想提供另一个实验来说明范围对象的性质:
>>> r = range(5)>>>for i in r:print(i,2in r, list(r))0True[0,1,2,3,4]1True[0,1,2,3,4]2True[0,1,2,3,4]3True[0,1,2,3,4]4True[0,1,2,3,4]
The other answers explained it well already, but I’d like to offer another experiment illustrating the nature of range objects:
>>> r = range(5)
>>> for i in r:
print(i, 2 in r, list(r))
0 True [0, 1, 2, 3, 4]
1 True [0, 1, 2, 3, 4]
2 True [0, 1, 2, 3, 4]
3 True [0, 1, 2, 3, 4]
4 True [0, 1, 2, 3, 4]
As you can see, a range object is an object that remembers its range and can be used many times (even while iterating over it), not just a one-time generator.
It’s all about a lazy approach to the evaluation and some extra optimization of range.
Values in ranges don’t need to be computed until real use, or even further due to extra optimization.
By the way, your integer is not such big, consider sys.maxsize
sys.maxsize in range(sys.maxsize)is pretty fast
due to optimization – it’s easy to compare given integer just with min and max of range.
but:
Decimal(sys.maxsize) in range(sys.maxsize)is pretty slow.
(in this case, there is no optimization in range, so if python receives unexpected Decimal, python will compare all numbers)
You should be aware of an implementation detail but should not be relied upon, because this may change in the future.
The object returned by range() is actually a range object. This object implements the iterator interface so you can iterate over its values sequentially, just like a generator, list, or tuple.
But it also implements the __contains__ interface which is actually what gets called when an object appears on the right hand side of the in operator. The __contains__() method returns a bool of whether or not the item on the left-hand-side of the in is in the object. Since range objects know their bounds and stride, this is very easy to implement in O(1).
Due to optimization, it is very easy to compare given integers just with min and max range.
The reason that range() function is so fast in Python3 is that here we use mathematical reasoning for the bounds, rather than a direct iteration of the range object.
So for explaining the logic here:
Check whether the number is between the start and stop.
Check whether the step precision value doesn’t go over our number.
Take an example, 997 is in range(4, 1000, 3) because:
4 <= 997 < 1000, and (997 - 4) % 3 == 0.
回答 9
尝试x-1 in (i for i in range(x))使用较大的x值,该值使用生成器理解来避免调用range.__contains__优化。
The SimpleHTTPServer module has been merged into http.server in Python 3.0. The 2to3 tool will automatically adapt imports when converting your sources to 3.0.
So, your command is python -m http.server, or depending on your installation, it can be:
How do I force Python’s print function to output to the screen?
This is not a duplicate of Disable output buffering – the linked question is attempting unbuffered output, while this is more general. The top answers in that question are too powerful or involved for this one (they’re not good answers for this), and this question can be found on Google by a relative newbie.
Since Python 3.3, you can force the normal print() function to flush without the need to use sys.stdout.flush(); just set the “flush” keyword argument to true. From the documentation:
Print objects to the stream file, separated by sep and followed by end. sep, end and file, if present, must be given as keyword arguments.
All non-keyword arguments are converted to strings like str() does and written to the stream, separated by sep and followed by end. Both sep and end must be strings; they can also be None, which means to use the default values. If no objects are given, print() will just write end.
The file argument must be an object with a write(string) method; if it is not present or None, sys.stdout will be used. Whether output is buffered is usually determined by file, but if the flush keyword argument is true, the stream is forcibly flushed.
>>>from __future__ import print_function
>>> help(print)print(...)print(value,..., sep=' ', end='\n', file=sys.stdout)Prints the values to a stream,or to sys.stdout by default.Optional keyword arguments:
file: a file-like object (stream); defaults to the current sys.stdout.
sep: string inserted between values, default a space.
end: string appended after the last value, default a newline.
In Python 3, call print(..., flush=True) (the flush argument is not available in Python 2’s print function, and there is no analogue for the print statement).
Call file.flush() on the output file (we can wrap python 2’s print function to do this), for example, sys.stdout
apply this to every print function call in the module with a partial function, print = partial(print, flush=True) applied to the module global.
apply this to the process with a flag (-u) passed to the interpreter command
apply this to every python process in your environment with PYTHONUNBUFFERED=TRUE (and unset the variable to undo this).
Python 3.3+
Using Python 3.3 or higher, you can just provide flush=True as a keyword argument to the print function:
print('foo', flush=True)
Python 2 (or < 3.3)
They did not backport the flush argument to Python 2.7 So if you’re using Python 2 (or less than 3.3), and want code that’s compatible with both 2 and 3, may I suggest the following compatibility code. (Note the __future__ import must be at/very “near the top of your module“):
from __future__ import print_function
import sys
if sys.version_info[:2] < (3, 3):
old_print = print
def print(*args, **kwargs):
flush = kwargs.pop('flush', False)
old_print(*args, **kwargs)
if flush:
file = kwargs.get('file', sys.stdout)
# Why might file=None? IDK, but it works for print(i, file=None)
file.flush() if file is not None else sys.stdout.flush()
The above compatibility code will cover most uses, but for a much more thorough treatment, see the six module.
Alternatively, you can just call file.flush() after printing, for example, with the print statement in Python 2:
Note again, this only changes the current global scope, because the print name on the current global scope will overshadow the builtin print function (or unreference the compatibility function, if using one in Python 2, in that current global scope).
If you want to do this inside a function instead of on a module’s global scope, you should give it a different name, e.g.:
def foo():
printf = functools.partial(print, flush=True)
printf('print stuff like this')
If you declare it a global in a function, you’re changing it on the module’s global namespace, so you should just put it in the global namespace, unless that specific behavior is exactly what you want.
Changing the default for the process
I think the best option here is to use the -u flag to get unbuffered output.
Force stdin, stdout and stderr to be totally unbuffered. On systems where it matters, also put stdin, stdout and stderr in binary mode.
Note that there is internal buffering in file.readlines() and File Objects (for line in sys.stdin) which is not influenced by this option. To work around this, you will want to use file.readline() inside a while 1: loop.
Changing the default for the shell operating environment
You can get this behavior for all python processes in the environment or environments that inherit from the environment if you set the environment variable to a nonempty string:
If this is set to a non-empty string it is equivalent to specifying the -u option.
Addendum
Here’s the help on the print function from Python 2.7.12 – note that there is noflush argument:
>>> from __future__ import print_function
>>> help(print)
print(...)
print(value, ..., sep=' ', end='\n', file=sys.stdout)
Prints the values to a stream, or to sys.stdout by default.
Optional keyword arguments:
file: a file-like object (stream); defaults to the current sys.stdout.
sep: string inserted between values, default a space.
end: string appended after the last value, default a newline.
Using the -u command-line switch works, but it is a little bit clumsy. It would mean that the program would potentially behave incorrectly if the user invoked the script without the -u option. I usually use a custom stdout, like this:
class flushfile:
def __init__(self, f):
self.f = f
def write(self, x):
self.f.write(x)
self.f.flush()
import sys
sys.stdout = flushfile(sys.stdout)
… Now all your print calls (which use sys.stdout implicitly), will be automatically flushed.
from enum importEnum# for enum34, or the stdlib version# from aenum import Enum # for the aenum versionAnimal=Enum('Animal','ant bee cat dog')Animal.ant # returns <Animal.ant: 1>Animal['ant']# returns <Animal.ant: 1> (string lookup)Animal.ant.name # returns 'ant' (inverse lookup)
For more advanced Enum techniques try the aenum library (2.7, 3.3+, same author as enum34. Code is not perfectly compatible between py2 and py3, e.g. you’ll need __order__ in python 2).
To use enum34, do $ pip install enum34
To use aenum, do $ pip install aenum
Installing enum (no numbers) will install a completely different and incompatible version.
from enum import Enum # for enum34, or the stdlib version
# from aenum import Enum # for the aenum version
Animal = Enum('Animal', 'ant bee cat dog')
Animal.ant # returns <Animal.ant: 1>
Animal['ant'] # returns <Animal.ant: 1> (string lookup)
Animal.ant.name # returns 'ant' (inverse lookup)
or equivalently:
class Animal(Enum):
ant = 1
bee = 2
cat = 3
dog = 4
In earlier versions, one way of accomplishing enums is:
Support for converting the values back to names can be added this way:
def enum(*sequential, **named):
enums = dict(zip(sequential, range(len(sequential))), **named)
reverse = dict((value, key) for key, value in enums.iteritems())
enums['reverse_mapping'] = reverse
return type('Enum', (), enums)
This overwrites anything with that name, but it is useful for rendering your enums in output. It will throw KeyError if the reverse mapping doesn’t exist. With the first example:
Before PEP 435, Python didn’t have an equivalent but you could implement your own.
Myself, I like keeping it simple (I’ve seen some horribly complex examples on the net), something like this …
class Animal:
DOG = 1
CAT = 2
x = Animal.DOG
In Python 3.4 (PEP 435), you can make Enum the base class. This gets you a little bit of extra functionality, described in the PEP. For example, enum members are distinct from integers, and they are composed of a name and a value.
class Animal(Enum):
DOG = 1
CAT = 2
print(Animal.DOG)
# <Animal.DOG: 1>
print(Animal.DOG.value)
# 1
print(Animal.DOG.name)
# "DOG"
If you don’t want to type the values, use the following shortcut:
class Animal(Enum):
DOG, CAT = range(2)
Enum implementations can be converted to lists and are iterable. The order of its members is the declaration order and has nothing to do with their values. For example:
class Animal(Enum):
DOG = 1
CAT = 2
COW = 0
list(Animal)
# [<Animal.DOG: 1>, <Animal.CAT: 2>, <Animal.COW: 0>]
[animal.value for animal in Animal]
# [1, 2, 0]
Animal.CAT in Animal
# True
回答 2
这是一个实现:
classEnum(set):def __getattr__(self, name):if name in self:return nameraiseAttributeError
If you need the numeric values, here’s the quickest way:
dog, cat, rabbit = range(3)
In Python 3.x you can also add a starred placeholder at the end, which will soak up all the remaining values of the range in case you don’t mind wasting memory and cannot count:
The best solution for you would depend on what you require from your fakeenum.
Simple enum:
If you need the enum as only a list of names identifying different items, the solution by Mark Harrison (above) is great:
Pen, Pencil, Eraser = range(0, 3)
Using a range also allows you to set any starting value:
Pen, Pencil, Eraser = range(9, 12)
In addition to the above, if you also require that the items belong to a container of some sort, then embed them in a class:
class Stationery:
Pen, Pencil, Eraser = range(0, 3)
To use the enum item, you would now need to use the container name and the item name:
stype = Stationery.Pen
Complex enum:
For long lists of enum or more complicated uses of enum, these solutions will not suffice. You could look to the recipe by Will Ware for Simulating Enumerations in Python published in the Python Cookbook. An online version of that is available here.
The typesafe enum pattern which was used in Java pre-JDK 5 has a
number of advantages. Much like in Alexandru’s answer, you create a
class and class level fields are the enum values; however, the enum
values are instances of the class rather than small integers. This has
the advantage that your enum values don’t inadvertently compare equal
to small integers, you can control how they’re printed, add arbitrary
methods if that’s useful and make assertions using isinstance:
class Animal:
def __init__(self, name):
self.name = name
def __str__(self):
return self.name
def __repr__(self):
return "<Animal: %s>" % self
Animal.DOG = Animal("dog")
Animal.CAT = Animal("cat")
>>> x = Animal.DOG
>>> x
<Animal: dog>
>>> x == 1
False
A recent thread on python-dev pointed out there are a couple of enum libraries in the wild, including:
>>>State=Enum(['Unclaimed','Claimed'])>>>State.Claimed1>>>State[1]'Claimed'>>>State('Unclaimed','Claimed')>>> range(len(State))[0,1]>>>[(k,State[k])for k in range(len(State))][(0,'Unclaimed'),(1,'Claimed')]>>>[(k, getattr(State, k))for k inState][('Unclaimed',0),('Claimed',1)]
Python doesn’t have a built-in equivalent to enum, and other answers have ideas for implementing your own (you may also be interested in the over the top version in the Python cookbook).
However, in situations where an enum would be called for in C, I usually end up just using simple strings: because of the way objects/attributes are implemented, (C)Python is optimized to work very fast with short strings anyway, so there wouldn’t really be any performance benefit to using integers. To guard against typos / invalid values you can insert checks in selected places.
On 2013-05-10, Guido agreed to accept PEP 435 into the Python 3.4 standard library. This means that Python finally has builtin support for enumerations!
There is a backport available for Python 3.3, 3.2, 3.1, 2.7, 2.6, 2.5, and 2.4. It’s on Pypi as enum34.
Declaration:
>>> from enum import Enum
>>> class Color(Enum):
... red = 1
... green = 2
... blue = 3
class Animal:
class Dog: pass
class Cat: pass
x = Animal.Dog
It’s more bug-proof than using integers since you don’t have to worry about ensuring that the integers are unique (e.g. if you said Dog = 1 and Cat = 1 you’d be screwed).
It’s more bug-proof than using strings since you don’t have to worry about typos (e.g.
x == “catt” fails silently, but x == Animal.Catt is a runtime exception).
回答 11
def M_add_class_attribs(attribs):def foo(name, bases, dict_):for v, k in attribs:
dict_[k]= v
return type(name, bases, dict_)return foo
def enum(*names):classFoo(object):
__metaclass__ = M_add_class_attribs(enumerate(names))def __setattr__(self, name, value):# this makes it read-onlyraiseNotImplementedErrorreturnFoo()
def M_add_class_attribs(attribs):
def foo(name, bases, dict_):
for v, k in attribs:
dict_[k] = v
return type(name, bases, dict_)
return foo
def enum(*names):
class Foo(object):
__metaclass__ = M_add_class_attribs(enumerate(names))
def __setattr__(self, name, value): # this makes it read-only
raise NotImplementedError
return Foo()
Hmmm… I suppose the closest thing to an enum would be a dictionary, defined either like this:
months = {
'January': 1,
'February': 2,
...
}
or
months = dict(
January=1,
February=2,
...
)
Then, you can use the symbolic name for the constants like this:
mymonth = months['January']
There are other options, like a list of tuples, or a tuple of tuples, but the dictionary is the only one that provides you with a “symbolic” (constant string) way to access the
value.
Edit: I like Alexandru’s answer too!
回答 13
另一个非常简单的Python枚举实现,使用namedtuple:
from collections import namedtuple
def enum(*keys):return namedtuple('Enum', keys)(*keys)MyEnum= enum('FOO','BAR','BAZ')
或者,
# With sequential number valuesdef enum(*keys):return namedtuple('Enum', keys)(*range(len(keys)))# From a dict / keyword argsdef enum(**kwargs):return namedtuple('Enum', kwargs.keys())(*kwargs.values())
就像上面子类的方法一样set,这允许:
'FOO'inMyEnum
other =MyEnum.FOO
assert other ==MyEnum.FOO
Enumerations are created using the class syntax, which makes them easy
to read and write. An alternative creation method is described in
Functional API. To define an enumeration, subclass Enum as follows:
from enum import Enum
class Color(Enum):
red = 1
green = 2
blue = 3
回答 15
我用什么:
classEnum(object):def __init__(self, names, separator=None):
self.names = names.split(separator)for value, name in enumerate(self.names):
setattr(self, name.upper(), value)def tuples(self):return tuple(enumerate(self.names))
如何使用:
>>> state =Enum('draft published retracted')>>> state.DRAFT
0>>> state.RETRACTED
2>>> state.FOO
Traceback(most recent call last):File"<stdin>", line 1,in<module>AttributeError:'Enum' object has no attribute 'FOO'>>> state.tuples()((0,'draft'),(1,'published'),(2,'retracted'))
def cmp(a,b):if a < b:return-1if b < a:return1return0defEnum(*names):##assert names, "Empty enums are not supported" # <- Don't like empty enums? Uncomment!classEnumClass(object):
__slots__ = names
def __iter__(self):return iter(constants)def __len__(self):return len(constants)def __getitem__(self, i):return constants[i]def __repr__(self):return'Enum'+ str(names)def __str__(self):return'enum '+ str(constants)classEnumValue(object):
__slots__ =('__value')def __init__(self, value): self.__value = value
Value= property(lambda self: self.__value)EnumType= property(lambda self:EnumType)def __hash__(self):return hash(self.__value)def __cmp__(self, other):# C fans might want to remove the following assertion# to make all enums comparable by ordinal value {;))assert self.EnumTypeis other.EnumType,"Only values from the same enum are comparable"return cmp(self.__value, other.__value)def __lt__(self, other):return self.__cmp__(other)<0def __eq__(self, other):return self.__cmp__(other)==0def __invert__(self):return constants[maximum - self.__value]def __nonzero__(self):return bool(self.__value)def __repr__(self):return str(names[self.__value])
maximum = len(names)-1
constants =[None]* len(names)for i, each in enumerate(names):
val =EnumValue(i)
setattr(EnumClass, each, val)
constants[i]= val
constants = tuple(constants)EnumType=EnumClass()returnEnumTypeif __name__ =='__main__':print('\n*** Enum Demo ***')print('--- Days of week ---')Days=Enum('Mo','Tu','We','Th','Fr','Sa','Su')print(Days)print(Days.Mo)print(Days.Fr)print(Days.Mo<Days.Fr)print( list(Days))for each inDays:print('Day:', each)print('--- Yes/No ---')Confirmation=Enum('No','Yes')
answer =Confirmation.Noprint('Your answer is not',~answer)
It gives you a class, and the class contains all the enums. The enums can be compared to each other, but don’t have any particular value; you can’t use them as an integer value. (I resisted this at first because I am used to C enums, which are integer values. But if you can’t use it as an integer, you can’t use it as an integer by mistake so overall I think it is a win.) Each enum is a unique value. You can print enums, you can iterate over them, you can test that an enum value is “in” the enum. It’s pretty complete and slick.
Edit (cfi): The above link is not Python 3 compatible. Here’s my port of enum.py to Python 3:
def cmp(a,b):
if a < b: return -1
if b < a: return 1
return 0
def Enum(*names):
##assert names, "Empty enums are not supported" # <- Don't like empty enums? Uncomment!
class EnumClass(object):
__slots__ = names
def __iter__(self): return iter(constants)
def __len__(self): return len(constants)
def __getitem__(self, i): return constants[i]
def __repr__(self): return 'Enum' + str(names)
def __str__(self): return 'enum ' + str(constants)
class EnumValue(object):
__slots__ = ('__value')
def __init__(self, value): self.__value = value
Value = property(lambda self: self.__value)
EnumType = property(lambda self: EnumType)
def __hash__(self): return hash(self.__value)
def __cmp__(self, other):
# C fans might want to remove the following assertion
# to make all enums comparable by ordinal value {;))
assert self.EnumType is other.EnumType, "Only values from the same enum are comparable"
return cmp(self.__value, other.__value)
def __lt__(self, other): return self.__cmp__(other) < 0
def __eq__(self, other): return self.__cmp__(other) == 0
def __invert__(self): return constants[maximum - self.__value]
def __nonzero__(self): return bool(self.__value)
def __repr__(self): return str(names[self.__value])
maximum = len(names) - 1
constants = [None] * len(names)
for i, each in enumerate(names):
val = EnumValue(i)
setattr(EnumClass, each, val)
constants[i] = val
constants = tuple(constants)
EnumType = EnumClass()
return EnumType
if __name__ == '__main__':
print( '\n*** Enum Demo ***')
print( '--- Days of week ---')
Days = Enum('Mo', 'Tu', 'We', 'Th', 'Fr', 'Sa', 'Su')
print( Days)
print( Days.Mo)
print( Days.Fr)
print( Days.Mo < Days.Fr)
print( list(Days))
for each in Days:
print( 'Day:', each)
print( '--- Yes/No ---')
Confirmation = Enum('No', 'Yes')
answer = Confirmation.No
print( 'Your answer is not', ~answer)
>>>classEnum(int):...def __new__(cls, value):...if isinstance(value, str):...return getattr(cls, value)...elif isinstance(value, int):...return cls.__index[value]...def __str__(self):return self.__name
...def __repr__(self):return"%s.%s"%(type(self).__name__, self.__name)...class __metaclass__(type):...def __new__(mcls, name, bases, attrs):... attrs['__slots__']=['_Enum__name']... cls = type.__new__(mcls, name, bases, attrs)... cls._Enum__index= _index ={}...for base in reversed(bases):...if hasattr(base,'_Enum__index'):... _index.update(base._Enum__index)...# create all of the instances of the new class...for attr in attrs.keys():... value = attrs[attr]...if isinstance(value, int):... evalue = int.__new__(cls, value)... evalue._Enum__name= attr
... _index[value]= evalue
... setattr(cls, attr, evalue)...return cls
...
一个奇特的使用示例:
>>>classCitrus(Enum):...Lemon=1...Lime=2...>>>Citrus.LemonCitrus.Lemon>>>>>>Citrus(1)Citrus.Lemon>>>Citrus(5)Traceback(most recent call last):File"<stdin>", line 1,in<module>File"<stdin>", line 6,in __new__
KeyError:5>>>classFruit(Citrus):...Apple=3...Banana=4...>>>Fruit.AppleFruit.Apple>>>Fruit.LemonCitrus.Lemon>>>Fruit(1)Citrus.Lemon>>>Fruit(3)Fruit.Apple>>>"%d %s %r"%((Fruit.Apple,)*3)'3 Apple Fruit.Apple'>>>Fruit(1)isCitrus.LemonTrue
I have had occasion to need of an Enum class, for the purpose of decoding a binary file format. The features I happened to want is concise enum definition, the ability to freely create instances of the enum by either integer value or string, and a useful representation. Here’s what I ended up with:
>>> class Enum(int):
... def __new__(cls, value):
... if isinstance(value, str):
... return getattr(cls, value)
... elif isinstance(value, int):
... return cls.__index[value]
... def __str__(self): return self.__name
... def __repr__(self): return "%s.%s" % (type(self).__name__, self.__name)
... class __metaclass__(type):
... def __new__(mcls, name, bases, attrs):
... attrs['__slots__'] = ['_Enum__name']
... cls = type.__new__(mcls, name, bases, attrs)
... cls._Enum__index = _index = {}
... for base in reversed(bases):
... if hasattr(base, '_Enum__index'):
... _index.update(base._Enum__index)
... # create all of the instances of the new class
... for attr in attrs.keys():
... value = attrs[attr]
... if isinstance(value, int):
... evalue = int.__new__(cls, value)
... evalue._Enum__name = attr
... _index[value] = evalue
... setattr(cls, attr, evalue)
... return cls
...
A whimsical example of using it:
>>> class Citrus(Enum):
... Lemon = 1
... Lime = 2
...
>>> Citrus.Lemon
Citrus.Lemon
>>>
>>> Citrus(1)
Citrus.Lemon
>>> Citrus(5)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 6, in __new__
KeyError: 5
>>> class Fruit(Citrus):
... Apple = 3
... Banana = 4
...
>>> Fruit.Apple
Fruit.Apple
>>> Fruit.Lemon
Citrus.Lemon
>>> Fruit(1)
Citrus.Lemon
>>> Fruit(3)
Fruit.Apple
>>> "%d %s %r" % ((Fruit.Apple,)*3)
'3 Apple Fruit.Apple'
>>> Fruit(1) is Citrus.Lemon
True
Key features:
str(), int() and repr() all produce the most useful output possible, respectively the name of the enumartion, its integer value, and a Python expression that evaluates back to the enumeration.
Enumerated values returned by the constructor are limited strictly to the predefined values, no accidental enum values.
Enumerated values are singletons; they can be strictly compared with is
>>>from flufl.enum importEnum>>>classColors(Enum):... red =1... green =2... blue =3>>>for color inColors:print color
Colors.red
Colors.green
Colors.blue
>>> from flufl.enum import Enum
>>> class Colors(Enum):
... red = 1
... green = 2
... blue = 3
>>> for color in Colors: print color
Colors.red
Colors.green
Colors.blue
回答 21
def enum(*sequential,**named):
enums = dict(zip(sequential,[object()for _ in range(len(sequential))]),**named)return type('Enum',(), enums)
When using other implementations sited here (also when using named instances in my example) you must be sure you never try to compare objects from different enums. For here’s a possible pitfall:
>>>Numbers= enum_base(int, ONE=1, TWO=2, THREE=3)>>>Numbers.ONE
1>>> x =Numbers.TWO
>>>10+ x
12>>> type(Numbers)<type 'type'>>>> type(Numbers.ONE)<class'Enum'>>>> isinstance(x,Numbers)True
使用此方法可以完成的另一件有趣的事情是,通过覆盖内置方法来自定义特定行为:
def enum_repr(t,**enums):'''enums with a base class and repr() output'''classEnum(t):def __repr__(self):return'<enum {0} of type Enum({1})>'.format(self._name, t.__name__)for key,val in enums.items():
i =Enum(val)
i._name = key
setattr(Enum, key, i)returnEnum>>>Numbers= enum_repr(int, ONE=1, TWO=2, THREE=3)>>> repr(Numbers.ONE)'<enum ONE of type Enum(int)>'>>> str(Numbers.ONE)'1'
It’s elegant and clean looking, but it’s just a function that creates a class with the specified attributes.
With a little modification to the function, we can get it to act a little more ‘enumy’:
NOTE: I created the following examples by trying to reproduce the
behavior of pygtk’s new style ‘enums’ (like Gtk.MessageType.WARNING)
def enum_base(t, **enums):
'''enums with a base class'''
T = type('Enum', (t,), {})
for key,val in enums.items():
setattr(T, key, T(val))
return T
This creates an enum based off a specified type. In addition to giving attribute access like the previous function, it behaves as you would expect an Enum to with respect to types. It also inherits the base class.
Another interesting thing that can be done with this method is customize specific behavior by overriding built-in methods:
def enum_repr(t, **enums):
'''enums with a base class and repr() output'''
class Enum(t):
def __repr__(self):
return '<enum {0} of type Enum({1})>'.format(self._name, t.__name__)
for key,val in enums.items():
i = Enum(val)
i._name = key
setattr(Enum, key, i)
return Enum
>>> Numbers = enum_repr(int, ONE=1, TWO=2, THREE=3)
>>> repr(Numbers.ONE)
'<enum ONE of type Enum(int)>'
>>> str(Numbers.ONE)
'1'
The enum package from PyPI provides a robust implementation of enums. An earlier answer mentioned PEP 354; this was rejected but the proposal was implemented
http://pypi.python.org/pypi/enum.
Alexandru’s suggestion of using class constants for enums works quite well.
I also like to add a dictionary for each set of constants to lookup a human-readable string representation.
This serves two purposes: a) it provides a simple way to pretty-print your enum and b) the dictionary logically groups the constants so that you can test for membership.
def enum(*names):"""
SYNOPSIS
Well-behaved enumerated type, easier than creating custom classes
DESCRIPTION
Create a custom type that implements an enumeration. Similar in concept
to a C enum but with some additional capabilities and protections. See
http://code.activestate.com/recipes/413486-first-class-enums-in-python/.
PARAMETERS
names Ordered list of names. The order in which names are given
will be the sort order in the enum type. Duplicate names
are not allowed. Unicode names are mapped to ASCII.
RETURNS
Object of type enum, with the input names and the enumerated values.
EXAMPLES
>>> letters = enum('a','e','i','o','u','b','c','y','z')
>>> letters.a < letters.e
True
## index by property
>>> letters.a
a
## index by position
>>> letters[0]
a
## index by name, helpful for bridging string inputs to enum
>>> letters['a']
a
## sorting by order in the enum() create, not character value
>>> letters.u < letters.b
True
## normal slicing operations available
>>> letters[-1]
z
## error since there are not 100 items in enum
>>> letters[99]
Traceback (most recent call last):
...
IndexError: tuple index out of range
## error since name does not exist in enum
>>> letters['ggg']
Traceback (most recent call last):
...
ValueError: tuple.index(x): x not in tuple
## enums must be named using valid Python identifiers
>>> numbers = enum(1,2,3,4)
Traceback (most recent call last):
...
AssertionError: Enum values must be string or unicode
>>> a = enum('-a','-b')
Traceback (most recent call last):
...
TypeError: Error when calling the metaclass bases
__slots__ must be identifiers
## create another enum
>>> tags = enum('a','b','c')
>>> tags.a
a
>>> letters.a
a
## can't compare values from different enums
>>> letters.a == tags.a
Traceback (most recent call last):
...
AssertionError: Only values from the same enum are comparable
>>> letters.a < tags.a
Traceback (most recent call last):
...
AssertionError: Only values from the same enum are comparable
## can't update enum after create
>>> letters.a = 'x'
Traceback (most recent call last):
...
AttributeError: 'EnumClass' object attribute 'a' is read-only
## can't update enum after create
>>> del letters.u
Traceback (most recent call last):
...
AttributeError: 'EnumClass' object attribute 'u' is read-only
## can't have non-unique enum values
>>> x = enum('a','b','c','a')
Traceback (most recent call last):
...
AssertionError: Enums must not repeat values
## can't have zero enum values
>>> x = enum()
Traceback (most recent call last):
...
AssertionError: Empty enums are not supported
## can't have enum values that look like special function names
## since these could collide and lead to non-obvious errors
>>> x = enum('a','b','c','__cmp__')
Traceback (most recent call last):
...
AssertionError: Enum values beginning with __ are not supported
LIMITATIONS
Enum values of unicode type are not preserved, mapped to ASCII instead.
"""## must have at least one enum valueassert names,'Empty enums are not supported'## enum values must be stringsassert len([i for i in names ifnot isinstance(i, types.StringTypes)andnot \
isinstance(i, unicode)])==0,'Enum values must be string or unicode'## enum values must not collide with special function namesassert len([i for i in names if i.startswith("__")])==0,\
'Enum values beginning with __ are not supported'## each enum value must be unique from all othersassert names == uniquify(names),'Enums must not repeat values'classEnumClass(object):""" See parent function for explanation """
__slots__ = names
def __iter__(self):return iter(constants)def __len__(self):return len(constants)def __getitem__(self, i):## this makes xx['name'] possibleif isinstance(i, types.StringTypes):
i = names.index(i)## handles the more normal xx[0]return constants[i]def __repr__(self):return'enum'+ str(names)def __str__(self):return'enum '+ str(constants)def index(self, i):return names.index(i)classEnumValue(object):""" See parent function for explanation """
__slots__ =('__value')def __init__(self, value):
self.__value = value
value = property(lambda self: self.__value)
enumtype = property(lambda self: enumtype)def __hash__(self):return hash(self.__value)def __cmp__(self, other):assert self.enumtype is other.enumtype,'Only values from the same enum are comparable'return cmp(self.value, other.value)def __invert__(self):return constants[maximum - self.value]def __nonzero__(self):## return bool(self.value)## Original code led to bool(x[0])==False, not correctreturnTruedef __repr__(self):return str(names[self.value])
maximum = len(names)-1
constants =[None]* len(names)for i, each in enumerate(names):
val =EnumValue(i)
setattr(EnumClass, each, val)
constants[i]= val
constants = tuple(constants)
enumtype =EnumClass()return enumtype
Many doctests included here to illustrate what’s different about this approach.
def enum(*names):
"""
SYNOPSIS
Well-behaved enumerated type, easier than creating custom classes
DESCRIPTION
Create a custom type that implements an enumeration. Similar in concept
to a C enum but with some additional capabilities and protections. See
http://code.activestate.com/recipes/413486-first-class-enums-in-python/.
PARAMETERS
names Ordered list of names. The order in which names are given
will be the sort order in the enum type. Duplicate names
are not allowed. Unicode names are mapped to ASCII.
RETURNS
Object of type enum, with the input names and the enumerated values.
EXAMPLES
>>> letters = enum('a','e','i','o','u','b','c','y','z')
>>> letters.a < letters.e
True
## index by property
>>> letters.a
a
## index by position
>>> letters[0]
a
## index by name, helpful for bridging string inputs to enum
>>> letters['a']
a
## sorting by order in the enum() create, not character value
>>> letters.u < letters.b
True
## normal slicing operations available
>>> letters[-1]
z
## error since there are not 100 items in enum
>>> letters[99]
Traceback (most recent call last):
...
IndexError: tuple index out of range
## error since name does not exist in enum
>>> letters['ggg']
Traceback (most recent call last):
...
ValueError: tuple.index(x): x not in tuple
## enums must be named using valid Python identifiers
>>> numbers = enum(1,2,3,4)
Traceback (most recent call last):
...
AssertionError: Enum values must be string or unicode
>>> a = enum('-a','-b')
Traceback (most recent call last):
...
TypeError: Error when calling the metaclass bases
__slots__ must be identifiers
## create another enum
>>> tags = enum('a','b','c')
>>> tags.a
a
>>> letters.a
a
## can't compare values from different enums
>>> letters.a == tags.a
Traceback (most recent call last):
...
AssertionError: Only values from the same enum are comparable
>>> letters.a < tags.a
Traceback (most recent call last):
...
AssertionError: Only values from the same enum are comparable
## can't update enum after create
>>> letters.a = 'x'
Traceback (most recent call last):
...
AttributeError: 'EnumClass' object attribute 'a' is read-only
## can't update enum after create
>>> del letters.u
Traceback (most recent call last):
...
AttributeError: 'EnumClass' object attribute 'u' is read-only
## can't have non-unique enum values
>>> x = enum('a','b','c','a')
Traceback (most recent call last):
...
AssertionError: Enums must not repeat values
## can't have zero enum values
>>> x = enum()
Traceback (most recent call last):
...
AssertionError: Empty enums are not supported
## can't have enum values that look like special function names
## since these could collide and lead to non-obvious errors
>>> x = enum('a','b','c','__cmp__')
Traceback (most recent call last):
...
AssertionError: Enum values beginning with __ are not supported
LIMITATIONS
Enum values of unicode type are not preserved, mapped to ASCII instead.
"""
## must have at least one enum value
assert names, 'Empty enums are not supported'
## enum values must be strings
assert len([i for i in names if not isinstance(i, types.StringTypes) and not \
isinstance(i, unicode)]) == 0, 'Enum values must be string or unicode'
## enum values must not collide with special function names
assert len([i for i in names if i.startswith("__")]) == 0,\
'Enum values beginning with __ are not supported'
## each enum value must be unique from all others
assert names == uniquify(names), 'Enums must not repeat values'
class EnumClass(object):
""" See parent function for explanation """
__slots__ = names
def __iter__(self):
return iter(constants)
def __len__(self):
return len(constants)
def __getitem__(self, i):
## this makes xx['name'] possible
if isinstance(i, types.StringTypes):
i = names.index(i)
## handles the more normal xx[0]
return constants[i]
def __repr__(self):
return 'enum' + str(names)
def __str__(self):
return 'enum ' + str(constants)
def index(self, i):
return names.index(i)
class EnumValue(object):
""" See parent function for explanation """
__slots__ = ('__value')
def __init__(self, value):
self.__value = value
value = property(lambda self: self.__value)
enumtype = property(lambda self: enumtype)
def __hash__(self):
return hash(self.__value)
def __cmp__(self, other):
assert self.enumtype is other.enumtype, 'Only values from the same enum are comparable'
return cmp(self.value, other.value)
def __invert__(self):
return constants[maximum - self.value]
def __nonzero__(self):
## return bool(self.value)
## Original code led to bool(x[0])==False, not correct
return True
def __repr__(self):
return str(names[self.value])
maximum = len(names) - 1
constants = [None] * len(names)
for i, each in enumerate(names):
val = EnumValue(i)
setattr(EnumClass, each, val)
constants[i] = val
constants = tuple(constants)
enumtype = EnumClass()
return enumtype
While the original enum proposal, PEP 354, was rejected years ago, it keeps coming back up. Some kind of enum was intended to be added to 3.2, but it got pushed back to 3.3 and then forgotten. And now there’s a PEP 435 intended for inclusion in Python 3.4. The reference implementation of PEP 435 is flufl.enum.
As of April 2013, there seems to be a general consensus that something should be added to the standard library in 3.4—as long as people can agree on what that “something” should be. That’s the hard part. See the threads starting here and here, and a half dozen other threads in the early months of 2013.
Meanwhile, every time this comes up, a slew of new designs and implementations appear on PyPI, ActiveState, etc., so if you don’t like the FLUFL design, try a PyPI search.
回答 29
使用以下内容。
TYPE ={'EAN13': u'EAN-13','CODE39': u'Code 39','CODE128': u'Code 128','i25': u'Interleaved 2 of 5',}>>> TYPE.items()[('EAN13', u'EAN-13'),('i25', u'Interleaved 2 of 5'),('CODE39', u'Code 39'),('CODE128', u'Code 128')]>>> TYPE.keys()['EAN13','i25','CODE39','CODE128']>>> TYPE.values()[u'EAN-13', u'Interleaved 2 of 5', u'Code 39', u'Code 128']
setup.py is a python file, which usually tells you that the module/package you are about to install has been packaged and distributed with Distutils, which is the standard for distributing Python Modules.
This allows you to easily install Python packages. Often it’s enough to write:
$ pip install .
pip will use setup.py to install your module. Avoid calling setup.py directly.
from setuptools import setup
setup(
name='foo',
version='1.0',
description='A useful module',
author='Man Foo',
author_email='foomail@foo.com',
packages=['foo'],#same as name
install_requires=['bar','greek'],#external packages as dependencies)
It helps to install a python package foo on your machine (can also be in virtualenv) so that you can import the package foo from other projects and also from [I]Python prompts.
It does the similar job of pip, easy_install etc.,
Using setup.py
Let’s start with some definitions:
Package – A folder/directory that contains __init__.py file. Module – A valid python file with .py extension. Distribution – How one package relates to other packages and modules.
Let’s say you want to install a package named foo. Then you do,
Instead, if you don’t want to actually install it but still would like to use it. Then do,
$ python setup.py develop
This command will create symlinks to the source directory within site-packages instead of copying things. Because of this, it is quite fast (particularly for large packages).
from setuptools import setup
setup(
name='foo',
version='1.0',
description='A useful module',
author='Man Foo',
author_email='foomail@foo.com',
packages=['foo'], #same as name
install_requires=['bar', 'greek'], #external packages as dependencies
scripts=[
'scripts/cool',
'scripts/skype',
]
)
Add more stuff to (setup.py) & make it decent:
from setuptools import setup
with open("README", 'r') as f:
long_description = f.read()
setup(
name='foo',
version='1.0',
description='A useful module',
license="MIT",
long_description=long_description,
author='Man Foo',
author_email='foomail@foo.com',
url="http://www.foopackage.com/",
packages=['foo'], #same as name
install_requires=['bar', 'greek'], #external packages as dependencies
scripts=[
'scripts/cool',
'scripts/skype',
]
)
The long_description is used in pypi.org as the README description of your package.
And finally, you’re now ready to upload your package to PyPi.org so that others can install your package using pip install yourpackage.
First step is to claim your package name & space in pypi using:
$ python setup.py register
Once your package name is registered, nobody can claim or use it. After successful registration, you have to upload your package there (to the cloud) by,
$ python setup.py upload
Optionally, you can also sign your package with GPG by,
setup.py is Python’s answer to a multi-platform installer and make file.
If you’re familiar with command line installations, then make && make install translates to python setup.py build && python setup.py install.
Some packages are pure Python, and are only byte compiled. Others may contain native code, which will require a native compiler (like gcc or cl) and a Python interfacing module (like swig or pyrex).
setup.py is a Python script that is usually shipped with libraries or programs, written in that language. It’s purpose is the correct installation of the software.
Many packages use the distutils framework in conjuction with setup.py.
setup.py can be used in two scenarios , First, you want to install a Python package. Second, you want to create your own Python package. Usually standard Python package has couple of important files like setup.py, setup.cfg and Manifest.in. When you are creating the Python package, these three files will determine the (content in PKG-INFO under egg-info folder) name, version, description, other required installations (usually in .txt file) and few other parameters. setup.cfg is read by setup.py while package is created (could be tar.gz ). Manifest.in is where you can define what should be included in your package. Anyways you can do bunch of stuff using setup.py like
python setup.py build
python setup.py install
python setup.py sdist <distname> upload [-r urltorepo] (to upload package to pypi or local repo)
There are bunch of other commands which could be used with setup.py . for help
python setup.py --help-commands
回答 6
当您通过setup.py打开终端(Mac,Linux)或命令提示符(Windows)下载软件包时。使用“ cd Tab”按钮并为您提供帮助,将路径设置为已下载文件的文件夹的正确位置,该文件夹位于setup.py:
When you download a package with setup.py open your Terminal (Mac,Linux) or Command Prompt (Windows). Using cd and helping you with Tab button set the path right to the folder where you have downloaded the file and where there is setup.py :
To install a Python package you’ve downloaded, you extract the archive and run the setup.py script inside:
python setup.py install
To me, this has always felt odd. It would be more natural to point a package manager at the download, as one would do in Ruby and Nodejs, eg. gem install rails-4.1.1.gem
A package manager is more comfortable too, because it’s familiar and reliable. On the other hand, each setup.py is novel, because it’s specific to the package. It demands faith in convention “I trust this setup.py takes the same commands as others I have used in the past”. That’s a regrettable tax on mental willpower.
I’m not saying the setup.py workflow is less secure than a package manager (I understand Pip just runs the setup.py inside), but certainly I feel it’s awkard and jarring. There’s a harmony to commands all being to the same package manager application. You might even grow fond it.
setup.py is a Python file like any other. It can take any name, except by convention it is named setup.py so that there is not a different procedure with each script.
Most frequently setup.py is used to install a Python module but server other purposes:
Modules:
Perhaps this is most famous usage of setup.py is in modules. Although they can be installed using pip, old Python versions did not include pip by default and they needed to be installed separately.
If you wanted to install a module but did not want to install pip, just about the only alternative was to install the module from setup.py file. This could be achieved via python setup.py install. This would install the Python module to the root dictionary (without pip, easy_install ect).
This method is often used when pip will fail. For example if the correct Python version of the desired package is not available via pipperhaps because it is no longer maintained, , downloading the source and running python setup.py install would perform the same thing, except in the case of compiled binaries are required, (but will disregard the Python version -unless an error is returned).
Another use of setup.py is to install a package from source. If a module is still under development the wheel files will not be available and the only way to install is to install from the source directly.
Building Python extensions:
When a module has been built it can be converted into module ready for distribution using a distutils setup script. Once built these can be installed using the command above.
A setup script is easy to build and once the file has been properly configured and can be compiled by running python setup.py build (see link for all commands).
Once again it is named setup.py for ease of use and by convention, but can take any name.
Cython:
Another famous use of setup.py files include compiled extensions. These require a setup script with user defined values. They allow fast (but once compiled are platform dependant) execution. Here is a simple example from the documentation:
from distutils.core import setup
from Cython.Build import cythonize
setup(
name = 'Hello world app',
ext_modules = cythonize("hello.pyx"),
)
This can be compiled via python setup.py build
Cx_Freeze:
Another module requiring a setup script is cx_Freeze. This converts Python script to executables. This allows many commands such as descriptions, names, icons, packages to include, exclude ect and once run will produce a distributable application. An example from the documentation:
import sys
from cx_Freeze import setup, Executable
build_exe_options = {"packages": ["os"], "excludes": ["tkinter"]}
base = None
if sys.platform == "win32":
base = "Win32GUI"
setup( name = "guifoo",
version = "0.1",
description = "My GUI application!",
options = {"build_exe": build_exe_options},
executables = [Executable("guifoo.py", base=base)])
This can be compiled via python setup.py build.
So what is a setup.py file?
Quite simply it is a script that builds or configures something in the Python environment.
A package when distributed should contain only one setup script but it is not uncommon to combine several together into a single setup script. Notice this often involves distutils but not always (as I showed in my last example). The thing to remember it just configures Python package/script in some way.
It takes the name so the same command can always be used when building or installing.
To make it simple, setup.py is run as "__main__" when you call the install functions the other answers mentioned. Inside setup.py, you should put everything needed to install your package.
Common setup.py functions
The following two sections discuss two things many setup.py modules have.
setuptools.setup
This function allows you to specify project attributes like the name of the project, the version…. Most importantly, this function allows you to install other functions if they’re packaged properly. See this webpage for an example of setuptools.setup
These attributes of setuptools.setup enable installing these types of packages:
In an ideal world, setuptools.setup would handle everything for you. Unfortunately this isn’t always the case. Sometimes you have to do specific things, like installing dependencies with the subprocess command, to get the system you’re installing on in the right state for your package. Try to avoid this, these functions get confusing and often differ between OS and even distribution.