import timefrom itertools import islice, izipdef pairs_1(t):return zip(t[::2], t[1::2])def pairs_2(t):return izip(t[::2], t[1::2])def pairs_3(t):return izip(islice(t,None,None,2), islice(t,1,None,2))
A = range(10000)
B = xrange(len(A))def pairs_4(t):# ignore value of t!
t = Breturn izip(islice(t,None,None,2), islice(t,1,None,2))for f in pairs_1, pairs_2, pairs_3, pairs_4:# time the pairing
s = time.time()for i in range(1000):
p = f(A)
t1 = time.time()- s# time using the pairs
s = time.time()for i in range(1000):
p = f(A)for a, b in p:pass
t2 = time.time()- sprint t1, t2, t2-t1
Often enough, I’ve found the need to process a list by pairs. I was wondering which would be the pythonic and efficient way to do it, and found this on Google:
pairs = zip(t[::2], t[1::2])
I thought that was pythonic enough, but after a recent discussion involving idioms versus efficiency, I decided to do some tests:
import time
from itertools import islice, izip
def pairs_1(t):
return zip(t[::2], t[1::2])
def pairs_2(t):
return izip(t[::2], t[1::2])
def pairs_3(t):
return izip(islice(t,None,None,2), islice(t,1,None,2))
A = range(10000)
B = xrange(len(A))
def pairs_4(t):
# ignore value of t!
t = B
return izip(islice(t,None,None,2), islice(t,1,None,2))
for f in pairs_1, pairs_2, pairs_3, pairs_4:
# time the pairing
s = time.time()
for i in range(1000):
p = f(A)
t1 = time.time() - s
# time using the pairs
s = time.time()
for i in range(1000):
p = f(A)
for a, b in p:
pass
t2 = time.time() - s
print t1, t2, t2-t1
If I’m interpreting them correctly, that should mean that the implementation of lists, list indexing, and list slicing in Python is very efficient. It’s a result both comforting and unexpected.
Is there another, “better” way of traversing a list in pairs?
Note that if the list has an odd number of elements then the last one will not be in any of the pairs.
Which would be the right way to ensure that all elements are included?
I added these two suggestions from the answers to the tests:
def pairwise(t):
it = iter(t)
return izip(it, it)
def chunkwise(t, size=2):
it = iter(t)
return izip(*[it]*size)
It took me a moment to grok that the first answer uses two iterators while the second uses a single one.
To deal with sequences with an odd number of elements, the suggestion has been to augment the original sequence adding one element (None) that gets paired with the previous last element, something that can be achieved with itertools.izip_longest().
Finally
Note that, in Python 3.x, zip() behaves as itertools.izip(), and itertools.izip() is gone.
回答 0
我最喜欢的方式:
from itertools import izipdef pairwise(t):
it = iter(t)return izip(it,it)# for "pairs" of any lengthdef chunkwise(t, size=2):
it = iter(t)return izip(*[it]*size)
当您要配对所有元素时,您显然可能需要一个fillvalue:
from itertools import izip_longestdef blockwise(t, size=2, fillvalue=None):
it = iter(t)return izip_longest(*[it]*size, fillvalue=fillvalue)
from itertools import izip
def pairwise(t):
it = iter(t)
return izip(it,it)
# for "pairs" of any length
def chunkwise(t, size=2):
it = iter(t)
return izip(*[it]*size)
When you want to pair all elements you obviously might need a fillvalue:
from itertools import izip_longest
def blockwise(t, size=2, fillvalue=None):
it = iter(t)
return izip_longest(*[it]*size, fillvalue=fillvalue)
I’d say that your initial solution pairs = zip(t[::2], t[1::2]) is the best one because it is easiest to read (and in Python 3, zip automatically returns an iterator instead of a list).
To ensure that all elements are included, you could simply extend the list by None.
Then, if the list has an odd number of elements, the last pair will be (item, None).
I start with small disclaimer – don’t use the code below. It’s not Pythonic at all, I wrote just for fun. It’s similar to @THC4k pairwise function but it uses iter and lambda closures. It doesn’t use itertools module and doesn’t support fillvalue. I put it here because someone might find it interesting:
As far as most pythonic goes, I’d say the recipes supplied in the python source docs (some of which look a lot like the answers that @JochenRitzel provided) is probably your best bet ;)
def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
args = [iter(iterable)] * n
return izip_longest(fillvalue=fillvalue, *args)
Is there another, “better” way of traversing a list in pairs?
I can’t say for sure but I doubt it: Any other traversal would include more Python code which has to be interpreted. The built-in functions like zip() are written in C which is much faster.
Which would be the right way to ensure that all elements are included?
Check the length of the list and if it’s odd (len(list) & 1 == 1), copy the list and append an item.
回答 5
>>> my_list =[1,2,3,4,5,6,7,8,9,10]>>> my_pairs = list()>>>while(my_list):... a = my_list.pop(0); b = my_list.pop(0)... my_pairs.append((a,b))...>>>print(my_pairs)[(1,2),(3,4),(5,6),(7,8),(9,10)]
Here is an example of creating pairs/legs by using a generator. Generators are free from stack limits
def pairwise(data):
zip(data[::2], data[1::2])
Example:
print(list(pairwise(range(10))))
Output:
[(0, 1), (2, 3), (4, 5), (6, 7), (8, 9)]
回答 8
万一有人需要明智的答案算法,这里是:
>>>def getPairs(list):... out =[]...for i in range(len(list)-1):... a = list.pop(0)...for j in a:... out.append([a, j])...return b>>>>>> k =[1,2,3,4]>>> l = getPairs(k)>>> l[[1,2],[1,3],[1,4],[2,3],[2,4],[3,4]]
Just in case someone needs the answer algorithm-wise, here it is:
>>> def getPairs(list):
... out = []
... for i in range(len(list)-1):
... a = list.pop(0)
... for j in a:
... out.append([a, j])
... return b
>>>
>>> k = [1, 2, 3, 4]
>>> l = getPairs(k)
>>> l
[[1, 2], [1, 3], [1, 4], [2, 3], [2, 4], [3, 4]]
But take note that your original list will also be reduced to its last element, because you used pop on it.
What is the most idiomatic way to do the following?
def xstr(s):
if s is None:
return ''
else:
return s
s = xstr(a) + xstr(b)
update: I’m incorporating Tryptich’s suggestion to use str(s), which makes this routine work for other types besides strings. I’m awfully impressed by Vinay Sajip’s lambda suggestion, but I want to keep my code relatively simple.
def xstr(s):
if s is None:
return ''
else:
return str(s)
Variation on the above if you need to be compatible with Python 2.4
xstr = lambda s: s is not None and s or ''
回答 11
如果要格式化字符串,则可以执行以下操作:
from string importFormatterclassNoneAsEmptyFormatter(Formatter):def get_value(self, key, args, kwargs):
v = super().get_value(key, args, kwargs)return''if v isNoneelse v
fmt =NoneAsEmptyFormatter()
s = fmt.format('{}{}', a, b)
If it is about formatting strings, you can do the following:
from string import Formatter
class NoneAsEmptyFormatter(Formatter):
def get_value(self, key, args, kwargs):
v = super().get_value(key, args, kwargs)
return '' if v is None else v
fmt = NoneAsEmptyFormatter()
s = fmt.format('{}{}', a, b)
回答 12
def xstr(s):return s if s else''
s ="%s%s"%(xstr(a), xstr(b))
customer ="John"# even though its None still it will work properly.
name = customer
if name isNoneprint"Name is blank"else:print"Customer name : "+ str(name)
We can always avoid type casting in scenarios explained below.
customer = "John"
name = str(customer)
if name is None
print "Name is blank"
else:
print "Customer name : " + name
In the example above in case variable customer’s value is None the it further gets casting while getting assigned to ‘name’. The comparison in ‘if’ clause will always fail.
customer = "John" # even though its None still it will work properly.
name = customer
if name is None
print "Name is blank"
else:
print "Customer name : " + str(name)
Above example will work properly. Such scenarios are very common when values are being fetched from URL, JSON or XML or even values need further type casting for any manipulation.
I’d like to know the best way (more compact and “pythonic” way) to do a special treatment for the last element in a for loop. There is a piece of code that should be called only between elements, being suppressed in the last one.
Here is how I currently do it:
for i, data in enumerate(data_list):
code_that_is_done_for_every_element
if i != len(data_list) - 1:
code_that_is_done_between_elements
Is there any better way?
Note: I don’t want to make it with hacks such as using reduce. ;)
回答 0
在大多数情况下,使第一个迭代成为特殊情况而不是最后一个案例更容易(且更便宜):
first =Truefor data in data_list:if first:
first =Falseelse:
between_items()
item()
这将适用于任何可迭代的对象,即使对于那些没有len():
file = open('/path/to/file')for line in file:
process_line(line)# No way of telling if this is the last line!
def lookahead(iterable):"""Pass through all values from the given iterable, augmented by the
information if there are more values to come after the current one
(True), or if it is the last value (False).
"""# Get an iterator and pull the first value.
it = iter(iterable)
last = next(it)# Run the iterator to exhaustion (starting from the second value).for val in it:# Report the *previous* value (more to come).yield last,True
last = val
# Report the last value.yield last,False
然后,您可以像这样使用它:
>>>for i, has_more in lookahead(range(3)):...print(i, has_more)0True1True2False
Most of the times it is easier (and cheaper) to make the first iteration the special case instead of the last one:
first = True
for data in data_list:
if first:
first = False
else:
between_items()
item()
This will work for any iterable, even for those that have no len():
file = open('/path/to/file')
for line in file:
process_line(line)
# No way of telling if this is the last line!
Apart from that, I don’t think there is a generally superior solution as it depends on what you are trying to do. For example, if you are building a string from a list, it’s naturally better to use str.join() than using a for loop “with special case”.
Using the same principle but more compact:
for i, line in enumerate(data_list):
if i > 0:
between_items()
item()
Looks familiar, doesn’t it? :)
For @ofko, and others who really need to find out if the current value of an iterable without len() is the last one, you will need to look ahead:
def lookahead(iterable):
"""Pass through all values from the given iterable, augmented by the
information if there are more values to come after the current one
(True), or if it is the last value (False).
"""
# Get an iterator and pull the first value.
it = iter(iterable)
last = next(it)
# Run the iterator to exhaustion (starting from the second value).
for val in it:
# Report the *previous* value (more to come).
yield last, True
last = val
# Report the last value.
yield last, False
Then you can use it like this:
>>> for i, has_more in lookahead(range(3)):
... print(i, has_more)
0 True
1 True
2 False
Although that question is pretty old, I came here via google and I found a quite simple way: List slicing. Let’s say you want to put an ‘&’ between all list entries.
s = ""
l = [1, 2, 3]
for i in l[:-1]:
s = s + str(i) + ' & '
s = s + str(l[-1])
The ‘code between’ is an example of the Head-Tail pattern.
You have an item, which is followed by a sequence of ( between, item ) pairs. You can also view this as a sequence of (item, between) pairs followed by an item. It’s generally simpler to take the first element as special and all the others as the “standard” case.
Further, to avoid repeating code, you have to provide a function or other object to contain the code you don’t want to repeat. Embedding an if statement in a loop which is always false except one time is kind of silly.
def item_processing( item ):
# *the common processing*
head_tail_iter = iter( someSequence )
head = next(head_tail_iter)
item_processing( head )
for item in head_tail_iter:
# *the between processing*
item_processing( item )
This is more reliable because it’s slightly easier to prove, It doesn’t create an extra data structure (i.e., a copy of a list) and doesn’t require a lot of wasted execution of an if condition which is always false except once.
If you’re simply looking to modify the last element in data_list then you can simply use the notation:
L[-1]
However, it looks like you’re doing more than that. There is nothing really wrong with your way. I even took a quick glance at some Django code for their template tags and they do basically what you’re doing.
回答 4
如果项目是唯一的:
for x in list:#codeif x == list[-1]:#code
其他选择:
pos =-1for x in list:
pos +=1#codeif pos == len(list)-1:#codefor x in list:#code#code - e.g. print xif len(list)>0:for x in list[:-1]#codefor x in list[-1]:#code
pos = -1
for x in list:
pos += 1
#code
if pos == len(list) - 1:
#code
for x in list:
#code
#code - e.g. print x
if len(list) > 0:
for x in list[:-1]
#code
for x in list[-1]:
#code
def last_iter(it):# Ensure it's an iterator and get the first field
it = iter(it)
prev = next(it)for item in it:# Lag by one item so I know I'm not at the endyield0, prev
prev = item
# Last itemyield1, prev
def test(data):
result = list(last_iter(data))ifnot result:returnif len(result)>1:assert set(x[0]for x in result[:-1])== set([0]), result
assert result[-1][0]==1
test([])
test([1])
test([1,2])
test(range(5))
test(xrange(4))for is_last, item in last_iter("Hi!"):print is_last, item
This is similar to Ants Aasma’s approach but without using the itertools module. It’s also a lagging iterator which looks-ahead a single element in the iterator stream:
def last_iter(it):
# Ensure it's an iterator and get the first field
it = iter(it)
prev = next(it)
for item in it:
# Lag by one item so I know I'm not at the end
yield 0, prev
prev = item
# Last item
yield 1, prev
def test(data):
result = list(last_iter(data))
if not result:
return
if len(result) > 1:
assert set(x[0] for x in result[:-1]) == set([0]), result
assert result[-1][0] == 1
test([])
test([1])
test([1, 2])
test(range(5))
test(xrange(4))
for is_last, item in last_iter("Hi!"):
print is_last, item
from itertools import tee, izip, chain
def pairwise(seq):
a,b = tee(seq)
next(b,None)return izip(a,b)def annotated_last(seq):"""Returns an iterable of pairs of input item and a boolean that show if
the current item is the last item in the sequence."""
MISSING = object()for current_item, next_item in pairwise(chain(seq,[MISSING])):yield current_item, next_item is MISSING:for item, is_last_item in annotated_last(data_list):if is_last_item:# current item is the last item
You can use a sliding window over the input data to get a peek at the next value and use a sentinel to detect the last value. This works on any iterable, so you don’t need to know the length beforehand. The pairwise implementation is from itertools recipes.
from itertools import tee, izip, chain
def pairwise(seq):
a,b = tee(seq)
next(b, None)
return izip(a,b)
def annotated_last(seq):
"""Returns an iterable of pairs of input item and a boolean that show if
the current item is the last item in the sequence."""
MISSING = object()
for current_item, next_item in pairwise(chain(seq, [MISSING])):
yield current_item, next_item is MISSING:
for item, is_last_item in annotated_last(data_list):
if is_last_item:
# current item is the last item
Is there no possibility to iterate over all-but the last element, and treat the last one outside of the loop? After all, a loop is created to do something similar to all elements you loop over; if one element needs something special, it shouldn’t be in the loop.
EDIT: since the question is more about the “in between”, either the first element is the special one in that it has no predecessor, or the last element is special in that it has no successor.
回答 8
我喜欢@ ethan-t的方法,但是while True从我的角度来看很危险。
data_list =[1,2,3,2,1]# sample data
L = list(data_list)# destroy L instead of data_listwhile L:
e = L.pop(0)if L:print(f'process element {e}')else:print(f'process last element {e}')del L
data_list =[1,2,3,2,1]if data_list:whileTrue:
e = data_list.pop(0)if data_list:print(f'process element {e}')else:print(f'process last element {e}')breakelse:print('list is empty')
好的方面是它很快。坏-它是可破坏的(data_list变空)。
最直观的解决方案:
data_list =[1,2,3,2,1]# sample datafor i, e in enumerate(data_list):if i != len(data_list)-1:print(f'process element {e}')else:print(f'process last element {e}')
I like the approach of @ethan-t, but while True is dangerous from my point of view.
data_list = [1, 2, 3, 2, 1] # sample data
L = list(data_list) # destroy L instead of data_list
while L:
e = L.pop(0)
if L:
print(f'process element {e}')
else:
print(f'process last element {e}')
del L
Here, data_list is so that last element is equal by value to the first one of the list. L can be exchanged with data_list but in this case it results empty after the loop. while True is also possible to use if you check that list is not empty before the processing or the check is not needed (ouch!).
data_list = [1, 2, 3, 2, 1]
if data_list:
while True:
e = data_list.pop(0)
if data_list:
print(f'process element {e}')
else:
print(f'process last element {e}')
break
else:
print('list is empty')
The good part is that it is fast. The bad – it is destructible (data_list becomes empty).
Most intuitive solution:
data_list = [1, 2, 3, 2, 1] # sample data
for i, e in enumerate(data_list):
if i != len(data_list) - 1:
print(f'process element {e}')
else:
print(f'process last element {e}')
iterable =[1,2,3]# Your date
iterator = iter(iterable)# get the data iteratortry:# wrap all in a try / exceptwhile1:
item = iterator.next()print item # put the "for loop" code hereexceptStopIteration, e :# make the process on the last element hereprint item
输出:
1233
但实际上,就您而言,我觉得这太过分了。
无论如何,切片可能会让您更幸运:
for item in iterable[:-1]:print item
print"last :", iterable[-1]#outputs12
last :3
要不就 :
for item in iterable :print item
print iterable[-1]#outputs123
last :3
最终,采用KISS方式为您做事,这将适用于任何可迭代的事物,包括那些没有__len__:
item =''for item in iterable :print item
print item
There is nothing wrong with your way, unless you will have 100 000 loops and wants save 100 000 “if” statements. In that case, you can go that way :
iterable = [1,2,3] # Your date
iterator = iter(iterable) # get the data iterator
try : # wrap all in a try / except
while 1 :
item = iterator.next()
print item # put the "for loop" code here
except StopIteration, e : # make the process on the last element here
print item
Outputs :
1
2
3
3
But really, in your case I feel like it’s overkill.
In any case, you will probably be luckier with slicing :
for item in iterable[:-1] :
print item
print "last :", iterable[-1]
#outputs
1
2
last : 3
or just :
for item in iterable :
print item
print iterable[-1]
#outputs
1
2
3
last : 3
Eventually, a KISS way to do you stuff, and that would work with any iterable, including the ones without __len__ :
item = ''
for item in iterable :
print item
print item
Ouputs:
1
2
3
3
If feel like I would do it that way, seems simple to me.
回答 10
使用切片和is检查最后一个元素:
for data in data_list:<code_that_is_done_for_every_element>ifnot data is data_list[-1]:<code_that_is_done_between_elements>
for data in data_list:
<code_that_is_done_for_every_element>
if not data is data_list[-1]:
<code_that_is_done_between_elements>
Caveat emptor: This only works if all elements in the list are actually different (have different locations in memory). Under the hood, Python may detect equal elements and reuse the same objects for them. For instance, for strings of the same value and common integers.
回答 11
如果您要查看清单,对我来说,这也可行:
for j in range(0, len(Array)):if len(Array)- j >1:
notLast()
Google brought me to this old question and I think I could add a different approach to this problem.
Most of the answers here would deal with a proper treatment of a for loop control as it was asked, but if the data_list is destructible, I would suggest that you pop the items from the list until you end up with an empty list:
while True:
element = element_list.pop(0)
do_this_for_all_elements()
if not element:
do_this_only_for_last_element()
break
do_this_for_all_elements_but_last()
you could even use while len(element_list) if you don’t need to do anything with the last element. I find this solution more elegant then dealing with next().
回答 13
对我而言,处理列表结尾处的特殊情况的最简单,最Python的方法是:
for data in data_list[:-1]:
handle_element(data)
handle_special_element(data_list[-1])
There can be multiple ways. slicing will be fastest. Adding one more which uses .index() method:
>>> l1 = [1,5,2,3,5,1,7,43]
>>> [i for i in l1 if l1.index(i)+1==len(l1)]
[43]
回答 17
假设输入为迭代器,以下是使用itertools中的tee和izip的方法:
from itertools import tee, izip
items, between = tee(input_iterator,2)# Input must be an iterator.
first = items.next()
do_to_every_item(first)# All "do to every" operations done to first item go here.for i, b in izip(items, between):
do_between_items(b)# All "between" operations go here.
do_to_every_item(i)# All "do to every" operations go here.
演示:
>>>def do_every(x):print"E", x
...>>>def do_between(x):print"B", x
...>>> test_input = iter(range(5))>>>>>>from itertools import tee, izip
>>>>>> items, between = tee(test_input,2)>>> first = items.next()>>> do_every(first)
E 0>>>for i,b in izip(items, between):... do_between(b)... do_every(i)...
B 0
E 1
B 1
E 2
B 2
E 3
B 3
E 4>>>
Assuming input as an iterator, here’s a way using tee and izip from itertools:
from itertools import tee, izip
items, between = tee(input_iterator, 2) # Input must be an iterator.
first = items.next()
do_to_every_item(first) # All "do to every" operations done to first item go here.
for i, b in izip(items, between):
do_between_items(b) # All "between" operations go here.
do_to_every_item(i) # All "do to every" operations go here.
Demo:
>>> def do_every(x): print "E", x
...
>>> def do_between(x): print "B", x
...
>>> test_input = iter(range(5))
>>>
>>> from itertools import tee, izip
>>>
>>> items, between = tee(test_input, 2)
>>> first = items.next()
>>> do_every(first)
E 0
>>> for i,b in izip(items, between):
... do_between(b)
... do_every(i)
...
B 0
E 1
B 1
E 2
B 2
E 3
B 3
E 4
>>>
回答 18
我想到的最简单的解决方案是:
for item in data_list:try:print(new)exceptNameError:pass
new = item
print('The last item: '+ str(new))
因此,我们总是通过延迟处理一次迭代来向前看一项。要跳过第一次迭代期间的操作,我只是捕捉到了错误。
当然,您需要考虑一下,以便在NameError需要时提出它。
同时保持`counstruct
try:
new
exceptNameError:passelse:# continue here if no error was raised
for item in data_list:
try:
print(new)
except NameError: pass
new = item
print('The last item: ' + str(new))
So we always look ahead one item by delaying the the processing one iteration. To skip doing something during the first iteration I simply catch the error.
Of course you need to think a bit, in order for the NameError to be raised when you want it.
Also keep the `counstruct
try:
new
except NameError: pass
else:
# continue here if no error was raised
This relies that the name new wasn’t previously defined. If you are paranoid you can ensure that new doesn’t exist using:
try:
del new
except NameError:
pass
Alternatively you can of course also use an if statement (if notfirst: print(new) else: notfirst = True). But as far as I know the overhead is bigger.
Using `timeit` yields:
...: try: new = 'test'
...: except NameError: pass
...:
100000000 loops, best of 3: 16.2 ns per loop
so I expect the overhead to be unelectable.
回答 19
计数一次,并跟上剩余的项目数:
remaining = len(data_list)for data in data_list:
code_that_is_done_for_every_element
remaining -=1if remaining:
code_that_is_done_between_elements
Count the items once and keep up with the number of items remaining:
remaining = len(data_list)
for data in data_list:
code_that_is_done_for_every_element
remaining -= 1
if remaining:
code_that_is_done_between_elements
This way you only evaluate the length of the list once. Many of the solutions on this page seem to assume the length is unavailable in advance, but that is not part of your question. If you have the length, use it.
回答 20
我想到的一个简单的解决方案是:
for i inMyList:# Check if 'i' is the last element in the listif i ==MyList[-1]:# Do something different for the lastelse:# Do something for all other elements
第二个同样简单的解决方案可以通过使用计数器来实现:
# Count the no. of elements in the listListLength= len(MyList)# Initialize a counter
count =0for i inMyList:# increment counter
count +=1# Check if 'i' is the last element in the list# by using the counterif count ==ListLength:# Do something different for the lastelse:# Do something for all other elements
for i in MyList:
# Check if 'i' is the last element in the list
if i == MyList[-1]:
# Do something different for the last
else:
# Do something for all other elements
A second equally simple solution could be achieved by using a counter:
# Count the no. of elements in the list
ListLength = len(MyList)
# Initialize a counter
count = 0
for i in MyList:
# increment counter
count += 1
# Check if 'i' is the last element in the list
# by using the counter
if count == ListLength:
# Do something different for the last
else:
# Do something for all other elements
x = get_first_list()if x:# do something with x[0]# inevitably forget the [0] part, and have a bug to fix
y = get_second_list()if y:# do something with y[0]# inevitably forget the [0] part AGAIN, and have another bug to fix
我想做的事情肯定可以通过一个函数来完成(可能会做到):
def first_item(list_or_none):if list_or_none:return list_or_none[0]
x = first_item(get_first_list())if x:# do something with x
y = first_item(get_second_list())if y:# do something with y
I’m sure there’s a simpler way of doing this that’s just not occurring to me.
I’m calling a bunch of methods that return a list. The list may be empty. If the list is non-empty, I want to return the first item; otherwise, I want to return None. This code works:
It seems to me that there should be a simple one-line idiom for doing this, but for the life of me I can’t think of it. Is there?
Edit:
The reason that I’m looking for a one-line expression here is not that I like incredibly terse code, but because I’m having to write a lot of code like this:
x = get_first_list()
if x:
# do something with x[0]
# inevitably forget the [0] part, and have a bug to fix
y = get_second_list()
if y:
# do something with y[0]
# inevitably forget the [0] part AGAIN, and have another bug to fix
What I’d like to be doing can certainly be accomplished with a function (and probably will be):
def first_item(list_or_none):
if list_or_none: return list_or_none[0]
x = first_item(get_first_list())
if x:
# do something with x
y = first_item(get_second_list())
if y:
# do something with y
I posted the question because I’m frequently surprised by what simple expressions in Python can do, and I thought that writing a function was a silly thing to do if there was a simple expression could do the trick. But seeing these answers, it seems like a function is the simple solution.
回答 0
Python 2.6以上
next(iter(your_list),None)
如果your_list可以None:
next(iter(your_list or[]),None)
Python 2.4
def get_first(iterable, default=None):if iterable:for item in iterable:return item
return default
例:
x = get_first(get_first_list())if x:...
y = get_first(get_second_list())if y:...
另一个选择是内联以上函数:
for x in get_first_list()or[]:# process xbreak# process at most one itemfor y in get_second_list()or[]:# process ybreak
为了避免break您可以写:
for x in yield_first(get_first_list()):
x # process xfor y in yield_first(get_second_list()):
y # process y
哪里:
def yield_first(iterable):for item in iterable or[]:yield item
return
The most python idiomatic way is to use the next() on a iterator since list is iterable. just like what @J.F.Sebastian put in the comment on Dec 13, 2011.
next(iter(the_list), None) This returns None if the_list is empty. see next() Python 2.6+
Unless you need to return None to terminate a function early, it’s unnecessary to explicitly return None. Quite succinctly, just return the first entry, should it exist.
some_list = get_list()
if some_list:
return list[0]
And finally, perhaps this was implied, but just to be explicit (because explicit is better than implicit), you should not have your function get the list from another function; just pass it in as a parameter. So, the final result would be
Frankly speaking, I do not think there is a better idiom: your is clear and terse – no need for anything “better”. Maybe, but this is really a matter of taste, you could change if len(list) > 0: with if list: – an empty list will always evaluate to False.
On a related note, Python is not Perl (no pun intended!), you do not have to get the coolest code possible.
Actually, the worst code I have seen in Python, was also very cool :-) and completely unmaintainable.
By the way, most of the solution I have seen here do not take into consideration when list[0] evaluates to False (e.g. empty string, or zero) – in this case, they all return None and not the correct element.
The most Pythonic approach is what the most upvoted answer demonstrated, and it was the first thing to come to my mind when I read the question. Here’s how to use it, first if the possibly empty list is passed into a function:
def get_first(l):
return l[0] if l else None
And if the list is returned from a get_list function:
l = get_list()
return l[0] if l else None
Other ways demonstrated to do this here, with explanations
for
When I began trying to think of clever ways to do this, this is the second thing I thought of:
for item in get_list():
return item
This presumes the function ends here, implicitly returning None if get_list returns an empty list. The below explicit code is exactly equivalent:
for item in get_list():
return item
return None
if some_list
The following was also proposed (I corrected the incorrect variable name) which also uses the implicit None. This would be preferable to the above, as it uses the logical check instead of an iteration that may not happen. This should be easier to understand immediately what is happening. But if we’re writing for readability and maintainability, we should also add the explicit return None at the end:
some_list = get_list()
if some_list:
return some_list[0]
slice or [None] and select zeroth index
This one is also in the most up-voted answer:
return (get_list()[:1] or [None])[0]
The slice is unnecessary, and creates an extra one-item list in memory. The following should be more performant. To explain, or returns the second element if the first is False in a boolean context, so if get_list returns an empty list, the expression contained in the parentheses will return a list with ‘None’, which will then be accessed by the 0 index:
return (get_list() or [None])[0]
The next one uses the fact that and returns the second item if the first is True in a boolean context, and since it references my_list twice, it is no better than the ternary expression (and technically not a one-liner):
my_list = get_list()
return (my_list and my_list[0]) or None
next
Then we have the following clever use of the builtin next and iter
return next(iter(get_list()), None)
To explain, iter returns an iterator with a .next method. (.__next__ in Python 3.) Then the builtin next calls that .next method, and if the iterator is exhausted, returns the default we give, None.
redundant ternary expression (a if b else c) and circling back
The below was proposed, but the inverse would be preferable, as logic is usually better understood in the positive instead of the negative. Since get_list is called twice, unless the result is memoized in some way, this would perform poorly:
return None if not get_list() else get_list()[0]
The better inverse:
return get_list()[0] if get_list() else None
Even better, use a local variable so that get_list is only called one time, and you have the recommended Pythonic solution first discussed:
import random
import timeit
def index_first_item(some_list):if some_list:return some_list[0]def return_first_item(some_list):for item in some_list:return item
empty_lists =[]for i in range(10000):
empty_lists.append([])assert empty_lists[0]isnot empty_lists[1]
full_lists =[]for i in range(10000):
full_lists.append(list([random.random()for i in range(10)]))
mixed_lists = empty_lists[:50000]+ full_lists[:50000]
random.shuffle(mixed_lists)if __name__ =='__main__':
ENV ='import firstitem'
test_data =('empty_lists','full_lists','mixed_lists')
funcs =('index_first_item','return_first_item')for data in test_data:print"%s:"% data
for func in funcs:
t = timeit.Timer('firstitem.%s(firstitem.%s)'%(
func, data), ENV)
times = t.repeat()
avg_time = sum(times)/ len(times)print" %s:"% func
for time in times:print" %f seconds"% time
print" %f seconds avg."% avg_time
Out of curiosity, I ran timings on two of the solutions. The solution which uses a return statement to prematurely end a for loop is slightly more costly on my machine with Python 2.5.1, I suspect this has to do with setting up the iterable.
import random
import timeit
def index_first_item(some_list):
if some_list:
return some_list[0]
def return_first_item(some_list):
for item in some_list:
return item
empty_lists = []
for i in range(10000):
empty_lists.append([])
assert empty_lists[0] is not empty_lists[1]
full_lists = []
for i in range(10000):
full_lists.append(list([random.random() for i in range(10)]))
mixed_lists = empty_lists[:50000] + full_lists[:50000]
random.shuffle(mixed_lists)
if __name__ == '__main__':
ENV = 'import firstitem'
test_data = ('empty_lists', 'full_lists', 'mixed_lists')
funcs = ('index_first_item', 'return_first_item')
for data in test_data:
print "%s:" % data
for func in funcs:
t = timeit.Timer('firstitem.%s(firstitem.%s)' % (
func, data), ENV)
times = t.repeat()
avg_time = sum(times) / len(times)
print " %s:" % func
for time in times:
print " %f seconds" % time
print " %f seconds avg." % avg_time
lists =[["first","list"],["second","list"],["third","list"]]def do_something(element):ifnot element:returnelse:# do somethingpassfor li in lists:
do_something(head(li))
BTW: I’d rework your general program flow into something like this:
lists = [
["first", "list"],
["second", "list"],
["third", "list"]
]
def do_something(element):
if not element:
return
else:
# do something
pass
for li in lists:
do_something(head(li))
Several people have suggested doing something like this:
list = get_list()
return list and list[0] or None
That works in many cases, but it will only work if list[0] is not equal to 0, False, or an empty string. If list[0] is 0, False, or an empty string, the method will incorrectly return None.
I’ve created this bug in my own code one too many times !
You could use Extract Method. In other words extract that code into a method which you’d then call.
I wouldn’t try to compress it much more, the one liners seem harder to read than the verbose version. And if you use Extract Method, it’s a one liner ;)
Best: min(d, key=d.get) — no reason to interpose a useless lambda indirection layer or extract items or keys!
回答 1
这实际上是提供OP所需解决方案的答案:
>>> d ={320:1,321:0,322:3}>>> d.items()[(320,1),(321,0),(322,3)]>>># find the minimum by comparing the second element of each tuple>>> min(d.items(), key=lambda x: x[1])(321,0)
Here’s an answer that actually gives the solution the OP asked for:
>>> d = {320:1, 321:0, 322:3}
>>> d.items()
[(320, 1), (321, 0), (322, 3)]
>>> # find the minimum by comparing the second element of each tuple
>>> min(d.items(), key=lambda x: x[1])
(321, 0)
Using d.iteritems() will be more efficient for larger dictionaries, however.
回答 2
对于具有相等最小值的多个键,可以使用列表理解:
d ={320:1,321:0,322:3,323:0}
minval = min(d.values())
res =[k for k, v in d.items()if v==minval][321,323]
If you are not sure that you have not multiple minimum values, I would suggest:
d = {320:1, 321:0, 322:3, 323:0}
print ', '.join(str(key) for min_value in (min(d.values()),) for key in d if d[key]==min_value)
"""Output:
321, 323
"""
Use min with an iterator (for python 3 use items instead of iteritems); instead of lambda use the itemgetter from operator, which is faster than lambda.
from operator import itemgetter
min_key, _ = min(d.iteritems(), key=itemgetter(1))
回答 10
d={}
d[320]=1
d[321]=0
d[322]=3
value = min(d.values())for k in d.keys():if d[k]== value:print k,d[k]
to create an orderable class you have to override 6 special functions, so that it would be called by the min() function
these methods are__lt__ , __le__, __gt__, __ge__, __eq__ , __ne__ in order they are less than, less than or equal, greater than, greater than or equal, equal, not equal.
for example you should implement __lt__ as follows:
Use the zip function to create an iterator of tuples containing values
and keys. Then wrap it with a min function which takes the minimum based
on the first key. This returns a tuple containing (value, key) pair. The index of [1] is used to get the corresponding key
回答 14
# python
d={320:1,321:0,322:3}
reduce(lambda x,y: x if d[x]<=d[y]else y, d.iterkeys())321
# What gets printed if foo is the main program
before import
before functionA
before functionB
before __name__ guard
Function A
Function B 10.0
after __name__ guard
# What gets printed if foo is imported as a regular module
before import
before functionA
before functionB
before __name__ guard
after __name__ guard
Whenever the Python interpreter reads a source file, it does two things:
it sets a few special variables like __name__, and then
it executes all of the code found in the file.
Let’s see how this works and how it relates to your question about the __name__ checks we always see in Python scripts.
Code Sample
Let’s use a slightly different code sample to explore how imports and scripts work. Suppose the following is in a file called foo.py.
# Suppose this is foo.py.
print("before import")
import math
print("before functionA")
def functionA():
print("Function A")
print("before functionB")
def functionB():
print("Function B {}".format(math.sqrt(100)))
print("before __name__ guard")
if __name__ == '__main__':
functionA()
functionB()
print("after __name__ guard")
Special Variables
When the Python interpeter reads a source file, it first defines a few special variables. In this case, we care about the __name__ variable.
When Your Module Is the Main Program
If you are running your module (the source file) as the main program, e.g.
python foo.py
the interpreter will assign the hard-coded string "__main__" to the __name__ variable, i.e.
# It's as if the interpreter inserts this at the top
# of your module when run as the main program.
__name__ = "__main__"
When Your Module Is Imported By Another
On the other hand, suppose some other module is the main program and it imports your module. This means there’s a statement like this in the main program, or in some other module the main program imports:
# Suppose this is in some other main program.
import foo
The interpreter will search for your foo.py file (along with searching for a few other variants), and prior to executing that module, it will assign the name "foo" from the import statement to the __name__ variable, i.e.
# It's as if the interpreter inserts this at the top
# of your module when it's imported from another module.
__name__ = "foo"
Executing the Module’s Code
After the special variables are set up, the interpreter executes all the code in the module, one statement at a time. You may want to open another window on the side with the code sample so you can follow along with this explanation.
Always
It prints the string "before import" (without quotes).
It loads the math module and assigns it to a variable called math. This is equivalent to replacing import math with the following (note that __import__ is a low-level function in Python that takes a string and triggers the actual import):
# Find and load a module given its string name, "math",
# then assign it to a local variable called math.
math = __import__("math")
It prints the string "before functionA".
It executes the def block, creating a function object, then assigning that function object to a variable called functionA.
It prints the string "before functionB".
It executes the second def block, creating another function object, then assigning it to a variable called functionB.
It prints the string "before __name__ guard".
Only When Your Module Is the Main Program
If your module is the main program, then it will see that __name__ was indeed set to "__main__" and it calls the two functions, printing the strings "Function A" and "Function B 10.0".
Only When Your Module Is Imported by Another
(instead) If your module is not the main program but was imported by another one, then __name__ will be "foo", not "__main__", and it’ll skip the body of the if statement.
Always
It will print the string "after __name__ guard" in both situations.
Summary
In summary, here’s what’d be printed in the two cases:
# What gets printed if foo is the main program
before import
before functionA
before functionB
before __name__ guard
Function A
Function B 10.0
after __name__ guard
# What gets printed if foo is imported as a regular module
before import
before functionA
before functionB
before __name__ guard
after __name__ guard
Why Does It Work This Way?
You might naturally wonder why anybody would want this. Well, sometimes you want to write a .py file that can be both used by other programs and/or modules as a module, and can also be run as the main program itself. Examples:
Your module is a library, but you want to have a script mode where it runs some unit tests or a demo.
Your module is only used as a main program, but it has some unit tests, and the testing framework works by importing .py files like your script and running special test functions. You don’t want it to try running the script just because it’s importing the module.
Your module is mostly used as a main program, but it also provides a programmer-friendly API for advanced users.
Beyond those examples, it’s elegant that running a script in Python is just setting up a few magic variables and importing the script. “Running” the script is a side effect of importing the script’s module.
Food for Thought
Question: Can I have multiple __name__ checking blocks? Answer: it’s strange to do so, but the language won’t stop you.
Suppose the following is in foo2.py. What happens if you say python foo2.py on the command-line? Why?
# Suppose this is foo2.py.
def functionA():
print("a1")
from foo2 import functionB
print("a2")
functionB()
print("a3")
def functionB():
print("b")
print("t1")
if __name__ == "__main__":
print("m1")
functionA()
print("m2")
print("t2")
Now, figure out what will happen if you remove the __name__ check in foo3.py:
# Suppose this is foo3.py.
def functionA():
print("a1")
from foo3 import functionB
print("a2")
functionB()
print("a3")
def functionB():
print("b")
print("t1")
print("m1")
functionA()
print("m2")
print("t2")
What will this do when used as a script? When imported as a module?
# Suppose this is in foo4.py
__name__ = "__main__"
def bar():
print("bar")
print("before __name__ guard")
if __name__ == "__main__":
bar()
print("after __name__ guard")
# file one.pydef func():print("func() in one.py")print("top-level in one.py")if __name__ =="__main__":print("one.py is being run directly")else:print("one.py is being imported into another module")
# file two.pyimport one
print("top-level in two.py")
one.func()if __name__ =="__main__":print("two.py is being run directly")else:print("two.py is being imported into another module")
现在,如果您将解释器调用为
python one.py
输出将是
top-level in one.py
one.py is being run directly
如果two.py改为运行:
python two.py
你得到
top-level in one.py
one.py is being imported into another module
top-level in two.py
func()in one.py
two.py is being run directly
When your script is run by passing it as a command to the Python interpreter,
python myscript.py
all of the code that is at indentation level 0 gets executed. Functions and classes that are defined are, well, defined, but none of their code gets run. Unlike other languages, there’s no main() function that gets run automatically – the main() function is implicitly all the code at the top level.
In this case, the top-level code is an if block. __name__ is a built-in variable which evaluates to the name of the current module. However, if a module is being run directly (as in myscript.py above), then __name__ instead is set to the string "__main__". Thus, you can test whether your script is being run directly or being imported by something else by testing
if __name__ == "__main__":
...
If your script is being imported into another module, its various function and class definitions will be imported and its top-level code will be executed, but the code in the then-body of the if clause above won’t get run as the condition is not met. As a basic example, consider the following two scripts:
# file one.py
def func():
print("func() in one.py")
print("top-level in one.py")
if __name__ == "__main__":
print("one.py is being run directly")
else:
print("one.py is being imported into another module")
# file two.py
import one
print("top-level in two.py")
one.func()
if __name__ == "__main__":
print("two.py is being run directly")
else:
print("two.py is being imported into another module")
Now, if you invoke the interpreter as
python one.py
The output will be
top-level in one.py
one.py is being run directly
If you run two.py instead:
python two.py
You get
top-level in one.py
one.py is being imported into another module
top-level in two.py
func() in one.py
two.py is being run directly
Thus, when module one gets loaded, its __name__ equals "one" instead of "__main__".
回答 2
__name__变量(imho)的最简单解释如下:
创建以下文件。
# a.pyimport b
和
# b.pyprint"Hello World from %s!"% __name__
if __name__ =='__main__':print"Hello World again from %s!"% __name__
The simplest explanation for the __name__ variable (imho) is the following:
Create the following files.
# a.py
import b
and
# b.py
print "Hello World from %s!" % __name__
if __name__ == '__main__':
print "Hello World again from %s!" % __name__
Running them will get you this output:
$ python a.py
Hello World from b!
As you can see, when a module is imported, Python sets globals()['__name__'] in this module to the module’s name. Also, upon import all the code in the module is being run. As the if statement evaluates to False this part is not executed.
$ python b.py
Hello World from __main__!
Hello World again from __main__!
As you can see, when a file is executed, Python sets globals()['__name__'] in this file to "__main__". This time, the if statement evaluates to True and is being run.
def main():"""business logic for when running this module as the primary one!"""
setup()
foo = do_important()
bar = do_even_more_important(foo)for baz in bar:
do_super_important(baz)
teardown()# Here's our payoff idiom!if __name__ =='__main__':
main()
The global variable, __name__, in the module that is the entry point to your program, is '__main__'. Otherwise, it’s the name you import the module by.
So, code under the if block will only run if the module is the entry point to your program.
It allows the code in the module to be importable by other modules, without executing the code block beneath on import.
Why do we need this?
Developing and Testing Your Code
Say you’re writing a Python script designed to be used as a module:
def do_important():
"""This function does something very important"""
You could test the module by adding this call of the function to the bottom:
do_important()
and running it (on a command prompt) with something like:
~$ python important.py
The Problem
However, if you want to import the module to another script:
import important
On import, the do_important function would be called, so you’d probably comment out your function call, do_important(), at the bottom.
# do_important() # I must remember to uncomment to execute this!
And then you’ll have to remember whether or not you’ve commented out your test function call. And this extra complexity would mean you’re likely to forget, making your development process more troublesome.
A Better Way
The __name__ variable points to the namespace wherever the Python interpreter happens to be at the moment.
Inside an imported module, it’s the name of that module.
But inside the primary module (or an interactive Python session, i.e. the interpreter’s Read, Eval, Print Loop, or REPL) you are running everything from its "__main__".
So if you check before executing:
if __name__ == "__main__":
do_important()
With the above, your code will only execute when you’re running it as the primary module (or intentionally call it from another script).
An Even Better Way
There’s a Pythonic way to improve on this, though.
What if we want to run this business process from outside the module?
If we put the code we want to exercise as we develop and test in a function like this and then do our check for '__main__' immediately after:
def main():
"""business logic for when running this module as the primary one!"""
setup()
foo = do_important()
bar = do_even_more_important(foo)
for baz in bar:
do_super_important(baz)
teardown()
# Here's our payoff idiom!
if __name__ == '__main__':
main()
We now have a final function for the end of our module that will run if we run the module as the primary module.
It will allow the module and its functions and classes to be imported into other scripts without running the main function, and will also allow the module (and its functions and classes) to be called when running from a different '__main__' module, i.e.
This module represents the (otherwise anonymous) scope in which the
interpreter’s main program executes — commands read either from
standard input, from a script file, or from an interactive prompt. It
is this environment in which the idiomatic “conditional script” stanza
causes a script to run:
if __name__ == '__main__':
main()
回答 4
if __name__ == "__main__"是使用(例如)命令从(例如)命令行运行脚本时运行的部分python myscript.py。
__name__ is a global variable (in Python, global actually means on the module level) that exists in all namespaces. It is typically the module’s name (as a str type).
As the only special case, however, in whatever Python process you run, as in mycode.py:
python mycode.py
the otherwise anonymous global namespace is assigned the value of '__main__' to its __name__.
when it is the primary, entry-point module that is run by a Python process,
will cause your script’s uniquely defined main function to run.
Another benefit of using this construct: you can also import your code as a module in another script and then run the main function if and when your program decides:
import mycode
# ... any amount of other code
mycode.main()
import ab
def main():print('main function: this is where the action is')def x():print('peripheral task: might be useful in other projects')
x()if __name__ =="__main__":
main()
There are lots of different takes here on the mechanics of the code in question, the “How”, but for me none of it made sense until I understood the “Why”. This should be especially helpful for new programmers.
Take file “ab.py”:
def a():
print('A function in ab file');
a()
And a second file “xy.py”:
import ab
def main():
print('main function: this is where the action is')
def x():
print ('peripheral task: might be useful in other projects')
x()
if __name__ == "__main__":
main()
What is this code actually doing?
When you execute xy.py, you import ab. The import statement runs the module immediately on import, so ab‘s operations get executed before the remainder of xy‘s. Once finished with ab, it continues with xy.
The interpreter keeps track of which scripts are running with __name__. When you run a script – no matter what you’ve named it – the interpreter calls it "__main__", making it the master or ‘home’ script that gets returned to after running an external script.
Any other script that’s called from this "__main__" script is assigned its filename as its __name__ (e.g., __name__ == "ab.py"). Hence, the line if __name__ == "__main__": is the interpreter’s test to determine if it’s interpreting/parsing the ‘home’ script that was initially executed, or if it’s temporarily peeking into another (external) script. This gives the programmer flexibility to have the script behave differently if it’s executed directly vs. called externally.
Let’s step through the above code to understand what’s happening, focusing first on the unindented lines and the order they appear in the scripts. Remember that function – or def – blocks don’t do anything by themselves until they’re called. What the interpreter might say if mumbled to itself:
Open xy.py as the ‘home’ file; call it "__main__" in the __name__ variable.
Import and open file with the __name__ == "ab.py".
Oh, a function. I’ll remember that.
Ok, function a(); I just learned that. Printing ‘A function in ab file‘.
End of file; back to "__main__"!
Oh, a function. I’ll remember that.
Another one.
Function x(); ok, printing ‘peripheral task: might be useful in other projects‘.
What’s this? An if statement. Well, the condition has been met (the variable __name__ has been set to "__main__"), so I’ll enter the main() function and print ‘main function: this is where the action is‘.
The bottom two lines mean: “If this is the "__main__" or ‘home’ script, execute the function called main()“. That’s why you’ll see a def main(): block up top, which contains the main flow of the script’s functionality.
Why implement this?
Remember what I said earlier about import statements? When you import a module it doesn’t just ‘recognize’ it and wait for further instructions – it actually runs all the executable operations contained within the script. So, putting the meat of your script into the main() function effectively quarantines it, putting it in isolation so that it won’t immediately run when imported by another script.
Again, there will be exceptions, but common practice is that main() doesn’t usually get called externally. So you may be wondering one more thing: if we’re not calling main(), why are we calling the script at all? It’s because many people structure their scripts with standalone functions that are built to be run independent of the rest of the code in the file. They’re then later called somewhere else in the body of the script. Which brings me to this:
But the code works without it
Yes, that’s right. These separate functions can be called from an in-line script that’s not contained inside a main() function. If you’re accustomed (as I am, in my early learning stages of programming) to building in-line scripts that do exactly what you need, and you’ll try to figure it out again if you ever need that operation again … well, you’re not used to this kind of internal structure to your code, because it’s more complicated to build and it’s not as intuitive to read.
But that’s a script that probably can’t have its functions called externally, because if it did it would immediately start calculating and assigning variables. And chances are if you’re trying to re-use a function, your new script is related closely enough to the old one that there will be conflicting variables.
In splitting out independent functions, you gain the ability to re-use your previous work by calling them into another script. For example, “example.py” might import “xy.py” and call x(), making use of the ‘x’ function from “xy.py”. (Maybe it’s capitalizing the third word of a given text string; creating a NumPy array from a list of numbers and squaring them; or detrending a 3D surface. The possibilities are limitless.)
(As an aside, this question contains an answer by @kindall that finally helped me to understand – the why, not the how. Unfortunately it’s been marked as a duplicate of this one, which I think is a mistake.)
When there are certain statements in our module (M.py) we want to be executed when it’ll be running as main (not imported), we can place those statements (test-cases, print statements) under this if block.
As by default (when module running as main, not imported) the __name__ variable is set to "__main__", and when it’ll be imported the __name__ variable will get a different value, most probably the name of the module ('M').
This is helpful in running different variants of a modules together, and separating their specific input & output statements and also if there are any test-cases.
In short, use this ‘if __name__ == "main" ‘ block to prevent (certain) code from being run when the module is imported.
Put simply, __name__ is a variable defined for each script that defines whether the script is being run as the main module or it is being run as an imported module.
Script1's name is script1
Script 2's name: __main__
As you can see, __name__ tells us which code is the ‘main’ module.
This is great, because you can just write code and not have to worry about structural issues like in C/C++, where, if a file does not implement a ‘main’ function then it cannot be compiled as an executable and if it does, it cannot then be used as a library.
Say you write a Python script that does something great and you implement a boatload of functions that are useful for other purposes. If I want to use them I can just import your script and use them without executing your program (given that your code only executes within the if __name__ == "__main__": context). Whereas in C/C++ you would have to portion out those pieces into a separate module that then includes the file. Picture the situation below;
The arrows are import links. For three modules each trying to include the previous modules code there are six files (nine, counting the implementation files) and five links. This makes it difficult to include other code into a C project unless it is compiled specifically as a library. Now picture it for Python:
You write a module, and if someone wants to use your code they just import it and the __name__ variable can help to separate the executable portion of the program from the library part.
...
<Block A>
if __name__ == '__main__':
<Block B>
...
Blocks A and B are run when we are running x.py.
But just block A (and not B) is run when we are running another module, y.py for example, in which x.py is imported and the code is run from there (like when a function in x.py is called from y.py).
if __name__ =='__main__':# Do something appropriate here, like calling a# main() function defined elsewhere in this module.
main()else:# Do nothing. This module has been imported by another# module that wants to make use of the functions,# classes and other useful bits it has defined.
When you run Python interactively the local __name__ variable is assigned a value of __main__. Likewise, when you execute a Python module from the command line, rather than importing it into another module, its __name__ attribute is assigned a value of __main__, rather than the actual name of the module. In this way, modules can look at their own __name__ value to determine for themselves how they are being used, whether as support for another program or as the main application executed from the command line. Thus, the following idiom is quite common in Python modules:
if __name__ == '__main__':
# Do something appropriate here, like calling a
# main() function defined elsewhere in this module.
main()
else:
# Do nothing. This module has been imported by another
# module that wants to make use of the functions,
# classes and other useful bits it has defined.
It checks if the __name__ attribute of the Python script is "__main__". In other words, if the program itself is executed, the attribute will be __main__, so the program will be executed (in this case the main() function).
However, if your Python script is used by a module, any code outside of the if statement will be executed, so if \__name__ == "\__main__" is used just to check if the program is used as a module or not, and therefore decides whether to run the code.
Before explaining anything about if __name__ == '__main__' it is important to understand what __name__ is and what it does.
What is __name__?
__name__ is a DunderAlias – can be thought of as a global variable (accessible from modules) and works in a similar way to global.
It is a string (global as mentioned above) as indicated by type(__name__) (yielding <class 'str'>), and is an inbuilt standard for both Python 3 and Python 2 versions.
Where:
It can not only be used in scripts but can also be found in both the interpreter and modules/packages.
Interpreter:
>>> print(__name__)
__main__
>>>
Script:
test_file.py:
print(__name__)
Resulting in __main__
Module or package:
somefile.py:
def somefunction():
print(__name__)
test_file.py:
import somefile
somefile.somefunction()
Resulting in somefile
Notice that when used in a package or module, __name__ takes the name of the file. The path of the actual module or package path is not given, but has its own DunderAlias __file__, that allows for this.
You should see that, where __name__, where it is the main file (or program) will always return __main__, and if it is a module/package, or anything that is running off some other Python script, will return the name of the file where it has originated from.
Practice:
Being a variable means that it’s value can be overwritten (“can” does not mean “should”), overwriting the value of __name__ will result in a lack of readability. So do not do it, for any reason. If you need a variable define a new variable.
It is always assumed that the value of __name__ to be __main__ or the name of the file. Once again changing this default value will cause more confusion that it will do good, causing problems further down the line.
It is considered good practice in general to include the if __name__ == '__main__' in scripts.
Now to answer if __name__ == '__main__':
Now we know the behaviour of __name__ things become clearer:
An if is a flow control statement that contains the block of code will execute if the value given is true. We have seen that __name__ can take either
__main__ or the file name it has been imported from.
This means that if __name__ is equal to __main__ then the file must be the main file and must actually be running (or it is the interpreter), not a module or package imported into the script.
If indeed __name__ does take the value of __main__ then whatever is in that block of code will execute.
This tells us that if the file running is the main file (or you are running from the interpreter directly) then that condition must execute. If it is a package then it should not, and the value will not be __main__.
Modules:
__name__ can also be used in modules to define the name of a module
Variants:
It is also possible to do other, less common but useful things with __name__, some I will show here:
Executing only if the file is a module or package:
if __name__ != '__main__':
# Do some useful things
Running one condition if the file is the main one and another if it is not:
if __name__ == '__main__':
# Execute something
else:
# Do some useful things
You can also use it to provide runnable help functions/utilities on packages and modules without the elaborate use of libraries.
It also allows modules to be run from the command line as main scripts, which can be also very useful.
I think it’s best to break the answer in depth and in simple words:
__name__: Every module in Python has a special attribute called __name__.
It is a built-in variable that returns the name of the module.
__main__: Like other programming languages, Python too has an execution entry point, i.e., main. '__main__'is the name of the scope in which top-level code executes. Basically you have two ways of using a Python module: Run it directly as a script, or import it. When a module is run as a script, its __name__ is set to __main__.
Thus, the value of the __name__ attribute is set to __main__ when the module is run as the main program. Otherwise the value of __name__ is set to contain the name of the module.
It is a special for when a Python file is called from the command line. This is typically used to call a “main()” function or execute other appropriate startup code, like commandline arguments handling for instance.
It could be written in several ways. Another is:
def some_function_for_instance_main():
dosomething()
__name__ == '__main__' and some_function_for_instance_main()
I am not saying you should use this in production code, but it serves to illustrate that there is nothing “magical” about if __name__ == '__main__'. It is a good convention for invoking a main function in Python files.
There are a number of variables that the system (Python interpreter) provides for source files (modules). You can get their values anytime you want, so, let us focus on the __name__ variable/attribute:
When Python loads a source code file, it executes all of the code found in it. (Note that it doesn’t call all of the methods and functions defined in the file, but it does define them.)
Before the interpreter executes the source code file though, it defines a few special variables for that file; __name__ is one of those special variables that Python automatically defines for each source code file.
If Python is loading this source code file as the main program (i.e. the file you run), then it sets the special __name__ variable for this file to have a value “__main__”.
If this is being imported from another module, __name__ will be set to that module’s name.
will be executed only when you run the module directly; the code block will not execute if another module is calling/importing it because the value of __name__ will not equal to “main” in that particular instance.
Hope this helps out.
回答 16
if __name__ == "__main__": 基本上是顶级脚本环境,它指定了解释器(“我首先执行的优先级最高”)。
if __name__ == "__main__": is basically the top-level script environment, and it specifies the interpreter that (‘I have the highest priority to be executed first’).
'__main__' is the name of the scope in which top-level code executes. A module’s __name__ is set equal to '__main__' when read from standard input, a script, or from an interactive prompt.
if __name__ == "__main__":
# Execute only if run as a script
main()
I’ve been reading so much throughout the answers on this page. I would say, if you know the thing, for sure you will understand those answers, otherwise, you are still confused.
To be short, you need to know several points:
import a action actually runs all that can be ran in “a”
Because of point 1, you may not want everything to be run in “a” when importing it
To solve the problem in point 2, python allows you to put a condition check
__name__ is an implicit variable in all .py modules; when a.py is imported, the value of __name__ of a.py module is set to its file name “a“; when a.py is run directly using “python a.py“, which means a.py is the entry point, then the value of __name__ of a.py module is set to a string __main__
Based on the mechanism how python sets the variable __name__ for each module, do you know how to achieve point 3? The answer is fairly easy, right? Put a if condition: if __name__ == "__main__": ...; you can even put if __name__ == "a" depending on your functional need
The important thing that python is special at is point 4! The rest is just basic logic.
回答 18
考虑:
print __name__
上面的输出是__main__。
if __name__ =="__main__":print"direct method"
上面的陈述是正确的,并显示“ direct method”。假设他们在另一个类中导入了该类,则不会打印“直接方法”,因为在导入时它将设置__name__ equal to "first model name"。
The above statement is true and prints “direct method”. Suppose if they imported this class in another class it doesn’t print “direct method” because, while importing, it will set __name__ equal to "first model name".
回答 19
您可以使该文件可用作脚本以及可导入模块。
fibo.py(名为的模块fibo)
# Other modules can IMPORT this MODULE to use the function fibdef fib(n):# write Fibonacci series up to n
a, b =0,1while b < n:print(b, end=' ')
a, b = b, a+b
print()# This allows the file to be used as a SCRIPTif __name__ =="__main__":import sys
fib(int(sys.argv[1]))
You can make the file usable as a script as well as an importable module.
fibo.py (a module named fibo)
# Other modules can IMPORT this MODULE to use the function fib
def fib(n): # write Fibonacci series up to n
a, b = 0, 1
while b < n:
print(b, end=' ')
a, b = b, a+b
print()
# This allows the file to be used as a SCRIPT
if __name__ == "__main__":
import sys
fib(int(sys.argv[1]))
is primarily to avoid the import lock problems that would arise from having code directly imported. You want main() to run if your file was directly invoked (that’s the __name__ == "__main__" case), but if your code was imported then the importer has to enter your code from the true main module to avoid import lock problems.
A side-effect is that you automatically sign on to a methodology that supports multiple entry points. You can run your program using main() as the entry point, but you don’t have to. While setup.py expects main(), other tools use alternate entry points. For example, to run your file as a gunicorn process, you define an app() function instead of a main(). Just as with setup.py, gunicorn imports your code so you don’t want it do do anything while it’s being imported (because of the import lock issue).
This answer is for Java programmers learning Python.
Every Java file typically contains one public class. You can use that class in two ways:
Call the class from other files. You just have to import it in the calling program.
Run the class stand alone, for testing purposes.
For the latter case, the class should contain a public static void main() method. In Python this purpose is served by the globally defined label '__main__'.
回答 22
if __name__ == '__main__':仅当模块作为脚本调用时,才会执行以下代码。
例如,考虑以下模块my_test_module.py:
# my_test_module.pyprint('This is going to be printed out, no matter what')if __name__ =='__main__':print('This is going to be printed out, only if user invokes the module as a script')
第一种可能性:导入my_test_module.py另一个模块
# main.pyimport my_test_module
if __name__ =='__main__':print('Hello from main.py')
现在,如果您调用main.py:
python main.py
>>'This is going to be printed out, no matter what'>>'Hello from main.py'
python my_test_module.py
>>>'This is going to be printed out, no matter what'>>>'This is going to be printed out, only if user invokes the module as a script'
The code under if __name__ == '__main__':will only be executed if the module is invoked as a script.
As an example consider the following module my_test_module.py:
# my_test_module.py
print('This is going to be printed out, no matter what')
if __name__ == '__main__':
print('This is going to be printed out, only if user invokes the module as a script')
1st possibility: Import my_test_module.py in another module
# main.py
import my_test_module
if __name__ == '__main__':
print('Hello from main.py')
Now if you invoke main.py:
python main.py
>> 'This is going to be printed out, no matter what'
>> 'Hello from main.py'
Note that only the top-level print() statement in my_test_module is executed.
2nd possibility: Invoke my_test_module.py as a script
Now if you run my_test_module.py as a Python script, both print() statements will be exectued:
python my_test_module.py
>>> 'This is going to be printed out, no matter what'
>>> 'This is going to be printed out, only if user invokes the module as a script'
#Script test.py
apple =42def hello_world():print("I am inside hello_world")if __name__ =="__main__":print("Value of __name__ is: ", __name__)print("Going to call hello_world")
hello_world()
我们可以直接执行为
python test.py
输出量
Value of __name__ is: __main__
Going to call hello_world
I am inside hello_world
现在假设我们从其他脚本中调用上述脚本
#script external_calling.pyimport test
print(test.apple)
test.hello_world()print(test.__name__)
Every module in python has a attribute called __name__. The value of __name__ attribute is __main__ when the module is run directly, like python my_module.py. Otherwise (like when you say import my_module) the value of __name__ is the name of the module.
Small example to explain in short.
#Script test.py
apple = 42
def hello_world():
print("I am inside hello_world")
if __name__ == "__main__":
print("Value of __name__ is: ", __name__)
print("Going to call hello_world")
hello_world()
We can execute this directly as
python test.py
Output
Value of __name__ is: __main__
Going to call hello_world
I am inside hello_world
Now suppose we call above script from other script
#script external_calling.py
import test
print(test.apple)
test.hello_world()
print(test.__name__)
When you execute this
python external_calling.py
Output
42
I am inside hello_world
test
So, above is self explanatory that when you call test from other script, if loop __name__ in test.py will not execute.
All the answers have pretty much explained the functionality. But I will provide one example of its usage which might help clearing out the concept further.
Assume that you have two Python files, a.py and b.py. Now, a.py imports b.py. We run the a.py file, where the “import b.py” code is executed first. Before the rest of the a.py code runs, the code in the file b.py must run completely.
In the b.py code there is some code that is exclusive to that file b.py and we don’t want any other file (other than b.py file), that has imported the b.py file, to run it.
So that is what this line of code checks. If it is the main file (i.e., b.py) running the code, which in this case it is not (a.py is the main file running), then only the code gets executed.
回答 27
创建一个文件a.py:
print(__name__)# It will print out __main__
__name__始终等于__main__该文件直接运行时表明它是主文件。
在同一目录中创建另一个文件b.py:
import a # Prints a
运行。它将打印一个,即被导入文件的名称。
因此,为了显示同一文件的两种不同行为,这是一个常用的技巧:
# Code to be run when imported into another python fileif __name__ =='__main__':# Code to be run only when run directly