分类目录归档:知识问答

通过字典中的值获取键

问题:通过字典中的值获取键

我制作了一个函数,该函数将查询年龄Dictionary并显示匹配的名称:

dictionary = {'george' : 16, 'amber' : 19}
search_age = raw_input("Provide age")
for age in dictionary.values():
    if age == search_age:
        name = dictionary[age]
        print name

我知道如何比较和查找年龄,但我不知道如何显示此人的名字。另外,KeyError由于第5行,我得到了提示。我知道这是不正确的,但我不知道如何使它向后搜索。

I made a function which will look up ages in a Dictionary and show the matching name:

dictionary = {'george' : 16, 'amber' : 19}
search_age = raw_input("Provide age")
for age in dictionary.values():
    if age == search_age:
        name = dictionary[age]
        print name

I know how to compare and find the age I just don’t know how to show the name of the person. Additionally, I am getting a KeyError because of line 5. I know it’s not correct but I can’t figure out how to make it search backwards.


回答 0

空无一人。 dict不打算以此方式使用。

dictionary = {'george': 16, 'amber': 19}
search_age = input("Provide age")
for name, age in dictionary.items():  # for name, age in dictionary.iteritems():  (for Python 2.x)
    if age == search_age:
        print(name)

There is none. dict is not intended to be used this way.

dictionary = {'george': 16, 'amber': 19}
search_age = input("Provide age")
for name, age in dictionary.items():  # for name, age in dictionary.iteritems():  (for Python 2.x)
    if age == search_age:
        print(name)

回答 1

mydict = {'george': 16, 'amber': 19}
print mydict.keys()[mydict.values().index(16)]  # Prints george

或在Python 3.x中:

mydict = {'george': 16, 'amber': 19}
print(list(mydict.keys())[list(mydict.values()).index(16)])  # Prints george

基本上,它将字典中的值分隔在一个列表中,找到您拥有的值的位置,并在该位置获取键。

有关Python 3的更多信息keys()如何从dict中获取值列表?.values()

mydict = {'george': 16, 'amber': 19}
print mydict.keys()[mydict.values().index(16)]  # Prints george

Or in Python 3.x:

mydict = {'george': 16, 'amber': 19}
print(list(mydict.keys())[list(mydict.values()).index(16)])  # Prints george

Basically, it separates the dictionary’s values in a list, finds the position of the value you have, and gets the key at that position.

More about keys() and .values() in Python 3: How can I get list of values from dict?


回答 2

如果你想同时姓名年龄,你应该用.items()它给你的关键(key, value)元组:

for name, age in mydict.items():
    if age == search_age:
        print name

您可以在for循环中将元组解压缩为两个单独的变量,然后匹配年龄。

如果您通常要按年龄查找,并且没有两个人具有相同的年龄,则还应该考虑反转字典。

{16: 'george', 19: 'amber'}

这样您就可以通过做

mydict[search_age]

我一直称其为,mydict而不是list因为它list是内置类型的名称,因此您不应将此名称用作其他名称。

您甚至可以在一行中获得所有给定年龄的人的列表:

[name for name, age in mydict.items() if age == search_age]

或每个年龄段只有一个人:

next((name for name, age in mydict.items() if age == search_age), None)

这只会给你 None如果没有那个年龄的人,。

最后,如果dict较长,并且您使用的是Python 2,则应考虑使用.iteritems()而不是.items()Cat Plus Plus在其答案中所做的操作,因为它不需要复制列表。

If you want both the name and the age, you should be using .items() which gives you key (key, value) tuples:

for name, age in mydict.items():
    if age == search_age:
        print name

You can unpack the tuple into two separate variables right in the for loop, then match the age.

You should also consider reversing the dictionary if you’re generally going to be looking up by age, and no two people have the same age:

{16: 'george', 19: 'amber'}

so you can look up the name for an age by just doing

mydict[search_age]

I’ve been calling it mydict instead of list because list is the name of a built-in type, and you shouldn’t use that name for anything else.

You can even get a list of all people with a given age in one line:

[name for name, age in mydict.items() if age == search_age]

or if there is only one person with each age:

next((name for name, age in mydict.items() if age == search_age), None)

which will just give you None if there isn’t anyone with that age.

Finally, if the dict is long and you’re on Python 2, you should consider using .iteritems() instead of .items() as Cat Plus Plus did in his answer, since it doesn’t need to make a copy of the list.


回答 3

我认为指出哪种方法最快,在哪种情况下会很有趣:

这是我运行的一些测试(在2012年的MacBook Pro上)

>>> def method1(list,search_age):
...     for name,age in list.iteritems():
...             if age == search_age:
...                     return name
... 
>>> def method2(list,search_age):
...     return [name for name,age in list.iteritems() if age == search_age]
... 
>>> def method3(list,search_age):
...     return list.keys()[list.values().index(search_age)]

profile.run()每种方法的结果100000次:

方法1:

>>> profile.run("for i in range(0,100000): method1(list,16)")
     200004 function calls in 1.173 seconds

方法2:

>>> profile.run("for i in range(0,100000): method2(list,16)")
     200004 function calls in 1.222 seconds

方法3:

>>> profile.run("for i in range(0,100000): method3(list,16)")
     400004 function calls in 2.125 seconds

因此,这表明对于小的字典而言,方法1最快。这很可能是因为它返回第一个匹配项,而不是像方法2那样的所有匹配项(请参见下面的注释)。


有趣的是,对我拥有2700个条目的字典执行相同的测试,我得到的结果却截然不同(这次运行10000次):

方法1:

>>> profile.run("for i in range(0,10000): method1(UIC_CRS,'7088380')")
     20004 function calls in 2.928 seconds

方法2:

>>> profile.run("for i in range(0,10000): method2(UIC_CRS,'7088380')")
     20004 function calls in 3.872 seconds

方法3:

>>> profile.run("for i in range(0,10000): method3(UIC_CRS,'7088380')")
     40004 function calls in 1.176 seconds

所以在这里,方法3是快。只是显示您的字典大小会影响您选择的方法。

注意:方法2返回所有名称的列表,而方法1和3仅返回第一个匹配项。我没有考虑内存使用情况。我不确定方法3是否创建2个额外的列表(keys()和values())并将其存储在内存中。

I thought it would be interesting to point out which methods are the quickest, and in what scenario:

Here’s some tests I ran (on a 2012 MacBook Pro)

>>> def method1(list,search_age):
...     for name,age in list.iteritems():
...             if age == search_age:
...                     return name
... 
>>> def method2(list,search_age):
...     return [name for name,age in list.iteritems() if age == search_age]
... 
>>> def method3(list,search_age):
...     return list.keys()[list.values().index(search_age)]

Results from profile.run() on each method 100000 times:

Method 1:

>>> profile.run("for i in range(0,100000): method1(list,16)")
     200004 function calls in 1.173 seconds

Method 2:

>>> profile.run("for i in range(0,100000): method2(list,16)")
     200004 function calls in 1.222 seconds

Method 3:

>>> profile.run("for i in range(0,100000): method3(list,16)")
     400004 function calls in 2.125 seconds

So this shows that for a small dict, method 1 is the quickest. This is most likely because it returns the first match, as opposed to all of the matches like method 2 (see note below).


Interestingly, performing the same tests on a dict I have with 2700 entries, I get quite different results (this time run 10000 times):

Method 1:

>>> profile.run("for i in range(0,10000): method1(UIC_CRS,'7088380')")
     20004 function calls in 2.928 seconds

Method 2:

>>> profile.run("for i in range(0,10000): method2(UIC_CRS,'7088380')")
     20004 function calls in 3.872 seconds

Method 3:

>>> profile.run("for i in range(0,10000): method3(UIC_CRS,'7088380')")
     40004 function calls in 1.176 seconds

So here, method 3 is much faster. Just goes to show the size of your dict will affect which method you choose.

Notes: Method 2 returns a list of all names, whereas methods 1 and 3 return only the first match. I have not considered memory usage. I’m not sure if method 3 creates 2 extra lists (keys() and values()) and stores them in memory.


回答 4

一线版本:(i是旧字典,p是反向字典)

说明:i.keys()i.values()返回两个分别具有字典键和值的列表。zip函数具有将列表捆绑在一起以生成字典的能力。

p = dict(zip(i.values(),i.keys()))

警告:仅当值是可哈希值且唯一时,此选项才起作用。

one line version: (i is an old dictionary, p is a reversed dictionary)

explanation : i.keys() and i.values() returns two lists with keys and values of the dictionary respectively. The zip function has the ability to tie together lists to produce a dictionary.

p = dict(zip(i.values(),i.keys()))

Warning : This will work only if the values are hashable and unique.


回答 5

a = {'a':1,'b':2,'c':3}
{v:k for k, v in a.items()}[1]

或更好

{k:v for k, v in a.items() if v == 1}
a = {'a':1,'b':2,'c':3}
{v:k for k, v in a.items()}[1]

or better

{k:v for k, v in a.items() if v == 1}

回答 6

key = next((k for k in my_dict if my_dict[k] == val), None)
key = next((k for k in my_dict if my_dict[k] == val), None)

回答 7

尝试以下这种单线来反向字典:

reversed_dictionary = dict(map(reversed, dictionary.items()))

Try this one-liner to reverse a dictionary:

reversed_dictionary = dict(map(reversed, dictionary.items()))

回答 8

我找到了这个答案非常有效,但对我来说却不太容易理解。

为了更加清晰,您可以反转字典的键和值。这是使键的值和值的键,因为看到这里

mydict = {'george':16,'amber':19}
res = dict((v,k) for k,v in mydict.iteritems())
print(res[16]) # Prints george

要么

mydict = {'george':16,'amber':19}
dict((v,k) for k,v in mydict.iteritems())[16]

这与其他答案基本相同。

I found this answer very effective but not very easy to read for me.

To make it more clear you can invert the key and the value of a dictionary. This is make the keys values and the values keys, as seen here.

mydict = {'george':16,'amber':19}
res = dict((v,k) for k,v in mydict.iteritems())
print(res[16]) # Prints george

or

mydict = {'george':16,'amber':19}
dict((v,k) for k,v in mydict.iteritems())[16]

which is essentially the same that this other answer.


回答 9

如果要通过值查找键,则可以使用字典理解来创建查找字典,然后使用该字典从值中查找键。

lookup = {value: key for key, value in self.data}
lookup[value]

If you want to find the key by the value, you can use a dictionary comprehension to create a lookup dictionary and then use that to find the key from the value.

lookup = {value: key for key, value in self.data}
lookup[value]

回答 10

您可以通过使用获取密钥dict.keys()dict.values()list.index()方法,请参见下面的代码示例:

names_dict = {'george':16,'amber':19}
search_age = int(raw_input("Provide age"))
key = names_dict.keys()[names_dict.values().index(search_age)]

You can get key by using dict.keys(), dict.values() and list.index() methods, see code samples below:

names_dict = {'george':16,'amber':19}
search_age = int(raw_input("Provide age"))
key = names_dict.keys()[names_dict.values().index(search_age)]

回答 11

这是我对这个问题的看法。:)我刚刚开始学习Python,所以我这样称呼:

“初学者的理解”解决方案。

#Code without comments.

list1 = {'george':16,'amber':19, 'Garry':19}
search_age = raw_input("Provide age: ")
print
search_age = int(search_age)

listByAge = {}

for name, age in list1.items():
    if age == search_age:
        age = str(age)
        results = name + " " +age
        print results

        age2 = int(age)
        listByAge[name] = listByAge.get(name,0)+age2

print
print listByAge

#Code with comments.
#I've added another name with the same age to the list.
list1 = {'george':16,'amber':19, 'Garry':19}
#Original code.
search_age = raw_input("Provide age: ")
print
#Because raw_input gives a string, we need to convert it to int,
#so we can search the dictionary list with it.
search_age = int(search_age)

#Here we define another empty dictionary, to store the results in a more 
#permanent way.
listByAge = {}

#We use double variable iteration, so we get both the name and age 
#on each run of the loop.
for name, age in list1.items():
    #Here we check if the User Defined age = the age parameter 
    #for this run of the loop.
    if age == search_age:
        #Here we convert Age back to string, because we will concatenate it 
        #with the person's name. 
        age = str(age)
        #Here we concatenate.
        results = name + " " +age
        #If you want just the names and ages displayed you can delete
        #the code after "print results". If you want them stored, don't...
        print results

        #Here we create a second variable that uses the value of
        #the age for the current person in the list.
        #For example if "Anna" is "10", age2 = 10,
        #integer value which we can use in addition.
        age2 = int(age)
        #Here we use the method that checks or creates values in dictionaries.
        #We create a new entry for each name that matches the User Defined Age
        #with default value of 0, and then we add the value from age2.
        listByAge[name] = listByAge.get(name,0)+age2

#Here we print the new dictionary with the users with User Defined Age.
print
print listByAge

#Results
Running: *\test.py (Thu Jun 06 05:10:02 2013)

Provide age: 19

amber 19
Garry 19

{'amber': 19, 'Garry': 19}

Execution Successful!

Here is my take on this problem. :) I have just started learning Python, so I call this:

“The Understandable for beginners” solution.

#Code without comments.

list1 = {'george':16,'amber':19, 'Garry':19}
search_age = raw_input("Provide age: ")
print
search_age = int(search_age)

listByAge = {}

for name, age in list1.items():
    if age == search_age:
        age = str(age)
        results = name + " " +age
        print results

        age2 = int(age)
        listByAge[name] = listByAge.get(name,0)+age2

print
print listByAge

.

#Code with comments.
#I've added another name with the same age to the list.
list1 = {'george':16,'amber':19, 'Garry':19}
#Original code.
search_age = raw_input("Provide age: ")
print
#Because raw_input gives a string, we need to convert it to int,
#so we can search the dictionary list with it.
search_age = int(search_age)

#Here we define another empty dictionary, to store the results in a more 
#permanent way.
listByAge = {}

#We use double variable iteration, so we get both the name and age 
#on each run of the loop.
for name, age in list1.items():
    #Here we check if the User Defined age = the age parameter 
    #for this run of the loop.
    if age == search_age:
        #Here we convert Age back to string, because we will concatenate it 
        #with the person's name. 
        age = str(age)
        #Here we concatenate.
        results = name + " " +age
        #If you want just the names and ages displayed you can delete
        #the code after "print results". If you want them stored, don't...
        print results

        #Here we create a second variable that uses the value of
        #the age for the current person in the list.
        #For example if "Anna" is "10", age2 = 10,
        #integer value which we can use in addition.
        age2 = int(age)
        #Here we use the method that checks or creates values in dictionaries.
        #We create a new entry for each name that matches the User Defined Age
        #with default value of 0, and then we add the value from age2.
        listByAge[name] = listByAge.get(name,0)+age2

#Here we print the new dictionary with the users with User Defined Age.
print
print listByAge

.

#Results
Running: *\test.py (Thu Jun 06 05:10:02 2013)

Provide age: 19

amber 19
Garry 19

{'amber': 19, 'Garry': 19}

Execution Successful!

回答 12

get_key = lambda v, d: next(k for k in d if d[k] is v)
get_key = lambda v, d: next(k for k in d if d[k] is v)

回答 13

考虑使用熊猫。正如William McKinney的“ Python for Data Analysis”中所述

考虑序列的另一种方法是定长排序的字典,因为它是索引值到数据值的映射。它可以在可能使用字典的许多情况下使用。

import pandas as pd
list = {'george':16,'amber':19}
lookup_list = pd.Series(list)

要查询您的系列,请执行以下操作:

lookup_list[lookup_list.values == 19]

生成:

Out[1]: 
amber    19
dtype: int64

如果您需要对输出执行其他任何操作,将答案转换为列表可能会很有用:

answer = lookup_list[lookup_list.values == 19].index
answer = pd.Index.tolist(answer)

Consider using Pandas. As stated in William McKinney’s “Python for Data Analysis’

Another way to think about a Series is as a fixed-length, ordered dict, as it is a mapping of index values to data values. It can be used in many contexts where you might use a dict.

import pandas as pd
list = {'george':16,'amber':19}
lookup_list = pd.Series(list)

To query your series do the following:

lookup_list[lookup_list.values == 19]

Which yields:

Out[1]: 
amber    19
dtype: int64

If you need to do anything else with the output transforming the answer into a list might be useful:

answer = lookup_list[lookup_list.values == 19].index
answer = pd.Index.tolist(answer)

回答 14

在这里,recover_key将使用字典和要在字典中找到的值。然后,我们遍历字典中的键,并与value的键进行比较,然后返回该特定键。

def recover_key(dicty,value):
    for a_key in dicty.keys():
        if (dicty[a_key] == value):
            return a_key

Here, recover_key takes dictionary and value to find in dictionary. We then loop over the keys in dictionary and make a comparison with that of value and return that particular key.

def recover_key(dicty,value):
    for a_key in dicty.keys():
        if (dicty[a_key] == value):
            return a_key

回答 15

for name in mydict:
    if mydict[name] == search_age:
        print(name) 
        #or do something else with it. 
        #if in a function append to a temporary list, 
        #then after the loop return the list
for name in mydict:
    if mydict[name] == search_age:
        print(name) 
        #or do something else with it. 
        #if in a function append to a temporary list, 
        #then after the loop return the list

回答 16

我们可以得到Keydict是:

def getKey(dct,value):
     return [key for key in dct if (dct[key] == value)]

we can get the Key of dict by :

def getKey(dct,value):
     return [key for key in dct if (dct[key] == value)]

回答 17

它得到了回答,但是可以通过使用“ map / reduce”来完成,例如:

def find_key(value, dictionary):
    return reduce(lambda x, y: x if x is not None else y,
                  map(lambda x: x[0] if x[1] == value else None, 
                      dictionary.iteritems()))

it’s answered, but it could be done with a fancy ‘map/reduce’ use, e.g.:

def find_key(value, dictionary):
    return reduce(lambda x, y: x if x is not None else y,
                  map(lambda x: x[0] if x[1] == value else None, 
                      dictionary.iteritems()))

回答 18

Cat Plus Plus提到,这不是打算使用字典的方式。原因如下:

字典的定义类似于数学中的映射。在这种情况下,字典是K(键集)到V(值)的映射-但反之亦然。如果取消引用dict,则期望返回的值恰好是一个。但是,将不同的键映射到相同的值是完全合法的,例如:

d = { k1 : v1, k2 : v2, k3 : v1}

当您通过键的对应值查找键时,实际上就是在反转字典。但是,映射不一定是可逆的!在此示例中,请求对应于v1的密钥可能会产生k1或k3。您应该同时退货吗?只是找到第一个?这就是为什么字典未定义indexof()的原因。

如果您知道自己的数据,则可以这样做。但是API不能假定任意字典都是可逆的,因此缺少这种操作。

Cat Plus Plus mentioned that this isn’t how a dictionary is intended to be used. Here’s why:

The definition of a dictionary is analogous to that of a mapping in mathematics. In this case, a dict is a mapping of K (the set of keys) to V (the values) – but not vice versa. If you dereference a dict, you expect to get exactly one value returned. But, it is perfectly legal for different keys to map onto the same value, e.g.:

d = { k1 : v1, k2 : v2, k3 : v1}

When you look up a key by it’s corresponding value, you’re essentially inverting the dictionary. But a mapping isn’t necessarily invertible! In this example, asking for the key corresponding to v1 could yield k1 or k3. Should you return both? Just the first one found? That’s why indexof() is undefined for dictionaries.

If you know your data, you could do this. But an API can’t assume that an arbitrary dictionary is invertible, hence the lack of such an operation.


回答 19

这是我的看法。万一您需要一个结果,这对于显示多个结果很有用。所以我也添加了列表

myList = {'george':16,'amber':19, 'rachel':19, 
           'david':15 }                         #Setting the dictionary
result=[]                                       #Making ready of the result list
search_age = int(input('Enter age '))

for keywords in myList.keys():
    if myList[keywords] ==search_age:
    result.append(keywords)                    #This part, we are making list of results

for res in result:                             #We are now printing the results
    print(res)

就是这样…

here is my take on it. This is good for displaying multiple results just in case you need one. So I added the list as well

myList = {'george':16,'amber':19, 'rachel':19, 
           'david':15 }                         #Setting the dictionary
result=[]                                       #Making ready of the result list
search_age = int(input('Enter age '))

for keywords in myList.keys():
    if myList[keywords] ==search_age:
    result.append(keywords)                    #This part, we are making list of results

for res in result:                             #We are now printing the results
    print(res)

And that’s it…


回答 20

d= {'george':16,'amber':19}

dict((v,k) for k,v in d.items()).get(16)

输出如下:

-> prints george
d= {'george':16,'amber':19}

dict((v,k) for k,v in d.items()).get(16)

The output is as follows:

-> prints george

回答 21

没有一种简单的方法可以通过“查找”值在列表中找到键。但是,如果您知道该值(通过键进行迭代),则可以通过该元素在字典中查找值。如果D [element](其中D是字典对象)等于您要查找的键,则可以执行一些代码。

D = {'Ali': 20, 'Marina': 12, 'George':16}
age = int(input('enter age:\t'))  
for element in D.keys():
    if D[element] == age:
        print(element)

There is no easy way to find a key in a list by ‘looking up’ the value. However, if you know the value, iterating through the keys, you can look up values in the dictionary by the element. If D[element] where D is a dictionary object, is equal to the key you’re trying to look up, you can execute some code.

D = {'Ali': 20, 'Marina': 12, 'George':16}
age = int(input('enter age:\t'))  
for element in D.keys():
    if D[element] == age:
        print(element)

回答 22

您需要使用字典,然后反向使用该字典。这意味着您需要另一个数据结构。如果您使用的是python 3,请使用enum模块,但如果使用的是python 2.7,请使用enum34python 2的反向移植。

例:

from enum import Enum

class Color(Enum): 
    red = 1 
    green = 2 
    blue = 3

>>> print(Color.red) 
Color.red

>>> print(repr(Color.red)) 
<color.red: 1=""> 

>>> type(Color.red) 
<enum 'color'=""> 
>>> isinstance(Color.green, Color) 
True 

>>> member = Color.red 
>>> member.name 
'red' 
>>> member.value 
1 

You need to use a dictionary and reverse of that dictionary. It means you need another data structure. If you are in python 3, use enum module but if you are using python 2.7 use enum34 which is back ported for python 2.

Example:

from enum import Enum

class Color(Enum): 
    red = 1 
    green = 2 
    blue = 3

>>> print(Color.red) 
Color.red

>>> print(repr(Color.red)) 
<color.red: 1=""> 

>>> type(Color.red) 
<enum 'color'=""> 
>>> isinstance(Color.green, Color) 
True 

>>> member = Color.red 
>>> member.name 
'red' 
>>> member.value 
1 

回答 23

def get_Value(dic,value):
    for name in dic:
        if dic[name] == value:
            del dic[name]
            return name
def get_Value(dic,value):
    for name in dic:
        if dic[name] == value:
            del dic[name]
            return name

回答 24

只是我的回答lambdafilter

filter( lambda x, dictionary=dictionary, search_age=int(search_age): dictionary[x] == search_age  , dictionary )

Just my answer in lambda and filter.

filter( lambda x, dictionary=dictionary, search_age=int(search_age): dictionary[x] == search_age  , dictionary )

回答 25

已经回答了,但是由于有几个人提到反转字典,因此这是您在一行中进行操作的方式(假设1:1映射)和一些各种性能数据:

python 2.6:

reversedict = dict([(value, key) for key, value in mydict.iteritems()])

2.7+:

reversedict = {value:key for key, value in mydict.iteritems()}

如果您认为不是1:1,则仍然可以用几行代码创建合理的反向映射:

reversedict = defaultdict(list)
[reversedict[value].append(key) for key, value in mydict.iteritems()]

速度有多慢:比简单的搜索要慢,但是却没有您想像的慢-在“直” 100000条目字典上,“快速”搜索(即在键中寻找一个早的值)比反转整个字典快约10倍,“慢速”搜索(快到结尾)约快4-5倍。因此,经过最多约10次查询后,它是自付费用的。

第二个版本(每个项目都有列表)的花费约为简单版本的2.5倍。

largedict = dict((x,x) for x in range(100000))

# Should be slow, has to search 90000 entries before it finds it
In [26]: %timeit largedict.keys()[largedict.values().index(90000)]
100 loops, best of 3: 4.81 ms per loop

# Should be fast, has to only search 9 entries to find it. 
In [27]: %timeit largedict.keys()[largedict.values().index(9)]
100 loops, best of 3: 2.94 ms per loop

# How about using iterkeys() instead of keys()?
# These are faster, because you don't have to create the entire keys array.
# You DO have to create the entire values array - more on that later.

In [31]: %timeit islice(largedict.iterkeys(), largedict.values().index(90000))
100 loops, best of 3: 3.38 ms per loop

In [32]: %timeit islice(largedict.iterkeys(), largedict.values().index(9))
1000 loops, best of 3: 1.48 ms per loop

In [24]: %timeit reversedict = dict([(value, key) for key, value in largedict.iteritems()])
10 loops, best of 3: 22.9 ms per loop

In [23]: %%timeit
....: reversedict = defaultdict(list)
....: [reversedict[value].append(key) for key, value in largedict.iteritems()]
....:
10 loops, best of 3: 53.6 ms per loop

使用ifilter也有一些有趣的结果。从理论上讲,ifilter应该更快,因为我们可以使用itervalues(),而不必创建/遍历整个值列表。实际上,结果是…很奇怪…

In [72]: %%timeit
....: myf = ifilter(lambda x: x[1] == 90000, largedict.iteritems())
....: myf.next()[0]
....:
100 loops, best of 3: 15.1 ms per loop

In [73]: %%timeit
....: myf = ifilter(lambda x: x[1] == 9, largedict.iteritems())
....: myf.next()[0]
....:
100000 loops, best of 3: 2.36 us per loop

因此,对于较小的偏移量,它比以前的任何版本都快得多(2.36 * u * S,而以前的版本最低为1.48 * m * S)。但是,对于列表末尾附近的较大偏移量,它的速度要慢得多(15.1ms与相同的1.48mS)。恕我直言,低端的少量节省不值得高端的成本。

already been answered, but since several people mentioned reversing the dictionary, here’s how you do it in one line (assuming 1:1 mapping) and some various perf data:

python 2.6:

reversedict = dict([(value, key) for key, value in mydict.iteritems()])

2.7+:

reversedict = {value:key for key, value in mydict.iteritems()}

if you think it’s not 1:1, you can still create a reasonable reverse mapping with a couple lines:

reversedict = defaultdict(list)
[reversedict[value].append(key) for key, value in mydict.iteritems()]

how slow is this: slower than a simple search, but not nearly as slow as you’d think – on a ‘straight’ 100000 entry dictionary, a ‘fast’ search (i.e. looking for a value that should be early in the keys) was about 10x faster than reversing the entire dictionary, and a ‘slow’ search (towards the end) about 4-5x faster. So after at most about 10 lookups, it’s paid for itself.

the second version (with lists per item) takes about 2.5x as long as the simple version.

largedict = dict((x,x) for x in range(100000))

# Should be slow, has to search 90000 entries before it finds it
In [26]: %timeit largedict.keys()[largedict.values().index(90000)]
100 loops, best of 3: 4.81 ms per loop

# Should be fast, has to only search 9 entries to find it. 
In [27]: %timeit largedict.keys()[largedict.values().index(9)]
100 loops, best of 3: 2.94 ms per loop

# How about using iterkeys() instead of keys()?
# These are faster, because you don't have to create the entire keys array.
# You DO have to create the entire values array - more on that later.

In [31]: %timeit islice(largedict.iterkeys(), largedict.values().index(90000))
100 loops, best of 3: 3.38 ms per loop

In [32]: %timeit islice(largedict.iterkeys(), largedict.values().index(9))
1000 loops, best of 3: 1.48 ms per loop

In [24]: %timeit reversedict = dict([(value, key) for key, value in largedict.iteritems()])
10 loops, best of 3: 22.9 ms per loop

In [23]: %%timeit
....: reversedict = defaultdict(list)
....: [reversedict[value].append(key) for key, value in largedict.iteritems()]
....:
10 loops, best of 3: 53.6 ms per loop

Also had some interesting results with ifilter. Theoretically, ifilter should be faster, in that we can use itervalues() and possibly not have to create/go through the entire values list. In practice, the results were… odd…

In [72]: %%timeit
....: myf = ifilter(lambda x: x[1] == 90000, largedict.iteritems())
....: myf.next()[0]
....:
100 loops, best of 3: 15.1 ms per loop

In [73]: %%timeit
....: myf = ifilter(lambda x: x[1] == 9, largedict.iteritems())
....: myf.next()[0]
....:
100000 loops, best of 3: 2.36 us per loop

So, for small offsets, it was dramatically faster than any previous version (2.36 *u*S vs. a minimum of 1.48 *m*S for previous cases). However, for large offsets near the end of the list, it was dramatically slower (15.1ms vs. the same 1.48mS). The small savings at the low end is not worth the cost at the high end, imho.


回答 26

有时可能需要int():

titleDic = {'Фильмы':1, 'Музыка':2}

def categoryTitleForNumber(self, num):
    search_title = ''
    for title, titleNum in self.titleDic.items():
        if int(titleNum) == int(num):
            search_title = title
    return search_title

Sometimes int() may be needed:

titleDic = {'Фильмы':1, 'Музыка':2}

def categoryTitleForNumber(self, num):
    search_title = ''
    for title, titleNum in self.titleDic.items():
        if int(titleNum) == int(num):
            search_title = title
    return search_title

回答 27

这是一个在Python 2和Python 3中都可以使用的解决方案。

dict((v, k) for k, v in list.items())[search_age]

直到[search_age]构造反向字典为止(其中值是键,反之亦然)。您可以创建一个辅助方法,该方法将缓存此反向字典,如下所示:

def find_name(age, _rev_lookup=dict((v, k) for k, v in ages_by_name.items())):
    return _rev_lookup[age]

甚至更一般的工厂将为您的一个或多个列表创建按年龄查找名称的方法

def create_name_finder(ages_by_name):
    names_by_age = dict((v, k) for k, v in ages_by_name.items())
    def find_name(age):
      return names_by_age[age]

这样您就可以执行以下操作:

find_teen_by_age = create_name_finder({'george':16,'amber':19})
...
find_teen_by_age(search_age)

请注意,由于前者是预定义类型,因此我将其重命名listages_by_name

Here is a solution which works both in Python 2 and Python 3:

dict((v, k) for k, v in list.items())[search_age]

The part until [search_age] constructs the reverse dictionary (where values are keys and vice-versa). You could create a helper method which will cache this reversed dictionary like so:

def find_name(age, _rev_lookup=dict((v, k) for k, v in ages_by_name.items())):
    return _rev_lookup[age]

or even more generally a factory which would create a by-age name lookup method for one or more of you lists

def create_name_finder(ages_by_name):
    names_by_age = dict((v, k) for k, v in ages_by_name.items())
    def find_name(age):
      return names_by_age[age]

so you would be able to do:

find_teen_by_age = create_name_finder({'george':16,'amber':19})
...
find_teen_by_age(search_age)

Note that I renamed list to ages_by_name since the former is a predefined type.


回答 28

这是您访问字典以执行所需操作的方式:

list = {'george': 16, 'amber': 19}
search_age = raw_input("Provide age")
for age in list:
    if list[age] == search_age:
        print age

当然,您的名字太不正确了,看起来好像要打印一个年龄,但确实可以打印出名字。由于您是按名称访问的,因此如果您输入以下内容,它将变得更加容易理解:

list = {'george': 16, 'amber': 19}
search_age = raw_input("Provide age")
for name in list:
    if list[name] == search_age:
        print name

更好的是:

people = {'george': {'age': 16}, 'amber': {'age': 19}}
search_age = raw_input("Provide age")
for name in people:
    if people[name]['age'] == search_age:
        print name

This is how you access the dictionary to do what you want:

list = {'george': 16, 'amber': 19}
search_age = raw_input("Provide age")
for age in list:
    if list[age] == search_age:
        print age

of course, your names are so off it looks like it would be printing an age, but it DOES print the name. Since you are accessing by name, it becomes more understandable if you write:

list = {'george': 16, 'amber': 19}
search_age = raw_input("Provide age")
for name in list:
    if list[name] == search_age:
        print name

Better yet:

people = {'george': {'age': 16}, 'amber': {'age': 19}}
search_age = raw_input("Provide age")
for name in people:
    if people[name]['age'] == search_age:
        print name

回答 29

dictionary = {'george' : 16, 'amber' : 19}
search_age = raw_input("Provide age")
key = [filter( lambda x: dictionary[x] == k  , dictionary ),[None]][0] 
# key = None from [None] which is a safeguard for not found.

对于多次出现,请使用:

keys = [filter( lambda x: dictionary[x] == k  , dictionary )]
dictionary = {'george' : 16, 'amber' : 19}
search_age = raw_input("Provide age")
key = [filter( lambda x: dictionary[x] == k  , dictionary ),[None]][0] 
# key = None from [None] which is a safeguard for not found.

For multiple occurrences use:

keys = [filter( lambda x: dictionary[x] == k  , dictionary )]

使用Python请求发布JSON

问题:使用Python请求发布JSON

我需要将JSON从客户端发布到服务器。我正在使用Python 2.7.1和simplejson。客户端正在使用请求。服务器是CherryPy。我可以从服务器获取硬编码的JSON(代码未显示),但是当我尝试将JSON POST到服务器时,会收到“ 400 Bad Request”。

这是我的客户代码:

data = {'sender':   'Alice',
    'receiver': 'Bob',
    'message':  'We did it!'}
data_json = simplejson.dumps(data)
payload = {'json_payload': data_json}
r = requests.post("http://localhost:8080", data=payload)

这是服务器代码。

class Root(object):

    def __init__(self, content):
        self.content = content
        print self.content  # this works

    exposed = True

    def GET(self):
        cherrypy.response.headers['Content-Type'] = 'application/json'
        return simplejson.dumps(self.content)

    def POST(self):
        self.content = simplejson.loads(cherrypy.request.body.read())

有任何想法吗?

I need to POST a JSON from a client to a server. I’m using Python 2.7.1 and simplejson. The client is using Requests. The server is CherryPy. I can GET a hard-coded JSON from the server (code not shown), but when I try to POST a JSON to the server, I get “400 Bad Request”.

Here is my client code:

data = {'sender':   'Alice',
    'receiver': 'Bob',
    'message':  'We did it!'}
data_json = simplejson.dumps(data)
payload = {'json_payload': data_json}
r = requests.post("http://localhost:8080", data=payload)

Here is the server code.

class Root(object):

    def __init__(self, content):
        self.content = content
        print self.content  # this works

    exposed = True

    def GET(self):
        cherrypy.response.headers['Content-Type'] = 'application/json'
        return simplejson.dumps(self.content)

    def POST(self):
        self.content = simplejson.loads(cherrypy.request.body.read())

Any ideas?


回答 0

从Requests 2.4.2及更高版本开始,您可以在调用中使用’json’参数,从而使其更简单。

>>> import requests
>>> r = requests.post('http://httpbin.org/post', json={"key": "value"})
>>> r.status_code
200
>>> r.json()
{'args': {},
 'data': '{"key": "value"}',
 'files': {},
 'form': {},
 'headers': {'Accept': '*/*',
             'Accept-Encoding': 'gzip, deflate',
             'Connection': 'close',
             'Content-Length': '16',
             'Content-Type': 'application/json',
             'Host': 'httpbin.org',
             'User-Agent': 'python-requests/2.4.3 CPython/3.4.0',
             'X-Request-Id': 'xx-xx-xx'},
 'json': {'key': 'value'},
 'origin': 'x.x.x.x',
 'url': 'http://httpbin.org/post'}

编辑:此功能已添加到官方文档中。您可以在这里查看:请求文档

As of Requests version 2.4.2 and onwards, you can alternatively use ‘json’ parameter in the call which makes it simpler.

>>> import requests
>>> r = requests.post('http://httpbin.org/post', json={"key": "value"})
>>> r.status_code
200
>>> r.json()
{'args': {},
 'data': '{"key": "value"}',
 'files': {},
 'form': {},
 'headers': {'Accept': '*/*',
             'Accept-Encoding': 'gzip, deflate',
             'Connection': 'close',
             'Content-Length': '16',
             'Content-Type': 'application/json',
             'Host': 'httpbin.org',
             'User-Agent': 'python-requests/2.4.3 CPython/3.4.0',
             'X-Request-Id': 'xx-xx-xx'},
 'json': {'key': 'value'},
 'origin': 'x.x.x.x',
 'url': 'http://httpbin.org/post'}

EDIT: This feature has been added to the official documentation. You can view it here: Requests documentation


回答 1

原来我缺少标题信息。以下作品:

url = "http://localhost:8080"
data = {'sender': 'Alice', 'receiver': 'Bob', 'message': 'We did it!'}
headers = {'Content-type': 'application/json', 'Accept': 'text/plain'}
r = requests.post(url, data=json.dumps(data), headers=headers)

It turns out I was missing the header information. The following works:

url = "http://localhost:8080"
data = {'sender': 'Alice', 'receiver': 'Bob', 'message': 'We did it!'}
headers = {'Content-type': 'application/json', 'Accept': 'text/plain'}
r = requests.post(url, data=json.dumps(data), headers=headers)

回答 2

从请求2.4.2(https://pypi.python.org/pypi/requests)开始,支持“ json”参数。无需指定“ Content-Type”。因此,较短的版本:

requests.post('http://httpbin.org/post', json={'test': 'cheers'})

From requests 2.4.2 (https://pypi.python.org/pypi/requests), the “json” parameter is supported. No need to specify “Content-Type”. So the shorter version:

requests.post('http://httpbin.org/post', json={'test': 'cheers'})

回答 3

更好的方法是:

url = "http://xxx.xxxx.xx"

datas = {"cardno":"6248889874650987","systemIdentify":"s08","sourceChannel": 12}

headers = {'Content-type': 'application/json'}

rsp = requests.post(url, json=datas, headers=headers)

The better way is:

url = "http://xxx.xxxx.xx"

datas = {"cardno":"6248889874650987","systemIdentify":"s08","sourceChannel": 12}

headers = {'Content-type': 'application/json'}

rsp = requests.post(url, json=datas, headers=headers)

回答 4

与python 3.5+完美搭配

客户:

import requests
data = {'sender':   'Alice',
    'receiver': 'Bob',
    'message':  'We did it!'}
r = requests.post("http://localhost:8080", json={'json_payload': data})

服务器:

class Root(object):

    def __init__(self, content):
        self.content = content
        print self.content  # this works

    exposed = True

    def GET(self):
        cherrypy.response.headers['Content-Type'] = 'application/json'
        return simplejson.dumps(self.content)

    @cherrypy.tools.json_in()
    @cherrypy.tools.json_out()
    def POST(self):
        self.content = cherrypy.request.json
        return {'status': 'success', 'message': 'updated'}

Works perfectly with python 3.5+

client:

import requests
data = {'sender':   'Alice',
    'receiver': 'Bob',
    'message':  'We did it!'}
r = requests.post("http://localhost:8080", json={'json_payload': data})

server:

class Root(object):

    def __init__(self, content):
        self.content = content
        print self.content  # this works

    exposed = True

    def GET(self):
        cherrypy.response.headers['Content-Type'] = 'application/json'
        return simplejson.dumps(self.content)

    @cherrypy.tools.json_in()
    @cherrypy.tools.json_out()
    def POST(self):
        self.content = cherrypy.request.json
        return {'status': 'success', 'message': 'updated'}

回答 5

应该使用(data / json / files)之间的哪个参数,它实际上取决于名为ContentType的请求标头(通常通过浏览器的开发人员工具进行检查),

当Content-Type为application / x-www-form-urlencoded时,代码应为:

requests.post(url, data=jsonObj)

当Content-Type为application / json时,您的代码应为以下之一:

requests.post(url, json=jsonObj)
requests.post(url, data=jsonstr, headers={"Content-Type":"application/json"})

当Content-Type为multipart / form-data时,它用于上传文件,因此您的代码应为:

requests.post(url, files=xxxx)

Which parameter between (data / json / files) should be used,it’s actually depends on a request header named ContentType(usually check this through developer tools of your browser),

when the Content-Type is application/x-www-form-urlencoded, code should be:

requests.post(url, data=jsonObj)

when the Content-Type is application/json, your code is supposed to be one of below:

requests.post(url, json=jsonObj)
requests.post(url, data=jsonstr, headers={"Content-Type":"application/json"})

when the Content-Type is multipart/form-data, it’s used to upload files, so your code should be:

requests.post(url, files=xxxx)

在Python中将日期转换为日期时间

问题:在Python中将日期转换为日期时间

是否有一个内置的转换方法datedatetime在Python,例如获得datetime在指定日期的午夜?相反的转换很容易:datetime有一个.date()方法。

我真的必须手动打电话datetime(d.year, d.month, d.day)吗?

Is there a built-in method for converting a date to a datetime in Python, for example getting the datetime for the midnight of the given date? The opposite conversion is easy: datetime has a .date() method.

Do I really have to manually call datetime(d.year, d.month, d.day)?


回答 0

您可以使用datetime.combine(date, time);现在,您创建一个datetime.time初始化为午夜的对象。

from datetime import date
from datetime import datetime

dt = datetime.combine(date.today(), datetime.min.time())

You can use datetime.combine(date, time); for the time, you create a datetime.time object initialized to midnight.

from datetime import date
from datetime import datetime

dt = datetime.combine(date.today(), datetime.min.time())

回答 1

尽管我确实相信您提到(和不喜欢)的方法是最易读的方法,但是有几种方法。

>>> t=datetime.date.today()
>>> datetime.datetime.fromordinal(t.toordinal())
datetime.datetime(2009, 12, 20, 0, 0)
>>> datetime.datetime(t.year, t.month, t.day)
datetime.datetime(2009, 12, 20, 0, 0)
>>> datetime.datetime(*t.timetuple()[:-4])
datetime.datetime(2009, 12, 20, 0, 0)

依此类推-但基本上它们都取决于从date对象中适当地提取信息并将其犁入合适的ctor或classfunction中datetime

There are several ways, although I do believe the one you mention (and dislike) is the most readable one.

>>> t=datetime.date.today()
>>> datetime.datetime.fromordinal(t.toordinal())
datetime.datetime(2009, 12, 20, 0, 0)
>>> datetime.datetime(t.year, t.month, t.day)
datetime.datetime(2009, 12, 20, 0, 0)
>>> datetime.datetime(*t.timetuple()[:-4])
datetime.datetime(2009, 12, 20, 0, 0)

and so forth — but basically they all hinge on appropriately extracting info from the date object and ploughing it back into the suitable ctor or classfunction for datetime.


回答 2

可接受的答案是正确的,但我宁愿避免使用datetime.min.time()它,因为它对我的确切作用并不明显。如果对您而言显而易见,那么您将拥有更多权力。我也对timetuple方法和对订购的依赖的看法。

我认为,在不依赖读者非常熟悉datetime模块API的情况下,最易读,明确的方式是:

from datetime import date, datetime
today = date.today()
today_with_time = datetime(
    year=today.year, 
    month=today.month,
    day=today.day,
)

这就是我对“明确胜于隐含”的看法。

The accepted answer is correct, but I would prefer to avoid using datetime.min.time() because it’s not obvious to me exactly what it does. If it’s obvious to you, then more power to you. I also feel the same way about the timetuple method and the reliance on the ordering.

In my opinion, the most readable, explicit way of doing this without relying on the reader to be very familiar with the datetime module API is:

from datetime import date, datetime
today = date.today()
today_with_time = datetime(
    year=today.year, 
    month=today.month,
    day=today.day,
)

That’s my take on “explicit is better than implicit.”


回答 3

您可以使用date.timetuple()method和unpack运算符*

args = d.timetuple()[:6]
datetime.datetime(*args)

You can use the date.timetuple() method and unpack operator *.

args = d.timetuple()[:6]
datetime.datetime(*args)

回答 4

今天是2016年,我认为pandas Timestamp提供了最干净的解决方案:

from datetime import date
import pandas as pd
d = date.today()
pd.Timestamp(d)

时间戳等于日期时间的大熊猫,并且在大多数情况下可以互换。校验:

from datetime import datetime
isinstance(pd.Timestamp(d), datetime)

但是,如果您真的想要一个普通的日期时间,您仍然可以执行以下操作:

pd.Timestamp(d).to_datetime()

时间戳比日期时间强大得多,尤其是在处理时区时。实际上,时间戳记是如此强大,以至于它们的文献记录如此之少,实在令人遗憾。

Today being 2016, I think the cleanest solution is provided by pandas Timestamp:

from datetime import date
import pandas as pd
d = date.today()
pd.Timestamp(d)

Timestamp is the pandas equivalent of datetime and is interchangable with it in most cases. Check:

from datetime import datetime
isinstance(pd.Timestamp(d), datetime)

But in case you really want a vanilla datetime, you can still do:

pd.Timestamp(d).to_datetime()

Timestamps are a lot more powerful than datetimes, amongst others when dealing with timezones. Actually, Timestamps are so powerful that it’s a pity they are so poorly documented…


回答 5

尚未提到的将日期转换为日期时间的一种方法:

from datetime import date, datetime
d = date.today()
datetime.strptime(d.strftime('%Y%m%d'), '%Y%m%d')

One way to convert from date to datetime that hasn’t been mentioned yet:

from datetime import date, datetime
d = date.today()
datetime.strptime(d.strftime('%Y%m%d'), '%Y%m%d')

回答 6

您可以使用 easy_date使其变得容易:

import date_converter
my_datetime = date_converter.date_to_datetime(my_date)

You can use easy_date to make it easy:

import date_converter
my_datetime = date_converter.date_to_datetime(my_date)

回答 7

如果您需要快速操作,请datetime_object.date()给您一个datetime对象的日期。

If you need something quick, datetime_object.date() gives you a date of a datetime object.


回答 8

我是Python的新手。但是这段代码对我有用,它将我提供的指定输入转换为datetime。这是代码。如我错了请纠正我。

import sys
from datetime import datetime
from time import mktime, strptime

user_date = '02/15/1989'
if user_date is not None:
     user_date = datetime.strptime(user_date,"%m/%d/%Y")
else:
     user_date = datetime.now()
print user_date

I am a newbie to Python. But this code worked for me which converts the specified input I provide to datetime. Here’s the code. Correct me if I’m wrong.

import sys
from datetime import datetime
from time import mktime, strptime

user_date = '02/15/1989'
if user_date is not None:
     user_date = datetime.strptime(user_date,"%m/%d/%Y")
else:
     user_date = datetime.now()
print user_date

“ ==”和“是”之间有区别吗?

问题:“ ==”和“是”之间有区别吗?

我的Google Fu使我失败了。

在Python中,以下两个相等测试是否等效?

n = 5
# Test one.
if n == 5:
    print 'Yay!'

# Test two.
if n is 5:
    print 'Yay!'

这是否适用于您要比较实例(list说)的对象?

好的,这样可以回答我的问题:

L = []
L.append(1)
if L == [1]:
    print 'Yay!'
# Holds true, but...

if L is [1]:
    print 'Yay!'
# Doesn't.

因此,==测试会重视在哪里is进行测试以查看它们是否是同一对象?

My Google-fu has failed me.

In Python, are the following two tests for equality equivalent?

n = 5
# Test one.
if n == 5:
    print 'Yay!'

# Test two.
if n is 5:
    print 'Yay!'

Does this hold true for objects where you would be comparing instances (a list say)?

Okay, so this kind of answers my question:

L = []
L.append(1)
if L == [1]:
    print 'Yay!'
# Holds true, but...

if L is [1]:
    print 'Yay!'
# Doesn't.

So == tests value where is tests to see if they are the same object?


回答 0

isTrue如果两个变量指向同一个对象(==如果变量引用的对象相等),则将返回。

>>> a = [1, 2, 3]
>>> b = a
>>> b is a 
True
>>> b == a
True

# Make a new copy of list `a` via the slice operator, 
# and assign it to variable `b`
>>> b = a[:] 
>>> b is a
False
>>> b == a
True

在您的情况下,第二项测试仅能工作,因为Python会缓存小的整数对象,这是实现细节。对于较大的整数,这不起作用:

>>> 1000 is 10**3
False
>>> 1000 == 10**3
True

字符串文字也是如此:

>>> "a" is "a"
True
>>> "aa" is "a" * 2
True
>>> x = "a"
>>> "aa" is x * 2
False
>>> "aa" is intern(x*2)
True

也请参阅此问题

is will return True if two variables point to the same object, == if the objects referred to by the variables are equal.

>>> a = [1, 2, 3]
>>> b = a
>>> b is a 
True
>>> b == a
True

# Make a new copy of list `a` via the slice operator, 
# and assign it to variable `b`
>>> b = a[:] 
>>> b is a
False
>>> b == a
True

In your case, the second test only works because Python caches small integer objects, which is an implementation detail. For larger integers, this does not work:

>>> 1000 is 10**3
False
>>> 1000 == 10**3
True

The same holds true for string literals:

>>> "a" is "a"
True
>>> "aa" is "a" * 2
True
>>> x = "a"
>>> "aa" is x * 2
False
>>> "aa" is intern(x*2)
True

Please see this question as well.


回答 1

有一条简单的经验法则可以告诉您何时使用==is

  • ==是为了价值平等。当您想知道两个对象是否具有相同的值时,请使用它。
  • is参考平等。当您想知道两个引用是否引用同一对象时,请使用它。

通常,在将某事物与简单类型进行比较时,通常会检查值是否相等,因此应使用==。例如,您的示例的目的可能是检查x是否具有等于2(==)的值,而不是检查x字面上是否指向与2相同的对象。


其他注意事项:由于CPython参考实现的工作方式,如果错误地用于is比较整数的参考相等性,则会得到意外且不一致的结果:

>>> a = 500
>>> b = 500
>>> a == b
True
>>> a is b
False

这几乎是我们所期望的:a并且b具有相同的值,但是是不同的实体。但是呢?

>>> c = 200
>>> d = 200
>>> c == d
True
>>> c is d
True

这与先前的结果不一致。这里发生了什么?事实证明,出于性能原因,Python的参考实现将-5..256范围内的整数对象作为单例实例进行缓存。这是一个演示此示例:

>>> for i in range(250, 260): a = i; print "%i: %s" % (i, a is int(str(i)));
... 
250: True
251: True
252: True
253: True
254: True
255: True
256: True
257: False
258: False
259: False

这是另一个不使用的明显原因is:当您错误地将其用于值相等时,该行为应由实现决定。

There is a simple rule of thumb to tell you when to use == or is.

  • == is for value equality. Use it when you would like to know if two objects have the same value.
  • is is for reference equality. Use it when you would like to know if two references refer to the same object.

In general, when you are comparing something to a simple type, you are usually checking for value equality, so you should use ==. For example, the intention of your example is probably to check whether x has a value equal to 2 (==), not whether x is literally referring to the same object as 2.


Something else to note: because of the way the CPython reference implementation works, you’ll get unexpected and inconsistent results if you mistakenly use is to compare for reference equality on integers:

>>> a = 500
>>> b = 500
>>> a == b
True
>>> a is b
False

That’s pretty much what we expected: a and b have the same value, but are distinct entities. But what about this?

>>> c = 200
>>> d = 200
>>> c == d
True
>>> c is d
True

This is inconsistent with the earlier result. What’s going on here? It turns out the reference implementation of Python caches integer objects in the range -5..256 as singleton instances for performance reasons. Here’s an example demonstrating this:

>>> for i in range(250, 260): a = i; print "%i: %s" % (i, a is int(str(i)));
... 
250: True
251: True
252: True
253: True
254: True
255: True
256: True
257: False
258: False
259: False

This is another obvious reason not to use is: the behavior is left up to implementations when you’re erroneously using it for value equality.


回答 2

==确定值是否相等,而is确定它们是否是完全相同的对象。

== determines if the values are equal, while is determines if they are the exact same object.


回答 3

==isPython 之间有区别吗?

是的,它们有非常重要的区别。

==:检查是否相等-语义是等效对象(不一定是同一对象)将被测试为相等。如文档所述

运算符<,>,==,> =,<=和!=比较两个对象的值。

is:检查身份-语义是对象(保存在内存中)对象。再次,文档说

运算符isis not对象身份测试:x is y当且仅当xy是相同对象时,才为true 。使用该id()功能确定对象身份。x is not y产生反真值。

因此,对身份的检查与对对象ID的相等性检查相同。那是,

a is b

是相同的:

id(a) == id(b)

where id是返回整数的内建函数,该整数“保证同时存在的对象之间是唯一的”(请参阅​​参考资料help(id)),而where ab则是任意对象。

其他使用说明

您应该将这些比较用于它们的语义。使用is检查身份和==检查平等。

因此,通常,我们使用is来检查身份。当我们检查一个仅在内存中存在一次的对象(在文档中称为“单个”)时,这通常很有用。

用例is包括:

  • None
  • 枚举值(当使用枚举模块中的枚举时)
  • 通常是模块
  • 通常是由类定义产生的类对象
  • 通常由函数定义产生的函数对象
  • 在内存中应该只存在一次的所有其他内容(通常是所有单例)
  • 您希望通过身份获得的特定对象

通常的用例==包括:

  • 数字,包括整数
  • 清单
  • 词典
  • 自定义可变对象
  • 在大多数情况下,其他内置的不可变对象

一般使用情况下,再次对==,就是你想可能不是对象相同的对象,相反,它可能是一个相当于一个

PEP 8方向

PEP 8,标准库的官方Python样式指南还提到了以下两个用例is

与单例之类的比较None应始终使用isis not,而不应使用相等运算符。

另外,当心if x您的意思if x is not None,例如当测试是否将默认None 设置为的变量或参数设置为其他值时,请当心编写。另一个值可能具有在布尔上下文中可能为false的类型(例如容器)!

从身份推断平等

如果is为true,通常可以推断出相等性-从逻辑上讲,如果对象是自身,则它应该测试为等同于自身。

在大多数情况下,此逻辑是正确的,但它依赖于__eq__特殊方法的实现。正如文档所说,

相等比较(==!=)的默认行为基于对象的标识。因此,具有相同身份的实例的相等比较会导致相等,而具有不同身份的实例的相等比较会导致不平等。这种默认行为的动机是希望所有对象都是自反的(即x为y意味着x == y)。

为了保持一致性,建议:

平等比较应该是自反的。换句话说,相同的对象应该比较相等:

x is y 暗示 x == y

我们可以看到这是自定义对象的默认行为:

>>> class Object(object): pass
>>> obj = Object()
>>> obj2 = Object()
>>> obj == obj, obj is obj
(True, True)
>>> obj == obj2, obj is obj2
(False, False)

相反,通常也是如此-如果某项测试的结果不相等,则通常可以推断出它们不是同一对象。

由于可以对相等性测试进行自定义,因此该推论并不总是适用于所有类型。

一个exceptions

一个显着的exceptions是nan-它总是被测试为不等于自身:

>>> nan = float('nan')
>>> nan
nan
>>> nan is nan
True
>>> nan == nan           # !!!!!
False

检查身份比检查相等性要快得多(可能需要递归检查成员)。

但是它不能替代相等性,在相等性中您可能会发现多个对象相等。

请注意,比较列表和元组的相等性将假定对象的身份相同(因为这是一个快速检查)。如果逻辑不一致,这可能会产生矛盾-就是这样nan

>>> [nan] == [nan]
True
>>> (nan,) == (nan,)
True

警示故事:

问题是试图is用来比较整数。您不应该假定整数的实例与另一个引用获得的实例相同。这个故事解释了为什么。

一个注释者的代码依赖于以下事实:小整数(包括-5至256)在Python中是单例,而不是检查是否相等。

哇,这可能会导致一些隐患。我有一些检查a是否为b的代码,它可以按我的意愿工作,因为a和b通常很小。该错误仅在生产六个月后才出现在今天,因为a和b最终足够大而无法缓存。– gwg

它在开发中起作用。它可能已经通过了一些单元测试。

它可以在生产中使用-直到代码检查出大于256的整数为止,此时它在生产中失败了。

这是生产失败,可能已在代码审查中或可能通过样式检查器捕获。

让我强调一下:不要is用于比较整数。

Is there a difference between == and is in Python?

Yes, they have a very important difference.

==: check for equality – the semantics are that equivalent objects (that aren’t necessarily the same object) will test as equal. As the documentation says:

The operators <, >, ==, >=, <=, and != compare the values of two objects.

is: check for identity – the semantics are that the object (as held in memory) is the object. Again, the documentation says:

The operators is and is not test for object identity: x is y is true if and only if x and y are the same object. Object identity is determined using the id() function. x is not y yields the inverse truth value.

Thus, the check for identity is the same as checking for the equality of the IDs of the objects. That is,

a is b

is the same as:

id(a) == id(b)

where id is the builtin function that returns an integer that “is guaranteed to be unique among simultaneously existing objects” (see help(id)) and where a and b are any arbitrary objects.

Other Usage Directions

You should use these comparisons for their semantics. Use is to check identity and == to check equality.

So in general, we use is to check for identity. This is usually useful when we are checking for an object that should only exist once in memory, referred to as a “singleton” in the documentation.

Use cases for is include:

  • None
  • enum values (when using Enums from the enum module)
  • usually modules
  • usually class objects resulting from class definitions
  • usually function objects resulting from function definitions
  • anything else that should only exist once in memory (all singletons, generally)
  • a specific object that you want by identity

Usual use cases for == include:

  • numbers, including integers
  • strings
  • lists
  • sets
  • dictionaries
  • custom mutable objects
  • other builtin immutable objects, in most cases

The general use case, again, for ==, is the object you want may not be the same object, instead it may be an equivalent one

PEP 8 directions

PEP 8, the official Python style guide for the standard library also mentions two use-cases for is:

Comparisons to singletons like None should always be done with is or is not, never the equality operators.

Also, beware of writing if x when you really mean if x is not None — e.g. when testing whether a variable or argument that defaults to None was set to some other value. The other value might have a type (such as a container) that could be false in a boolean context!

Inferring equality from identity

If is is true, equality can usually be inferred – logically, if an object is itself, then it should test as equivalent to itself.

In most cases this logic is true, but it relies on the implementation of the __eq__ special method. As the docs say,

The default behavior for equality comparison (== and !=) is based on the identity of the objects. Hence, equality comparison of instances with the same identity results in equality, and equality comparison of instances with different identities results in inequality. A motivation for this default behavior is the desire that all objects should be reflexive (i.e. x is y implies x == y).

and in the interests of consistency, recommends:

Equality comparison should be reflexive. In other words, identical objects should compare equal:

x is y implies x == y

We can see that this is the default behavior for custom objects:

>>> class Object(object): pass
>>> obj = Object()
>>> obj2 = Object()
>>> obj == obj, obj is obj
(True, True)
>>> obj == obj2, obj is obj2
(False, False)

The contrapositive is also usually true – if somethings test as not equal, you can usually infer that they are not the same object.

Since tests for equality can be customized, this inference does not always hold true for all types.

An exception

A notable exception is nan – it always tests as not equal to itself:

>>> nan = float('nan')
>>> nan
nan
>>> nan is nan
True
>>> nan == nan           # !!!!!
False

Checking for identity can be much a much quicker check than checking for equality (which might require recursively checking members).

But it cannot be substituted for equality where you may find more than one object as equivalent.

Note that comparing equality of lists and tuples will assume that identity of objects are equal (because this is a fast check). This can create contradictions if the logic is inconsistent – as it is for nan:

>>> [nan] == [nan]
True
>>> (nan,) == (nan,)
True

A Cautionary Tale:

The question is attempting to use is to compare integers. You shouldn’t assume that an instance of an integer is the same instance as one obtained by another reference. This story explains why.

A commenter had code that relied on the fact that small integers (-5 to 256 inclusive) are singletons in Python, instead of checking for equality.

Wow, this can lead to some insidious bugs. I had some code that checked if a is b, which worked as I wanted because a and b are typically small numbers. The bug only happened today, after six months in production, because a and b were finally large enough to not be cached. – gwg

It worked in development. It may have passed some unittests.

And it worked in production – until the code checked for an integer larger than 256, at which point it failed in production.

This is a production failure that could have been caught in code review or possibly with a style-checker.

Let me emphasize: do not use is to compare integers.


回答 4

is和之间有什么区别==

==is不同的比较!正如其他人已经说过的:

  • == 比较对象的值。
  • is 比较对象的引用。

在Python中,例如,在这种情况下value1,名称指的是对象,并value2指代int存储值的实例1000

value1 = 1000
value2 = value1

在此处输入图片说明

因为value2引用相同的对象is==将给出True

>>> value1 == value2
True
>>> value1 is value2
True

在以下示例中,名称value1value2引用不同的int实例,即使它们都存储相同的整数:

>>> value1 = 1000
>>> value2 = 1000

在此处输入图片说明

因为相同的值(整数)存储==将是True,这就是为什么它通常被称为“值比较”。但是is会返回,False因为这些是不同的对象:

>>> value1 == value2
True
>>> value1 is value2
False

什么时候使用?

通常,is比较起来要快得多。这就是为什么CPython缓存(或者最好是重用)某些对象,例如小整数,某些字符串等。但是,这应该被视为实现细节,即使在没有警告的情况下也可以随时更改(即使可能性很小)。

is应在以下情况下使用

  • 想要检查两个对象是否真的是同一对象(不仅仅是相同的“值”)。一个示例可以是如果使用单例对象作为常量。
  • 想比较一个值和一个Python 常量。Python中的常量为:

    • None
    • True1个
    • False1个
    • NotImplemented
    • Ellipsis
    • __debug__
    • 类(例如int is intint is float
    • 内置模块或第三方模块中可能存在其他常量。例如np.ma.masked来自NumPy模块)

其他所有情况下,您都应使用==检查是否相等。

我可以自定义行为吗?

==在其他答案中还没有提到某些方面:它是Python“ Data model”的一部分。这意味着可以使用该__eq__方法自定义其行为。例如:

class MyClass(object):
    def __init__(self, val):
        self._value = val

    def __eq__(self, other):
        print('__eq__ method called')
        try:
            return self._value == other._value
        except AttributeError:
            raise TypeError('Cannot compare {0} to objects of type {1}'
                            .format(type(self), type(other)))

这只是一个人工的例子,用来说明该方法的确是这样的:

>>> MyClass(10) == MyClass(10)
__eq__ method called
True

请注意,默认情况下(如果__eq__在类或超类中找不到的其他实现)则__eq__使用is

class AClass(object):
    def __init__(self, value):
        self._value = value

>>> a = AClass(10)
>>> b = AClass(10)
>>> a == b
False
>>> a == a

因此,实现__eq__您想要的不仅仅是定制类的引用比较,实际上很重要!

另一方面,您无法自定义is检查。它总是会比较公正,如果你有相同的参考。

这些比较是否总是返回布尔值?

由于__eq__可以重新实现或覆盖,因此不限于return TrueFalse。它可以返回任何内容(但是在大多数情况下,它应该返回一个布尔值!)。

例如,对于NumPy数组,==它将返回一个数组:

>>> import numpy as np
>>> np.arange(10) == 2
array([False, False,  True, False, False, False, False, False, False, False], dtype=bool)

但是is支票总是会返回TrueFalse


1正如亚伦·霍尔在评论中提到的那样:

通常,您不应该执行任何操作is Trueis False检查,因为一个人通常在将条件隐式转换为布尔值的上下文中使用这些“检查” (例如,在if语句中)。因此,进行is True比较隐式的布尔类型转换要比仅仅进行布尔类型转换做更多的工作-并且您将自己限制为布尔值(不认为它是pythonic)。

就像PEP8提到的那样:

不要将布尔值与TrueFalse使用进行比较==

Yes:   if greeting:
No:    if greeting == True:
Worse: if greeting is True:

What’s the difference between is and ==?

== and is are different comparison! As others already said:

  • == compares the values of the objects.
  • is compares the references of the objects.

In Python names refer to objects, for example in this case value1 and value2 refer to an int instance storing the value 1000:

value1 = 1000
value2 = value1

enter image description here

Because value2 refers to the same object is and == will give True:

>>> value1 == value2
True
>>> value1 is value2
True

In the following example the names value1 and value2 refer to different int instances, even if both store the same integer:

>>> value1 = 1000
>>> value2 = 1000

enter image description here

Because the same value (integer) is stored == will be True, that’s why it’s often called “value comparison”. However is will return False because these are different objects:

>>> value1 == value2
True
>>> value1 is value2
False

When to use which?

Generally is is a much faster comparison. That’s why CPython caches (or maybe reuses would be the better term) certain objects like small integers, some strings, etc. But this should be treated as implementation detail that could (even if unlikely) change at any point without warning.

You should only use is if you:

  • want to check if two objects are really the same object (not just the same “value”). One example can be if you use a singleton object as constant.
  • want to compare a value to a Python constant. The constants in Python are:

    • None
    • True1
    • False1
    • NotImplemented
    • Ellipsis
    • __debug__
    • classes (for example int is int or int is float)
    • there could be additional constants in built-in modules or 3rd party modules. For example np.ma.masked from the NumPy module)

In every other case you should use == to check for equality.

Can I customize the behavior?

There is some aspect to == that hasn’t been mentioned already in the other answers: It’s part of Pythons “Data model”. That means its behavior can be customized using the __eq__ method. For example:

class MyClass(object):
    def __init__(self, val):
        self._value = val

    def __eq__(self, other):
        print('__eq__ method called')
        try:
            return self._value == other._value
        except AttributeError:
            raise TypeError('Cannot compare {0} to objects of type {1}'
                            .format(type(self), type(other)))

This is just an artificial example to illustrate that the method is really called:

>>> MyClass(10) == MyClass(10)
__eq__ method called
True

Note that by default (if no other implementation of __eq__ can be found in the class or the superclasses) __eq__ uses is:

class AClass(object):
    def __init__(self, value):
        self._value = value

>>> a = AClass(10)
>>> b = AClass(10)
>>> a == b
False
>>> a == a

So it’s actually important to implement __eq__ if you want “more” than just reference-comparison for custom classes!

On the other hand you cannot customize is checks. It will always compare just if you have the same reference.

Will these comparisons always return a boolean?

Because __eq__ can be re-implemented or overridden, it’s not limited to return True or False. It could return anything (but in most cases it should return a boolean!).

For example with NumPy arrays the == will return an array:

>>> import numpy as np
>>> np.arange(10) == 2
array([False, False,  True, False, False, False, False, False, False, False], dtype=bool)

But is checks will always return True or False!


1 As Aaron Hall mentioned in the comments:

Generally you shouldn’t do any is True or is False checks because one normally uses these “checks” in a context that implicitly converts the condition to a boolean (for example in an if statement). So doing the is True comparison and the implicit boolean cast is doing more work than just doing the boolean cast – and you limit yourself to booleans (which isn’t considered pythonic).

Like PEP8 mentions:

Don’t compare boolean values to True or False using ==.

Yes:   if greeting:
No:    if greeting == True:
Worse: if greeting is True:

回答 5

他们是完全不同的is检查对象身份,同时==检查是否相等(一个概念取决于两个操作数的类型)。

幸运的巧合是“ is”似乎可以正确地使用小整数(例如5 == 4 + 1)。那是因为CPython通过使整数成为单例来优化整数存储范围(-5到256)。此行为完全取决于实现,并且不能保证在所有较小的转换操作方式下都可以保留该行为。

例如,Python 3.5还使短字符串单身,但将它们切片会破坏此行为:

>>> "foo" + "bar" == "foobar"
True
>>> "foo" + "bar" is "foobar"
True
>>> "foo"[:] + "bar" == "foobar"
True
>>> "foo"[:] + "bar" is "foobar"
False

They are completely different. is checks for object identity, while == checks for equality (a notion that depends on the two operands’ types).

It is only a lucky coincidence that “is” seems to work correctly with small integers (e.g. 5 == 4+1). That is because CPython optimizes the storage of integers in the range (-5 to 256) by making them singletons. This behavior is totally implementation-dependent and not guaranteed to be preserved under all manner of minor transformative operations.

For example, Python 3.5 also makes short strings singletons, but slicing them disrupts this behavior:

>>> "foo" + "bar" == "foobar"
True
>>> "foo" + "bar" is "foobar"
True
>>> "foo"[:] + "bar" == "foobar"
True
>>> "foo"[:] + "bar" is "foobar"
False

回答 6

https://docs.python.org/library/stdtypes.html#comparisons

is身份 ==测试,是否相等

每个(小)整数值都映射到单个值,因此,每个3都是相同且相等的。这是实现细节,但不是语言规范的一部分

https://docs.python.org/library/stdtypes.html#comparisons

is tests for identity == tests for equality

Each (small) integer value is mapped to a single value, so every 3 is identical and equal. This is an implementation detail, not part of the language spec though


回答 7

您的回答是正确的。该is运算符比较两个对象的身份。该==操作比较两个对象的值。

一旦创建了对象,其身份就不会改变。您可能会认为它是对象在内存中的地址。

您可以通过定义__cmp__方法或丰富的比较方法(例如)来控制对象值的比较行为__eq__

Your answer is correct. The is operator compares the identity of two objects. The == operator compares the values of two objects.

An object’s identity never changes once it has been created; you may think of it as the object’s address in memory.

You can control comparison behaviour of object values by defining a __cmp__ method or a rich comparison method like __eq__.


回答 8

看一下Stack Overflow问题,Python的“ is”运算符在使用整数时表现异常

最主要的原因是“ is”检查它们是否是同一对象,而不只是彼此相等(小于256的数字是特例)。

Have a look at Stack Overflow question Python’s “is” operator behaves unexpectedly with integers.

What it mostly boils down to is that “is” checks to see if they are the same object, not just equal to each other (the numbers below 256 are a special case).


回答 9

简而言之,is检查两个引用是否指向同一对象。==检查两个对象是否具有相同的值。

a=[1,2,3]
b=a        #a and b point to the same object
c=list(a)  #c points to different object 

if a==b:
    print('#')   #output:#
if a is b:
    print('##')  #output:## 
if a==c:
    print('###') #output:## 
if a is c:
    print('####') #no output as c and a point to different object 

In a nutshell, is checks whether two references point to the same object or not.== checks whether two objects have the same value or not.

a=[1,2,3]
b=a        #a and b point to the same object
c=list(a)  #c points to different object 

if a==b:
    print('#')   #output:#
if a is b:
    print('##')  #output:## 
if a==c:
    print('###') #output:## 
if a is c:
    print('####') #no output as c and a point to different object 

回答 10

正如John Feminella所说,大多数时候,您将使用==和!=,因为您的目标是比较值。我只想对剩下的时间做些什么:

NoneType只有一个实例,即None是一个单例。因此foo == Nonefoo is None意思相同。但是,is测试速度更快,并且要使用Pythonic约定foo is None

如果您要对垃圾收集进行自省或处理,或者检查自定义构建的字符串实习小工具是否正常工作,则可能有一个用例foo是is bar

True和False也是(现在)单例,但是没有用例,foo == True也没有用例foo is True

As John Feminella said, most of the time you will use == and != because your objective is to compare values. I’d just like to categorise what you would do the rest of the time:

There is one and only one instance of NoneType i.e. None is a singleton. Consequently foo == None and foo is None mean the same. However the is test is faster and the Pythonic convention is to use foo is None.

If you are doing some introspection or mucking about with garbage collection or checking whether your custom-built string interning gadget is working or suchlike, then you probably have a use-case for foo is bar.

True and False are also (now) singletons, but there is no use-case for foo == True and no use case for foo is True.


回答 11

他们中的大多数人已经回答了这一点。正如补充说明(基于我的理解和实验,但不是来自书面记录)的声明

==如果变量引用的对象相等

从上面的答案应该理解为

==如果变量引用的对象相等并且属于相同类型/类的对象

。我根据以下测试得出了这个结论:

list1 = [1,2,3,4]
tuple1 = (1,2,3,4)

print(list1)
print(tuple1)
print(id(list1))
print(id(tuple1))

print(list1 == tuple1)
print(list1 is tuple1)

这里的列表和元组的内容相同,但类型/类不同。

Most of them already answered to the point. Just as an additional note (based on my understanding and experimenting but not from a documented source), the statement

== if the objects referred to by the variables are equal

from above answers should be read as

== if the objects referred to by the variables are equal and objects belonging to the same type/class

. I arrived at this conclusion based on the below test:

list1 = [1,2,3,4]
tuple1 = (1,2,3,4)

print(list1)
print(tuple1)
print(id(list1))
print(id(tuple1))

print(list1 == tuple1)
print(list1 is tuple1)

Here the contents of the list and tuple are same but the type/class are different.


回答 12

is和equals(==)之间的Python区别

is运算符可能看起来与相等运算符相同,但它们并不相同。

is检查两个变量是否指向同一对象,而==符号检查两个变量的值是否相同。

因此,如果is运算符返回True,则相等性肯定为True,但相反的情况可能为True,也可能不是True。

这是一个演示相似性和差异性的示例。

>>> a = b = [1,2,3]
>>> c = [1,2,3]
>>> a == b
True
>>> a == c
True
>>> a is b
True
>>> a is c
False
>>> a = [1,2,3]
>>> b = [1,2]
>>> a == b
False
>>> a is b
False
>>> del a[2]
>>> a == b
True
>>> a is b
False
Tip: Avoid using is operator for immutable types such as strings and numbers, the result is unpredictable.

Python difference between is and equals(==)

The is operator may seem like the same as the equality operator but they are not same.

The is checks if both the variables point to the same object whereas the == sign checks if the values for the two variables are the same.

So if the is operator returns True then the equality is definitely True, but the opposite may or may not be True.

Here is an example to demonstrate the similarity and the difference.

>>> a = b = [1,2,3]
>>> c = [1,2,3]
>>> a == b
True
>>> a == c
True
>>> a is b
True
>>> a is c
False
>>> a = [1,2,3]
>>> b = [1,2]
>>> a == b
False
>>> a is b
False
>>> del a[2]
>>> a == b
True
>>> a is b
False
Tip: Avoid using is operator for immutable types such as strings and numbers, the result is unpredictable.

回答 13

当这篇文章中的其他人详细回答了这个问题时,我将主要强调字符串之间的比较is以及可以给出不同结果的== 字符串,我敦促程序员谨慎使用它们。

为了进行字符串比较,请确保使用==代替is

str = 'hello'
if (str is 'hello'):
    print ('str is hello')
if (str == 'hello'):
    print ('str == hello')

出:

str is hello
str == hello

在下面的例子中==,并is会得到不同的结果:

str = 'hello sam'
    if (str is 'hello sam'):
        print ('str is hello sam')
    if (str == 'hello sam'):
        print ('str == hello sam')

出:

str == hello sam

结论:

is谨慎使用以比较字符串

As the other people in this post answer the question in details, I would emphasize mainly the comparison between is and == for strings which can give different results and I would urge programmers to carefully use them.

For string comparison, make sure to use == instead of is:

str = 'hello'
if (str is 'hello'):
    print ('str is hello')
if (str == 'hello'):
    print ('str == hello')

Out:

str is hello
str == hello

But in the below example == and is will get different results:

str = 'hello sam'
    if (str is 'hello sam'):
        print ('str is hello sam')
    if (str == 'hello sam'):
        print ('str == hello sam')

Out:

str == hello sam

Conclusion:

Use is carefully to compare between strings


函数中静态变量的Python等效项是什么?

问题:函数中静态变量的Python等效项是什么?

此C / C ++代码的惯用Python等效项是什么?

void foo()
{
    static int counter = 0;
    counter++;
    printf("counter is %d\n", counter);
}

具体来说,如何在函数级别而非类级别实现静态成员?并将函数放入类中是否会发生任何变化?

What is the idiomatic Python equivalent of this C/C++ code?

void foo()
{
    static int counter = 0;
    counter++;
    printf("counter is %d\n", counter);
}

specifically, how does one implement the static member at the function level, as opposed to the class level? And does placing the function into a class change anything?


回答 0

有点相反,但这应该起作用:

def foo():
    foo.counter += 1
    print "Counter is %d" % foo.counter
foo.counter = 0

如果要将计数器初始化代码放在顶部而不是底部,则可以创建一个装饰器:

def static_vars(**kwargs):
    def decorate(func):
        for k in kwargs:
            setattr(func, k, kwargs[k])
        return func
    return decorate

然后使用如下代码:

@static_vars(counter=0)
def foo():
    foo.counter += 1
    print "Counter is %d" % foo.counter

foo.不幸的是,它仍然需要您使用前缀。

(信用:@ony

A bit reversed, but this should work:

def foo():
    foo.counter += 1
    print "Counter is %d" % foo.counter
foo.counter = 0

If you want the counter initialization code at the top instead of the bottom, you can create a decorator:

def static_vars(**kwargs):
    def decorate(func):
        for k in kwargs:
            setattr(func, k, kwargs[k])
        return func
    return decorate

Then use the code like this:

@static_vars(counter=0)
def foo():
    foo.counter += 1
    print "Counter is %d" % foo.counter

It’ll still require you to use the foo. prefix, unfortunately.

(Credit: @ony)


回答 1

您可以向函数添加属性,并将其用作静态变量。

def myfunc():
  myfunc.counter += 1
  print myfunc.counter

# attribute must be initialized
myfunc.counter = 0

另外,如果您不想在函数外部设置变量,则hasattr()可以避免出现AttributeError异常:

def myfunc():
  if not hasattr(myfunc, "counter"):
     myfunc.counter = 0  # it doesn't exist yet, so initialize it
  myfunc.counter += 1

无论如何,静态变量很少见,您应该为该变量找到一个更好的位置,很可能在类中。

You can add attributes to a function, and use it as a static variable.

def myfunc():
  myfunc.counter += 1
  print myfunc.counter

# attribute must be initialized
myfunc.counter = 0

Alternatively, if you don’t want to setup the variable outside the function, you can use hasattr() to avoid an AttributeError exception:

def myfunc():
  if not hasattr(myfunc, "counter"):
     myfunc.counter = 0  # it doesn't exist yet, so initialize it
  myfunc.counter += 1

Anyway static variables are rather rare, and you should find a better place for this variable, most likely inside a class.


回答 2

还可以考虑:

def foo():
    try:
        foo.counter += 1
    except AttributeError:
        foo.counter = 1

推理:

  • 大量pythonic(“请求宽恕而不是许可”)
  • 使用异常(仅抛出一次)而不是if分支(请考虑StopIteration异常)

One could also consider:

def foo():
    try:
        foo.counter += 1
    except AttributeError:
        foo.counter = 1

Reasoning:

  • much pythonic (“ask for forgiveness not permission”)
  • use exception (thrown only once) instead of if branch (think StopIteration exception)

回答 3

其他答案已经说明了您应该执行此操作的方式。这是您不应该使用的方法:

>>> def foo(counter=[0]):
...   counter[0] += 1
...   print("Counter is %i." % counter[0]);
... 
>>> foo()
Counter is 1.
>>> foo()
Counter is 2.
>>> 

仅在第一次评估该函数时才初始化缺省值,而不是在每次执行该函数时才初始化缺省值,因此可以使用列表或任何其他可变对象存储静态值。

Other answers have demonstrated the way you should do this. Here’s a way you shouldn’t:

>>> def foo(counter=[0]):
...   counter[0] += 1
...   print("Counter is %i." % counter[0]);
... 
>>> foo()
Counter is 1.
>>> foo()
Counter is 2.
>>> 

Default values are initialized only when the function is first evaluated, not each time it is executed, so you can use a list or any other mutable object to store static values.


回答 4

很多人已经建议测试“ hasattr”,但是答案很简单:

def func():
    func.counter = getattr(func, 'counter', 0) + 1

没有try / except,没有测试hasattr,只有默认的getattr。

Many people have already suggested testing ‘hasattr’, but there’s a simpler answer:

def func():
    func.counter = getattr(func, 'counter', 0) + 1

No try/except, no testing hasattr, just getattr with a default.


回答 5

这是一个完全封装的版本,不需要外部初始化调用:

def fn():
    fn.counter=vars(fn).setdefault('counter',-1)
    fn.counter+=1
    print (fn.counter)

在Python中,函数是对象,我们可以通过special属性将成员变量简单地添加或Monkey补丁__dict__。内置vars()函数返回special属性__dict__

编辑:请注意,与替代try:except AttributeError答案不同,使用此方法,变量将始终为初始化后的代码逻辑做好准备。我认为以下try:except AttributeError替代方案将减少DRY和/或流程笨拙:

def Fibonacci(n):
   if n<2: return n
   Fibonacci.memo=vars(Fibonacci).setdefault('memo',{}) # use static variable to hold a results cache
   return Fibonacci.memo.setdefault(n,Fibonacci(n-1)+Fibonacci(n-2)) # lookup result in cache, if not available then calculate and store it

EDIT2:仅当从多个位置调用该函数时,才建议使用上述方法。如果只在一个地方调用该函数,则最好使用nonlocal

def TheOnlyPlaceStaticFunctionIsCalled():
    memo={}
    def Fibonacci(n):
       nonlocal memo  # required in Python3. Python2 can see memo
       if n<2: return n
       return memo.setdefault(n,Fibonacci(n-1)+Fibonacci(n-2))
    ...
    print (Fibonacci(200))
    ...

Here is a fully encapsulated version that doesn’t require an external initialization call:

def fn():
    fn.counter=vars(fn).setdefault('counter',-1)
    fn.counter+=1
    print (fn.counter)

In Python, functions are objects and we can simply add, or monkey patch, member variables to them via the special attribute __dict__. The built-in vars() returns the special attribute __dict__.

EDIT: Note, unlike the alternative try:except AttributeError answer, with this approach the variable will always be ready for the code logic following initialization. I think the try:except AttributeError alternative to the following will be less DRY and/or have awkward flow:

def Fibonacci(n):
   if n<2: return n
   Fibonacci.memo=vars(Fibonacci).setdefault('memo',{}) # use static variable to hold a results cache
   return Fibonacci.memo.setdefault(n,Fibonacci(n-1)+Fibonacci(n-2)) # lookup result in cache, if not available then calculate and store it

EDIT2: I only recommend the above approach when the function will be called from multiple locations. If instead the function is only called in one place, it’s better to use nonlocal:

def TheOnlyPlaceStaticFunctionIsCalled():
    memo={}
    def Fibonacci(n):
       nonlocal memo  # required in Python3. Python2 can see memo
       if n<2: return n
       return memo.setdefault(n,Fibonacci(n-1)+Fibonacci(n-2))
    ...
    print (Fibonacci(200))
    ...

回答 6

Python没有静态变量,但是您可以通过定义可调用的类对象然后将其用作函数来伪造它。另请参阅此答案

class Foo(object):
  # Class variable, shared by all instances of this class
  counter = 0

  def __call__(self):
    Foo.counter += 1
    print Foo.counter

# Create an object instance of class "Foo," called "foo"
foo = Foo()

# Make calls to the "__call__" method, via the object's name itself
foo() #prints 1
foo() #prints 2
foo() #prints 3

请注意,这__call__使得类(对象)的实例可以通过其自己的名称来调用。这就是为什么foo()上面的调用会调用类的__call__方法的原因。从文档中

可以通过在任意类的类中定义一个__call__()方法来使其实例化。

Python doesn’t have static variables but you can fake it by defining a callable class object and then using it as a function. Also see this answer.

class Foo(object):
  # Class variable, shared by all instances of this class
  counter = 0

  def __call__(self):
    Foo.counter += 1
    print Foo.counter

# Create an object instance of class "Foo," called "foo"
foo = Foo()

# Make calls to the "__call__" method, via the object's name itself
foo() #prints 1
foo() #prints 2
foo() #prints 3

Note that __call__ makes an instance of a class (object) callable by its own name. That’s why calling foo() above calls the class’ __call__ method. From the documentation:

Instances of arbitrary classes can be made callable by defining a __call__() method in their class.


回答 7

使用生成器函数生成迭代器。

def foo_gen():
    n = 0
    while True:
        n+=1
        yield n

然后像

foo = foo_gen().next
for i in range(0,10):
    print foo()

如果需要上限:

def foo_gen(limit=100000):
    n = 0
    while n < limit:
       n+=1
       yield n

如果迭代器终止(如上面的示例),您也可以直接在其上循环,例如

for i in foo_gen(20):
    print i

当然,在这些简单的情况下,最好使用xrange :)

这是关于yield声明的文档。

Use a generator function to generate an iterator.

def foo_gen():
    n = 0
    while True:
        n+=1
        yield n

Then use it like

foo = foo_gen().next
for i in range(0,10):
    print foo()

If you want an upper limit:

def foo_gen(limit=100000):
    n = 0
    while n < limit:
       n+=1
       yield n

If the iterator terminates (like the example above), you can also loop over it directly, like

for i in foo_gen(20):
    print i

Of course, in these simple cases it’s better to use xrange :)

Here is the documentation on the yield statement.


回答 8

其他解决方案通常使用复杂的逻辑来将计数器属性添加到函数,以处理初始化。这不适用于新代码。

在Python 3中,正确的方法是使用以下nonlocal语句:

counter = 0
def foo():
    nonlocal counter
    counter += 1
    print(f'counter is {counter}')

有关声明的说明,请参见PEP 3104nonlocal

如果计数器是模块专用的,则应_counter改为命名。

Other solutions attach a counter attribute to the function, usually with convoluted logic to handle the initialization. This is inappropriate for new code.

In Python 3, the right way is to use a nonlocal statement:

counter = 0
def foo():
    nonlocal counter
    counter += 1
    print(f'counter is {counter}')

See PEP 3104 for the specification of the nonlocal statement.

If the counter is intended to be private to the module, it should be named _counter instead.


回答 9

将函数的属性用作静态变量有一些潜在的缺点:

  • 每次您要访问变量时,都必须写出函数的全名。
  • 外部代码可以轻松访问变量并弄乱值。

第二个问题的惯用python可能是用一个前导下划线将变量命名,以表示该变量不是可被访问的,而在事发后仍可访问。

另一种选择是使用词法闭包的模式,这nonlocal在python 3 中受关键字支持。

def make_counter():
    i = 0
    def counter():
        nonlocal i
        i = i + 1
        return i
    return counter
counter = make_counter()

可悲的是,我不知道将这种解决方案封装到装饰器中的方法。

Using an attribute of a function as static variable has some potential drawbacks:

  • Every time you want to access the variable, you have to write out the full name of the function.
  • Outside code can access the variable easily and mess with the value.

Idiomatic python for the second issue would probably be naming the variable with a leading underscore to signal that it is not meant to be accessed, while keeping it accessible after the fact.

An alternative would be a pattern using lexical closures, which are supported with the nonlocal keyword in python 3.

def make_counter():
    i = 0
    def counter():
        nonlocal i
        i = i + 1
        return i
    return counter
counter = make_counter()

Sadly I know no way to encapsulate this solution into a decorator.


回答 10

def staticvariables(**variables):
    def decorate(function):
        for variable in variables:
            setattr(function, variable, variables[variable])
        return function
    return decorate

@staticvariables(counter=0, bar=1)
def foo():
    print(foo.counter)
    print(foo.bar)

就像上面的vincent的代码一样,它将用作函数装饰器,并且必须使用函数名称作为前缀来访问静态变量。该代码的优点(尽管可以承认,任何人都可以聪明地解决它)是您可以拥有多个静态变量,并可以以更常规的方式对其进行初始化。

def staticvariables(**variables):
    def decorate(function):
        for variable in variables:
            setattr(function, variable, variables[variable])
        return function
    return decorate

@staticvariables(counter=0, bar=1)
def foo():
    print(foo.counter)
    print(foo.bar)

Much like vincent’s code above, this would be used as a function decorator and static variables must be accessed with the function name as a prefix. The advantage of this code (although admittedly anyone might be smart enough to figure it out) is that you can have multiple static variables and initialise them in a more conventional manner.


回答 11

更具可读性,但更冗长(Python的Zen:显式优于隐式):

>>> def func(_static={'counter': 0}):
...     _static['counter'] += 1
...     print _static['counter']
...
>>> func()
1
>>> func()
2
>>>

请参阅此处以了解其工作原理。

A little bit more readable, but more verbose (Zen of Python: explicit is better than implicit):

>>> def func(_static={'counter': 0}):
...     _static['counter'] += 1
...     print _static['counter']
...
>>> func()
1
>>> func()
2
>>>

See here for an explanation of how this works.


回答 12

_counter = 0
def foo():
   全局_counter
   _counter + = 1
   打印“计数器是”,_counter

Python通常使用下划线指示私有变量。C语言中在函数内部声明静态变量的唯一原因是将其隐藏在函数外部,这并不是真正的Python。

_counter = 0
def foo():
   global _counter
   _counter += 1
   print 'counter is', _counter

Python customarily uses underscores to indicate private variables. The only reason in C to declare the static variable inside the function is to hide it outside the function, which is not really idiomatic Python.


回答 13

在尝试了几种方法之后,我最终使用了@warvariuc的答案的改进版本:

import types

def func(_static=types.SimpleNamespace(counter=0)):
    _static.counter += 1
    print(_static.counter)

After trying several approaches I end up using an improved version of @warvariuc’s answer:

import types

def func(_static=types.SimpleNamespace(counter=0)):
    _static.counter += 1
    print(_static.counter)

回答 14

惯用的方法是使用一个,它可以有属性。如果需要不分离实例,请使用单例。

您可以通过多种方法将“静态”变量伪造或修改为Python(到目前为止,尚未提及的一种方法是使用可变的默认参数),但这不是Pythonic惯用的方法。只需使用一个类。

如果您的使用模式合适,则可能是生成器。

The idiomatic way is to use a class, which can have attributes. If you need instances to not be separate, use a singleton.

There are a number of ways you could fake or munge “static” variables into Python (one not mentioned so far is to have a mutable default argument), but this is not the Pythonic, idiomatic way to do it. Just use a class.

Or possibly a generator, if your usage pattern fits.


回答 15

这个问题的提示下,我可以提出另一种选择,它可能会更好用,并且对于方法和函数来说都一样:

@static_var2('seed',0)
def funccounter(statics, add=1):
    statics.seed += add
    return statics.seed

print funccounter()       #1
print funccounter(add=2)  #3
print funccounter()       #4

class ACircle(object):
    @static_var2('seed',0)
    def counter(statics, self, add=1):
        statics.seed += add
        return statics.seed

c = ACircle()
print c.counter()      #1
print c.counter(add=2) #3
print c.counter()      #4
d = ACircle()
print d.counter()      #5
print d.counter(add=2) #7
print d.counter()      #8    

如果您喜欢这种用法,请执行以下操作:

class StaticMan(object):
    def __init__(self):
        self.__dict__['_d'] = {}

    def __getattr__(self, name):
        return self.__dict__['_d'][name]
    def __getitem__(self, name):
        return self.__dict__['_d'][name]
    def __setattr__(self, name, val):
        self.__dict__['_d'][name] = val
    def __setitem__(self, name, val):
        self.__dict__['_d'][name] = val

def static_var2(name, val):
    def decorator(original):
        if not hasattr(original, ':staticman'):    
            def wrapped(*args, **kwargs):
                return original(getattr(wrapped, ':staticman'), *args, **kwargs)
            setattr(wrapped, ':staticman', StaticMan())
            f = wrapped
        else:
            f = original #already wrapped

        getattr(f, ':staticman')[name] = val
        return f
    return decorator

Prompted by this question, may I present another alternative which might be a bit nicer to use and will look the same for both methods and functions:

@static_var2('seed',0)
def funccounter(statics, add=1):
    statics.seed += add
    return statics.seed

print funccounter()       #1
print funccounter(add=2)  #3
print funccounter()       #4

class ACircle(object):
    @static_var2('seed',0)
    def counter(statics, self, add=1):
        statics.seed += add
        return statics.seed

c = ACircle()
print c.counter()      #1
print c.counter(add=2) #3
print c.counter()      #4
d = ACircle()
print d.counter()      #5
print d.counter(add=2) #7
print d.counter()      #8    

If you like the usage, here’s the implementation:

class StaticMan(object):
    def __init__(self):
        self.__dict__['_d'] = {}

    def __getattr__(self, name):
        return self.__dict__['_d'][name]
    def __getitem__(self, name):
        return self.__dict__['_d'][name]
    def __setattr__(self, name, val):
        self.__dict__['_d'][name] = val
    def __setitem__(self, name, val):
        self.__dict__['_d'][name] = val

def static_var2(name, val):
    def decorator(original):
        if not hasattr(original, ':staticman'):    
            def wrapped(*args, **kwargs):
                return original(getattr(wrapped, ':staticman'), *args, **kwargs)
            setattr(wrapped, ':staticman', StaticMan())
            f = wrapped
        else:
            f = original #already wrapped

        getattr(f, ':staticman')[name] = val
        return f
    return decorator

回答 16

另一个(不建议!)在可调用对象(如https://stackoverflow.com/a/279598/916373)上的扭曲,如果您不介意使用时髦的调用签名,则可以这样做

class foo(object):
    counter = 0;
    @staticmethod
    def __call__():
        foo.counter += 1
        print "counter is %i" % foo.counter

>>> foo()()
counter is 1
>>> foo()()
counter is 2

Another (not recommended!) twist on the callable object like https://stackoverflow.com/a/279598/916373, if you don’t mind using a funky call signature, would be to do

class foo(object):
    counter = 0;
    @staticmethod
    def __call__():
        foo.counter += 1
        print "counter is %i" % foo.counter

>>> foo()()
counter is 1
>>> foo()()
counter is 2

回答 17

除了创建具有静态局部变量的函数外,您始终可以创建所谓的“函数对象”并为其提供标准(非静态)成员变量。

由于您提供了用C ++编写的示例,因此我将首先解释C ++中的“函数对象”。“功能对象”就是带有重载的任何类operator()。该类的实例的行为类似于函数。例如,int x = square(5);即使square是一个对象(具有重载operator()),但从技术上讲不是“函数” ,您也可以编写。您可以给功能对象提供可以给类对象提供的任何功能。

# C++ function object
class Foo_class {
    private:
        int counter;     
    public:
        Foo_class() {
             counter = 0;
        }
        void operator() () {  
            counter++;
            printf("counter is %d\n", counter);
        }     
   };
   Foo_class foo;

在Python中,我们也可以重载,operator()只是方法改为命名为__call__

这是一个类定义:

class Foo_class:
    def __init__(self): # __init__ is similair to a C++ class constructor
        self.counter = 0
        # self.counter is like a static member
        # variable of a function named "foo"
    def __call__(self): # overload operator()
        self.counter += 1
        print("counter is %d" % self.counter);
foo = Foo_class() # call the constructor

这是使用的类的示例:

from foo import foo

for i in range(0, 5):
    foo() # function call

打印到控制台的输出是:

counter is 1
counter is 2
counter is 3
counter is 4
counter is 5

如果要让函数接受输入参数,也可以将其添加到其中__call__

# FILE: foo.py - - - - - - - - - - - - - - - - - - - - - - - - -

class Foo_class:
    def __init__(self):
        self.counter = 0
    def __call__(self, x, y, z): # overload operator()
        self.counter += 1
        print("counter is %d" % self.counter);
        print("x, y, z, are %d, %d, %d" % (x, y, z));
foo = Foo_class() # call the constructor

# FILE: main.py - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

from foo import foo

for i in range(0, 5):
    foo(7, 8, 9) # function call

# Console Output - - - - - - - - - - - - - - - - - - - - - - - - - - 

counter is 1
x, y, z, are 7, 8, 9
counter is 2
x, y, z, are 7, 8, 9
counter is 3
x, y, z, are 7, 8, 9
counter is 4
x, y, z, are 7, 8, 9
counter is 5
x, y, z, are 7, 8, 9

Instead of creating a function having a static local variable, you can always create what is called a “function object” and give it a standard (non-static) member variable.

Since you gave an example written C++, I will first explain what a “function object” is in C++. A “function object” is simply any class with an overloaded operator(). Instances of the class will behave like functions. For example, you can write int x = square(5); even if square is an object (with overloaded operator()) and not technically not a “function.” You can give a function-object any of the features that you could give a class object.

# C++ function object
class Foo_class {
    private:
        int counter;     
    public:
        Foo_class() {
             counter = 0;
        }
        void operator() () {  
            counter++;
            printf("counter is %d\n", counter);
        }     
   };
   Foo_class foo;

In Python, we can also overload operator() except that the method is instead named __call__:

Here is a class definition:

class Foo_class:
    def __init__(self): # __init__ is similair to a C++ class constructor
        self.counter = 0
        # self.counter is like a static member
        # variable of a function named "foo"
    def __call__(self): # overload operator()
        self.counter += 1
        print("counter is %d" % self.counter);
foo = Foo_class() # call the constructor

Here is an example of the class being used:

from foo import foo

for i in range(0, 5):
    foo() # function call

The output printed to the console is:

counter is 1
counter is 2
counter is 3
counter is 4
counter is 5

If you want your function to take input arguments, you can add those to __call__ as well:

# FILE: foo.py - - - - - - - - - - - - - - - - - - - - - - - - -

class Foo_class:
    def __init__(self):
        self.counter = 0
    def __call__(self, x, y, z): # overload operator()
        self.counter += 1
        print("counter is %d" % self.counter);
        print("x, y, z, are %d, %d, %d" % (x, y, z));
foo = Foo_class() # call the constructor

# FILE: main.py - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

from foo import foo

for i in range(0, 5):
    foo(7, 8, 9) # function call

# Console Output - - - - - - - - - - - - - - - - - - - - - - - - - - 

counter is 1
x, y, z, are 7, 8, 9
counter is 2
x, y, z, are 7, 8, 9
counter is 3
x, y, z, are 7, 8, 9
counter is 4
x, y, z, are 7, 8, 9
counter is 5
x, y, z, are 7, 8, 9

回答 18

溶液n + = 1

def foo():
  foo.__dict__.setdefault('count', 0)
  foo.count += 1
  return foo.count

Soulution n +=1

def foo():
  foo.__dict__.setdefault('count', 0)
  foo.count += 1
  return foo.count

回答 19

全局声明提供了此功能。在下面的示例(使用“ f”的python 3.5或更高版本)中,计数器变量在函数外部定义。在功能中将其定义为全局表示表示该功能之外的“全局”版本应可用于该功能。因此,每次函数运行时,它都会修改函数外部的值,并将其保留在函数之外。

counter = 0

def foo():
    global counter
    counter += 1
    print("counter is {}".format(counter))

foo() #output: "counter is 1"
foo() #output: "counter is 2"
foo() #output: "counter is 3"

A global declaration provides this functionality. In the example below (python 3.5 or greater to use the “f”), the counter variable is defined outside of the function. Defining it as global in the function signifies that the “global” version outside of the function should be made available to the function. So each time the function runs, it modifies the value outside the function, preserving it beyond the function.

counter = 0

def foo():
    global counter
    counter += 1
    print("counter is {}".format(counter))

foo() #output: "counter is 1"
foo() #output: "counter is 2"
foo() #output: "counter is 3"

回答 20

Python方法内的静态变量

class Count:
    def foo(self):
        try: 
            self.foo.__func__.counter += 1
        except AttributeError: 
            self.foo.__func__.counter = 1

        print self.foo.__func__.counter

m = Count()
m.foo()       # 1
m.foo()       # 2
m.foo()       # 3

A static variable inside a Python method

class Count:
    def foo(self):
        try: 
            self.foo.__func__.counter += 1
        except AttributeError: 
            self.foo.__func__.counter = 1

        print self.foo.__func__.counter

m = Count()
m.foo()       # 1
m.foo()       # 2
m.foo()       # 3

回答 21

我个人更喜欢以下装饰器。给每个人自己。

def staticize(name, factory):
    """Makes a pseudo-static variable in calling function.

    If name `name` exists in calling function, return it. 
    Otherwise, saves return value of `factory()` in 
    name `name` of calling function and return it.

    :param name: name to use to store static object 
    in calling function
    :type name: String
    :param factory: used to initialize name `name` 
    in calling function
    :type factory: function
    :rtype: `type(factory())`

    >>> def steveholt(z):
    ...     a = staticize('a', list)
    ...     a.append(z)
    >>> steveholt.a
    Traceback (most recent call last):
    ...
    AttributeError: 'function' object has no attribute 'a'
    >>> steveholt(1)
    >>> steveholt.a
    [1]
    >>> steveholt('a')
    >>> steveholt.a
    [1, 'a']
    >>> steveholt.a = []
    >>> steveholt.a
    []
    >>> steveholt('zzz')
    >>> steveholt.a
    ['zzz']

    """
    from inspect import stack
    # get scope enclosing calling function
    calling_fn_scope = stack()[2][0]
    # get calling function
    calling_fn_name = stack()[1][3]
    calling_fn = calling_fn_scope.f_locals[calling_fn_name]
    if not hasattr(calling_fn, name):
        setattr(calling_fn, name, factory())
    return getattr(calling_fn, name)

I personally prefer the following to decorators. To each their own.

def staticize(name, factory):
    """Makes a pseudo-static variable in calling function.

    If name `name` exists in calling function, return it. 
    Otherwise, saves return value of `factory()` in 
    name `name` of calling function and return it.

    :param name: name to use to store static object 
    in calling function
    :type name: String
    :param factory: used to initialize name `name` 
    in calling function
    :type factory: function
    :rtype: `type(factory())`

    >>> def steveholt(z):
    ...     a = staticize('a', list)
    ...     a.append(z)
    >>> steveholt.a
    Traceback (most recent call last):
    ...
    AttributeError: 'function' object has no attribute 'a'
    >>> steveholt(1)
    >>> steveholt.a
    [1]
    >>> steveholt('a')
    >>> steveholt.a
    [1, 'a']
    >>> steveholt.a = []
    >>> steveholt.a
    []
    >>> steveholt('zzz')
    >>> steveholt.a
    ['zzz']

    """
    from inspect import stack
    # get scope enclosing calling function
    calling_fn_scope = stack()[2][0]
    # get calling function
    calling_fn_name = stack()[1][3]
    calling_fn = calling_fn_scope.f_locals[calling_fn_name]
    if not hasattr(calling_fn, name):
        setattr(calling_fn, name, factory())
    return getattr(calling_fn, name)

回答 22

此答案基于@claudiu的答案。

我发现,每当我打算访问静态变量时,总是必须在函数名称前加上前缀,我的代码变得不清楚。

即,在我的函数代码中,我更喜欢写:

print(statics.foo)

代替

print(my_function_name.foo)

因此,我的解决方案是:

  1. 添加一个 statics向函数属性
  2. 在功能范围中,将局部变量添加statics为别名my_function.statics
from bunch import *

def static_vars(**kwargs):
    def decorate(func):
        statics = Bunch(**kwargs)
        setattr(func, "statics", statics)
        return func
    return decorate

@static_vars(name = "Martin")
def my_function():
    statics = my_function.statics
    print("Hello, {0}".format(statics.name))

备注

我的方法使用一个名为的类Bunch,它是一个支持属性样式访问的字典,即JavaScript(请参见原始文章)。 2000年前后)

可以通过安装 pip install bunch

也可以这样手写:

class Bunch(dict):
    def __init__(self, **kw):
        dict.__init__(self,kw)
        self.__dict__ = self

This answer builds on @claudiu ‘s answer.

I found that my code was getting less clear when I always had to prepend the function name, whenever I intend to access a static variable.

Namely, in my function code I would prefer to write:

print(statics.foo)

instead of

print(my_function_name.foo)

So, my solution is to :

  1. add a statics attribute to the function
  2. in the function scope, add a local variable statics as an alias to my_function.statics
from bunch import *

def static_vars(**kwargs):
    def decorate(func):
        statics = Bunch(**kwargs)
        setattr(func, "statics", statics)
        return func
    return decorate

@static_vars(name = "Martin")
def my_function():
    statics = my_function.statics
    print("Hello, {0}".format(statics.name))

Remark

My method uses a class named Bunch, which is a dictionary that supports attribute-style access, a la JavaScript (see the original article about it, around 2000)

It can be installed via pip install bunch

It can also be hand-written like so:

class Bunch(dict):
    def __init__(self, **kw):
        dict.__init__(self,kw)
        self.__dict__ = self

回答 23

基于丹尼尔的答案(补充):

class Foo(object): 
    counter = 0  

def __call__(self, inc_value=0):
    Foo.counter += inc_value
    return Foo.counter

foo = Foo()

def use_foo(x,y):
    if(x==5):
        foo(2)
    elif(y==7):
        foo(3)
    if(foo() == 10):
        print("yello")


use_foo(5,1)
use_foo(5,1)
use_foo(1,7)
use_foo(1,7)
use_foo(1,1)

我想添加此部分的原因是,作为一个实际示例,静态变量不仅用于增加某个值,而且还检查静态var是否等于某个值。

静态变量仍受保护,仅在函数use_foo()的范围内使用

在此示例中,对foo()的调用功能完全相同(相对于相应的c ++等效项):

stat_c +=9; // in c++
foo(9)  #python equiv

if(stat_c==10){ //do something}  // c++

if(foo() == 10):      # python equiv
  #add code here      # python equiv       

Output :
yello
yello

如果将Foo类限制性地定义为单例类,那将是理想的。这将使它更具Pythonic性。

Building on Daniel’s answer (additions):

class Foo(object): 
    counter = 0  

def __call__(self, inc_value=0):
    Foo.counter += inc_value
    return Foo.counter

foo = Foo()

def use_foo(x,y):
    if(x==5):
        foo(2)
    elif(y==7):
        foo(3)
    if(foo() == 10):
        print("yello")


use_foo(5,1)
use_foo(5,1)
use_foo(1,7)
use_foo(1,7)
use_foo(1,1)

The reason why I wanted to add this part is , static variables are used not only for incrementing by some value, but also check if the static var is equal to some value, as a real life example.

The static variable is still protected and used only within the scope of the function use_foo()

In this example, call to foo() functions exactly as(with respect to the corresponding c++ equivalent) :

stat_c +=9; // in c++
foo(9)  #python equiv

if(stat_c==10){ //do something}  // c++

if(foo() == 10):      # python equiv
  #add code here      # python equiv       

Output :
yello
yello

if class Foo is defined restrictively as a singleton class, that would be ideal. This would make it more pythonic.


回答 24

当然,这是一个老问题,但我想我可能会提供一些更新。

似乎性能参数已过时。相同的测试套件似乎为siInt_try和isInt_re2提供了相似的结果。当然结果会有所不同,但这是在我的计算机上使用Xeon W3550在内核4.3.01上使用python 3.4.4的会话。我已经运行了几次,结果似乎是相似的。我将全局正则表达式移到了函数静态中,但是性能差异可以忽略不计。

isInt_try: 0.3690
isInt_str: 0.3981
isInt_re: 0.5870
isInt_re2: 0.3632

随着性能问题的解决,try / catch似乎会生成最适合未来和极端情况的代码,因此也许将其包装在函数中

Sure this is an old question but I think I might provide some update.

It seems that the performance argument is obsolete. The same test suite appears to give similar results for siInt_try and isInt_re2. Of course results vary, but this is one session on my computer with python 3.4.4 on kernel 4.3.01 with Xeon W3550. I have run it several times and the results seem to be similar. I moved the global regex into function static, but the performance difference is negligible.

isInt_try: 0.3690
isInt_str: 0.3981
isInt_re: 0.5870
isInt_re2: 0.3632

With performance issue out of the way, it seems that try/catch would produce the most future- and cornercase- proof code so maybe just wrap it in function


在函数调用中,星号运算符是什么意思?

问题:在函数调用中,星号运算符是什么意思?

*运算符在Python中是什么意思,例如likezip(*x)或代码f(**k)

  1. 在解释器内部如何处理?
  2. 它会影响性能吗?是快还是慢?
  3. 什么时候有用,什么时候没有?
  4. 应该在函数声明中还是在调用中使用它?

What does the * operator mean in Python, such as in code like zip(*x) or f(**k)?

  1. How is it handled internally in the interpreter?
  2. Does it affect performance at all? Is it fast or slow?
  3. When is it useful and when is it not?
  4. Should it be used in a function declaration or in a call?

回答 0

单颗星*将序列/集合解压缩为位置参数,因此您可以执行以下操作:

def sum(a, b):
    return a + b

values = (1, 2)

s = sum(*values)

这将打开元组的包装,使其实际执行为:

s = sum(1, 2)

双星**只使用字典并因此命名参数来做同样的事情:

values = { 'a': 1, 'b': 2 }
s = sum(**values)

您还可以结合:

def sum(a, b, c, d):
    return a + b + c + d

values1 = (1, 2)
values2 = { 'c': 10, 'd': 15 }
s = sum(*values1, **values2)

将执行为:

s = sum(1, 2, c=10, d=15)

另请参见Python文档的4.7.4-解包参数列表


另外,您可以定义要接受的函数*x**y参数,这使函数可以接受在声明中未专门命名的任何数量的位置和/或命名参数。

例:

def sum(*values):
    s = 0
    for v in values:
        s = s + v
    return s

s = sum(1, 2, 3, 4, 5)

或搭配**

def get_a(**values):
    return values['a']

s = get_a(a=1, b=2)      # returns 1

这可以使您无需声明它们即可指定大量可选参数。

再一次,您可以结合:

def sum(*values, **options):
    s = 0
    for i in values:
        s = s + i
    if "neg" in options:
        if options["neg"]:
            s = -s
    return s

s = sum(1, 2, 3, 4, 5)            # returns 15
s = sum(1, 2, 3, 4, 5, neg=True)  # returns -15
s = sum(1, 2, 3, 4, 5, neg=False) # returns 15

The single star * unpacks the sequence/collection into positional arguments, so you can do this:

def sum(a, b):
    return a + b

values = (1, 2)

s = sum(*values)

This will unpack the tuple so that it actually executes as:

s = sum(1, 2)

The double star ** does the same, only using a dictionary and thus named arguments:

values = { 'a': 1, 'b': 2 }
s = sum(**values)

You can also combine:

def sum(a, b, c, d):
    return a + b + c + d

values1 = (1, 2)
values2 = { 'c': 10, 'd': 15 }
s = sum(*values1, **values2)

will execute as:

s = sum(1, 2, c=10, d=15)

Also see section 4.7.4 – Unpacking Argument Lists of the Python documentation.


Additionally you can define functions to take *x and **y arguments, this allows a function to accept any number of positional and/or named arguments that aren’t specifically named in the declaration.

Example:

def sum(*values):
    s = 0
    for v in values:
        s = s + v
    return s

s = sum(1, 2, 3, 4, 5)

or with **:

def get_a(**values):
    return values['a']

s = get_a(a=1, b=2)      # returns 1

this can allow you to specify a large number of optional parameters without having to declare them.

And again, you can combine:

def sum(*values, **options):
    s = 0
    for i in values:
        s = s + i
    if "neg" in options:
        if options["neg"]:
            s = -s
    return s

s = sum(1, 2, 3, 4, 5)            # returns 15
s = sum(1, 2, 3, 4, 5, neg=True)  # returns -15
s = sum(1, 2, 3, 4, 5, neg=False) # returns 15

回答 1

一点:这些不是运算符。表达式中使用运算符从现有值创建新值(例如1 + 2变为3。这里的*和**是函数声明和调用语法的一部分。

One small point: these are not operators. Operators are used in expressions to create new values from existing values (1+2 becomes 3, for example. The * and ** here are part of the syntax of function declarations and calls.


回答 2

对于要“存储”函数调用的情况,我发现这特别有用。

例如,假设我对功能“ add”进行了一些单元测试:

def add(a, b): return a + b
tests = { (1,4):5, (0, 0):0, (-1, 3):3 }
for test, result in tests.items():
   print 'test: adding', test, '==', result, '---', add(*test) == result

除了手动执行类似add(test [0],test [1])之类的丑陋操作外,没有其他方法可以调用add。另外,如果变量数量可变,则所有需要的if语句的代码都会变得很丑陋。

另一个有用的地方是定义Factory对象(为您创建对象的对象)。假设您有一些工厂类,该类使Car对象返回。您可以使myFactory.make_car(’red’,’bmw’,’335ix’)创建Car(’red’,’bmw’,’335ix’),然后返回它。

def make_car(*args):
   return Car(*args)

当您要调用超类的构造函数时,这也很有用。

I find this particularly useful for when you want to ‘store’ a function call.

For example, suppose I have some unit tests for a function ‘add’:

def add(a, b): return a + b
tests = { (1,4):5, (0, 0):0, (-1, 3):3 }
for test, result in tests.items():
   print 'test: adding', test, '==', result, '---', add(*test) == result

There is no other way to call add, other than manually doing something like add(test[0], test[1]), which is ugly. Also, if there are a variable number of variables, the code could get pretty ugly with all the if-statements you would need.

Another place this is useful is for defining Factory objects (objects that create objects for you). Suppose you have some class Factory, that makes Car objects and returns them. You could make it so that myFactory.make_car(‘red’, ‘bmw’, ‘335ix’) creates Car(‘red’, ‘bmw’, ‘335ix’), then returns it.

def make_car(*args):
   return Car(*args)

This is also useful when you want to call a superclass’ constructor.


回答 3

它称为扩展调用语法。从文档中

如果语法* expression出现在函数调用中,则表达式必须计算为序列。来自此序列的元素被视为它们是附加的位置参数。如果存在位置参数x1,…,xN,并且表达式的计算结果为序列y1,…,yM,则等效于使用M + N个位置参数x1,…,xN,y1,…的调用。 ..,yM。

和:

如果语法** expression出现在函数调用中,则expression必须计算为一个映射,该映射的内容被视为其他关键字参数。如果关键字同时出现在表达式中并作为显式关键字参数出现,则会引发TypeError异常。

It is called the extended call syntax. From the documentation:

If the syntax *expression appears in the function call, expression must evaluate to a sequence. Elements from this sequence are treated as if they were additional positional arguments; if there are positional arguments x1,…, xN, and expression evaluates to a sequence y1, …, yM, this is equivalent to a call with M+N positional arguments x1, …, xN, y1, …, yM.

and:

If the syntax **expression appears in the function call, expression must evaluate to a mapping, the contents of which are treated as additional keyword arguments. In the case of a keyword appearing in both expression and as an explicit keyword argument, a TypeError exception is raised.


回答 4

在函数调用中,单星号将列表变成单独的参数(例如zip(*x),与zip(x1,x2,x3)if相同x=[x1,x2,x3]),而双星号将字典变成单独的关键字参数(例如f(**k),与f(x=my_x, y=my_y)if相同)k = {'x':my_x, 'y':my_y}

在函数定义中,反之亦然:单星将任意数量的参数转换为列表,而双引号将任意数量的关键字参数转换为字典。例如,def foo(*x)表示“ foo接受任意数量的参数,可以通过列表x来访问它们(即,如果用户调用foo(1,2,3)x将是[1,2,3])”,并且def bar(**k)表示“ bar可以接受任意数量的关键字参数,并且可以通过字典k进行访问。 (即,如果用户调用bar(x=42, y=23)k将为{'x': 42, 'y': 23})”。

In a function call the single star turns a list into seperate arguments (e.g. zip(*x) is the same as zip(x1,x2,x3) if x=[x1,x2,x3]) and the double star turns a dictionary into seperate keyword arguments (e.g. f(**k) is the same as f(x=my_x, y=my_y) if k = {'x':my_x, 'y':my_y}.

In a function definition it’s the other way around: the single star turns an arbitrary number of arguments into a list, and the double start turns an arbitrary number of keyword arguments into a dictionary. E.g. def foo(*x) means “foo takes an arbitrary number of arguments and they will be accessible through the list x (i.e. if the user calls foo(1,2,3), x will be [1,2,3])” and def bar(**k) means “bar takes an arbitrary number of keyword arguments and they will be accessible through the dictionary k (i.e. if the user calls bar(x=42, y=23), k will be {'x': 42, 'y': 23})”.


如何处理熊猫中的SettingWithCopyWarning?

问题:如何处理熊猫中的SettingWithCopyWarning?

背景

我刚刚将熊猫从0.11升级到0.13.0rc1。现在,该应用程序弹出了许多新警告。其中之一是这样的:

E:\FinReporter\FM_EXT.py:449: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
  quote_df['TVol']   = quote_df['TVol']/TVOL_SCALE

我想知道到底是什么意思?我需要改变什么吗?

如果我坚持使用该如何警告quote_df['TVol'] = quote_df['TVol']/TVOL_SCALE

产生错误的功能

def _decode_stock_quote(list_of_150_stk_str):
    """decode the webpage and return dataframe"""

    from cStringIO import StringIO

    str_of_all = "".join(list_of_150_stk_str)

    quote_df = pd.read_csv(StringIO(str_of_all), sep=',', names=list('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefg')) #dtype={'A': object, 'B': object, 'C': np.float64}
    quote_df.rename(columns={'A':'STK', 'B':'TOpen', 'C':'TPCLOSE', 'D':'TPrice', 'E':'THigh', 'F':'TLow', 'I':'TVol', 'J':'TAmt', 'e':'TDate', 'f':'TTime'}, inplace=True)
    quote_df = quote_df.ix[:,[0,3,2,1,4,5,8,9,30,31]]
    quote_df['TClose'] = quote_df['TPrice']
    quote_df['RT']     = 100 * (quote_df['TPrice']/quote_df['TPCLOSE'] - 1)
    quote_df['TVol']   = quote_df['TVol']/TVOL_SCALE
    quote_df['TAmt']   = quote_df['TAmt']/TAMT_SCALE
    quote_df['STK_ID'] = quote_df['STK'].str.slice(13,19)
    quote_df['STK_Name'] = quote_df['STK'].str.slice(21,30)#.decode('gb2312')
    quote_df['TDate']  = quote_df.TDate.map(lambda x: x[0:4]+x[5:7]+x[8:10])

    return quote_df

更多错误讯息

E:\FinReporter\FM_EXT.py:449: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
  quote_df['TVol']   = quote_df['TVol']/TVOL_SCALE
E:\FinReporter\FM_EXT.py:450: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
  quote_df['TAmt']   = quote_df['TAmt']/TAMT_SCALE
E:\FinReporter\FM_EXT.py:453: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
  quote_df['TDate']  = quote_df.TDate.map(lambda x: x[0:4]+x[5:7]+x[8:10])

Background

I just upgraded my Pandas from 0.11 to 0.13.0rc1. Now, the application is popping out many new warnings. One of them like this:

E:\FinReporter\FM_EXT.py:449: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
  quote_df['TVol']   = quote_df['TVol']/TVOL_SCALE

I want to know what exactly it means? Do I need to change something?

How should I suspend the warning if I insist to use quote_df['TVol'] = quote_df['TVol']/TVOL_SCALE?

The function that gives errors

def _decode_stock_quote(list_of_150_stk_str):
    """decode the webpage and return dataframe"""

    from cStringIO import StringIO

    str_of_all = "".join(list_of_150_stk_str)

    quote_df = pd.read_csv(StringIO(str_of_all), sep=',', names=list('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefg')) #dtype={'A': object, 'B': object, 'C': np.float64}
    quote_df.rename(columns={'A':'STK', 'B':'TOpen', 'C':'TPCLOSE', 'D':'TPrice', 'E':'THigh', 'F':'TLow', 'I':'TVol', 'J':'TAmt', 'e':'TDate', 'f':'TTime'}, inplace=True)
    quote_df = quote_df.ix[:,[0,3,2,1,4,5,8,9,30,31]]
    quote_df['TClose'] = quote_df['TPrice']
    quote_df['RT']     = 100 * (quote_df['TPrice']/quote_df['TPCLOSE'] - 1)
    quote_df['TVol']   = quote_df['TVol']/TVOL_SCALE
    quote_df['TAmt']   = quote_df['TAmt']/TAMT_SCALE
    quote_df['STK_ID'] = quote_df['STK'].str.slice(13,19)
    quote_df['STK_Name'] = quote_df['STK'].str.slice(21,30)#.decode('gb2312')
    quote_df['TDate']  = quote_df.TDate.map(lambda x: x[0:4]+x[5:7]+x[8:10])

    return quote_df

More error messages

E:\FinReporter\FM_EXT.py:449: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
  quote_df['TVol']   = quote_df['TVol']/TVOL_SCALE
E:\FinReporter\FM_EXT.py:450: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
  quote_df['TAmt']   = quote_df['TAmt']/TAMT_SCALE
E:\FinReporter\FM_EXT.py:453: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
  quote_df['TDate']  = quote_df.TDate.map(lambda x: x[0:4]+x[5:7]+x[8:10])

回答 0

SettingWithCopyWarning被创造的标志可能造成混淆的“链接”的任务,比如下面这并不总是按预期方式工作,特别是当第一选择返回一个副本。[ 有关背景讨论,请参见GH5390GH5597。]

df[df['A'] > 2]['B'] = new_val  # new_val not set in df

该警告提出了如下重写建议:

df.loc[df['A'] > 2, 'B'] = new_val

但是,这不适合您的用法,相当于:

df = df[df['A'] > 2]
df['B'] = new_val

显然,您不关心将其写回到原始帧的写操作(因为您正在覆盖对它的引用),但是不幸的是,这种模式无法与第一个链式分配示例区分开。因此,(误报)警告。如果您想进一步阅读,可能会在建立索引文档中解决误报的可能性。您可以通过以下分配安全地禁用此新警告。

import pandas as pd
pd.options.mode.chained_assignment = None  # default='warn'

The SettingWithCopyWarning was created to flag potentially confusing “chained” assignments, such as the following, which does not always work as expected, particularly when the first selection returns a copy. [see GH5390 and GH5597 for background discussion.]

df[df['A'] > 2]['B'] = new_val  # new_val not set in df

The warning offers a suggestion to rewrite as follows:

df.loc[df['A'] > 2, 'B'] = new_val

However, this doesn’t fit your usage, which is equivalent to:

df = df[df['A'] > 2]
df['B'] = new_val

While it’s clear that you don’t care about writes making it back to the original frame (since you are overwriting the reference to it), unfortunately this pattern cannot be differentiated from the first chained assignment example. Hence the (false positive) warning. The potential for false positives is addressed in the docs on indexing, if you’d like to read further. You can safely disable this new warning with the following assignment.

import pandas as pd
pd.options.mode.chained_assignment = None  # default='warn'

回答 1

SettingWithCopyWarning熊猫如何应对?

这篇文章的读者对象是:

  1. 想了解此警告的含义
  2. 想了解抑制此警告的不同方法
  3. 想了解如何改进其代码并遵循良好做法,以避免将来出现此警告。

设定

np.random.seed(0)
df = pd.DataFrame(np.random.choice(10, (3, 5)), columns=list('ABCDE'))
df
   A  B  C  D  E
0  5  0  3  3  7
1  9  3  5  2  4
2  7  6  8  8  1

什么是SettingWithCopyWarning

要知道如何处理此警告,重要的是要理解它的含义以及为什么首先提出它。

过滤DataFrame时,可以对帧进行切片/索引以返回一个视图copy,具体取决于内部布局和各种实现细节。顾名思义,“视图”是原始数据的视图,因此修改视图可能会修改原始对象。另一方面,“副本”是原始数据的复制,修改副本不会影响原始数据。

如其他答案所述SettingWithCopyWarning,创建时会标记“链接分配”操作。df在上面的设置中考虑。假设您要选择“ B”列中的所有值,其中“ A”列中的值>5。Pandas允许您以不同的方式执行此操作,其中某些方法比其他方法更正确。例如,

df[df.A > 5]['B']

1    3
2    6
Name: B, dtype: int64

和,

df.loc[df.A > 5, 'B']

1    3
2    6
Name: B, dtype: int64

这些返回相同的结果,因此,如果您仅读取这些值,则没有区别。那么,问题是什么呢?链式分配的问题在于,通常很难预测是否返回视图或副本,因此在尝试分配回值时这在很大程度上成为一个问题。为了建立在前面的示例上,请考虑解释器如何执行此代码:

df.loc[df.A > 5, 'B'] = 4
# becomes
df.__setitem__((df.A > 5, 'B'), 4)

只需__setitem__调用一次即可df。OTOH,请考虑以下代码:

df[df.A > 5]['B'] = 4
# becomes
df.__getitem__(df.A > 5).__setitem__('B", 4)

现在,根据__getitem__返回的视图还是副本,__setitem__操作可能不起作用

通常,您应将其loc用于基于标签的分配以及iloc基于整数/位置的分配,因为该规范保证它们始终在原始文件上运行。此外,要设置单个单元格,应使用atiat

可以在文档中找到更多信息

注意使用进行的
所有布尔索引操作loc也可以使用进行iloc。唯一的区别是iloc期望索引的整数/位置或布尔值的numpy数组,以及列的整数/位置索引。

例如,

df.loc[df.A > 5, 'B'] = 4

可以写成nas

df.iloc[(df.A > 5).values, 1] = 4

和,

df.loc[1, 'A'] = 100

可以写成

df.iloc[1, 0] = 100

等等。


告诉我如何抑制警告!

考虑对的“ A”列进行的简单操作df。选择“ A”并除以2将发出警告,但该操作将起作用。

df2 = df[['A']]
df2['A'] /= 2
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/IPython/__main__.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

df2
     A
0  2.5
1  4.5
2  3.5

有两种方法可以直接静默此警告:

  1. 做一个 deepcopy

    df2 = df[['A']].copy(deep=True)
    df2['A'] /= 2
  2. 更改pd.options.mode.chained_assignment
    可以设置为None"warn""raise""warn"是默认值。None将完全抑制警告,并"raise"抛出SettingWithCopyError,阻止操作进行。

    pd.options.mode.chained_assignment = None
    df2['A'] /= 2

@Peter Cotton在评论中提出了一种不错的方法,即使用上下文管理器以非侵入方式更改模式(从此要点修改),仅在需要时才设置模式,然后将其重置为完成后的原始状态。

class ChainedAssignent:
    def __init__(self, chained=None):
        acceptable = [None, 'warn', 'raise']
        assert chained in acceptable, "chained must be in " + str(acceptable)
        self.swcw = chained

    def __enter__(self):
        self.saved_swcw = pd.options.mode.chained_assignment
        pd.options.mode.chained_assignment = self.swcw
        return self

    def __exit__(self, *args):
        pd.options.mode.chained_assignment = self.saved_swcw

用法如下:

# some code here
with ChainedAssignent():
    df2['A'] /= 2
# more code follows

或者,引发异常

with ChainedAssignent(chained='raise'):
    df2['A'] /= 2

SettingWithCopyError: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

“ XY问题”:我在做什么错?

很多时候,用户试图寻找抑制此异常的方法而没有完全理解为什么首先出现该异常。这是XY问题的一个很好的示例,用户尝试解决问题“ Y”,这实际上是根源问题“ X”的症状。将根据遇到此警告的常见问题提出问题,然后提出解决方案。

问题1
我有一个DataFrame

df
       A  B  C  D  E
    0  5  0  3  3  7
    1  9  3  5  2  4
    2  7  6  8  8  1

我想为“ A”> 5到1000分配值。我的预期输出是

      A  B  C  D  E
0     5  0  3  3  7
1  1000  3  5  2  4
2  1000  6  8  8  1

错误的方法:

df.A[df.A > 5] = 1000         # works, because df.A returns a view
df[df.A > 5]['A'] = 1000      # does not work
df.loc[df.A  5]['A'] = 1000   # does not work

正确使用方法loc

df.loc[df.A > 5, 'A'] = 1000


问题2 1
我正在尝试将单元格(1,’D’)中的值设置为12345。我的预期输出是

   A  B  C      D  E
0  5  0  3      3  7
1  9  3  5  12345  4
2  7  6  8      8  1

我尝试了多种访问此单元格的方法,例如 df['D'][1]。做这个的最好方式是什么?

1.这个问题与警告并不特别相关,但是最好了解如何正确执行此特定操作,以避免将来可能出现警告的情况。

您可以使用以下任何一种方法来执行此操作。

df.loc[1, 'D'] = 12345
df.iloc[1, 3] = 12345
df.at[1, 'D'] = 12345
df.iat[1, 3] = 12345


问题3
我试图根据某些条件对值进行子集化。我有一个DataFrame

   A  B  C  D  E
1  9  3  5  2  4
2  7  6  8  8  1

我想将“ D”中的值分配给123,以使“ C” ==5。我尝试过

df2.loc[df2.C == 5, 'D'] = 123

看起来不错,但我仍然可以 SettingWithCopyWarning!我该如何解决?

实际上,这可能是因为您的管道中的代码更高。您是否df2从更大的事物(例如

df2 = df[df.A > 5]

?在这种情况下,布尔索引将返回一个视图,因此df2将引用原始视图。您需要做的就是分配df2一个副本

df2 = df[df.A > 5].copy()
# Or,
# df2 = df.loc[df.A > 5, :]


问题4
我试图将列“ C”从

   A  B  C  D  E
1  9  3  5  2  4
2  7  6  8  8  1

但是使用

df2.drop('C', axis=1, inplace=True)

抛出SettingWithCopyWarning。为什么会这样呢?

这是因为df2必须已通过其他切片操作将其创建为视图,例如

df2 = df[df.A > 5]

这里的解决方案是要么做copy()df,或使用loc,如前。

How to deal with SettingWithCopyWarning in Pandas?

This post is meant for readers who,

  1. Would like to understand what this warning means
  2. Would like to understand different ways of suppressing this warning
  3. Would like to understand how to improve their code and follow good practices to avoid this warning in the future.

Setup

np.random.seed(0)
df = pd.DataFrame(np.random.choice(10, (3, 5)), columns=list('ABCDE'))
df
   A  B  C  D  E
0  5  0  3  3  7
1  9  3  5  2  4
2  7  6  8  8  1

What is the SettingWithCopyWarning?

To know how to deal with this warning, it is important to understand what it means and why it is raised in the first place.

When filtering DataFrames, it is possible slice/index a frame to return either a view, or a copy, depending on the internal layout and various implementation details. A “view” is, as the term suggests, a view into the original data, so modifying the view may modify the original object. On the other hand, a “copy” is a replication of data from the original, and modifying the copy has no effect on the original.

As mentioned by other answers, the SettingWithCopyWarning was created to flag “chained assignment” operations. Consider df in the setup above. Suppose you would like to select all values in column “B” where values in column “A” is > 5. Pandas allows you to do this in different ways, some more correct than others. For example,

df[df.A > 5]['B']

1    3
2    6
Name: B, dtype: int64

And,

df.loc[df.A > 5, 'B']

1    3
2    6
Name: B, dtype: int64

These return the same result, so if you are only reading these values, it makes no difference. So, what is the issue? The problem with chained assignment, is that it is generally difficult to predict whether a view or a copy is returned, so this largely becomes an issue when you are attempting to assign values back. To build on the earlier example, consider how this code is executed by the interpreter:

df.loc[df.A > 5, 'B'] = 4
# becomes
df.__setitem__((df.A > 5, 'B'), 4)

With a single __setitem__ call to df. OTOH, consider this code:

df[df.A > 5]['B'] = 4
# becomes
df.__getitem__(df.A > 5).__setitem__('B", 4)

Now, depending on whether __getitem__ returned a view or a copy, the __setitem__ operation may not work.

In general, you should use loc for label-based assignment, and iloc for integer/positional based assignment, as the spec guarantees that they always operate on the original. Additionally, for setting a single cell, you should use at and iat.

More can be found in the documentation.

Note
All boolean indexing operations done with loc can also be done with iloc. The only difference is that iloc expects either integers/positions for index or a numpy array of boolean values, and integer/position indexes for the columns.

For example,

df.loc[df.A > 5, 'B'] = 4

Can be written nas

df.iloc[(df.A > 5).values, 1] = 4

And,

df.loc[1, 'A'] = 100

Can be written as

df.iloc[1, 0] = 100

And so on.


Just tell me how to suppress the warning!

Consider a simple operation on the “A” column of df. Selecting “A” and dividing by 2 will raise the warning, but the operation will work.

df2 = df[['A']]
df2['A'] /= 2
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/IPython/__main__.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

df2
     A
0  2.5
1  4.5
2  3.5

There are a couple ways of directly silencing this warning:

  1. Make a deepcopy

    df2 = df[['A']].copy(deep=True)
    df2['A'] /= 2
    
  2. Change pd.options.mode.chained_assignment
    Can be set to None, "warn", or "raise". "warn" is the default. None will suppress the warning entirely, and "raise" will throw a SettingWithCopyError, preventing the operation from going through.

    pd.options.mode.chained_assignment = None
    df2['A'] /= 2
    

@Peter Cotton in the comments, came up with a nice way of non-intrusively changing the mode (modified from this gist) using a context manager, to set the mode only as long as it is required, and the reset it back to the original state when finished.

class ChainedAssignent:
    def __init__(self, chained=None):
        acceptable = [None, 'warn', 'raise']
        assert chained in acceptable, "chained must be in " + str(acceptable)
        self.swcw = chained

    def __enter__(self):
        self.saved_swcw = pd.options.mode.chained_assignment
        pd.options.mode.chained_assignment = self.swcw
        return self

    def __exit__(self, *args):
        pd.options.mode.chained_assignment = self.saved_swcw

The usage is as follows:

# some code here
with ChainedAssignent():
    df2['A'] /= 2
# more code follows

Or, to raise the exception

with ChainedAssignent(chained='raise'):
    df2['A'] /= 2

SettingWithCopyError: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

The “XY Problem”: What am I doing wrong?

A lot of the time, users attempt to look for ways of suppressing this exception without fully understanding why it was raised in the first place. This is a good example of an XY problem, where users attempt to solve a problem “Y” that is actually a symptom of a deeper rooted problem “X”. Questions will be raised based on common problems that encounter this warning, and solutions will then be presented.

Question 1
I have a DataFrame

df
       A  B  C  D  E
    0  5  0  3  3  7
    1  9  3  5  2  4
    2  7  6  8  8  1

I want to assign values in col “A” > 5 to 1000. My expected output is

      A  B  C  D  E
0     5  0  3  3  7
1  1000  3  5  2  4
2  1000  6  8  8  1

Wrong way to do this:

df.A[df.A > 5] = 1000         # works, because df.A returns a view
df[df.A > 5]['A'] = 1000      # does not work
df.loc[df.A  5]['A'] = 1000   # does not work

Right way using loc:

df.loc[df.A > 5, 'A'] = 1000


Question 21
I am trying to set the value in cell (1, ‘D’) to 12345. My expected output is

   A  B  C      D  E
0  5  0  3      3  7
1  9  3  5  12345  4
2  7  6  8      8  1

I have tried different ways of accessing this cell, such as df['D'][1]. What is the best way to do this?

1. This question isn’t specifically related to the warning, but it is good to understand how to do this particular operation correctly so as to avoid situations where the warning could potentially arise in future.

You can use any of the following methods to do this.

df.loc[1, 'D'] = 12345
df.iloc[1, 3] = 12345
df.at[1, 'D'] = 12345
df.iat[1, 3] = 12345


Question 3
I am trying to subset values based on some condition. I have a DataFrame

   A  B  C  D  E
1  9  3  5  2  4
2  7  6  8  8  1

I would like to assign values in “D” to 123 such that “C” == 5. I tried

df2.loc[df2.C == 5, 'D'] = 123

Which seems fine but I am still getting the SettingWithCopyWarning! How do I fix this?

This is actually probably because of code higher up in your pipeline. Did you create df2 from something larger, like

df2 = df[df.A > 5]

? In this case, boolean indexing will return a view, so df2 will reference the original. What you’d need to do is assign df2 to a copy:

df2 = df[df.A > 5].copy()
# Or,
# df2 = df.loc[df.A > 5, :]


Question 4
I’m trying to drop column “C” in-place from

   A  B  C  D  E
1  9  3  5  2  4
2  7  6  8  8  1

But using

df2.drop('C', axis=1, inplace=True)

Throws SettingWithCopyWarning. Why is this happening?

This is because df2 must have been created as a view from some other slicing operation, such as

df2 = df[df.A > 5]

The solution here is to either make a copy() of df, or use loc, as before.


回答 2

通常,的目的SettingWithCopyWarning是向用户(尤其是新用户)显示他们可能正在使用副本,而不是他们认为的原始内容。这里误报(IOW如果你知道你在做什么,它可能是好的)。一种可能就是简单地关闭(默认警告按照@Garrett的建议)警告。

这是另一个选择:

In [1]: df = DataFrame(np.random.randn(5, 2), columns=list('AB'))

In [2]: dfa = df.ix[:, [1, 0]]

In [3]: dfa.is_copy
Out[3]: True

In [4]: dfa['A'] /= 2
/usr/local/bin/ipython:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
  #!/usr/local/bin/python

您可以将is_copy标志设置为False,以有效关闭该对象的检查:

In [5]: dfa.is_copy = False

In [6]: dfa['A'] /= 2

如果您明确复制,则不会发生进一步的警告:

In [7]: dfa = df.ix[:, [1, 0]].copy()

In [8]: dfa['A'] /= 2

OP在上面显示的代码是合法的,并且可能是我也可以做的,但从技术上讲,此警告是一种情况,不是误报。没有警告的另一种方法是通过进行选择操作reindex,例如

quote_df = quote_df.reindex(columns=['STK', ...])

要么,

quote_df = quote_df.reindex(['STK', ...], axis=1)  # v.0.21

In general the point of the SettingWithCopyWarning is to show users (and especially new users) that they may be operating on a copy and not the original as they think. There are false positives (IOW if you know what you are doing it could be ok). One possibility is simply to turn off the (by default warn) warning as @Garrett suggest.

Here is another option:

In [1]: df = DataFrame(np.random.randn(5, 2), columns=list('AB'))

In [2]: dfa = df.ix[:, [1, 0]]

In [3]: dfa.is_copy
Out[3]: True

In [4]: dfa['A'] /= 2
/usr/local/bin/ipython:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
  #!/usr/local/bin/python

You can set the is_copy flag to False, which will effectively turn off the check, for that object:

In [5]: dfa.is_copy = False

In [6]: dfa['A'] /= 2

If you explicitly copy then no further warning will happen:

In [7]: dfa = df.ix[:, [1, 0]].copy()

In [8]: dfa['A'] /= 2

The code the OP is showing above, while legitimate, and probably something I do as well, is technically a case for this warning, and not a false positive. Another way to not have the warning would be to do the selection operation via reindex, e.g.

quote_df = quote_df.reindex(columns=['STK', ...])

Or,

quote_df = quote_df.reindex(['STK', ...], axis=1)  # v.0.21

回答 3

熊猫数据框复制警告

当您去做这样的事情时:

quote_df = quote_df.ix[:,[0,3,2,1,4,5,8,9,30,31]]

pandas.ix 在这种情况下将返回一个新的独立数据帧。

您决定在此数据框中更改的任何值都不会更改原始数据框。

这就是熊猫试图警告您的内容。


为什么 .ix是个坏主意

.ix对象试图做的事情不只一件事,而且对于任何阅读过干净代码的人来说,这是一种强烈的气味。

给定此数据框:

df = pd.DataFrame({"a": [1,2,3,4], "b": [1,1,2,2]})

两种行为:

dfcopy = df.ix[:,["a"]]
dfcopy.a.ix[0] = 2

行为一:dfcopy现在是一个独立的数据框。改变它不会改变df

df.ix[0, "a"] = 3

行为二:更改原始数据框。


使用.loc替代

熊猫开发者意识到该.ix对象很臭(推测地),因此创建了两个新对象,这些对象有助于数据的获取和分配。(另一个是.iloc

.loc 速度更快,因为它不会尝试创建数据副本。

.loc 旨在就地修改现有数据框,从而提高内存效率。

.loc 是可预测的,它具有一种行为。


解决方案

在代码示例中,您正在执行的操作是加载一个包含许多列的大文件,然后将其修改为较小的文件。

pd.read_csv功能可以帮助您解决很多问题,还可以加快文件的加载速度。

所以不要这样做

quote_df = pd.read_csv(StringIO(str_of_all), sep=',', names=list('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefg')) #dtype={'A': object, 'B': object, 'C': np.float64}
quote_df.rename(columns={'A':'STK', 'B':'TOpen', 'C':'TPCLOSE', 'D':'TPrice', 'E':'THigh', 'F':'TLow', 'I':'TVol', 'J':'TAmt', 'e':'TDate', 'f':'TTime'}, inplace=True)
quote_df = quote_df.ix[:,[0,3,2,1,4,5,8,9,30,31]]

做这个

columns = ['STK', 'TPrice', 'TPCLOSE', 'TOpen', 'THigh', 'TLow', 'TVol', 'TAmt', 'TDate', 'TTime']
df = pd.read_csv(StringIO(str_of_all), sep=',', usecols=[0,3,2,1,4,5,8,9,30,31])
df.columns = columns

这只会读取您感兴趣的列,并正确命名它们。无需使用邪恶的.ix物体做神奇的事情。

Pandas dataframe copy warning

When you go and do something like this:

quote_df = quote_df.ix[:,[0,3,2,1,4,5,8,9,30,31]]

pandas.ix in this case returns a new, stand alone dataframe.

Any values you decide to change in this dataframe, will not change the original dataframe.

This is what pandas tries to warn you about.


Why .ix is a bad idea

The .ix object tries to do more than one thing, and for anyone who has read anything about clean code, this is a strong smell.

Given this dataframe:

df = pd.DataFrame({"a": [1,2,3,4], "b": [1,1,2,2]})

Two behaviors:

dfcopy = df.ix[:,["a"]]
dfcopy.a.ix[0] = 2

Behavior one: dfcopy is now a stand alone dataframe. Changing it will not change df

df.ix[0, "a"] = 3

Behavior two: This changes the original dataframe.


Use .loc instead

The pandas developers recognized that the .ix object was quite smelly[speculatively] and thus created two new objects which helps in the accession and assignment of data. (The other being .iloc)

.loc is faster, because it does not try to create a copy of the data.

.loc is meant to modify your existing dataframe inplace, which is more memory efficient.

.loc is predictable, it has one behavior.


The solution

What you are doing in your code example is loading a big file with lots of columns, then modifying it to be smaller.

The pd.read_csv function can help you out with a lot of this and also make the loading of the file a lot faster.

So instead of doing this

quote_df = pd.read_csv(StringIO(str_of_all), sep=',', names=list('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefg')) #dtype={'A': object, 'B': object, 'C': np.float64}
quote_df.rename(columns={'A':'STK', 'B':'TOpen', 'C':'TPCLOSE', 'D':'TPrice', 'E':'THigh', 'F':'TLow', 'I':'TVol', 'J':'TAmt', 'e':'TDate', 'f':'TTime'}, inplace=True)
quote_df = quote_df.ix[:,[0,3,2,1,4,5,8,9,30,31]]

Do this

columns = ['STK', 'TPrice', 'TPCLOSE', 'TOpen', 'THigh', 'TLow', 'TVol', 'TAmt', 'TDate', 'TTime']
df = pd.read_csv(StringIO(str_of_all), sep=',', usecols=[0,3,2,1,4,5,8,9,30,31])
df.columns = columns

This will only read the columns you are interested in, and name them properly. No need for using the evil .ix object to do magical stuff.


回答 4

在这里,我直接回答这个问题。怎么处理呢?

.copy(deep=False)切片后做一个。参见pandas.DataFrame.copy

等等,切片不返回副本吗?毕竟,这是警告消息要说的内容?阅读详细答案:

import pandas as pd
df = pd.DataFrame({'x':[1,2,3]})

这给出了警告:

df0 = df[df.x>2]
df0['foo'] = 'bar'

这不是:

df1 = df[df.x>2].copy(deep=False)
df1['foo'] = 'bar'

两者df0df1都是DataFrame对象,但它们之间的某些不同之处使熊猫能够打印警告。让我们找出它是什么。

import inspect
slice= df[df.x>2]
slice_copy = df[df.x>2].copy(deep=False)
inspect.getmembers(slice)
inspect.getmembers(slice_copy)

使用选择的差异工具,您将看到,除了几个地址之外,唯一的实质区别是:

|          | slice   | slice_copy |
| _is_copy | weakref | None       |

决定是否发出警告的方法是DataFrame._check_setitem_copy检查_is_copy。所以,你去。制作一个copy使您的DataFrame不_is_copy

建议使用警告.loc,但如果在上使用.loc该框架_is_copy,您仍会收到相同的警告。误导?是。烦人吗 你打赌 有帮助吗?可能在使用链式分配时。但是它不能正确检测链条分配,并且会随意打印警告。

Here I answer the question directly. How to deal with it?

Make a .copy(deep=False) after you slice. See pandas.DataFrame.copy.

Wait, doesn’t a slice return a copy? After all, this is what the warning message is attempting to say? Read the long answer:

import pandas as pd
df = pd.DataFrame({'x':[1,2,3]})

This gives a warning:

df0 = df[df.x>2]
df0['foo'] = 'bar'

This does not:

df1 = df[df.x>2].copy(deep=False)
df1['foo'] = 'bar'

Both df0 and df1 are DataFrame objects, but something about them is different that enables pandas to print the warning. Let’s find out what it is.

import inspect
slice= df[df.x>2]
slice_copy = df[df.x>2].copy(deep=False)
inspect.getmembers(slice)
inspect.getmembers(slice_copy)

Using your diff tool of choice, you will see that beyond a couple of addresses, the only material difference is this:

|          | slice   | slice_copy |
| _is_copy | weakref | None       |

The method that decides whether to warn is DataFrame._check_setitem_copy which checks _is_copy. So here you go. Make a copy so that your DataFrame is not _is_copy.

The warning is suggesting to use .loc, but if you use .loc on a frame that _is_copy, you will still get the same warning. Misleading? Yes. Annoying? You bet. Helpful? Potentially, when chained assignment is used. But it cannot correctly detect chain assignment and prints the warning indiscriminately.


回答 5

这个话题确实让Pandas感到困惑。幸运的是,它有一个相对简单的解决方案。

问题在于,并不总是清楚数据过滤操作(例如loc)是否返回DataFrame的副本或视图。因此,这种过滤后的DataFrame的进一步使用可能会造成混淆。

简单的解决方案是(除非您需要处理非常大的数据集):

每当需要更新任何值时,请始终确保在分配之前隐式复制DataFrame。

df  # Some DataFrame
df = df.loc[:, 0:2]  # Some filtering (unsure whether a view or copy is returned)
df = df.copy()  # Ensuring a copy is made
df[df["Name"] == "John"] = "Johny"  # Assignment can be done now (no warning)

This topic is really confusing with Pandas. Luckily, it has a relatively simple solution.

The problem is that it is not always clear whether data filtering operations (e.g. loc) return a copy or a view of the DataFrame. Further use of such filtered DataFrame could therefore be confusing.

The simple solution is (unless you need to work with very large sets of data):

Whenever you need to update any values, always make sure that you implicitely copy the DataFrame before the assignment.

df  # Some DataFrame
df = df.loc[:, 0:2]  # Some filtering (unsure whether a view or copy is returned)
df = df.copy()  # Ensuring a copy is made
df[df["Name"] == "John"] = "Johny"  # Assignment can be done now (no warning)


回答 6

为了消除任何疑问,我的解决方案是制作切片的深层副本,而不是常规副本。根据您的上下文,这可能不适用(内存限制/切片的大小,潜在的性能下降-特别是如果复制像对我一样在一个循环中发生,等等。)

需要明确的是,这是我收到的警告:

/opt/anaconda3/lib/python3.6/site-packages/ipykernel/__main__.py:54:
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

插图

我怀疑是否由于我将一列放在切片的副本上而引发警告。虽然从技术上讲,它不是在切片副本中尝试设置值,但是这仍然是切片副本的修改。以下是我为确认怀疑而采取的(简化)步骤,希望它能对那些试图了解警告的人有所帮助。

示例1:在原件上放置一列会影响复印

我们已经知道了,但这是健康的提醒。这是不是警告是关于什么的。

>> data1 = {'A': [111, 112, 113], 'B':[121, 122, 123]}
>> df1 = pd.DataFrame(data1)
>> df1

    A   B
0   111 121
1   112 122
2   113 123


>> df2 = df1
>> df2

A   B
0   111 121
1   112 122
2   113 123

# Dropping a column on df1 affects df2
>> df1.drop('A', axis=1, inplace=True)
>> df2
    B
0   121
1   122
2   123

可以避免对df1进行更改以影响df2

>> data1 = {'A': [111, 112, 113], 'B':[121, 122, 123]}
>> df1 = pd.DataFrame(data1)
>> df1

A   B
0   111 121
1   112 122
2   113 123

>> import copy
>> df2 = copy.deepcopy(df1)
>> df2
A   B
0   111 121
1   112 122
2   113 123

# Dropping a column on df1 does not affect df2
>> df1.drop('A', axis=1, inplace=True)
>> df2
    A   B
0   111 121
1   112 122
2   113 123

示例2:在副本上放置一列可能会影响原始

这实际上说明了警告。

>> data1 = {'A': [111, 112, 113], 'B':[121, 122, 123]}
>> df1 = pd.DataFrame(data1)
>> df1

    A   B
0   111 121
1   112 122
2   113 123

>> df2 = df1
>> df2

    A   B
0   111 121
1   112 122
2   113 123

# Dropping a column on df2 can affect df1
# No slice involved here, but I believe the principle remains the same?
# Let me know if not
>> df2.drop('A', axis=1, inplace=True)
>> df1

B
0   121
1   122
2   123

可以避免对df2进行更改以影响df1

>> data1 = {'A': [111, 112, 113], 'B':[121, 122, 123]}
>> df1 = pd.DataFrame(data1)
>> df1

    A   B
0   111 121
1   112 122
2   113 123

>> import copy
>> df2 = copy.deepcopy(df1)
>> df2

A   B
0   111 121
1   112 122
2   113 123

>> df2.drop('A', axis=1, inplace=True)
>> df1

A   B
0   111 121
1   112 122
2   113 123

干杯!

To remove any doubt, my solution was to make a deep copy of the slice instead of a regular copy. This may not be applicable depending on your context (Memory constraints / size of the slice, potential for performance degradation – especially if the copy occurs in a loop like it did for me, etc…)

To be clear, here is the warning I received:

/opt/anaconda3/lib/python3.6/site-packages/ipykernel/__main__.py:54:
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

Illustration

I had doubts that the warning was thrown because of a column I was dropping on a copy of the slice. While not technically trying to set a value in the copy of the slice, that was still a modification of the copy of the slice. Below are the (simplified) steps I have taken to confirm the suspicion, I hope it will help those of us who are trying to understand the warning.

Example 1: dropping a column on the original affects the copy

We knew that already but this is a healthy reminder. This is NOT what the warning is about.

>> data1 = {'A': [111, 112, 113], 'B':[121, 122, 123]}
>> df1 = pd.DataFrame(data1)
>> df1

    A   B
0   111 121
1   112 122
2   113 123


>> df2 = df1
>> df2

A   B
0   111 121
1   112 122
2   113 123

# Dropping a column on df1 affects df2
>> df1.drop('A', axis=1, inplace=True)
>> df2
    B
0   121
1   122
2   123

It is possible to avoid changes made on df1 to affect df2

>> data1 = {'A': [111, 112, 113], 'B':[121, 122, 123]}
>> df1 = pd.DataFrame(data1)
>> df1

A   B
0   111 121
1   112 122
2   113 123

>> import copy
>> df2 = copy.deepcopy(df1)
>> df2
A   B
0   111 121
1   112 122
2   113 123

# Dropping a column on df1 does not affect df2
>> df1.drop('A', axis=1, inplace=True)
>> df2
    A   B
0   111 121
1   112 122
2   113 123

Example 2: dropping a column on the copy may affect the original

This actually illustrates the warning.

>> data1 = {'A': [111, 112, 113], 'B':[121, 122, 123]}
>> df1 = pd.DataFrame(data1)
>> df1

    A   B
0   111 121
1   112 122
2   113 123

>> df2 = df1
>> df2

    A   B
0   111 121
1   112 122
2   113 123

# Dropping a column on df2 can affect df1
# No slice involved here, but I believe the principle remains the same?
# Let me know if not
>> df2.drop('A', axis=1, inplace=True)
>> df1

B
0   121
1   122
2   123

It is possible to avoid changes made on df2 to affect df1

>> data1 = {'A': [111, 112, 113], 'B':[121, 122, 123]}
>> df1 = pd.DataFrame(data1)
>> df1

    A   B
0   111 121
1   112 122
2   113 123

>> import copy
>> df2 = copy.deepcopy(df1)
>> df2

A   B
0   111 121
1   112 122
2   113 123

>> df2.drop('A', axis=1, inplace=True)
>> df1

A   B
0   111 121
1   112 122
2   113 123

Cheers!


回答 7

这应该工作:

quote_df.loc[:,'TVol'] = quote_df['TVol']/TVOL_SCALE

This should work:

quote_df.loc[:,'TVol'] = quote_df['TVol']/TVOL_SCALE

回答 8

有些人可能想简单地消除警告:

class SupressSettingWithCopyWarning:
    def __enter__(self):
        pd.options.mode.chained_assignment = None

    def __exit__(self, *args):
        pd.options.mode.chained_assignment = 'warn'

with SupressSettingWithCopyWarning():
    #code that produces warning

Some may want to simply suppress the warning:

class SupressSettingWithCopyWarning:
    def __enter__(self):
        pd.options.mode.chained_assignment = None

    def __exit__(self, *args):
        pd.options.mode.chained_assignment = 'warn'

with SupressSettingWithCopyWarning():
    #code that produces warning

回答 9

如果您已将切片分配给变量,并希望使用变量进行设置,如下所示:

df2 = df[df['A'] > 2]
df2['B'] = value

而且由于您的条件计算df2时间太长或出于某些其他原因,您不想使用Jeffs解决方案,那么您可以使用以下方法:

df.loc[df2.index.tolist(), 'B'] = value

df2.index.tolist() 返回df2中所有条目的索引,然后将这些索引用于设置原始数据帧中的B列。

If you have assigned the slice to a variable and want to set using the variable as in the following:

df2 = df[df['A'] > 2]
df2['B'] = value

And you do not want to use Jeffs solution because your condition computing df2 is to long or for some other reason, then you can use the following:

df.loc[df2.index.tolist(), 'B'] = value

df2.index.tolist() returns the indices from all entries in df2, which will then be used to set column B in the original dataframe.


回答 10

对我来说,此问题发生在下面的> simplified <示例中。我也能够解决它(希望有一个正确的解决方案):

带有警告的旧代码:

def update_old_dataframe(old_dataframe, new_dataframe):
    for new_index, new_row in new_dataframe.iterrorws():
        old_dataframe.loc[new_index] = update_row(old_dataframe.loc[new_index], new_row)

def update_row(old_row, new_row):
    for field in [list_of_columns]:
        # line with warning because of chain indexing old_dataframe[new_index][field]
        old_row[field] = new_row[field]  
    return old_row

这打印了该行的警告 old_row[field] = new_row[field]

由于update_row方法中的行实际上是type Series,因此我将其替换为:

old_row.at[field] = new_row.at[field]

即用于访问/查找的方法Series。尽管两者都可以正常工作并且结果是相同的,但是通过这种方式,我不必禁用警告(=将其保留在其他地方的其他链索引问题中)。

我希望这可以帮助某人。

For me this issue occured in a following >simplified< example. And I was also able to solve it (hopefully with a correct solution):

old code with warning:

def update_old_dataframe(old_dataframe, new_dataframe):
    for new_index, new_row in new_dataframe.iterrorws():
        old_dataframe.loc[new_index] = update_row(old_dataframe.loc[new_index], new_row)

def update_row(old_row, new_row):
    for field in [list_of_columns]:
        # line with warning because of chain indexing old_dataframe[new_index][field]
        old_row[field] = new_row[field]  
    return old_row

This printed the warning for the line old_row[field] = new_row[field]

Since the rows in update_row method are actually type Series, I replaced the line with:

old_row.at[field] = new_row.at[field]

i.e. method for accessing/lookups for a Series. Eventhough both works just fine and the result is same, this way I don’t have to disable the warnings (=keep them for other chain indexing issues somewhere else).

I hope this may help someone.


回答 11

我相信您可以避免像这样的整个问题:

return (
    pd.read_csv(StringIO(str_of_all), sep=',', names=list('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefg')) #dtype={'A': object, 'B': object, 'C': np.float64}
    .rename(columns={'A':'STK', 'B':'TOpen', 'C':'TPCLOSE', 'D':'TPrice', 'E':'THigh', 'F':'TLow', 'I':'TVol', 'J':'TAmt', 'e':'TDate', 'f':'TTime'}, inplace=True)
    .ix[:,[0,3,2,1,4,5,8,9,30,31]]
    .assign(
        TClose=lambda df: df['TPrice'],
        RT=lambda df: 100 * (df['TPrice']/quote_df['TPCLOSE'] - 1),
        TVol=lambda df: df['TVol']/TVOL_SCALE,
        TAmt=lambda df: df['TAmt']/TAMT_SCALE,
        STK_ID=lambda df: df['STK'].str.slice(13,19),
        STK_Name=lambda df: df['STK'].str.slice(21,30)#.decode('gb2312'),
        TDate=lambda df: df.TDate.map(lambda x: x[0:4]+x[5:7]+x[8:10]),
    )
)

使用分配。从文档中:将新列分配给DataFrame,返回一个新对象(一个副本),其中除新列外还包含所有原始列。

参见汤姆·奥格斯珀格(Tom Augspurger)关于熊猫方法链接的文章:https ://tomaugspurger.github.io/method-chaining

You could avoid the whole problem like this, I believe:

return (
    pd.read_csv(StringIO(str_of_all), sep=',', names=list('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefg')) #dtype={'A': object, 'B': object, 'C': np.float64}
    .rename(columns={'A':'STK', 'B':'TOpen', 'C':'TPCLOSE', 'D':'TPrice', 'E':'THigh', 'F':'TLow', 'I':'TVol', 'J':'TAmt', 'e':'TDate', 'f':'TTime'}, inplace=True)
    .ix[:,[0,3,2,1,4,5,8,9,30,31]]
    .assign(
        TClose=lambda df: df['TPrice'],
        RT=lambda df: 100 * (df['TPrice']/quote_df['TPCLOSE'] - 1),
        TVol=lambda df: df['TVol']/TVOL_SCALE,
        TAmt=lambda df: df['TAmt']/TAMT_SCALE,
        STK_ID=lambda df: df['STK'].str.slice(13,19),
        STK_Name=lambda df: df['STK'].str.slice(21,30)#.decode('gb2312'),
        TDate=lambda df: df.TDate.map(lambda x: x[0:4]+x[5:7]+x[8:10]),
    )
)

Using Assign. From the documentation: Assign new columns to a DataFrame, returning a new object (a copy) with all the original columns in addition to the new ones.

See Tom Augspurger’s article on method chaining in pandas: https://tomaugspurger.github.io/method-chaining


回答 12

后续初学者问题/备注

也许是对其他像我这样的初学者的澄清(我来自R,似乎在幕后工作有所不同)。以下看起来无害且功能正常的代码不断产生SettingWithCopy警告,但我不知道为什么。我已经阅读并理解了带有“链式索引”的内容,但是我的代码不包含任何内容:

def plot(pdb, df, title, **kw):
    df['target'] = (df['ogg'] + df['ugg']) / 2
    # ...

但是后来,太晚了,我查看了plot()函数的调用位置:

    df = data[data['anz_emw'] > 0]
    pixbuf = plot(pdb, df, title)

因此,“ df”不是数据帧,而是某种对象,它会以某种方式记住它是通过索引数据帧而创建的(因此是视图?),这将使plot()中的行成为可能。

 df['target'] = ...

相当于

 data[data['anz_emw'] > 0]['target'] = ...

这是一个链接索引。我说对了吗?

无论如何,

def plot(pdb, df, title, **kw):
    df.loc[:,'target'] = (df['ogg'] + df['ugg']) / 2

固定它。

Followup beginner question / remark

Maybe a clarification for other beginners like me (I come from R which seems to work a bit differently under the hood). The following harmless-looking and functional code kept producing the SettingWithCopy warning, and I couldn’t figure out why. I had both read and understood the issued with “chained indexing”, but my code doesn’t contain any:

def plot(pdb, df, title, **kw):
    df['target'] = (df['ogg'] + df['ugg']) / 2
    # ...

But then, later, much too late, I looked at where the plot() function is called:

    df = data[data['anz_emw'] > 0]
    pixbuf = plot(pdb, df, title)

So “df” isn’t a data frame but an object that somehow remembers that it was created by indexing a data frame (so is that a view?) which would make the line in plot()

 df['target'] = ...

equivalent to

 data[data['anz_emw'] > 0]['target'] = ...

which is a chained indexing. Did I get that right?

Anyway,

def plot(pdb, df, title, **kw):
    df.loc[:,'target'] = (df['ogg'] + df['ugg']) / 2

fixed it.


回答 13

由于这个问题已经在现有答案中得到了充分的解释和讨论,因此我将为pandas上下文管理器提供一种简洁的方法,使用pandas.option_context(指向文档示例的链接)-绝对不需要使用所有dunder方法和其他方法创建自定义类和口哨声。

首先,上下文管理器代码本身:

from contextlib import contextmanager

@contextmanager
def SuppressPandasWarning():
    with pd.option_context("mode.chained_assignment", None):
        yield

再举一个例子:

import pandas as pd
from string import ascii_letters

a = pd.DataFrame({"A": list(ascii_letters[0:4]), "B": range(0,4)})

mask = a["A"].isin(["c", "d"])
# Even shallow copy below is enough to not raise the warning, but why is a mystery to me.
b = a.loc[mask]  # .copy(deep=False)

# Raises the `SettingWithCopyWarning`
b["B"] = b["B"] * 2

# Does not!
with SuppressPandasWarning():
    b["B"] = b["B"] * 2

值得一提的是,这两个方法均未修改a,这对我来说有点令人惊讶,即使是带有df的浅表副本.copy(deep=False)也将阻止发出此警告(据我所知,浅表副本也应至少a也应进行修改,但它不会’t。pandas魔术。)。

As this question is already fully explained and discussed in existing answers I will just provide a neat pandas approach to the context manager using pandas.option_context (links to docs and example) – there is absolutely no need to create a custom class with all the dunder methods and other bells and whistles.

First the context manager code itself:

from contextlib import contextmanager

@contextmanager
def SuppressPandasWarning():
    with pd.option_context("mode.chained_assignment", None):
        yield

Then an example:

import pandas as pd
from string import ascii_letters

a = pd.DataFrame({"A": list(ascii_letters[0:4]), "B": range(0,4)})

mask = a["A"].isin(["c", "d"])
# Even shallow copy below is enough to not raise the warning, but why is a mystery to me.
b = a.loc[mask]  # .copy(deep=False)

# Raises the `SettingWithCopyWarning`
b["B"] = b["B"] * 2

# Does not!
with SuppressPandasWarning():
    b["B"] = b["B"] * 2

Worth noticing is that both approches do not modify a, which is a bit surprising to me, and even a shallow df copy with .copy(deep=False) would prevent this warning to be raised (as far as I understand shallow copy should at least modify a as well, but it doesn’t. pandas magic.).


回答 14

.apply()从使用该.query()方法的现有数据帧分配新数据帧时,我一直遇到这个问题。例如:

prop_df = df.query('column == "value"')
prop_df['new_column'] = prop_df.apply(function, axis=1)

将返回此错误。在这种情况下,似乎可以解决该错误的修补程序是将其更改为:

prop_df = df.copy(deep=True)
prop_df = prop_df.query('column == "value"')
prop_df['new_column'] = prop_df.apply(function, axis=1)

但是,由于必须进行新的复制,因此这在使用大型数据帧时效率不高。

如果.apply()在生成新列及其值时使用该方法,则可以通过添加以下方法来解决错误并提高效率.reset_index(drop=True)

prop_df = df.query('column == "value"').reset_index(drop=True)
prop_df['new_column'] = prop_df.apply(function, axis=1)

I had been getting this issue with .apply() when assigning a new dataframe from a pre-existing dataframe on which i’ve used the .query() method. For instance:

prop_df = df.query('column == "value"')
prop_df['new_column'] = prop_df.apply(function, axis=1)

Would return this error. The fix that seems to resolve the error in this case is by changing this to:

prop_df = df.copy(deep=True)
prop_df = prop_df.query('column == "value"')
prop_df['new_column'] = prop_df.apply(function, axis=1)

However, this is NOT efficient especially when using large dataframes, due to having to make a new copy.

If you’re using the .apply() method in generating a new column and its values, a fix that resolves the error and is more efficient is by adding .reset_index(drop=True):

prop_df = df.query('column == "value"').reset_index(drop=True)
prop_df['new_column'] = prop_df.apply(function, axis=1)

我正在运行什么操作系统?

问题:我正在运行什么操作系统?

我要查看我是在Windows还是Unix等上,我需要查看什么?

What do I need to look at to see whether I’m on Windows or Unix, etc?


回答 0

>>> import os
>>> os.name
'posix'
>>> import platform
>>> platform.system()
'Linux'
>>> platform.release()
'2.6.22-15-generic'

的输出platform.system()如下:

  • Linux: Linux
  • 苹果电脑: Darwin
  • 视窗: Windows

请参阅:platform—访问基础平台的标识数据

>>> import os
>>> os.name
'posix'
>>> import platform
>>> platform.system()
'Linux'
>>> platform.release()
'2.6.22-15-generic'

The output of platform.system() is as follows:

  • Linux: Linux
  • Mac: Darwin
  • Windows: Windows

See: platform — Access to underlying platform’s identifying data


回答 1

Dang-lbrandy击败了我,但这并不意味着我无法为您提供Vista的系统结果!

>>> import os
>>> os.name
'nt'
>>> import platform
>>> platform.system()
'Windows'
>>> platform.release()
'Vista'

…而且我不敢相信还没有人为Windows 10发布过一个:

>>> import os
>>> os.name
'nt'
>>> import platform
>>> platform.system()
'Windows'
>>> platform.release()
'10'

Dang — lbrandy beat me to the punch, but that doesn’t mean I can’t provide you with the system results for Vista!

>>> import os
>>> os.name
'nt'
>>> import platform
>>> platform.system()
'Windows'
>>> platform.release()
'Vista'

…and I can’t believe no one’s posted one for Windows 10 yet:

>>> import os
>>> os.name
'nt'
>>> import platform
>>> platform.system()
'Windows'
>>> platform.release()
'10'

回答 2

为了记录,这是在Mac上的结果:

>>> import os
>>> os.name
'posix'
>>> import platform
>>> platform.system()
'Darwin'
>>> platform.release()
'8.11.1'

For the record here’s the results on Mac:

>>> import os
>>> os.name
'posix'
>>> import platform
>>> platform.system()
'Darwin'
>>> platform.release()
'8.11.1'

回答 3

使用python区分操作系统的示例代码:

from sys import platform as _platform

if _platform == "linux" or _platform == "linux2":
    # linux
elif _platform == "darwin":
    # MAC OS X
elif _platform == "win32":
    # Windows
elif _platform == "win64":
    # Windows 64-bit

Sample code to differentiate OS’s using python:

from sys import platform as _platform

if _platform == "linux" or _platform == "linux2":
    # linux
elif _platform == "darwin":
    # MAC OS X
elif _platform == "win32":
    # Windows
elif _platform == "win64":
    # Windows 64-bit

回答 4

sys.platform如果已经导入sys并且不想导入其他模块,也可以使用

>>> import sys
>>> sys.platform
'linux2'

You can also use sys.platform if you already have imported sys and you don’t want to import another module

>>> import sys
>>> sys.platform
'linux2'

回答 5

如果您想要用户可读的数据但仍然很详细,则可以使用platform.platform()

>>> import platform
>>> platform.platform()
'Linux-3.3.0-8.fc16.x86_64-x86_64-with-fedora-16-Verne'

您可以拨打以下几种可能的电话来识别自己的位置

import platform
import sys

def linux_distribution():
  try:
    return platform.linux_distribution()
  except:
    return "N/A"

print("""Python version: %s
dist: %s
linux_distribution: %s
system: %s
machine: %s
platform: %s
uname: %s
version: %s
mac_ver: %s
""" % (
sys.version.split('\n'),
str(platform.dist()),
linux_distribution(),
platform.system(),
platform.machine(),
platform.platform(),
platform.uname(),
platform.version(),
platform.mac_ver(),
))

该脚本的输出在几种不同的系统(Linux,Windows,Solaris,MacOS)上运行,并且体系结构(x86,x64,Itanium,power pc,sparc)可在以下位置找到:https : //github.com/hpcugent/easybuild/ Wiki / OS_flavor_name_version

以Ubuntu 12.04服务器为例:

Python version: ['2.6.5 (r265:79063, Oct  1 2012, 22:04:36) ', '[GCC 4.4.3]']
dist: ('Ubuntu', '10.04', 'lucid')
linux_distribution: ('Ubuntu', '10.04', 'lucid')
system: Linux
machine: x86_64
platform: Linux-2.6.32-32-server-x86_64-with-Ubuntu-10.04-lucid
uname: ('Linux', 'xxx', '2.6.32-32-server', '#62-Ubuntu SMP Wed Apr 20 22:07:43 UTC 2011', 'x86_64', '')
version: #62-Ubuntu SMP Wed Apr 20 22:07:43 UTC 2011
mac_ver: ('', ('', '', ''), '')

If you want user readable data but still detailed, you can use platform.platform()

>>> import platform
>>> platform.platform()
'Linux-3.3.0-8.fc16.x86_64-x86_64-with-fedora-16-Verne'

Here’s a few different possible calls you can make to identify where you are

import platform
import sys

def linux_distribution():
  try:
    return platform.linux_distribution()
  except:
    return "N/A"

print("""Python version: %s
dist: %s
linux_distribution: %s
system: %s
machine: %s
platform: %s
uname: %s
version: %s
mac_ver: %s
""" % (
sys.version.split('\n'),
str(platform.dist()),
linux_distribution(),
platform.system(),
platform.machine(),
platform.platform(),
platform.uname(),
platform.version(),
platform.mac_ver(),
))

The outputs of this script ran on a few different systems (Linux, Windows, Solaris, MacOS) and architectures (x86, x64, Itanium, power pc, sparc) is available here: https://github.com/hpcugent/easybuild/wiki/OS_flavor_name_version

Ubuntu 12.04 server for example gives:

Python version: ['2.6.5 (r265:79063, Oct  1 2012, 22:04:36) ', '[GCC 4.4.3]']
dist: ('Ubuntu', '10.04', 'lucid')
linux_distribution: ('Ubuntu', '10.04', 'lucid')
system: Linux
machine: x86_64
platform: Linux-2.6.32-32-server-x86_64-with-Ubuntu-10.04-lucid
uname: ('Linux', 'xxx', '2.6.32-32-server', '#62-Ubuntu SMP Wed Apr 20 22:07:43 UTC 2011', 'x86_64', '')
version: #62-Ubuntu SMP Wed Apr 20 22:07:43 UTC 2011
mac_ver: ('', ('', '', ''), '')

回答 6

短篇故事

使用platform.system()。它返回WindowsLinuxDarwin(对于OSX)。

很长的故事

使用Python获取OS的方法有3种,每种方法各有优缺点:

方法1

>>> import sys
>>> sys.platform
'win32'  # could be 'linux', 'linux2, 'darwin', 'freebsd8' etc

工作原理(来源):内部调用OS API以获取OS定义的OS名称。有关各种特定于操作系统的值,请参见此处

优点:无魔法,低等级。

缺点:取决于操作系统版本,因此最好不要直接使用。

方法二

>>> import os
>>> os.name
'nt'  # for Linux and Mac it prints 'posix'

工作原理(来源):内部会检查python是否具有称为posix或nt的特定于操作系统的模块。

优点:易于检查posix OS

缺点:Linux或OSX之间没有区别。

方法3

>>> import platform
>>> platform.system()
'Windows' # for Linux it prints 'Linux', Mac it prints `'Darwin'

工作原理(来源):内部将最终调用内部OS API,获取特定于操作系统版本的名称,例如“ win32”或“ win16”或“ linux1”,然后将其标准化为更通用的名称,例如“ Windows”或“ Linux”或通过应用几种启发式方法来“达尔文”。

专业版:Windows,OSX和Linux的最佳便携式方式。

缺点:Python人员必须保持规范化启发式更新。

摘要

  • 如果要检查OS是Windows还是Linux或OSX,那么最可靠的方法是platform.system()
  • 如果你想OS专用电话,但通过内置的Python模块posixnt再使用os.name
  • 如果要获取OS本身提供的原始OS名称,请使用sys.platform

Short Story

Use platform.system(). It returns Windows, Linux or Darwin (for OSX).

Long Story

There are 3 ways to get OS in Python, each with its own pro and cons:

Method 1

>>> import sys
>>> sys.platform
'win32'  # could be 'linux', 'linux2, 'darwin', 'freebsd8' etc

How this works (source): Internally it calls OS APIs to get name of the OS as defined by OS. See here for various OS-specific values.

Pro: No magic, low level.

Con: OS version dependent, so best not to use directly.

Method 2

>>> import os
>>> os.name
'nt'  # for Linux and Mac it prints 'posix'

How this works (source): Internally it checks if python has OS-specific modules called posix or nt.

Pro: Simple to check if posix OS

Con: no differentiation between Linux or OSX.

Method 3

>>> import platform
>>> platform.system()
'Windows' # for Linux it prints 'Linux', Mac it prints `'Darwin'

How this works (source): Internally it will eventually call internal OS APIs, get OS version-specific name like ‘win32’ or ‘win16’ or ‘linux1’ and then normalize to more generic names like ‘Windows’ or ‘Linux’ or ‘Darwin’ by applying several heuristics.

Pro: Best portable way for Windows, OSX and Linux.

Con: Python folks must keep normalization heuristic up to date.

Summary

  • If you want to check if OS is Windows or Linux or OSX then the most reliable way is platform.system().
  • If you want to make OS-specific calls but via built-in Python modules posix or nt then use os.name.
  • If you want to get raw OS name as supplied by OS itself then use sys.platform.

回答 7

新答案如何:

import psutil
psutil.MACOS   #True (OSX is deprecated)
psutil.WINDOWS #False
psutil.LINUX   #False 

如果我正在使用MACOS,这将是输出

How about a new answer:

import psutil
psutil.MACOS   #True (OSX is deprecated)
psutil.WINDOWS #False
psutil.LINUX   #False 

This would be the output if I was using MACOS


回答 8

我开始更系统地列出了使用各种模块可以期望得到的值(可以随意编辑和添加系统):

Linux(64位)+ WSL

os.name                     posix
sys.platform                linux
platform.system()           Linux
sysconfig.get_platform()    linux-x86_64
platform.machine()          x86_64
platform.architecture()     ('64bit', '')
  • 尝试使用archlinux和mint,得到相同的结果
  • 在python2上带有sys.platform内核版本的后缀,例如linux2,其他所有内容保持不变
  • 在Linux的Windows子系统上具有相同的输出(与ubuntu 18.04 LTS一起尝试),除了 platform.architecture() = ('64bit', 'ELF')

WINDOWS(64位)

(其中32bit列在32bit子系统中运行)

official python installer   64bit                     32bit
-------------------------   -----                     -----
os.name                     nt                        nt
sys.platform                win32                     win32
platform.system()           Windows                   Windows
sysconfig.get_platform()    win-amd64                 win32
platform.machine()          AMD64                     AMD64
platform.architecture()     ('64bit', 'WindowsPE')    ('64bit', 'WindowsPE')

msys2                       64bit                     32bit
-----                       -----                     -----
os.name                     posix                     posix
sys.platform                msys                      msys
platform.system()           MSYS_NT-10.0              MSYS_NT-10.0-WOW
sysconfig.get_platform()    msys-2.11.2-x86_64        msys-2.11.2-i686
platform.machine()          x86_64                    i686
platform.architecture()     ('64bit', 'WindowsPE')    ('32bit', 'WindowsPE')

msys2                       mingw-w64-x86_64-python3  mingw-w64-i686-python3
-----                       ------------------------  ----------------------
os.name                     nt                        nt
sys.platform                win32                     win32
platform.system()           Windows                   Windows
sysconfig.get_platform()    mingw                     mingw
platform.machine()          AMD64                     AMD64
platform.architecture()     ('64bit', 'WindowsPE')    ('32bit', 'WindowsPE')

cygwin                      64bit                     32bit
------                      -----                     -----
os.name                     posix                     posix
sys.platform                cygwin                    cygwin
platform.system()           CYGWIN_NT-10.0            CYGWIN_NT-10.0-WOW
sysconfig.get_platform()    cygwin-3.0.1-x86_64       cygwin-3.0.1-i686
platform.machine()          x86_64                    i686
platform.architecture()     ('64bit', 'WindowsPE')    ('32bit', 'WindowsPE')

一些说明:

  • 也有distutils.util.get_platform()和`sysconfig.get_platform
  • Windows上的anaconda与官方python Windows安装程序相同
  • 我没有Mac,也没有真正的32位系统,也没有动力在线​​进行此操作

要与您的系统进行比较,只需运行此脚本(如果缺少,请在此处附加结果:)

from __future__ import print_function
import os
import sys
import platform
import sysconfig

print("os.name                      ",  os.name)
print("sys.platform                 ",  sys.platform)
print("platform.system()            ",  platform.system())
print("sysconfig.get_platform()     ",  sysconfig.get_platform())
print("platform.machine()           ",  platform.machine())
print("platform.architecture()      ",  platform.architecture())

I started a bit more systematic listing of what values you can expect using the various modules (feel free to edit and add your system):

Linux (64bit) + WSL

os.name                     posix
sys.platform                linux
platform.system()           Linux
sysconfig.get_platform()    linux-x86_64
platform.machine()          x86_64
platform.architecture()     ('64bit', '')
  • tried with archlinux and mint, got same results
  • on python2 sys.platform is suffixed by kernel version, e.g. linux2, everything else stays identical
  • same output on Windows Subsystem for Linux (tried with ubuntu 18.04 LTS), except platform.architecture() = ('64bit', 'ELF')

WINDOWS (64bit)

(with 32bit column running in the 32bit subsystem)

official python installer   64bit                     32bit
-------------------------   -----                     -----
os.name                     nt                        nt
sys.platform                win32                     win32
platform.system()           Windows                   Windows
sysconfig.get_platform()    win-amd64                 win32
platform.machine()          AMD64                     AMD64
platform.architecture()     ('64bit', 'WindowsPE')    ('64bit', 'WindowsPE')

msys2                       64bit                     32bit
-----                       -----                     -----
os.name                     posix                     posix
sys.platform                msys                      msys
platform.system()           MSYS_NT-10.0              MSYS_NT-10.0-WOW
sysconfig.get_platform()    msys-2.11.2-x86_64        msys-2.11.2-i686
platform.machine()          x86_64                    i686
platform.architecture()     ('64bit', 'WindowsPE')    ('32bit', 'WindowsPE')

msys2                       mingw-w64-x86_64-python3  mingw-w64-i686-python3
-----                       ------------------------  ----------------------
os.name                     nt                        nt
sys.platform                win32                     win32
platform.system()           Windows                   Windows
sysconfig.get_platform()    mingw                     mingw
platform.machine()          AMD64                     AMD64
platform.architecture()     ('64bit', 'WindowsPE')    ('32bit', 'WindowsPE')

cygwin                      64bit                     32bit
------                      -----                     -----
os.name                     posix                     posix
sys.platform                cygwin                    cygwin
platform.system()           CYGWIN_NT-10.0            CYGWIN_NT-10.0-WOW
sysconfig.get_platform()    cygwin-3.0.1-x86_64       cygwin-3.0.1-i686
platform.machine()          x86_64                    i686
platform.architecture()     ('64bit', 'WindowsPE')    ('32bit', 'WindowsPE')

Some remarks:

  • there is also distutils.util.get_platform() which is identical to `sysconfig.get_platform
  • anaconda on windows is same as official python windows installer
  • I don’t have a Mac nor a true 32bit system and was not motivated to do it online

To compare with your system, simply run this script (and please append results here if missing :)

from __future__ import print_function
import os
import sys
import platform
import sysconfig

print("os.name                      ",  os.name)
print("sys.platform                 ",  sys.platform)
print("platform.system()            ",  platform.system())
print("sysconfig.get_platform()     ",  sysconfig.get_platform())
print("platform.machine()           ",  platform.machine())
print("platform.architecture()      ",  platform.architecture())

回答 9

我使用的是weblogic附带的WLST工具,它没有实现平台软件包。

wls:/offline> import os
wls:/offline> print os.name
java 
wls:/offline> import sys
wls:/offline> print sys.platform
'java1.5.0_11'

除了修补系统javaos.py使用jdk1.5在Windows 2003上的os.system()问题)补丁(我不能做,我必须开箱即用使用weblogic),这是我使用的方法:

def iswindows():
  os = java.lang.System.getProperty( "os.name" )
  return "win" in os.lower()

I am using the WLST tool that comes with weblogic, and it doesn’t implement the platform package.

wls:/offline> import os
wls:/offline> print os.name
java 
wls:/offline> import sys
wls:/offline> print sys.platform
'java1.5.0_11'

Apart from patching the system javaos.py (issue with os.system() on windows 2003 with jdk1.5) (which I can’t do, I have to use weblogic out of the box), this is what I use:

def iswindows():
  os = java.lang.System.getProperty( "os.name" )
  return "win" in os.lower()

回答 10

/usr/bin/python3.2

def cls():
    from subprocess import call
    from platform import system

    os = system()
    if os == 'Linux':
        call('clear', shell = True)
    elif os == 'Windows':
        call('cls', shell = True)

/usr/bin/python3.2

def cls():
    from subprocess import call
    from platform import system

    os = system()
    if os == 'Linux':
        call('clear', shell = True)
    elif os == 'Windows':
        call('cls', shell = True)

回答 11

对于Jython,我发现获得操作系统名称的唯一方法是检查os.nameJava属性(在WinXP上使用sysosplatformJython 2.5.3的模块进行了尝试):

def get_os_platform():
    """return platform name, but for Jython it uses os.name Java property"""
    ver = sys.platform.lower()
    if ver.startswith('java'):
        import java.lang
        ver = java.lang.System.getProperty("os.name").lower()
    print('platform: %s' % (ver))
    return ver

For Jython the only way to get os name I found is to check os.name Java property (tried with sys, os and platform modules for Jython 2.5.3 on WinXP):

def get_os_platform():
    """return platform name, but for Jython it uses os.name Java property"""
    ver = sys.platform.lower()
    if ver.startswith('java'):
        import java.lang
        ver = java.lang.System.getProperty("os.name").lower()
    print('platform: %s' % (ver))
    return ver

回答 12

在Windows 8上有趣的结果:

>>> import os
>>> os.name
'nt'
>>> import platform
>>> platform.system()
'Windows'
>>> platform.release()
'post2008Server'

编辑:那是一个错误

Interesting results on windows 8:

>>> import os
>>> os.name
'nt'
>>> import platform
>>> platform.system()
'Windows'
>>> platform.release()
'post2008Server'

Edit: That’s a bug


回答 13

当心,如果你使用的是Windows使用Cygwin哪里os.nameposix

>>> import os, platform
>>> print os.name
posix
>>> print platform.system()
CYGWIN_NT-6.3-WOW

Watch out if you’re on Windows with Cygwin where os.name is posix.

>>> import os, platform
>>> print os.name
posix
>>> print platform.system()
CYGWIN_NT-6.3-WOW

回答 14

以同样的方式…

import platform
is_windows=(platform.system().lower().find("win") > -1)

if(is_windows): lv_dll=LV_dll("my_so_dll.dll")
else:           lv_dll=LV_dll("./my_so_dll.so")

in the same vein….

import platform
is_windows=(platform.system().lower().find("win") > -1)

if(is_windows): lv_dll=LV_dll("my_so_dll.dll")
else:           lv_dll=LV_dll("./my_so_dll.so")

回答 15

如果您不是在寻找内核版本等,而是在寻找Linux发行版,则可能需要使用以下代码

在python2.6 +中

>>> import platform
>>> print platform.linux_distribution()
('CentOS Linux', '6.0', 'Final')
>>> print platform.linux_distribution()[0]
CentOS Linux
>>> print platform.linux_distribution()[1]
6.0

在python2.4中

>>> import platform
>>> print platform.dist()
('centos', '6.0', 'Final')
>>> print platform.dist()[0]
centos
>>> print platform.dist()[1]
6.0

显然,这只有在Linux上运行时才有效。如果希望跨平台使用更通用的脚本,可以将其与其他答案中给出的代码示例混合使用。

If you not looking for the kernel version etc, but looking for the linux distribution you may want to use the following

in python2.6+

>>> import platform
>>> print platform.linux_distribution()
('CentOS Linux', '6.0', 'Final')
>>> print platform.linux_distribution()[0]
CentOS Linux
>>> print platform.linux_distribution()[1]
6.0

in python2.4

>>> import platform
>>> print platform.dist()
('centos', '6.0', 'Final')
>>> print platform.dist()[0]
centos
>>> print platform.dist()[1]
6.0

Obviously, this will work only if you are running this on linux. If you want to have more generic script across platforms, you can mix this with code samples given in other answers.


回答 16

尝试这个:

import os

os.uname()

你可以做到:

info=os.uname()
info[0]
info[1]

try this:

import os

os.uname()

and you can make it :

info=os.uname()
info[0]
info[1]

回答 17

使用模块平台检查可用的测试,并为您的系统打印答案:

import platform

print dir(platform)

for x in dir(platform):
    if x[0].isalnum():
        try:
            result = getattr(platform, x)()
            print "platform."+x+": "+result
        except TypeError:
            continue

Check the available tests with module platform and print the answer out for your system:

import platform

print dir(platform)

for x in dir(platform):
    if x[0].isalnum():
        try:
            result = getattr(platform, x)()
            print "platform."+x+": "+result
        except TypeError:
            continue

回答 18

您也可以仅使用平台模块,而无需导入os模块来获取所有信息。

>>> import platform
>>> platform.os.name
'posix'
>>> platform.uname()
('Darwin', 'mainframe.local', '15.3.0', 'Darwin Kernel Version 15.3.0: Thu Dec 10 18:40:58 PST 2015; root:xnu-3248.30.4~1/RELEASE_X86_64', 'x86_64', 'i386')

使用此行可以实现一个美观,整洁的报告布局:

for i in zip(['system','node','release','version','machine','processor'],platform.uname()):print i[0],':',i[1]

给出以下输出:

system : Darwin
node : mainframe.local
release : 15.3.0
version : Darwin Kernel Version 15.3.0: Thu Dec 10 18:40:58 PST 2015; root:xnu-3248.30.4~1/RELEASE_X86_64
machine : x86_64
processor : i386

通常缺少的是操作系统版本,但是您应该知道是运行Windows,Linux还是Mac,平台独立的方式是使用此测试:

In []: for i in [platform.linux_distribution(),platform.mac_ver(),platform.win32_ver()]:
   ....:     if i[0]:
   ....:         print 'Version: ',i[0]

You can also use only platform module without importing os module to get all the information.

>>> import platform
>>> platform.os.name
'posix'
>>> platform.uname()
('Darwin', 'mainframe.local', '15.3.0', 'Darwin Kernel Version 15.3.0: Thu Dec 10 18:40:58 PST 2015; root:xnu-3248.30.4~1/RELEASE_X86_64', 'x86_64', 'i386')

A nice and tidy layout for reporting purpose can be achieved using this line:

for i in zip(['system','node','release','version','machine','processor'],platform.uname()):print i[0],':',i[1]

That gives this output:

system : Darwin
node : mainframe.local
release : 15.3.0
version : Darwin Kernel Version 15.3.0: Thu Dec 10 18:40:58 PST 2015; root:xnu-3248.30.4~1/RELEASE_X86_64
machine : x86_64
processor : i386

What is missing usually is the operating system version but you should know if you are running windows, linux or mac a platform indipendent way is to use this test:

In []: for i in [platform.linux_distribution(),platform.mac_ver(),platform.win32_ver()]:
   ....:     if i[0]:
   ....:         print 'Version: ',i[0]

回答 19

我知道这是一个古老的问题,但我相信我的回答可能对某些正在寻找一种简单,易于理解的python方法在其代码中检测OS的人有所帮助。在python3.7上测试

from sys import platform


class UnsupportedPlatform(Exception):
    pass


if "linux" in platform:
    print("linux")
elif "darwin" in platform:
    print("mac")
elif "win" in platform:
    print("windows")
else:
    raise UnsupportedPlatform

I know this is an old question but I believe that my answer is one that might be helpful to some people who are looking for an easy, simple to understand pythonic way to detect OS in their code. Tested on python3.7

from sys import platform


class UnsupportedPlatform(Exception):
    pass


if "linux" in platform:
    print("linux")
elif "darwin" in platform:
    print("mac")
elif "win" in platform:
    print("windows")
else:
    raise UnsupportedPlatform

回答 20

如果您正在运行macOS X并运行,platform.system()则会得到darwin,因为macOS X是基于Apple的Darwin OS构建的。Darwin是macOS X的内核,本质上是没有GUI的macOSX。

If you are running macOS X and run platform.system() you get darwin because macOS X is built on Apple’s Darwin OS. Darwin is the kernel of macOS X and is essentially macOS X without the GUI.


回答 21

此解决方案适用于pythonjython

模块os_identify.py

import platform
import os

# This module contains functions to determine the basic type of
# OS we are running on.
# Contrary to the functions in the `os` and `platform` modules,
# these allow to identify the actual basic OS,
# no matter whether running on the `python` or `jython` interpreter.

def is_linux():
    try:
        platform.linux_distribution()
        return True
    except:
        return False

def is_windows():
    try:
        platform.win32_ver()
        return True
    except:
        return False

def is_mac():
    try:
        platform.mac_ver()
        return True
    except:
        return False

def name():
    if is_linux():
        return "Linux"
    elif is_windows():
        return "Windows"
    elif is_mac():
        return "Mac"
    else:
        return "<unknown>" 

像这样使用:

import os_identify

print "My OS: " + os_identify.name()

This solution works for both python and jython.

module os_identify.py:

import platform
import os

# This module contains functions to determine the basic type of
# OS we are running on.
# Contrary to the functions in the `os` and `platform` modules,
# these allow to identify the actual basic OS,
# no matter whether running on the `python` or `jython` interpreter.

def is_linux():
    try:
        platform.linux_distribution()
        return True
    except:
        return False

def is_windows():
    try:
        platform.win32_ver()
        return True
    except:
        return False

def is_mac():
    try:
        platform.mac_ver()
        return True
    except:
        return False

def name():
    if is_linux():
        return "Linux"
    elif is_windows():
        return "Windows"
    elif is_mac():
        return "Mac"
    else:
        return "<unknown>" 

Use like this:

import os_identify

print "My OS: " + os_identify.name()

回答 22

像下面这样的简单Enum实现如何?无需外部库!

import platform
from enum import Enum
class OS(Enum):
    def checkPlatform(osName):
        return osName.lower()== platform.system().lower()

    MAC = checkPlatform("darwin")
    LINUX = checkPlatform("linux")
    WINDOWS = checkPlatform("windows")  #I haven't test this one

只需您即可使用Enum值进行访问

if OS.LINUX.value:
    print("Cool it is Linux")

PS是python3

How about a simple Enum implementation like the following? No need for external libs!

import platform
from enum import Enum
class OS(Enum):
    def checkPlatform(osName):
        return osName.lower()== platform.system().lower()

    MAC = checkPlatform("darwin")
    LINUX = checkPlatform("linux")
    WINDOWS = checkPlatform("windows")  #I haven't test this one

Simply you can access with Enum value

if OS.LINUX.value:
    print("Cool it is Linux")

P.S It is python3


回答 23

您可以查看pyOSinfo其中的代码是pip-date的一部分软件包,以获取最相关的系统信息,从你的Python分布观察。

人们想要检查其操作系统的最常见原因之一是终端兼容性以及某些系统命令是否可用。不幸的是,此检查的成功在某种程度上取决于您的python安装和操作系统。例如,uname在大多数Windows python软件包中不可用。上面的python程序将向您显示已经提供的最常用的内置函数的输出os, sys, platform, site

在此处输入图片说明

因此,仅获取基本代码的最佳方法就是以为例。(我想我可以将其粘贴到这里,但是从政治上讲这不是正确的。)

You can look at the code in pyOSinfo which is part of the pip-date package, to get the most relevant OS information, as seen from your Python distribution.

One of the most common reasons people want to check their OS is for terminal compatibility and if certain system commands are available. Unfortunately, the success of this checking is somewhat dependent on your python installation and OS. For example, uname is not available on most Windows python packages. The above python program will show you the output of the most commonly used built-in functions, already provided by os, sys, platform, site.

enter image description here

So the best way to get only the essential code is looking at that as an example. (I guess I could have just pasted it here, but that would not have been politically correct.)


回答 24

我迟到了游戏,但是,以防万一有人需要它,我使用此函数来调整我的代码,使其可以在Windows,Linux和MacO上运行:

import sys
def get_os(osoptions={'linux':'linux','Windows':'win','macos':'darwin'}):
    '''
    get OS to allow code specifics
    '''   
    opsys = [k for k in osoptions.keys() if sys.platform.lower().find(osoptions[k].lower()) != -1]
    try:
        return opsys[0]
    except:
        return 'unknown_OS'

I am late to the game but, just in case anybody needs it, this a function I use to make adjustments on my code so it runs on Windows, Linux and MacOs:

import sys
def get_os(osoptions={'linux':'linux','Windows':'win','macos':'darwin'}):
    '''
    get OS to allow code specifics
    '''   
    opsys = [k for k in osoptions.keys() if sys.platform.lower().find(osoptions[k].lower()) != -1]
    try:
        return opsys[0]
    except:
        return 'unknown_OS'

从父文件夹导入模块

问题:从父文件夹导入模块

我正在运行Python 2.5。

这是我的文件夹树:

ptdraft/
  nib.py
  simulations/
    life/
      life.py

(我还在__init__.py每个文件夹中,为便于阅读,在此省略)

如何nib从模块内部导入life模块?我希望无需修补sys.path就可以做到。

注意:正在运行的主模块在ptdraft文件夹中。

I am running Python 2.5.

This is my folder tree:

ptdraft/
  nib.py
  simulations/
    life/
      life.py

(I also have __init__.py in each folder, omitted here for readability)

How do I import the nib module from inside the life module? I am hoping it is possible to do without tinkering with sys.path.

Note: The main module being run is in the ptdraft folder.


回答 0

看来问题与该模块位于父目录或类似目录中无关。

您需要将包含的目录添加ptdraft到PYTHONPATH

您说过import nib与您合作,这可能意味着您将ptdraft自身(而不是其父项)添加到了PYTHONPATH中。

It seems that the problem is not related to the module being in a parent directory or anything like that.

You need to add the directory that contains ptdraft to PYTHONPATH

You said that import nib worked with you, that probably means that you added ptdraft itself (not its parent) to PYTHONPATH.


回答 1

您可以使用相对导入(python> = 2.5):

from ... import nib

(Python 2.5的新增功能)PEP 328:绝对导入和相对导入

编辑:添加了另一个点“。” 上两个包

You could use relative imports (python >= 2.5):

from ... import nib

(What’s New in Python 2.5) PEP 328: Absolute and Relative Imports

EDIT: added another dot ‘.’ to go up two packages


回答 2

相对导入(如中的from .. import mymodule)仅在包中起作用。要导入当前模块的父目录中的“ mymodule”:

import os,sys,inspect
currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
parentdir = os.path.dirname(currentdir)
sys.path.insert(0,parentdir) 

import mymodule

编辑__file__属性并不总是给定的。os.path.abspath(__file__)我现在建议不要使用Inspect模块来检索当前文件的文件名(和路径),而不要使用它

Relative imports (as in from .. import mymodule) only work in a package. To import ‘mymodule’ that is in the parent directory of your current module:

import os,sys,inspect
currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
parentdir = os.path.dirname(currentdir)
sys.path.insert(0,parentdir) 

import mymodule

edit: the __file__ attribute is not always given. Instead of using os.path.abspath(__file__) I now suggested using the inspect module to retrieve the filename (and path) of the current file


回答 3

对于同级软件包的导入问题,我也发表了类似的答案。你可以在这里看到它。

没有sys.path黑客的解决方案

摘要

  • 将代码包装到一个文件夹中(例如packaged_stuff
  • setup.py在使用setuptools.setup()的地方使用创建脚本。
  • 使用以下命令以可编辑状态安装软件包 pip install -e <myproject_folder>
  • 导入使用 from packaged_stuff.modulename import function_name

设定

我假设与问题中的文件夹结构相同

.
└── ptdraft
    ├── __init__.py
    ├── nib.py
    └── simulations
        ├── __init__.py
        └── life
            ├── __init__.py
            └── life.py

我将其.称为根文件夹,就我而言,它位于中C:\tmp\test_imports

脚步

1)将A添加setup.py到根文件夹

的内容setup.py可以很简单

from setuptools import setup, find_packages

setup(name='myproject', version='1.0', packages=find_packages())

基本上是“任何” setup.py都可以。这只是一个最小的工作示例。

2)使用虚拟环境

如果您熟悉虚拟环境,请激活一个,然后跳到下一步。虚拟环境的使用不是绝对必需的,但从长远来看(当您正在进行多个项目时),它们确实可以帮助您。最基本的步骤是(在根文件夹中运行)

  • 创建虚拟环境
    • python -m venv venv
  • 激活虚拟环境
    • . /venv/bin/activate(Linux)或./venv/Scripts/activate(Win)

要了解更多有关此的信息,只需在Google上搜索“ python virtualenv教程”或类似内容即可。除了创建,激活和停用之外,您可能根本不需要任何其他命令。

创建并激活虚拟环境后,控制台应在括号中提供虚拟环境的名称。

PS C:\tmp\test_imports> python -m venv venv
PS C:\tmp\test_imports> .\venv\Scripts\activate
(venv) PS C:\tmp\test_imports>

3)pip以可编辑状态安装项目

安装您的顶级包myproject使用pip。诀窍是-e在执行安装时使用标志。这样,它以可编辑状态安装,并且对.py文件所做的所有编辑将自动包含在已安装的软件包中。

在根目录中,运行

pip install -e . (注意点,它代表“当前目录”)

您还可以看到它是通过使用安装的 pip freeze

(venv) PS C:\tmp\test_imports> pip install -e .
Obtaining file:///C:/tmp/test_imports
Installing collected packages: myproject
  Running setup.py develop for myproject
Successfully installed myproject
(venv) PS C:\tmp\test_imports> pip freeze
myproject==1.0

4)通过mainfolder在每次导入之前进行导入

在此示例中,mainfolder将为ptdraft。这样的好处是您不会与其他模块名称(来自python标准库或3rd party模块)发生名称冲突。


用法示例

笔尖

def function_from_nib():
    print('I am the return value from function_from_nib!')

life.py

from ptdraft.nib import function_from_nib

if __name__ == '__main__':
    function_from_nib()

运行life.py

(venv) PS C:\tmp\test_imports> python .\ptdraft\simulations\life\life.py
I am the return value from function_from_nib!

I posted a similar answer also to the question regarding imports from sibling packages. You can see it here.

Solution without sys.path hacks

Summary

  • Wrap the code into one folder (e.g. packaged_stuff)
  • Use create setup.py script where you use setuptools.setup().
  • Pip install the package in editable state with pip install -e <myproject_folder>
  • Import using from packaged_stuff.modulename import function_name

Setup

I assume the same folder structure as in the question

.
└── ptdraft
    ├── __init__.py
    ├── nib.py
    └── simulations
        ├── __init__.py
        └── life
            ├── __init__.py
            └── life.py

I call the . the root folder, and in my case it is located in C:\tmp\test_imports.

Steps

1) Add a setup.py to the root folder

The contents of the setup.py can be simply

from setuptools import setup, find_packages

setup(name='myproject', version='1.0', packages=find_packages())

Basically “any” setup.py would work. This is just a minimal working example.

2) Use a virtual environment

If you are familiar with virtual environments, activate one, and skip to the next step. Usage of virtual environments are not absolutely required, but they will really help you out in the long run (when you have more than 1 project ongoing..). The most basic steps are (run in the root folder)

  • Create virtual env
    • python -m venv venv
  • Activate virtual env
    • . /venv/bin/activate (Linux) or ./venv/Scripts/activate (Win)

To learn more about this, just Google out “python virtualenv tutorial” or similar. You probably never need any other commands than creating, activating and deactivating.

Once you have made and activated a virtual environment, your console should give the name of the virtual environment in parenthesis

PS C:\tmp\test_imports> python -m venv venv
PS C:\tmp\test_imports> .\venv\Scripts\activate
(venv) PS C:\tmp\test_imports>

3) pip install your project in editable state

Install your top level package myproject using pip. The trick is to use the -e flag when doing the install. This way it is installed in an editable state, and all the edits made to the .py files will be automatically included in the installed package.

In the root directory, run

pip install -e . (note the dot, it stands for “current directory”)

You can also see that it is installed by using pip freeze

(venv) PS C:\tmp\test_imports> pip install -e .
Obtaining file:///C:/tmp/test_imports
Installing collected packages: myproject
  Running setup.py develop for myproject
Successfully installed myproject
(venv) PS C:\tmp\test_imports> pip freeze
myproject==1.0

4) Import by prepending mainfolder to every import

In this example, the mainfolder would be ptdraft. This has the advantage that you will not run into name collisions with other module names (from python standard library or 3rd party modules).


Example Usage

nib.py

def function_from_nib():
    print('I am the return value from function_from_nib!')

life.py

from ptdraft.nib import function_from_nib

if __name__ == '__main__':
    function_from_nib()

Running life.py

(venv) PS C:\tmp\test_imports> python .\ptdraft\simulations\life\life.py
I am the return value from function_from_nib!

回答 4

您可以在sys.path中列出的“模块搜索路径”中使用取决于OS的路径。因此您可以轻松添加父目录,如下所示

import sys
sys.path.insert(0,'..')

如果您要添加父/母目录,

sys.path.insert(0,'../..')

这在python 2和3。

You can use OS depending path in “module search path” which is listed in sys.path . So you can easily add parent directory like following

import sys
sys.path.insert(0,'..')

If you want to add parent-parent directory,

sys.path.insert(0,'../..')

This works both in python 2 and 3.


回答 5

如果无法将模块文件夹添加到PYTHONPATH,则可以在程序中修改sys.path列表,Python解释程序会在其中搜索要导入的模块,python文档说:

导入名为spam的模块时,解释器首先搜索具有该名称的内置模块。如果找不到,它将在变量sys.path给出的目录列表中搜索名为spam.py的文件。sys.path从以下位置初始化:

  • 包含输入脚本的目录(或当前目录)。
  • PYTHONPATH(目录名称列表,语法与shell变量PATH相同)。
  • 取决于安装的默认值。

初始化之后,Python程序可以修改sys.path。包含正在运行的脚本的目录位于搜索路径的开始,在标准库路径之前。这意味着将加载该目录中的脚本,而不是库目录中相同名称的模块。除非打算进行更换,否则这是一个错误。

知道了这一点,您可以在程序中执行以下操作:

import sys
# Add the ptdraft folder path to the sys.path list
sys.path.append('/path/to/ptdraft/')

# Now you can import your module
from ptdraft import nib
# Or just
import ptdraft

If adding your module folder to the PYTHONPATH didn’t work, You can modify the sys.path list in your program where the Python interpreter searches for the modules to import, the python documentation says:

When a module named spam is imported, the interpreter first searches for a built-in module with that name. If not found, it then searches for a file named spam.py in a list of directories given by the variable sys.path. sys.path is initialized from these locations:

  • the directory containing the input script (or the current directory).
  • PYTHONPATH (a list of directory names, with the same syntax as the shell variable PATH).
  • the installation-dependent default.

After initialization, Python programs can modify sys.path. The directory containing the script being run is placed at the beginning of the search path, ahead of the standard library path. This means that scripts in that directory will be loaded instead of modules of the same name in the library directory. This is an error unless the replacement is intended.

Knowing this, you can do the following in your program:

import sys
# Add the ptdraft folder path to the sys.path list
sys.path.append('/path/to/ptdraft/')

# Now you can import your module
from ptdraft import nib
# Or just
import ptdraft

回答 6

对python 2不太了解。
在python 3中,可以按以下方式添加父文件夹:

import sys 
sys.path.append('..')

…然后可以从中导入模块

Don’t know much about python 2.
In python 3, the parent folder can be added as follows:

import sys 
sys.path.append('..')

…and then one is able to import modules from it


回答 7

这是一个简单的答案,因此您可以了解它的工作原理(小型和跨平台)。
它仅使用内置模块(ossysinspect),所以应该工作
在任何操作系统(OS),因为Python是专为上。

较短的答案代码-更少的行和变量

from inspect import getsourcefile
import os.path as path, sys
current_dir = path.dirname(path.abspath(getsourcefile(lambda:0)))
sys.path.insert(0, current_dir[:current_dir.rfind(path.sep)])
import my_module  # Replace "my_module" here with the module name.
sys.path.pop(0)

如果少于此行,请用替换第二行import os.path as path, sys, inspect,在(第3行)的开头
添加inspect.getsourcefile然后删除第一行。
-但是,这会导入所有模块,因此可能需要更多的时间,内存和资源。

我的答案的代码(较长版本

from inspect import getsourcefile
import os.path
import sys

current_path = os.path.abspath(getsourcefile(lambda:0))
current_dir = os.path.dirname(current_path)
parent_dir = current_dir[:current_dir.rfind(os.path.sep)]

sys.path.insert(0, parent_dir)

import my_module  # Replace "my_module" here with the module name.

它使用来自Stack Overflow答案的示例。如何获取
Python中当前执行文件的路径?
使用内置工具查找正在运行的代码的源(文件名)。

from inspect import getsourcefile  
from os.path import abspath  

接下来,无论您想在哪里找到源文件,都只需使用:

abspath(getsourcefile(lambda:0))

我的代码sys.path在的python路径列表中添加了文件路径
因为这允许Python从该文件夹导入模块。

在代码中导入模块之后, 当添加的文件夹中的模块名称与另一个 稍后在程序中导入的模块同名时,最好sys.path.pop(0)换行运行。您需要删除导入之前添加的列表项,而不是其他路径。 如果您的程序未导入其他模块,则不删除文件路径是安全的,因为 在程序结束(或重新启动Python Shell)之后,对



sys.path消失。

有关文件名变量的注释

我的答案没有使用__file__变量来获取正在运行的
代码的文件路径/文件名,因为此处的用户经常将其描述为不可靠的。您不应将其
用于其他人使用的程序中从父文件夹导入模块

一些不起作用的示例(引用 Stack Overflow问题):

• 在某些平台找不到•有时不是完整的文件路径

  • py2exe没有__file__属性,但是有一种解决方法
  • 当您从IDLE运行时,execute()没有__file__属性
  • 我得到的OS X 10.6 NameError: global name '__file__' is not defined

Here is an answer that’s simple so you can see how it works, small and cross-platform.
It only uses built-in modules (os, sys and inspect) so should work
on any operating system (OS) because Python is designed for that.

Shorter code for answer – fewer lines and variables

from inspect import getsourcefile
import os.path as path, sys
current_dir = path.dirname(path.abspath(getsourcefile(lambda:0)))
sys.path.insert(0, current_dir[:current_dir.rfind(path.sep)])
import my_module  # Replace "my_module" here with the module name.
sys.path.pop(0)

For less lines than this, replace the second line with import os.path as path, sys, inspect,
add inspect. at the start of getsourcefile (line 3) and remove the first line.
– however this imports all of the module so could need more time, memory and resources.

The code for my answer (longer version)

from inspect import getsourcefile
import os.path
import sys

current_path = os.path.abspath(getsourcefile(lambda:0))
current_dir = os.path.dirname(current_path)
parent_dir = current_dir[:current_dir.rfind(os.path.sep)]

sys.path.insert(0, parent_dir)

import my_module  # Replace "my_module" here with the module name.

It uses an example from a Stack Overflow answer How do I get the path of the current
executed file in Python?
to find the source (filename) of running code with a built-in tool.

from inspect import getsourcefile  
from os.path import abspath  

Next, wherever you want to find the source file from you just use:

abspath(getsourcefile(lambda:0))

My code adds a file path to sys.path, the python path list
because this allows Python to import modules from that folder.

After importing a module in the code, it’s a good idea to run sys.path.pop(0) on a new line
when that added folder has a module with the same name as another module that is imported
later in the program. You need to remove the list item added before the import, not other paths.
If your program doesn’t import other modules, it’s safe to not delete the file path because
after a program ends (or restarting the Python shell), any edits made to sys.path disappear.

Notes about a filename variable

My answer doesn’t use the __file__ variable to get the file path/filename of running
code because users here have often described it as unreliable. You shouldn’t use it
for importing modules from parent folder in programs used by other people.

Some examples where it doesn’t work (quote from this Stack Overflow question):

• it can’t be found on some platforms • it sometimes isn’t the full file path

  • py2exe doesn’t have a __file__ attribute, but there is a workaround
  • When you run from IDLE with execute() there is no __file__ attribute
  • OS X 10.6 where I get NameError: global name '__file__' is not defined

回答 8

这是更通用的解决方案,其中将父目录包含在sys.path中(对我有用):

import os.path, sys
sys.path.append(os.path.join(os.path.dirname(os.path.realpath(__file__)), os.pardir))

Here is more generic solution that includes the parent directory into sys.path (works for me):

import os.path, sys
sys.path.append(os.path.join(os.path.dirname(os.path.realpath(__file__)), os.pardir))

回答 9

我发现以下方法可用于从脚本的父目录导入包。在示例中,我想env.pyapp.db包中导入函数。

.
└── my_application
    └── alembic
        └── env.py
    └── app
        ├── __init__.py
        └── db
import os
import sys
currentdir = os.path.dirname(os.path.realpath(__file__))
parentdir = os.path.dirname(currentdir)
sys.path.append(parentdir)

I found the following way works for importing a package from the script’s parent directory. In the example, I would like to import functions in env.py from app.db package.

.
└── my_application
    └── alembic
        └── env.py
    └── app
        ├── __init__.py
        └── db
import os
import sys
currentdir = os.path.dirname(os.path.realpath(__file__))
parentdir = os.path.dirname(currentdir)
sys.path.append(parentdir)

回答 10

上述解决方案也很好。解决此问题的另一种方法是

如果要从顶层目录导入任何内容。然后,

from ...module_name import *

另外,如果要从父目录导入任何模块。然后,

from ..module_name import *

另外,如果要从父目录导入任何模块。然后,

from ...module_name.another_module import *

这样,您可以根据需要导入任何特定方法。

Above mentioned solutions are also fine. Another solution to this problem is

If you want to import anything from top level directory. Then,

from ...module_name import *

Also, if you want to import any module from the parent directory. Then,

from ..module_name import *

Also, if you want to import any module from the parent directory. Then,

from ...module_name.another_module import *

This way you can import any particular method if you want to.


回答 11

对我来说,访问父目录最短和最喜欢的oneliner是:

sys.path.append(os.path.dirname(os.getcwd()))

要么:

sys.path.insert(1, os.path.dirname(os.getcwd()))

os.getcwd()返回当前工作目录的名称,os.path.dirname(directory_name)返回所传递目录的目录名称。

实际上,在我看来,Python项目体系结构应采用以下方式:子目录中的任何模块都不会使用父目录中的任何模块。如果发生这种情况,则值得重新考虑项目树。

另一种方法是将父目录添加到PYTHONPATH系统环境变量。

For me the shortest and my favorite oneliner for accessing to the parent directory is:

sys.path.append(os.path.dirname(os.getcwd()))

or:

sys.path.insert(1, os.path.dirname(os.getcwd()))

os.getcwd() returns the name of the current working directory, os.path.dirname(directory_name) returns the directory name for the passed one.

Actually, in my opinion Python project architecture should be done the way where no one module from child directory will use any module from the parent directory. If something like this happens it is worth to rethink about the project tree.

Another way is to add parent directory to PYTHONPATH system environment variable.


回答 12

在Jupyter笔记本中

只要您在Jupyter Notebook中工作,这个简短的解决方案就可能有用:

%cd ..
import nib

即使没有__init__.py文件也可以使用。

我在Linux和Windows 7上使用Anaconda3对其进行了测试。

In a Jupyter Notebook

As long as you’re working in a Jupyter Notebook, this short solution might be useful:

%cd ..
import nib

It works even without an __init__.py file.

I tested it with Anaconda3 on Linux and Windows 7.


回答 13

import sys sys.path.append('../')

import sys sys.path.append('../')


回答 14

当不在带有__init__.py文件的打包环境中时,pathlib库(包含在> = Python 3.4中)使将父目录的路径附加到PYTHONPATH变得非常简洁直观:

import sys
from pathlib import Path
sys.path.append(str(Path('.').absolute().parent))

When not being in a package environment with __init__.py files the pathlib library (included with >= Python 3.4) makes it very concise and intuitive to append the path of the parent directory to the PYTHONPATH:

import sys
from pathlib import Path
sys.path.append(str(Path('.').absolute().parent))

回答 15

与过去的答案相同的风格-但行数较少:P

import os,sys
parentdir = os.path.dirname(__file__)
sys.path.insert(0,parentdir)

文件返回您正在工作的位置

same sort of style as the past answer – but in fewer lines :P

import os,sys
parentdir = os.path.dirname(__file__)
sys.path.insert(0,parentdir)

file returns the location you are working in


回答 16

使用库。创建一个名为nib的库,使用setup.py安装它,使其驻留在站点程序包中,您的问题将得到解决。您不必将自己制作的所有东西都塞进一个包装中。分解成碎片。

Work with libraries. Make a library called nib, install it using setup.py, let it reside in site-packages and your problems are solved. You don’t have to stuff everything you make in a single package. Break it up to pieces.


回答 17

在Linux系统中,您可以创建一个从“ life”文件夹到nib.py文件的软链接。然后,您可以像这样简单地导入它:

import nib

In a Linux system, you can create a soft link from the “life” folder to the nib.py file. Then, you can simply import it like:

import nib

如何读取/处理命令行参数?

问题:如何读取/处理命令行参数?

我原本是C程序员。我看到了许多花招和“技巧”来阅读许多不同的论点。

Python程序员可以通过哪些方式做到这一点?

有关


回答 0

标准库中的规范解决方案是argparsedocs):

这是一个例子:

from argparse import ArgumentParser

parser = ArgumentParser()
parser.add_argument("-f", "--file", dest="filename",
                    help="write report to FILE", metavar="FILE")
parser.add_argument("-q", "--quiet",
                    action="store_false", dest="verbose", default=True,
                    help="don't print status messages to stdout")

args = parser.parse_args()

argparse 支持(除其他外):

  • 任意顺序的多个选项。
  • 短期和长期选择。
  • 默认值。
  • 生成使用帮助消息。

The canonical solution in the standard library is argparse (docs):

Here is an example:

from argparse import ArgumentParser

parser = ArgumentParser()
parser.add_argument("-f", "--file", dest="filename",
                    help="write report to FILE", metavar="FILE")
parser.add_argument("-q", "--quiet",
                    action="store_false", dest="verbose", default=True,
                    help="don't print status messages to stdout")

args = parser.parse_args()

argparse supports (among other things):

  • Multiple options in any order.
  • Short and long options.
  • Default values.
  • Generation of a usage help message.

回答 1

import sys

print("\n".join(sys.argv))

sys.argv 是一个列表,其中包含在命令行上传递给脚本的所有参数。

基本上,

import sys
print(sys.argv[1:])
import sys

print("\n".join(sys.argv))

sys.argv is a list that contains all the arguments passed to the script on the command line.

Basically,

import sys
print(sys.argv[1:])

回答 2

出于这些原因,只是为了使argparse传福音就更好了..本质上:

(从链接复制)

  • argparse模块可以处理位置和可选参数,而optparse仅可以处理可选参数

  • argparse对您的命令行界面应该是什么样的并不教条-支持-file或/ file之类的选项,以及必需的选项。Optparse拒绝支持这些功能,而是偏向于纯度而不是实用性

  • argparse会生成更多有用的用法消息,包括根据您的参数确定的命令行用法,以及有关位置参数和可选参数的帮助消息。optparse模块需要您编写自己的用法字符串,并且无法显示位置参数的帮助。

  • argparse支持消耗可变数量的命令行args的操作,而optparse要求事先知道确切数量的参数(例如1、2或3)

  • argparse支持分派到子命令allow_interspersed_args的解析器,而optparse需要手动设置 和执行解析器分派

和我个人最喜欢的:

  • argparse允许add_argument() 使用简单的可调用对象来指定类型和操作参数,而optparse则需要侵入类属性,例如 STORE_ACTIONSCHECK_METHODS进行正确的参数检查。

Just going around evangelizing for argparse which is better for these reasons.. essentially:

(copied from the link)

  • argparse module can handle positional and optional arguments, while optparse can handle only optional arguments

  • argparse isn’t dogmatic about what your command line interface should look like – options like -file or /file are supported, as are required options. Optparse refuses to support these features, preferring purity over practicality

  • argparse produces more informative usage messages, including command-line usage determined from your arguments, and help messages for both positional and optional arguments. The optparse module requires you to write your own usage string, and has no way to display help for positional arguments.

  • argparse supports action that consume a variable number of command-line args, while optparse requires that the exact number of arguments (e.g. 1, 2, or 3) be known in advance

  • argparse supports parsers that dispatch to sub-commands, while optparse requires setting allow_interspersed_args and doing the parser dispatch manually

And my personal favorite:

  • argparse allows the type and action parameters to add_argument() to be specified with simple callables, while optparse requires hacking class attributes like STORE_ACTIONS or CHECK_METHODS to get proper argument checking

回答 3

还有argparsestdlib模块stdlib模块上的“改进” optparse)。argparse简介中的示例:

# script.py
import argparse

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument(
        'integers', metavar='int', type=int, choices=range(10),
         nargs='+', help='an integer in the range 0..9')
    parser.add_argument(
        '--sum', dest='accumulate', action='store_const', const=sum,
        default=max, help='sum the integers (default: find the max)')

    args = parser.parse_args()
    print(args.accumulate(args.integers))

用法:

$ script.py 1 2 3 4
4

$ script.py --sum 1 2 3 4
10

There is also argparse stdlib module (an “impovement” on stdlib’s optparse module). Example from the introduction to argparse:

# script.py
import argparse

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument(
        'integers', metavar='int', type=int, choices=range(10),
         nargs='+', help='an integer in the range 0..9')
    parser.add_argument(
        '--sum', dest='accumulate', action='store_const', const=sum,
        default=max, help='sum the integers (default: find the max)')

    args = parser.parse_args()
    print(args.accumulate(args.integers))

Usage:

$ script.py 1 2 3 4
4

$ script.py --sum 1 2 3 4
10

回答 4

一种方法是使用sys.argv。这将打印脚本名称作为第一个参数以及传递给它的所有其他参数。

import sys

for arg in sys.argv:
    print arg

One way to do it is using sys.argv. This will print the script name as the first argument and all the other parameters that you pass to it.

import sys

for arg in sys.argv:
    print arg

回答 5

docopt库是真的光滑。它从应用程序的用法字符串中构建一个参数dict。

例如来自docopt自述文件:

"""Naval Fate.

Usage:
  naval_fate.py ship new <name>...
  naval_fate.py ship <name> move <x> <y> [--speed=<kn>]
  naval_fate.py ship shoot <x> <y>
  naval_fate.py mine (set|remove) <x> <y> [--moored | --drifting]
  naval_fate.py (-h | --help)
  naval_fate.py --version

Options:
  -h --help     Show this screen.
  --version     Show version.
  --speed=<kn>  Speed in knots [default: 10].
  --moored      Moored (anchored) mine.
  --drifting    Drifting mine.

"""
from docopt import docopt


if __name__ == '__main__':
    arguments = docopt(__doc__, version='Naval Fate 2.0')
    print(arguments)

The docopt library is really slick. It builds an argument dict from the usage string for your app.

Eg from the docopt readme:

"""Naval Fate.

Usage:
  naval_fate.py ship new <name>...
  naval_fate.py ship <name> move <x> <y> [--speed=<kn>]
  naval_fate.py ship shoot <x> <y>
  naval_fate.py mine (set|remove) <x> <y> [--moored | --drifting]
  naval_fate.py (-h | --help)
  naval_fate.py --version

Options:
  -h --help     Show this screen.
  --version     Show version.
  --speed=<kn>  Speed in knots [default: 10].
  --moored      Moored (anchored) mine.
  --drifting    Drifting mine.

"""
from docopt import docopt


if __name__ == '__main__':
    arguments = docopt(__doc__, version='Naval Fate 2.0')
    print(arguments)

回答 6

如果您需要快速但不太灵活的东西

main.py:

import sys

first_name = sys.argv[1]
last_name = sys.argv[2]
print("Hello " + first_name + " " + last_name)

然后跑 python main.py James Smith

产生以下输出:

你好詹姆斯史密斯

If you need something fast and not very flexible

main.py:

import sys

first_name = sys.argv[1]
last_name = sys.argv[2]
print("Hello " + first_name + " " + last_name)

Then run python main.py James Smith

to produce the following output:

Hello James Smith


回答 7

#set default args as -h , if no args:
if len(sys.argv) == 1: sys.argv[1:] = ["-h"]
#set default args as -h , if no args:
if len(sys.argv) == 1: sys.argv[1:] = ["-h"]

回答 8

我自己使用optparse,但真的很喜欢Simon Willison在他最近引入的optfunc库中的发展方向。它的工作原理是:

“对函数定义(包括其参数及其默认值)进行自省,并使用其来构建命令行参数解析器。”

因此,例如,此函数定义:

def geocode(s, api_key='', geocoder='google', list_geocoders=False):

变成以下optparse帮助文本:

    Options:
      -h, --help            show this help message and exit
      -l, --list-geocoders
      -a API_KEY, --api-key=API_KEY
      -g GEOCODER, --geocoder=GEOCODER

I use optparse myself, but really like the direction Simon Willison is taking with his recently introduced optfunc library. It works by:

“introspecting a function definition (including its arguments and their default values) and using that to construct a command line argument parser.”

So, for example, this function definition:

def geocode(s, api_key='', geocoder='google', list_geocoders=False):

is turned into this optparse help text:

    Options:
      -h, --help            show this help message and exit
      -l, --list-geocoders
      -a API_KEY, --api-key=API_KEY
      -g GEOCODER, --geocoder=GEOCODER

回答 9

我喜欢stdlib中的getopt,例如:

try:
    opts, args = getopt.getopt(sys.argv[1:], 'h', ['help'])
except getopt.GetoptError, err: 
    usage(err)

for opt, arg in opts:
    if opt in ('-h', '--help'): 
        usage()

if len(args) != 1:
    usage("specify thing...")

最近,我一直在包装类似的东西,以使事情变得不太冗长(例如,使“ -h”隐式)。

I like getopt from stdlib, eg:

try:
    opts, args = getopt.getopt(sys.argv[1:], 'h', ['help'])
except getopt.GetoptError, err: 
    usage(err)

for opt, arg in opts:
    if opt in ('-h', '--help'): 
        usage()

if len(args) != 1:
    usage("specify thing...")

Lately I have been wrapping something similiar to this to make things less verbose (eg; making “-h” implicit).


回答 10

Pocoo的点击更加直观,所需模板更少,并且至少与argparse一样强大。

到目前为止,我遇到的唯一弱点是您不能做太多自定义来帮助页面,但这通常不是必需的,docopt似乎是明确的选择。

Pocoo’s click is more intuitive, requires less boilerplate, and is at least as powerful as argparse.

The only weakness I’ve encountered so far is that you can’t do much customization to help pages, but that usually isn’t a requirement and docopt seems like the clear choice when it is.


回答 11

如您所见,optparse “ optparse模块已弃用,将不再进一步开发;argparse模块将继续开发。”

As you can see optparse “The optparse module is deprecated with and will not be developed further; development will continue with the argparse module.”


回答 12

import argparse

parser = argparse.ArgumentParser(description='Process some integers.')
parser.add_argument('integers', metavar='N', type=int, nargs='+',
                   help='an integer for the accumulator')
parser.add_argument('--sum', dest='accumulate', action='store_const',
                   const=sum, default=max,
                   help='sum the integers (default: find the max)')

args = parser.parse_args()
print(args.accumulate(args.integers))

Assuming the Python code above is saved into a file called prog.py
$ python prog.py -h

Ref-link: https://docs.python.org/3.3/library/argparse.html
import argparse

parser = argparse.ArgumentParser(description='Process some integers.')
parser.add_argument('integers', metavar='N', type=int, nargs='+',
                   help='an integer for the accumulator')
parser.add_argument('--sum', dest='accumulate', action='store_const',
                   const=sum, default=max,
                   help='sum the integers (default: find the max)')

args = parser.parse_args()
print(args.accumulate(args.integers))

Assuming the Python code above is saved into a file called prog.py
$ python prog.py -h

Ref-link: https://docs.python.org/3.3/library/argparse.html

回答 13

您可能对我编写的一个Python小模块感兴趣,该模块使命令行参数的处理更加容易(开源且免费使用)-Commando

You may be interested in a little Python module I wrote to make handling of command line arguments even easier (open source and free to use) – Commando


回答 14

我建议将docopt看作是其他替代品的简单替代方案。

docopt是一个新项目,可以通过解析–help使用情况消息来工作,而不是要求您自己实现一切。您只需要将使用情况消息以POSIX格式输入即可。

I recommend looking at docopt as a simple alternative to these others.

docopt is a new project that works by parsing your –help usage message rather than requiring you to implement everything yourself. You just have to put your usage message in the POSIX format.


回答 15

另一个选择是argh。它基于argparse构建,并允许您编写如下内容:

import argh

# declaring:

def echo(text):
    "Returns given word as is."
    return text

def greet(name, greeting='Hello'):
    "Greets the user with given name. The greeting is customizable."
    return greeting + ', ' + name

# assembling:

parser = argh.ArghParser()
parser.add_commands([echo, greet])

# dispatching:

if __name__ == '__main__':
    parser.dispatch()

它会自动生成帮助,等等,您可以使用装饰器提供有关arg解析工作方式的额外指导。

Yet another option is argh. It builds on argparse, and lets you write things like:

import argh

# declaring:

def echo(text):
    "Returns given word as is."
    return text

def greet(name, greeting='Hello'):
    "Greets the user with given name. The greeting is customizable."
    return greeting + ', ' + name

# assembling:

parser = argh.ArghParser()
parser.add_commands([echo, greet])

# dispatching:

if __name__ == '__main__':
    parser.dispatch()

It will automatically generate help and so on, and you can use decorators to provide extra guidance on how the arg-parsing should work.


回答 16

我的解决方案是entrypoint2。例:

from entrypoint2 import entrypoint
@entrypoint
def add(file, quiet=True): 
    ''' This function writes report.

    :param file: write report to FILE
    :param quiet: don't print status messages to stdout
    '''
    print file,quiet

帮助文字:

usage: report.py [-h] [-q] [--debug] file

This function writes report.

positional arguments:
  file         write report to FILE

optional arguments:
  -h, --help   show this help message and exit
  -q, --quiet  don't print status messages to stdout
  --debug      set logging level to DEBUG

My solution is entrypoint2. Example:

from entrypoint2 import entrypoint
@entrypoint
def add(file, quiet=True): 
    ''' This function writes report.

    :param file: write report to FILE
    :param quiet: don't print status messages to stdout
    '''
    print file,quiet

help text:

usage: report.py [-h] [-q] [--debug] file

This function writes report.

positional arguments:
  file         write report to FILE

optional arguments:
  -h, --help   show this help message and exit
  -q, --quiet  don't print status messages to stdout
  --debug      set logging level to DEBUG