


['Aden', 'abel']


['abel', 'Aden']



I have a list of strings like this:

['Aden', 'abel']

I want to sort the items, case-insensitive. So I want to get:

['abel', 'Aden']

But I get the opposite with sorted() or list.sort(), because uppercase appears before lowercase.

How can I ignore the case? I’ve seen solutions which involves lowercasing all list items, but I don’t want to change the case of the list items.

回答 0

在Python 3.3+中,有str.casefold一种专为无条件匹配而设计的方法:

sorted_list = sorted(unsorted_list, key=str.casefold)

在Python 2中使用lower()

sorted_list = sorted(unsorted_list, key=lambda s: s.lower())


在Python 2中,它可以将普通字符串和unicode字符串混合使用,因为这两种类型的值可以相互比较。但是,Python 3并不是这样工作的:您无法比较字节字符串和unicode字符串,因此在Python 3中,您应该做明智的事情,并且只能对一种类型的字符串列表进行排序。

>>> lst = ['Aden', u'abe1']
>>> sorted(lst)
['Aden', u'abe1']
>>> sorted(lst, key=lambda s: s.lower())
[u'abe1', 'Aden']

In Python 3.3+ there is the str.casefold method that’s specifically designed for caseless matching:

sorted_list = sorted(unsorted_list, key=str.casefold)

In Python 2 use lower():

sorted_list = sorted(unsorted_list, key=lambda s: s.lower())

It works for both normal and unicode strings, since they both have a lower method.

In Python 2 it works for a mix of normal and unicode strings, since values of the two types can be compared with each other. Python 3 doesn’t work like that, though: you can’t compare a byte string and a unicode string, so in Python 3 you should do the sane thing and only sort lists of one type of string.

>>> lst = ['Aden', u'abe1']
>>> sorted(lst)
['Aden', u'abe1']
>>> sorted(lst, key=lambda s: s.lower())
[u'abe1', 'Aden']

回答 1

>>> x = ['Aden', 'abel']
>>> sorted(x, key=str.lower) # Or unicode.lower if all items are unicode
['abel', 'Aden']

在Python 3中str是unicode,但在Python 2中,您可以使用这种更通用的方法,该方法对str和都适用unicode

>>> sorted(x, key=lambda s: s.lower())
['abel', 'Aden']
>>> x = ['Aden', 'abel']
>>> sorted(x, key=str.lower) # Or unicode.lower if all items are unicode
['abel', 'Aden']

In Python 3 str is unicode but in Python 2 you can use this more general approach which works for both str and unicode:

>>> sorted(x, key=lambda s: s.lower())
['abel', 'Aden']

回答 2


>>> x = ['Aden', 'abel']
>>> x.sort(key=lambda y: y.lower())
>>> x
['abel', 'Aden']

You can also try this to sort the list in-place:

>>> x = ['Aden', 'abel']
>>> x.sort(key=lambda y: y.lower())
>>> x
['abel', 'Aden']

回答 3

这在Python 3中有效,并且不涉及小写结果(!)。


This works in Python 3 and does not involves lowercasing the result (!).


回答 4


list1.sort(key=lambda x: x.lower()) #Case In-sensitive             
list1.sort() #Case Sensitive

In python3 you can use

list1.sort(key=lambda x: x.lower()) #Case In-sensitive             
list1.sort() #Case Sensitive

回答 5

我是通过Python 3.3做到的:

 def sortCaseIns(lst):
    lst2 = [[x for x in range(0, 2)] for y in range(0, len(lst))]
    for i in range(0, len(lst)):
        lst2[i][0] = lst[i].lower()
        lst2[i][1] = lst[i]
    for i in range(0, len(lst)):
        lst[i] = lst2[i][1]



I did it this way for Python 3.3:

 def sortCaseIns(lst):
    lst2 = [[x for x in range(0, 2)] for y in range(0, len(lst))]
    for i in range(0, len(lst)):
        lst2[i][0] = lst[i].lower()
        lst2[i][1] = lst[i]
    for i in range(0, len(lst)):
        lst[i] = lst2[i][1]

Then you just can call this function:


回答 6

不区分大小写的排序,在Python 2 OR 3中对字符串进行排序(在Python 2.7.17和Python 3.6.9中测试):

>>> x = ["aa", "A", "bb", "B", "cc", "C"]
>>> x.sort()
>>> x
['A', 'B', 'C', 'aa', 'bb', 'cc']
>>> x.sort(key=str.lower)           # <===== there it is!
>>> x
['A', 'aa', 'B', 'bb', 'C', 'cc']


x = ["aa", "A", "bb", "B", "cc", "C"]

请注意,但是,如果您的字符串是unicode字符串(如u'some string'),则仅在Python 2中(在这种情况下,在Python 3中不是),上述x.sort(key=str.lower)命令将失败并输出以下错误:

TypeError: descriptor 'lower' requires a 'str' object but received a 'unicode'

如果出现此错误,请升级到Python 3来处理unicode排序,或者先使用列表推导将unicode字符串转换为ASCII字符串,如下所示:

# for Python2, ensure all elements are ASCII (NOT unicode) strings first
x = [str(element) for element in x]  
# for Python2, this sort will only work on ASCII (NOT unicode) strings


  1. https://docs.python.org/3/library/stdtypes.html#list.sort
  2. 将Unicode字符串转换为Python中的字符串(包含多余的符号)
  3. https://www.programiz.com/python-programming/list-comprehension

Case-insensitive sort, sorting the string in place, in Python 2 OR 3 (tested in Python 2.7.17 and Python 3.6.9):

>>> x = ["aa", "A", "bb", "B", "cc", "C"]
>>> x.sort()
>>> x
['A', 'B', 'C', 'aa', 'bb', 'cc']
>>> x.sort(key=str.lower)           # <===== there it is!
>>> x
['A', 'aa', 'B', 'bb', 'C', 'cc']

The key is key=str.lower. Here’s what those commands look like with just the commands, for easy copy-pasting so you can test them:

x = ["aa", "A", "bb", "B", "cc", "C"]

Note that if your strings are unicode strings, however (like u'some string'), then in Python 2 only (NOT in Python 3 in this case) the above x.sort(key=str.lower) command will fail and output the following error:

TypeError: descriptor 'lower' requires a 'str' object but received a 'unicode'

If you get this error, then either upgrade to Python 3 where they handle unicode sorting, or convert your unicode strings to ASCII strings first, using a list comprehension, like this:

# for Python2, ensure all elements are ASCII (NOT unicode) strings first
x = [str(element) for element in x]  
# for Python2, this sort will only work on ASCII (NOT unicode) strings


  1. https://docs.python.org/3/library/stdtypes.html#list.sort
  2. Convert a Unicode string to a string in Python (containing extra symbols)
  3. https://www.programiz.com/python-programming/list-comprehension

回答 7


def cSort(inlist, minisort=True):
    sortlist = []
    newlist = []
    sortdict = {}
    for entry in inlist:
            lentry = entry.lower()
        except AttributeError:
            except KeyError:
                sortdict[lentry] = [entry]

    for entry in sortlist:
            thislist = sortdict[entry]
            if minisort: thislist.sort()
            newlist = newlist + thislist
        except KeyError:
    return newlist

lst = ['Aden', 'abel']
print cSort(lst)


['abel', 'Aden']

Try this

def cSort(inlist, minisort=True):
    sortlist = []
    newlist = []
    sortdict = {}
    for entry in inlist:
            lentry = entry.lower()
        except AttributeError:
            except KeyError:
                sortdict[lentry] = [entry]

    for entry in sortlist:
            thislist = sortdict[entry]
            if minisort: thislist.sort()
            newlist = newlist + thislist
        except KeyError:
    return newlist

lst = ['Aden', 'abel']
print cSort(lst)


['abel', 'Aden']





import pandas as pd
df = pd.DataFrame({'A': ['one', 'one', 'two', 'three', 'three', 'one'], 'B': range(6)})

       A  B
0    one  0
1    one  1
2    two  2
3  three  3
4  three  4
5    one  5

按“ A”分组后进行打印时,我有以下内容:


<pandas.core.groupby.DataFrameGroupBy object at 0x05416E90>





             A  B
one   0    one  0
      1    one  1
two   2    two  2
three 3  three  3
      4  three  4
one   5    one  5


             A  B
one   0    one  0
      1    one  1
      5    one  5
two   2    two  2
three 3  three  3
      4  three  4

I want to print the result of grouping with Pandas.

I have a dataframe:

import pandas as pd
df = pd.DataFrame({'A': ['one', 'one', 'two', 'three', 'three', 'one'], 'B': range(6)})

       A  B
0    one  0
1    one  1
2    two  2
3  three  3
4  three  4
5    one  5

When printing after grouping by ‘A’ I have the following:


<pandas.core.groupby.DataFrameGroupBy object at 0x05416E90>

How can I print the dataframe grouped?

If I do:


I obtain the dataframe as if it was not grouped:

             A  B
one   0    one  0
      1    one  1
two   2    two  2
three 3  three  3
      4  three  4
one   5    one  5

I was expecting something like:

             A  B
one   0    one  0
      1    one  1
      5    one  5
two   2    two  2
three 3  three  3
      4  three  4

回答 0


grouped_df = df.groupby('A')

for key, item in grouped_df:
    print(grouped_df.get_group(key), "\n\n")


grouped_df = df.groupby('A')    
gb = grouped_df.groups

for key, values in gb.iteritems():
    print(df.ix[values], "\n\n")


gb = grouped_df.groups

key_list_from_gb = [key1, key2, key3]

for key, values in gb.items():
    if key in key_list_from_gb:
        print(df.ix[values], "\n")

Simply do:

grouped_df = df.groupby('A')

for key, item in grouped_df:
    print(grouped_df.get_group(key), "\n\n")

This also works,

grouped_df = df.groupby('A')    
gb = grouped_df.groups

for key, values in gb.iteritems():
    print(df.ix[values], "\n\n")

For selective key grouping: Insert the keys you want inside the key_list_from_gb, in following, using gb.keys(): For Example,

gb = grouped_df.groups

key_list_from_gb = [key1, key2, key3]

for key, values in gb.items():
    if key in key_list_from_gb:
        print(df.ix[values], "\n")

回答 1


grp = df.groupby['colName']


If you’re simply looking for a way to display it, you could use describe():

grp = df.groupby['colName']

This gives you a neat table.

回答 2




I confirmed that the behavior of head() changes between version 0.12 and 0.13. That looks like a bug to me. I created an issue.

But a groupby operation doesn’t actually return a DataFrame sorted by group. The .head() method is a little misleading here — it’s just a convenience feature to let you re-examine the object (in this case, df) that you grouped. The result of groupby is separate kind of object, a GroupBy object. You must apply, transform, or filter to get back to a DataFrame or Series.

If all you wanted to do was sort by the values in columns A, you should use df.sort('A').

回答 3


for name_of_the_group, group in grouped_dataframe:
   print (name_of_the_group)
   print (group)

Another simple alternative:

for name_of_the_group, group in grouped_dataframe:
   print (name_of_the_group)
   print (group)

回答 4


gb = df.groupby("A")
gb.count() # or,

Also, other simple alternative could be:

gb = df.groupby("A")
gb.count() # or,

回答 5



df = pd.DataFrame({'A': ['one', 'one', 'two', 'three', 'three', 'one'], 'B': range(6)})



In addition to previous answers:

Taking your example,

df = pd.DataFrame({'A': ['one', 'one', 'two', 'three', 'three', 'one'], 'B': range(6)})

Then simple 1 line code


回答 6


for key, value in df.groupby('A'):
    print(key, value)

Thanks to Surya for good insights. I’d clean up his solution and simply do:

for key, value in df.groupby('A'):
    print(key, value)

回答 7


group = df.groupby('A') #group variable contains groupby data
for A,A_df in group: # A is your column and A_df is group of one kind at a time



you cannot see the groupBy data directly by print statement but you can see by iterating over the group using for loop try this code to see the group by data

group = df.groupby('A') #group variable contains groupby data
for A,A_df in group: # A is your column and A_df is group of one kind at a time

you will get an output after trying this as a groupby result

I hope it helps

回答 8




[('one',      A  B
0  one  0
1  one  1
5  one  5), ('three',        A  B
3  three  3
4  three  4), ('two',      A  B
2  two  2)]

Call list() on the GroupBy object


gives you:

[('one',      A  B
0  one  0
1  one  1
5  one  5), ('three',        A  B
3  three  3
4  three  4), ('two',      A  B
2  two  2)]

回答 9

在Jupyter Notebook中,如果执行以下操作,它将打印该对象的一个​​很好的分组版本。该apply方法有助于创建多索引数据框。

by = 'A'  # groupby 'by' argument
df.groupby(by).apply(lambda a: a[:])


             A  B
one   0    one  0
      1    one  1
      5    one  5
three 3  three  3
      4  three  4
two   2    two  2


df.groupby(by).apply(lambda a: a.drop(by, axis=1)[:])


one   0  0
      1  1
      5  5
three 3  3
      4  4
two   2  2


In Jupyter Notebook, if you do the following, it prints a nice grouped version of the object. The apply method helps in creation of a multiindex dataframe.

by = 'A'  # groupby 'by' argument
df.groupby(by).apply(lambda a: a[:])


             A  B
one   0    one  0
      1    one  1
      5    one  5
three 3  three  3
      4  three  4
two   2    two  2

If you want the by column(s) to not appear in the output, just drop the column(s), like so.

df.groupby(by).apply(lambda a: a.drop(by, axis=1)[:])


one   0  0
      1  1
      5  5
three 3  3
      4  4
two   2  2

Here, I am not sure as to why .iloc[:] does not work instead of [:] at the end. So, if there are some issues in future due to updates (or at present), .iloc[:len(a)] also works.

回答 10


df['a'] = df['A']  # create a shadow column for MultiIndexing
df.sort_values('A', inplace=True)
df.set_index(["A","a"], inplace=True)


A     a
one   one    0
      one    1
      one    5
three three  3
      three  4
two   two    2

优点很容易打印,因为它返回一个数据框而不是Groupby Object。输出看起来不错。缺点是会创建一系列冗余数据。

I found a tricky way, just for brainstorm, see the code:

df['a'] = df['A']  # create a shadow column for MultiIndexing
df.sort_values('A', inplace=True)
df.set_index(["A","a"], inplace=True)

the output:

A     a
one   one    0
      one    1
      one    5
three three  3
      three  4
two   two    2

The pros is so easy to print, as it returns a dataframe, instead of Groupby Object. And the output looks nice. While the con is that it create a series of redundant data.

回答 11

在python 3中

k = None
for name_of_the_group, group in dict(df_group):
    if(k != name_of_the_group):
        print ('\n', name_of_the_group)
    print (group)
    k = name_of_the_group


In python 3

k = None
for name_of_the_group, group in dict(df_group):
    if(k != name_of_the_group):
        print ('\n', name_of_the_group)
    print (group)
    k = name_of_the_group

In more interactive way

回答 12


import pandas as pd
pd.set_option('display.max_rows', 500)

grouped_df = df.group(['var1', 'var2'])

to print all (or arbitrarily many) lines of the grouped df:

import pandas as pd
pd.set_option('display.max_rows', 500)

grouped_df = df.group(['var1', 'var2'])





  1. pip安装django-debug-toolbar
  2. 添加到中间件类:
    # Uncomment the next line for simple clickjacking protection:
    # 'django.middleware.clickjacking.XFrameOptionsMiddleware',





我什至将debug_toolbar模板的目录添加到了我的 TEMPLATE_DIRS

I looked at other questions and can’t figure it out…

I did the following to install django-debug-toolbar:

  1. pip install django-debug-toolbar
  2. added to middleware classes:
    # Uncomment the next line for simple clickjacking protection:
    # 'django.middleware.clickjacking.XFrameOptionsMiddleware',



4 Added debug_toolbar to installed apps

I am not getting any errors or anything, and the toolbar doesn’t show up on any page, not even admin.

I even added the directory of the debug_toolbar templates to my TEMPLATE_DIRS

回答 0

愚蠢的问题,但您没有提及,所以… DEBUG设置为什么?它不会加载,除非它是True

如果仍然无法使用,请尝试同时添加“” INTERNAL_IPS




def show_toolbar(request):
    return True





def show_toolbar(request):
    return True
    "SHOW_TOOLBAR_CALLBACK" : show_toolbar,


Stupid question, but you didn’t mention it, so… What is DEBUG set to? It won’t load unless it’s True.

If it’s still not working, try adding ‘’ to INTERNAL_IPS as well.


This is a last-ditch-effort move, you shouldn’t have to do this, but it will clearly show if there’s merely some configuration issue or whether there’s some larger issue.

Add the following to settings.py:

def show_toolbar(request):
    return True

That will effectively remove all checks by debug toolbar to determine if it should or should not load itself; it will always just load. Only leave that in for testing purposes, if you forget and launch with it, all your visitors will get to see your debug toolbar too.

For explicit configuration, also see the official install docs here.


Apparently the syntax for the nuclear option has changed. It’s now in its own dictionary:

def show_toolbar(request):
    return True
    "SHOW_TOOLBAR_CALLBACK" : show_toolbar,

Their tests use this dictionary.

回答 1

调试工具栏希望在INTERNAL_IPS设置中设置request.META [‘REMOTE_ADDR’]中的IP地址。在您的其中一种视图中放入打印语句,例如:

print("IP Address for debug-toolbar: " + request.META['REMOTE_ADDR'])


通常,我认为您可以通过查看计算机的ip地址来轻松确定该地址,但是就我而言,我是在具有端口转发功能的Virtual Box中运行服务器……谁知道发生了什么。尽管在VB或我自己的OS上的ifconfig中没有看到它,但是REMOTE_ADDR键中显示的IP是激活工具栏的窍门。

Debug toolbar wants the ip address in request.META[‘REMOTE_ADDR’] to be set in the INTERNAL_IPS setting. Throw in a print statement in one of your views like such:

print("IP Address for debug-toolbar: " + request.META['REMOTE_ADDR'])

And then load that page. Make sure that IP is in your INTERNAL_IPS setting in settings.py.

Normally I’d think you would be able to determine the address easily by looking at your computer’s ip address, but in my case I’m running the server in a Virtual Box with port forwarding…and who knows what happened. Despite not seeing it anywhere in ifconfig on the VB or my own OS, the IP that showed up in the REMOTE_ADDR key was what did the trick of activating the toolbar.

回答 2


注意:仅当响应的模仿类型是text / html或application / xhtml + xml且包含结束标记时,调试工具栏才会显示。

回答 3



  1. DEBUG = True
  2. INTERNAL_IPS包括您的浏览器IP地址,而不是服务器地址。如果在本地浏览,则应为INTERNAL_IPS = ('',)。如果要远程浏览,只需指定您的公共地址
  3. 要安装的debug_toolbar应用程序,即 INSTALLED_APPS = (..., 'debug_toolbar',)
  4. 要添加的调试工具栏中间件类,即MIDDLEWARE_CLASSES = ('debug_toolbar.middleware.DebugToolbarMiddleware', ...)。它应尽早放在列表中。


  1. 必须是类型 text/html
  2. 必须有结束</html>标签



./manage.py collectstatic 


较新的开发版本为设置点2、3和4添加了默认值,这使工作变得更简单了,但是,与任何开发版本一样,它都有错误。我发现git的最新版本导致ImproperlyConfigured通过nginx / uwsgi运行时错误。


pip install -e git+https://github.com/django-debug-toolbar/django-debug-toolbar.git#egg=django-debug-toolbar 


pip install -e git+https://github.com/django-debug-toolbar/django-debug-toolbar.git@ba5af8f6fe7836eef0a0c85dd1e6d7418bc87f75#egg=django_debug_toolbar

The current stable version 0.11.0 requires the following things to be true for the toolbar to be shown:

Settings file:

  1. DEBUG = True
  2. INTERNAL_IPS to include your browser IP address, as opposed to the server address. If browsing locally this should be INTERNAL_IPS = ('',). If browsing remotely just specify your public address.
  3. The debug_toolbar app to be installed i.e INSTALLED_APPS = (..., 'debug_toolbar',)
  4. The debug toolbar middleware class to be added i.e. MIDDLEWARE_CLASSES = ('debug_toolbar.middleware.DebugToolbarMiddleware', ...). It should be placed as early as possible in the list.

Template files:

  1. Must be of type text/html
  2. Must have a closing </html> tag

Static files:

If you are serving static content make sure you collect the css, js and html by doing:

./manage.py collectstatic 

Note on upcoming versions of django-debug-toolbar

Newer, development versions have added defaults for settings points 2, 3 and 4 which makes life a bit simpler, however, as with any development version it has bugs. I found that the latest version from git resulted in an ImproperlyConfigured error when running through nginx/uwsgi.

Either way, if you want to install the latest version from github run:

pip install -e git+https://github.com/django-debug-toolbar/django-debug-toolbar.git#egg=django-debug-toolbar 

You can also clone a specific commit by doing:

pip install -e git+https://github.com/django-debug-toolbar/django-debug-toolbar.git@ba5af8f6fe7836eef0a0c85dd1e6d7418bc87f75#egg=django_debug_toolbar

回答 4

我尝试了所有操作,从设置DEBUG = True到设置INTERNAL_IPS到客户端IP地址,甚至手动配置Django Debug Toolbar(请注意,最新版本会自动进行所有配置,例如添加中间件和URL)。在远程开发服务器上没有任何工作(尽管它在本地工作)。唯一起作用的是配置工具栏,如下所示:

    "SHOW_TOOLBAR_CALLBACK" : lambda request: True,


I tried everything, from setting DEBUG = True, to settings INTERNAL_IPS to my client’s IP address, and even configuring Django Debug Toolbar manually (note that recent versions make all configurations automatically, such as adding the middleware and URLs). Nothing worked in a remote development server (though it did work locally). The ONLY thing that worked was configuring the toolbar as follows:

    "SHOW_TOOLBAR_CALLBACK" : lambda request: True,

This replaces the default method that decides if the toolbar should be shown, and always returns true.

回答 5



    'SHOW_TOOLBAR_CALLBACK': lambda _request: DEBUG



if request.META.get('REMOTE_ADDR', None) not in INTERNAL_IPS:
       return False

因此,如果REMOTE_ADDR由于动态docker路由而实际上不知道的值,则该工具栏将无法工作。您可以使用docker network命令查看动态IP值,例如docker network inspect my_docker_network_name


If you’re developing with a Django server in a Docker container with docker, the instructions for enabling the toolbar don’t work. The reason is related to the fact that the actual address that you would need to add to INTERNAL_IPS is going to be something dynamic, like Rather than trying to dynamically set the value of INTERNAL_IPS, the straightforward solution is to replace the function that enables the toolbar, in your settings.py, for example:

    'SHOW_TOOLBAR_CALLBACK': lambda _request: DEBUG

This should also work for other dynamic routing situations, like vagrant.

Here are some more details for the curious. The code in django_debug_tool that determines whether to show the toolbar examines the value of REMOTE_ADDR like this:

if request.META.get('REMOTE_ADDR', None) not in INTERNAL_IPS:
       return False

so if you don’t actually know the value of REMOTE_ADDR due to your dynamic docker routing, the toolbar will not work. You can use the docker network command to see the dynamic IP values, for example docker network inspect my_docker_network_name

回答 6


  1. DEBUG = True
  2. INTERNAL_IPS = ('', '',)


I have the toolbar working just perfect. With this configurations:

  1. DEBUG = True
  2. INTERNAL_IPS = ('', '',)
  4. The middleware is the first element in MIDDLEWARE_CLASSES:

I hope it helps

回答 7在Windows上添加到您的INTERNAL_IPS,内部与流浪汉一起使用

INTERNAL_IPS =(’10 .0.2.2’,)


Add to your INTERNAL_IPS on Windows, it is used with vagrant internally

INTERNAL_IPS = ( ‘’, )

This should work.

回答 8


在INTERNAL_IPS中,您需要具有客户端的 IP地址。

I had the same problem and finally resolved it after some googling.

In INTERNAL_IPS, you need to have the client’s IP address.

回答 9



Another thing that can cause the toolbar to remain hidden is if it cannot find the required static files. The debug_toolbar templates use the {{ STATIC_URL }} template tag, so make sure there is a folder in your static files called debug toolbar.

The collectstatic management command should take care of this on most installations.

回答 10


# django-debug-toolbar
MIDDLEWARE_CLASSES = Common.MIDDLEWARE_CLASSES + ('debug_toolbar.middleware.DebugToolbarMiddleware',)
INSTALLED_APPS += ('debug_toolbar',)


# end django-debug-toolbar

我只是通过添加'debug_toolbar.apps.DebugToolbarConfig'而不是django-debug-toolbar官方文档中'debug_toolbar'提到的方式对其进行了修改,因为我使用的是Django 1.7。

I tried the configuration from pydanny’s cookiecutter-django and it worked for me:

# django-debug-toolbar
MIDDLEWARE_CLASSES = Common.MIDDLEWARE_CLASSES + ('debug_toolbar.middleware.DebugToolbarMiddleware',)
INSTALLED_APPS += ('debug_toolbar',)


# end django-debug-toolbar

I just modified it by adding 'debug_toolbar.apps.DebugToolbarConfig' instead of 'debug_toolbar' as mentioned in the official django-debug-toolbar docs, as I’m using Django 1.7.

回答 11






location ~* ^/static/debug_toolbar/.+.(ico|css|js)$ {
    root [path to your python site-packages here]/site-packages/debug_toolbar;


Alias /static/debug_toolbar [path to your python site-packages here]/site-packages/debug_toolbar/static/debug_toolbar


manage.py collectstatic

在这里更多关于collectstatic的内容: https //docs.djangoproject.com/en/dev/ref/contrib/staticfiles/#collectstatic


An addition to previous answers:

if the toolbar doesn’t show up, but it loads in the html (check your site html in a browser, scroll down)

the issue can be that debug toolbar static files are not found (you can also see this in your site’s access logs then, e.g. 404 errors for /static/debug_toolbar/js/toolbar.js)

It can be fixed the following way then (examples for nginx and apache):

nginx config:

location ~* ^/static/debug_toolbar/.+.(ico|css|js)$ {
    root [path to your python site-packages here]/site-packages/debug_toolbar;

apache config:

Alias /static/debug_toolbar [path to your python site-packages here]/site-packages/debug_toolbar/static/debug_toolbar


manage.py collectstatic

more on collectstatic here: https://docs.djangoproject.com/en/dev/ref/contrib/staticfiles/#collectstatic

Or manualy move debug_toolbar folder of debug_toolbar static files to your set static files folder

回答 12


由于调试工具栏的自动配置将调试工具栏的中间件放在顶部,因此它只能“看到” gzip压缩的HTML,因此无法在其中添加工具栏。

我在开发设置中删除了GZipMiddleware。手动设置调试工具栏的配置,并将中间件放置 GZip 之后也应该可以。

In my case, it was another problem that hasn’t been mentioned here yet: I had GZipMiddleware in my list of middlewares.

As the automatic configuration of debug toolbar puts the debug toolbar’s middleware at the top, it only gets the “see” the gzipped HTML, to which it can’t add the toolbar.

I removed GZipMiddleware in my development settings. Setting up the debug toolbar’s configuration manually and placing the middleware after GZip’s should also work.

回答 13


In my case I just needed to remove the python compiled files (*.pyc)

回答 14

Django 1.8.5:


 from django.conf.urls import include
 from django.conf.urls import patterns
 from django.conf import settings

  if settings.DEBUG:
      import debug_toolbar
      urlpatterns += patterns('',
              url(r'^__debug__/', include(debug_toolbar.urls)),

Django 1.10:及更高版本:

from django.conf.urls import include, url
from django.conf.urls import patterns
from django.conf import settings

if settings.DEBUG:

  import debug_toolbar
  urlpatterns =[
         url(r'^__debug__/', include(debug_toolbar.urls)),
         ] + urlpatterns


# ...


# ...
# ...

django 1.8.5:

I had to add the following to the project url.py file to get the debug toolbar display. After that debug tool bar is displayed.

 from django.conf.urls import include
 from django.conf.urls import patterns
 from django.conf import settings

  if settings.DEBUG:
      import debug_toolbar
      urlpatterns += patterns('',
              url(r'^__debug__/', include(debug_toolbar.urls)),

django 1.10: and higher:

from django.conf.urls import include, url
from django.conf.urls import patterns
from django.conf import settings

if settings.DEBUG:

  import debug_toolbar
  urlpatterns =[
         url(r'^__debug__/', include(debug_toolbar.urls)),
         ] + urlpatterns

Also don’t forget to include the debug_toolbar to your middleware. The Debug Toolbar is mostly implemented in a middleware. Enable it in your settings module as follows: (django newer versions)

# ...

Old-style middleware:(need to have _CLASSES keywork in the Middleware)

# ...
# ...

回答 15

对于这个特定的作者来说不是这种情况,但是我一直在苦苦挣扎,因为Debug Toolbar没有显示出来,并且在他们指出所有步骤之后,我发现MIDDLEWARE订单有问题。因此,将中间件放在列表的前面是可行的。我的是第一个:

MIDDLEWARE_CLASSES = ( 'debug_toolbar.middleware.DebugToolbarMiddleware', 'django.middleware.common.CommonMiddleware', 'django.contrib.sessions.middleware.SessionMiddleware', 'django.middleware.csrf.CsrfViewMiddleware', 'django.contrib.auth.middleware.AuthenticationMiddleware', 'django.contrib.messages.middleware.MessageMiddleware', 'dynpages.middleware.DynpageFallbackMiddleware', 'utils.middleware.UserThread', )

This wasn’t the case for this specific author but I just have been struggling with the Debug Toolbar not showing and after doing everything they pointed out, I found out it was a problem with MIDDLEWARE order. So putting the middleware early in the list could work. Mine is first:

MIDDLEWARE_CLASSES = ( 'debug_toolbar.middleware.DebugToolbarMiddleware', 'django.middleware.common.CommonMiddleware', 'django.contrib.sessions.middleware.SessionMiddleware', 'django.middleware.csrf.CsrfViewMiddleware', 'django.contrib.auth.middleware.AuthenticationMiddleware', 'django.contrib.messages.middleware.MessageMiddleware', 'dynpages.middleware.DynpageFallbackMiddleware', 'utils.middleware.UserThread', )

回答 16



you have to make sure there is a closing tag in your templates.

My problem is that there is no regular html tags in my templates, I just display content in plain text. I solved it by inheriting every html file from base.html, which has a tag.

回答 17


For me this was as simple as typing into the address bar, rather than localhost:8000 which apparently was not matching the INTERNAL_IPS.

回答 18

我遇到了同样的问题,我通过查看Apache的错误日志解决了它。我用mod_wsgi在Mac OS X上运行了Apache。debug_toolbar的tamplete文件夹未加载


==> /private/var/log/apache2/dummy-host2.example.com-error_log <==
[Sun Apr 27 23:23:48 2014] [error] [client] File does not exist: /Library/WebServer/Documents/rblreport/rbl/static/debug_toolbar, referer:

==> /private/var/log/apache2/dummy-host2.example.com-access_log <== - - [27/Apr/2014:23:23:48 -0300] "GET /static/debug_toolbar/css/toolbar.css HTTP/1.1" 404 234 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:28.0) Gecko/20100101 Firefox/28.0"


Alias /static/debug_toolbar /Library/Python/2.7/site-packages/debug_toolbar/static/debug_toolbar
  • 当然,您必须更改python路径

I got the same problem, I solved it by looking at the Apache’s error log. I got the apache running on mac os x with mod_wsgi The debug_toolbar’s tamplete folder wasn’t being load

Log sample:

==> /private/var/log/apache2/dummy-host2.example.com-error_log <==
[Sun Apr 27 23:23:48 2014] [error] [client] File does not exist: /Library/WebServer/Documents/rblreport/rbl/static/debug_toolbar, referer:

==> /private/var/log/apache2/dummy-host2.example.com-access_log <== - - [27/Apr/2014:23:23:48 -0300] "GET /static/debug_toolbar/css/toolbar.css HTTP/1.1" 404 234 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:28.0) Gecko/20100101 Firefox/28.0"

I just add this line to my VirtualHost file:

Alias /static/debug_toolbar /Library/Python/2.7/site-packages/debug_toolbar/static/debug_toolbar
  • Of course you must change your python path

回答 19




I had the same problem using Vagrant. I solved this problem by adding ::ffff: to the INTERNAL_IPS as below example.


Remembering that is the IP in my private network in Vagrantfile.

回答 20





I had this problem and had to install the debug toolbar from source.

Version 1.4 has a problem where it’s hidden if you use PureCSS and apparently other CSS frameworks.

This is the commit which fixes that.

The docs explain how to install from source.

回答 21

对于使用Pycharm 5的任何人-模板调试在某些版本中均不起作用。在5.0.4修复,影响vesions – 5.0.1,5.0.2退房问题


For anyone who is using Pycharm 5 – template debug is not working there in some versions. Fixed in 5.0.4, affected vesions – 5.0.1, 5.0.2 Check out issue

Spend A LOT time to find that out. Maybe will help someone

回答 22



In the code I was working on, multiple small requests were made during handling of main request (it’s very specific use case). They were requests handled by the same Django’s thread. Django debug toolbar (DjDT) doesn’t expect this behaviour and includes DjDT’s toolbars to the first response and then it removes its state for the thread. So when main request was sent back to the browser, DjDT was not included in the response.

Lessons learned: DjDT saves it’s state per thread. It removes state for a thread after the first response.

回答 23



What got me is an outdated browser!

Noticed that it loads some stylesheets from debug toolbar and guessed it might be a front-end issue.

回答 24

一件愚蠢的事让我..如果使用apache wsgi,请记住触摸.wsgi文件以强制重新编译代码。只是浪费了我20分钟的时间来调试愚蠢的错误:(

One stupid thing got me.. that if you use apache wsgi, remember to touch the .wsgi file to force your code recompile. just waste 20 minutes of my time to debug the stupid error :(

Mac OS X 10.9之后无法安装PIL

问题:Mac OS X 10.9之后无法安装PIL

我刚刚将Mac OS更新为10.9,发现其中的某些(全部?)Python模块不再可用,尤其是Image模块。

所以我尝试执行sudo pip install pil,但是出现此错误:

/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk/usr/include/tk.h:78:11: fatal error: 'X11/Xlib.h' file not found

#      include <X11/Xlib.h>


1 error generated.

error: command 'cc' failed with exit status 1


I’ve just updated my Mac OS to 10.9 and I discovered that some (all?) of my Python modules are not here anymore, especially the Image one.

So I try to execute sudo pip install pil, but I get this error:

/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk/usr/include/tk.h:78:11: fatal error: 'X11/Xlib.h' file not found

#      include <X11/Xlib.h>


1 error generated.

error: command 'cc' failed with exit status 1

My Xcode is up-to-date and I don’t have any idea. Is it possible that PIL is not yet 10.9 compatible ?

回答 0


ln -s  /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers/X11 /usr/local/include/X11
sudo pip install pil



打开终端并执行: xcode-select --install

Following worked for me:

ln -s  /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers/X11 /usr/local/include/X11
sudo pip install pil


But there is more correct solution below, provided by Will.

open your terminal and execute: xcode-select --install

回答 1


xcode-select --install

open your terminal and execute:

xcode-select --install

回答 2

sudo ln -s /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.8.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers/X11/ /usr/local/include/X11

对我有帮助!操作系统x 10.9

pip install pillow


*** ZLIB (PNG/ZIP) support not available


xcode-select --install


pip install pillow

    version      Pillow 2.2.1
    platform     darwin 2.7.5 (default, Aug 25 2013, 00:04:04)
                 [GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)]
    --- TKINTER support available
    --- JPEG support available
    --- ZLIB (PNG/ZIP) support available
    --- TIFF G3/G4 (experimental) support available
    --- FREETYPE2 support available
    --- LITTLECMS support available
    --- WEBP support available
    --- WEBPMUX support available
sudo ln -s /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.8.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers/X11/ /usr/local/include/X11

helps for me! os x 10.9

pip install pillow

but! after pip install …

*** ZLIB (PNG/ZIP) support not available

and finally i fix it by running:

xcode-select --install

then reinstall pillow

pip install pillow

    version      Pillow 2.2.1
    platform     darwin 2.7.5 (default, Aug 25 2013, 00:04:04)
                 [GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)]
    --- TKINTER support available
    --- JPEG support available
    --- ZLIB (PNG/ZIP) support available
    --- TIFF G3/G4 (experimental) support available
    --- FREETYPE2 support available
    --- LITTLECMS support available
    --- WEBP support available
    --- WEBPMUX support available

回答 3

适用于我(OS X Yosemite 10.10.2-Python 2.7.9):

xcode-select --install
sudo pip install pillow


from PIL import Image
image = Image.open("file.jpg")

Works for me ( OS X Yosemite 10.10.2 – Python 2.7.9 ) :

xcode-select --install
sudo pip install pillow

Try this to check it:

from PIL import Image
image = Image.open("file.jpg")

回答 4


1)运行xcode install,使用此命令或从应用商店下载更新:

xcode-select --install

1b)添加命令行工具可选工具,在Mountain Lion中,这是xcode下载页面上的一个选项,但是现在您必须注册您的Apple ID并从以下位置下载: https //developer.apple.com/downloads/

寻找Xcode的命令行工具(OS X Mavericks)


brew install readline sqlite gdbm
brew install python --universal --framework 
brew install libpng jpeg freetype



easy_install pip 
sudo pip install setuptools --no-use-wheel --upgrade


sudo pip install Pillow


sudo pip install pil --allow-external pil --allow-unverified pil


sudo pip install Pillow


Here is what I did, some steps may not be necessary just for PIL but I needed libpng and others anyways:

1) Run xcode install, use this command or download updates from the app store:

xcode-select --install

1b) Add the Command Line Tools optional tool, in Mountain Lion this was an option on the xcode Download page, but now you have to register with your apple id and download from: https://developer.apple.com/downloads/

Look for Command Line Tools (OS X Mavericks) for Xcode

2) Install everything needed for python (using brew), I believe you can use port as well:

brew install readline sqlite gdbm
brew install python --universal --framework 
brew install libpng jpeg freetype

Unlink/ relink if needed i.e. if upgrading.

3) Install Pip and required modules:

easy_install pip 
sudo pip install setuptools --no-use-wheel --upgrade

4) Finally this works with no errors:

sudo pip install Pillow

UPDATE 11/04/14: PIL repo no longer receives updates or support so Pillow should be used. The below is now deprecated so stick with Pillow.

sudo pip install pil --allow-external pil --allow-unverified pil

UPDATE (OLD) : The same thing applies when installing Pillow (PIL fork) and should be mentioned as its quickly becoming a replacement in most cases of PIL. Instead of installing pip in step 4, run this instead:

sudo pip install Pillow

Hope this helps someone!

回答 5




installing command line tools fixed the issue for me

you have to install them separately as they are not part of the packages in xcode now:


回答 6


clang: error: unknown argument: '-mno-fused-madd' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
error: command 'cc' failed with exit status 1


sudo export CFLAGS=-Qunused-arguments
sudo export CPPFLAGS=-Qunused-arguments
sudo pip install PIL --allow-external PIL --allow-unverified PIL


Non of those worked for me.. I kept receiving:

clang: error: unknown argument: '-mno-fused-madd' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
error: command 'cc' failed with exit status 1

So I found a work around with the following solution:

sudo export CFLAGS=-Qunused-arguments
sudo export CPPFLAGS=-Qunused-arguments
sudo pip install PIL --allow-external PIL --allow-unverified PIL

This way I was able to install.

回答 7

我有一个类似的问题:安装枕头失败clang: error: unknown argument: '-mno-fused-madd' [-Wunused-command-line-argument-hard-error-in-future],安装枕头失败Can't install the software because it is not currently available from the Software Update server.,并且,即使手动安装了命令行工具,PIL的编译也失败了。


要解决此问题,只需export ARCHFLAGS="-Wno-error=unused-command-line-argument-hard-error-in-future"在终端上运行,然后再尝试进行编译(安装pil)。

I had a similar problem: Installing pillow failed with clang: error: unknown argument: '-mno-fused-madd' [-Wunused-command-line-argument-hard-error-in-future], installing command line tools failed with Can't install the software because it is not currently available from the Software Update server., and even after installing the command line tools manually, the compilation of PIL failed.

This happens cause clang under the newest version of xcode doesn’t warn on unknown compiler flags, but rather stop the compilation with a hard error.

To fix this, just run export ARCHFLAGS="-Wno-error=unused-command-line-argument-hard-error-in-future" on the terminal before trying to compile (installing pil).

回答 8


pip install pil --allow-external pil --allow-unverified pil

Simply run

pip install pil --allow-external pil --allow-unverified pil

回答 9

这是我在Mac OS 10.9.1上的步骤

1. sudo su
2. easy_install pip
3. xcode-select --install
4. pip install --no-index -f http://dist.plone.org/thirdparty/ -U PIL

This my steps on mac os 10.9.1

1. sudo su
2. easy_install pip
3. xcode-select --install
4. pip install --no-index -f http://dist.plone.org/thirdparty/ -U PIL

回答 10

您可以使用Homebrew进行安装 http://brew.sh

brew tap Homebrew/python
brew install pillow

You could use Homebrew to do the install http://brew.sh

brew tap Homebrew/python
brew install pillow

回答 11


sudo pip install pil --allow-external pil --allow-unverified pil

Make sure you have Command Line Tools installed on your xcode. Then execute:

sudo pip install pil --allow-external pil --allow-unverified pil

回答 12


building 'PIL._imagingft' extension
_imagingft.c:62:10: fatal error: 'freetype/fterrors.h' file not found

#include <freetype/fterrors.h>


1 error generated.

error: command 'cc' failed with exit status 1


I was having the following error

building 'PIL._imagingft' extension
_imagingft.c:62:10: fatal error: 'freetype/fterrors.h' file not found

#include <freetype/fterrors.h>


1 error generated.

error: command 'cc' failed with exit status 1

The solution to this was to symlink freetype2 to freetype and this solved the problem.

回答 13


sudo easy_install pip
sudo pip install pillow


I didn’t want to install XCode (I don’t use it) and I’m loath to fiddle with Application directory. I’ve cribbed from the many answers in this post and the following two steps work for me with 10.9.5:

sudo easy_install pip
sudo pip install pillow

It did appear to me strange that I had to use easy_install to install pip. But pip didn’t want to work for me before that (re-)install.

回答 14

找到了解决方案…您必须像这样对X11进行符号链接ln -s /opt/X11/include/X11 /usr/local/include/X11,然后sudo pip install pil才能正常工作。

Found the solution … You’ve to symlink X11 like this ln -s /opt/X11/include/X11 /usr/local/include/X11 and then sudo pip install pil should work.

回答 15


ln -s  /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers/X11 /usr/local/include/X11


sudo pip install -U PIL --allow-external PIL --allow-unverified PIL

Reusing @DmitryDemidenko’s answer that is how it worked for me:

ln -s  /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers/X11 /usr/local/include/X11

and then

sudo pip install -U PIL --allow-external PIL --allow-unverified PIL

回答 16

执行下面的命令行。在Mac OS 10.9.5上像超级按钮一样工作


sudo pip install setuptools –no-use-wheel –upgrade

sudo pip安装枕头


Execute the bellow command lines. Works like a charm on Mac OS 10.9.5

easy_install pip

sudo pip install setuptools –no-use-wheel –upgrade

sudo pip install Pillow

Best, Theo

回答 17


首先升级到Xcode 5(我正在运行10.9)。然后,在终端中执行以下命令:

$ /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk
$ ln -s /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers/X11 usr/include/

That’s what I did:

First upgrade to Xcode 5 (I am running 10.9). Then, execute the following commands in a terminal:

$ /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk
$ ln -s /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers/X11 usr/include/

回答 18

一个更完整的解决方案需要安装Xquartz X11子系统,该子系统已经在Apple之外构建了几年。这是我用来使其全部工作的步骤

  1. http://xquartz.macosforge.org/landing/安装XQuartz
  2. sudo pip install pillow

A more complete solution requires the installation of the Xquartz X11 subsystem that has been built outside of Apple for several years now. Here are the steps I used to get it all working

  1. Install XQuartz from http://xquartz.macosforge.org/landing/
  2. Run sudo pip install pillow

回答 19

因为公认的答案是正确的答案,xcode-select --install但有些人(包括我)可能会遇到Can't install the software because it is not currently available from the Software Update server 如果您使用的是Beta版软件(因为我现在使用的是优胜美地并且遇到相同的问题),则您需要单独购买CLT,因为它不包含在其中。 XCode(甚至xcode beta)也可以转到developers.apple.com并为您的OS获取CLT工具;)


As the accepted answer is the right one with xcode-select --install but some people (including me) may encounter Can't install the software because it is not currently available from the Software Update server If you are using beta software (as I am using Yosemite now and had the same problem) you NEED to get the CLT separately since it is NOT included in XCode (even xcode beta) Head over to developers.apple.com and get CLT tools for your OS ;)

P.S. You don’t need XQuartz for PIL or Pillow to work

回答 20

我最近从OS 10.8-> 10.9升级的机器陷入了xcrun和lipo之间的循环。

将/ usr / bin / lipo重命名为/ usr / bin / lipo_broken


使用OS X Mavericks和XCode 4.x冻结xcrun / lipo

My machine which was recently upgraded from OS 10.8 -> 10.9 got stuck in a loop between xcrun and lipo.

Rename /usr/bin/lipo to /usr/bin/lipo_broken

Refer to this thread for further information on how to resolve:

xcrun/lipo freezes with OS X Mavericks and XCode 4.x

回答 21


sudo pip install pillow

Install Pillow instead:

sudo pip install pillow

回答 22

ln -s /usr/local/include/freetype2 /usr/local/include/freetype
sudo ARCHFLAGS=-Wno-error=unused-command-line-argument-hard-error-in-future pip install pil
ln -s /usr/local/include/freetype2 /usr/local/include/freetype
sudo ARCHFLAGS=-Wno-error=unused-command-line-argument-hard-error-in-future pip install pil

回答 23


ln -s /usr/local/include/freetype2 /usr/local/include/freetype

Try this:

ln -s /usr/local/include/freetype2 /usr/local/include/freetype

回答 24

sudo pip uninstall pillow
pip install pillow

为我工作。我在优胜美地上运行Python 2.7.9。import PIL现在为我工作。

sudo pip uninstall pillow
pip install pillow

worked for me. I’m running Python 2.7.9 on Yosemite.import PIL now works for me.

回答 25

在Mac OSC 10.10 Yosemite上安装PIL(Imaging.1.1.7)。我尝试了这里推荐的许多修复程序,但是每个修复程序都遇到了麻烦。我终于通过编辑setup.py文件来解决了这个问题:

TCL_ROOT =“ / opt / X11 / include”


Installing PIL (Imaging.1.1.7) on Mac OSC 10.10 Yosemite. I tried numerous fixes recommended here but ran into trouble with each one. I finally solved this problem by editing the setup.py file such that:

TCL_ROOT = “/opt/X11/include”

which passes the appropriate include path for X11 in the compilation of _imagingtk.c, which was causing the problem for me. Worked immediately after change.

回答 26


I’ve moved from pyenv to virtualenv and this fixed my problem.

回答 27

  1. ln -s / opt / X11 / include / X11 / usr / local / include / X11
  2. 没有sudo的pip install pil
  1. ln -s /opt/X11/include/X11 /usr/local/include/X11
  2. pip install pil without sudo








In Python console:


Gives me:


Why? Can someone explain this particular case to me in binary?

回答 0


1 是:






11111110  00000001  00000010 
       Flip       Add 1

它是2,但由于MSB为1 ,所以符号为负。




>>> type(True)
<class 'bool'>
>>> isinstance(True, int)

>>> True == 1
>>> True is 1  # they're still different objects

int(True) is 1.

1 is:


and ~1 is:


Which is -2 in Two’s complement1

1 Flip all the bits, add 1 to the resulting number and interpret the result as a binary representation of the magnitude and add a negative sign (since the number begins with 1):

11111110 → 00000001 → 00000010 
         ↑          ↑ 
       Flip       Add 1

Which is 2, but the sign is negative since the MSB is 1.

Worth mentioning:

Think about bool, you’ll find that it’s numeric in nature – It has two values, True and False, and they are just “customized” versions of the integers 1 and 0 that only print themselves differently. They are subclasses of the integer type int.

So they behave exactly as 1 and 0, except that bool redefines str and repr to display them differently.

>>> type(True)
<class 'bool'>
>>> isinstance(True, int)

>>> True == 1
>>> True is 1  # they're still different objects

回答 1

Python bool类型是的子类int(出于历史原因;布尔值仅在Python 2.3中添加)。


有关为什么是的子类,请参见PEP 285boolint


>>> not True
>>> not False


>>> # Python 3
>>> import struct
>>> format(struct.pack('b', 1)[0], '08b')
>>> format(struct.pack('b', ~1)[0], '08b')


The Python bool type is a subclass of int (for historical reasons; booleans were only added in Python 2.3).

Since int(True) is 1, ~True is ~1 is -2.

See PEP 285 for why bool is a subclass of int.

If you wanted the boolean inverse, use not:

>>> not True
>>> not False

If you wanted to know why ~1 is -2, it’s because you are inverting all bits in a signed integer; 00000001 becomes 1111110 which in a signed integer is a negative number, see Two’s complement:

>>> # Python 3
>>> import struct
>>> format(struct.pack('b', 1)[0], '08b')
>>> format(struct.pack('b', ~1)[0], '08b')

where the initial 1 bit means the value is negative, and the rest of the bits encode the inverse of the positive number minus one.

回答 2

~True == -2,如果不奇怪 True的手段1 ~方式按位反转



  • 修复整数表示和按位求反运算符之间的混合
  • 进行另一次抛光(信息越短,需要做的工作越多)

~True == -2 is not surprising if True means 1 and ~ means bitwise inversion

provided that


  • fixed the mixing between integer representation and bitwise inversion operator
  • applied another polishing (the shorter the message, the more work needed)




class A(object):
    foo = 5   # some default value

class B(object):
    def __init__(self, foo=5):
        self.foo = foo


Is there any meaningful distinction between:

class A(object):
    foo = 5   # some default value


class B(object):
    def __init__(self, foo=5):
        self.foo = foo

If you’re creating a lot of instances, is there any difference in performance or space requirements for the two styles? When you read the code, do you consider the meaning of the two styles to be significantly different?

回答 0


>>> class A: foo = []
>>> a, b = A(), A()
>>> a.foo.append(5)
>>> b.foo
>>> class A:
...  def __init__(self): self.foo = []
>>> a, b = A(), A()
>>> a.foo.append(5)
>>> b.foo    

Beyond performance considerations, there is a significant semantic difference. In the class attribute case, there is just one object referred to. In the instance-attribute-set-at-instantiation, there can be multiple objects referred to. For instance

>>> class A: foo = []
>>> a, b = A(), A()
>>> a.foo.append(5)
>>> b.foo
>>> class A:
...  def __init__(self): self.foo = []
>>> a, b = A(), A()
>>> a.foo.append(5)
>>> b.foo    

回答 1


如果来自C ++,则类的属性更像静态成员变量。

The difference is that the attribute on the class is shared by all instances. The attribute on an instance is unique to that instance.

If coming from C++, attributes on the class are more like static member variables.

回答 2


class Bar(object):
    ## No need for dot syntax
    class_var = 1

    def __init__(self, i_var):
        self.i_var = i_var

## Need dot syntax as we've left scope of class namespace
## 1
foo = MyClass(2)

## Finds i_var in foo's instance namespace
## 2

## Doesn't find class_var in instance namespace…
## So look's in class namespace (Bar.__dict__)
## 1



  • 如果通过访问该类设置了一个类属性,它将覆盖所有实例的值

    foo = Bar(2)
    ## 1
    Bar.class_var = 2
    ## 2
  • 如果通过访问实例来设置类变量,则它将覆盖该实例的值。实际上,这会覆盖类变量,并将其转变为仅可用于该实例的直观实例变量。

    foo = Bar(2)
    ## 1
    foo.class_var = 2
    ## 2
    ## 1


  • 存储常数。由于可以将类属性作为类本身的属性进行访问,因此最好使用它们来存储类范围的,特定于类的常量

    class Circle(object):
         pi = 3.14159
         def __init__(self, radius):
              self.radius = radius   
        def area(self):
             return Circle.pi * self.radius * self.radius
    ## 3.14159
    c = Circle(10)
    ## 3.14159
    ## 314.159
  • 定义默认值。举一个简单的例子,我们可以创建一个有界列表(即只能容纳一定数量或更少数量元素的列表),并选择默认上限为10个项目

    class MyClass(object):
        limit = 10
        def __init__(self):
            self.data = []
        def item(self, i):
            return self.data[i]
        def add(self, e):
            if len(self.data) >= self.limit:
                raise Exception("Too many elements")
     ## 10

Here is a very good post, and summary it as below.

class Bar(object):
    ## No need for dot syntax
    class_var = 1

    def __init__(self, i_var):
        self.i_var = i_var

## Need dot syntax as we've left scope of class namespace
## 1
foo = MyClass(2)

## Finds i_var in foo's instance namespace
## 2

## Doesn't find class_var in instance namespace…
## So look's in class namespace (Bar.__dict__)
## 1

And in visual form

Class attribute assignment

  • If a class attribute is set by accessing the class, it will override the value for all instances

    foo = Bar(2)
    ## 1
    Bar.class_var = 2
    ## 2
  • If a class variable is set by accessing an instance, it will override the value only for that instance. This essentially overrides the class variable and turns it into an instance variable available, intuitively, only for that instance.

    foo = Bar(2)
    ## 1
    foo.class_var = 2
    ## 2
    ## 1

When would you use class attribute?

  • Storing constants. As class attributes can be accessed as attributes of the class itself, it’s often nice to use them for storing Class-wide, Class-specific constants

    class Circle(object):
         pi = 3.14159
         def __init__(self, radius):
              self.radius = radius   
        def area(self):
             return Circle.pi * self.radius * self.radius
    ## 3.14159
    c = Circle(10)
    ## 3.14159
    ## 314.159
  • Defining default values. As a trivial example, we might create a bounded list (i.e., a list that can only hold a certain number of elements or fewer) and choose to have a default cap of 10 items

    class MyClass(object):
        limit = 10
        def __init__(self):
            self.data = []
        def item(self, i):
            return self.data[i]
        def add(self, e):
            if len(self.data) >= self.limit:
                raise Exception("Too many elements")
     ## 10

回答 3

由于此处评论中的人们以及其他两个标记为重复的问题似乎都以相同的方式引起了混淆,因此我认为有必要在Alex Coventry的基础上再增加一个答案。


>>> class A: foo = object()
>>> a, b = A(), A()
>>> a.foo is b.foo
>>> class A:
...     def __init__(self): self.foo = object()
>>> a, b = A(), A()
>>> a.foo is b.foo


那么,为什么a.foo.append(5)在Alex的示例中会影响b.foo,但a.foo = 5在我的示例中却没有呢?那么,尝试a.foo = 5在Alex的例子,并注意不影响b.foo两种

a.foo = 5只是a.foo为…而出名5。这不会影响b.foo,也不会影响以前a.foo引用的旧值的任何其他名称。*我们正在创建一个隐藏类属性的实例属性,这有点棘手,但是一旦得到,就没有什么复杂的了发生在这里。


*对于来自C ++之类的人的困惑在于,在Python中,值不存储在变量中。值本身就存在于值域中,变量只是值的名称,赋值只是为值创建一个新名称。如果有帮助,请将每个Python变量视为shared_ptr<T>而不是T


Since people in the comments here and in two other questions marked as dups all appear to be confused about this in the same way, I think it’s worth adding an additional answer on top of Alex Coventry’s.

The fact that Alex is assigning a value of a mutable type, like a list, has nothing to do with whether things are shared or not. We can see this with the id function or the is operator:

>>> class A: foo = object()
>>> a, b = A(), A()
>>> a.foo is b.foo
>>> class A:
...     def __init__(self): self.foo = object()
>>> a, b = A(), A()
>>> a.foo is b.foo

(If you’re wondering why I used object() instead of, say, 5, that’s to avoid running into two whole other issues which I don’t want to get into here; for two different reasons, entirely separately-created 5s can end up being the same instance of the number 5. But entirely separately-created object()s cannot.)

So, why is it that a.foo.append(5) in Alex’s example affects b.foo, but a.foo = 5 in my example doesn’t? Well, try a.foo = 5 in Alex’s example, and notice that it doesn’t affect b.foo there either.

a.foo = 5 is just making a.foo into a name for 5. That doesn’t affect b.foo, or any other name for the old value that a.foo used to refer to.* It’s a little tricky that we’re creating an instance attribute that hides a class attribute,** but once you get that, nothing complicated is happening here.

Hopefully it’s now obvious why Alex used a list: the fact that you can mutate a list means it’s easier to show that two variables name the same list, and also means it’s more important in real-life code to know whether you have two lists or two names for the same list.

* The confusion for people coming from a language like C++ is that in Python, values aren’t stored in variables. Values live off in value-land, on their own, variables are just names for values, and assignment just creates a new name for a value. If it helps, think of each Python variable as a shared_ptr<T> instead of a T.

** Some people take advantage of this by using a class attribute as a “default value” for an instance attribute that instances may or may not set. This can be useful in some cases, but it can also be confusing, so be careful with it.

回答 4



# -*- encoding: utf-8 -*-

class RevealAccess(object):
    def __init__(self, initval=None, name='var'):
        self.val = initval
        self.name = name

    def __get__(self, obj, objtype):
        return self.val

class Base(object):
    attr_1 = RevealAccess(10, 'var "x"')

    def __init__(self):
        self.attr_2 = RevealAccess(10, 'var "x"')

def main():
    b = Base()
    print("Access to class attribute, return: ", Base.attr_1)
    print("Access to instance attribute, return: ", b.attr_2)

if __name__ == '__main__':


('Access to class attribute, return: ', 10)
('Access to instance attribute, return: ', <__main__.RevealAccess object at 0x10184eb50>)





There is one more situation.

Class and instance attributes is Descriptor.

# -*- encoding: utf-8 -*-

class RevealAccess(object):
    def __init__(self, initval=None, name='var'):
        self.val = initval
        self.name = name

    def __get__(self, obj, objtype):
        return self.val

class Base(object):
    attr_1 = RevealAccess(10, 'var "x"')

    def __init__(self):
        self.attr_2 = RevealAccess(10, 'var "x"')

def main():
    b = Base()
    print("Access to class attribute, return: ", Base.attr_1)
    print("Access to instance attribute, return: ", b.attr_2)

if __name__ == '__main__':

Above will output:

('Access to class attribute, return: ', 10)
('Access to instance attribute, return: ', <__main__.RevealAccess object at 0x10184eb50>)

The same type of instance access through class or instance return different result!

And i found in c.PyObject_GenericGetAttr definition,and a great post.


If the attribute is found in the dictionary of the classes which make up. the objects MRO, then check to see if the attribute being looked up points to a Data Descriptor (which is nothing more that a class implementing both the __get__ and the __set__ methods). If it does, resolve the attribute lookup by calling the __get__ method of the Data Descriptor (lines 28–33).




Is there a way in Python to pass optional parameters to a function while calling it and in the function definition have some code based on “only if the optional parameter is passed”

回答 0

Python 2中的文档,7.6。函数定义为您提供了两种方法来检测调用方是否提供了可选参数。

首先,您可以使用特殊的形式参数语法*。如果函数定义的形式参数前面带有single *,则Python会使用前形式参数(作为元组)不匹配的任何位置参数填充该参数。如果函数定义的正式参数以开头**,则Python会使用与先前正式参数不匹配的任何关键字参数(作为dict)来填充该参数。函数的实现可以检查这些参数的内容,以查找所需的任何“可选参数”。

例如,这是一个函数opt_fun,它需要两个位置参数x1x2,并寻找另一个名为“ optional”的关键字参数。

>>> def opt_fun(x1, x2, *positional_parameters, **keyword_parameters):
...     if ('optional' in keyword_parameters):
...         print 'optional parameter found, it is ', keyword_parameters['optional']
...     else:
...         print 'no optional parameter, sorry'
>>> opt_fun(1, 2)
no optional parameter, sorry
>>> opt_fun(1,2, optional="yes")
optional parameter found, it is  yes
>>> opt_fun(1,2, another="yes")
no optional parameter, sorry


The Python 2 documentation, 7.6. Function definitions gives you a couple of ways to detect whether a caller supplied an optional parameter.

First, you can use special formal parameter syntax *. If the function definition has a formal parameter preceded by a single *, then Python populates that parameter with any positional parameters that aren’t matched by preceding formal parameters (as a tuple). If the function definition has a formal parameter preceded by **, then Python populates that parameter with any keyword parameters that aren’t matched by preceding formal parameters (as a dict). The function’s implementation can check the contents of these parameters for any “optional parameters” of the sort you want.

For instance, here’s a function opt_fun which takes two positional parameters x1 and x2, and looks for another keyword parameter named “optional”.

>>> def opt_fun(x1, x2, *positional_parameters, **keyword_parameters):
...     if ('optional' in keyword_parameters):
...         print 'optional parameter found, it is ', keyword_parameters['optional']
...     else:
...         print 'no optional parameter, sorry'
>>> opt_fun(1, 2)
no optional parameter, sorry
>>> opt_fun(1,2, optional="yes")
optional parameter found, it is  yes
>>> opt_fun(1,2, another="yes")
no optional parameter, sorry

Second, you can supply a default parameter value of some value like None which a caller would never use. If the parameter has this default value, you know the caller did not specify the parameter. If the parameter has a non-default value, you know it came from the caller.

回答 1

def my_func(mandatory_arg, optional_arg=100):
    print(mandatory_arg, optional_arg)




MISSING = object()

def func(arg=MISSING):
    if arg is MISSING:
def my_func(mandatory_arg, optional_arg=100):
    print(mandatory_arg, optional_arg)


I find this more readable than using **kwargs.

To determine if an argument was passed at all, I use a custom utility object as the default value:

MISSING = object()

def func(arg=MISSING):
    if arg is MISSING:

回答 2

def op(a=4,b=6):
    add = a+b
    print add

i)op() [o/p: will be (4+6)=10]
ii)op(99) [o/p: will be (99+6)=105]
iii)op(1,1) [o/p: will be (1+1)=2]
 If none or one parameter is passed the default passed parameter will be considered for the function. 
def op(a=4,b=6):
    add = a+b
    print add

i)op() [o/p: will be (4+6)=10]
ii)op(99) [o/p: will be (99+6)=105]
iii)op(1,1) [o/p: will be (1+1)=2]
 If none or one parameter is passed the default passed parameter will be considered for the function. 

回答 3

如果要为参数提供一些默认值,请在()中分配值。像(x = 10)。但重要的是,首先应强制参数,然后默认值。


(y,x = 10)

(x = 10,y)是错误的

If you want give some default value to a parameter assign value in (). like (x =10). But important is first should compulsory argument then default value.


(y, x =10)


(x=10, y) is wrong

回答 4


class _NO_DEFAULT:
    def __repr__(self):return "<no default>"

def func(optional= _NO_DEFAULT):
    if optional is _NO_DEFAULT:
        print("the optional argument was not passed")
        print("the optional argument was:",optional)


# these two work the same as using **

# the optional argument can be positional or keyword unlike using **

#this correctly raises an error where as it would need to be explicitly checked when using **

You can specify a default value for the optional argument with something that would never passed to the function and check it with the is operator:

class _NO_DEFAULT:
    def __repr__(self):return "<no default>"

def func(optional= _NO_DEFAULT):
    if optional is _NO_DEFAULT:
        print("the optional argument was not passed")
        print("the optional argument was:",optional)

then as long as you do not do func(_NO_DEFAULT) you can be accurately detect whether the argument was passed or not, and unlike the accepted answer you don’t have to worry about side effects of ** notation:

# these two work the same as using **

# the optional argument can be positional or keyword unlike using **

#this correctly raises an error where as it would need to be explicitly checked when using **




>>> from timeit import timeit
>>> timeit("[x for x in 'abc']")
>>> timeit("[x for x in ['a', 'b', 'c']]")


I was playing around with timeit and noticed that doing a simple list comprehension over a small string took longer than doing the same operation on a list of small single character strings. Any explanation? It’s almost 1.35 times as much time.

>>> from timeit import timeit
>>> timeit("[x for x in 'abc']")
>>> timeit("[x for x in ['a', 'b', 'c']]")

What’s happening on a lower level that’s causing this?

回答 0


  • 对于Python 2,一旦消除了很多开销,实际的速度差异就会接近70%(或更高)。

  • 对象创建没有错。这两种方法都不会创建新对象,因为会缓存一个字符的字符串。

  • 区别并不明显,但可能是由于对类型和格式正确的字符串索引进行了大量检查而造成的。由于很有必要检查返回的商品,因此很有可能。

  • 列表索引非常快。

>>> python3 -m timeit '[x for x in "abc"]'
1000000 loops, best of 3: 0.388 usec per loop

>>> python3 -m timeit '[x for x in ["a", "b", "c"]]'
1000000 loops, best of 3: 0.436 usec per loop


然后,您必须使用Python 2。

>>> python2 -m timeit '[x for x in "abc"]'
1000000 loops, best of 3: 0.309 usec per loop

>>> python2 -m timeit '[x for x in ["a", "b", "c"]]'
1000000 loops, best of 3: 0.212 usec per loop


对于Python 3:

import dis

def list_iterate():
    [item for item in ["a", "b", "c"]]

#>>>   4           0 LOAD_CONST               1 (<code object <listcomp> at 0x7f4d06b118a0, file "", line 4>)
#>>>               3 LOAD_CONST               2 ('list_iterate.<locals>.<listcomp>')
#>>>               6 MAKE_FUNCTION            0
#>>>               9 LOAD_CONST               3 ('a')
#>>>              12 LOAD_CONST               4 ('b')
#>>>              15 LOAD_CONST               5 ('c')
#>>>              18 BUILD_LIST               3
#>>>              21 GET_ITER
#>>>              22 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
#>>>              25 POP_TOP
#>>>              26 LOAD_CONST               0 (None)
#>>>              29 RETURN_VALUE

def string_iterate():
    [item for item in "abc"]

#>>>  21           0 LOAD_CONST               1 (<code object <listcomp> at 0x7f4d06b17150, file "", line 21>)
#>>>               3 LOAD_CONST               2 ('string_iterate.<locals>.<listcomp>')
#>>>               6 MAKE_FUNCTION            0
#>>>               9 LOAD_CONST               3 ('abc')
#>>>              12 GET_ITER
#>>>              13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
#>>>              16 POP_TOP
#>>>              17 LOAD_CONST               0 (None)
#>>>              20 RETURN_VALUE



 9 LOAD_CONST   3 ('a')
12 LOAD_CONST   4 ('b')
15 LOAD_CONST   5 ('c')


 9 LOAD_CONST   3 ('abc')


def string_iterate():
    [item for item in ("a", "b", "c")]

#>>>  35           0 LOAD_CONST               1 (<code object <listcomp> at 0x7f4d068be660, file "", line 35>)
#>>>               3 LOAD_CONST               2 ('string_iterate.<locals>.<listcomp>')
#>>>               6 MAKE_FUNCTION            0
#>>>               9 LOAD_CONST               6 (('a', 'b', 'c'))
#>>>              12 GET_ITER
#>>>              13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
#>>>              16 POP_TOP
#>>>              17 LOAD_CONST               0 (None)
#>>>              20 RETURN_VALUE


 9 LOAD_CONST               6 (('a', 'b', 'c'))


>>> python3 -m timeit '[x for x in ("a", "b", "c")]'
1000000 loops, best of 3: 0.369 usec per loop


对于Python 2:

def list_iterate():
    [item for item in ["a", "b", "c"]]

#>>>   2           0 BUILD_LIST               0
#>>>               3 LOAD_CONST               1 ('a')
#>>>               6 LOAD_CONST               2 ('b')
#>>>               9 LOAD_CONST               3 ('c')
#>>>              12 BUILD_LIST               3
#>>>              15 GET_ITER            
#>>>         >>   16 FOR_ITER                12 (to 31)
#>>>              19 STORE_FAST               0 (item)
#>>>              22 LOAD_FAST                0 (item)
#>>>              25 LIST_APPEND              2
#>>>              28 JUMP_ABSOLUTE           16
#>>>         >>   31 POP_TOP             
#>>>              32 LOAD_CONST               0 (None)
#>>>              35 RETURN_VALUE        

def string_iterate():
    [item for item in "abc"]

#>>>   2           0 BUILD_LIST               0
#>>>               3 LOAD_CONST               1 ('abc')
#>>>               6 GET_ITER            
#>>>         >>    7 FOR_ITER                12 (to 22)
#>>>              10 STORE_FAST               0 (item)
#>>>              13 LOAD_FAST                0 (item)
#>>>              16 LIST_APPEND              2
#>>>              19 JUMP_ABSOLUTE            7
#>>>         >>   22 POP_TOP             
#>>>              23 LOAD_CONST               0 (None)
#>>>              26 RETURN_VALUE        

奇怪的是,我们具有相同的列表构建,但是这样做的速度仍然更快。Python 2的运行速度异常快。

让我们删除理解和重新计时。这_ =是为了防止它被优化。

>>> python3 -m timeit '_ = ["a", "b", "c"]'
10000000 loops, best of 3: 0.0707 usec per loop

>>> python3 -m timeit '_ = "abc"'
100000000 loops, best of 3: 0.0171 usec per loop

我们可以看到初始化不足以说明版本之间的差异(这些数字很小)!因此,我们可以得出结论,Python 3的理解速度较慢。随着Python 3将理解方式更改为具有更安全的作用域,这才有意义。


>>> python3 -m timeit -s 'iterable = "abc"'           '[x for x in iterable]'
1000000 loops, best of 3: 0.387 usec per loop

>>> python3 -m timeit -s 'iterable = ["a", "b", "c"]' '[x for x in iterable]'
1000000 loops, best of 3: 0.368 usec per loop
>>> python2 -m timeit -s 'iterable = "abc"'           '[x for x in iterable]'
1000000 loops, best of 3: 0.309 usec per loop

>>> python2 -m timeit -s 'iterable = ["a", "b", "c"]' '[x for x in iterable]'
10000000 loops, best of 3: 0.164 usec per loop


>>> python3 -m timeit -s 'iterable = "abc"'           'iter(iterable)'
10000000 loops, best of 3: 0.099 usec per loop

>>> python3 -m timeit -s 'iterable = ["a", "b", "c"]' 'iter(iterable)'
10000000 loops, best of 3: 0.1 usec per loop
>>> python2 -m timeit -s 'iterable = "abc"'           'iter(iterable)'
10000000 loops, best of 3: 0.0913 usec per loop

>>> python2 -m timeit -s 'iterable = ["a", "b", "c"]' 'iter(iterable)'
10000000 loops, best of 3: 0.0854 usec per loop

不,不是。差别太小,尤其是对于Python 3。


>>> python3 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' '[x for x in iterable]'
100 loops, best of 3: 3.12 msec per loop

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' '[x for x in iterable]'
100 loops, best of 3: 2.77 msec per loop
>>> python2 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' '[x for x in iterable]'
100 loops, best of 3: 2.32 msec per loop

>>> python2 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' '[x for x in iterable]'
100 loops, best of 3: 2.09 msec per loop



>>> python3 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'for x in iterable: pass'
1000 loops, best of 3: 1.71 msec per loop

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'for x in iterable: pass'
1000 loops, best of 3: 1.36 msec per loop
>>> python2 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'for x in iterable: pass'
1000 loops, best of 3: 1.27 msec per loop

>>> python2 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'for x in iterable: pass'
1000 loops, best of 3: 935 usec per loop


>>> python3 -m timeit -s 'import random; from collections import deque; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 777 usec per loop

>>> python3 -m timeit -s 'import random; from collections import deque; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 405 usec per loop
>>> python2 -m timeit -s 'import random; from collections import deque; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 805 usec per loop

>>> python2 -m timeit -s 'import random; from collections import deque; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 438 usec per loop


  • bytes

    >>> python3 -m timeit -s 'import random; from collections import deque; iterable = b"".join(chr(random.randint(0, 127)).encode("ascii") for _ in range(100000))' 'deque(iterable, maxlen=0)'                                                                    :(
    1000 loops, best of 3: 571 usec per loop
    >>> python3 -m timeit -s 'import random; from collections import deque; iterable =         [chr(random.randint(0, 127)).encode("ascii") for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 394 usec per loop
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable = b"".join(chr(random.randint(0, 127))                 for _ in range(100000))' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 757 usec per loop
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable =         [chr(random.randint(0, 127))                 for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 438 usec per loop

    在这里,您可以看到Python 3实际上比Python 2 更快

  • unicode

    >>> python3 -m timeit -s 'import random; from collections import deque; iterable = u"".join(   chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 800 usec per loop
    >>> python3 -m timeit -s 'import random; from collections import deque; iterable =         [   chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 394 usec per loop
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable = u"".join(unichr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 1.07 msec per loop
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable =         [unichr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 469 usec per loop

    同样,Python 3更快,尽管这是可以预料的(str在Python 3中引起了很多关注)。



>>> python3 -m timeit -s 'import random; from collections import deque; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 777 usec per loop

>>> python3 -m timeit -s 'import random; from collections import deque; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 405 usec per loop

实际上,我们可以排除蒂姆·彼得(Tim Peter)提出10次支持的答案!

>>> foo = iterable[123]
>>> iterable[36] is foo



>>> python3 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'iterable[123]'
10000000 loops, best of 3: 0.0397 usec per loop

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'iterable[123]'
10000000 loops, best of 3: 0.0374 usec per loop


>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'iterable; 123'
100000000 loops, best of 3: 0.0173 usec per loop





static PyObject *
unicode_getitem(PyObject *self, Py_ssize_t index)
    void *data;
    enum PyUnicode_Kind kind;
    Py_UCS4 ch;
    PyObject *res;

    if (!PyUnicode_Check(self) || PyUnicode_READY(self) == -1) {
        return NULL;
    if (index < 0 || index >= PyUnicode_GET_LENGTH(self)) {
        PyErr_SetString(PyExc_IndexError, "string index out of range");
        return NULL;
    kind = PyUnicode_KIND(self);
    data = PyUnicode_DATA(self);
    ch = PyUnicode_READ(kind, data, index);
    if (ch < 256)
        return get_latin1_char(ch);

    res = PyUnicode_New(1, ch);
    if (res == NULL)
        return NULL;
    kind = PyUnicode_KIND(res);
    data = PyUnicode_DATA(res);
    PyUnicode_WRITE(kind, data, 0, ch);
    assert(_PyUnicode_CheckConsistency(res, 1));
    return res;


ch = PyUnicode_READ(kind, data, index);



kind = PyUnicode_KIND(self);
data = PyUnicode_DATA(self);
ch = PyUnicode_READ(kind, data, index);
return get_latin1_char(ch);


#define PyUnicode_KIND(op) \
    (assert(PyUnicode_Check(op)), \
     assert(PyUnicode_IS_READY(op)),            \
     ((PyASCIIObject *)(op))->state.kind)

(这很无聊,因为断言在调试中会被忽略(因此我可以检查它们是否很快),((PyASCIIObject *)(op))->state.kind)并且(我认为)是间接调用和C级强制转换);

#define PyUnicode_DATA(op) \
    (assert(PyUnicode_Check(op)), \
     PyUnicode_IS_COMPACT(op) ? _PyUnicode_COMPACT_DATA(op) :   \


#define PyUnicode_READ(kind, data, index) \
    ((Py_UCS4) \
    ((kind) == PyUnicode_1BYTE_KIND ? \
        ((const Py_UCS1 *)(data))[(index)] : \
        ((kind) == PyUnicode_2BYTE_KIND ? \
            ((const Py_UCS2 *)(data))[(index)] : \
            ((const Py_UCS4 *)(data))[(index)] \
        ) \


static PyObject*
get_latin1_char(unsigned char ch)
    PyObject *unicode = unicode_latin1[ch];
    if (!unicode) {
        unicode = PyUnicode_New(1, ch);
        if (!unicode)
            return NULL;
        PyUnicode_1BYTE_DATA(unicode)[0] = ch;
        assert(_PyUnicode_CheckConsistency(unicode, 1));
        unicode_latin1[ch] = unicode;
    return unicode;


  • 这被缓存:

    PyObject *unicode = unicode_latin1[ch];
  • 这应该很快。在if (!unicode)没有运行,所以它是在这种情况下相当于字面上

    PyObject *unicode = unicode_latin1[ch];
    return unicode;

坦白地说,在测试asserts 之后(通过禁用它们[我认为它可以在C级断言上运行…]),只有看起来很慢的部分是:



#define PyUnicode_IS_COMPACT(op) \


#define _PyUnicode_COMPACT_DATA(op)                     \
    (PyUnicode_IS_ASCII(op) ?                   \
     ((void*)((PyASCIIObject*)(op) + 1)) :              \
     ((void*)((PyCompactUnicodeObject*)(op) + 1)))


#define _PyUnicode_NONCOMPACT_DATA(op)                  \
    (assert(((PyUnicodeObject*)(op))->data.any),        \
     ((((PyUnicodeObject *)(op))->data.any)))





#define PyUnicode_IS_ASCII(op)                   \
    (assert(PyUnicode_Check(op)),                \
     assert(PyUnicode_IS_READY(op)),             \


好吧,但让我们将其与进行比较PyList_GetItem。(是的,感谢蒂姆·彼得斯(Tim Peters)为我提供了更多的工作要做:P。)

PyObject *
PyList_GetItem(PyObject *op, Py_ssize_t i)
    if (!PyList_Check(op)) {
        return NULL;
    if (i < 0 || i >= Py_SIZE(op)) {
        if (indexerr == NULL) {
            indexerr = PyUnicode_FromString(
                "list index out of range");
            if (indexerr == NULL)
                return NULL;
        PyErr_SetObject(PyExc_IndexError, indexerr);
        return NULL;
    return ((PyListObject *)op) -> ob_item[i];


((PyListObject *)op) -> ob_item[i]


#define PyList_Check(op) \
     PyType_FastSubclass(Py_TYPE(op), Py_TPFLAGS_LIST_SUBCLASS)

TABS!TABS !!!)(issue215875分钟内修复并合并。就像…是的。该死的。他们让Skeet感到羞耻。

#define Py_SIZE(ob)             (((PyVarObject*)(ob))->ob_size)
#define PyType_FastSubclass(t,f)  PyType_HasFeature(t,f)
#define PyType_HasFeature(t,f)  ((PyType_GetFlags(t) & (f)) != 0)
#define PyType_HasFeature(t,f)  (((t)->tp_flags & (f)) != 0)


然后是索引和强制转换(((PyListObject *)op) -> ob_item[i]),我们完成了。




  • The actual speed difference is closer to 70% (or more) once a lot of the overhead is removed, for Python 2.

  • Object creation is not at fault. Neither method creates a new object, as one-character strings are cached.

  • The difference is unobvious, but is likely created from a greater number of checks on string indexing, with regards to the type and well-formedness. It is also quite likely thanks to the need to check what to return.

  • List indexing is remarkably fast.

>>> python3 -m timeit '[x for x in "abc"]'
1000000 loops, best of 3: 0.388 usec per loop

>>> python3 -m timeit '[x for x in ["a", "b", "c"]]'
1000000 loops, best of 3: 0.436 usec per loop

This disagrees with what you’ve found…

You must be using Python 2, then.

>>> python2 -m timeit '[x for x in "abc"]'
1000000 loops, best of 3: 0.309 usec per loop

>>> python2 -m timeit '[x for x in ["a", "b", "c"]]'
1000000 loops, best of 3: 0.212 usec per loop

Let’s explain the difference between the versions. I’ll examine the compiled code.

For Python 3:

import dis

def list_iterate():
    [item for item in ["a", "b", "c"]]

#>>>   4           0 LOAD_CONST               1 (<code object <listcomp> at 0x7f4d06b118a0, file "", line 4>)
#>>>               3 LOAD_CONST               2 ('list_iterate.<locals>.<listcomp>')
#>>>               6 MAKE_FUNCTION            0
#>>>               9 LOAD_CONST               3 ('a')
#>>>              12 LOAD_CONST               4 ('b')
#>>>              15 LOAD_CONST               5 ('c')
#>>>              18 BUILD_LIST               3
#>>>              21 GET_ITER
#>>>              22 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
#>>>              25 POP_TOP
#>>>              26 LOAD_CONST               0 (None)
#>>>              29 RETURN_VALUE

def string_iterate():
    [item for item in "abc"]

#>>>  21           0 LOAD_CONST               1 (<code object <listcomp> at 0x7f4d06b17150, file "", line 21>)
#>>>               3 LOAD_CONST               2 ('string_iterate.<locals>.<listcomp>')
#>>>               6 MAKE_FUNCTION            0
#>>>               9 LOAD_CONST               3 ('abc')
#>>>              12 GET_ITER
#>>>              13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
#>>>              16 POP_TOP
#>>>              17 LOAD_CONST               0 (None)
#>>>              20 RETURN_VALUE

You see here that the list variant is likely to be slower due to the building of the list each time.

This is the

 9 LOAD_CONST   3 ('a')
12 LOAD_CONST   4 ('b')
15 LOAD_CONST   5 ('c')

part. The string variant only has

 9 LOAD_CONST   3 ('abc')

You can check that this does seem to make a difference:

def string_iterate():
    [item for item in ("a", "b", "c")]

#>>>  35           0 LOAD_CONST               1 (<code object <listcomp> at 0x7f4d068be660, file "", line 35>)
#>>>               3 LOAD_CONST               2 ('string_iterate.<locals>.<listcomp>')
#>>>               6 MAKE_FUNCTION            0
#>>>               9 LOAD_CONST               6 (('a', 'b', 'c'))
#>>>              12 GET_ITER
#>>>              13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
#>>>              16 POP_TOP
#>>>              17 LOAD_CONST               0 (None)
#>>>              20 RETURN_VALUE

This produces just

 9 LOAD_CONST               6 (('a', 'b', 'c'))

as tuples are immutable. Test:

>>> python3 -m timeit '[x for x in ("a", "b", "c")]'
1000000 loops, best of 3: 0.369 usec per loop

Great, back up to speed.

For Python 2:

def list_iterate():
    [item for item in ["a", "b", "c"]]

#>>>   2           0 BUILD_LIST               0
#>>>               3 LOAD_CONST               1 ('a')
#>>>               6 LOAD_CONST               2 ('b')
#>>>               9 LOAD_CONST               3 ('c')
#>>>              12 BUILD_LIST               3
#>>>              15 GET_ITER            
#>>>         >>   16 FOR_ITER                12 (to 31)
#>>>              19 STORE_FAST               0 (item)
#>>>              22 LOAD_FAST                0 (item)
#>>>              25 LIST_APPEND              2
#>>>              28 JUMP_ABSOLUTE           16
#>>>         >>   31 POP_TOP             
#>>>              32 LOAD_CONST               0 (None)
#>>>              35 RETURN_VALUE        

def string_iterate():
    [item for item in "abc"]

#>>>   2           0 BUILD_LIST               0
#>>>               3 LOAD_CONST               1 ('abc')
#>>>               6 GET_ITER            
#>>>         >>    7 FOR_ITER                12 (to 22)
#>>>              10 STORE_FAST               0 (item)
#>>>              13 LOAD_FAST                0 (item)
#>>>              16 LIST_APPEND              2
#>>>              19 JUMP_ABSOLUTE            7
#>>>         >>   22 POP_TOP             
#>>>              23 LOAD_CONST               0 (None)
#>>>              26 RETURN_VALUE        

The odd thing is that we have the same building of the list, but it’s still faster for this. Python 2 is acting strangely fast.

Let’s remove the comprehensions and re-time. The _ = is to prevent it getting optimised out.

>>> python3 -m timeit '_ = ["a", "b", "c"]'
10000000 loops, best of 3: 0.0707 usec per loop

>>> python3 -m timeit '_ = "abc"'
100000000 loops, best of 3: 0.0171 usec per loop

We can see that initialization is not significant enough to account for the difference between the versions (those numbers are small)! We can thus conclude that Python 3 has slower comprehensions. This makes sense as Python 3 changed comprehensions to have safer scoping.

Well, now improve the benchmark (I’m just removing overhead that isn’t iteration). This removes the building of the iterable by pre-assigning it:

>>> python3 -m timeit -s 'iterable = "abc"'           '[x for x in iterable]'
1000000 loops, best of 3: 0.387 usec per loop

>>> python3 -m timeit -s 'iterable = ["a", "b", "c"]' '[x for x in iterable]'
1000000 loops, best of 3: 0.368 usec per loop
>>> python2 -m timeit -s 'iterable = "abc"'           '[x for x in iterable]'
1000000 loops, best of 3: 0.309 usec per loop

>>> python2 -m timeit -s 'iterable = ["a", "b", "c"]' '[x for x in iterable]'
10000000 loops, best of 3: 0.164 usec per loop

We can check if calling iter is the overhead:

>>> python3 -m timeit -s 'iterable = "abc"'           'iter(iterable)'
10000000 loops, best of 3: 0.099 usec per loop

>>> python3 -m timeit -s 'iterable = ["a", "b", "c"]' 'iter(iterable)'
10000000 loops, best of 3: 0.1 usec per loop
>>> python2 -m timeit -s 'iterable = "abc"'           'iter(iterable)'
10000000 loops, best of 3: 0.0913 usec per loop

>>> python2 -m timeit -s 'iterable = ["a", "b", "c"]' 'iter(iterable)'
10000000 loops, best of 3: 0.0854 usec per loop

No. No it is not. The difference is too small, especially for Python 3.

So let’s remove yet more unwanted overhead… by making the whole thing slower! The aim is just to have a longer iteration so the time hides overhead.

>>> python3 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' '[x for x in iterable]'
100 loops, best of 3: 3.12 msec per loop

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' '[x for x in iterable]'
100 loops, best of 3: 2.77 msec per loop
>>> python2 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' '[x for x in iterable]'
100 loops, best of 3: 2.32 msec per loop

>>> python2 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' '[x for x in iterable]'
100 loops, best of 3: 2.09 msec per loop

This hasn’t actually changed much, but it’s helped a little.

So remove the comprehension. It’s overhead that’s not part of the question:

>>> python3 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'for x in iterable: pass'
1000 loops, best of 3: 1.71 msec per loop

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'for x in iterable: pass'
1000 loops, best of 3: 1.36 msec per loop
>>> python2 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'for x in iterable: pass'
1000 loops, best of 3: 1.27 msec per loop

>>> python2 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'for x in iterable: pass'
1000 loops, best of 3: 935 usec per loop

That’s more like it! We can get slightly faster still by using deque to iterate. It’s basically the same, but it’s faster:

>>> python3 -m timeit -s 'import random; from collections import deque; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 777 usec per loop

>>> python3 -m timeit -s 'import random; from collections import deque; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 405 usec per loop
>>> python2 -m timeit -s 'import random; from collections import deque; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 805 usec per loop

>>> python2 -m timeit -s 'import random; from collections import deque; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 438 usec per loop

What impresses me is that Unicode is competitive with bytestrings. We can check this explicitly by trying bytes and unicode in both:

  • bytes

    >>> python3 -m timeit -s 'import random; from collections import deque; iterable = b"".join(chr(random.randint(0, 127)).encode("ascii") for _ in range(100000))' 'deque(iterable, maxlen=0)'                                                                    :(
    1000 loops, best of 3: 571 usec per loop
    >>> python3 -m timeit -s 'import random; from collections import deque; iterable =         [chr(random.randint(0, 127)).encode("ascii") for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 394 usec per loop
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable = b"".join(chr(random.randint(0, 127))                 for _ in range(100000))' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 757 usec per loop
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable =         [chr(random.randint(0, 127))                 for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 438 usec per loop

    Here you see Python 3 actually faster than Python 2.

  • unicode

    >>> python3 -m timeit -s 'import random; from collections import deque; iterable = u"".join(   chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 800 usec per loop
    >>> python3 -m timeit -s 'import random; from collections import deque; iterable =         [   chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 394 usec per loop
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable = u"".join(unichr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 1.07 msec per loop
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable =         [unichr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 469 usec per loop

    Again, Python 3 is faster, although this is to be expected (str has had a lot of attention in Python 3).

In fact, this unicodebytes difference is very small, which is impressive.

So let’s analyse this one case, seeing as it’s fast and convenient for me:

>>> python3 -m timeit -s 'import random; from collections import deque; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 777 usec per loop

>>> python3 -m timeit -s 'import random; from collections import deque; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 405 usec per loop

We can actually rule out Tim Peter’s 10-times-upvoted answer!

>>> foo = iterable[123]
>>> iterable[36] is foo

These are not new objects!

But this is worth mentioning: indexing costs. The difference will likely be in the indexing, so remove the iteration and just index:

>>> python3 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'iterable[123]'
10000000 loops, best of 3: 0.0397 usec per loop

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'iterable[123]'
10000000 loops, best of 3: 0.0374 usec per loop

The difference seems small, but at least half of the cost is overhead:

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'iterable; 123'
100000000 loops, best of 3: 0.0173 usec per loop

so the speed difference is sufficient to decide to blame it. I think.

So why is indexing a list so much faster?

Well, I’ll come back to you on that, but my guess is that’s is down to the check for interned strings (or cached characters if it’s a separate mechanism). This will be less fast than optimal. But I’ll go check the source (although I’m not comfortable in C…) :).

So here’s the source:

static PyObject *
unicode_getitem(PyObject *self, Py_ssize_t index)
    void *data;
    enum PyUnicode_Kind kind;
    Py_UCS4 ch;
    PyObject *res;

    if (!PyUnicode_Check(self) || PyUnicode_READY(self) == -1) {
        return NULL;
    if (index < 0 || index >= PyUnicode_GET_LENGTH(self)) {
        PyErr_SetString(PyExc_IndexError, "string index out of range");
        return NULL;
    kind = PyUnicode_KIND(self);
    data = PyUnicode_DATA(self);
    ch = PyUnicode_READ(kind, data, index);
    if (ch < 256)
        return get_latin1_char(ch);

    res = PyUnicode_New(1, ch);
    if (res == NULL)
        return NULL;
    kind = PyUnicode_KIND(res);
    data = PyUnicode_DATA(res);
    PyUnicode_WRITE(kind, data, 0, ch);
    assert(_PyUnicode_CheckConsistency(res, 1));
    return res;

Walking from the top, we’ll have some checks. These are boring. Then some assigns, which should also be boring. The first interesting line is

ch = PyUnicode_READ(kind, data, index);

but we’d hope that is fast, as we’re reading from a contiguous C array by indexing it. The result, ch, will be less than 256 so we’ll return the cached character in get_latin1_char(ch).

So we’ll run (dropping the first checks)

kind = PyUnicode_KIND(self);
data = PyUnicode_DATA(self);
ch = PyUnicode_READ(kind, data, index);
return get_latin1_char(ch);


#define PyUnicode_KIND(op) \
    (assert(PyUnicode_Check(op)), \
     assert(PyUnicode_IS_READY(op)),            \
     ((PyASCIIObject *)(op))->state.kind)

(which is boring because asserts get ignored in debug [so I can check that they’re fast] and ((PyASCIIObject *)(op))->state.kind) is (I think) an indirection and a C-level cast);

#define PyUnicode_DATA(op) \
    (assert(PyUnicode_Check(op)), \
     PyUnicode_IS_COMPACT(op) ? _PyUnicode_COMPACT_DATA(op) :   \

(which is also boring for similar reasons, assuming the macros (Something_CAPITALIZED) are all fast),

#define PyUnicode_READ(kind, data, index) \
    ((Py_UCS4) \
    ((kind) == PyUnicode_1BYTE_KIND ? \
        ((const Py_UCS1 *)(data))[(index)] : \
        ((kind) == PyUnicode_2BYTE_KIND ? \
            ((const Py_UCS2 *)(data))[(index)] : \
            ((const Py_UCS4 *)(data))[(index)] \
        ) \

(which involves indexes but really isn’t slow at all) and

static PyObject*
get_latin1_char(unsigned char ch)
    PyObject *unicode = unicode_latin1[ch];
    if (!unicode) {
        unicode = PyUnicode_New(1, ch);
        if (!unicode)
            return NULL;
        PyUnicode_1BYTE_DATA(unicode)[0] = ch;
        assert(_PyUnicode_CheckConsistency(unicode, 1));
        unicode_latin1[ch] = unicode;
    return unicode;

Which confirms my suspicion that:

  • This is cached:

    PyObject *unicode = unicode_latin1[ch];
  • This should be fast. The if (!unicode) is not run, so it’s literally equivalent in this case to

    PyObject *unicode = unicode_latin1[ch];
    return unicode;

Honestly, after testing the asserts are fast (by disabling them [I think it works on the C-level asserts…]), the only plausibly-slow parts are:


Which are:

#define PyUnicode_IS_COMPACT(op) \

(fast, as before),

#define _PyUnicode_COMPACT_DATA(op)                     \
    (PyUnicode_IS_ASCII(op) ?                   \
     ((void*)((PyASCIIObject*)(op) + 1)) :              \
     ((void*)((PyCompactUnicodeObject*)(op) + 1)))

(fast if the macro IS_ASCII is fast), and

#define _PyUnicode_NONCOMPACT_DATA(op)                  \
    (assert(((PyUnicodeObject*)(op))->data.any),        \
     ((((PyUnicodeObject *)(op))->data.any)))

(also fast as it’s an assert plus an indirection plus a cast).

So we’re down (the rabbit hole) to:


which is

#define PyUnicode_IS_ASCII(op)                   \
    (assert(PyUnicode_Check(op)),                \
     assert(PyUnicode_IS_READY(op)),             \

Hmm… that seems fast too…

Well, OK, but let’s compare it to PyList_GetItem. (Yeah, thanks Tim Peters for giving me more work to do :P.)

PyObject *
PyList_GetItem(PyObject *op, Py_ssize_t i)
    if (!PyList_Check(op)) {
        return NULL;
    if (i < 0 || i >= Py_SIZE(op)) {
        if (indexerr == NULL) {
            indexerr = PyUnicode_FromString(
                "list index out of range");
            if (indexerr == NULL)
                return NULL;
        PyErr_SetObject(PyExc_IndexError, indexerr);
        return NULL;
    return ((PyListObject *)op) -> ob_item[i];

We can see that on non-error cases this is just going to run:

((PyListObject *)op) -> ob_item[i]

Where PyList_Check is

#define PyList_Check(op) \
     PyType_FastSubclass(Py_TYPE(op), Py_TPFLAGS_LIST_SUBCLASS)

(TABS! TABS!!!) (issue21587) That got fixed and merged in 5 minutes. Like… yeah. Damn. They put Skeet to shame.

#define Py_SIZE(ob)             (((PyVarObject*)(ob))->ob_size)
#define PyType_FastSubclass(t,f)  PyType_HasFeature(t,f)
#define PyType_HasFeature(t,f)  ((PyType_GetFlags(t) & (f)) != 0)
#define PyType_HasFeature(t,f)  (((t)->tp_flags & (f)) != 0)

So this is normally really trivial (two indirections and a couple of boolean checks) unless Py_LIMITED_API is on, in which case… ???

Then there’s the indexing and a cast (((PyListObject *)op) -> ob_item[i]) and we’re done.

So there are definitely fewer checks for lists, and the small speed differences certainly imply that it could be relevant.

I think in general, there’s just more type-checking and indirection (->) for Unicode. It seems I’m missing a point, but what?

回答 1



When you iterate over most container objects (lists, tuples, dicts, …), the iterator delivers the objects in the container.

But when you iterate over a string, a new object has to be created for each character delivered – a string is not “a container” in the same sense a list is a container. The individual characters in a string don’t exist as distinct objects before iteration creates those objects.

回答 2



>>> timeit("[x for x in ['a','b','c']]")
>>> timeit("[x for x in 'abc']")

这是使用2.7运行的,但是在我的Mac book pro i7上。这可能是系统配置不同的结果。

You could be incurring and overhead for creating the iterator for the string. Whereas the array already contains an iterator upon instantiation.


>>> timeit("[x for x in ['a','b','c']]")
>>> timeit("[x for x in 'abc']")

This was ran using 2.7, but on my mac book pro i7. This could be the result of a system configuration difference.







I’m looking for documents that describes in details how python garbage collection works.

I’m interested what is done in which step. What objects are in these 3 collections? What kinds of objects are deleted in each step? What algorithm is used for reference cycles finding?

Background: I’m implementing some searches that have to finish in small amount of time. When the garbage collector starts collecting the oldest generation, it is “much” slower than in other cases. It took more time than it is intended for searches. I’m looking how to predict when it will collect oldest generation and how long it will take.

It is easy to predict when it will collect oldest generation with get_count() and get_threshold(). That also can be manipulated with set_threshold(). But I don’t see how easy to decide is it better to make collect() by force or wait for scheduled collection.

回答 0




There’s no definitive resource on how Python does its garbage collection (other than the source code itself), but those 3 links should give you a pretty good idea.


The source is actually pretty helpful. How much you get out of it depends on how well you read C, but the comments are actually very helpful. Skip down to the collect() function and the comments explain the process well (albeit in very technical terms).

__init __()是否应该调用父类的__init __()?

问题:__init __()是否应该调用父类的__init __()?


- (void)init {
    if (self = [super init]) {
        // init class
    return self;


class NewClass(SomeOtherClass):
    def __init__(self):
        # init class



I’m used that in Objective-C I’ve got this construct:

- (void)init {
    if (self = [super init]) {
        // init class
    return self;

Should Python also call the parent class’s implementation for __init__?

class NewClass(SomeOtherClass):
    def __init__(self):
        # init class

Is this also true/false for __new__() and __del__()?

Edit: There’s a very similar question: Inheritance and Overriding __init__ in Python

回答 0





In Python, calling the super-class’ __init__ is optional. If you call it, it is then also optional whether to use the super identifier, or whether to explicitly name the super class:


In case of object, calling the super method is not strictly necessary, since the super method is empty. Same for __del__.

On the other hand, for __new__, you should indeed call the super method, and use its return as the newly-created object – unless you explicitly want to return something different.

回答 1

如果__init__除了在当前类中正在执行的操作之外,还需要从super 进行操作,则__init__,必须自己调用它,因为这不会自动发生。但是,如果您不需要super的__init__,任何东西,则无需调用它。例:

>>> class C(object):
        def __init__(self):
            self.b = 1

>>> class D(C):
        def __init__(self):
            super().__init__() # in Python 2 use super(D, self).__init__()
            self.a = 1

>>> class E(C):
        def __init__(self):
            self.a = 1

>>> d = D()
>>> d.a
>>> d.b  # This works because of the call to super's init
>>> e = E()
>>> e.a
>>> e.b  # This is going to fail since nothing in E initializes b...
Traceback (most recent call last):
  File "<pyshell#70>", line 1, in <module>
    e.b  # This is going to fail since nothing in E initializes b...
AttributeError: 'E' object has no attribute 'b'


我很少使用__new__. 所有初始化方法__init__.

If you need something from super’s __init__ to be done in addition to what is being done in the current class’s __init__, you must call it yourself, since that will not happen automatically. But if you don’t need anything from super’s __init__, no need to call it. Example:

>>> class C(object):
        def __init__(self):
            self.b = 1

>>> class D(C):
        def __init__(self):
            super().__init__() # in Python 2 use super(D, self).__init__()
            self.a = 1

>>> class E(C):
        def __init__(self):
            self.a = 1

>>> d = D()
>>> d.a
>>> d.b  # This works because of the call to super's init
>>> e = E()
>>> e.a
>>> e.b  # This is going to fail since nothing in E initializes b...
Traceback (most recent call last):
  File "<pyshell#70>", line 1, in <module>
    e.b  # This is going to fail since nothing in E initializes b...
AttributeError: 'E' object has no attribute 'b'

__del__ is the same way, (but be wary of relying on __del__ for finalization – consider doing it via the with statement instead).

I rarely use __new__. I do all the initialization in __init__.

回答 2

“如果__init__除了在当前类中所做的事情之外,还需要从super 进行一些事情__init__,则必须自己调用它,因为这不会自动发生”


不是说“ super __init__ (…)中的某事不会自动发生”,而是它会自动发生,但不会发生,因为__init__派生类的定义覆盖了基类。__init__

那么,为什么要定义一个named_class’ __init__,因为它会覆盖有人诉诸继承时的目标?


然后,问题是__init__在实例化时不再激活存在于基类中的所需指令。为了抵消这种失活,需要做一些特殊的事情:显式调用基类’ __init__,以便保留基类执行的初始化,而不是添加__init__。这就是官方文档中所说的:



  • 当目标是保留基类执行的初始化(即纯继承)时,不需要任何特殊操作,必须避免__init__在派生类中定义一个函数

  • 当目的是替换由基类执行的初始化时,__init__必须在派生类中定义

  • 当目标是将过程添加到由基类执行的初始化时,__init__ 必须定义一个派生类,包括对基类的显式调用__init__


In Anon’s answer:
“If you need something from super’s __init__ to be done in addition to what is being done in the current class’s __init__ , you must call it yourself, since that will not happen automatically”

It’s incredible: he is wording exactly the contrary of the principle of inheritance.

It is not that “something from super’s __init__ (…) will not happen automatically” , it is that it WOULD happen automatically, but it doesn’t happen because the base-class’ __init__ is overriden by the definition of the derived-clas __init__

So then, WHY defining a derived_class’ __init__ , since it overrides what is aimed at when someone resorts to inheritance ??

It’s because one needs to define something that is NOT done in the base-class’ __init__ , and the only possibility to obtain that is to put its execution in a derived-class’ __init__ function.
In other words, one needs something in base-class’ __init__ in addition to what would be automatically done in the base-classe’ __init__ if this latter wasn’t overriden.
NOT the contrary.

Then, the problem is that the desired instructions present in the base-class’ __init__ are no more activated at the moment of instantiation. In order to offset this inactivation, something special is required: calling explicitly the base-class’ __init__ , in order to KEEP , NOT TO ADD, the initialization performed by the base-class’ __init__ . That’s exactly what is said in the official doc:

An overriding method in a derived class may in fact want to extend rather than simply replace the base class method of the same name. There is a simple way to call the base class method directly: just call BaseClassName.methodname(self, arguments).

That’s all the story:

  • when the aim is to KEEP the initialization performed by the base-class, that is pure inheritance, nothing special is needed, one must just avoid to define an __init__ function in the derived class

  • when the aim is to REPLACE the initialization performed by the base-class, __init__ must be defined in the derived-class

  • when the aim is to ADD processes to the initialization performed by the base-class, a derived-class’ __init__ must be defined , comprising an explicit call to the base-class __init__

What I feel astonishing in the post of Anon is not only that he expresses the contrary of the inheritance theory, but that there have been 5 guys passing by that upvoted without turning a hair, and moreover there have been nobody to react in 2 years in a thread whose interesting subject must be read relatively often.

回答 3



>>> class A:
    def __init__(self, val):
        self.a = val

>>> class B(A):

>>> class C(A):
    def __init__(self, val):
        A.__init__(self, val)
        self.a += val

>>> A(4).a
>>> B(5).a
>>> C(6).a

Edit: (after the code change)
There is no way for us to tell you whether you need or not to call your parent’s __init__ (or any other function). Inheritance obviously would work without such call. It all depends on the logic of your code: for example, if all your __init__ is done in parent class, you can just skip child-class __init__ altogether.

consider the following example:

>>> class A:
    def __init__(self, val):
        self.a = val

>>> class B(A):

>>> class C(A):
    def __init__(self, val):
        A.__init__(self, val)
        self.a += val

>>> A(4).a
>>> B(5).a
>>> C(6).a

回答 4




There’s no hard and fast rule. The documentation for a class should indicate whether subclasses should call the superclass method. Sometimes you want to completely replace superclass behaviour, and at other times augment it – i.e. call your own code before and/or after a superclass call.

Update: The same basic logic applies to any method call. Constructors sometimes need special consideration (as they often set up state which determines behaviour) and destructors because they parallel constructors (e.g. in the allocation of resources, e.g. database connections). But the same might apply, say, to the render() method of a widget.

Further update: What’s the OPP? Do you mean OOP? No – a subclass often needs to know something about the design of the superclass. Not the internal implementation details – but the basic contract that the superclass has with its clients (using classes). This does not violate OOP principles in any way. That’s why protected is a valid concept in OOP in general (though not, of course, in Python).

回答 5


IMO, you should call it. If your superclass is object, you should not, but in other cases I think it is exceptional not to call it. As already answered by others, it is very convenient if your class doesn’t even have to override __init__ itself, for example when it has no (additional) internal state to initialize.

回答 6




class Base:
  def __init__(self):
    print('base init')

class Derived1(Base):
  def __init__(self):
    print('derived1 init')

class Derived2(Base):
  def __init__(self):
    super(Derived2, self).__init__()
    print('derived2 init')

print('Creating Derived1...')
d1 = Derived1()
print('Creating Derived2...')
d2 = Derived2()


Creating Derived1...
derived1 init
Creating Derived2...
base init
derived2 init


Yes, you should always call base class __init__ explicitly as a good coding practice. Forgetting to do this can cause subtle issues or run time errors. This is true even if __init__ doesn’t take any parameters. This is unlike other languages where compiler would implicitly call base class constructor for you. Python doesn’t do that!

The main reason for always calling base class _init__ is that base class may typically create member variable and initialize them to defaults. So if you don’t call base class init, none of that code would be executed and you would end up with base class that has no member variables.


class Base:
  def __init__(self):
    print('base init')

class Derived1(Base):
  def __init__(self):
    print('derived1 init')

class Derived2(Base):
  def __init__(self):
    super(Derived2, self).__init__()
    print('derived2 init')

print('Creating Derived1...')
d1 = Derived1()
print('Creating Derived2...')
d2 = Derived2()

This prints..

Creating Derived1...
derived1 init
Creating Derived2...
base init
derived2 init

Run this code.


请使用 微信 扫码支付