如何在Pandas DataFrame中将True / False映射到1/0?

问题:如何在Pandas DataFrame中将True / False映射到1/0?

我在python pandas DataFrame中有一列具有布尔True / False值的列,但是对于进一步的计算,我需要1/0表示形式。有没有一种快速的方法来做到这一点?

I have a column in python pandas DataFrame that has boolean True/False values, but for further calculations I need 1/0 representation. Is there a quick pandas/numpy way to do that?


回答 0

一种将布尔值的单列转换为整数1或0的列的简洁方法:

df["somecolumn"] = df["somecolumn"].astype(int)

A succinct way to convert a single column of boolean values to a column of integers 1 or 0:

df["somecolumn"] = df["somecolumn"].astype(int)

回答 1

只需将您的数据框乘以1(int)

[1]: data = pd.DataFrame([[True, False, True], [False, False, True]])
[2]: print data
          0      1     2
     0   True  False  True
     1   False False  True

[3]: print data*1
         0  1  2
     0   1  0  1
     1   0  0  1

Just multiply your Dataframe by 1 (int)

[1]: data = pd.DataFrame([[True, False, True], [False, False, True]])
[2]: print data
          0      1     2
     0   True  False  True
     1   False False  True

[3]: print data*1
         0  1  2
     0   1  0  1
     1   0  0  1

回答 2

True1在Python,同样False0*

>>> True == 1
True
>>> False == 0
True

通过将它们视为数字,就可以对它们执行所需的任何操作,因为它们数字:

>>> issubclass(bool, int)
True
>>> True * 5
5

因此,回答您的问题,无需任何工作-您已经有了所需的东西。

*请注意,我使用的英文单词,而不是Python关键字isTrue与任何random都不是同一对象1

True is 1 in Python, and likewise False is 0*:

>>> True == 1
True
>>> False == 0
True

You should be able to perform any operations you want on them by just treating them as though they were numbers, as they are numbers:

>>> issubclass(bool, int)
True
>>> True * 5
5

So to answer your question, no work necessary – you already have what you are looking for.

* Note I use is as an English word, not the Python keyword isTrue will not be the same object as any random 1.


回答 3

您也可以直接在框架上执行此操作

In [104]: df = DataFrame(dict(A = True, B = False),index=range(3))

In [105]: df
Out[105]: 
      A      B
0  True  False
1  True  False
2  True  False

In [106]: df.dtypes
Out[106]: 
A    bool
B    bool
dtype: object

In [107]: df.astype(int)
Out[107]: 
   A  B
0  1  0
1  1  0
2  1  0

In [108]: df.astype(int).dtypes
Out[108]: 
A    int64
B    int64
dtype: object

You also can do this directly on Frames

In [104]: df = DataFrame(dict(A = True, B = False),index=range(3))

In [105]: df
Out[105]: 
      A      B
0  True  False
1  True  False
2  True  False

In [106]: df.dtypes
Out[106]: 
A    bool
B    bool
dtype: object

In [107]: df.astype(int)
Out[107]: 
   A  B
0  1  0
1  1  0
2  1  0

In [108]: df.astype(int).dtypes
Out[108]: 
A    int64
B    int64
dtype: object

回答 4

您可以对数据框使用转换:

df = pd.DataFrame(my_data condition)

在1/0中转换真/假

df = df*1

You can use a transformation for your data frame:

df = pd.DataFrame(my_data condition)

transforming True/False in 1/0

df = df*1

回答 5

使用Series.view的转换布尔为整数:

df["somecolumn"] = df["somecolumn"].view('i1')

Use Series.view for convert boolean to integers:

df["somecolumn"] = df["somecolumn"].view('i1')

哪里是logging.config.dictConfig的完整示例?

问题:哪里是logging.config.dictConfig的完整示例?

我想使用dictConfig,但是文档有点抽象。在哪里可以找到与结合使用的字典的具体,可复制和粘贴的示例dictConfig

I’d like to use dictConfig, but the documentation is a little bit abstract. Where can I find a concrete, copy+paste-able example of the dictionary used with dictConfig?


回答 0

怎么样

LOGGING_CONFIG = { 
    'version': 1,
    'disable_existing_loggers': True,
    'formatters': { 
        'standard': { 
            'format': '%(asctime)s [%(levelname)s] %(name)s: %(message)s'
        },
    },
    'handlers': { 
        'default': { 
            'level': 'INFO',
            'formatter': 'standard',
            'class': 'logging.StreamHandler',
            'stream': 'ext://sys.stdout',  # Default is stderr
        },
    },
    'loggers': { 
        '': {  # root logger
            'handlers': ['default'],
            'level': 'WARNING',
            'propagate': False
        },
        'my.packg': { 
            'handlers': ['default'],
            'level': 'INFO',
            'propagate': False
        },
        '__main__': {  # if __name__ == '__main__'
            'handlers': ['default'],
            'level': 'DEBUG',
            'propagate': False
        },
    } 
}

用法:

# Run once at startup:
logging.config.dictConfig(LOGGING_CONFIG)

# Include in each module:
log = logging.getLogger(__name__)
log.debug("Logging is configured.")

如果您从第三方软件包中看到太多日志,请确保在导入第三方软件包logging.config.dictConfig(LOGGING_CONFIG) 之前使用来运行此配置。

参考:https : //docs.python.org/3/library/logging.config.html#configuration-dictionary-schema

How about here! The corresponding documentation reference is configuration-dictionary-schema.

LOGGING_CONFIG = { 
    'version': 1,
    'disable_existing_loggers': True,
    'formatters': { 
        'standard': { 
            'format': '%(asctime)s [%(levelname)s] %(name)s: %(message)s'
        },
    },
    'handlers': { 
        'default': { 
            'level': 'INFO',
            'formatter': 'standard',
            'class': 'logging.StreamHandler',
            'stream': 'ext://sys.stdout',  # Default is stderr
        },
    },
    'loggers': { 
        '': {  # root logger
            'handlers': ['default'],
            'level': 'WARNING',
            'propagate': False
        },
        'my.packg': { 
            'handlers': ['default'],
            'level': 'INFO',
            'propagate': False
        },
        '__main__': {  # if __name__ == '__main__'
            'handlers': ['default'],
            'level': 'DEBUG',
            'propagate': False
        },
    } 
}

Usage:

# Run once at startup:
logging.config.dictConfig(LOGGING_CONFIG)

# Include in each module:
log = logging.getLogger(__name__)
log.debug("Logging is configured.")

In case you see too many logs from third-party packages, be sure to run this config using logging.config.dictConfig(LOGGING_CONFIG) before the third-party packages are imported.

To add additional custom info to each log message using a logging filter, consider this answer.


回答 1

接受的答案很好!但是,如果可以从不太复杂的东西开始呢?日志记录模块是非常强大的功能,并且文档对于新手来说有点不知所措。但是从一开始,您就不需要配置格式化程序和处理程序。您可以在确定需要的内容时添加它。

例如:

import logging.config

DEFAULT_LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'loggers': {
        '': {
            'level': 'INFO',
        },
        'another.module': {
            'level': 'DEBUG',
        },
    }
}

logging.config.dictConfig(DEFAULT_LOGGING)

logging.info('Hello, log')

The accepted answer is nice! But what if one could begin with something less complex? The logging module is very powerful thing and the documentation is kind of a little bit overwhelming especially for novice. But for the beginning you no need to configure formatters and handlers. You can add it when you figure out what you want.

For example:

import logging.config

DEFAULT_LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'loggers': {
        '': {
            'level': 'INFO',
        },
        'another.module': {
            'level': 'DEBUG',
        },
    }
}

logging.config.dictConfig(DEFAULT_LOGGING)

logging.info('Hello, log')

回答 2

流处理程序,文件处理程序,旋转文件处理程序和SMTP处理程序的示例

from logging.config import dictConfig

LOGGING_CONFIG = {
    'version': 1,
    'loggers': {
        '': {  # root logger
            'level': 'NOTSET',
            'handlers': ['debug_console_handler', 'info_rotating_file_handler', 'error_file_handler', 'critical_mail_handler'],
        },
        'my.package': { 
            'level': 'WARNING',
            'propagate': False,
            'handlers': ['info_rotating_file_handler', 'error_file_handler' ],
        },
    },
    'handlers': {
        'debug_console_handler': {
            'level': 'DEBUG',
            'formatter': 'info',
            'class': 'logging.StreamHandler',
            'stream': 'ext://sys.stdout',
        },
        'info_rotating_file_handler': {
            'level': 'INFO',
            'formatter': 'info',
            'class': 'logging.handlers.RotatingFileHandler',
            'filename': 'info.log',
            'mode': 'a',
            'maxBytes': 1048576,
            'backupCount': 10
        },
        'error_file_handler': {
            'level': 'WARNING',
            'formatter': 'error',
            'class': 'logging.FileHandler',
            'filename': 'error.log',
            'mode': 'a',
        },
        'critical_mail_handler': {
            'level': 'CRITICAL',
            'formatter': 'error',
            'class': 'logging.handlers.SMTPHandler',
            'mailhost' : 'localhost',
            'fromaddr': 'monitoring@domain.com',
            'toaddrs': ['dev@domain.com', 'qa@domain.com'],
            'subject': 'Critical error with application name'
        }
    },
    'formatters': {
        'info': {
            'format': '%(asctime)s-%(levelname)s-%(name)s::%(module)s|%(lineno)s:: %(message)s'
        },
        'error': {
            'format': '%(asctime)s-%(levelname)s-%(name)s-%(process)d::%(module)s|%(lineno)s:: %(message)s'
        },
    },

}

dictConfig(LOGGING_CONFIG)

Example with Stream Handler, File Handler, Rotating File Handler and SMTP Handler

from logging.config import dictConfig

LOGGING_CONFIG = {
    'version': 1,
    'loggers': {
        '': {  # root logger
            'level': 'NOTSET',
            'handlers': ['debug_console_handler', 'info_rotating_file_handler', 'error_file_handler', 'critical_mail_handler'],
        },
        'my.package': { 
            'level': 'WARNING',
            'propagate': False,
            'handlers': ['info_rotating_file_handler', 'error_file_handler' ],
        },
    },
    'handlers': {
        'debug_console_handler': {
            'level': 'DEBUG',
            'formatter': 'info',
            'class': 'logging.StreamHandler',
            'stream': 'ext://sys.stdout',
        },
        'info_rotating_file_handler': {
            'level': 'INFO',
            'formatter': 'info',
            'class': 'logging.handlers.RotatingFileHandler',
            'filename': 'info.log',
            'mode': 'a',
            'maxBytes': 1048576,
            'backupCount': 10
        },
        'error_file_handler': {
            'level': 'WARNING',
            'formatter': 'error',
            'class': 'logging.FileHandler',
            'filename': 'error.log',
            'mode': 'a',
        },
        'critical_mail_handler': {
            'level': 'CRITICAL',
            'formatter': 'error',
            'class': 'logging.handlers.SMTPHandler',
            'mailhost' : 'localhost',
            'fromaddr': 'monitoring@domain.com',
            'toaddrs': ['dev@domain.com', 'qa@domain.com'],
            'subject': 'Critical error with application name'
        }
    },
    'formatters': {
        'info': {
            'format': '%(asctime)s-%(levelname)s-%(name)s::%(module)s|%(lineno)s:: %(message)s'
        },
        'error': {
            'format': '%(asctime)s-%(levelname)s-%(name)s-%(process)d::%(module)s|%(lineno)s:: %(message)s'
        },
    },

}

dictConfig(LOGGING_CONFIG)

回答 3

我在下面找到了Django v1.11.15默认配置,希望对您有所帮助

DEFAULT_LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'filters': {
        'require_debug_false': {
            '()': 'django.utils.log.RequireDebugFalse',
        },
        'require_debug_true': {
            '()': 'django.utils.log.RequireDebugTrue',
        },
    },
    'formatters': {
        'django.server': {
            '()': 'django.utils.log.ServerFormatter',
            'format': '[%(server_time)s] %(message)s',
        }
    },
    'handlers': {
        'console': {
            'level': 'INFO',
            'filters': ['require_debug_true'],
            'class': 'logging.StreamHandler',
        },
        'django.server': {
            'level': 'INFO',
            'class': 'logging.StreamHandler',
            'formatter': 'django.server',
        },
        'mail_admins': {
            'level': 'ERROR',
            'filters': ['require_debug_false'],
            'class': 'django.utils.log.AdminEmailHandler'
        }
    },
    'loggers': {
        'django': {
            'handlers': ['console', 'mail_admins'],
            'level': 'INFO',
        },
        'django.server': {
            'handlers': ['django.server'],
            'level': 'INFO',
            'propagate': False,
        },
    }
}

I found Django v1.11.15 default config below, hope it helps

DEFAULT_LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'filters': {
        'require_debug_false': {
            '()': 'django.utils.log.RequireDebugFalse',
        },
        'require_debug_true': {
            '()': 'django.utils.log.RequireDebugTrue',
        },
    },
    'formatters': {
        'django.server': {
            '()': 'django.utils.log.ServerFormatter',
            'format': '[%(server_time)s] %(message)s',
        }
    },
    'handlers': {
        'console': {
            'level': 'INFO',
            'filters': ['require_debug_true'],
            'class': 'logging.StreamHandler',
        },
        'django.server': {
            'level': 'INFO',
            'class': 'logging.StreamHandler',
            'formatter': 'django.server',
        },
        'mail_admins': {
            'level': 'ERROR',
            'filters': ['require_debug_false'],
            'class': 'django.utils.log.AdminEmailHandler'
        }
    },
    'loggers': {
        'django': {
            'handlers': ['console', 'mail_admins'],
            'level': 'INFO',
        },
        'django.server': {
            'handlers': ['django.server'],
            'level': 'INFO',
            'propagate': False,
        },
    }
}

回答 4

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import logging
import logging.handlers
from logging.config import dictConfig

logger = logging.getLogger(__name__)

DEFAULT_LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
}
def configure_logging(logfile_path):
    """
    Initialize logging defaults for Project.

    :param logfile_path: logfile used to the logfile
    :type logfile_path: string

    This function does:

    - Assign INFO and DEBUG level to logger file handler and console handler

    """
    dictConfig(DEFAULT_LOGGING)

    default_formatter = logging.Formatter(
        "[%(asctime)s] [%(levelname)s] [%(name)s] [%(funcName)s():%(lineno)s] [PID:%(process)d TID:%(thread)d] %(message)s",
        "%d/%m/%Y %H:%M:%S")

    file_handler = logging.handlers.RotatingFileHandler(logfile_path, maxBytes=10485760,backupCount=300, encoding='utf-8')
    file_handler.setLevel(logging.INFO)

    console_handler = logging.StreamHandler()
    console_handler.setLevel(logging.DEBUG)

    file_handler.setFormatter(default_formatter)
    console_handler.setFormatter(default_formatter)

    logging.root.setLevel(logging.DEBUG)
    logging.root.addHandler(file_handler)
    logging.root.addHandler(console_handler)



[31/10/2015 22:00:33] [DEBUG] [yourmodulename] [yourfunction_name():9] [PID:61314 TID:140735248744448] this is logger infomation from hello module
#!/usr/bin/env python
# -*- coding: utf-8 -*-

import logging
import logging.handlers
from logging.config import dictConfig

logger = logging.getLogger(__name__)

DEFAULT_LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
}
def configure_logging(logfile_path):
    """
    Initialize logging defaults for Project.

    :param logfile_path: logfile used to the logfile
    :type logfile_path: string

    This function does:

    - Assign INFO and DEBUG level to logger file handler and console handler

    """
    dictConfig(DEFAULT_LOGGING)

    default_formatter = logging.Formatter(
        "[%(asctime)s] [%(levelname)s] [%(name)s] [%(funcName)s():%(lineno)s] [PID:%(process)d TID:%(thread)d] %(message)s",
        "%d/%m/%Y %H:%M:%S")

    file_handler = logging.handlers.RotatingFileHandler(logfile_path, maxBytes=10485760,backupCount=300, encoding='utf-8')
    file_handler.setLevel(logging.INFO)

    console_handler = logging.StreamHandler()
    console_handler.setLevel(logging.DEBUG)

    file_handler.setFormatter(default_formatter)
    console_handler.setFormatter(default_formatter)

    logging.root.setLevel(logging.DEBUG)
    logging.root.addHandler(file_handler)
    logging.root.addHandler(console_handler)



[31/10/2015 22:00:33] [DEBUG] [yourmodulename] [yourfunction_name():9] [PID:61314 TID:140735248744448] this is logger infomation from hello module

不区分大小写的列表排序,而不降低结果大小?

问题:不区分大小写的列表排序,而不降低结果大小?

我有一个这样的字符串列表:

['Aden', 'abel']

我要对项目排序,不区分大小写。所以我想得到:

['abel', 'Aden']

但与sorted()或相反list.sort(),因为大写字母先于小写字母。

我如何忽略这种情况?我已经看到了涉及降低所有列表项的解决方案,但是我不想更改列表项的大小写。

I have a list of strings like this:

['Aden', 'abel']

I want to sort the items, case-insensitive. So I want to get:

['abel', 'Aden']

But I get the opposite with sorted() or list.sort(), because uppercase appears before lowercase.

How can I ignore the case? I’ve seen solutions which involves lowercasing all list items, but I don’t want to change the case of the list items.


回答 0

在Python 3.3+中,有str.casefold一种专为无条件匹配而设计的方法:

sorted_list = sorted(unsorted_list, key=str.casefold)

在Python 2中使用lower()

sorted_list = sorted(unsorted_list, key=lambda s: s.lower())

它适用于普通字符串和unicode字符串,因为它们都有lower方法。

在Python 2中,它可以将普通字符串和unicode字符串混合使用,因为这两种类型的值可以相互比较。但是,Python 3并不是这样工作的:您无法比较字节字符串和unicode字符串,因此在Python 3中,您应该做明智的事情,并且只能对一种类型的字符串列表进行排序。

>>> lst = ['Aden', u'abe1']
>>> sorted(lst)
['Aden', u'abe1']
>>> sorted(lst, key=lambda s: s.lower())
[u'abe1', 'Aden']

In Python 3.3+ there is the str.casefold method that’s specifically designed for caseless matching:

sorted_list = sorted(unsorted_list, key=str.casefold)

In Python 2 use lower():

sorted_list = sorted(unsorted_list, key=lambda s: s.lower())

It works for both normal and unicode strings, since they both have a lower method.

In Python 2 it works for a mix of normal and unicode strings, since values of the two types can be compared with each other. Python 3 doesn’t work like that, though: you can’t compare a byte string and a unicode string, so in Python 3 you should do the sane thing and only sort lists of one type of string.

>>> lst = ['Aden', u'abe1']
>>> sorted(lst)
['Aden', u'abe1']
>>> sorted(lst, key=lambda s: s.lower())
[u'abe1', 'Aden']

回答 1

>>> x = ['Aden', 'abel']
>>> sorted(x, key=str.lower) # Or unicode.lower if all items are unicode
['abel', 'Aden']

在Python 3中str是unicode,但在Python 2中,您可以使用这种更通用的方法,该方法对str和都适用unicode

>>> sorted(x, key=lambda s: s.lower())
['abel', 'Aden']
>>> x = ['Aden', 'abel']
>>> sorted(x, key=str.lower) # Or unicode.lower if all items are unicode
['abel', 'Aden']

In Python 3 str is unicode but in Python 2 you can use this more general approach which works for both str and unicode:

>>> sorted(x, key=lambda s: s.lower())
['abel', 'Aden']

回答 2

您也可以尝试使用此方法对列表进行就地排序:

>>> x = ['Aden', 'abel']
>>> x.sort(key=lambda y: y.lower())
>>> x
['abel', 'Aden']

You can also try this to sort the list in-place:

>>> x = ['Aden', 'abel']
>>> x.sort(key=lambda y: y.lower())
>>> x
['abel', 'Aden']

回答 3

这在Python 3中有效,并且不涉及小写结果(!)。

values.sort(key=str.lower)

This works in Python 3 and does not involves lowercasing the result (!).

values.sort(key=str.lower)

回答 4

在python3中,您可以使用

list1.sort(key=lambda x: x.lower()) #Case In-sensitive             
list1.sort() #Case Sensitive

In python3 you can use

list1.sort(key=lambda x: x.lower()) #Case In-sensitive             
list1.sort() #Case Sensitive

回答 5

我是通过Python 3.3做到的:

 def sortCaseIns(lst):
    lst2 = [[x for x in range(0, 2)] for y in range(0, len(lst))]
    for i in range(0, len(lst)):
        lst2[i][0] = lst[i].lower()
        lst2[i][1] = lst[i]
    lst2.sort()
    for i in range(0, len(lst)):
        lst[i] = lst2[i][1]

然后,您可以调用此函数:

sortCaseIns(yourListToSort)

I did it this way for Python 3.3:

 def sortCaseIns(lst):
    lst2 = [[x for x in range(0, 2)] for y in range(0, len(lst))]
    for i in range(0, len(lst)):
        lst2[i][0] = lst[i].lower()
        lst2[i][1] = lst[i]
    lst2.sort()
    for i in range(0, len(lst)):
        lst[i] = lst2[i][1]

Then you just can call this function:

sortCaseIns(yourListToSort)

回答 6

不区分大小写的排序,在Python 2 OR 3中对字符串进行排序(在Python 2.7.17和Python 3.6.9中测试):

>>> x = ["aa", "A", "bb", "B", "cc", "C"]
>>> x.sort()
>>> x
['A', 'B', 'C', 'aa', 'bb', 'cc']
>>> x.sort(key=str.lower)           # <===== there it is!
>>> x
['A', 'aa', 'B', 'bb', 'C', 'cc']

关键是key=str.lower。这些命令只是这些命令的外观,以便于复制粘贴,因此您可以对其进行测试:

x = ["aa", "A", "bb", "B", "cc", "C"]
x.sort()
x
x.sort(key=str.lower)
x

请注意,但是,如果您的字符串是unicode字符串(如u'some string'),则仅在Python 2中(在这种情况下,在Python 3中不是),上述x.sort(key=str.lower)命令将失败并输出以下错误:

TypeError: descriptor 'lower' requires a 'str' object but received a 'unicode'

如果出现此错误,请升级到Python 3来处理unicode排序,或者先使用列表推导将unicode字符串转换为ASCII字符串,如下所示:

# for Python2, ensure all elements are ASCII (NOT unicode) strings first
x = [str(element) for element in x]  
# for Python2, this sort will only work on ASCII (NOT unicode) strings
x.sort(key=str.lower)

参考文献:

  1. https://docs.python.org/3/library/stdtypes.html#list.sort
  2. 将Unicode字符串转换为Python中的字符串(包含多余的符号)
  3. https://www.programiz.com/python-programming/list-comprehension

Case-insensitive sort, sorting the string in place, in Python 2 OR 3 (tested in Python 2.7.17 and Python 3.6.9):

>>> x = ["aa", "A", "bb", "B", "cc", "C"]
>>> x.sort()
>>> x
['A', 'B', 'C', 'aa', 'bb', 'cc']
>>> x.sort(key=str.lower)           # <===== there it is!
>>> x
['A', 'aa', 'B', 'bb', 'C', 'cc']

The key is key=str.lower. Here’s what those commands look like with just the commands, for easy copy-pasting so you can test them:

x = ["aa", "A", "bb", "B", "cc", "C"]
x.sort()
x
x.sort(key=str.lower)
x

Note that if your strings are unicode strings, however (like u'some string'), then in Python 2 only (NOT in Python 3 in this case) the above x.sort(key=str.lower) command will fail and output the following error:

TypeError: descriptor 'lower' requires a 'str' object but received a 'unicode'

If you get this error, then either upgrade to Python 3 where they handle unicode sorting, or convert your unicode strings to ASCII strings first, using a list comprehension, like this:

# for Python2, ensure all elements are ASCII (NOT unicode) strings first
x = [str(element) for element in x]  
# for Python2, this sort will only work on ASCII (NOT unicode) strings
x.sort(key=str.lower)

References:

  1. https://docs.python.org/3/library/stdtypes.html#list.sort
  2. Convert a Unicode string to a string in Python (containing extra symbols)
  3. https://www.programiz.com/python-programming/list-comprehension

回答 7

试试这个

def cSort(inlist, minisort=True):
    sortlist = []
    newlist = []
    sortdict = {}
    for entry in inlist:
        try:
            lentry = entry.lower()
        except AttributeError:
            sortlist.append(lentry)
        else:
            try:
                sortdict[lentry].append(entry)
            except KeyError:
                sortdict[lentry] = [entry]
                sortlist.append(lentry)

    sortlist.sort()
    for entry in sortlist:
        try:
            thislist = sortdict[entry]
            if minisort: thislist.sort()
            newlist = newlist + thislist
        except KeyError:
            newlist.append(entry)
    return newlist

lst = ['Aden', 'abel']
print cSort(lst)

输出量

['abel', 'Aden']

Try this

def cSort(inlist, minisort=True):
    sortlist = []
    newlist = []
    sortdict = {}
    for entry in inlist:
        try:
            lentry = entry.lower()
        except AttributeError:
            sortlist.append(lentry)
        else:
            try:
                sortdict[lentry].append(entry)
            except KeyError:
                sortdict[lentry] = [entry]
                sortlist.append(lentry)

    sortlist.sort()
    for entry in sortlist:
        try:
            thislist = sortdict[entry]
            if minisort: thislist.sort()
            newlist = newlist + thislist
        except KeyError:
            newlist.append(entry)
    return newlist

lst = ['Aden', 'abel']
print cSort(lst)

Output

['abel', 'Aden']


如何打印分组对象

问题:如何打印分组对象

我想打印与熊猫分组的结果。

我有一个数据框:

import pandas as pd
df = pd.DataFrame({'A': ['one', 'one', 'two', 'three', 'three', 'one'], 'B': range(6)})
print(df)

       A  B
0    one  0
1    one  1
2    two  2
3  three  3
4  three  4
5    one  5

按“ A”分组后进行打印时,我有以下内容:

print(df.groupby('A'))

<pandas.core.groupby.DataFrameGroupBy object at 0x05416E90>

如何打印分组的数据框?

如果我做:

print(df.groupby('A').head())

我获得的数据框好像没有分组一样:

             A  B
A                
one   0    one  0
      1    one  1
two   2    two  2
three 3  three  3
      4  three  4
one   5    one  5

我期待的是这样的:

             A  B
A                
one   0    one  0
      1    one  1
      5    one  5
two   2    two  2
three 3  three  3
      4  three  4

I want to print the result of grouping with Pandas.

I have a dataframe:

import pandas as pd
df = pd.DataFrame({'A': ['one', 'one', 'two', 'three', 'three', 'one'], 'B': range(6)})
print(df)

       A  B
0    one  0
1    one  1
2    two  2
3  three  3
4  three  4
5    one  5

When printing after grouping by ‘A’ I have the following:

print(df.groupby('A'))

<pandas.core.groupby.DataFrameGroupBy object at 0x05416E90>

How can I print the dataframe grouped?

If I do:

print(df.groupby('A').head())

I obtain the dataframe as if it was not grouped:

             A  B
A                
one   0    one  0
      1    one  1
two   2    two  2
three 3  three  3
      4  three  4
one   5    one  5

I was expecting something like:

             A  B
A                
one   0    one  0
      1    one  1
      5    one  5
two   2    two  2
three 3  three  3
      4  three  4

回答 0

只需做:

grouped_df = df.groupby('A')

for key, item in grouped_df:
    print(grouped_df.get_group(key), "\n\n")

这也可以

grouped_df = df.groupby('A')    
gb = grouped_df.groups

for key, values in gb.iteritems():
    print(df.ix[values], "\n\n")

对于选择性键分组:key_list_from_gb使用以下命令将所需的键插入,如下所示gb.keys()

gb = grouped_df.groups
gb.keys()

key_list_from_gb = [key1, key2, key3]

for key, values in gb.items():
    if key in key_list_from_gb:
        print(df.ix[values], "\n")

Simply do:

grouped_df = df.groupby('A')

for key, item in grouped_df:
    print(grouped_df.get_group(key), "\n\n")

This also works,

grouped_df = df.groupby('A')    
gb = grouped_df.groups

for key, values in gb.iteritems():
    print(df.ix[values], "\n\n")

For selective key grouping: Insert the keys you want inside the key_list_from_gb, in following, using gb.keys(): For Example,

gb = grouped_df.groups
gb.keys()

key_list_from_gb = [key1, key2, key3]

for key, values in gb.items():
    if key in key_list_from_gb:
        print(df.ix[values], "\n")

回答 1

如果您只是在寻找一种显示方式,可以使用describe():

grp = df.groupby['colName']
grp.describe()

这给您一个整洁的桌子。

If you’re simply looking for a way to display it, you could use describe():

grp = df.groupby['colName']
grp.describe()

This gives you a neat table.


回答 2

我确认了head()版本0.12和0.13之间的更改行为。在我看来,这似乎是个虫子。我创建了一个问题

但是groupby操作实际上并不返回按组排序的DataFrame。该.head()方法在这里有点误导-只是方便的功能,它使您可以重新检查df您分组的对象(在本例中为)。结果groupby是另一种对象,一个GroupBy对象。您必须applytransformfilter返回到DataFrame或Series。

如果您要做的只是按A列中的值排序,则应使用df.sort('A')

I confirmed that the behavior of head() changes between version 0.12 and 0.13. That looks like a bug to me. I created an issue.

But a groupby operation doesn’t actually return a DataFrame sorted by group. The .head() method is a little misleading here — it’s just a convenience feature to let you re-examine the object (in this case, df) that you grouped. The result of groupby is separate kind of object, a GroupBy object. You must apply, transform, or filter to get back to a DataFrame or Series.

If all you wanted to do was sort by the values in columns A, you should use df.sort('A').


回答 3

另一个简单的选择:

for name_of_the_group, group in grouped_dataframe:
   print (name_of_the_group)
   print (group)

Another simple alternative:

for name_of_the_group, group in grouped_dataframe:
   print (name_of_the_group)
   print (group)

回答 4

另外,其他简单的选择可能是:

gb = df.groupby("A")
gb.count() # or,
gb.get_group(your_key)

Also, other simple alternative could be:

gb = df.groupby("A")
gb.count() # or,
gb.get_group(your_key)

回答 5

除了以前的答案:

以你为例

df = pd.DataFrame({'A': ['one', 'one', 'two', 'three', 'three', 'one'], 'B': range(6)})

然后是简单的1行代码

df.groupby('A').apply(print)

In addition to previous answers:

Taking your example,

df = pd.DataFrame({'A': ['one', 'one', 'two', 'three', 'three', 'one'], 'B': range(6)})

Then simple 1 line code

df.groupby('A').apply(print)

回答 6

感谢Surya的深刻见解。我会清理他的解决方案,然后简单地执行以下操作:

for key, value in df.groupby('A'):
    print(key, value)

Thanks to Surya for good insights. I’d clean up his solution and simply do:

for key, value in df.groupby('A'):
    print(key, value)

回答 7

您不能直接通过print语句查看groupBy数据,但可以使用for循环遍历该组来查看,请尝试使用此代码查看数据中的组

group = df.groupby('A') #group variable contains groupby data
for A,A_df in group: # A is your column and A_df is group of one kind at a time
  print(A)
  print(A_df)

尝试将其作为分组结果后,您将获得输出

希望对您有所帮助

you cannot see the groupBy data directly by print statement but you can see by iterating over the group using for loop try this code to see the group by data

group = df.groupby('A') #group variable contains groupby data
for A,A_df in group: # A is your column and A_df is group of one kind at a time
  print(A)
  print(A_df)

you will get an output after trying this as a groupby result

I hope it helps


回答 8

在GroupBy对象上调用list()

print(list(df.groupby('A')))

给你:

[('one',      A  B
0  one  0
1  one  1
5  one  5), ('three',        A  B
3  three  3
4  three  4), ('two',      A  B
2  two  2)]

Call list() on the GroupBy object

print(list(df.groupby('A')))

gives you:

[('one',      A  B
0  one  0
1  one  1
5  one  5), ('three',        A  B
3  three  3
4  three  4), ('two',      A  B
2  two  2)]

回答 9

在Jupyter Notebook中,如果执行以下操作,它将打印该对象的一个​​很好的分组版本。该apply方法有助于创建多索引数据框。

by = 'A'  # groupby 'by' argument
df.groupby(by).apply(lambda a: a[:])

输出:

             A  B
A                
one   0    one  0
      1    one  1
      5    one  5
three 3  three  3
      4  three  4
two   2    two  2

如果您希望该by列不出现在输出中,请像这样删除列。

df.groupby(by).apply(lambda a: a.drop(by, axis=1)[:])

输出:

         B
A         
one   0  0
      1  1
      5  5
three 3  3
      4  4
two   2  2

在这里,我不确定为什么.iloc[:]不起作用,而不是[:]最后。因此,如果将来由于更新(或当前)而存在一些问题,.iloc[:len(a)]也可以使用。

In Jupyter Notebook, if you do the following, it prints a nice grouped version of the object. The apply method helps in creation of a multiindex dataframe.

by = 'A'  # groupby 'by' argument
df.groupby(by).apply(lambda a: a[:])

Output:

             A  B
A                
one   0    one  0
      1    one  1
      5    one  5
three 3  three  3
      4  three  4
two   2    two  2

If you want the by column(s) to not appear in the output, just drop the column(s), like so.

df.groupby(by).apply(lambda a: a.drop(by, axis=1)[:])

Output:

         B
A         
one   0  0
      1  1
      5  5
three 3  3
      4  4
two   2  2

Here, I am not sure as to why .iloc[:] does not work instead of [:] at the end. So, if there are some issues in future due to updates (or at present), .iloc[:len(a)] also works.


回答 10

我发现了一个棘手的方法,只是为了头脑风暴,请参见代码:

df['a'] = df['A']  # create a shadow column for MultiIndexing
df.sort_values('A', inplace=True)
df.set_index(["A","a"], inplace=True)
print(df)

输出:

             B
A     a
one   one    0
      one    1
      one    5
three three  3
      three  4
two   two    2

优点很容易打印,因为它返回一个数据框而不是Groupby Object。输出看起来不错。缺点是会创建一系列冗余数据。

I found a tricky way, just for brainstorm, see the code:

df['a'] = df['A']  # create a shadow column for MultiIndexing
df.sort_values('A', inplace=True)
df.set_index(["A","a"], inplace=True)
print(df)

the output:

             B
A     a
one   one    0
      one    1
      one    5
three three  3
      three  4
two   two    2

The pros is so easy to print, as it returns a dataframe, instead of Groupby Object. And the output looks nice. While the con is that it create a series of redundant data.


回答 11

在python 3中

k = None
for name_of_the_group, group in dict(df_group):
    if(k != name_of_the_group):
        print ('\n', name_of_the_group)
        print('..........','\n')
    print (group)
    k = name_of_the_group

以更互动的方式

In python 3

k = None
for name_of_the_group, group in dict(df_group):
    if(k != name_of_the_group):
        print ('\n', name_of_the_group)
        print('..........','\n')
    print (group)
    k = name_of_the_group

In more interactive way


回答 12

打印所有(或任意多个)分组的df行:

import pandas as pd
pd.set_option('display.max_rows', 500)

grouped_df = df.group(['var1', 'var2'])
print(grouped_df)

to print all (or arbitrarily many) lines of the grouped df:

import pandas as pd
pd.set_option('display.max_rows', 500)

grouped_df = df.group(['var1', 'var2'])
print(grouped_df)

django-debug-toolbar未显示

问题:django-debug-toolbar未显示

我看着其他问题,无法解决…

我做了以下安装django-debug-toolbar的操作:

  1. pip安装django-debug-toolbar
  2. 添加到中间件类:
MIDDLEWARE_CLASSES = (
    'django.middleware.common.CommonMiddleware',
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.middleware.csrf.CsrfViewMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.contrib.messages.middleware.MessageMiddleware',
    # Uncomment the next line for simple clickjacking protection:
    # 'django.middleware.clickjacking.XFrameOptionsMiddleware',
    'debug_toolbar.middleware.DebugToolbarMiddleware',
)

3添加了INTERNAL_IPS:

INTERNAL_IPS =(’174.121.34.187’,)

4将debug_toolbar添加到已安装的应用程序

我没有收到任何错误或任何内容,并且该工具栏也没有显示在任何页面上,甚至没有显示在管理页面上。

我什至将debug_toolbar模板的目录添加到了我的 TEMPLATE_DIRS

I looked at other questions and can’t figure it out…

I did the following to install django-debug-toolbar:

  1. pip install django-debug-toolbar
  2. added to middleware classes:
MIDDLEWARE_CLASSES = (
    'django.middleware.common.CommonMiddleware',
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.middleware.csrf.CsrfViewMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.contrib.messages.middleware.MessageMiddleware',
    # Uncomment the next line for simple clickjacking protection:
    # 'django.middleware.clickjacking.XFrameOptionsMiddleware',
    'debug_toolbar.middleware.DebugToolbarMiddleware',
)

3 Added INTERNAL_IPS:

INTERNAL_IPS = (‘174.121.34.187’,)

4 Added debug_toolbar to installed apps

I am not getting any errors or anything, and the toolbar doesn’t show up on any page, not even admin.

I even added the directory of the debug_toolbar templates to my TEMPLATE_DIRS


回答 0

愚蠢的问题,但您没有提及,所以… DEBUG设置为什么?它不会加载,除非它是True

如果仍然无法使用,请尝试同时添加“ 127.0.0.1” INTERNAL_IPS

更新

这是最后的努力,您不应该这样这样做,但它清楚地表明,如果有一些只是配置问题,或者是否有一些更大的问题。

将以下内容添加到settings.py:

def show_toolbar(request):
    return True
SHOW_TOOLBAR_CALLBACK = show_toolbar

这将有效地删除调试工具栏上的所有检查,以确定是否应该加载自身。它总是会加载。仅将其保留用于测试目的,如果您忘记了并随它一起启动,所有访客也将看到您的调试工具栏。

对于显式配置,另请参阅此处的官方安装文档

编辑(6/17/2015):

显然,核选项的语法已更改。现在在它自己的字典中:

def show_toolbar(request):
    return True
DEBUG_TOOLBAR_CONFIG = {
    "SHOW_TOOLBAR_CALLBACK" : show_toolbar,
}

他们的测试使用此词典。

Stupid question, but you didn’t mention it, so… What is DEBUG set to? It won’t load unless it’s True.

If it’s still not working, try adding ‘127.0.0.1’ to INTERNAL_IPS as well.

UPDATE

This is a last-ditch-effort move, you shouldn’t have to do this, but it will clearly show if there’s merely some configuration issue or whether there’s some larger issue.

Add the following to settings.py:

def show_toolbar(request):
    return True
SHOW_TOOLBAR_CALLBACK = show_toolbar

That will effectively remove all checks by debug toolbar to determine if it should or should not load itself; it will always just load. Only leave that in for testing purposes, if you forget and launch with it, all your visitors will get to see your debug toolbar too.

For explicit configuration, also see the official install docs here.

EDIT(6/17/2015):

Apparently the syntax for the nuclear option has changed. It’s now in its own dictionary:

def show_toolbar(request):
    return True
DEBUG_TOOLBAR_CONFIG = {
    "SHOW_TOOLBAR_CALLBACK" : show_toolbar,
}

Their tests use this dictionary.


回答 1

调试工具栏希望在INTERNAL_IPS设置中设置request.META [‘REMOTE_ADDR’]中的IP地址。在您的其中一种视图中放入打印语句,例如:

print("IP Address for debug-toolbar: " + request.META['REMOTE_ADDR'])

然后加载该页面。确保IP位于settings.py中的INTERNAL_IPS设置中。

通常,我认为您可以通过查看计算机的ip地址来轻松确定该地址,但是就我而言,我是在具有端口转发功能的Virtual Box中运行服务器……谁知道发生了什么。尽管在VB或我自己的OS上的ifconfig中没有看到它,但是REMOTE_ADDR键中显示的IP是激活工具栏的窍门。

Debug toolbar wants the ip address in request.META[‘REMOTE_ADDR’] to be set in the INTERNAL_IPS setting. Throw in a print statement in one of your views like such:

print("IP Address for debug-toolbar: " + request.META['REMOTE_ADDR'])

And then load that page. Make sure that IP is in your INTERNAL_IPS setting in settings.py.

Normally I’d think you would be able to determine the address easily by looking at your computer’s ip address, but in my case I’m running the server in a Virtual Box with port forwarding…and who knows what happened. Despite not seeing it anywhere in ifconfig on the VB or my own OS, the IP that showed up in the REMOTE_ADDR key was what did the trick of activating the toolbar.


回答 2

如果其他方法都没问题,则可能是您的模板缺少明确的结束<body>标记-

注意:仅当响应的模仿类型是text / html或application / xhtml + xml且包含结束标记时,调试工具栏才会显示。


回答 3

当前的稳定版本0.11.0要求满足以下条件才能显示工具栏:

设置文件:

  1. DEBUG = True
  2. INTERNAL_IPS包括您的浏览器IP地址,而不是服务器地址。如果在本地浏览,则应为INTERNAL_IPS = ('127.0.0.1',)。如果要远程浏览,只需指定您的公共地址
  3. 要安装的debug_toolbar应用程序,即 INSTALLED_APPS = (..., 'debug_toolbar',)
  4. 要添加的调试工具栏中间件类,即MIDDLEWARE_CLASSES = ('debug_toolbar.middleware.DebugToolbarMiddleware', ...)。它应尽早放在列表中。

模板文件:

  1. 必须是类型 text/html
  2. 必须有结束</html>标签

静态文件:

如果您要提供静态内容,请确保通过执行以下步骤来收集CSS,JS和html:

./manage.py collectstatic 


注意即将发布的django-debug-toolbar版本

较新的开发版本为设置点2、3和4添加了默认值,这使工作变得更简单了,但是,与任何开发版本一样,它都有错误。我发现git的最新版本导致ImproperlyConfigured通过nginx / uwsgi运行时错误。

无论哪种方式,如果要从github安装最新版本,请运行:

pip install -e git+https://github.com/django-debug-toolbar/django-debug-toolbar.git#egg=django-debug-toolbar 

您还可以通过执行以下操作来克隆特定的提交:

pip install -e git+https://github.com/django-debug-toolbar/django-debug-toolbar.git@ba5af8f6fe7836eef0a0c85dd1e6d7418bc87f75#egg=django_debug_toolbar

The current stable version 0.11.0 requires the following things to be true for the toolbar to be shown:

Settings file:

  1. DEBUG = True
  2. INTERNAL_IPS to include your browser IP address, as opposed to the server address. If browsing locally this should be INTERNAL_IPS = ('127.0.0.1',). If browsing remotely just specify your public address.
  3. The debug_toolbar app to be installed i.e INSTALLED_APPS = (..., 'debug_toolbar',)
  4. The debug toolbar middleware class to be added i.e. MIDDLEWARE_CLASSES = ('debug_toolbar.middleware.DebugToolbarMiddleware', ...). It should be placed as early as possible in the list.

Template files:

  1. Must be of type text/html
  2. Must have a closing </html> tag

Static files:

If you are serving static content make sure you collect the css, js and html by doing:

./manage.py collectstatic 


Note on upcoming versions of django-debug-toolbar

Newer, development versions have added defaults for settings points 2, 3 and 4 which makes life a bit simpler, however, as with any development version it has bugs. I found that the latest version from git resulted in an ImproperlyConfigured error when running through nginx/uwsgi.

Either way, if you want to install the latest version from github run:

pip install -e git+https://github.com/django-debug-toolbar/django-debug-toolbar.git#egg=django-debug-toolbar 

You can also clone a specific commit by doing:

pip install -e git+https://github.com/django-debug-toolbar/django-debug-toolbar.git@ba5af8f6fe7836eef0a0c85dd1e6d7418bc87f75#egg=django_debug_toolbar

回答 4

我尝试了所有操作,从设置DEBUG = True到设置INTERNAL_IPS到客户端IP地址,甚至手动配置Django Debug Toolbar(请注意,最新版本会自动进行所有配置,例如添加中间件和URL)。在远程开发服务器上没有任何工作(尽管它在本地工作)。唯一起作用的是配置工具栏,如下所示:

DEBUG_TOOLBAR_CONFIG = {
    "SHOW_TOOLBAR_CALLBACK" : lambda request: True,
}

这将替换默认方法,该默认方法确定是否应显示工具栏,并始终返回true。

I tried everything, from setting DEBUG = True, to settings INTERNAL_IPS to my client’s IP address, and even configuring Django Debug Toolbar manually (note that recent versions make all configurations automatically, such as adding the middleware and URLs). Nothing worked in a remote development server (though it did work locally). The ONLY thing that worked was configuring the toolbar as follows:

DEBUG_TOOLBAR_CONFIG = {
    "SHOW_TOOLBAR_CALLBACK" : lambda request: True,
}

This replaces the default method that decides if the toolbar should be shown, and always returns true.


回答 5

码头工人

如果您要在具有docker的Docker容器中使用Django服务器进行开发,则启用工具栏的说明无效。原因与以下事实有关:您需要添加的实际地址将INTERNAL_IPS是动态的,例如172.24.0.1。而不是尝试动态设置的值INTERNAL_IPS,直接的解决方案是替换您的中启用工具栏的功能,settings.py例如:

DEBUG_TOOLBAR_CONFIG = {
    'SHOW_TOOLBAR_CALLBACK': lambda _request: DEBUG
}


这也应该适用于其他动态路由情况,例如无业游民。


这里有一些好奇的细节。django_debug_tool中的确定是否显示工具栏的代码检查如下值REMOTE_ADDR

if request.META.get('REMOTE_ADDR', None) not in INTERNAL_IPS:
       return False

因此,如果REMOTE_ADDR由于动态docker路由而实际上不知道的值,则该工具栏将无法工作。您可以使用docker network命令查看动态IP值,例如docker network inspect my_docker_network_name

Docker

If you’re developing with a Django server in a Docker container with docker, the instructions for enabling the toolbar don’t work. The reason is related to the fact that the actual address that you would need to add to INTERNAL_IPS is going to be something dynamic, like 172.24.0.1. Rather than trying to dynamically set the value of INTERNAL_IPS, the straightforward solution is to replace the function that enables the toolbar, in your settings.py, for example:

DEBUG_TOOLBAR_CONFIG = {
    'SHOW_TOOLBAR_CALLBACK': lambda _request: DEBUG
}


This should also work for other dynamic routing situations, like vagrant.


Here are some more details for the curious. The code in django_debug_tool that determines whether to show the toolbar examines the value of REMOTE_ADDR like this:

if request.META.get('REMOTE_ADDR', None) not in INTERNAL_IPS:
       return False

so if you don’t actually know the value of REMOTE_ADDR due to your dynamic docker routing, the toolbar will not work. You can use the docker network command to see the dynamic IP values, for example docker network inspect my_docker_network_name


回答 6

我的工具栏工作得非常完美。使用此配置:

  1. DEBUG = True
  2. INTERNAL_IPS = ('127.0.0.1', '192.168.0.1',)
  3. DEBUG_TOOLBAR_CONFIG = {'INTERCEPT_REDIRECTS': False,}
  4. 中间件是MIDDLEWARE_CLASSES
MIDDLEWARE_CLASSES = (
    'debug_toolbar.middleware.DebugToolbarMiddleware',
    'django.middleware.common.CommonMiddleware',
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.middleware.csrf.CsrfViewMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.contrib.messages.middleware.MessageMiddleware',
)

希望对您有所帮助

I have the toolbar working just perfect. With this configurations:

  1. DEBUG = True
  2. INTERNAL_IPS = ('127.0.0.1', '192.168.0.1',)
  3. DEBUG_TOOLBAR_CONFIG = {'INTERCEPT_REDIRECTS': False,}
  4. The middleware is the first element in MIDDLEWARE_CLASSES:
MIDDLEWARE_CLASSES = (
    'debug_toolbar.middleware.DebugToolbarMiddleware',
    'django.middleware.common.CommonMiddleware',
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.middleware.csrf.CsrfViewMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.contrib.messages.middleware.MessageMiddleware',
)

I hope it helps


回答 7

10.0.2.2在Windows上添加到您的INTERNAL_IPS,内部与流浪汉一起使用

INTERNAL_IPS =(’10 .0.2.2’,)

这应该工作。

Add 10.0.2.2 to your INTERNAL_IPS on Windows, it is used with vagrant internally

INTERNAL_IPS = ( ‘10.0.2.2’, )

This should work.


回答 8

我遇到了同样的问题,经过谷歌搜索后终于解决了。

在INTERNAL_IPS中,您需要具有客户端的 IP地址。

I had the same problem and finally resolved it after some googling.

In INTERNAL_IPS, you need to have the client’s IP address.


回答 9

导致工具栏保持隐藏状态的另一件事是,它找不到所需的静态文件。debug_toolbar模板使用{{STATIC_URL}}模板标记,因此请确保您的静态文件中有一个名为debug工具栏的文件夹。

在大多数安装中,collectstatic管理命令应注意这一点。

Another thing that can cause the toolbar to remain hidden is if it cannot find the required static files. The debug_toolbar templates use the {{ STATIC_URL }} template tag, so make sure there is a folder in your static files called debug toolbar.

The collectstatic management command should take care of this on most installations.


回答 10

我尝试从pydanny的cookiecutter-django配置,它对有用

# django-debug-toolbar
MIDDLEWARE_CLASSES = Common.MIDDLEWARE_CLASSES + ('debug_toolbar.middleware.DebugToolbarMiddleware',)
INSTALLED_APPS += ('debug_toolbar',)

INTERNAL_IPS = ('127.0.0.1',)

DEBUG_TOOLBAR_CONFIG = {
    'DISABLE_PANELS': [
        'debug_toolbar.panels.redirects.RedirectsPanel',
    ],
    'SHOW_TEMPLATE_CONTEXT': True,
}
# end django-debug-toolbar

我只是通过添加'debug_toolbar.apps.DebugToolbarConfig'而不是django-debug-toolbar官方文档中'debug_toolbar'提到的方式对其进行了修改,因为我使用的是Django 1.7。

I tried the configuration from pydanny’s cookiecutter-django and it worked for me:

# django-debug-toolbar
MIDDLEWARE_CLASSES = Common.MIDDLEWARE_CLASSES + ('debug_toolbar.middleware.DebugToolbarMiddleware',)
INSTALLED_APPS += ('debug_toolbar',)

INTERNAL_IPS = ('127.0.0.1',)

DEBUG_TOOLBAR_CONFIG = {
    'DISABLE_PANELS': [
        'debug_toolbar.panels.redirects.RedirectsPanel',
    ],
    'SHOW_TEMPLATE_CONTEXT': True,
}
# end django-debug-toolbar

I just modified it by adding 'debug_toolbar.apps.DebugToolbarConfig' instead of 'debug_toolbar' as mentioned in the official django-debug-toolbar docs, as I’m using Django 1.7.


回答 11

除了以前的答案:

如果工具栏未显示,但已加载到html中(在浏览器中检查您的站点html,向下滚动)

问题可能是找不到调试工具栏静态文件(然后您也可以在站点的访问日志中看到此信息,例如/static/debug_toolbar/js/toolbar.js的404错误)

然后可以通过以下方式进行修复(nginx和apache的示例):

Nginx的配置:

location ~* ^/static/debug_toolbar/.+.(ico|css|js)$ {
    root [path to your python site-packages here]/site-packages/debug_toolbar;
}

apache配置

Alias /static/debug_toolbar [path to your python site-packages here]/site-packages/debug_toolbar/static/debug_toolbar

要么:

manage.py collectstatic

在这里更多关于collectstatic的内容: https //docs.djangoproject.com/en/dev/ref/contrib/staticfiles/#collectstatic

或手动将debug_toolbar静态文件的debug_toolbar文件夹移动到您设置的静态文件文件夹中

An addition to previous answers:

if the toolbar doesn’t show up, but it loads in the html (check your site html in a browser, scroll down)

the issue can be that debug toolbar static files are not found (you can also see this in your site’s access logs then, e.g. 404 errors for /static/debug_toolbar/js/toolbar.js)

It can be fixed the following way then (examples for nginx and apache):

nginx config:

location ~* ^/static/debug_toolbar/.+.(ico|css|js)$ {
    root [path to your python site-packages here]/site-packages/debug_toolbar;
}

apache config:

Alias /static/debug_toolbar [path to your python site-packages here]/site-packages/debug_toolbar/static/debug_toolbar

Or:

manage.py collectstatic

more on collectstatic here: https://docs.djangoproject.com/en/dev/ref/contrib/staticfiles/#collectstatic

Or manualy move debug_toolbar folder of debug_toolbar static files to your set static files folder


回答 12

就我而言,这是这里尚未提及的另一个问题:我的中间件列表中有GZipMiddleware。

由于调试工具栏的自动配置将调试工具栏的中间件放在顶部,因此它只能“看到” gzip压缩的HTML,因此无法在其中添加工具栏。

我在开发设置中删除了GZipMiddleware。手动设置调试工具栏的配置,并将中间件放置 GZip 之后也应该可以。

In my case, it was another problem that hasn’t been mentioned here yet: I had GZipMiddleware in my list of middlewares.

As the automatic configuration of debug toolbar puts the debug toolbar’s middleware at the top, it only gets the “see” the gzipped HTML, to which it can’t add the toolbar.

I removed GZipMiddleware in my development settings. Setting up the debug toolbar’s configuration manually and placing the middleware after GZip’s should also work.


回答 13

就我而言,我只需要删除python编译文件(*.pyc

In my case I just needed to remove the python compiled files (*.pyc)


回答 14

Django 1.8.5:

我必须将以下内容添加到项目url.py文件中,以显示调试工具栏。之后,将显示调试工具栏。

 from django.conf.urls import include
 from django.conf.urls import patterns
 from django.conf import settings


  if settings.DEBUG:
      import debug_toolbar
      urlpatterns += patterns('',
              url(r'^__debug__/', include(debug_toolbar.urls)),
              )

Django 1.10:及更高版本:

from django.conf.urls import include, url
from django.conf.urls import patterns
from django.conf import settings


if settings.DEBUG:

  import debug_toolbar
  urlpatterns =[
         url(r'^__debug__/', include(debug_toolbar.urls)),
         ] + urlpatterns

同样不要忘记在中间件中包含debug_toolbar。调试工具栏主要在中间件中实现。如下在您的设置模块中启用它:(django较新版本)


MIDDLEWARE = [
# ...
'debug_toolbar.middleware.DebugToolbarMiddleware',
#

旧式中间件:(需要在中间件中具有_CLASSES键盘功能)

MIDDLEWARE_CLASSES = [
# ...
'debug_toolbar.middleware.DebugToolbarMiddleware',
# ...
]

django 1.8.5:

I had to add the following to the project url.py file to get the debug toolbar display. After that debug tool bar is displayed.

 from django.conf.urls import include
 from django.conf.urls import patterns
 from django.conf import settings


  if settings.DEBUG:
      import debug_toolbar
      urlpatterns += patterns('',
              url(r'^__debug__/', include(debug_toolbar.urls)),
              )

django 1.10: and higher:

from django.conf.urls import include, url
from django.conf.urls import patterns
from django.conf import settings


if settings.DEBUG:

  import debug_toolbar
  urlpatterns =[
         url(r'^__debug__/', include(debug_toolbar.urls)),
         ] + urlpatterns

Also don’t forget to include the debug_toolbar to your middleware. The Debug Toolbar is mostly implemented in a middleware. Enable it in your settings module as follows: (django newer versions)


MIDDLEWARE = [
# ...
'debug_toolbar.middleware.DebugToolbarMiddleware',
#

Old-style middleware:(need to have _CLASSES keywork in the Middleware)

MIDDLEWARE_CLASSES = [
# ...
'debug_toolbar.middleware.DebugToolbarMiddleware',
# ...
]

回答 15

对于这个特定的作者来说不是这种情况,但是我一直在苦苦挣扎,因为Debug Toolbar没有显示出来,并且在他们指出所有步骤之后,我发现MIDDLEWARE订单有问题。因此,将中间件放在列表的前面是可行的。我的是第一个:

MIDDLEWARE_CLASSES = ( 'debug_toolbar.middleware.DebugToolbarMiddleware', 'django.middleware.common.CommonMiddleware', 'django.contrib.sessions.middleware.SessionMiddleware', 'django.middleware.csrf.CsrfViewMiddleware', 'django.contrib.auth.middleware.AuthenticationMiddleware', 'django.contrib.messages.middleware.MessageMiddleware', 'dynpages.middleware.DynpageFallbackMiddleware', 'utils.middleware.UserThread', )

This wasn’t the case for this specific author but I just have been struggling with the Debug Toolbar not showing and after doing everything they pointed out, I found out it was a problem with MIDDLEWARE order. So putting the middleware early in the list could work. Mine is first:

MIDDLEWARE_CLASSES = ( 'debug_toolbar.middleware.DebugToolbarMiddleware', 'django.middleware.common.CommonMiddleware', 'django.contrib.sessions.middleware.SessionMiddleware', 'django.middleware.csrf.CsrfViewMiddleware', 'django.contrib.auth.middleware.AuthenticationMiddleware', 'django.contrib.messages.middleware.MessageMiddleware', 'dynpages.middleware.DynpageFallbackMiddleware', 'utils.middleware.UserThread', )


回答 16

您必须确保模板中有一个结束标记。

我的问题是我的模板中没有常规的html标签,我只是以纯文本形式显示内容。我通过从base.html继承每个带有标签的html文件来解决它。

you have to make sure there is a closing tag in your templates.

My problem is that there is no regular html tags in my templates, I just display content in plain text. I solved it by inheriting every html file from base.html, which has a tag.


回答 17

对我来说,这就像127.0.0.1:8000在地址栏中键入内容一样简单,而不是localhost:8000显然与INTERNAL_IPS不匹配。

For me this was as simple as typing 127.0.0.1:8000 into the address bar, rather than localhost:8000 which apparently was not matching the INTERNAL_IPS.


回答 18

我遇到了同样的问题,我通过查看Apache的错误日志解决了它。我用mod_wsgi在Mac OS X上运行了Apache。debug_toolbar的tamplete文件夹未加载

日志样本:

==> /private/var/log/apache2/dummy-host2.example.com-error_log <==
[Sun Apr 27 23:23:48 2014] [error] [client 127.0.0.1] File does not exist: /Library/WebServer/Documents/rblreport/rbl/static/debug_toolbar, referer: http://127.0.0.1/

==> /private/var/log/apache2/dummy-host2.example.com-access_log <==
127.0.0.1 - - [27/Apr/2014:23:23:48 -0300] "GET /static/debug_toolbar/css/toolbar.css HTTP/1.1" 404 234 "http://127.0.0.1/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:28.0) Gecko/20100101 Firefox/28.0"

我只是将以下行添加到我的VirtualHost文件中:

Alias /static/debug_toolbar /Library/Python/2.7/site-packages/debug_toolbar/static/debug_toolbar
  • 当然,您必须更改python路径

I got the same problem, I solved it by looking at the Apache’s error log. I got the apache running on mac os x with mod_wsgi The debug_toolbar’s tamplete folder wasn’t being load

Log sample:

==> /private/var/log/apache2/dummy-host2.example.com-error_log <==
[Sun Apr 27 23:23:48 2014] [error] [client 127.0.0.1] File does not exist: /Library/WebServer/Documents/rblreport/rbl/static/debug_toolbar, referer: http://127.0.0.1/

==> /private/var/log/apache2/dummy-host2.example.com-access_log <==
127.0.0.1 - - [27/Apr/2014:23:23:48 -0300] "GET /static/debug_toolbar/css/toolbar.css HTTP/1.1" 404 234 "http://127.0.0.1/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:28.0) Gecko/20100101 Firefox/28.0"

I just add this line to my VirtualHost file:

Alias /static/debug_toolbar /Library/Python/2.7/site-packages/debug_toolbar/static/debug_toolbar
  • Of course you must change your python path

回答 19

我在使用Vagrant时遇到了同样的问题。我通过添加::ffff:192.168.33.1到INTERNAL_IPS来解决此问题,如下例。

INTERNAL_IPS = (
    '::ffff:192.168.33.1',
)

记住那192.168.33.10是我在Vagrantfile中的专用网络中的IP。

I had the same problem using Vagrant. I solved this problem by adding ::ffff:192.168.33.1 to the INTERNAL_IPS as below example.

INTERNAL_IPS = (
    '::ffff:192.168.33.1',
)

Remembering that 192.168.33.10 is the IP in my private network in Vagrantfile.


回答 20

我遇到了这个问题,不得不从源代码安装调试工具栏。

如果使用PureCSS和其他CSS框架,则1.4版存在一个隐藏的问题。

是修复此问题的提交。

该文档解释了如何从源代码安装。

I had this problem and had to install the debug toolbar from source.

Version 1.4 has a problem where it’s hidden if you use PureCSS and apparently other CSS frameworks.

This is the commit which fixes that.

The docs explain how to install from source.


回答 21

对于使用Pycharm 5的任何人-模板调试在某些版本中均不起作用。在5.0.4修复,影响vesions – 5.0.1,5.0.2退房问题

花很多时间找出答案。也许会帮助某人

For anyone who is using Pycharm 5 – template debug is not working there in some versions. Fixed in 5.0.4, affected vesions – 5.0.1, 5.0.2 Check out issue

Spend A LOT time to find that out. Maybe will help someone


回答 22

在我正在处理的代码中,在处理主请求期间提出了多个小请求(这是非常特殊的用例)。它们是由同一Django线程处理的请求。Django调试工具栏(DjDT)不会出现这种情况,它会在第一个响应中包含DjDT的工具栏,然后删除其线程状态。因此,当主请求发送回浏览器时,响应中不包含DjDT。

经验教训:DjDT保存每个线程的状态。它在第一个响应后删除线程的状态。

In the code I was working on, multiple small requests were made during handling of main request (it’s very specific use case). They were requests handled by the same Django’s thread. Django debug toolbar (DjDT) doesn’t expect this behaviour and includes DjDT’s toolbars to the first response and then it removes its state for the thread. So when main request was sent back to the browser, DjDT was not included in the response.

Lessons learned: DjDT saves it’s state per thread. It removes state for a thread after the first response.


回答 23

什么让我是一个过时的浏览器!

注意,它从调试工具栏加载了一些样式表,并猜测可能是前端问题。

What got me is an outdated browser!

Noticed that it loads some stylesheets from debug toolbar and guessed it might be a front-end issue.


回答 24

一件愚蠢的事让我..如果使用apache wsgi,请记住触摸.wsgi文件以强制重新编译代码。只是浪费了我20分钟的时间来调试愚蠢的错误:(

One stupid thing got me.. that if you use apache wsgi, remember to touch the .wsgi file to force your code recompile. just waste 20 minutes of my time to debug the stupid error :(


Mac OS X 10.9之后无法安装PIL

问题:Mac OS X 10.9之后无法安装PIL

我刚刚将Mac OS更新为10.9,发现其中的某些(全部?)Python模块不再可用,尤其是Image模块。

所以我尝试执行sudo pip install pil,但是出现此错误:

/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk/usr/include/tk.h:78:11: fatal error: 'X11/Xlib.h' file not found

#      include <X11/Xlib.h>

               ^

1 error generated.

error: command 'cc' failed with exit status 1

我的Xcode是最新的,我不知道。PIL可能还不兼容10.9吗?

I’ve just updated my Mac OS to 10.9 and I discovered that some (all?) of my Python modules are not here anymore, especially the Image one.

So I try to execute sudo pip install pil, but I get this error:

/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk/usr/include/tk.h:78:11: fatal error: 'X11/Xlib.h' file not found

#      include <X11/Xlib.h>

               ^

1 error generated.

error: command 'cc' failed with exit status 1

My Xcode is up-to-date and I don’t have any idea. Is it possible that PIL is not yet 10.9 compatible ?


回答 0

以下为我工作:

ln -s  /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers/X11 /usr/local/include/X11
sudo pip install pil

更新:

但是,威尔提供了以下更正确的解决方案。

打开终端并执行: xcode-select --install

Following worked for me:

ln -s  /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers/X11 /usr/local/include/X11
sudo pip install pil

UPDATE:

But there is more correct solution below, provided by Will.

open your terminal and execute: xcode-select --install


回答 1

打开终端并执行:

xcode-select --install

open your terminal and execute:

xcode-select --install


回答 2

sudo ln -s /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.8.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers/X11/ /usr/local/include/X11

对我有帮助!操作系统x 10.9

pip install pillow

但!点安装后…

*** ZLIB (PNG/ZIP) support not available

最后我通过运行来修复它:

xcode-select --install

然后重新安装枕头

pip install pillow

PIL SETUP SUMMARY
    --------------------------------------------------------------------
    version      Pillow 2.2.1
    platform     darwin 2.7.5 (default, Aug 25 2013, 00:04:04)
                 [GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)]
    --------------------------------------------------------------------
    --- TKINTER support available
    --- JPEG support available
    --- ZLIB (PNG/ZIP) support available
    --- TIFF G3/G4 (experimental) support available
    --- FREETYPE2 support available
    --- LITTLECMS support available
    --- WEBP support available
    --- WEBPMUX support available
    --------------------------------------------------------------------
sudo ln -s /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.8.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers/X11/ /usr/local/include/X11

helps for me! os x 10.9

pip install pillow

but! after pip install …

*** ZLIB (PNG/ZIP) support not available

and finally i fix it by running:

xcode-select --install

then reinstall pillow

pip install pillow

PIL SETUP SUMMARY
    --------------------------------------------------------------------
    version      Pillow 2.2.1
    platform     darwin 2.7.5 (default, Aug 25 2013, 00:04:04)
                 [GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)]
    --------------------------------------------------------------------
    --- TKINTER support available
    --- JPEG support available
    --- ZLIB (PNG/ZIP) support available
    --- TIFF G3/G4 (experimental) support available
    --- FREETYPE2 support available
    --- LITTLECMS support available
    --- WEBP support available
    --- WEBPMUX support available
    --------------------------------------------------------------------

回答 3

适用于我(OS X Yosemite 10.10.2-Python 2.7.9):

xcode-select --install
sudo pip install pillow

尝试检查一下:

from PIL import Image
image = Image.open("file.jpg")
image.show()

Works for me ( OS X Yosemite 10.10.2 – Python 2.7.9 ) :

xcode-select --install
sudo pip install pillow

Try this to check it:

from PIL import Image
image = Image.open("file.jpg")
image.show()

回答 4

这是我所做的,某些步骤可能仅对于PIL并不是必需的,但无论如何我都需要libpng和其他步骤:

1)运行xcode install,使用此命令或从应用商店下载更新:

xcode-select --install

1b)添加命令行工具可选工具,在Mountain Lion中,这是xcode下载页面上的一个选项,但是现在您必须注册您的Apple ID并从以下位置下载: https //developer.apple.com/downloads/

寻找Xcode的命令行工具(OS X Mavericks)

2)安装python所需的一切(使用brew),我相信您也可以使用port:

brew install readline sqlite gdbm
brew install python --universal --framework 
brew install libpng jpeg freetype

必要时取消链接/重新链接,即升级。

3)安装Pip和所需的模块:

easy_install pip 
sudo pip install setuptools --no-use-wheel --upgrade

4)最后,这没有错误:

sudo pip install Pillow

2014年11月4日更新:PIL存储区不再收到更新或支持,因此应使用Pillow。现在不建议使用以下内容,因此请坚持使用Pillow。

sudo pip install pil --allow-external pil --allow-unverified pil

UPDATE(旧):安装Pillow(PIL拨叉)时同样适用,并且在大多数情况下,它很快就可以替代PILlow。而不是在步骤4中安装pip,而是运行以下命令:

sudo pip install Pillow

希望这对某人有帮助!

Here is what I did, some steps may not be necessary just for PIL but I needed libpng and others anyways:

1) Run xcode install, use this command or download updates from the app store:

xcode-select --install

1b) Add the Command Line Tools optional tool, in Mountain Lion this was an option on the xcode Download page, but now you have to register with your apple id and download from: https://developer.apple.com/downloads/

Look for Command Line Tools (OS X Mavericks) for Xcode

2) Install everything needed for python (using brew), I believe you can use port as well:

brew install readline sqlite gdbm
brew install python --universal --framework 
brew install libpng jpeg freetype

Unlink/ relink if needed i.e. if upgrading.

3) Install Pip and required modules:

easy_install pip 
sudo pip install setuptools --no-use-wheel --upgrade

4) Finally this works with no errors:

sudo pip install Pillow

UPDATE 11/04/14: PIL repo no longer receives updates or support so Pillow should be used. The below is now deprecated so stick with Pillow.

sudo pip install pil --allow-external pil --allow-unverified pil

UPDATE (OLD) : The same thing applies when installing Pillow (PIL fork) and should be mentioned as its quickly becoming a replacement in most cases of PIL. Instead of installing pip in step 4, run this instead:

sudo pip install Pillow

Hope this helps someone!


回答 5

安装命令行工具为我解决了这个问题

您必须分别安装它们,因为它们现在不属于xcode软件包中的一部分:

https://developer.apple.com/downloads/index.action?=command%20line%20tools#

installing command line tools fixed the issue for me

you have to install them separately as they are not part of the packages in xcode now:

https://developer.apple.com/downloads/index.action?=command%20line%20tools#


回答 6

这些都不对我有用。我一直收到:

clang: error: unknown argument: '-mno-fused-madd' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
error: command 'cc' failed with exit status 1

因此,我找到了以下解决方案:

sudo export CFLAGS=-Qunused-arguments
sudo export CPPFLAGS=-Qunused-arguments
sudo pip install PIL --allow-external PIL --allow-unverified PIL

这样我就可以安装。

Non of those worked for me.. I kept receiving:

clang: error: unknown argument: '-mno-fused-madd' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
error: command 'cc' failed with exit status 1

So I found a work around with the following solution:

sudo export CFLAGS=-Qunused-arguments
sudo export CPPFLAGS=-Qunused-arguments
sudo pip install PIL --allow-external PIL --allow-unverified PIL

This way I was able to install.


回答 7

我有一个类似的问题:安装枕头失败clang: error: unknown argument: '-mno-fused-madd' [-Wunused-command-line-argument-hard-error-in-future],安装枕头失败Can't install the software because it is not currently available from the Software Update server.,并且,即使手动安装了命令行工具,PIL的编译也失败了。

发生这种情况是因为最新版本的xcode下的clang不会警告未知的编译器标志,而是通过硬错误停止编译。

要解决此问题,只需export ARCHFLAGS="-Wno-error=unused-command-line-argument-hard-error-in-future"在终端上运行,然后再尝试进行编译(安装pil)。

I had a similar problem: Installing pillow failed with clang: error: unknown argument: '-mno-fused-madd' [-Wunused-command-line-argument-hard-error-in-future], installing command line tools failed with Can't install the software because it is not currently available from the Software Update server., and even after installing the command line tools manually, the compilation of PIL failed.

This happens cause clang under the newest version of xcode doesn’t warn on unknown compiler flags, but rather stop the compilation with a hard error.

To fix this, just run export ARCHFLAGS="-Wno-error=unused-command-line-argument-hard-error-in-future" on the terminal before trying to compile (installing pil).


回答 8

只需运行

pip install pil --allow-external pil --allow-unverified pil

Simply run

pip install pil --allow-external pil --allow-unverified pil


回答 9

这是我在Mac OS 10.9.1上的步骤

1. sudo su
2. easy_install pip
3. xcode-select --install
4. pip install --no-index -f http://dist.plone.org/thirdparty/ -U PIL

This my steps on mac os 10.9.1

1. sudo su
2. easy_install pip
3. xcode-select --install
4. pip install --no-index -f http://dist.plone.org/thirdparty/ -U PIL

回答 10

您可以使用Homebrew进行安装 http://brew.sh

brew tap Homebrew/python
brew install pillow

You could use Homebrew to do the install http://brew.sh

brew tap Homebrew/python
brew install pillow

回答 11

确保在xcode上安装了命令行工具。然后执行:

sudo pip install pil --allow-external pil --allow-unverified pil

Make sure you have Command Line Tools installed on your xcode. Then execute:

sudo pip install pil --allow-external pil --allow-unverified pil

回答 12

我遇到以下错误

building 'PIL._imagingft' extension
_imagingft.c:62:10: fatal error: 'freetype/fterrors.h' file not found

#include <freetype/fterrors.h>

         ^

1 error generated.

error: command 'cc' failed with exit status 1

解决方案是将freetype2符号链接到freetype,从而解决了该问题。

I was having the following error

building 'PIL._imagingft' extension
_imagingft.c:62:10: fatal error: 'freetype/fterrors.h' file not found

#include <freetype/fterrors.h>

         ^

1 error generated.

error: command 'cc' failed with exit status 1

The solution to this was to symlink freetype2 to freetype and this solved the problem.


回答 13

我不想安装XCode(我不使用它),但我讨厌摆弄Application目录。我从这篇文章的许多答案中脱颖而出,以下两个步骤对我来说适用于10.9.5:

sudo easy_install pip
sudo pip install pillow

我不得不使用easy_install来安装pip确实让我感到奇怪。但是pip不想在重新安装之前为我工作。

I didn’t want to install XCode (I don’t use it) and I’m loath to fiddle with Application directory. I’ve cribbed from the many answers in this post and the following two steps work for me with 10.9.5:

sudo easy_install pip
sudo pip install pillow

It did appear to me strange that I had to use easy_install to install pip. But pip didn’t want to work for me before that (re-)install.


回答 14

找到了解决方案…您必须像这样对X11进行符号链接ln -s /opt/X11/include/X11 /usr/local/include/X11,然后sudo pip install pil才能正常工作。

Found the solution … You’ve to symlink X11 like this ln -s /opt/X11/include/X11 /usr/local/include/X11 and then sudo pip install pil should work.


回答 15

重用@DmitryDemidenko的答案对我有用:

ln -s  /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers/X11 /usr/local/include/X11

然后

sudo pip install -U PIL --allow-external PIL --allow-unverified PIL

Reusing @DmitryDemidenko’s answer that is how it worked for me:

ln -s  /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers/X11 /usr/local/include/X11

and then

sudo pip install -U PIL --allow-external PIL --allow-unverified PIL

回答 16

执行下面的命令行。在Mac OS 10.9.5上像超级按钮一样工作

easy_install点

sudo pip install setuptools –no-use-wheel –upgrade

sudo pip安装枕头

最好的,西奥

Execute the bellow command lines. Works like a charm on Mac OS 10.9.5

easy_install pip

sudo pip install setuptools –no-use-wheel –upgrade

sudo pip install Pillow

Best, Theo


回答 17

那就是我所做的:

首先升级到Xcode 5(我正在运行10.9)。然后,在终端中执行以下命令:

$ /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk
$ ln -s /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers/X11 usr/include/

That’s what I did:

First upgrade to Xcode 5 (I am running 10.9). Then, execute the following commands in a terminal:

$ /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk
$ ln -s /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers/X11 usr/include/

回答 18

一个更完整的解决方案需要安装Xquartz X11子系统,该子系统已经在Apple之外构建了几年。这是我用来使其全部工作的步骤

  1. http://xquartz.macosforge.org/landing/安装XQuartz
  2. sudo pip install pillow

A more complete solution requires the installation of the Xquartz X11 subsystem that has been built outside of Apple for several years now. Here are the steps I used to get it all working

  1. Install XQuartz from http://xquartz.macosforge.org/landing/
  2. Run sudo pip install pillow

回答 19

因为公认的答案是正确的答案,xcode-select --install但有些人(包括我)可能会遇到Can't install the software because it is not currently available from the Software Update server 如果您使用的是Beta版软件(因为我现在使用的是优胜美地并且遇到相同的问题),则您需要单独购买CLT,因为它不包含在其中。 XCode(甚至xcode beta)也可以转到developers.apple.com并为您的OS获取CLT工具;)

PS您不需要XQuartz的PIL或Pillow即可工作

As the accepted answer is the right one with xcode-select --install but some people (including me) may encounter Can't install the software because it is not currently available from the Software Update server If you are using beta software (as I am using Yosemite now and had the same problem) you NEED to get the CLT separately since it is NOT included in XCode (even xcode beta) Head over to developers.apple.com and get CLT tools for your OS ;)

P.S. You don’t need XQuartz for PIL or Pillow to work


回答 20

我最近从OS 10.8-> 10.9升级的机器陷入了xcrun和lipo之间的循环。

将/ usr / bin / lipo重命名为/ usr / bin / lipo_broken

请参阅此线程以获取有关如何解决的更多信息:

使用OS X Mavericks和XCode 4.x冻结xcrun / lipo

My machine which was recently upgraded from OS 10.8 -> 10.9 got stuck in a loop between xcrun and lipo.

Rename /usr/bin/lipo to /usr/bin/lipo_broken

Refer to this thread for further information on how to resolve:

xcrun/lipo freezes with OS X Mavericks and XCode 4.x


回答 21

改为安装枕头

sudo pip install pillow

Install Pillow instead:

sudo pip install pillow

回答 22

ln -s /usr/local/include/freetype2 /usr/local/include/freetype
sudo ARCHFLAGS=-Wno-error=unused-command-line-argument-hard-error-in-future pip install pil
ln -s /usr/local/include/freetype2 /usr/local/include/freetype
sudo ARCHFLAGS=-Wno-error=unused-command-line-argument-hard-error-in-future pip install pil

回答 23

试试这个:

ln -s /usr/local/include/freetype2 /usr/local/include/freetype

Try this:

ln -s /usr/local/include/freetype2 /usr/local/include/freetype

回答 24

sudo pip uninstall pillow
pip install pillow

为我工作。我在优胜美地上运行Python 2.7.9。import PIL现在为我工作。

sudo pip uninstall pillow
pip install pillow

worked for me. I’m running Python 2.7.9 on Yosemite.import PIL now works for me.


回答 25

在Mac OSC 10.10 Yosemite上安装PIL(Imaging.1.1.7)。我尝试了这里推荐的许多修复程序,但是每个修复程序都遇到了麻烦。我终于通过编辑setup.py文件来解决了这个问题:

TCL_ROOT =“ / opt / X11 / include”

它在_imagingtk.c的编译中通过了X11的适当包含路径,这对我造成了问题。更改后立即工作。

Installing PIL (Imaging.1.1.7) on Mac OSC 10.10 Yosemite. I tried numerous fixes recommended here but ran into trouble with each one. I finally solved this problem by editing the setup.py file such that:

TCL_ROOT = “/opt/X11/include”

which passes the appropriate include path for X11 in the compilation of _imagingtk.c, which was causing the problem for me. Worked immediately after change.


回答 26

我已从pyenv移至virtualenv,这解决了我的问题。

I’ve moved from pyenv to virtualenv and this fixed my problem.


回答 27

  1. ln -s / opt / X11 / include / X11 / usr / local / include / X11
  2. 没有sudo的pip install pil
  1. ln -s /opt/X11/include/X11 /usr/local/include/X11
  2. pip install pil without sudo

为什么〜True导致-2?

问题:为什么〜True导致-2?

在Python控制台中:

~True

给我:

-2

为什么?有人可以用二进制给我解释这个特殊情况吗?

In Python console:

~True

Gives me:

-2

Why? Can someone explain this particular case to me in binary?


回答 0

int(True)1

1 是:

00000001

并且~1是:

11111110

这是-2两个补1

1翻转所有位,在结果数上加1,然后将结果解释为幅度的二进制表示,并添加一个负号(因为数字以1开头):

11111110  00000001  00000010 
                    
       Flip       Add 1

它是2,但由于MSB为1 ,所以符号为负。


值得一提:

想一想bool,您会发现它本质上是数字-它有两个值TrueFalse,它们只是整数1和0的“自定义”版本,它们的打印方式不同。它们是整数类型的子类int

因此它们的行为与1和0完全相同,只是bool重新定义strrepr以不同的方式显示它们。

>>> type(True)
<class 'bool'>
>>> isinstance(True, int)
True

>>> True == 1
True
>>> True is 1  # they're still different objects
False

int(True) is 1.

1 is:

00000001

and ~1 is:

11111110

Which is -2 in Two’s complement1

1 Flip all the bits, add 1 to the resulting number and interpret the result as a binary representation of the magnitude and add a negative sign (since the number begins with 1):

11111110 → 00000001 → 00000010 
         ↑          ↑ 
       Flip       Add 1

Which is 2, but the sign is negative since the MSB is 1.


Worth mentioning:

Think about bool, you’ll find that it’s numeric in nature – It has two values, True and False, and they are just “customized” versions of the integers 1 and 0 that only print themselves differently. They are subclasses of the integer type int.

So they behave exactly as 1 and 0, except that bool redefines str and repr to display them differently.

>>> type(True)
<class 'bool'>
>>> isinstance(True, int)
True

>>> True == 1
True
>>> True is 1  # they're still different objects
False

回答 1

Python bool类型是的子类int(出于历史原因;布尔值仅在Python 2.3中添加)。

由于int(True)1~True~1-2

有关为什么是的子类,请参见PEP 285boolint

如果需要布尔逆,请使用not

>>> not True
False
>>> not False
True

如果您想知道为什么这样~1-2,那是因为您正在反转一个有符号整数中的所有位;00000001变为1111110符号整数中的一个负数,请参见二进制补码

>>> # Python 3
...
>>> import struct
>>> format(struct.pack('b', 1)[0], '08b')
'00000001'
>>> format(struct.pack('b', ~1)[0], '08b')
'11111110'

其中起始1位表示该值为负,其余位则对正数减去一的值进行反编码。

The Python bool type is a subclass of int (for historical reasons; booleans were only added in Python 2.3).

Since int(True) is 1, ~True is ~1 is -2.

See PEP 285 for why bool is a subclass of int.

If you wanted the boolean inverse, use not:

>>> not True
False
>>> not False
True

If you wanted to know why ~1 is -2, it’s because you are inverting all bits in a signed integer; 00000001 becomes 1111110 which in a signed integer is a negative number, see Two’s complement:

>>> # Python 3
...
>>> import struct
>>> format(struct.pack('b', 1)[0], '08b')
'00000001'
>>> format(struct.pack('b', ~1)[0], '08b')
'11111110'

where the initial 1 bit means the value is negative, and the rest of the bits encode the inverse of the positive number minus one.


回答 2

~True == -2,如果不奇怪 True的手段1 ~方式按位反转

只要


编辑:

  • 修复整数表示和按位求反运算符之间的混合
  • 进行另一次抛光(信息越短,需要做的工作越多)

~True == -2 is not surprising if True means 1 and ~ means bitwise inversion

provided that


Edits:

  • fixed the mixing between integer representation and bitwise inversion operator
  • applied another polishing (the shorter the message, the more work needed)

类和实例属性之间有什么区别?

问题:类和实例属性之间有什么区别?

之间是否有有意义的区别:

class A(object):
    foo = 5   # some default value

class B(object):
    def __init__(self, foo=5):
        self.foo = foo

如果要创建很多实例,这两种样式在性能或空间要求上是否有差异?阅读代码时,您是否认为两种样式的含义有明显不同?

Is there any meaningful distinction between:

class A(object):
    foo = 5   # some default value

vs.

class B(object):
    def __init__(self, foo=5):
        self.foo = foo

If you’re creating a lot of instances, is there any difference in performance or space requirements for the two styles? When you read the code, do you consider the meaning of the two styles to be significantly different?


回答 0

除了性能方面的考虑之外,还有显着的语义差异。在类属性的情况下,仅引用一个对象。在实例属性设置实例中,可以有多个引用对象。例如

>>> class A: foo = []
>>> a, b = A(), A()
>>> a.foo.append(5)
>>> b.foo
[5]
>>> class A:
...  def __init__(self): self.foo = []
>>> a, b = A(), A()
>>> a.foo.append(5)
>>> b.foo    
[]

Beyond performance considerations, there is a significant semantic difference. In the class attribute case, there is just one object referred to. In the instance-attribute-set-at-instantiation, there can be multiple objects referred to. For instance

>>> class A: foo = []
>>> a, b = A(), A()
>>> a.foo.append(5)
>>> b.foo
[5]
>>> class A:
...  def __init__(self): self.foo = []
>>> a, b = A(), A()
>>> a.foo.append(5)
>>> b.foo    
[]

回答 1

区别在于该类的属性由所有实例共享。实例上的属性对该实例是唯一的。

如果来自C ++,则类的属性更像静态成员变量。

The difference is that the attribute on the class is shared by all instances. The attribute on an instance is unique to that instance.

If coming from C++, attributes on the class are more like static member variables.


回答 2

这是一篇很好的文章,总结如下。

class Bar(object):
    ## No need for dot syntax
    class_var = 1

    def __init__(self, i_var):
        self.i_var = i_var

## Need dot syntax as we've left scope of class namespace
Bar.class_var
## 1
foo = MyClass(2)

## Finds i_var in foo's instance namespace
foo.i_var
## 2

## Doesn't find class_var in instance namespace…
## So look's in class namespace (Bar.__dict__)
foo.class_var
## 1

并以视觉形式

类属性分配

  • 如果通过访问该类设置了一个类属性,它将覆盖所有实例的值

    foo = Bar(2)
    foo.class_var
    ## 1
    Bar.class_var = 2
    foo.class_var
    ## 2
  • 如果通过访问实例来设置类变量,则它将覆盖该实例的值。实际上,这会覆盖类变量,并将其转变为仅可用于该实例的直观实例变量。

    foo = Bar(2)
    foo.class_var
    ## 1
    foo.class_var = 2
    foo.class_var
    ## 2
    Bar.class_var
    ## 1

什么时候使用class属性?

  • 存储常数。由于可以将类属性作为类本身的属性进行访问,因此最好使用它们来存储类范围的,特定于类的常量

    class Circle(object):
         pi = 3.14159
    
         def __init__(self, radius):
              self.radius = radius   
        def area(self):
             return Circle.pi * self.radius * self.radius
    
    Circle.pi
    ## 3.14159
    c = Circle(10)
    c.pi
    ## 3.14159
    c.area()
    ## 314.159
  • 定义默认值。举一个简单的例子,我们可以创建一个有界列表(即只能容纳一定数量或更少数量元素的列表),并选择默认上限为10个项目

    class MyClass(object):
        limit = 10
    
        def __init__(self):
            self.data = []
        def item(self, i):
            return self.data[i]
    
        def add(self, e):
            if len(self.data) >= self.limit:
                raise Exception("Too many elements")
            self.data.append(e)
    
     MyClass.limit
     ## 10

Here is a very good post, and summary it as below.

class Bar(object):
    ## No need for dot syntax
    class_var = 1

    def __init__(self, i_var):
        self.i_var = i_var

## Need dot syntax as we've left scope of class namespace
Bar.class_var
## 1
foo = MyClass(2)

## Finds i_var in foo's instance namespace
foo.i_var
## 2

## Doesn't find class_var in instance namespace…
## So look's in class namespace (Bar.__dict__)
foo.class_var
## 1

And in visual form

Class attribute assignment

  • If a class attribute is set by accessing the class, it will override the value for all instances

    foo = Bar(2)
    foo.class_var
    ## 1
    Bar.class_var = 2
    foo.class_var
    ## 2
    
  • If a class variable is set by accessing an instance, it will override the value only for that instance. This essentially overrides the class variable and turns it into an instance variable available, intuitively, only for that instance.

    foo = Bar(2)
    foo.class_var
    ## 1
    foo.class_var = 2
    foo.class_var
    ## 2
    Bar.class_var
    ## 1
    

When would you use class attribute?

  • Storing constants. As class attributes can be accessed as attributes of the class itself, it’s often nice to use them for storing Class-wide, Class-specific constants

    class Circle(object):
         pi = 3.14159
    
         def __init__(self, radius):
              self.radius = radius   
        def area(self):
             return Circle.pi * self.radius * self.radius
    
    Circle.pi
    ## 3.14159
    c = Circle(10)
    c.pi
    ## 3.14159
    c.area()
    ## 314.159
    
  • Defining default values. As a trivial example, we might create a bounded list (i.e., a list that can only hold a certain number of elements or fewer) and choose to have a default cap of 10 items

    class MyClass(object):
        limit = 10
    
        def __init__(self):
            self.data = []
        def item(self, i):
            return self.data[i]
    
        def add(self, e):
            if len(self.data) >= self.limit:
                raise Exception("Too many elements")
            self.data.append(e)
    
     MyClass.limit
     ## 10
    

回答 3

由于此处评论中的人们以及其他两个标记为重复的问题似乎都以相同的方式引起了混淆,因此我认为有必要在Alex Coventry的基础上再增加一个答案。

Alex分配一个可变类型的值(例如列表)的事实与是否共享事物无关。我们可以通过id函数或is运算符看到这一点:

>>> class A: foo = object()
>>> a, b = A(), A()
>>> a.foo is b.foo
True
>>> class A:
...     def __init__(self): self.foo = object()
>>> a, b = A(), A()
>>> a.foo is b.foo
False

(如果您想知道为什么我使用object()而不是说,5那是为了避免遇到两个我不想讨论的其他问题;由于两个不同的原因,完全独立创建的5s最终可能是相同的数字实例,5但完全不能单独创建object())。


那么,为什么a.foo.append(5)在Alex的示例中会影响b.foo,但a.foo = 5在我的示例中却没有呢?那么,尝试a.foo = 5在Alex的例子,并注意不影响b.foo两种

a.foo = 5只是a.foo为…而出名5。这不会影响b.foo,也不会影响以前a.foo引用的旧值的任何其他名称。*我们正在创建一个隐藏类属性的实例属性,这有点棘手,但是一旦得到,就没有什么复杂的了发生在这里。


希望现在可以清楚地知道Alex使用列表的原因:您可以对列表进行变异的事实意味着,更容易显示两个变量命名相同的列表,并且这也意味着在现实生活中的代码中更重要的是要知道您是否具有两个列表或同一列表的两个名称。


*对于来自C ++之类的人的困惑在于,在Python中,值不存储在变量中。值本身就存在于值域中,变量只是值的名称,赋值只是为值创建一个新名称。如果有帮助,请将每个Python变量视为shared_ptr<T>而不是T

**有些人通过将class属性用作实例属性的“默认值”来利用此属性,实例属性可以设置也可以不设置。在某些情况下这可能很有用,但也可能造成混淆,因此请谨慎使用。

Since people in the comments here and in two other questions marked as dups all appear to be confused about this in the same way, I think it’s worth adding an additional answer on top of Alex Coventry’s.

The fact that Alex is assigning a value of a mutable type, like a list, has nothing to do with whether things are shared or not. We can see this with the id function or the is operator:

>>> class A: foo = object()
>>> a, b = A(), A()
>>> a.foo is b.foo
True
>>> class A:
...     def __init__(self): self.foo = object()
>>> a, b = A(), A()
>>> a.foo is b.foo
False

(If you’re wondering why I used object() instead of, say, 5, that’s to avoid running into two whole other issues which I don’t want to get into here; for two different reasons, entirely separately-created 5s can end up being the same instance of the number 5. But entirely separately-created object()s cannot.)


So, why is it that a.foo.append(5) in Alex’s example affects b.foo, but a.foo = 5 in my example doesn’t? Well, try a.foo = 5 in Alex’s example, and notice that it doesn’t affect b.foo there either.

a.foo = 5 is just making a.foo into a name for 5. That doesn’t affect b.foo, or any other name for the old value that a.foo used to refer to.* It’s a little tricky that we’re creating an instance attribute that hides a class attribute,** but once you get that, nothing complicated is happening here.


Hopefully it’s now obvious why Alex used a list: the fact that you can mutate a list means it’s easier to show that two variables name the same list, and also means it’s more important in real-life code to know whether you have two lists or two names for the same list.


* The confusion for people coming from a language like C++ is that in Python, values aren’t stored in variables. Values live off in value-land, on their own, variables are just names for values, and assignment just creates a new name for a value. If it helps, think of each Python variable as a shared_ptr<T> instead of a T.

** Some people take advantage of this by using a class attribute as a “default value” for an instance attribute that instances may or may not set. This can be useful in some cases, but it can also be confusing, so be careful with it.


回答 4

还有另一种情况。

类和实例属性是Descriptor

# -*- encoding: utf-8 -*-


class RevealAccess(object):
    def __init__(self, initval=None, name='var'):
        self.val = initval
        self.name = name

    def __get__(self, obj, objtype):
        return self.val


class Base(object):
    attr_1 = RevealAccess(10, 'var "x"')

    def __init__(self):
        self.attr_2 = RevealAccess(10, 'var "x"')


def main():
    b = Base()
    print("Access to class attribute, return: ", Base.attr_1)
    print("Access to instance attribute, return: ", b.attr_2)

if __name__ == '__main__':
    main()

以上将输出:

('Access to class attribute, return: ', 10)
('Access to instance attribute, return: ', <__main__.RevealAccess object at 0x10184eb50>)

通过类或实例访问相同类型的实例将返回不同的结果!

而且我在c.PyObject_GenericGetAttr定义中找到了一篇不错的文章

说明

如果在组成类的字典中找到该属性。对象MRO,然后检查要查找的属性是否指向数据描述符(这仅是同时实现__get____set__方法的类)。如果是这样,请通过调用__get__数据描述符的方法来解析属性查找(第28-33行)。

There is one more situation.

Class and instance attributes is Descriptor.

# -*- encoding: utf-8 -*-


class RevealAccess(object):
    def __init__(self, initval=None, name='var'):
        self.val = initval
        self.name = name

    def __get__(self, obj, objtype):
        return self.val


class Base(object):
    attr_1 = RevealAccess(10, 'var "x"')

    def __init__(self):
        self.attr_2 = RevealAccess(10, 'var "x"')


def main():
    b = Base()
    print("Access to class attribute, return: ", Base.attr_1)
    print("Access to instance attribute, return: ", b.attr_2)

if __name__ == '__main__':
    main()

Above will output:

('Access to class attribute, return: ', 10)
('Access to instance attribute, return: ', <__main__.RevealAccess object at 0x10184eb50>)

The same type of instance access through class or instance return different result!

And i found in c.PyObject_GenericGetAttr definition,and a great post.

Explain

If the attribute is found in the dictionary of the classes which make up. the objects MRO, then check to see if the attribute being looked up points to a Data Descriptor (which is nothing more that a class implementing both the __get__ and the __set__ methods). If it does, resolve the attribute lookup by calling the __get__ method of the Data Descriptor (lines 28–33).


有没有办法将可选参数传递给函数?

问题:有没有办法将可选参数传递给函数?

Python中有没有一种方法可以在调用函数时将可选参数传递给函数,并且函数定义中的代码基于“仅当传递了可选参数时”

Is there a way in Python to pass optional parameters to a function while calling it and in the function definition have some code based on “only if the optional parameter is passed”


回答 0

Python 2中的文档,7.6。函数定义为您提供了两种方法来检测调用方是否提供了可选参数。

首先,您可以使用特殊的形式参数语法*。如果函数定义的形式参数前面带有single *,则Python会使用前形式参数(作为元组)不匹配的任何位置参数填充该参数。如果函数定义的正式参数以开头**,则Python会使用与先前正式参数不匹配的任何关键字参数(作为dict)来填充该参数。函数的实现可以检查这些参数的内容,以查找所需的任何“可选参数”。

例如,这是一个函数opt_fun,它需要两个位置参数x1x2,并寻找另一个名为“ optional”的关键字参数。

>>> def opt_fun(x1, x2, *positional_parameters, **keyword_parameters):
...     if ('optional' in keyword_parameters):
...         print 'optional parameter found, it is ', keyword_parameters['optional']
...     else:
...         print 'no optional parameter, sorry'
... 
>>> opt_fun(1, 2)
no optional parameter, sorry
>>> opt_fun(1,2, optional="yes")
optional parameter found, it is  yes
>>> opt_fun(1,2, another="yes")
no optional parameter, sorry

第二,您可以提供某个值的默认参数值,None调用者将永远不会使用该值。如果参数具有此默认值,则说明调用者未指定参数。如果参数具有非默认值,则说明它来自调用方。

The Python 2 documentation, 7.6. Function definitions gives you a couple of ways to detect whether a caller supplied an optional parameter.

First, you can use special formal parameter syntax *. If the function definition has a formal parameter preceded by a single *, then Python populates that parameter with any positional parameters that aren’t matched by preceding formal parameters (as a tuple). If the function definition has a formal parameter preceded by **, then Python populates that parameter with any keyword parameters that aren’t matched by preceding formal parameters (as a dict). The function’s implementation can check the contents of these parameters for any “optional parameters” of the sort you want.

For instance, here’s a function opt_fun which takes two positional parameters x1 and x2, and looks for another keyword parameter named “optional”.

>>> def opt_fun(x1, x2, *positional_parameters, **keyword_parameters):
...     if ('optional' in keyword_parameters):
...         print 'optional parameter found, it is ', keyword_parameters['optional']
...     else:
...         print 'no optional parameter, sorry'
... 
>>> opt_fun(1, 2)
no optional parameter, sorry
>>> opt_fun(1,2, optional="yes")
optional parameter found, it is  yes
>>> opt_fun(1,2, another="yes")
no optional parameter, sorry

Second, you can supply a default parameter value of some value like None which a caller would never use. If the parameter has this default value, you know the caller did not specify the parameter. If the parameter has a non-default value, you know it came from the caller.


回答 1

def my_func(mandatory_arg, optional_arg=100):
    print(mandatory_arg, optional_arg)

http://docs.python.org/2/tutorial/controlflow.html#default-argument-values

我发现这比使用更具可读性**kwargs

为了确定是否传递了一个参数,我使用一个自定义实用程序对象作为默认值:

MISSING = object()

def func(arg=MISSING):
    if arg is MISSING:
        ...
def my_func(mandatory_arg, optional_arg=100):
    print(mandatory_arg, optional_arg)

http://docs.python.org/2/tutorial/controlflow.html#default-argument-values

I find this more readable than using **kwargs.

To determine if an argument was passed at all, I use a custom utility object as the default value:

MISSING = object()

def func(arg=MISSING):
    if arg is MISSING:
        ...

回答 2

def op(a=4,b=6):
    add = a+b
    print add

i)op() [o/p: will be (4+6)=10]
ii)op(99) [o/p: will be (99+6)=105]
iii)op(1,1) [o/p: will be (1+1)=2]
Note:
 If none or one parameter is passed the default passed parameter will be considered for the function. 
def op(a=4,b=6):
    add = a+b
    print add

i)op() [o/p: will be (4+6)=10]
ii)op(99) [o/p: will be (99+6)=105]
iii)op(1,1) [o/p: will be (1+1)=2]
Note:
 If none or one parameter is passed the default passed parameter will be considered for the function. 

回答 3

如果要为参数提供一些默认值,请在()中分配值。像(x = 10)。但重要的是,首先应强制参数,然后默认值。

例如。

(y,x = 10)

(x = 10,y)是错误的

If you want give some default value to a parameter assign value in (). like (x =10). But important is first should compulsory argument then default value.

eg.

(y, x =10)

but

(x=10, y) is wrong


回答 4

您可以为可选参数指定一个默认值,该值将不会传递给该函数,并使用is运算符进行检查:

class _NO_DEFAULT:
    def __repr__(self):return "<no default>"
_NO_DEFAULT = _NO_DEFAULT()

def func(optional= _NO_DEFAULT):
    if optional is _NO_DEFAULT:
        print("the optional argument was not passed")
    else:
        print("the optional argument was:",optional)

那么只要您不这样做func(_NO_DEFAULT),就可以准确地检测出是否传递了参数,并且与接受的答案不同,您不必担心**表示法的副作用:

# these two work the same as using **
func()
func(optional=1)

# the optional argument can be positional or keyword unlike using **
func(1) 

#this correctly raises an error where as it would need to be explicitly checked when using **
func(invalid_arg=7)

You can specify a default value for the optional argument with something that would never passed to the function and check it with the is operator:

class _NO_DEFAULT:
    def __repr__(self):return "<no default>"
_NO_DEFAULT = _NO_DEFAULT()

def func(optional= _NO_DEFAULT):
    if optional is _NO_DEFAULT:
        print("the optional argument was not passed")
    else:
        print("the optional argument was:",optional)

then as long as you do not do func(_NO_DEFAULT) you can be accurately detect whether the argument was passed or not, and unlike the accepted answer you don’t have to worry about side effects of ** notation:

# these two work the same as using **
func()
func(optional=1)

# the optional argument can be positional or keyword unlike using **
func(1) 

#this correctly raises an error where as it would need to be explicitly checked when using **
func(invalid_arg=7)

为什么迭代一小串字符串比一小串列表慢?

问题:为什么迭代一小串字符串比一小串列表慢?

我在玩timeit时发现,对小字符串进行简单的列表理解要比对小字符串列表进行相同的操作花费的时间更长。有什么解释吗?时间几乎是原来的1.35倍。

>>> from timeit import timeit
>>> timeit("[x for x in 'abc']")
2.0691067844831528
>>> timeit("[x for x in ['a', 'b', 'c']]")
1.5286479570345861

导致此情况的较低级别发生了什么?

I was playing around with timeit and noticed that doing a simple list comprehension over a small string took longer than doing the same operation on a list of small single character strings. Any explanation? It’s almost 1.35 times as much time.

>>> from timeit import timeit
>>> timeit("[x for x in 'abc']")
2.0691067844831528
>>> timeit("[x for x in ['a', 'b', 'c']]")
1.5286479570345861

What’s happening on a lower level that’s causing this?


回答 0

TL; DR

  • 对于Python 2,一旦消除了很多开销,实际的速度差异就会接近70%(或更高)。

  • 对象创建没有错。这两种方法都不会创建新对象,因为会缓存一个字符的字符串。

  • 区别并不明显,但可能是由于对类型和格式正确的字符串索引进行了大量检查而造成的。由于很有必要检查返回的商品,因此很有可能。

  • 列表索引非常快。



>>> python3 -m timeit '[x for x in "abc"]'
1000000 loops, best of 3: 0.388 usec per loop

>>> python3 -m timeit '[x for x in ["a", "b", "c"]]'
1000000 loops, best of 3: 0.436 usec per loop

这与您发现的内容不同…

然后,您必须使用Python 2。

>>> python2 -m timeit '[x for x in "abc"]'
1000000 loops, best of 3: 0.309 usec per loop

>>> python2 -m timeit '[x for x in ["a", "b", "c"]]'
1000000 loops, best of 3: 0.212 usec per loop

让我们解释两个版本之间的区别。我将检查编译后的代码。

对于Python 3:

import dis

def list_iterate():
    [item for item in ["a", "b", "c"]]

dis.dis(list_iterate)
#>>>   4           0 LOAD_CONST               1 (<code object <listcomp> at 0x7f4d06b118a0, file "", line 4>)
#>>>               3 LOAD_CONST               2 ('list_iterate.<locals>.<listcomp>')
#>>>               6 MAKE_FUNCTION            0
#>>>               9 LOAD_CONST               3 ('a')
#>>>              12 LOAD_CONST               4 ('b')
#>>>              15 LOAD_CONST               5 ('c')
#>>>              18 BUILD_LIST               3
#>>>              21 GET_ITER
#>>>              22 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
#>>>              25 POP_TOP
#>>>              26 LOAD_CONST               0 (None)
#>>>              29 RETURN_VALUE

def string_iterate():
    [item for item in "abc"]

dis.dis(string_iterate)
#>>>  21           0 LOAD_CONST               1 (<code object <listcomp> at 0x7f4d06b17150, file "", line 21>)
#>>>               3 LOAD_CONST               2 ('string_iterate.<locals>.<listcomp>')
#>>>               6 MAKE_FUNCTION            0
#>>>               9 LOAD_CONST               3 ('abc')
#>>>              12 GET_ITER
#>>>              13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
#>>>              16 POP_TOP
#>>>              17 LOAD_CONST               0 (None)
#>>>              20 RETURN_VALUE

您会在此处看到,由于每次都建立列表,列表变体可能会变慢。

这是

 9 LOAD_CONST   3 ('a')
12 LOAD_CONST   4 ('b')
15 LOAD_CONST   5 ('c')
18 BUILD_LIST   3

部分。字符串变体仅具有

 9 LOAD_CONST   3 ('abc')

您可以检查一下是否确实有所不同:

def string_iterate():
    [item for item in ("a", "b", "c")]

dis.dis(string_iterate)
#>>>  35           0 LOAD_CONST               1 (<code object <listcomp> at 0x7f4d068be660, file "", line 35>)
#>>>               3 LOAD_CONST               2 ('string_iterate.<locals>.<listcomp>')
#>>>               6 MAKE_FUNCTION            0
#>>>               9 LOAD_CONST               6 (('a', 'b', 'c'))
#>>>              12 GET_ITER
#>>>              13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
#>>>              16 POP_TOP
#>>>              17 LOAD_CONST               0 (None)
#>>>              20 RETURN_VALUE

这产生了

 9 LOAD_CONST               6 (('a', 'b', 'c'))

因为元组是不可变的。测试:

>>> python3 -m timeit '[x for x in ("a", "b", "c")]'
1000000 loops, best of 3: 0.369 usec per loop

太好了,赶快行动吧。

对于Python 2:

def list_iterate():
    [item for item in ["a", "b", "c"]]

dis.dis(list_iterate)
#>>>   2           0 BUILD_LIST               0
#>>>               3 LOAD_CONST               1 ('a')
#>>>               6 LOAD_CONST               2 ('b')
#>>>               9 LOAD_CONST               3 ('c')
#>>>              12 BUILD_LIST               3
#>>>              15 GET_ITER            
#>>>         >>   16 FOR_ITER                12 (to 31)
#>>>              19 STORE_FAST               0 (item)
#>>>              22 LOAD_FAST                0 (item)
#>>>              25 LIST_APPEND              2
#>>>              28 JUMP_ABSOLUTE           16
#>>>         >>   31 POP_TOP             
#>>>              32 LOAD_CONST               0 (None)
#>>>              35 RETURN_VALUE        

def string_iterate():
    [item for item in "abc"]

dis.dis(string_iterate)
#>>>   2           0 BUILD_LIST               0
#>>>               3 LOAD_CONST               1 ('abc')
#>>>               6 GET_ITER            
#>>>         >>    7 FOR_ITER                12 (to 22)
#>>>              10 STORE_FAST               0 (item)
#>>>              13 LOAD_FAST                0 (item)
#>>>              16 LIST_APPEND              2
#>>>              19 JUMP_ABSOLUTE            7
#>>>         >>   22 POP_TOP             
#>>>              23 LOAD_CONST               0 (None)
#>>>              26 RETURN_VALUE        

奇怪的是,我们具有相同的列表构建,但是这样做的速度仍然更快。Python 2的运行速度异常快。

让我们删除理解和重新计时。这_ =是为了防止它被优化。

>>> python3 -m timeit '_ = ["a", "b", "c"]'
10000000 loops, best of 3: 0.0707 usec per loop

>>> python3 -m timeit '_ = "abc"'
100000000 loops, best of 3: 0.0171 usec per loop

我们可以看到初始化不足以说明版本之间的差异(这些数字很小)!因此,我们可以得出结论,Python 3的理解速度较慢。随着Python 3将理解方式更改为具有更安全的作用域,这才有意义。

好吧,现在提高基准(我只是删除不是迭代的开销)。这通过预先分配来删除迭代器的构建:

>>> python3 -m timeit -s 'iterable = "abc"'           '[x for x in iterable]'
1000000 loops, best of 3: 0.387 usec per loop

>>> python3 -m timeit -s 'iterable = ["a", "b", "c"]' '[x for x in iterable]'
1000000 loops, best of 3: 0.368 usec per loop
>>> python2 -m timeit -s 'iterable = "abc"'           '[x for x in iterable]'
1000000 loops, best of 3: 0.309 usec per loop

>>> python2 -m timeit -s 'iterable = ["a", "b", "c"]' '[x for x in iterable]'
10000000 loops, best of 3: 0.164 usec per loop

我们可以检查调用iter是否是开销:

>>> python3 -m timeit -s 'iterable = "abc"'           'iter(iterable)'
10000000 loops, best of 3: 0.099 usec per loop

>>> python3 -m timeit -s 'iterable = ["a", "b", "c"]' 'iter(iterable)'
10000000 loops, best of 3: 0.1 usec per loop
>>> python2 -m timeit -s 'iterable = "abc"'           'iter(iterable)'
10000000 loops, best of 3: 0.0913 usec per loop

>>> python2 -m timeit -s 'iterable = ["a", "b", "c"]' 'iter(iterable)'
10000000 loops, best of 3: 0.0854 usec per loop

不,不是。差别太小,尤其是对于Python 3。

因此,让我们降低整体速度,从而消除更多不必要的开销!目的是使迭代时间更长,从而节省时间。

>>> python3 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' '[x for x in iterable]'
100 loops, best of 3: 3.12 msec per loop

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' '[x for x in iterable]'
100 loops, best of 3: 2.77 msec per loop
>>> python2 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' '[x for x in iterable]'
100 loops, best of 3: 2.32 msec per loop

>>> python2 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' '[x for x in iterable]'
100 loops, best of 3: 2.09 msec per loop

这实际上并没有太大变化,但有所帮助。

因此,消除理解。开销并不是问题的一部分:

>>> python3 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'for x in iterable: pass'
1000 loops, best of 3: 1.71 msec per loop

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'for x in iterable: pass'
1000 loops, best of 3: 1.36 msec per loop
>>> python2 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'for x in iterable: pass'
1000 loops, best of 3: 1.27 msec per loop

>>> python2 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'for x in iterable: pass'
1000 loops, best of 3: 935 usec per loop

这还差不多!通过使用deque迭代,我们仍然可以稍微快一些。基本上是一样的,但是速度更快

>>> python3 -m timeit -s 'import random; from collections import deque; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 777 usec per loop

>>> python3 -m timeit -s 'import random; from collections import deque; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 405 usec per loop
>>> python2 -m timeit -s 'import random; from collections import deque; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 805 usec per loop

>>> python2 -m timeit -s 'import random; from collections import deque; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 438 usec per loop

令我印象深刻的是,Unicode在字节串方面具有竞争力。我们可以通过尝试在bytesunicode两者中进行显式检查:

  • bytes

    >>> python3 -m timeit -s 'import random; from collections import deque; iterable = b"".join(chr(random.randint(0, 127)).encode("ascii") for _ in range(100000))' 'deque(iterable, maxlen=0)'                                                                    :(
    1000 loops, best of 3: 571 usec per loop
    
    >>> python3 -m timeit -s 'import random; from collections import deque; iterable =         [chr(random.randint(0, 127)).encode("ascii") for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 394 usec per loop
    
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable = b"".join(chr(random.randint(0, 127))                 for _ in range(100000))' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 757 usec per loop
    
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable =         [chr(random.randint(0, 127))                 for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 438 usec per loop
    

    在这里,您可以看到Python 3实际上比Python 2 更快

  • unicode

    >>> python3 -m timeit -s 'import random; from collections import deque; iterable = u"".join(   chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 800 usec per loop
    
    >>> python3 -m timeit -s 'import random; from collections import deque; iterable =         [   chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 394 usec per loop
    
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable = u"".join(unichr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 1.07 msec per loop
    
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable =         [unichr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 469 usec per loop
    

    同样,Python 3更快,尽管这是可以预料的(str在Python 3中引起了很多关注)。

实际上,这unicodebytes差异很小,令人印象深刻。

因此,让我们分析一下这种情况,因为它对我来说既快速又方便:

>>> python3 -m timeit -s 'import random; from collections import deque; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 777 usec per loop

>>> python3 -m timeit -s 'import random; from collections import deque; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 405 usec per loop

实际上,我们可以排除蒂姆·彼得(Tim Peter)提出10次支持的答案!

>>> foo = iterable[123]
>>> iterable[36] is foo
True

这些不是新对象!

但这值得一提:索引成本。区别可能在于索引,因此删除迭代并仅索引:

>>> python3 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'iterable[123]'
10000000 loops, best of 3: 0.0397 usec per loop

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'iterable[123]'
10000000 loops, best of 3: 0.0374 usec per loop

差异似乎很小,但是至少一半的成本是间接费用:

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'iterable; 123'
100000000 loops, best of 3: 0.0173 usec per loop

因此,速度差足以决定对此负责。我认为。

那么为什么索引列表这么快呢?

好吧,我会回来给你这一点,但我的猜测是的是倒在支票实习字符串(或缓存的字符,如果它是一个独立的机构)。这将不如最佳速度快。但是我会去检查源代码(尽管我对C语言不太满意):)。


所以这是来源:

static PyObject *
unicode_getitem(PyObject *self, Py_ssize_t index)
{
    void *data;
    enum PyUnicode_Kind kind;
    Py_UCS4 ch;
    PyObject *res;

    if (!PyUnicode_Check(self) || PyUnicode_READY(self) == -1) {
        PyErr_BadArgument();
        return NULL;
    }
    if (index < 0 || index >= PyUnicode_GET_LENGTH(self)) {
        PyErr_SetString(PyExc_IndexError, "string index out of range");
        return NULL;
    }
    kind = PyUnicode_KIND(self);
    data = PyUnicode_DATA(self);
    ch = PyUnicode_READ(kind, data, index);
    if (ch < 256)
        return get_latin1_char(ch);

    res = PyUnicode_New(1, ch);
    if (res == NULL)
        return NULL;
    kind = PyUnicode_KIND(res);
    data = PyUnicode_DATA(res);
    PyUnicode_WRITE(kind, data, 0, ch);
    assert(_PyUnicode_CheckConsistency(res, 1));
    return res;
}

从顶部走,我们将进行一些检查。这些无聊。然后一些分配,这也应该很无聊。第一个有趣的行是

ch = PyUnicode_READ(kind, data, index);

但是我们希望这很快,因为我们正在通过索引从连续的C数组读取数据。结果ch小于256,因此我们将在中返回缓存的字符get_latin1_char(ch)

因此,我们将运行(删除第一个检查)

kind = PyUnicode_KIND(self);
data = PyUnicode_DATA(self);
ch = PyUnicode_READ(kind, data, index);
return get_latin1_char(ch);

哪里

#define PyUnicode_KIND(op) \
    (assert(PyUnicode_Check(op)), \
     assert(PyUnicode_IS_READY(op)),            \
     ((PyASCIIObject *)(op))->state.kind)

(这很无聊,因为断言在调试中会被忽略(因此我可以检查它们是否很快),((PyASCIIObject *)(op))->state.kind)并且(我认为)是间接调用和C级强制转换);

#define PyUnicode_DATA(op) \
    (assert(PyUnicode_Check(op)), \
     PyUnicode_IS_COMPACT(op) ? _PyUnicode_COMPACT_DATA(op) :   \
     _PyUnicode_NONCOMPACT_DATA(op))

(由于类似的原因,这也很无聊,假设宏(Something_CAPITALIZED)都很快),

#define PyUnicode_READ(kind, data, index) \
    ((Py_UCS4) \
    ((kind) == PyUnicode_1BYTE_KIND ? \
        ((const Py_UCS1 *)(data))[(index)] : \
        ((kind) == PyUnicode_2BYTE_KIND ? \
            ((const Py_UCS2 *)(data))[(index)] : \
            ((const Py_UCS4 *)(data))[(index)] \
        ) \
    ))

(涉及索引,但实际上一点也不慢),并且

static PyObject*
get_latin1_char(unsigned char ch)
{
    PyObject *unicode = unicode_latin1[ch];
    if (!unicode) {
        unicode = PyUnicode_New(1, ch);
        if (!unicode)
            return NULL;
        PyUnicode_1BYTE_DATA(unicode)[0] = ch;
        assert(_PyUnicode_CheckConsistency(unicode, 1));
        unicode_latin1[ch] = unicode;
    }
    Py_INCREF(unicode);
    return unicode;
}

这证实了我的怀疑:

  • 这被缓存:

    PyObject *unicode = unicode_latin1[ch];
  • 这应该很快。在if (!unicode)没有运行,所以它是在这种情况下相当于字面上

    PyObject *unicode = unicode_latin1[ch];
    Py_INCREF(unicode);
    return unicode;
    

坦白地说,在测试asserts 之后(通过禁用它们[我认为它可以在C级断言上运行…]),只有看起来很慢的部分是:

PyUnicode_IS_COMPACT(op)
_PyUnicode_COMPACT_DATA(op)
_PyUnicode_NONCOMPACT_DATA(op)

哪个是:

#define PyUnicode_IS_COMPACT(op) \
    (((PyASCIIObject*)(op))->state.compact)

(和以前一样快),

#define _PyUnicode_COMPACT_DATA(op)                     \
    (PyUnicode_IS_ASCII(op) ?                   \
     ((void*)((PyASCIIObject*)(op) + 1)) :              \
     ((void*)((PyCompactUnicodeObject*)(op) + 1)))

(如果宏IS_ASCII很快,则很快),以及

#define _PyUnicode_NONCOMPACT_DATA(op)                  \
    (assert(((PyUnicodeObject*)(op))->data.any),        \
     ((((PyUnicodeObject *)(op))->data.any)))

(因为它是断言,间接寻址和强制转换,因此速度也很快)。

因此,我们进入(兔子洞)以:

PyUnicode_IS_ASCII

这是

#define PyUnicode_IS_ASCII(op)                   \
    (assert(PyUnicode_Check(op)),                \
     assert(PyUnicode_IS_READY(op)),             \
     ((PyASCIIObject*)op)->state.ascii)

嗯…似乎也很快…


好吧,但让我们将其与进行比较PyList_GetItem。(是的,感谢蒂姆·彼得斯(Tim Peters)为我提供了更多的工作要做:P。)

PyObject *
PyList_GetItem(PyObject *op, Py_ssize_t i)
{
    if (!PyList_Check(op)) {
        PyErr_BadInternalCall();
        return NULL;
    }
    if (i < 0 || i >= Py_SIZE(op)) {
        if (indexerr == NULL) {
            indexerr = PyUnicode_FromString(
                "list index out of range");
            if (indexerr == NULL)
                return NULL;
        }
        PyErr_SetObject(PyExc_IndexError, indexerr);
        return NULL;
    }
    return ((PyListObject *)op) -> ob_item[i];
}

我们可以看到,在非错误情况下,这只会运行:

PyList_Check(op)
Py_SIZE(op)
((PyListObject *)op) -> ob_item[i]

哪里PyList_Check

#define PyList_Check(op) \
     PyType_FastSubclass(Py_TYPE(op), Py_TPFLAGS_LIST_SUBCLASS)

TABS!TABS !!!)(issue215875分钟内修复并合并。就像…是的。该死的。他们让Skeet感到羞耻。

#define Py_SIZE(ob)             (((PyVarObject*)(ob))->ob_size)
#define PyType_FastSubclass(t,f)  PyType_HasFeature(t,f)
#ifdef Py_LIMITED_API
#define PyType_HasFeature(t,f)  ((PyType_GetFlags(t) & (f)) != 0)
#else
#define PyType_HasFeature(t,f)  (((t)->tp_flags & (f)) != 0)
#endif

因此,除非Py_LIMITED_API启用,否则通常这确实是微不足道的(两个间接调用和几个布尔检查)……???

然后是索引和强制转换(((PyListObject *)op) -> ob_item[i]),我们完成了。

因此,对列表检查肯定会更少,并且速度差异很小肯定意味着它可能是相关的。


我认为通常来说,(->)Unicode的类型检查和间接性更多。似乎我遗漏了一点,但是

TL;DR

  • The actual speed difference is closer to 70% (or more) once a lot of the overhead is removed, for Python 2.

  • Object creation is not at fault. Neither method creates a new object, as one-character strings are cached.

  • The difference is unobvious, but is likely created from a greater number of checks on string indexing, with regards to the type and well-formedness. It is also quite likely thanks to the need to check what to return.

  • List indexing is remarkably fast.



>>> python3 -m timeit '[x for x in "abc"]'
1000000 loops, best of 3: 0.388 usec per loop

>>> python3 -m timeit '[x for x in ["a", "b", "c"]]'
1000000 loops, best of 3: 0.436 usec per loop

This disagrees with what you’ve found…

You must be using Python 2, then.

>>> python2 -m timeit '[x for x in "abc"]'
1000000 loops, best of 3: 0.309 usec per loop

>>> python2 -m timeit '[x for x in ["a", "b", "c"]]'
1000000 loops, best of 3: 0.212 usec per loop

Let’s explain the difference between the versions. I’ll examine the compiled code.

For Python 3:

import dis

def list_iterate():
    [item for item in ["a", "b", "c"]]

dis.dis(list_iterate)
#>>>   4           0 LOAD_CONST               1 (<code object <listcomp> at 0x7f4d06b118a0, file "", line 4>)
#>>>               3 LOAD_CONST               2 ('list_iterate.<locals>.<listcomp>')
#>>>               6 MAKE_FUNCTION            0
#>>>               9 LOAD_CONST               3 ('a')
#>>>              12 LOAD_CONST               4 ('b')
#>>>              15 LOAD_CONST               5 ('c')
#>>>              18 BUILD_LIST               3
#>>>              21 GET_ITER
#>>>              22 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
#>>>              25 POP_TOP
#>>>              26 LOAD_CONST               0 (None)
#>>>              29 RETURN_VALUE

def string_iterate():
    [item for item in "abc"]

dis.dis(string_iterate)
#>>>  21           0 LOAD_CONST               1 (<code object <listcomp> at 0x7f4d06b17150, file "", line 21>)
#>>>               3 LOAD_CONST               2 ('string_iterate.<locals>.<listcomp>')
#>>>               6 MAKE_FUNCTION            0
#>>>               9 LOAD_CONST               3 ('abc')
#>>>              12 GET_ITER
#>>>              13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
#>>>              16 POP_TOP
#>>>              17 LOAD_CONST               0 (None)
#>>>              20 RETURN_VALUE

You see here that the list variant is likely to be slower due to the building of the list each time.

This is the

 9 LOAD_CONST   3 ('a')
12 LOAD_CONST   4 ('b')
15 LOAD_CONST   5 ('c')
18 BUILD_LIST   3

part. The string variant only has

 9 LOAD_CONST   3 ('abc')

You can check that this does seem to make a difference:

def string_iterate():
    [item for item in ("a", "b", "c")]

dis.dis(string_iterate)
#>>>  35           0 LOAD_CONST               1 (<code object <listcomp> at 0x7f4d068be660, file "", line 35>)
#>>>               3 LOAD_CONST               2 ('string_iterate.<locals>.<listcomp>')
#>>>               6 MAKE_FUNCTION            0
#>>>               9 LOAD_CONST               6 (('a', 'b', 'c'))
#>>>              12 GET_ITER
#>>>              13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
#>>>              16 POP_TOP
#>>>              17 LOAD_CONST               0 (None)
#>>>              20 RETURN_VALUE

This produces just

 9 LOAD_CONST               6 (('a', 'b', 'c'))

as tuples are immutable. Test:

>>> python3 -m timeit '[x for x in ("a", "b", "c")]'
1000000 loops, best of 3: 0.369 usec per loop

Great, back up to speed.

For Python 2:

def list_iterate():
    [item for item in ["a", "b", "c"]]

dis.dis(list_iterate)
#>>>   2           0 BUILD_LIST               0
#>>>               3 LOAD_CONST               1 ('a')
#>>>               6 LOAD_CONST               2 ('b')
#>>>               9 LOAD_CONST               3 ('c')
#>>>              12 BUILD_LIST               3
#>>>              15 GET_ITER            
#>>>         >>   16 FOR_ITER                12 (to 31)
#>>>              19 STORE_FAST               0 (item)
#>>>              22 LOAD_FAST                0 (item)
#>>>              25 LIST_APPEND              2
#>>>              28 JUMP_ABSOLUTE           16
#>>>         >>   31 POP_TOP             
#>>>              32 LOAD_CONST               0 (None)
#>>>              35 RETURN_VALUE        

def string_iterate():
    [item for item in "abc"]

dis.dis(string_iterate)
#>>>   2           0 BUILD_LIST               0
#>>>               3 LOAD_CONST               1 ('abc')
#>>>               6 GET_ITER            
#>>>         >>    7 FOR_ITER                12 (to 22)
#>>>              10 STORE_FAST               0 (item)
#>>>              13 LOAD_FAST                0 (item)
#>>>              16 LIST_APPEND              2
#>>>              19 JUMP_ABSOLUTE            7
#>>>         >>   22 POP_TOP             
#>>>              23 LOAD_CONST               0 (None)
#>>>              26 RETURN_VALUE        

The odd thing is that we have the same building of the list, but it’s still faster for this. Python 2 is acting strangely fast.

Let’s remove the comprehensions and re-time. The _ = is to prevent it getting optimised out.

>>> python3 -m timeit '_ = ["a", "b", "c"]'
10000000 loops, best of 3: 0.0707 usec per loop

>>> python3 -m timeit '_ = "abc"'
100000000 loops, best of 3: 0.0171 usec per loop

We can see that initialization is not significant enough to account for the difference between the versions (those numbers are small)! We can thus conclude that Python 3 has slower comprehensions. This makes sense as Python 3 changed comprehensions to have safer scoping.

Well, now improve the benchmark (I’m just removing overhead that isn’t iteration). This removes the building of the iterable by pre-assigning it:

>>> python3 -m timeit -s 'iterable = "abc"'           '[x for x in iterable]'
1000000 loops, best of 3: 0.387 usec per loop

>>> python3 -m timeit -s 'iterable = ["a", "b", "c"]' '[x for x in iterable]'
1000000 loops, best of 3: 0.368 usec per loop
>>> python2 -m timeit -s 'iterable = "abc"'           '[x for x in iterable]'
1000000 loops, best of 3: 0.309 usec per loop

>>> python2 -m timeit -s 'iterable = ["a", "b", "c"]' '[x for x in iterable]'
10000000 loops, best of 3: 0.164 usec per loop

We can check if calling iter is the overhead:

>>> python3 -m timeit -s 'iterable = "abc"'           'iter(iterable)'
10000000 loops, best of 3: 0.099 usec per loop

>>> python3 -m timeit -s 'iterable = ["a", "b", "c"]' 'iter(iterable)'
10000000 loops, best of 3: 0.1 usec per loop
>>> python2 -m timeit -s 'iterable = "abc"'           'iter(iterable)'
10000000 loops, best of 3: 0.0913 usec per loop

>>> python2 -m timeit -s 'iterable = ["a", "b", "c"]' 'iter(iterable)'
10000000 loops, best of 3: 0.0854 usec per loop

No. No it is not. The difference is too small, especially for Python 3.

So let’s remove yet more unwanted overhead… by making the whole thing slower! The aim is just to have a longer iteration so the time hides overhead.

>>> python3 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' '[x for x in iterable]'
100 loops, best of 3: 3.12 msec per loop

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' '[x for x in iterable]'
100 loops, best of 3: 2.77 msec per loop
>>> python2 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' '[x for x in iterable]'
100 loops, best of 3: 2.32 msec per loop

>>> python2 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' '[x for x in iterable]'
100 loops, best of 3: 2.09 msec per loop

This hasn’t actually changed much, but it’s helped a little.

So remove the comprehension. It’s overhead that’s not part of the question:

>>> python3 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'for x in iterable: pass'
1000 loops, best of 3: 1.71 msec per loop

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'for x in iterable: pass'
1000 loops, best of 3: 1.36 msec per loop
>>> python2 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'for x in iterable: pass'
1000 loops, best of 3: 1.27 msec per loop

>>> python2 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'for x in iterable: pass'
1000 loops, best of 3: 935 usec per loop

That’s more like it! We can get slightly faster still by using deque to iterate. It’s basically the same, but it’s faster:

>>> python3 -m timeit -s 'import random; from collections import deque; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 777 usec per loop

>>> python3 -m timeit -s 'import random; from collections import deque; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 405 usec per loop
>>> python2 -m timeit -s 'import random; from collections import deque; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 805 usec per loop

>>> python2 -m timeit -s 'import random; from collections import deque; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 438 usec per loop

What impresses me is that Unicode is competitive with bytestrings. We can check this explicitly by trying bytes and unicode in both:

  • bytes

    >>> python3 -m timeit -s 'import random; from collections import deque; iterable = b"".join(chr(random.randint(0, 127)).encode("ascii") for _ in range(100000))' 'deque(iterable, maxlen=0)'                                                                    :(
    1000 loops, best of 3: 571 usec per loop
    
    >>> python3 -m timeit -s 'import random; from collections import deque; iterable =         [chr(random.randint(0, 127)).encode("ascii") for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 394 usec per loop
    
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable = b"".join(chr(random.randint(0, 127))                 for _ in range(100000))' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 757 usec per loop
    
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable =         [chr(random.randint(0, 127))                 for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 438 usec per loop
    

    Here you see Python 3 actually faster than Python 2.

  • unicode

    >>> python3 -m timeit -s 'import random; from collections import deque; iterable = u"".join(   chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 800 usec per loop
    
    >>> python3 -m timeit -s 'import random; from collections import deque; iterable =         [   chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 394 usec per loop
    
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable = u"".join(unichr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 1.07 msec per loop
    
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable =         [unichr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 469 usec per loop
    

    Again, Python 3 is faster, although this is to be expected (str has had a lot of attention in Python 3).

In fact, this unicodebytes difference is very small, which is impressive.

So let’s analyse this one case, seeing as it’s fast and convenient for me:

>>> python3 -m timeit -s 'import random; from collections import deque; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 777 usec per loop

>>> python3 -m timeit -s 'import random; from collections import deque; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 405 usec per loop

We can actually rule out Tim Peter’s 10-times-upvoted answer!

>>> foo = iterable[123]
>>> iterable[36] is foo
True

These are not new objects!

But this is worth mentioning: indexing costs. The difference will likely be in the indexing, so remove the iteration and just index:

>>> python3 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'iterable[123]'
10000000 loops, best of 3: 0.0397 usec per loop

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'iterable[123]'
10000000 loops, best of 3: 0.0374 usec per loop

The difference seems small, but at least half of the cost is overhead:

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'iterable; 123'
100000000 loops, best of 3: 0.0173 usec per loop

so the speed difference is sufficient to decide to blame it. I think.

So why is indexing a list so much faster?

Well, I’ll come back to you on that, but my guess is that’s is down to the check for interned strings (or cached characters if it’s a separate mechanism). This will be less fast than optimal. But I’ll go check the source (although I’m not comfortable in C…) :).


So here’s the source:

static PyObject *
unicode_getitem(PyObject *self, Py_ssize_t index)
{
    void *data;
    enum PyUnicode_Kind kind;
    Py_UCS4 ch;
    PyObject *res;

    if (!PyUnicode_Check(self) || PyUnicode_READY(self) == -1) {
        PyErr_BadArgument();
        return NULL;
    }
    if (index < 0 || index >= PyUnicode_GET_LENGTH(self)) {
        PyErr_SetString(PyExc_IndexError, "string index out of range");
        return NULL;
    }
    kind = PyUnicode_KIND(self);
    data = PyUnicode_DATA(self);
    ch = PyUnicode_READ(kind, data, index);
    if (ch < 256)
        return get_latin1_char(ch);

    res = PyUnicode_New(1, ch);
    if (res == NULL)
        return NULL;
    kind = PyUnicode_KIND(res);
    data = PyUnicode_DATA(res);
    PyUnicode_WRITE(kind, data, 0, ch);
    assert(_PyUnicode_CheckConsistency(res, 1));
    return res;
}

Walking from the top, we’ll have some checks. These are boring. Then some assigns, which should also be boring. The first interesting line is

ch = PyUnicode_READ(kind, data, index);

but we’d hope that is fast, as we’re reading from a contiguous C array by indexing it. The result, ch, will be less than 256 so we’ll return the cached character in get_latin1_char(ch).

So we’ll run (dropping the first checks)

kind = PyUnicode_KIND(self);
data = PyUnicode_DATA(self);
ch = PyUnicode_READ(kind, data, index);
return get_latin1_char(ch);

Where

#define PyUnicode_KIND(op) \
    (assert(PyUnicode_Check(op)), \
     assert(PyUnicode_IS_READY(op)),            \
     ((PyASCIIObject *)(op))->state.kind)

(which is boring because asserts get ignored in debug [so I can check that they’re fast] and ((PyASCIIObject *)(op))->state.kind) is (I think) an indirection and a C-level cast);

#define PyUnicode_DATA(op) \
    (assert(PyUnicode_Check(op)), \
     PyUnicode_IS_COMPACT(op) ? _PyUnicode_COMPACT_DATA(op) :   \
     _PyUnicode_NONCOMPACT_DATA(op))

(which is also boring for similar reasons, assuming the macros (Something_CAPITALIZED) are all fast),

#define PyUnicode_READ(kind, data, index) \
    ((Py_UCS4) \
    ((kind) == PyUnicode_1BYTE_KIND ? \
        ((const Py_UCS1 *)(data))[(index)] : \
        ((kind) == PyUnicode_2BYTE_KIND ? \
            ((const Py_UCS2 *)(data))[(index)] : \
            ((const Py_UCS4 *)(data))[(index)] \
        ) \
    ))

(which involves indexes but really isn’t slow at all) and

static PyObject*
get_latin1_char(unsigned char ch)
{
    PyObject *unicode = unicode_latin1[ch];
    if (!unicode) {
        unicode = PyUnicode_New(1, ch);
        if (!unicode)
            return NULL;
        PyUnicode_1BYTE_DATA(unicode)[0] = ch;
        assert(_PyUnicode_CheckConsistency(unicode, 1));
        unicode_latin1[ch] = unicode;
    }
    Py_INCREF(unicode);
    return unicode;
}

Which confirms my suspicion that:

  • This is cached:

    PyObject *unicode = unicode_latin1[ch];
    
  • This should be fast. The if (!unicode) is not run, so it’s literally equivalent in this case to

    PyObject *unicode = unicode_latin1[ch];
    Py_INCREF(unicode);
    return unicode;
    

Honestly, after testing the asserts are fast (by disabling them [I think it works on the C-level asserts…]), the only plausibly-slow parts are:

PyUnicode_IS_COMPACT(op)
_PyUnicode_COMPACT_DATA(op)
_PyUnicode_NONCOMPACT_DATA(op)

Which are:

#define PyUnicode_IS_COMPACT(op) \
    (((PyASCIIObject*)(op))->state.compact)

(fast, as before),

#define _PyUnicode_COMPACT_DATA(op)                     \
    (PyUnicode_IS_ASCII(op) ?                   \
     ((void*)((PyASCIIObject*)(op) + 1)) :              \
     ((void*)((PyCompactUnicodeObject*)(op) + 1)))

(fast if the macro IS_ASCII is fast), and

#define _PyUnicode_NONCOMPACT_DATA(op)                  \
    (assert(((PyUnicodeObject*)(op))->data.any),        \
     ((((PyUnicodeObject *)(op))->data.any)))

(also fast as it’s an assert plus an indirection plus a cast).

So we’re down (the rabbit hole) to:

PyUnicode_IS_ASCII

which is

#define PyUnicode_IS_ASCII(op)                   \
    (assert(PyUnicode_Check(op)),                \
     assert(PyUnicode_IS_READY(op)),             \
     ((PyASCIIObject*)op)->state.ascii)

Hmm… that seems fast too…


Well, OK, but let’s compare it to PyList_GetItem. (Yeah, thanks Tim Peters for giving me more work to do :P.)

PyObject *
PyList_GetItem(PyObject *op, Py_ssize_t i)
{
    if (!PyList_Check(op)) {
        PyErr_BadInternalCall();
        return NULL;
    }
    if (i < 0 || i >= Py_SIZE(op)) {
        if (indexerr == NULL) {
            indexerr = PyUnicode_FromString(
                "list index out of range");
            if (indexerr == NULL)
                return NULL;
        }
        PyErr_SetObject(PyExc_IndexError, indexerr);
        return NULL;
    }
    return ((PyListObject *)op) -> ob_item[i];
}

We can see that on non-error cases this is just going to run:

PyList_Check(op)
Py_SIZE(op)
((PyListObject *)op) -> ob_item[i]

Where PyList_Check is

#define PyList_Check(op) \
     PyType_FastSubclass(Py_TYPE(op), Py_TPFLAGS_LIST_SUBCLASS)

(TABS! TABS!!!) (issue21587) That got fixed and merged in 5 minutes. Like… yeah. Damn. They put Skeet to shame.

#define Py_SIZE(ob)             (((PyVarObject*)(ob))->ob_size)
#define PyType_FastSubclass(t,f)  PyType_HasFeature(t,f)
#ifdef Py_LIMITED_API
#define PyType_HasFeature(t,f)  ((PyType_GetFlags(t) & (f)) != 0)
#else
#define PyType_HasFeature(t,f)  (((t)->tp_flags & (f)) != 0)
#endif

So this is normally really trivial (two indirections and a couple of boolean checks) unless Py_LIMITED_API is on, in which case… ???

Then there’s the indexing and a cast (((PyListObject *)op) -> ob_item[i]) and we’re done.

So there are definitely fewer checks for lists, and the small speed differences certainly imply that it could be relevant.


I think in general, there’s just more type-checking and indirection (->) for Unicode. It seems I’m missing a point, but what?


回答 1

当您遍历大多数容器对象(列表,元组,字典,…)时,迭代器会容器中传递对象。

但是,当您遍历字符串时,必须为传递的每个字符创建一个对象-字符串不是“容器”,就如同列表是容器一样。在迭代创建对象之前,字符串中的各个字符不作为不同的对象存在。

When you iterate over most container objects (lists, tuples, dicts, …), the iterator delivers the objects in the container.

But when you iterate over a string, a new object has to be created for each character delivered – a string is not “a container” in the same sense a list is a container. The individual characters in a string don’t exist as distinct objects before iteration creates those objects.


回答 2

创建字符串的迭代器可能会招致麻烦。而数组在实例化时已经包含一个迭代器。

编辑:

>>> timeit("[x for x in ['a','b','c']]")
0.3818681240081787
>>> timeit("[x for x in 'abc']")
0.3732869625091553

这是使用2.7运行的,但是在我的Mac book pro i7上。这可能是系统配置不同的结果。

You could be incurring and overhead for creating the iterator for the string. Whereas the array already contains an iterator upon instantiation.

EDIT:

>>> timeit("[x for x in ['a','b','c']]")
0.3818681240081787
>>> timeit("[x for x in 'abc']")
0.3732869625091553

This was ran using 2.7, but on my mac book pro i7. This could be the result of a system configuration difference.


有趣好用的Python教程

退出移动版
微信支付
请使用 微信 扫码支付