分类目录归档:知识问答

在Python中从字符串转换为布尔值?

问题:在Python中从字符串转换为布尔值?

有谁知道如何在Python中从字符串转换为布尔值?我找到了这个链接。但这似乎不是正确的方法。即使用内置功能等

我之所以这样问,是因为我int("string")从这里学到了。但是当尝试bool("string")它总是返回True

>>> bool("False")
True

Does anyone know how to do convert from a string to a boolean in Python? I found this link. But it doesn’t look like a proper way to do it. I.e. using built-in functionality, etc.

The reason I’m asking this is because I learned about int("string") from here. But when trying bool("string") it always returns True:

>>> bool("False")
True

回答 0

实际上,您只需将字符串与希望接受的代表true的字符串进行比较,就可以做到这一点:

s == 'True'

或检查一堆值:

s.lower() in ['true', '1', 't', 'y', 'yes', 'yeah', 'yup', 'certainly', 'uh-huh']

使用以下内容时请小心:

>>> bool("foo")
True
>>> bool("")
False

空字符串的计算结果为False,但其他所有结果的计算结果为True。因此,不应将其用于任何类型的解析目的。

Really, you just compare the string to whatever you expect to accept as representing true, so you can do this:

s == 'True'

Or to checks against a whole bunch of values:

s.lower() in ['true', '1', 't', 'y', 'yes', 'yeah', 'yup', 'certainly', 'uh-huh']

Be cautious when using the following:

>>> bool("foo")
True
>>> bool("")
False

Empty strings evaluate to False, but everything else evaluates to True. So this should not be used for any kind of parsing purposes.


回答 1

采用:

bool(distutils.util.strtobool(some_string))

真实值为y,y,t,true,on和1;false值是n,no,f,false,off和0。如果val是其他值,则引发ValueError。

请注意,它distutils.util.strtobool()返回整数表示形式,因此需要将其包装bool()以获取布尔值。

Use:

bool(distutils.util.strtobool(some_string))

True values are y, yes, t, true, on and 1; false values are n, no, f, false, off and 0. Raises ValueError if val is anything else.

Be aware that distutils.util.strtobool() returns integer representations and thus it needs to be wrapped with bool() to get Boolean values.


回答 2

def str2bool(v):
  return v.lower() in ("yes", "true", "t", "1")

然后这样称呼它:

>>> str2bool("yes")
True
>>> str2bool("no")
False
>>> str2bool("stuff")
False
>>> str2bool("1")
True
>>> str2bool("0")
False

显式处理真假:

您还可以使函数显式地检查True单词列表和False单词列表。然后,如果它不在两个列表中,则可能引发异常。

def str2bool(v):
  return v.lower() in ("yes", "true", "t", "1")

Then call it like so:

>>> str2bool("yes")
True
>>> str2bool("no")
False
>>> str2bool("stuff")
False
>>> str2bool("1")
True
>>> str2bool("0")
False

Handling true and false explicitly:

You could also make your function explicitly check against a True list of words and a False list of words. Then if it is in neither list, you could throw an exception.


回答 3

JSON解析器通常也可用于将字符串转换为合理的python类型。

>>> import json
>>> json.loads("false".lower())
False
>>> json.loads("True".lower())
True

The JSON parser is also useful for in general converting strings to reasonable python types.

>>> import json
>>> json.loads("false".lower())
False
>>> json.loads("True".lower())
True

回答 4

从Python 2.6开始,现在有ast.literal_eval

>>>导入AST
>>>帮助(ast.literal_eval)
帮助ast模块中的literal_eval函数:

literal_eval(node_or_string)
    安全地评估表达式节点或包含Python的字符串
    表达。提供的字符串或节点只能由以下内容组成
    Python文字结构:字符串,数字,元组,列表,字典,布尔值,
    和没有。

因为你这似乎工作,只要确保你的字符串将是两种"True""False"

>>> ast.literal_eval(“ True”)
真正
>>> ast.literal_eval(“ False”)
假
>>> ast.literal_eval(“ F”)
追溯(最近一次通话):
  文件“”,第1行,位于 
  文件“ /opt/Python-2.6.1/lib/python2.6/ast.py”,第68行,位于literal_eval中
    返回_convert(node_or_string)
  _convert中的文件“ /opt/Python-2.6.1/lib/python2.6/ast.py”,第67行
    引发ValueError('格式错误的字符串')
ValueError:格式错误的字符串
>>> ast.literal_eval(“'False'”)
'假'

我通常不建议这样做,但是它是完全内置的,根据您的要求可能是正确的选择。

Starting with Python 2.6, there is now ast.literal_eval:

>>> import ast
>>> help(ast.literal_eval)
Help on function literal_eval in module ast:

literal_eval(node_or_string)
    Safely evaluate an expression node or a string containing a Python
    expression.  The string or node provided may only consist of the following
    Python literal structures: strings, numbers, tuples, lists, dicts, booleans,
    and None.

Which seems to work, as long as you’re sure your strings are going to be either "True" or "False":

>>> ast.literal_eval("True")
True
>>> ast.literal_eval("False")
False
>>> ast.literal_eval("F")
Traceback (most recent call last):
  File "", line 1, in 
  File "/opt/Python-2.6.1/lib/python2.6/ast.py", line 68, in literal_eval
    return _convert(node_or_string)
  File "/opt/Python-2.6.1/lib/python2.6/ast.py", line 67, in _convert
    raise ValueError('malformed string')
ValueError: malformed string
>>> ast.literal_eval("'False'")
'False'

I wouldn’t normally recommend this, but it is completely built-in and could be the right thing depending on your requirements.


回答 5

如果您知道字符串为"True"or "False",则可以使用eval(s)

>>> eval("True")
True
>>> eval("False")
False

不过,仅在确定字符串的内容时才使用它,因为如果字符串不包含有效的Python,它将引发异常,并且还将执行字符串中包含的代码。

If you know the string will be either "True" or "False", you could just use eval(s).

>>> eval("True")
True
>>> eval("False")
False

Only use this if you are sure of the contents of the string though, as it will throw an exception if the string does not contain valid Python, and will also execute code contained in the string.


回答 6

此版本保留了int(value)等构造函数的语义,并提供了一种定义可接受的字符串值的简便方法。

def to_bool(value):
    valid = {'true': True, 't': True, '1': True,
             'false': False, 'f': False, '0': False,
             }   

    if isinstance(value, bool):
        return value

    if not isinstance(value, basestring):
        raise ValueError('invalid literal for boolean. Not a string.')

    lower_value = value.lower()
    if lower_value in valid:
        return valid[lower_value]
    else:
        raise ValueError('invalid literal for boolean: "%s"' % value)


# Test cases
assert to_bool('true'), '"true" is True' 
assert to_bool('True'), '"True" is True' 
assert to_bool('TRue'), '"TRue" is True' 
assert to_bool('TRUE'), '"TRUE" is True' 
assert to_bool('T'), '"T" is True' 
assert to_bool('t'), '"t" is True' 
assert to_bool('1'), '"1" is True' 
assert to_bool(True), 'True is True' 
assert to_bool(u'true'), 'unicode "true" is True'

assert to_bool('false') is False, '"false" is False' 
assert to_bool('False') is False, '"False" is False' 
assert to_bool('FAlse') is False, '"FAlse" is False' 
assert to_bool('FALSE') is False, '"FALSE" is False' 
assert to_bool('F') is False, '"F" is False' 
assert to_bool('f') is False, '"f" is False' 
assert to_bool('0') is False, '"0" is False' 
assert to_bool(False) is False, 'False is False'
assert to_bool(u'false') is False, 'unicode "false" is False'

# Expect ValueError to be raised for invalid parameter...
try:
    to_bool('')
    to_bool(12)
    to_bool([])
    to_bool('yes')
    to_bool('FOObar')
except ValueError, e:
    pass

This version keeps the semantics of constructors like int(value) and provides an easy way to define acceptable string values.

def to_bool(value):
    valid = {'true': True, 't': True, '1': True,
             'false': False, 'f': False, '0': False,
             }   

    if isinstance(value, bool):
        return value

    if not isinstance(value, basestring):
        raise ValueError('invalid literal for boolean. Not a string.')

    lower_value = value.lower()
    if lower_value in valid:
        return valid[lower_value]
    else:
        raise ValueError('invalid literal for boolean: "%s"' % value)


# Test cases
assert to_bool('true'), '"true" is True' 
assert to_bool('True'), '"True" is True' 
assert to_bool('TRue'), '"TRue" is True' 
assert to_bool('TRUE'), '"TRUE" is True' 
assert to_bool('T'), '"T" is True' 
assert to_bool('t'), '"t" is True' 
assert to_bool('1'), '"1" is True' 
assert to_bool(True), 'True is True' 
assert to_bool(u'true'), 'unicode "true" is True'

assert to_bool('false') is False, '"false" is False' 
assert to_bool('False') is False, '"False" is False' 
assert to_bool('FAlse') is False, '"FAlse" is False' 
assert to_bool('FALSE') is False, '"FALSE" is False' 
assert to_bool('F') is False, '"F" is False' 
assert to_bool('f') is False, '"f" is False' 
assert to_bool('0') is False, '"0" is False' 
assert to_bool(False) is False, 'False is False'
assert to_bool(u'false') is False, 'unicode "false" is False'

# Expect ValueError to be raised for invalid parameter...
try:
    to_bool('')
    to_bool(12)
    to_bool([])
    to_bool('yes')
    to_bool('FOObar')
except ValueError, e:
    pass

回答 7

这是我的版本。它同时检查正值和负值列表,从而引发未知值的异常。而且它不接收字符串,但是任何类型都可以。

def to_bool(value):
    """
       Converts 'something' to boolean. Raises exception for invalid formats
           Possible True  values: 1, True, "1", "TRue", "yes", "y", "t"
           Possible False values: 0, False, None, [], {}, "", "0", "faLse", "no", "n", "f", 0.0, ...
    """
    if str(value).lower() in ("yes", "y", "true",  "t", "1"): return True
    if str(value).lower() in ("no",  "n", "false", "f", "0", "0.0", "", "none", "[]", "{}"): return False
    raise Exception('Invalid value for boolean conversion: ' + str(value))

样品运行:

>>> to_bool(True)
True
>>> to_bool("tRUe")
True
>>> to_bool("1")
True
>>> to_bool(1)
True
>>> to_bool(2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 9, in to_bool
Exception: Invalid value for boolean conversion: 2
>>> to_bool([])
False
>>> to_bool({})
False
>>> to_bool(None)
False
>>> to_bool("Wasssaaaaa")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 9, in to_bool
Exception: Invalid value for boolean conversion: Wasssaaaaa
>>>

Here’s is my version. It checks against both positive and negative values lists, raising an exception for unknown values. And it does not receive a string, but any type should do.

def to_bool(value):
    """
       Converts 'something' to boolean. Raises exception for invalid formats
           Possible True  values: 1, True, "1", "TRue", "yes", "y", "t"
           Possible False values: 0, False, None, [], {}, "", "0", "faLse", "no", "n", "f", 0.0, ...
    """
    if str(value).lower() in ("yes", "y", "true",  "t", "1"): return True
    if str(value).lower() in ("no",  "n", "false", "f", "0", "0.0", "", "none", "[]", "{}"): return False
    raise Exception('Invalid value for boolean conversion: ' + str(value))

Sample runs:

>>> to_bool(True)
True
>>> to_bool("tRUe")
True
>>> to_bool("1")
True
>>> to_bool(1)
True
>>> to_bool(2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 9, in to_bool
Exception: Invalid value for boolean conversion: 2
>>> to_bool([])
False
>>> to_bool({})
False
>>> to_bool(None)
False
>>> to_bool("Wasssaaaaa")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 9, in to_bool
Exception: Invalid value for boolean conversion: Wasssaaaaa
>>>

回答 8

你总是可以做这样的事情

myString = "false"
val = (myString == "true")

括号中的位将评估为False。这是无需执行实际函数调用的另一种方法。

you could always do something like

myString = "false"
val = (myString == "true")

the bit in parens would evaluate to False. This is just another way to do it without having to do an actual function call.


回答 9

一个很酷的简单技巧(基于@Alan Marchiori发布的内容),但是使用了yaml:

import yaml

parsed = yaml.load("true")
print bool(parsed)

如果宽度太大,可以通过测试类型结果来完善它。如果yaml返回的类型是str,则不能将其强制转换为任何其他类型(无论如何我都可以想到),因此您可以单独处理它,也可以使其为true。

我不会对速度做出任何猜测,但是由于无论如何我都在Qt gui下使用yaml数据,所以这具有很好的对称性。

A cool, simple trick (based on what @Alan Marchiori posted), but using yaml:

import yaml

parsed = yaml.load("true")
print bool(parsed)

If this is too wide, it can be refined by testing the type result. If the yaml-returned type is a str, then it can’t be cast to any other type (that I can think of anyway), so you could handle that separately, or just let it be true.

I won’t make any guesses at speed, but since I am working with yaml data under Qt gui anyway, this has a nice symmetry.


回答 10

我不同意这里的任何解决方案,因为它们太宽容了。这通常不是解析字符串时想要的。

所以这是我正在使用的解决方案:

def to_bool(bool_str):
    """Parse the string and return the boolean value encoded or raise an exception"""
    if isinstance(bool_str, basestring) and bool_str: 
        if bool_str.lower() in ['true', 't', '1']: return True
        elif bool_str.lower() in ['false', 'f', '0']: return False

    #if here we couldn't parse it
    raise ValueError("%s is no recognized as a boolean value" % bool_str)

结果:

>>> [to_bool(v) for v in ['true','t','1','F','FALSE','0']]
[True, True, True, False, False, False]
>>> to_bool("")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 8, in to_bool
ValueError: '' is no recognized as a boolean value

只是要清楚一点,因为它看起来好像我的回答以某种方式冒犯了别人:

关键是您不想只测试一个值并假设另一个值。我不认为您总是想将所有内容绝对映射到未解析的值。产生易于出错的代码。

因此,如果您知道要编码的内容。

I don’t agree with any solution here, as they are too permissive. This is not normally what you want when parsing a string.

So here the solution I’m using:

def to_bool(bool_str):
    """Parse the string and return the boolean value encoded or raise an exception"""
    if isinstance(bool_str, basestring) and bool_str: 
        if bool_str.lower() in ['true', 't', '1']: return True
        elif bool_str.lower() in ['false', 'f', '0']: return False

    #if here we couldn't parse it
    raise ValueError("%s is no recognized as a boolean value" % bool_str)

And the results:

>>> [to_bool(v) for v in ['true','t','1','F','FALSE','0']]
[True, True, True, False, False, False]
>>> to_bool("")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 8, in to_bool
ValueError: '' is no recognized as a boolean value

Just to be clear because it looks as if my answer offended somebody somehow:

The point is that you don’t want to test for only one value and assume the other. I don’t think you always want to map Absolutely everything to the non parsed value. That produces error prone code.

So, if you know what you want code it in.


回答 11

dict(实际上是defaultdict)为您提供了一种非常简单的方法来完成此操作:

from collections import defaultdict
bool_mapping = defaultdict(bool) # Will give you False for non-found values
for val in ['True', 'yes', ...]:
    bool_mapping[val] = True

print(bool_mapping['True']) # True
print(bool_mapping['kitten']) # False

将该方法定制为所需的确切转换行为非常容易-您可以使用允许的Truthy和Falsy值填充该方法,并在找不到值时将其引发异常(或返回None),或者默认为True,或默认为False或您想要的任何值。

A dict (really, a defaultdict) gives you a pretty easy way to do this trick:

from collections import defaultdict
bool_mapping = defaultdict(bool) # Will give you False for non-found values
for val in ['True', 'yes', ...]:
    bool_mapping[val] = True

print(bool_mapping['True']) # True
print(bool_mapping['kitten']) # False

It’s really easy to tailor this method to the exact conversion behavior you want — you can fill it with allowed Truthy and Falsy values and let it raise an exception (or return None) when a value isn’t found, or default to True, or default to False, or whatever you want.


回答 12

您可能已经有了解决方案,但对于其他人,他们正在寻找一种方法,使用除false,no和0之外的“ Non”,[],{}和“”等“标准” false值将值转换为布尔值。

def toBoolean( val ):
    """ 
    Get the boolean value of the provided input.

        If the value is a boolean return the value.
        Otherwise check to see if the value is in 
        ["false", "f", "no", "n", "none", "0", "[]", "{}", "" ]
        and returns True if value is not in the list
    """

    if val is True or val is False:
        return val

    falseItems = ["false", "f", "no", "n", "none", "0", "[]", "{}", "" ]

    return not str( val ).strip().lower() in falseItems

You probably already have a solution but for others who are looking for a method to convert a value to a boolean value using “standard” false values including None, [], {}, and “” in addition to false, no , and 0.

def toBoolean( val ):
    """ 
    Get the boolean value of the provided input.

        If the value is a boolean return the value.
        Otherwise check to see if the value is in 
        ["false", "f", "no", "n", "none", "0", "[]", "{}", "" ]
        and returns True if value is not in the list
    """

    if val is True or val is False:
        return val

    falseItems = ["false", "f", "no", "n", "none", "0", "[]", "{}", "" ]

    return not str( val ).strip().lower() in falseItems

回答 13

您可以简单地使用内置函数eval()

a='True'
if a is True:
    print 'a is True, a type is', type(a)
else:
    print "a isn't True, a type is", type(a)
b = eval(a)
if b is True:
    print 'b is True, b type is', type(b)
else:
    print "b isn't True, b type is", type(b)

和输出:

a isn't True, a type is <type 'str'>
b is True, b type is <type 'bool'>

You can simply use the built-in function eval():

a='True'
if a is True:
    print 'a is True, a type is', type(a)
else:
    print "a isn't True, a type is", type(a)
b = eval(a)
if b is True:
    print 'b is True, b type is', type(b)
else:
    print "b isn't True, b type is", type(b)

and the output:

a isn't True, a type is <type 'str'>
b is True, b type is <type 'bool'>

回答 14

另一个选择

from ansible.module_utils.parsing.convert_bool import boolean
boolean('no')
# False
boolean('yEs')
# True
boolean('true')
# True

但是在生产中,如果不需要ansible及其所有依赖关系,一个好主意是查看其源代码并复制所需逻辑的一部分。

Yet another option

from ansible.module_utils.parsing.convert_bool import boolean
boolean('no')
# False
boolean('yEs')
# True
boolean('true')
# True

But in production if you don’t need ansible and all its dependencies, a good idea is to look at its source code and copy part of the logic that you need.


回答 15

进行浇铸,一个布尔值,通常的规则是,一些特殊的文字(False00.0()[]{})都是假的,然后一切是真实的,所以我提出以下建议:

def boolify(val):
    if (isinstance(val, basestring) and bool(val)):
        return not val in ('False', '0', '0.0')
    else:
        return bool(val)

The usual rule for casting to a bool is that a few special literals (False, 0, 0.0, (), [], {}) are false and then everything else is true, so I recommend the following:

def boolify(val):
    if (isinstance(val, basestring) and bool(val)):
        return not val in ('False', '0', '0.0')
    else:
        return bool(val)

回答 16

这是我写的版本。将其他几种解决方案合并为一个。

def to_bool(value):
    """
    Converts 'something' to boolean. Raises exception if it gets a string it doesn't handle.
    Case is ignored for strings. These string values are handled:
      True: 'True', "1", "TRue", "yes", "y", "t"
      False: "", "0", "faLse", "no", "n", "f"
    Non-string values are passed to bool.
    """
    if type(value) == type(''):
        if value.lower() in ("yes", "y", "true",  "t", "1"):
            return True
        if value.lower() in ("no",  "n", "false", "f", "0", ""):
            return False
        raise Exception('Invalid value for boolean conversion: ' + value)
    return bool(value)

如果它得到一个字符串,则它期望特定的值,否则引发Exception。如果没有得到字符串,只需让bool构造函数弄清楚即可。测试了以下情况:

test_cases = [
    ('true', True),
    ('t', True),
    ('yes', True),
    ('y', True),
    ('1', True),
    ('false', False),
    ('f', False),
    ('no', False),
    ('n', False),
    ('0', False),
    ('', False),
    (1, True),
    (0, False),
    (1.0, True),
    (0.0, False),
    ([], False),
    ({}, False),
    ((), False),
    ([1], True),
    ({1:2}, True),
    ((1,), True),
    (None, False),
    (object(), True),
    ]

This is the version I wrote. Combines several of the other solutions into one.

def to_bool(value):
    """
    Converts 'something' to boolean. Raises exception if it gets a string it doesn't handle.
    Case is ignored for strings. These string values are handled:
      True: 'True', "1", "TRue", "yes", "y", "t"
      False: "", "0", "faLse", "no", "n", "f"
    Non-string values are passed to bool.
    """
    if type(value) == type(''):
        if value.lower() in ("yes", "y", "true",  "t", "1"):
            return True
        if value.lower() in ("no",  "n", "false", "f", "0", ""):
            return False
        raise Exception('Invalid value for boolean conversion: ' + value)
    return bool(value)

If it gets a string it expects specific values, otherwise raises an Exception. If it doesn’t get a string, just lets the bool constructor figure it out. Tested these cases:

test_cases = [
    ('true', True),
    ('t', True),
    ('yes', True),
    ('y', True),
    ('1', True),
    ('false', False),
    ('f', False),
    ('no', False),
    ('n', False),
    ('0', False),
    ('', False),
    (1, True),
    (0, False),
    (1.0, True),
    (0.0, False),
    ([], False),
    ({}, False),
    ((), False),
    ([1], True),
    ({1:2}, True),
    ((1,), True),
    (None, False),
    (object(), True),
    ]

回答 17

如果您知道您的输入将为“ True”或“ False”,那么为什么不使用:

def bool_convert(s):
    return s == "True"

If you know that your input will be either “True” or “False” then why not use:

def bool_convert(s):
    return s == "True"

回答 18

我用

# function
def toBool(x):
    return x in ("True","true",True)

# test cases
[[x, toBool(x)] for x in [True,"True","true",False,"False","false",None,1,0,-1,123]]
"""
Result:
[[True, True],
 ['True', True],
 ['true', True],
 [False, False],
 ['False', False],
 ['false', False],
 [None, False],
 [1, True],
 [0, False],
 [-1, False],
 [123, False]]
"""

I use

# function
def toBool(x):
    return x in ("True","true",True)

# test cases
[[x, toBool(x)] for x in [True,"True","true",False,"False","false",None,1,0,-1,123]]
"""
Result:
[[True, True],
 ['True', True],
 ['true', True],
 [False, False],
 ['False', False],
 ['false', False],
 [None, False],
 [1, True],
 [0, False],
 [-1, False],
 [123, False]]
"""

回答 19

我喜欢使用三元运算符,因为对于某些东西来说,它不应该超过1行,因此更加简洁。

True if myString=="True" else False

I like to use the ternary operator for this, since it’s a bit more succinct for something that feels like it shouldn’t be more than 1 line.

True if myString=="True" else False

回答 20

我意识到这是一篇旧文章,但是某些解决方案需要大量代码,这就是我最终使用的内容:

def str2bool(value):
    return {"True": True, "true": True}.get(value, False)

I realize this is an old post, but some of the solutions require quite a bit of code, here’s what I ended up using:

def str2bool(value):
    return {"True": True, "true": True}.get(value, False)

回答 21

使用软件包str2bool pip install str2bool

Use package str2bool pip install str2bool


回答 22

如果您喜欢我,只需要来自字符串的变量中的布尔值即可。您可以使用@jzwiener前面提到的distils。但是,我无法按照他的建议导入和使用该模块。

相反,我最终在python3.7上以这种方式使用它

distutils字符串在Python中布尔

from distutils import util # to handle str to bool conversion
enable_deletion = 'False'
enable_deletion = bool(util.strtobool(enable_deletion))

distutils是python std lib的一部分,因此无需安装。太好了!👍

If you like me just need boolean from variable which is string. You can use distils as mentioned earlier by @jzwiener. However I could not import and use the module as he suggested.

Instead I end up using it this way on python3.7

distutils string to bool in python

from distutils import util # to handle str to bool conversion
enable_deletion = 'False'
enable_deletion = bool(util.strtobool(enable_deletion))

distutils is part of the python std lib so no need of installation. Which is great!👍


回答 23

我想分享我的简单解决方案:使用eval()。这将字符串转换True以及False如果字符串恰恰是在标题格式正确的布尔类型TrueFalse总是第一个字母大写,否则该函数将引发错误。

例如

>>> eval('False')
False

>>> eval('True')
True

当然,对于动态变量,您可以简单地使用.title()来格式化布尔字符串。

>>> x = 'true'
>>> eval(x.title())
True

这将引发错误。

>>> eval('true')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1, in <module>
NameError: name 'true' is not defined

>>> eval('false')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1, in <module>
NameError: name 'false' is not defined

I would like to share my simple solution: use the eval(). It will convert the string True and False to proper boolean type IF the string is exactly in title format True or False always first letter capital or else the function will raise an error.

e.g.

>>> eval('False')
False

>>> eval('True')
True

Of course for dynamic variable you can simple use the .title() to format the boolean string.

>>> x = 'true'
>>> eval(x.title())
True

This will throw an error.

>>> eval('true')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1, in <module>
NameError: name 'true' is not defined

>>> eval('false')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1, in <module>
NameError: name 'false' is not defined

回答 24

这是一个毛茸茸的方法,旨在获得许多相同的答案。请注意,尽管python认为""是false,而所有其他字符串都为true,但TCL对事物的看法却截然不同。

>>> import Tkinter
>>> tk = Tkinter.Tk()
>>> var = Tkinter.BooleanVar(tk)
>>> var.set("false")
>>> var.get()
False
>>> var.set("1")
>>> var.get()
True
>>> var.set("[exec 'rm -r /']")
>>> var.get()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.5/lib-tk/Tkinter.py", line 324, in get
    return self._tk.getboolean(self._tk.globalgetvar(self._name))
_tkinter.TclError: 0expected boolean value but got "[exec 'rm -r /']"
>>> 

这样做的好处是,您可以宽恕可以使用的值。关于将字符串转换为值是懒惰的,对于接受和拒绝的内容也很不合时宜(请注意,如果上述声明是在tcl提示符下给出的,则会擦除用户的硬盘)。

坏的事情是,它要求Tkinter可用,这通常是正确的,但并非普遍如此,更重要的是,它要求创建一个Tk实例,它相对较重。

什么被认为是真还是假取决于的行为Tcl_GetBoolean,它认为0falsenooff为假1trueyeson是真实的,不区分大小写。任何其他字符串(包括空字符串)都会导致异常。

here’s a hairy, built in way to get many of the same answers. Note that although python considers "" to be false and all other strings to be true, TCL has a very different idea about things.

>>> import Tkinter
>>> tk = Tkinter.Tk()
>>> var = Tkinter.BooleanVar(tk)
>>> var.set("false")
>>> var.get()
False
>>> var.set("1")
>>> var.get()
True
>>> var.set("[exec 'rm -r /']")
>>> var.get()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.5/lib-tk/Tkinter.py", line 324, in get
    return self._tk.getboolean(self._tk.globalgetvar(self._name))
_tkinter.TclError: 0expected boolean value but got "[exec 'rm -r /']"
>>> 

A good thing about this is that it is fairly forgiving about the values you can use. It’s lazy about turning strings into values, and it’s hygenic about what it accepts and rejects(notice that if the above statement were given at a tcl prompt, it would erase the users hard disk).

the bad thing is that it requires that Tkinter be available, which is usually, but not universally true, and more significantly, requires that a Tk instance be created, which is comparatively heavy.

What is considered true or false depends on the behavior of the Tcl_GetBoolean, which considers 0, false, no and off to be false and 1, true, yes and on to be true, case insensitive. Any other string, including the empty string, cause an exception.


回答 25

def str2bool(str):
  if isinstance(str, basestring) and str.lower() in ['0','false','no']:
    return False
  else:
    return bool(str)

想法:检查您是否希望将字符串评估为False;否则,bool()对于任何非空字符串都返回True。

def str2bool(str):
  if isinstance(str, basestring) and str.lower() in ['0','false','no']:
    return False
  else:
    return bool(str)

idea: check if you want the string to be evaluated to False; otherwise bool() returns True for any non-empty string.


回答 26

我汇总了以下内容以评估字符串的真实性:

def as_bool(val):
 if val:
  try:
   if not int(val): val=False
  except: pass
  try:
   if val.lower()=="false": val=False
  except: pass
 return bool(val)

与使用大致相同的结果,eval但更安全。

Here’s something I threw together to evaluate the truthiness of a string:

def as_bool(val):
 if val:
  try:
   if not int(val): val=False
  except: pass
  try:
   if val.lower()=="false": val=False
  except: pass
 return bool(val)

more-or-less same results as using eval but safer.


回答 27

我只需要这样做…所以聚会晚了-但有人可能会觉得有用

def str_to_bool(input, default):
    """
    | Default | not_default_str | input   | result
    | T       |  "false"        | "true"  |  T
    | T       |  "false"        | "false" |  F
    | F       |  "true"         | "true"  |  T
    | F       |  "true"         | "false" |  F

    """
    if default:
        not_default_str = "false"
    else:
        not_default_str = "true"

    if input.lower() == not_default_str:
        return not default
    else:
        return default

I just had to do this… so maybe late to the party – but someone may find it useful

def str_to_bool(input, default):
    """
    | Default | not_default_str | input   | result
    | T       |  "false"        | "true"  |  T
    | T       |  "false"        | "false" |  F
    | F       |  "true"         | "true"  |  T
    | F       |  "true"         | "false" |  F

    """
    if default:
        not_default_str = "false"
    else:
        not_default_str = "true"

    if input.lower() == not_default_str:
        return not default
    else:
        return default

回答 28

如果您可以控制返回true/ 的实体,false则可以选择使它返回1/ 0而不是true/ false,然后:

boolean_response = bool(int(response))

用于int处理来自网络的响应(通常是字符串)的额外转换。

If you have control over the entity that’s returning true/false, one option is to have it return 1/0 instead of true/false, then:

boolean_response = bool(int(response))

The extra cast to int handles responses from a network, which are always string.


回答 29

通过使用Python的内置eval()函数和.capitalize()方法,您可以将任何“ true” /“ false”字符串(无论初始大小写如何)转换为真实的Python布尔值。

例如:

true_false = "trUE"
type(true_false)

# OUTPUT: <type 'str'>

true_false = eval(true_false.capitalize())
type(true_false)

# OUTPUT: <type 'bool'>

By using Python’s built-in eval() function and the .capitalize() method, you can convert any “true” / “false” string (regardless of initial capitalization) to a true Python boolean.

For example:

true_false = "trUE"
type(true_false)

# OUTPUT: <type 'str'>

true_false = eval(true_false.capitalize())
type(true_false)

# OUTPUT: <type 'bool'>

如何使用十进制range()步长值?

问题:如何使用十进制range()步长值?

有没有办法在0和1之间以0.1步进?

我以为我可以像下面那样做,但是失败了:

for i in range(0, 1, 0.1):
    print i

相反,它说step参数不能为零,这是我没有想到的。

Is there a way to step between 0 and 1 by 0.1?

I thought I could do it like the following, but it failed:

for i in range(0, 1, 0.1):
    print i

Instead, it says that the step argument cannot be zero, which I did not expect.


回答 0

与直接使用小数步相比,用所需的点数表示这一点要安全得多。否则,浮点舍入错误可能会给您带来错误的结果。

您可以使用NumPy库中的linspace函数(该库不是标准库的一部分,但相对容易获得)。需要返回多个点,还可以指定是否包括正确的端点:linspace

>>> np.linspace(0,1,11)
array([ 0. ,  0.1,  0.2,  0.3,  0.4,  0.5,  0.6,  0.7,  0.8,  0.9,  1. ])
>>> np.linspace(0,1,10,endpoint=False)
array([ 0. ,  0.1,  0.2,  0.3,  0.4,  0.5,  0.6,  0.7,  0.8,  0.9])

如果您确实要使用浮点阶跃值,可以使用numpy.arange

>>> import numpy as np
>>> np.arange(0.0, 1.0, 0.1)
array([ 0. ,  0.1,  0.2,  0.3,  0.4,  0.5,  0.6,  0.7,  0.8,  0.9])

但是,浮点舍入错误引起问题。这是一个简单的情况,当四舍五入错误arange仅应产生3个数字时,会导致产生长度为4的数组:

>>> numpy.arange(1, 1.3, 0.1)
array([1. , 1.1, 1.2, 1.3])

Rather than using a decimal step directly, it’s much safer to express this in terms of how many points you want. Otherwise, floating-point rounding error is likely to give you a wrong result.

You can use the linspace function from the NumPy library (which isn’t part of the standard library but is relatively easy to obtain). linspace takes a number of points to return, and also lets you specify whether or not to include the right endpoint:

>>> np.linspace(0,1,11)
array([ 0. ,  0.1,  0.2,  0.3,  0.4,  0.5,  0.6,  0.7,  0.8,  0.9,  1. ])
>>> np.linspace(0,1,10,endpoint=False)
array([ 0. ,  0.1,  0.2,  0.3,  0.4,  0.5,  0.6,  0.7,  0.8,  0.9])

If you really want to use a floating-point step value, you can, with numpy.arange.

>>> import numpy as np
>>> np.arange(0.0, 1.0, 0.1)
array([ 0. ,  0.1,  0.2,  0.3,  0.4,  0.5,  0.6,  0.7,  0.8,  0.9])

Floating-point rounding error will cause problems, though. Here’s a simple case where rounding error causes arange to produce a length-4 array when it should only produce 3 numbers:

>>> numpy.arange(1, 1.3, 0.1)
array([1. , 1.1, 1.2, 1.3])

回答 1

Python的range()只能做整数,不能做浮点数。在您的特定情况下,可以改用列表推导:

[x * 0.1 for x in range(0, 10)]

(用该表达式将调用替换为range。)

对于更一般的情况,您可能需要编写自定义函数或生成器。

Python’s range() can only do integers, not floating point. In your specific case, you can use a list comprehension instead:

[x * 0.1 for x in range(0, 10)]

(Replace the call to range with that expression.)

For the more general case, you may want to write a custom function or generator.


回答 2

‘xrange([start],stop [,step])’的基础上,您可以定义一个生成器,该生成器接受并产生您选择的任何类型(坚持支持+and的类型<):

>>> def drange(start, stop, step):
...     r = start
...     while r < stop:
...         yield r
...         r += step
...         
>>> i0=drange(0.0, 1.0, 0.1)
>>> ["%g" % x for x in i0]
['0', '0.1', '0.2', '0.3', '0.4', '0.5', '0.6', '0.7', '0.8', '0.9', '1']
>>> 

Building on ‘xrange([start], stop[, step])’, you can define a generator that accepts and produces any type you choose (stick to types supporting + and <):

>>> def drange(start, stop, step):
...     r = start
...     while r < stop:
...         yield r
...         r += step
...         
>>> i0=drange(0.0, 1.0, 0.1)
>>> ["%g" % x for x in i0]
['0', '0.1', '0.2', '0.3', '0.4', '0.5', '0.6', '0.7', '0.8', '0.9', '1']
>>> 

回答 3

增大i循环的幅度,然后在需要时减小它。

for i * 100 in range(0, 100, 10):
    print i / 100.0

编辑:老实说,我不记得为什么我认为这将在语法上起作用

for i in range(0, 11, 1):
    print i / 10.0

那应该具有所需的输出。

Increase the magnitude of i for the loop and then reduce it when you need it.

for i * 100 in range(0, 100, 10):
    print i / 100.0

EDIT: I honestly cannot remember why I thought that would work syntactically

for i in range(0, 11, 1):
    print i / 10.0

That should have the desired output.


回答 4

scipy有一个内置函数arange,可以泛化Python的range()构造函数,以满足您对float处理的要求。

from scipy import arange

scipy has a built in function arange which generalizes Python’s range() constructor to satisfy your requirement of float handling.

from scipy import arange


回答 5

我认为NumPy有点矫kill过正。

[p/10 for p in range(0, 10)]
[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]

一般来说,逐步1/x进行y将可以

x=100
y=2
[p/x for p in range(0, int(x*y))]
[0.0, 0.01, 0.02, 0.03, ..., 1.97, 1.98, 1.99]

1/x我测试时产生的舍入噪音较小)。

NumPy is a bit overkill, I think.

[p/10 for p in range(0, 10)]
[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]

Generally speaking, to do a step-by-1/x up to y you would do

x=100
y=2
[p/x for p in range(0, int(x*y))]
[0.0, 0.01, 0.02, 0.03, ..., 1.97, 1.98, 1.99]

(1/x produced less rounding noise when I tested).


回答 6

类似于R的 seq函数,该函数以正确的步长值以任意顺序返回序列。最后一个值等于停止值。

def seq(start, stop, step=1):
    n = int(round((stop - start)/float(step)))
    if n > 1:
        return([start + step*i for i in range(n+1)])
    elif n == 1:
        return([start])
    else:
        return([])

结果

seq(1, 5, 0.5)

[1.0、1.5、2.0、2.5、3.0、3.5、4.0、4.5、5.0]

seq(10, 0, -1)

[10、9、8、7、6、5、4、3、2、1、0]

seq(10, 0, -2)

[10、8、6、4、2、0]

seq(1, 1)

[1]

Similar to R’s seq function, this one returns a sequence in any order given the correct step value. The last value is equal to the stop value.

def seq(start, stop, step=1):
    n = int(round((stop - start)/float(step)))
    if n > 1:
        return([start + step*i for i in range(n+1)])
    elif n == 1:
        return([start])
    else:
        return([])

Results

seq(1, 5, 0.5)

[1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0]

seq(10, 0, -1)

[10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

seq(10, 0, -2)

[10, 8, 6, 4, 2, 0]

seq(1, 1)

[ 1 ]


回答 7

恐怕range()内置函数会返回一个整数值序列,因此您不能使用它执行小数步。

我想说的只是使用while循环:

i = 0.0
while i <= 1.0:
    print i
    i += 0.1

如果您好奇,Python会将您的0.1转换为0,这就是为什么它告诉您参数不能为零的原因。

The range() built-in function returns a sequence of integer values, I’m afraid, so you can’t use it to do a decimal step.

I’d say just use a while loop:

i = 0.0
while i <= 1.0:
    print i
    i += 0.1

If you’re curious, Python is converting your 0.1 to 0, which is why it’s telling you the argument can’t be zero.


回答 8

这是使用itertools的解决方案:

import itertools

def seq(start, end, step):
    if step == 0:
        raise ValueError("step must not be 0")
    sample_count = int(abs(end - start) / step)
    return itertools.islice(itertools.count(start, step), sample_count)

用法示例:

for i in seq(0, 1, 0.1):
    print(i)

Here’s a solution using itertools:

import itertools

def seq(start, end, step):
    if step == 0:
        raise ValueError("step must not be 0")
    sample_count = int(abs(end - start) / step)
    return itertools.islice(itertools.count(start, step), sample_count)

Usage Example:

for i in seq(0, 1, 0.1):
    print(i)

回答 9

[x * 0.1 for x in range(0, 10)] 

在Python 2.7x中,结果如下:

[0.0、0.1、0.2、0.30000000000000004、0.4、0.5、0.6000000000000001、0.7000000000000001、0.8、0.9]

但如果您使用:

[ round(x * 0.1, 1) for x in range(0, 10)]

给您所需的:

[0.0、0.1、0.2、0.3、0.4、0.5、0.6、0.7、0.8、0.9]

[x * 0.1 for x in range(0, 10)] 

in Python 2.7x gives you the result of:

[0.0, 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6000000000000001, 0.7000000000000001, 0.8, 0.9]

but if you use:

[ round(x * 0.1, 1) for x in range(0, 10)]

gives you the desired:

[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]


回答 10

import numpy as np
for i in np.arange(0, 1, 0.1): 
    print i 
import numpy as np
for i in np.arange(0, 1, 0.1): 
    print i 

回答 11

而且,如果您经常这样做,则可能要保存生成的列表 r

r=map(lambda x: x/10.0,range(0,10))
for i in r:
    print i

And if you do this often, you might want to save the generated list r

r=map(lambda x: x/10.0,range(0,10))
for i in r:
    print i

回答 12

more_itertools是实现numeric_range工具的第三方库:

import more_itertools as mit


for x in mit.numeric_range(0, 1, 0.1):
    print("{:.1f}".format(x))

输出量

0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9

此工具还适用于DecimalFraction

more_itertools is a third-party library that implements a numeric_range tool:

import more_itertools as mit


for x in mit.numeric_range(0, 1, 0.1):
    print("{:.1f}".format(x))

Output

0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9

This tool also works for Decimal and Fraction.


回答 13

我的版本使用原始的范围函数来为班次创建乘法索引。这允许与原始范围函数使用相同的语法。我做了两个版本,一个使用浮点,一个使用十进制,因为我发现在某些情况下我想避免浮点算术引入的舍入漂移。

它与范围/ xrange中的空集结果一致。

仅将单个数值传递给任何一个函数都将使标准范围输出返回到输入参数的整数上限值(因此,如果给定5.5,它将返回range(6)。)

编辑:下面的代码现在可以在pypi上作为软件包使用:Franges

## frange.py
from math import ceil
# find best range function available to version (2.7.x / 3.x.x)
try:
    _xrange = xrange
except NameError:
    _xrange = range

def frange(start, stop = None, step = 1):
    """frange generates a set of floating point values over the 
    range [start, stop) with step size step

    frange([start,] stop [, step ])"""

    if stop is None:
        for x in _xrange(int(ceil(start))):
            yield x
    else:
        # create a generator expression for the index values
        indices = (i for i in _xrange(0, int((stop-start)/step)))  
        # yield results
        for i in indices:
            yield start + step*i

## drange.py
import decimal
from math import ceil
# find best range function available to version (2.7.x / 3.x.x)
try:
    _xrange = xrange
except NameError:
    _xrange = range

def drange(start, stop = None, step = 1, precision = None):
    """drange generates a set of Decimal values over the
    range [start, stop) with step size step

    drange([start,] stop, [step [,precision]])"""

    if stop is None:
        for x in _xrange(int(ceil(start))):
            yield x
    else:
        # find precision
        if precision is not None:
            decimal.getcontext().prec = precision
        # convert values to decimals
        start = decimal.Decimal(start)
        stop = decimal.Decimal(stop)
        step = decimal.Decimal(step)
        # create a generator expression for the index values
        indices = (
            i for i in _xrange(
                0, 
                ((stop-start)/step).to_integral_value()
            )
        )  
        # yield results
        for i in indices:
            yield float(start + step*i)

## testranges.py
import frange
import drange
list(frange.frange(0, 2, 0.5)) # [0.0, 0.5, 1.0, 1.5]
list(drange.drange(0, 2, 0.5, precision = 6)) # [0.0, 0.5, 1.0, 1.5]
list(frange.frange(3)) # [0, 1, 2]
list(frange.frange(3.5)) # [0, 1, 2, 3]
list(frange.frange(0,10, -1)) # []

My versions use the original range function to create multiplicative indices for the shift. This allows same syntax to the original range function. I have made two versions, one using float, and one using Decimal, because I found that in some cases I wanted to avoid the roundoff drift introduced by the floating point arithmetic.

It is consistent with empty set results as in range/xrange.

Passing only a single numeric value to either function will return the standard range output to the integer ceiling value of the input parameter (so if you gave it 5.5, it would return range(6).)

Edit: the code below is now available as package on pypi: Franges

## frange.py
from math import ceil
# find best range function available to version (2.7.x / 3.x.x)
try:
    _xrange = xrange
except NameError:
    _xrange = range

def frange(start, stop = None, step = 1):
    """frange generates a set of floating point values over the 
    range [start, stop) with step size step

    frange([start,] stop [, step ])"""

    if stop is None:
        for x in _xrange(int(ceil(start))):
            yield x
    else:
        # create a generator expression for the index values
        indices = (i for i in _xrange(0, int((stop-start)/step)))  
        # yield results
        for i in indices:
            yield start + step*i

## drange.py
import decimal
from math import ceil
# find best range function available to version (2.7.x / 3.x.x)
try:
    _xrange = xrange
except NameError:
    _xrange = range

def drange(start, stop = None, step = 1, precision = None):
    """drange generates a set of Decimal values over the
    range [start, stop) with step size step

    drange([start,] stop, [step [,precision]])"""

    if stop is None:
        for x in _xrange(int(ceil(start))):
            yield x
    else:
        # find precision
        if precision is not None:
            decimal.getcontext().prec = precision
        # convert values to decimals
        start = decimal.Decimal(start)
        stop = decimal.Decimal(stop)
        step = decimal.Decimal(step)
        # create a generator expression for the index values
        indices = (
            i for i in _xrange(
                0, 
                ((stop-start)/step).to_integral_value()
            )
        )  
        # yield results
        for i in indices:
            yield float(start + step*i)

## testranges.py
import frange
import drange
list(frange.frange(0, 2, 0.5)) # [0.0, 0.5, 1.0, 1.5]
list(drange.drange(0, 2, 0.5, precision = 6)) # [0.0, 0.5, 1.0, 1.5]
list(frange.frange(3)) # [0, 1, 2]
list(frange.frange(3.5)) # [0, 1, 2, 3]
list(frange.frange(0,10, -1)) # []

回答 14

这是我获得浮动步距范围的解决方案。
使用此功能,无需导入numpy或安装它。
我很确定可以对其进行改进和优化。随意做并张贴在这里。

from __future__ import division
from math import log

def xfrange(start, stop, step):

    old_start = start #backup this value

    digits = int(round(log(10000, 10)))+1 #get number of digits
    magnitude = 10**digits
    stop = int(magnitude * stop) #convert from 
    step = int(magnitude * step) #0.1 to 10 (e.g.)

    if start == 0:
        start = 10**(digits-1)
    else:
        start = 10**(digits)*start

    data = []   #create array

    #calc number of iterations
    end_loop = int((stop-start)//step)
    if old_start == 0:
        end_loop += 1

    acc = start

    for i in xrange(0, end_loop):
        data.append(acc/magnitude)
        acc += step

    return data

print xfrange(1, 2.1, 0.1)
print xfrange(0, 1.1, 0.1)
print xfrange(-1, 0.1, 0.1)

输出为:

[1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0]
[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1]
[-1.0, -0.9, -0.8, -0.7, -0.6, -0.5, -0.4, -0.3, -0.2, -0.1, 0.0]

This is my solution to get ranges with float steps.
Using this function it’s not necessary to import numpy, nor install it.
I’m pretty sure that it could be improved and optimized. Feel free to do it and post it here.

from __future__ import division
from math import log

def xfrange(start, stop, step):

    old_start = start #backup this value

    digits = int(round(log(10000, 10)))+1 #get number of digits
    magnitude = 10**digits
    stop = int(magnitude * stop) #convert from 
    step = int(magnitude * step) #0.1 to 10 (e.g.)

    if start == 0:
        start = 10**(digits-1)
    else:
        start = 10**(digits)*start

    data = []   #create array

    #calc number of iterations
    end_loop = int((stop-start)//step)
    if old_start == 0:
        end_loop += 1

    acc = start

    for i in xrange(0, end_loop):
        data.append(acc/magnitude)
        acc += step

    return data

print xfrange(1, 2.1, 0.1)
print xfrange(0, 1.1, 0.1)
print xfrange(-1, 0.1, 0.1)

The output is:

[1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0]
[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1]
[-1.0, -0.9, -0.8, -0.7, -0.6, -0.5, -0.4, -0.3, -0.2, -0.1, 0.0]

回答 15

为了完善精品店,提供了一个实用的解决方案:

def frange(a,b,s):
  return [] if s > 0 and a > b or s < 0 and a < b or s==0 else [a]+frange(a+s,b,s)

For completeness of boutique, a functional solution:

def frange(a,b,s):
  return [] if s > 0 and a > b or s < 0 and a < b or s==0 else [a]+frange(a+s,b,s)

回答 16

您可以使用此功能:

def frange(start,end,step):
    return map(lambda x: x*step, range(int(start*1./step),int(end*1./step)))

You can use this function:

def frange(start,end,step):
    return map(lambda x: x*step, range(int(start*1./step),int(end*1./step)))

回答 17

诀窍避免四舍五入问题是使用一个单独的号码通过的范围内移动,启动和一半一步提前开始

# floating point range
def frange(a, b, stp=1.0):
  i = a+stp/2.0
  while i<b:
    yield a
    a += stp
    i += stp

或者,numpy.arange可以使用。

The trick to avoid round-off problem is to use a separate number to move through the range, that starts and half the step ahead of start.

# floating point range
def frange(a, b, stp=1.0):
  i = a+stp/2.0
  while i<b:
    yield a
    a += stp
    i += stp

Alternatively, numpy.arange can be used.


回答 18

可以使用Numpy库完成。arange()函数允许进行浮动操作。但是,它返回一个numpy数组,为方便起见,可以使用tolist()将其转换为list。

for i in np.arange(0, 1, 0.1).tolist():
   print i

It can be done using Numpy library. arange() function allows steps in float. But, it returns a numpy array which can be converted to list using tolist() for our convenience.

for i in np.arange(0, 1, 0.1).tolist():
   print i

回答 19

我的答案与使用map()的其他答案相似,不需要NumPy,也不需要使用lambda(尽管可以)。要以dt的步长获取从0.0到t_max的浮点值列表:

def xdt(n):
    return dt*float(n)
tlist  = map(xdt, range(int(t_max/dt)+1))

My answer is similar to others using map(), without need of NumPy, and without using lambda (though you could). To get a list of float values from 0.0 to t_max in steps of dt:

def xdt(n):
    return dt*float(n)
tlist  = map(xdt, range(int(t_max/dt)+1))

回答 20

出人意料的是,尚未在Python 3文档中提及推荐的解决方案:

也可以看看:

  • linspace配方展示了如何实现一个懒惰的版本范围的适用于浮点应用程序。

定义后,该配方易于使用,不需要numpy或任何其他外部库,但功能类似于numpy.linspace()。请注意,step第三个num参数不是参数,而是指定所需值的数量,例如:

print(linspace(0, 10, 5))
# linspace(0, 10, 5)
print(list(linspace(0, 10, 5)))
# [0.0, 2.5, 5.0, 7.5, 10]

我在下面引用了安德鲁·巴纳特(Andrew Barnert)的完整Python 3配方的修改版:

import collections.abc
import numbers

class linspace(collections.abc.Sequence):
    """linspace(start, stop, num) -> linspace object

    Return a virtual sequence of num numbers from start to stop (inclusive).

    If you need a half-open range, use linspace(start, stop, num+1)[:-1].
    """
    def __init__(self, start, stop, num):
        if not isinstance(num, numbers.Integral) or num <= 1:
            raise ValueError('num must be an integer > 1')
        self.start, self.stop, self.num = start, stop, num
        self.step = (stop-start)/(num-1)
    def __len__(self):
        return self.num
    def __getitem__(self, i):
        if isinstance(i, slice):
            return [self[x] for x in range(*i.indices(len(self)))]
        if i < 0:
            i = self.num + i
        if i >= self.num:
            raise IndexError('linspace object index out of range')
        if i == self.num-1:
            return self.stop
        return self.start + i*self.step
    def __repr__(self):
        return '{}({}, {}, {})'.format(type(self).__name__,
                                       self.start, self.stop, self.num)
    def __eq__(self, other):
        if not isinstance(other, linspace):
            return False
        return ((self.start, self.stop, self.num) ==
                (other.start, other.stop, other.num))
    def __ne__(self, other):
        return not self==other
    def __hash__(self):
        return hash((type(self), self.start, self.stop, self.num))

Suprised no-one has yet mentioned the recommended solution in the Python 3 docs:

See also:

  • The linspace recipe shows how to implement a lazy version of range that suitable for floating point applications.

Once defined, the recipe is easy to use and does not require numpy or any other external libraries, but functions like numpy.linspace(). Note that rather than a step argument, the third num argument specifies the number of desired values, for example:

print(linspace(0, 10, 5))
# linspace(0, 10, 5)
print(list(linspace(0, 10, 5)))
# [0.0, 2.5, 5.0, 7.5, 10]

I quote a modified version of the full Python 3 recipe from Andrew Barnert below:

import collections.abc
import numbers

class linspace(collections.abc.Sequence):
    """linspace(start, stop, num) -> linspace object

    Return a virtual sequence of num numbers from start to stop (inclusive).

    If you need a half-open range, use linspace(start, stop, num+1)[:-1].
    """
    def __init__(self, start, stop, num):
        if not isinstance(num, numbers.Integral) or num <= 1:
            raise ValueError('num must be an integer > 1')
        self.start, self.stop, self.num = start, stop, num
        self.step = (stop-start)/(num-1)
    def __len__(self):
        return self.num
    def __getitem__(self, i):
        if isinstance(i, slice):
            return [self[x] for x in range(*i.indices(len(self)))]
        if i < 0:
            i = self.num + i
        if i >= self.num:
            raise IndexError('linspace object index out of range')
        if i == self.num-1:
            return self.stop
        return self.start + i*self.step
    def __repr__(self):
        return '{}({}, {}, {})'.format(type(self).__name__,
                                       self.start, self.stop, self.num)
    def __eq__(self, other):
        if not isinstance(other, linspace):
            return False
        return ((self.start, self.stop, self.num) ==
                (other.start, other.stop, other.num))
    def __ne__(self, other):
        return not self==other
    def __hash__(self):
        return hash((type(self), self.start, self.stop, self.num))

回答 21

要解决浮动精度问题,可以使用Decimalmodule

这就要求转化为额外的努力,Decimalint或者float一边写代码,但你能传递str和修改功能,如果那样的便利性确实是必要的。

from decimal import Decimal
from decimal import Decimal as D


def decimal_range(*args):

    zero, one = Decimal('0'), Decimal('1')

    if len(args) == 1:
        start, stop, step = zero, args[0], one
    elif len(args) == 2:
        start, stop, step = args + (one,)
    elif len(args) == 3:
        start, stop, step = args
    else:
        raise ValueError('Expected 1 or 2 arguments, got %s' % len(args))

    if not all([type(arg) == Decimal for arg in (start, stop, step)]):
        raise ValueError('Arguments must be passed as <type: Decimal>')

    # neglect bad cases
    if (start == stop) or (start > stop and step >= zero) or \
                          (start < stop and step <= zero):
        return []

    current = start
    while abs(current) < abs(stop):
        yield current
        current += step

样本输出-

list(decimal_range(D('2')))
# [Decimal('0'), Decimal('1')]
list(decimal_range(D('2'), D('4.5')))
# [Decimal('2'), Decimal('3'), Decimal('4')]
list(decimal_range(D('2'), D('4.5'), D('0.5')))
# [Decimal('2'), Decimal('2.5'), Decimal('3.0'), Decimal('3.5'), Decimal('4.0')]
list(decimal_range(D('2'), D('4.5'), D('-0.5')))
# []
list(decimal_range(D('2'), D('-4.5'), D('-0.5')))
# [Decimal('2'),
#  Decimal('1.5'),
#  Decimal('1.0'),
#  Decimal('0.5'),
#  Decimal('0.0'),
#  Decimal('-0.5'),
#  Decimal('-1.0'),
#  Decimal('-1.5'),
#  Decimal('-2.0'),
#  Decimal('-2.5'),
#  Decimal('-3.0'),
#  Decimal('-3.5'),
#  Decimal('-4.0')]

To counter the float precision issues, you could use the Decimal module.

This demands an extra effort of converting to Decimal from int or float while writing the code, but you can instead pass str and modify the function if that sort of convenience is indeed necessary.

from decimal import Decimal
from decimal import Decimal as D


def decimal_range(*args):

    zero, one = Decimal('0'), Decimal('1')

    if len(args) == 1:
        start, stop, step = zero, args[0], one
    elif len(args) == 2:
        start, stop, step = args + (one,)
    elif len(args) == 3:
        start, stop, step = args
    else:
        raise ValueError('Expected 1 or 2 arguments, got %s' % len(args))

    if not all([type(arg) == Decimal for arg in (start, stop, step)]):
        raise ValueError('Arguments must be passed as <type: Decimal>')

    # neglect bad cases
    if (start == stop) or (start > stop and step >= zero) or \
                          (start < stop and step <= zero):
        return []

    current = start
    while abs(current) < abs(stop):
        yield current
        current += step

Sample outputs –

list(decimal_range(D('2')))
# [Decimal('0'), Decimal('1')]
list(decimal_range(D('2'), D('4.5')))
# [Decimal('2'), Decimal('3'), Decimal('4')]
list(decimal_range(D('2'), D('4.5'), D('0.5')))
# [Decimal('2'), Decimal('2.5'), Decimal('3.0'), Decimal('3.5'), Decimal('4.0')]
list(decimal_range(D('2'), D('4.5'), D('-0.5')))
# []
list(decimal_range(D('2'), D('-4.5'), D('-0.5')))
# [Decimal('2'),
#  Decimal('1.5'),
#  Decimal('1.0'),
#  Decimal('0.5'),
#  Decimal('0.0'),
#  Decimal('-0.5'),
#  Decimal('-1.0'),
#  Decimal('-1.5'),
#  Decimal('-2.0'),
#  Decimal('-2.5'),
#  Decimal('-3.0'),
#  Decimal('-3.5'),
#  Decimal('-4.0')]

回答 22

添加自动更正,以防止出现错误的登录步骤:

def frange(start,step,stop):
    step *= 2*((stop>start)^(step<0))-1
    return [start+i*step for i in range(int((stop-start)/step))]

Add auto-correction for the possibility of an incorrect sign on step:

def frange(start,step,stop):
    step *= 2*((stop>start)^(step<0))-1
    return [start+i*step for i in range(int((stop-start)/step))]

回答 23

我的解决方案:

def seq(start, stop, step=1, digit=0):
    x = float(start)
    v = []
    while x <= stop:
        v.append(round(x,digit))
        x += step
    return v

My solution:

def seq(start, stop, step=1, digit=0):
    x = float(start)
    v = []
    while x <= stop:
        v.append(round(x,digit))
        x += step
    return v

回答 24

最佳解决方案:无舍入错误
_________________________________________________________________________________

>>> step = .1
>>> N = 10     # number of data points
>>> [ x / pow(step, -1) for x in range(0, N + 1) ]

[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]

_________________________________________________________________________________

或者,对于设定范围而不是设定数据点(例如,连续功能),请使用:

>>> step = .1
>>> rnge = 1     # NOTE range = 1, i.e. span of data points
>>> N = int(rnge / step
>>> [ x / pow(step,-1) for x in range(0, N + 1) ]

[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]

要实现一个功能:更换x / pow(step, -1)f( x / pow(step, -1) ),并定义f
例如:

>>> import math
>>> def f(x):
        return math.sin(x)

>>> step = .1
>>> rnge = 1     # NOTE range = 1, i.e. span of data points
>>> N = int(rnge / step)
>>> [ f( x / pow(step,-1) ) for x in range(0, N + 1) ]

[0.0, 0.09983341664682815, 0.19866933079506122, 0.29552020666133955, 0.3894183423086505, 
 0.479425538604203, 0.5646424733950354, 0.644217687237691, 0.7173560908995228,
 0.7833269096274834, 0.8414709848078965]

Best Solution: no rounding error
_________________________________________________________________________________

>>> step = .1
>>> N = 10     # number of data points
>>> [ x / pow(step, -1) for x in range(0, N + 1) ]

[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]

_________________________________________________________________________________

Or, for a set range instead of set data points (e.g. continuous function), use:

>>> step = .1
>>> rnge = 1     # NOTE range = 1, i.e. span of data points
>>> N = int(rnge / step
>>> [ x / pow(step,-1) for x in range(0, N + 1) ]

[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]

To implement a function: replace x / pow(step, -1) with f( x / pow(step, -1) ), and define f.
For example:

>>> import math
>>> def f(x):
        return math.sin(x)

>>> step = .1
>>> rnge = 1     # NOTE range = 1, i.e. span of data points
>>> N = int(rnge / step)
>>> [ f( x / pow(step,-1) ) for x in range(0, N + 1) ]

[0.0, 0.09983341664682815, 0.19866933079506122, 0.29552020666133955, 0.3894183423086505, 
 0.479425538604203, 0.5646424733950354, 0.644217687237691, 0.7173560908995228,
 0.7833269096274834, 0.8414709848078965]

回答 25

start和stop具有包容性,而不是一个或另一个(通常不包括stop),并且没有导入,并且使用生成器

def rangef(start, stop, step, fround=5):
    """
    Yields sequence of numbers from start (inclusive) to stop (inclusive)
    by step (increment) with rounding set to n digits.

    :param start: start of sequence
    :param stop: end of sequence
    :param step: int or float increment (e.g. 1 or 0.001)
    :param fround: float rounding, n decimal places
    :return:
    """
    try:
        i = 0
        while stop >= start and step > 0:
            if i==0:
                yield start
            elif start >= stop:
                yield stop
            elif start < stop:
                if start == 0:
                    yield 0
                if start != 0:
                    yield start
            i += 1
            start += step
            start = round(start, fround)
        else:
            pass
    except TypeError as e:
        yield "type-error({})".format(e)
    else:
        pass


# passing
print(list(rangef(-100.0,10.0,1)))
print(list(rangef(-100,0,0.5)))
print(list(rangef(-1,1,0.2)))
print(list(rangef(-1,1,0.1)))
print(list(rangef(-1,1,0.05)))
print(list(rangef(-1,1,0.02)))
print(list(rangef(-1,1,0.01)))
print(list(rangef(-1,1,0.005)))
# failing: type-error:
print(list(rangef("1","10","1")))
print(list(rangef(1,10,"1")))

Python 3.6.2(v3.6.2:5fd33b5,2017年7月8日,04:57:36)[MSC v.1900 64位(AMD64)]

start and stop are inclusive rather than one or the other (usually stop is excluded) and without imports, and using generators

def rangef(start, stop, step, fround=5):
    """
    Yields sequence of numbers from start (inclusive) to stop (inclusive)
    by step (increment) with rounding set to n digits.

    :param start: start of sequence
    :param stop: end of sequence
    :param step: int or float increment (e.g. 1 or 0.001)
    :param fround: float rounding, n decimal places
    :return:
    """
    try:
        i = 0
        while stop >= start and step > 0:
            if i==0:
                yield start
            elif start >= stop:
                yield stop
            elif start < stop:
                if start == 0:
                    yield 0
                if start != 0:
                    yield start
            i += 1
            start += step
            start = round(start, fround)
        else:
            pass
    except TypeError as e:
        yield "type-error({})".format(e)
    else:
        pass


# passing
print(list(rangef(-100.0,10.0,1)))
print(list(rangef(-100,0,0.5)))
print(list(rangef(-1,1,0.2)))
print(list(rangef(-1,1,0.1)))
print(list(rangef(-1,1,0.05)))
print(list(rangef(-1,1,0.02)))
print(list(rangef(-1,1,0.01)))
print(list(rangef(-1,1,0.005)))
# failing: type-error:
print(list(rangef("1","10","1")))
print(list(rangef(1,10,"1")))

Python 3.6.2 (v3.6.2:5fd33b5, Jul 8 2017, 04:57:36) [MSC v.1900 64 bit (AMD64)]


回答 26

我知道我在这里参加聚会迟到了,但这是一个在3.6中运行的简单生成器解决方案:

def floatRange(*args):
    start, step = 0, 1
    if len(args) == 1:
        stop = args[0]
    elif len(args) == 2:
        start, stop = args[0], args[1]
    elif len(args) == 3:
        start, stop, step = args[0], args[1], args[2]
    else:
        raise TypeError("floatRange accepts 1, 2, or 3 arguments. ({0} given)".format(len(args)))
    for num in start, step, stop:
        if not isinstance(num, (int, float)):
            raise TypeError("floatRange only accepts float and integer arguments. ({0} : {1} given)".format(type(num), str(num)))
    for x in range(int((stop-start)/step)):
        yield start + (x * step)
    return

那么您可以像原始邮件一样调用它range()……没有错误处理,但是请让我知道是否有可以合理捕获的错误,我将进行更新。或者您可以更新它。这是StackOverflow。

I know I’m late to the party here, but here’s a trivial generator solution that’s working in 3.6:

def floatRange(*args):
    start, step = 0, 1
    if len(args) == 1:
        stop = args[0]
    elif len(args) == 2:
        start, stop = args[0], args[1]
    elif len(args) == 3:
        start, stop, step = args[0], args[1], args[2]
    else:
        raise TypeError("floatRange accepts 1, 2, or 3 arguments. ({0} given)".format(len(args)))
    for num in start, step, stop:
        if not isinstance(num, (int, float)):
            raise TypeError("floatRange only accepts float and integer arguments. ({0} : {1} given)".format(type(num), str(num)))
    for x in range(int((stop-start)/step)):
        yield start + (x * step)
    return

then you can call it just like the original range()… there’s no error handling, but let me know if there is an error that can be reasonably caught, and I’ll update. or you can update it. this is StackOverflow.


回答 27

这是我的解决方案,它与float_range(-1,0,0.01)一起正常工作,并且没有浮点表示错误。它不是很快,但是可以正常工作:

from decimal import Decimal

def get_multiplier(_from, _to, step):
    digits = []
    for number in [_from, _to, step]:
        pre = Decimal(str(number)) % 1
        digit = len(str(pre)) - 2
        digits.append(digit)
    max_digits = max(digits)
    return float(10 ** (max_digits))


def float_range(_from, _to, step, include=False):
    """Generates a range list of floating point values over the Range [start, stop]
       with step size step
       include=True - allows to include right value to if possible
       !! Works fine with floating point representation !!
    """
    mult = get_multiplier(_from, _to, step)
    # print mult
    int_from = int(round(_from * mult))
    int_to = int(round(_to * mult))
    int_step = int(round(step * mult))
    # print int_from,int_to,int_step
    if include:
        result = range(int_from, int_to + int_step, int_step)
        result = [r for r in result if r <= int_to]
    else:
        result = range(int_from, int_to, int_step)
    # print result
    float_result = [r / mult for r in result]
    return float_result


print float_range(-1, 0, 0.01,include=False)

assert float_range(1.01, 2.06, 5.05 % 1, True) ==\
[1.01, 1.06, 1.11, 1.16, 1.21, 1.26, 1.31, 1.36, 1.41, 1.46, 1.51, 1.56, 1.61, 1.66, 1.71, 1.76, 1.81, 1.86, 1.91, 1.96, 2.01, 2.06]

assert float_range(1.01, 2.06, 5.05 % 1, False)==\
[1.01, 1.06, 1.11, 1.16, 1.21, 1.26, 1.31, 1.36, 1.41, 1.46, 1.51, 1.56, 1.61, 1.66, 1.71, 1.76, 1.81, 1.86, 1.91, 1.96, 2.01]

Here is my solution which works fine with float_range(-1, 0, 0.01) and works without floating point representation errors. It is not very fast, but works fine:

from decimal import Decimal

def get_multiplier(_from, _to, step):
    digits = []
    for number in [_from, _to, step]:
        pre = Decimal(str(number)) % 1
        digit = len(str(pre)) - 2
        digits.append(digit)
    max_digits = max(digits)
    return float(10 ** (max_digits))


def float_range(_from, _to, step, include=False):
    """Generates a range list of floating point values over the Range [start, stop]
       with step size step
       include=True - allows to include right value to if possible
       !! Works fine with floating point representation !!
    """
    mult = get_multiplier(_from, _to, step)
    # print mult
    int_from = int(round(_from * mult))
    int_to = int(round(_to * mult))
    int_step = int(round(step * mult))
    # print int_from,int_to,int_step
    if include:
        result = range(int_from, int_to + int_step, int_step)
        result = [r for r in result if r <= int_to]
    else:
        result = range(int_from, int_to, int_step)
    # print result
    float_result = [r / mult for r in result]
    return float_result


print float_range(-1, 0, 0.01,include=False)

assert float_range(1.01, 2.06, 5.05 % 1, True) ==\
[1.01, 1.06, 1.11, 1.16, 1.21, 1.26, 1.31, 1.36, 1.41, 1.46, 1.51, 1.56, 1.61, 1.66, 1.71, 1.76, 1.81, 1.86, 1.91, 1.96, 2.01, 2.06]

assert float_range(1.01, 2.06, 5.05 % 1, False)==\
[1.01, 1.06, 1.11, 1.16, 1.21, 1.26, 1.31, 1.36, 1.41, 1.46, 1.51, 1.56, 1.61, 1.66, 1.71, 1.76, 1.81, 1.86, 1.91, 1.96, 2.01]

回答 28

我只是一个初学者,但是在模拟一些计算时遇到了同样的问题。这是我尝试解决的方法,似乎正在使用小数步。

我也很懒,因此我很难编写自己的范围函数。

基本上,我所做的是将我更改xrange(0.0, 1.0, 0.01)xrange(0, 100, 1)100.0在循环内使用除法。我也很担心是否会出现四舍五入的错误。所以我决定测试是否有。现在,我听说,如果例如0.01从计算中得出的浮点数不完全相同,0.01则应将它们返回False(如果我错了,请告诉我)。

因此,我决定通过运行简短的测试来测试我的解决方案是否适合我的范围:

for d100 in xrange(0, 100, 1):
    d = d100 / 100.0
    fl = float("0.00"[:4 - len(str(d100))] + str(d100))
    print d, "=", fl , d == fl

并且每个都打印True。

现在,如果我完全错了,请告诉我。

I am only a beginner, but I had the same problem, when simulating some calculations. Here is how I attempted to work this out, which seems to be working with decimal steps.

I am also quite lazy and so I found it hard to write my own range function.

Basically what I did is changed my xrange(0.0, 1.0, 0.01) to xrange(0, 100, 1) and used the division by 100.0 inside the loop. I was also concerned, if there will be rounding mistakes. So I decided to test, whether there are any. Now I heard, that if for example 0.01 from a calculation isn’t exactly the float 0.01 comparing them should return False (if I am wrong, please let me know).

So I decided to test if my solution will work for my range by running a short test:

for d100 in xrange(0, 100, 1):
    d = d100 / 100.0
    fl = float("0.00"[:4 - len(str(d100))] + str(d100))
    print d, "=", fl , d == fl

And it printed True for each.

Now, if I’m getting it totally wrong, please let me know.


回答 29

这个衬里不会使您的代码混乱。step参数的符号很重要。

def frange(start, stop, step):
    return [x*step+start for x in range(0,round(abs((stop-start)/step)+0.5001),
        int((stop-start)/step<0)*-2+1)]

This one liner will not clutter your code. The sign of the step parameter is important.

def frange(start, stop, step):
    return [x*step+start for x in range(0,round(abs((stop-start)/step)+0.5001),
        int((stop-start)/step<0)*-2+1)]

如何获得一个函数名作为字符串?

问题:如何获得一个函数名作为字符串?

在Python中,如何在不调用函数的情况下以字符串形式获取函数名称?

def my_function():
    pass

print get_function_name_as_string(my_function) # my_function is not in quotes

应该输出"my_function"

此类功能在Python中可用吗?如果没有,关于如何get_function_name_as_string在Python中实现的任何想法?

In Python, how do I get a function name as a string, without calling the function?

def my_function():
    pass

print get_function_name_as_string(my_function) # my_function is not in quotes

should output "my_function".

Is such function available in Python? If not, any ideas on how to implement get_function_name_as_string, in Python?


回答 0

my_function.__name__

使用__name__是首选的方法,因为它可以统一应用。与不同func_name,它还可以用于内置函数:

>>> import time
>>> time.time.func_name
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'builtin_function_or_method' object has no attribute 'func_name'
>>> time.time.__name__ 
'time'

同样,双下划线向读者表明这是一个特殊的属性。另外,类和模块也具有__name__属性,因此您只记得一个特殊名称。

my_function.__name__

Using __name__ is the preferred method as it applies uniformly. Unlike func_name, it works on built-in functions as well:

>>> import time
>>> time.time.func_name
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'builtin_function_or_method' object has no attribute 'func_name'
>>> time.time.__name__ 
'time'

Also the double underscores indicate to the reader this is a special attribute. As a bonus, classes and modules have a __name__ attribute too, so you only have remember one special name.


回答 1

要从内部获取当前函数或方法的名称,请考虑:

import inspect

this_function_name = inspect.currentframe().f_code.co_name

sys._getframeinspect.currentframe尽管后者避免访问私有功能,但它也可以代替。

要获取调用函数的名称,请考虑f_back中的inspect.currentframe().f_back.f_code.co_name


如果还使用mypy,它可能会抱怨:

错误:“ Optional [FrameType]”的项目“ None”没有属性“ f_code”

要抑制上述错误,请考虑:

import inspect
import types
from typing import cast

this_function_name = cast(types.FrameType, inspect.currentframe()).f_code.co_name

To get the current function’s or method’s name from inside it, consider:

import inspect

this_function_name = inspect.currentframe().f_code.co_name

sys._getframe also works instead of inspect.currentframe although the latter avoids accessing a private function.

To get the calling function’s name instead, consider f_back as in inspect.currentframe().f_back.f_code.co_name.


If also using mypy, it can complain that:

error: Item “None” of “Optional[FrameType]” has no attribute “f_code”

To suppress the above error, consider:

import inspect
import types
from typing import cast

this_function_name = cast(types.FrameType, inspect.currentframe()).f_code.co_name

回答 2

my_function.func_name

函数还有其他有趣的属性。键入dir(func_name)以列出它们。func_name.func_code.co_code是已编译的函数,存储为字符串。

import dis
dis.dis(my_function)

将以几乎人类可读的格式显示代码。:)

my_function.func_name

There are also other fun properties of functions. Type dir(func_name) to list them. func_name.func_code.co_code is the compiled function, stored as a string.

import dis
dis.dis(my_function)

will display the code in almost human readable format. :)


回答 3

该函数将返回调用者的函数名称。

def func_name():
    import traceback
    return traceback.extract_stack(None, 2)[0][2]

就像阿尔伯特·冯普普(Albert Vonpupp)用友好的包装纸回答的那样。

This function will return the caller’s function name.

def func_name():
    import traceback
    return traceback.extract_stack(None, 2)[0][2]

It is like Albert Vonpupp’s answer with a friendly wrapper.


回答 4

如果你有兴趣类的方法也一样,Python的3.3+具有__qualname____name__

def my_function():
    pass

class MyClass(object):
    def method(self):
        pass

print(my_function.__name__)         # gives "my_function"
print(MyClass.method.__name__)      # gives "method"

print(my_function.__qualname__)     # gives "my_function"
print(MyClass.method.__qualname__)  # gives "MyClass.method"

If you’re interested in class methods too, Python 3.3+ has __qualname__ in addition to __name__.

def my_function():
    pass

class MyClass(object):
    def method(self):
        pass

print(my_function.__name__)         # gives "my_function"
print(MyClass.method.__name__)      # gives "method"

print(my_function.__qualname__)     # gives "my_function"
print(MyClass.method.__qualname__)  # gives "MyClass.method"

回答 5

我喜欢使用函数装饰器。我添加了一个类,它也乘以函数时间。假设gLog是标准的python记录器:

class EnterExitLog():
    def __init__(self, funcName):
        self.funcName = funcName

    def __enter__(self):
        gLog.debug('Started: %s' % self.funcName)
        self.init_time = datetime.datetime.now()
        return self

    def __exit__(self, type, value, tb):
        gLog.debug('Finished: %s in: %s seconds' % (self.funcName, datetime.datetime.now() - self.init_time))

def func_timer_decorator(func):
    def func_wrapper(*args, **kwargs):
        with EnterExitLog(func.__name__):
            return func(*args, **kwargs)

    return func_wrapper

所以现在您要做的就是装饰它,瞧

@func_timer_decorator
def my_func():

I like using a function decorator. I added a class, which also times the function time. Assume gLog is a standard python logger:

class EnterExitLog():
    def __init__(self, funcName):
        self.funcName = funcName

    def __enter__(self):
        gLog.debug('Started: %s' % self.funcName)
        self.init_time = datetime.datetime.now()
        return self

    def __exit__(self, type, value, tb):
        gLog.debug('Finished: %s in: %s seconds' % (self.funcName, datetime.datetime.now() - self.init_time))

def func_timer_decorator(func):
    def func_wrapper(*args, **kwargs):
        with EnterExitLog(func.__name__):
            return func(*args, **kwargs)

    return func_wrapper

so now all you have to do with your function is decorate it and voila

@func_timer_decorator
def my_func():

回答 6

sys._getframe()不能保证在所有Python实现中都可用(请参阅ref),您可以使用该traceback模块执行相同的操作,例如。

import traceback
def who_am_i():
   stack = traceback.extract_stack()
   filename, codeline, funcName, text = stack[-2]

   return funcName

调用stack[-1]将返回当前过程详细信息。

sys._getframe() is not guaranteed to be available in all implementations of Python (see ref) ,you can use the traceback module to do the same thing, eg.

import traceback
def who_am_i():
   stack = traceback.extract_stack()
   filename, codeline, funcName, text = stack[-2]

   return funcName

A call to stack[-1] will return the current process details.


回答 7

import inspect

def foo():
   print(inspect.stack()[0][3])

哪里

  • stack()[0]调用者

  • stack()[3]方法的字符串名称

import inspect

def foo():
   print(inspect.stack()[0][3])

where

  • stack()[0] the caller

  • stack()[3] the string name of the method


回答 8

作为@Demyn答案的扩展,我创建了一些实用程序函数,这些函数打印当前函数的名称和当前函数的参数:

import inspect
import logging
import traceback

def get_function_name():
    return traceback.extract_stack(None, 2)[0][2]

def get_function_parameters_and_values():
    frame = inspect.currentframe().f_back
    args, _, _, values = inspect.getargvalues(frame)
    return ([(i, values[i]) for i in args])

def my_func(a, b, c=None):
    logging.info('Running ' + get_function_name() + '(' + str(get_function_parameters_and_values()) +')')
    pass

logger = logging.getLogger()
handler = logging.StreamHandler()
formatter = logging.Formatter(
    '%(asctime)s [%(levelname)s] -> %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)
logger.setLevel(logging.INFO)

my_func(1, 3) # 2016-03-25 17:16:06,927 [INFO] -> Running my_func([('a', 1), ('b', 3), ('c', None)])

As an extension of @Demyn’s answer, I created some utility functions which print the current function’s name and current function’s arguments:

import inspect
import logging
import traceback

def get_function_name():
    return traceback.extract_stack(None, 2)[0][2]

def get_function_parameters_and_values():
    frame = inspect.currentframe().f_back
    args, _, _, values = inspect.getargvalues(frame)
    return ([(i, values[i]) for i in args])

def my_func(a, b, c=None):
    logging.info('Running ' + get_function_name() + '(' + str(get_function_parameters_and_values()) +')')
    pass

logger = logging.getLogger()
handler = logging.StreamHandler()
formatter = logging.Formatter(
    '%(asctime)s [%(levelname)s] -> %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)
logger.setLevel(logging.INFO)

my_func(1, 3) # 2016-03-25 17:16:06,927 [INFO] -> Running my_func([('a', 1), ('b', 3), ('c', None)])

回答 9

您只想获取函数的名称,这里是一个简单的代码。假设您已经定义了这些功能

def function1():
    print "function1"

def function2():
    print "function2"

def function3():
    print "function3"
print function1.__name__

输出将为function1

现在说您在列表中有这些功能

a = [function1 , function2 , funciton3]

获得功能的名称

for i in a:
    print i.__name__

输出将是

功能1
功能2
功能3

You just want to get the name of the function here is a simple code for that. let say you have these functions defined

def function1():
    print "function1"

def function2():
    print "function2"

def function3():
    print "function3"
print function1.__name__

the output will be function1

Now let say you have these functions in a list

a = [function1 , function2 , funciton3]

to get the name of the functions

for i in a:
    print i.__name__

the output will be

function1
function2
function3


回答 10

我看到了一些使用装饰器的答案,尽管我觉得有些冗长。这是我用来记录函数名称以及它们各自的输入和输出值的东西。我在这里对其进行了修改,以仅打印信息,而不是创建日志文件,并将其修改为应用于OP特定示例。

def debug(func=None):
    def wrapper(*args, **kwargs):
        try:
            function_name = func.__func__.__qualname__
        except:
            function_name = func.__qualname__
        return func(*args, **kwargs, function_name=function_name)
    return wrapper

@debug
def my_function(**kwargs):
    print(kwargs)

my_function()

输出:

{'function_name': 'my_function'}

I’ve seen a few answers that utilized decorators, though I felt a few were a bit verbose. Here’s something I use for logging function names as well as their respective input and output values. I’ve adapted it here to just print the info rather than creating a log file and adapted it to apply to the OP specific example.

def debug(func=None):
    def wrapper(*args, **kwargs):
        try:
            function_name = func.__func__.__qualname__
        except:
            function_name = func.__qualname__
        return func(*args, **kwargs, function_name=function_name)
    return wrapper

@debug
def my_function(**kwargs):
    print(kwargs)

my_function()

Output:

{'function_name': 'my_function'}

如何克服“ datetime.datetime无法JSON序列化”?

问题:如何克服“ datetime.datetime无法JSON序列化”?

我有一个基本的命令,如下所示:

sample = {}
sample['title'] = "String"
sample['somedate'] = somedatetimehere

当我尝试做时,jsonify(sample)我得到:

TypeError: datetime.datetime(2012, 8, 8, 21, 46, 24, 862000) is not JSON serializable

我该怎么做才能使我的字典示例可以克服上述错误?

注意:尽管可能无关紧要,但字典是从记录的检索中生成的,这些记录是mongodb在我打印出str(sample['somedate'])的地方输出的2012-08-08 21:46:24.862000

I have a basic dict as follows:

sample = {}
sample['title'] = "String"
sample['somedate'] = somedatetimehere

When I try to do jsonify(sample) I get:

TypeError: datetime.datetime(2012, 8, 8, 21, 46, 24, 862000) is not JSON serializable

What can I do such that my dictionary sample can overcome the error above?

Note: Though it may not be relevant, the dictionaries are generated from the retrieval of records out of mongodb where when I print out str(sample['somedate']), the output is 2012-08-08 21:46:24.862000.


回答 0

更新于2018

原始答案适应了MongoDB“日期”字段表示为:

{"$date": 1506816000000}

如果您希望使用通用的Python解决方案序列化为datetimejson,请查看@jjmontes的答案以获取无需依赖项的快速解决方案。


当您使用mongoengine(每个注释)并且pymongo是一个依赖项时,pymongo具有内置的实用程序来帮助json序列化:http ://api.mongodb.org/python/1.10.1/api/bson/json_util.html

用法示例(序列化):

from bson import json_util
import json

json.dumps(anObject, default=json_util.default)

用法示例(反序列化):

json.loads(aJsonString, object_hook=json_util.object_hook)

Django的

Django提供了本机DjangoJSONEncoder序列化程序,可以正确处理这种情况。

参见https://docs.djangoproject.com/en/dev/topics/serialization/#djangojsonencoder

from django.core.serializers.json import DjangoJSONEncoder

return json.dumps(
  item,
  sort_keys=True,
  indent=1,
  cls=DjangoJSONEncoder
)

我注意到DjangoJSONEncoder和使用这样的自定义之间的一个区别default

import datetime
import json

def default(o):
    if isinstance(o, (datetime.date, datetime.datetime)):
        return o.isoformat()

return json.dumps(
  item,
  sort_keys=True,
  indent=1,
  default=default
)

是Django剥离了一些数据:

 "last_login": "2018-08-03T10:51:42.990", # DjangoJSONEncoder 
 "last_login": "2018-08-03T10:51:42.990239", # default

因此,在某些情况下,您可能需要注意这一点。

Updated for 2018

The original answer accommodated the way MongoDB “date” fields were represented as:

{"$date": 1506816000000}

If you want a generic Python solution for serializing datetime to json, check out @jjmontes’ answer for a quick solution which requires no dependencies.


As you are using mongoengine (per comments) and pymongo is a dependency, pymongo has built-in utilities to help with json serialization:
http://api.mongodb.org/python/1.10.1/api/bson/json_util.html

Example usage (serialization):

from bson import json_util
import json

json.dumps(anObject, default=json_util.default)

Example usage (deserialization):

json.loads(aJsonString, object_hook=json_util.object_hook)

Django

Django provides a native DjangoJSONEncoder serializer that deals with this kind of properly.

See https://docs.djangoproject.com/en/dev/topics/serialization/#djangojsonencoder

from django.core.serializers.json import DjangoJSONEncoder

return json.dumps(
  item,
  sort_keys=True,
  indent=1,
  cls=DjangoJSONEncoder
)

One difference I’ve noticed between DjangoJSONEncoder and using a custom default like this:

import datetime
import json

def default(o):
    if isinstance(o, (datetime.date, datetime.datetime)):
        return o.isoformat()

return json.dumps(
  item,
  sort_keys=True,
  indent=1,
  default=default
)

Is that Django strips a bit of the data:

 "last_login": "2018-08-03T10:51:42.990", # DjangoJSONEncoder 
 "last_login": "2018-08-03T10:51:42.990239", # default

So, you may need to be careful about that in some cases.


回答 1

我的快速且肮脏的JSON转储会吃掉日期和所有东西:

json.dumps(my_dictionary, indent=4, sort_keys=True, default=str)

My quick & dirty JSON dump that eats dates and everything:

json.dumps(my_dictionary, indent=4, sort_keys=True, default=str)

回答 2

基于其他答案,这是一个基于特定序列化器的简单解决方案,该序列化器仅将datetime.datetime和转换datetime.date为字符串。

from datetime import date, datetime

def json_serial(obj):
    """JSON serializer for objects not serializable by default json code"""

    if isinstance(obj, (datetime, date)):
        return obj.isoformat()
    raise TypeError ("Type %s not serializable" % type(obj))

如图所示,代码仅检查对象是否属于datetime.datetime或类datetime.date,然后.isoformat()根据ISO 8601格式YYYY-MM-DDTHH:MM:SS来生成对象的序列化版本(可通过JavaScript轻松解码) )。如果寻求更复杂的序列化表示形式,则可以使用其他代码代替str()(有关示例,请参见此问题的其他答案)。该代码以引发异常结束,以处理使用非序列化类型调用该异常的情况。

此json_serial函数可以按如下方式使用:

from datetime import datetime
from json import dumps

print dumps(datetime.now(), default=json_serial)

有关json.dumps默认参数如何工作的详细信息,请参见json模块文档的“基本用法”部分

Building on other answers, a simple solution based on a specific serializer that just converts datetime.datetime and datetime.date objects to strings.

from datetime import date, datetime

def json_serial(obj):
    """JSON serializer for objects not serializable by default json code"""

    if isinstance(obj, (datetime, date)):
        return obj.isoformat()
    raise TypeError ("Type %s not serializable" % type(obj))

As seen, the code just checks to find out if object is of class datetime.datetime or datetime.date, and then uses .isoformat() to produce a serialized version of it, according to ISO 8601 format, YYYY-MM-DDTHH:MM:SS (which is easily decoded by JavaScript). If more complex serialized representations are sought, other code could be used instead of str() (see other answers to this question for examples). The code ends by raising an exception, to deal with the case it is called with a non-serializable type.

This json_serial function can be used as follows:

from datetime import datetime
from json import dumps

print dumps(datetime.now(), default=json_serial)

The details about how the default parameter to json.dumps works can be found in Section Basic Usage of the json module documentation.


回答 3

我刚遇到这个问题,我的解决方案是子类化json.JSONEncoder

from datetime import datetime
import json

class DateTimeEncoder(json.JSONEncoder):
    def default(self, o):
        if isinstance(o, datetime):
            return o.isoformat()

        return json.JSONEncoder.default(self, o)

在您的通话做这样的事情:json.dumps(yourobj, cls=DateTimeEncoder).isoformat()我从上面的答案之一了。

I have just encountered this problem and my solution is to subclass json.JSONEncoder:

from datetime import datetime
import json

class DateTimeEncoder(json.JSONEncoder):
    def default(self, o):
        if isinstance(o, datetime):
            return o.isoformat()

        return json.JSONEncoder.default(self, o)

In your call do something like: json.dumps(yourobj, cls=DateTimeEncoder) The .isoformat() I got from one of the answers above.


回答 4

将日期转换为字符串

sample['somedate'] = str( datetime.utcnow() )

Convert the date to a string

sample['somedate'] = str( datetime.utcnow() )

回答 5

对于不需要或想要使用pymongo库的其他人,您可以使用此小片段轻松实现日期时间JSON转换:

def default(obj):
    """Default JSON serializer."""
    import calendar, datetime

    if isinstance(obj, datetime.datetime):
        if obj.utcoffset() is not None:
            obj = obj - obj.utcoffset()
        millis = int(
            calendar.timegm(obj.timetuple()) * 1000 +
            obj.microsecond / 1000
        )
        return millis
    raise TypeError('Not sure how to serialize %s' % (obj,))

然后像这样使用它:

import datetime, json
print json.dumps(datetime.datetime.now(), default=default)

输出: 

'1365091796124'

For others who do not need or want to use the pymongo library for this.. you can achieve datetime JSON conversion easily with this small snippet:

def default(obj):
    """Default JSON serializer."""
    import calendar, datetime

    if isinstance(obj, datetime.datetime):
        if obj.utcoffset() is not None:
            obj = obj - obj.utcoffset()
        millis = int(
            calendar.timegm(obj.timetuple()) * 1000 +
            obj.microsecond / 1000
        )
        return millis
    raise TypeError('Not sure how to serialize %s' % (obj,))

Then use it like so:

import datetime, json
print json.dumps(datetime.datetime.now(), default=default)

output: 

'1365091796124'

回答 6

这是我的解决方案:

# -*- coding: utf-8 -*-
import json


class DatetimeEncoder(json.JSONEncoder):
    def default(self, obj):
        try:
            return super(DatetimeEncoder, obj).default(obj)
        except TypeError:
            return str(obj)

然后,您可以像这样使用它:

json.dumps(dictionnary, cls=DatetimeEncoder)

Here is my solution:

# -*- coding: utf-8 -*-
import json


class DatetimeEncoder(json.JSONEncoder):
    def default(self, obj):
        try:
            return super(DatetimeEncoder, obj).default(obj)
        except TypeError:
            return str(obj)

Then you can use it like that:

json.dumps(dictionnary, cls=DatetimeEncoder)

回答 7

我有一个类似问题的应用程序;我的方法是将datetime值作为6项列表(年,月,日,时,分,秒)进行JSON化;您可以将微秒作为7项列表,但是我不需要:

class DateTimeEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, datetime.datetime):
            encoded_object = list(obj.timetuple())[0:6]
        else:
            encoded_object =json.JSONEncoder.default(self, obj)
        return encoded_object

sample = {}
sample['title'] = "String"
sample['somedate'] = datetime.datetime.now()

print sample
print json.dumps(sample, cls=DateTimeEncoder)

生成:

{'somedate': datetime.datetime(2013, 8, 1, 16, 22, 45, 890000), 'title': 'String'}
{"somedate": [2013, 8, 1, 16, 22, 45], "title": "String"}

I have an application with a similar issue; my approach was to JSONize the datetime value as a 6-item list (year, month, day, hour, minutes, seconds); you could go to microseconds as a 7-item list, but I had no need to:

class DateTimeEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, datetime.datetime):
            encoded_object = list(obj.timetuple())[0:6]
        else:
            encoded_object =json.JSONEncoder.default(self, obj)
        return encoded_object

sample = {}
sample['title'] = "String"
sample['somedate'] = datetime.datetime.now()

print sample
print json.dumps(sample, cls=DateTimeEncoder)

produces:

{'somedate': datetime.datetime(2013, 8, 1, 16, 22, 45, 890000), 'title': 'String'}
{"somedate": [2013, 8, 1, 16, 22, 45], "title": "String"}

回答 8

我的解决方案(我认为冗长程度较低):

def default(o):
    if type(o) is datetime.date or type(o) is datetime.datetime:
        return o.isoformat()

def jsondumps(o):
    return json.dumps(o, default=default)

然后使用jsondumps代替json.dumps。它将打印:

>>> jsondumps({'today': datetime.date.today()})
'{"today": "2013-07-30"}'

我想要,稍后您可以通过简单的default方法添加其他特殊情况。例:

def default(o):
    if type(o) is datetime.date or type(o) is datetime.datetime:
        return o.isoformat()
    if type(o) is decimal.Decimal:
        return float(o)

My solution (with less verbosity, I think):

def default(o):
    if type(o) is datetime.date or type(o) is datetime.datetime:
        return o.isoformat()

def jsondumps(o):
    return json.dumps(o, default=default)

Then use jsondumps instead of json.dumps. It will print:

>>> jsondumps({'today': datetime.date.today()})
'{"today": "2013-07-30"}'

I you want, later you can add other special cases to this with a simple twist of the default method. Example:

def default(o):
    if type(o) is datetime.date or type(o) is datetime.datetime:
        return o.isoformat()
    if type(o) is decimal.Decimal:
        return float(o)

回答 9

这个Q一次又一次地重复-修补json模块的一种简单方法,以便序列化支持datetime。

import json
import datetime

json.JSONEncoder.default = lambda self,obj: (obj.isoformat() if isinstance(obj, datetime.datetime) else None)

比起您一如既往地使用json序列化-这次将datetime序列化为isoformat。

json.dumps({'created':datetime.datetime.now()})

结果:'{“ created”:“ 2015-08-26T14:21:31.853855”}“

有关更多详细信息,请注意: StackOverflow:Python和JavaScript之间的JSON日期时间

This Q repeats time and time again – a simple way to patch the json module such that serialization would support datetime.

import json
import datetime

json.JSONEncoder.default = lambda self,obj: (obj.isoformat() if isinstance(obj, datetime.datetime) else None)

Than use json serialization as you always do – this time with datetime being serialized as isoformat.

json.dumps({'created':datetime.datetime.now()})

Resulting in: ‘{“created”: “2015-08-26T14:21:31.853855”}’

See more details and some words of caution at: StackOverflow: JSON datetime between Python and JavaScript


回答 10

json.dumps方法可以接受称为default的可选参数,该参数应为函数。每次JSON尝试转换值时,它都不知道如何转换将调用我们传递给它的函数。该函数将接收有问题的对象,并且应返回该对象的JSON表示形式。

def myconverter(o):
 if isinstance(o, datetime.datetime):
    return o.__str__()

print(json.dumps(d, default = myconverter)) 

The json.dumps method can accept an optional parameter called default which is expected to be a function. Every time JSON tries to convert a value it does not know how to convert it will call the function we passed to it. The function will receive the object in question, and it is expected to return the JSON representation of the object.

def myconverter(o):
 if isinstance(o, datetime.datetime):
    return o.__str__()

print(json.dumps(d, default = myconverter)) 

回答 11

如果您使用的是python3.7,那么最好的解决方案是使用 datetime.isoformat()datetime.fromisoformat(); 它们既可以用于天真datetime对象,也可以用于感知对象:

#!/usr/bin/env python3.7

from datetime import datetime
from datetime import timezone
from datetime import timedelta
import json

def default(obj):
    if isinstance(obj, datetime):
        return { '_isoformat': obj.isoformat() }
    return super().default(obj)

def object_hook(obj):
    _isoformat = obj.get('_isoformat')
    if _isoformat is not None:
        return datetime.fromisoformat(_isoformat)
    return obj

if __name__ == '__main__':
    #d = { 'now': datetime(2000, 1, 1) }
    d = { 'now': datetime(2000, 1, 1, tzinfo=timezone(timedelta(hours=-8))) }
    s = json.dumps(d, default=default)
    print(s)
    print(d == json.loads(s, object_hook=object_hook))

输出:

{"now": {"_isoformat": "2000-01-01T00:00:00-08:00"}}
True

如果您使用的是python3.6或更低版本,并且您仅关心时间值(而不是时区),则可以使用datetime.timestamp()datetime.fromtimestamp()

如果您使用的是python3.6或更低版本,并且确实关心时区,则可以通过进行获取datetime.tzinfo,但是您必须自己序列化此字段;最简单的方法是_tzinfo在序列化对象中添加另一个字段;

最后,在所有这些示例中都要当心精度;

if you are using python3.7, then the best solution is using datetime.isoformat() and datetime.fromisoformat(); they work with both naive and aware datetime objects:

#!/usr/bin/env python3.7

from datetime import datetime
from datetime import timezone
from datetime import timedelta
import json

def default(obj):
    if isinstance(obj, datetime):
        return { '_isoformat': obj.isoformat() }
    return super().default(obj)

def object_hook(obj):
    _isoformat = obj.get('_isoformat')
    if _isoformat is not None:
        return datetime.fromisoformat(_isoformat)
    return obj

if __name__ == '__main__':
    #d = { 'now': datetime(2000, 1, 1) }
    d = { 'now': datetime(2000, 1, 1, tzinfo=timezone(timedelta(hours=-8))) }
    s = json.dumps(d, default=default)
    print(s)
    print(d == json.loads(s, object_hook=object_hook))

output:

{"now": {"_isoformat": "2000-01-01T00:00:00-08:00"}}
True

if you are using python3.6 or below, and you only care about the time value (not the timezone), then you can use datetime.timestamp() and datetime.fromtimestamp() instead;

if you are using python3.6 or below, and you do care about the timezone, then you can get it via datetime.tzinfo, but you have to serialize this field by yourself; the easiest way to do this is to add another field _tzinfo in the serialized object;

finally, beware of precisions in all these examples;


回答 12

您应使用.strftime()method on .datetime.now()method使其成为可序列化的方法。

这是一个例子:

from datetime import datetime

time_dict = {'time': datetime.now().strftime('%Y-%m-%dT%H:%M:%S')}
sample_dict = {'a': 1, 'b': 2}
sample_dict.update(time_dict)
sample_dict

输出:

Out[0]: {'a': 1, 'b': 2, 'time': '2017-10-31T15:16:30'}

You should use .strftime() method on .datetime.now() method to making it as a serializable method.

Here’s an example:

from datetime import datetime

time_dict = {'time': datetime.now().strftime('%Y-%m-%dT%H:%M:%S')}
sample_dict = {'a': 1, 'b': 2}
sample_dict.update(time_dict)
sample_dict

Output:

Out[0]: {'a': 1, 'b': 2, 'time': '2017-10-31T15:16:30'}

回答 13

这是解决“ datetime not JSON serializable”问题的简单解决方案。

enco = lambda obj: (
    obj.isoformat()
    if isinstance(obj, datetime.datetime)
    or isinstance(obj, datetime.date)
    else None
)

json.dumps({'date': datetime.datetime.now()}, default=enco)

输出:-> {“ date”:“ 2015-12-16T04:48:20.024609”}

Here is a simple solution to over come “datetime not JSON serializable” problem.

enco = lambda obj: (
    obj.isoformat()
    if isinstance(obj, datetime.datetime)
    or isinstance(obj, datetime.date)
    else None
)

json.dumps({'date': datetime.datetime.now()}, default=enco)

Output:-> {“date”: “2015-12-16T04:48:20.024609”}


回答 14

您必须提供一个自定义编码器类,其cls参数为json.dumps。引用文档

>>> import json
>>> class ComplexEncoder(json.JSONEncoder):
...     def default(self, obj):
...         if isinstance(obj, complex):
...             return [obj.real, obj.imag]
...         return json.JSONEncoder.default(self, obj)
...
>>> dumps(2 + 1j, cls=ComplexEncoder)
'[2.0, 1.0]'
>>> ComplexEncoder().encode(2 + 1j)
'[2.0, 1.0]'
>>> list(ComplexEncoder().iterencode(2 + 1j))
['[', '2.0', ', ', '1.0', ']']

这以复数为例,但是您可以轻松地创建一个类来对日期进行编码(但我认为JSON对日期有些模糊)

You have to supply a custom encoder class with the cls parameter of json.dumps. To quote from the docs:

>>> import json
>>> class ComplexEncoder(json.JSONEncoder):
...     def default(self, obj):
...         if isinstance(obj, complex):
...             return [obj.real, obj.imag]
...         return json.JSONEncoder.default(self, obj)
...
>>> dumps(2 + 1j, cls=ComplexEncoder)
'[2.0, 1.0]'
>>> ComplexEncoder().encode(2 + 1j)
'[2.0, 1.0]'
>>> list(ComplexEncoder().iterencode(2 + 1j))
['[', '2.0', ', ', '1.0', ']']

This uses complex numbers as the example, but you can just as easily create a class to encode dates (except I think JSON is a little fuzzy about dates)


回答 15

最简单的方法是将日期时间格式的字典部分更改为isoformat。该值将有效地是json可以使用的isoformat字符串。

v_dict = version.dict()
v_dict['created_at'] = v_dict['created_at'].isoformat()

The simplest way to do this is to change the part of the dict that is in datetime format to isoformat. That value will effectively be a string in isoformat which json is ok with.

v_dict = version.dict()
v_dict['created_at'] = v_dict['created_at'].isoformat()

回答 16

其实这很简单。如果您需要经常序列化日期,则将它们作为字符串使用。如果需要,您可以轻松地将它们转换回日期时间对象。

如果您主要需要用作日期时间对象,则在序列化之前将它们转换为字符串。

import json, datetime

date = str(datetime.datetime.now())
print(json.dumps(date))
"2018-12-01 15:44:34.409085"
print(type(date))
<class 'str'>

datetime_obj = datetime.datetime.strptime(date, '%Y-%m-%d %H:%M:%S.%f')
print(datetime_obj)
2018-12-01 15:44:34.409085
print(type(datetime_obj))
<class 'datetime.datetime'>

如您所见,两种情况下的输出是相同的。只有类型不同。

Actually it is quite simple. If you need to often serialize dates, then work with them as strings. You can easily convert them back as datetime objects if needed.

If you need to work mostly as datetime objects, then convert them as strings before serializing.

import json, datetime

date = str(datetime.datetime.now())
print(json.dumps(date))
"2018-12-01 15:44:34.409085"
print(type(date))
<class 'str'>

datetime_obj = datetime.datetime.strptime(date, '%Y-%m-%d %H:%M:%S.%f')
print(datetime_obj)
2018-12-01 15:44:34.409085
print(type(datetime_obj))
<class 'datetime.datetime'>

As you can see, the output is the same in both cases. Only the type is different.


回答 17

如果在视图中使用结果,请确保返回正确的响应。根据API,jsonify执行以下操作:

创建一个带有给定参数的JSON表示的响应,该响应具有application / json mimetype。

要用json.dumps模仿这种行为,您必须添加一些额外的代码行。

response = make_response(dumps(sample, cls=CustomEncoder))
response.headers['Content-Type'] = 'application/json'
response.headers['mimetype'] = 'application/json'
return response

您还应该返回一个字典以完全复制jsonify的响应。因此,整个文件将如下所示

from flask import make_response
from json import JSONEncoder, dumps


class CustomEncoder(JSONEncoder):
    def default(self, obj):
        if set(['quantize', 'year']).intersection(dir(obj)):
            return str(obj)
        elif hasattr(obj, 'next'):
            return list(obj)
        return JSONEncoder.default(self, obj)

@app.route('/get_reps/', methods=['GET'])
def get_reps():
    sample = ['some text', <datetime object>, 123]
    response = make_response(dumps({'result': sample}, cls=CustomEncoder))
    response.headers['Content-Type'] = 'application/json'
    response.headers['mimetype'] = 'application/json'
    return response

If you are using the result in a view be sure to return a proper response. According to the API, jsonify does the following:

Creates a Response with the JSON representation of the given arguments with an application/json mimetype.

To mimic this behavior with json.dumps you have to add a few extra lines of code.

response = make_response(dumps(sample, cls=CustomEncoder))
response.headers['Content-Type'] = 'application/json'
response.headers['mimetype'] = 'application/json'
return response

You should also return a dict to fully replicate jsonify’s response. So, the entire file will look like this

from flask import make_response
from json import JSONEncoder, dumps


class CustomEncoder(JSONEncoder):
    def default(self, obj):
        if set(['quantize', 'year']).intersection(dir(obj)):
            return str(obj)
        elif hasattr(obj, 'next'):
            return list(obj)
        return JSONEncoder.default(self, obj)

@app.route('/get_reps/', methods=['GET'])
def get_reps():
    sample = ['some text', <datetime object>, 123]
    response = make_response(dumps({'result': sample}, cls=CustomEncoder))
    response.headers['Content-Type'] = 'application/json'
    response.headers['mimetype'] = 'application/json'
    return response

回答 18

尝试用一个示例来解析它:

#!/usr/bin/env python

import datetime
import json

import dateutil.parser  # pip install python-dateutil


class JSONEncoder(json.JSONEncoder):

    def default(self, obj):
        if isinstance(obj, datetime.datetime):
            return obj.isoformat()
        return super(JSONEncoder, self).default(obj)


def test():
    dts = [
        datetime.datetime.now(),
        datetime.datetime.now(datetime.timezone(-datetime.timedelta(hours=4))),
        datetime.datetime.utcnow(),
        datetime.datetime.now(datetime.timezone.utc),
    ]
    for dt in dts:
        dt_isoformat = json.loads(json.dumps(dt, cls=JSONEncoder))
        dt_parsed = dateutil.parser.parse(dt_isoformat)
        assert dt == dt_parsed
        print(f'{dt}, {dt_isoformat}, {dt_parsed}')
        # 2018-07-22 02:22:42.910637, 2018-07-22T02:22:42.910637, 2018-07-22 02:22:42.910637
        # 2018-07-22 02:22:42.910643-04:00, 2018-07-22T02:22:42.910643-04:00, 2018-07-22 02:22:42.910643-04:00
        # 2018-07-22 06:22:42.910645, 2018-07-22T06:22:42.910645, 2018-07-22 06:22:42.910645
        # 2018-07-22 06:22:42.910646+00:00, 2018-07-22T06:22:42.910646+00:00, 2018-07-22 06:22:42.910646+00:00


if __name__ == '__main__':
    test()

Try this one with an example to parse it:

#!/usr/bin/env python

import datetime
import json

import dateutil.parser  # pip install python-dateutil


class JSONEncoder(json.JSONEncoder):

    def default(self, obj):
        if isinstance(obj, datetime.datetime):
            return obj.isoformat()
        return super(JSONEncoder, self).default(obj)


def test():
    dts = [
        datetime.datetime.now(),
        datetime.datetime.now(datetime.timezone(-datetime.timedelta(hours=4))),
        datetime.datetime.utcnow(),
        datetime.datetime.now(datetime.timezone.utc),
    ]
    for dt in dts:
        dt_isoformat = json.loads(json.dumps(dt, cls=JSONEncoder))
        dt_parsed = dateutil.parser.parse(dt_isoformat)
        assert dt == dt_parsed
        print(f'{dt}, {dt_isoformat}, {dt_parsed}')
        # 2018-07-22 02:22:42.910637, 2018-07-22T02:22:42.910637, 2018-07-22 02:22:42.910637
        # 2018-07-22 02:22:42.910643-04:00, 2018-07-22T02:22:42.910643-04:00, 2018-07-22 02:22:42.910643-04:00
        # 2018-07-22 06:22:42.910645, 2018-07-22T06:22:42.910645, 2018-07-22 06:22:42.910645
        # 2018-07-22 06:22:42.910646+00:00, 2018-07-22T06:22:42.910646+00:00, 2018-07-22 06:22:42.910646+00:00


if __name__ == '__main__':
    test()

回答 19

我的解决方案…

from datetime import datetime
import json

from pytz import timezone
import pytz


def json_dt_serializer(obj):
    """JSON serializer, by macm.
    """
    rsp = dict()
    if isinstance(obj, datetime):
        rsp['day'] = obj.day
        rsp['hour'] = obj.hour
        rsp['microsecond'] = obj.microsecond
        rsp['minute'] = obj.minute
        rsp['month'] = obj.month
        rsp['second'] = obj.second
        rsp['year'] = obj.year
        rsp['tzinfo'] = str(obj.tzinfo)
        return rsp
    raise TypeError("Type not serializable")


def json_dt_deserialize(obj):
    """JSON deserialize from json_dt_serializer, by macm.
    """
    if isinstance(obj, str):
        obj = json.loads(obj)
    tzone = timezone(obj['tzinfo'])
    tmp_dt = datetime(obj['year'],
                      obj['month'],
                      obj['day'],
                      hour=obj['hour'],
                      minute=obj['minute'],
                      second=obj['second'],
                      microsecond=obj['microsecond'])
    loc_dt = tzone.localize(tmp_dt)
    deserialize = loc_dt.astimezone(tzone)
    return deserialize    

好的,现在进行一些测试。

# Tests
now = datetime.now(pytz.utc)

# Using this solution
rsp = json_dt_serializer(now)
tmp = json_dt_deserialize(rsp)
assert tmp == now
assert isinstance(tmp, datetime) == True
assert isinstance(now, datetime) == True

# using default from json.dumps
tmp = json.dumps(datetime.now(pytz.utc), default=json_dt_serializer)
rsp = json_dt_deserialize(tmp)
assert isinstance(rsp, datetime) == True

# Lets try another timezone
eastern = timezone('US/Eastern')
now = datetime.now(eastern)
rsp = json_dt_serializer(now)
tmp = json_dt_deserialize(rsp)

print(tmp)
# 2015-10-22 09:18:33.169302-04:00

print(now)
# 2015-10-22 09:18:33.169302-04:00

# Wow, Works!
assert tmp == now

My solution …

from datetime import datetime
import json

from pytz import timezone
import pytz


def json_dt_serializer(obj):
    """JSON serializer, by macm.
    """
    rsp = dict()
    if isinstance(obj, datetime):
        rsp['day'] = obj.day
        rsp['hour'] = obj.hour
        rsp['microsecond'] = obj.microsecond
        rsp['minute'] = obj.minute
        rsp['month'] = obj.month
        rsp['second'] = obj.second
        rsp['year'] = obj.year
        rsp['tzinfo'] = str(obj.tzinfo)
        return rsp
    raise TypeError("Type not serializable")


def json_dt_deserialize(obj):
    """JSON deserialize from json_dt_serializer, by macm.
    """
    if isinstance(obj, str):
        obj = json.loads(obj)
    tzone = timezone(obj['tzinfo'])
    tmp_dt = datetime(obj['year'],
                      obj['month'],
                      obj['day'],
                      hour=obj['hour'],
                      minute=obj['minute'],
                      second=obj['second'],
                      microsecond=obj['microsecond'])
    loc_dt = tzone.localize(tmp_dt)
    deserialize = loc_dt.astimezone(tzone)
    return deserialize    

Ok, now some tests.

# Tests
now = datetime.now(pytz.utc)

# Using this solution
rsp = json_dt_serializer(now)
tmp = json_dt_deserialize(rsp)
assert tmp == now
assert isinstance(tmp, datetime) == True
assert isinstance(now, datetime) == True

# using default from json.dumps
tmp = json.dumps(datetime.now(pytz.utc), default=json_dt_serializer)
rsp = json_dt_deserialize(tmp)
assert isinstance(rsp, datetime) == True

# Lets try another timezone
eastern = timezone('US/Eastern')
now = datetime.now(eastern)
rsp = json_dt_serializer(now)
tmp = json_dt_deserialize(rsp)

print(tmp)
# 2015-10-22 09:18:33.169302-04:00

print(now)
# 2015-10-22 09:18:33.169302-04:00

# Wow, Works!
assert tmp == now

回答 20

这是将日期时间转换为JSON并返回的完整解决方案。

import calendar, datetime, json

def outputJSON(obj):
    """Default JSON serializer."""

    if isinstance(obj, datetime.datetime):
        if obj.utcoffset() is not None:
            obj = obj - obj.utcoffset()

        return obj.strftime('%Y-%m-%d %H:%M:%S.%f')
    return str(obj)

def inputJSON(obj):
    newDic = {}

    for key in obj:
        try:
            if float(key) == int(float(key)):
                newKey = int(key)
            else:
                newKey = float(key)

            newDic[newKey] = obj[key]
            continue
        except ValueError:
            pass

        try:
            newDic[str(key)] = datetime.datetime.strptime(obj[key], '%Y-%m-%d %H:%M:%S.%f')
            continue
        except TypeError:
            pass

        newDic[str(key)] = obj[key]

    return newDic

x = {'Date': datetime.datetime.utcnow(), 34: 89.9, 12.3: 90, 45: 67, 'Extra': 6}

print x

with open('my_dict.json', 'w') as fp:
    json.dump(x, fp, default=outputJSON)

with open('my_dict.json') as f:
    my_dict = json.load(f, object_hook=inputJSON)

print my_dict

输出量

{'Date': datetime.datetime(2013, 11, 8, 2, 30, 56, 479727), 34: 89.9, 45: 67, 12.3: 90, 'Extra': 6}
{'Date': datetime.datetime(2013, 11, 8, 2, 30, 56, 479727), 34: 89.9, 45: 67, 12.3: 90, 'Extra': 6}

JSON文件

{"Date": "2013-11-08 02:30:56.479727", "34": 89.9, "45": 67, "12.3": 90, "Extra": 6}

这使我能够导入和导出字符串,整数,浮点数和日期时间对象。对于其他类型,应该不难扩展。

Here is my full solution for converting datetime to JSON and back..

import calendar, datetime, json

def outputJSON(obj):
    """Default JSON serializer."""

    if isinstance(obj, datetime.datetime):
        if obj.utcoffset() is not None:
            obj = obj - obj.utcoffset()

        return obj.strftime('%Y-%m-%d %H:%M:%S.%f')
    return str(obj)

def inputJSON(obj):
    newDic = {}

    for key in obj:
        try:
            if float(key) == int(float(key)):
                newKey = int(key)
            else:
                newKey = float(key)

            newDic[newKey] = obj[key]
            continue
        except ValueError:
            pass

        try:
            newDic[str(key)] = datetime.datetime.strptime(obj[key], '%Y-%m-%d %H:%M:%S.%f')
            continue
        except TypeError:
            pass

        newDic[str(key)] = obj[key]

    return newDic

x = {'Date': datetime.datetime.utcnow(), 34: 89.9, 12.3: 90, 45: 67, 'Extra': 6}

print x

with open('my_dict.json', 'w') as fp:
    json.dump(x, fp, default=outputJSON)

with open('my_dict.json') as f:
    my_dict = json.load(f, object_hook=inputJSON)

print my_dict

Output

{'Date': datetime.datetime(2013, 11, 8, 2, 30, 56, 479727), 34: 89.9, 45: 67, 12.3: 90, 'Extra': 6}
{'Date': datetime.datetime(2013, 11, 8, 2, 30, 56, 479727), 34: 89.9, 45: 67, 12.3: 90, 'Extra': 6}

JSON File

{"Date": "2013-11-08 02:30:56.479727", "34": 89.9, "45": 67, "12.3": 90, "Extra": 6}

This has enabled me to import and export strings, ints, floats and datetime objects. It shouldn’t be to hard to extend for other types.


回答 21

转换datestring

date = str(datetime.datetime(somedatetimehere)) 

Convert the date to string

date = str(datetime.datetime(somedatetimehere)) 

回答 22

通常,有几种方法可以序列化日期时间,例如:

  1. ISO字符串,短,可以包含时区信息,例如@jgbarah的答案
  2. 时间戳记(时区数据丢失),例如@JayTaylor的答案
  3. 属性字典(包括时区)。

如果您对最后一种方法感到满意,则json_tricks包将处理日期,时间和日期时间,包括时区。

from datetime import datetime
from json_tricks import dumps
foo = {'title': 'String', 'datetime': datetime(2012, 8, 8, 21, 46, 24, 862000)}
dumps(foo)

这使:

{"title": "String", "datetime": {"__datetime__": null, "year": 2012, "month": 8, "day": 8, "hour": 21, "minute": 46, "second": 24, "microsecond": 862000}}

所以你要做的就是

`pip install json_tricks`

然后从而json_tricks不是json

解码时不将其存储为单个字符串,int或float的优点是:如果仅遇到字符串,或者特别是int或float,则需要了解有关数据的一些信息才能知道它是日期时间。作为命令,您可以存储元数据,以便可以对其进行自动解码,这就是json_tricks您需要的。它对于人类也很容易编辑。

免责声明:它是我做的。因为我有同样的问题。

Generally there are several ways to serialize datetimes, like:

  1. ISO string, short and can include timezone info, e.g. @jgbarah’s answer
  2. Timestamp (timezone data is lost), e.g. @JayTaylor’s answer
  3. Dictionary of properties (including timezone).

If you’re okay with the last way, the json_tricks package handles dates, times and datetimes including timezones.

from datetime import datetime
from json_tricks import dumps
foo = {'title': 'String', 'datetime': datetime(2012, 8, 8, 21, 46, 24, 862000)}
dumps(foo)

which gives:

{"title": "String", "datetime": {"__datetime__": null, "year": 2012, "month": 8, "day": 8, "hour": 21, "minute": 46, "second": 24, "microsecond": 862000}}

So all you need to do is

`pip install json_tricks`

and then import from json_tricks instead of json.

The advantage of not storing it as a single string, int or float comes when decoding: if you encounter just a string or especially int or float, you need to know something about the data to know if it’s a datetime. As a dict, you can store metadata so it can be decoded automatically, which is what json_tricks does for you. It’s also easily editable for humans.

Disclaimer: it’s made by me. Because I had the same problem.


回答 23

在使用sqlalchemy的类中编写序列化装饰器时,我得到了相同的错误消息。所以代替:

Class Puppy(Base):
    ...
    @property
    def serialize(self):
        return { 'id':self.id,
                 'date_birth':self.date_birth,
                  ...
                }

我只是借用了jgbarah使用isoformat()的想法,并将原始值附加到isoformat()上,因此现在看起来像:

                  ...
                 'date_birth':self.date_birth.isoformat(),
                  ...

I got the same error message while writing the serialize decorator inside a Class with sqlalchemy. So instead of :

Class Puppy(Base):
    ...
    @property
    def serialize(self):
        return { 'id':self.id,
                 'date_birth':self.date_birth,
                  ...
                }

I simply borrowed jgbarah’s idea of using isoformat() and appended the original value with isoformat(), so that it now looks like:

                  ...
                 'date_birth':self.date_birth.isoformat(),
                  ...

回答 24

如果需要自己的格式,可以快速解决

for key,val in sample.items():
    if isinstance(val, datetime):
        sample[key] = '{:%Y-%m-%d %H:%M:%S}'.format(val) #you can add different formating here
json.dumps(sample)

A quick fix if you want your own formatting

for key,val in sample.items():
    if isinstance(val, datetime):
        sample[key] = '{:%Y-%m-%d %H:%M:%S}'.format(val) #you can add different formating here
json.dumps(sample)

回答 25

如果您处于通信的两面,则可以将repr()eval()函数与json一起使用。

import datetime, json

dt = datetime.datetime.now()
print("This is now: {}".format(dt))

dt1 = json.dumps(repr(dt))
print("This is serialised: {}".format(dt1))

dt2 = json.loads(dt1)
print("This is loaded back from json: {}".format(dt2))

dt3 = eval(dt2)
print("This is the same object as we started: {}".format(dt3))

print("Check if they are equal: {}".format(dt == dt3))

您不应该将datetime导入为

from datetime import datetime

因为eval会抱怨。或者,您可以将datetime作为参数传递给eval。无论如何,这都行得通。

If you are on both sides of the communication you can use repr() and eval() functions along with json.

import datetime, json

dt = datetime.datetime.now()
print("This is now: {}".format(dt))

dt1 = json.dumps(repr(dt))
print("This is serialised: {}".format(dt1))

dt2 = json.loads(dt1)
print("This is loaded back from json: {}".format(dt2))

dt3 = eval(dt2)
print("This is the same object as we started: {}".format(dt3))

print("Check if they are equal: {}".format(dt == dt3))

You shouldn’t import datetime as

from datetime import datetime

since eval will complain. Or you can pass datetime as a parameter to eval. In any case this should work.


回答 26

当外部化Django模型对象以JSON格式转储时,我遇到了相同的问题。这是解决问题的方法。

def externalize(model_obj):
  keys = model_obj._meta.get_all_field_names() 
  data = {}
  for key in keys:
    if key == 'date_time':
      date_time_obj = getattr(model_obj, key)
      data[key] = date_time_obj.strftime("%A %d. %B %Y")
    else:
      data[key] = getattr(model_obj, key)
  return data

I had encountered same problem when externalizing django model object to dump as JSON. Here is how you can solve it.

def externalize(model_obj):
  keys = model_obj._meta.get_all_field_names() 
  data = {}
  for key in keys:
    if key == 'date_time':
      date_time_obj = getattr(model_obj, key)
      data[key] = date_time_obj.strftime("%A %d. %B %Y")
    else:
      data[key] = getattr(model_obj, key)
  return data

回答 27

def j_serial(o):     # self contained
    from datetime import datetime, date
    return str(o).split('.')[0] if isinstance(o, (datetime, date)) else None

上面实用程序的用法:

import datetime
serial_d = j_serial(datetime.datetime.now())
if serial_d:
    print(serial_d)  # output: 2018-02-28 02:23:15
def j_serial(o):     # self contained
    from datetime import datetime, date
    return str(o).split('.')[0] if isinstance(o, (datetime, date)) else None

Usage of above utility:

import datetime
serial_d = j_serial(datetime.datetime.now())
if serial_d:
    print(serial_d)  # output: 2018-02-28 02:23:15

回答 28

这个库superjson可以做到。您可以按照以下说明轻松为自己的Python对象自定义json序列化程序:https://superjson.readthedocs.io/index.html#extend

一般概念是:

您的代码需要根据python对象找到正确的序列化/反序列化方法。通常,完整的类名是一个很好的标识符。

然后,您的ser / deser方法应该能够将您的对象转换为常规的Json可序列化对象,该对象是通用python类型,dict,list,string,int,float的组合。并反向实施您的deser方法。

This library superjson can do it. And you can easily custom json serializer for your own Python Object by following this instruction https://superjson.readthedocs.io/index.html#extend.

The general concept is:

your code need to locate the right serialization / deserialization method based on the python object. Usually, the full classname is a good identifier.

And then your ser / deser method should be able to transform your object to a regular Json serializable object, a combination of generic python type, dict, list, string, int, float. And implement your deser method reversely.


回答 29

我可能不是100%正确,但这是进行序列化的简单方法

#!/usr/bin/python
import datetime,json

sampledict = {}
sampledict['a'] = "some string"
sampledict['b'] = datetime.datetime.now()

print sampledict   # output : {'a': 'some string', 'b': datetime.datetime(2017, 4, 15, 5, 15, 34, 652996)}

#print json.dumps(sampledict)

'''
output : 

Traceback (most recent call last):
  File "./jsonencodedecode.py", line 10, in <module>
    print json.dumps(sampledict)
  File "/usr/lib/python2.7/json/__init__.py", line 244, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib/python2.7/json/encoder.py", line 207, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python2.7/json/encoder.py", line 270, in iterencode
    return _iterencode(o, 0)
  File "/usr/lib/python2.7/json/encoder.py", line 184, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: datetime.datetime(2017, 4, 15, 5, 16, 17, 435706) is not JSON serializable


'''

sampledict['b'] = datetime.datetime.now().strftime("%B %d, %Y %H:%M %p")

afterdump = json.dumps(sampledict)

print afterdump  #output : {"a": "some string", "b": "April 15, 2017 05:18 AM"}

print type(afterdump) #<type 'str'>


afterloads = json.loads(afterdump) 

print afterloads # output : {u'a': u'some string', u'b': u'April 15, 2017 05:18 AM'}


print type(afterloads) # output :<type 'dict'> 

I may not 100% correct but, this is the simple way to do serialize

#!/usr/bin/python
import datetime,json

sampledict = {}
sampledict['a'] = "some string"
sampledict['b'] = datetime.datetime.now()

print sampledict   # output : {'a': 'some string', 'b': datetime.datetime(2017, 4, 15, 5, 15, 34, 652996)}

#print json.dumps(sampledict)

'''
output : 

Traceback (most recent call last):
  File "./jsonencodedecode.py", line 10, in <module>
    print json.dumps(sampledict)
  File "/usr/lib/python2.7/json/__init__.py", line 244, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib/python2.7/json/encoder.py", line 207, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python2.7/json/encoder.py", line 270, in iterencode
    return _iterencode(o, 0)
  File "/usr/lib/python2.7/json/encoder.py", line 184, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: datetime.datetime(2017, 4, 15, 5, 16, 17, 435706) is not JSON serializable


'''

sampledict['b'] = datetime.datetime.now().strftime("%B %d, %Y %H:%M %p")

afterdump = json.dumps(sampledict)

print afterdump  #output : {"a": "some string", "b": "April 15, 2017 05:18 AM"}

print type(afterdump) #<type 'str'>


afterloads = json.loads(afterdump) 

print afterloads # output : {u'a': u'some string', u'b': u'April 15, 2017 05:18 AM'}


print type(afterloads) # output :<type 'dict'> 

如何使用glob()递归查找文件?

问题:如何使用glob()递归查找文件?

这就是我所拥有的:

glob(os.path.join('src','*.c'))

但我想搜索src的子文件夹。这样的事情会起作用:

glob(os.path.join('src','*.c'))
glob(os.path.join('src','*','*.c'))
glob(os.path.join('src','*','*','*.c'))
glob(os.path.join('src','*','*','*','*.c'))

但这显然是有限且笨拙的。

This is what I have:

glob(os.path.join('src','*.c'))

but I want to search the subfolders of src. Something like this would work:

glob(os.path.join('src','*.c'))
glob(os.path.join('src','*','*.c'))
glob(os.path.join('src','*','*','*.c'))
glob(os.path.join('src','*','*','*','*.c'))

But this is obviously limited and clunky.


回答 0

Python 3.5+

由于您使用的是新的python,因此应pathlib.Path.rglobpathlib模块中使用。

from pathlib import Path

for path in Path('src').rglob('*.c'):
    print(path.name)

如果您不想使用pathlib,只需使用glob.glob,但不要忘记传递recursive关键字参数。

对于匹配文件以点(。)开头的情况;例如当前目录中的文件或基于Unix的系统上的隐藏文件,请使用以下os.walk解决方案。

较旧的Python版本

对于较旧的Python版本,可os.walk用于递归遍历目录并fnmatch.filter与简单表达式匹配:

import fnmatch
import os

matches = []
for root, dirnames, filenames in os.walk('src'):
    for filename in fnmatch.filter(filenames, '*.c'):
        matches.append(os.path.join(root, filename))

Python 3.5+

Since you’re on a new python, you should use pathlib.Path.rglob from the the pathlib module.

from pathlib import Path

for path in Path('src').rglob('*.c'):
    print(path.name)

If you don’t want to use pathlib, just use glob.glob, but don’t forget to pass in the recursive keyword parameter.

For cases where matching files beginning with a dot (.); like files in the current directory or hidden files on Unix based system, use the os.walk solution below.

Older Python versions

For older Python versions, use os.walk to recursively walk a directory and fnmatch.filter to match against a simple expression:

import fnmatch
import os

matches = []
for root, dirnames, filenames in os.walk('src'):
    for filename in fnmatch.filter(filenames, '*.c'):
        matches.append(os.path.join(root, filename))

回答 1

与其他解决方案类似,但是使用fnmatch.fnmatch而不是glob,因为os.walk已经列出了文件名:

import os, fnmatch


def find_files(directory, pattern):
    for root, dirs, files in os.walk(directory):
        for basename in files:
            if fnmatch.fnmatch(basename, pattern):
                filename = os.path.join(root, basename)
                yield filename


for filename in find_files('src', '*.c'):
    print 'Found C source:', filename

另外,使用生成器可以使您处理找到的每个文件,而不是查找所有文件然后进行处理。

Similar to other solutions, but using fnmatch.fnmatch instead of glob, since os.walk already listed the filenames:

import os, fnmatch


def find_files(directory, pattern):
    for root, dirs, files in os.walk(directory):
        for basename in files:
            if fnmatch.fnmatch(basename, pattern):
                filename = os.path.join(root, basename)
                yield filename


for filename in find_files('src', '*.c'):
    print 'Found C source:', filename

Also, using a generator alows you to process each file as it is found, instead of finding all the files and then processing them.


回答 2

我修改了glob模块,以支持**用于递归glob,例如:

>>> import glob2
>>> all_header_files = glob2.glob('src/**/*.c')

https://github.com/miracle2k/python-glob2/

当您想为用户提供使用**语法的能力时很有用,因此仅os.walk()不够好。

I’ve modified the glob module to support ** for recursive globbing, e.g:

>>> import glob2
>>> all_header_files = glob2.glob('src/**/*.c')

https://github.com/miracle2k/python-glob2/

Useful when you want to provide your users with the ability to use the ** syntax, and thus os.walk() alone is not good enough.


回答 3

从Python 3.4开始,可以使用新pathlib模块中支持通配符glob()Path类之一的方法。例如:**

from pathlib import Path

for file_path in Path('src').glob('**/*.c'):
    print(file_path) # do whatever you need with these files

更新: 从Python 3.5开始,glob.glob()

Starting with Python 3.4, one can use the glob() method of one of the Path classes in the new pathlib module, which supports ** wildcards. For example:

from pathlib import Path

for file_path in Path('src').glob('**/*.c'):
    print(file_path) # do whatever you need with these files

Update: Starting with Python 3.5, the same syntax is also supported by glob.glob().


回答 4

import os
import fnmatch


def recursive_glob(treeroot, pattern):
    results = []
    for base, dirs, files in os.walk(treeroot):
        goodfiles = fnmatch.filter(files, pattern)
        results.extend(os.path.join(base, f) for f in goodfiles)
    return results

fnmatch为您提供与完全相同的模式glob,因此对于glob.glob非常紧密的语义而言,这确实是一个很好的替代。迭代的版本(例如生成器),用IOW代替glob.iglob,是微不足道的改编(只是yield中间结果,而不是extend最后返回单个结果列表)。

import os
import fnmatch


def recursive_glob(treeroot, pattern):
    results = []
    for base, dirs, files in os.walk(treeroot):
        goodfiles = fnmatch.filter(files, pattern)
        results.extend(os.path.join(base, f) for f in goodfiles)
    return results

fnmatch gives you exactly the same patterns as glob, so this is really an excellent replacement for glob.glob with very close semantics. An iterative version (e.g. a generator), IOW a replacement for glob.iglob, is a trivial adaptation (just yield the intermediate results as you go, instead of extending a single results list to return at the end).


回答 5

对于python> = 3.5,可以使用**recursive=True

import glob
for x in glob.glob('path/**/*.c', recursive=True):
    print(x)

演示版


如果是递归的True,则模式** 将匹配任何文件以及零个或多个directoriessubdirectories。如果模式后跟一个os.sep,则仅目录和subdirectories匹配项。

For python >= 3.5 you can use **, recursive=True :

import glob
for x in glob.glob('path/**/*.c', recursive=True):
    print(x)

Demo


If recursive is True, the pattern ** will match any files and zero or more directories and subdirectories. If the pattern is followed by an os.sep, only directories and subdirectories match.


回答 6

您将要用来os.walk收集符合条件的文件名。例如:

import os
cfiles = []
for root, dirs, files in os.walk('src'):
  for file in files:
    if file.endswith('.c'):
      cfiles.append(os.path.join(root, file))

You’ll want to use os.walk to collect filenames that match your criteria. For example:

import os
cfiles = []
for root, dirs, files in os.walk('src'):
  for file in files:
    if file.endswith('.c'):
      cfiles.append(os.path.join(root, file))

回答 7

这是一个具有嵌套列表推导的解决方案,os.walk而不是简单的后缀匹配glob

import os
cfiles = [os.path.join(root, filename)
          for root, dirnames, filenames in os.walk('src')
          for filename in filenames if filename.endswith('.c')]

可以将其压缩为单线:

import os;cfiles=[os.path.join(r,f) for r,d,fs in os.walk('src') for f in fs if f.endswith('.c')]

或概括为一个函数:

import os

def recursive_glob(rootdir='.', suffix=''):
    return [os.path.join(looproot, filename)
            for looproot, _, filenames in os.walk(rootdir)
            for filename in filenames if filename.endswith(suffix)]

cfiles = recursive_glob('src', '.c')

如果您确实需要完整的glob样式模式,则可以遵循Alex和Bruno的示例并使用fnmatch

import fnmatch
import os

def recursive_glob(rootdir='.', pattern='*'):
    return [os.path.join(looproot, filename)
            for looproot, _, filenames in os.walk(rootdir)
            for filename in filenames
            if fnmatch.fnmatch(filename, pattern)]

cfiles = recursive_glob('src', '*.c')

Here’s a solution with nested list comprehensions, os.walk and simple suffix matching instead of glob:

import os
cfiles = [os.path.join(root, filename)
          for root, dirnames, filenames in os.walk('src')
          for filename in filenames if filename.endswith('.c')]

It can be compressed to a one-liner:

import os;cfiles=[os.path.join(r,f) for r,d,fs in os.walk('src') for f in fs if f.endswith('.c')]

or generalized as a function:

import os

def recursive_glob(rootdir='.', suffix=''):
    return [os.path.join(looproot, filename)
            for looproot, _, filenames in os.walk(rootdir)
            for filename in filenames if filename.endswith(suffix)]

cfiles = recursive_glob('src', '.c')

If you do need full glob style patterns, you can follow Alex’s and Bruno’s example and use fnmatch:

import fnmatch
import os

def recursive_glob(rootdir='.', pattern='*'):
    return [os.path.join(looproot, filename)
            for looproot, _, filenames in os.walk(rootdir)
            for filename in filenames
            if fnmatch.fnmatch(filename, pattern)]

cfiles = recursive_glob('src', '*.c')

回答 8

最近,我不得不恢复扩展名为.jpg的图片。我运行了photorec并恢复了4579个目录,其中220万个文件具有多种扩展名。使用以下脚本,我能够在几分钟内选择50133个具有.jpg扩展名的文件:

#!/usr/binenv python2.7

import glob
import shutil
import os

src_dir = "/home/mustafa/Masaüstü/yedek"
dst_dir = "/home/mustafa/Genel/media"
for mediafile in glob.iglob(os.path.join(src_dir, "*", "*.jpg")): #"*" is for subdirectory
    shutil.copy(mediafile, dst_dir)

Recently I had to recover my pictures with the extension .jpg. I ran photorec and recovered 4579 directories 2.2 million files within, having tremendous variety of extensions.With the script below I was able to select 50133 files havin .jpg extension within minutes:

#!/usr/binenv python2.7

import glob
import shutil
import os

src_dir = "/home/mustafa/Masaüstü/yedek"
dst_dir = "/home/mustafa/Genel/media"
for mediafile in glob.iglob(os.path.join(src_dir, "*", "*.jpg")): #"*" is for subdirectory
    shutil.copy(mediafile, dst_dir)

回答 9

考虑一下pathlib.rglob()

这就好比调用Path.glob()"**/"在给定的相对图案前面加:

import pathlib


for p in pathlib.Path("src").rglob("*.c"):
    print(p)

另请参阅@taleinat的相关文章和类似的文章其他地方。

Consider pathlib.rglob().

This is like calling Path.glob() with "**/" added in front of the given relative pattern:

import pathlib


for p in pathlib.Path("src").rglob("*.c"):
    print(p)

See also @taleinat’s related post here and a similar post elsewhere.


回答 10

Johan和Bruno针对上述最低要求提供了出色的解决方案。我刚刚发布了实现了Ant FileSet和Globs的Formic,它可以处理这种情况以及更复杂的情况。您的要求的实现是:

import formic
fileset = formic.FileSet(include="/src/**/*.c")
for file_name in fileset.qualified_files():
    print file_name

Johan and Bruno provide excellent solutions on the minimal requirement as stated. I have just released Formic which implements Ant FileSet and Globs which can handle this and more complicated scenarios. An implementation of your requirement is:

import formic
fileset = formic.FileSet(include="/src/**/*.c")
for file_name in fileset.qualified_files():
    print file_name

回答 11

基于其他答案,这是我当前的工作实现,它在根目录中检索嵌套的xml文件:

files = []
for root, dirnames, filenames in os.walk(myDir):
    files.extend(glob.glob(root + "/*.xml"))

我真的很喜欢python :)

based on other answers this is my current working implementation, which retrieves nested xml files in a root directory:

files = []
for root, dirnames, filenames in os.walk(myDir):
    files.extend(glob.glob(root + "/*.xml"))

I’m really having fun with python :)


回答 12

仅使用glob模块执行此操作的另一种方法。只需在rglob方法中添加一个起始基本目录和一个匹配模式即可,它将返回匹配文件名的列表。

import glob
import os

def _getDirs(base):
    return [x for x in glob.iglob(os.path.join( base, '*')) if os.path.isdir(x) ]

def rglob(base, pattern):
    list = []
    list.extend(glob.glob(os.path.join(base,pattern)))
    dirs = _getDirs(base)
    if len(dirs):
        for d in dirs:
            list.extend(rglob(os.path.join(base,d), pattern))
    return list

Another way to do it using just the glob module. Just seed the rglob method with a starting base directory and a pattern to match and it will return a list of matching file names.

import glob
import os

def _getDirs(base):
    return [x for x in glob.iglob(os.path.join( base, '*')) if os.path.isdir(x) ]

def rglob(base, pattern):
    list = []
    list.extend(glob.glob(os.path.join(base,pattern)))
    dirs = _getDirs(base)
    if len(dirs):
        for d in dirs:
            list.extend(rglob(os.path.join(base,d), pattern))
    return list

回答 13

或具有列表理解:

 >>> base = r"c:\User\xtofl"
 >>> binfiles = [ os.path.join(base,f) 
            for base, _, files in os.walk(root) 
            for f in files if f.endswith(".jpg") ] 

Or with a list comprehension:

 >>> base = r"c:\User\xtofl"
 >>> binfiles = [ os.path.join(base,f) 
            for base, _, files in os.walk(root) 
            for f in files if f.endswith(".jpg") ] 

回答 14

刚做这个..它将以分层方式打印文件和目录

但是我没有用过fnmatch或walk

#!/usr/bin/python

import os,glob,sys

def dirlist(path, c = 1):

        for i in glob.glob(os.path.join(path, "*")):
                if os.path.isfile(i):
                        filepath, filename = os.path.split(i)
                        print '----' *c + filename

                elif os.path.isdir(i):
                        dirname = os.path.basename(i)
                        print '----' *c + dirname
                        c+=1
                        dirlist(i,c)
                        c-=1


path = os.path.normpath(sys.argv[1])
print(os.path.basename(path))
dirlist(path)

Just made this.. it will print files and directory in hierarchical way

But I didn’t used fnmatch or walk

#!/usr/bin/python

import os,glob,sys

def dirlist(path, c = 1):

        for i in glob.glob(os.path.join(path, "*")):
                if os.path.isfile(i):
                        filepath, filename = os.path.split(i)
                        print '----' *c + filename

                elif os.path.isdir(i):
                        dirname = os.path.basename(i)
                        print '----' *c + dirname
                        c+=1
                        dirlist(i,c)
                        c-=1


path = os.path.normpath(sys.argv[1])
print(os.path.basename(path))
dirlist(path)

回答 15

那使用fnmatch或正则表达式:

import fnmatch, os

def filepaths(directory, pattern):
    for root, dirs, files in os.walk(directory):
        for basename in files:
            try:
                matched = pattern.match(basename)
            except AttributeError:
                matched = fnmatch.fnmatch(basename, pattern)
            if matched:
                yield os.path.join(root, basename)

# usage
if __name__ == '__main__':
    from pprint import pprint as pp
    import re
    path = r'/Users/hipertracker/app/myapp'
    pp([x for x in filepaths(path, re.compile(r'.*\.py$'))])
    pp([x for x in filepaths(path, '*.py')])

That one uses fnmatch or regular expression:

import fnmatch, os

def filepaths(directory, pattern):
    for root, dirs, files in os.walk(directory):
        for basename in files:
            try:
                matched = pattern.match(basename)
            except AttributeError:
                matched = fnmatch.fnmatch(basename, pattern)
            if matched:
                yield os.path.join(root, basename)

# usage
if __name__ == '__main__':
    from pprint import pprint as pp
    import re
    path = r'/Users/hipertracker/app/myapp'
    pp([x for x in filepaths(path, re.compile(r'.*\.py$'))])
    pp([x for x in filepaths(path, '*.py')])

回答 16

除了建议的答案,您还可以通过一些懒惰的生成和列表理解魔术来做到这一点:

import os, glob, itertools

results = itertools.chain.from_iterable(glob.iglob(os.path.join(root,'*.c'))
                                               for root, dirs, files in os.walk('src'))

for f in results: print(f)

除了适合一行并且避免在内存中使用不必要的列表之外,这还具有很好的副作用,即您可以以类似于**运算符的方式使用它,例如,可以使用os.path.join(root, 'some/path/*.c')它来获取所有.c文件。具有此结构的src子目录。

In addition to the suggested answers, you can do this with some lazy generation and list comprehension magic:

import os, glob, itertools

results = itertools.chain.from_iterable(glob.iglob(os.path.join(root,'*.c'))
                                               for root, dirs, files in os.walk('src'))

for f in results: print(f)

Besides fitting in one line and avoiding unnecessary lists in memory, this also has the nice side effect, that you can use it in a way similar to the ** operator, e.g., you could use os.path.join(root, 'some/path/*.c') in order to get all .c files in all sub directories of src that have this structure.


回答 17

对于python 3.5及更高版本

import glob

#file_names_array = glob.glob('path/*.c', recursive=True)
#above works for files directly at path/ as guided by NeStack

#updated version
file_names_array = glob.glob('path/**/*.c', recursive=True)

您可能还需要

for full_path_in_src in  file_names_array:
    print (full_path_in_src ) # be like 'abc/xyz.c'
    #Full system path of this would be like => 'path till src/abc/xyz.c'

For python 3.5 and later

import glob

#file_names_array = glob.glob('path/*.c', recursive=True)
#above works for files directly at path/ as guided by NeStack

#updated version
file_names_array = glob.glob('path/**/*.c', recursive=True)

further you might need

for full_path_in_src in  file_names_array:
    print (full_path_in_src ) # be like 'abc/xyz.c'
    #Full system path of this would be like => 'path till src/abc/xyz.c'

回答 18

这是Python 2.7上的有效代码。作为我的devops工作的一部分,我需要编写一个脚本,该脚本会将标有live-appName.properties的配置文件移动到appName.properties。可能还有其他扩展文件,例如live-appName.xml。

以下是用于此目的的工作代码,该代码在给定目录(嵌套级别)中查找文件,然后将其重命名(移动)为所需的文件名

def flipProperties(searchDir):
   print "Flipping properties to point to live DB"
   for root, dirnames, filenames in os.walk(searchDir):
      for filename in fnmatch.filter(filenames, 'live-*.*'):
        targetFileName = os.path.join(root, filename.split("live-")[1])
        print "File "+ os.path.join(root, filename) + "will be moved to " + targetFileName
        shutil.move(os.path.join(root, filename), targetFileName)

从主脚本调用此函数

flipProperties(searchDir)

希望这可以帮助遇到类似问题的人。

This is a working code on Python 2.7. As part of my devops work, I was required to write a script which would move the config files marked with live-appName.properties to appName.properties. There could be other extension files as well like live-appName.xml.

Below is a working code for this, which finds the files in the given directories (nested level) and then renames (moves) it to the required filename

def flipProperties(searchDir):
   print "Flipping properties to point to live DB"
   for root, dirnames, filenames in os.walk(searchDir):
      for filename in fnmatch.filter(filenames, 'live-*.*'):
        targetFileName = os.path.join(root, filename.split("live-")[1])
        print "File "+ os.path.join(root, filename) + "will be moved to " + targetFileName
        shutil.move(os.path.join(root, filename), targetFileName)

This function is called from a main script

flipProperties(searchDir)

Hope this helps someone struggling with similar issues.


回答 19

Johan Dahlin答案的简化版本,不带fnmatch

import os

matches = []
for root, dirnames, filenames in os.walk('src'):
  matches += [os.path.join(root, f) for f in filenames if f[-2:] == '.c']

Simplified version of Johan Dahlin’s answer, without fnmatch.

import os

matches = []
for root, dirnames, filenames in os.walk('src'):
  matches += [os.path.join(root, f) for f in filenames if f[-2:] == '.c']

回答 20

这是我的使用列表推导的解决方案在目录和所有子目录中递归搜索多个文件扩展名的解决方案:

import os, glob

def _globrec(path, *exts):
""" Glob recursively a directory and all subdirectories for multiple file extensions 
    Note: Glob is case-insensitive, i. e. for '\*.jpg' you will get files ending
    with .jpg and .JPG

    Parameters
    ----------
    path : str
        A directory name
    exts : tuple
        File extensions to glob for

    Returns
    -------
    files : list
        list of files matching extensions in exts in path and subfolders

    """
    dirs = [a[0] for a in os.walk(path)]
    f_filter = [d+e for d in dirs for e in exts]    
    return [f for files in [glob.iglob(files) for files in f_filter] for f in files]

my_pictures = _globrec(r'C:\Temp', '\*.jpg','\*.bmp','\*.png','\*.gif')
for f in my_pictures:
    print f

Here is my solution using list comprehension to search for multiple file extensions recursively in a directory and all subdirectories:

import os, glob

def _globrec(path, *exts):
""" Glob recursively a directory and all subdirectories for multiple file extensions 
    Note: Glob is case-insensitive, i. e. for '\*.jpg' you will get files ending
    with .jpg and .JPG

    Parameters
    ----------
    path : str
        A directory name
    exts : tuple
        File extensions to glob for

    Returns
    -------
    files : list
        list of files matching extensions in exts in path and subfolders

    """
    dirs = [a[0] for a in os.walk(path)]
    f_filter = [d+e for d in dirs for e in exts]    
    return [f for files in [glob.iglob(files) for files in f_filter] for f in files]

my_pictures = _globrec(r'C:\Temp', '\*.jpg','\*.bmp','\*.png','\*.gif')
for f in my_pictures:
    print f

回答 21

import sys, os, glob

dir_list = ["c:\\books\\heap"]

while len(dir_list) > 0:
    cur_dir = dir_list[0]
    del dir_list[0]
    list_of_files = glob.glob(cur_dir+'\\*')
    for book in list_of_files:
        if os.path.isfile(book):
            print(book)
        else:
            dir_list.append(book)
import sys, os, glob

dir_list = ["c:\\books\\heap"]

while len(dir_list) > 0:
    cur_dir = dir_list[0]
    del dir_list[0]
    list_of_files = glob.glob(cur_dir+'\\*')
    for book in list_of_files:
        if os.path.isfile(book):
            print(book)
        else:
            dir_list.append(book)

回答 22

我修改了此发布中的最佳答案..并最近创建了此脚本,该脚本将遍历给定目录(searchdir)中的所有文件及其下的子目录…并打印文件名,rootdir,修改/创建日期和尺寸。

希望这对某人有帮助…他们可以遍历目录并获取fileinfo。

import time
import fnmatch
import os

def fileinfo(file):
    filename = os.path.basename(file)
    rootdir = os.path.dirname(file)
    lastmod = time.ctime(os.path.getmtime(file))
    creation = time.ctime(os.path.getctime(file))
    filesize = os.path.getsize(file)

    print "%s**\t%s\t%s\t%s\t%s" % (rootdir, filename, lastmod, creation, filesize)

searchdir = r'D:\Your\Directory\Root'
matches = []

for root, dirnames, filenames in os.walk(searchdir):
    ##  for filename in fnmatch.filter(filenames, '*.c'):
    for filename in filenames:
        ##      matches.append(os.path.join(root, filename))
        ##print matches
        fileinfo(os.path.join(root, filename))

I modified the top answer in this posting.. and recently created this script which will loop through all files in a given directory (searchdir) and the sub-directories under it… and prints filename, rootdir, modified/creation date, and size.

Hope this helps someone… and they can walk the directory and get fileinfo.

import time
import fnmatch
import os

def fileinfo(file):
    filename = os.path.basename(file)
    rootdir = os.path.dirname(file)
    lastmod = time.ctime(os.path.getmtime(file))
    creation = time.ctime(os.path.getctime(file))
    filesize = os.path.getsize(file)

    print "%s**\t%s\t%s\t%s\t%s" % (rootdir, filename, lastmod, creation, filesize)

searchdir = r'D:\Your\Directory\Root'
matches = []

for root, dirnames, filenames in os.walk(searchdir):
    ##  for filename in fnmatch.filter(filenames, '*.c'):
    for filename in filenames:
        ##      matches.append(os.path.join(root, filename))
        ##print matches
        fileinfo(os.path.join(root, filename))

回答 23

这是一个将模式与完整路径而不只是基本文件名匹配的解决方案。

它用于fnmatch.translate将glob样式的模式转换为正则表达式,然后将其与在遍历目录时发现的每个文件的完整路径进行匹配。

re.IGNORECASE是可选的,但在Windows上是理想的,因为文件系统本身不区分大小写。(我没有费心编译正则表达式,因为文档表明它应该在内部缓存。)

import fnmatch
import os
import re

def findfiles(dir, pattern):
    patternregex = fnmatch.translate(pattern)
    for root, dirs, files in os.walk(dir):
        for basename in files:
            filename = os.path.join(root, basename)
            if re.search(patternregex, filename, re.IGNORECASE):
                yield filename

Here is a solution that will match the pattern against the full path and not just the base filename.

It uses fnmatch.translate to convert a glob-style pattern into a regular expression, which is then matched against the full path of each file found while walking the directory.

re.IGNORECASE is optional, but desirable on Windows since the file system itself is not case-sensitive. (I didn’t bother compiling the regex because docs indicate it should be cached internally.)

import fnmatch
import os
import re

def findfiles(dir, pattern):
    patternregex = fnmatch.translate(pattern)
    for root, dirs, files in os.walk(dir):
        for basename in files:
            filename = os.path.join(root, basename)
            if re.search(patternregex, filename, re.IGNORECASE):
                yield filename

回答 24

我需要一个解决方案的Python 2.x中,工程上大的目录。
我结束了这一点:

import subprocess
foundfiles= subprocess.check_output("ls src/*.c src/**/*.c", shell=True)
for foundfile in foundfiles.splitlines():
    print foundfile

请注意,如果ls找不到任何匹配文件,您可能需要一些异常处理。

I needed a solution for python 2.x that works fast on large directories.
I endet up with this:

import subprocess
foundfiles= subprocess.check_output("ls src/*.c src/**/*.c", shell=True)
for foundfile in foundfiles.splitlines():
    print foundfile

Note that you might need some exception handling in case ls doesn’t find any matching file.


如何将字符串转换为大写

问题:如何将字符串转换为大写

我在使用Python将字符串更改为大写时遇到问题。在我的研究中,我知道了,string.ascii_uppercase但是没有用。

如下代码:

 >>s = 'sdsd'
 >>s.ascii_uppercase

给出此错误信息:

Traceback (most recent call last):
  File "<console>", line 1, in <module>
AttributeError: 'str' object has no attribute 'ascii_uppercase'

我的问题是:如何在Python中将字符串转换为大写?

I have problem in changing a string into uppercase with Python. In my research, I got string.ascii_uppercase but it doesn’t work.

The following code:

 >>s = 'sdsd'
 >>s.ascii_uppercase

Gives this error message:

Traceback (most recent call last):
  File "<console>", line 1, in <module>
AttributeError: 'str' object has no attribute 'ascii_uppercase'

My question is: how can I convert a string into uppercase in Python?


回答 0

>>> s = 'sdsd'
>>> s.upper()
'SDSD'

请参阅字符串方法

>>> s = 'sdsd'
>>> s.upper()
'SDSD'

See String Methods.


回答 1

要获取字符串的大写版本,可以使用str.upper

s = 'sdsd'
s.upper()
#=> 'SDSD'

另一方面,string.ascii_uppercase是一个包含所有大写ASCII字母的字符串:

import string
string.ascii_uppercase
#=> 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

To get upper case version of a string you can use str.upper:

s = 'sdsd'
s.upper()
#=> 'SDSD'

On the other hand string.ascii_uppercase is a string containing all ASCII letters in upper case:

import string
string.ascii_uppercase
#=> 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

回答 2

使字符串大写-只需键入

s.upper()

简单容易!你也可以做同样的事情来降低它

s.lower()

等等

to make the string upper case — just simply type

s.upper()

simple and easy! you can do the same to make it lower too

s.lower()

etc.


回答 3

s = 'sdsd'
print (s.upper())
upper = raw_input('type in something lowercase.')
lower = raw_input('type in the same thing caps lock.')
print upper.upper()
print lower.lower()
s = 'sdsd'
print (s.upper())
upper = raw_input('type in something lowercase.')
lower = raw_input('type in the same thing caps lock.')
print upper.upper()
print lower.lower()

回答 4

用于将大写字母从小写字母转换为大写字母

"string".upper()

"string"您要转换大写的字符串在哪里

对于这个问题,它会像这样:

s.upper()

用于从大写字符串制作小写字母,只需使用

"string".lower()

"string"您要转换小写的字符串在哪里

对于这个问题,它会像这样:

s.lower()

如果要使用整个字符串变量

s="sadf"
# sadf

s=s.upper()
# SADF

for making uppercase from lowercase to upper just use

"string".upper()

where "string" is your string that you want to convert uppercase

for this question concern it will like this:

s.upper()

for making lowercase from uppercase string just use

"string".lower()

where "string" is your string that you want to convert lowercase

for this question concern it will like this:

s.lower()

If you want to make your whole string variable use

s="sadf"
# sadf

s=s.upper()
# SADF

回答 5

对于有关简单字符串操作的问题,dir内置函数非常方便。它给您提供参数方法的列表,例如,dir(s)返回包含的列表upper

For questions on simple string manipulation the dir built-in function comes in handy. It gives you, among others, a list of methods of the argument, e.g., dir(s) returns a list containing upper.


获取烧瓶请求中收到的数据

问题:获取烧瓶请求中收到的数据

我希望能够将数据发送到我的Flask应用。我尝试访问,request.data但它是一个空字符串。您如何访问请求数据?

from flask import request

@app.route('/', methods=['GET', 'POST'])
def parse_request():
    data = request.data  # data is empty
    # need posted data here

这个问题的答案使我提出了在Python Flask中获取原始POST正文的问题,而不管接下来的Content-Type标头如何,这都是关于获取原始数据而不是已解析数据的问题。

I want to be able to get the data sent to my Flask app. I’ve tried accessing request.data but it is an empty string. How do you access request data?

from flask import request

@app.route('/', methods=['GET', 'POST'])
def parse_request():
    data = request.data  # data is empty
    # need posted data here

The answer to this question led me to ask Get raw POST body in Python Flask regardless of Content-Type header next, which is about getting the raw data rather than the parsed data.


回答 0

文档描述的要求提供的属性。在大多数情况下,request.data由于它用作后备广告,因此将为空:

request.data 如果传入的请求数据带有mimetype Flask无法处理,则将其包含为字符串。

  • request.args:URL查询字符串中的键/值对
  • request.form:正文中的键/值对,来自HTML帖子形式或非JSON编码的JavaScript请求
  • request.files:Flask与正文分开的正文文件form。必须使用HTML表单,enctype=multipart/form-data否则将不会上传文件。
  • request.values:组合argsformargs如果键重叠则首选
  • request.json:解析的JSON数据。该请求必须具有application/json内容类型,或用于request.get_json(force=True)忽略内容类型。

所有这些都是MultiDict实例(除外json)。您可以使用以下方法访问值:

  • request.form['name']:如果您知道密钥存在,请使用索引
  • request.form.get('name')get如果密钥可能不存在,则使用
  • request.form.getlist('name')getlist如果键被多次发送并且需要值列表,则使用该键。get仅返回第一个值。

The docs describe the attributes available on the request. In most common cases request.data will be empty because it’s used as a fallback:

request.data Contains the incoming request data as string in case it came with a mimetype Flask does not handle.

  • request.args: the key/value pairs in the URL query string
  • request.form: the key/value pairs in the body, from a HTML post form, or JavaScript request that isn’t JSON encoded
  • request.files: the files in the body, which Flask keeps separate from form. HTML forms must use enctype=multipart/form-data or files will not be uploaded.
  • request.values: combined args and form, preferring args if keys overlap
  • request.json: parsed JSON data. The request must have the application/json content type, or use request.get_json(force=True) to ignore the content type.

All of these are MultiDict instances (except for json). You can access values using:

  • request.form['name']: use indexing if you know the key exists
  • request.form.get('name'): use get if the key might not exist
  • request.form.getlist('name'): use getlist if the key is sent multiple times and you want a list of values. get only returns the first value.

回答 1

要获取原始数据,请使用request.data。这仅在无法将其解析为表单数据时才有效,否则它将为空request.form并将具有解析后的数据。

from flask import request
request.data

To get the raw data, use request.data. This only works if it couldn’t be parsed as form data, otherwise it will be empty and request.form will have the parsed data.

from flask import request
request.data

回答 2

对于URL查询参数,请使用request.args

search = request.args.get("search")
page = request.args.get("page")

对于张贴的表单输入,请使用request.form

email = request.form.get('email')
password = request.form.get('password')

对于以内容类型发布的JSON application/json,请使用request.get_json()

data = request.get_json()

For URL query parameters, use request.args.

search = request.args.get("search")
page = request.args.get("page")

For posted form input, use request.form.

email = request.form.get('email')
password = request.form.get('password')

For JSON posted with content type application/json, use request.get_json().

data = request.get_json()

回答 3

这是解析发布的JSON数据并将其回显的示例。

from flask import Flask, request, jsonify

app = Flask(__name__)

@app.route('/foo', methods=['POST']) 
def foo():
    data = request.json
    return jsonify(data)

要使用curl发布JSON:

curl -i -H "Content-Type: application/json" -X POST -d '{"userId":"1", "username": "fizz bizz"}' http://localhost:5000/foo

或使用邮递员:

使用邮递员发布JSON

Here’s an example of parsing posted JSON data and echoing it back.

from flask import Flask, request, jsonify

app = Flask(__name__)

@app.route('/foo', methods=['POST']) 
def foo():
    data = request.json
    return jsonify(data)

To post JSON with curl:

curl -i -H "Content-Type: application/json" -X POST -d '{"userId":"1", "username": "fizz bizz"}' http://localhost:5000/foo

Or to use Postman:

using postman to post JSON


回答 4

如果您发布内容类型为的JSON,请在Flask中application/json使用request.get_json()它。如果内容类型不正确,None则返回。如果数据不是JSON,则会引发错误。

@app.route("/something", methods=["POST"])
def do_something():
    data = request.get_json()

If you post JSON with content type application/json, use request.get_json() to get it in Flask. If the content type is not correct, None is returned. If the data is not JSON, an error is raised.

@app.route("/something", methods=["POST"])
def do_something():
    data = request.get_json()

回答 5

要获取原始帖子正文,而不管内容类型如何,请使用request.get_data()。如果使用request.data,它将调用request.get_data(parse_form_data=True),它将填充request.form MultiDictdata留空。

To get the raw post body regardless of the content type, use request.get_data(). If you use request.data, it calls request.get_data(parse_form_data=True), which will populate the request.form MultiDict and leave data empty.


回答 6

request.form使用普通字典,请使用request.form.to_dict(flat=False)

要返回API的JSON数据,请将其传递给jsonify

本示例将表单数据作为JSON数据返回。

@app.route('/form_to_json', methods=['POST'])
def form_to_json():
    data = request.form.to_dict(flat=False)
    return jsonify(data)

这是带有curl的POST表单数据的示例,返回为JSON:

$ curl http://127.0.0.1:5000/data -d "name=ivanleoncz&role=Software Developer"
{
  "name": "ivanleoncz", 
  "role": "Software Developer"
}

To get request.form as a normal dictionary , use request.form.to_dict(flat=False).

To return JSON data for an API, pass it to jsonify.

This example returns form data as JSON data.

@app.route('/form_to_json', methods=['POST'])
def form_to_json():
    data = request.form.to_dict(flat=False)
    return jsonify(data)

Here’s an example of POST form data with curl, returning as JSON:

$ curl http://127.0.0.1:5000/data -d "name=ivanleoncz&role=Software Developer"
{
  "name": "ivanleoncz", 
  "role": "Software Developer"
}

回答 7

使用request.get_json()得到张贴JSON数据。

data = request.get_json()
name = data.get('name', '')

使用request.formPOST方法提交表单时用于获取数据。

name = request.form.get('name', '')

使用request.args得到的网址,用GET方法提交表单的时候喜欢的查询字符串传递的数据。

request.args.get("name", "")

request.form等类似dict,get如果未通过默认值,请使用方法获取值。

Use request.get_json() to get posted JSON data.

data = request.get_json()
name = data.get('name', '')

Use request.form to get data when submitting a form with the POST method.

name = request.form.get('name', '')

Use request.args to get data passed in the query string of the URL, like when submitting a form with the GET method.

request.args.get("name", "")

request.form etc. are dict-like, use the get method to get a value with a default if it wasn’t passed.


回答 8

要发布不包含application/json内容类型的JSON ,请使用request.get_json(force=True)

@app.route('/process_data', methods=['POST'])
def process_data():
    req_data = request.get_json(force=True)
    language = req_data['language']
    return 'The language value is: {}'.format(language)

To get JSON posted without the application/json content type, use request.get_json(force=True).

@app.route('/process_data', methods=['POST'])
def process_data():
    req_data = request.get_json(force=True)
    language = req_data['language']
    return 'The language value is: {}'.format(language)

回答 9

原始数据从WSGI服务器传递到Flask应用程序request.stream。流的长度在Content-Length标题中。

length = request.headers["Content-Length"]
data = request.stream.read(length)

通常,使用它更安全request.get_data()

The raw data is passed in to the Flask application from the WSGI server as request.stream. The length of the stream is in the Content-Length header.

length = request.headers["Content-Length"]
data = request.stream.read(length)

It is usually safer to use request.get_data() instead.


回答 10

要在JavaScript中使用jQuery发布JSON,请使用JSON.stringify转储数据并将内容类型设置为application/json

var value_data = [1, 2, 3, 4];

$.ajax({
    type: 'POST',
    url: '/process',
    data: JSON.stringify(value_data),
    contentType: 'application/json',
    success: function (response_data) {
        alert("success");
    }   
});

使用解析在Flask中request.get_json()

data = request.get_json()

To post JSON with jQuery in JavaScript, use JSON.stringify to dump the data, and set the content type to application/json.

var value_data = [1, 2, 3, 4];

$.ajax({
    type: 'POST',
    url: '/process',
    data: JSON.stringify(value_data),
    contentType: 'application/json',
    success: function (response_data) {
        alert("success");
    }   
});

Parse it in Flask with request.get_json().

data = request.get_json()

回答 11

要解析JSON,请使用request.get_json()

@app.route("/something", methods=["POST"])
def do_something():
    result = handle(request.get_json())
    return jsonify(data=result)

To parse JSON, use request.get_json().

@app.route("/something", methods=["POST"])
def do_something():
    result = handle(request.get_json())
    return jsonify(data=result)

回答 12

这是一个发布表单数据以将用户添加到数据库的示例。检查request.method == "POST"表单是否已提交。使用键request.form来获取表单数据。使用<form>其他方式呈现HTML模板。表单中的字段应具有name与中的键匹配的属性request.form

from flask import Flask, request, render_template

app = Flask(__name__)

@app.route("/user/add", methods=["GET", "POST"])
def add_user():
    if request.method == "POST":
        user = User(
            username=request.form["username"],
            email=request.form["email"],
        )
        db.session.add(user)
        db.session.commit()
        return redirect(url_for("index"))

    return render_template("add_user.html")
<form method="post">
    <label for="username">Username</label>
    <input type="text" name="username" id="username">
    <label for="email">Email</label>
    <input type="email" name="email" id="email">
    <input type="submit">
</form>

Here’s an example of posting form data to add a user to a database. Check request.method == "POST" to check if the form was submitted. Use keys from request.form to get the form data. Render an HTML template with a <form> otherwise. The fields in the form should have name attributes that match the keys in request.form.

from flask import Flask, request, render_template

app = Flask(__name__)

@app.route("/user/add", methods=["GET", "POST"])
def add_user():
    if request.method == "POST":
        user = User(
            username=request.form["username"],
            email=request.form["email"],
        )
        db.session.add(user)
        db.session.commit()
        return redirect(url_for("index"))

    return render_template("add_user.html")
<form method="post">
    <label for="username">Username</label>
    <input type="text" name="username" id="username">
    <label for="email">Email</label>
    <input type="email" name="email" id="email">
    <input type="submit">
</form>

回答 13

如果内容类型被识别为表单数据,request.data则将其解析为表单request.form并返回一个空字符串。

要获取原始数据,而不管内容类型如何,请调用request.get_data()request.data直接调用get_data(parse_form_data=True),而默认值为False直接调用。

If the content type is recognized as form data, request.data will parse that into request.form and return an empty string.

To get the raw data regardless of content type, call request.get_data(). request.data calls get_data(parse_form_data=True), while the default is False if you call it directly.


回答 14

如果将正文识别为表单数据,它将在中request.form。如果是JSON,它将位于中request.get_json()。否则,原始数据将在中request.data。如果不确定如何提交数据,可以使用or链来获取第一个包含数据的链。

def get_request_data():
    return (
        request.args
        or request.form
        or request.get_json(force=True, silent=True)
        or request.data
    )

request.args包含从查询字符串中解析出的args,无论主体是什么,因此get_request_data()如果它和主体都应同时进行数据处理,则可以将其删除。

If the body is recognized as form data, it will be in request.form. If it’s JSON, it will be in request.get_json(). Otherwise the raw data will be in request.data. If you’re not sure how data will be submitted, you can use an or chain to get the first one with data.

def get_request_data():
    return (
        request.args
        or request.form
        or request.get_json(force=True, silent=True)
        or request.data
    )

request.args contains args parsed from the query string, regardless of what was in the body, so you would remove that from get_request_data() if both it and a body should data at the same time.


回答 15

使用HTML表单发布表单数据时,请确保input标签具有name属性,否则它们将不会出现在中request.form

@app.route('/', methods=['GET', 'POST'])
def index():
    print(request.form)
    return """
<form method="post">
    <input type="text">
    <input type="text" id="txt2">
    <input type="text" name="txt3" id="txt3">  
    <input type="submit">
</form>
"""
ImmutableMultiDict([('txt3', 'text 3')])

只有txt3输入中有一个name,因此它是中唯一的键request.form

When posting form data with an HTML form, be sure the input tags have name attributes, otherwise they won’t be present in request.form.

@app.route('/', methods=['GET', 'POST'])
def index():
    print(request.form)
    return """
<form method="post">
    <input type="text">
    <input type="text" id="txt2">
    <input type="text" name="txt3" id="txt3">  
    <input type="submit">
</form>
"""
ImmutableMultiDict([('txt3', 'text 3')])

Only the txt3 input had a name, so it’s the only key present in request.form.


列表理解与地图

问题:列表理解与地图

有理由更喜欢使用map()列表理解吗?反之亦然?它们中的一个通常比另一个效率更高,或者通常被认为比另一个更Python化吗?

Is there a reason to prefer using map() over list comprehension or vice versa? Is either of them generally more efficient or considered generally more pythonic than the other?


回答 0

map在某些情况下(如果您不是出于此目的而使用lambda,而是在map和listcomp中使用相同的函数),在微观上可能会更快。在其他情况下,列表理解可能会更快,并且大多数(并非全部)pythonista用户认为列表更直接,更清晰。

使用完全相同的函数时map的微小速度优势的一个示例:

$ python -mtimeit -s'xs=range(10)' 'map(hex, xs)'
100000 loops, best of 3: 4.86 usec per loop
$ python -mtimeit -s'xs=range(10)' '[hex(x) for x in xs]'
100000 loops, best of 3: 5.58 usec per loop

当地图需要使用lambda时,如何完全颠倒性能比较的示例:

$ python -mtimeit -s'xs=range(10)' 'map(lambda x: x+2, xs)'
100000 loops, best of 3: 4.24 usec per loop
$ python -mtimeit -s'xs=range(10)' '[x+2 for x in xs]'
100000 loops, best of 3: 2.32 usec per loop

map may be microscopically faster in some cases (when you’re NOT making a lambda for the purpose, but using the same function in map and a listcomp). List comprehensions may be faster in other cases and most (not all) pythonistas consider them more direct and clearer.

An example of the tiny speed advantage of map when using exactly the same function:

$ python -mtimeit -s'xs=range(10)' 'map(hex, xs)'
100000 loops, best of 3: 4.86 usec per loop
$ python -mtimeit -s'xs=range(10)' '[hex(x) for x in xs]'
100000 loops, best of 3: 5.58 usec per loop

An example of how performance comparison gets completely reversed when map needs a lambda:

$ python -mtimeit -s'xs=range(10)' 'map(lambda x: x+2, xs)'
100000 loops, best of 3: 4.24 usec per loop
$ python -mtimeit -s'xs=range(10)' '[x+2 for x in xs]'
100000 loops, best of 3: 2.32 usec per loop

回答 1

案例

  • 常见情况:几乎总是,您将要在python中使用列表推导,因为对于新手程序员来说,阅读代码会更加明显。(这不适用于可能适用其他习惯用法的其他语言。)由于列表推导是python中用于迭代的事实上的标准,因此您对python程序员所做的工作甚至会更加明显。他们是预期的
  • 较少见的情况:但是,如果您已经定义了一个函数,则使用通常是合理的map,尽管它被认为是“非pythonic”的。例如,map(sum, myLists)比更加优雅/简洁[sum(x) for x in myLists]。您可以不必编写一个虚拟变量(例如sum(x) for x...or sum(_) for _...sum(readableName) for readableName...),而只需键入两次即可进行迭代,从而获得了优雅。同样的道理也适用于filterreduce从什么itertools模块:如果你已经有一个方便的功能,您可以继续前进,做一些函数式编程。在某些情况下,这会提高可读性,而在其他情况下(例如,新手程序员,多个参数),则会失去可读性。但是,无论如何,代码的可读性在很大程度上取决于注释。
  • 几乎永远不会map在进行函数编程时,您可能希望将函数用作纯抽象函数,在这种情况下您正在映射map,currying map,或者从map以函数的形式进行讨论中受益。例如,在Haskell中,一个称为functor的接口可以fmap概括任何数据结构上的映射。这在python中非常罕见,因为python语法迫使您使用生成器样式来谈论迭代;您不能轻易将其概括。(这有时是好事,有时是坏事。)您可能会想出一些罕见的python例子,这map(f, *lists)是合理的事情。我能想到的最接近的示例是sumEach = partial(map,sum),这是一个单行代码,大致相当于:

def sumEach(myLists):
    return [sum(_) for _ in myLists]
  • 仅使用for-loop:您当然也可以使用for循环。尽管从函数式编程的角度来看并不那么优雅,但有时​​非局部变量使命令式编程语言(例如python)中的代码更清晰,因为人们已经非常习惯于以这种方式读取代码。通常,当您仅执行任何不构建列表的复杂操作(例如列表理解和映射)(例如,求和或制作树等)时,for循环也是最有效的。就内存而言,它是高效的(不必在时间上,我希望在最坏的情况下,它是一个恒定的因素,除非出现一些罕见的病理性垃圾收集问题)。

“ Python主义”

我不喜欢“ pythonic”一词,因为我发现pythonic在我眼中并不总是那么优雅。然而,mapfilter和类似的功能(如非常有用的itertools模块)很可能在风格方面考虑unpythonic。

懒惰

就效率而言,就像大多数函数式编程构造一样,MAP可以是LAZY,实际上在python中是懒惰的。这意味着您可以执行此操作(在python3中),并且计算机不会耗尽内存,并且不会丢失所有未保存的数据:

>>> map(str, range(10**100))
<map object at 0x2201d50>

尝试通过列表理解做到这一点:

>>> [str(n) for n in range(10**100)]
# DO NOT TRY THIS AT HOME OR YOU WILL BE SAD #

请注意,列表推导本质上也是惰性的,但是python选择将其实现为非惰性的。不过,python确实以生成器表达式的形式支持惰性列表推导,如下所示:

>>> (str(n) for n in range(10**100))
<generator object <genexpr> at 0xacbdef>

您基本上可以将[...]语法视为将生成器表达式传递给list构造函数,例如list(x for x in range(5))

简短的人为例子

from operator import neg
print({x:x**2 for x in map(neg,range(5))})

print({x:x**2 for x in [-y for y in range(5)]})

print({x:x**2 for x in (-y for y in range(5))})

列表推导是非延迟的,因此可能需要更多内存(除非您使用生成器推导)。方括号[...]通常使事情变得显而易见,尤其是在括号中。另一方面,有时您最终会变得像打字一样冗长[x for x in...。只要您使迭代器变量简短,如果不缩进代码,列表解析通常会更加清晰。但是您总是可以缩进代码。

print(
    {x:x**2 for x in (-y for y in range(5))}
)

或分手:

rangeNeg5 = (-y for y in range(5))
print(
    {x:x**2 for x in rangeNeg5}
)

python3的效率比较

map 现在很懒:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=map(f,xs)'
1000000 loops, best of 3: 0.336 usec per loop            ^^^^^^^^^

因此,如果您将不使用所有数据,或者不提前知道需要多少数据,那么map在python3中(以及python2或python3中的生成器表达式)将避免计算它们的值,直到最后一刻。通常,这通常会超过使用带来的任何开销map。不利之处在于,与大多数功能语言相反,这在python中非常有限:只有按“顺序”从左至右访问数据时,您才能获得此好处,因为python生成器表达式只能按order求值x[0], x[1], x[2], ...

但是,假设我们有一个f想要的预制函数map,并且我们忽略了map通过立即强制使用来进行赋值的懒惰list(...)。我们得到一些非常有趣的结果:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(map(f,xs))'                                                                                                                                                
10000 loops, best of 3: 165/124/135 usec per loop        ^^^^^^^^^^^^^^^
                    for list(<map object>)

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=[f(x) for x in xs]'                                                                                                                                      
10000 loops, best of 3: 181/118/123 usec per loop        ^^^^^^^^^^^^^^^^^^
                    for list(<generator>), probably optimized

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(f(x) for x in xs)'                                                                                                                                    
1000 loops, best of 3: 215/150/150 usec per loop         ^^^^^^^^^^^^^^^^^^^^^^
                    for list(<generator>)

结果为AAA / BBB / CCC格式,其中A在带有python 3。?。?的约2010年英特尔工作站上执行,而B和C则是在python 3.2.1的约2013年AMD工作站上执行,具有截然不同的硬件。结果似乎是,地图和列表理解的性能可比,这受其他随机因素的影响最大。我们可以告诉的唯一的事情似乎是,奇怪的是,虽然我们期待列表解析[...]比生成器表达式更好地发挥(...)map也更高效,生成器表达式(再次假设计算所有的值/使用)。

重要的是要意识到这些测试假设一个非常简单的功能(身份功能)。但是这很好,因为如果功能复杂,那么与程序中的其他因素相比,性能开销可以忽略不计。(测试其他简单的东西可能仍然很有趣,例如f=lambda x:x+x

如果您精通python汇编语言,则可以使用该dis模块来查看这是否是幕后真正发生的事情:

>>> listComp = compile('[f(x) for x in xs]', 'listComp', 'eval')
>>> dis.dis(listComp)
  1           0 LOAD_CONST               0 (<code object <listcomp> at 0x2511a48, file "listComp", line 1>) 
              3 MAKE_FUNCTION            0 
              6 LOAD_NAME                0 (xs) 
              9 GET_ITER             
             10 CALL_FUNCTION            1 
             13 RETURN_VALUE         
>>> listComp.co_consts
(<code object <listcomp> at 0x2511a48, file "listComp", line 1>,)
>>> dis.dis(listComp.co_consts[0])
  1           0 BUILD_LIST               0 
              3 LOAD_FAST                0 (.0) 
        >>    6 FOR_ITER                18 (to 27) 
              9 STORE_FAST               1 (x) 
             12 LOAD_GLOBAL              0 (f) 
             15 LOAD_FAST                1 (x) 
             18 CALL_FUNCTION            1 
             21 LIST_APPEND              2 
             24 JUMP_ABSOLUTE            6 
        >>   27 RETURN_VALUE

 

>>> listComp2 = compile('list(f(x) for x in xs)', 'listComp2', 'eval')
>>> dis.dis(listComp2)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_CONST               0 (<code object <genexpr> at 0x255bc68, file "listComp2", line 1>) 
              6 MAKE_FUNCTION            0 
              9 LOAD_NAME                1 (xs) 
             12 GET_ITER             
             13 CALL_FUNCTION            1 
             16 CALL_FUNCTION            1 
             19 RETURN_VALUE         
>>> listComp2.co_consts
(<code object <genexpr> at 0x255bc68, file "listComp2", line 1>,)
>>> dis.dis(listComp2.co_consts[0])
  1           0 LOAD_FAST                0 (.0) 
        >>    3 FOR_ITER                17 (to 23) 
              6 STORE_FAST               1 (x) 
              9 LOAD_GLOBAL              0 (f) 
             12 LOAD_FAST                1 (x) 
             15 CALL_FUNCTION            1 
             18 YIELD_VALUE          
             19 POP_TOP              
             20 JUMP_ABSOLUTE            3 
        >>   23 LOAD_CONST               0 (None) 
             26 RETURN_VALUE

 

>>> evalledMap = compile('list(map(f,xs))', 'evalledMap', 'eval')
>>> dis.dis(evalledMap)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_NAME                1 (map) 
              6 LOAD_NAME                2 (f) 
              9 LOAD_NAME                3 (xs) 
             12 CALL_FUNCTION            2 
             15 CALL_FUNCTION            1 
             18 RETURN_VALUE 

似乎使用[...]语法比更好list(...)。遗憾的是,map该类在拆卸方面有点不透明,但是我们可以通过速度测试来确定。

Cases

  • Common case: Almost always, you will want to use a list comprehension in python because it will be more obvious what you’re doing to novice programmers reading your code. (This does not apply to other languages, where other idioms may apply.) It will even be more obvious what you’re doing to python programmers, since list comprehensions are the de-facto standard in python for iteration; they are expected.
  • Less-common case: However if you already have a function defined, it is often reasonable to use map, though it is considered ‘unpythonic’. For example, map(sum, myLists) is more elegant/terse than [sum(x) for x in myLists]. You gain the elegance of not having to make up a dummy variable (e.g. sum(x) for x... or sum(_) for _... or sum(readableName) for readableName...) which you have to type twice, just to iterate. The same argument holds for filter and reduce and anything from the itertools module: if you already have a function handy, you could go ahead and do some functional programming. This gains readability in some situations, and loses it in others (e.g. novice programmers, multiple arguments)… but the readability of your code highly depends on your comments anyway.
  • Almost never: You may want to use the map function as a pure abstract function while doing functional programming, where you’re mapping map, or currying map, or otherwise benefit from talking about map as a function. In Haskell for example, a functor interface called fmap generalizes mapping over any data structure. This is very uncommon in python because the python grammar compels you to use generator-style to talk about iteration; you can’t generalize it easily. (This is sometimes good and sometimes bad.) You can probably come up with rare python examples where map(f, *lists) is a reasonable thing to do. The closest example I can come up with would be sumEach = partial(map,sum), which is a one-liner that is very roughly equivalent to:

def sumEach(myLists):
    return [sum(_) for _ in myLists]
  • Just using a for-loop: You can also of course just use a for-loop. While not as elegant from a functional-programming viewpoint, sometimes non-local variables make code clearer in imperative programming languages such as python, because people are very used to reading code that way. For-loops are also, generally, the most efficient when you are merely doing any complex operation that is not building a list like list-comprehensions and map are optimized for (e.g. summing, or making a tree, etc.) — at least efficient in terms of memory (not necessarily in terms of time, where I’d expect at worst a constant factor, barring some rare pathological garbage-collection hiccuping).

“Pythonism”

I dislike the word “pythonic” because I don’t find that pythonic is always elegant in my eyes. Nevertheless, map and filter and similar functions (like the very useful itertools module) are probably considered unpythonic in terms of style.

Laziness

In terms of efficiency, like most functional programming constructs, MAP CAN BE LAZY, and in fact is lazy in python. That means you can do this (in python3) and your computer will not run out of memory and lose all your unsaved data:

>>> map(str, range(10**100))
<map object at 0x2201d50>

Try doing that with a list comprehension:

>>> [str(n) for n in range(10**100)]
# DO NOT TRY THIS AT HOME OR YOU WILL BE SAD #

Do note that list comprehensions are also inherently lazy, but python has chosen to implement them as non-lazy. Nevertheless, python does support lazy list comprehensions in the form of generator expressions, as follows:

>>> (str(n) for n in range(10**100))
<generator object <genexpr> at 0xacbdef>

You can basically think of the [...] syntax as passing in a generator expression to the list constructor, like list(x for x in range(5)).

Brief contrived example

from operator import neg
print({x:x**2 for x in map(neg,range(5))})

print({x:x**2 for x in [-y for y in range(5)]})

print({x:x**2 for x in (-y for y in range(5))})

List comprehensions are non-lazy, so may require more memory (unless you use generator comprehensions). The square brackets [...] often make things obvious, especially when in a mess of parentheses. On the other hand, sometimes you end up being verbose like typing [x for x in.... As long as you keep your iterator variables short, list comprehensions are usually clearer if you don’t indent your code. But you could always indent your code.

print(
    {x:x**2 for x in (-y for y in range(5))}
)

or break things up:

rangeNeg5 = (-y for y in range(5))
print(
    {x:x**2 for x in rangeNeg5}
)

Efficiency comparison for python3

map is now lazy:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=map(f,xs)'
1000000 loops, best of 3: 0.336 usec per loop            ^^^^^^^^^

Therefore if you will not be using all your data, or do not know ahead of time how much data you need, map in python3 (and generator expressions in python2 or python3) will avoid calculating their values until the last moment necessary. Usually this will usually outweigh any overhead from using map. The downside is that this is very limited in python as opposed to most functional languages: you only get this benefit if you access your data left-to-right “in order”, because python generator expressions can only be evaluated the order x[0], x[1], x[2], ....

However let’s say that we have a pre-made function f we’d like to map, and we ignore the laziness of map by immediately forcing evaluation with list(...). We get some very interesting results:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(map(f,xs))'                                                                                                                                                
10000 loops, best of 3: 165/124/135 usec per loop        ^^^^^^^^^^^^^^^
                    for list(<map object>)

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=[f(x) for x in xs]'                                                                                                                                      
10000 loops, best of 3: 181/118/123 usec per loop        ^^^^^^^^^^^^^^^^^^
                    for list(<generator>), probably optimized

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(f(x) for x in xs)'                                                                                                                                    
1000 loops, best of 3: 215/150/150 usec per loop         ^^^^^^^^^^^^^^^^^^^^^^
                    for list(<generator>)

In results are in the form AAA/BBB/CCC where A was performed with on a circa-2010 Intel workstation with python 3.?.?, and B and C were performed with a circa-2013 AMD workstation with python 3.2.1, with extremely different hardware. The result seems to be that map and list comprehensions are comparable in performance, which is most strongly affected by other random factors. The only thing we can tell seems to be that, oddly, while we expect list comprehensions [...] to perform better than generator expressions (...), map is ALSO more efficient that generator expressions (again assuming that all values are evaluated/used).

It is important to realize that these tests assume a very simple function (the identity function); however this is fine because if the function were complicated, then performance overhead would be negligible compared to other factors in the program. (It may still be interesting to test with other simple things like f=lambda x:x+x)

If you’re skilled at reading python assembly, you can use the dis module to see if that’s actually what’s going on behind the scenes:

>>> listComp = compile('[f(x) for x in xs]', 'listComp', 'eval')
>>> dis.dis(listComp)
  1           0 LOAD_CONST               0 (<code object <listcomp> at 0x2511a48, file "listComp", line 1>) 
              3 MAKE_FUNCTION            0 
              6 LOAD_NAME                0 (xs) 
              9 GET_ITER             
             10 CALL_FUNCTION            1 
             13 RETURN_VALUE         
>>> listComp.co_consts
(<code object <listcomp> at 0x2511a48, file "listComp", line 1>,)
>>> dis.dis(listComp.co_consts[0])
  1           0 BUILD_LIST               0 
              3 LOAD_FAST                0 (.0) 
        >>    6 FOR_ITER                18 (to 27) 
              9 STORE_FAST               1 (x) 
             12 LOAD_GLOBAL              0 (f) 
             15 LOAD_FAST                1 (x) 
             18 CALL_FUNCTION            1 
             21 LIST_APPEND              2 
             24 JUMP_ABSOLUTE            6 
        >>   27 RETURN_VALUE

 

>>> listComp2 = compile('list(f(x) for x in xs)', 'listComp2', 'eval')
>>> dis.dis(listComp2)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_CONST               0 (<code object <genexpr> at 0x255bc68, file "listComp2", line 1>) 
              6 MAKE_FUNCTION            0 
              9 LOAD_NAME                1 (xs) 
             12 GET_ITER             
             13 CALL_FUNCTION            1 
             16 CALL_FUNCTION            1 
             19 RETURN_VALUE         
>>> listComp2.co_consts
(<code object <genexpr> at 0x255bc68, file "listComp2", line 1>,)
>>> dis.dis(listComp2.co_consts[0])
  1           0 LOAD_FAST                0 (.0) 
        >>    3 FOR_ITER                17 (to 23) 
              6 STORE_FAST               1 (x) 
              9 LOAD_GLOBAL              0 (f) 
             12 LOAD_FAST                1 (x) 
             15 CALL_FUNCTION            1 
             18 YIELD_VALUE          
             19 POP_TOP              
             20 JUMP_ABSOLUTE            3 
        >>   23 LOAD_CONST               0 (None) 
             26 RETURN_VALUE

 

>>> evalledMap = compile('list(map(f,xs))', 'evalledMap', 'eval')
>>> dis.dis(evalledMap)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_NAME                1 (map) 
              6 LOAD_NAME                2 (f) 
              9 LOAD_NAME                3 (xs) 
             12 CALL_FUNCTION            2 
             15 CALL_FUNCTION            1 
             18 RETURN_VALUE 

It seems it is better to use [...] syntax than list(...). Sadly the map class is a bit opaque to disassembly, but we can make due with our speed test.


回答 2

Python 2:您应该使用mapfilter而不是列表推导。

即使它们不是“ Pythonic”的,您还是还是偏爱它们的一个客观原因是:
它们需要函数/ lambda作为参数,从而引入了新的作用域

我被这个不止一次地咬了:

for x, y in somePoints:
    # (several lines of code here)
    squared = [x ** 2 for x in numbers]
    # Oops, x was silently overwritten!

但如果相反,我曾说过:

for x, y in somePoints:
    # (several lines of code here)
    squared = map(lambda x: x ** 2, numbers)

那一切都会好起来的

您可能会说我在相同范围内使用相同的变量名很愚蠢。

我不是 最初的代码很好-两者x不在同一范围内。
直到我内部块移到代码的不同部分之后,问题才出现(阅读:维护期间的问题,而不是开发过程中的问题),而且我没想到。

是的,如果您从未犯过此错误,则列表理解会更优雅。
但是从个人经验(和看到其他人犯同样的错误)中,我已经看到它发生了很多次,我认为当这些错误潜入您的代码中时,您不应该经历这种痛苦。

结论:

使用mapfilter。它们可以防止与范围相关的细微难以诊断的错误。

边注:

如果适合您的情况,请不要忘记考虑使用imapifilter(中的itertools)!

Python 2: You should use map and filter instead of list comprehensions.

An objective reason why you should prefer them even though they’re not “Pythonic” is this:
They require functions/lambdas as arguments, which introduce a new scope.

I’ve gotten bitten by this more than once:

for x, y in somePoints:
    # (several lines of code here)
    squared = [x ** 2 for x in numbers]
    # Oops, x was silently overwritten!

but if instead I had said:

for x, y in somePoints:
    # (several lines of code here)
    squared = map(lambda x: x ** 2, numbers)

then everything would’ve been fine.

You could say I was being silly for using the same variable name in the same scope.

I wasn’t. The code was fine originally — the two xs weren’t in the same scope.
It was only after I moved the inner block to a different section of the code that the problem came up (read: problem during maintenance, not development), and I didn’t expect it.

Yes, if you never make this mistake then list comprehensions are more elegant.
But from personal experience (and from seeing others make the same mistake) I’ve seen it happen enough times that I think it’s not worth the pain you have to go through when these bugs creep into your code.

Conclusion:

Use map and filter. They prevent subtle hard-to-diagnose scope-related bugs.

Side note:

Don’t forget to consider using imap and ifilter (in itertools) if they are appropriate for your situation!


回答 3

实际上,map列表理解在Python 3语言中的行为大不相同。看一下下面的Python 3程序:

def square(x):
    return x*x
squares = map(square, [1, 2, 3])
print(list(squares))
print(list(squares))

您可能希望它打印两次“ [1,4,9]”行,但是打印“ [1,4,9]”后跟“ []”。第一次查看时,squares它似乎表现为三个元素的序列,但是第二次查看时为空元素。

在Python 2语言中,会map返回一个普通的旧列表,就像列表推导在两种语言中一样。症结在于,mapPython 3(和imapPython 2)中的return值不是一个列表-它是一个迭代器!

与遍历列表不同,遍历迭代器时将消耗元素。这就是为什么squares在最后print(list(squares))一行看起来空白。

总结一下:

  • 在处理迭代器时,必须记住它们是有状态的,并且在遍历它们时会发生变化。
  • 列表更容易预测,因为它们仅在您显式对其进行更改时才会更改;他们是容器
  • 还有一个好处:数字,字符串和元组甚至可以更容易预测,因为它们根本无法更改;它们是价值

Actually, map and list comprehensions behave quite differently in the Python 3 language. Take a look at the following Python 3 program:

def square(x):
    return x*x
squares = map(square, [1, 2, 3])
print(list(squares))
print(list(squares))

You might expect it to print the line “[1, 4, 9]” twice, but instead it prints “[1, 4, 9]” followed by “[]”. The first time you look at squares it seems to behave as a sequence of three elements, but the second time as an empty one.

In the Python 2 language map returns a plain old list, just like list comprehensions do in both languages. The crux is that the return value of map in Python 3 (and imap in Python 2) is not a list – it’s an iterator!

The elements are consumed when you iterate over an iterator unlike when you iterate over a list. This is why squares looks empty in the last print(list(squares)) line.

To summarize:

  • When dealing with iterators you have to remember that they are stateful and that they mutate as you traverse them.
  • Lists are more predictable since they only change when you explicitly mutate them; they are containers.
  • And a bonus: numbers, strings, and tuples are even more predictable since they cannot change at all; they are values.

回答 4

我发现列表理解通常比我要表达的要表达的要多map-它们都可以完成,但是前者节省了试图理解什么可能是复杂lambda表达的精神负担。

在某个地方(我无法找到它)也有一次采访,其中Guido列出了lambdas和函数功能,这是他最后悔接受Python的事情,因此您可以凭借这些观点认为它们是非Python的其中。

I find list comprehensions are generally more expressive of what I’m trying to do than map – they both get it done, but the former saves the mental load of trying to understand what could be a complex lambda expression.

There’s also an interview out there somewhere (I can’t find it offhand) where Guido lists lambdas and the functional functions as the thing he most regrets about accepting into Python, so you could make the argument that they’re un-Pythonic by virtue of that.


回答 5

这是一种可能的情况:

map(lambda op1,op2: op1*op2, list1, list2)

与:

[op1*op2 for op1,op2 in zip(list1,list2)]

我猜想zip()是一个不幸的和不必要的开销,如果您坚持使用列表推导而不是地图,则需要沉迷于此。如果有人肯定或否定这一点,那就太好了。

Here is one possible case:

map(lambda op1,op2: op1*op2, list1, list2)

versus:

[op1*op2 for op1,op2 in zip(list1,list2)]

I am guessing the zip() is an unfortunate and unnecessary overhead you need to indulge in if you insist on using list comprehensions instead of the map. Would be great if someone clarifies this whether affirmatively or negatively.


回答 6

如果您打算编写任何异步,并行或分布式代码,则您可能会更喜欢map列表理解-因为大多数异步,并行或分布式程序包都提供了map使python过载的功能map。然后,通过将适当的map函数传递给其余代码,您可能不必修改原始串行代码即可使其并行运行(等)。

If you plan on writing any asynchronous, parallel, or distributed code, you will probably prefer map over a list comprehension — as most asynchronous, parallel, or distributed packages provide a map function to overload python’s map. Then by passing the appropriate map function to the rest of your code, you may not have to modify your original serial code to have it run in parallel (etc).


回答 7

因此,由于Python 3 map()是迭代器,因此您需要牢记所需的东西:迭代器或list对象。

正如@AlexMartelli已经提到的那样map()仅当您不使用lambda函数时,它比列表理解要快。

我将向您介绍一些时间比较。

Python 3.5.2和CPython
我使用了Jupiter笔记本,尤其是%timeit内置的魔术命令
测量:s == 1000 ms == 1000 * 1000 µs = 1000 * 1000 * 1000 ns

设定:

x_list = [(i, i+1, i+2, i*2, i-9) for i in range(1000)]
i_list = list(range(1000))

内置功能:

%timeit map(sum, x_list)  # creating iterator object
# Output: The slowest run took 9.91 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 277 ns per loop

%timeit list(map(sum, x_list))  # creating list with map
# Output: 1000 loops, best of 3: 214 µs per loop

%timeit [sum(x) for x in x_list]  # creating list with list comprehension
# Output: 1000 loops, best of 3: 290 µs per loop

lambda 功能:

%timeit map(lambda i: i+1, i_list)
# Output: The slowest run took 8.64 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 325 ns per loop

%timeit list(map(lambda i: i+1, i_list))
# Output: 1000 loops, best of 3: 183 µs per loop

%timeit [i+1 for i in i_list]
# Output: 10000 loops, best of 3: 84.2 µs per loop

还有诸如生成器表达式之类的东西,请参阅PEP-0289。所以我认为将其添加到比较中将很有用

%timeit (sum(i) for i in x_list)
# Output: The slowest run took 6.66 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 495 ns per loop

%timeit list((sum(x) for x in x_list))
# Output: 1000 loops, best of 3: 319 µs per loop

%timeit (i+1 for i in i_list)
# Output: The slowest run took 6.83 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 506 ns per loop

%timeit list((i+1 for i in i_list))
# Output: 10000 loops, best of 3: 125 µs per loop

您需要list对象:

如果是自定义函数,list(map())则使用列表理解;如果有内置函数,则使用列表理解

您不需要list对象,只需要一个可迭代的对象:

始终使用map()

So since Python 3, map() is an iterator, you need to keep in mind what do you need: an iterator or list object.

As @AlexMartelli already mentioned, map() is faster than list comprehension only if you don’t use lambda function.

I will present you some time comparisons.

Python 3.5.2 and CPython
I’ve used Jupiter notebook and especially %timeit built-in magic command
Measurements: s == 1000 ms == 1000 * 1000 µs = 1000 * 1000 * 1000 ns

Setup:

x_list = [(i, i+1, i+2, i*2, i-9) for i in range(1000)]
i_list = list(range(1000))

Built-in function:

%timeit map(sum, x_list)  # creating iterator object
# Output: The slowest run took 9.91 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 277 ns per loop

%timeit list(map(sum, x_list))  # creating list with map
# Output: 1000 loops, best of 3: 214 µs per loop

%timeit [sum(x) for x in x_list]  # creating list with list comprehension
# Output: 1000 loops, best of 3: 290 µs per loop

lambda function:

%timeit map(lambda i: i+1, i_list)
# Output: The slowest run took 8.64 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 325 ns per loop

%timeit list(map(lambda i: i+1, i_list))
# Output: 1000 loops, best of 3: 183 µs per loop

%timeit [i+1 for i in i_list]
# Output: 10000 loops, best of 3: 84.2 µs per loop

There is also such thing as generator expression, see PEP-0289. So i thought it would be useful to add it to comparison

%timeit (sum(i) for i in x_list)
# Output: The slowest run took 6.66 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 495 ns per loop

%timeit list((sum(x) for x in x_list))
# Output: 1000 loops, best of 3: 319 µs per loop

%timeit (i+1 for i in i_list)
# Output: The slowest run took 6.83 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 506 ns per loop

%timeit list((i+1 for i in i_list))
# Output: 10000 loops, best of 3: 125 µs per loop

You need list object:

Use list comprehension if it’s custom function, use list(map()) if there is builtin function

You don’t need list object, you just need iterable one:

Always use map()!


回答 8

我进行了一项快速测试,比较了三种调用对象方法的方法。在这种情况下,时差可以忽略不计,并且与所讨论的功能有关(请参阅@Alex Martelli的回复)。在这里,我查看了以下方法:

# map_lambda
list(map(lambda x: x.add(), vals))

# map_operator
from operator import methodcaller
list(map(methodcaller("add"), vals))

# map_comprehension
[x.add() for x in vals]

我查看vals了整数(Python int)和浮点数(Python )的列表(存储在变量中),float以增加列表大小。DummyNum考虑以下虚拟类:

class DummyNum(object):
    """Dummy class"""
    __slots__ = 'n',

    def __init__(self, n):
        self.n = n

    def add(self):
        self.n += 5

具体来说,add方法。该__slots__属性是Python中的一种简单优化,用于定义类(属性)所需的总内存,从而减小了内存大小。这是结果图。

映射Python对象方法的性能

如前所述,所使用的技术差异很小,您应该以一种对您最易读的方式或在特定情况下进行编码。在这种情况下,列表推导(map_comprehension技巧)对于对象中两种类型的加法最快,尤其是对于较短的列表。

访问此pastebin,获取用于生成图和数据的源。

I ran a quick test comparing three methods for invoking the method of an object. The time difference, in this case, is negligible and is a matter of the function in question (see @Alex Martelli’s response). Here, I looked at the following methods:

# map_lambda
list(map(lambda x: x.add(), vals))

# map_operator
from operator import methodcaller
list(map(methodcaller("add"), vals))

# map_comprehension
[x.add() for x in vals]

I looked at lists (stored in the variable vals) of both integers (Python int) and floating point numbers (Python float) for increasing list sizes. The following dummy class DummyNum is considered:

class DummyNum(object):
    """Dummy class"""
    __slots__ = 'n',

    def __init__(self, n):
        self.n = n

    def add(self):
        self.n += 5

Specifically, the add method. The __slots__ attribute is a simple optimization in Python to define the total memory needed by the class (attributes), reducing memory size. Here are the resulting plots.

Performance of mapping Python object methods

As stated previously, the technique used makes a minimal difference and you should code in a way that is most readable to you, or in the particular circumstance. In this case, the list comprehension (map_comprehension technique) is fastest for both types of additions in an object, especially with shorter lists.

Visit this pastebin for the source used to generate the plot and data.


回答 9

我认为最Python的方式是使用列表理解而不是mapand filter。原因是列表理解比map和更清晰filter

In [1]: odd_cubes = [x ** 3 for x in range(10) if x % 2 == 1] # using a list comprehension

In [2]: odd_cubes_alt = list(map(lambda x: x ** 3, filter(lambda x: x % 2 == 1, range(10)))) # using map and filter

In [3]: odd_cubes == odd_cubes_alt
Out[3]: True

如您所见,理解并不需要额外的lambda表达式map。此外,理解还允许容易地过滤,同时map需要filter允许过滤。

I consider that the most Pythonic way is to use a list comprehension instead of map and filter. The reason is that list comprehensions are clearer than map and filter.

In [1]: odd_cubes = [x ** 3 for x in range(10) if x % 2 == 1] # using a list comprehension

In [2]: odd_cubes_alt = list(map(lambda x: x ** 3, filter(lambda x: x % 2 == 1, range(10)))) # using map and filter

In [3]: odd_cubes == odd_cubes_alt
Out[3]: True

As you an see, a comprehension does not require extra lambda expressions as map needs. Furthermore, a comprehension also allows filtering easily, while map requires filter to allow filtering.


回答 10

我尝试了@ alex-martelli的代码,但发现了一些差异

python -mtimeit -s "xs=range(123456)" "map(hex, xs)"
1000000 loops, best of 5: 218 nsec per loop
python -mtimeit -s "xs=range(123456)" "[hex(x) for x in xs]"
10 loops, best of 5: 19.4 msec per loop

映射即使在非常大的范围内也要花费相同的时间,而使用列表理解则要花费很多时间,这从我的代码中可以明显看出。因此,除了被视为“ unpythonic”之外,我还没有遇到任何与使用map有关的性能问题。

I tried the code by @alex-martelli but found some discrepancies

python -mtimeit -s "xs=range(123456)" "map(hex, xs)"
1000000 loops, best of 5: 218 nsec per loop
python -mtimeit -s "xs=range(123456)" "[hex(x) for x in xs]"
10 loops, best of 5: 19.4 msec per loop

map takes the same amount of time even for very large ranges while using list comprehension takes a lot of time as is evident from my code. So apart from being considered “unpythonic”, I have not faced any performance issues relating to usage of map.


Python应用程序的最佳项目结构是什么?[关闭]

问题:Python应用程序的最佳项目结构是什么?[关闭]

想象一下,您想使用Python开发非平凡的最终用户桌面(非Web)应用程序。构造项目文件夹层次结构的最佳方法是什么?

理想的功能是易于维护,IDE友好,适用于源代码控制分支/合并以及易于生成安装软件包。

特别是:

  1. 您将源放在哪里?
  2. 您将应用程序启动脚本放在哪里?
  3. 您将IDE项目放在哪里?
  4. 您将单元/验收测试放在哪里?
  5. 您将非Python数据(例如配置文件)放在哪里?
  6. 您在哪里将非Python来源(例如C ++)用于pyd / so二进制扩展模块?

Imagine that you want to develop a non-trivial end-user desktop (not web) application in Python. What is the best way to structure the project’s folder hierarchy?

Desirable features are ease of maintenance, IDE-friendliness, suitability for source control branching/merging, and easy generation of install packages.

In particular:

  1. Where do you put the source?
  2. Where do you put application startup scripts?
  3. Where do you put the IDE project cruft?
  4. Where do you put the unit/acceptance tests?
  5. Where do you put non-Python data such as config files?
  6. Where do you put non-Python sources such as C++ for pyd/so binary extension modules?

回答 0

没什么大不了的。令您快乐的一切都会起作用。没有很多愚蠢的规则,因为Python项目可以很简单。

  • /scripts/bin那种命令行界面的东西
  • /tests 为您的测试
  • /lib 用于您的C语言库
  • /doc 对于大多数文档
  • /apidoc 用于Epydoc生成的API文档。

顶级目录可以包含自述文件,配置文件和其他内容。

困难的选择是是否使用/src树。Python没有区别/src/lib/bin如Java或C具有。

由于/src某些人认为顶层目录没有意义,因此顶层目录可以是应用程序的顶层体系结构。

  • /foo
  • /bar
  • /baz

我建议将所有这些都放在“我的产品名称”目录下。因此,如果您正在编写名为的应用程序quux,则包含所有这些内容的目录将命名为 /quux

这样,另一个项目PYTHONPATH可以包括/path/to/quux/foo重用QUUX.foo模块。

就我而言,由于我使用Komodo Edit,所以我的IDE cuft是单个.KPF文件。实际上,我将其放在顶层/quux目录中,并省略了将其添加到SVN中的情况。

Doesn’t too much matter. Whatever makes you happy will work. There aren’t a lot of silly rules because Python projects can be simple.

  • /scripts or /bin for that kind of command-line interface stuff
  • /tests for your tests
  • /lib for your C-language libraries
  • /doc for most documentation
  • /apidoc for the Epydoc-generated API docs.

And the top-level directory can contain README’s, Config’s and whatnot.

The hard choice is whether or not to use a /src tree. Python doesn’t have a distinction between /src, /lib, and /bin like Java or C has.

Since a top-level /src directory is seen by some as meaningless, your top-level directory can be the top-level architecture of your application.

  • /foo
  • /bar
  • /baz

I recommend putting all of this under the “name-of-my-product” directory. So, if you’re writing an application named quux, the directory that contains all this stuff is named /quux.

Another project’s PYTHONPATH, then, can include /path/to/quux/foo to reuse the QUUX.foo module.

In my case, since I use Komodo Edit, my IDE cuft is a single .KPF file. I actually put that in the top-level /quux directory, and omit adding it to SVN.


回答 1

根据Jean-Paul Calderone的Python项目文件系统结构

Project/
|-- bin/
|   |-- project
|
|-- project/
|   |-- test/
|   |   |-- __init__.py
|   |   |-- test_main.py
|   |   
|   |-- __init__.py
|   |-- main.py
|
|-- setup.py
|-- README

According to Jean-Paul Calderone’s Filesystem structure of a Python project:

Project/
|-- bin/
|   |-- project
|
|-- project/
|   |-- test/
|   |   |-- __init__.py
|   |   |-- test_main.py
|   |   
|   |-- __init__.py
|   |-- main.py
|
|-- setup.py
|-- README

回答 2

博客由让-保罗·Calderone的岗位如Freenode上的#python答案通常是给出。

Python项目的文件系统结构

做:

  • 为目录命名与您的项目相关的名称。例如,如果您的项目名为“ Twisted”,请为其源文件命名顶级目录Twisted。发行时,应包括版本号后缀:Twisted-2.5
  • 创建目录Twisted/bin,然后将可执行文件放在此处(如果有)。.py即使它们是Python源文件,也不要给它们扩展名。除了在项目中其他地方定义的main函数的导入和调用外,不要在其中添加任何代码。(略有起皱:由于在Windows上,解释器是由文件扩展名选择的,因此Windows用户实际上确实希望使用.py扩展名。因此,在为Windows打包时,可能需要添加它。不幸的是,没有简单的distutils技巧可以考虑到在POSIX上.py扩展名只是一个疣,而在Windows上缺少是一个实际的错误,如果您的用户群包括Windows用户,则可能希望仅使用.py。扩展到处。)
  • 如果您的项目可表示为单个Python源文件,则将其放入目录并命名与项目相关的名称。例如,Twisted/twisted.py。如果需要多个源文件,请创建一个包(Twisted/twisted/,带一个空Twisted/twisted/__init__.py),然后将源文件放入其中。例如,Twisted/twisted/internet.py
  • 将单元测试放在程序包的子包中(请注意-这意味着上面的单个Python源文件选项是一个技巧- 单元测试始终需要至少一个其他文件)。例如,Twisted/twisted/test/。当然,请使用将其打包Twisted/twisted/test/__init__.py。将测试放在的文件中Twisted/twisted/test/test_internet.py
  • 如果感觉不错,分别添加Twisted/READMETwisted/setup.py来解释和安装软件。

别:

  • 将您的源代码放在一个名为src或的目录中lib。这使得不安装就很难运行。
  • 将测试放到Python包之外。这使得很难针对已安装的版本运行测试。
  • 创建一个包,只有拥有__init__.py,然后把所有的代码放入__init__.py。只需制作一个模块而不是一个包,就更简单了。
  • 尝试提出一些神奇的技巧,以使Python能够导入您的模块或包,而无需用户将包含它的目录添加到其导入路径(通过PYTHONPATH或其他机制)。您将无法正确处理所有情况,并且当您的软件无法在其环境中运行时,用户会生您的气。

This blog post by Jean-Paul Calderone is commonly given as an answer in #python on Freenode.

Filesystem structure of a Python project

Do:

  • name the directory something related to your project. For example, if your project is named “Twisted”, name the top-level directory for its source files Twisted. When you do releases, you should include a version number suffix: Twisted-2.5.
  • create a directory Twisted/bin and put your executables there, if you have any. Don’t give them a .py extension, even if they are Python source files. Don’t put any code in them except an import of and call to a main function defined somewhere else in your projects. (Slight wrinkle: since on Windows, the interpreter is selected by the file extension, your Windows users actually do want the .py extension. So, when you package for Windows, you may want to add it. Unfortunately there’s no easy distutils trick that I know of to automate this process. Considering that on POSIX the .py extension is a only a wart, whereas on Windows the lack is an actual bug, if your userbase includes Windows users, you may want to opt to just have the .py extension everywhere.)
  • If your project is expressable as a single Python source file, then put it into the directory and name it something related to your project. For example, Twisted/twisted.py. If you need multiple source files, create a package instead (Twisted/twisted/, with an empty Twisted/twisted/__init__.py) and place your source files in it. For example, Twisted/twisted/internet.py.
  • put your unit tests in a sub-package of your package (note – this means that the single Python source file option above was a trick – you always need at least one other file for your unit tests). For example, Twisted/twisted/test/. Of course, make it a package with Twisted/twisted/test/__init__.py. Place tests in files like Twisted/twisted/test/test_internet.py.
  • add Twisted/README and Twisted/setup.py to explain and install your software, respectively, if you’re feeling nice.

Don’t:

  • put your source in a directory called src or lib. This makes it hard to run without installing.
  • put your tests outside of your Python package. This makes it hard to run the tests against an installed version.
  • create a package that only has a __init__.py and then put all your code into __init__.py. Just make a module instead of a package, it’s simpler.
  • try to come up with magical hacks to make Python able to import your module or package without having the user add the directory containing it to their import path (either via PYTHONPATH or some other mechanism). You will not correctly handle all cases and users will get angry at you when your software doesn’t work in their environment.

回答 3

以正确的方式查看Open Sourcing Python项目

让我摘录那篇优秀文章的项目布局部分:

设置项目时,布局(或目录结构)对于正确设置很重要。合理的布局意味着潜在的贡献者不必花大量的时间寻找代码。文件位置很直观。由于我们正在处理现有项目,因此这意味着您可能需要移动一些内容。

让我们从顶部开始。大多数项目都有许多顶级文件(例如setup.py,README.md,requirements.txt等)。每个项目应具有三个目录:

  • 包含项目文档的docs目录
  • 以项目名称命名的目录,用于存储实际的Python包
  • 在两个位置之一中的测试目录
    • 在包含测试代码和资源的包目录下
    • 作为独立的顶层目录为了更好地了解文件的组织方式,以下是我的一个项目sandman的布局简化快照:
$ pwd
~/code/sandman
$ tree
.
|- LICENSE
|- README.md
|- TODO.md
|- docs
|   |-- conf.py
|   |-- generated
|   |-- index.rst
|   |-- installation.rst
|   |-- modules.rst
|   |-- quickstart.rst
|   |-- sandman.rst
|- requirements.txt
|- sandman
|   |-- __init__.py
|   |-- exception.py
|   |-- model.py
|   |-- sandman.py
|   |-- test
|       |-- models.py
|       |-- test_sandman.py
|- setup.py

如您所见,这里有一些顶级文件,一个docs目录(生成的是一个空目录,sphinx将在其中放置生成的文档),一个sandman目录和一个sandman下的test目录。

Check out Open Sourcing a Python Project the Right Way.

Let me excerpt the project layout part of that excellent article:

When setting up a project, the layout (or directory structure) is important to get right. A sensible layout means that potential contributors don’t have to spend forever hunting for a piece of code; file locations are intuitive. Since we’re dealing with an existing project, it means you’ll probably need to move some stuff around.

Let’s start at the top. Most projects have a number of top-level files (like setup.py, README.md, requirements.txt, etc). There are then three directories that every project should have:

  • A docs directory containing project documentation
  • A directory named with the project’s name which stores the actual Python package
  • A test directory in one of two places
    • Under the package directory containing test code and resources
    • As a stand-alone top level directory To get a better sense of how your files should be organized, here’s a simplified snapshot of the layout for one of my projects, sandman:
$ pwd
~/code/sandman
$ tree
.
|- LICENSE
|- README.md
|- TODO.md
|- docs
|   |-- conf.py
|   |-- generated
|   |-- index.rst
|   |-- installation.rst
|   |-- modules.rst
|   |-- quickstart.rst
|   |-- sandman.rst
|- requirements.txt
|- sandman
|   |-- __init__.py
|   |-- exception.py
|   |-- model.py
|   |-- sandman.py
|   |-- test
|       |-- models.py
|       |-- test_sandman.py
|- setup.py

As you can see, there are some top level files, a docs directory (generated is an empty directory where sphinx will put the generated documentation), a sandman directory, and a test directory under sandman.


回答 4

“ Python包装管理中心”有一个示例项目:

https://github.com/pypa/sampleproject

它是一个示例项目,可作为《 Python打包用户指南》中有关打包和分发项目的教程的辅助工具而存在。

The “Python Packaging Authority” has a sampleproject:

https://github.com/pypa/sampleproject

It is a sample project that exists as an aid to the Python Packaging User Guide’s Tutorial on Packaging and Distributing Projects.


回答 5

尝试使用python_boilerplate模板启动项目。它在很大程度上遵循了最佳实践(例如此处的),但是如果您发现自己愿意在某个时候将您的项目分成多个鸡蛋(并且相信我,除了最简单的项目之外的其他项目,您会做到),它会更适合。常见的情况是您必须使用其他人的库的本地修​​改版本)。

  • 您将源放在哪里?

    • 对于大型项目,将源分成几个鸡蛋是有意义的。每个鸡蛋将在下作为单独的setuptools-layout放置PROJECT_ROOT/src/<egg_name>
  • 您将应用程序启动脚本放在哪里?

    • 理想的选择是将应用程序启动脚本注册为entry_point其中一个鸡蛋。
  • 您将IDE项目放在哪里?

    • 取决于IDE。他们中的许多人将自己的东西保存PROJECT_ROOT/.<something>在项目的根目录中,这很好。
  • 您将单元/验收测试放在哪里?

    • 每个鸡蛋都有单独的一组测试,并保存在其PROJECT_ROOT/src/<egg_name>/tests目录中。我个人更喜欢使用py.test它们来运行它们。
  • 您将非Python数据(例如配置文件)放在哪里?

    • 这取决于。可能有不同类型的非Python数据。
      • “资源”,即必须包装在一个鸡蛋中的数据。该数据进入包命名空间中某个位置的相应egg目录。可以通过pkg_resources从中的包使用它,也可以从标准库中setuptoolsimportlib.resources模块通过Python 3.7开始使用。
      • “配置文件”,即非Python文件,它们被视为项目源文件的外部文件,但在应用程序开始运行时必须使用一些值进行初始化。在开发过程中,我更喜欢将此类文件保存在中PROJECT_ROOT/config。对于部署,可以有多种选择。在Windows %APP_DATA%/<app-name>/config上,可以在Linux /etc/<app-name>或上使用/opt/<app-name>/config
      • 生成的文件,即应用程序在执行期间可以创建或修改的文件。我希望PROJECT_ROOT/var在开发/var期间以及在Linux部署期间保留它们。
  • 您在哪里将非Python来源(例如C ++)用于pyd / so二进制扩展模块?
    • 进入 PROJECT_ROOT/src/<egg_name>/native

文件通常会放入PROJECT_ROOT/docPROJECT_ROOT/src/<egg_name>/doc(取决于您是否将某些鸡蛋视为一个单独的大型项目)。一些其他配置将在PROJECT_ROOT/buildout.cfg和文件中PROJECT_ROOT/setup.cfg

Try starting the project using the python_boilerplate template. It largely follows the best practices (e.g. those here), but is better suited in case you find yourself willing to split your project into more than one egg at some point (and believe me, with anything but the simplest projects, you will. One common situation is where you have to use a locally-modified version of someone else’s library).

  • Where do you put the source?

    • For decently large projects it makes sense to split the source into several eggs. Each egg would go as a separate setuptools-layout under PROJECT_ROOT/src/<egg_name>.
  • Where do you put application startup scripts?

    • The ideal option is to have application startup script registered as an entry_point in one of the eggs.
  • Where do you put the IDE project cruft?

    • Depends on the IDE. Many of them keep their stuff in PROJECT_ROOT/.<something> in the root of the project, and this is fine.
  • Where do you put the unit/acceptance tests?

    • Each egg has a separate set of tests, kept in its PROJECT_ROOT/src/<egg_name>/tests directory. I personally prefer to use py.test to run them.
  • Where do you put non-Python data such as config files?

    • It depends. There can be different types of non-Python data.
      • “Resources”, i.e. data that must be packaged within an egg. This data goes into the corresponding egg directory, somewhere within package namespace. It can be used via the pkg_resources package from setuptools, or since Python 3.7 via the importlib.resources module from the standard library.
      • “Config-files”, i.e. non-Python files that are to be regarded as external to the project source files, but have to be initialized with some values when application starts running. During development I prefer to keep such files in PROJECT_ROOT/config. For deployment there can be various options. On Windows one can use %APP_DATA%/<app-name>/config, on Linux, /etc/<app-name> or /opt/<app-name>/config.
      • Generated files, i.e. files that may be created or modified by the application during execution. I would prefer to keep them in PROJECT_ROOT/var during development, and under /var during Linux deployment.
  • Where do you put non-Python sources such as C++ for pyd/so binary extension modules?
    • Into PROJECT_ROOT/src/<egg_name>/native

Documentation would typically go into PROJECT_ROOT/doc or PROJECT_ROOT/src/<egg_name>/doc (this depends on whether you regard some of the eggs to be a separate large projects). Some additional configuration will be in files like PROJECT_ROOT/buildout.cfg and PROJECT_ROOT/setup.cfg.


回答 6

以我的经验,这只是迭代问题。将您的数据和代码放在您认为任何地方。很有可能,无论如何你都会错的。但是,一旦您对事物的确切形状有了一个更好的了解,您就可以进行这些猜测。

至于扩展源,我们在主干下有一个Code目录,其中包含python目录和各种其他语言的目录。就个人而言,下一次我更倾向于尝试将任何扩展代码放入其自己的存储库中。

话虽如此,我回到了我的初始观点:不要做太大的事情。将其放在似乎对您有用的位置。如果发现不起作用,则可以(并且应该)对其进行更改。

In my experience, it’s just a matter of iteration. Put your data and code wherever you think they go. Chances are, you’ll be wrong anyway. But once you get a better idea of exactly how things are going to shape up, you’re in a much better position to make these kinds of guesses.

As far as extension sources, we have a Code directory under trunk that contains a directory for python and a directory for various other languages. Personally, I’m more inclined to try putting any extension code into its own repository next time around.

With that said, I go back to my initial point: don’t make too big a deal out of it. Put it somewhere that seems to work for you. If you find something that doesn’t work, it can (and should) be changed.


回答 7

最好使用setuptools中package_data支持将非Python数据捆绑到您的Python模块中。我强烈建议您使用命名空间包来创建多个项目可以使用的共享命名空间,这很像Java约定(将软件包放入其中并能够拥有一个共享命名空间)。com.yourcompany.yourprojectcom.yourcompany.utils

重新分支和合并,如果您使用足够好的源代码控制系统,它将通过重命名来处理合并;集市在这方面尤其擅长。

与这里的其他答案相反,我对拥有src顶级目录(带有doctest目录并在旁边)+1 。文档目录树的特定约定将根据您所使用的内容而有所不同。例如,Sphinx有其快速启动工具支持的自己的约定。

请,请利用setuptools和pkg_resources;这使其他项目更容易依赖于代码的特定版本(如果使用,则多个版本可以与不同的非代码文件同时安装package_data)。

Non-python data is best bundled inside your Python modules using the package_data support in setuptools. One thing I strongly recommend is using namespace packages to create shared namespaces which multiple projects can use — much like the Java convention of putting packages in com.yourcompany.yourproject (and being able to have a shared com.yourcompany.utils namespace).

Re branching and merging, if you use a good enough source control system it will handle merges even through renames; Bazaar is particularly good at this.

Contrary to some other answers here, I’m +1 on having a src directory top-level (with doc and test directories alongside). Specific conventions for documentation directory trees will vary depending on what you’re using; Sphinx, for instance, has its own conventions which its quickstart tool supports.

Please, please leverage setuptools and pkg_resources; this makes it much easier for other projects to rely on specific versions of your code (and for multiple versions to be simultaneously installed with different non-code files, if you’re using package_data).


arr .__ len __()是在Python中获取数组长度的首选方法吗?

问题:arr .__ len __()是在Python中获取数组长度的首选方法吗?

Python中,以下是获取元素数量的唯一方法吗?

arr.__len__()

如果是这样,为什么会有奇怪的语法?

In Python, is the following the only way to get the number of elements?

arr.__len__()

If so, why the strange syntax?


回答 0

my_list = [1,2,3,4,5]
len(my_list)
# 5

对于元组也是如此:

my_tuple = (1,2,3,4,5)
len(my_tuple)
# 5

和字符串,它们实际上只是字符数组:

my_string = 'hello world'
len(my_string)
# 11

这样做的目的是为了使列表,元组和其他容器类型或可迭代对象都不需要显式实现公共.length()方法,而只需检查len()实现了“魔术” __len__()方法的所有内容即可。

当然,这似乎是多余的,但是长度检查的实现可能会有很大的不同,即使在同一语言中也是如此。经常会看到一种类型的集合使用一种.length()方法,而另一种类型使用一种.length属性,而另一种类型使用.count()。使用语言级别的关键字可以统一所有这些类型的入口点。因此,即使您可能不认为是元素列表的对象也可以进行长度检查。这包括字符串,队列,树等。

的功能性质len()也很适合编程的功能样式。

lengths = map(len, list_of_containers)
my_list = [1,2,3,4,5]
len(my_list)
# 5

The same works for tuples:

my_tuple = (1,2,3,4,5)
len(my_tuple)
# 5

And strings, which are really just arrays of characters:

my_string = 'hello world'
len(my_string)
# 11

It was intentionally done this way so that lists, tuples and other container types or iterables didn’t all need to explicitly implement a public .length() method, instead you can just check the len() of anything that implements the ‘magic’ __len__() method.

Sure, this may seem redundant, but length checking implementations can vary considerably, even within the same language. It’s not uncommon to see one collection type use a .length() method while another type uses a .length property, while yet another uses .count(). Having a language-level keyword unifies the entry point for all these types. So even objects you may not consider to be lists of elements could still be length-checked. This includes strings, queues, trees, etc.

The functional nature of len() also lends itself well to functional styles of programming.

lengths = map(len, list_of_containers)

回答 1

取任何有意义的长度(列表,字典,元组,字符串等)的方法是调用len它。

l = [1,2,3,4]
s = 'abcde'
len(l) #returns 4
len(s) #returns 5

使用“奇怪”语法的原因是python在内部翻译len(object)object.__len__()。这适用于任何对象。因此,如果您正在定义某个类,并且让它具有长度是有意义的,则只需__len__()在其上定义一个方法,然后人们便可以调用len这些实例。

The way you take a length of anything for which that makes sense (a list, dictionary, tuple, string, …) is to call len on it.

l = [1,2,3,4]
s = 'abcde'
len(l) #returns 4
len(s) #returns 5

The reason for the “strange” syntax is that internally python translates len(object) into object.__len__(). This applies to any object. So, if you are defining some class and it makes sense for it to have a length, just define a __len__() method on it and then one can call len on those instances.


回答 2

Python使用鸭子类型:它不在乎对象什么,只要它具有适合当前情况的适当接口即可。当您在对象上调用内置函数len()时,实际上是在调用其内部__len__方法。自定义对象可以实现此接口,并且len()将返回答案,即使该对象在概念上不是序列。

有关接口的完整列表,请在此处查看:http : //docs.python.org/reference/datamodel.html#basic-customization

Python uses duck typing: it doesn’t care about what an object is, as long as it has the appropriate interface for the situation at hand. When you call the built-in function len() on an object, you are actually calling its internal __len__ method. A custom object can implement this interface and len() will return the answer, even if the object is not conceptually a sequence.

For a complete list of interfaces, have a look here: http://docs.python.org/reference/datamodel.html#basic-customization


回答 3

获取任何python对象长度的首选方法是将其作为参数传递给len函数。然后在内部,python将尝试调用__len__所传递对象的特殊方法。

The preferred way to get the length of any python object is to pass it as an argument to the len function. Internally, python will then try to call the special __len__ method of the object that was passed.


回答 4

只需使用len(arr)

>>> import array
>>> arr = array.array('i')
>>> arr.append('2')
>>> arr.__len__()
1
>>> len(arr)
1

Just use len(arr):

>>> import array
>>> arr = array.array('i')
>>> arr.append('2')
>>> arr.__len__()
1
>>> len(arr)
1

回答 5

您可以len(arr) 按照前面的答案中的建议使用,以获取数组的长度。如果需要二维数组的尺寸,可以使用arr.shape返回高度和宽度

you can use len(arr) as suggested in previous answers to get the length of the array. In case you want the dimensions of a 2D array you could use arr.shape returns height and width


回答 6

len(list_name)函数以list作为参数,并调用list的__len__()函数。

len(list_name) function takes list as a parameter and it calls list’s __len__() function.


回答 7

Python建议用户使用len()而不是__len__()为了保持一致性,就像其他人所说的那样。但是,还有其他一些好处:

对于一些内置的类型,如liststrbytearray等等,在用Cython实现len()需要一个快捷方式。它直接ob_size以C结构返回,比调用更快__len__()

如果您对这样的细节感兴趣,可以阅读Luciano Ramalho的书“ Fluent Python”。其中包含许多有趣的细节,可能有助于您更深入地了解Python。

Python suggests users use len() instead of __len__() for consistency, just like other guys said. However, There’re some other benefits:

For some built-in types like list, str, bytearray and so on, the Cython implementation of len() takes a shortcut. It directly returns the ob_size in a C structure, which is faster than calling __len__().

If you are interested in such details, you could read the book called “Fluent Python” by Luciano Ramalho. There’re many interesting details in it, and may help you understand Python more deeply.