标签归档:python-2.7

指定输入参数argparse python的格式

问题:指定输入参数argparse python的格式

我有一个需要一些命令行输入的python脚本,并且我正在使用argparse进行解析。我发现文档有点混乱,找不到在输入参数中检查格式的方法。这个示例脚本解释了我检查格式的意思:

parser.add_argument('-s', "--startdate", help="The Start Date - format YYYY-MM-DD ", required=True)
parser.add_argument('-e', "--enddate", help="The End Date format YYYY-MM-DD (Inclusive)", required=True)
parser.add_argument('-a', "--accountid", type=int, help='Account ID for the account for which data is required (Default: 570)')
parser.add_argument('-o', "--outputpath", help='Directory where output needs to be stored (Default: ' + os.path.dirname(os.path.abspath(__file__)))

我需要检查选项,-s并且-e用户输入的格式为YYYY-MM-DD。我不知道argparse中有一个选项可以完成此任务。

I have a python script that requires some command line inputs and I am using argparse for parsing them. I found the documentation a bit confusing and couldn’t find a way to check for a format in the input parameters. What I mean by checking format is explained with this example script:

parser.add_argument('-s', "--startdate", help="The Start Date - format YYYY-MM-DD ", required=True)
parser.add_argument('-e', "--enddate", help="The End Date format YYYY-MM-DD (Inclusive)", required=True)
parser.add_argument('-a', "--accountid", type=int, help='Account ID for the account for which data is required (Default: 570)')
parser.add_argument('-o', "--outputpath", help='Directory where output needs to be stored (Default: ' + os.path.dirname(os.path.abspath(__file__)))

I need to check for option -s and -e that the input by the user is in the format YYYY-MM-DD. Is there an option in argparse that I do not know of which accomplishes this.


回答 0

根据文档

type关键字参数add_argument()允许执行任何必要的类型检查和类型转换… type=可以接受带有单个字符串参数并返回转换后值的任何可调用对象

您可以执行以下操作:

def valid_date(s):
    try:
        return datetime.strptime(s, "%Y-%m-%d")
    except ValueError:
        msg = "Not a valid date: '{0}'.".format(s)
        raise argparse.ArgumentTypeError(msg)

然后将其用作type

parser.add_argument("-s", 
                    "--startdate", 
                    help="The Start Date - format YYYY-MM-DD", 
                    required=True, 
                    type=valid_date)

Per the documentation:

The type keyword argument of add_argument() allows any necessary type-checking and type conversions to be performed … type= can take any callable that takes a single string argument and returns the converted value

You could do something like:

def valid_date(s):
    try:
        return datetime.strptime(s, "%Y-%m-%d")
    except ValueError:
        msg = "Not a valid date: '{0}'.".format(s)
        raise argparse.ArgumentTypeError(msg)

Then use that as type:

parser.add_argument("-s", 
                    "--startdate", 
                    help="The Start Date - format YYYY-MM-DD", 
                    required=True, 
                    type=valid_date)

回答 1

只是为了补充上面的答案,如果您想将lambda函数保持为单一格式,则可以使用该函数。例如:

parser.add_argument('--date', type=lambda d: datetime.strptime(d, '%Y%m%d'))

旧线程,但问题至少仍然与我有关!

Just to add on to the answer above, you can use a lambda function if you want to keep it to a one-liner. For example:

parser.add_argument('--date', type=lambda d: datetime.strptime(d, '%Y%m%d'))

Old thread but the question was still relevant for me at least!


回答 2

对于其他通过搜索引擎实现此目标的人:在Python 3.7中,您可以使用标准的.fromisoformatclass方法,而不是为ISO-8601兼容日期重新发明轮子,例如:

parser.add_argument('-s', "--startdate",
    help="The Start Date - format YYYY-MM-DD",
    required=True,
    type=datetime.date.fromisoformat)
parser.add_argument('-e', "--enddate",
    help="The End Date format YYYY-MM-DD (Inclusive)",
    required=True,
    type=datetime.date.fromisoformat)

For others who hit this via search engines: in Python 3.7, you can use the standard .fromisoformat class method instead of reinventing the wheel for ISO-8601 compliant dates, e.g.:

parser.add_argument('-s', "--startdate",
    help="The Start Date - format YYYY-MM-DD",
    required=True,
    type=datetime.date.fromisoformat)
parser.add_argument('-e', "--enddate",
    help="The End Date format YYYY-MM-DD (Inclusive)",
    required=True,
    type=datetime.date.fromisoformat)

使用花括号在Python中初始化Set

问题:使用花括号在Python中初始化Set

我正在学习python,并且对初始化集有一个新手问题。通过测试,我发现可以像这样初始化一个集合:

my_set = {'foo', 'bar', 'baz'}

与标准方式相反,以这种方式进行操作是否有任何缺点:

my_set = set(['foo', 'bar', 'baz'])

还是仅仅是样式问题?

I’m learning python, and I have a novice question about initializing sets. Through testing, I’ve discovered that a set can be initialized like so:

my_set = {'foo', 'bar', 'baz'}

Are there any disadvantages of doing it this way, as opposed to the standard way of:

my_set = set(['foo', 'bar', 'baz'])

or is it just a question of style?


回答 0

设置文字语法有两个明显的问题:

my_set = {'foo', 'bar', 'baz'}
  1. 在Python 2.7之前不可用

  2. 无法使用该语法表示空集(使用{}创建空dict)

这些可能对您很重要,也可能不重要。

概述此语法的文档部分在此处

There are two obvious issues with the set literal syntax:

my_set = {'foo', 'bar', 'baz'}
  1. It’s not available before Python 2.7

  2. There’s no way to express an empty set using that syntax (using {} creates an empty dict)

Those may or may not be important to you.

The section of the docs outlining this syntax is here.


回答 1

也比较之间的差别{},并set()用一个字的说法。

>>> a = set('aardvark')
>>> a
{'d', 'v', 'a', 'r', 'k'} 
>>> b = {'aardvark'}
>>> b
{'aardvark'}

但两者ab都是套路。

Compare also the difference between {} and set() with a single word argument.

>>> a = set('aardvark')
>>> a
{'d', 'v', 'a', 'r', 'k'} 
>>> b = {'aardvark'}
>>> b
{'aardvark'}

but both a and b are sets of course.


回答 2

Python 3文档与python 2.7相同):

花括号或set()函数可用于创建集合。注意:要创建一个空集,您必须使用set()而不是{}; 后者将创建一个空字典,这是我们将在下一节中讨论的数据结构。

在python 2.7中:

>>> my_set = {'foo', 'bar', 'baz', 'baz', 'foo'}
>>> my_set
set(['bar', 'foo', 'baz'])

请注意,{}它也用于map/ dict

>>> m = {'a':2,3:'d'}
>>> m[3]
'd'
>>> m={}
>>> type(m)
<type 'dict'> 

还可以使用综合语法来初始化集:

>>> a = {x for x in """didn't know about {} and sets """ if x not in 'set' }
>>> a
set(['a', ' ', 'b', 'd', "'", 'i', 'k', 'o', 'n', 'u', 'w', '{', '}'])

From Python 3 documentation (the same holds for python 2.7):

Curly braces or the set() function can be used to create sets. Note: to create an empty set you have to use set(), not {}; the latter creates an empty dictionary, a data structure that we discuss in the next section.

in python 2.7:

>>> my_set = {'foo', 'bar', 'baz', 'baz', 'foo'}
>>> my_set
set(['bar', 'foo', 'baz'])

Be aware that {} is also used for map/dict:

>>> m = {'a':2,3:'d'}
>>> m[3]
'd'
>>> m={}
>>> type(m)
<type 'dict'> 

One can also use comprehensive syntax to initialize sets:

>>> a = {x for x in """didn't know about {} and sets """ if x not in 'set' }
>>> a
set(['a', ' ', 'b', 'd', "'", 'i', 'k', 'o', 'n', 'u', 'w', '{', '}'])

回答 3

您需要 empty_set = set()初始化一个空集。{}是空字典。

You need to do empty_set = set() to initialize an empty set. {} is am empty dictionaty.


有什么方法可以正确打印订购的字典吗?

问题:有什么方法可以正确打印订购的字典吗?

我喜欢Python中的pprint模块。我经常使用它进行测试和调试。我经常使用width选项来确保输出完全适合我的终端窗口。

直到他们在Python 2.7中添加了新的有序词典类型(我真的很喜欢的另一个很酷的功能)之前,它一直运行良好。如果我尝试漂亮地打印有序词典,则显示效果会不好。整个事情并没有出现在每个键值对各自的行上,而是整条显示在一条长行上,该行包装许多次并且很难阅读。

这里有没有人像老旧的无序词典一样,能够很好地打印出来?如果我花了足够的时间,我可能可以使用PrettyPrinter.format方法解决问题,但是我想知道这里是否有人知道解决方案。

更新:我为此提交了一个错误报告。您可以在http://bugs.python.org/issue10592上看到它。

I like the pprint module in Python. I use it a lot for testing and debugging. I frequently use the width option to make sure the output fits nicely within my terminal window.

It has worked fine until they added the new ordered dictionary type in Python 2.7 (another cool feature I really like). If I try to pretty-print an ordered dictionary, it doesn’t show nicely. Instead of having each key-value pair on its own line, the whole thing shows up on one long line, which wraps many times and is hard to read.

Does anyone here have a way to make it print nicely, like the old unordered dictionaries? I could probably figure something out, possibly using the PrettyPrinter.format method, if I spend enough time, but I am wondering if anyone here already knows of a solution.

UPDATE: I filed a bug report for this. You can see it at http://bugs.python.org/issue10592.


回答 0

作为临时的解决方法,您可以尝试以JSON格式进行转储。您会丢失一些类型信息,但是看起来不错,可以保持顺序。

import json

pprint(data, indent=4)
# ^ugly

print(json.dumps(data, indent=4))
# ^nice

As a temporary workaround you can try dumping in JSON format. You lose some type information, but it looks nice and keeps the order.

import json

pprint(data, indent=4)
# ^ugly

print(json.dumps(data, indent=4))
# ^nice

回答 1

如果您的OrderedDict的顺序是alpha排序,则以下内容将起作用,因为pprint将在打印之前对字典进行排序。

pprint(dict(o.items()))

The following will work if the order of your OrderedDict is an alpha sort, since pprint will sort a dict before print.

pprint(dict(o.items()))

回答 2

这是另一个在pprint()内部覆盖并使用stock 函数的方法。与我之前的版本不同,它将OrderedDict在另一个容器(例如a)内处理,list并且还应该能够处理给定的任何可选关键字参数-但是,它对输出的控制程度与另一个容器不同。

它通过将stock函数的输出重定向到一个临时缓冲区中进行操作,然后对其进行自动换行,然后再将其发送到输出流。尽管最终产生的输出不是特别漂亮,但它是不错的,并且可能“足够好”用作解决方法。

更新2.0

通过使用标准库textwrap模块进行了简化,并进行了修改,使其可以在Python 2和3中使用。

from collections import OrderedDict
try:
    from cStringIO import StringIO
except ImportError:  # Python 3
    from io import StringIO
from pprint import pprint as pp_pprint
import sys
import textwrap

def pprint(object, **kwrds):
    try:
        width = kwrds['width']
    except KeyError: # unlimited, use stock function
        pp_pprint(object, **kwrds)
        return
    buffer = StringIO()
    stream = kwrds.get('stream', sys.stdout)
    kwrds.update({'stream': buffer})
    pp_pprint(object, **kwrds)
    words = buffer.getvalue().split()
    buffer.close()

    # word wrap output onto multiple lines <= width characters
    try:
        print >> stream, textwrap.fill(' '.join(words), width=width)
    except TypeError:  # Python 3
        print(textwrap.fill(' '.join(words), width=width), file=stream)

d = dict((('john',1), ('paul',2), ('mary',3)))
od = OrderedDict((('john',1), ('paul',2), ('mary',3)))
lod = [OrderedDict((('john',1), ('paul',2), ('mary',3))),
       OrderedDict((('moe',1), ('curly',2), ('larry',3))),
       OrderedDict((('weapons',1), ('mass',2), ('destruction',3)))]

样本输出:

pprint(d, width=40)

»   {'john': 1, 'mary': 3, 'paul': 2}

pprint(od, width=40)

» OrderedDict([('john', 1), ('paul', 2),
   ('mary', 3)])

pprint(lod, width=40)

» [OrderedDict([('john', 1), ('paul', 2),
   ('mary', 3)]), OrderedDict([('moe', 1),
   ('curly', 2), ('larry', 3)]),
   OrderedDict([('weapons', 1), ('mass',
   2), ('destruction', 3)])]

Here’s another answer that works by overriding and using the stock pprint() function internally. Unlike my earlier one it will handle OrderedDict‘s inside another container such as a list and should also be able to handle any optional keyword arguments given — however it does not have the same degree of control over the output that the other one afforded.

It operates by redirecting the stock function’s output into a temporary buffer and then word wraps that before sending it on to the output stream. While the final output produced isn’t exceptionalily pretty, it’s decent and may be “good enough” to use as a workaround.

Update 2.0

Simplified by using standard library textwrap module, and modified to work in both Python 2 & 3.

from collections import OrderedDict
try:
    from cStringIO import StringIO
except ImportError:  # Python 3
    from io import StringIO
from pprint import pprint as pp_pprint
import sys
import textwrap

def pprint(object, **kwrds):
    try:
        width = kwrds['width']
    except KeyError: # unlimited, use stock function
        pp_pprint(object, **kwrds)
        return
    buffer = StringIO()
    stream = kwrds.get('stream', sys.stdout)
    kwrds.update({'stream': buffer})
    pp_pprint(object, **kwrds)
    words = buffer.getvalue().split()
    buffer.close()

    # word wrap output onto multiple lines <= width characters
    try:
        print >> stream, textwrap.fill(' '.join(words), width=width)
    except TypeError:  # Python 3
        print(textwrap.fill(' '.join(words), width=width), file=stream)

d = dict((('john',1), ('paul',2), ('mary',3)))
od = OrderedDict((('john',1), ('paul',2), ('mary',3)))
lod = [OrderedDict((('john',1), ('paul',2), ('mary',3))),
       OrderedDict((('moe',1), ('curly',2), ('larry',3))),
       OrderedDict((('weapons',1), ('mass',2), ('destruction',3)))]

Sample output:

pprint(d, width=40)

»   {'john': 1, 'mary': 3, 'paul': 2}

pprint(od, width=40)

» OrderedDict([('john', 1), ('paul', 2),
   ('mary', 3)])

pprint(lod, width=40)

» [OrderedDict([('john', 1), ('paul', 2),
   ('mary', 3)]), OrderedDict([('moe', 1),
   ('curly', 2), ('larry', 3)]),
   OrderedDict([('weapons', 1), ('mass',
   2), ('destruction', 3)])]


回答 3

打印命令字典,例如

from collections import OrderedDict

d=OrderedDict([
    ('a', OrderedDict([
        ('a1',1),
        ('a2','sss')
    ])),
    ('b', OrderedDict([
        ('b1', OrderedDict([
            ('bb1',1),
            ('bb2',4.5)])),
        ('b2',4.5)
    ])),
])

我做

def dict_or_OrdDict_to_formatted_str(OD, mode='dict', s="", indent=' '*4, level=0):
    def is_number(s):
        try:
            float(s)
            return True
        except ValueError:
            return False
    def fstr(s):
        return s if is_number(s) else '"%s"'%s
    if mode != 'dict':
        kv_tpl = '("%s", %s)'
        ST = 'OrderedDict([\n'; END = '])'
    else:
        kv_tpl = '"%s": %s'
        ST = '{\n'; END = '}'
    for i,k in enumerate(OD.keys()):
        if type(OD[k]) in [dict, OrderedDict]:
            level += 1
            s += (level-1)*indent+kv_tpl%(k,ST+dict_or_OrdDict_to_formatted_str(OD[k], mode=mode, indent=indent, level=level)+(level-1)*indent+END)
            level -= 1
        else:
            s += level*indent+kv_tpl%(k,fstr(OD[k]))
        if i!=len(OD)-1:
            s += ","
        s += "\n"
    return s

print dict_or_OrdDict_to_formatted_str(d)

哪个Yield

"a": {
    "a1": 1,
    "a2": "sss"
},
"b": {
    "b1": {
        "bb1": 1,
        "bb2": 4.5
    },
    "b2": 4.5
}

要么

print dict_or_OrdDict_to_formatted_str(d, mode='OD')

产生

("a", OrderedDict([
    ("a1", 1),
    ("a2", "sss")
])),
("b", OrderedDict([
    ("b1", OrderedDict([
        ("bb1", 1),
        ("bb2", 4.5)
    ])),
    ("b2", 4.5)
]))

To print an ordered dict, e.g.

from collections import OrderedDict

d=OrderedDict([
    ('a', OrderedDict([
        ('a1',1),
        ('a2','sss')
    ])),
    ('b', OrderedDict([
        ('b1', OrderedDict([
            ('bb1',1),
            ('bb2',4.5)])),
        ('b2',4.5)
    ])),
])

I do

def dict_or_OrdDict_to_formatted_str(OD, mode='dict', s="", indent=' '*4, level=0):
    def is_number(s):
        try:
            float(s)
            return True
        except ValueError:
            return False
    def fstr(s):
        return s if is_number(s) else '"%s"'%s
    if mode != 'dict':
        kv_tpl = '("%s", %s)'
        ST = 'OrderedDict([\n'; END = '])'
    else:
        kv_tpl = '"%s": %s'
        ST = '{\n'; END = '}'
    for i,k in enumerate(OD.keys()):
        if type(OD[k]) in [dict, OrderedDict]:
            level += 1
            s += (level-1)*indent+kv_tpl%(k,ST+dict_or_OrdDict_to_formatted_str(OD[k], mode=mode, indent=indent, level=level)+(level-1)*indent+END)
            level -= 1
        else:
            s += level*indent+kv_tpl%(k,fstr(OD[k]))
        if i!=len(OD)-1:
            s += ","
        s += "\n"
    return s

print dict_or_OrdDict_to_formatted_str(d)

Which yields

"a": {
    "a1": 1,
    "a2": "sss"
},
"b": {
    "b1": {
        "bb1": 1,
        "bb2": 4.5
    },
    "b2": 4.5
}

or

print dict_or_OrdDict_to_formatted_str(d, mode='OD')

which yields

("a", OrderedDict([
    ("a1", 1),
    ("a2", "sss")
])),
("b", OrderedDict([
    ("b1", OrderedDict([
        ("bb1", 1),
        ("bb2", 4.5)
    ])),
    ("b2", 4.5)
]))

回答 4

这是破解的实现的方法pprintpprint在打印之前对键进行排序,因此,为了保持顺序,我们只需要按所需的方式对键进行排序即可。

请注意,这会影响items()功能。因此,您可能需要在执行pprint之后保留和恢复覆盖的功能。

from collections import OrderedDict
import pprint

class ItemKey(object):
  def __init__(self, name, position):
    self.name = name
    self.position = position
  def __cmp__(self, b):
    assert isinstance(b, ItemKey)
    return cmp(self.position, b.position)
  def __repr__(self):
    return repr(self.name)

OrderedDict.items = lambda self: [
    (ItemKey(name, i), value)
    for i, (name, value) in enumerate(self.iteritems())]
OrderedDict.__repr__ = dict.__repr__

a = OrderedDict()
a[4] = '4'
a[1] = '1'
a[2] = '2'
print pprint.pformat(a) # {4: '4', 1: '1', 2: '2'}

Here’s a way that hacks the implementation of pprint. pprint sorts the keys before printing, so to preserve order, we just have to make the keys sort in the way we want.

Note that this impacts the items() function. So you might want to preserve and restore the overridden functions after doing the pprint.

from collections import OrderedDict
import pprint

class ItemKey(object):
  def __init__(self, name, position):
    self.name = name
    self.position = position
  def __cmp__(self, b):
    assert isinstance(b, ItemKey)
    return cmp(self.position, b.position)
  def __repr__(self):
    return repr(self.name)

OrderedDict.items = lambda self: [
    (ItemKey(name, i), value)
    for i, (name, value) in enumerate(self.iteritems())]
OrderedDict.__repr__ = dict.__repr__

a = OrderedDict()
a[4] = '4'
a[1] = '1'
a[2] = '2'
print pprint.pformat(a) # {4: '4', 1: '1', 2: '2'}

回答 5

这是我漂亮打印OrderedDict的方法

from collections import OrderedDict
import json
d = OrderedDict()
d['duck'] = 'alive'
d['parrot'] = 'dead'
d['penguin'] = 'exploded'
d['Falcon'] = 'discharged'
print(d)
print(json.dumps(d,indent=4))

OutPut:

OrderedDict([('duck', 'alive'), ('parrot', 'dead'), ('penguin', 'exploded'), ('Falcon', 'discharged')])

{
    "duck": "alive",
    "parrot": "dead",
    "penguin": "exploded",
    "Falcon": "discharged"
}

如果您想按键顺序漂亮地打印字典

print(json.dumps(indent=4,sort_keys=True))
{
    "Falcon": "discharged",
    "duck": "alive",
    "parrot": "dead",
    "penguin": "exploded"
}

Here is my approach to pretty print an OrderedDict

from collections import OrderedDict
import json
d = OrderedDict()
d['duck'] = 'alive'
d['parrot'] = 'dead'
d['penguin'] = 'exploded'
d['Falcon'] = 'discharged'
print(d)
print(json.dumps(d,indent=4))

OutPut:

OrderedDict([('duck', 'alive'), ('parrot', 'dead'), ('penguin', 'exploded'), ('Falcon', 'discharged')])

{
    "duck": "alive",
    "parrot": "dead",
    "penguin": "exploded",
    "Falcon": "discharged"
}

If you want to pretty print dictionary with keys in sorted order

print(json.dumps(indent=4,sort_keys=True))
{
    "Falcon": "discharged",
    "duck": "alive",
    "parrot": "dead",
    "penguin": "exploded"
}

回答 6

这非常粗糙,但是我只需要一种可视化由任意映射和Iterable组成的数据结构的方法,这就是我放弃之前想到的。它是递归的,因此它将遍历嵌套结构和列表。我使用了集合中的Mapping和Iterable抽象基类来处理几乎所有内容。

我的目标是使用简洁的python代码输出几乎像yaml这样的输出,但并没有完全做到这一点。

def format_structure(d, level=0):
    x = ""
    if isinstance(d, Mapping):
        lenk = max(map(lambda x: len(str(x)), d.keys()))
        for k, v in d.items():
            key_text = "\n" + " "*level + " "*(lenk - len(str(k))) + str(k)
            x += key_text + ": " + format_structure(v, level=level+lenk)
    elif isinstance(d, Iterable) and not isinstance(d, basestring):
        for e in d:
            x += "\n" + " "*level + "- " + format_structure(e, level=level+4)
    else:
        x = str(d)
    return x

和一些使用OrderedDict的测试数据和OrderedDicts的列表…(sheesh Python严重需要OrderedDict文字…)

d = OrderedDict([("main",
                  OrderedDict([("window",
                                OrderedDict([("size", [500, 500]),
                                             ("position", [100, 900])])),
                               ("splash_enabled", True),
                               ("theme", "Dark")])),
                 ("updates",
                  OrderedDict([("automatic", True),
                               ("servers",
                                [OrderedDict([("url", "http://server1.com"),
                                              ("name", "Stable")]),
                                 OrderedDict([("url", "http://server2.com"),
                                              ("name", "Beta")]),
                                 OrderedDict([("url", "http://server3.com"),
                                              ("name", "Dev")])]),
                               ("prompt_restart", True)])),
                 ("logging",
                  OrderedDict([("enabled", True),
                               ("rotate", True)]))])

print format_structure(d)

产生以下输出:

   main: 
               window: 
                         size: 
                             - 500
                             - 500
                     position: 
                             - 100
                             - 900
       splash_enabled: True
                theme: Dark
updates: 
            automatic: True
              servers: 
                     - 
                          url: http://server1.com
                         name: Stable
                     - 
                          url: http://server2.com
                         name: Beta
                     - 
                          url: http://server3.com
                         name: Dev
       prompt_restart: True
logging: 
       enabled: True
        rotate: True

在使用str.format()进行更好的对齐的过程中,我有一些想法,但并不想深入研究它。您需要根据所需的对齐类型动态指定字段宽度,这会变得棘手或麻烦。

无论如何,这以可读的分层方式向我显示了我的数据,因此对我有用!

This is pretty crude, but I just needed a way to visualize a data structure made up of any arbitrary Mappings and Iterables and this is what I came up with before giving up. It’s recursive, so it will fall through nested structures and lists just fine. I used the Mapping and Iterable abstract base classes from collections to handle just about anything.

I was aiming for almost yaml like output with concise python code, but didn’t quite make it.

def format_structure(d, level=0):
    x = ""
    if isinstance(d, Mapping):
        lenk = max(map(lambda x: len(str(x)), d.keys()))
        for k, v in d.items():
            key_text = "\n" + " "*level + " "*(lenk - len(str(k))) + str(k)
            x += key_text + ": " + format_structure(v, level=level+lenk)
    elif isinstance(d, Iterable) and not isinstance(d, basestring):
        for e in d:
            x += "\n" + " "*level + "- " + format_structure(e, level=level+4)
    else:
        x = str(d)
    return x

and some test data using OrderedDict and lists of OrderedDicts… (sheesh Python needs OrderedDict literals sooo badly…)

d = OrderedDict([("main",
                  OrderedDict([("window",
                                OrderedDict([("size", [500, 500]),
                                             ("position", [100, 900])])),
                               ("splash_enabled", True),
                               ("theme", "Dark")])),
                 ("updates",
                  OrderedDict([("automatic", True),
                               ("servers",
                                [OrderedDict([("url", "http://server1.com"),
                                              ("name", "Stable")]),
                                 OrderedDict([("url", "http://server2.com"),
                                              ("name", "Beta")]),
                                 OrderedDict([("url", "http://server3.com"),
                                              ("name", "Dev")])]),
                               ("prompt_restart", True)])),
                 ("logging",
                  OrderedDict([("enabled", True),
                               ("rotate", True)]))])

print format_structure(d)

yields the following output:

   main: 
               window: 
                         size: 
                             - 500
                             - 500
                     position: 
                             - 100
                             - 900
       splash_enabled: True
                theme: Dark
updates: 
            automatic: True
              servers: 
                     - 
                          url: http://server1.com
                         name: Stable
                     - 
                          url: http://server2.com
                         name: Beta
                     - 
                          url: http://server3.com
                         name: Dev
       prompt_restart: True
logging: 
       enabled: True
        rotate: True

I had some thoughts along the way of using str.format() for better alignment, but didn’t feel like digging into it. You’d need to dynamically specify the field widths depending on the type of alignment you want, which would get either tricky or cumbersome.

Anyway, this shows me my data in readable hierarchical fashion, so that works for me!


回答 7

def pprint_od(od):
    print "{"
    for key in od:
        print "%s:%s,\n" % (key, od[key]) # Fixed syntax
    print "}"

你去了^^

for item in li:
    pprint_od(item)

要么

(pprint_od(item) for item in li)
def pprint_od(od):
    print "{"
    for key in od:
        print "%s:%s,\n" % (key, od[key]) # Fixed syntax
    print "}"

There you go ^^

for item in li:
    pprint_od(item)

or

(pprint_od(item) for item in li)

回答 8

我已经在python3.5上测试了这个基于Monkey补丁的邪恶方法,它可以工作:

pprint.PrettyPrinter._dispatch[pprint._collections.OrderedDict.__repr__] = pprint.PrettyPrinter._pprint_dict


def unsorted_pprint(data):
    def fake_sort(*args, **kwargs):
        return args[0]
    orig_sorted = __builtins__.sorted
    try:
        __builtins__.sorted = fake_sort
        pprint.pprint(data)
    finally:
        __builtins__.sorted = orig_sorted

您可以pprint使用通常的基于dict的摘要,还可以在通话过程中禁用排序功能,这样就不会为打印实际排序任何键。

I’ve tested this unholy monkey-patch based hack on python3.5 and it works:

pprint.PrettyPrinter._dispatch[pprint._collections.OrderedDict.__repr__] = pprint.PrettyPrinter._pprint_dict


def unsorted_pprint(data):
    def fake_sort(*args, **kwargs):
        return args[0]
    orig_sorted = __builtins__.sorted
    try:
        __builtins__.sorted = fake_sort
        pprint.pprint(data)
    finally:
        __builtins__.sorted = orig_sorted

You make pprint use the usual dict based summary and also disable sorting for the duration of the call so that no keys are actually sorted for printing.


回答 9

从Python 3.8开始:pprint.PrettyPrinter公开sort_dicts关键字参数。

默认情况下为True,将其设置为False将使字典不排序。

>>> from pprint import PrettyPrinter

>>> x = {'John': 1,
>>>      'Mary': 2,
>>>      'Paul': 3,
>>>      'Lisa': 4,
>>>      }

>>> PrettyPrinter(sort_dicts=False).pprint(x)

将输出:

{'John': 1, 
 'Mary': 2, 
 'Paul': 3,
 'Lisa': 4}

参考:https : //docs.python.org/3/library/pprint.html

As of Python 3.8 : pprint.PrettyPrinter exposes the sort_dicts keyword parameter.

True by default, setting it to False will leave the dictionary unsorted.

>>> from pprint import PrettyPrinter

>>> x = {'John': 1,
>>>      'Mary': 2,
>>>      'Paul': 3,
>>>      'Lisa': 4,
>>>      }

>>> PrettyPrinter(sort_dicts=False).pprint(x)

Will output :

{'John': 1, 
 'Mary': 2, 
 'Paul': 3,
 'Lisa': 4}

Reference : https://docs.python.org/3/library/pprint.html


回答 10

pprint()方法只是调用其中__repr__()的事物的方法,在它的方法中OrderedDict似乎并没有做很多(或没有任何东西)。

如果您不关心订单在打印输出中的可见性,那么这是一个便宜的解决方案,该解决方案在以下情况下可能会很大:

class PrintableOrderedDict(OrderedDict):
    def __repr__(self):
        return dict.__repr__(self)

令我惊讶的是,订单没有得到保存……嗯。

The pprint() method is just invoking the __repr__() method of things in it, and OrderedDict doesn’t appear to do much in it’s method (or doesn’t have one or something).

Here’s a cheap solution that should work IF YOU DON’T CARE ABOUT THE ORDER BEING VISIBLE IN THE PPRINT OUTPUT, which may be a big if:

class PrintableOrderedDict(OrderedDict):
    def __repr__(self):
        return dict.__repr__(self)

I’m actually surprised that the order isn’t preserved… ah well.


回答 11

您还可以使用以下简化的kzh答案:

pprint(data.items(), indent=4)

它保留顺序,并且输出结果几乎与webwurst答案相同(通过json dump打印)。

You can also use this simplification of the kzh answer:

pprint(data.items(), indent=4)

It preserves the order and will output almost the same than the webwurst answer (print through json dump).


回答 12

对于python <3.8(例如3.6):

Monkey补丁pprintsorted为了防止其排序。这也将有利于一切递归工作,并且比json需要使用width参数的用户更适合:

import pprint
pprint.sorted = lambda arg, *a, **kw: arg

>>> pprint.pprint({'z': 1, 'a': 2, 'c': {'z': 0, 'a': 1}}, width=20)
{'z': 1,
 'a': 2,
 'c': {'z': 0,
       'a': 1}}

编辑:清理

要清理这个肮脏的业务,只需运行: pprint.sorted = sorted

对于真正干净的解决方案,甚至可以使用contextmanager:

import pprint
import contextlib

@contextlib.contextmanager
def pprint_ordered():
    pprint.sorted = lambda arg, *args, **kwargs: arg
    yield
    pprint.sorted = sorted

# usage:

with pprint_ordered():
    pprint.pprint({'z': 1, 'a': 2, 'c': {'z': 0, 'a': 1}}, width=20)

# without it    
pprint.pprint({'z': 1, 'a': 2, 'c': {'z': 0, 'a': 1}}, width=20)

# prints: 
#    
# {'z': 1,
#  'a': 2,
#  'c': {'z': 0,
#        'a': 1}}
#
# {'a': 2,
#  'c': {'a': 1,
#        'z': 0},
#  'z': 1}

For python < 3.8 (e.g. 3.6):

Monkey patch pprint‘s sorted in order to prevent it from sorting. This will have the benefit of everything working recursively as well, and is more suitable than the json option for whoever needs to use e.g. width parameter:

import pprint
pprint.sorted = lambda arg, *a, **kw: arg

>>> pprint.pprint({'z': 1, 'a': 2, 'c': {'z': 0, 'a': 1}}, width=20)
{'z': 1,
 'a': 2,
 'c': {'z': 0,
       'a': 1}}

Edit: cleaning up

To clean up after this dirty business just run: pprint.sorted = sorted

For a really clean solution can even use a contextmanager:

import pprint
import contextlib

@contextlib.contextmanager
def pprint_ordered():
    pprint.sorted = lambda arg, *args, **kwargs: arg
    yield
    pprint.sorted = sorted

# usage:

with pprint_ordered():
    pprint.pprint({'z': 1, 'a': 2, 'c': {'z': 0, 'a': 1}}, width=20)

# without it    
pprint.pprint({'z': 1, 'a': 2, 'c': {'z': 0, 'a': 1}}, width=20)

# prints: 
#    
# {'z': 1,
#  'a': 2,
#  'c': {'z': 0,
#        'a': 1}}
#
# {'a': 2,
#  'c': {'a': 1,
#        'z': 0},
#  'z': 1}

回答 13

您可以重新定义pprint()并拦截对的调用OrderedDict。这是一个简单的例子。按照规定,OrderedDict越权代码忽略任何可选streamindentwidth,或者depth可能已经通过关键字,但可以增强贯彻落实。但这种方法不处理他们另一个容器内,比如一个listOrderDict

from collections import OrderedDict
from pprint import pprint as pp_pprint

def pprint(obj, *args, **kwrds):
    if not isinstance(obj, OrderedDict):
        # use stock function
        return pp_pprint(obj, *args, **kwrds)
    else:
        # very simple sample custom implementation...
        print "{"
        for key in obj:
            print "    %r:%r" % (key, obj[key])
        print "}"

l = [10, 2, 4]
d = dict((('john',1), ('paul',2), ('mary',3)))
od = OrderedDict((('john',1), ('paul',2), ('mary',3)))
pprint(l, width=4)
# [10,
#  2,
#  4]
pprint(d)
# {'john': 1, 'mary': 3, 'paul': 2}

pprint(od)
# {
#     'john':1
#     'paul':2
#     'mary':3
# }

You could redefine pprint() and intercept calls for OrderedDict‘s. Here’s a simple illustration. As written, the OrderedDict override code ignores any optional stream, indent, width, or depth keywords that may have been passed, but could be enhanced to implement them. Unfortunately this technique doesn’t handle them inside another container, such as a list of OrderDict‘s

from collections import OrderedDict
from pprint import pprint as pp_pprint

def pprint(obj, *args, **kwrds):
    if not isinstance(obj, OrderedDict):
        # use stock function
        return pp_pprint(obj, *args, **kwrds)
    else:
        # very simple sample custom implementation...
        print "{"
        for key in obj:
            print "    %r:%r" % (key, obj[key])
        print "}"

l = [10, 2, 4]
d = dict((('john',1), ('paul',2), ('mary',3)))
od = OrderedDict((('john',1), ('paul',2), ('mary',3)))
pprint(l, width=4)
# [10,
#  2,
#  4]
pprint(d)
# {'john': 1, 'mary': 3, 'paul': 2}

pprint(od)
# {
#     'john':1
#     'paul':2
#     'mary':3
# }

回答 14

如果字典项都是一种类型,则可以使用令人惊叹的数据处理库pandas

>>> import pandas as pd
>>> x = {'foo':1, 'bar':2}
>>> pd.Series(x)
bar    2
foo    1
dtype: int64

要么

>>> import pandas as pd
>>> x = {'foo':'bar', 'baz':'bam'}
>>> pd.Series(x)
baz    bam
foo    bar
dtype: object

If the dictionary items are all of one type, you could use the amazing data-handling library pandas:

>>> import pandas as pd
>>> x = {'foo':1, 'bar':2}
>>> pd.Series(x)
bar    2
foo    1
dtype: int64

or

>>> import pandas as pd
>>> x = {'foo':'bar', 'baz':'bam'}
>>> pd.Series(x)
baz    bam
foo    bar
dtype: object

Python 2.7:打印到文件

问题:Python 2.7:打印到文件

为什么尝试直接打印到文件而不是sys.stdout产生以下语法错误:

Python 2.7.2+ (default, Oct  4 2011, 20:06:09)
[GCC 4.6.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> f1=open('./testfile', 'w+')
>>> print('This is a test', file=f1)
  File "<stdin>", line 1
    print('This is a test', file=f1)
                            ^
SyntaxError: invalid syntax

从帮助(__builtins__),我有以下信息:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout)

    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file: a file-like object (stream); defaults to the current sys.stdout.
    sep:  string inserted between values, default a space.
    end:  string appended after the last value, default a newline.

那么,将标准流打印内容写入更改的正确语法是什么?

我知道有不同的也许更好的写入文件的方法,但是我真的不明白为什么这应该是语法错误…

一个很好的解释将不胜感激!

Why does trying to print directly to a file instead of sys.stdout produce the following syntax error:

Python 2.7.2+ (default, Oct  4 2011, 20:06:09)
[GCC 4.6.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> f1=open('./testfile', 'w+')
>>> print('This is a test', file=f1)
  File "<stdin>", line 1
    print('This is a test', file=f1)
                            ^
SyntaxError: invalid syntax

From help(__builtins__) I have the following info:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout)

    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file: a file-like object (stream); defaults to the current sys.stdout.
    sep:  string inserted between values, default a space.
    end:  string appended after the last value, default a newline.

So what would be the right syntax to change the standard stream print writes to?

I know that there are different maybe better ways to write to file but I really don’t get why this should be a syntax error…

A nice explanation would be appreciated!


回答 0

如果要print在Python 2中使用该函数,则必须从导入__future__

from __future__ import print_function

但是,即使不使用该函数,也可以达到相同的效果:

print >>f1, 'This is a test'

If you want to use the print function in Python 2, you have to import from __future__:

from __future__ import print_function

But you can have the same effect without using the function, too:

print >>f1, 'This is a test'

回答 1

print是python 2.X中的关键字。您应该使用以下内容:

f1=open('./testfile', 'w+')
f1.write('This is a test')
f1.close()

print is a keyword in python 2.X. You should use the following:

f1=open('./testfile', 'w+')
f1.write('This is a test')
f1.close()

回答 2

print(args, file=f1)是python 3.x语法。对于python 2.x使用print >> f1, args

print(args, file=f1) is the python 3.x syntax. For python 2.x use print >> f1, args.


回答 3

您可以将打印语句导出到文件,而无需更改任何代码。只需打开终端窗口并以这种方式运行代码:

python yourcode.py >> log.txt

You can export print statement to file without changing any code. Simply open a terminal windows and run your code in this way:

python yourcode.py >> log.txt

回答 4

这会将您的“打印”输出重定向到文件:

import sys
sys.stdout = open("file.txt", "w+")
print "this line will redirect to file.txt"

This will redirect your ‘print’ output to a file:

import sys
sys.stdout = open("file.txt", "w+")
print "this line will redirect to file.txt"

回答 5

在Python 3.0+中,print是一个函数,您可以使用调用它print(...)。在较早的版本中,print是一个声明,您可以使用进行声明print ...

要在Python 3.0之前的版本中打印到文件,请执行以下操作:

print >> f, 'what ever %d', i

>>操作指引打印到文件f

In Python 3.0+, print is a function, which you’d call with print(...). In earlier version, print is a statement, which you’d make with print ....

To print to a file in Python earlier than 3.0, you’d do:

print >> f, 'what ever %d', i

The >> operator directs print to the file f.


使用app.yaml将环境变量安全地存储在GAE中

问题:使用app.yaml将环境变量安全地存储在GAE中

我需要将API密钥和其他敏感信息存储app.yaml为环境变量,以便在GAE上进行部署。问题是如果我推app.yaml送到GitHub,此信息将公开(不好)。我不想将信息存储在数据存储中,因为它不适合该项目。相反,我想换出.gitignore应用程序每次部署中列出的文件中的值。

这是我的app.yaml文件:

application: myapp
version: 3 
runtime: python27
api_version: 1
threadsafe: true

libraries:
- name: webapp2
  version: latest
- name: jinja2
  version: latest

handlers:
- url: /static
  static_dir: static

- url: /.*
  script: main.application  
  login: required
  secure: always
# auth_fail_action: unauthorized

env_variables:
  CLIENT_ID: ${CLIENT_ID}
  CLIENT_SECRET: ${CLIENT_SECRET}
  ORG: ${ORG}
  ACCESS_TOKEN: ${ACCESS_TOKEN}
  SESSION_SECRET: ${SESSION_SECRET}

有任何想法吗?

I need to store API keys and other sensitive information in app.yaml as environment variables for deployment on GAE. The issue with this is that if I push app.yaml to GitHub, this information becomes public (not good). I don’t want to store the info in a datastore as it does not suit the project. Rather, I’d like to swap out the values from a file that is listed in .gitignore on each deployment of the app.

Here is my app.yaml file:

application: myapp
version: 3 
runtime: python27
api_version: 1
threadsafe: true

libraries:
- name: webapp2
  version: latest
- name: jinja2
  version: latest

handlers:
- url: /static
  static_dir: static

- url: /.*
  script: main.application  
  login: required
  secure: always
# auth_fail_action: unauthorized

env_variables:
  CLIENT_ID: ${CLIENT_ID}
  CLIENT_SECRET: ${CLIENT_SECRET}
  ORG: ${ORG}
  ACCESS_TOKEN: ${ACCESS_TOKEN}
  SESSION_SECRET: ${SESSION_SECRET}

Any ideas?


回答 0

如果是敏感数据,则不应将其存储在源代码中,因为它将被检查到源代码管理中。错误的人(组织内部或外部)可能会在此处找到它。另外,您的开发环境可能会使用与生产环境不同的配置值。如果这些值存储在代码中,则您将不得不在开发和生产中运行不同的代码,这是很麻烦的做法。

在我的项目中,我使用此类将配置数据放入数据存储区:

from google.appengine.ext import ndb

class Settings(ndb.Model):
  name = ndb.StringProperty()
  value = ndb.StringProperty()

  @staticmethod
  def get(name):
    NOT_SET_VALUE = "NOT SET"
    retval = Settings.query(Settings.name == name).get()
    if not retval:
      retval = Settings()
      retval.name = name
      retval.value = NOT_SET_VALUE
      retval.put()
    if retval.value == NOT_SET_VALUE:
      raise Exception(('Setting %s not found in the database. A placeholder ' +
        'record has been created. Go to the Developers Console for your app ' +
        'in App Engine, look up the Settings record with name=%s and enter ' +
        'its value in that record\'s value field.') % (name, name))
    return retval.value

您的应用程序将这样做以获取价值:

API_KEY = Settings.get('API_KEY')

如果数据存储中有该键的值,则将获得它。如果没有,将创建一个占位符记录并引发异常。该异常将提醒您转到开发人员控制台并更新占位符记录。

我发现这消除了对设置配置值的猜测。如果不确定要设置哪些配置值,只需运行代码,它将告诉您!

上面的代码使用了ndb库,该库使用了memcache和后台的数据存储,因此速度很快。


更新:

jelder询问如何在App Engine控制台中找到数据存储区值并进行设置。方法如下:

  1. 前往https://console.cloud.google.com/datastore/

  2. 如果尚未选择项目,请在页面顶部选择它。

  3. 种类下拉框中,选择设置

  4. 如果您运行上面的代码,您的密钥将会显示。它们都将具有值NOT SET。单击每个并设置其值。

希望这可以帮助!

If it’s sensitive data, you should not store it in source code as it will be checked into source control. The wrong people (inside or outside your organization) may find it there. Also, your development environment probably uses different config values from your production environment. If these values are stored in code, you will have to run different code in development and production, which is messy and bad practice.

In my projects, I put config data in the datastore using this class:

from google.appengine.ext import ndb

class Settings(ndb.Model):
  name = ndb.StringProperty()
  value = ndb.StringProperty()

  @staticmethod
  def get(name):
    NOT_SET_VALUE = "NOT SET"
    retval = Settings.query(Settings.name == name).get()
    if not retval:
      retval = Settings()
      retval.name = name
      retval.value = NOT_SET_VALUE
      retval.put()
    if retval.value == NOT_SET_VALUE:
      raise Exception(('Setting %s not found in the database. A placeholder ' +
        'record has been created. Go to the Developers Console for your app ' +
        'in App Engine, look up the Settings record with name=%s and enter ' +
        'its value in that record\'s value field.') % (name, name))
    return retval.value

Your application would do this to get a value:

API_KEY = Settings.get('API_KEY')

If there is a value for that key in the datastore, you will get it. If there isn’t, a placeholder record will be created and an exception will be thrown. The exception will remind you to go to the Developers Console and update the placeholder record.

I find this takes the guessing out of setting config values. If you are unsure of what config values to set, just run the code and it will tell you!

The code above uses the ndb library which uses memcache and the datastore under the hood, so it’s fast.


Update:

jelder asked for how to find the Datastore values in the App Engine console and set them. Here is how:

  1. Go to https://console.cloud.google.com/datastore/

  2. Select your project at the top of the page if it’s not already selected.

  3. In the Kind dropdown box, select Settings.

  4. If you ran the code above, your keys will show up. They will all have the value NOT SET. Click each one and set its value.

Hope this helps!


回答 1

此解决方案很简单,但可能不适合所有不同的团队。

首先,将环境变量放入env_variables.yaml中,例如,

env_variables:
  SECRET: 'my_secret'

然后,将其包含env_variables.yamlapp.yaml

includes:
  - env_variables.yaml

最后,将添加env_variables.yaml.gitignore,以使秘密变量在存储库中不存在。

在这种情况下,env_variables.yaml需要在部署管理器之间共享。

This solution is simple but may not suit all different teams.

First, put the environment variables in an env_variables.yaml, e.g.,

env_variables:
  SECRET: 'my_secret'

Then, include this env_variables.yaml in the app.yaml

includes:
  - env_variables.yaml

Finally, add the env_variables.yaml to .gitignore, so that the secret variables won’t exist in the repository.

In this case, the env_variables.yaml needs to be shared among the deployment managers.


回答 2

我的方法是将客户端机密存储在App Engine应用本身中。客户端机密既不在源代码控制中,也不在任何本地计算机上。这样的好处是,任何 App Engine合作者都可以部署代码更改,而不必担心客户端机密。

我将客户端机密直接存储在数据存储区中,并使用Memcache改善了访问机密的延迟。数据存储区实体仅需要创建一次,并将在以后的部署中保持不变。当然,可以随时使用App Engine控制台更新这些实体。

有两种方法可以执行一次性实体创建:

  • 使用App Engine 远程API交互式外壳程序创建实体。
  • 创建一个仅管理员处理程序,该处理程序将使用伪值初始化实体。手动调用此管理处理程序,然后使用App Engine控制台使用生产客户端密码更新实体。

My approach is to store client secrets only within the App Engine app itself. The client secrets are neither in source control nor on any local computers. This has the benefit that any App Engine collaborator can deploy code changes without having to worry about the client secrets.

I store client secrets directly in Datastore and use Memcache for improved latency accessing the secrets. The Datastore entities only need to be created once and will persist across future deploys. of course the App Engine console can be used to update these entities at any time.

There are two options to perform the one-time entity creation:

  • Use the App Engine Remote API interactive shell to create the entities.
  • Create an Admin only handler that will initialize the entities with dummy values. Manually invoke this admin handler, then use the App Engine console to update the entities with the production client secrets.

回答 3

最好的方法是将密钥存储在client_secrets.json文件中,并通过在.gitignore文件中列出密钥,将其从上传到git中排除。如果您在不同环境下使用不同的密钥,则可以使用app_identity api来确定应用程序ID是什么,并进行适当加载。

这里有一个相当全面的示例-> https://developers.google.com/api-client-library/python/guide/aaa_client_secrets

这是一些示例代码:

# declare your app ids as globals ...
APPID_LIVE = 'awesomeapp'
APPID_DEV = 'awesomeapp-dev'
APPID_PILOT = 'awesomeapp-pilot'

# create a dictionary mapping the app_ids to the filepaths ...
client_secrets_map = {APPID_LIVE:'client_secrets_live.json',
                      APPID_DEV:'client_secrets_dev.json',
                      APPID_PILOT:'client_secrets_pilot.json'}

# get the filename based on the current app_id ...
client_secrets_filename = client_secrets_map.get(
    app_identity.get_application_id(),
    APPID_DEV # fall back to dev
    )

# use the filename to construct the flow ...
flow = flow_from_clientsecrets(filename=client_secrets_filename,
                               scope=scope,
                               redirect_uri=redirect_uri)

# or, you could load up the json file manually if you need more control ...
f = open(client_secrets_filename, 'r')
client_secrets = json.loads(f.read())
f.close()

Best way to do it, is store the keys in a client_secrets.json file, and exclude that from being uploaded to git by listing it in your .gitignore file. If you have different keys for different environments, you can use app_identity api to determine what the app id is, and load appropriately.

There is a fairly comprehensive example here -> https://developers.google.com/api-client-library/python/guide/aaa_client_secrets.

Here’s some example code:

# declare your app ids as globals ...
APPID_LIVE = 'awesomeapp'
APPID_DEV = 'awesomeapp-dev'
APPID_PILOT = 'awesomeapp-pilot'

# create a dictionary mapping the app_ids to the filepaths ...
client_secrets_map = {APPID_LIVE:'client_secrets_live.json',
                      APPID_DEV:'client_secrets_dev.json',
                      APPID_PILOT:'client_secrets_pilot.json'}

# get the filename based on the current app_id ...
client_secrets_filename = client_secrets_map.get(
    app_identity.get_application_id(),
    APPID_DEV # fall back to dev
    )

# use the filename to construct the flow ...
flow = flow_from_clientsecrets(filename=client_secrets_filename,
                               scope=scope,
                               redirect_uri=redirect_uri)

# or, you could load up the json file manually if you need more control ...
f = open(client_secrets_filename, 'r')
client_secrets = json.loads(f.read())
f.close()

回答 4

发布时不存在此功能,但对于在这里偶然发现的其他人,Google现在提供一项称为Secret Manager的服务

这是一个简单的REST服务(当然,其中包含SDK)将您的机密存储在Google云平台上的安全位置。与Data Store相比,这是一种更好的方法,需要额外的步骤来查看存储的机密并具有更细粒度的权限模型-如果需要,您可以针对项目的不同方面以不同的方式保护单个机密。

它提供版本控制,因此您可以相对轻松地处理密码更改,以及强大的查询和管理层,使您能够在必要时在运行时发现和创建机密信息。

Python SDK

用法示例:

from google.cloud import secretmanager_v1beta1 as secretmanager

secret_id = 'my_secret_key'
project_id = 'my_project'
version = 1    # use the management tools to determine version at runtime

client = secretmanager.SecretManagerServiceClient()

secret_path = client.secret_verion_path(project_id, secret_id, version)
response = client.access_secret_version(secret_path)
password_string = response.payload.data.decode('UTF-8')

# use password_string -- set up database connection, call third party service, whatever

This didn’t exist when you posted, but for anyone else who stumbles in here, Google now offers a service called Secret Manager.

It’s a simple REST service (with SDKs wrapping it, of course) to store your secrets in a secure location on google cloud platform. This is a better approach than Data Store, requiring extra steps to see the stored secrets and having a finer-grained permission model — you can secure individual secrets differently for different aspects of your project, if you need to.

It offers versioning, so you can handle password changes with relative ease, as well as a robust query and management layer enabling you to discover and create secrets at runtime, if necessary.

Python SDK

Example usage:

from google.cloud import secretmanager_v1beta1 as secretmanager

secret_id = 'my_secret_key'
project_id = 'my_project'
version = 1    # use the management tools to determine version at runtime

client = secretmanager.SecretManagerServiceClient()

secret_path = client.secret_verion_path(project_id, secret_id, version)
response = client.access_secret_version(secret_path)
password_string = response.payload.data.decode('UTF-8')

# use password_string -- set up database connection, call third party service, whatever

回答 5

此解决方案依赖于已弃用的appcfg.py

将应用程序部署到GAE时,可以使用appcfg.py的-E命令行选项设置环境变量(appcfg.py更新)

$ appcfg.py
...
-E NAME:VALUE, --env_variable=NAME:VALUE
                    Set an environment variable, potentially overriding an
                    env_variable value from app.yaml file (flag may be
                    repeated to set multiple variables).
...

This solution relies on the deprecated appcfg.py

You can use the -E command line option of appcfg.py to setup the environment variables when you deploy your app to GAE (appcfg.py update)

$ appcfg.py
...
-E NAME:VALUE, --env_variable=NAME:VALUE
                    Set an environment variable, potentially overriding an
                    env_variable value from app.yaml file (flag may be
                    repeated to set multiple variables).
...

回答 6

大多数答案已过时。实际上,现在使用Google Cloud Datastore有点不同。https://cloud.google.com/python/getting-started/using-cloud-datastore

这是一个例子:

from google.cloud import datastore
client = datastore.Client()
datastore_entity = client.get(client.key('settings', 'TWITTER_APP_KEY'))
connection_string_prod = datastore_entity.get('value')

假设实体名称为“ TWITTER_APP_KEY”,种类为“设置”,“值”为TWITTER_APP_KEY实体的属性。

Most answers are outdated. Using google cloud datastore is actually a bit different right now. https://cloud.google.com/python/getting-started/using-cloud-datastore

Here’s an example:

from google.cloud import datastore
client = datastore.Client()
datastore_entity = client.get(client.key('settings', 'TWITTER_APP_KEY'))
connection_string_prod = datastore_entity.get('value')

This assumes the entity name is ‘TWITTER_APP_KEY’, the kind is ‘settings’, and ‘value’ is a property of the TWITTER_APP_KEY entity.


回答 7

听起来您可以采取一些方法。我们有一个类似的问题,请执行以下操作(以适合您的用例):

  • 创建一个存储任何动态app.yaml值的文件,并将其放置在构建环境中的安全服务器上。如果您确实偏执,则可以非对称地加密值。如果您需要版本控制/动态拉取,甚至可以将其保存在专用回购中,或者仅使用shell脚本将其复制/从适当的地方拉出。
  • 在部署脚本期间从git中提取
  • 在git pull之后,通过使用yaml库在纯python中读写来修改app.yaml

最简单的方法是使用持续集成服务器,例如HudsonBambooJenkins。只需添加一些插件,脚本步骤或工作流程即可完成我提到的所有上述项目。例如,您可以传入在Bamboo本身中配置的环境变量。

总之,在您只能访问的环境中,只需在构建过程中输入值即可。如果您尚未使构建自动化,则应该这样做。

另一个选项就是您所说的内容,将其放入数据库中。如果您不这样做的原因是操作太慢,则只需将值作为第二层缓存推送到内存缓存中,然后将值作为第一层缓存固定到实例即可。如果值可以更改并且您需要在不重新启动实例的情况下更新实例,则只需保留一个散列即可检查它们何时更改,或者在您进行某些更改后以某种方式触发它。应该的。

It sounds like you can do a few approaches. We have a similar issue and do the following (adapted to your use-case):

  • Create a file that stores any dynamic app.yaml values and place it on a secure server in your build environment. If you are really paranoid, you can asymmetrically encrypt the values. You can even keep this in a private repo if you need version control/dynamic pulling, or just use a shells script to copy it/pull it from the appropriate place.
  • Pull from git during the deployment script
  • After the git pull, modify the app.yaml by reading and writing it in pure python using a yaml library

The easiest way to do this is to use a continuous integration server such as Hudson, Bamboo, or Jenkins. Simply add some plug-in, script step, or workflow that does all the above items I mentioned. You can pass in environment variables that are configured in Bamboo itself for example.

In summary, just push in the values during your build process in an environment you only have access to. If you aren’t already automating your builds, you should be.

Another option option is what you said, put it in the database. If your reason for not doing that is that things are too slow, simply push the values into memcache as a 2nd layer cache, and pin the values to the instances as a first-layer cache. If the values can change and you need to update the instances without rebooting them, just keep a hash you can check to know when they change or trigger it somehow when something you do changes the values. That should be it.


回答 8

您应该使用google kms加密变量,并将其嵌入到源代码中。(https://cloud.google.com/kms/

echo -n the-twitter-app-key | gcloud kms encrypt \
> --project my-project \
> --location us-central1 \
> --keyring THEKEYRING \
> --key THECRYPTOKEY \
> --plaintext-file - \
> --ciphertext-file - \
> | base64

将加扰后的值(加密并以base64编码)放入您的环境变量(在yaml文件中)。

一些Python式代码可帮助您开始解密。

kms_client = kms_v1.KeyManagementServiceClient()
name = kms_client.crypto_key_path_path("project", "global", "THEKEYRING", "THECRYPTOKEY")

twitter_app_key = kms_client.decrypt(name, base64.b64decode(os.environ.get("TWITTER_APP_KEY"))).plaintext

You should encrypt the variables with google kms and embed it in your source code. (https://cloud.google.com/kms/)

echo -n the-twitter-app-key | gcloud kms encrypt \
> --project my-project \
> --location us-central1 \
> --keyring THEKEYRING \
> --key THECRYPTOKEY \
> --plaintext-file - \
> --ciphertext-file - \
> | base64

put the scrambled (encrypted and base64 encoded) value into your environment variable (in yaml file).

Some pythonish code to get you started on decrypting.

kms_client = kms_v1.KeyManagementServiceClient()
name = kms_client.crypto_key_path_path("project", "global", "THEKEYRING", "THECRYPTOKEY")

twitter_app_key = kms_client.decrypt(name, base64.b64decode(os.environ.get("TWITTER_APP_KEY"))).plaintext

回答 9

@Jason F 基于使用Google数据存储的答案很接近,但是基于库docs上的示例用法,代码有些过时了。这是对我有用的代码片段:

from google.cloud import datastore

client = datastore.Client('<your project id>')
key = client.key('<kind e.g settings>', '<entity name>') # note: entity name not property
# get by key for this entity
result = client.get(key)
print(result) # prints all the properties ( a dict). index a specific value like result['MY_SECRET_KEY'])

部分受此中篇文章的启发

@Jason F’s answer based on using Google Datastore is close, but the code is a bit outdated based on the sample usage on the library docs. Here’s the snippet that worked for me:

from google.cloud import datastore

client = datastore.Client('<your project id>')
key = client.key('<kind e.g settings>', '<entity name>') # note: entity name not property
# get by key for this entity
result = client.get(key)
print(result) # prints all the properties ( a dict). index a specific value like result['MY_SECRET_KEY'])

Partly inspired by this Medium post


回答 10

只是想说明一下我是如何在javascript / nodejs中解决此问题的。对于本地开发,我使用了“ dotenv” npm软件包,该软件包将环境变量从.env文件加载到process.env中。当我开始使用GAE时,我了解到需要在“ app.yaml”文件中设置环境变量。好吧,我不想将’dotenv’用于本地开发,而不想将’app.yaml’用于GAE(并在两个文件之间复制我的环境变量),所以我编写了一个小脚本,将app.yaml环境变量加载到进程中.env,用于本地开发。希望这对某人有帮助:

yaml_env.js:

(function () {
    const yaml = require('js-yaml');
    const fs = require('fs');
    const isObject = require('lodash.isobject')

    var doc = yaml.safeLoad(
        fs.readFileSync('app.yaml', 'utf8'), 
        { json: true }
    );

    // The .env file will take precedence over the settings the app.yaml file
    // which allows me to override stuff in app.yaml (the database connection string (DATABASE_URL), for example)
    // This is optional of course. If you don't use dotenv then remove this line:
    require('dotenv/config');

    if(isObject(doc) && isObject(doc.env_variables)) {
        Object.keys(doc.env_variables).forEach(function (key) {
            // Dont set environment with the yaml file value if it's already set
            process.env[key] = process.env[key] || doc.env_variables[key]
        })
    }
})()

现在,尽早将此代码包含在您的代码中,您已完成:

require('../yaml_env')

Just wanted to note how I solved this problem in javascript/nodejs. For local development I used the ‘dotenv’ npm package which loads environment variables from a .env file into process.env. When I started using GAE I learned that environment variables need to be set in a ‘app.yaml’ file. Well, I didn’t want to use ‘dotenv’ for local development and ‘app.yaml’ for GAE (and duplicate my environment variables between the two files), so I wrote a little script that loads app.yaml environment variables into process.env, for local development. Hope this helps someone:

yaml_env.js:

(function () {
    const yaml = require('js-yaml');
    const fs = require('fs');
    const isObject = require('lodash.isobject')

    var doc = yaml.safeLoad(
        fs.readFileSync('app.yaml', 'utf8'), 
        { json: true }
    );

    // The .env file will take precedence over the settings the app.yaml file
    // which allows me to override stuff in app.yaml (the database connection string (DATABASE_URL), for example)
    // This is optional of course. If you don't use dotenv then remove this line:
    require('dotenv/config');

    if(isObject(doc) && isObject(doc.env_variables)) {
        Object.keys(doc.env_variables).forEach(function (key) {
            // Dont set environment with the yaml file value if it's already set
            process.env[key] = process.env[key] || doc.env_variables[key]
        })
    }
})()

Now include this file as early as possible in your code, and you’re done:

require('../yaml_env')

回答 11

扩展马丁的答案

from google.appengine.ext import ndb

class Settings(ndb.Model):
    """
    Get sensitive data setting from DataStore.

    key:String -> value:String
    key:String -> Exception

    Thanks to: Martin Omander @ Stackoverflow
    https://stackoverflow.com/a/35261091/1463812
    """
    name = ndb.StringProperty()
    value = ndb.StringProperty()

    @staticmethod
    def get(name):
        retval = Settings.query(Settings.name == name).get()
        if not retval:
            raise Exception(('Setting %s not found in the database. A placeholder ' +
                             'record has been created. Go to the Developers Console for your app ' +
                             'in App Engine, look up the Settings record with name=%s and enter ' +
                             'its value in that record\'s value field.') % (name, name))
        return retval.value

    @staticmethod
    def set(name, value):
        exists = Settings.query(Settings.name == name).get()
        if not exists:
            s = Settings(name=name, value=value)
            s.put()
        else:
            exists.value = value
            exists.put()

        return True

Extending Martin’s answer

from google.appengine.ext import ndb

class Settings(ndb.Model):
    """
    Get sensitive data setting from DataStore.

    key:String -> value:String
    key:String -> Exception

    Thanks to: Martin Omander @ Stackoverflow
    https://stackoverflow.com/a/35261091/1463812
    """
    name = ndb.StringProperty()
    value = ndb.StringProperty()

    @staticmethod
    def get(name):
        retval = Settings.query(Settings.name == name).get()
        if not retval:
            raise Exception(('Setting %s not found in the database. A placeholder ' +
                             'record has been created. Go to the Developers Console for your app ' +
                             'in App Engine, look up the Settings record with name=%s and enter ' +
                             'its value in that record\'s value field.') % (name, name))
        return retval.value

    @staticmethod
    def set(name, value):
        exists = Settings.query(Settings.name == name).get()
        if not exists:
            s = Settings(name=name, value=value)
            s.put()
        else:
            exists.value = value
            exists.put()

        return True

回答 12

有一个名为gae_env的pypi软件包,可让您将Appengine环境变量保存在Cloud Datastore中。在后台,它还使用Memcache,因此其速度很快

用法:

import gae_env

API_KEY = gae_env.get('API_KEY')

如果数据存储中有该键的值,则将其返回。如果没有,__NOT_SET__将创建一个占位符记录并ValueNotSetError抛出一个。该异常将提醒您转到开发人员控制台并更新占位符记录。


与Martin的答案类似,这是如何更新数据存储区中键的值:

  1. 转到开发人员控制台中的“ 数据存储”部分

  2. 如果尚未选择项目,请在页面顶部选择它。

  3. 在“ 种类”下拉框中,选择GaeEnvSettings

  4. 引发异常的键将具有价值__NOT_SET__


转到软件包的GitHub页面以获取有关用法/配置的更多信息

There is a pypi package called gae_env that allows you to save appengine environment variables in Cloud Datastore. Under the hood, it also uses Memcache so its fast

Usage:

import gae_env

API_KEY = gae_env.get('API_KEY')

If there is a value for that key in the datastore, it will be returned. If there isn’t, a placeholder record __NOT_SET__ will be created and a ValueNotSetError will be thrown. The exception will remind you to go to the Developers Console and update the placeholder record.


Similar to Martin’s answer, here is how to update the value for the key in Datastore:

  1. Go to Datastore Section in the developers console

  2. Select your project at the top of the page if it’s not already selected.

  3. In the Kind dropdown box, select GaeEnvSettings.

  4. Keys for which an exception was raised will have value __NOT_SET__.


Go to the package’s GitHub page for more info on usage/configuration


如何使用PyCharm调试Scrapy项目

问题:如何使用PyCharm调试Scrapy项目

我正在使用Python 2.7开发Scrapy 0.20。我发现PyCharm具有良好的Python调试器。我想使用它测试我的Scrapy蜘蛛。有人知道该怎么做吗?

我尝试过的

实际上,我尝试将Spider作为脚本运行。结果,我构建了该脚本。然后,我尝试将Scrapy项目添加到PyCharm中,如下所示:
File->Setting->Project structure->Add content root.

但是我不知道我还要做什么

I am working on Scrapy 0.20 with Python 2.7. I found PyCharm has a good Python debugger. I want to test my Scrapy spiders using it. Anyone knows how to do that please?

What I have tried

Actually I tried to run the spider as a script. As a result, I built that script. Then, I tried to add my Scrapy project to PyCharm as a model like this:
File->Setting->Project structure->Add content root.

But I don’t know what else I have to do


回答 0

scrapy命令是python脚本,这意味着您可以从PyCharm内部启动它。

当检查scrapy二进制文件(which scrapy)时,您会注意到这实际上是一个python脚本:

#!/usr/bin/python

from scrapy.cmdline import execute
execute()

这意味着scrapy crawl IcecatCrawler还可以像这样执行命令 :python /Library/Python/2.7/site-packages/scrapy/cmdline.py crawl IcecatCrawler

尝试找到scrapy.cmdline软件包。就我而言,位置在这里:/Library/Python/2.7/site-packages/scrapy/cmdline.py

使用该脚本作为脚本在PyCharm中创建运行/调试配置。用scrapy命令和Spider填充脚本参数。在这种情况下crawl IcecatCrawler

像这样:

将断点放在爬网代码中的任何位置,它应该可以正常工作。

The scrapy command is a python script which means you can start it from inside PyCharm.

When you examine the scrapy binary (which scrapy) you will notice that this is actually a python script:

#!/usr/bin/python

from scrapy.cmdline import execute
execute()

This means that a command like scrapy crawl IcecatCrawler can also be executed like this: python /Library/Python/2.7/site-packages/scrapy/cmdline.py crawl IcecatCrawler

Try to find the scrapy.cmdline package. In my case the location was here: /Library/Python/2.7/site-packages/scrapy/cmdline.py

Create a run/debug configuration inside PyCharm with that script as script. Fill the script parameters with the scrapy command and spider. In this case crawl IcecatCrawler.

Like this:

Put your breakpoints anywhere in your crawling code and it should work™.


回答 1

您只需要这样做。

在项目的搜寻器文件夹上创建一个Python文件。我使用了main.py。

  • 项目
    • 履带式
      • 履带式
        • 蜘蛛网
      • main.py
      • scrapy.cfg

在您的main.py内部,将下面的代码。

from scrapy import cmdline    
cmdline.execute("scrapy crawl spider".split())

并且您需要创建一个“运行配置”以运行您的main.py。

这样做,如果在代码上放置断点,它将在此处停止。

You just need to do this.

Create a Python file on crawler folder on your project. I used main.py.

  • Project
    • Crawler
      • Crawler
        • Spiders
      • main.py
      • scrapy.cfg

Inside your main.py put this code below.

from scrapy import cmdline    
cmdline.execute("scrapy crawl spider".split())

And you need to create a “Run Configuration” to run your main.py.

Doing this, if you put a breakpoint at your code it will stop there.


回答 2

截至2018.1,这变得容易得多。现在Module name,您可以在项目的中进行选择Run/Debug Configuration。将此设置为,scrapy.cmdline并将其设置Working directory为scrapy项目的根目录(其中有一个目录settings.py)。

像这样:

现在,您可以添加断点来调试代码。

As of 2018.1 this became a lot easier. You can now select Module name in your project’s Run/Debug Configuration. Set this to scrapy.cmdline and the Working directory to the root dir of the scrapy project (the one with settings.py in it).

Like so:

Now you can add breakpoints to debug your code.


回答 3

我正在使用Python 3.5.0在virtualenv中运行scrapy,并设置“ script”参数/path_to_project_env/env/bin/scrapy为我解决了该问题。

I am running scrapy in a virtualenv with Python 3.5.0 and setting the “script” parameter to /path_to_project_env/env/bin/scrapy solved the issue for me.


回答 4

intellij的想法也可以。

创建main.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-
#coding=utf-8
import sys
from scrapy import cmdline
def main(name):
    if name:
        cmdline.execute(name.split())



if __name__ == '__main__':
    print('[*] beginning main thread')
    name = "scrapy crawl stack"
    #name = "scrapy crawl spa"
    main(name)
    print('[*] main thread exited')
    print('main stop====================================================')

显示如下:

intellij idea also work.

create main.py:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
#coding=utf-8
import sys
from scrapy import cmdline
def main(name):
    if name:
        cmdline.execute(name.split())



if __name__ == '__main__':
    print('[*] beginning main thread')
    name = "scrapy crawl stack"
    #name = "scrapy crawl spa"
    main(name)
    print('[*] main thread exited')
    print('main stop====================================================')

show below:


回答 5

要在可接受的答案中添加一点点,将近一个小时后,我发现必须从下拉列表(图标工具栏中央附近)中选择正确的“运行配置”,然后单击“调试”按钮才能使其正常工作。希望这可以帮助!

To add a bit to the accepted answer, after almost an hour I found I had to select the correct Run Configuration from the dropdown list (near the center of the icon toolbar), then click the Debug button in order to get it to work. Hope this helps!


回答 6

我也在使用PyCharm,但没有使用其内置的调试功能。

为了调试,我使用ipdb。我设置了键盘快捷键,可以import ipdb; ipdb.set_trace()在希望断点发生的任何行上插入。

然后,我可以键入n执行下s一条语句,以进入函数,键入任何对象名称以查看其值,更改执行环境,键入c以继续执行…

这非常灵活,可以在PyCharm之外的其他环境中使用,在这些环境中您无法控制执行环境。

只需输入您的虚拟环境,pip install ipdb然后放在import ipdb; ipdb.set_trace()您要暂停执行的行上即可。

I am also using PyCharm, but I am not using its built-in debugging features.

For debugging I am using ipdb. I set up a keyboard shortcut to insert import ipdb; ipdb.set_trace() on any line I want the break point to happen.

Then I can type n to execute the next statement, s to step into a function, type any object name to see its value, alter execution environment, type c to continue execution…

This is very flexible, works in environments other than PyCharm, where you don’t control the execution environment.

Just type in your virtual environment pip install ipdb and place import ipdb; ipdb.set_trace() on a line where you want the execution to pause.

UPDATE

You can also pip install pdbpp and use the standard import pdb; pdb.set_trace instead of ipdb. PDB++ is nicer in my opinion.


回答 7

根据该文件https://doc.scrapy.org/en/latest/topics/practices.html

import scrapy
from scrapy.crawler import CrawlerProcess

class MySpider(scrapy.Spider):
    # Your spider definition
    ...

process = CrawlerProcess({
    'USER_AGENT': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'
})

process.crawl(MySpider)
process.start() # the script will block here until the crawling is finished

According to the documentation https://doc.scrapy.org/en/latest/topics/practices.html

import scrapy
from scrapy.crawler import CrawlerProcess

class MySpider(scrapy.Spider):
    # Your spider definition
    ...

process = CrawlerProcess({
    'USER_AGENT': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'
})

process.crawl(MySpider)
process.start() # the script will block here until the crawling is finished

回答 8

我使用以下简单脚本:

from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings

process = CrawlerProcess(get_project_settings())

process.crawl('your_spider_name')
process.start()

I use this simple script:

from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings

process = CrawlerProcess(get_project_settings())

process.crawl('your_spider_name')
process.start()

回答 9

扩展了@Rodrigo的答案版本,我添加了此脚本,现在我可以从配置中设置蜘蛛网名称,而不用更改字符串。

import sys
from scrapy import cmdline

cmdline.execute(f"scrapy crawl {sys.argv[1]}".split())

Extending @Rodrigo’s version of the answer I added this script and now I can set spider name from configuration instead of changing in the string.

import sys
from scrapy import cmdline

cmdline.execute(f"scrapy crawl {sys.argv[1]}".split())

清单上的Python os.path.join()

问题:清单上的Python os.path.join()

我可以

>>> os.path.join("c:/","home","foo","bar","some.txt")
'c:/home\\foo\\bar\\some.txt'

但是,当我这样做

>>> s = "c:/,home,foo,bar,some.txt".split(",")
>>> os.path.join(s)
['c:/', 'home', 'foo', 'bar', 'some.txt']

我在这里想念什么?

I can do

>>> os.path.join("c:/","home","foo","bar","some.txt")
'c:/home\\foo\\bar\\some.txt'

But, when I do

>>> s = "c:/,home,foo,bar,some.txt".split(",")
>>> os.path.join(s)
['c:/', 'home', 'foo', 'bar', 'some.txt']

What am I missing here?


回答 0

问题是,os.path.join不接受listas参数,它必须是单独的参数。

在这里*,“ splat”运算符开始起作用…

我可以

>>> s = "c:/,home,foo,bar,some.txt".split(",")
>>> os.path.join(*s)
'c:/home\\foo\\bar\\some.txt'

The problem is, os.path.join doesn’t take a list as argument, it has to be separate arguments.

This is where *, the ‘splat’ operator comes into play…

I can do

>>> s = "c:/,home,foo,bar,some.txt".split(",")
>>> os.path.join(*s)
'c:/home\\foo\\bar\\some.txt'

回答 1

假设join不是按照这种方式设计的(正如ATOzTOA所指出的那样),并且只采用了两个参数,您仍然可以使用内置参数reduce

>>> reduce(os.path.join,["c:/","home","foo","bar","some.txt"])
'c:/home\\foo\\bar\\some.txt'

相同的输出,如:

>>> os.path.join(*["c:/","home","foo","bar","some.txt"])
'c:/home\\foo\\bar\\some.txt' 

仅出于完整性和教育方面的原因(以及其他*无法正常工作的情况)。

Python 3提示

reduce已移至functools模块。

Assuming join wasn’t designed that way (which it is, as ATOzTOA pointed out), and it only took two parameters, you could still use the built-in reduce:

>>> reduce(os.path.join,["c:/","home","foo","bar","some.txt"])
'c:/home\\foo\\bar\\some.txt'

Same output like:

>>> os.path.join(*["c:/","home","foo","bar","some.txt"])
'c:/home\\foo\\bar\\some.txt' 

Just for completeness and educational reasons (and for other situations where * doesn’t work).

Hint for Python 3

reduce was moved to the functools module.


回答 2

我偶然发现列表可能为空的情况。在这种情况下:

os.path.join('', *the_list_with_path_components)

注意第一个参数,它不会改变结果。

I stumbled over the situation where the list might be empty. In that case:

os.path.join('', *the_list_with_path_components)

Note the first argument, which will not alter the result.


回答 3

这只是方法。您什么都不会错过。在官方文件显示,你可以用列表拆包提供几条路径:

s = "c:/,home,foo,bar,some.txt".split(",")
os.path.join(*s)

注意in中的*sintead 。使用星号将触发列表的解压缩,这意味着每个列表参数将作为单独的参数提供给函数。sos.path.join(*s)

It’s just the method. You’re not missing anything. The official documentation shows that you can use list unpacking to supply several paths:

s = "c:/,home,foo,bar,some.txt".split(",")
os.path.join(*s)

Note the *s intead of just s in os.path.join(*s). Using the asterisk will trigger the unpacking of the list, which means that each list argument will be supplied to the function as a separate argument.


回答 4

如果您希望从功能编程的角度考虑它,也可以将其视为简单的map reduce操作。

import os
folders = [("home",".vim"),("home","zathura")]
[reduce(lambda x,y: os.path.join(x,y), each, "") for each in folders]

reduce在Python 2.x中内置。在Python 3.x中,它已移至。itertools但是,公认的答案更好。

下面已经回答了这个问题,但是如果您有需要加入的项目列表,可以回答。

This can be also thought of as a simple map reduce operation if you would like to think of it from a functional programming perspective.

import os
folders = [("home",".vim"),("home","zathura")]
[reduce(lambda x,y: os.path.join(x,y), each, "") for each in folders]

reduce is builtin in Python 2.x. In Python 3.x it has been moved to itertools However the accepted the answer is better.

This has been answered below but answering if you have a list of items that needs to be joined.


如果set为空,则返回布尔值

问题:如果set为空,则返回布尔值

如果我的函数集结尾为空,我正在努力寻找一种更干净的返回布尔值的方法

我将两个集合相交,并希望返回TrueFalse基于结果集是否为空。

def myfunc(a,b):
    c = a.intersection(b)
    #...return boolean here

我最初的想法是

return c is not None

但是,在我的解释器中,我可以很容易地看到,如果 c = set([])

>>> c = set([])
>>> c is not None
True

我还尝试了以下所有方法:

>>> c == None
False
>>> c == False
False
>>> c is None
False

现在,我已经从文档阅读,我只能用andornot空集来推断一个布尔值。到目前为止,我唯一能想到的就是不返回

>>> not not c
False
>>> not c
True

我感觉有一种更多的Python方式可以做到这一点,因为我一直在努力寻找它。我不想将实际集合返回到if语句,因为我不需要这些值,我只想知道它们是否相交。

I am struggling to find a more clean way of returning a boolean value if my set is empty at the end of my function

I take the intersection of two sets, and want to return True or False based on if the resulting set is empty.

def myfunc(a,b):
    c = a.intersection(b)
    #...return boolean here

My initial thought was to do

return c is not None

However, in my interpreter I can easily see that statement will return true if c = set([])

>>> c = set([])
>>> c is not None
True

I’ve also tried all of the following:

>>> c == None
False
>>> c == False
False
>>> c is None
False

Now I’ve read from the documentation that I can only use and, or, and not with empty sets to deduce a boolean value. So far, the only thing I can come up with is returning not not c

>>> not not c
False
>>> not c
True

I have a feeling there is a much more pythonic way to do this, by I am struggling to find it. I don’t want to return the actual set to an if statement because I don’t need the values, I just want to know if they intersect.


回答 0

def myfunc(a,b):
    c = a.intersection(b)
    return bool(c)

bool()会执行类似的操作not not,但在意识形态上更清晰。

def myfunc(a,b):
    c = a.intersection(b)
    return bool(c)

bool() will do something similar to not not, but more ideomatic and clear.


回答 1

不像其他答案那样具有pythonic功能,而是数学功能:

return len(c) == 0

正如一些评论想知道的那样,这len(set)可能会对复杂性产生影响。如源代码中所示,它是O(1),因为它依赖于跟踪集合用法的变量。

static Py_ssize_t
set_len(PyObject *so)
{
    return ((PySetObject *)so)->used;
}

not as pythonic as the other answers, but mathematics:

return len(c) == 0

As some comments wondered about the impact len(set) could have on complexity. It is O(1) as shown in the source code given it relies on a variable that tracks the usage of the set.

static Py_ssize_t
set_len(PyObject *so)
{
    return ((PySetObject *)so)->used;
}

回答 2

如果您想要return True一个空集,那么我认为这样做会更清晰:

return c == set()

即“c等于空set”。

(或者,反之亦然return c != set())。

在我看来,这比False在布尔值上下文中依赖Python对空集的解释更为明确(尽管较少习惯用法)。

If you want to return True for an empty set, then I think it would be clearer to do:

return c == set()

i.e. “c is equal to an empty set“.

(Or, for the other way around, return c != set()).

In my opinion, this is more explicit (though less idiomatic) than relying on Python’s interpretation of an empty set as False in a boolean context.


回答 3

如果c为set,则可以通过执行以下操作检查它是否为空return not c

如果c为空,那么not cTrue

否则,如果c包含任何元素not c将为False

If c is a set then you can check whether it’s empty by doing: return not c.

If c is empty then not c will be True.

Otherwise, if c contains any elements not c will be False.


回答 4

当你说:

c is not None

您实际上正在检查c和None是否引用相同的对象。这就是“ is”运算符的作用。在python中,按常规,None是一个特殊的空值,表示您没有可用的值。像c或java中的null之类的sorta。由于python在内部仅使用“ is”运算符分配一个None值来检查某项是否为None(认为null)有效,因此它已成为流行的样式。但是,这与集合c的真值无关,它正在检查c实际上是一个集合,而不是空值。

如果要检查条件语句中的集合是否为空,则在上下文中将其强制转换为布尔值,因此您可以这样说:

c = set()
if c:
   print "it has stuff in it"
else:
   print "it is empty"

但是,如果您希望将其转换为布尔值以进行存储,则可以简单地说:

c = set()
c_has_stuff_in_it = bool(c)

When you say:

c is not None

You are actually checking if c and None reference the same object. That is what the “is” operator does. In python None is a special null value conventionally meaning you don’t have a value available. Sorta like null in c or java. Since python internally only assigns one None value using the “is” operator to check if something is None (think null) works, and it has become the popular style. However this does not have to do with the truth value of the set c, it is checking that c actually is a set rather than a null value.

If you want to check if a set is empty in a conditional statement, it is cast as a boolean in context so you can just say:

c = set()
if c:
   print "it has stuff in it"
else:
   print "it is empty"

But if you want it converted to a boolean to be stored away you can simply say:

c = set()
c_has_stuff_in_it = bool(c)

回答 5

"""
This function check if set is empty or not.
>>> c = set([])
>>> set_is_empty(c)
True

:param some_set: set to check if he empty or not.
:return True if empty, False otherwise.
"""
def set_is_empty(some_set):
    return some_set == set()
"""
This function check if set is empty or not.
>>> c = set([])
>>> set_is_empty(c)
True

:param some_set: set to check if he empty or not.
:return True if empty, False otherwise.
"""
def set_is_empty(some_set):
    return some_set == set()

回答 6

不如bool(c)干净,但是使用三元是一个借口。

def myfunc(a,b):
    return True if a.intersection(b) else False

同样,也使用一些相同的逻辑,除非您将其用于其他用途,否则无需分配给c。

def myfunc(a,b):
    return bool(a.intersection(b))

最后,我假设您想要一个True / False值,因为您将使用它执行某种布尔测试。我建议通过简单地测试您需要的位置来跳过函数调用和定义的开销。

代替:

if (myfunc(a,b)):
    # Do something

也许这样:

if a.intersection(b):
    # Do something

Not as clean as bool(c) but it was an excuse to use ternary.

def myfunc(a,b):
    return True if a.intersection(b) else False

Also using a bit of the same logic there is no need to assign to c unless you are using it for something else.

def myfunc(a,b):
    return bool(a.intersection(b))

Finally, I would assume you want a True / False value because you are going to perform some sort of boolean test with it. I would recommend skipping the overhead of a function call and definition by simply testing where you need it.

Instead of:

if (myfunc(a,b)):
    # Do something

Maybe this:

if a.intersection(b):
    # Do something

如何在python中将int转换为Enum?

问题:如何在python中将int转换为Enum?

在python 2.7.6中使用新的Enum功能(通过backport enum34)。

给定以下定义,如何将int转换为相应的Enum值?

from enum import Enum

class Fruit(Enum):
    Apple = 4
    Orange = 5
    Pear = 6

我知道我可以手工制作一系列的if语句来进行转换,但是有没有简单的pythonic转换方法?基本上,我想要一个返回枚举值的函数ConvertIntToFruit(int)。

我的用例是我有一个记录的csv文件,在其中我将每个记录读入一个对象。文件字段之一是代表枚举的整数字段。在填充对象时,我想将文件中的整数字段转换为对象中对应的Enum值。

Using the new Enum feature (via backport enum34) with python 2.7.6.

Given the following definition, how can I convert an int to the corresponding Enum value?

from enum import Enum

class Fruit(Enum):
    Apple = 4
    Orange = 5
    Pear = 6

I know I can hand craft a series of if-statements to do the conversion but is there an easy pythonic way to convert? Basically, I’d like a function ConvertIntToFruit(int) that returns an enum value.

My use case is I have a csv file of records where I’m reading each record into an object. One of the file fields is an integer field that represents an enumeration. As I’m populating the object I’d like to convert that integer field from the file into the corresponding Enum value in the object.


回答 0

您“打电话”Enum上课:

Fruit(5)

轮到5Fruit.Orange

>>> from enum import Enum
>>> class Fruit(Enum):
...     Apple = 4
...     Orange = 5
...     Pear = 6
... 
>>> Fruit(5)
<Fruit.Orange: 5>

从文档的程序访问到枚举成员及其属性部分:

有时,以编程方式访问枚举中的成员很有用(例如,Color.red由于在编写程序时尚不知道确切的颜色而无法这样做)。Enum允许这样的访问:

>>> Color(1)
<Color.red: 1>
>>> Color(3)
<Color.blue: 3>

在相关说明中:要映射包含枚举成员名称的字符串值,请使用subscription:

>>> s = 'Apple'
>>> Fruit[s]
<Fruit.Apple: 4>

You ‘call’ the Enum class:

Fruit(5)

to turn 5 into Fruit.Orange:

>>> from enum import Enum
>>> class Fruit(Enum):
...     Apple = 4
...     Orange = 5
...     Pear = 6
... 
>>> Fruit(5)
<Fruit.Orange: 5>

From the Programmatic access to enumeration members and their attributes section of the documentation:

Sometimes it’s useful to access members in enumerations programmatically (i.e. situations where Color.red won’t do because the exact color is not known at program-writing time). Enum allows such access:

>>> Color(1)
<Color.red: 1>
>>> Color(3)
<Color.blue: 3>

In a related note: to map a string value containing the name of an enum member, use subscription:

>>> s = 'Apple'
>>> Fruit[s]
<Fruit.Apple: 4>

回答 1

我认为这是简单的话是对转换int价值为Enum通过调用EnumType(int_value),访问后name的的Enum对象:

my_fruit_from_int = Fruit(5) #convert to int
fruit_name = my_fruit_from_int.name #get the name
print(fruit_name) #Orange will be printed here

或作为功能:

def convert_int_to_fruit(int_value):
    try:
        my_fruit_from_int = Fruit(int_value)
        return my_fruit_from_int.name
    except:
        return None

I think it is in simple words is to convert the int value into Enum by calling EnumType(int_value), after that access the name of the Enum object:

my_fruit_from_int = Fruit(5) #convert to int
fruit_name = my_fruit_from_int.name #get the name
print(fruit_name) #Orange will be printed here

Or as a function:

def convert_int_to_fruit(int_value):
    try:
        my_fruit_from_int = Fruit(int_value)
        return my_fruit_from_int.name
    except:
        return None

回答 2

我想要类似的东西,以便可以从单个引用访问值对的任何一部分。香草版本:

#!/usr/bin/env python3


from enum import IntEnum


class EnumDemo(IntEnum):
    ENUM_ZERO       = 0
    ENUM_ONE        = 1
    ENUM_TWO        = 2
    ENUM_THREE      = 3
    ENUM_INVALID    = 4


#endclass.


print('Passes')
print('1) %d'%(EnumDemo['ENUM_TWO']))
print('2) %s'%(EnumDemo['ENUM_TWO']))
print('3) %s'%(EnumDemo.ENUM_TWO.name))
print('4) %d'%(EnumDemo.ENUM_TWO))
print()


print('Fails')
print('1) %d'%(EnumDemo.ENUM_TWOa))

失败将引发异常。

一个更强大的版本:

#!/usr/bin/env python3


class EnumDemo():


    enumeration =   (
                        'ENUM_ZERO',    # 0.
                        'ENUM_ONE',     # 1.
                        'ENUM_TWO',     # 2.
                        'ENUM_THREE',   # 3.
                        'ENUM_INVALID'  # 4.
                    )


    def name(self, val):
        try:

            name = self.enumeration[val]
        except IndexError:

            # Always return last tuple.
            name = self.enumeration[len(self.enumeration) - 1]

        return name


    def number(self, val):
        try:

            index = self.enumeration.index(val)
        except (TypeError, ValueError):

            # Always return last tuple.
            index = (len(self.enumeration) - 1)

        return index


#endclass.


print('Passes')
print('1) %d'%(EnumDemo().number('ENUM_TWO')))
print('2) %s'%(EnumDemo().number('ENUM_TWO')))
print('3) %s'%(EnumDemo().name(1)))
print('4) %s'%(EnumDemo().enumeration[1]))
print()
print('Fails')
print('1) %d'%(EnumDemo().number('ENUM_THREEa')))
print('2) %s'%(EnumDemo().number('ENUM_THREEa')))
print('3) %s'%(EnumDemo().name(11)))
print('4) %s'%(EnumDemo().enumeration[-1]))

如果使用不正确,这可以避免产生异常,而是传回故障指示。一种更Python化的方法是返回“ None”,但是我的特定应用程序直接使用文本。

I wanted something similar so that I could access either part of the value pair from a single reference. The vanilla version:

#!/usr/bin/env python3


from enum import IntEnum


class EnumDemo(IntEnum):
    ENUM_ZERO       = 0
    ENUM_ONE        = 1
    ENUM_TWO        = 2
    ENUM_THREE      = 3
    ENUM_INVALID    = 4


#endclass.


print('Passes')
print('1) %d'%(EnumDemo['ENUM_TWO']))
print('2) %s'%(EnumDemo['ENUM_TWO']))
print('3) %s'%(EnumDemo.ENUM_TWO.name))
print('4) %d'%(EnumDemo.ENUM_TWO))
print()


print('Fails')
print('1) %d'%(EnumDemo.ENUM_TWOa))

The failure throws an exception as would be expected.

A more robust version:

#!/usr/bin/env python3


class EnumDemo():


    enumeration =   (
                        'ENUM_ZERO',    # 0.
                        'ENUM_ONE',     # 1.
                        'ENUM_TWO',     # 2.
                        'ENUM_THREE',   # 3.
                        'ENUM_INVALID'  # 4.
                    )


    def name(self, val):
        try:

            name = self.enumeration[val]
        except IndexError:

            # Always return last tuple.
            name = self.enumeration[len(self.enumeration) - 1]

        return name


    def number(self, val):
        try:

            index = self.enumeration.index(val)
        except (TypeError, ValueError):

            # Always return last tuple.
            index = (len(self.enumeration) - 1)

        return index


#endclass.


print('Passes')
print('1) %d'%(EnumDemo().number('ENUM_TWO')))
print('2) %s'%(EnumDemo().number('ENUM_TWO')))
print('3) %s'%(EnumDemo().name(1)))
print('4) %s'%(EnumDemo().enumeration[1]))
print()
print('Fails')
print('1) %d'%(EnumDemo().number('ENUM_THREEa')))
print('2) %s'%(EnumDemo().number('ENUM_THREEa')))
print('3) %s'%(EnumDemo().name(11)))
print('4) %s'%(EnumDemo().enumeration[-1]))

When not used correctly this avoids creating an exception and, instead, passes back a fault indication. A more Pythonic way to do this would be to pass back “None” but my particular application uses the text directly.


Python只读属性

问题:Python只读属性

我不知道何时属性应该是私有的,是否应该使用属性。

我最近读到,setter和getters不是pythonic,我应该使用属性装饰器。没关系。

但是,如果我有属性,该属性不能从类外部设置,而是可以读取的(只读属性)。这个属性应该是私有的self._x吗?我所说的私有是指下划线吗?如果是,那么不使用getter怎么读?我现在知道的唯一方法是写

@property
def x(self):
    return self._x

这样我就可以读取属性,obj.x但是我无法设置它,obj.x = 1所以很好。

但是,我真的应该在乎设置不应该设置的对象吗?也许我应该离开它。但是话又说回来,我不能使用下划线,因为阅读obj._x对于用户来说很奇怪,所以我应该使用下划线obj.x,然后用户又一次不知道他一定不能设置该属性。

您的看法和做法是什么?

I don’t know when attribute should be private and if I should use property.

I read recently that setters and getters are not pythonic and I should use property decorator. It’s ok.

But what if I have attribute, that mustn’t be set from outside of class but can be read (read-only attribute). Should this attribute be private, and by private I mean with underscore, like that self._x? If yes then how can I read it without using getter? Only method I know right now is to write

@property
def x(self):
    return self._x

That way I can read attribute by obj.x but I can’t set it obj.x = 1 so it’s fine.

But should I really care about setting object that mustn’t be set? Maybe I should just leave it. But then again I can’t use underscore because reading obj._x is odd for user, so I should use obj.x and then again user doesn’t know that he mustn’t set this attribute.

What’s your opinion and practics?


回答 0

通常,在编写Python程序时应假定所有用户都同意成年人,因此他们有责任自己正确使用事物。但是,在极少数情况下,无法设置属性(例如派生值或从某个静态数据源读取的值)就没有意义,仅使用吸气剂的属性通常是首选模式。

Generally, Python programs should be written with the assumption that all users are consenting adults, and thus are responsible for using things correctly themselves. However, in the rare instance where it just does not make sense for an attribute to be settable (such as a derived value, or a value read from some static datasource), the getter-only property is generally the preferred pattern.


回答 1

西拉斯·雷Silas Ray)只是我的两分钱,走在正确的轨道上,但是我觉得自己想举个例子。;-)

Python是一种类型不安全的语言,因此,您始终必须信任代码的用户才能像合理的(明智的)人员一样使用代码。

根据PEP 8

仅对非公共方法和实例变量使用前导下划线。

要在类中具有“只读”属性,您可以使用@property修饰,您需要在继承object时使用新样式的类来进行继承。

例:

>>> class A(object):
...     def __init__(self, a):
...         self._a = a
...
...     @property
...     def a(self):
...         return self._a
... 
>>> a = A('test')
>>> a.a
'test'
>>> a.a = 'pleh'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: can't set attribute

Just my two cents, Silas Ray is on the right track, however I felt like adding an example. ;-)

Python is a type-unsafe language and thus you’ll always have to trust the users of your code to use the code like a reasonable (sensible) person.

Per PEP 8:

Use one leading underscore only for non-public methods and instance variables.

To have a ‘read-only’ property in a class you can make use of the @property decoration, you’ll need to inherit from object when you do so to make use of the new-style classes.

Example:

>>> class A(object):
...     def __init__(self, a):
...         self._a = a
...
...     @property
...     def a(self):
...         return self._a
... 
>>> a = A('test')
>>> a.a
'test'
>>> a.a = 'pleh'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: can't set attribute

回答 2

这是一种避免假设的方法

所有使用者都是成年人,因此有责任自行正确使用事物。

请在下面查看我的更新

使用@property,非常冗长,例如:

   class AClassWithManyAttributes:
        '''refactored to properties'''
        def __init__(a, b, c, d, e ...)
             self._a = a
             self._b = b
             self._c = c
             self.d = d
             self.e = e

        @property
        def a(self):
            return self._a
        @property
        def b(self):
            return self._b
        @property
        def c(self):
            return self._c
        # you get this ... it's long

使用

没有下划线:这是一个公共变量。
一个下划线:这是一个受保护的变量。
有两个下划线:这是一个私有变量。

除了最后一个,这是一个约定。如果您确实努力尝试,仍然可以使用双下划线访问变量。

那么我们该怎么办?我们是否放弃使用Python中的只读属性?

看哪!read_only_properties装潢抢救!

@read_only_properties('readonly', 'forbidden')
class MyClass(object):
    def __init__(self, a, b, c):
        self.readonly = a
        self.forbidden = b
        self.ok = c

m = MyClass(1, 2, 3)
m.ok = 4
# we can re-assign a value to m.ok
# read only access to m.readonly is OK 
print(m.ok, m.readonly) 
print("This worked...")
# this will explode, and raise AttributeError
m.forbidden = 4

你问:

哪里read_only_properties来的?

很高兴您询问,这是read_only_properties的来源:

def read_only_properties(*attrs):

    def class_rebuilder(cls):
        "The class decorator"

        class NewClass(cls):
            "This is the overwritten class"
            def __setattr__(self, name, value):
                if name not in attrs:
                    pass
                elif name not in self.__dict__:
                    pass
                else:
                    raise AttributeError("Can't modify {}".format(name))

                super().__setattr__(name, value)
        return NewClass
    return class_rebuilder

更新

我没想到这个答案会引起如此多的关注。令人惊讶的是。这鼓励我创建一个可以使用的软件包。

$ pip install read-only-properties

在您的python shell中:

In [1]: from rop import read_only_properties

In [2]: @read_only_properties('a')
   ...: class Foo:
   ...:     def __init__(self, a, b):
   ...:         self.a = a
   ...:         self.b = b
   ...:         

In [3]: f=Foo('explodes', 'ok-to-overwrite')

In [4]: f.b = 5

In [5]: f.a = 'boom'
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-5-a5226072b3b4> in <module>()
----> 1 f.a = 'boom'

/home/oznt/.virtualenvs/tracker/lib/python3.5/site-packages/rop.py in __setattr__(self, name, value)
    116                     pass
    117                 else:
--> 118                     raise AttributeError("Can't touch {}".format(name))
    119 
    120                 super().__setattr__(name, value)

AttributeError: Can't touch a

Here is a way to avoid the assumption that

all users are consenting adults, and thus are responsible for using things correctly themselves.

please see my update below

Using @property, is very verbose e.g.:

   class AClassWithManyAttributes:
        '''refactored to properties'''
        def __init__(a, b, c, d, e ...)
             self._a = a
             self._b = b
             self._c = c
             self.d = d
             self.e = e

        @property
        def a(self):
            return self._a
        @property
        def b(self):
            return self._b
        @property
        def c(self):
            return self._c
        # you get this ... it's long

Using

No underscore: it’s a public variable.
One underscore: it’s a protected variable.
Two underscores: it’s a private variable.

Except the last one, it’s a convention. You can still, if you really try hard, access variables with double underscore.

So what do we do? Do we give up on having read only properties in Python?

Behold! read_only_properties decorator to the rescue!

@read_only_properties('readonly', 'forbidden')
class MyClass(object):
    def __init__(self, a, b, c):
        self.readonly = a
        self.forbidden = b
        self.ok = c

m = MyClass(1, 2, 3)
m.ok = 4
# we can re-assign a value to m.ok
# read only access to m.readonly is OK 
print(m.ok, m.readonly) 
print("This worked...")
# this will explode, and raise AttributeError
m.forbidden = 4

You ask:

Where is read_only_properties coming from?

Glad you asked, here is the source for read_only_properties:

def read_only_properties(*attrs):

    def class_rebuilder(cls):
        "The class decorator"

        class NewClass(cls):
            "This is the overwritten class"
            def __setattr__(self, name, value):
                if name not in attrs:
                    pass
                elif name not in self.__dict__:
                    pass
                else:
                    raise AttributeError("Can't modify {}".format(name))

                super().__setattr__(name, value)
        return NewClass
    return class_rebuilder

update

I never expected this answer will get so much attention. Surprisingly it does. This encouraged me to create a package you can use.

$ pip install read-only-properties

in your python shell:

In [1]: from rop import read_only_properties

In [2]: @read_only_properties('a')
   ...: class Foo:
   ...:     def __init__(self, a, b):
   ...:         self.a = a
   ...:         self.b = b
   ...:         

In [3]: f=Foo('explodes', 'ok-to-overwrite')

In [4]: f.b = 5

In [5]: f.a = 'boom'
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-5-a5226072b3b4> in <module>()
----> 1 f.a = 'boom'

/home/oznt/.virtualenvs/tracker/lib/python3.5/site-packages/rop.py in __setattr__(self, name, value)
    116                     pass
    117                 else:
--> 118                     raise AttributeError("Can't touch {}".format(name))
    119 
    120                 super().__setattr__(name, value)

AttributeError: Can't touch a

回答 3

这是一种对只读属性略有不同的方法,由于必须对它们进行初始化,因此应该将它们称为一次写入属性,不是吗?对于那些担心直接通过访问对象字典来修改属性的偏执狂,我引入了“极端”名称处理:

from uuid import uuid4

class Read_Only_Property:
    def __init__(self, name):
        self.name = name
        self.dict_name = uuid4().hex
        self.initialized = False

    def __get__(self, instance, cls):
        if instance is None:
            return self
        else:
            return instance.__dict__[self.dict_name]

    def __set__(self, instance, value):
        if self.initialized:
            raise AttributeError("Attempt to modify read-only property '%s'." % self.name)
        instance.__dict__[self.dict_name] = value
        self.initialized = True

class Point:
    x = Read_Only_Property('x')
    y = Read_Only_Property('y')
    def __init__(self, x, y):
        self.x = x
        self.y = y

if __name__ == '__main__':
    try:
        p = Point(2, 3)
        print(p.x, p.y)
        p.x = 9
    except Exception as e:
        print(e)

Here is a slightly different approach to read-only properties, which perhaps should be called write-once properties since they do have to get initialized, don’t they? For the paranoid among us who worry about being able to modify properties by accessing the object’s dictionary directly, I’ve introduced “extreme” name mangling:

from uuid import uuid4

class Read_Only_Property:
    def __init__(self, name):
        self.name = name
        self.dict_name = uuid4().hex
        self.initialized = False

    def __get__(self, instance, cls):
        if instance is None:
            return self
        else:
            return instance.__dict__[self.dict_name]

    def __set__(self, instance, value):
        if self.initialized:
            raise AttributeError("Attempt to modify read-only property '%s'." % self.name)
        instance.__dict__[self.dict_name] = value
        self.initialized = True

class Point:
    x = Read_Only_Property('x')
    y = Read_Only_Property('y')
    def __init__(self, x, y):
        self.x = x
        self.y = y

if __name__ == '__main__':
    try:
        p = Point(2, 3)
        print(p.x, p.y)
        p.x = 9
    except Exception as e:
        print(e)

回答 4

我对创建只读属性的前两个答案不满意,因为第一个解决方案允许删除readonly属性,然后进行设置,并且不会阻止__dict__。第二种解决方案可以与测试一起解决-找到等于您将其设置为2的值并最终进行更改。

现在,获取代码。

def final(cls):
    clss = cls
    @classmethod
    def __init_subclass__(cls, **kwargs):
        raise TypeError("type '{}' is not an acceptable base type".format(clss.__name__))
    cls.__init_subclass__ = __init_subclass__
    return cls


def methoddefiner(cls, method_name):
    for clss in cls.mro():
        try:
            getattr(clss, method_name)
            return clss
        except(AttributeError):
            pass
    return None


def readonlyattributes(*attrs):
    """Method to create readonly attributes in a class

    Use as a decorator for a class. This function takes in unlimited 
    string arguments for names of readonly attributes and returns a
    function to make the readonly attributes readonly. 

    The original class's __getattribute__, __setattr__, and __delattr__ methods
    are redefined so avoid defining those methods in the decorated class

    You may create setters and deleters for readonly attributes, however
    if they are overwritten by the subclass, they lose access to the readonly
    attributes. 

    Any method which sets or deletes a readonly attribute within
    the class loses access if overwritten by the subclass besides the __new__
    or __init__ constructors.

    This decorator doesn't support subclassing of these classes
    """
    def classrebuilder(cls):
        def __getattribute__(self, name):
            if name == '__dict__':
                    from types import MappingProxyType
                    return MappingProxyType(super(cls, self).__getattribute__('__dict__'))
            return super(cls, self).__getattribute__(name)
        def __setattr__(self, name, value): 
                if name == '__dict__' or name in attrs:
                    import inspect
                    stack = inspect.stack()
                    try:
                        the_class = stack[1][0].f_locals['self'].__class__
                    except(KeyError):
                        the_class = None
                    the_method = stack[1][0].f_code.co_name
                    if the_class != cls: 
                         if methoddefiner(type(self), the_method) != cls:
                            raise AttributeError("Cannot set readonly attribute '{}'".format(name))                        
                return super(cls, self).__setattr__(name, value)
        def __delattr__(self, name):                
                if name == '__dict__' or name in attrs:
                    import inspect
                    stack = inspect.stack()
                    try:
                        the_class = stack[1][0].f_locals['self'].__class__
                    except(KeyError):
                        the_class = None
                    the_method = stack[1][0].f_code.co_name
                    if the_class != cls:
                        if methoddefiner(type(self), the_method) != cls:
                            raise AttributeError("Cannot delete readonly attribute '{}'".format(name))                        
                return super(cls, self).__delattr__(name)
        clss = cls
        cls.__getattribute__ = __getattribute__
        cls.__setattr__ = __setattr__
        cls.__delattr__ = __delattr__
        #This line will be moved when this algorithm will be compatible with inheritance
        cls = final(cls)
        return cls
    return classrebuilder

def setreadonlyattributes(cls, *readonlyattrs):
    return readonlyattributes(*readonlyattrs)(cls)


if __name__ == '__main__':
    #test readonlyattributes only as an indpendent module
    @readonlyattributes('readonlyfield')
    class ReadonlyFieldClass(object):
        def __init__(self, a, b):
            #Prevent initalization of the internal, unmodified PrivateFieldClass
            #External PrivateFieldClass can be initalized
            self.readonlyfield = a
            self.publicfield = b


    attr = None
    def main():
        global attr
        pfi = ReadonlyFieldClass('forbidden', 'changable')
        ###---test publicfield, ensure its mutable---###
        try:
            #get publicfield
            print(pfi.publicfield)
            print('__getattribute__ works')
            #set publicfield
            pfi.publicfield = 'mutable'
            print('__setattr__ seems to work')
            #get previously set publicfield
            print(pfi.publicfield)
            print('__setattr__ definitely works')
            #delete publicfield
            del pfi.publicfield 
            print('__delattr__ seems to work')
            #get publicfield which was supposed to be deleted therefore should raise AttributeError
            print(pfi.publlicfield)
            #publicfield wasn't deleted, raise RuntimeError
            raise RuntimeError('__delattr__ doesn\'t work')
        except(AttributeError):
            print('__delattr__ works')


        try:
            ###---test readonly, make sure its readonly---###
            #get readonlyfield
            print(pfi.readonlyfield)
            print('__getattribute__ works')
            #set readonlyfield, should raise AttributeError
            pfi.readonlyfield = 'readonly'
            #apparently readonlyfield was set, notify user
            raise RuntimeError('__setattr__ doesn\'t work')
        except(AttributeError):
            print('__setattr__ seems to work')
            try:
                #ensure readonlyfield wasn't set
                print(pfi.readonlyfield)
                print('__setattr__ works')
                #delete readonlyfield
                del pfi.readonlyfield
                #readonlyfield was deleted, raise RuntimeError
                raise RuntimeError('__delattr__ doesn\'t work')
            except(AttributeError):
                print('__delattr__ works')
        try:
            print("Dict testing")
            print(pfi.__dict__, type(pfi.__dict__))
            attr = pfi.readonlyfield
            print(attr)
            print("__getattribute__ works")
            if pfi.readonlyfield != 'forbidden':
                print(pfi.readonlyfield)
                raise RuntimeError("__getattr__ doesn't work")
            try:
                pfi.__dict__ = {}
                raise RuntimeError("__setattr__ doesn't work")
            except(AttributeError):
                print("__setattr__ works")
            del pfi.__dict__
            raise RuntimeError("__delattr__ doesn't work")
        except(AttributeError):
            print(pfi.__dict__)
            print("__delattr__ works")
            print("Basic things work")


main()

除非您编写库代码时使用只读属性,否则将这些属性设置为只读代码是为了增强他们的程序而将代码分发给其他人使用,而不是用于其他目的(例如应用程序开发)的代码。解决了__dict__问题,因为__dict__现在是不可变的类型。MappingProxyType,因此无法通过__dict__更改属性。设置或删除__dict__也被阻止。更改只读属性的唯一方法是更改​​类本身的方法。

尽管我认为我的解决方案比前两个解决方案要好,但可以改进。这些是此代码的弱点:

a)不允许在子类中添加设置或删除只读属性的方法。即使调用了超类的方法,也会自动禁止子类中定义的方法访问只读属性。

b)可以更改类的只读方法以克服只读限制。

但是,没有办法不编辑类来设置或删除只读属性。这不依赖于命名约定,这很好,因为Python与命名约定不太一致。这提供了一种方法,使只读属性无法通过隐藏的漏洞进行更改,而无需编辑类本身。只需在将装饰器作为参数调用时列出要只读的属性即可,它们将变为只读。

归功于Brice的回答:如何在python中另一个类的函数中获取调用方类名称?获取调用方的类和方法。

I am dissatisfied with the previous two answers to create read only properties because the first solution allows the readonly attribute to be deleted and then set and doesn’t block the __dict__. The second solution could be worked around with testing – finding the value that equals what you set it two and changing it eventually.

Now, for the code.

def final(cls):
    clss = cls
    @classmethod
    def __init_subclass__(cls, **kwargs):
        raise TypeError("type '{}' is not an acceptable base type".format(clss.__name__))
    cls.__init_subclass__ = __init_subclass__
    return cls


def methoddefiner(cls, method_name):
    for clss in cls.mro():
        try:
            getattr(clss, method_name)
            return clss
        except(AttributeError):
            pass
    return None


def readonlyattributes(*attrs):
    """Method to create readonly attributes in a class

    Use as a decorator for a class. This function takes in unlimited 
    string arguments for names of readonly attributes and returns a
    function to make the readonly attributes readonly. 

    The original class's __getattribute__, __setattr__, and __delattr__ methods
    are redefined so avoid defining those methods in the decorated class

    You may create setters and deleters for readonly attributes, however
    if they are overwritten by the subclass, they lose access to the readonly
    attributes. 

    Any method which sets or deletes a readonly attribute within
    the class loses access if overwritten by the subclass besides the __new__
    or __init__ constructors.

    This decorator doesn't support subclassing of these classes
    """
    def classrebuilder(cls):
        def __getattribute__(self, name):
            if name == '__dict__':
                    from types import MappingProxyType
                    return MappingProxyType(super(cls, self).__getattribute__('__dict__'))
            return super(cls, self).__getattribute__(name)
        def __setattr__(self, name, value): 
                if name == '__dict__' or name in attrs:
                    import inspect
                    stack = inspect.stack()
                    try:
                        the_class = stack[1][0].f_locals['self'].__class__
                    except(KeyError):
                        the_class = None
                    the_method = stack[1][0].f_code.co_name
                    if the_class != cls: 
                         if methoddefiner(type(self), the_method) != cls:
                            raise AttributeError("Cannot set readonly attribute '{}'".format(name))                        
                return super(cls, self).__setattr__(name, value)
        def __delattr__(self, name):                
                if name == '__dict__' or name in attrs:
                    import inspect
                    stack = inspect.stack()
                    try:
                        the_class = stack[1][0].f_locals['self'].__class__
                    except(KeyError):
                        the_class = None
                    the_method = stack[1][0].f_code.co_name
                    if the_class != cls:
                        if methoddefiner(type(self), the_method) != cls:
                            raise AttributeError("Cannot delete readonly attribute '{}'".format(name))                        
                return super(cls, self).__delattr__(name)
        clss = cls
        cls.__getattribute__ = __getattribute__
        cls.__setattr__ = __setattr__
        cls.__delattr__ = __delattr__
        #This line will be moved when this algorithm will be compatible with inheritance
        cls = final(cls)
        return cls
    return classrebuilder

def setreadonlyattributes(cls, *readonlyattrs):
    return readonlyattributes(*readonlyattrs)(cls)


if __name__ == '__main__':
    #test readonlyattributes only as an indpendent module
    @readonlyattributes('readonlyfield')
    class ReadonlyFieldClass(object):
        def __init__(self, a, b):
            #Prevent initalization of the internal, unmodified PrivateFieldClass
            #External PrivateFieldClass can be initalized
            self.readonlyfield = a
            self.publicfield = b


    attr = None
    def main():
        global attr
        pfi = ReadonlyFieldClass('forbidden', 'changable')
        ###---test publicfield, ensure its mutable---###
        try:
            #get publicfield
            print(pfi.publicfield)
            print('__getattribute__ works')
            #set publicfield
            pfi.publicfield = 'mutable'
            print('__setattr__ seems to work')
            #get previously set publicfield
            print(pfi.publicfield)
            print('__setattr__ definitely works')
            #delete publicfield
            del pfi.publicfield 
            print('__delattr__ seems to work')
            #get publicfield which was supposed to be deleted therefore should raise AttributeError
            print(pfi.publlicfield)
            #publicfield wasn't deleted, raise RuntimeError
            raise RuntimeError('__delattr__ doesn\'t work')
        except(AttributeError):
            print('__delattr__ works')


        try:
            ###---test readonly, make sure its readonly---###
            #get readonlyfield
            print(pfi.readonlyfield)
            print('__getattribute__ works')
            #set readonlyfield, should raise AttributeError
            pfi.readonlyfield = 'readonly'
            #apparently readonlyfield was set, notify user
            raise RuntimeError('__setattr__ doesn\'t work')
        except(AttributeError):
            print('__setattr__ seems to work')
            try:
                #ensure readonlyfield wasn't set
                print(pfi.readonlyfield)
                print('__setattr__ works')
                #delete readonlyfield
                del pfi.readonlyfield
                #readonlyfield was deleted, raise RuntimeError
                raise RuntimeError('__delattr__ doesn\'t work')
            except(AttributeError):
                print('__delattr__ works')
        try:
            print("Dict testing")
            print(pfi.__dict__, type(pfi.__dict__))
            attr = pfi.readonlyfield
            print(attr)
            print("__getattribute__ works")
            if pfi.readonlyfield != 'forbidden':
                print(pfi.readonlyfield)
                raise RuntimeError("__getattr__ doesn't work")
            try:
                pfi.__dict__ = {}
                raise RuntimeError("__setattr__ doesn't work")
            except(AttributeError):
                print("__setattr__ works")
            del pfi.__dict__
            raise RuntimeError("__delattr__ doesn't work")
        except(AttributeError):
            print(pfi.__dict__)
            print("__delattr__ works")
            print("Basic things work")


main()

There is no point to making read only attributes except when your writing library code, code which is being distributed to others as code to use in order to enhance their programs, not code for any other purpose, like app development. The __dict__ problem is solved, because the __dict__ is now of the immutable types.MappingProxyType, so attributes cannot be changed through the __dict__. Setting or deleting __dict__ is also blocked. The only way to change read only properties is through changing the methods of the class itself.

Though I believe my solution is better than of the previous two, it could be improved. These are this code’s weaknesses:

a) Doesn’t allow adding to a method in a subclass which sets or deletes a readonly attribute. A method defined in a subclass is automatically barred from accessing a readonly attribute, even by calling the superclass’ version of the method.

b) The class’ readonly methods can be changed to defeat the read only restrictions.

However, there is not way without editing the class to set or delete a read only attribute. This isn’t dependent on naming conventions, which is good because Python isn’t so consistent with naming conventions. This provides a way to make read only attributes that cannot be changed with hidden loopholes without editing the class itself. Simply list the attributes to be read only when calling the decorator as arguments and they will become read only.

Credit to Brice’s answer in How to get the caller class name inside a function of another class in python? for getting the caller classes and methods.


回答 5

注意,实例方法也是(类的)属性,如果您确实想成为坏蛋,则可以在类或实例级别设置它们。或者,您可以设置一个类变量(这也是该类的一个属性),在该变量中,方便的只读属性将无法立即使用。我要说的是,“只读属性”问题实际上比通常认为的要普遍得多。幸运的是,人们对工作的传统期望是如此强烈,以至于使我们在其他情况下视而不见(毕竟,几乎所有东西都是python中的某种属性)。

基于这些期望,我认为最通用,最轻便的方法是采用以下约定:“公开”(无前导下划线)属性是只读的,除非明确记录为可写。这包含了通常的期望,即不会对方法进行修补,而指示实例默认值的类变量则更不用说了。如果您真的对某些特殊属性感到偏执,请使用只读描述符作为最后的资源度量。

Notice that instance methods are also attributes (of the class) and that you could set them at the class or instance level if you really wanted to be a badass. Or that you may set a class variable (which is also an attribute of the class), where handy readonly properties won’t work neatly out of the box. What I’m trying to say is that the “readonly attribute” problem is in fact more general than it’s usually perceived to be. Fortunately there are conventional expectations at work that are so strong as to blind us wrt these other cases (after all, almost everything is an attribute of some sort in python).

Building upon these expectations I think the most general and lightweight approach is to adopt the convention that “public” (no leading underscore) attributes are readonly except when explicitly documented as writeable. This subsumes the usual expectation that methods won’t be patched and class variables indicating instance defaults are better let alone. If you feel really paranoid about some special attribute, use a readonly descriptor as a last resource measure.


回答 6

尽管我喜欢Oz123的类装饰器,但是您也可以执行以下操作,该操作使用显式类包装器和__new__以及类Factory方法,以在闭包内返回类:

class B(object):
    def __new__(cls, val):
        return cls.factory(val)

@classmethod
def factory(cls, val):
    private = {'var': 'test'}

    class InnerB(object):
        def __init__(self):
            self.variable = val
            pass

        @property
        def var(self):
            return private['var']

    return InnerB()

While I like the class decorator from Oz123, you could also do the following, which uses an explicit class wrapper and __new__ with a class Factory method returning the class within a closure:

class B(object):
    def __new__(cls, val):
        return cls.factory(val)

@classmethod
def factory(cls, val):
    private = {'var': 'test'}

    class InnerB(object):
        def __init__(self):
            self.variable = val
            pass

        @property
        def var(self):
            return private['var']

    return InnerB()

回答 7

那是我的解决方法。

@property
def language(self):
    return self._language
@language.setter
def language(self, value):
    # WORKAROUND to get a "getter-only" behavior
    # set the value only if the attribute does not exist
    try:
        if self.language == value:
            pass
        print("WARNING: Cannot set attribute \'language\'.")
    except AttributeError:
        self._language = value

That’s my workaround.

@property
def language(self):
    return self._language
@language.setter
def language(self, value):
    # WORKAROUND to get a "getter-only" behavior
    # set the value only if the attribute does not exist
    try:
        if self.language == value:
            pass
        print("WARNING: Cannot set attribute \'language\'.")
    except AttributeError:
        self._language = value

回答 8

有人提到使用代理对象,但我没有看到这样的示例,所以我最终尝试了一下,[可怜]。

/!\如果可能,请更喜欢类定义和类构造函数

这段代码可以有效地重写class.__new__(类构造函数),但在各个方面都更糟。减轻痛苦,如果可以,请不要使用此模式。

def attr_proxy(obj):
    """ Use dynamic class definition to bind obj and proxy_attrs.
        If you can extend the target class constructor that is 
        cleaner, but its not always trivial to do so.
    """
    proxy_attrs = dict()

    class MyObjAttrProxy():
        def __getattr__(self, name):
            if name in proxy_attrs:
                return proxy_attrs[name]  # overloaded

            return getattr(obj, name)  # proxy

        def __setattr__(self, name, value):
            """ note, self is not bound when overloading methods
            """
            proxy_attrs[name] = value

    return MyObjAttrProxy()


myobj = attr_proxy(Object())
setattr(myobj, 'foo_str', 'foo')

def func_bind_obj_as_self(func, self):
    def _method(*args, **kwargs):
        return func(self, *args, **kwargs)
    return _method

def mymethod(self, foo_ct):
    """ self is not bound because we aren't using object __new__
        you can write the __setattr__ method to bind a self 
        argument, or declare your functions dynamically to bind in 
        a static object reference.
    """
    return self.foo_str + foo_ct

setattr(myobj, 'foo', func_bind_obj_as_self(mymethod, myobj))

someone mentioned using a proxy object, I didn’t see an example of that so I ended up trying it out, [poorly].

/!\ Please prefer class definitions and class constructors if possible

this code is effectively re-writing class.__new__ (class constructor) except worse in every way. Save yourself the pain and do not use this pattern if you can.

def attr_proxy(obj):
    """ Use dynamic class definition to bind obj and proxy_attrs.
        If you can extend the target class constructor that is 
        cleaner, but its not always trivial to do so.
    """
    proxy_attrs = dict()

    class MyObjAttrProxy():
        def __getattr__(self, name):
            if name in proxy_attrs:
                return proxy_attrs[name]  # overloaded

            return getattr(obj, name)  # proxy

        def __setattr__(self, name, value):
            """ note, self is not bound when overloading methods
            """
            proxy_attrs[name] = value

    return MyObjAttrProxy()


myobj = attr_proxy(Object())
setattr(myobj, 'foo_str', 'foo')

def func_bind_obj_as_self(func, self):
    def _method(*args, **kwargs):
        return func(self, *args, **kwargs)
    return _method

def mymethod(self, foo_ct):
    """ self is not bound because we aren't using object __new__
        you can write the __setattr__ method to bind a self 
        argument, or declare your functions dynamically to bind in 
        a static object reference.
    """
    return self.foo_str + foo_ct

setattr(myobj, 'foo', func_bind_obj_as_self(mymethod, myobj))

回答 9

我知道我从头开始带回了这个线程,但是我正在研究如何使属性变为只读,并且在找到该主题之后,我对已经共享的解决方案不满意。

因此,如果您从以下代码开始,请回到最初的问题:

@property
def x(self):
    return self._x

并且您想将X设为只读,只需添加:

@x.setter
def x(self, value):
    raise Exception("Member readonly")

然后,如果您运行以下命令:

print (x) # Will print whatever X value is
x = 3 # Will raise exception "Member readonly"

I know i’m bringing back from the dead this thread, but I was looking at how to make a property read only and after finding this topic, I wasn’t satisfied with the solutions already shared.

So, going back to the initial question, if you start with this code:

@property
def x(self):
    return self._x

And you want to make X readonly, you can just add:

@x.setter
def x(self, value):
    raise Exception("Member readonly")

Then, if you run the following:

print (x) # Will print whatever X value is
x = 3 # Will raise exception "Member readonly"