标签归档:dictionary

如何在一次通过中检查多个键是否在字典中?

问题:如何在一次通过中检查多个键是否在字典中?

我想做类似的事情:

foo = {'foo':1,'zip':2,'zam':3,'bar':4}

if ("foo","bar") in foo:
    #do stuff

如何检查dict foo中是否同时包含“ foo”和“ bar”?

I want to do something like:

foo = {
    'foo': 1,
    'zip': 2,
    'zam': 3,
    'bar': 4
}

if ("foo", "bar") in foo:
    #do stuff

How do I check whether both foo and bar are in dict foo?


回答 0

好吧,你可以这样做:

>>> if all (k in foo for k in ("foo","bar")):
...     print "They're there!"
...
They're there!

Well, you could do this:

>>> if all (k in foo for k in ("foo","bar")):
...     print "They're there!"
...
They're there!

回答 1

if {"foo", "bar"} <= myDict.keys(): ...

如果您仍在使用Python 2,则可以执行

if {"foo", "bar"} <= myDict.viewkeys(): ...

如果您仍然使用的旧版本<= 2.6的Python,则可以调用setdict,但是它将遍历整个dict以构建集合,这很慢:

if set(("foo", "bar")) <= set(myDict): ...
if {"foo", "bar"} <= myDict.keys(): ...

If you’re still on Python 2, you can do

if {"foo", "bar"} <= myDict.viewkeys(): ...

If you’re still on a really old Python <= 2.6, you can call set on the dict, but it’ll iterate over the whole dict to build the set, and that’s slow:

if set(("foo", "bar")) <= set(myDict): ...

回答 2

3个替代方案的简单基准测试平台。

输入您自己的D和Q值


>>> from timeit import Timer
>>> setup='''from random import randint as R;d=dict((str(R(0,1000000)),R(0,1000000)) for i in range(D));q=dict((str(R(0,1000000)),R(0,1000000)) for i in range(Q));print("looking for %s items in %s"%(len(q),len(d)))'''

>>> Timer('set(q) <= set(d)','D=1000000;Q=100;'+setup).timeit(1)
looking for 100 items in 632499
0.28672504425048828

#This one only works for Python3
>>> Timer('set(q) <= d.keys()','D=1000000;Q=100;'+setup).timeit(1)
looking for 100 items in 632084
2.5987625122070312e-05

>>> Timer('all(k in d for k in q)','D=1000000;Q=100;'+setup).timeit(1)
looking for 100 items in 632219
1.1920928955078125e-05

Simple benchmarking rig for 3 of the alternatives.

Put in your own values for D and Q


>>> from timeit import Timer
>>> setup='''from random import randint as R;d=dict((str(R(0,1000000)),R(0,1000000)) for i in range(D));q=dict((str(R(0,1000000)),R(0,1000000)) for i in range(Q));print("looking for %s items in %s"%(len(q),len(d)))'''

>>> Timer('set(q) <= set(d)','D=1000000;Q=100;'+setup).timeit(1)
looking for 100 items in 632499
0.28672504425048828

#This one only works for Python3
>>> Timer('set(q) <= d.keys()','D=1000000;Q=100;'+setup).timeit(1)
looking for 100 items in 632084
2.5987625122070312e-05

>>> Timer('all(k in d for k in q)','D=1000000;Q=100;'+setup).timeit(1)
looking for 100 items in 632219
1.1920928955078125e-05

回答 3

您不必将左侧包裹在一组中。您可以这样做:

if {'foo', 'bar'} <= set(some_dict):
    pass

这也比all(k in d...)解决方案要好。

You don’t have to wrap the left side in a set. You can just do this:

if {'foo', 'bar'} <= set(some_dict):
    pass

This also performs better than the all(k in d...) solution.


回答 4

使用

if set(("foo", "bar")).issubset(foo):
    #do stuff

或者:

if set(("foo", "bar")) <= set(foo):
    #do stuff

Using sets:

if set(("foo", "bar")).issubset(foo):
    #do stuff

Alternatively:

if set(("foo", "bar")) <= set(foo):
    #do stuff

回答 5

这个怎么样:

if all([key in foo for key in ["foo","bar"]]):
    # do stuff
    pass

How about this:

if all([key in foo for key in ["foo","bar"]]):
    # do stuff
    pass

回答 6

我认为这是最明智的选择。

{'key1','key2'} <= my_dict.keys()

I think this is the smartest and pithonic.

{'key1','key2'} <= my_dict.keys()

回答 7

虽然我喜欢Alex Martelli的答案,但对我来说似乎不是Pythonic。也就是说,我认为成为Pythonic的重要部分是易于理解。有了这个目标,<=并不容易理解。

虽然字符更多,但issubset()按照卡尔·福格特兰(Karl Voigtland)的答案所建议的用法更容易理解。由于该方法可以将字典用作参数,因此一个简短的,可理解的解决方案是:

foo = {'foo': 1, 'zip': 2, 'zam': 3, 'bar': 4}

if set(('foo', 'bar')).issubset(foo):
    #do stuff

我想用{'foo', 'bar'}代替set(('foo', 'bar')),因为它更短。但是,这不是很容易理解,我认为花括号像字典一样容易被混淆。

While I like Alex Martelli’s answer, it doesn’t seem Pythonic to me. That is, I thought an important part of being Pythonic is to be easily understandable. With that goal, <= isn’t easy to understand.

While it’s more characters, using issubset() as suggested by Karl Voigtland’s answer is more understandable. Since that method can use a dictionary as an argument, a short, understandable solution is:

foo = {'foo': 1, 'zip': 2, 'zam': 3, 'bar': 4}

if set(('foo', 'bar')).issubset(foo):
    #do stuff

I’d like to use {'foo', 'bar'} in place of set(('foo', 'bar')), because it’s shorter. However, it’s not that understandable and I think the braces are too easily confused as being a dictionary.


回答 8

Alex Martelli的解决方案set(queries) <= set(my_dict)是最短的代码,但可能不是最快的。假设Q = len(查询)和D = len(my_dict)。

这需要O(Q)+ O(D)来创建两个集合,然后(一个希望!)仅O(min(Q,D))进行子集测试-当然,假设Python进行了查找是O(1)-这是最坏的情况(当答案为True时)。

休格布朗(et al?)的生成器解all(k in my_dict for k in queries)为最坏情况O(Q)。

复杂因素:
(1)基于集合的小工具中的循环全部以C速度完成,而基于Any的小工具则在字节码上循环。
(2)基于任何内容的小工具的调用者都可以使用任何失败概率的知识来对查询项目进行相应的排序,而基于集合的小工具则不允许这样的控制。

与往常一样,如果速度很重要,则在操作条件下进行基准测试是一个好主意。

Alex Martelli’s solution set(queries) <= set(my_dict) is the shortest code but may not be the fastest. Assume Q = len(queries) and D = len(my_dict).

This takes O(Q) + O(D) to make the two sets, and then (one hopes!) only O(min(Q,D)) to do the subset test — assuming of course that Python set look-up is O(1) — this is worst case (when the answer is True).

The generator solution of hughdbrown (et al?) all(k in my_dict for k in queries) is worst-case O(Q).

Complicating factors:
(1) the loops in the set-based gadget are all done at C-speed whereas the any-based gadget is looping over bytecode.
(2) The caller of the any-based gadget may be able to use any knowledge of probability of failure to order the query items accordingly whereas the set-based gadget allows no such control.

As always, if speed is important, benchmarking under operational conditions is a good idea.


回答 9

您可以使用.issubset()以及

>>> {"key1", "key2"}.issubset({"key1":1, "key2":2, "key3": 3})
True
>>> {"key4", "key2"}.issubset({"key1":1, "key2":2, "key3": 3})
False
>>>

You can use .issubset() as well

>>> {"key1", "key2"}.issubset({"key1":1, "key2":2, "key3": 3})
True
>>> {"key4", "key2"}.issubset({"key1":1, "key2":2, "key3": 3})
False
>>>

回答 10

使用lambda怎么样?

 if reduce( (lambda x, y: x and foo.has_key(y) ), [ True, "foo", "bar"] ): # do stuff

How about using lambda?

 if reduce( (lambda x, y: x and foo.has_key(y) ), [ True, "foo", "bar"] ): # do stuff

回答 11

如果您想:

  • 还获取键的值
  • 检查一个以上的字典

然后:

from operator import itemgetter
foo = {'foo':1,'zip':2,'zam':3,'bar':4}
keys = ("foo","bar") 
getter = itemgetter(*keys) # returns all values
try:
    values = getter(foo)
except KeyError:
    # not both keys exist
    pass

In case you want to:

  • also get the values for the keys
  • check more than one dictonary

then:

from operator import itemgetter
foo = {'foo':1,'zip':2,'zam':3,'bar':4}
keys = ("foo","bar") 
getter = itemgetter(*keys) # returns all values
try:
    values = getter(foo)
except KeyError:
    # not both keys exist
    pass

回答 12

并不是说这不是您没有想到的事情,但是我发现最简单的事情通常是最好的:

if ("foo" in foo) and ("bar" in foo):
    # do stuff

Not to suggest that this isn’t something that you haven’t thought of, but I find that the simplest thing is usually the best:

if ("foo" in foo) and ("bar" in foo):
    # do stuff

回答 13

>>> if 'foo' in foo and 'bar' in foo:
...     print 'yes'
... 
yes

Jason()在Python中不是必需的。

>>> if 'foo' in foo and 'bar' in foo:
...     print 'yes'
... 
yes

Jason, () aren’t necessary in Python.


回答 14

就我的观点而言,所有给定的选项都有两种易于理解的方法。因此,我的主要标准是具有非常易读的代码,而不是非常快速的代码。为了使代码易于理解,我更喜欢给定可能性:

  • var <= var2.keys()
  • var.issubset(var2)

在下面的测试中,“ var <= var2.keys()”的执行速度更快,这一事实我更喜欢。

import timeit

timeit.timeit('var <= var2.keys()', setup='var={"managed_ip", "hostname", "fqdn"}; var2= {"zone": "test-domain1.var23.com", "hostname": "bakje", "api_client_ip": "127.0.0.1", "request_data": "", "request_method": "GET", "request_url": "hvar2p://127.0.0.1:5000/test-domain1.var23.com/bakje", "utc_datetime": "04-Apr-2019 07:01:10", "fqdn": "bakje.test-domain1.var23.com"}; var={"managed_ip", "hostname", "fqdn"}')
0.1745898080000643

timeit.timeit('var.issubset(var2)', setup='var={"managed_ip", "hostname", "fqdn"}; var2= {"zone": "test-domain1.var23.com", "hostname": "bakje", "api_client_ip": "127.0.0.1", "request_data": "", "request_method": "GET", "request_url": "hvar2p://127.0.0.1:5000/test-domain1.var23.com/bakje", "utc_datetime": "04-Apr-2019 07:01:10", "fqdn": "bakje.test-domain1.var23.com"}; var={"managed_ip", "hostname", "fqdn"};')
0.2644960229999924

Just my take on this, there are two methods that are easy to understand of all the given options. So my main criteria is have very readable code, not exceptionally fast code. To keep code understandable, i prefer to given possibilities:

  • var <= var2.keys()
  • var.issubset(var2)

The fact that “var <= var2.keys()” executes faster in my testing below, i prefer this one.

import timeit

timeit.timeit('var <= var2.keys()', setup='var={"managed_ip", "hostname", "fqdn"}; var2= {"zone": "test-domain1.var23.com", "hostname": "bakje", "api_client_ip": "127.0.0.1", "request_data": "", "request_method": "GET", "request_url": "hvar2p://127.0.0.1:5000/test-domain1.var23.com/bakje", "utc_datetime": "04-Apr-2019 07:01:10", "fqdn": "bakje.test-domain1.var23.com"}; var={"managed_ip", "hostname", "fqdn"}')
0.1745898080000643

timeit.timeit('var.issubset(var2)', setup='var={"managed_ip", "hostname", "fqdn"}; var2= {"zone": "test-domain1.var23.com", "hostname": "bakje", "api_client_ip": "127.0.0.1", "request_data": "", "request_method": "GET", "request_url": "hvar2p://127.0.0.1:5000/test-domain1.var23.com/bakje", "utc_datetime": "04-Apr-2019 07:01:10", "fqdn": "bakje.test-domain1.var23.com"}; var={"managed_ip", "hostname", "fqdn"};')
0.2644960229999924

回答 15

在确定是否只有某些键匹配的情况下,这可行:

any_keys_i_seek = ["key1", "key2", "key3"]

if set(my_dict).intersection(any_keys_i_seek):
    # code_here
    pass

查找是否只有一些键匹配的另一种选择:

any_keys_i_seek = ["key1", "key2", "key3"]

if any_keys_i_seek & my_dict.keys():
    # code_here
    pass

In the case of determining whether only some keys match, this works:

any_keys_i_seek = ["key1", "key2", "key3"]

if set(my_dict).intersection(any_keys_i_seek):
    # code_here
    pass

Yet another option to find if only some keys match:

any_keys_i_seek = ["key1", "key2", "key3"]

if any_keys_i_seek & my_dict.keys():
    # code_here
    pass

回答 16

用于检测所有键是否都在字典中的另一个选项:

dict_to_test = { ... }  # dict
keys_sought = { "key_sought_1", "key_sought_2", "key_sought_3" }  # set

if keys_sought & dict_to_test.keys() == keys_sought: 
    # yes -- dict_to_test contains all keys in keys_sought
    # code_here
    pass

Another option for detecting whether all keys are in a dict:

dict_to_test = { ... }  # dict
keys_sought = { "key_sought_1", "key_sought_2", "key_sought_3" }  # set

if keys_sought & dict_to_test.keys() == keys_sought: 
    # True -- dict_to_test contains all keys in keys_sought
    # code_here
    pass

回答 17

>>> ok
{'five': '5', 'two': '2', 'one': '1'}

>>> if ('two' and 'one' and 'five') in ok:
...   print "cool"
... 
cool

这似乎有效

>>> ok
{'five': '5', 'two': '2', 'one': '1'}

>>> if ('two' and 'one' and 'five') in ok:
...   print "cool"
... 
cool

This seems to work


如何“完美”地覆盖字典?

问题:如何“完美”地覆盖字典?

如何使dict的子类尽可能“完美” ?最终目标是要有一个简单的字典,其中的键是小写的。

似乎应该覆盖一些微小的原语才能完成这项工作,但是根据我的所有研究和尝试,似乎并非如此:

这是我的第一个尝试,get()不起作用,毫无疑问,还有许多其他小问题:

class arbitrary_dict(dict):
    """A dictionary that applies an arbitrary key-altering function
       before accessing the keys."""

    def __keytransform__(self, key):
        return key

    # Overridden methods. List from 
    # /programming/2390827/how-to-properly-subclass-dict

    def __init__(self, *args, **kwargs):
        self.update(*args, **kwargs)

    # Note: I'm using dict directly, since super(dict, self) doesn't work.
    # I'm not sure why, perhaps dict is not a new-style class.

    def __getitem__(self, key):
        return dict.__getitem__(self, self.__keytransform__(key))

    def __setitem__(self, key, value):
        return dict.__setitem__(self, self.__keytransform__(key), value)

    def __delitem__(self, key):
        return dict.__delitem__(self, self.__keytransform__(key))

    def __contains__(self, key):
        return dict.__contains__(self, self.__keytransform__(key))


class lcdict(arbitrary_dict):
    def __keytransform__(self, key):
        return str(key).lower()

How can I make as “perfect” a subclass of dict as possible? The end goal is to have a simple dict in which the keys are lowercase.

It would seem that there should be some tiny set of primitives I can override to make this work, but according to all my research and attempts it seem like this isn’t the case:

  • If I override __getitem__/__setitem__, then get/set don’t work. How can I make them work? Surely I don’t need to implement them individually?

  • Am I preventing pickling from working, and do I need to implement __setstate__ etc?

  • Do I need repr, update and __init__?

  • Should I just use mutablemapping (it seems one shouldn’t use UserDict or DictMixin)? If so, how? The docs aren’t exactly enlightening.

Here is my first go at it, get() doesn’t work and no doubt there are many other minor problems:

class arbitrary_dict(dict):
    """A dictionary that applies an arbitrary key-altering function
       before accessing the keys."""

    def __keytransform__(self, key):
        return key

    # Overridden methods. List from 
    # https://stackoverflow.com/questions/2390827/how-to-properly-subclass-dict

    def __init__(self, *args, **kwargs):
        self.update(*args, **kwargs)

    # Note: I'm using dict directly, since super(dict, self) doesn't work.
    # I'm not sure why, perhaps dict is not a new-style class.

    def __getitem__(self, key):
        return dict.__getitem__(self, self.__keytransform__(key))

    def __setitem__(self, key, value):
        return dict.__setitem__(self, self.__keytransform__(key), value)

    def __delitem__(self, key):
        return dict.__delitem__(self, self.__keytransform__(key))

    def __contains__(self, key):
        return dict.__contains__(self, self.__keytransform__(key))


class lcdict(arbitrary_dict):
    def __keytransform__(self, key):
        return str(key).lower()

回答 0

您可以使用模块中的ABC(抽象基类)编写行为dict非常简单的对象。它甚至会告诉您是否错过了一种方法,因此以下是关闭ABC的最低版本。collections.abc

from collections.abc import MutableMapping


class TransformedDict(MutableMapping):
    """A dictionary that applies an arbitrary key-altering
       function before accessing the keys"""

    def __init__(self, *args, **kwargs):
        self.store = dict()
        self.update(dict(*args, **kwargs))  # use the free update to set keys

    def __getitem__(self, key):
        return self.store[self.__keytransform__(key)]

    def __setitem__(self, key, value):
        self.store[self.__keytransform__(key)] = value

    def __delitem__(self, key):
        del self.store[self.__keytransform__(key)]

    def __iter__(self):
        return iter(self.store)

    def __len__(self):
        return len(self.store)

    def __keytransform__(self, key):
        return key

您可以从ABC获得一些免费方法:

class MyTransformedDict(TransformedDict):

    def __keytransform__(self, key):
        return key.lower()


s = MyTransformedDict([('Test', 'test')])

assert s.get('TEST') is s['test']   # free get
assert 'TeSt' in s                  # free __contains__
                                    # free setdefault, __eq__, and so on

import pickle
# works too since we just use a normal dict
assert pickle.loads(pickle.dumps(s)) == s

我不会dict直接继承(或其他内置)。这通常没有任何意义,因为您真正想要做的是实现a的接口dict。而这正是ABC的目的。

You can write an object that behaves like a dict quite easily with ABCs (Abstract Base Classes) from the collections.abc module. It even tells you if you missed a method, so below is the minimal version that shuts the ABC up.

from collections.abc import MutableMapping


class TransformedDict(MutableMapping):
    """A dictionary that applies an arbitrary key-altering
       function before accessing the keys"""

    def __init__(self, *args, **kwargs):
        self.store = dict()
        self.update(dict(*args, **kwargs))  # use the free update to set keys

    def __getitem__(self, key):
        return self.store[self._keytransform(key)]

    def __setitem__(self, key, value):
        self.store[self._keytransform(key)] = value

    def __delitem__(self, key):
        del self.store[self._keytransform(key)]

    def __iter__(self):
        return iter(self.store)
    
    def __len__(self):
        return len(self.store)

    def _keytransform(self, key):
        return key

You get a few free methods from the ABC:

class MyTransformedDict(TransformedDict):

    def _keytransform(self, key):
        return key.lower()


s = MyTransformedDict([('Test', 'test')])

assert s.get('TEST') is s['test']   # free get
assert 'TeSt' in s                  # free __contains__
                                    # free setdefault, __eq__, and so on

import pickle
# works too since we just use a normal dict
assert pickle.loads(pickle.dumps(s)) == s

I wouldn’t subclass dict (or other builtins) directly. It often makes no sense, because what you actually want to do is implement the interface of a dict. And that is exactly what ABCs are for.


回答 1

如何使dict的子类尽可能“完美”?

最终目标是要有一个简单的字典,其中的键是小写的。

  • 如果我覆盖__getitem__/ __setitem__,则获取/设置不起作用。我如何使它们工作?当然,我不需要单独实施它们吗?

  • 我是否在阻止酸洗,我需要实施 __setstate__等吗?

  • 我需要repr,update和__init__吗?

  • 我应该只使用mutablemapping(似乎不应该使用UserDictDictMixin)吗?如果是这样,怎么办?这些文档并不完全具有启发性。

可接受的答案将是我的第一种方法,但是由于它存在一些问题,并且由于没有人解决替代方法,实际上dict是将a子类化,因此我将在此处进行操作。

接受的答案有什么问题?

对我来说,这似乎是一个非常简单的请求:

如何使dict的子类尽可能“完美”?最终目标是要有一个简单的字典,其中的键是小写的。

接受的答案实际上不是子类dict,并且对此的测试失败:

>>> isinstance(MyTransformedDict([('Test', 'test')]), dict)
False

理想情况下,任何类型检查代码都将测试我们期望的接口或抽象基类,但是如果将我们的数据对象传递给正在测试的函数,dict而我们无法“修复”这些函数,则此代码将失败。

其他可能引起的争议:

  • 可接受的答案也缺少类方法:fromkeys
  • 可接受的答案也有冗余__dict__-因此会占用更多的内存空间:

    >>> s.foo = 'bar'
    >>> s.__dict__
    {'foo': 'bar', 'store': {'test': 'test'}}

实际上是子类化 dict

我们可以通过继承重用dict方法。我们需要做的就是创建一个接口层,以确保键(如果是字符串)以小写形式传递到字典中。

如果我覆盖__getitem__/ __setitem__,则获取/设置不起作用。我如何使它们工作?当然,我不需要单独实施它们吗?

好吧,分别实现它们是此方法的缺点,也是使用方法的不利之处MutableMapping(请参阅接受的答案),但实际上并不需要太多工作。

首先,让我们排除Python 2和Python 3之间的差异,创建一个singleton(_RaiseKeyError)以确保我们知道是否确实获得的参数dict.pop,并创建一个函数以确保我们的字符串键是小写的:

from itertools import chain
try:              # Python 2
    str_base = basestring
    items = 'iteritems'
except NameError: # Python 3
    str_base = str, bytes, bytearray
    items = 'items'

_RaiseKeyError = object() # singleton for no-default behavior

def ensure_lower(maybe_str):
    """dict keys can be any hashable object - only call lower if str"""
    return maybe_str.lower() if isinstance(maybe_str, str_base) else maybe_str

现在我们实现-我使用super了完整参数,因此该代码适用于Python 2和3:

class LowerDict(dict):  # dicts take a mapping or iterable as their optional first argument
    __slots__ = () # no __dict__ - that would be redundant
    @staticmethod # because this doesn't make sense as a global function.
    def _process_args(mapping=(), **kwargs):
        if hasattr(mapping, items):
            mapping = getattr(mapping, items)()
        return ((ensure_lower(k), v) for k, v in chain(mapping, getattr(kwargs, items)()))
    def __init__(self, mapping=(), **kwargs):
        super(LowerDict, self).__init__(self._process_args(mapping, **kwargs))
    def __getitem__(self, k):
        return super(LowerDict, self).__getitem__(ensure_lower(k))
    def __setitem__(self, k, v):
        return super(LowerDict, self).__setitem__(ensure_lower(k), v)
    def __delitem__(self, k):
        return super(LowerDict, self).__delitem__(ensure_lower(k))
    def get(self, k, default=None):
        return super(LowerDict, self).get(ensure_lower(k), default)
    def setdefault(self, k, default=None):
        return super(LowerDict, self).setdefault(ensure_lower(k), default)
    def pop(self, k, v=_RaiseKeyError):
        if v is _RaiseKeyError:
            return super(LowerDict, self).pop(ensure_lower(k))
        return super(LowerDict, self).pop(ensure_lower(k), v)
    def update(self, mapping=(), **kwargs):
        super(LowerDict, self).update(self._process_args(mapping, **kwargs))
    def __contains__(self, k):
        return super(LowerDict, self).__contains__(ensure_lower(k))
    def copy(self): # don't delegate w/ super - dict.copy() -> dict :(
        return type(self)(self)
    @classmethod
    def fromkeys(cls, keys, v=None):
        return super(LowerDict, cls).fromkeys((ensure_lower(k) for k in keys), v)
    def __repr__(self):
        return '{0}({1})'.format(type(self).__name__, super(LowerDict, self).__repr__())

我们使用的样板化的做法对任何方法或特殊方法引用的关键,但在其他方面,通过继承,我们获得方法:lenclearitemskeyspopitem,和values是免费的。尽管这需要一些仔细的思考才能正确解决,但看到它可行却是微不足道的。

(请注意,haskey在Python 2 中已弃用,在Python 3中已删除。)

这是一些用法:

>>> ld = LowerDict(dict(foo='bar'))
>>> ld['FOO']
'bar'
>>> ld['foo']
'bar'
>>> ld.pop('FoO')
'bar'
>>> ld.setdefault('Foo')
>>> ld
{'foo': None}
>>> ld.get('Bar')
>>> ld.setdefault('Bar')
>>> ld
{'bar': None, 'foo': None}
>>> ld.popitem()
('bar', None)

我是否在阻止酸洗,我需要实施 __setstate__等吗?

酸洗

dict子类的泡菜就可以了:

>>> import pickle
>>> pickle.dumps(ld)
b'\x80\x03c__main__\nLowerDict\nq\x00)\x81q\x01X\x03\x00\x00\x00fooq\x02Ns.'
>>> pickle.loads(pickle.dumps(ld))
{'foo': None}
>>> type(pickle.loads(pickle.dumps(ld)))
<class '__main__.LowerDict'>

__repr__

我需要repr,update和__init__吗?

我们定义了update__init__,但是__repr__默认情况下您会很漂亮:

>>> ld # without __repr__ defined for the class, we get this
{'foo': None}

但是,最好编写一个,__repr__以提高代码的可调试性。理想的测试是eval(repr(obj)) == obj。如果您的代码很简单,我强烈建议您:

>>> ld = LowerDict({})
>>> eval(repr(ld)) == ld
True
>>> ld = LowerDict(dict(a=1, b=2, c=3))
>>> eval(repr(ld)) == ld
True

您会看到,这正是我们重新创建等效对象所需要的-这可能会出现在我们的日志或回溯中:

>>> ld
LowerDict({'a': 1, 'c': 3, 'b': 2})

结论

我应该只使用mutablemapping(似乎不应该使用UserDictDictMixin)吗?如果是这样,怎么办?这些文档并不完全具有启发性。

是的,这些是更多几行代码,但是它们旨在变得更全面。我的第一个倾向是使用公认的答案,如果有问题,我将看一下我的答案-因为它有点复杂,而且没有ABC可以帮助我正确设置界面。

过早的优化将使搜索性能变得更加复杂。 MutableMapping更简单-在其他所有条件相同的情况下,它可以立即获得优势。不过,要列出所有差异,让我们进行比较和对比。

我应该补充一点,是有人试图将类似的字典放入collections模块中,但是被拒绝了。您可能应该这样做:

my_dict[transform(key)]

它应该更容易调试。

比较和对比

MutableMapping(缺少fromkeys)实现的6个接口函数和带有dict子类的11 个接口函数。我并不需要实现__iter__或者__len__,而是我要实现getsetdefaultpopupdatecopy__contains__,和fromkeys-但这些都是相当琐碎,因为我可以使用继承大多数这些实现的。

MutableMapping实现在Python中dict实现了一些用C 实现的东西-因此,我希望dict在某些情况下子类的性能更高。

我们__eq__在两种方法上都获得了自由-只有当另一个dict都为小写时,这两种方法才假定相等-但是,我再次认为,dict子类的比较会更快。

摘要:

  • 子类化MutableMapping更简单,发生错误的机会更少,但更慢,占用更多内存(请参阅冗余字典),并且失败isinstance(x, dict)
  • 子类化dict更快,使用更少的内存并通过isinstance(x, dict),但是实现起来却更加复杂。

哪个更完美?那取决于您对完美的定义。

How can I make as “perfect” a subclass of dict as possible?

The end goal is to have a simple dict in which the keys are lowercase.

  • If I override __getitem__/__setitem__, then get/set don’t work. How do I make them work? Surely I don’t need to implement them individually?

  • Am I preventing pickling from working, and do I need to implement __setstate__ etc?

  • Do I need repr, update and __init__?

  • Should I just use mutablemapping (it seems one shouldn’t use UserDict or DictMixin)? If so, how? The docs aren’t exactly enlightening.

The accepted answer would be my first approach, but since it has some issues, and since no one has addressed the alternative, actually subclassing a dict, I’m going to do that here.

What’s wrong with the accepted answer?

This seems like a rather simple request to me:

How can I make as “perfect” a subclass of dict as possible? The end goal is to have a simple dict in which the keys are lowercase.

The accepted answer doesn’t actually subclass dict, and a test for this fails:

>>> isinstance(MyTransformedDict([('Test', 'test')]), dict)
False

Ideally, any type-checking code would be testing for the interface we expect, or an abstract base class, but if our data objects are being passed into functions that are testing for dict – and we can’t “fix” those functions, this code will fail.

Other quibbles one might make:

  • The accepted answer is also missing the classmethod: fromkeys.
  • The accepted answer also has a redundant __dict__ – therefore taking up more space in memory:

    >>> s.foo = 'bar'
    >>> s.__dict__
    {'foo': 'bar', 'store': {'test': 'test'}}
    

Actually subclassing dict

We can reuse the dict methods through inheritance. All we need to do is create an interface layer that ensures keys are passed into the dict in lowercase form if they are strings.

If I override __getitem__/__setitem__, then get/set don’t work. How do I make them work? Surely I don’t need to implement them individually?

Well, implementing them each individually is the downside to this approach and the upside to using MutableMapping (see the accepted answer), but it’s really not that much more work.

First, let’s factor out the difference between Python 2 and 3, create a singleton (_RaiseKeyError) to make sure we know if we actually get an argument to dict.pop, and create a function to ensure our string keys are lowercase:

from itertools import chain
try:              # Python 2
    str_base = basestring
    items = 'iteritems'
except NameError: # Python 3
    str_base = str, bytes, bytearray
    items = 'items'

_RaiseKeyError = object() # singleton for no-default behavior

def ensure_lower(maybe_str):
    """dict keys can be any hashable object - only call lower if str"""
    return maybe_str.lower() if isinstance(maybe_str, str_base) else maybe_str

Now we implement – I’m using super with the full arguments so that this code works for Python 2 and 3:

class LowerDict(dict):  # dicts take a mapping or iterable as their optional first argument
    __slots__ = () # no __dict__ - that would be redundant
    @staticmethod # because this doesn't make sense as a global function.
    def _process_args(mapping=(), **kwargs):
        if hasattr(mapping, items):
            mapping = getattr(mapping, items)()
        return ((ensure_lower(k), v) for k, v in chain(mapping, getattr(kwargs, items)()))
    def __init__(self, mapping=(), **kwargs):
        super(LowerDict, self).__init__(self._process_args(mapping, **kwargs))
    def __getitem__(self, k):
        return super(LowerDict, self).__getitem__(ensure_lower(k))
    def __setitem__(self, k, v):
        return super(LowerDict, self).__setitem__(ensure_lower(k), v)
    def __delitem__(self, k):
        return super(LowerDict, self).__delitem__(ensure_lower(k))
    def get(self, k, default=None):
        return super(LowerDict, self).get(ensure_lower(k), default)
    def setdefault(self, k, default=None):
        return super(LowerDict, self).setdefault(ensure_lower(k), default)
    def pop(self, k, v=_RaiseKeyError):
        if v is _RaiseKeyError:
            return super(LowerDict, self).pop(ensure_lower(k))
        return super(LowerDict, self).pop(ensure_lower(k), v)
    def update(self, mapping=(), **kwargs):
        super(LowerDict, self).update(self._process_args(mapping, **kwargs))
    def __contains__(self, k):
        return super(LowerDict, self).__contains__(ensure_lower(k))
    def copy(self): # don't delegate w/ super - dict.copy() -> dict :(
        return type(self)(self)
    @classmethod
    def fromkeys(cls, keys, v=None):
        return super(LowerDict, cls).fromkeys((ensure_lower(k) for k in keys), v)
    def __repr__(self):
        return '{0}({1})'.format(type(self).__name__, super(LowerDict, self).__repr__())

We use an almost boiler-plate approach for any method or special method that references a key, but otherwise, by inheritance, we get methods: len, clear, items, keys, popitem, and values for free. While this required some careful thought to get right, it is trivial to see that this works.

(Note that haskey was deprecated in Python 2, removed in Python 3.)

Here’s some usage:

>>> ld = LowerDict(dict(foo='bar'))
>>> ld['FOO']
'bar'
>>> ld['foo']
'bar'
>>> ld.pop('FoO')
'bar'
>>> ld.setdefault('Foo')
>>> ld
{'foo': None}
>>> ld.get('Bar')
>>> ld.setdefault('Bar')
>>> ld
{'bar': None, 'foo': None}
>>> ld.popitem()
('bar', None)

Am I preventing pickling from working, and do I need to implement __setstate__ etc?

pickling

And the dict subclass pickles just fine:

>>> import pickle
>>> pickle.dumps(ld)
b'\x80\x03c__main__\nLowerDict\nq\x00)\x81q\x01X\x03\x00\x00\x00fooq\x02Ns.'
>>> pickle.loads(pickle.dumps(ld))
{'foo': None}
>>> type(pickle.loads(pickle.dumps(ld)))
<class '__main__.LowerDict'>

__repr__

Do I need repr, update and __init__?

We defined update and __init__, but you have a beautiful __repr__ by default:

>>> ld # without __repr__ defined for the class, we get this
{'foo': None}

However, it’s good to write a __repr__ to improve the debugability of your code. The ideal test is eval(repr(obj)) == obj. If it’s easy to do for your code, I strongly recommend it:

>>> ld = LowerDict({})
>>> eval(repr(ld)) == ld
True
>>> ld = LowerDict(dict(a=1, b=2, c=3))
>>> eval(repr(ld)) == ld
True

You see, it’s exactly what we need to recreate an equivalent object – this is something that might show up in our logs or in backtraces:

>>> ld
LowerDict({'a': 1, 'c': 3, 'b': 2})

Conclusion

Should I just use mutablemapping (it seems one shouldn’t use UserDict or DictMixin)? If so, how? The docs aren’t exactly enlightening.

Yeah, these are a few more lines of code, but they’re intended to be comprehensive. My first inclination would be to use the accepted answer, and if there were issues with it, I’d then look at my answer – as it’s a little more complicated, and there’s no ABC to help me get my interface right.

Premature optimization is going for greater complexity in search of performance. MutableMapping is simpler – so it gets an immediate edge, all else being equal. Nevertheless, to lay out all the differences, let’s compare and contrast.

I should add that there was a push to put a similar dictionary into the collections module, but it was rejected. You should probably just do this instead:

my_dict[transform(key)]

It should be far more easily debugable.

Compare and contrast

There are 6 interface functions implemented with the MutableMapping (which is missing fromkeys) and 11 with the dict subclass. I don’t need to implement __iter__ or __len__, but instead I have to implement get, setdefault, pop, update, copy, __contains__, and fromkeys – but these are fairly trivial, since I can use inheritance for most of those implementations.

The MutableMapping implements some things in Python that dict implements in C – so I would expect a dict subclass to be more performant in some cases.

We get a free __eq__ in both approaches – both of which assume equality only if another dict is all lowercase – but again, I think the dict subclass will compare more quickly.

Summary:

  • subclassing MutableMapping is simpler with fewer opportunities for bugs, but slower, takes more memory (see redundant dict), and fails isinstance(x, dict)
  • subclassing dict is faster, uses less memory, and passes isinstance(x, dict), but it has greater complexity to implement.

Which is more perfect? That depends on your definition of perfect.


回答 2

我的要求比较严格:

  • 我必须保留大小写信息(字符串是显示给用户的文件的路径,但这是Windows应用程序,因此内部所有操作都必须区分大小写)
  • 我需要密钥尽可能小(它确实在内存性能上有所作为,从370中砍掉了110 mb)。这意味着不能缓存键的小写版本。
  • 我需要尽快创建数据结构(这次再次改变了性能,提高了速度)。我不得不去一个内置的

我最初的想法是用笨拙的Path类代替不区分大小写的unicode子类-但是:

  • 事实证明很难做到这一点-参见:python中不区分大小写的字符串类
  • 事实证明,显式的dict键处理使代码变得冗长而混乱,并且容易出错(结构前后传递,并且不清楚它们是否具有CIStr实例作为键/元素,容易忘记some_dict[CIstr(path)],而且很难看)

因此,我最终不得不写下不区分大小写的字典。感谢@AaronHall 编写的代码,它简化了10倍。

class CIstr(unicode):
    """See https://stackoverflow.com/a/43122305/281545, especially for inlines"""
    __slots__ = () # does make a difference in memory performance

    #--Hash/Compare
    def __hash__(self):
        return hash(self.lower())
    def __eq__(self, other):
        if isinstance(other, CIstr):
            return self.lower() == other.lower()
        return NotImplemented
    def __ne__(self, other):
        if isinstance(other, CIstr):
            return self.lower() != other.lower()
        return NotImplemented
    def __lt__(self, other):
        if isinstance(other, CIstr):
            return self.lower() < other.lower()
        return NotImplemented
    def __ge__(self, other):
        if isinstance(other, CIstr):
            return self.lower() >= other.lower()
        return NotImplemented
    def __gt__(self, other):
        if isinstance(other, CIstr):
            return self.lower() > other.lower()
        return NotImplemented
    def __le__(self, other):
        if isinstance(other, CIstr):
            return self.lower() <= other.lower()
        return NotImplemented
    #--repr
    def __repr__(self):
        return '{0}({1})'.format(type(self).__name__,
                                 super(CIstr, self).__repr__())

def _ci_str(maybe_str):
    """dict keys can be any hashable object - only call CIstr if str"""
    return CIstr(maybe_str) if isinstance(maybe_str, basestring) else maybe_str

class LowerDict(dict):
    """Dictionary that transforms its keys to CIstr instances.
    Adapted from: https://stackoverflow.com/a/39375731/281545
    """
    __slots__ = () # no __dict__ - that would be redundant

    @staticmethod # because this doesn't make sense as a global function.
    def _process_args(mapping=(), **kwargs):
        if hasattr(mapping, 'iteritems'):
            mapping = getattr(mapping, 'iteritems')()
        return ((_ci_str(k), v) for k, v in
                chain(mapping, getattr(kwargs, 'iteritems')()))
    def __init__(self, mapping=(), **kwargs):
        # dicts take a mapping or iterable as their optional first argument
        super(LowerDict, self).__init__(self._process_args(mapping, **kwargs))
    def __getitem__(self, k):
        return super(LowerDict, self).__getitem__(_ci_str(k))
    def __setitem__(self, k, v):
        return super(LowerDict, self).__setitem__(_ci_str(k), v)
    def __delitem__(self, k):
        return super(LowerDict, self).__delitem__(_ci_str(k))
    def copy(self): # don't delegate w/ super - dict.copy() -> dict :(
        return type(self)(self)
    def get(self, k, default=None):
        return super(LowerDict, self).get(_ci_str(k), default)
    def setdefault(self, k, default=None):
        return super(LowerDict, self).setdefault(_ci_str(k), default)
    __no_default = object()
    def pop(self, k, v=__no_default):
        if v is LowerDict.__no_default:
            # super will raise KeyError if no default and key does not exist
            return super(LowerDict, self).pop(_ci_str(k))
        return super(LowerDict, self).pop(_ci_str(k), v)
    def update(self, mapping=(), **kwargs):
        super(LowerDict, self).update(self._process_args(mapping, **kwargs))
    def __contains__(self, k):
        return super(LowerDict, self).__contains__(_ci_str(k))
    @classmethod
    def fromkeys(cls, keys, v=None):
        return super(LowerDict, cls).fromkeys((_ci_str(k) for k in keys), v)
    def __repr__(self):
        return '{0}({1})'.format(type(self).__name__,
                                 super(LowerDict, self).__repr__())

隐式还是显式仍然是一个问题,但是一旦尘埃落定,就重命名属性/变量以ci开头(以及大量的doc注释说明ci代表不区分大小写),我认为这是一个完美的解决方案-因为代码的读者必须充分意识到我们正在处理不区分大小写的基础数据结构。希望这将修复一些难以重现的错误,我怀疑这些错误归结为区分大小写。

欢迎评论/更正:)

My requirements were a bit stricter:

  • I had to retain case info (the strings are paths to files displayed to the user, but it’s a windows app so internally all operations must be case insensitive)
  • I needed keys to be as small as possible (it did make a difference in memory performance, chopped off 110 mb out of 370). This meant that caching lowercase version of keys is not an option.
  • I needed creation of the data structures to be as fast as possible (again made a difference in performance, speed this time). I had to go with a builtin

My initial thought was to substitute our clunky Path class for a case insensitive unicode subclass – but:

  • proved hard to get that right – see: A case insensitive string class in python
  • turns out that explicit dict keys handling makes code verbose and messy – and error prone (structures are passed hither and thither, and it is not clear if they have CIStr instances as keys/elements, easy to forget plus some_dict[CIstr(path)] is ugly)

So I had finally to write down that case insensitive dict. Thanks to code by @AaronHall that was made 10 times easier.

class CIstr(unicode):
    """See https://stackoverflow.com/a/43122305/281545, especially for inlines"""
    __slots__ = () # does make a difference in memory performance

    #--Hash/Compare
    def __hash__(self):
        return hash(self.lower())
    def __eq__(self, other):
        if isinstance(other, CIstr):
            return self.lower() == other.lower()
        return NotImplemented
    def __ne__(self, other):
        if isinstance(other, CIstr):
            return self.lower() != other.lower()
        return NotImplemented
    def __lt__(self, other):
        if isinstance(other, CIstr):
            return self.lower() < other.lower()
        return NotImplemented
    def __ge__(self, other):
        if isinstance(other, CIstr):
            return self.lower() >= other.lower()
        return NotImplemented
    def __gt__(self, other):
        if isinstance(other, CIstr):
            return self.lower() > other.lower()
        return NotImplemented
    def __le__(self, other):
        if isinstance(other, CIstr):
            return self.lower() <= other.lower()
        return NotImplemented
    #--repr
    def __repr__(self):
        return '{0}({1})'.format(type(self).__name__,
                                 super(CIstr, self).__repr__())

def _ci_str(maybe_str):
    """dict keys can be any hashable object - only call CIstr if str"""
    return CIstr(maybe_str) if isinstance(maybe_str, basestring) else maybe_str

class LowerDict(dict):
    """Dictionary that transforms its keys to CIstr instances.
    Adapted from: https://stackoverflow.com/a/39375731/281545
    """
    __slots__ = () # no __dict__ - that would be redundant

    @staticmethod # because this doesn't make sense as a global function.
    def _process_args(mapping=(), **kwargs):
        if hasattr(mapping, 'iteritems'):
            mapping = getattr(mapping, 'iteritems')()
        return ((_ci_str(k), v) for k, v in
                chain(mapping, getattr(kwargs, 'iteritems')()))
    def __init__(self, mapping=(), **kwargs):
        # dicts take a mapping or iterable as their optional first argument
        super(LowerDict, self).__init__(self._process_args(mapping, **kwargs))
    def __getitem__(self, k):
        return super(LowerDict, self).__getitem__(_ci_str(k))
    def __setitem__(self, k, v):
        return super(LowerDict, self).__setitem__(_ci_str(k), v)
    def __delitem__(self, k):
        return super(LowerDict, self).__delitem__(_ci_str(k))
    def copy(self): # don't delegate w/ super - dict.copy() -> dict :(
        return type(self)(self)
    def get(self, k, default=None):
        return super(LowerDict, self).get(_ci_str(k), default)
    def setdefault(self, k, default=None):
        return super(LowerDict, self).setdefault(_ci_str(k), default)
    __no_default = object()
    def pop(self, k, v=__no_default):
        if v is LowerDict.__no_default:
            # super will raise KeyError if no default and key does not exist
            return super(LowerDict, self).pop(_ci_str(k))
        return super(LowerDict, self).pop(_ci_str(k), v)
    def update(self, mapping=(), **kwargs):
        super(LowerDict, self).update(self._process_args(mapping, **kwargs))
    def __contains__(self, k):
        return super(LowerDict, self).__contains__(_ci_str(k))
    @classmethod
    def fromkeys(cls, keys, v=None):
        return super(LowerDict, cls).fromkeys((_ci_str(k) for k in keys), v)
    def __repr__(self):
        return '{0}({1})'.format(type(self).__name__,
                                 super(LowerDict, self).__repr__())

Implicit vs explicit is still a problem, but once dust settles, renaming of attributes/variables to start with ci (and a big fat doc comment explaining that ci stands for case insensitive) I think is a perfect solution – as readers of the code must be fully aware that we are dealing with case insensitive underlying data structures. This will hopefully fix some hard to reproduce bugs, which I suspect boil down to case sensitivity.

Comments/corrections welcome :)


回答 3

您要做的就是

class BatchCollection(dict):
    def __init__(self, *args, **kwargs):
        dict.__init__(*args, **kwargs)

要么

class BatchCollection(dict):
    def __init__(self, inpt={}):
        super(BatchCollection, self).__init__(inpt)

我个人使用的样本用法

### EXAMPLE
class BatchCollection(dict):
    def __init__(self, inpt={}):
        dict.__init__(*args, **kwargs)

    def __setitem__(self, key, item):
        if (isinstance(key, tuple) and len(key) == 2
                and isinstance(item, collections.Iterable)):
            # self.__dict__[key] = item
            super(BatchCollection, self).__setitem__(key, item)
        else:
            raise Exception(
                "Valid key should be a tuple (database_name, table_name) "
                "and value should be iterable")

注意:仅在python3中测试

All you will have to do is

class BatchCollection(dict):
    def __init__(self, *args, **kwargs):
        dict.__init__(*args, **kwargs)

OR

class BatchCollection(dict):
    def __init__(self, inpt={}):
        super(BatchCollection, self).__init__(inpt)

A sample usage for my personal use

### EXAMPLE
class BatchCollection(dict):
    def __init__(self, inpt={}):
        dict.__init__(*args, **kwargs)

    def __setitem__(self, key, item):
        if (isinstance(key, tuple) and len(key) == 2
                and isinstance(item, collections.Iterable)):
            # self.__dict__[key] = item
            super(BatchCollection, self).__setitem__(key, item)
        else:
            raise Exception(
                "Valid key should be a tuple (database_name, table_name) "
                "and value should be iterable")

Note: tested only in python3


回答 4

尝试了两者的后顶部 2的建议,我已经定居在为Python 2.7黑幕,看中间路线。也许3更聪明,但对我来说:

class MyDict(MutableMapping):
   # ... the few __methods__ that mutablemapping requires
   # and then this monstrosity
   @property
   def __class__(self):
       return dict

我真的很讨厌,但似乎符合我的需求,这些需求是:

  • 可以覆盖 **my_dict
    • 如果您从继承dict则绕过您的代码。试试看。
    • 这使得#2 一直都是我无法接受的,因为这在python代码中很常见
  • 伪装成 isinstance(my_dict, dict)
    • 仅排除MutableMapping,所以#1是不够的
    • 我衷心推荐#1,如果您不需要的话,它既简单又可预测
  • 完全可控的行为
    • 所以我不能继承 dict

如果您需要与其他人区分开来,我个人使用这样的名称(尽管我会建议使用更好的名称):

def __am_i_me(self):
  return True

@classmethod
def __is_it_me(cls, other):
  try:
    return other.__am_i_me()
  except Exception:
    return False

只要您只需要在内部识别自己,这种方式就很难__am_i_me因python的名称更改(这_MyDict__am_i_me从此类外部的任何调用重命名)而意外调用。_method在实践和文化上都比s 私密一些。

到目前为止,除了看上去非常阴暗的__class__覆盖之外,我还没有任何抱怨。我很高兴听到别人遇到的任何问题,但我不完全了解后果。但是到目前为止,我还没有遇到任何问题,这使我可以在很多位置迁移很多中等质量的代码,而无需进行任何更改。


作为证据:https : //repl.it/repls/TraumaticToughCockatoo

基本上:复制当前的#2选项print 'method_name'向每个方法添加行,然后尝试执行此操作并观察输出:

d = LowerDict()  # prints "init", or whatever your print statement said
print '------'
splatted = dict(**d)  # note that there are no prints here

您将在其他情况下看到类似的行为。假设您的伪造品dict是其他数据类型的包装,因此没有合理的方法将数据存储在后备字典中;**your_dict不管其他方法做什么,它将为空。

这适用于MutableMapping,但是一旦您继承dict它就变得不可控制。


编辑:作为更新,它已经运行了将近两年没有出现任何问题,使用了数十万行(可能是几百万行)复杂的,遗留了很多经验的python。所以我对此很满意:)

编辑2:很显然,我很早以前就把它复印了。 @classmethod __class__不适用于isinstance支票- @property __class__可以:https : //repl.it/repls/UnitedScientificSequence

After trying out both of the top two suggestions, I’ve settled on a shady-looking middle route for Python 2.7. Maybe 3 is saner, but for me:

class MyDict(MutableMapping):
   # ... the few __methods__ that mutablemapping requires
   # and then this monstrosity
   @property
   def __class__(self):
       return dict

which I really hate, but seems to fit my needs, which are:

  • can override **my_dict
    • if you inherit from dict, this bypasses your code. try it out.
    • this makes #2 unacceptable for me at all times, as this is quite common in python code
  • masquerades as isinstance(my_dict, dict)
    • rules out MutableMapping alone, so #1 is not enough
    • I heartily recommend #1 if you don’t need this, it’s simple and predictable
  • fully controllable behavior
    • so I cannot inherit from dict

If you need to tell yourself apart from others, personally I use something like this (though I’d recommend better names):

def __am_i_me(self):
  return True

@classmethod
def __is_it_me(cls, other):
  try:
    return other.__am_i_me()
  except Exception:
    return False

As long as you only need to recognize yourself internally, this way it’s harder to accidentally call __am_i_me due to python’s name-munging (this is renamed to _MyDict__am_i_me from anything calling outside this class). Slightly more private than _methods, both in practice and culturally.

So far I have no complaints, aside from the seriously-shady-looking __class__ override. I’d be thrilled to hear of any problems that others encounter with this though, I don’t fully understand the consequences. But so far I’ve had no problems whatsoever, and this allowed me to migrate a lot of middling-quality code in lots of locations without needing any changes.


As evidence: https://repl.it/repls/TraumaticToughCockatoo

Basically: copy the current #2 option, add print 'method_name' lines to every method, and then try this and watch the output:

d = LowerDict()  # prints "init", or whatever your print statement said
print '------'
splatted = dict(**d)  # note that there are no prints here

You’ll see similar behavior for other scenarios. Say your fake-dict is a wrapper around some other datatype, so there’s no reasonable way to store the data in the backing-dict; **your_dict will be empty, regardless of what every other method does.

This works correctly for MutableMapping, but as soon as you inherit from dict it becomes uncontrollable.


Edit: as an update, this has been running without a single issue for almost two years now, on several hundred thousand (eh, might be a couple million) lines of complicated, legacy-ridden python. So I’m pretty happy with it :)

Edit 2: apparently I mis-copied this or something long ago. @classmethod __class__ does not work for isinstance checks – @property __class__ does: https://repl.it/repls/UnitedScientificSequence


Python创建列表字典

问题:Python创建列表字典

我想创建一个字典,其值为列表。例如:

{
  1: ['1'],
  2: ['1','2'],
  3: ['2']
}

如果我做:

d = dict()
a = ['1', '2']
for i in a:
    for j in range(int(i), int(i) + 2): 
        d[j].append(i)

我收到一个KeyError,因为d […]不是列表。在这种情况下,我可以在分配a后添加以下代码以初始化字典。

for x in range(1, 4):
    d[x] = list()

有一个更好的方法吗?可以说,直到进入第二个for循环,我才知道需要的键。例如:

class relation:
    scope_list = list()
...
d = dict()
for relation in relation_list:
    for scope_item in relation.scope_list:
        d[scope_item].append(relation)

然后可以替代

d[scope_item].append(relation)

if d.has_key(scope_item):
    d[scope_item].append(relation)
else:
    d[scope_item] = [relation,]

处理此问题的最佳方法是什么?理想情况下,追加将“有效”。有什么方法可以表达我想要空列表的字典,即使我第一次创建列表时也不知道每个键?

I want to create a dictionary whose values are lists. For example:

{
  1: ['1'],
  2: ['1','2'],
  3: ['2']
}

If I do:

d = dict()
a = ['1', '2']
for i in a:
    for j in range(int(i), int(i) + 2): 
        d[j].append(i)

I get a KeyError, because d[…] isn’t a list. In this case, I can add the following code after the assignment of a to initialize the dictionary.

for x in range(1, 4):
    d[x] = list()

Is there a better way to do this? Lets say I don’t know the keys I am going to need until I am in the second for loop. For example:

class relation:
    scope_list = list()
...
d = dict()
for relation in relation_list:
    for scope_item in relation.scope_list:
        d[scope_item].append(relation)

An alternative would then be replacing

d[scope_item].append(relation)

with

if d.has_key(scope_item):
    d[scope_item].append(relation)
else:
    d[scope_item] = [relation,]

What is the best way to handle this? Ideally, appending would “just work”. Is there some way to express that I want a dictionary of empty lists, even if I don’t know every key when I first create the list?


回答 0

您可以使用defaultdict

>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> a = ['1', '2']
>>> for i in a:
...   for j in range(int(i), int(i) + 2):
...     d[j].append(i)
...
>>> d
defaultdict(<type 'list'>, {1: ['1'], 2: ['1', '2'], 3: ['2']})
>>> d.items()
[(1, ['1']), (2, ['1', '2']), (3, ['2'])]

You can use defaultdict:

>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> a = ['1', '2']
>>> for i in a:
...   for j in range(int(i), int(i) + 2):
...     d[j].append(i)
...
>>> d
defaultdict(<type 'list'>, {1: ['1'], 2: ['1', '2'], 3: ['2']})
>>> d.items()
[(1, ['1']), (2, ['1', '2']), (3, ['2'])]

回答 1

您可以使用列表理解来构建它,如下所示:

>>> dict((i, range(int(i), int(i) + 2)) for i in ['1', '2'])
{'1': [1, 2], '2': [2, 3]}

对于问题的第二部分,请使用defaultdict

>>> from collections import defaultdict
>>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
>>> d = defaultdict(list)
>>> for k, v in s:
        d[k].append(v)

>>> d.items()
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]

You can build it with list comprehension like this:

>>> dict((i, range(int(i), int(i) + 2)) for i in ['1', '2'])
{'1': [1, 2], '2': [2, 3]}

And for the second part of your question use defaultdict

>>> from collections import defaultdict
>>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
>>> d = defaultdict(list)
>>> for k, v in s:
        d[k].append(v)

>>> d.items()
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]

回答 2

您可以使用setdefault

d = dict()
a = ['1', '2']
for i in a:
    for j in range(int(i), int(i) + 2): 
        d.setdefault(j, []).append(i)

print d  # prints {1: ['1'], 2: ['1', '2'], 3: ['2']}

这个名称很奇怪的setdefault函数说:“使用此键获取值,或者如果该键不存在,则添加该值,然后将其返回。”

正如其他人正确指出的那样,这defaultdict是一个更好,更现代的选择。 setdefault在旧版本的Python(2.5之前的版本)中仍然有用。

You can use setdefault:

d = dict()
a = ['1', '2']
for i in a:
    for j in range(int(i), int(i) + 2): 
        d.setdefault(j, []).append(i)

print d  # prints {1: ['1'], 2: ['1', '2'], 3: ['2']}

The rather oddly-named setdefault function says “Get the value with this key, or if that key isn’t there, add this value and then return it.”

As others have rightly pointed out, defaultdict is a better and more modern choice. setdefault is still useful in older versions of Python (prior to 2.5).


回答 3

您的问题已得到解答,但是IIRC您可以替换以下行:

if d.has_key(scope_item):

与:

if scope_item in d:

也就是说,该构造中的d参考d.keys()。有时defaultdict并不是最好的选择(例如,如果您想在else与上面的内容关联后执行多行代码if),并且我发现in语法更易于阅读。

Your question has already been answered, but IIRC you can replace lines like:

if d.has_key(scope_item):

with:

if scope_item in d:

That is, d references d.keys() in that construction. Sometimes defaultdict isn’t the best option (for example, if you want to execute multiple lines of code after the else associated with the above if), and I find the in syntax easier to read.


回答 4

就个人而言,我只是使用JSON将内容转换为字符串然后返回。我了解的字符串。

import json
s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
mydict = {}
hash = json.dumps(s)
mydict[hash] = "whatever"
print mydict
#{'[["yellow", 1], ["blue", 2], ["yellow", 3], ["blue", 4], ["red", 1]]': 'whatever'}

Personally, I just use JSON to convert things to strings and back. Strings I understand.

import json
s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
mydict = {}
hash = json.dumps(s)
mydict[hash] = "whatever"
print mydict
#{'[["yellow", 1], ["blue", 2], ["yellow", 3], ["blue", 4], ["red", 1]]': 'whatever'}

回答 5

简单的方法是:

a = [1,2]
d = {}
for i in a:
  d[i]=[i, ]

print(d)
{'1': [1, ], '2':[2, ]}

easy way is:

a = [1,2]
d = {}
for i in a:
  d[i]=[i, ]

print(d)
{'1': [1, ], '2':[2, ]}

如何在python-3.x中使用字典格式化字符串?

问题:如何在python-3.x中使用字典格式化字符串?

我非常喜欢使用字典来格式化字符串。它可以帮助我阅读所使用的字符串格式,也可以利用现有的字典。例如:

class MyClass:
    def __init__(self):
        self.title = 'Title'

a = MyClass()
print 'The title is %(title)s' % a.__dict__

path = '/path/to/a/file'
print 'You put your file here: %(path)s' % locals()

但是我无法弄清楚python 3.x语法是否可以这样做(或者甚至可以)。我想做以下

# Fails, KeyError 'latitude'
geopoint = {'latitude':41.123,'longitude':71.091}
print '{latitude} {longitude}'.format(geopoint)

# Succeeds
print '{latitude} {longitude}'.format(latitude=41.123,longitude=71.091)

I am a big fan of using dictionaries to format strings. It helps me read the string format I am using as well as let me take advantage of existing dictionaries. For example:

class MyClass:
    def __init__(self):
        self.title = 'Title'

a = MyClass()
print 'The title is %(title)s' % a.__dict__

path = '/path/to/a/file'
print 'You put your file here: %(path)s' % locals()

However I cannot figure out the python 3.x syntax for doing the same (or if that is even possible). I would like to do the following

# Fails, KeyError 'latitude'
geopoint = {'latitude':41.123,'longitude':71.091}
print '{latitude} {longitude}'.format(geopoint)

# Succeeds
print '{latitude} {longitude}'.format(latitude=41.123,longitude=71.091)

回答 0

由于问题是特定于Python 3的,因此这里使用的从Python 3.6开始可用的新f字符串语法

>>> geopoint = {'latitude':41.123,'longitude':71.091}
>>> print(f'{geopoint["latitude"]} {geopoint["longitude"]}')
41.123 71.091

注意外部单引号和内部双引号(您也可以采用其他方法)。

Since the question is specific to Python 3, here’s using the new f-string syntax, available since Python 3.6:

>>> geopoint = {'latitude':41.123,'longitude':71.091}
>>> print(f'{geopoint["latitude"]} {geopoint["longitude"]}')
41.123 71.091

Note the outer single quotes and inner double quotes (you could also do it the other way around).


回答 1

这对你有好处吗?

geopoint = {'latitude':41.123,'longitude':71.091}
print('{latitude} {longitude}'.format(**geopoint))

Is this good for you?

geopoint = {'latitude':41.123,'longitude':71.091}
print('{latitude} {longitude}'.format(**geopoint))

回答 2

要将字典解压缩为关键字参数,请使用**。此外,新型格式支持引用对象的属性和映射项:

'{0[latitude]} {0[longitude]}'.format(geopoint)
'The title is {0.title}s'.format(a) # the a from your first example

To unpack a dictionary into keyword arguments, use **. Also,, new-style formatting supports referring to attributes of objects and items of mappings:

'{0[latitude]} {0[longitude]}'.format(geopoint)
'The title is {0.title}s'.format(a) # the a from your first example

回答 3

由于Python 3.0和3.1已停产,而且没有人使用它们,因此您可以并且应该使用str.format_map(mapping)(Python 3.2+):

与相似str.format(**mapping)除了直接使用映射而不将其复制到dict。例如,如果映射是dict子类,则这很有用。

这意味着您可以使用例如defaultdict为丢失的键设置(并返回)默认值的a:

>>> from collections import defaultdict
>>> vals = defaultdict(lambda: '<unset>', {'bar': 'baz'})
>>> 'foo is {foo} and bar is {bar}'.format_map(vals)
'foo is <unset> and bar is baz'

即使提供的映射是dict,而不是子类,也可能会稍快一些。

鉴于给定,差异并不大

>>> d = dict(foo='x', bar='y', baz='z')

然后

>>> 'foo is {foo}, bar is {bar} and baz is {baz}'.format_map(d)

约比10 ns(2%)快

>>> 'foo is {foo}, bar is {bar} and baz is {baz}'.format(**d)

在我的Python 3.4.3上。当字典中有更多键时,差异可能会更大,并且


注意,格式语言比它灵活得多。它们可以包含索引表达式,属性访问等,因此您可以格式化整个对象或其中两个:

>>> p1 = {'latitude':41.123,'longitude':71.091}
>>> p2 = {'latitude':56.456,'longitude':23.456}
>>> '{0[latitude]} {0[longitude]} - {1[latitude]} {1[longitude]}'.format(p1, p2)
'41.123 71.091 - 56.456 23.456'

从3.6开始,您也可以使用插值字符串:

>>> f'lat:{p1["latitude"]} lng:{p1["longitude"]}'
'lat:41.123 lng:71.091'

您只需要记住在嵌套引号中使用其他引号字符。这种方法的另一个好处是,它比调用格式化方法要快得多

As Python 3.0 and 3.1 are EOL’ed and no one uses them, you can and should use str.format_map(mapping) (Python 3.2+):

Similar to str.format(**mapping), except that mapping is used directly and not copied to a dict. This is useful if for example mapping is a dict subclass.

What this means is that you can use for example a defaultdict that would set (and return) a default value for keys that are missing:

>>> from collections import defaultdict
>>> vals = defaultdict(lambda: '<unset>', {'bar': 'baz'})
>>> 'foo is {foo} and bar is {bar}'.format_map(vals)
'foo is <unset> and bar is baz'

Even if the mapping provided is a dict, not a subclass, this would probably still be slightly faster.

The difference is not big though, given

>>> d = dict(foo='x', bar='y', baz='z')

then

>>> 'foo is {foo}, bar is {bar} and baz is {baz}'.format_map(d)

is about 10 ns (2 %) faster than

>>> 'foo is {foo}, bar is {bar} and baz is {baz}'.format(**d)

on my Python 3.4.3. The difference would probably be larger as more keys are in the dictionary, and


Note that the format language is much more flexible than that though; they can contain indexed expressions, attribute accesses and so on, so you can format a whole object, or 2 of them:

>>> p1 = {'latitude':41.123,'longitude':71.091}
>>> p2 = {'latitude':56.456,'longitude':23.456}
>>> '{0[latitude]} {0[longitude]} - {1[latitude]} {1[longitude]}'.format(p1, p2)
'41.123 71.091 - 56.456 23.456'

Starting from 3.6 you can use the interpolated strings too:

>>> f'lat:{p1["latitude"]} lng:{p1["longitude"]}'
'lat:41.123 lng:71.091'

You just need to remember to use the other quote characters within the nested quotes. Another upside of this approach is that it is much faster than calling a formatting method.


回答 4

print("{latitude} {longitude}".format(**geopoint))
print("{latitude} {longitude}".format(**geopoint))

回答 5

Python 2语法也可以在Python 3中使用:

>>> class MyClass:
...     def __init__(self):
...         self.title = 'Title'
... 
>>> a = MyClass()
>>> print('The title is %(title)s' % a.__dict__)
The title is Title
>>> 
>>> path = '/path/to/a/file'
>>> print('You put your file here: %(path)s' % locals())
You put your file here: /path/to/a/file

The Python 2 syntax works in Python 3 as well:

>>> class MyClass:
...     def __init__(self):
...         self.title = 'Title'
... 
>>> a = MyClass()
>>> print('The title is %(title)s' % a.__dict__)
The title is Title
>>> 
>>> path = '/path/to/a/file'
>>> print('You put your file here: %(path)s' % locals())
You put your file here: /path/to/a/file

回答 6

geopoint = {'latitude':41.123,'longitude':71.091}

# working examples.
print(f'{geopoint["latitude"]} {geopoint["longitude"]}') # from above answer
print('{geopoint[latitude]} {geopoint[longitude]}'.format(geopoint=geopoint)) # alternate for format method  (including dict name in string).
print('%(latitude)s %(longitude)s'%geopoint) # thanks @tcll
geopoint = {'latitude':41.123,'longitude':71.091}

# working examples.
print(f'{geopoint["latitude"]} {geopoint["longitude"]}') # from above answer
print('{geopoint[latitude]} {geopoint[longitude]}'.format(geopoint=geopoint)) # alternate for format method  (including dict name in string).
print('%(latitude)s %(longitude)s'%geopoint) # thanks @tcll

回答 7

大多数答案仅格式化dict的值。

如果还要将密钥格式化为字符串,则可以使用dict.items()

geopoint = {'latitude':41.123,'longitude':71.091}
print("{} {}".format(*geopoint.items()))

输出:

(“纬度”,41.123)(“经度”,71.091)

如果要以套利方式格式化,即不显示元组之类的键值:

from functools import reduce
print("{} is {} and {} is {}".format(*reduce((lambda x, y: x + y), [list(item) for item in geopoint.items()])))

输出:

纬度为41.123,经度为71.091

Most answers formatted only the values of the dict.

If you want to also format the key into the string you can use dict.items():

geopoint = {'latitude':41.123,'longitude':71.091}
print("{} {}".format(*geopoint.items()))

Output:

(‘latitude’, 41.123) (‘longitude’, 71.091)

If you want to format in an arbitry way, that is, not showing the key-values like tuples:

from functools import reduce
print("{} is {} and {} is {}".format(*reduce((lambda x, y: x + y), [list(item) for item in geopoint.items()])))

Output:

latitude is 41.123 and longitude is 71.091


如何检查变量是否是Python中的字典?

问题:如何检查变量是否是Python中的字典?

您如何检查变量是否是python中的字典?

例如,我希望它遍历字典中的值,直到找到字典为止。然后,遍历找到的内容:

dict = {'abc': 'abc', 'def': {'ghi': 'ghi', 'jkl': 'jkl'}}
for k, v in dict.iteritems():
    if ###check if v is a dictionary:
        for k, v in v.iteritems():
            print(k, ' ', v)
    else:
        print(k, ' ', v)

How would you check if a variable is a dictionary in python?

For example, I’d like it to loop through the values in the dictionary until it finds a dictionary. Then, loop through the one it finds:

dict = {'abc': 'abc', 'def': {'ghi': 'ghi', 'jkl': 'jkl'}}
for k, v in dict.iteritems():
    if ###check if v is a dictionary:
        for k, v in v.iteritems():
            print(k, ' ', v)
    else:
        print(k, ' ', v)

回答 0

您可以使用if type(ele) is dict或使用isinstance(ele, dict)将子类化后将起作用的方法dict

d = {'abc':'abc','def':{'ghi':'ghi','jkl':'jkl'}}
for ele in d.values():
    if isinstance(ele,dict):
       for k, v in ele.items():
           print(k,' ',v)

You could use if type(ele) is dict or use isinstance(ele, dict) which would work if you had subclassed dict:

d = {'abc':'abc','def':{'ghi':'ghi','jkl':'jkl'}}
for ele in d.values():
    if isinstance(ele,dict):
       for k, v in ele.items():
           print(k,' ',v)

回答 1

您如何检查变量是否是Python中的字典?

这是一个很好的问题,但不幸的是,最受支持的答案导致推荐不力type(obj) is dict

(请注意,您也不应将其dict用作变量名-这是内置对象的名称。)

如果您正在编写将要由其他人导入和使用的代码,请不要假定他们将直接使用内置的dict-假定这种假定会使您的代码更加不灵活,在这种情况下,请创建容易隐藏的错误,这些错误不会使程序出错。

我强烈建议,出于将来用户的正确性,可维护性和灵活性的目的,当存在更灵活,惯用的表达式时,请不要在代码中使用较不灵活,唯一的表达式。

is是对对象身份的测试。它不支持继承,不支持任何抽象,并且不支持接口。

因此,我将提供几种选择。

支持继承:

这是第一个建议,我会做,因为它允许用户提供自己的字典的子类,或者OrderedDictdefaultdict或者Counter从收藏模块:

if isinstance(any_object, dict):

但是,还有更灵活的选择。

支持的抽象:

from collections.abc import Mapping

if isinstance(any_object, Mapping):

这使你的代码使用抽象映射,其中还包括的任何子类的自己的定制实现的用户dict,并仍然得到正确的行为。

使用界面

您通常会听到OOP建议“接口编程”。

这种策略利用了Python的多态性或鸭子式输入。

因此,只需尝试访问该接口,并通过合理的回退捕获特定的预期错误(AttributeError在没有错误.items和无法调用的TypeError情况下items)-现在,实现该接口的任何类都将为您提供其项(注意.iteritems()在Python中已消失) 3):

try:
    items = any_object.items()
except (AttributeError, TypeError):
    non_items_behavior(any_object)
else: # no exception raised
    for item in items: ...

也许您可能会认为,使用这种鸭子式输入方式会导致过多的误报,而且可能取决于您的代码目标。

结论

不要is用于检查标准控制流的类型。使用isinstance,考虑类似Mapping或的抽象MutableMapping,并考虑直接使用接口完全避免类型检查。

How would you check if a variable is a dictionary in Python?

This is an excellent question, but it is unfortunate that the most upvoted answer leads with a poor recommendation, type(obj) is dict.

(Note that you should also not use dict as a variable name – it’s the name of the builtin object.)

If you are writing code that will be imported and used by others, do not presume that they will use the dict builtin directly – making that presumption makes your code more inflexible and in this case, create easily hidden bugs that would not error the program out.

I strongly suggest, for the purposes of correctness, maintainability, and flexibility for future users, never having less flexible, unidiomatic expressions in your code when there are more flexible, idiomatic expressions.

is is a test for object identity. It does not support inheritance, it does not support any abstraction, and it does not support the interface.

So I will provide several options that do.

Supporting inheritance:

This is the first recommendation I would make, because it allows for users to supply their own subclass of dict, or a OrderedDict, defaultdict, or Counter from the collections module:

if isinstance(any_object, dict):

But there are even more flexible options.

Supporting abstractions:

from collections.abc import Mapping

if isinstance(any_object, Mapping):

This allows the user of your code to use their own custom implementation of an abstract Mapping, which also includes any subclass of dict, and still get the correct behavior.

Use the interface

You commonly hear the OOP advice, “program to an interface”.

This strategy takes advantage of Python’s polymorphism or duck-typing.

So just attempt to access the interface, catching the specific expected errors (AttributeError in case there is no .items and TypeError in case items is not callable) with a reasonable fallback – and now any class that implements that interface will give you its items (note .iteritems() is gone in Python 3):

try:
    items = any_object.items()
except (AttributeError, TypeError):
    non_items_behavior(any_object)
else: # no exception raised
    for item in items: ...

Perhaps you might think using duck-typing like this goes too far in allowing for too many false positives, and it may be, depending on your objectives for this code.

Conclusion

Don’t use is to check types for standard control flow. Use isinstance, consider abstractions like Mapping or MutableMapping, and consider avoiding type-checking altogether, using the interface directly.


回答 2

OP没有排除起始变量,因此为了完整起见,这里是如何处理处理可能包含条目作为字典的假定字典的一般情况。

还遵循上述注释中纯Python(3.8)建议的测试字典的方法

from collections.abc import Mapping

dict = {'abc': 'abc', 'def': {'ghi': 'ghi', 'jkl': 'jkl'}}

def parse_dict(in_dict): 
    if isinstance(in_dict, Mapping):
        for k, v in in_dict.items():
            if isinstance(v, Mapping):
                for k, v in v.items():
                    print(k, v)
            else:
                print(k, v)

parse_dict(dict)

The OP did not exclude the starting variable, so for completeness here is how to handle the generic case of processing a supposed dictionary that may include items as dictionaries.

Also following the pure Python(3.8) recommended way to test for dictionary in the above comments.

from collections.abc import Mapping

dict = {'abc': 'abc', 'def': {'ghi': 'ghi', 'jkl': 'jkl'}}

def parse_dict(in_dict): 
    if isinstance(in_dict, Mapping):
        for k_outer, v_outer in in_dict.items():
            if isinstance(v_outer, Mapping):
                for k_inner, v_inner in v_outer.items():
                    print(k_inner, v_inner)
            else:
                print(k_outer, v_outer)

parse_dict(dict)

将JSON字符串转换为字典未列出

问题:将JSON字符串转换为字典未列出

我正在尝试传递JSON文件并将数据转换成字典。

到目前为止,这是我所做的:

import json
json1_file = open('json1')
json1_str = json1_file.read()
json1_data = json.loads(json1_str)

我期望json1_data是一种dict类型,但是list当我使用进行检查时,它实际上是作为一种类型出现的type(json1_data)

我想念什么?我需要将它作为字典,以便可以访问其中一个键。

I am trying to pass in a JSON file and convert the data into a dictionary.

So far, this is what I have done:

import json
json1_file = open('json1')
json1_str = json1_file.read()
json1_data = json.loads(json1_str)

I’m expecting json1_data to be a dict type but it actually comes out as a list type when I check it with type(json1_data).

What am I missing? I need this to be a dictionary so I can access one of the keys.


回答 0

JSON是一个内部有单个对象的数组,因此当您读入JSON时,会得到一个内部带有字典的列表。您可以通过访问列表中的项目0来访问字典,如下所示:

json1_data = json.loads(json1_str)[0]

现在,您可以按预期访问存储在数据点中的数据

datapoints = json1_data['datapoints']

我还有一个问题,是否有人可以咬:我正在尝试获取这些数据点(即datapoints [0] [0])中第一个元素的平均值。只是列出它们,我尝试做datapoints [0:5] [0],但我得到的只是两个元素的第一个数据点,而不是想要获取仅包含第一个元素的前5个数据点。有没有办法做到这一点?

datapoints[0:5][0]并没有达到您的期望。datapoints[0:5]返回仅包含前5个元素的新列表切片,然后[0]在其末尾添加将仅从结果列表切片中获取第一个元素。您需要使用以获得列表结果的方法:

[p[0] for p in datapoints[0:5]]

这是一种计算均值的简单方法:

sum(p[0] for p in datapoints[0:5])/5. # Result is 35.8

如果您愿意安装NumPy,那么它甚至更容易:

import numpy
json1_file = open('json1')
json1_str = json1_file.read()
json1_data = json.loads(json1_str)[0]
datapoints = numpy.array(json1_data['datapoints'])
avg = datapoints[0:5,0].mean()
# avg is now 35.8

,运算符与NumPy数组的切片语法一起使用会产生您最初期望的与列表切片相同的行为。

Your JSON is an array with a single object inside, so when you read it in you get a list with a dictionary inside. You can access your dictionary by accessing item 0 in the list, as shown below:

json1_data = json.loads(json1_str)[0]

Now you can access the data stored in datapoints just as you were expecting:

datapoints = json1_data['datapoints']

I have one more question if anyone can bite: I am trying to take the average of the first elements in these datapoints(i.e. datapoints[0][0]). Just to list them, I tried doing datapoints[0:5][0] but all I get is the first datapoint with both elements as opposed to wanting to get the first 5 datapoints containing only the first element. Is there a way to do this?

datapoints[0:5][0] doesn’t do what you’re expecting. datapoints[0:5] returns a new list slice containing just the first 5 elements, and then adding [0] on the end of it will take just the first element from that resulting list slice. What you need to use to get the result you want is a list comprehension:

[p[0] for p in datapoints[0:5]]

Here’s a simple way to calculate the mean:

sum(p[0] for p in datapoints[0:5])/5. # Result is 35.8

If you’re willing to install NumPy, then it’s even easier:

import numpy
json1_file = open('json1')
json1_str = json1_file.read()
json1_data = json.loads(json1_str)[0]
datapoints = numpy.array(json1_data['datapoints'])
avg = datapoints[0:5,0].mean()
# avg is now 35.8

Using the , operator with the slicing syntax for NumPy’s arrays has the behavior you were originally expecting with the list slices.


回答 1

这是一个简单的代码片段,可json从字典中读取文本文件。请注意,您的json文件必须遵循json标准,因此它必须具有"双引号而不是'单引号。

您的JSON dump.txt文件:

{"test":"1", "test2":123}

Python脚本:

import json
with open('/your/path/to/a/dict/dump.txt') as handle:
    dictdump = json.loads(handle.read())

Here is a simple snippet that read’s in a json text file from a dictionary. Note that your json file must follow the json standard, so it has to have " double quotes rather then ' single quotes.

Your JSON dump.txt File:

{"test":"1", "test2":123}

Python Script:

import json
with open('/your/path/to/a/dict/dump.txt') as handle:
    dictdump = json.loads(handle.read())

回答 2

您可以使用以下内容:

import json

 with open('<yourFile>.json', 'r') as JSON:
       json_dict = json.load(JSON)

 # Now you can use it like dictionary
 # For example:

 print(json_dict["username"])

You can use the following:

import json

 with open('<yourFile>.json', 'r') as JSON:
       json_dict = json.load(JSON)

 # Now you can use it like dictionary
 # For example:

 print(json_dict["username"])

回答 3

将JSON数据加载到Dictionary中的最好方法是可以使用内置的json加载器。

以下是可以使用的示例代码段。

import json
f = open("data.json")
data = json.load(f))
f.close()
type(data)
print(data[<keyFromTheJsonFile>])

The best way to Load JSON Data into Dictionary is You can user the inbuilt json loader.

Below is the sample snippet that can be used.

import json
f = open("data.json")
data = json.load(f))
f.close()
type(data)
print(data[<keyFromTheJsonFile>])

回答 4

我正在使用针对REST API的Python代码,因此这是针对从事类似项目的人员的。

我使用POST请求从URL提取数据,原始输出为JSON。由于某种原因,输出已经是字典,而不是列表,并且我能够立即引用嵌套的字典键,如下所示:

datapoint_1 = json1_data['datapoints']['datapoint_1']

其中datapoint_1在数据点字典中。

I am working with a Python code for a REST API, so this is for those who are working on similar projects.

I extract data from an URL using a POST request and the raw output is JSON. For some reason the output is already a dictionary, not a list, and I’m able to refer to the nested dictionary keys right away, like this:

datapoint_1 = json1_data['datapoints']['datapoint_1']

where datapoint_1 is inside the datapoints dictionary.


回答 5

从get方法使用javascript ajax传递数据

    **//javascript function    
    function addnewcustomer(){ 
    //This function run when button click
    //get the value from input box using getElementById
            var new_cust_name = document.getElementById("new_customer").value;
            var new_cust_cont = document.getElementById("new_contact_number").value;
            var new_cust_email = document.getElementById("new_email").value;
            var new_cust_gender = document.getElementById("new_gender").value;
            var new_cust_cityname = document.getElementById("new_cityname").value;
            var new_cust_pincode = document.getElementById("new_pincode").value;
            var new_cust_state = document.getElementById("new_state").value;
            var new_cust_contry = document.getElementById("new_contry").value;
    //create json or if we know python that is call dictionary.        
    var data = {"cust_name":new_cust_name, "cust_cont":new_cust_cont, "cust_email":new_cust_email, "cust_gender":new_cust_gender, "cust_cityname":new_cust_cityname, "cust_pincode":new_cust_pincode, "cust_state":new_cust_state, "cust_contry":new_cust_contry};
    //apply stringfy method on json
            data = JSON.stringify(data);
    //insert data into database using javascript ajax
            var send_data = new XMLHttpRequest();
            send_data.open("GET", "http://localhost:8000/invoice_system/addnewcustomer/?customerinfo="+data,true);
            send_data.send();

            send_data.onreadystatechange = function(){
              if(send_data.readyState==4 && send_data.status==200){
                alert(send_data.responseText);
              }
            }
          }

django意见

    def addNewCustomer(request):
    #if method is get then condition is true and controller check the further line
        if request.method == "GET":
    #this line catch the json from the javascript ajax.
            cust_info = request.GET.get("customerinfo")
    #fill the value in variable which is coming from ajax.
    #it is a json so first we will get the value from using json.loads method.
    #cust_name is a key which is pass by javascript json. 
    #as we know json is a key value pair. the cust_name is a key which pass by javascript json
            cust_name = json.loads(cust_info)['cust_name']
            cust_cont = json.loads(cust_info)['cust_cont']
            cust_email = json.loads(cust_info)['cust_email']
            cust_gender = json.loads(cust_info)['cust_gender']
            cust_cityname = json.loads(cust_info)['cust_cityname']
            cust_pincode = json.loads(cust_info)['cust_pincode']
            cust_state = json.loads(cust_info)['cust_state']
            cust_contry = json.loads(cust_info)['cust_contry']
    #it print the value of cust_name variable on server
            print(cust_name)
            print(cust_cont)
            print(cust_email)
            print(cust_gender)
            print(cust_cityname)
            print(cust_pincode)
            print(cust_state)
            print(cust_contry)
            return HttpResponse("Yes I am reach here.")**

pass the data using javascript ajax from get methods

    **//javascript function    
    function addnewcustomer(){ 
    //This function run when button click
    //get the value from input box using getElementById
            var new_cust_name = document.getElementById("new_customer").value;
            var new_cust_cont = document.getElementById("new_contact_number").value;
            var new_cust_email = document.getElementById("new_email").value;
            var new_cust_gender = document.getElementById("new_gender").value;
            var new_cust_cityname = document.getElementById("new_cityname").value;
            var new_cust_pincode = document.getElementById("new_pincode").value;
            var new_cust_state = document.getElementById("new_state").value;
            var new_cust_contry = document.getElementById("new_contry").value;
    //create json or if we know python that is call dictionary.        
    var data = {"cust_name":new_cust_name, "cust_cont":new_cust_cont, "cust_email":new_cust_email, "cust_gender":new_cust_gender, "cust_cityname":new_cust_cityname, "cust_pincode":new_cust_pincode, "cust_state":new_cust_state, "cust_contry":new_cust_contry};
    //apply stringfy method on json
            data = JSON.stringify(data);
    //insert data into database using javascript ajax
            var send_data = new XMLHttpRequest();
            send_data.open("GET", "http://localhost:8000/invoice_system/addnewcustomer/?customerinfo="+data,true);
            send_data.send();

            send_data.onreadystatechange = function(){
              if(send_data.readyState==4 && send_data.status==200){
                alert(send_data.responseText);
              }
            }
          }

django views

    def addNewCustomer(request):
    #if method is get then condition is true and controller check the further line
        if request.method == "GET":
    #this line catch the json from the javascript ajax.
            cust_info = request.GET.get("customerinfo")
    #fill the value in variable which is coming from ajax.
    #it is a json so first we will get the value from using json.loads method.
    #cust_name is a key which is pass by javascript json. 
    #as we know json is a key value pair. the cust_name is a key which pass by javascript json
            cust_name = json.loads(cust_info)['cust_name']
            cust_cont = json.loads(cust_info)['cust_cont']
            cust_email = json.loads(cust_info)['cust_email']
            cust_gender = json.loads(cust_info)['cust_gender']
            cust_cityname = json.loads(cust_info)['cust_cityname']
            cust_pincode = json.loads(cust_info)['cust_pincode']
            cust_state = json.loads(cust_info)['cust_state']
            cust_contry = json.loads(cust_info)['cust_contry']
    #it print the value of cust_name variable on server
            print(cust_name)
            print(cust_cont)
            print(cust_email)
            print(cust_gender)
            print(cust_cityname)
            print(cust_pincode)
            print(cust_state)
            print(cust_contry)
            return HttpResponse("Yes I am reach here.")**

字典和默认值

问题:字典和默认值

假设connectionDetails是Python字典,那么像这样的重构代码的最佳,最优雅,最“ pythonic”的方法是什么?

if "host" in connectionDetails:
    host = connectionDetails["host"]
else:
    host = someDefaultValue

Assuming connectionDetails is a Python dictionary, what’s the best, most elegant, most “pythonic” way of refactoring code like this?

if "host" in connectionDetails:
    host = connectionDetails["host"]
else:
    host = someDefaultValue

回答 0

像这样:

host = connectionDetails.get('host', someDefaultValue)

Like this:

host = connectionDetails.get('host', someDefaultValue)

回答 1

您也可以这样使用defaultdict

from collections import defaultdict
a = defaultdict(lambda: "default", key="some_value")
a["blabla"] => "default"
a["key"] => "some_value"

您可以传递任何普通函数而不是lambda:

from collections import defaultdict
def a():
  return 4

b = defaultdict(a, key="some_value")
b['absent'] => 4
b['key'] => "some_value"

You can also use the defaultdict like so:

from collections import defaultdict
a = defaultdict(lambda: "default", key="some_value")
a["blabla"] => "default"
a["key"] => "some_value"

You can pass any ordinary function instead of lambda:

from collections import defaultdict
def a():
  return 4

b = defaultdict(a, key="some_value")
b['absent'] => 4
b['key'] => "some_value"

回答 2

虽然这.get()是一个很好的习惯用法,但是它比if/else(比try/except大多数情况下可以预期字典中键的存在要慢):

>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="try:\n a=d[1]\nexcept KeyError:\n a=10")
0.07691968797894333
>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="try:\n a=d[2]\nexcept KeyError:\n a=10")
0.4583777282275605
>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="a=d.get(1, 10)")
0.17784020746671558
>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="a=d.get(2, 10)")
0.17952161730158878
>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="if 1 in d:\n a=d[1]\nelse:\n a=10")
0.10071221458065338
>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="if 2 in d:\n a=d[2]\nelse:\n a=10")
0.06966537335119938

While .get() is a nice idiom, it’s slower than if/else (and slower than try/except if presence of the key in the dictionary can be expected most of the time):

>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="try:\n a=d[1]\nexcept KeyError:\n a=10")
0.07691968797894333
>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="try:\n a=d[2]\nexcept KeyError:\n a=10")
0.4583777282275605
>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="a=d.get(1, 10)")
0.17784020746671558
>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="a=d.get(2, 10)")
0.17952161730158878
>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="if 1 in d:\n a=d[1]\nelse:\n a=10")
0.10071221458065338
>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="if 2 in d:\n a=d[2]\nelse:\n a=10")
0.06966537335119938

回答 3

对于多个不同的默认值,请尝试以下操作:

connectionDetails = { "host": "www.example.com" }
defaults = { "host": "127.0.0.1", "port": 8080 }

completeDetails = {}
completeDetails.update(defaults)
completeDetails.update(connectionDetails)
completeDetails["host"]  # ==> "www.example.com"
completeDetails["port"]  # ==> 8080

For multiple different defaults try this:

connectionDetails = { "host": "www.example.com" }
defaults = { "host": "127.0.0.1", "port": 8080 }

completeDetails = {}
completeDetails.update(defaults)
completeDetails.update(connectionDetails)
completeDetails["host"]  # ==> "www.example.com"
completeDetails["port"]  # ==> 8080

回答 4

python词典中有一个方法可以做到这一点: dict.setdefault

connectionDetails.setdefault('host',someDefaultValue)
host = connectionDetails['host']

但是,与问题所要求的不同,此方法将if 的值设置connectionDetails['host']someDefaultValueif host尚未定义。

There is a method in python dictionaries to do this: dict.setdefault

connectionDetails.setdefault('host',someDefaultValue)
host = connectionDetails['host']

However this method sets the value of connectionDetails['host'] to someDefaultValue if key host is not already defined, unlike what the question asked.


回答 5

(这是一个很晚的答案)

一种替代方法是对类进行子dict类化并实现__missing__()方法,如下所示:

class ConnectionDetails(dict):
    def __missing__(self, key):
        if key == 'host':
            return "localhost"
        raise KeyError(key)

例子:

>>> connection_details = ConnectionDetails(port=80)

>>> connection_details['host']
'localhost'

>>> connection_details['port']
80

>>> connection_details['password']
Traceback (most recent call last):
  File "python", line 1, in <module>
  File "python", line 6, in __missing__
KeyError: 'password'

(this is a late answer)

An alternative is to subclass the dict class and implement the __missing__() method, like this:

class ConnectionDetails(dict):
    def __missing__(self, key):
        if key == 'host':
            return "localhost"
        raise KeyError(key)

Examples:

>>> connection_details = ConnectionDetails(port=80)

>>> connection_details['host']
'localhost'

>>> connection_details['port']
80

>>> connection_details['password']
Traceback (most recent call last):
  File "python", line 1, in <module>
  File "python", line 6, in __missing__
KeyError: 'password'

回答 6

测试@Tim Pietzcker对Python 3.3.5的PyPy(5.2.0-alpha0)情况的怀疑,我发现确实两者.get()if/ else方式的执行情况相似。实际上,在if / else情况下,如果条件和赋值涉及相同的键,则似乎只有一次查找(与最后一次有两次查找的情况比较)。

>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="try:\n a=d[1]\nexcept KeyError:\n a=10")
0.011889292989508249
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="try:\n a=d[2]\nexcept KeyError:\n a=10")
0.07310474599944428
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="a=d.get(1, 10)")
0.010391917996457778
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="a=d.get(2, 10)")
0.009348208011942916
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="if 1 in d:\n a=d[1]\nelse:\n a=10")
0.011475925013655797
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="if 2 in d:\n a=d[2]\nelse:\n a=10")
0.009605801998986863
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="if 2 in d:\n a=d[2]\nelse:\n a=d[1]")
0.017342638995614834

Testing @Tim Pietzcker’s suspicion about the situation in PyPy (5.2.0-alpha0) for Python 3.3.5, I find that indeed both .get() and the if/else way perform similar. Actually it seems that in the if/else case there is even only a single lookup if the condition and the assignment involve the same key (compare with the last case where there is two lookups).

>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="try:\n a=d[1]\nexcept KeyError:\n a=10")
0.011889292989508249
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="try:\n a=d[2]\nexcept KeyError:\n a=10")
0.07310474599944428
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="a=d.get(1, 10)")
0.010391917996457778
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="a=d.get(2, 10)")
0.009348208011942916
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="if 1 in d:\n a=d[1]\nelse:\n a=10")
0.011475925013655797
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="if 2 in d:\n a=d[2]\nelse:\n a=10")
0.009605801998986863
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="if 2 in d:\n a=d[2]\nelse:\n a=d[1]")
0.017342638995614834

回答 7

您可以将lamba函数用作单线。制作一个connectionDetails2可以像函数一样访问的新对象 …

connectionDetails2 = lambda k: connectionDetails[k] if k in connectionDetails.keys() else "DEFAULT"

现在使用

connectionDetails2(k)

代替

connectionDetails[k]

如果k在键中,则返回字典值,否则返回"DEFAULT"

You can use a lamba function for this as a one-liner. Make a new object connectionDetails2 which is accessed like a function…

connectionDetails2 = lambda k: connectionDetails[k] if k in connectionDetails.keys() else "DEFAULT"

Now use

connectionDetails2(k)

instead of

connectionDetails[k]

which returns the dictionary value if k is in the keys, otherwise it returns "DEFAULT"


如何打印字典的键?

问题:如何打印字典的键?

我想打印一个特定的Python字典键:

mydic = {}
mydic['key_name'] = 'value_name'

现在,我可以检查是否可以mydic.has_key('key_name'),但是我想做的是打印密钥的名称'key_name'。当然可以使用mydic.items(),但是我不想列出所有键,而只列出一个特定的键。例如,我期望这样的事情(用伪代码):

print "the key name is", mydic['key_name'].name_the_key(), "and its value is", mydic['key_name']

有什么name_the_key()方法可以打印键名?


编辑: 好的,非常感谢你们的反应!:)我意识到我的问题没有很好的表述和琐碎。我只是感到困惑,因为我意识到key_name mydic['key_name']是两个不同的东西,我认为key_name从字典上下文中打印出来是不正确的。但是实际上我可以简单地使用“ key_name”来引用密钥!:)

I would like to print a specific Python dictionary key:

mydic = {}
mydic['key_name'] = 'value_name'

Now I can check if mydic.has_key('key_name'), but what I would like to do is print the name of the key 'key_name'. Of course I could use mydic.items(), but I don’t want all the keys listed, merely one specific key. For instance I’d expect something like this (in pseudo-code):

print "the key name is", mydic['key_name'].name_the_key(), "and its value is", mydic['key_name']

Is there any name_the_key() method to print a key name?


Edit: OK, thanks a lot guys for your reactions! :) I realise my question is not well formulated and trivial. I just got confused because i realised key_name and mydic['key_name'] are two different things and i thought it would incorrect to print the key_name out of the dictionary context. But indeed i can simply use the ‘key_name’ to refer to the key! :)


回答 0

根据定义,字典具有任意数量的键。没有“钥匙”。您有keys()方法,可以为您提供list所有键的python ,并且您有iteritems()方法,可以返回键-值对,因此

for key, value in mydic.iteritems() :
    print key, value

Python 3版本:

for key, value in mydic.items() :
    print (key, value)

因此,您在键上有一个手柄,但是如果将其与值耦合,它们实际上仅意味着意义。我希望我理解你的问题。

A dictionary has, by definition, an arbitrary number of keys. There is no “the key”. You have the keys() method, which gives you a python list of all the keys, and you have the iteritems() method, which returns key-value pairs, so

for key, value in mydic.iteritems() :
    print key, value

Python 3 version:

for key, value in mydic.items() :
    print (key, value)

So you have a handle on the keys, but they only really mean sense if coupled to a value. I hope I have understood your question.


回答 1

此外,您可以使用…。

print(dictionary.items()) #prints keys and values
print(dictionary.keys()) #prints keys
print(dictionary.values()) #prints values

Additionally you can use….

print(dictionary.items()) #prints keys and values
print(dictionary.keys()) #prints keys
print(dictionary.values()) #prints values

回答 2

嗯,我想您可能想做的是打印字典中的所有键及其各自的值?

如果是这样,您需要以下内容:

for key in mydic:
  print "the key name is" + key + "and its value is" + mydic[key]

确保您也使用+而不是’。我认为逗号会将每一项放在单独的行中,加号会将它们放在同一行中。

Hmm, I think that what you might be wanting to do is print all the keys in the dictionary and their respective values?

If so you want the following:

for key in mydic:
  print "the key name is" + key + "and its value is" + mydic[key]

Make sure you use +’s instead of ,’ as well. The comma will put each of those items on a separate line I think, where as plus will put them on the same line.


回答 3

dic = {"key 1":"value 1","key b":"value b"}

#print the keys:
for key in dic:
    print key

#print the values:
for value in dic.itervalues():
    print value

#print key and values
for key, value in dic.iteritems():
    print key, value

注意:在Python 3中,dic.iteritems()重命名为dic.items()

dic = {"key 1":"value 1","key b":"value b"}

#print the keys:
for key in dic:
    print key

#print the values:
for value in dic.itervalues():
    print value

#print key and values
for key, value in dic.iteritems():
    print key, value

Note:In Python 3, dic.iteritems() was renamed as dic.items()


回答 4

因此,键“ key_name”的名称为key_name,print 'key_name'或者是您代表它的任何变量。

The name of the key ‘key_name’ is key_name, therefore print 'key_name' or whatever variable you have representing it.


回答 5

在Python 3中:

# A simple dictionary
x = {'X':"yes", 'Y':"no", 'Z':"ok"}

# To print a specific key (for example key at index 1)
print([key for key in x.keys()][1])

# To print a specific value (for example value at index 1)
print([value for value in x.values()][1])

# To print a pair of a key with its value (for example pair at index 2)
print(([key for key in x.keys()][2], [value for value in x.values()][2]))

# To print a key and a different value (for example key at index 0 and value at index 1)
print(([key for key in x.keys()][0], [value for value in x.values()][1]))

# To print all keys and values concatenated together
print(''.join(str(key) + '' + str(value) for key, value in x.items()))

# To print all keys and values separated by commas
print(', '.join(str(key) + ', ' + str(value) for key, value in x.items()))

# To print all pairs of (key, value) one at a time
for e in range(len(x)):
    print(([key for key in x.keys()][e], [value for value in x.values()][e]))

# To print all pairs (key, value) in a tuple
print(tuple(([key for key in x.keys()][i], [value for value in x.values()][i]) for i in range(len(x))))

In Python 3:

# A simple dictionary
x = {'X':"yes", 'Y':"no", 'Z':"ok"}

# To print a specific key (for example key at index 1)
print([key for key in x.keys()][1])

# To print a specific value (for example value at index 1)
print([value for value in x.values()][1])

# To print a pair of a key with its value (for example pair at index 2)
print(([key for key in x.keys()][2], [value for value in x.values()][2]))

# To print a key and a different value (for example key at index 0 and value at index 1)
print(([key for key in x.keys()][0], [value for value in x.values()][1]))

# To print all keys and values concatenated together
print(''.join(str(key) + '' + str(value) for key, value in x.items()))

# To print all keys and values separated by commas
print(', '.join(str(key) + ', ' + str(value) for key, value in x.items()))

# To print all pairs of (key, value) one at a time
for e in range(len(x)):
    print(([key for key in x.keys()][e], [value for value in x.values()][e]))

# To print all pairs (key, value) in a tuple
print(tuple(([key for key in x.keys()][i], [value for value in x.values()][i]) for i in range(len(x))))

回答 6

由于我们都在尝试猜测“打印键名”的含义,因此我将对其进行介绍。也许您想要一个从字典中获取一个值并找到相应键的函数?反向查询?

def key_for_value(d, value):
    """Return a key in `d` having a value of `value`."""
    for k, v in d.iteritems():
        if v == value:
            return k

请注意,许多键可能具有相同的值,因此此函数将返回一些具有该值的键,也许不是您想要的键。

如果您需要经常执行此操作,则构造反向字典将很有意义:

d_rev = dict(v,k for k,v in d.iteritems())

Since we’re all trying to guess what “print a key name” might mean, I’ll take a stab at it. Perhaps you want a function that takes a value from the dictionary and finds the corresponding key? A reverse lookup?

def key_for_value(d, value):
    """Return a key in `d` having a value of `value`."""
    for k, v in d.iteritems():
        if v == value:
            return k

Note that many keys could have the same value, so this function will return some key having the value, perhaps not the one you intended.

If you need to do this frequently, it would make sense to construct the reverse dictionary:

d_rev = dict(v,k for k,v in d.iteritems())

回答 7

或者,您可以按照以下方式进行操作:

for key in my_dict:
     print key, my_dict[key]

Or you can do it that manner:

for key in my_dict:
     print key, my_dict[key]

回答 8

# highlighting how to use a named variable within a string:
mapping = {'a': 1, 'b': 2}

# simple method:
print(f'a: {mapping["a"]}')
print(f'b: {mapping["b"]}')

# programmatic method:
for key, value in mapping.items():
    print(f'{key}: {value}')

# yields:
# a 1
# b 2

# using list comprehension
print('\n'.join(f'{key}: {value}' for key, value in dict.items()))


# yields:
# a: 1
# b: 2

编辑:更新为python 3的f字符串…

# highlighting how to use a named variable within a string:
mapping = {'a': 1, 'b': 2}

# simple method:
print(f'a: {mapping["a"]}')
print(f'b: {mapping["b"]}')

# programmatic method:
for key, value in mapping.items():
    print(f'{key}: {value}')

# yields:
# a 1
# b 2

# using list comprehension
print('\n'.join(f'{key}: {value}' for key, value in dict.items()))


# yields:
# a: 1
# b: 2

Edit: Updated for python 3’s f-strings…


回答 9

确保做

dictionary.keys()
# rather than
dictionary.keys

Make sure to do

dictionary.keys()
# rather than
dictionary.keys

回答 10

'key_name'即使使用变量,使用它又有什么问题呢?

What’s wrong with using 'key_name' instead, even if it is a variable?


回答 11

import pprint
pprint.pprint(mydic.keys())
import pprint
pprint.pprint(mydic.keys())

回答 12

dict = {'name' : 'Fred', 'age' : 100, 'employed' : True }

# Choose key to print (could be a user input)
x = 'name'

if x in dict.keys():
    print(x)
dict = {'name' : 'Fred', 'age' : 100, 'employed' : True }

# Choose key to print (could be a user input)
x = 'name'

if x in dict.keys():
    print(x)

回答 13

可能是仅检索键名的最快方法:

mydic = {}
mydic['key_name'] = 'value_name'

print mydic.items()[0][0]

结果:

key_name

将转换dictionary为,list然后列出第一个元素是整体,dict然后列出该元素的第一个值:key_name

Probably the quickest way to retrieve only the key name:

mydic = {}
mydic['key_name'] = 'value_name'

print mydic.items()[0][0]

Result:

key_name

Converts the dictionary into a list then it lists the first element which is the whole dict then it lists the first value of that element which is: key_name


回答 14

我查询了这个问题,因为如果我的字典只有一个条目,我想知道如何检索“键”的名称。就我而言,密钥对我来说并不为人所知,可以是任何东西。这是我想出的:

dict1 = {'random_word': [1,2,3]}
key_name = str([key for key in dict1]).strip("'[]'")        
print(key_name)  # equal to 'random_word', type: string.

I looked up this question, because I wanted to know how to retrieve the name of “the key” if my dictionary only had one entry. In my case, the key was unknown to me and could be any number of things. Here is what I came up with:

dict1 = {'random_word': [1,2,3]}
key_name = str([key for key in dict1]).strip("'[]'")        
print(key_name)  # equal to 'random_word', type: string.

回答 15

试试这个:

def name_the_key(dict, key):
    return key, dict[key]

mydict = {'key1':1, 'key2':2, 'key3':3}

key_name, value = name_the_key(mydict, 'key2')
print 'KEY NAME: %s' % key_name
print 'KEY VALUE: %s' % value

Try this:

def name_the_key(dict, key):
    return key, dict[key]

mydict = {'key1':1, 'key2':2, 'key3':3}

key_name, value = name_the_key(mydict, 'key2')
print 'KEY NAME: %s' % key_name
print 'KEY VALUE: %s' % value

回答 16

key_name = '...'
print "the key name is %s and its value is %s"%(key_name, mydic[key_name])
key_name = '...'
print "the key name is %s and its value is %s"%(key_name, mydic[key_name])

回答 17

如果要获取单个值的键,则以下方法将有所帮助:

def get_key(b): # the value is passed to the function
    for k, v in mydic.items():
        if v.lower() == b.lower():
            return k

以pythonic方式:

c = next((x for x, y in mydic.items() if y.lower() == b.lower()), \
     "Enter a valid 'Value'")
print(c)

If you want to get the key of a single value, the following would help:

def get_key(b): # the value is passed to the function
    for k, v in mydic.items():
        if v.lower() == b.lower():
            return k

In pythonic way:

c = next((x for x, y in mydic.items() if y.lower() == b.lower()), \
     "Enter a valid 'Value'")
print(c)

回答 18

我将这个答案添加为此处的其他答案之一(https://stackoverflow.com/a/5905752/1904943)已过时(Python 2; iteritems),并且提供了代码-如果根据建议的Python 3更新了代码解决方案,在对该答案的评论中-默默地无法返回所有相关数据。


背景

我有一些代谢数据,用图表表示(节点,边缘等)。在这些数据的字典表示中,的形式为(604, 1037, 0)(表示源节点和目标节点,以及边缘类型),的形式为5.3.1.9(表示EC酶代码)。

查找给定值的键

以下代码可以正确地找到给定值的密钥:

def k4v_edited(my_dict, value):
    values_list = []
    for k, v in my_dict.items():
        if v == value:
            values_list.append(k)
    return values_list

print(k4v_edited(edge_attributes, '5.3.1.9'))
## [(604, 1037, 0), (604, 3936, 0), (1037, 3936, 0)]

而此代码仅返回第一个(可能有多个匹配项)键:

def k4v(my_dict, value):
    for k, v in my_dict.items():
        if v == value:
            return k

print(k4v(edge_attributes, '5.3.1.9'))
## (604, 1037, 0)

后者的代码天真地更新iteritemsitems,无法返回(604, 3936, 0), (1037, 3936, 0

I’m adding this answer as one of the other answers here (https://stackoverflow.com/a/5905752/1904943) is dated (Python 2; iteritems), and the code presented — if updated for Python 3 per the suggested workaround in a comment to that answer — silently fails to return all relevant data.


Background

I have some metabolic data, represented in a graph (nodes, edges, …). In a dictionary representation of those data, keys are of the form (604, 1037, 0) (representing source and target nodes, and the edge type), with values of the form 5.3.1.9 (representing EC enzyme codes).

Find keys for given values

The following code correctly finds my keys, given values:

def k4v_edited(my_dict, value):
    values_list = []
    for k, v in my_dict.items():
        if v == value:
            values_list.append(k)
    return values_list

print(k4v_edited(edge_attributes, '5.3.1.9'))
## [(604, 1037, 0), (604, 3936, 0), (1037, 3936, 0)]

whereas this code returns only the first (of possibly several matching) keys:

def k4v(my_dict, value):
    for k, v in my_dict.items():
        if v == value:
            return k

print(k4v(edge_attributes, '5.3.1.9'))
## (604, 1037, 0)

The latter code, naively updated replacing iteritems with items, fails to return (604, 3936, 0), (1037, 3936, 0.


回答 19

要访问数据,您需要执行以下操作:

foo = {
    "foo0": "bar0",
    "foo1": "bar1",
    "foo2": "bar2",
    "foo3": "bar3"
}
for bar in foo:
  print(bar)

或者,要访问该值,只需从键中调用它即可: foo[bar]

To access the data, you’ll need to do this:

foo = {
    "foo0": "bar0",
    "foo1": "bar1",
    "foo2": "bar2",
    "foo3": "bar3"
}
for bar in foo:
  print(bar)

Or, to access the value you just call it from the key: foo[bar]


初始化dict的首选语法是:大括号文字{}或dict()函数?

问题:初始化dict的首选语法是:大括号文字{}或dict()函数?

我正在努力学习Python,并且一直密切关注通用的编码标准。这似乎是一个毫无意义的挑剔问题,但是我在学习时会尝试着眼于最佳实践,因此我不必学习任何“坏”习惯。

我看到两种初始化字典的常用方法:

a = {
    'a': 'value',
    'another': 'value',
}

b = dict( 
    a='value',
    another='value',
)

哪个被认为是“更pythonic的”?您使用哪个?为什么?

I’m putting in some effort to learn Python, and I am paying close attention to common coding standards. This may seem like a pointlessly nit-picky question, but I am trying to focus on best-practices as I learn, so I don’t have to unlearn any ‘bad’ habits.

I see two common methods for initializing a dict:

a = {
    'a': 'value',
    'another': 'value',
}

b = dict( 
    a='value',
    another='value',
)

Which is considered to be “more pythonic”? Which do you use? Why?


回答 0

大括号。将关键字参数传递给dict(),尽管它在许多情况下都能很好地工作,但只有在键是有效的Python标识符的情况下,才能初始化映射。

这有效:

a = {'import': 'trade', 1: 7.8}
a = dict({'import': 'trade', 1: 7.8})

这行不通:

a =                  dict(import='trade', 1=7.8)
>> SyntaxError: invalid syntax  ^

Curly braces. Passing keyword arguments into dict(), though it works beautifully in a lot of scenarios, can only initialize a map if the keys are valid Python identifiers.

This works:

a = {'import': 'trade', 1: 7.8}
a = dict({'import': 'trade', 1: 7.8})

This won’t work:

a =                  dict(import='trade', 1=7.8)
>> SyntaxError: invalid syntax  ^

回答 1

首先,大括号。否则,您会遇到键中包含奇数字符的一致性问题=

# Works fine.
a = {
    'a': 'value',
    'b=c': 'value',
}

# Eeep! Breaks if trying to be consistent.
b = dict( 
    a='value',
    b=c='value',
)

The first, curly braces. Otherwise, you run into consistency issues with keys that have odd characters in them, like =.

# Works fine.
a = {
    'a': 'value',
    'b=c': 'value',
}

# Eeep! Breaks if trying to be consistent.
b = dict( 
    a='value',
    b=c='value',
)

回答 2

第一个版本更可取:

  • 它适用于各种键,因此您可以说{1: 'one', 2: 'two'}。第二种变体仅适用于(某些)字符串键。根据键的类型使用不同种类的语法将是不必要的不​​一致。
  • 它更快:

    $ python -m timeit "dict(a='value', another='value')"
    1000000 loops, best of 3: 0.79 usec per loop
    $ python -m timeit "{'a': 'value','another': 'value'}"
    1000000 loops, best of 3: 0.305 usec per loop
  • 如果不打算使用字典文字的特殊语法,则可能不存在。

The first version is preferable:

  • It works for all kinds of keys, so you can, for example, say {1: 'one', 2: 'two'}. The second variant only works for (some) string keys. Using different kinds of syntax depending on the type of the keys would be an unnecessary inconsistency.
  • It is faster:

    $ python -m timeit "dict(a='value', another='value')"
    1000000 loops, best of 3: 0.79 usec per loop
    $ python -m timeit "{'a': 'value','another': 'value'}"
    1000000 loops, best of 3: 0.305 usec per loop
    
  • If the special syntax for dictionary literals wasn’t intended to be used, it probably wouldn’t exist.

回答 3

我认为第一种选择更好,因为您将以a [‘a’]或a [‘another’]的形式访问值。字典中的键是字符串,没有理由假装它们不是。对我来说,关键字语法乍一看看上去很聪明,但是第二眼看上去却很晦涩。仅当您使用时__dict__,这对我才有意义,并且关键字稍后将成为属性,诸如此类。

I think the first option is better because you are going to access the values as a[‘a’] or a[‘another’]. The keys in your dictionary are strings, and there is no reason to pretend they are not. To me the keyword syntax looks clever at first, but obscure at a second look. This only makes sense to me if you are working with __dict__, and the keywords are going to become attributes later, something like that.


回答 4

仅供参考,如果您需要向字典中添加属性(附加到字典中但不是键之一的属性),则需要第二种形式。在这种情况下,您可以使用具有任意字符的键来初始化字典,一次一次,如下所示:

    class mydict(dict): pass
    a = mydict()        
    a["b=c"] = 'value'
    a.test = False

FYI, in case you need to add attributes to your dictionary (things that are attached to the dictionary, but are not one of the keys), then you’ll need the second form. In that case, you can initialize your dictionary with keys having arbitrary characters, one at a time, like so:

    class mydict(dict): pass
    a = mydict()        
    a["b=c"] = 'value'
    a.test = False

回答 5

有时dict()是一个不错的选择:

a=dict(zip(['Mon','Tue','Wed','Thu','Fri'], [x for x in range(1, 6)]))

mydict=dict(zip(['mon','tue','wed','thu','fri','sat','sun'],

[random.randint(0,100)for x in range(0,7)]))

Sometimes dict() is a good choice:

a=dict(zip(['Mon','Tue','Wed','Thu','Fri'], [x for x in range(1, 6)]))

mydict=dict(zip(['mon','tue','wed','thu','fri','sat','sun'],

[random.randint(0,100) for x in range(0,7)]))


回答 6

我几乎总是使用花括号;但是,在某些情况下,当我编写测试时,我会使用关键字打包/拆包,在这种情况下,dict()更具可维护性,因为我不需要更改:

a=1,
b=2,

至:

'a': 1,
'b': 2,

在某些情况下,它在我认为以后可能需要将其转换为namedtuple或类实例的情况下也有帮助。

在实现本身中,由于我对优化的痴迷,并且当我没有看到特别巨大的可维护性好处时,我将始终青睐花括号。

在测试和实现中,如果在那时或将来添加的键有可能会出现以下情况,我将永远不会使用dict():

  • 并不总是一个字符串
  • 不仅包含数字,ASCII字母和下划线
  • 以整数开头(dict(1foo=2)引发SyntaxError)

I almost always use curly-braces; however, in some cases where I’m writing tests, I do keyword packing/unpacking, and in these cases dict() is much more maintainable, as I don’t need to change:

a=1,
b=2,

to:

'a': 1,
'b': 2,

It also helps in some circumstances where I think I might want to turn it into a namedtuple or class instance at a later time.

In the implementation itself, because of my obsession with optimisation, and when I don’t see a particularly huge maintainability benefit, I’ll always favour curly-braces.

In tests and the implementation, I would never use dict() if there is a chance that the keys added then, or in the future, would either:

  • Not always be a string
  • Not only contain digits, ASCII letters and underscores
  • Start with an integer (dict(1foo=2) raises a SyntaxError)

如何根据任意条件函数过滤字典?

问题:如何根据任意条件函数过滤字典?

我有一个要点词典,说:

>>> points={'a':(3,4), 'b':(1,2), 'c':(5,5), 'd':(3,3)}

我想创建一个新字典,其中所有x和y值均小于5的点,即点“ a”,“ b”和“ d”。

根据这本书,每个字典都有该items()函数,该函数返回一个(key, pair) 元组列表:

>>> points.items()
[('a', (3, 4)), ('c', (5, 5)), ('b', (1, 2)), ('d', (3, 3))]

所以我写了这个:

>>> for item in [i for i in points.items() if i[1][0]<5 and i[1][1]<5]:
...     points_small[item[0]]=item[1]
...
>>> points_small
{'a': (3, 4), 'b': (1, 2), 'd': (3, 3)}

有没有更优雅的方式?我期待Python具有一些超棒的dictionary.filter(f)功能…

I have a dictionary of points, say:

>>> points={'a':(3,4), 'b':(1,2), 'c':(5,5), 'd':(3,3)}

I want to create a new dictionary with all the points whose x and y value is smaller than 5, i.e. points ‘a’, ‘b’ and ‘d’.

According to the the book, each dictionary has the items() function, which returns a list of (key, pair) tuple:

>>> points.items()
[('a', (3, 4)), ('c', (5, 5)), ('b', (1, 2)), ('d', (3, 3))]

So I have written this:

>>> for item in [i for i in points.items() if i[1][0]<5 and i[1][1]<5]:
...     points_small[item[0]]=item[1]
...
>>> points_small
{'a': (3, 4), 'b': (1, 2), 'd': (3, 3)}

Is there a more elegant way? I was expecting Python to have some super-awesome dictionary.filter(f) function…


回答 0

如今,在Python 2.7及更高版本中,您可以使用dict理解:

{k: v for k, v in points.iteritems() if v[0] < 5 and v[1] < 5}

在Python 3中:

{k: v for k, v in points.items() if v[0] < 5 and v[1] < 5}

Nowadays, in Python 2.7 and up, you can use a dict comprehension:

{k: v for k, v in points.iteritems() if v[0] < 5 and v[1] < 5}

And in Python 3:

{k: v for k, v in points.items() if v[0] < 5 and v[1] < 5}

回答 1

dict((k, v) for k, v in points.items() if all(x < 5 for x in v))

如果您使用的是Python 2,并且您可能有很多条目.iteritems().items()则可以选择调用而不是。points

all(x < 5 for x in v)如果您确定每个点始终都是二维的,则可能会过大(在这种情况下,您可能会用表示相同的约束and),但效果很好;-)。

dict((k, v) for k, v in points.items() if all(x < 5 for x in v))

You could choose to call .iteritems() instead of .items() if you’re in Python 2 and points may have a lot of entries.

all(x < 5 for x in v) may be overkill if you know for sure each point will always be 2D only (in that case you might express the same constraint with an and) but it will work fine;-).


回答 2

points_small = dict(filter(lambda (a,(b,c)): b<5 and c < 5, points.items()))
points_small = dict(filter(lambda (a,(b,c)): b<5 and c < 5, points.items()))

回答 3

>>> points = {'a': (3, 4), 'c': (5, 5), 'b': (1, 2), 'd': (3, 3)}
>>> dict(filter(lambda x: (x[1][0], x[1][1]) < (5, 5), points.items()))

{'a': (3, 4), 'b': (1, 2), 'd': (3, 3)}
>>> points = {'a': (3, 4), 'c': (5, 5), 'b': (1, 2), 'd': (3, 3)}
>>> dict(filter(lambda x: (x[1][0], x[1][1]) < (5, 5), points.items()))

{'a': (3, 4), 'b': (1, 2), 'd': (3, 3)}

回答 4

dict((k, v) for (k, v) in points.iteritems() if v[0] < 5 and v[1] < 5)
dict((k, v) for (k, v) in points.iteritems() if v[0] < 5 and v[1] < 5)

回答 5

我认为Alex Martelli的答案绝对是做到这一点的最优雅的方法,但只是想添加一种dictionary.filter(f)方法,以Pythonic的方式满足您对超棒方法的需求:

class FilterDict(dict):
    def __init__(self, input_dict):
        for key, value in input_dict.iteritems():
            self[key] = value
    def filter(self, criteria):
        for key, value in self.items():
            if (criteria(value)):
                self.pop(key)

my_dict = FilterDict( {'a':(3,4), 'b':(1,2), 'c':(5,5), 'd':(3,3)} )
my_dict.filter(lambda x: x[0] < 5 and x[1] < 5)

基本上,我们创建一个继承自的类dict,但添加了filter方法。我们确实需要使用.items()过滤,因为.iteritems()在破坏性迭代时使用会引发异常。

I think that Alex Martelli’s answer is definitely the most elegant way to do this, but just wanted to add a way to satisfy your want for a super awesome dictionary.filter(f) method in a Pythonic sort of way:

class FilterDict(dict):
    def __init__(self, input_dict):
        for key, value in input_dict.iteritems():
            self[key] = value
    def filter(self, criteria):
        for key, value in self.items():
            if (criteria(value)):
                self.pop(key)

my_dict = FilterDict( {'a':(3,4), 'b':(1,2), 'c':(5,5), 'd':(3,3)} )
my_dict.filter(lambda x: x[0] < 5 and x[1] < 5)

Basically we create a class that inherits from dict, but adds the filter method. We do need to use .items() for the the filtering, since using .iteritems() while destructively iterating will raise exception.


回答 6

dict((k, v) for (k, v) in points.iteritems() if v[0] < 5 and v[1] < 5)
dict((k, v) for (k, v) in points.iteritems() if v[0] < 5 and v[1] < 5)