我应该使用类还是字典?

问题:我应该使用类还是字典?

我有一个只包含字段而没有方法的类,如下所示:

class Request(object):

    def __init__(self, environ):
        self.environ = environ
        self.request_method = environ.get('REQUEST_METHOD', None)
        self.url_scheme = environ.get('wsgi.url_scheme', None)
        self.request_uri = wsgiref.util.request_uri(environ)
        self.path = environ.get('PATH_INFO', None)
        # ...

这可以很容易地翻译成字典。该类对于将来的添加更加灵活,使用可以更快__slots__。那么使用dict会有好处吗?字典会比全班更快吗?并且比具有插槽的类快吗?

I have a class that contains only fields and no methods, like this:

class Request(object):

    def __init__(self, environ):
        self.environ = environ
        self.request_method = environ.get('REQUEST_METHOD', None)
        self.url_scheme = environ.get('wsgi.url_scheme', None)
        self.request_uri = wsgiref.util.request_uri(environ)
        self.path = environ.get('PATH_INFO', None)
        # ...

This could easily be translated to a dict. The class is more flexible for future additions and could be fast with __slots__. So would there be a benefit of using a dict instead? Would a dict be faster than a class? And faster than a class with slots?


回答 0

你为什么要把它当作字典?有什么好处?如果您以后想要添加一些代码,会发生什么?您的__init__代码会去哪儿?

类用于捆绑相关数据(通常是代码)。

字典用于存储键-值关系,其中通常键都是同一类型,并且所有值也都是一种类型。有时,当键/属性名称并非一开始就为人所知时,它们对于捆绑数据很有用,但这通常表明您的设计有问题。

保持这堂课。

Why would you make this a dictionary? What’s the advantage? What happens if you later want to add some code? Where would your __init__ code go?

Classes are for bundling related data (and usually code).

Dictionaries are for storing key-value relationships, where usually the keys are all of the same type, and all the values are also of one type. Occasionally they can be useful for bundling data when the key/attribute names are not all known up front, but often this a sign that something’s wrong with your design.

Keep this a class.


回答 1

使用字典,除非您需要类的额外机制。您还可以将a namedtuple用作混合方法:

>>> from collections import namedtuple
>>> request = namedtuple("Request", "environ request_method url_scheme")
>>> request
<class '__main__.Request'>
>>> request.environ = "foo"
>>> request.environ
'foo'

这里的性能差异将是最小的,尽管如果字典速度不快,我会感到惊讶。

Use a dictionary unless you need the extra mechanism of a class. You could also use a namedtuple for a hybrid approach:

>>> from collections import namedtuple
>>> request = namedtuple("Request", "environ request_method url_scheme")
>>> request
<class '__main__.Request'>
>>> request.environ = "foo"
>>> request.environ
'foo'

Performance differences here will be minimal, although I would be surprised if the dictionary wasn’t faster.


回答 2

python 的类下面的字典。类的行为确实会增加一些开销,但是如果没有事件探查器,您将无法注意到它。在这种情况下,我相信您会从课堂中受益,因为:

  • 您所有的逻辑都存在于一个功能中
  • 易于更新并保持封装
  • 如果以后更改任何内容,则可以轻松地使界面保持不变

A class in python is a dict underneath. You do get some overhead with the class behavior, but you won’t be able to notice it without a profiler. In this case, I believe you benefit from the class because:

  • All your logic lives in a single function
  • It is easy to update and stays encapsulated
  • If you change anything later, you can easily keep the interface the same

回答 3

我认为每个人的用法都太主观,我无法理解,所以我只会坚持数字。

我比较了在dict,new_style类和带槽的new_style类中创建和更改变量所需的时间。

这是我用来测试的代码(虽然有点杂乱,但确实可以完成工作。)

import timeit

class Foo(object):

    def __init__(self):

        self.foo1 = 'test'
        self.foo2 = 'test'
        self.foo3 = 'test'

def create_dict():

    foo_dict = {}
    foo_dict['foo1'] = 'test'
    foo_dict['foo2'] = 'test'
    foo_dict['foo3'] = 'test'

    return foo_dict

class Bar(object):
    __slots__ = ['foo1', 'foo2', 'foo3']

    def __init__(self):

        self.foo1 = 'test'
        self.foo2 = 'test'
        self.foo3 = 'test'

tmit = timeit.timeit

print 'Creating...\n'
print 'Dict: ' + str(tmit('create_dict()', 'from __main__ import create_dict'))
print 'Class: ' + str(tmit('Foo()', 'from __main__ import Foo'))
print 'Class with slots: ' + str(tmit('Bar()', 'from __main__ import Bar'))

print '\nChanging a variable...\n'

print 'Dict: ' + str((tmit('create_dict()[\'foo3\'] = "Changed"', 'from __main__ import create_dict') - tmit('create_dict()', 'from __main__ import create_dict')))
print 'Class: ' + str((tmit('Foo().foo3 = "Changed"', 'from __main__ import Foo') - tmit('Foo()', 'from __main__ import Foo')))
print 'Class with slots: ' + str((tmit('Bar().foo3 = "Changed"', 'from __main__ import Bar') - tmit('Bar()', 'from __main__ import Bar')))

这是输出…

正在建立…

Dict: 0.817466186345
Class: 1.60829183597
Class_with_slots: 1.28776730003

更改变量…

Dict: 0.0735140918748
Class: 0.111714198313
Class_with_slots: 0.10618612142

因此,如果您只是存储变量,则需要速度,并且不需要进行很多计算,因此我建议使用dict(您始终可以使函数看起来像方法)。但是,如果您确实需要类,请记住-始终使用__ slot __

注意:

我测试的“类”有两种 new_style和old_style类。事实证明,old_style类的创建速度更快,但修改速度却较慢(如果要在紧密的循环中创建许多类,则幅度不大,但意义重大(提示:您做错了))。

此外,由于我的计算机较旧且运行缓慢,因此在计算机上创建和更改变量的时间可能会有所不同。确保自己进行测试以查看“真实”结果。

编辑:

后来我测试了namedtuple:我无法修改它,但是创建10000个样本(或类似的东西)花了1.4秒,因此字典确实是最快的。

如果我更改dict函数以包括键和值,并在创建它时返回dict而不是包含dict的变量,则它会给我0.65而不是0.8秒。

class Foo(dict):
    pass

创建就像是一个带有插槽的类,并且更改变量最慢(0.17秒),因此不要使用这些类。求字典(速度)或对象派生的类(“语法糖果”)

I think that the usage of each one is way too subjective for me to get in on that, so i’ll just stick to numbers.

I compared the time it takes to create and to change a variable in a dict, a new_style class and a new_style class with slots.

Here’s the code i used to test it(it’s a bit messy but it does the job.)

import timeit

class Foo(object):

    def __init__(self):

        self.foo1 = 'test'
        self.foo2 = 'test'
        self.foo3 = 'test'

def create_dict():

    foo_dict = {}
    foo_dict['foo1'] = 'test'
    foo_dict['foo2'] = 'test'
    foo_dict['foo3'] = 'test'

    return foo_dict

class Bar(object):
    __slots__ = ['foo1', 'foo2', 'foo3']

    def __init__(self):

        self.foo1 = 'test'
        self.foo2 = 'test'
        self.foo3 = 'test'

tmit = timeit.timeit

print 'Creating...\n'
print 'Dict: ' + str(tmit('create_dict()', 'from __main__ import create_dict'))
print 'Class: ' + str(tmit('Foo()', 'from __main__ import Foo'))
print 'Class with slots: ' + str(tmit('Bar()', 'from __main__ import Bar'))

print '\nChanging a variable...\n'

print 'Dict: ' + str((tmit('create_dict()[\'foo3\'] = "Changed"', 'from __main__ import create_dict') - tmit('create_dict()', 'from __main__ import create_dict')))
print 'Class: ' + str((tmit('Foo().foo3 = "Changed"', 'from __main__ import Foo') - tmit('Foo()', 'from __main__ import Foo')))
print 'Class with slots: ' + str((tmit('Bar().foo3 = "Changed"', 'from __main__ import Bar') - tmit('Bar()', 'from __main__ import Bar')))

And here is the output…

Creating…

Dict: 0.817466186345
Class: 1.60829183597
Class_with_slots: 1.28776730003

Changing a variable…

Dict: 0.0735140918748
Class: 0.111714198313
Class_with_slots: 0.10618612142

So, if you’re just storing variables, you need speed, and it won’t require you to do many calculations, i recommend using a dict(you could always just make a function that looks like a method). But, if you really need classes, remember – always use __slots__.

Note:

I tested the ‘Class’ with both new_style and old_style classes. It turns out that old_style classes are faster to create but slower to modify(not by much but significant if you’re creating lots of classes in a tight loop (tip: you’re doing it wrong)).

Also the times for creating and changing variables may differ on your computer since mine is old and slow. Make sure you test it yourself to see the ‘real’ results.

Edit:

I later tested the namedtuple: i can’t modify it but to create the 10000 samples (or something like that) it took 1.4 seconds so the dictionary is indeed the fastest.

If i change the dict function to include the keys and values and to return the dict instead of the variable containing the dict when i create it it gives me 0.65 instead of 0.8 seconds.

class Foo(dict):
    pass

Creating is like a class with slots and changing the variable is the slowest (0.17 seconds) so do not use these classes. go for a dict (speed) or for the class derived from object (‘syntax candy’)


回答 4

我同意@adw。我永远不会用字典来代表“对象”(从OO意义上来说)。词典汇总名称/值对。类代表对象。我已经看到了用字典表示对象的代码,目前尚不清楚事物的实际形状是什么。当某些名称/值不存在时会发生什么?是什么限制了客户什么也没花。或者试图把所有东西都花掉。事物的形状应始终明确定义。

使用Python时,重要的是要有纪律性进行构建,因为该语言为作者提供了多种射击方式。

I agree with @adw. I would never represent an “object” (in an OO sense) with a dictionary. Dictionaries aggregate name/value pairs. Classes represent objects. I’ve seen code where the objects are represented with dictionaries and it’s unclear what the actual shape of the thing is. What happens when certain name/values aren’t there? What restricts the client from putting anything at all in. Or trying to get anything at all out. The shape of the thing should always be clearly defined.

When using Python it is important to build with discipline as the language allows many ways for the author to shoot him/herself in the foot.


回答 5

我会推荐一个类,因为它是与请求有关的各种信息。曾经是使用字典的人,我希望存储的数据本质上会更加相似。我倾向于遵循的一个指导原则是,如果我想遍历整个键-值对集合并执行某些操作,则可以使用字典。否则,数据显然比基本的键->值映射具有更多的结构,这意味着类可能是更好的选择。

因此,坚持上课。

I would recommend a class, as it is all sorts of information involved with a request. Were one to use a dictionary, I’d expect the data stored to be far more similar in nature. A guideline I tend to follow myself is that if I may want to loop over the entire set of key->value pairs and do something, I use a dictionary. Otherwise, the data apparently has far more structure than a basic key->value mapping, meaning a class would likely be a better alternative.

Hence, stick with the class.


回答 6

如果您要实现的只是语法糖果obj.bla = 5而不是obj['bla'] = 5,特别是如果您必须重复很多,那么您可能想要使用一些简单的容器类,如martineaus建议中那样。但是,那里的代码非常肿,并且速度很慢。您可以像这样简单:

class AttrDict(dict):
    """ Syntax candy """
    __getattr__ = dict.__getitem__
    __setattr__ = dict.__setitem__
    __delattr__ = dict.__delitem__

切换到namedtuples或class的另一个原因__slots__可能是内存使用率。字典比列表类型需要更多的内存,因此可能需要考虑一下。

无论如何,在您的特定情况下,似乎没有任何动力要退出当前的实现。您似乎没有维护数百万个此类对象,因此不需要列表派生类型。而且它实际上包含内的一些功能逻辑__init__,因此您也不应该使用AttrDict

If all that you want to achive is syntax candy like obj.bla = 5 instead of obj['bla'] = 5, especially if you have to repeat that a lot, you maybe want to use some plain container class as in martineaus suggestion. Nevertheless, the code there is quite bloated and unnecessarily slow. You can keep it simple like that:

class AttrDict(dict):
    """ Syntax candy """
    __getattr__ = dict.__getitem__
    __setattr__ = dict.__setitem__
    __delattr__ = dict.__delitem__

Another reason to switch to namedtuples or a class with __slots__ could be memory usage. Dicts require significantly more memory than list types, so this could be a point to think about.

Anyways, in your specific case, there doesn’t seem to be any motivation to switch away from your current implementation. You don’t seem to maintain millions of these objects, so no list-derived-types required. And it’s actually containing some functional logic within the __init__, so you also shouldn’t got with AttrDict.


回答 7

也可能有蛋糕也可以吃。换句话说,您可以创建提供类和字典实例功能的东西。请参阅ActiveState的Dɪᴄᴛɪᴏɴᴀʀʏᴡɪᴛʜᴀᴛᴛʀɪʙᴜᴛᴇ-sᴛʏʟᴇss食谱和有关此方法的注释。

如果您决定使用常规类而不是子类,那么我发现T recipesɪᴍᴘʟᴇᴄᴏʟʟᴇᴄᴛᴏʀᴄᴏʟʟᴇᴄᴛᴏʀᴄᴏʟʟᴇᴄᴛᴏʀrecipe recipe ss的食谱(由Alex Martelli 撰写非常灵活,对此类事情很有用看起来就像您在做的(即创建一个相对简单的信息聚合器)。由于它是一个类,因此您可以通过添加方法轻松地进一步扩展其功能。

最后,应该指出,类成员的名称必须是合法的Python标识符,但字典键则不能—因此字典在这方面将提供更大的自由度,因为键可以是任何可散列的东西(甚至可以不是字符串)。

更新资料

一类object(其不具有__dict__)指定的子类SimpleNamespace(它有一个)加入到该types模块的Python 3.3,并且是又一替代。

It may be possible to have your cake and eat it, too. In other words you can create something that provides the functionality of both a class and dictionary instance. See the ActiveState’s Dɪᴄᴛɪᴏɴᴀʀʏ ᴡɪᴛʜ ᴀᴛᴛʀɪʙᴜᴛᴇ-sᴛʏʟᴇ ᴀᴄᴄᴇss recipe and comments on ways of doing that.

If you decide to use a regular class rather than a subclass, I’ve found the Tʜᴇ sɪᴍᴘʟᴇ ʙᴜᴛ ʜᴀɴᴅʏ “ᴄᴏʟʟᴇᴄᴛᴏʀ ᴏғ ᴀ ʙᴜɴᴄʜ ᴏғ ɴᴀᴍᴇᴅ sᴛᴜғғ” ᴄʟᴀss recipe (by Alex Martelli) to be very flexible and useful for the sort of thing it looks like you’re doing (i.e. create a relative simple aggregator of information). Since it’s a class you can easily extend its functionality further by adding methods.

Lastly it should be noted that the names of class members must be legal Python identifiers, but dictionary keys do not—so a dictionary would provide greater freedom in that regard because keys can be anything hashable (even something that’s not a string).

Update

A class object (which doesn’t have a __dict__) subclass named SimpleNamespace (which does have one) was added to the types module Python 3.3, and is yet another alternative.


回答 8

class ClassWithSlotBase:
    __slots__ = ('a', 'b',)

def __init__(self):
    self.a: str = "test"
    self.b: float = 0.0


def test_type_hint(_b: float) -> None:
    print(_b)


class_tmp = ClassWithSlotBase()

test_type_hint(class_tmp.a)

我推荐一堂课。如果使用类,则可以得到如下所示的类型提示。当class是函数的参数时,Class支持自动完成。

class ClassWithSlotBase:
    __slots__ = ('a', 'b',)

def __init__(self):
    self.a: str = "test"
    self.b: float = 0.0


def test_type_hint(_b: float) -> None:
    print(_b)


class_tmp = ClassWithSlotBase()

test_type_hint(class_tmp.a)

I recommend a class. If you use a class, you can get type hint as shown. And Class support auto complete when class is argument of function.