标签归档:class-design

如何在Python中设计类?

问题:如何在Python中设计类?

我在以前的问题中为检测爪子中的爪子脚趾提供了非常出色的帮助,但是所有这些解决方案一次只能进行一次测量。

现在,我得到的数据包括:

  • 大约30条狗;
  • 每个都有24个测量值(分为几个子组);
  • 每次测量至少有4个接触点(每只爪子一个),并且
    • 每个联系人分为5部分,
    • 具有几个参数,例如接触时间,位置,总力等。

显然,将所有内容粘贴到一个大对象中并不会减少它,因此我认为我需要使用类而不是当前的许多函数。但是,即使我已经阅读了学习Python的有关类的章节,也无法将其应用于我自己的代码(GitHub链接

我也觉得每次我想获取一些信息时处理所有数据都是很奇怪的。一旦知道了每个爪子的位置,就没有理由再进行计算了。此外,我想比较同一只狗的所有爪子,以确定哪个接触属于哪个爪子(前/后,左/右)。如果我继续只使用函数,那将变得一团糟。

因此,现在我正在寻找有关如何创建类的建议,这些类将使我能够以明智的方式处理我的数据(链接到一只狗的压缩数据)。

I’ve had some really awesome help on my previous questions for detecting paws and toes within a paw, but all these solutions only work for one measurement at a time.

Now I have data that consists off:

  • about 30 dogs;
  • each has 24 measurements (divided into several subgroups);
  • each measurement has at least 4 contacts (one for each paw) and
    • each contact is divided into 5 parts and
    • has several parameters, like contact time, location, total force etc.

Obviously sticking everything into one big object isn’t going to cut it, so I figured I needed to use classes instead of the current slew of functions. But even though I’ve read Learning Python’s chapter about classes, I fail to apply it to my own code (GitHub link)

I also feel like it’s rather strange to process all the data every time I want to get out some information. Once I know the locations of each paw, there’s no reason for me to calculate this again. Furthermore, I want to compare all the paws of the same dog to determine which contact belongs to which paw (front/hind, left/right). This would become a mess if I continue using only functions.

So now I’m looking for advice on how to create classes that will let me process my data (link to the zipped data of one dog) in a sensible fashion.


回答 0

如何设计类。

  1. 写下单词。您开始这样做。有些人没有,不知道为什么会有问题。

  2. 将您的词汇集扩展为有关这些对象将要做什么的简单说明。也就是说,写下您将要在这些事情上进行的各种计算。您的30条狗的清单,24条测量值,4个联系人以及每个联系人几个“参数”很有趣,但这只是故事的一部分。您的“每个爪子的位置”和“比较同一只狗的所有爪子,以确定哪个联系人属于哪个爪子”是对象设计的下一步。

  3. 在名词下划线。说真的 一些人参数这种方法的价值,但是我发现对于初次面向对象的开发人员来说,它有所帮助。在名词下划线。

  4. 查看名词。诸如“参数”和“度量”之类的通用名词需要替换为在问题域中适用于您的问题的特定,具体名词。细节有助于澄清问题。泛型只是忽略细节。

  5. 对于每个名词(“接触”,“爪子”,“狗”等),写下该名词的属性以及该对象参与的动作。不要捷径。每个属性。例如,“数据集包含30条狗”很重要。

  6. 对于每个属性,请确定这是与已定义名词的关系,还是与其他类型的“原始”或“原子”数据(如字符串或浮点数或不可约数)的关系。

  7. 对于每个动作或操作,您必须确定哪个名词有责任,哪些名词仅参与其中。这是“可变性”的问题。有些对象得到更新,而另一些则没有。可变对象必须对其突变负全部责任。

  8. 此时,您可以开始将名词转换为类定义。一些集合名词是列表,字典,元组,集合或命名元组,您不需要做很多工作。由于复杂的派生数据或执行的某些更新/变异,其他类则更为复杂。

不要忘记使用unittest单独测试每个类。

另外,没有法律规定Class必须是可变的。例如,就您而言,您几乎没有可变数据。您所拥有的是派生数据,这些数据是通过转换功能从源数据集中创建的。

How to design a class.

  1. Write down the words. You started to do this. Some people don’t and wonder why they have problems.

  2. Expand your set of words into simple statements about what these objects will be doing. That is to say, write down the various calculations you’ll be doing on these things. Your short list of 30 dogs, 24 measurements, 4 contacts, and several “parameters” per contact is interesting, but only part of the story. Your “locations of each paw” and “compare all the paws of the same dog to determine which contact belongs to which paw” are the next step in object design.

  3. Underline the nouns. Seriously. Some folks debate the value of this, but I find that for first-time OO developers it helps. Underline the nouns.

  4. Review the nouns. Generic nouns like “parameter” and “measurement” need to be replaced with specific, concrete nouns that apply to your problem in your problem domain. Specifics help clarify the problem. Generics simply elide details.

  5. For each noun (“contact”, “paw”, “dog”, etc.) write down the attributes of that noun and the actions in which that object engages. Don’t short-cut this. Every attribute. “Data Set contains 30 Dogs” for example is important.

  6. For each attribute, identify if this is a relationship to a defined noun, or some other kind of “primitive” or “atomic” data like a string or a float or something irreducible.

  7. For each action or operation, you have to identify which noun has the responsibility, and which nouns merely participate. It’s a question of “mutability”. Some objects get updated, others don’t. Mutable objects must own total responsibility for their mutations.

  8. At this point, you can start to transform nouns into class definitions. Some collective nouns are lists, dictionaries, tuples, sets or namedtuples, and you don’t need to do very much work. Other classes are more complex, either because of complex derived data or because of some update/mutation which is performed.

Don’t forget to test each class in isolation using unittest.

Also, there’s no law that says classes must be mutable. In your case, for example, you have almost no mutable data. What you have is derived data, created by transformation functions from the source dataset.


回答 1

以下建议(类似于@ S.Lott的建议)来自《Beginning Python:从新手到专业》一书

  1. 写下您的问题的描述(问题应该做什么?)。在所有名词,动词和形容词下划线。

  2. 遍历名词,寻找可能的类别。

  3. 遍历动词,寻找可能的方法。

  4. 浏览形容词,寻找潜在的属性

  5. 将方法和属性分配给您的类

为了完善课堂,本书还建议我们可以执行以下操作:

  1. 写下(或构想)一组用例,以了解如何使用程序。尝试涵盖所有功能。

  2. 逐步思考每个用例,确保涵盖了我们所需的一切。

The following advices (similar to @S.Lott’s advice) are from the book, Beginning Python: From Novice to Professional

  1. Write down a description of your problem (what should the problem do?). Underline all the nouns, verbs, and adjectives.

  2. Go through the nouns, looking for potential classes.

  3. Go through the verbs, looking for potential methods.

  4. Go through the adjectives, looking for potential attributes

  5. Allocate methods and attributes to your classes

To refine the class, the book also advises we can do the following:

  1. Write down (or dream up) a set of use cases—scenarios of how your program may be used. Try to cover all the functionally.

  2. Think through every use case step by step, making sure that everything we need is covered.


回答 2

我喜欢TDD方法…因此,首先针对所需的行为编写测试。并编写通过的代码。在这一点上,不必太担心设计,只需获得通过测试的套件和软件即可。如果您最终遇到一个丑陋的类,并且使用复杂的方法,请不要担心。

有时,在此初始过程中,您会发现难以测试且需要分解的行为(仅出于可测试性)。这可能暗示需要单独的类。

然后是有趣的部分…重构。使用了可用的软件后,您可以看到复杂的部分。通常,很少有行为的迹象,这表明有一个新的类,但如果没有,则只寻找简化代码的方法。提取服务对象和值对象。简化您的方法。

如果您正确使用了git(不是,您正在使用git吗?),则可以在重构过程中非常快速地尝试进行某些特定的分解,然后放弃它,如果它不能简化事情,请还原并返回。

通过首先编写经过测试的工作代码,您应该获得对问题域的深入了解,而这些问题是设计优先方法无法轻易实现的。编写测试和代码使您摆脱“我从哪里开始”的瘫痪。

I like the TDD approach… So start by writing tests for what you want the behaviour to be. And write code that passes. At this point, don’t worry too much about design, just get a test suite and software that passes. Don’t worry if you end up with a single big ugly class, with complex methods.

Sometimes, during this initial process, you’ll find a behaviour that is hard to test and needs to be decomposed, just for testability. This may be a hint that a separate class is warranted.

Then the fun part… refactoring. After you have working software you can see the complex pieces. Often little pockets of behaviour will become apparent, suggesting a new class, but if not, just look for ways to simplify the code. Extract service objects and value objects. Simplify your methods.

If you’re using git properly (you are using git, aren’t you?), you can very quickly experiment with some particular decomposition during refactoring, and then abandon it and revert back if it doesn’t simplify things.

By writing tested working code first you should gain an intimate insight into the problem domain that you couldn’t easily get with the design-first approach. Writing tests and code push you past that “where do I begin” paralysis.


回答 3

OO设计的整个思想是使您的代码映射到您的问题,因此,例如,当您想要一只狗的第一个足迹时,您可以执行以下操作:

dog.footstep(0)

现在,对于您的情况,可能需要读取原始数据文件并计算足迹位置。所有这些都可以隐藏在footstep()函数中,以便仅发生一次。就像是:

 class Dog:
   def __init__(self):
     self._footsteps=None 
   def footstep(self,n):
     if not self._footsteps:
        self.readInFootsteps(...)
     return self._footsteps[n]

[现在这是一种缓存模式。第一次读取足迹数据,随后又从self._footsteps获取。]

但是,是的,正确设计OO设计可能很棘手。多想想您要对数据执行的操作,这将告诉您将什么方法应用于什么类。

The whole idea of OO design is to make your code map to your problem, so when, for example, you want the first footstep of a dog, you do something like:

dog.footstep(0)

Now, it may be that for your case you need to read in your raw data file and compute the footstep locations. All this could be hidden in the footstep() function so that it only happens once. Something like:

 class Dog:
   def __init__(self):
     self._footsteps=None 
   def footstep(self,n):
     if not self._footsteps:
        self.readInFootsteps(...)
     return self._footsteps[n]

[This is now a sort of caching pattern. The first time it goes and reads the footstep data, subsequent times it just gets it from self._footsteps.]

But yes, getting OO design right can be tricky. Think more about the things you want to do to your data, and that will inform what methods you’ll need to apply to what classes.


回答 4

写出您的名词,动词,形容词是一种很好的方法,但是我更倾向于将类设计看作是询问应该隐藏哪些数据的问题?

假设您有一个Query对象和一个Database对象:

Query对象将帮助您创建和存储查询-存储是此处的关键,因为函数可以帮助您轻松创建一个查询。也许您可以留下:Query().select('Country').from_table('User').where('Country == "Brazil"')。语法无关紧要-这就是您的工作!-关键是对象可以帮助您隐藏某些东西,在这种情况下,是存储和输出查询所必需的数据。对象的强大功能来自使用它的语法(在这种情况下,是一种巧妙的链接),并且不需要知道它存储了什么才能使其工作。如果操作正确,则该Query对象可以输出对多个数据库的查询。它在内部将存储特定格式,但在输出时可以轻松转换为其他格式(Postgres,MySQL,MongoDB)。

现在让我们仔细考虑一下Database对象。这个藏起来什么?显然,它不能存储数据库的全部内容,因为这就是我们拥有数据库的原因!那有什么意义呢?目的是向使用对象的人员隐藏数据库的工作方式Database。好的类将在处理内部状态时简化推理。对于此Database对象,您可以隐藏网络调用的工作方式,或批处理查询或更新,或提供缓存层。

问题是这个Database对象很大。它代表了如何访问数据库,因此它可以做任何事情。显然,根据系统的不同,很难进行联网,缓存和批处理,因此将它们隐藏起来将非常有帮助。但是,正如许多人会注意到的那样,数据库异常复杂,而且与原始DB调用之间的距离越远,调整性能和理解事情的工作就越困难。

这是OOP的基本权衡。如果选择正确的抽象,它会使编码更简单(字符串,数组,字典),如果选择的抽象太大(数据库,EmailManager,NetworkingManager),则可能变得太复杂而无法真正了解其工作原理或如何处理。期望。目的是隐藏复杂性,但是一定要复杂。一个好的经验法则是从避免Manager对象开始,而是创建类似的类structs-它们所做的只是保存数据,并使用一些辅助方法来创建/处理数据,从而使您的生活更轻松。例如,在以EmailManager调用sendEmail一个Email对象的函数开始的情况下。这是一个简单的起点,并且代码很容易理解。

对于您的示例,请考虑需要将哪些数据组合在一起以计算所需的内容。例如,如果您想知道一只动物走了多远,您可以拥有AnimalStepAnimalTrip(收集AnimalSteps)类。既然每个Trip都具有所有Step数据,那么它应该能够弄清楚它的内容,也许AnimalTrip.calculateDistance()是有道理的。

Writing out your nouns, verbs, adjectives is a great approach, but I prefer to think of class design as asking the question what data should be hidden?

Imagine you had a Query object and a Database object:

The Query object will help you create and store a query — store, is the key here, as a function could help you create one just as easily. Maybe you could stay: Query().select('Country').from_table('User').where('Country == "Brazil"'). It doesn’t matter exactly the syntax — that is your job! — the key is the object is helping you hide something, in this case the data necessary to store and output a query. The power of the object comes from the syntax of using it (in this case some clever chaining) and not needing to know what it stores to make it work. If done right the Query object could output queries for more then one database. It internally would store a specific format but could easily convert to other formats when outputting (Postgres, MySQL, MongoDB).

Now let’s think through the Database object. What does this hide and store? Well clearly it can’t store the full contents of the database, since that is why we have a database! So what is the point? The goal is to hide how the database works from people who use the Database object. Good classes will simplify reasoning when manipulating internal state. For this Database object you could hide how the networking calls work, or batch queries or updates, or provide a caching layer.

The problem is this Database object is HUGE. It represents how to access a database, so under the covers it could do anything and everything. Clearly networking, caching, and batching are quite hard to deal with depending on your system, so hiding them away would be very helpful. But, as many people will note, a database is insanely complex, and the further from the raw DB calls you get, the harder it is to tune for performance and understand how things work.

This is the fundamental tradeoff of OOP. If you pick the right abstraction it makes coding simpler (String, Array, Dictionary), if you pick an abstraction that is too big (Database, EmailManager, NetworkingManager), it may become too complex to really understand how it works, or what to expect. The goal is to hide complexity, but some complexity is necessary. A good rule of thumb is to start out avoiding Manager objects, and instead create classes that are like structs — all they do is hold data, with some helper methods to create/manipulate the data to make your life easier. For example, in the case of EmailManager start with a function called sendEmail that takes an Email object. This is a simple starting point and the code is very easy to understand.

As for your example, think about what data needs to be together to calculate what you are looking for. If you wanted to know how far an animal was walking, for example, you could have AnimalStep and AnimalTrip (collection of AnimalSteps) classes. Now that each Trip has all the Step data, then it should be able to figure stuff out about it, perhaps AnimalTrip.calculateDistance() makes sense.


回答 5

浏览了链接的代码后,在我看来,最好不要在此时设计Dog类。相反,您应该使用Pandasdataframes。数据框是带有列的表。您数据帧都会有这样的栏目:dog_idcontact_partcontact_timecontact_location,等大熊猫在后台使用numpy的阵列,它已经为你许多方便的方法:

  • 通过例如选择一只狗: my_measurements['dog_id']=='Charly'
  • 保存数据: my_measurements.save('filename.pickle')
  • 考虑使用pandas.read_csv()而不是手动读取文本文件。

After skimming your linked code, it seems to me that you are better off not designing a Dog class at this point. Rather, you should use Pandas and dataframes. A dataframe is a table with columns. You dataframe would have columns such as: dog_id, contact_part, contact_time, contact_location, etc. Pandas uses Numpy arrays behind the scenes, and it has many convenience methods for you:

  • Select a dog by e.g. : my_measurements['dog_id']=='Charly'
  • save the data: my_measurements.save('filename.pickle')
  • Consider using pandas.read_csv() instead of manually reading the text files.

为什么总是在__new __()之后调用__init __()?

问题:为什么总是在__new __()之后调用__init __()?

我只是想简化我的一个类,并以与flyweight设计模式相同的样式介绍了一些功能。

但是,对于为什么__init__总是被称为after ,我有点困惑__new__。我没想到这一点。谁能告诉我为什么会这样,否则我如何实现此功能?(除了将实现放到__new__hacky中之外)。

这是一个例子:

class A(object):
    _dict = dict()

    def __new__(cls):
        if 'key' in A._dict:
            print "EXISTS"
            return A._dict['key']
        else:
            print "NEW"
            return super(A, cls).__new__(cls)

    def __init__(self):
        print "INIT"
        A._dict['key'] = self
        print ""

a1 = A()
a2 = A()
a3 = A()

输出:

NEW
INIT

EXISTS
INIT

EXISTS
INIT

为什么?

I’m just trying to streamline one of my classes and have introduced some functionality in the same style as the flyweight design pattern.

However, I’m a bit confused as to why __init__ is always called after __new__. I wasn’t expecting this. Can anyone tell me why this is happening and how I can implement this functionality otherwise? (Apart from putting the implementation into the __new__ which feels quite hacky.)

Here’s an example:

class A(object):
    _dict = dict()

    def __new__(cls):
        if 'key' in A._dict:
            print "EXISTS"
            return A._dict['key']
        else:
            print "NEW"
            return super(A, cls).__new__(cls)

    def __init__(self):
        print "INIT"
        A._dict['key'] = self
        print ""

a1 = A()
a2 = A()
a3 = A()

Outputs:

NEW
INIT

EXISTS
INIT

EXISTS
INIT

Why?


回答 0

使用__new__时,你需要控制一个新实例的创建。

使用 __init__时,你需要一个新的实例的控件初始化。

__new__是实例创建的第一步。它首先被调用,并负责返回您的类的新实例。

相反, __init__什么也不返回;创建实例后,它仅负责初始化实例。

通常,__new__除非您要继承不可变类型(例如str,int,unicode或tuple),否则无需重写。

从2008年4月发布:何时使用__new__vs __init__在mail.python.org上。

您应该考虑要尝试做的事通常是通过Factory完成的,这是最好的方法。使用__new__不是一个好的清洁解决方案,因此请考虑使用工厂。在这里,您有一个很好的工厂示例

Use __new__ when you need to control the creation of a new instance.

Use __init__ when you need to control initialization of a new instance.

__new__ is the first step of instance creation. It’s called first, and is responsible for returning a new instance of your class.

In contrast, __init__ doesn’t return anything; it’s only responsible for initializing the instance after it’s been created.

In general, you shouldn’t need to override __new__ unless you’re subclassing an immutable type like str, int, unicode or tuple.

From April 2008 post: When to use __new__ vs. __init__? on mail.python.org.

You should consider that what you are trying to do is usually done with a Factory and that’s the best way to do it. Using __new__ is not a good clean solution so please consider the usage of a factory. Here you have a good factory example.


回答 1

__new__是静态类方法,__init__而是实例方法。 __new__必须先创建实例,因此__init__可以对其进行初始化。注意,__init__将其self作为参数。在创建实例之前,没有任何实例self

现在,我知道您正在尝试在Python中实现单例模式。有几种方法可以做到这一点。

另外,从Python 2.6开始,您可以使用类装饰器

def singleton(cls):
    instances = {}
    def getinstance():
        if cls not in instances:
            instances[cls] = cls()
        return instances[cls]
    return getinstance

@singleton
class MyClass:
  ...

__new__ is static class method, while __init__ is instance method. __new__ has to create the instance first, so __init__ can initialize it. Note that __init__ takes self as parameter. Until you create instance there is no self.

Now, I gather, that you’re trying to implement singleton pattern in Python. There are a few ways to do that.

Also, as of Python 2.6, you can use class decorators.

def singleton(cls):
    instances = {}
    def getinstance():
        if cls not in instances:
            instances[cls] = cls()
        return instances[cls]
    return getinstance

@singleton
class MyClass:
  ...

回答 2

在大多数众所周知的OO语言中,类似的表达式SomeClass(arg1, arg2)将分配一个新实例,初始化该实例的属性,然后返回它。

在大多数著名的OO语言中,可以通过定义构造函数为每个类自定义“初始化实例的属性”部分,该构造函数基本上只是在新实例上运行的代码块(使用提供给构造函数表达式的参数) )来设置所需的任何初始条件。在Python中,这对应于class的__init__方法。

Python的__new__功能无非就是“分配新实例”部分的类似的按类自定义。当然,这允许您执行不同寻常的操作,例如返回现有实例而不是分配新实例。因此,在Python中,我们不应该真的认为这部分必然涉及分配。我们所需要的只是__new__从某个地方提出一个合适的实例。

但这仍然只是工作的一半,Python系统无法知道有时您希望__init__稍后再执行另一部分工作(),而有时又不想。如果您想要这种行为,则必须明确地说出。

通常,您可以重构,因此只需要__new__,或者不需要__new__,或者这样__init__就可以在已初始化的对象上表现不同。但是,如果你真的想,Python不竟让你重新定义“工作”,所以SomeClass(arg1, arg2)不一定需要__new__后面__init__。为此,您需要创建一个元类,并定义其__call__方法。

元类只是类的类。而类的__call__方法控制了当您调用类的实例时会发生什么。因此,metaclass__call__方法控制了您调用类时发生的事情。即,它允许您从头到尾重新定义实例创建机制。在此级别上,您可以最优雅地实现完全非标准的实例创建过程,例如单例模式。事实上,用了不到10行代码就可以实现一个Singleton元类是那么甚至不要求你与futz __new__ 可言,并且可以将任何通过简单地增加,否则正常的,定义为单__metaclass__ = Singleton

class Singleton(type):
    def __init__(self, *args, **kwargs):
        super(Singleton, self).__init__(*args, **kwargs)
        self.__instance = None
    def __call__(self, *args, **kwargs):
        if self.__instance is None:
            self.__instance = super(Singleton, self).__call__(*args, **kwargs)
        return self.__instance

但是,这可能比这种情况下真正应具有的魔力还要深!

In most well-known OO languages, an expression like SomeClass(arg1, arg2) will allocate a new instance, initialise the instance’s attributes, and then return it.

In most well-known OO languages, the “initialise the instance’s attributes” part can be customised for each class by defining a constructor, which is basically just a block of code that operates on the new instance (using the arguments provided to the constructor expression) to set up whatever initial conditions are desired. In Python, this corresponds to the class’ __init__ method.

Python’s __new__ is nothing more and nothing less than similar per-class customisation of the “allocate a new instance” part. This of course allows you to do unusual things such as returning an existing instance rather than allocating a new one. So in Python, we shouldn’t really think of this part as necessarily involving allocation; all that we require is that __new__ comes up with a suitable instance from somewhere.

But it’s still only half of the job, and there’s no way for the Python system to know that sometimes you want to run the other half of the job (__init__) afterwards and sometimes you don’t. If you want that behavior, you have to say so explicitly.

Often, you can refactor so you only need __new__, or so you don’t need __new__, or so that __init__ behaves differently on an already-initialised object. But if you really want to, Python does actually allow you to redefine “the job”, so that SomeClass(arg1, arg2) doesn’t necessarily call __new__ followed by __init__. To do this, you need to create a metaclass, and define its __call__ method.

A metaclass is just the class of a class. And a class’ __call__ method controls what happens when you call instances of the class. So a metaclass__call__ method controls what happens when you call a class; i.e. it allows you to redefine the instance-creation mechanism from start to finish. This is the level at which you can most elegantly implement a completely non-standard instance creation process such as the singleton pattern. In fact, with less than 10 lines of code you can implement a Singleton metaclass that then doesn’t even require you to futz with __new__ at all, and can turn any otherwise-normal class into a singleton by simply adding __metaclass__ = Singleton!

class Singleton(type):
    def __init__(self, *args, **kwargs):
        super(Singleton, self).__init__(*args, **kwargs)
        self.__instance = None
    def __call__(self, *args, **kwargs):
        if self.__instance is None:
            self.__instance = super(Singleton, self).__call__(*args, **kwargs)
        return self.__instance

However this is probably deeper magic than is really warranted for this situation!


回答 3

引用文档

典型的实现通过使用带有适当参数的“ super(currentclass,cls).__ new __(cls [,…])”调用超类的__new __()方法,然后根据需要修改新创建的实例来创建该类的新实例。在返回之前。

如果__new __()不返回cls的实例,则将不会调用新实例的__init __()方法。

__new __()主要用于允许不可变类型的子类(例如int,str或tuple)自定义实例创建。

To quote the documentation:

Typical implementations create a new instance of the class by invoking the superclass’s __new__() method using “super(currentclass, cls).__new__(cls[, …])”with appropriate arguments and then modifying the newly-created instance as necessary before returning it.

If __new__() does not return an instance of cls, then the new instance’s __init__() method will not be invoked.

__new__() is intended mainly to allow subclasses of immutable types (like int, str, or tuple) to customize instance creation.


回答 4

我意识到这个问题已经很久了,但是我也遇到了类似的问题。以下是我想要的:

class Agent(object):
    _agents = dict()

    def __new__(cls, *p):
        number = p[0]
        if not number in cls._agents:
            cls._agents[number] = object.__new__(cls)
        return cls._agents[number]

    def __init__(self, number):
        self.number = number

    def __eq__(self, rhs):
        return self.number == rhs.number

Agent("a") is Agent("a") == True

我将此页面用作资源http://infohost.nmt.edu/tcc/help/pubs/python/web/new-new-method.html

I realize that this question is quite old but I had a similar issue. The following did what I wanted:

class Agent(object):
    _agents = dict()

    def __new__(cls, *p):
        number = p[0]
        if not number in cls._agents:
            cls._agents[number] = object.__new__(cls)
        return cls._agents[number]

    def __init__(self, number):
        self.number = number

    def __eq__(self, rhs):
        return self.number == rhs.number

Agent("a") is Agent("a") == True

I used this page as a resource http://infohost.nmt.edu/tcc/help/pubs/python/web/new-new-method.html


回答 5

我认为这个问题的简单答案是,如果__new__返回的值与类的类型相同,则__init__函数将执行,否则将不会执行。在这种情况下,您的代码将返回A._dict('key')与相同的类cls,因此__init__将被执行。

I think the simple answer to this question is that, if __new__ returns a value that is the same type as the class, the __init__ function executes, otherwise it won’t. In this case your code returns A._dict('key') which is the same class as cls, so __init__ will be executed.


回答 6

__new__返回相同类的实例时,__init__随后在返回的对象上运行。即您不能使用它__new__来阻止__init__运行。即使您从中返回先前创建的对象__new__,也将__init__一次又一次地将其初始化为double(三重,等等)。

这是Singleton模式的通用方法,它在上面扩展了vartec答案并对其进行了修复:

def SingletonClass(cls):
    class Single(cls):
        __doc__ = cls.__doc__
        _initialized = False
        _instance = None

        def __new__(cls, *args, **kwargs):
            if not cls._instance:
                cls._instance = super(Single, cls).__new__(cls, *args, **kwargs)
            return cls._instance

        def __init__(self, *args, **kwargs):
            if self._initialized:
                return
            super(Single, self).__init__(*args, **kwargs)
            self.__class__._initialized = True  # Its crucial to set this variable on the class!
    return Single

全文在这里

实际上涉及的另一种方法__new__是使用类方法:

class Singleton(object):
    __initialized = False

    def __new__(cls, *args, **kwargs):
        if not cls.__initialized:
            cls.__init__(*args, **kwargs)
            cls.__initialized = True
        return cls


class MyClass(Singleton):
    @classmethod
    def __init__(cls, x, y):
        print "init is here"

    @classmethod
    def do(cls):
        print "doing stuff"

请注意,通过这种方法,您需要用修饰所有方法@classmethod,因为您将永远不会使用的任何实际实例MyClass

When __new__ returns instance of the same class, __init__ is run afterwards on returned object. I.e. you can NOT use __new__ to prevent __init__ from being run. Even if you return previously created object from __new__, it will be double (triple, etc…) initialized by __init__ again and again.

Here is the generic approach to Singleton pattern which extends vartec answer above and fixes it:

def SingletonClass(cls):
    class Single(cls):
        __doc__ = cls.__doc__
        _initialized = False
        _instance = None

        def __new__(cls, *args, **kwargs):
            if not cls._instance:
                cls._instance = super(Single, cls).__new__(cls, *args, **kwargs)
            return cls._instance

        def __init__(self, *args, **kwargs):
            if self._initialized:
                return
            super(Single, self).__init__(*args, **kwargs)
            self.__class__._initialized = True  # Its crucial to set this variable on the class!
    return Single

Full story is here.

Another approach, which in fact involves __new__ is to use classmethods:

class Singleton(object):
    __initialized = False

    def __new__(cls, *args, **kwargs):
        if not cls.__initialized:
            cls.__init__(*args, **kwargs)
            cls.__initialized = True
        return cls


class MyClass(Singleton):
    @classmethod
    def __init__(cls, x, y):
        print "init is here"

    @classmethod
    def do(cls):
        print "doing stuff"

Please pay attention, that with this approach you need to decorate ALL of your methods with @classmethod, because you’ll never use any real instance of MyClass.


回答 7

参考此文档

当对不可变的内置类型(例如数字和字符串)进行子类化时,有时在其他情况下,可以使用new静态方法。new是实例构造的第一步,在init之前调用。

方法被称为与类作为第一个参数; 它的责任是返回该类的新实例。

将此与init进行比较:init是使用实例作为其第一个参数调用的,它不返回任何内容;它的责任是初始化实例。

在某些情况下,无需调用init即可创建新实例(例如,从泡菜中加载实例时)。如果不调用new,就无法创建新实例(尽管在某些情况下,可以通过调用基类的new来摆脱困境)。

关于您希望实现的目标,在有关Singleton模式的相同文档信息中也有

class Singleton(object):
        def __new__(cls, *args, **kwds):
            it = cls.__dict__.get("__it__")
            if it is not None:
                return it
            cls.__it__ = it = object.__new__(cls)
            it.init(*args, **kwds)
            return it
        def init(self, *args, **kwds):
            pass

您也可以使用装饰器使用PEP 318中的此实现

def singleton(cls):
    instances = {}
    def getinstance():
        if cls not in instances:
            instances[cls] = cls()
        return instances[cls]
    return getinstance

@singleton
class MyClass:
...

Referring to this doc:

When subclassing immutable built-in types like numbers and strings, and occasionally in other situations, the static method new comes in handy. new is the first step in instance construction, invoked before init.

The new method is called with the class as its first argument; its responsibility is to return a new instance of that class.

Compare this to init: init is called with an instance as its first argument, and it doesn’t return anything; its responsibility is to initialize the instance.

There are situations where a new instance is created without calling init (for example when the instance is loaded from a pickle). There is no way to create a new instance without calling new (although in some cases you can get away with calling a base class’s new).

Regarding what you wish to achieve, there also in same doc info about Singleton pattern

class Singleton(object):
        def __new__(cls, *args, **kwds):
            it = cls.__dict__.get("__it__")
            if it is not None:
                return it
            cls.__it__ = it = object.__new__(cls)
            it.init(*args, **kwds)
            return it
        def init(self, *args, **kwds):
            pass

you may also use this implementation from PEP 318, using a decorator

def singleton(cls):
    instances = {}
    def getinstance():
        if cls not in instances:
            instances[cls] = cls()
        return instances[cls]
    return getinstance

@singleton
class MyClass:
...

回答 8

class M(type):
    _dict = {}

    def __call__(cls, key):
        if key in cls._dict:
            print 'EXISTS'
            return cls._dict[key]
        else:
            print 'NEW'
            instance = super(M, cls).__call__(key)
            cls._dict[key] = instance
            return instance

class A(object):
    __metaclass__ = M

    def __init__(self, key):
        print 'INIT'
        self.key = key
        print

a1 = A('aaa')
a2 = A('bbb')
a3 = A('aaa')

输出:

NEW
INIT

NEW
INIT

EXISTS

NB作为一个副作用M._dict属性会自动变成可触及AA._dict所以要小心不要顺带覆盖它。

class M(type):
    _dict = {}

    def __call__(cls, key):
        if key in cls._dict:
            print 'EXISTS'
            return cls._dict[key]
        else:
            print 'NEW'
            instance = super(M, cls).__call__(key)
            cls._dict[key] = instance
            return instance

class A(object):
    __metaclass__ = M

    def __init__(self, key):
        print 'INIT'
        self.key = key
        print

a1 = A('aaa')
a2 = A('bbb')
a3 = A('aaa')

outputs:

NEW
INIT

NEW
INIT

EXISTS

NB As a side effect M._dict property automatically becomes accessible from A as A._dict so take care not to overwrite it incidentally.


回答 9

__new__应该返回一个类的新的空白实例。然后调用__init__初始化该实例。您不是在__new__的“ NEW”情况下调用__init__,因此正在为您调用它。所调用的代码__new__无法跟踪是否已在特定实例上调用__init__,也不会跟踪它,因为您在这里做的事情很不寻常。

您可以在__init__函数中向该对象添加一个属性,以指示它已被初始化。首先检查该属性是否存在,如果已存在,请不要继续进行。

__new__ should return a new, blank instance of a class. __init__ is then called to initialise that instance. You’re not calling __init__ in the “NEW” case of __new__, so it’s being called for you. The code that is calling __new__ doesn’t keep track of whether __init__ has been called on a particular instance or not nor should it, because you’re doing something very unusual here.

You could add an attribute to the object in the __init__ function to indicate that it’s been initialised. Check for the existence of that attribute as the first thing in __init__ and don’t proceed any further if it has been.


回答 10

对@AntonyHatchkins答案的更新,您可能希望为元类型的每个类提供单独的实例字典,这意味着您应__init__在元类中使用一个方法使用该字典初始化您的类对象,而不是使它在所有类中都为全局对象。

class MetaQuasiSingleton(type):
    def __init__(cls, name, bases, attibutes):
        cls._dict = {}

    def __call__(cls, key):
        if key in cls._dict:
            print('EXISTS')
            instance = cls._dict[key]
        else:
            print('NEW')
            instance = super().__call__(key)
            cls._dict[key] = instance
        return instance

class A(metaclass=MetaQuasiSingleton):
    def __init__(self, key):
        print 'INIT'
        self.key = key
        print()

我继续使用一种__init__方法更新了原始代码,并将语法更改为Python 3表示法(super类参数中的no-arg调用和metaclass而不是作为属性)。

无论哪种方式,最重要的一点是,你的类初始化函数(__call__方法)将不会执行任何__new__或者__init__如果键被找到。这比使用干净得多__new__,如果要跳过默认__init__步骤,使用标记您需要标记该对象。

An update to @AntonyHatchkins answer, you probably want a separate dictionary of instances for each class of the metatype, meaning that you should have an __init__ method in the metaclass to initialize your class object with that dictionary instead of making it global across all the classes.

class MetaQuasiSingleton(type):
    def __init__(cls, name, bases, attibutes):
        cls._dict = {}

    def __call__(cls, key):
        if key in cls._dict:
            print('EXISTS')
            instance = cls._dict[key]
        else:
            print('NEW')
            instance = super().__call__(key)
            cls._dict[key] = instance
        return instance

class A(metaclass=MetaQuasiSingleton):
    def __init__(self, key):
        print 'INIT'
        self.key = key
        print()

I have gone ahead and updated the original code with an __init__ method and changed the syntax to Python 3 notation (no-arg call to super and metaclass in the class arguments instead of as an attribute).

Either way, the important point here is that your class initializer (__call__ method) will not execute either __new__ or __init__ if the key is found. This is much cleaner than using __new__, which requires you to mark the object if you want to skip the default __init__ step.


回答 11

深入了解这一点!

CPython中泛型类的类型为type,其基类为Object(除非您明确定义另一个基类,如元类)。低级呼叫的顺序可以在这里找到。所谓的第一种方法是type_call,然后调用tp_new,然后tp_init

这里有趣的部分是tp_new将调用Object的(基类)new方法object_new,该方法执行tp_allocPyType_GenericAlloc)为对象分配内存的方法:)

那时在内存中创建对象,然后__init__调用该方法。如果__init__未在您的类中实现,则将object_init调用gets并且不执行任何操作:)

然后type_call只返回绑定到变量的对象。

Digging little deeper into that!

The type of a generic class in CPython is type and its base class is Object (Unless you explicitly define another base class like a metaclass). The sequence of low level calls can be found here. The first method called is the type_call which then calls tp_new and then tp_init.

The interesting part here is that tp_new will call the Object‘s (base class) new method object_new which does a tp_alloc (PyType_GenericAlloc) which allocates the memory for the object :)

At that point the object is created in memory and then the __init__ method gets called. If __init__ is not implemented in your class then the object_init gets called and it does nothing :)

Then type_call just returns the object which binds to your variable.


回答 12

应该将其__init__视为传统OO语言中的一种简单构造函数。例如,如果您熟悉Java或C ++,则向构造函数隐式传递一个指向其自身实例的指针。对于Java,它是this变量。如果要检查为Java生成的字节码,则有人会注意到有两个调用。第一个调用是对“ new”方法的调用,然后下一个调用是对init方法的调用(这是对用户定义的构造函数的实际调用)。通过两步过程,可以在调用类的构造方法(该实例的另一个方法)之前创建实际实例。

现在,对于Python,__new__是用户可以访问的附加功能。Java由于其类型性质而没有提供这种灵活性。如果一种语言提供了该功能,那么的实现者__new__可以在返回实例之前用该方法做很多事情,包括在某些情况下创建不相关对象的全新实例。而且,这种方法对于Python尤其适用于不可变类型也很有效。

One should look at __init__ as a simple constructor in traditional OO languages. For example, if you are familiar with Java or C++, the constructor is passed a pointer to its own instance implicitly. In the case of Java, it is the this variable. If one were to inspect the byte code generated for Java, one would notice two calls. The first call is to an “new” method, and then next call is to the init method (which is the actual call to the user defined constructor). This two step process enables creation of the actual instance before calling the constructor method of the class which is just another method of that instance.

Now, in the case of Python, __new__ is a added facility that is accessible to the user. Java does not provide that flexibility, due to its typed nature. If a language provided that facility, then the implementor of __new__ could do many things in that method before returning the instance, including creating a totally new instance of a unrelated object in some cases. And, this approach also works out well for especially for immutable types in the case of Python.


回答 13

但是,对于为什么__init__总是被称为after ,我有点困惑__new__

我认为C ++类比在这里会很有用:

  1. __new__只需为对象分配内存。一个对象的实例变量需要内存来保存它,这就是该步骤__new__要做的。

  2. __init__ 将对象的内部变量初始化为特定值(可以是默认值)。

However, I’m a bit confused as to why __init__ is always called after __new__.

I think the C++ analogy would be useful here:

  1. __new__ simply allocates memory for the object. The instance variables of an object needs memory to hold it, and this is what the step __new__ would do.

  2. __init__ initialize the internal variables of the object to specific values (could be default).


回答 14

__init__经过被称为__new__所以,当你在子类中重写它,你添加的代码仍然会被调用。

如果您尝试对已经具有a的类进行子类化,则__new__对此一无所知的人可能会先改编__init__并将调用向下转发到子类__init__。这种呼叫__init__后的约定__new__有助于按预期工作。

__init__仍然需要允许超任何参数__new__需要的,但不这样做通常会建立一个清晰的运行时错误。并且__new__可能应该明确允许*args和’** kw’,以明确表示扩展名是可以的。

这是普遍不好的形式既有__new____init__在继承同级别在同一个Class,因为原来的海报中描述的行为。

The __init__ is called after __new__ so that when you override it in a subclass, your added code will still get called.

If you are trying to subclass a class that already has a __new__, someone unaware of this might start by adapting the __init__ and forwarding the call down to the subclass __init__. This convention of calling __init__ after __new__ helps that work as expected.

The __init__ still needs to allow for any parameters the superclass __new__ needed, but failing to do so will usually create a clear runtime error. And the __new__ should probably explicitly allow for *args and ‘**kw’, to make it clear that extension is OK.

It is generally bad form to have both __new__ and __init__ in the same class at the same level of inheritance, because of the behavior the original poster described.


回答 15

但是,对于为什么__init__总是被称为after ,我有点困惑__new__

除了这样做是没有其他原因的。__new__没有初始化类的责任,其他方法有责任(__call__,可能-我不确定)。

我没想到这一点。谁能告诉我为什么会这样,否则我如何实现此功能?(除了将实现放入__new__hack之外)。

你可以有__init__做什么,如果它已经被初始化,或者你可以写一个新的一个新的元类__call__,只有调用__init__新的实例,否则直接返回__new__(...)

However, I’m a bit confused as to why __init__ is always called after __new__.

Not much of a reason other than that it just is done that way. __new__ doesn’t have the responsibility of initializing the class, some other method does (__call__, possibly– I don’t know for sure).

I wasn’t expecting this. Can anyone tell me why this is happening and how I implement this functionality otherwise? (apart from putting the implementation into the __new__ which feels quite hacky).

You could have __init__ do nothing if it’s already been initialized, or you could write a new metaclass with a new __call__ that only calls __init__ on new instances, and otherwise just returns __new__(...).


回答 16

原因很简单,函数用于创建实例,而init用于初始化实例。在初始化之前,应先创建实例。这就是为什么应该在init之前调用new的原因。

The simple reason is that the new is used for creating an instance, while init is used for initializing the instance. Before initializing, the instance should be created first. That’s why new should be called before init.


回答 17

现在我遇到了同样的问题,由于某些原因,我决定避免使用装饰器,工厂和元类。我这样做是这样的:

主文件

def _alt(func):
    import functools
    @functools.wraps(func)
    def init(self, *p, **k):
        if hasattr(self, "parent_initialized"):
            return
        else:
            self.parent_initialized = True
            func(self, *p, **k)

    return init


class Parent:
    # Empty dictionary, shouldn't ever be filled with anything else
    parent_cache = {}

    def __new__(cls, n, *args, **kwargs):

        # Checks if object with this ID (n) has been created
        if n in cls.parent_cache:

            # It was, return it
            return cls.parent_cache[n]

        else:

            # Check if it was modified by this function
            if not hasattr(cls, "parent_modified"):
                # Add the attribute
                cls.parent_modified = True
                cls.parent_cache = {}

                # Apply it
                cls.__init__ = _alt(cls.__init__)

            # Get the instance
            obj = super().__new__(cls)

            # Push it to cache
            cls.parent_cache[n] = obj

            # Return it
            return obj

示例类

class A(Parent):

    def __init__(self, n):
        print("A.__init__", n)


class B(Parent):

    def __init__(self, n):
        print("B.__init__", n)

正在使用

>>> A(1)
A.__init__ 1  # First A(1) initialized 
<__main__.A object at 0x000001A73A4A2E48>
>>> A(1)      # Returned previous A(1)
<__main__.A object at 0x000001A73A4A2E48>
>>> A(2)
A.__init__ 2  # First A(2) initialized
<__main__.A object at 0x000001A7395D9C88>
>>> B(2)
B.__init__ 2  # B class doesn't collide with A, thanks to separate cache
<__main__.B object at 0x000001A73951B080>
  • 警告:您不应该初始化Parent,它与其他类发生冲突-除非您在每个子代中都定义了单独的缓存,否则这不是我们想要的。
  • 警告:父辈的祖父母类看起来很奇怪。[未验证]

在线尝试!

Now I’ve got the same problem, and for some reasons I decided to avoid decorators, factories and metaclasses. I did it like this:

Main file

def _alt(func):
    import functools
    @functools.wraps(func)
    def init(self, *p, **k):
        if hasattr(self, "parent_initialized"):
            return
        else:
            self.parent_initialized = True
            func(self, *p, **k)

    return init


class Parent:
    # Empty dictionary, shouldn't ever be filled with anything else
    parent_cache = {}

    def __new__(cls, n, *args, **kwargs):

        # Checks if object with this ID (n) has been created
        if n in cls.parent_cache:

            # It was, return it
            return cls.parent_cache[n]

        else:

            # Check if it was modified by this function
            if not hasattr(cls, "parent_modified"):
                # Add the attribute
                cls.parent_modified = True
                cls.parent_cache = {}

                # Apply it
                cls.__init__ = _alt(cls.__init__)

            # Get the instance
            obj = super().__new__(cls)

            # Push it to cache
            cls.parent_cache[n] = obj

            # Return it
            return obj

Example classes

class A(Parent):

    def __init__(self, n):
        print("A.__init__", n)


class B(Parent):

    def __init__(self, n):
        print("B.__init__", n)

In use

>>> A(1)
A.__init__ 1  # First A(1) initialized 
<__main__.A object at 0x000001A73A4A2E48>
>>> A(1)      # Returned previous A(1)
<__main__.A object at 0x000001A73A4A2E48>
>>> A(2)
A.__init__ 2  # First A(2) initialized
<__main__.A object at 0x000001A7395D9C88>
>>> B(2)
B.__init__ 2  # B class doesn't collide with A, thanks to separate cache
<__main__.B object at 0x000001A73951B080>
  • Warning: You shouldn’t initialize Parent, it will collide with other classes – unless you defined separate cache in each of the children, that’s not what we want.
  • Warning: It seems a class with Parent as grandparent behaves weird. [Unverified]

Try it online!