什么时候应该在Python中使用类?

问题:什么时候应该在Python中使用类?

我已经用python编程了大约两年了。主要是数据(pandas,mpl,numpy),还有自动化脚本和小型Web应用程序。我试图成为一个更好的程序员,并增加我的python知识,而困扰我的一件事是我从未使用过一个类(除了为小型Web应用程序复制随机烧瓶代码外)。我通常理解它们是什么,但是我似乎无法为为什么在一个简单的函数中需要它们的问题而wrap之以鼻。

为了使我的问题更具针对性:我编写了大量的自动报告,这些报告总是涉及从多个数据源(mongo,sql,postgres,api)中提取数据,执行一些或少量的数据整理和格式化,将数据写入csv / excel / html,通过电子邮件发送出去。脚本范围从〜250行到〜600行。我有什么理由要使用类来做到这一点吗?为什么?

I have been programming in python for about two years; mostly data stuff (pandas, mpl, numpy), but also automation scripts and small web apps. I’m trying to become a better programmer and increase my python knowledge and one of the things that bothers me is that I have never used a class (outside of copying random flask code for small web apps). I generally understand what they are, but I can’t seem to wrap my head around why I would need them over a simple function.

To add specificity to my question: I write tons of automated reports which always involve pulling data from multiple data sources (mongo, sql, postgres, apis), performing a lot or a little data munging and formatting, writing the data to csv/excel/html, send it out in an email. The scripts range from ~250 lines to ~600 lines. Would there be any reason for me to use classes to do this and why?


回答 0

类是面向对象程序设计的支柱。OOP高度关注代码的组织,可重用性和封装。

首先,免责声明:OOP与函数式编程在某种程度上相反,后者是Python中经常使用的一种不同范例。并非每个使用Python(或肯定是大多数语言)编程的人都使用OOP。在Java 8中,您可以做很多事情,而这些都不是面向对象的。如果您不想使用OOP,请不要使用。如果您只是编写一次性脚本来处理将不再使用的数据,那么请按原样编写。

但是,使用OOP的原因很多。

原因如下:

  • 组织:OOP定义了在代码中描述和定义数据与过程的众所周知的标准方法。数据和过程都可以存储在不同的定义级别(在不同的类中),并且有谈论这些定义的标准方法。也就是说,如果您以标准方式使用OOP,它将帮助您以后的自己和他人理解,编辑和使用您的代码。同样,您可以使用数据结构的名称并方便地引用它们,而不是使用复杂的任意数据存储机制(命令或列表的命令,集合的命令或列表的命令或其他命令)。

  • 状态:OOP可帮助您定义和跟踪状态。例如,在一个经典的示例中,如果您要创建一个处理学生的程序(例如,年级程序),则可以将您需要的所有有关他们的信息都保留在一个位置(姓名,年龄,性别,年级,类,年级,教师,同龄人,饮食,特殊需求等),并且只要对象还活着并且可以轻松访问,此数据就会保留下来。

  • 封装:通过封装,过程和数据一起存储。方法(功能的OOP术语)与操作和产生的数据一起定义。在像Java这样的允许访问控制的语言中,或者在Python中,取决于您描述公共API的方式,这意味着可以向用户隐藏方法和数据。这意味着,如果您需要或想要更改代码,则可以对代码的实现做任何您想做的事,但要使公共API保持不变。

  • 继承:通过继承,您可以在一个位置(一个类)中定义数据和过程,然后在以后覆盖或扩展该功能。例如,在Python中,我经常看到人们创建dict该类的子类以添加其他功能。常见的更改是覆盖从不存在的字典中请求键以基于未知键提供默认值时引发异常的方法。这允许您现在或以后扩展自己的代码,允许其他人扩展您的代码,并允许您扩展其他人的代码。

  • 可重用性:所有这些原因以及其他原因都可以提高代码的可重用性。面向对象的代码使您可以编写可靠的(经过测试的)代码一次,然后反复使用。如果需要针对特定​​用例进行调整,则可以从现有的类继承并覆盖现有的行为。如果您需要更改某些内容,则可以在保留现有公共方法签名的同时进行全部更改,并且没有一个人是明智的(希望如此)。

同样,有几个原因不使用OOP,而您则不需要。但是幸运的是,使用Python之类的语言,您可以使用一点或很多,这取决于您。

学生用例的一个示例(不能保证代码质量,仅是一个示例):

面向对象

class Student(object):
    def __init__(self, name, age, gender, level, grades=None):
        self.name = name
        self.age = age
        self.gender = gender
        self.level = level
        self.grades = grades or {}

    def setGrade(self, course, grade):
        self.grades[course] = grade

    def getGrade(self, course):
        return self.grades[course]

    def getGPA(self):
        return sum(self.grades.values())/len(self.grades)

# Define some students
john = Student("John", 12, "male", 6, {"math":3.3})
jane = Student("Jane", 12, "female", 6, {"math":3.5})

# Now we can get to the grades easily
print(john.getGPA())
print(jane.getGPA())

标准区

def calculateGPA(gradeDict):
    return sum(gradeDict.values())/len(gradeDict)

students = {}
# We can set the keys to variables so we might minimize typos
name, age, gender, level, grades = "name", "age", "gender", "level", "grades"
john, jane = "john", "jane"
math = "math"
students[john] = {}
students[john][age] = 12
students[john][gender] = "male"
students[john][level] = 6
students[john][grades] = {math:3.3}

students[jane] = {}
students[jane][age] = 12
students[jane][gender] = "female"
students[jane][level] = 6
students[jane][grades] = {math:3.5}

# At this point, we need to remember who the students are and where the grades are stored. Not a huge deal, but avoided by OOP.
print(calculateGPA(students[john][grades]))
print(calculateGPA(students[jane][grades]))

Classes are the pillar of Object Oriented Programming. OOP is highly concerned with code organization, reusability, and encapsulation.

First, a disclaimer: OOP is partially in contrast to Functional Programming, which is a different paradigm used a lot in Python. Not everyone who programs in Python (or surely most languages) uses OOP. You can do a lot in Java 8 that isn’t very Object Oriented. If you don’t want to use OOP, then don’t. If you’re just writing one-off scripts to process data that you’ll never use again, then keep writing the way you are.

However, there are a lot of reasons to use OOP.

Some reasons:

  • Organization: OOP defines well known and standard ways of describing and defining both data and procedure in code. Both data and procedure can be stored at varying levels of definition (in different classes), and there are standard ways about talking about these definitions. That is, if you use OOP in a standard way, it will help your later self and others understand, edit, and use your code. Also, instead of using a complex, arbitrary data storage mechanism (dicts of dicts or lists or dicts or lists of dicts of sets, or whatever), you can name pieces of data structures and conveniently refer to them.

  • State: OOP helps you define and keep track of state. For instance, in a classic example, if you’re creating a program that processes students (for instance, a grade program), you can keep all the info you need about them in one spot (name, age, gender, grade level, courses, grades, teachers, peers, diet, special needs, etc.), and this data is persisted as long as the object is alive, and is easily accessible.

  • Encapsulation: With encapsulation, procedure and data are stored together. Methods (an OOP term for functions) are defined right alongside the data that they operate on and produce. In a language like Java that allows for access control, or in Python, depending upon how you describe your public API, this means that methods and data can be hidden from the user. What this means is that if you need or want to change code, you can do whatever you want to the implementation of the code, but keep the public APIs the same.

  • Inheritance: Inheritance allows you to define data and procedure in one place (in one class), and then override or extend that functionality later. For instance, in Python, I often see people creating subclasses of the dict class in order to add additional functionality. A common change is overriding the method that throws an exception when a key is requested from a dictionary that doesn’t exist to give a default value based on an unknown key. This allows you to extend your own code now or later, allow others to extend your code, and allows you to extend other people’s code.

  • Reusability: All of these reasons and others allow for greater reusability of code. Object oriented code allows you to write solid (tested) code once, and then reuse over and over. If you need to tweak something for your specific use case, you can inherit from an existing class and overwrite the existing behavior. If you need to change something, you can change it all while maintaining the existing public method signatures, and no one is the wiser (hopefully).

Again, there are several reasons not to use OOP, and you don’t need to. But luckily with a language like Python, you can use just a little bit or a lot, it’s up to you.

An example of the student use case (no guarantee on code quality, just an example):

Object Oriented

class Student(object):
    def __init__(self, name, age, gender, level, grades=None):
        self.name = name
        self.age = age
        self.gender = gender
        self.level = level
        self.grades = grades or {}

    def setGrade(self, course, grade):
        self.grades[course] = grade

    def getGrade(self, course):
        return self.grades[course]

    def getGPA(self):
        return sum(self.grades.values())/len(self.grades)

# Define some students
john = Student("John", 12, "male", 6, {"math":3.3})
jane = Student("Jane", 12, "female", 6, {"math":3.5})

# Now we can get to the grades easily
print(john.getGPA())
print(jane.getGPA())

Standard Dict

def calculateGPA(gradeDict):
    return sum(gradeDict.values())/len(gradeDict)

students = {}
# We can set the keys to variables so we might minimize typos
name, age, gender, level, grades = "name", "age", "gender", "level", "grades"
john, jane = "john", "jane"
math = "math"
students[john] = {}
students[john][age] = 12
students[john][gender] = "male"
students[john][level] = 6
students[john][grades] = {math:3.3}

students[jane] = {}
students[jane][age] = 12
students[jane][gender] = "female"
students[jane][level] = 6
students[jane][grades] = {math:3.5}

# At this point, we need to remember who the students are and where the grades are stored. Not a huge deal, but avoided by OOP.
print(calculateGPA(students[john][grades]))
print(calculateGPA(students[jane][grades]))

回答 1

每当您需要维护函数状态时,都无法使用生成器来实现(生成而不是返回的函数)。生成器保持自己的状态。

如果要覆盖任何标准运算符,则需要一个类。

每当您用于访问者模式时,就需要使用类。使用生成器,上下文管理器(与生成器相比,比作为类更好地实现)和POD类型(字典,列表和元组等),可以更有效,更干净地完成所有其他设计模式。

如果要编写“ pythonic”代码,则应优先使用上下文管理器和生成器,而不要使用类。会更干净。

如果要扩展功能,几乎总是可以通过包含而不是继承来实现它。

每个规则都有exceptions。如果要快速封装功能(即编写测试代码而不是库级别的可重用代码),则可以将状态封装在类中。这很简单,不需要重用。

如果您需要C ++样式析构函数(RIIA),则绝对不希望使用类。您需要上下文管理器。

Whenever you need to maintain a state of your functions and it cannot be accomplished with generators (functions which yield rather than return). Generators maintain their own state.

If you want to override any of the standard operators, you need a class.

Whenever you have a use for a Visitor pattern, you’ll need classes. Every other design pattern can be accomplished more effectively and cleanly with generators, context managers (which are also better implemented as generators than as classes) and POD types (dictionaries, lists and tuples, etc.).

If you want to write “pythonic” code, you should prefer context managers and generators over classes. It will be cleaner.

If you want to extend functionality, you will almost always be able to accomplish it with containment rather than inheritance.

As every rule, this has an exception. If you want to encapsulate functionality quickly (ie, write test code rather than library-level reusable code), you can encapsulate the state in a class. It will be simple and won’t need to be reusable.

If you need a C++ style destructor (RIIA), you definitely do NOT want to use classes. You want context managers.


回答 2

我想你做对了。当您需要模拟一些业务逻辑或具有困难关系的困难现实过程时,类是合理的。例如:

  • 具有共享状态的几个功能
  • 多个相同状态变量的副本
  • 扩展现有功能的行为

我也建议您观看这部经典影片

I think you do it right. Classes are reasonable when you need to simulate some business logic or difficult real-life processes with difficult relations. As example:

  • Several functions with share state
  • More than one copy of the same state variables
  • To extend the behavior of an existing functionality

I also suggest you to watch this classic video


回答 3

一个类定义了一个现实世界的实体。如果您正在处理独立存在的事物,并且具有与其他事物不同的自己的逻辑,则应该为其创建一个类。例如,一个封装数据库连接的类。

如果不是这种情况,则无需创建类

A class defines a real world entity. If you are working on something that exists individually and has its own logic that is separate from others, you should create a class for it. For example, a class that encapsulates database connectivity.

If this not the case, no need to create class


回答 4

这取决于您的想法和设计。如果您是一位优秀的设计师,那么OOP会以各种设计模式的形式自然出现。对于简单的脚本级别,处理OOP可能会产生开销。简单考虑一下OOP的基本好处,例如可重用和可扩展,并确定是否需要它们。OOP使复杂的事情变得越来越简单。使用OOP或不使用OOP都可以使事情简单。使用哪个更简单。

Its depends on your idea and design. if you are good designer than OOPs will come out naturally in the form of various design patterns. For a simple script level processing OOPs can be overhead. Simple consider the basic benefits of OOPs like reusable and extendable and make sure if they are needed or not. OOPs make complex things simpler and simpler things complex. Simply keeps the things simple in either way using OOPs or not Using OOPs. which ever is simpler use that.