标签归档:side-effects

为什么全局变量是邪恶的?[关闭]

问题:为什么全局变量是邪恶的?[关闭]

我试图找出为什么global在Python(以及一般编程)中将使用视为不好的做法。有人可以解释吗?具有更多信息的链接也将不胜感激。

I’m trying to find out why the use of global is considered to be bad practice in python (and in programming in general). Can somebody explain? Links with more info would also be appreciated.


回答 0

这与Python无关。全局变量在任何编程语言中都是不好的。

但是,全局常量在概念上与全局变量并不相同。全局常数完全无害。在Python中,两者之间的区别纯属约定:CONSTANTS_ARE_CAPITALIZEDglobals_are_not

全局变量之所以不好是因为它们使函数具有隐藏的(非显而易见的,令人惊讶的,难以检测的,难以诊断的)副作用,从而导致复杂性的增加,并有可能导致产生Spaghetti代码

但是,即使在函数式编程中,也可以合理使用全局状态(局部状态和可变性也是如此),无论是算法优化,降低复杂性,缓存和记忆化,还是移植以命令式代码库为基础的结构的实用性。

总而言之,您的问题可以通过多种方式回答,因此您最好的选择就是使用Google“为什么全局变量不好”。一些例子:

如果您想更深入地了解造成副作用的原因以及许多其他启发性的内容,则应该学习函数式编程:

This has nothing to do with Python; global variables are bad in any programming language.

However, global constants are not conceptually the same as global variables; global constants are perfectly harmless. In Python the distinction between the two is purely by convention: CONSTANTS_ARE_CAPITALIZED and globals_are_not.

The reason global variables are bad is that they enable functions to have hidden (non-obvious, surprising, hard to detect, hard to diagnose) side effects, leading to an increase in complexity, potentially leading to Spaghetti code.

However, sane use of global state is acceptable (as is local state and mutability) even in functional programming, either for algorithm optimization, reduced complexity, caching and memoization, or the practicality of porting structures originating in a predominantly imperative codebase.

All in all, your question can be answered in many ways, so your best bet is to just google “why are global variables bad”. Some examples:

If you want to go deeper and find out why side effects are all about, and many other enlightening things, you should learn Functional Programming:


回答 1

是的,从理论上讲,全局变量(通常是“状态”)是邪恶的。在实践中,如果查看python的packages目录,您会发现那里的大多数模块都是以一堆全局声明开头的。显然,人们对此没有任何问题。

特别是对于python,全局变量的可见性仅限于一个模块,因此没有影响整个程序的“真实”全局变量-使其危害程度降低。还有一点:没有const,所以当您需要一个常量时,必须使用一个全局变量。

在我的实践中,如果碰巧在函数中修改了全局变量,那么global即使在技术上没有必要,我也总是用声明它,例如:

cache = {}

def foo(args):
    global cache

    cache[args] = ...

这使得全局变量的操作更易于跟踪。

Yes, in theory, globals (and “state” in general) are evil. In practice, if you look into your python’s packages directory you’ll find that most modules there start with a bunch of global declarations. Obviously, people have no problem with them.

Specifically to python, globals’ visibility is limited to a module, therefore there are no “true” globals that affect the whole program – that makes them a way less harmful. Another point: there are no const, so when you need a constant you have to use a global.

In my practice, if I happen to modify a global in a function, I always declare it with global, even if there technically no need for that, as in:

cache = {}

def foo(args):
    global cache

    cache[args] = ...

This makes globals’ manipulations easier to track down.


回答 2

关于该主题的个人观点是,在函数逻辑中使用全局变量意味着其他一些代码可以更改该函数的逻辑和预期输出,这将使调试非常困难(尤其是在大型项目中),并使测试更加困难也一样

此外,如果您考虑其他人(例如开源社区,同事等)阅读代码,他们将很难理解设置全局变量的位置,已更改的位置以及相对于此全局变量的期望隔离功能,可以通过读取功能定义本身来确定其功能。

(可能)违反纯函数定义

我相信干净且(几乎)没有错误的代码应具有尽可能纯净的功能(请参阅纯功能)。纯函数是具有以下条件的函数:

  1. 给定相同的参数值,该函数始终求值相同的结果值。函数结果值不能取决于在程序执行过程中或在程序的不同执行之间可能更改的任何隐藏信息或状态,也不能取决于I / O设备的任何外部输入(通常-参见下文)。
  2. 结果评估不会引起任何语义上可观察到的副作用或输出,例如可变对象的突变或输出到I / O设备。

全局变量违反了以上至少一项(如果不是全部),因为外部代码可能会导致意外的结果。

纯函数的另一个清晰定义:“纯函数是将其所有输入作为显式参数并将其所有输出作为显式结果的函数。” [1]。具有全局变量违反了纯函数的概念,因为未明确给出或返回输入或输出之一(全局变量)。

(可能)违反单元测试FIRST原则

另外对,如果你考虑的单元测试和第一原理(˚F AST测试, ndependent测试,[R epeatable,Ş精灵验证和牛逼 imely)可能会违反独立的测试原理(这意味着测试不依赖彼此)。

具有全局变量(并非总是如此),但在大多数情况下(至少是到目前为止我所看到的),是准备并将结果传递给其他函数。这也违反了该原理。如果以这种方式使用了全局变量(即必须先在函数Y中设置函数X中使用的全局变量),则意味着要对单元X进行单元测试,必须首先运行测试/运行函数Y。

全局常量

另一方面,正如其他人已经提到的那样,如果全局变量用作“常量”变量会更好一些,因为该语言不支持常量。但是,我总是更喜欢使用类并将“常量”作为类成员,而不使用全局变量。如果您有一个代码,两个不同的类需要共享一个全局变量,那么您可能需要重构您的解决方案并使您的类独立。

我不认为不应使用全局变量。但是,如果使用它们,那么作者应该考虑一些原则(上面可能提​​到的原则以及其他软件工程原则和良好实践),以获得更干净,几乎没有错误的代码。

A personal opinion on the topic is that having global variables being used in a function logic means that some other code can alter the logic and the expected output of that function which will make debugging very hard (especially in big projects) and will make testing harder as well.

Furthermore, if you consider other people reading your code (open-source community, colleagues etc) they will have a hard time trying to understand where the global variable is being set, where has been changed and what to expect from this global variable as opposed to an isolated function that its functionality can be determined by reading the function definition itself.

(Probably) Violating Pure Function definition

I believe that a clean and (nearly) bug-free code should have functions that are as pure as possible (see pure functions). A pure function is the one that has the following conditions:

  1. The function always evaluates the same result value given the same argument value(s). The function result value cannot depend on any hidden information or state that may change while program execution proceeds or between different executions of the program, nor can it depend on any external input from I/O devices (usually—see below).
  2. Evaluation of the result does not cause any semantically observable side effect or output, such as mutation of mutable objects or output to I/O devices.

Having global variables is violating at least one of the above if not both as an external code can probably cause unexpected results.

Another clear definition of pure functions: “Pure function is a function that takes all of its inputs as explicit arguments and produces all of its outputs as explicit results.” [1]. Having global variables violates the idea of pure functions since an input and maybe one of the outputs (the global variable) is not explicitly being given or returned.

(Probably) Violating Unit testing F.I.R.S.T principle

Further on that, if you consider unit-testing and the F.I.R.S.T principle (Fast tests, Independent tests, Repeatable, Self-Validating and Timely) will probably violate the Independent tests principle (which means that tests don’t depend on each other).

Having a global variable (not always) but in most of the cases (at least of what I have seen so far) is to prepare and pass results to other functions. This violates this principle as well. If the global variable has been used in that way (i.e the global variable used in function X has to be set in a function Y first) it means that to unit test function X you have to run test/run function Y first.

Globals as constants

On the other hand and as other people have already mentioned, if the global variable is used as a “constant” variable can be slightly better since the language does not support constants. However, I always prefer working with classes and having the “constants” as a class member and not use a global variable at all. If you have a code that two different classes require to share a global variable then you probably need to refactor your solution and make your classes independent.

I don’t believe that globals shouldn’t be used. But if they are used the authors should consider some principles (the ones mentioned above perhaps and other software engineering principles and good practices) for a cleaner and nearly bug-free code.


回答 3

它们是必不可少的,屏幕就是一个很好的例子。但是,在多线程环境中或在涉及许多开发人员的情况下,实际上常常会出现问题:谁(错误地)设置或清除了它?根据体系结构,分析可能很昂贵并且经常需要。虽然可以读取全局var,但是必须例如通过单线程或线程安全类来控制对其的写入。因此,全球变种人担心由于自身被认为是邪恶的后果而可能产生高昂的开发成本。因此,一般而言,最好将全局变量的数量保持在较低水平。

They are essential, the screen being a good example. However, in a multithreaded environment or with many developers involved, in practice often the question arises: who did (erraneously) set or clear it? Depending on the architecture, analysis can be costly and be required often. While reading the global var can be ok, writing to it must be controlled, for example by a single thread or threadsafe class. Hence, global vars arise the fear of high development costs possible by the consequences for which themselves are considered evil. Therefore in general, it’s good practice to keep the number of global vars low.