标签归档:Python

如何在Pandas中找到数字列?

问题:如何在Pandas中找到数字列?

假设df是一个熊猫DataFrame。我想找到所有数字类型的列。就像是:

isNumeric = is_numeric(df)

Let’s say df is a pandas DataFrame. I would like to find all columns of numeric type. Something like:

isNumeric = is_numeric(df)

回答 0

您可以使用select_dtypesDataFrame的方法。它包括两个参数include和exclude。所以isNumeric看起来像:

numerics = ['int16', 'int32', 'int64', 'float16', 'float32', 'float64']

newdf = df.select_dtypes(include=numerics)

You could use select_dtypes method of DataFrame. It includes two parameters include and exclude. So isNumeric would look like:

numerics = ['int16', 'int32', 'int64', 'float16', 'float32', 'float64']

newdf = df.select_dtypes(include=numerics)

回答 1

您可以使用未记录的功能_get_numeric_data()来仅过滤数字列:

df._get_numeric_data()

例:

In [32]: data
Out[32]:
   A  B
0  1  s
1  2  s
2  3  s
3  4  s

In [33]: data._get_numeric_data()
Out[33]:
   A
0  1
1  2
2  3
3  4

注意,这是一个“私有方法”(即实现细节),将来可能会更改或完全删除。请谨慎使用

You can use the undocumented function _get_numeric_data() to filter only numeric columns:

df._get_numeric_data()

Example:

In [32]: data
Out[32]:
   A  B
0  1  s
1  2  s
2  3  s
3  4  s

In [33]: data._get_numeric_data()
Out[33]:
   A
0  1
1  2
2  3
3  4

Note that this is a “private method” (i.e., an implementation detail) and is subject to change or total removal in the future. Use with caution.


回答 2

简单的单行答案即可创建仅包含数字列的新数据框:

df.select_dtypes(include=np.number)

如果需要数字列的名称:

df.select_dtypes(include=np.number).columns.tolist()

完整的代码:

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': range(7, 10),
                   'B': np.random.rand(3),
                   'C': ['foo','bar','baz'],
                   'D': ['who','what','when']})
df
#    A         B    C     D
# 0  7  0.704021  foo   who
# 1  8  0.264025  bar  what
# 2  9  0.230671  baz  when

df_numerics_only = df.select_dtypes(include=np.number)
df_numerics_only
#    A         B
# 0  7  0.704021
# 1  8  0.264025
# 2  9  0.230671

colnames_numerics_only = df.select_dtypes(include=np.number).columns.tolist()
colnames_numerics_only
# ['A', 'B']

Simple one-line answer to create a new dataframe with only numeric columns:

df.select_dtypes(include=np.number)

If you want the names of numeric columns:

df.select_dtypes(include=np.number).columns.tolist()

Complete code:

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': range(7, 10),
                   'B': np.random.rand(3),
                   'C': ['foo','bar','baz'],
                   'D': ['who','what','when']})
df
#    A         B    C     D
# 0  7  0.704021  foo   who
# 1  8  0.264025  bar  what
# 2  9  0.230671  baz  when

df_numerics_only = df.select_dtypes(include=np.number)
df_numerics_only
#    A         B
# 0  7  0.704021
# 1  8  0.264025
# 2  9  0.230671

colnames_numerics_only = df.select_dtypes(include=np.number).columns.tolist()
colnames_numerics_only
# ['A', 'B']

回答 3

df.select_dtypes(exclude=['object'])
df.select_dtypes(exclude = ['object'])

Update

df.select_dtypes(inlcude = np.number)
#or with new version of panda
df.select_dtypes('number')

回答 4

简单的一线:

df.select_dtypes('number').columns

Simple one-liner:

df.select_dtypes('number').columns

回答 5

以下代码将返回数据集的数字列的名称列表。

cnames=list(marketing_train.select_dtypes(exclude=['object']).columns)

marketing_train是我的数据集,它select_dtypes()是使用exclude和include参数选择数据类型的功能,而column用于获取上述代码输出的数据集的列名,如下所示:

['custAge',
     'campaign',
     'pdays',
     'previous',
     'emp.var.rate',
     'cons.price.idx',
     'cons.conf.idx',
     'euribor3m',
     'nr.employed',
     'pmonths',
     'pastEmail']

谢谢

Following codes will return list of names of the numeric columns of a data set.

cnames=list(marketing_train.select_dtypes(exclude=['object']).columns)

here marketing_train is my data set and select_dtypes() is function to select data types using exclude and include arguments and columns is used to fetch the column name of data set output of above code will be following:

['custAge',
     'campaign',
     'pdays',
     'previous',
     'emp.var.rate',
     'cons.price.idx',
     'cons.conf.idx',
     'euribor3m',
     'nr.employed',
     'pmonths',
     'pastEmail']

Thanks


回答 6

这是用于在熊猫数据框中查找数字列的另一种简单代码,

numeric_clmns = df.dtypes[df.dtypes != "object"].index 

This is another simple code for finding numeric column in pandas data frame,

numeric_clmns = df.dtypes[df.dtypes != "object"].index 

回答 7

def is_type(df, baseType):
    import numpy as np
    import pandas as pd
    test = [issubclass(np.dtype(d).type, baseType) for d in df.dtypes]
    return pd.DataFrame(data = test, index = df.columns, columns = ["test"])
def is_float(df):
    import numpy as np
    return is_type(df, np.float)
def is_number(df):
    import numpy as np
    return is_type(df, np.number)
def is_integer(df):
    import numpy as np
    return is_type(df, np.integer)
def is_type(df, baseType):
    import numpy as np
    import pandas as pd
    test = [issubclass(np.dtype(d).type, baseType) for d in df.dtypes]
    return pd.DataFrame(data = test, index = df.columns, columns = ["test"])
def is_float(df):
    import numpy as np
    return is_type(df, np.float)
def is_number(df):
    import numpy as np
    return is_type(df, np.number)
def is_integer(df):
    import numpy as np
    return is_type(df, np.integer)

回答 8

改编这个答案,你可以做

df.ix[:,df.applymap(np.isreal).all(axis=0)]

在这里,np.applymap(np.isreal)显示数据框中的每个单元格是否都是数字,并.axis(all=0)检查列中的所有值是否均为True,并返回一系列布尔值,这些布尔值可用于索引所需的列。

Adapting this answer, you could do

df.ix[:,df.applymap(np.isreal).all(axis=0)]

Here, np.applymap(np.isreal) shows whether every cell in the data frame is numeric, and .axis(all=0) checks if all values in a column are True and returns a series of Booleans that can be used to index the desired columns.


回答 9

请看下面的代码:

if(dataset.select_dtypes(include=[np.number]).shape[1] > 0):
display(dataset.select_dtypes(include=[np.number]).describe())
if(dataset.select_dtypes(include=[np.object]).shape[1] > 0):
display(dataset.select_dtypes(include=[np.object]).describe())

这样,您可以检查值是否为数字,例如float和int或srting值。第二条if语句用于检查对象引用的字符串值。

Please see the below code:

if(dataset.select_dtypes(include=[np.number]).shape[1] > 0):
display(dataset.select_dtypes(include=[np.number]).describe())
if(dataset.select_dtypes(include=[np.object]).shape[1] > 0):
display(dataset.select_dtypes(include=[np.object]).describe())

This way you can check whether the value are numeric such as float and int or the srting values. the second if statement is used for checking the string values which is referred by the object.


回答 10

我们可以根据以下要求包括和排除数据类型:

train.select_dtypes(include=None, exclude=None)
train.select_dtypes(include='number') #will include all the numeric types

从Jupyter Notebook引用。

要选择所有数字类型,请使用np.number'number'

  • 要选择字符串,您必须使用objectdtype,但是请注意,这将返回所有对象dtype列

  • NumPy dtype hierarchy <http://docs.scipy.org/doc/numpy/reference/arrays.scalars.html>__

  • 要选择日期时间,使用np.datetime64'datetime''datetime64'

  • 要选择timedeltas,使用np.timedelta64'timedelta''timedelta64'

  • 要选择Pandas类别dtype,请使用 'category'

  • 要选择Pandas datetimetz dtypes,请使用'datetimetz'(0.20.0中的新功能)或“’datetime64 [ns,tz]’

We can include and exclude data types as per the requirement as below:

train.select_dtypes(include=None, exclude=None)
train.select_dtypes(include='number') #will include all the numeric types

Referred from Jupyter Notebook.

To select all numeric types, use np.number or 'number'

  • To select strings you must use the object dtype but note that this will return all object dtype columns

  • See the NumPy dtype hierarchy <http://docs.scipy.org/doc/numpy/reference/arrays.scalars.html>__

  • To select datetimes, use np.datetime64, 'datetime' or 'datetime64'

  • To select timedeltas, use np.timedelta64, 'timedelta' or 'timedelta64'

  • To select Pandas categorical dtypes, use 'category'

  • To select Pandas datetimetz dtypes, use 'datetimetz' (new in 0.20.0) or “’datetime64[ns, tz]’


Flask-SQLalchemy更新行的信息

问题:Flask-SQLalchemy更新行的信息

如何更新行的信息?

例如,我想更改ID为5的行的名称列。

How can I update a row’s information?

For example I’d like to alter the name column of the row that has the id 5.


回答 0

使用Flask-SQLAlchemy文档中显示教程来检索对象。拥有要更改的实体后,请更改实体本身。然后,db.session.commit()

例如:

admin = User.query.filter_by(username='admin').first()
admin.email = 'my_new_email@example.com'
db.session.commit()

user = User.query.get(5)
user.name = 'New Name'
db.session.commit()

Flask-SQLAlchemy基于SQLAlchemy,因此请务必同时查看SQLAlchemy文档

Retrieve an object using the tutorial shown in the Flask-SQLAlchemy documentation. Once you have the entity that you want to change, change the entity itself. Then, db.session.commit().

For example:

admin = User.query.filter_by(username='admin').first()
admin.email = 'my_new_email@example.com'
db.session.commit()

user = User.query.get(5)
user.name = 'New Name'
db.session.commit()

Flask-SQLAlchemy is based on SQLAlchemy, so be sure to check out the SQLAlchemy Docs as well.


回答 1

updateSQLAlchemy中的BaseQuery对象有一个方法,由返回filter_by

admin = User.query.filter_by(username='admin').update(dict(email='my_new_email@example.com')))
db.session.commit()

update当要更新的对象很多时,使用更改实体的优势就来了。

如果您想add_user授予所有admins 权限,

rows_changed = User.query.filter_by(role='admin').update(dict(permission='add_user'))
db.session.commit()

注意,filter_by使用关键字参数(仅使用一个=),而不是filter使用表达式。

There is a method update on BaseQuery object in SQLAlchemy, which is returned by filter_by.

admin = User.query.filter_by(username='admin').update(dict(email='my_new_email@example.com')))
db.session.commit()

The advantage of using update over changing the entity comes when there are many objects to be updated.

If you want to give add_user permission to all the admins,

rows_changed = User.query.filter_by(role='admin').update(dict(permission='add_user'))
db.session.commit()

Notice that filter_by takes keyword arguments (use only one =) as opposed to filter which takes an expression.


回答 2

如果您修改模型的腌制属性,这将不起作用。腌制的属性应被替换以触发更新:

from flask import Flask
from flask.ext.sqlalchemy import SQLAlchemy
from pprint import pprint

app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqllite:////tmp/users.db'
db = SQLAlchemy(app)


class User(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String(80), unique=True)
    data = db.Column(db.PickleType())

    def __init__(self, name, data):
        self.name = name
        self.data = data

    def __repr__(self):
        return '<User %r>' % self.username

db.create_all()

# Create a user.
bob = User('Bob', {})
db.session.add(bob)
db.session.commit()

# Retrieve the row by its name.
bob = User.query.filter_by(name='Bob').first()
pprint(bob.data)  # {}

# Modifying data is ignored.
bob.data['foo'] = 123
db.session.commit()
bob = User.query.filter_by(name='Bob').first()
pprint(bob.data)  # {}

# Replacing data is respected.
bob.data = {'bar': 321}
db.session.commit()
bob = User.query.filter_by(name='Bob').first()
pprint(bob.data)  # {'bar': 321}

# Modifying data is ignored.
bob.data['moo'] = 789
db.session.commit()
bob = User.query.filter_by(name='Bob').first()
pprint(bob.data)  # {'bar': 321}

This does not work if you modify a pickled attribute of the model. Pickled attributes should be replaced in order to trigger updates:

from flask import Flask
from flask.ext.sqlalchemy import SQLAlchemy
from pprint import pprint

app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqllite:////tmp/users.db'
db = SQLAlchemy(app)


class User(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String(80), unique=True)
    data = db.Column(db.PickleType())

    def __init__(self, name, data):
        self.name = name
        self.data = data

    def __repr__(self):
        return '<User %r>' % self.username

db.create_all()

# Create a user.
bob = User('Bob', {})
db.session.add(bob)
db.session.commit()

# Retrieve the row by its name.
bob = User.query.filter_by(name='Bob').first()
pprint(bob.data)  # {}

# Modifying data is ignored.
bob.data['foo'] = 123
db.session.commit()
bob = User.query.filter_by(name='Bob').first()
pprint(bob.data)  # {}

# Replacing data is respected.
bob.data = {'bar': 321}
db.session.commit()
bob = User.query.filter_by(name='Bob').first()
pprint(bob.data)  # {'bar': 321}

# Modifying data is ignored.
bob.data['moo'] = 789
db.session.commit()
bob = User.query.filter_by(name='Bob').first()
pprint(bob.data)  # {'bar': 321}

回答 3

仅分配值并提交它们将适用于除JSON和Pickled属性以外的所有数据类型。由于上面已经解释了腌制类型,因此我将记下更新JSON的方法略有不同但很简单。

class User(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String(80), unique=True)
    data = db.Column(db.JSON)

def __init__(self, name, data):
    self.name = name
    self.data = data

假设模型如上。

user = User("Jon Dove", {"country":"Sri Lanka"})
db.session.add(user)
db.session.flush()
db.session.commit()

这会将用户与数据{“ country”:“ Sri Lanka”}一起添加到MySQL数据库中

修改数据将被忽略。我的无效代码如下。

user = User.query().filter(User.name=='Jon Dove')
data = user.data
data["province"] = "south"
user.data = data
db.session.merge(user)
db.session.flush()
db.session.commit()

与其完成将JSON复制到新字典(而不是像上面那样将其分配给新变量)的繁琐工作,我应该找到了一种简单的方法来完成该工作。有一种方法可以标记JSON已更改的系统。

以下是工作代码。

from sqlalchemy.orm.attributes import flag_modified
user = User.query().filter(User.name=='Jon Dove')
data = user.data
data["province"] = "south"
user.data = data
flag_modified(user, "data")
db.session.merge(user)
db.session.flush()
db.session.commit()

这就像一个魅力。这里还建议了另一种方法 希望对我有帮助。

Just assigning the value and committing them will work for all the data types but JSON and Pickled attributes. Since pickled type is explained above I’ll note down a slightly different but easy way to update JSONs.

class User(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String(80), unique=True)
    data = db.Column(db.JSON)

def __init__(self, name, data):
    self.name = name
    self.data = data

Let’s say the model is like above.

user = User("Jon Dove", {"country":"Sri Lanka"})
db.session.add(user)
db.session.flush()
db.session.commit()

This will add the user into the MySQL database with data {“country”:”Sri Lanka”}

Modifying data will be ignored. My code that didn’t work is as follows.

user = User.query().filter(User.name=='Jon Dove')
data = user.data
data["province"] = "south"
user.data = data
db.session.merge(user)
db.session.flush()
db.session.commit()

Instead of going through the painful work of copying the JSON to a new dict (not assigning it to a new variable as above), which should have worked I found a simple way to do that. There is a way to flag the system that JSONs have changed.

Following is the working code.

from sqlalchemy.orm.attributes import flag_modified
user = User.query().filter(User.name=='Jon Dove')
data = user.data
data["province"] = "south"
user.data = data
flag_modified(user, "data")
db.session.merge(user)
db.session.flush()
db.session.commit()

This worked like a charm. There is another method proposed along with this method here Hope I’ve helped some one.


为什么全局变量是邪恶的?[关闭]

问题:为什么全局变量是邪恶的?[关闭]

我试图找出为什么global在Python(以及一般编程)中将使用视为不好的做法。有人可以解释吗?具有更多信息的链接也将不胜感激。

I’m trying to find out why the use of global is considered to be bad practice in python (and in programming in general). Can somebody explain? Links with more info would also be appreciated.


回答 0

这与Python无关。全局变量在任何编程语言中都是不好的。

但是,全局常量在概念上与全局变量并不相同。全局常数完全无害。在Python中,两者之间的区别纯属约定:CONSTANTS_ARE_CAPITALIZEDglobals_are_not

全局变量之所以不好是因为它们使函数具有隐藏的(非显而易见的,令人惊讶的,难以检测的,难以诊断的)副作用,从而导致复杂性的增加,并有可能导致产生Spaghetti代码

但是,即使在函数式编程中,也可以合理使用全局状态(局部状态和可变性也是如此),无论是算法优化,降低复杂性,缓存和记忆化,还是移植以命令式代码库为基础的结构的实用性。

总而言之,您的问题可以通过多种方式回答,因此您最好的选择就是使用Google“为什么全局变量不好”。一些例子:

如果您想更深入地了解造成副作用的原因以及许多其他启发性的内容,则应该学习函数式编程:

This has nothing to do with Python; global variables are bad in any programming language.

However, global constants are not conceptually the same as global variables; global constants are perfectly harmless. In Python the distinction between the two is purely by convention: CONSTANTS_ARE_CAPITALIZED and globals_are_not.

The reason global variables are bad is that they enable functions to have hidden (non-obvious, surprising, hard to detect, hard to diagnose) side effects, leading to an increase in complexity, potentially leading to Spaghetti code.

However, sane use of global state is acceptable (as is local state and mutability) even in functional programming, either for algorithm optimization, reduced complexity, caching and memoization, or the practicality of porting structures originating in a predominantly imperative codebase.

All in all, your question can be answered in many ways, so your best bet is to just google “why are global variables bad”. Some examples:

If you want to go deeper and find out why side effects are all about, and many other enlightening things, you should learn Functional Programming:


回答 1

是的,从理论上讲,全局变量(通常是“状态”)是邪恶的。在实践中,如果查看python的packages目录,您会发现那里的大多数模块都是以一堆全局声明开头的。显然,人们对此没有任何问题。

特别是对于python,全局变量的可见性仅限于一个模块,因此没有影响整个程序的“真实”全局变量-使其危害程度降低。还有一点:没有const,所以当您需要一个常量时,必须使用一个全局变量。

在我的实践中,如果碰巧在函数中修改了全局变量,那么global即使在技术上没有必要,我也总是用声明它,例如:

cache = {}

def foo(args):
    global cache

    cache[args] = ...

这使得全局变量的操作更易于跟踪。

Yes, in theory, globals (and “state” in general) are evil. In practice, if you look into your python’s packages directory you’ll find that most modules there start with a bunch of global declarations. Obviously, people have no problem with them.

Specifically to python, globals’ visibility is limited to a module, therefore there are no “true” globals that affect the whole program – that makes them a way less harmful. Another point: there are no const, so when you need a constant you have to use a global.

In my practice, if I happen to modify a global in a function, I always declare it with global, even if there technically no need for that, as in:

cache = {}

def foo(args):
    global cache

    cache[args] = ...

This makes globals’ manipulations easier to track down.


回答 2

关于该主题的个人观点是,在函数逻辑中使用全局变量意味着其他一些代码可以更改该函数的逻辑和预期输出,这将使调试非常困难(尤其是在大型项目中),并使测试更加困难也一样

此外,如果您考虑其他人(例如开源社区,同事等)阅读代码,他们将很难理解设置全局变量的位置,已更改的位置以及相对于此全局变量的期望隔离功能,可以通过读取功能定义本身来确定其功能。

(可能)违反纯函数定义

我相信干净且(几乎)没有错误的代码应具有尽可能纯净的功能(请参阅纯功能)。纯函数是具有以下条件的函数:

  1. 给定相同的参数值,该函数始终求值相同的结果值。函数结果值不能取决于在程序执行过程中或在程序的不同执行之间可能更改的任何隐藏信息或状态,也不能取决于I / O设备的任何外部输入(通常-参见下文)。
  2. 结果评估不会引起任何语义上可观察到的副作用或输出,例如可变对象的突变或输出到I / O设备。

全局变量违反了以上至少一项(如果不是全部),因为外部代码可能会导致意外的结果。

纯函数的另一个清晰定义:“纯函数是将其所有输入作为显式参数并将其所有输出作为显式结果的函数。” [1]。具有全局变量违反了纯函数的概念,因为未明确给出或返回输入或输出之一(全局变量)。

(可能)违反单元测试FIRST原则

另外对,如果你考虑的单元测试和第一原理(˚F AST测试, ndependent测试,[R epeatable,Ş精灵验证和牛逼 imely)可能会违反独立的测试原理(这意味着测试不依赖彼此)。

具有全局变量(并非总是如此),但在大多数情况下(至少是到目前为止我所看到的),是准备并将结果传递给其他函数。这也违反了该原理。如果以这种方式使用了全局变量(即必须先在函数Y中设置函数X中使用的全局变量),则意味着要对单元X进行单元测试,必须首先运行测试/运行函数Y。

全局常量

另一方面,正如其他人已经提到的那样,如果全局变量用作“常量”变量会更好一些,因为该语言不支持常量。但是,我总是更喜欢使用类并将“常量”作为类成员,而不使用全局变量。如果您有一个代码,两个不同的类需要共享一个全局变量,那么您可能需要重构您的解决方案并使您的类独立。

我不认为不应使用全局变量。但是,如果使用它们,那么作者应该考虑一些原则(上面可能提​​到的原则以及其他软件工程原则和良好实践),以获得更干净,几乎没有错误的代码。

A personal opinion on the topic is that having global variables being used in a function logic means that some other code can alter the logic and the expected output of that function which will make debugging very hard (especially in big projects) and will make testing harder as well.

Furthermore, if you consider other people reading your code (open-source community, colleagues etc) they will have a hard time trying to understand where the global variable is being set, where has been changed and what to expect from this global variable as opposed to an isolated function that its functionality can be determined by reading the function definition itself.

(Probably) Violating Pure Function definition

I believe that a clean and (nearly) bug-free code should have functions that are as pure as possible (see pure functions). A pure function is the one that has the following conditions:

  1. The function always evaluates the same result value given the same argument value(s). The function result value cannot depend on any hidden information or state that may change while program execution proceeds or between different executions of the program, nor can it depend on any external input from I/O devices (usually—see below).
  2. Evaluation of the result does not cause any semantically observable side effect or output, such as mutation of mutable objects or output to I/O devices.

Having global variables is violating at least one of the above if not both as an external code can probably cause unexpected results.

Another clear definition of pure functions: “Pure function is a function that takes all of its inputs as explicit arguments and produces all of its outputs as explicit results.” [1]. Having global variables violates the idea of pure functions since an input and maybe one of the outputs (the global variable) is not explicitly being given or returned.

(Probably) Violating Unit testing F.I.R.S.T principle

Further on that, if you consider unit-testing and the F.I.R.S.T principle (Fast tests, Independent tests, Repeatable, Self-Validating and Timely) will probably violate the Independent tests principle (which means that tests don’t depend on each other).

Having a global variable (not always) but in most of the cases (at least of what I have seen so far) is to prepare and pass results to other functions. This violates this principle as well. If the global variable has been used in that way (i.e the global variable used in function X has to be set in a function Y first) it means that to unit test function X you have to run test/run function Y first.

Globals as constants

On the other hand and as other people have already mentioned, if the global variable is used as a “constant” variable can be slightly better since the language does not support constants. However, I always prefer working with classes and having the “constants” as a class member and not use a global variable at all. If you have a code that two different classes require to share a global variable then you probably need to refactor your solution and make your classes independent.

I don’t believe that globals shouldn’t be used. But if they are used the authors should consider some principles (the ones mentioned above perhaps and other software engineering principles and good practices) for a cleaner and nearly bug-free code.


回答 3

它们是必不可少的,屏幕就是一个很好的例子。但是,在多线程环境中或在涉及许多开发人员的情况下,实际上常常会出现问题:谁(错误地)设置或清除了它?根据体系结构,分析可能很昂贵并且经常需要。虽然可以读取全局var,但是必须例如通过单线程或线程安全类来控制对其的写入。因此,全球变种人担心由于自身被认为是邪恶的后果而可能产生高昂的开发成本。因此,一般而言,最好将全局变量的数量保持在较低水平。

They are essential, the screen being a good example. However, in a multithreaded environment or with many developers involved, in practice often the question arises: who did (erraneously) set or clear it? Depending on the architecture, analysis can be costly and be required often. While reading the global var can be ok, writing to it must be controlled, for example by a single thread or threadsafe class. Hence, global vars arise the fear of high development costs possible by the consequences for which themselves are considered evil. Therefore in general, it’s good practice to keep the number of global vars low.


在简单的HTTP服务器上启用访问控制

问题:在简单的HTTP服务器上启用访问控制

对于非常简单的HTTP服务器,我具有以下shell脚本:

#!/bin/sh

echo "Serving at http://localhost:3000"
python -m SimpleHTTPServer 3000

我想知道我怎么可以启用或添加CORS标题喜欢Access-Control-Allow-Origin: *到这台服务器?

I have the following shell script for a very simple HTTP server:

#!/bin/sh

echo "Serving at http://localhost:3000"
python -m SimpleHTTPServer 3000

I was wondering how I can enable or add a CORS header like Access-Control-Allow-Origin: * to this server?


回答 0

不幸的是,简单的HTTP服务器是如此简单,以至于它不允许任何自定义,尤其是不允许其发送的标头。但是,您可以使用的大部分内容自己创建一个简单的HTTP服务器SimpleHTTPRequestHandler,而只需添加所需的标头即可。

为此,只需创建一个文件simple-cors-http-server.py(或其他文件),然后根据所使用的Python版本,将以下代码之一放入其中。

然后,您可以执行操作python simple-cors-http-server.py,它将启动修改后的服务器,该服务器将为每个响应设置CORS标头。

shebang置于顶部,使该文件可执行并放入PATH,您也可以使用它运行它simple-cors-http-server.py

Python 3解决方案

Python 3使用SimpleHTTPRequestHandlerHTTPServerhttp.server模块运行服务器:

#!/usr/bin/env python3
from http.server import HTTPServer, SimpleHTTPRequestHandler, test
import sys

class CORSRequestHandler (SimpleHTTPRequestHandler):
    def end_headers (self):
        self.send_header('Access-Control-Allow-Origin', '*')
        SimpleHTTPRequestHandler.end_headers(self)

if __name__ == '__main__':
    test(CORSRequestHandler, HTTPServer, port=int(sys.argv[1]) if len(sys.argv) > 1 else 8000)

Python 2解决方案

Python 2使用SimpleHTTPServer.SimpleHTTPRequestHandlerBaseHTTPServer模块来运行服务器。

#!/usr/bin/env python2
from SimpleHTTPServer import SimpleHTTPRequestHandler
import BaseHTTPServer

class CORSRequestHandler (SimpleHTTPRequestHandler):
    def end_headers (self):
        self.send_header('Access-Control-Allow-Origin', '*')
        SimpleHTTPRequestHandler.end_headers(self)

if __name__ == '__main__':
    BaseHTTPServer.test(CORSRequestHandler, BaseHTTPServer.HTTPServer)

Python 2和3解决方案

如果您需要同时兼容Python 3和Python 2,则可以使用在两个版本中都可以使用的多语言脚本。它首先尝试从Python 3位置导入,否则回落到Python 2:

#!/usr/bin/env python
try:
    # Python 3
    from http.server import HTTPServer, SimpleHTTPRequestHandler, test as test_orig
    import sys
    def test (*args):
        test_orig(*args, port=int(sys.argv[1]) if len(sys.argv) > 1 else 8000)
except ImportError: # Python 2
    from BaseHTTPServer import HTTPServer, test
    from SimpleHTTPServer import SimpleHTTPRequestHandler

class CORSRequestHandler (SimpleHTTPRequestHandler):
    def end_headers (self):
        self.send_header('Access-Control-Allow-Origin', '*')
        SimpleHTTPRequestHandler.end_headers(self)

if __name__ == '__main__':
    test(CORSRequestHandler, HTTPServer)

Unfortunately, the simple HTTP server is really that simple that it does not allow any customization, especially not for the headers it sends. You can however create a simple HTTP server yourself, using most of SimpleHTTPRequestHandler, and just add that desired header.

For that, simply create a file simple-cors-http-server.py (or whatever) and, depending on the Python version you are using, put one of the following codes inside.

Then you can do python simple-cors-http-server.py and it will launch your modified server which will set the CORS header for every response.

With the shebang at the top, make the file executable and put it into your PATH, and you can just run it using simple-cors-http-server.py too.

Python 3 solution

Python 3 uses SimpleHTTPRequestHandler and HTTPServer from the http.server module to run the server:

#!/usr/bin/env python3
from http.server import HTTPServer, SimpleHTTPRequestHandler, test
import sys

class CORSRequestHandler (SimpleHTTPRequestHandler):
    def end_headers (self):
        self.send_header('Access-Control-Allow-Origin', '*')
        SimpleHTTPRequestHandler.end_headers(self)

if __name__ == '__main__':
    test(CORSRequestHandler, HTTPServer, port=int(sys.argv[1]) if len(sys.argv) > 1 else 8000)

Python 2 solution

Python 2 uses SimpleHTTPServer.SimpleHTTPRequestHandler and the BaseHTTPServer module to run the server.

#!/usr/bin/env python2
from SimpleHTTPServer import SimpleHTTPRequestHandler
import BaseHTTPServer

class CORSRequestHandler (SimpleHTTPRequestHandler):
    def end_headers (self):
        self.send_header('Access-Control-Allow-Origin', '*')
        SimpleHTTPRequestHandler.end_headers(self)

if __name__ == '__main__':
    BaseHTTPServer.test(CORSRequestHandler, BaseHTTPServer.HTTPServer)

Python 2 & 3 solution

If you need compatibility for both Python 3 and Python 2, you could use this polyglot script that works in both versions. It first tries to import from the Python 3 locations, and otherwise falls back to Python 2:

#!/usr/bin/env python
try:
    # Python 3
    from http.server import HTTPServer, SimpleHTTPRequestHandler, test as test_orig
    import sys
    def test (*args):
        test_orig(*args, port=int(sys.argv[1]) if len(sys.argv) > 1 else 8000)
except ImportError: # Python 2
    from BaseHTTPServer import HTTPServer, test
    from SimpleHTTPServer import SimpleHTTPRequestHandler

class CORSRequestHandler (SimpleHTTPRequestHandler):
    def end_headers (self):
        self.send_header('Access-Control-Allow-Origin', '*')
        SimpleHTTPRequestHandler.end_headers(self)

if __name__ == '__main__':
    test(CORSRequestHandler, HTTPServer)

回答 1

尝试其他替代方法,例如http-server

由于SimpleHTTPServer并不是真正要部署到生产环境的服务器,因此我在这里假设您对使用哪种工具并不在乎,只要它能够http://localhost:3000通过简单的方式在CORS标头中公开文件即可命令行

# install (it requires nodejs/npm)
npm install http-server -g

#run
http-server -p 3000 --cors

需要HTTPS吗?

如果您需要在本地使用https,也可以尝试 caddycertbot


您可能会发现有用的一些相关工具

  • ngrok:运行时ngrok http 3000,它将创建一个https://$random.ngrok.com允许任何人访问您的http://localhost:3000服务器的URL 。它可以向世界展示您计算机本地运行的内容(包括本地后端/ API)

  • localtunnel:与ngrok几乎相同

  • 现在:运行时now,它将在线上传您的静态资产并将其部署到https://$random.now.sh。除非您另有决定,否则它们将永远保持在线状态。由于差异,部署速度很快(第一个部署除外)。现在适用于生产前端/ SPA代码部署。它还可以部署Docker和NodeJS应用程序。它不是真正的免费,但他们有免费计划。

Try an alternative like http-server

As SimpleHTTPServer is not really the kind of server you deploy to production, I’m assuming here that you don’t care that much about which tool you use as long as it does the job of exposing your files at http://localhost:3000 with CORS headers in a simple command line

# install (it requires nodejs/npm)
npm install http-server -g

#run
http-server -p 3000 --cors

Need HTTPS?

If you need https in local you can also try caddy or certbot


Some related tools you might find useful

  • ngrok: when running ngrok http 3000, it creates an url https://$random.ngrok.com that permits anyone to access your http://localhost:3000 server. It can expose to the world what runs locally on your computer (including local backends/apis)

  • localtunnel: almost the same as ngrok

  • now: when running now, it uploads your static assets online and deploy them to https://$random.now.sh. They remain online forever unless you decide otherwise. Deployment is fast (except the first one) thanks to diffing. Now is suitable for production frontend/SPA code deployment It can also deploy Docker and NodeJS apps. It is not really free, but they have a free plan.


回答 2

我遇到了同样的问题,并提出了以下解决方案:

class Handler(SimpleHTTPRequestHandler):
    def send_response(self, *args, **kwargs):
        SimpleHTTPRequestHandler.send_response(self, *args, **kwargs)
        self.send_header('Access-Control-Allow-Origin', '*')

我只是创建了一个继承自SimpleHTTPRequestHandler的新类,该类仅更改了send_response方法。

I had the same problem and came to this solution:

class Handler(SimpleHTTPRequestHandler):
    def send_response(self, *args, **kwargs):
        SimpleHTTPRequestHandler.send_response(self, *args, **kwargs)
        self.send_header('Access-Control-Allow-Origin', '*')

I simply created a new class inheriting from SimpleHTTPRequestHandler that only changes the send_response method.


回答 3

您需要提供自己的do_GET()实例(如果选择支持HEAD操作,则需要提供do_HEAD())。像这样的东西:

class MyHTTPServer(SimpleHTTPServer):

    allowed_hosts = (('127.0.0.1', 80),)

    def do_GET(self):
        if self.client_address not in allowed_hosts:
            self.send_response(401, 'request not allowed')
        else:
            super(MyHTTPServer, self).do_Get()

You’ll need to provide your own instances of do_GET() (and do_HEAD() if choose to support HEAD operations). something like this:

class MyHTTPServer(SimpleHTTPServer):

    allowed_hosts = (('127.0.0.1', 80),)

    def do_GET(self):
        if self.client_address not in allowed_hosts:
            self.send_response(401, 'request not allowed')
        else:
            super(MyHTTPServer, self).do_Get()

如何检查字符串中的字符是否为字母?(Python)

问题:如何检查字符串中的字符是否为字母?(Python)

我知道islowerisupper,但是您可以检查该字符是否是字母?例如:

>>> s = 'abcdefg'
>>> s2 = '123abcd'
>>> s3 = 'abcDEFG'
>>> s[0].islower()
True

>>> s2[0].islower()
False

>>> s3[0].islower()
True

除了做.islower()还是,有什么办法可以问它是否是一个角色.isupper()

I know about islower and isupper, but can you check whether or not that character is a letter? For Example:

>>> s = 'abcdefg'
>>> s2 = '123abcd'
>>> s3 = 'abcDEFG'
>>> s[0].islower()
True

>>> s2[0].islower()
False

>>> s3[0].islower()
True

Is there any way to just ask if it is a character besides doing .islower() or .isupper()?


回答 0

您可以使用str.isalpha()

例如:

s = 'a123b'

for char in s:
    print(char, char.isalpha())

输出:

a True
1 False
2 False
3 False
b True

You can use str.isalpha().

For example:

s = 'a123b'

for char in s:
    print(char, char.isalpha())

Output:

a True
1 False
2 False
3 False
b True

回答 1

str.isalpha()

如果字符串中的所有字符都是字母并且至少包含一个字符,则返回true,否则返回false。字母字符是在Unicode字符数据库中定义为“字母”的那些字符,即,具有一般类别属性为“ Lm”,“ Lt”,“ Lu”,“ Ll”或“ Lo”之一的那些字符。请注意,这与Unicode标准中定义的“字母”属性不同。

在python2.x中:

>>> s = u'a1中文'
>>> for char in s: print char, char.isalpha()
...
a True
1 False
 True
 True
>>> s = 'a1中文'
>>> for char in s: print char, char.isalpha()
...
a True
1 False
 False
 False
 False
 False
 False
 False
>>>

在python3.x中:

>>> s = 'a1中文'
>>> for char in s: print(char, char.isalpha())
...
a True
1 False
 True
 True
>>>

此代码的工作原理:

>>> def is_alpha(word):
...     try:
...         return word.encode('ascii').isalpha()
...     except:
...         return False
...
>>> is_alpha('中国')
False
>>> is_alpha(u'中国')
False
>>>

>>> a = 'a'
>>> b = 'a'
>>> ord(a), ord(b)
(65345, 97)
>>> a.isalpha(), b.isalpha()
(True, True)
>>> is_alpha(a), is_alpha(b)
(False, True)
>>>
str.isalpha()

Return true if all characters in the string are alphabetic and there is at least one character, false otherwise. Alphabetic characters are those characters defined in the Unicode character database as “Letter”, i.e., those with general category property being one of “Lm”, “Lt”, “Lu”, “Ll”, or “Lo”. Note that this is different from the “Alphabetic” property defined in the Unicode Standard.

In python2.x:

>>> s = u'a1中文'
>>> for char in s: print char, char.isalpha()
...
a True
1 False
中 True
文 True
>>> s = 'a1中文'
>>> for char in s: print char, char.isalpha()
...
a True
1 False
� False
� False
� False
� False
� False
� False
>>>

In python3.x:

>>> s = 'a1中文'
>>> for char in s: print(char, char.isalpha())
...
a True
1 False
中 True
文 True
>>>

This code work:

>>> def is_alpha(word):
...     try:
...         return word.encode('ascii').isalpha()
...     except:
...         return False
...
>>> is_alpha('中国')
False
>>> is_alpha(u'中国')
False
>>>

>>> a = 'a'
>>> b = 'a'
>>> ord(a), ord(b)
(65345, 97)
>>> a.isalpha(), b.isalpha()
(True, True)
>>> is_alpha(a), is_alpha(b)
(False, True)
>>>

回答 2

我发现使用函数和基本代码可以实现此目的。这是一个接受字符串并计算大写字母,小写字母以及“其他”数量的代码。其他分类为空格,标点符号,甚至日语和中文字符。

def check(count):

    lowercase = 0
    uppercase = 0
    other = 0

    low = 'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'
    upper = 'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'



    for n in count:
        if n in low:
            lowercase += 1
        elif n in upper:
            uppercase += 1
        else:
            other += 1

    print("There are " + str(lowercase) + " lowercase letters.")
    print("There are " + str(uppercase) + " uppercase letters.")
    print("There are " + str(other) + " other elements to this sentence.")

I found a good way to do this with using a function and basic code. This is a code that accepts a string and counts the number of capital letters, lowercase letters and also ‘other’. Other is classed as a space, punctuation mark or even Japanese and Chinese characters.

def check(count):

    lowercase = 0
    uppercase = 0
    other = 0

    low = 'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'
    upper = 'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'



    for n in count:
        if n in low:
            lowercase += 1
        elif n in upper:
            uppercase += 1
        else:
            other += 1

    print("There are " + str(lowercase) + " lowercase letters.")
    print("There are " + str(uppercase) + " uppercase letters.")
    print("There are " + str(other) + " other elements to this sentence.")

回答 3

data = "abcdefg hi j 12345"

digits_count = 0
letters_count = 0
others_count = 0

for i in userinput:

    if i.isdigit():
        digits_count += 1 
    elif i.isalpha():
        letters_count += 1
    else:
        others_count += 1

print("Result:")        
print("Letters=", letters_count)
print("Digits=", digits_count)

输出:

Please Enter Letters with Numbers:
abcdefg hi j 12345
Result:
Letters = 10
Digits = 5

通过使用str.isalpha()您可以检查它是否是字母。

data = "abcdefg hi j 12345"

digits_count = 0
letters_count = 0
others_count = 0

for i in userinput:

    if i.isdigit():
        digits_count += 1 
    elif i.isalpha():
        letters_count += 1
    else:
        others_count += 1

print("Result:")        
print("Letters=", letters_count)
print("Digits=", digits_count)

Output:

Please Enter Letters with Numbers:
abcdefg hi j 12345
Result:
Letters = 10
Digits = 5

By using str.isalpha() you can check if it is a letter.


回答 4

这有效:

any(c.isalpha() for c in 'string')

This works:

any(c.isalpha() for c in 'string')

回答 5

这有效:

word = str(input("Enter string:"))
notChar = 0
isChar = 0
for char in word:
    if not char.isalpha():
        notChar += 1
    else:
        isChar += 1
print(isChar, " were letters; ", notChar, " were not letters.")

This works:

word = str(input("Enter string:"))
notChar = 0
isChar = 0
for char in word:
    if not char.isalpha():
        notChar += 1
    else:
        isChar += 1
print(isChar, " were letters; ", notChar, " were not letters.")

有效地对numpy数组进行降序排序?

问题:有效地对numpy数组进行降序排序?

令我惊讶的是,之前没有提出过这个具体问题,但是我真的没有在SO或文档中找到它np.sort

假设我有一个包含整数的随机numpy数组,例如:

> temp = np.random.randint(1,10, 10)    
> temp
array([2, 4, 7, 4, 2, 2, 7, 6, 4, 4])

如果对它进行排序,则默认情况下我将获得升序:

> np.sort(temp)
array([2, 2, 2, 4, 4, 4, 4, 6, 7, 7])

但我希望解决方案按降序排序。

现在,我知道我可以永远做:

reverse_order = np.sort(temp)[::-1]

但这最后的陈述有效吗?它不是按升序创建副本,然后反转此副本以反转顺序获得结果吗?如果确实如此,是否有有效的选择?看起来好像不np.sort接受参数来更改排序操作中的比较符号以使事情相反。

I am surprised this specific question hasn’t been asked before, but I really didn’t find it on SO nor on the documentation of np.sort.

Say I have a random numpy array holding integers, e.g:

> temp = np.random.randint(1,10, 10)    
> temp
array([2, 4, 7, 4, 2, 2, 7, 6, 4, 4])

If I sort it, I get ascending order by default:

> np.sort(temp)
array([2, 2, 2, 4, 4, 4, 4, 6, 7, 7])

but I want the solution to be sorted in descending order.

Now, I know I can always do:

reverse_order = np.sort(temp)[::-1]

but is this last statement efficient? Doesn’t it create a copy in ascending order, and then reverses this copy to get the result in reversed order? If this is indeed the case, is there an efficient alternative? It doesn’t look like np.sort accepts parameters to change the sign of the comparisons in the sort operation to get things in reverse order.


回答 0

temp[::-1].sort()对数组进行排序,然后np.sort(temp)[::-1]创建一个新数组。

In [25]: temp = np.random.randint(1,10, 10)

In [26]: temp
Out[26]: array([5, 2, 7, 4, 4, 2, 8, 6, 4, 4])

In [27]: id(temp)
Out[27]: 139962713524944

In [28]: temp[::-1].sort()

In [29]: temp
Out[29]: array([8, 7, 6, 5, 4, 4, 4, 4, 2, 2])

In [30]: id(temp)
Out[30]: 139962713524944

temp[::-1].sort() sorts the array in place, whereas np.sort(temp)[::-1] creates a new array.

In [25]: temp = np.random.randint(1,10, 10)

In [26]: temp
Out[26]: array([5, 2, 7, 4, 4, 2, 8, 6, 4, 4])

In [27]: id(temp)
Out[27]: 139962713524944

In [28]: temp[::-1].sort()

In [29]: temp
Out[29]: array([8, 7, 6, 5, 4, 4, 4, 4, 2, 2])

In [30]: id(temp)
Out[30]: 139962713524944

回答 1

>>> a=np.array([5, 2, 7, 4, 4, 2, 8, 6, 4, 4])

>>> np.sort(a)
array([2, 2, 4, 4, 4, 4, 5, 6, 7, 8])

>>> -np.sort(-a)
array([8, 7, 6, 5, 4, 4, 4, 4, 2, 2])
>>> a=np.array([5, 2, 7, 4, 4, 2, 8, 6, 4, 4])

>>> np.sort(a)
array([2, 2, 4, 4, 4, 4, 5, 6, 7, 8])

>>> -np.sort(-a)
array([8, 7, 6, 5, 4, 4, 4, 4, 2, 2])

回答 2

对于短数组,我建议np.argsort()通过查找已排序的否定数组的索引来使用,这比反转已排序的数组要快一些:

In [37]: temp = np.random.randint(1,10, 10)

In [38]: %timeit np.sort(temp)[::-1]
100000 loops, best of 3: 4.65 µs per loop

In [39]: %timeit temp[np.argsort(-temp)]
100000 loops, best of 3: 3.91 µs per loop

For short arrays I suggest using np.argsort() by finding the indices of the sorted negatived array, which is slightly faster than reversing the sorted array:

In [37]: temp = np.random.randint(1,10, 10)

In [38]: %timeit np.sort(temp)[::-1]
100000 loops, best of 3: 4.65 µs per loop

In [39]: %timeit temp[np.argsort(-temp)]
100000 loops, best of 3: 3.91 µs per loop

回答 3

不幸的是,当您有一个复杂的数组时,只能np.sort(temp)[::-1]正常工作。这里提到的其他两种方法无效。

Unfortunately when you have a complex array, only np.sort(temp)[::-1] works properly. The two other methods mentioned here are not effective.


回答 4

注意尺寸。

x  # initial numpy array
I = np.argsort(x) or I = x.argsort() 
y = np.sort(x)    or y = x.sort()
z  # reverse sorted array

全反转

z = x[-I]
z = -np.sort(-x)
z = np.flip(y)
  • flip更改1.15需要以前的版本。解决方案:。1.14 axispip install --upgrade numpy

第一维反转

z = y[::-1]
z = np.flipud(y)
z = np.flip(y, axis=0)

逆向二维

z = y[::-1, :]
z = np.fliplr(y)
z = np.flip(y, axis=1)

测试中

在100×10×10阵列上测试1000次。

Method       | Time (ms)
-------------+----------
y[::-1]      | 0.126659  # only in first dimension
-np.sort(-x) | 0.133152
np.flip(y)   | 0.121711
x[-I]        | 4.611778

x.sort()     | 0.024961
x.argsort()  | 0.041830
np.flip(x)   | 0.002026

这主要是由于重新索引而不是argsort

# Timing code
import time
import numpy as np


def timeit(fun, xs):
    t = time.time()
    for i in range(len(xs)):  # inline and map gave much worse results for x[-I], 5*t
        fun(xs[i])
    t = time.time() - t
    print(np.round(t,6))

I, N = 1000, (100, 10, 10)
xs = np.random.rand(I,*N)
timeit(lambda x: np.sort(x)[::-1], xs)
timeit(lambda x: -np.sort(-x), xs)
timeit(lambda x: np.flip(x.sort()), xs)
timeit(lambda x: x[-x.argsort()], xs)
timeit(lambda x: x.sort(), xs)
timeit(lambda x: x.argsort(), xs)
timeit(lambda x: np.flip(x), xs)

Be careful with dimensions.

Let

x  # initial numpy array
I = np.argsort(x) or I = x.argsort() 
y = np.sort(x)    or y = x.sort()
z  # reverse sorted array

Full Reverse

z = x[I[::-1]]
z = -np.sort(-x)
z = np.flip(y)
  • flip changed in 1.15, previous versions 1.14 required axis. Solution: pip install --upgrade numpy.

First Dimension Reversed

z = y[::-1]
z = np.flipud(y)
z = np.flip(y, axis=0)

Second Dimension Reversed

z = y[::-1, :]
z = np.fliplr(y)
z = np.flip(y, axis=1)

Testing

Testing on a 100×10×10 array 1000 times.

Method       | Time (ms)
-------------+----------
y[::-1]      | 0.126659  # only in first dimension
-np.sort(-x) | 0.133152
np.flip(y)   | 0.121711
x[I[::-1]]   | 4.611778

x.sort()     | 0.024961
x.argsort()  | 0.041830
np.flip(x)   | 0.002026

This is mainly due to reindexing rather than argsort.

# Timing code
import time
import numpy as np


def timeit(fun, xs):
    t = time.time()
    for i in range(len(xs)):  # inline and map gave much worse results for x[-I], 5*t
        fun(xs[i])
    t = time.time() - t
    print(np.round(t,6))

I, N = 1000, (100, 10, 10)
xs = np.random.rand(I,*N)
timeit(lambda x: np.sort(x)[::-1], xs)
timeit(lambda x: -np.sort(-x), xs)
timeit(lambda x: np.flip(x.sort()), xs)
timeit(lambda x: x[x.argsort()[::-1]], xs)
timeit(lambda x: x.sort(), xs)
timeit(lambda x: x.argsort(), xs)
timeit(lambda x: np.flip(x), xs)

回答 5

您好,我在寻找一种对二维numpy数组进行反向排序的解决方案,但找不到任何有效的方法,但是我想我偶然发现了一个我上载的解决方案,以防万一有人在同一条船上。

x=np.sort(array)
y=np.fliplr(x)

np.sort对升序进行排序,这不是您想要的,但是命令fliplr将行从左向右翻转!似乎可以工作!

希望它可以帮助您!

我猜这与上面关于-np.sort(-a)的建议相似,但是由于评论它并不总是有效而推迟了我的建议。也许我的解决方案也不总是可行,但是我已经用几个阵列对其进行了测试,似乎还可以。

Hello I was searching for a solution to reverse sorting a two dimensional numpy array, and I couldn’t find anything that worked, but I think I have stumbled on a solution which I am uploading just in case anyone is in the same boat.

x=np.sort(array)
y=np.fliplr(x)

np.sort sorts ascending which is not what you want, but the command fliplr flips the rows left to right! Seems to work!

Hope it helps you out!

I guess it’s similar to the suggest about -np.sort(-a) above but I was put off going for that by comment that it doesn’t always work. Perhaps my solution won’t always work either however I have tested it with a few arrays and seems to be OK.


回答 6

您可以先对数组进行排序(默认为升序),然后应用np.flip()https://docs.scipy.org/doc/numpy/reference/generated/numpy.flip.html

仅供参考,它也适用于日期时间对象。

例:

    x = np.array([2,3,1,0]) 
    x_sort_asc=np.sort(x) 
    print(x_sort_asc)

    >>> array([0, 1, 2, 3])

    x_sort_desc=np.flip(x_sort_asc) 
    print(x_sort_desc)

    >>> array([3,2,1,0])

You could sort the array first (Ascending by default) and then apply np.flip() (https://docs.scipy.org/doc/numpy/reference/generated/numpy.flip.html)

FYI It works with datetime objects as well.

Example:

    x = np.array([2,3,1,0]) 
    x_sort_asc=np.sort(x) 
    print(x_sort_asc)

    >>> array([0, 1, 2, 3])

    x_sort_desc=np.flip(x_sort_asc) 
    print(x_sort_desc)

    >>> array([3,2,1,0])

回答 7

这是一个快速窍门

In[3]: import numpy as np
In[4]: temp = np.random.randint(1,10, 10)
In[5]: temp
Out[5]: array([5, 4, 2, 9, 2, 3, 4, 7, 5, 8])

In[6]: sorted = np.sort(temp)
In[7]: rsorted = list(reversed(sorted))
In[8]: sorted
Out[8]: array([2, 2, 3, 4, 4, 5, 5, 7, 8, 9])

In[9]: rsorted
Out[9]: [9, 8, 7, 5, 5, 4, 4, 3, 2, 2]

Here is a quick trick

In[3]: import numpy as np
In[4]: temp = np.random.randint(1,10, 10)
In[5]: temp
Out[5]: array([5, 4, 2, 9, 2, 3, 4, 7, 5, 8])

In[6]: sorted = np.sort(temp)
In[7]: rsorted = list(reversed(sorted))
In[8]: sorted
Out[8]: array([2, 2, 3, 4, 4, 5, 5, 7, 8, 9])

In[9]: rsorted
Out[9]: [9, 8, 7, 5, 5, 4, 4, 3, 2, 2]

回答 8

我建议使用这个…

np.arange(start_index, end_index, intervals)[::-1]

例如:

np.arange(10, 20, 0.5)
np.arange(10, 20, 0.5)[::-1]

然后您的恢复:

[ 19.5,  19. ,  18.5,  18. ,  17.5,  17. ,  16.5,  16. ,  15.5,
    15. ,  14.5,  14. ,  13.5,  13. ,  12.5,  12. ,  11.5,  11. ,
    10.5,  10. ]

i suggest using this …

np.arange(start_index, end_index, intervals)[::-1]

for example:

np.arange(10, 20, 0.5)
np.arange(10, 20, 0.5)[::-1]

Then your resault:

[ 19.5,  19. ,  18.5,  18. ,  17.5,  17. ,  16.5,  16. ,  15.5,
    15. ,  14.5,  14. ,  13.5,  13. ,  12.5,  12. ,  11.5,  11. ,
    10.5,  10. ]

Windows上的RuntimeError尝试Python多处理

问题:Windows上的RuntimeError尝试Python多处理

我正在尝试在Windows机器上使用Threading and Multiprocessing的第一个正式python程序。我无法启动进程,但是python给出了以下消息。问题是,我没有在模块中启动线程。线程在类内的单独模块中处理。

编辑:顺便说一句,此代码在ubuntu上运行良好。在窗户上不太

RuntimeError: 
            Attempt to start a new process before the current process
            has finished its bootstrapping phase.
            This probably means that you are on Windows and you have
            forgotten to use the proper idiom in the main module:
                if __name__ == '__main__':
                    freeze_support()
                    ...
            The "freeze_support()" line can be omitted if the program
            is not going to be frozen to produce a Windows executable.

我的原始代码很长,但是我能够以节略的版本重现该错误。它分为两个文件,第一个是主模块,除了导入处理进程/线程和调用方法的模块外,几乎没有其他作用。第二个模块是代码的关键所在。


testMain.py:

import parallelTestModule

extractor = parallelTestModule.ParallelExtractor()
extractor.runInParallel(numProcesses=2, numThreads=4)

parallelTestModule.py:

import multiprocessing
from multiprocessing import Process
import threading

class ThreadRunner(threading.Thread):
    """ This class represents a single instance of a running thread"""
    def __init__(self, name):
        threading.Thread.__init__(self)
        self.name = name
    def run(self):
        print self.name,'\n'

class ProcessRunner:
    """ This class represents a single instance of a running process """
    def runp(self, pid, numThreads):
        mythreads = []
        for tid in range(numThreads):
            name = "Proc-"+str(pid)+"-Thread-"+str(tid)
            th = ThreadRunner(name)
            mythreads.append(th) 
        for i in mythreads:
            i.start()
        for i in mythreads:
            i.join()

class ParallelExtractor:    
    def runInParallel(self, numProcesses, numThreads):
        myprocs = []
        prunner = ProcessRunner()
        for pid in range(numProcesses):
            pr = Process(target=prunner.runp, args=(pid, numThreads)) 
            myprocs.append(pr) 
#        if __name__ == 'parallelTestModule':    #This didnt work
#        if __name__ == '__main__':              #This obviously doesnt work
#        multiprocessing.freeze_support()        #added after seeing error to no avail
        for i in myprocs:
            i.start()

        for i in myprocs:
            i.join()

I am trying my very first formal python program using Threading and Multiprocessing on a windows machine. I am unable to launch the processes though, with python giving the following message. The thing is, I am not launching my threads in the main module. The threads are handled in a separate module inside a class.

EDIT: By the way this code runs fine on ubuntu. Not quite on windows

RuntimeError: 
            Attempt to start a new process before the current process
            has finished its bootstrapping phase.
            This probably means that you are on Windows and you have
            forgotten to use the proper idiom in the main module:
                if __name__ == '__main__':
                    freeze_support()
                    ...
            The "freeze_support()" line can be omitted if the program
            is not going to be frozen to produce a Windows executable.

My original code is pretty long, but I was able to reproduce the error in an abridged version of the code. It is split in two files, the first is the main module and does very little other than import the module which handles processes/threads and calls a method. The second module is where the meat of the code is.


testMain.py:

import parallelTestModule

extractor = parallelTestModule.ParallelExtractor()
extractor.runInParallel(numProcesses=2, numThreads=4)

parallelTestModule.py:

import multiprocessing
from multiprocessing import Process
import threading

class ThreadRunner(threading.Thread):
    """ This class represents a single instance of a running thread"""
    def __init__(self, name):
        threading.Thread.__init__(self)
        self.name = name
    def run(self):
        print self.name,'\n'

class ProcessRunner:
    """ This class represents a single instance of a running process """
    def runp(self, pid, numThreads):
        mythreads = []
        for tid in range(numThreads):
            name = "Proc-"+str(pid)+"-Thread-"+str(tid)
            th = ThreadRunner(name)
            mythreads.append(th) 
        for i in mythreads:
            i.start()
        for i in mythreads:
            i.join()

class ParallelExtractor:    
    def runInParallel(self, numProcesses, numThreads):
        myprocs = []
        prunner = ProcessRunner()
        for pid in range(numProcesses):
            pr = Process(target=prunner.runp, args=(pid, numThreads)) 
            myprocs.append(pr) 
#        if __name__ == 'parallelTestModule':    #This didnt work
#        if __name__ == '__main__':              #This obviously doesnt work
#        multiprocessing.freeze_support()        #added after seeing error to no avail
        for i in myprocs:
            i.start()

        for i in myprocs:
            i.join()

回答 0

在Windows上,子进程将在启动时导入(即执行)主模块。您需要if __name__ == '__main__':在主模块中插入防护,以避免递归创建子流程。

已修改testMain.py

import parallelTestModule

if __name__ == '__main__':    
    extractor = parallelTestModule.ParallelExtractor()
    extractor.runInParallel(numProcesses=2, numThreads=4)

On Windows the subprocesses will import (i.e. execute) the main module at start. You need to insert an if __name__ == '__main__': guard in the main module to avoid creating subprocesses recursively.

Modified testMain.py:

import parallelTestModule

if __name__ == '__main__':    
    extractor = parallelTestModule.ParallelExtractor()
    extractor.runInParallel(numProcesses=2, numThreads=4)

回答 1

尝试将代码放入testMain.py的主函数中

import parallelTestModule

if __name__ ==  '__main__':
  extractor = parallelTestModule.ParallelExtractor()
  extractor.runInParallel(numProcesses=2, numThreads=4)

文档

"For an explanation of why (on Windows) the if __name__ == '__main__' 
part is necessary, see Programming guidelines."

哪说

“确保主模块可以由新的Python解释器安全地导入,而不会引起意外的副作用(例如,启动新进程)。”

… 通过使用 if __name__ == '__main__'

Try putting your code inside a main function in testMain.py

import parallelTestModule

if __name__ ==  '__main__':
  extractor = parallelTestModule.ParallelExtractor()
  extractor.runInParallel(numProcesses=2, numThreads=4)

See the docs:

"For an explanation of why (on Windows) the if __name__ == '__main__' 
part is necessary, see Programming guidelines."

which say

“Make sure that the main module can be safely imported by a new Python interpreter without causing unintended side effects (such a starting a new process).”

… by using if __name__ == '__main__'


回答 2

尽管较早的答案是正确的,但有一点复杂之处将有助于进一步说明。

如果您的主模块导入了另一个模块,在该模块中定义了全局变量或类成员变量并将其初始化为(或使用)一些新对象,则可能必须以相同的方式来限制导入:

if __name__ ==  '__main__':
  import my_module

Though the earlier answers are correct, there’s a small complication it would help to remark on.

In case your main module imports another module in which global variables or class member variables are defined and initialized to (or using) some new objects, you may have to condition that import in the same way:

if __name__ ==  '__main__':
  import my_module

回答 3

正如@Ofer所说,当您使用其他库或模块时,应将所有库或模块导入其中。 if __name__ == '__main__':

因此,就我而言,这样结束:

if __name__ == '__main__':       
    import librosa
    import os
    import pandas as pd
    run_my_program()

As @Ofer said, when you are using another libraries or modules, you should import all of them inside the if __name__ == '__main__':

So, in my case, ended like this:

if __name__ == '__main__':       
    import librosa
    import os
    import pandas as pd
    run_my_program()

回答 4

就我而言,这是代码中的一个简单错误,在创建变量之前就使用了变量。在尝试上述解决方案之前,值得检查一下。上帝知道为什么我得到这个特殊的错误信息。

In my case it was a simple bug in the code, using a variable before it was created. Worth checking that out before trying the above solutions. Why I got this particular error message, Lord knows.


在Python中,如何使用urllib查看网站是404还是200?

问题:在Python中,如何使用urllib查看网站是404还是200?

如何通过urllib获取标头的代码?

How to get the code of the headers through urllib?


回答 0

getcode()方法(在python2.6中添加)返回与响应一起发送的HTTP状态代码;如果URL不是HTTP URL,则返回None。

>>> a=urllib.urlopen('http://www.google.com/asdfsf')
>>> a.getcode()
404
>>> a=urllib.urlopen('http://www.google.com/')
>>> a.getcode()
200

The getcode() method (Added in python2.6) returns the HTTP status code that was sent with the response, or None if the URL is no HTTP URL.

>>> a=urllib.urlopen('http://www.google.com/asdfsf')
>>> a.getcode()
404
>>> a=urllib.urlopen('http://www.google.com/')
>>> a.getcode()
200

回答 1

您也可以使用urllib2

import urllib2

req = urllib2.Request('http://www.python.org/fish.html')
try:
    resp = urllib2.urlopen(req)
except urllib2.HTTPError as e:
    if e.code == 404:
        # do something...
    else:
        # ...
except urllib2.URLError as e:
    # Not an HTTP-specific error (e.g. connection refused)
    # ...
else:
    # 200
    body = resp.read()

请注意,它HTTPError是一个子类,URLError用于存储HTTP状态代码。

You can use urllib2 as well:

import urllib2

req = urllib2.Request('http://www.python.org/fish.html')
try:
    resp = urllib2.urlopen(req)
except urllib2.HTTPError as e:
    if e.code == 404:
        # do something...
    else:
        # ...
except urllib2.URLError as e:
    # Not an HTTP-specific error (e.g. connection refused)
    # ...
else:
    # 200
    body = resp.read()

Note that HTTPError is a subclass of URLError which stores the HTTP status code.


回答 2

对于Python 3:

import urllib.request, urllib.error

url = 'http://www.google.com/asdfsf'
try:
    conn = urllib.request.urlopen(url)
except urllib.error.HTTPError as e:
    # Return code error (e.g. 404, 501, ...)
    # ...
    print('HTTPError: {}'.format(e.code))
except urllib.error.URLError as e:
    # Not an HTTP-specific error (e.g. connection refused)
    # ...
    print('URLError: {}'.format(e.reason))
else:
    # 200
    # ...
    print('good')

For Python 3:

import urllib.request, urllib.error

url = 'http://www.google.com/asdfsf'
try:
    conn = urllib.request.urlopen(url)
except urllib.error.HTTPError as e:
    # Return code error (e.g. 404, 501, ...)
    # ...
    print('HTTPError: {}'.format(e.code))
except urllib.error.URLError as e:
    # Not an HTTP-specific error (e.g. connection refused)
    # ...
    print('URLError: {}'.format(e.reason))
else:
    # 200
    # ...
    print('good')

回答 3

import urllib2

try:
    fileHandle = urllib2.urlopen('http://www.python.org/fish.html')
    data = fileHandle.read()
    fileHandle.close()
except urllib2.URLError, e:
    print 'you got an error with the code', e
import urllib2

try:
    fileHandle = urllib2.urlopen('http://www.python.org/fish.html')
    data = fileHandle.read()
    fileHandle.close()
except urllib2.URLError, e:
    print 'you got an error with the code', e

Python 3.0、3.1、3.2中的“ ValueError:格式为零长度的字段名称”错误

问题:Python 3.0、3.1、3.2中的“ ValueError:格式为零长度的字段名称”错误

我正在尝试学习Python(具体来说是3),并且遇到了以下错误:

ValueError: zero length field name in format

我用谷歌搜索,发现需要指定数字:

a, b = 0, 1
if a < b:
     print('a ({0}) is less than b ({1})'.format(a, b))
else:
     print('a ({0}) is not less than b ({1})'.format(a, b))

而且不像本教程(来自lynda.com)实际所说的那样:

a, b = 0, 1
if a < b:
     print('a ({}) is less than b ({})'.format(a, b))
else:
     print('a ({}) is not less than b ({})'.format(a, b))

以下教程即时消息具有Python 3.1,即时消息使用3.2,而我读到的有关此错误的信息是,此错误仅发生在<3.1(3.0)中。他们在3.2中撤消了此操作,还是我做错了什么?

另外,说慢点;)从字面上看,这是我第一次学习Python,只是我用Python编写的第二个“脚本”。

I’m trying learn Python (3 to be more specific) and I’m getting this error:

ValueError: zero length field name in format

I googled it and I found out you need to specify the numbers:

a, b = 0, 1
if a < b:
     print('a ({0}) is less than b ({1})'.format(a, b))
else:
     print('a ({0}) is not less than b ({1})'.format(a, b))

And not like the tutorial (from lynda.com) actually says to do:

a, b = 0, 1
if a < b:
     print('a ({}) is less than b ({})'.format(a, b))
else:
     print('a ({}) is not less than b ({})'.format(a, b))

The tutorial im following has Python 3.1, and im using 3.2 and what i read about this error is that this only happens in <3.1 (3.0). Did they undo this in 3.2, or am i doing something wrong?

Also, speak slowly ;) this is literally my first night learning Python and only the 2nd “script” i’ve written in Python.


回答 0

我会猜测您偶然以某种方式运行了python 2.6。

如果您使用的是python 3,则此功能至少适用于3.1;如果您使用的是python 2,则此功能仅适用于2.7。

I’m gonna guess that you are running python 2.6 by accident somehow.

This feature is only available for at least 3.1 if you are using python 3, or 2.7 if you are using python 2.


回答 1

Python 2.6和3.0需要字段编号。在Python 2.7和更高版本以及3.1和更高版本中,可以忽略它们。

在2.7版中进行了更改:可以省略位置参数说明符,因此'{} {}’等同于'{0} {1}’。

python2.6.4>>> print '|{0:^12}|{1:^12}|'.format(3,4)
|     3      |     4     |

Python 2.6 and 3.0 require the field numbers. In Python 2.7 and later and 3.1 and later, they can be omitted.

Changed in version 2.7: The positional argument specifiers can be omitted, so ‘{} {}’ is equivalent to ‘{0} {1}’.

python2.6.4>>> print '|{0:^12}|{1:^12}|'.format(3,4)
|     3      |     4     |

回答 2

如果您使用的是Eclipse,则应查看Window-> Preferences-> PyDev-> Interpreter-Python。那里有口译员的名单(包括姓名和位置)。如果对于当前项目,您正在使用例如位于/ usr / bin / python中的解释器,则可能执行/ usr / bin / python -V whill会给您类似“ Python 2.6.6”的信息。就像Winston Ewert所写的那样,您的答案是正确的。

(您可以通过单击“新建…”按钮并在/ usr / bin / python3中将其添加为“位置”来添加新的交互程序。然后,您可能必须更改项目设置(首选项-> PyDev-解释器/语法)。

If you’re using Eclipse you should look into Window -> Preferences -> PyDev -> Interpreter – Python. There you have a list of interpreters (with name and location). If for your current project you’re using interpreter which is located for example in /usr/bin/python then probably executing /usr/bin/python -V whill give you something like “Python 2.6.6”. And there is your answer like Winston Ewert wrote.

(you can add new interperter by simply clicking “New…” button and giving /usr/bin/python3 as “location”. Then you have probably to change your project settings (Preferences -> PyDev – Interpreter/Grammar).


Python中的实例变量与类变量

问题:Python中的实例变量与类变量

我有Python类,在运行时我只需要一个实例,因此每个类仅一个属性,而每个实例仅具有一个属性就足够了。如果将有多个实例(不会发生),则所有实例都应具有相同的配置。我不知道以下哪个选项会更好或更“惯用” Python。

类变量:

class MyController(Controller):

  path = "something/"
  children = [AController, BController]

  def action(self, request):
    pass

实例变量:

class MyController(Controller):

  def __init__(self):
    self.path = "something/"
    self.children = [AController, BController]

  def action(self, request):
    pass

I have Python classes, of which I need only one instance at runtime, so it would be sufficient to have the attributes only once per class and not per instance. If there would be more than one instance (which won’t happen), all instance should have the same configuration. I wonder which of the following options would be better or more “idiomatic” Python.

Class variables:

class MyController(Controller):

  path = "something/"
  children = [AController, BController]

  def action(self, request):
    pass

Instance variables:

class MyController(Controller):

  def __init__(self):
    self.path = "something/"
    self.children = [AController, BController]

  def action(self, request):
    pass

回答 0

如果您仍然只有一个实例,那么最好每个实例都设置所有变量,这仅仅是因为它们的访问速度(稍微快一点)(由于类与实例之间的“继承性”,因此“查找”的级别降低了),而且没有不利的一面来抵消这一小优势。

If you have only one instance anyway, it’s best to make all variables per-instance, simply because they will be accessed (a little bit) faster (one less level of “lookup” due to the “inheritance” from class to instance), and there are no downsides to weigh against this small advantage.


回答 1

进一步呼应MikeAlex的建议,并添加我自己的颜色…

使用实例属性是典型的……更加惯用的Python。由于类属性的用例是特定的,因此未使用过多的类属性。静态方法和类方法与“普通”方法一样。它们是解决特定用例的特殊结构,否则它是由异常的程序员创建的代码,目的是炫耀他们知道Python编程的一些晦涩之处。

Alex在他的答复中提到,由于查找级别降低了,访问将(稍微快一些)……让我进一步澄清那些还不知道如何工作的人。它与变量访问非常相似-搜索顺序为:

  1. 当地人
  2. 非本地人
  3. 全球
  4. 内建

对于属性访问,顺序为:

  1. 实例
  2. MRO确定的基本类(方法解析顺序)

两种技术都以“由内而外”的方式工作,这意味着首先检查大多数局部对象,然后依次检查外层。

在上面的示例中,假设您正在查找path属性。当遇到“ self.path”之类的引用时,Python将首先查看实例属性以进行匹配。如果失败,它将检查实例化对象的类。最后,它将搜索基类。如Alex所述,如果在实例中找到您的属性,则无需在其他地方查找,因此节省了一点时间。

但是,如果您坚持使用类属性,则需要进行额外的查找。或者,您的另一种选择是通过类而不是实例来引用对象,例如,MyController.path代替self.path。这是一个直接查找,可以绕开延迟查找,但是正如alex在下面提到的那样,它是一个全局变量,因此您丢失了原本想保存的那一部分(除非您创建对[global]类名的本地引用) )。

最重要的是,您应该在大多数时间使用实例属性。但是,在某些情况下,类属性是适合该工作的工具。同时使用这两个代码将需要最大的努力,因为使用self只会使您获得实例属性对象,并且可以通过影子访问相同名称的class属性。在这种情况下,必须使用通过类名访问属性,以便对其进行引用。

Further echoing Mike’s and Alex’s advice and adding my own color…

Using instance attributes are the typical… the more idiomatic Python. Class attributes are not used used as much, since their use cases are specific. The same is true for static and class methods vs. “normal” methods. They’re special constructs addressing specific use cases, else it’s code created by an aberrant programmer wanting to show off they know some obscure corner of Python programming.

Alex mentions in his reply that access will be (a little bit) faster due to one less level of lookup… let me further clarify for those who don’t know about how this works yet. It is very similar to variable access — the search order of which is:

  1. locals
  2. nonlocals
  3. globals
  4. built-ins

For attribute access, the order is:

  1. instance
  2. class
  3. base classes as determined by the MRO (method resolution order)

Both techniques work in an “inside-out” manner, meaning the most local objects are checked first, then outer layers are checked in succession.

In your example above, let’s say you’re looking up the path attribute. When it encounters a reference like “self.path“, Python will look at the instance attributes first for a match. When that fails, it checks the class from which the object was instantiated from. Finally, it will search the base classes. As Alex stated, if your attribute is found in the instance, it doesn’t need to look elsewhere, hence your little bit of time savings.

However, if you insist on class attributes, you need that extra lookup. Or, your other alternative is to refer to the object via the class instead of the instance, e.g., MyController.path instead of self.path. That’s a direct lookup which will get around the deferred lookup, but as alex mentions below, it’s a global variable, so you lose that bit that you thought you were going to save (unless you create a local reference to the [global] class name).

The bottom-line is that you should use instance attributes most of the time. However, there will be occasions where a class attribute is the right tool for the job. Code using both at the same time will require the most diligence, because using self will only get you the instance attribute object and shadows access to the class attribute of the same name. In this case, you must use access the attribute by the class name in order to reference it.


回答 2

如有疑问,您可能需要实例属性。

类属性最好保留给有意义的特殊情况。唯一非常常见的用例是方法。对实例需要知道的只读常量使用类属性并不罕见(尽管这样做的唯一好处是,如果您还希望从类外部进行访问),但是对于在其中存储任何状态,您一定要谨慎,这很少是您想要的。即使您只有一个实例,也应该像编写其他实例一样编写类,这通常意味着使用实例属性。

When in doubt, you probably want an instance attribute.

Class attributes are best reserved for special cases where they make sense. The only very-common use case is methods. It isn’t uncommon to use class attributes for read-only constants that instances need to know (though the only benefit to this is if you also want access from outside the class), but you should certainly be cautious about storing any state in them, which is seldom what you want. Even if you will only have one instance, you should write the class like you would any other, which usually means using instance attributes.


回答 3

关于在Python中访问类变量的性能存在相同的问题-此处的代码改编自@Edward Loper

局部变量是访问最快的,与模块变量,类变量,实例变量密切相关。

您可以从以下四个范围访问变量:

  1. 实例变量(self.varname)
  2. 类变量(Classname.varname)
  3. 模块变量(VARNAME)
  4. 局部变量(变量名)

考试:

import timeit

setup='''
XGLOBAL= 5
class A:
    xclass = 5
    def __init__(self):
        self.xinstance = 5
    def f1(self):
        xlocal = 5
        x = self.xinstance
    def f2(self):
        xlocal = 5
        x = A.xclass
    def f3(self):
        xlocal = 5
        x = XGLOBAL
    def f4(self):
        xlocal = 5
        x = xlocal
a = A()
'''
print('access via instance variable: %.3f' % timeit.timeit('a.f1()', setup=setup, number=300000000) )
print('access via class variable: %.3f' % timeit.timeit('a.f2()', setup=setup, number=300000000) )
print('access via module variable: %.3f' % timeit.timeit('a.f3()', setup=setup, number=300000000) )
print('access via local variable: %.3f' % timeit.timeit('a.f4()', setup=setup, number=300000000) )

结果:

access via instance variable: 93.456
access via class variable: 82.169
access via module variable: 72.634
access via local variable: 72.199

Same question at Performance of accessing class variables in Python – the code here adapted from @Edward Loper

Local Variables are the fastest to access, pretty much tied with Module Variables, followed by Class Variables, followed by Instance Variables.

There are 4 scopes you can access variables from:

  1. Instance Variables (self.varname)
  2. Class Variables (Classname.varname)
  3. Module Variables (VARNAME)
  4. Local Variables (varname)

The test:

import timeit

setup='''
XGLOBAL= 5
class A:
    xclass = 5
    def __init__(self):
        self.xinstance = 5
    def f1(self):
        xlocal = 5
        x = self.xinstance
    def f2(self):
        xlocal = 5
        x = A.xclass
    def f3(self):
        xlocal = 5
        x = XGLOBAL
    def f4(self):
        xlocal = 5
        x = xlocal
a = A()
'''
print('access via instance variable: %.3f' % timeit.timeit('a.f1()', setup=setup, number=300000000) )
print('access via class variable: %.3f' % timeit.timeit('a.f2()', setup=setup, number=300000000) )
print('access via module variable: %.3f' % timeit.timeit('a.f3()', setup=setup, number=300000000) )
print('access via local variable: %.3f' % timeit.timeit('a.f4()', setup=setup, number=300000000) )

The result:

access via instance variable: 93.456
access via class variable: 82.169
access via module variable: 72.634
access via local variable: 72.199