标签归档:Python

Python垃圾收集器文档

问题:Python垃圾收集器文档

我正在寻找详细描述python垃圾回收如何工作的文档。

我对在哪个步骤中完成操作很感兴趣。这三个集合中有哪些对象?在每个步骤中删除哪些对象?参考循环使用什么算法?

背景:我正在实施一些必须在短时间内完成的搜索。当垃圾收集器开始收集最旧的一代时,它比其他情况“慢很多”。它花费了比计划搜索更多的时间。我正在寻找如何预测何时收集最古老的一代以及需要多长时间。

很容易预测何时使用get_count()和收集最老的一代get_threshold()。也可以使用进行操纵set_threshold()。但是我看不出collect()用武力做出更好的决定或等待预定的收集会多么容易。

I’m looking for documents that describes in details how python garbage collection works.

I’m interested what is done in which step. What objects are in these 3 collections? What kinds of objects are deleted in each step? What algorithm is used for reference cycles finding?

Background: I’m implementing some searches that have to finish in small amount of time. When the garbage collector starts collecting the oldest generation, it is “much” slower than in other cases. It took more time than it is intended for searches. I’m looking how to predict when it will collect oldest generation and how long it will take.

It is easy to predict when it will collect oldest generation with get_count() and get_threshold(). That also can be manipulated with set_threshold(). But I don’t see how easy to decide is it better to make collect() by force or wait for scheduled collection.


回答 0

没有关于Python如何进行垃圾收集的明确资源(除了源代码本身),但是这3个链接应该给您一个很好的主意。

更新资料

来源实际上很有帮助。从中获得多少取决于您对C的理解程度,但是注释实际上非常有帮助。跳到该collect()功能,注释会很好地解释该过程(尽管在技术上非常严格)。

There’s no definitive resource on how Python does its garbage collection (other than the source code itself), but those 3 links should give you a pretty good idea.

Update

The source is actually pretty helpful. How much you get out of it depends on how well you read C, but the comments are actually very helpful. Skip down to the collect() function and the comments explain the process well (albeit in very technical terms).


__init __()是否应该调用父类的__init __()?

问题:__init __()是否应该调用父类的__init __()?

我在Objective-C中使用过这种结构:

- (void)init {
    if (self = [super init]) {
        // init class
    }
    return self;
}

Python是否还应该为调用父类的实现__init__

class NewClass(SomeOtherClass):
    def __init__(self):
        SomeOtherClass.__init__(self)
        # init class

对于__new__()和也是正确/错误__del__()吗?

编辑:有一个非常类似的问题:Python中的继承和重写__init__

I’m used that in Objective-C I’ve got this construct:

- (void)init {
    if (self = [super init]) {
        // init class
    }
    return self;
}

Should Python also call the parent class’s implementation for __init__?

class NewClass(SomeOtherClass):
    def __init__(self):
        SomeOtherClass.__init__(self)
        # init class

Is this also true/false for __new__() and __del__()?

Edit: There’s a very similar question: Inheritance and Overriding __init__ in Python


回答 0

在Python中,调用超类__init__是可选的。如果调用它,那么使用super标识符还是显式命名超类也是可选的:

object.__init__(self)

对于对象,由于super方法为空,因此不必严格要求调用super方法。相同__del__

另一方面,对于__new__,您确实应该调用super方法,并将其return用作新创建的对象-除非您明确希望返回其他内容。

In Python, calling the super-class’ __init__ is optional. If you call it, it is then also optional whether to use the super identifier, or whether to explicitly name the super class:

object.__init__(self)

In case of object, calling the super method is not strictly necessary, since the super method is empty. Same for __del__.

On the other hand, for __new__, you should indeed call the super method, and use its return as the newly-created object – unless you explicitly want to return something different.


回答 1

如果__init__除了在当前类中正在执行的操作之外,还需要从super 进行操作,则__init__,必须自己调用它,因为这不会自动发生。但是,如果您不需要super的__init__,任何东西,则无需调用它。例:

>>> class C(object):
        def __init__(self):
            self.b = 1


>>> class D(C):
        def __init__(self):
            super().__init__() # in Python 2 use super(D, self).__init__()
            self.a = 1


>>> class E(C):
        def __init__(self):
            self.a = 1


>>> d = D()
>>> d.a
1
>>> d.b  # This works because of the call to super's init
1
>>> e = E()
>>> e.a
1
>>> e.b  # This is going to fail since nothing in E initializes b...
Traceback (most recent call last):
  File "<pyshell#70>", line 1, in <module>
    e.b  # This is going to fail since nothing in E initializes b...
AttributeError: 'E' object has no attribute 'b'

__del__是相同的方式(但要警惕依赖于__del__完成-请考虑通过with语句代替)。

我很少使用__new__. 所有初始化方法__init__.

If you need something from super’s __init__ to be done in addition to what is being done in the current class’s __init__, you must call it yourself, since that will not happen automatically. But if you don’t need anything from super’s __init__, no need to call it. Example:

>>> class C(object):
        def __init__(self):
            self.b = 1


>>> class D(C):
        def __init__(self):
            super().__init__() # in Python 2 use super(D, self).__init__()
            self.a = 1


>>> class E(C):
        def __init__(self):
            self.a = 1


>>> d = D()
>>> d.a
1
>>> d.b  # This works because of the call to super's init
1
>>> e = E()
>>> e.a
1
>>> e.b  # This is going to fail since nothing in E initializes b...
Traceback (most recent call last):
  File "<pyshell#70>", line 1, in <module>
    e.b  # This is going to fail since nothing in E initializes b...
AttributeError: 'E' object has no attribute 'b'

__del__ is the same way, (but be wary of relying on __del__ for finalization – consider doing it via the with statement instead).

I rarely use __new__. I do all the initialization in __init__.


回答 2

在Anon的回答中:
“如果__init__除了在当前类中所做的事情之外,还需要从super 进行一些事情__init__,则必须自己调用它,因为这不会自动发生”

令人难以置信:他的措辞与继承原则完全相反。


不是说“ super __init__ (…)中的某事不会自动发生”,而是它会自动发生,但不会发生,因为__init__派生类的定义覆盖了基类。__init__

那么,为什么要定义一个named_class’ __init__,因为它会覆盖有人诉诸继承时的目标?

这是因为需要定义一些在基类中未完成的事情__init__,而获得该结果的唯一可能性是将其执行置于派生类的__init__函数中。
换句话说,如果在基类__init____init__没有被覆盖,除了在基类会自动完成的事情外,还需要在基类做些什么。
并非相反。


然后,问题是__init__在实例化时不再激活存在于基类中的所需指令。为了抵消这种失活,需要做一些特殊的事情:显式调用基类’ __init__,以便保留基类执行的初始化,而不是添加__init__。这就是官方文档中所说的:

实际上,派生类中的重写方法可能想扩展而不是简单地替换相同名称的基类方法。有一种直接调用基类方法的简单方法:只需调用BaseClassName.methodname(self,arguments)。
http://docs.python.org/tutorial/classes.html#inheritance

这就是全部故事:

  • 当目标是保留基类执行的初始化(即纯继承)时,不需要任何特殊操作,必须避免__init__在派生类中定义一个函数

  • 当目的是替换由基类执行的初始化时,__init__必须在派生类中定义

  • 当目标是将过程添加到由基类执行的初始化时,__init__ 必须定义一个派生类,包括对基类的显式调用__init__


在Anon的职位上,我感到惊讶的不仅是他表达了与继承理论相反的事实,而且还有5个人绕过那个被推崇而又不掉头的家伙,而且在过去的2年中,没有人反应一个线程,其有趣的主题必须相对频繁地阅读。

In Anon’s answer:
“If you need something from super’s __init__ to be done in addition to what is being done in the current class’s __init__ , you must call it yourself, since that will not happen automatically”

It’s incredible: he is wording exactly the contrary of the principle of inheritance.


It is not that “something from super’s __init__ (…) will not happen automatically” , it is that it WOULD happen automatically, but it doesn’t happen because the base-class’ __init__ is overriden by the definition of the derived-clas __init__

So then, WHY defining a derived_class’ __init__ , since it overrides what is aimed at when someone resorts to inheritance ??

It’s because one needs to define something that is NOT done in the base-class’ __init__ , and the only possibility to obtain that is to put its execution in a derived-class’ __init__ function.
In other words, one needs something in base-class’ __init__ in addition to what would be automatically done in the base-classe’ __init__ if this latter wasn’t overriden.
NOT the contrary.


Then, the problem is that the desired instructions present in the base-class’ __init__ are no more activated at the moment of instantiation. In order to offset this inactivation, something special is required: calling explicitly the base-class’ __init__ , in order to KEEP , NOT TO ADD, the initialization performed by the base-class’ __init__ . That’s exactly what is said in the official doc:

An overriding method in a derived class may in fact want to extend rather than simply replace the base class method of the same name. There is a simple way to call the base class method directly: just call BaseClassName.methodname(self, arguments).
http://docs.python.org/tutorial/classes.html#inheritance

That’s all the story:

  • when the aim is to KEEP the initialization performed by the base-class, that is pure inheritance, nothing special is needed, one must just avoid to define an __init__ function in the derived class

  • when the aim is to REPLACE the initialization performed by the base-class, __init__ must be defined in the derived-class

  • when the aim is to ADD processes to the initialization performed by the base-class, a derived-class’ __init__ must be defined , comprising an explicit call to the base-class __init__


What I feel astonishing in the post of Anon is not only that he expresses the contrary of the inheritance theory, but that there have been 5 guys passing by that upvoted without turning a hair, and moreover there have been nobody to react in 2 years in a thread whose interesting subject must be read relatively often.


回答 3

编辑:(在代码更改之后)
我们无法告诉您是否需要调用父母的__init__(或任何其他函数)。继承显然可以在没有这种调用的情况下工作。这完全取决于代码的逻辑:例如,如果所有__init__操作都在父类中完成,则可以完全跳过子类__init__

考虑以下示例:

>>> class A:
    def __init__(self, val):
        self.a = val


>>> class B(A):
    pass

>>> class C(A):
    def __init__(self, val):
        A.__init__(self, val)
        self.a += val


>>> A(4).a
4
>>> B(5).a
5
>>> C(6).a
12

Edit: (after the code change)
There is no way for us to tell you whether you need or not to call your parent’s __init__ (or any other function). Inheritance obviously would work without such call. It all depends on the logic of your code: for example, if all your __init__ is done in parent class, you can just skip child-class __init__ altogether.

consider the following example:

>>> class A:
    def __init__(self, val):
        self.a = val


>>> class B(A):
    pass

>>> class C(A):
    def __init__(self, val):
        A.__init__(self, val)
        self.a += val


>>> A(4).a
4
>>> B(5).a
5
>>> C(6).a
12

回答 4

没有硬性规定。类的文档应指出子类是否应调用超类方法。有时您想完全替换超类行为,而有时又要增强它-即在超类调用之前和/或之后调用您自己的代码。

更新:相同的基本逻辑适用于任何方法调用。构造函数有时需要特别考虑(因为它们经常设置确定行为的状态)和析构函数,因为它们并行构造函数(例如,在资源分配(例如数据库连接)中)。但是,对于render()小部件的方法可能也是如此。

进一步更新:什么是OPP?你是说OOP吗?否-一个子类经常需要知道一些关于超类的设计。不是内部实现细节-而是超类与其客户(使用类)所拥有的基本契约。这丝毫不违反OOP原则。这就是为什么protected在OOP中通常是一个有效的概念的原因(尽管在Python中当然不是)。

There’s no hard and fast rule. The documentation for a class should indicate whether subclasses should call the superclass method. Sometimes you want to completely replace superclass behaviour, and at other times augment it – i.e. call your own code before and/or after a superclass call.

Update: The same basic logic applies to any method call. Constructors sometimes need special consideration (as they often set up state which determines behaviour) and destructors because they parallel constructors (e.g. in the allocation of resources, e.g. database connections). But the same might apply, say, to the render() method of a widget.

Further update: What’s the OPP? Do you mean OOP? No – a subclass often needs to know something about the design of the superclass. Not the internal implementation details – but the basic contract that the superclass has with its clients (using classes). This does not violate OOP principles in any way. That’s why protected is a valid concept in OOP in general (though not, of course, in Python).


回答 5

海事组织,你应该给它打电话。如果您的超类是object,则不应这样做,但在其他情况下,我认为不调用它是一种exceptions。正如其他人已经回答的那样,如果您的类甚至不必重写__init__自身,例如在没有(其他)内部状态要初始化的情况下,这将非常方便。

IMO, you should call it. If your superclass is object, you should not, but in other cases I think it is exceptional not to call it. As already answered by others, it is very convenient if your class doesn’t even have to override __init__ itself, for example when it has no (additional) internal state to initialize.


回答 6

是的,您应该始终__init__显式调用基类,这是一种良好的编码习惯。忘记执行此操作可能会导致细微的问题或运行时错误。即使__init__不接受任何参数也是如此。这与其他语言不同,在其他语言中,编译器会为您隐式调用基类构造函数。Python不会那样做!

始终调用基类的主要原因_init__是基类通常可以创建成员变量并将其初始化为默认值。因此,如果不调用基类init,则不会执行任何代码,并且最终会得到没有成员变量的基类。

范例

class Base:
  def __init__(self):
    print('base init')

class Derived1(Base):
  def __init__(self):
    print('derived1 init')

class Derived2(Base):
  def __init__(self):
    super(Derived2, self).__init__()
    print('derived2 init')

print('Creating Derived1...')
d1 = Derived1()
print('Creating Derived2...')
d2 = Derived2()

打印..

Creating Derived1...
derived1 init
Creating Derived2...
base init
derived2 init

运行此代码

Yes, you should always call base class __init__ explicitly as a good coding practice. Forgetting to do this can cause subtle issues or run time errors. This is true even if __init__ doesn’t take any parameters. This is unlike other languages where compiler would implicitly call base class constructor for you. Python doesn’t do that!

The main reason for always calling base class _init__ is that base class may typically create member variable and initialize them to defaults. So if you don’t call base class init, none of that code would be executed and you would end up with base class that has no member variables.

Example:

class Base:
  def __init__(self):
    print('base init')

class Derived1(Base):
  def __init__(self):
    print('derived1 init')

class Derived2(Base):
  def __init__(self):
    super(Derived2, self).__init__()
    print('derived2 init')

print('Creating Derived1...')
d1 = Derived1()
print('Creating Derived2...')
d2 = Derived2()

This prints..

Creating Derived1...
derived1 init
Creating Derived2...
base init
derived2 init

Run this code.


迭代对应于Python中列表的字典键值

问题:迭代对应于Python中列表的字典键值

使用Python 2.7。我有一本字典,其中以球队名称为关键,对每支球队得分并允许的奔跑次数作为值列表:

NL_East = {'Phillies': [645, 469], 'Braves': [599, 548], 'Mets': [653, 672]}

我希望能够将字典提供给函数并遍历每个团队(键)。

这是我正在使用的代码。现在,我只能逐队参加。我将如何遍历每个团队并为每个团队打印预期的win_percentage?

def Pythag(league):
    runs_scored = float(league['Phillies'][0])
    runs_allowed = float(league['Phillies'][1])
    win_percentage = round((runs_scored**2)/((runs_scored**2)+(runs_allowed**2))*1000)
    print win_percentage

谢谢你的帮助。

Working in Python 2.7. I have a dictionary with team names as the keys and the amount of runs scored and allowed for each team as the value list:

NL_East = {'Phillies': [645, 469], 'Braves': [599, 548], 'Mets': [653, 672]}

I would like to be able to feed the dictionary into a function and iterate over each team (the keys).

Here’s the code I’m using. Right now, I can only go team by team. How would I iterate over each team and print the expected win_percentage for each team?

def Pythag(league):
    runs_scored = float(league['Phillies'][0])
    runs_allowed = float(league['Phillies'][1])
    win_percentage = round((runs_scored**2)/((runs_scored**2)+(runs_allowed**2))*1000)
    print win_percentage

Thanks for any help.


回答 0

您有几种选择可以遍历字典。

如果迭代字典本身(for team in league),则将迭代字典的键。当使用for循环进行循环时,无论您是在dict(league)本身上循环还是在以下情况下,行为都是相同的league.keys()

for team in league.keys():
    runs_scored, runs_allowed = map(float, league[team])

您还可以通过迭代遍历键和值一次league.items()

for team, runs in league.items():
    runs_scored, runs_allowed = map(float, runs)

您甚至可以在迭代时执行元组拆包:

for team, (runs_scored, runs_allowed) in league.items():
    runs_scored = float(runs_scored)
    runs_allowed = float(runs_allowed)

You have several options for iterating over a dictionary.

If you iterate over the dictionary itself (for team in league), you will be iterating over the keys of the dictionary. When looping with a for loop, the behavior will be the same whether you loop over the dict (league) itself, or league.keys():

for team in league.keys():
    runs_scored, runs_allowed = map(float, league[team])

You can also iterate over both the keys and the values at once by iterating over league.items():

for team, runs in league.items():
    runs_scored, runs_allowed = map(float, runs)

You can even perform your tuple unpacking while iterating:

for team, (runs_scored, runs_allowed) in league.items():
    runs_scored = float(runs_scored)
    runs_allowed = float(runs_allowed)

回答 1

您也可以很容易地遍历字典:

for team, scores in NL_East.iteritems():
    runs_scored = float(scores[0])
    runs_allowed = float(scores[1])
    win_percentage = round((runs_scored**2)/((runs_scored**2)+(runs_allowed**2))*1000)
    print '%s: %.1f%%' % (team, win_percentage)

You can very easily iterate over dictionaries, too:

for team, scores in NL_East.iteritems():
    runs_scored = float(scores[0])
    runs_allowed = float(scores[1])
    win_percentage = round((runs_scored**2)/((runs_scored**2)+(runs_allowed**2))*1000)
    print '%s: %.1f%%' % (team, win_percentage)

回答 2

字典具有一个称为的内置函数iterkeys()

尝试:

for team in league.iterkeys():
    runs_scored = float(league[team][0])
    runs_allowed = float(league[team][1])
    win_percentage = round((runs_scored**2)/((runs_scored**2)+(runs_allowed**2))*1000)
    print win_percentage

Dictionaries have a built in function called iterkeys().

Try:

for team in league.iterkeys():
    runs_scored = float(league[team][0])
    runs_allowed = float(league[team][1])
    win_percentage = round((runs_scored**2)/((runs_scored**2)+(runs_allowed**2))*1000)
    print win_percentage

回答 3

字典对象允许您迭代其项目。此外,通过模式匹配和__future__可以使事情稍微简化。

最后,您可以将逻辑从打印中分离出来,以使事情在以后的重构/调试中更加容易。

from __future__ import division

def Pythag(league):
    def win_percentages():
        for team, (runs_scored, runs_allowed) in league.iteritems():
            win_percentage = round((runs_scored**2) / ((runs_scored**2)+(runs_allowed**2))*1000)
            yield win_percentage

    for win_percentage in win_percentages():
        print win_percentage

Dictionary objects allow you to iterate over their items. Also, with pattern matching and the division from __future__ you can do simplify things a bit.

Finally, you can separate your logic from your printing to make things a bit easier to refactor/debug later.

from __future__ import division

def Pythag(league):
    def win_percentages():
        for team, (runs_scored, runs_allowed) in league.iteritems():
            win_percentage = round((runs_scored**2) / ((runs_scored**2)+(runs_allowed**2))*1000)
            yield win_percentage

    for win_percentage in win_percentages():
        print win_percentage

回答 4

列表理解可以缩短内容…

win_percentages = [m**2.0 / (m**2.0 + n**2.0) * 100 for m, n in [a[i] for i in NL_East]]

List comprehension can shorten things…

win_percentages = [m**2.0 / (m**2.0 + n**2.0) * 100 for m, n in [a[i] for i in NL_East]]

Emacs适用于Python的批量缩进

问题:Emacs适用于Python的批量缩进

如果我想在代码块中添加try / except,则在Emacs中使用Python,我经常发现我必须逐行缩进整个代码块。在Emacs中,如何立即缩进整个块。

我不是经验丰富的Emacs用户,但是发现它是通过ssh工作的最佳工具。我在命令行(Ubuntu)上使用Emacs,而不是作为gui,如果有什么不同的话。

Working with Python in Emacs if I want to add a try/except to a block of code, I often find that I am having to indent the whole block, line by line. In Emacs, how do you indent the whole block at once.

I am not an experienced Emacs user, but just find it is the best tool for working through ssh. I am using Emacs on the command line(Ubuntu), not as a gui, if that makes any difference.


回答 0

如果您正在使用Emacs编程Python,那么您可能应该使用python-mode。使用python-mode,在标记代码块之后,

C-c >C-c C-l 将区域右移4个空格

C-c <C-c C-r 将区域向左移动4个空格

如果您需要将代码缩进两个级别,或者需要一定程度的缩编,则可以在命令前加上一个参数:

C-u 8 C-c > 将区域右移8个空格

C-u 8 C-c < 将区域向左移动8个空格

另一种选择是使用M-x indent-rigidly绑定到C-x TAB

C-u 8 C-x TAB 将区域右移8个空格

C-u -8 C-x TAB 将区域向左移动8个空格

用于文本矩形而不是文本行的矩形命令也很有用。

例如,在标记矩形区域后,

C-x r o 插入空格以填充矩形区域(有效地向右移动代码)

C-x r k 杀死矩形区域(有效地将代码向左移动)

C-x r t提示输入一个字符串来替换矩形。输入C-u 8 <space>后将输入8个空格。

PS。使用Ubuntu,要将python-mode设置为所有.py文件的默认模式,只需安装该python-mode软件包。

If you are programming Python using Emacs, then you should probably be using python-mode. With python-mode, after marking the block of code,

C-c > or C-c C-l shifts the region 4 spaces to the right

C-c < or C-c C-r shifts the region 4 spaces to the left

If you need to shift code by two levels of indention, or some arbitary amount you can prefix the command with an argument:

C-u 8 C-c > shifts the region 8 spaces to the right

C-u 8 C-c < shifts the region 8 spaces to the left

Another alternative is to use M-x indent-rigidly which is bound to C-x TAB:

C-u 8 C-x TAB shifts the region 8 spaces to the right

C-u -8 C-x TAB shifts the region 8 spaces to the left

Also useful are the rectangle commands that operate on rectangles of text instead of lines of text.

For example, after marking a rectangular region,

C-x r o inserts blank space to fill the rectangular region (effectively shifting code to the right)

C-x r k kills the rectangular region (effectively shifting code to the left)

C-x r t prompts for a string to replace the rectangle with. Entering C-u 8 <space> will then enter 8 spaces.

PS. With Ubuntu, to make python-mode the default mode for all .py files, simply install the python-mode package.


回答 1

除了默认情况下indent-region映射到C-M-\的,矩形编辑命令对Python很有用。将区域标记为正常,然后:

  • C-x r tstring-rectangle):将提示您输入要插入每行的字符;非常适合插入一定数量的空格
  • C-x r kkill-rectangle):删除矩形区域;非常适合去除压痕

您也可以C-x r yyank-rectangle),但这很少有用。

In addition to indent-region, which is mapped to C-M-\ by default, the rectangle edit commands are very useful for Python. Mark a region as normal, then:

  • C-x r t (string-rectangle): will prompt you for characters you’d like to insert into each line; great for inserting a certain number of spaces
  • C-x r k (kill-rectangle): remove a rectangle region; great for removing indentation

You can also C-x r y (yank-rectangle), but that’s only rarely useful.


回答 2

indent-region映射C-M-\应该可以解决问题。

indent-region mapped to C-M-\ should do the trick.


回答 3

我一直在使用此功能来处理缩进和缩进:

(defun unindent-dwim (&optional count-arg)
  "Keeps relative spacing in the region.  Unindents to the next multiple of the current tab-width"
  (interactive)
  (let ((deactivate-mark nil)
        (beg (or (and mark-active (region-beginning)) (line-beginning-position)))
        (end (or (and mark-active (region-end)) (line-end-position)))
        (min-indentation)
        (count (or count-arg 1)))
    (save-excursion
      (goto-char beg)
      (while (< (point) end)
        (add-to-list 'min-indentation (current-indentation))
        (forward-line)))
    (if (< 0 count)
        (if (not (< 0 (apply 'min min-indentation)))
            (error "Can't indent any more.  Try `indent-rigidly` with a negative arg.")))
    (if (> 0 count)
        (indent-rigidly beg end (* (- 0 tab-width) count))
      (let (
            (indent-amount
             (apply 'min (mapcar (lambda (x) (- 0 (mod x tab-width))) min-indentation))))
        (indent-rigidly beg end (or
                                 (and (< indent-amount 0) indent-amount)
                                 (* (or count 1) (- 0 tab-width))))))))

然后将其分配给键盘快捷键:

(global-set-key (kbd "s-[") 'unindent-dwim)
(global-set-key (kbd "s-]") (lambda () (interactive) (unindent-dwim -1)))

I’ve been using this function to handle my indenting and unindenting:

(defun unindent-dwim (&optional count-arg)
  "Keeps relative spacing in the region.  Unindents to the next multiple of the current tab-width"
  (interactive)
  (let ((deactivate-mark nil)
        (beg (or (and mark-active (region-beginning)) (line-beginning-position)))
        (end (or (and mark-active (region-end)) (line-end-position)))
        (min-indentation)
        (count (or count-arg 1)))
    (save-excursion
      (goto-char beg)
      (while (< (point) end)
        (add-to-list 'min-indentation (current-indentation))
        (forward-line)))
    (if (< 0 count)
        (if (not (< 0 (apply 'min min-indentation)))
            (error "Can't indent any more.  Try `indent-rigidly` with a negative arg.")))
    (if (> 0 count)
        (indent-rigidly beg end (* (- 0 tab-width) count))
      (let (
            (indent-amount
             (apply 'min (mapcar (lambda (x) (- 0 (mod x tab-width))) min-indentation))))
        (indent-rigidly beg end (or
                                 (and (< indent-amount 0) indent-amount)
                                 (* (or count 1) (- 0 tab-width))))))))

And then I assign it to a keyboard shortcut:

(global-set-key (kbd "s-[") 'unindent-dwim)
(global-set-key (kbd "s-]") (lambda () (interactive) (unindent-dwim -1)))

回答 4

我是Emacs的新手,因此此答案可能对您毫无用处。

到目前为止,所提到的答案都没有覆盖字面量如dict或的重新缩进list。例如,如果您剪切并粘贴了以下文字并需要对其进行合理的缩进,则M-x indent-regionor或M-x python-indent-shift-rightcompany不会提供帮助:

    foo = {
  'bar' : [
     1,
    2,
        3 ],
      'baz' : {
     'asdf' : {
        'banana' : 1,
        'apple' : 2 } } }

感觉M-x indent-region应该在中做一些明智的事情python-mode,但是还不是这样。

对于将文字放在方括号中的特定情况,在相关行上使用TAB即可获得所需的内容(因为空格不起作用)。

因此,在这种情况下,我一直在快速记录键盘宏,例如<f3> C-n TAB <f4>F3,Ctrl-n(或向下箭头),TAB,F4,然后重复使用F4来应用宏可以节省几次击键。或者,您可以将C-u 10 C-x e其应用10次。

(我知道这听起来不多,但是尝试重新缩进100行垃圾文字而不丢失向下箭头,然后不得不上升5行并重复一遍;)。

I’m an Emacs newb, so this answer it probably bordering on useless.

None of the answers mentioned so far cover re-indentation of literals like dict or list. E.g. M-x indent-region or M-x python-indent-shift-right and company aren’t going to help if you’ve cut-and-pasted the following literal and need it to be re-indented sensibly:

    foo = {
  'bar' : [
     1,
    2,
        3 ],
      'baz' : {
     'asdf' : {
        'banana' : 1,
        'apple' : 2 } } }

It feels like M-x indent-region should do something sensibly in python-mode, but that’s not (yet) the case.

For the specific case where your literals are bracketed, using TAB on the lines in question gets what you want (because whitespace doesn’t play a role).

So what I’ve been doing in such cases is quickly recording a keyboard macro like <f3> C-n TAB <f4> as in F3, Ctrl-n (or down arrow), TAB, F4, and then using F4 repeatedly to apply the macro can save a couple of keystrokes. Or you can do C-u 10 C-x e to apply it 10 times.

(I know it doesn’t sound like much, but try re-indenting 100 lines of garbage literal without missing down-arrow, and then having to go up 5 lines and repeat things ;) ).


回答 5

我使用以下代码段。当选项卡处于非活动状态时,在选项卡上缩进当前行(通常如此);当选择处于非活动状态时,它将使整个区域向右缩进。

(defun my-python-tab-command (&optional _)
  "If the region is active, shift to the right; otherwise, indent current line."
  (interactive)
  (if (not (region-active-p))
      (indent-for-tab-command)
    (let ((lo (min (region-beginning) (region-end)))
          (hi (max (region-beginning) (region-end))))
      (goto-char lo)
      (beginning-of-line)
      (set-mark (point))
      (goto-char hi)
      (end-of-line)
      (python-indent-shift-right (mark) (point)))))
(define-key python-mode-map [remap indent-for-tab-command] 'my-python-tab-command)

I use the following snippet. On tab when the selection is inactive, it indents the current line (as it normally does); when the selection is inactive, it indents the whole region to the right.

(defun my-python-tab-command (&optional _)
  "If the region is active, shift to the right; otherwise, indent current line."
  (interactive)
  (if (not (region-active-p))
      (indent-for-tab-command)
    (let ((lo (min (region-beginning) (region-end)))
          (hi (max (region-beginning) (region-end))))
      (goto-char lo)
      (beginning-of-line)
      (set-mark (point))
      (goto-char hi)
      (end-of-line)
      (python-indent-shift-right (mark) (point)))))
(define-key python-mode-map [remap indent-for-tab-command] 'my-python-tab-command)

回答 6

交互进行缩进。

  1. 选择要缩进的区域。
  2. Cx TAB
  3. 使用箭头(<-->)进行交互缩进。
  4. Esc完成所需的缩进后,按三次。

从我的文章中复制:在Emacs中缩进几行

Do indentation interactively.

  1. Select the region to be indented.
  2. C-x TAB.
  3. Use arrows (<- and ->) to indent interactively.
  4. Press Esc three times when you are done with the required indentation.

Copied from my post in: Indent several lines in Emacs


回答 7

我普遍做这样的事情

;; intent whole buffer 
(defun iwb ()
  "indent whole buffer"
  (interactive)
  ;;(delete-trailing-whitespace)
  (indent-region (point-min) (point-max) nil)
  (untabify (point-min) (point-max)))

I do something like this universally

;; intent whole buffer 
(defun iwb ()
  "indent whole buffer"
  (interactive)
  ;;(delete-trailing-whitespace)
  (indent-region (point-min) (point-max) nil)
  (untabify (point-min) (point-max)))

numpy,scipy,matplotlib和pylab之间的混淆

问题:numpy,scipy,matplotlib和pylab之间的混淆

Numpy,scipy,matplotlib和pylab是使用python进行科学计算的常用术语。

我只是学习了一些有关pylab的知识,而感到困惑。每当我要导入numpy时,我都可以执行以下操作:

import numpy as np

我只是认为,一旦我这样做

from pylab import *

numpy也将被导入(使用np别名)。所以基本上,第二个相比第一个做更多的事情。

我想问的几件事:

  1. pylab仅仅是numpy,scipy和matplotlib的包装吗?
  2. 由于NP是pylab中的numpy别名,因此pylab中的scipy和matplotlib别名是什么?(据我所知,plt是matplotlib.pyplot的别名,但我不知道matplotlib本身的别名)

Numpy, scipy, matplotlib, and pylab are common terms among they who use python for scientific computation.

I just learn a bit about pylab, and I got confused. Whenever I want to import numpy, I can always do:

import numpy as np

I just consider, that once I do

from pylab import *

the numpy will be imported as well (with np alias). So basically the second one does more things compared to the first one.

There are few things I want to ask:

  1. Is it right that pylab is just a wrapper for numpy, scipy and matplotlib?
  2. As np is the numpy alias in pylab, what is the scipy and matplotlib alias in pylab? (as far as I know, plt is alias of matplotlib.pyplot, but I don’t know the alias for the matplotlib itself)

回答 0

  1. 没有,pylab是的一部分matplotlib(在matplotlib.pylab),并试图给你喜欢的环境Matlab的。matplotlib有许多依赖项,其中有一些依赖项numpy以通用别名导入npscipy不是的依赖项matplotlib

  2. 如果运行ipython --pylab自动导入,则会将所有符号从中matplotlib.pylab放入全局范围。就像您写的一样numpy,在np别名下导入。别名matplotlib下的符号来自mpl

  1. No, pylab is part of matplotlib (in matplotlib.pylab) and tries to give you a MatLab like environment. matplotlib has a number of dependencies, among them numpy which it imports under the common alias np. scipy is not a dependency of matplotlib.

  2. If you run ipython --pylab an automatic import will put all symbols from matplotlib.pylab into global scope. Like you wrote numpy gets imported under the np alias. Symbols from matplotlib are available under the mpl alias.


回答 1

Scipy和numpy是科学项目,旨在为python带来高效,快速的数值计算。

Matplotlib是python绘图库的名称。

Pyplot是matplotlib的交互式api,主要用于jupyter之类的笔记本中。您通常会这样使用它:import matplotlib.pyplot as plt

Pylab与pyplot相同,但是具有额外的功能(目前不鼓励使用)。

  • pylab = pyplot + numpy的

在此处查看更多信息:Matplotlib,Pylab,Pyplot等:这些和何时使用它们有什么区别?

Scipy and numpy are scientific projects whose aim is to bring efficient and fast numeric computing to python.

Matplotlib is the name of the python plotting library.

Pyplot is an interactive api for matplotlib, mostly for use in notebooks like jupyter. You generally use it like this: import matplotlib.pyplot as plt.

Pylab is the same thing as pyplot, but with extra features (its use is currently discouraged).

  • pylab = pyplot + numpy

See more information here: Matplotlib, Pylab, Pyplot, etc: What’s the difference between these and when to use each?


回答 2

由于某些示例(例如我)可能仍然对pylab的使用感到困惑,因为pylab互联网上存在使用示例的示例,因此这里引用了官方matplotlib常见问题解答:

pylab是一个便捷模块,可在单个命名空间中批量导入matplotlib.pyplot(用于绘图)和numpy(用于数学以及使用数组)。尽管许多示例都使用pylab,但不再建议使用。

因此,TL; DR; 是不使用pylab,句点。根据需要分别使用pyplot和导入numpy

这是进一步阅读和其他有用示例的链接

Since some people (like me) may still be confused about usage of pylab since examples using pylab are out there on the internet, here is a quote from the official matplotlib FAQ:

pylab is a convenience module that bulk imports matplotlib.pyplot (for plotting) and numpy (for mathematics and working with arrays) in a single name space. Although many examples use pylab, it is no longer recommended.

So, TL;DR; is do not use pylab, period. Use pyplot and import numpy separately as needed.

Here is the link for further reading and other useful examples.


检查类是否已定义函数的最快方法是什么?

问题:检查类是否已定义函数的最快方法是什么?

我正在编写AI状态空间搜索算法,并且有一个通用类可以用于快速实现搜索算法。子类将定义必要的操作,然后算法执行其余操作。

这是我遇到的问题:我想避免一遍又一遍地重新生成父状态,所以我有以下函数,该函数返回可以合法地应用于任何状态的操作:

def get_operations(self, include_parent=True):
    ops = self._get_operations()
    if not include_parent and self.path.parent_op:
        try:
            parent_inverse = self.invert_op(self.path.parent_op)
            ops.remove(parent_inverse)
        except NotImplementedError:
            pass
    return ops

并且invert_op函数默认情况下抛出。

有没有比捕获异常更快的方法来检查函数是否未定义?

我在检查dir中是否存在内容时正在思考,但这似乎不正确。hasattr是通过调用getattr并检查它是否引发来实现的,这不是我想要的。

I’m writing an AI state space search algorithm, and I have a generic class which can be used to quickly implement a search algorithm. A subclass would define the necessary operations, and the algorithm does the rest.

Here is where I get stuck: I want to avoid regenerating the parent state over and over again, so I have the following function, which returns the operations that can be legally applied to any state:

def get_operations(self, include_parent=True):
    ops = self._get_operations()
    if not include_parent and self.path.parent_op:
        try:
            parent_inverse = self.invert_op(self.path.parent_op)
            ops.remove(parent_inverse)
        except NotImplementedError:
            pass
    return ops

And the invert_op function throws by default.

Is there a faster way to check to see if the function is not defined than catching an exception?

I was thinking something on the lines of checking for present in dir, but that doesn’t seem right. hasattr is implemented by calling getattr and checking if it raises, which is not what I want.


回答 0

是的,用于getattr()获取属性并callable()验证它是否为方法:

invert_op = getattr(self, "invert_op", None)
if callable(invert_op):
    invert_op(self.path.parent_op)

请注意,getattr()当属性不存在时,通常会引发异常。但是,如果您指定默认值(None在本例中为),它将返回该值。

Yes, use getattr() to get the attribute, and callable() to verify it is a method:

invert_op = getattr(self, "invert_op", None)
if callable(invert_op):
    invert_op(self.path.parent_op)

Note that getattr() normally throws exception when the attribute doesn’t exist. However, if you specify a default value (None, in this case), it will return that instead.


回答 1

它同时适用于Python 2和Python 3

hasattr(connection, 'invert_opt')

hasattrTrue如果连接对象已invert_opt定义函数,则返回。这是供您放牧的文档

https://docs.python.org/2/library/functions.html#hasattr https://docs.python.org/3/library/functions.html#hasattr

It works in both Python 2 and Python 3

hasattr(connection, 'invert_opt')

hasattr returns True if connection object has a function invert_opt defined. Here is the documentation for you to graze

https://docs.python.org/2/library/functions.html#hasattr https://docs.python.org/3/library/functions.html#hasattr


回答 2

有没有比捕获异常更快的方法来检查函数是否未定义?

你为什么反对那个?在大多数Pythonic情况下,最好是请求宽恕而不是允许。;-)

hasattr是通过调用getattr并检查它是否引发来实现的,这不是我想要的。

同样,为什么呢?以下是相当Pythonic的内容:

    try:
        invert_op = self.invert_op
    except AttributeError:
        pass
    else:
        parent_inverse = invert_op(self.path.parent_op)
        ops.remove(parent_inverse)

要么,

    # if you supply the optional `default` parameter, no exception is thrown
    invert_op = getattr(self, 'invert_op', None)  
    if invert_op is not None:
        parent_inverse = invert_op(self.path.parent_op)
        ops.remove(parent_inverse)

但是请注意,这getattr(obj, attr, default)基本上也是通过捕获异常来实现的。Python领域没有错!

Is there a faster way to check to see if the function is not defined than catching an exception?

Why are you against that? In most Pythonic cases, it’s better to ask forgiveness than permission. ;-)

hasattr is implemented by calling getattr and checking if it raises, which is not what I want.

Again, why is that? The following is quite Pythonic:

    try:
        invert_op = self.invert_op
    except AttributeError:
        pass
    else:
        parent_inverse = invert_op(self.path.parent_op)
        ops.remove(parent_inverse)

Or,

    # if you supply the optional `default` parameter, no exception is thrown
    invert_op = getattr(self, 'invert_op', None)  
    if invert_op is not None:
        parent_inverse = invert_op(self.path.parent_op)
        ops.remove(parent_inverse)

Note, however, that getattr(obj, attr, default) is basically implemented by catching an exception, too. There is nothing wrong with that in Python land!


回答 3

这里的响应检查字符串是否是对象的属性的名称。需要一个额外的步骤(使用callable)来检查属性是否为方法。

因此,可以归结为:检查对象obj是否具有属性attrib的最快方法是什么。答案是

'attrib' in obj.__dict__

之所以如此,是因为dict对其键进行了哈希处理,因此可以快速检查键的存在。

请参见下面的时序比较。

>>> class SomeClass():
...         pass
...
>>> obj = SomeClass()
>>>
>>> getattr(obj, "invert_op", None)
>>>
>>> %timeit getattr(obj, "invert_op", None)
1000000 loops, best of 3: 723 ns per loop
>>> %timeit hasattr(obj, "invert_op")
The slowest run took 4.60 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 674 ns per loop
>>> %timeit "invert_op" in obj.__dict__
The slowest run took 12.19 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 176 ns per loop

The responses herein check if a string is the name of an attribute of the object. An extra step (using callable) is needed to check if the attribute is a method.

So it boils down to: what is the fastest way to check if an object obj has an attribute attrib. The answer is

'attrib' in obj.__dict__

This is so because a dict hashes its keys so checking for the key’s existence is fast.

See timing comparisons below.

>>> class SomeClass():
...         pass
...
>>> obj = SomeClass()
>>>
>>> getattr(obj, "invert_op", None)
>>>
>>> %timeit getattr(obj, "invert_op", None)
1000000 loops, best of 3: 723 ns per loop
>>> %timeit hasattr(obj, "invert_op")
The slowest run took 4.60 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 674 ns per loop
>>> %timeit "invert_op" in obj.__dict__
The slowest run took 12.19 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 176 ns per loop

回答 4

我喜欢内森·奥斯特加德的回答,并对此进行了投票。但是解决问题的另一种方法是使用记忆修饰符,该修饰符将缓存函数调用的结果。因此,您可以继续使用具有昂贵功能的功能来解决某些问题,但是当您一遍又一遍地调用它时,后续调用很快。函数的记忆版本会在字典中查找参数,然后从实际函数计算结果时开始在字典中查找结果,然后立即返回结果。

这是雷蒙德·海廷格(Raymond Hettinger)称为“ lru_cache”的记忆修饰器的食谱。现在,此版本是Python 3.2中functools模块的标准版本。

http://code.activestate.com/recipes/498245-lru-and-lfu-cache-decorators/

http://docs.python.org/release/3.2/library/functools.html

I like Nathan Ostgard’s answer and I up-voted it. But another way you could solve your problem would be to use a memoizing decorator, which would cache the result of the function call. So you can go ahead and have an expensive function that figures something out, but then when you call it over and over the subsequent calls are fast; the memoized version of the function looks up the arguments in a dict, finds the result in the dict from when the actual function computed the result, and returns the result right away.

Here is a recipe for a memoizing decorator called “lru_cache” by Raymond Hettinger. A version of this is now standard in the functools module in Python 3.2.

http://code.activestate.com/recipes/498245-lru-and-lfu-cache-decorators/

http://docs.python.org/release/3.2/library/functools.html


回答 5

像Python中的任何东西一样,如果您尽力而为,那么您就可以直截了当地去做一些令人讨厌的事情。现在,这是令人讨厌的部分:

def invert_op(self, op):
    raise NotImplementedError

def is_invert_op_implemented(self):
    # Only works in CPython 2.x of course
    return self.invert_op.__code__.co_code == 't\x00\x00\x82\x01\x00d\x00\x00S'

请帮我们一个忙,只要继续解决您的问题,就不要使用它,除非您是PyPy团队的黑客,他们正在侵入Python解释器。您所拥有的是Pythonic,我在这里拥有的是纯EVIL

Like anything in Python, if you try hard enough, you can get at the guts and do something really nasty. Now, here’s the nasty part:

def invert_op(self, op):
    raise NotImplementedError

def is_invert_op_implemented(self):
    # Only works in CPython 2.x of course
    return self.invert_op.__code__.co_code == 't\x00\x00\x82\x01\x00d\x00\x00S'

Please do us a favor, just keep doing what you have in your question and DON’T ever use this unless you are on the PyPy team hacking into the Python interpreter. What you have up there is Pythonic, what I have here is pure EVIL.


回答 6

您也可以遍历类:

import inspect


def get_methods(cls_):
    methods = inspect.getmembers(cls_, inspect.isfunction)
    return dict(methods)

# Example
class A(object):
    pass

class B(object):
    def foo():
        print('B')


# If you only have an object, you can use `cls_ = obj.__class__`
if 'foo' in get_methods(A):
    print('A has foo')

if 'foo' in get_methods(B):
    print('B has foo')

You can also go over the class:

import inspect


def get_methods(cls_):
    methods = inspect.getmembers(cls_, inspect.isfunction)
    return dict(methods)

# Example
class A(object):
    pass

class B(object):
    def foo():
        print('B')


# If you only have an object, you can use `cls_ = obj.__class__`
if 'foo' in get_methods(A):
    print('A has foo')

if 'foo' in get_methods(B):
    print('B has foo')

回答 7

虽然在__dict__属性中检查属性确实非常快,但是您不能将其用于方法,因为它们不会出现在__dict__哈希中。但是,如果性能至关重要,则可以在课堂上采用棘手的解决方法:

class Test():
    def __init__():
        # redefine your method as attribute
        self.custom_method = self.custom_method

    def custom_method(self):
        pass

然后检查方法为:

t = Test()
'custom_method' in t.__dict__

时间比较getattr

>>%timeit 'custom_method' in t.__dict__
55.9 ns ± 0.626 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

>>%timeit getattr(t, 'custom_method', None)
116 ns ± 0.765 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

我并不是在鼓励这种方法,但是它似乎有效。

[EDIT]当方法名称不在给定的类中时,性能提升甚至更高:

>>%timeit 'rubbish' in t.__dict__
65.5 ns ± 11 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

>>%timeit getattr(t, 'rubbish', None)
385 ns ± 12.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

While checking for attributes in __dict__ property is really fast, you cannot use this for methods, since they do not appear in __dict__ hash. You could however resort to hackish workaround in your class, if performance is that critical:

class Test():
    def __init__():
        # redefine your method as attribute
        self.custom_method = self.custom_method

    def custom_method(self):
        pass

Then check for method as:

t = Test()
'custom_method' in t.__dict__

Time comparision with getattr:

>>%timeit 'custom_method' in t.__dict__
55.9 ns ± 0.626 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

>>%timeit getattr(t, 'custom_method', None)
116 ns ± 0.765 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Not that I’m encouraging this approach, but it seems to work.

[EDIT] Performance boost is even higher when method name is not in given class:

>>%timeit 'rubbish' in t.__dict__
65.5 ns ± 11 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

>>%timeit getattr(t, 'rubbish', None)
385 ns ± 12.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Python,创建对象

问题:Python,创建对象

我正在尝试学习python,现在我试图摆脱类的困扰,以及如何使用实例操纵它们。

我似乎无法理解这个练习问题:

创建并返回其名称,年龄和专业与输入相同的学生对象

def make_student(name, age, major)

我只是不了解对象的含义,是否意味着我应该在包含这些值的函数内创建一个数组?或创建一个类,让该函数位于其中,并分配实例?(在问这个问题之前,我被要求开设一个学生班,里面要说姓名,年龄和专业)

class Student:
    name = "Unknown name"
    age = 0
    major = "Unknown major"

I’m trying to learn python and I now I am trying to get the hang of classes and how to manipulate them with instances.

I can’t seem to understand this practice problem:

Create and return a student object whose name, age, and major are the same as those given as input

def make_student(name, age, major)

I just don’t get what it means by object, do they mean I should create an array inside the function that holds these values? or create a class and let this function be inside it, and assign instances? (before this question i was asked to set up a student class with name, age, and major inside)

class Student:
    name = "Unknown name"
    age = 0
    major = "Unknown major"

回答 0

class Student(object):
    name = ""
    age = 0
    major = ""

    # The class "constructor" - It's actually an initializer 
    def __init__(self, name, age, major):
        self.name = name
        self.age = age
        self.major = major

def make_student(name, age, major):
    student = Student(name, age, major)
    return student

请注意,即使Python哲学中的原则之一是“应该有一个,最好只有一个,这是显而易见的方式”,但仍然有多种方式可以做到这一点。您还可以使用以下两个代码段来利用Python的动态功能:

class Student(object):
    name = ""
    age = 0
    major = ""

def make_student(name, age, major):
    student = Student()
    student.name = name
    student.age = age
    student.major = major
    # Note: I didn't need to create a variable in the class definition before doing this.
    student.gpa = float(4.0)
    return student

我更喜欢前者,但在某些情况下后者可能有用–一种是在使用文档数据库(如MongoDB)时。

class Student(object):
    name = ""
    age = 0
    major = ""

    # The class "constructor" - It's actually an initializer 
    def __init__(self, name, age, major):
        self.name = name
        self.age = age
        self.major = major

def make_student(name, age, major):
    student = Student(name, age, major)
    return student

Note that even though one of the principles in Python’s philosophy is “there should be one—and preferably only one—obvious way to do it”, there are still multiple ways to do this. You can also use the two following snippets of code to take advantage of Python’s dynamic capabilities:

class Student(object):
    name = ""
    age = 0
    major = ""

def make_student(name, age, major):
    student = Student()
    student.name = name
    student.age = age
    student.major = major
    # Note: I didn't need to create a variable in the class definition before doing this.
    student.gpa = float(4.0)
    return student

I prefer the former, but there are instances where the latter can be useful – one being when working with document databases like MongoDB.


回答 1

创建一个类并为其提供__init__方法:

class Student:
    def __init__(self, name, age, major):
        self.name = name
        self.age = age
        self.major = major

    def is_old(self):
        return self.age > 100

现在,您可以初始化Student该类的实例:

>>> s = Student('John', 88, None)
>>> s.name
    'John'
>>> s.age
    88

尽管我不知道make_student如果做与相同的功能为什么为什么需要一个学生函数Student.__init__

Create a class and give it an __init__ method:

class Student:
    def __init__(self, name, age, major):
        self.name = name
        self.age = age
        self.major = major

    def is_old(self):
        return self.age > 100

Now, you can initialize an instance of the Student class:

>>> s = Student('John', 88, None)
>>> s.name
    'John'
>>> s.age
    88

Although I’m not sure why you need a make_student student function if it does the same thing as Student.__init__.


回答 2

对象是类的实例。类只是对象的蓝图。因此,根据您的类定义-

# Note the added (object) - this is the preferred way of creating new classes
class Student(object):
    name = "Unknown name"
    age = 0
    major = "Unknown major"

您可以make_student通过将属性明确分配给Student– 的新实例来创建函数

def make_student(name, age, major):
    student = Student()
    student.name = name
    student.age = age
    student.major = major
    return student

但是在构造函数(__init__)中执行此操作可能更有意义-

class Student(object):
    def __init__(self, name="Unknown name", age=0, major="Unknown major"):
        self.name = name
        self.age = age
        self.major = major

使用时会调用构造函数Student()。它将采用__init__方法中定义的参数。现在,构造函数签名实际上将是Student(name, age, major)

如果使用该make_student函数,那么函数是微不足道的(并且是多余的)-

def make_student(name, age, major):
    return Student(name, age, major)

为了好玩,这里有一个示例,说明如何在make_student不定义类的情况下创建函数。请不要在家尝试。

def make_student(name, age, major):
    return type('Student', (object,),
                {'name': name, 'age': age, 'major': major})()

Objects are instances of classes. Classes are just the blueprints for objects. So given your class definition –

# Note the added (object) - this is the preferred way of creating new classes
class Student(object):
    name = "Unknown name"
    age = 0
    major = "Unknown major"

You can create a make_student function by explicitly assigning the attributes to a new instance of Student

def make_student(name, age, major):
    student = Student()
    student.name = name
    student.age = age
    student.major = major
    return student

But it probably makes more sense to do this in a constructor (__init__) –

class Student(object):
    def __init__(self, name="Unknown name", age=0, major="Unknown major"):
        self.name = name
        self.age = age
        self.major = major

The constructor is called when you use Student(). It will take the arguments defined in the __init__ method. The constructor signature would now essentially be Student(name, age, major).

If you use that, then a make_student function is trivial (and superfluous) –

def make_student(name, age, major):
    return Student(name, age, major)

For fun, here is an example of how to create a make_student function without defining a class. Please do not try this at home.

def make_student(name, age, major):
    return type('Student', (object,),
                {'name': name, 'age': age, 'major': major})()

回答 3

使用predefine类创建对象时,首先要创建一个用于存储该对象的变量。然后,您可以创建对象并存储您创建的变量。

class Student:
     def __init__(self):

# creating an object....

   student1=Student()

实际上,此init方法是class的构造方法。您可以使用一些属性来初始化该方法。在这一点上,创建对象时,您将必须为特定属性传递一些值。

class Student:
      def __init__(self,name,age):
            self.name=value
            self.age=value

 # creating an object.......

     student2=Student("smith",25)

when you create an object using predefine class, at first you want to create a variable for storing that object. Then you can create object and store variable that you created.

class Student:
     def __init__(self):

# creating an object....

   student1=Student()

Actually this init method is the constructor of class.you can initialize that method using some attributes.. In that point , when you creating an object , you will have to pass some values for particular attributes..

class Student:
      def __init__(self,name,age):
            self.name=value
            self.age=value

 # creating an object.......

     student2=Student("smith",25)

脾气暴躁的地方有多个条件

问题:脾气暴躁的地方有多个条件

我有一组距离称为dists。我想选择两个值之间的距离。我编写了以下代码行:

 dists[(np.where(dists >= r)) and (np.where(dists <= r + dr))]

但是,这仅针对条件选择

 (np.where(dists <= r + dr))

如果我通过使用临时变量按顺序执行命令,则效果很好。为什么上面的代码不起作用,如何使它起作用?

干杯

I have an array of distances called dists. I want to select dists which are between two values. I wrote the following line of code to do that:

 dists[(np.where(dists >= r)) and (np.where(dists <= r + dr))]

However this selects only for the condition

 (np.where(dists <= r + dr))

If I do the commands sequentially by using a temporary variable it works fine. Why does the above code not work, and how do I get it to work?

Cheers


回答 0

您的特定情况下,最好的方法将两个条件更改为一个条件:

dists[abs(dists - r - dr/2.) <= dr/2.]

它仅创建一个布尔数组,在我看来是更易于阅读,因为它说,dist内部的dr还是r(尽管我将重新定义r为您感兴趣的区域的中心,而不是开始的位置,所以r = r + dr/2.)但这并不能回答您的问题。


问题的答案:如果您只是想过滤出不符合标准的元素,则
实际上并不需要:wheredists

dists[(dists >= r) & (dists <= r+dr)]

因为&将会为您提供基本元素and(括号是必需的)。

或者,如果您where出于某些原因要使用,可以执行以下操作:

 dists[(np.where((dists >= r) & (dists <= r + dr)))]

原因:
不起作用的原因是因为np.where返回的是索引列表,而不是布尔数组。您试图and在两个数字列表之间移动,这些数字当然没有您期望的True/ False值。如果ab都是两个True值,则a and b返回b。所以说这样的话[0,1,2] and [2,3,4]只会给你[2,3,4]。它在起作用:

In [230]: dists = np.arange(0,10,.5)
In [231]: r = 5
In [232]: dr = 1

In [233]: np.where(dists >= r)
Out[233]: (array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19]),)

In [234]: np.where(dists <= r+dr)
Out[234]: (array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12]),)

In [235]: np.where(dists >= r) and np.where(dists <= r+dr)
Out[235]: (array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12]),)

您期望比较的只是布尔数组,例如

In [236]: dists >= r
Out[236]: 
array([False, False, False, False, False, False, False, False, False,
       False,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True], dtype=bool)

In [237]: dists <= r + dr
Out[237]: 
array([ True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True, False, False, False, False, False,
       False, False], dtype=bool)

In [238]: (dists >= r) & (dists <= r + dr)
Out[238]: 
array([False, False, False, False, False, False, False, False, False,
       False,  True,  True,  True, False, False, False, False, False,
       False, False], dtype=bool)

现在,您可以调用np.where组合的布尔数组:

In [239]: np.where((dists >= r) & (dists <= r + dr))
Out[239]: (array([10, 11, 12]),)

In [240]: dists[np.where((dists >= r) & (dists <= r + dr))]
Out[240]: array([ 5. ,  5.5,  6. ])

或者使用花式索引简单地用布尔数组对原始数组进行索引

In [241]: dists[(dists >= r) & (dists <= r + dr)]
Out[241]: array([ 5. ,  5.5,  6. ])

The best way in your particular case would just be to change your two criteria to one criterion:

dists[abs(dists - r - dr/2.) <= dr/2.]

It only creates one boolean array, and in my opinion is easier to read because it says, is dist within a dr or r? (Though I’d redefine r to be the center of your region of interest instead of the beginning, so r = r + dr/2.) But that doesn’t answer your question.


The answer to your question:
You don’t actually need where if you’re just trying to filter out the elements of dists that don’t fit your criteria:

dists[(dists >= r) & (dists <= r+dr)]

Because the & will give you an elementwise and (the parentheses are necessary).

Or, if you do want to use where for some reason, you can do:

 dists[(np.where((dists >= r) & (dists <= r + dr)))]

Why:
The reason it doesn’t work is because np.where returns a list of indices, not a boolean array. You’re trying to get and between two lists of numbers, which of course doesn’t have the True/False values that you expect. If a and b are both True values, then a and b returns b. So saying something like [0,1,2] and [2,3,4] will just give you [2,3,4]. Here it is in action:

In [230]: dists = np.arange(0,10,.5)
In [231]: r = 5
In [232]: dr = 1

In [233]: np.where(dists >= r)
Out[233]: (array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19]),)

In [234]: np.where(dists <= r+dr)
Out[234]: (array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12]),)

In [235]: np.where(dists >= r) and np.where(dists <= r+dr)
Out[235]: (array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12]),)

What you were expecting to compare was simply the boolean array, for example

In [236]: dists >= r
Out[236]: 
array([False, False, False, False, False, False, False, False, False,
       False,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True], dtype=bool)

In [237]: dists <= r + dr
Out[237]: 
array([ True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True, False, False, False, False, False,
       False, False], dtype=bool)

In [238]: (dists >= r) & (dists <= r + dr)
Out[238]: 
array([False, False, False, False, False, False, False, False, False,
       False,  True,  True,  True, False, False, False, False, False,
       False, False], dtype=bool)

Now you can call np.where on the combined boolean array:

In [239]: np.where((dists >= r) & (dists <= r + dr))
Out[239]: (array([10, 11, 12]),)

In [240]: dists[np.where((dists >= r) & (dists <= r + dr))]
Out[240]: array([ 5. ,  5.5,  6. ])

Or simply index the original array with the boolean array using fancy indexing

In [241]: dists[(dists >= r) & (dists <= r + dr)]
Out[241]: array([ 5. ,  5.5,  6. ])

回答 1

公认的答案已经很好地解释了这个问题。但是,应用多个条件的Numpythonic方法更多是使用numpy逻辑函数。在这种情况下,您可以使用np.logical_and

np.where(np.logical_and(np.greater_equal(dists,r),np.greater_equal(dists,r + dr)))

The accepted answer explained the problem well enough. However, the the more Numpythonic approach for applying multiple conditions is to use numpy logical functions. In this ase you can use np.logical_and:

np.where(np.logical_and(np.greater_equal(dists,r),np.greater_equal(dists,r + dr)))

回答 2

这里要指出的一件有趣的事情是:在这种情况下,通常也可以使用ORAND的方式,但有一点点变化。代替“ and”和“ or”,而使用Ampersand(&)Pipe Operator(|),它将起作用。

当我们使用‘and’时

ar = np.array([3,4,5,14,2,4,3,7])
np.where((ar>3) and (ar<6), 'yo', ar)

Output:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

当我们使用&符时

ar = np.array([3,4,5,14,2,4,3,7])
np.where((ar>3) & (ar<6), 'yo', ar)

Output:
array(['3', 'yo', 'yo', '14', '2', 'yo', '3', '7'], dtype='<U11')

当我们尝试应用大熊猫Dataframe的多个过滤器时,情况也是如此。现在,其背后的原因必须与逻辑运算符和按位运算符有关,并且为了对它们有更多的了解,我建议在stackoverflow中仔细研究一下此答案或类似的Q / A。

更新

用户问,为什么需要在括号内给出(ar> 3)和(ar <6)。好吧,这就是事情。在我开始讨论这里发生的事情之前,需要了解Python中的运算符优先级。

类似于BODMAS所涉及的内容,python还优先执行应首先执行的操作。首先执行括号内的项目,然后按位运算符开始工作。我将在下面显示两种情况,当您确实使用和不使用“(”,“)”时会发生什么。

情况1:

np.where( ar>3 & ar<6, 'yo', ar)
np.where( np.array([3,4,5,14,2,4,3,7])>3 & np.array([3,4,5,14,2,4,3,7])<6, 'yo', ar)

由于这里没有括号,因此按位运算符(&)在这里变得困惑,您甚至要求它获得逻辑与,因为在运算符优先级表中(如果看到的话)&被赋予了优先于<>运算符。这是从最低优先级到最高优先级的表格。

它甚至不执行<>操作被要求执行逻辑与操作。这就是为什么它会导致该错误。

您可以查看以下链接以了解更多信息:运算符优先级

现在转到案例2:

如果您确实使用了支架,那么您会清楚地看到会发生什么。

np.where( (ar>3) & (ar<6), 'yo', ar)
np.where( (array([False,  True,  True,  True, False,  True, False,  True])) & (array([ True,  True,  True, False,  True,  True,  True, False])), 'yo', ar)

真假两个数组。而且,您可以轻松地对其执行逻辑AND操作。这给你:

np.where( array([False,  True,  True, False, False,  True, False, False]),  'yo', ar)

休息一下,np.where,对于给定的情况,在任何情况下,True都会分配第一个值(即“ yo”),如果为False,则分配另一个值(即在此保留原始值)。

就这样。我希望我能很好地解释查询。

One interesting thing to point here; the usual way of using OR and AND too will work in this case, but with a small change. Instead of “and” and instead of “or”, rather use Ampersand(&) and Pipe Operator(|) and it will work.

When we use ‘and’:

ar = np.array([3,4,5,14,2,4,3,7])
np.where((ar>3) and (ar<6), 'yo', ar)

Output:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

When we use Ampersand(&):

ar = np.array([3,4,5,14,2,4,3,7])
np.where((ar>3) & (ar<6), 'yo', ar)

Output:
array(['3', 'yo', 'yo', '14', '2', 'yo', '3', '7'], dtype='<U11')

And this is same in the case when we are trying to apply multiple filters in case of pandas Dataframe. Now the reasoning behind this has to do something with Logical Operators and Bitwise Operators and for more understanding about same, I’d suggest to go through this answer or similar Q/A in stackoverflow.

UPDATE

A user asked, why is there a need for giving (ar>3) and (ar<6) inside the parenthesis. Well here’s the thing. Before I start talking about what’s happening here, one needs to know about Operator precedence in Python.

Similar to what BODMAS is about, python also gives precedence to what should be performed first. Items inside the parenthesis are performed first and then the bitwise operator comes to work. I’ll show below what happens in both the cases when you do use and not use “(“, “)”.

Case1:

np.where( ar>3 & ar<6, 'yo', ar)
np.where( np.array([3,4,5,14,2,4,3,7])>3 & np.array([3,4,5,14,2,4,3,7])<6, 'yo', ar)

Since there are no brackets here, the bitwise operator(&) is getting confused here that what are you even asking it to get logical AND of, because in the operator precedence table if you see, & is given precedence over < or > operators. Here’s the table from from lowest precedence to highest precedence.

It’s not even performing the < and > operation and being asked to perform a logical AND operation. So that’s why it gives that error.

One can check out the following link to learn more about: operator precedence

Now to Case 2:

If you do use the bracket, you clearly see what happens.

np.where( (ar>3) & (ar<6), 'yo', ar)
np.where( (array([False,  True,  True,  True, False,  True, False,  True])) & (array([ True,  True,  True, False,  True,  True,  True, False])), 'yo', ar)

Two arrays of True and False. And you can easily perform logical AND operation on them. Which gives you:

np.where( array([False,  True,  True, False, False,  True, False, False]),  'yo', ar)

And rest you know, np.where, for given cases, wherever True, assigns first value(i.e. here ‘yo’) and if False, the other(i.e. here, keeping the original).

That’s all. I hope I explained the query well.


回答 3

我喜欢np.vectorize用于此类任务。考虑以下:

>>> # function which returns True when constraints are satisfied.
>>> func = lambda d: d >= r and d<= (r+dr) 
>>>
>>> # Apply constraints element-wise to the dists array.
>>> result = np.vectorize(func)(dists) 
>>>
>>> result = np.where(result) # Get output.

您也可以使用np.argwhere代替以np.where获得清晰的输出。但这是您的电话:)

希望能帮助到你。

I like to use np.vectorize for such tasks. Consider the following:

>>> # function which returns True when constraints are satisfied.
>>> func = lambda d: d >= r and d<= (r+dr) 
>>>
>>> # Apply constraints element-wise to the dists array.
>>> result = np.vectorize(func)(dists) 
>>>
>>> result = np.where(result) # Get output.

You can also use np.argwhere instead of np.where for clear output. But that is your call :)

Hope it helps.


回答 4

尝试:

np.intersect1d(np.where(dists >= r)[0],np.where(dists <= r + dr)[0])

Try:

np.intersect1d(np.where(dists >= r)[0],np.where(dists <= r + dr)[0])

回答 5

这应该工作:

dists[((dists >= r) & (dists <= r+dr))]

最优雅的方式~~

This should work:

dists[((dists >= r) & (dists <= r+dr))]

The most elegant way~~


回答 6

尝试:

import numpy as np
dist = np.array([1,2,3,4,5])
r = 2
dr = 3
np.where(np.logical_and(dist> r, dist<=r+dr))

输出:(array([2,3]),)

您可以查看逻辑功能以获取更多详细信息。

Try:

import numpy as np
dist = np.array([1,2,3,4,5])
r = 2
dr = 3
np.where(np.logical_and(dist> r, dist<=r+dr))

Output: (array([2, 3]),)

You can see Logic functions for more details.


回答 7

我已经解决了这个简单的例子

import numpy as np

ar = np.array([3,4,5,14,2,4,3,7])

print [X for X in list(ar) if (X >= 3 and X <= 6)]

>>> 
[3, 4, 5, 4, 3]

I have worked out this simple example

import numpy as np

ar = np.array([3,4,5,14,2,4,3,7])

print [X for X in list(ar) if (X >= 3 and X <= 6)]

>>> 
[3, 4, 5, 4, 3]

生成具有给定(数字)分布的随机数

问题:生成具有给定(数字)分布的随机数

我有一个具有不同值的概率的文件,例如:

1 0.1
2 0.05
3 0.05
4 0.2
5 0.4
6 0.2

我想使用此分布生成随机数。是否存在处理此问题的现有模块?自己编写代码是很简单的(构建累积密度函数,生成随机值[0,1]并选择相应的值),但这似乎是一个常见问题,可能有人为它创建了一个函数/模块它。

我需要这个,因为我想生成一个生日列表(不遵循标准random模块中的任何分布)。

I have a file with some probabilities for different values e.g.:

1 0.1
2 0.05
3 0.05
4 0.2
5 0.4
6 0.2

I would like to generate random numbers using this distribution. Does an existing module that handles this exist? It’s fairly simple to code on your own (build the cumulative density function, generate a random value [0,1] and pick the corresponding value) but it seems like this should be a common problem and probably someone has created a function/module for it.

I need this because I want to generate a list of birthdays (which do not follow any distribution in the standard random module).


回答 0

scipy.stats.rv_discrete可能就是您想要的。您可以通过values参数提供概率。然后,您可以使用rvs()分发对象的方法来生成随机数。

正如Eugene Pakhomov在评论中指出的那样,您还可以将p关键字参数传递给numpy.random.choice(),例如

numpy.random.choice(numpy.arange(1, 7), p=[0.1, 0.05, 0.05, 0.2, 0.4, 0.2])

如果您使用的是Python 3.6或更高版本,则可以random.choices()在标准库中使用–请参见Mark Dickinson答案

scipy.stats.rv_discrete might be what you want. You can supply your probabilities via the values parameter. You can then use the rvs() method of the distribution object to generate random numbers.

As pointed out by Eugene Pakhomov in the comments, you can also pass a p keyword parameter to numpy.random.choice(), e.g.

numpy.random.choice(numpy.arange(1, 7), p=[0.1, 0.05, 0.05, 0.2, 0.4, 0.2])

If you are using Python 3.6 or above, you can use random.choices() from the standard library – see the answer by Mark Dickinson.


回答 1

从Python 3.6开始,Python的标准库中提供了一个解决方案random.choices

用法示例:让我们设置与OP中的问题相匹配的总体和权重:

>>> from random import choices
>>> population = [1, 2, 3, 4, 5, 6]
>>> weights = [0.1, 0.05, 0.05, 0.2, 0.4, 0.2]

现在choices(population, weights)生成一个样本:

>>> choices(population, weights)
4

可选的仅关键字参数k允许一个参数一次请求多个样本。这很有价值,因为random.choices在生成任何样本之前,每次调用时都要做一些准备工作。通过一次生成许多样本,我们只需要做一次准备工作。在这里,我们生成了一百万个样本,并collections.Counter用来检查我们得到的分布与我们赋予的权重大致匹配。

>>> million_samples = choices(population, weights, k=10**6)
>>> from collections import Counter
>>> Counter(million_samples)
Counter({5: 399616, 6: 200387, 4: 200117, 1: 99636, 3: 50219, 2: 50025})

Since Python 3.6, there’s a solution for this in Python’s standard library, namely random.choices.

Example usage: let’s set up a population and weights matching those in the OP’s question:

>>> from random import choices
>>> population = [1, 2, 3, 4, 5, 6]
>>> weights = [0.1, 0.05, 0.05, 0.2, 0.4, 0.2]

Now choices(population, weights) generates a single sample:

>>> choices(population, weights)
4

The optional keyword-only argument k allows one to request more than one sample at once. This is valuable because there’s some preparatory work that random.choices has to do every time it’s called, prior to generating any samples; by generating many samples at once, we only have to do that preparatory work once. Here we generate a million samples, and use collections.Counter to check that the distribution we get roughly matches the weights we gave.

>>> million_samples = choices(population, weights, k=10**6)
>>> from collections import Counter
>>> Counter(million_samples)
Counter({5: 399616, 6: 200387, 4: 200117, 1: 99636, 3: 50219, 2: 50025})

回答 2

使用CDF生成列表的一个优点是可以使用二进制搜索。当您需要O(n)的时间和空间进行预处理时,您可以在O(k log n)中获得k个数字。由于普通的Python列表效率低下,因此可以使用array模块。

如果您坚持使用恒定的空间,则可以执行以下操作;O(n)时间,O(1)空间。

def random_distr(l):
    r = random.uniform(0, 1)
    s = 0
    for item, prob in l:
        s += prob
        if s >= r:
            return item
    return item  # Might occur because of floating point inaccuracies

An advantage to generating the list using CDF is that you can use binary search. While you need O(n) time and space for preprocessing, you can get k numbers in O(k log n). Since normal Python lists are inefficient, you can use array module.

If you insist on constant space, you can do the following; O(n) time, O(1) space.

def random_distr(l):
    r = random.uniform(0, 1)
    s = 0
    for item, prob in l:
        s += prob
        if s >= r:
            return item
    return item  # Might occur because of floating point inaccuracies

回答 3

也许有点晚了。但是您可以使用numpy.random.choice()传递p参数:

val = numpy.random.choice(numpy.arange(1, 7), p=[0.1, 0.05, 0.05, 0.2, 0.4, 0.2])

Maybe it is kind of late. But you can use numpy.random.choice(), passing the p parameter:

val = numpy.random.choice(numpy.arange(1, 7), p=[0.1, 0.05, 0.05, 0.2, 0.4, 0.2])

回答 4

(好吧,我知道您正在要求收缩包装,但是也许这些自制的解决方案还不够简洁,无法满足您的喜好。:-)

pdf = [(1, 0.1), (2, 0.05), (3, 0.05), (4, 0.2), (5, 0.4), (6, 0.2)]
cdf = [(i, sum(p for j,p in pdf if j < i)) for i,_ in pdf]
R = max(i for r in [random.random()] for i,c in cdf if c <= r)

我通过确认此表达式的输出来伪确认此方法有效:

sorted(max(i for r in [random.random()] for i,c in cdf if c <= r)
       for _ in range(1000))

(OK, I know you are asking for shrink-wrap, but maybe those home-grown solutions just weren’t succinct enough for your liking. :-)

pdf = [(1, 0.1), (2, 0.05), (3, 0.05), (4, 0.2), (5, 0.4), (6, 0.2)]
cdf = [(i, sum(p for j,p in pdf if j < i)) for i,_ in pdf]
R = max(i for r in [random.random()] for i,c in cdf if c <= r)

I pseudo-confirmed that this works by eyeballing the output of this expression:

sorted(max(i for r in [random.random()] for i,c in cdf if c <= r)
       for _ in range(1000))

回答 5

我写了一个从自定义连续分布中抽取随机样本的解决方案。

我需要一个与您的用例类似的用例(即生成具有给定概率分布的随机日期)。

您只需要功能random_custDist和功能samples=random_custDist(x0,x1,custDist=custDist,size=1000)。剩下的就是装饰^^。

import numpy as np

#funtion
def random_custDist(x0,x1,custDist,size=None, nControl=10**6):
    #genearte a list of size random samples, obeying the distribution custDist
    #suggests random samples between x0 and x1 and accepts the suggestion with probability custDist(x)
    #custDist noes not need to be normalized. Add this condition to increase performance. 
    #Best performance for max_{x in [x0,x1]} custDist(x) = 1
    samples=[]
    nLoop=0
    while len(samples)<size and nLoop<nControl:
        x=np.random.uniform(low=x0,high=x1)
        prop=custDist(x)
        assert prop>=0 and prop<=1
        if np.random.uniform(low=0,high=1) <=prop:
            samples += [x]
        nLoop+=1
    return samples

#call
x0=2007
x1=2019
def custDist(x):
    if x<2010:
        return .3
    else:
        return (np.exp(x-2008)-1)/(np.exp(2019-2007)-1)
samples=random_custDist(x0,x1,custDist=custDist,size=1000)
print(samples)

#plot
import matplotlib.pyplot as plt
#hist
bins=np.linspace(x0,x1,int(x1-x0+1))
hist=np.histogram(samples, bins )[0]
hist=hist/np.sum(hist)
plt.bar( (bins[:-1]+bins[1:])/2, hist, width=.96, label='sample distribution')
#dist
grid=np.linspace(x0,x1,100)
discCustDist=np.array([custDist(x) for x in grid]) #distrete version
discCustDist*=1/(grid[1]-grid[0])/np.sum(discCustDist)
plt.plot(grid,discCustDist,label='custom distribustion (custDist)', color='C1', linewidth=4)
#decoration
plt.legend(loc=3,bbox_to_anchor=(1,0))
plt.show()

该解决方案的性能肯定可以提高,但是我更喜欢可读性。

I wrote a solution for drawing random samples from a custom continuous distribution.

I needed this for a similar use-case to yours (i.e. generating random dates with a given probability distribution).

You just need the funtion random_custDist and the line samples=random_custDist(x0,x1,custDist=custDist,size=1000). The rest is decoration ^^.

import numpy as np

#funtion
def random_custDist(x0,x1,custDist,size=None, nControl=10**6):
    #genearte a list of size random samples, obeying the distribution custDist
    #suggests random samples between x0 and x1 and accepts the suggestion with probability custDist(x)
    #custDist noes not need to be normalized. Add this condition to increase performance. 
    #Best performance for max_{x in [x0,x1]} custDist(x) = 1
    samples=[]
    nLoop=0
    while len(samples)<size and nLoop<nControl:
        x=np.random.uniform(low=x0,high=x1)
        prop=custDist(x)
        assert prop>=0 and prop<=1
        if np.random.uniform(low=0,high=1) <=prop:
            samples += [x]
        nLoop+=1
    return samples

#call
x0=2007
x1=2019
def custDist(x):
    if x<2010:
        return .3
    else:
        return (np.exp(x-2008)-1)/(np.exp(2019-2007)-1)
samples=random_custDist(x0,x1,custDist=custDist,size=1000)
print(samples)

#plot
import matplotlib.pyplot as plt
#hist
bins=np.linspace(x0,x1,int(x1-x0+1))
hist=np.histogram(samples, bins )[0]
hist=hist/np.sum(hist)
plt.bar( (bins[:-1]+bins[1:])/2, hist, width=.96, label='sample distribution')
#dist
grid=np.linspace(x0,x1,100)
discCustDist=np.array([custDist(x) for x in grid]) #distrete version
discCustDist*=1/(grid[1]-grid[0])/np.sum(discCustDist)
plt.plot(grid,discCustDist,label='custom distribustion (custDist)', color='C1', linewidth=4)
#decoration
plt.legend(loc=3,bbox_to_anchor=(1,0))
plt.show()

The performance of this solution is improvable for sure, but I prefer readability.


回答 6

根据以下内容列出项目weights

items = [1, 2, 3, 4, 5, 6]
probabilities= [0.1, 0.05, 0.05, 0.2, 0.4, 0.2]
# if the list of probs is normalized (sum(probs) == 1), omit this part
prob = sum(probabilities) # find sum of probs, to normalize them
c = (1.0)/prob # a multiplier to make a list of normalized probs
probabilities = map(lambda x: c*x, probabilities)
print probabilities

ml = max(probabilities, key=lambda x: len(str(x)) - str(x).find('.'))
ml = len(str(ml)) - str(ml).find('.') -1
amounts = [ int(x*(10**ml)) for x in probabilities]
itemsList = list()
for i in range(0, len(items)): # iterate through original items
  itemsList += items[i:i+1]*amounts[i]

# choose from itemsList randomly
print itemsList

一种优化可能是通过最大公约数对数量进行归一化,以使目标列表更小。

另外,可能很有趣。

Make a list of items, based on their weights:

items = [1, 2, 3, 4, 5, 6]
probabilities= [0.1, 0.05, 0.05, 0.2, 0.4, 0.2]
# if the list of probs is normalized (sum(probs) == 1), omit this part
prob = sum(probabilities) # find sum of probs, to normalize them
c = (1.0)/prob # a multiplier to make a list of normalized probs
probabilities = map(lambda x: c*x, probabilities)
print probabilities

ml = max(probabilities, key=lambda x: len(str(x)) - str(x).find('.'))
ml = len(str(ml)) - str(ml).find('.') -1
amounts = [ int(x*(10**ml)) for x in probabilities]
itemsList = list()
for i in range(0, len(items)): # iterate through original items
  itemsList += items[i:i+1]*amounts[i]

# choose from itemsList randomly
print itemsList

An optimization may be to normalize amounts by the greatest common divisor, to make the target list smaller.

Also, this might be interesting.


回答 7

另一个答案,可能更快:)

distribution = [(1, 0.2), (2, 0.3), (3, 0.5)]  
# init distribution  
dlist = []  
sumchance = 0  
for value, chance in distribution:  
    sumchance += chance  
    dlist.append((value, sumchance))  
assert sumchance == 1.0 # not good assert because of float equality  

# get random value  
r = random.random()  
# for small distributions use lineair search  
if len(distribution) < 64: # don't know exact speed limit  
    for value, sumchance in dlist:  
        if r < sumchance:  
            return value  
else:  
    # else (not implemented) binary search algorithm  

Another answer, probably faster :)

distribution = [(1, 0.2), (2, 0.3), (3, 0.5)]  
# init distribution  
dlist = []  
sumchance = 0  
for value, chance in distribution:  
    sumchance += chance  
    dlist.append((value, sumchance))  
assert sumchance == 1.0 # not good assert because of float equality  

# get random value  
r = random.random()  
# for small distributions use lineair search  
if len(distribution) < 64: # don't know exact speed limit  
    for value, sumchance in dlist:  
        if r < sumchance:  
            return value  
else:  
    # else (not implemented) binary search algorithm  

回答 8

from __future__ import division
import random
from collections import Counter


def num_gen(num_probs):
    # calculate minimum probability to normalize
    min_prob = min(prob for num, prob in num_probs)
    lst = []
    for num, prob in num_probs:
        # keep appending num to lst, proportional to its probability in the distribution
        for _ in range(int(prob/min_prob)):
            lst.append(num)
    # all elems in lst occur proportional to their distribution probablities
    while True:
        # pick a random index from lst
        ind = random.randint(0, len(lst)-1)
        yield lst[ind]

验证:

gen = num_gen([(1, 0.1),
               (2, 0.05),
               (3, 0.05),
               (4, 0.2),
               (5, 0.4),
               (6, 0.2)])
lst = []
times = 10000
for _ in range(times):
    lst.append(next(gen))
# Verify the created distribution:
for item, count in Counter(lst).iteritems():
    print '%d has %f probability' % (item, count/times)

1 has 0.099737 probability
2 has 0.050022 probability
3 has 0.049996 probability 
4 has 0.200154 probability
5 has 0.399791 probability
6 has 0.200300 probability
from __future__ import division
import random
from collections import Counter


def num_gen(num_probs):
    # calculate minimum probability to normalize
    min_prob = min(prob for num, prob in num_probs)
    lst = []
    for num, prob in num_probs:
        # keep appending num to lst, proportional to its probability in the distribution
        for _ in range(int(prob/min_prob)):
            lst.append(num)
    # all elems in lst occur proportional to their distribution probablities
    while True:
        # pick a random index from lst
        ind = random.randint(0, len(lst)-1)
        yield lst[ind]

Verification:

gen = num_gen([(1, 0.1),
               (2, 0.05),
               (3, 0.05),
               (4, 0.2),
               (5, 0.4),
               (6, 0.2)])
lst = []
times = 10000
for _ in range(times):
    lst.append(next(gen))
# Verify the created distribution:
for item, count in Counter(lst).iteritems():
    print '%d has %f probability' % (item, count/times)

1 has 0.099737 probability
2 has 0.050022 probability
3 has 0.049996 probability 
4 has 0.200154 probability
5 has 0.399791 probability
6 has 0.200300 probability

回答 9

根据其他解决方案,您可以生成累积分布(任意形式为整数或浮点数),然后可以使用二等分来使其快速

这是一个简单的示例(我在这里使用了整数)

l=[(20, 'foo'), (60, 'banana'), (10, 'monkey'), (10, 'monkey2')]
def get_cdf(l):
    ret=[]
    c=0
    for i in l: c+=i[0]; ret.append((c, i[1]))
    return ret

def get_random_item(cdf):
    return cdf[bisect.bisect_left(cdf, (random.randint(0, cdf[-1][0]),))][1]

cdf=get_cdf(l)
for i in range(100): print get_random_item(cdf),

get_cdf函数会将其从20、60、10、10转换为20、20 + 60、20 + 60 + 10、20 + 60 + 10 + 10

现在我们使用选取最大为20 + 60 + 10 + 10的随机数,random.randint然后使用bisect快速获取实际值

based on other solutions, you generate accumulative distribution (as integer or float whatever you like), then you can use bisect to make it fast

this is a simple example (I used integers here)

l=[(20, 'foo'), (60, 'banana'), (10, 'monkey'), (10, 'monkey2')]
def get_cdf(l):
    ret=[]
    c=0
    for i in l: c+=i[0]; ret.append((c, i[1]))
    return ret

def get_random_item(cdf):
    return cdf[bisect.bisect_left(cdf, (random.randint(0, cdf[-1][0]),))][1]

cdf=get_cdf(l)
for i in range(100): print get_random_item(cdf),

the get_cdf function would convert it from 20, 60, 10, 10 into 20, 20+60, 20+60+10, 20+60+10+10

now we pick a random number up to 20+60+10+10 using random.randint then we use bisect to get the actual value in a fast way


回答 10

您可能想看看NumPy 随机抽样分布

you might want to have a look at NumPy Random sampling distributions


回答 11

这些答案都不是特别清楚或简单的。

这是保证可以正常工作的一种清晰,简单的方法。

accumulate_normalize_probabilities采用字典p将符号映射到概率频率。它输出要选择的元组的可用列表。

def accumulate_normalize_values(p):
        pi = p.items() if isinstance(p,dict) else p
        accum_pi = []
        accum = 0
        for i in pi:
                accum_pi.append((i[0],i[1]+accum))
                accum += i[1]
        if accum == 0:
                raise Exception( "You are about to explode the universe. Continue ? Y/N " )
        normed_a = []
        for a in accum_pi:
                normed_a.append((a[0],a[1]*1.0/accum))
        return normed_a

Yield:

>>> accumulate_normalize_values( { 'a': 100, 'b' : 300, 'c' : 400, 'd' : 200  } )
[('a', 0.1), ('c', 0.5), ('b', 0.8), ('d', 1.0)]

为什么运作

所述积累步骤变成每个符号到(在第一符号的情况下或0)本身和先前符号概率或频率之间的间隔。通过简单地逐步遍历列表,直到间隔0.0-> 1.0中的随机数(之前已准备好)小于或等于当前符号的间隔端点,可以使用这些间隔进行选择(从而对提供的分布进行采样)。

规范化释放我们从需求,以确保一切资金以一定的价值。归一化后,概率的“向量”总计为1.0。

下面用于从分布中选择并生成任意长样本的其余代码

def select(symbol_intervals,random):
        print symbol_intervals,random
        i = 0
        while random > symbol_intervals[i][1]:
                i += 1
                if i >= len(symbol_intervals):
                        raise Exception( "What did you DO to that poor list?" )
        return symbol_intervals[i][0]


def gen_random(alphabet,length,probabilities=None):
        from random import random
        from itertools import repeat
        if probabilities is None:
                probabilities = dict(zip(alphabet,repeat(1.0)))
        elif len(probabilities) > 0 and isinstance(probabilities[0],(int,long,float)):
                probabilities = dict(zip(alphabet,probabilities)) #ordered
        usable_probabilities = accumulate_normalize_values(probabilities)
        gen = []
        while len(gen) < length:
                gen.append(select(usable_probabilities,random()))
        return gen

用法:

>>> gen_random (['a','b','c','d'],10,[100,300,400,200])
['d', 'b', 'b', 'a', 'c', 'c', 'b', 'c', 'c', 'c']   #<--- some of the time

None of these answers is particularly clear or simple.

Here is a clear, simple method that is guaranteed to work.

accumulate_normalize_probabilities takes a dictionary p that maps symbols to probabilities OR frequencies. It outputs usable list of tuples from which to do selection.

def accumulate_normalize_values(p):
        pi = p.items() if isinstance(p,dict) else p
        accum_pi = []
        accum = 0
        for i in pi:
                accum_pi.append((i[0],i[1]+accum))
                accum += i[1]
        if accum == 0:
                raise Exception( "You are about to explode the universe. Continue ? Y/N " )
        normed_a = []
        for a in accum_pi:
                normed_a.append((a[0],a[1]*1.0/accum))
        return normed_a

Yields:

>>> accumulate_normalize_values( { 'a': 100, 'b' : 300, 'c' : 400, 'd' : 200  } )
[('a', 0.1), ('c', 0.5), ('b', 0.8), ('d', 1.0)]

Why it works

The accumulation step turns each symbol into an interval between itself and the previous symbols probability or frequency (or 0 in the case of the first symbol). These intervals can be used to select from (and thus sample the provided distribution) by simply stepping through the list until the random number in interval 0.0 -> 1.0 (prepared earlier) is less or equal to the current symbol’s interval end-point.

The normalization releases us from the need to make sure everything sums to some value. After normalization the “vector” of probabilities sums to 1.0.

The rest of the code for selection and generating a arbitrarily long sample from the distribution is below :

def select(symbol_intervals,random):
        print symbol_intervals,random
        i = 0
        while random > symbol_intervals[i][1]:
                i += 1
                if i >= len(symbol_intervals):
                        raise Exception( "What did you DO to that poor list?" )
        return symbol_intervals[i][0]


def gen_random(alphabet,length,probabilities=None):
        from random import random
        from itertools import repeat
        if probabilities is None:
                probabilities = dict(zip(alphabet,repeat(1.0)))
        elif len(probabilities) > 0 and isinstance(probabilities[0],(int,long,float)):
                probabilities = dict(zip(alphabet,probabilities)) #ordered
        usable_probabilities = accumulate_normalize_values(probabilities)
        gen = []
        while len(gen) < length:
                gen.append(select(usable_probabilities,random()))
        return gen

Usage :

>>> gen_random (['a','b','c','d'],10,[100,300,400,200])
['d', 'b', 'b', 'a', 'c', 'c', 'b', 'c', 'c', 'c']   #<--- some of the time

回答 12

这是一种更有效的方法

只需使用您的“权重”数组(假定索引为对应项)和否调用以下函数。需要的样本数。可以轻松修改此功能以处理有序对。

使用各自的概率返回采样/挑选(替换)的索引(或项目):

def resample(weights, n):
    beta = 0

    # Caveat: Assign max weight to max*2 for best results
    max_w = max(weights)*2

    # Pick an item uniformly at random, to start with
    current_item = random.randint(0,n-1)
    result = []

    for i in range(n):
        beta += random.uniform(0,max_w)

        while weights[current_item] < beta:
            beta -= weights[current_item]
            current_item = (current_item + 1) % n   # cyclic
        else:
            result.append(current_item)
    return result

关于while循环中使用的概念的简短说明。我们从累积beta减少当前项目的权重,该累积值是随机统一构造的累积值,并增加当前索引以找到其权重与beta值匹配的项目。

Here is a more effective way of doing this:

Just call the following function with your ‘weights’ array (assuming the indices as the corresponding items) and the no. of samples needed. This function can be easily modified to handle ordered pair.

Returns indexes (or items) sampled/picked (with replacement) using their respective probabilities:

def resample(weights, n):
    beta = 0

    # Caveat: Assign max weight to max*2 for best results
    max_w = max(weights)*2

    # Pick an item uniformly at random, to start with
    current_item = random.randint(0,n-1)
    result = []

    for i in range(n):
        beta += random.uniform(0,max_w)

        while weights[current_item] < beta:
            beta -= weights[current_item]
            current_item = (current_item + 1) % n   # cyclic
        else:
            result.append(current_item)
    return result

A short note on the concept used in the while loop. We reduce the current item’s weight from cumulative beta, which is a cumulative value constructed uniformly at random, and increment current index in order to find the item, the weight of which matches the value of beta.


在__init__内调用类函数

问题:在__init__内调用类函数

我正在编写一些使用文件名,打开文件并解析出一些数据的代码。我想在课堂上做到这一点。以下代码有效:

class MyClass():
    def __init__(self, filename):
        self.filename = filename 

        self.stat1 = None
        self.stat2 = None
        self.stat3 = None
        self.stat4 = None
        self.stat5 = None

        def parse_file():
            #do some parsing
            self.stat1 = result_from_parse1
            self.stat2 = result_from_parse2
            self.stat3 = result_from_parse3
            self.stat4 = result_from_parse4
            self.stat5 = result_from_parse5

        parse_file()

但是,这涉及到我将所有解析机制置于__init__类的功能范围之内。现在,对于此简化的代码来说,这看起来还不错,但是该函数parse_file还具有许多缩进级别。我更喜欢将函数定义parse_file()为类函数,如下所示:

class MyClass():
    def __init__(self, filename):
        self.filename = filename 

        self.stat1 = None
        self.stat2 = None
        self.stat3 = None
        self.stat4 = None
        self.stat5 = None
        parse_file()

    def parse_file():
        #do some parsing
        self.stat1 = result_from_parse1
        self.stat2 = result_from_parse2
        self.stat3 = result_from_parse3
        self.stat4 = result_from_parse4
        self.stat5 = result_from_parse5

当然,此代码不起作用,因为该函数parse_file()不在函数范围内__init__。有没有办法从该类内部调用类函数__init__?还是我想这是错误的方式?

I’m writing some code that takes a filename, opens the file, and parses out some data. I’d like to do this in a class. The following code works:

class MyClass():
    def __init__(self, filename):
        self.filename = filename 

        self.stat1 = None
        self.stat2 = None
        self.stat3 = None
        self.stat4 = None
        self.stat5 = None

        def parse_file():
            #do some parsing
            self.stat1 = result_from_parse1
            self.stat2 = result_from_parse2
            self.stat3 = result_from_parse3
            self.stat4 = result_from_parse4
            self.stat5 = result_from_parse5

        parse_file()

But it involves me putting all of the parsing machinery in the scope of the __init__ function for my class. That looks fine now for this simplified code, but the function parse_file has quite a few levels of indention as well. I’d prefer to define the function parse_file() as a class function like below:

class MyClass():
    def __init__(self, filename):
        self.filename = filename 

        self.stat1 = None
        self.stat2 = None
        self.stat3 = None
        self.stat4 = None
        self.stat5 = None
        parse_file()

    def parse_file():
        #do some parsing
        self.stat1 = result_from_parse1
        self.stat2 = result_from_parse2
        self.stat3 = result_from_parse3
        self.stat4 = result_from_parse4
        self.stat5 = result_from_parse5

Of course this code doesn’t work because the function parse_file() is not within the scope of the __init__ function. Is there a way to call a class function from within __init__ of that class? Or am I thinking about this the wrong way?


回答 0

以这种方式调用该函数:

self.parse_file()

您还需要像这样定义parse_file()函数:

def parse_file(self):

parse_file方法必须在调用时绑定到对象(因为它不是静态方法)。这是通过在对象的实例上调用函数来完成的(在您的情况下,实例是)self

Call the function in this way:

self.parse_file()

You also need to define your parse_file() function like this:

def parse_file(self):

The parse_file method has to be bound to an object upon calling it (because it’s not a static method). This is done by calling the function on an instance of the object, in your case the instance is self.


回答 1

如果我没记错的话,这两个函数都是您的类的一部分,则应像这样使用它:

class MyClass():
    def __init__(self, filename):
        self.filename = filename 

        self.stat1 = None
        self.stat2 = None
        self.stat3 = None
        self.stat4 = None
        self.stat5 = None
        self.parse_file()

    def parse_file(self):
        #do some parsing
        self.stat1 = result_from_parse1
        self.stat2 = result_from_parse2
        self.stat3 = result_from_parse3
        self.stat4 = result_from_parse4
        self.stat5 = result_from_parse5

替换行:

parse_file() 

与:

self.parse_file()

If I’m not wrong, both functions are part of your class, you should use it like this:

class MyClass():
    def __init__(self, filename):
        self.filename = filename 

        self.stat1 = None
        self.stat2 = None
        self.stat3 = None
        self.stat4 = None
        self.stat5 = None
        self.parse_file()

    def parse_file(self):
        #do some parsing
        self.stat1 = result_from_parse1
        self.stat2 = result_from_parse2
        self.stat3 = result_from_parse3
        self.stat4 = result_from_parse4
        self.stat5 = result_from_parse5

replace your line:

parse_file() 

with:

self.parse_file()

回答 2

怎么样:

class MyClass(object):
    def __init__(self, filename):
        self.filename = filename 
        self.stats = parse_file(filename)

def parse_file(filename):
    #do some parsing
    return results_from_parse

顺便说一句,如果有一个名为变量stat1stat2等等,情况正在乞求一个元组: stats = (...)

因此,让我们parse_file返回一个元组,并将其存储在中 self.stats

然后,例如,您可以访问曾经使用调用stat3的内容self.stats[2]

How about:

class MyClass(object):
    def __init__(self, filename):
        self.filename = filename 
        self.stats = parse_file(filename)

def parse_file(filename):
    #do some parsing
    return results_from_parse

By the way, if you have variables named stat1, stat2, etc., the situation is begging for a tuple: stats = (...).

So let parse_file return a tuple, and store the tuple in self.stats.

Then, for example, you can access what used to be called stat3 with self.stats[2].


回答 3

在中parse_file,接受self参数(与中一样__init__)。如果您需要任何其他上下文,则只需照常将其作为附加参数传递。

In parse_file, take the self argument (just like in __init__). If there’s any other context you need then just pass it as additional arguments as usual.


回答 4

您必须像这样声明parse_file; def parse_file(self)。在大多数语言中,“ self”参数是一个隐藏参数,但在python中则不是。您必须将其添加到属于一个类的所有方法的定义中。然后您可以使用以下方法从类中的任何方法调用该函数self.parse_file

您的最终程序将如下所示:

class MyClass():
  def __init__(self, filename):
      self.filename = filename 

      self.stat1 = None
      self.stat2 = None
      self.stat3 = None
      self.stat4 = None
      self.stat5 = None
      self.parse_file()

  def parse_file(self):
      #do some parsing
      self.stat1 = result_from_parse1
      self.stat2 = result_from_parse2
      self.stat3 = result_from_parse3
      self.stat4 = result_from_parse4
      self.stat5 = result_from_parse5

You must declare parse_file like this; def parse_file(self). The “self” parameter is a hidden parameter in most languages, but not in python. You must add it to the definition of all that methods that belong to a class. Then you can call the function from any method inside the class using self.parse_file

your final program is going to look like this:

class MyClass():
  def __init__(self, filename):
      self.filename = filename 

      self.stat1 = None
      self.stat2 = None
      self.stat3 = None
      self.stat4 = None
      self.stat5 = None
      self.parse_file()

  def parse_file(self):
      #do some parsing
      self.stat1 = result_from_parse1
      self.stat2 = result_from_parse2
      self.stat3 = result_from_parse3
      self.stat4 = result_from_parse4
      self.stat5 = result_from_parse5

回答 5

我认为您的问题实际上是没有正确缩进init函数,应该是这样的

class MyClass():
     def __init__(self, filename):
          pass

     def parse_file():
          pass

I think that your problem is actually with not correctly indenting init function.It should be like this

class MyClass():
     def __init__(self, filename):
          pass

     def parse_file():
          pass