标签归档:coding-style

如何在Python中打破一系列链接方法?

问题:如何在Python中打破一系列链接方法?

我有以下代码行(不要怪罪命名约定,它们不是我的):

subkeyword = Session.query(
    Subkeyword.subkeyword_id, Subkeyword.subkeyword_word
).filter_by(
    subkeyword_company_id=self.e_company_id
).filter_by(
    subkeyword_word=subkeyword_word
).filter_by(
    subkeyword_active=True
).one()

我不喜欢它的外观(不太可读),但是在这种情况下,我没有更好的主意将行数限制为79个字符。有没有更好的方法来破解它(最好没有反斜杠)?

I have a line of the following code (don’t blame for naming conventions, they are not mine):

subkeyword = Session.query(
    Subkeyword.subkeyword_id, Subkeyword.subkeyword_word
).filter_by(
    subkeyword_company_id=self.e_company_id
).filter_by(
    subkeyword_word=subkeyword_word
).filter_by(
    subkeyword_active=True
).one()

I don’t like how it looks like (not too readable) but I don’t have any better idea to limit lines to 79 characters in this situation. Is there a better way of breaking it (preferably without backslashes)?


回答 0

您可以使用其他括号:

subkeyword = (
        Session.query(Subkeyword.subkeyword_id, Subkeyword.subkeyword_word)
        .filter_by(subkeyword_company_id=self.e_company_id)
        .filter_by(subkeyword_word=subkeyword_word)
        .filter_by(subkeyword_active=True)
        .one()
    )

You could use additional parenthesis:

subkeyword = (
        Session.query(Subkeyword.subkeyword_id, Subkeyword.subkeyword_word)
        .filter_by(subkeyword_company_id=self.e_company_id)
        .filter_by(subkeyword_word=subkeyword_word)
        .filter_by(subkeyword_active=True)
        .one()
    )

回答 1

在这种情况下,最好使用连续行字符代替括号。随着方法名称变长以及方法开始采用参数,对这种样式的需求变得更加明显:

subkeyword = Session.query(Subkeyword.subkeyword_id, Subkeyword.subkeyword_word) \
                    .filter_by(subkeyword_company_id=self.e_company_id)          \
                    .filter_by(subkeyword_word=subkeyword_word)                  \
                    .filter_by(subkeyword_active=True)                           \
                    .one()

PEP 8旨在以一种常识的方式进行解释,并兼顾实用性和美观性。很高兴违反任何导致难看或难以阅读代码的PEP 8准则。

话虽如此,如果您经常发现自己与PEP 8不符,则可能表明存在一些可读性问题超出了对空白的选择:-)

This is a case where a line continuation character is preferred to open parentheses. The need for this style becomes more obvious as method names get longer and as methods start taking arguments:

subkeyword = Session.query(Subkeyword.subkeyword_id, Subkeyword.subkeyword_word) \
                    .filter_by(subkeyword_company_id=self.e_company_id)          \
                    .filter_by(subkeyword_word=subkeyword_word)                  \
                    .filter_by(subkeyword_active=True)                           \
                    .one()

PEP 8 is intend to be interpreted with a measure of common-sense and an eye for both the practical and the beautiful. Happily violate any PEP 8 guideline that results in ugly or hard to read code.

That being said, if you frequently find yourself at odds with PEP 8, it may be a sign that there are readability issues that transcend your choice of whitespace :-)


回答 2

我个人的选择是:

子关键字= Session.query(
    Subkeyword.subkeyword_id,
    Subkeyword.subkeyword_word,
)。过滤(
    subkeyword_company_id = self.e_company_id,
    subkeyword_word = subkeyword_word,
    subkeyword_active =真实,
)。之一()

My personal choice would be:

subkeyword = Session.query(
    Subkeyword.subkeyword_id,
    Subkeyword.subkeyword_word,
).filter_by(
    subkeyword_company_id=self.e_company_id,
    subkeyword_word=subkeyword_word,
    subkeyword_active=True,
).one()

回答 3

只需存储中间结果/对象并在其上调用下一个方法,例如

q = Session.query(Subkeyword.subkeyword_id, Subkeyword.subkeyword_word)
q = q.filter_by(subkeyword_company_id=self.e_company_id)
q = q.filter_by(subkeyword_word=subkeyword_word)
q = q.filter_by(subkeyword_active=True)
subkeyword = q.one()

Just store the intermediate result/object and invoke the next method on it, e.g.

q = Session.query(Subkeyword.subkeyword_id, Subkeyword.subkeyword_word)
q = q.filter_by(subkeyword_company_id=self.e_company_id)
q = q.filter_by(subkeyword_word=subkeyword_word)
q = q.filter_by(subkeyword_active=True)
subkeyword = q.one()

回答 4

根据Python语言参考,
您可以使用反斜杠。
或简单地打破它。如果括号未配对,则python不会将其视为一行。在这种情况下,以下行的缩进无关紧要。

According to Python Language Reference
You can use a backslash.
Or simply break it. If a bracket is not paired, python will not treat that as a line. And under such circumstance, the indentation of following lines doesn’t matter.


回答 5

它与其他人提供的解决方案有些不同,但我的最爱,因为它有时会导致漂亮的元编程。

base = [Subkeyword.subkeyword_id, Subkeyword_word]
search = {
    'subkeyword_company_id':self.e_company_id,
    'subkeyword_word':subkeyword_word,
    'subkeyword_active':True,
    }
subkeyword = Session.query(*base).filter_by(**search).one()

这是构建搜索的好方法。浏览条件列表,从复杂的查询表单(或基于字符串的有关用户正在寻找的内容的推论)中进行挖掘,然后将字典分解到过滤器中。

It’s a bit of a different solution than provided by others but a favorite of mine since it leads to nifty metaprogramming sometimes.

base = [Subkeyword.subkeyword_id, Subkeyword_word]
search = {
    'subkeyword_company_id':self.e_company_id,
    'subkeyword_word':subkeyword_word,
    'subkeyword_active':True,
    }
subkeyword = Session.query(*base).filter_by(**search).one()

This is a nice technique for building searches. Go through a list of conditionals to mine from your complex query form (or string-based deductions about what the user is looking for), then just explode the dictionary into the filter.


回答 6

您似乎在使用SQLAlchemy,如果为true,则sqlalchemy.orm.query.Query.filter_by()方法采用多个关键字参数,因此您可以这样编写:

subkeyword = Session.query(Subkeyword.subkeyword_id,
                           Subkeyword.subkeyword_word) \
                    .filter_by(subkeyword_company_id=self.e_company_id,
                               subkeyword_word=subkeyword_word,
                               subkeyword_active=True) \
                    .one()

但这会更好:

subkeyword = Session.query(Subkeyword.subkeyword_id,
                           Subkeyword.subkeyword_word)
subkeyword = subkeyword.filter_by(subkeyword_company_id=self.e_company_id,
                                  subkeyword_word=subkeyword_word,
                                  subkeyword_active=True)
subkeuword = subkeyword.one()

You seems using SQLAlchemy, if it is true, sqlalchemy.orm.query.Query.filter_by() method takes multiple keyword arguments, so you could write like:

subkeyword = Session.query(Subkeyword.subkeyword_id,
                           Subkeyword.subkeyword_word) \
                    .filter_by(subkeyword_company_id=self.e_company_id,
                               subkeyword_word=subkeyword_word,
                               subkeyword_active=True) \
                    .one()

But it would be better:

subkeyword = Session.query(Subkeyword.subkeyword_id,
                           Subkeyword.subkeyword_word)
subkeyword = subkeyword.filter_by(subkeyword_company_id=self.e_company_id,
                                  subkeyword_word=subkeyword_word,
                                  subkeyword_active=True)
subkeuword = subkeyword.one()

回答 7

我喜欢将参数缩进两个块,将语句缩进一个块,如下所示:

for image_pathname in image_directory.iterdir():
    image = cv2.imread(str(image_pathname))
    input_image = np.resize(
            image, (height, width, 3)
        ).transpose((2,0,1)).reshape(1, 3, height, width)
    net.forward_all(data=input_image)
    segmentation_index = net.blobs[
            'argmax'
        ].data.squeeze().transpose(1,2,0).astype(np.uint8)
    segmentation = np.empty(segmentation_index.shape, dtype=np.uint8)
    cv2.LUT(segmentation_index, label_colours, segmentation)
    prediction_pathname = prediction_directory / image_pathname.name
    cv2.imwrite(str(prediction_pathname), segmentation)

I like to indent the arguments by two blocks, and the statement by one block, like these:

for image_pathname in image_directory.iterdir():
    image = cv2.imread(str(image_pathname))
    input_image = np.resize(
            image, (height, width, 3)
        ).transpose((2,0,1)).reshape(1, 3, height, width)
    net.forward_all(data=input_image)
    segmentation_index = net.blobs[
            'argmax'
        ].data.squeeze().transpose(1,2,0).astype(np.uint8)
    segmentation = np.empty(segmentation_index.shape, dtype=np.uint8)
    cv2.LUT(segmentation_index, label_colours, segmentation)
    prediction_pathname = prediction_directory / image_pathname.name
    cv2.imwrite(str(prediction_pathname), segmentation)

在Python中,什么时候应该使用函数而不是方法?

问题:在Python中,什么时候应该使用函数而不是方法?

Python的Zen指出,只有一种方法可以做事情-但我经常遇到决定何时使用函数以及何时使用方法的问题。

让我们举一个简单的例子-ChessBoard对象。假设我们需要某种方式使董事会上所有合法的King举动均可用。我们是否编写ChessBoard.get_king_moves()或get_king_moves(chess_board)?

这是我看过的一些相关问题:

我得到的答案基本上没有定论:

为什么Python使用方法来实现某些功能(例如list.index()),却使用其他方法(例如len(list))呢?

主要原因是历史。函数用于那些对一组类型通用的操作,即使对于根本没有方法的对象(例如元组),这些操作也可以使用。使用Python的功能特性(map(),apply()等)时,具有可以轻松应用于对象的不定形集合的函数也很方便。

实际上,将len(),max(),min()实现为内置函数实际上比将它们实现为每种类型的方法要少。人们可能会质疑个别情况,但这是Python的一部分,现在进行这样的基本更改为时已晚。必须保留功能以避免大量代码损坏。

尽管很有趣,但是上面并没有真正说明采用哪种策略。

这是原因之一-使用自定义方法,开发人员可以自由选择其他方法名称,例如getLength(),length(),getlength()或其他名称。Python强制执行严格的命名,以便可以使用通用函数len()。

稍微有趣一点。我认为函数在某种意义上是接口的Pythonic版本。

最后,来自Guido本人

谈论能力/接口使我想到了一些“流氓”特殊方法名称。在《语言参考》中,它说:“类可以通过定义具有特殊名称的方法来实现某些由特殊语法调用的操作(例如算术运算或下标和切片)。” 但是,所有这些带有特殊名称的方法(例如__len__或)__unicode__似乎都是为内置函数的利益提供的,而不是为了支持语法。大概在基于接口的Python中,这些方法将在ABC上变成常规命名的方法,因此 __len__将成为

class container:
  ...
  def len(self):
    raise NotImplemented

虽然,再想一想,我不明白为什么所有的句法运算都不会仅仅在特定的ABC上调用适当的通常命名的方法。“ <”举例来说,大概会调用“ object.lessthan”(或者是“ comparable.lessthan“)。因此,另一个好处是能够使Python摆脱这种乱七八糟的名字,对我而言这似乎是HCI的改进

嗯 我不确定我是否同意(图:-)。

我首先要解释“ Python基本原理”的两个方面。

首先,出于HCI的原因,我选择了len(x)而不是x.len()(def __len__()后来出现了)。实际上,两个HCI相互交织在一起:

(a)对于某些运算,前缀表示法比后缀读得更好-前缀(和infix!)操作在数学中具有悠久的传统,喜欢在视觉上帮助数学家思考问题的表示法。比较与我们改写像公式简单x*(a+b)x*a + x*b使用原始OO符号做同样的事情的笨拙。

(b)当我读到说的代码时,len(x)知道那是在问某物的长度。这告诉我两件事:结果是整数,参数是某种容器。相反,当我阅读本文时x.len(),我必须已经知道这x是一种实现接口或从具有standard的类继承的容器len()。当未实现映射的类具有get()keys() 方法,或者不是文件的某些具有方法时,我们有时会感到困惑write()

用另一种方式说同样的事情,我将’len’视为内置 操作。我不想失去那个。我不能肯定地说出您是否是那样的意思,但是“ def len(self):…”当然听起来像您想将其降级为普通方法。我对此坚决为-1。

我答应解释的Python基本原理的第二点是为什么我选择了特殊的外观__special__而不是仅仅 选择外观的原因special。我期待类可能要覆盖的许多操作,一些标准(例如__add____getitem__),某些不是那么标准(例如,泡菜__reduce__很长一段时间都不支持C代码)。我不希望这些特殊操作使用普通的方法名称,因为那样的话,预先存在的类或用户没有为所有特殊方法存储百科全书的用户编写的类可能会意外地定义它们并非要实现的操作,可能会造成灾难性的后果。伊万·科斯蒂奇(IvanKrstić)在他的信息中对此进行了更为简洁的解释,在我将所有这些内容写完之后,这些信息才得以体现。

—Guido van Rossum(主页:http ://www.python.org/~guido/ )

我对此的理解是,在某些情况下,前缀表示法更有意义(即,从语言的角度来看,Duck.quack比quack(Duck)更有意义。)而且,该函数还允许使用“接口”。

在这种情况下,我的猜测是仅基于Guido的第一点实现get_king_moves。但这仍然存在很多悬而未决的问题,例如使用类似的push和pop方法实现堆栈和队列类-它们应该是函数还是方法?(在这里我会猜测功能,因为我真的很想发信号通知推送界面)

TLDR:有人可以解释决定何时使用函数还是方法的策略是什么?

The Zen of Python states that there should only be one way to do things- yet frequently I run into the problem of deciding when to use a function versus when to use a method.

Let’s take a trivial example- a ChessBoard object. Let’s say we need some way to get all the legal King moves available on the board. Do we write ChessBoard.get_king_moves() or get_king_moves(chess_board)?

Here are some related questions I looked at:

The answers I got were largely inconclusive:

Why does Python use methods for some functionality (e.g. list.index()) but functions for other (e.g. len(list))?

The major reason is history. Functions were used for those operations that were generic for a group of types and which were intended to work even for objects that didn’t have methods at all (e.g. tuples). It is also convenient to have a function that can readily be applied to an amorphous collection of objects when you use the functional features of Python (map(), apply() et al).

In fact, implementing len(), max(), min() as a built-in function is actually less code than implementing them as methods for each type. One can quibble about individual cases but it’s a part of Python, and it’s too late to make such fundamental changes now. The functions have to remain to avoid massive code breakage.

While interesting, the above doesn’t really say much as to what strategy to adopt.

This is one of the reasons – with custom methods, developers would be free to choose a different method name, like getLength(), length(), getlength() or whatsoever. Python enforces strict naming so that the common function len() can be used.

Slightly more interesting. My take is that functions are in a sense, the Pythonic version of interfaces.

Lastly, from Guido himself:

Talking about the Abilities/Interfaces made me think about some of our “rogue” special method names. In the Language Reference, it says, “A class can implement certain operations that are invoked by special syntax (such as arithmetic operations or subscripting and slicing) by defining methods with special names.” But there are all these methods with special names like __len__ or __unicode__ which seem to be provided for the benefit of built-in functions, rather than for support of syntax. Presumably in an interface-based Python, these methods would turn into regularly-named methods on an ABC, so that __len__ would become

class container:
  ...
  def len(self):
    raise NotImplemented

Though, thinking about it some more, I don’t see why all syntactic operations wouldn’t just invoke the appropriate normally-named method on a specific ABC. “<“, for instance, would presumably invoke “object.lessthan” (or perhaps “comparable.lessthan“). So another benefit would be the ability to wean Python away from this mangled-name oddness, which seems to me an HCI improvement.

Hm. I’m not sure I agree (figure that :-).

There are two bits of “Python rationale” that I’d like to explain first.

First of all, I chose len(x) over x.len() for HCI reasons (def __len__() came much later). There are two intertwined reasons actually, both HCI:

(a) For some operations, prefix notation just reads better than postfix — prefix (and infix!) operations have a long tradition in mathematics which likes notations where the visuals help the mathematician thinking about a problem. Compare the easy with which we rewrite a formula like x*(a+b) into x*a + x*b to the clumsiness of doing the same thing using a raw OO notation.

(b) When I read code that says len(x) I know that it is asking for the length of something. This tells me two things: the result is an integer, and the argument is some kind of container. To the contrary, when I read x.len(), I have to already know that x is some kind of container implementing an interface or inheriting from a class that has a standard len(). Witness the confusion we occasionally have when a class that is not implementing a mapping has a get() or keys() method, or something that isn’t a file has a write() method.

Saying the same thing in another way, I see ‘len’ as a built-in operation. I’d hate to lose that. I can’t say for sure whether you meant that or not, but ‘def len(self): …’ certainly sounds like you want to demote it to an ordinary method. I’m strongly -1 on that.

The second bit of Python rationale I promised to explain is the reason why I chose special methods to look __special__ and not merely special. I was anticipating lots of operations that classes might want to override, some standard (e.g. __add__ or __getitem__), some not so standard (e.g. pickle’s __reduce__ for a long time had no support in C code at all). I didn’t want these special operations to use ordinary method names, because then pre-existing classes, or classes written by users without an encyclopedic memory for all the special methods, would be liable to accidentally define operations they didn’t mean to implement, with possibly disastrous consequences. Ivan Krstić explained this more concise in his message, which arrived after I’d written all this up.

— –Guido van Rossum (home page: http://www.python.org/~guido/)

My understanding of this is that in certain cases, prefix notation just makes more sense (ie, Duck.quack makes more sense than quack(Duck) from a linguistic standpoint.) and again, the functions allow for “interfaces”.

In such a case, my guess would be to implement get_king_moves based solely on Guido’s first point. But that still leaves a lot of open questions regarding say, implementing a stack and queue class with similar push and pop methods- should they be functions or methods? (here I would guess functions, because I really want to signal a push-pop interface)

TLDR: Can someone explain what the strategy for deciding when to use functions vs. methods should be?


回答 0

我的一般规则是- 是在对象上执行还是由对象执行操作?

如果是由对象完成的,则应该是成员操作。如果它也可以应用于其他事物,或者由对象的其他事物完成,那么它应该是一个函数(或其他事物的成员)。

引入编程时,传统上(尽管实现不正确)以现实世界中的对象(例如汽车)来描述对象。您提到了一只鸭子,所以让我们开始吧。

class duck: 
    def __init__(self):pass
    def eat(self, o): pass 
    def crap(self) : pass
    def die(self)
    ....

在“对象是真实事物”类比的上下文中,为对象可以执行的任何操作添加类方法是“正确的”。所以说我想杀死一只鸭子,是否要在鸭子上添加.kill()?不,据我所知,动物不会自杀。因此,如果我想杀死一只鸭子,我应该这样做:

def kill(o):
    if isinstance(o, duck):
        o.die()
    elif isinstance(o, dog):
        print "WHY????"
        o.die()
    elif isinstance(o, nyancat):
        raise Exception("NYAN "*9001)
    else:
       print "can't kill it."

远离这种类比,为什么我们要使用方法和类?因为我们要包含数据并希望以某种方式构造我们的代码,以便将来可以重用和扩展它。这使我们想到了面向对象设计非常重要的封装概念。

封装原理实际上就是它的含义:作为设计人员,您应该隐藏有关实现和类内部的所有内容,对于任何用户或其他开发人员而言,都不一定要访问它。因为我们处理类的实例,所以这简化为“ 对该实例至关重要的操作”。如果操作不是实例特定的,则它不应是成员函数。

TL; DR:@Bryan说了什么。如果它在实例上运行并且需要访问类实例内部的数据,则它应该是成员函数。

My general rule is this – is the operation performed on the object or by the object?

if it is done by the object, it should be a member operation. If it could apply to other things too, or is done by something else to the object then it should be a function (or perhaps a member of something else).

When introducing programming, it is traditional (albeit implementation incorrect) to describe objects in terms of real-world objects such as cars. You mention a duck, so let’s go with that.

class duck: 
    def __init__(self):pass
    def eat(self, o): pass 
    def crap(self) : pass
    def die(self)
    ....

In the context of the “objects are real things” analogy, it is “correct” to add a class method for anything which the object can do. So say I want to kill off a duck, do I add a .kill() to the duck? No… as far as I know animals do not commit suicide. Therefore if I want to kill a duck I should do this:

def kill(o):
    if isinstance(o, duck):
        o.die()
    elif isinstance(o, dog):
        print "WHY????"
        o.die()
    elif isinstance(o, nyancat):
        raise Exception("NYAN "*9001)
    else:
       print "can't kill it."

Moving away from this analogy, why do we use methods and classes? Because we want to contain data and hopefully structure our code in a manner such that it will be reusable and extensible in the future. This brings us to the notion of encapsulation which is so dear to OO design.

The encapsulation principal is really what this comes down to: as a designer you should hide everything about the implementation and class internals which it is not absolutely necessarily for any user or other developer to access. Because we deal with instances of classes, this reduces to “what operations are crucial on this instance“. If an operation is not instance specific, then it should not be a member function.

TL;DR: what @Bryan said. If it operates on an instance and needs to access data which is internal to the class instance, it should be a member function.


回答 1

在需要以下情况时,请使用类:

1)从实现细节中隔离调用代码-利用抽象封装

2)当您想替代其他对象时-利用多态性

3)当您想为相似的对象重用代码时-利用继承

将函数用于对许多不同的对象类型有意义的调用-例如,内置的lenrepr函数适用于多种对象。

话虽如此,选择有时取决于口味。考虑一下最适合常规通话的方式和可读性。例如,这将是更好的(x.sin()**2 + y.cos()**2).sqrt()还是sqrt(sin(x)**2 + cos(y)**2)

Use a class when you want to:

1) Isolate calling code from implementation details — taking advantage of abstraction and encapsulation.

2) When you want to be substitutable for other objects — taking advantage of polymorphism.

3) When you want to reuse code for similar objects — taking advantage of inheritance.

Use a function for calls that make sense across many different object types — for example, the builtin len and repr functions apply to many kinds of objects.

That being said, the choice sometimes comes down to a matter of taste. Think in terms of what is most convenient and readable for typical calls. For example, which would be better (x.sin()**2 + y.cos()**2).sqrt() or sqrt(sin(x)**2 + cos(y)**2)?


回答 2

这是一条简单的经验法则:如果代码作用于对象的单个实例,请使用一种方法。甚至更好:除非有充分的理由将其编写为函数,否则请使用一种方法。

在您的特定示例中,您希望它看起来像这样:

chessboard = Chessboard()
...
chessboard.get_king_moves()

不要过度考虑。始终使用方法,直到您对自己说“将此方法定义为没有意义”为止,在这种情况下,您可以创建函数。

Here’s a simple rule of thumb: if the code acts upon a single instance of an object, use a method. Even better: use a method unless there is a compelling reason to write it as a function.

In your specific example, you want it to look like this:

chessboard = Chessboard()
...
chessboard.get_king_moves()

Don’t over think it. Always use methods until the point comes where you say to yourself “it doesn’t make sense to make this a method”, in which case you can make a function.


回答 3

我通常认为一个物体像一个人。

属性是人物的姓名,身高,鞋子的大小等。

方法功能是人可以执行的操作。

如果该操作只能由任何其他人完成,而又不需要该特定人独有的任何东西(并且无需更改该特定人的任何东西),那么它就是一个函数,应该这样编写。

如果某项操作正在对该人进行操作(例如进餐,散步等),或者需要该人独特的操作(例如跳舞,写书等),则应采用一种方法

当然,将其转换为您正在使用的特定对象并不总是一件容易的事,但是我发现这是思考它的好方法。

I usually think of an object like a person.

Attributes are the person’s name, height, shoe size, etc.

Methods and functions are operations that the person can perform.

If the operation could be done by just any ol’ person, without requiring anything unique to this one specific person (and without changing anything on this one specific person), then it’s a function and should be written as such.

If an operation is acting upon the person (e.g. eating, walking, …) or requires something unique to this person to get involved (like dancing, writing a book, …), then it should be a method.

Of course, it is not always trivial to translate this into the specific object you’re working with, but I find it is a good way to think of it.


回答 4

通常,我使用类来为某件事实现一组逻辑功能,以便在程序的其余部分中,我可以对事物进行推理,而不必担心构成其实现的所有小问题。

凡是是那核心抽象的一部分“你可以用做什么事情 ”通常应该是一个方法。这通常包括可以改变一切的事情,作为内部数据状态通常被认为是私人,而不是“你可以用做什么逻辑思想的一部分的事情 ”。

当您进行更高级别的操作时,特别是如果它们涉及多个事物,我发现它们通常最自然地表示为函数,前提是它们可以从事物的公共抽象中构建而无需特殊的内部访问(除非它们re方法)。这具有很大的优势,当我决定完全重写我工作方式的内部结构(无需更改接口)时,我只有一小部分核心方法可以重写,然后使用这些方法编写所有外部函数将工作。我发现坚持认为与类X有关的所有操作都是类X上的方法会导致类过于复杂。

这取决于我正在编写的代码。对于某些程序,我将它们建模为对象的集合,这些对象的相互作用引起了程序的行为。在这里,最重要的功能紧密耦合到单个对象,因此是在方法中实现的,其中包含实用功能。对于其他程序,最重要的东西是一组操作数据的函数,而类仅用于实现由这些函数操纵的自然“鸭子类型”。

Generally I use classes to implement a logical set of capabilities for some thing, so that in the rest of my program I can reason about the thing, not having to worry about all the little concerns that make up its implementation.

Anything that’s part of that core abstraction of “what you can do with a thing” should usually be a method. This generally includes everything that can alter a thing, as the internal data state is usually considered private and not part of the logical idea of “what you can do with a thing“.

When you come to higher level operations, especially if they involve multiple things, I find they are usually most naturally expressed as functions, if they can be built out of the public abstraction of a thing without needing special access to the internals (unless they’re methods of some other object). This has the big advantage that when I decide to completely rewrite the internals of how my thing works (without changing the interface), I just have a small core set of methods to rewrite, and then all the external functions written in terms of those methods will Just Work. I find that insisting that all operations to do with class X are methods on class X leads to over-complicated classes.

It depends on the code I’m writing though. For some programs I model them as a collection of objects whose interactions give rise to the behavior of the program; here most important functionality is closely coupled to a single object, and so is implemented in methods, with a scattering of utility functions. For other programs the most important stuff is a set of functions that manipulate data, and classes are in use only to implement the natural “duck types” that are manipulated by the functions.


回答 5

您可能会说,“面对模棱两可,拒绝猜测的诱惑”。

但是,这甚至不是猜测。您绝对可以确保两种方法的结果相同,因为它们可以解决您的问题。

我相信,采用多种方式实现目标只是一件好事。与其他用户一样,我要谦虚地告诉您,就语言而言,采用“味道更好” /感觉更直观的方法。

You may say that, “in the face of ambiguity, refuse the temptation to guess”.

However, it’s not even a guess. You’re absolutely sure that the outcomes of both approaches are the same in that they solve your problem.

I believe it is only a good thing to have multiple ways to accomplishing goals. I’d humbly tell you, as other users did already, to employ whichever “tastes better” / feels more intuitive, in terms of language.


Python编码标准/最佳实践

问题:Python编码标准/最佳实践

在python中,您通常使用PEP 8-Python代码样式指南作为您的编码标准/准则吗?您还有其他更喜欢的正式标准吗?

In python do you generally use PEP 8 — Style Guide for Python Code as your coding standards/guidelines? Are there any other formalized standards that you prefer?


回答 0

“在python中,您通常使用PEP 8-Python代码样式指南作为您的编码标准/准则吗?您是否还需要其他正式的标准?”

如您所提到的,请遵循PEP 8作为主要文本,并遵循PEP 257作为文档字符串约定

与Python样式指南一起,建议您参考以下内容:

  1. 像Pythonista一样的代码:惯用的Python
  2. 常见错误和疣
  3. 如何不编写Python代码
  4. Python陷阱

“In python do you generally use PEP 8 — Style Guide for Python Code as your coding standards/guidelines? Are there any other formalized standards that you prefer?”

As mentioned by you follow PEP 8 for the main text, and PEP 257 for docstring conventions

Along with Python Style Guides, I suggest that you refer the following:

  1. Code Like a Pythonista: Idiomatic Python
  2. Common mistakes and Warts
  3. How not to write Python code
  4. Python gotcha

回答 1

我遵循Rob Knight 撰写Python习语和效率指南。我认为它们与PEP 8完全相同,但更多是基于实例的综合。

如果您使用的是wxPython,则可能还需要查看Chris Barker 撰写的wxPython代码的样式指南

I follow the Python Idioms and Efficiency guidelines, by Rob Knight. I think they are exactly the same as PEP 8, but are more synthetic and based on examples.

If you are using wxPython you might also want to check Style Guide for wxPython code, by Chris Barker, as well.


回答 2

我非常坚持PEP-8。

我有三件事要更改为PEP-8。

  • 避免在括号,方括号或大括号内立即使用多余的空格。

    建议: spam(ham[1], {eggs: 2})

    无论如何,我这样做: spam( ham[ 1 ], { eggs: 2 } )

    为什么?30多年的根深蒂固的习惯使()与函数名或C语言中的语句关键字不符。从70年代的Fortran IV开始。

  • 在算术运算符周围使用空格:

    建议: x = x * 2 - 1

    无论如何,我这样做: x= x * 2 - 1

    为什么?Gries的《编程科学》建议这是强调赋值与状态被更改的变量之间的联系的一种方式。

    对于多重分配或扩充分配,它不太适用,因为我使用了大量空格。

  • 对于函数名称,方法名称和实例变量名称

    建议:小写,单词之间用下划线分隔,以提高可读性。

    我还是这样做:camelCase

    为什么?从80年代的Pascal开始,有20多年根深蒂固的camelCase习惯。

I stick to PEP-8 very closely.

There are three specific things that I can’t be bothered to change to PEP-8.

  • Avoid extraneous whitespace immediately inside parentheses, brackets or braces.

    Suggested: spam(ham[1], {eggs: 2})

    I do this anyway: spam( ham[ 1 ], { eggs: 2 } )

    Why? 30+ years of ingrained habit is snuggling ()’s up against function names or (in C) statements keywords. Starting with Fortran IV in the 70’s.

  • Use spaces around arithmetic operators:

    Suggested: x = x * 2 - 1

    I do this anyway: x= x * 2 - 1

    Why? Gries’ The Science of Programming suggested this as a way to emphasize the connection between assignment and the variable who’s state is being changed.

    It doesn’t work well for multiple assignment or augmented assignment, for that I use lots of spaces.

  • For function names, method names and instance variable names

    Suggested: lowercase, with words separated by underscores as necessary to improve readability.

    I do this anyway: camelCase

    Why? 20+ years of ingrained habit of camelCase, starting with Pascal in the 80’s.


回答 3

PEP 8很好,我唯一希望它更难解决的是Tabs-vs-Spaces的神圣战争。

基本上,如果您要使用python启动项目,则需要选择“制表符”或“空格”,然后立即将所有违规者开枪。

PEP 8 is good, the only thing that i wish it came down harder on was the Tabs-vs-Spaces holy war.

Basically if you are starting a project in python, you need to choose Tabs or Spaces and then shoot all offenders on sight.


回答 4

要添加到bhadra的 惯用指南列表中:

查阅Anthony Baxter的有关有效Python编程的演示文稿(来自OSON 2005)。

摘录:

# dict's setdefault method turns this:
if key in dictobj:
    dictobj[key].append(val)
else:
    dictobj[key] = [val]
# into this:
dictobj.setdefault(key,[]).append(val)

To add to bhadra’s list of idiomatic guides:

Checkout Anthony Baxter’s presentation on Effective Python Programming (from OSON 2005).

An excerpt:

# dict's setdefault method turns this:
if key in dictobj:
    dictobj[key].append(val)
else:
    dictobj[key] = [val]
# into this:
dictobj.setdefault(key,[]).append(val)

回答 5

我非常严格地遵循它。PEP-8之前唯一的神是现有的代码库。

I follow it extremely rigorously. The only god before PEP-8 is existing code bases.


回答 6

是的,我会尽量紧跟其后。

我没有遵循任何其他编码标准。

Yes, I try to follow it as closely as possible.

I don’t follow any other coding standards.


回答 7

我遵循PEP8,这是很棒的编码风格。

I follow the PEP8, it is a great piece of coding style.


将Python代码转换为符合PEP8的工具

问题:将Python代码转换为符合PEP8的工具

我知道有一些工具可以验证您的Python代码是否符合PEP8,例如,既有在线服务又有python模块

但是,我找不到可以我的Python文件转换为自包含的PEP8有效Python文件的服务或模块。有人知道是否有吗?
我认为这是可行的,因为PEP8完全是关于代码的外观,对吧?

I know there are tools which validate whether your Python code is compliant with PEP8, for example there is both an online service and a python module.

However, I cannot find a service or module which can convert my Python file to a self-contained, PEP8 valid Python file. Does anyone know if there are any?
I assume it’s feasible since PEP8 is all about the appearance of the code, right?


回答 0

不幸的是,“ pep8风暴”(整个项目)有几个负面影响:

  • 很多合并冲突
  • 打破git的责备
  • 使代码审查困难

作为替代方案(感谢@yp的想法),我编写了一个小程序包,该程序仅自动pepep8s自上次提交/分支以来一直在处理的那些行:

基本上使项目 比您发现的要好

pip install pep8radius

假设您已经完成工作master并准备提交:

# be somewhere in your project directory
# see the diff with pep, see the changes you've made since master
pep8radius master --diff
# make those changes
pep8radius master --diff --in-place

或者清除自上次提交以来已提交的新行:

pep8radius --diff
pep8radius --diff --in-place

# the lines which changed since a specific commit `git diff 98f51f`
pep8radius 98f51f --diff

基本上pep8radius是将autopep8应用于git / hg diff输出中的行(来自最后一个共享commit)。

该脚本当前可与git和hg一起使用,如果您使用其他方法并希望它起作用,请发表评论/问题/ PR

Unfortunately “pep8 storming” (the entire project) has several negative side-effects:

  • lots of merge-conflicts
  • break git blame
  • make code review difficult

As an alternative (and thanks to @y-p for the idea), I wrote a small package which autopep8s only those lines which you have been working on since the last commit/branch:

Basically leaving the project a little better than you found it:

pip install pep8radius

Suppose you’ve done your work off of master and are ready to commit:

# be somewhere in your project directory
# see the diff with pep, see the changes you've made since master
pep8radius master --diff
# make those changes
pep8radius master --diff --in-place

Or to clean the new lines you’ve commited since the last commit:

pep8radius --diff
pep8radius --diff --in-place

# the lines which changed since a specific commit `git diff 98f51f`
pep8radius 98f51f --diff

Basically pep8radius is applying autopep8 to lines in the output of git/hg diff (from the last shared commit).

This script currently works with git and hg, if your using something else and want this to work please post a comment/issue/PR!


回答 1

您可以使用autopep8!在您自己泡杯咖啡的同时,该工具会愉快地删除所有那些不会改变代码含义的讨厌的PEP8违规行为。

通过pip安装它:

pip install autopep8

将此应用到特定文件:

autopep8 py_file --in-place

或对您的项目(递归地),verbose选项为您提供了一些进展的反馈

autopep8 project_dir --recursive --in-place --pep8-passes 2000 --verbose

注意:有时默认的100次通过还不够,我将其设置为2000,因为它相当高,它将捕获除最麻烦的文件以外的所有文件(一旦发现没有可解决的pep8违规行为,它将停止通过)…

此时,我建议重新测试并进行提交!

如果要“完全”符合PEP8:我使用的一种策略是如上所述运行autopep8,然后运行PEP8,它会打印其余的违规行为(文件,行号以及其他内容):

pep8 project_dir --ignore=E501

并手动更改它们(例如E712s-与布尔值比较)。

注意:autopep8提供了一个--aggressive参数(以无情地“修复”这些改变含义的违规行为),但是请注意,如果您确实使用激进的方法,则可能必须调试…(例如,在numpy / pandas中,True == np.bool_(True)但不能True is np.bool_(True)!)

您可以检查每种类型(前后)有多少次违规:

pep8 --quiet --statistics .

注意:我认为E501(行太长)是一种特殊情况,因为您的代码中可能会有很多这样的代码,有时autopep8无法纠正这些代码。

例如,我将此技术应用于了熊猫代码库。

You can use autopep8! Whilst you make yourself a cup of coffee this tool happily removes all those pesky PEP8 violations which don’t change the meaning of the code.

Install it via pip:

pip install autopep8

Apply this to a specific file:

autopep8 py_file --in-place

or to your project (recursively), the verbose option gives you some feedback of how it’s going:

autopep8 project_dir --recursive --in-place --pep8-passes 2000 --verbose

Note: Sometimes the default of 100 passes isn’t enough, I set it to 2000 as it’s reasonably high and will catch all but the most troublesome files (it stops passing once it finds no resolvable pep8 infractions)…

At this point I suggest retesting and doing a commit!

If you want “full” PEP8 compliance: one tactic I’ve used is to run autopep8 as above, then run PEP8, which prints the remaining violations (file, line number, and what):

pep8 project_dir --ignore=E501

and manually change these individually (e.g. E712s – comparison with boolean).

Note: autopep8 offers an --aggressive argument (to ruthlessly “fix” these meaning-changing violations), but beware if you do use aggressive you may have to debug… (e.g. in numpy/pandas True == np.bool_(True) but not True is np.bool_(True)!)

You can check how many violations of each type (before and after):

pep8 --quiet --statistics .

Note: I consider E501s (line too long) are a special case as there will probably be a lot of these in your code and sometimes these are not corrected by autopep8.

As an example, I applied this technique to the pandas code base.


回答 2

@Andy Hayden很好地概述了autopep8。除此之外,还有一个名为pep8ify的软件包,它也可以执行相同的操作。

但是,这两个软件包都只能消除棉绒错误,但不能格式化代码。

little = more[3:   5]

上面的代码在经过pep8化后仍然保持不变。但是代码看起来还不太好。您可以使用诸如yapf之类的格式化程序,即使代码是PEP8兼容的,也可以格式化代码。上面的代码将被格式化为

little = more[3:5]

有时这甚至会破坏您的手动格式。例如

BAZ = {
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12]
}

将被转换为

BAZ = {[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]}

但是您可以告诉它忽略某些部分。

BAZ = {
   [1, 2, 3, 4],
   [5, 6, 7, 8],
   [9, 10, 11, 12]
}  # yapf: disable

取自我的旧博客文章:自动PEP8和格式化Python代码!

@Andy Hayden gave good overview of autopep8. In addition to that there is one more package called pep8ify which also does the same thing.

However both packages can remove only lint errors but they cannot format code.

little = more[3:   5]

Above code remains same after pep8ifying also. But the code doesn’t look good yet. You can use formatters like yapf, which will format the code even if the code is PEP8 compliant. Above code will be formatted to

little = more[3:5]

Some times this even destroys Your manual formatting. For example

BAZ = {
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12]
}

will be converted to

BAZ = {[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]}

But You can tell it to ignore some parts.

BAZ = {
   [1, 2, 3, 4],
   [5, 6, 7, 8],
   [9, 10, 11, 12]
}  # yapf: disable

Taken from my old blog post: Automatically PEP8 & Format Your Python Code!


回答 3

如果您使用的是eclipse + PyDev,则只需从PyDev的设置中激活autopep8:Windows-> Preferences->在搜索过滤器中键入’autopep8’。

选中“使用autopep8.py进行代码格式化?” ->好

现在,eclipse的CTRL-SHIFT-F代码格式应该使用autopep8来格式化代码:)

If you’re using eclipse + PyDev you can simply activate autopep8 from PyDev’s settings: Windows -> Preferences -> type ‘autopep8’ in the search filter.

Check the ‘use autopep8.py for code formatting?’ -> OK

Now eclipse’s CTRL-SHIFT-F code formatting should format your code using autopep8 :)


回答 4

我对python和代码风格的不同工具进行了广泛的研究。有两种类型的工具:linters-分析您的代码并给出有关错误使用的代码样式的警告并显示有关如何修复它的建议,以及code formatter-在保存文件时,它将使用PEP样式重新设置文档格式。

因为重新格式化必须更加准确-如果重新格式化您不希望的内容变得无用-它们覆盖的PEP所占比例较小,因此棉绒显示的更多。

它们都具有不同的配置权限-例如,在所有规则中都可以配置pylinter(您可以打开/关闭每种类型的警告),根本无法配置黑色。

以下是一些有用的链接和教程:

说明文件:

短绒(按受欢迎程度排序):

代码格式化程序(按流行程度排序):

I made wide research about different instruments for python and code style. There are two types of instruments: linters – analyzing your code and give some warnings about bad used code styles and showing advices how to fix it, and code formatters – when you save your file it re-format your document using PEP style.

Because re-formatting must be more accurate – if it remorfat something that you don’t want it became useless – they cover less part of PEP, linters show much more.

All of them have different permissions for configuring – for example, pylinter configurable in all its rules (you can turn on/off every type of warnings), black unconfigurable at all.

Here are some useful links and tutorials:

Documentation:

Linters (in order of popularity):

Code formatters (in order of popularity):


回答 5

有许多。

IDE通常具有一些内置的格式化功能。IntelliJ Idea / PyCharm确实具有这种功能,Eclipse的Python插件也是如此,依此类推。

有些格式化程序/分类符可以针对多种语言。https://coala.io是一个很好的例子。

然后是单目的工具,其他答案中提到了许多工具。

自动重新格式化的一种特定方法是将文件解析为AST树(不删除注释),然后将其转储回文本(意味着不保留任何原始格式)。例如:https://github.com/python/black

There are many.

IDEs usually have some formatting capability built in. IntelliJ Idea / PyCharm does, same goes for the Python plugin for Eclipse, and so on.

There are formatters/linters that can target multiple languages. https://coala.io is a good example of those.

Then there are the single purpose tools, of which many are mentioned in other answers.

One specific method of automatic reformatting is to parse the file into AST tree (without dropping comments) and then dump it back to text (meaning nothing of the original formatting is preserved). Example of that would be https://github.com/python/black.


创建随机数矩阵的简单方法

问题:创建随机数矩阵的简单方法

我正在尝试创建一个随机数矩阵,但是我的解决方案太长且看起来很丑

random_matrix = [[random.random() for e in range(2)] for e in range(3)]

看起来不错,但是在我的实现中

weights_h = [[random.random() for e in range(len(inputs[0]))] for e in range(hiden_neurons)]

这是非常不可读的,并且不能放在一行上。

I am trying to create a matrix of random numbers, but my solution is too long and looks ugly

random_matrix = [[random.random() for e in range(2)] for e in range(3)]

this looks ok, but in my implementation it is

weights_h = [[random.random() for e in range(len(inputs[0]))] for e in range(hiden_neurons)]

which is extremely unreadable and does not fit on one line.


回答 0

看看numpy.random.rand

文档字符串:rand(d0,d1,…,dn)

给定形状的随机值。

创建给定形状的数组,并使用上均匀分布的随机样本传播它[0, 1)


>>> import numpy as np
>>> np.random.rand(2,3)
array([[ 0.22568268,  0.0053246 ,  0.41282024],
       [ 0.68824936,  0.68086462,  0.6854153 ]])

Take a look at numpy.random.rand:

Docstring: rand(d0, d1, …, dn)

Random values in a given shape.

Create an array of the given shape and propagate it with random samples from a uniform distribution over [0, 1).


>>> import numpy as np
>>> np.random.rand(2,3)
array([[ 0.22568268,  0.0053246 ,  0.41282024],
       [ 0.68824936,  0.68086462,  0.6854153 ]])

回答 1

您可以删除range(len())

weights_h = [[random.random() for e in inputs[0]] for e in range(hiden_neurons)]

但实际上,您可能应该使用numpy。

In [9]: numpy.random.random((3, 3))
Out[9]:
array([[ 0.37052381,  0.03463207,  0.10669077],
       [ 0.05862909,  0.8515325 ,  0.79809676],
       [ 0.43203632,  0.54633635,  0.09076408]])

You can drop the range(len()):

weights_h = [[random.random() for e in inputs[0]] for e in range(hiden_neurons)]

But really, you should probably use numpy.

In [9]: numpy.random.random((3, 3))
Out[9]:
array([[ 0.37052381,  0.03463207,  0.10669077],
       [ 0.05862909,  0.8515325 ,  0.79809676],
       [ 0.43203632,  0.54633635,  0.09076408]])

回答 2

使用np.random.randint()numpy.random.random_integers()是过时

random_matrix = numpy.random.randint(min_val,max_val,(<num_rows>,<num_cols>))

use np.random.randint() as numpy.random.random_integers() is deprecated

random_matrix = numpy.random.randint(min_val,max_val,(<num_rows>,<num_cols>))

回答 3

看起来您正在执行Coursera机器学习神经网络练习的Python实现。这是我为randInitializeWeights(L_in,L_out)做的

#get a random array of floats between 0 and 1 as Pavel mentioned 
W = numpy.random.random((L_out, L_in +1))

#normalize so that it spans a range of twice epsilon
W = W * 2 * epsilon

#shift so that mean is at zero
W = W - epsilon

Looks like you are doing a Python implementation of the Coursera Machine Learning Neural Network exercise. Here’s what I did for randInitializeWeights(L_in, L_out)

#get a random array of floats between 0 and 1 as Pavel mentioned 
W = numpy.random.random((L_out, L_in +1))

#normalize so that it spans a range of twice epsilon
W = W * 2 * epsilon

#shift so that mean is at zero
W = W - epsilon

回答 4

首先,创建numpy数组,然后将其转换为matrix。请参见下面的代码:

import numpy

B = numpy.random.random((3, 4)) #its ndArray
C = numpy.matrix(B)# it is matrix
print(type(B))
print(type(C)) 
print(C)

First, create numpy array then convert it into matrix. See the code below:

import numpy

B = numpy.random.random((3, 4)) #its ndArray
C = numpy.matrix(B)# it is matrix
print(type(B))
print(type(C)) 
print(C)

回答 5

x = np.int_(np.random.rand(10) * 10)

对于10中的随机数,对于20中的随机数,我们必须乘以20。

x = np.int_(np.random.rand(10) * 10)

For random numbers out of 10. For out of 20 we have to multiply by 20.


回答 6

当您说“随机数矩阵”时,您可以使用numpy作为上述的Pavel https://stackoverflow.com/a/15451997/6169225,在这种情况下,我假设与您无关的是,这些分布是什么(伪)坚持随机数。

但是,如果您需要特定的分布(我想您对统一分布很感兴趣),则numpy.random有非常有用的方法为您服务。例如,假设您要一个3×2矩阵,其伪随机均匀分布以[low,high]为边界。您可以这样做:

numpy.random.uniform(low,high,(3,2))

注意,您可以用uniform此库支持的任意数量的发行版代替。

进一步阅读:https : //docs.scipy.org/doc/numpy/reference/routines.random.html

When you say “a matrix of random numbers”, you can use numpy as Pavel https://stackoverflow.com/a/15451997/6169225 mentioned above, in this case I’m assuming to you it is irrelevant what distribution these (pseudo) random numbers adhere to.

However, if you require a particular distribution (I imagine you are interested in the uniform distribution), numpy.random has very useful methods for you. For example, let’s say you want a 3×2 matrix with a pseudo random uniform distribution bounded by [low,high]. You can do this like so:

numpy.random.uniform(low,high,(3,2))

Note, you can replace uniform by any number of distributions supported by this library.

Further reading: https://docs.scipy.org/doc/numpy/reference/routines.random.html


回答 7

创建随机整数数组的一种简单方法是:

matrix = np.random.randint(maxVal, size=(rows, columns))

下面的代码输出从0到10的2到3的随机整数矩阵:

a = np.random.randint(10, size=(2,3))

A simple way of creating an array of random integers is:

matrix = np.random.randint(maxVal, size=(rows, columns))

The following outputs a 2 by 3 matrix of random integers from 0 to 10:

a = np.random.randint(10, size=(2,3))

回答 8

为了创建随机数数组,NumPy使用以下方法创建数组:

  1. 实数

  2. 整数

使用随机实数创建数组 有2个选项

  1. random.rand(用于均匀分布所生成的随机数)
  2. random.randn(用于所生成的随机数的正态分布)

随机兰德

import numpy as np 
arr = np.random.rand(row_size, column_size) 

随机兰德

import numpy as np 
arr = np.random.randn(row_size, column_size) 

使用随机整数创建数组

import numpy as np
numpy.random.randint(low, high=None, size=None, dtype='l')

哪里

  • low =要从分布中得出的最低(有符号)整数
  • 高(可选)=如果提供,则从分布中得出的最大(有符号)整数上方
  • size(可选)=输出形状,即如果给定形状为例如(m,n,k),则绘制m * n * k个样本
  • dtype(可选)=结果的所需dtype。

例如:

给定的示例将生成一个介于0和4之间的随机整数数组,其大小将为5 * 5,并具有25个整数

arr2 = np.random.randint(0,5,size = (5,5))

为了创建5 x 5矩阵,应将其修改为

arr2 = np.random.randint(0,5,size =(5,5)),将乘法符号*更改为逗号,#

[[2 1 1 0 1] [3 2 1 4 3] [2 3 0 3 3] [1 3 1 0 0] [4 1 2 0 1]]

eg2:

给定的示例将生成一个介于0和1之间的随机整数数组,其大小将为1 * 10并具有10个整数

arr3= np.random.randint(2, size = 10)

[0 0 0 0 1 1 0 0 1 1]

For creating an array of random numbers NumPy provides array creation using:

  1. Real numbers

  2. Integers

For creating array using random Real numbers: there are 2 options

  1. random.rand (for uniform distribution of the generated random numbers )
  2. random.randn (for normal distribution of the generated random numbers )

random.rand

import numpy as np 
arr = np.random.rand(row_size, column_size) 

random.randn

import numpy as np 
arr = np.random.randn(row_size, column_size) 

For creating array using random Integers:

import numpy as np
numpy.random.randint(low, high=None, size=None, dtype='l')

where

  • low = Lowest (signed) integer to be drawn from the distribution
  • high(optional)= If provided, one above the largest (signed) integer to be drawn from the distribution
  • size(optional) = Output shape i.e. if the given shape is, e.g., (m, n, k), then m * n * k samples are drawn
  • dtype(optional) = Desired dtype of the result.

eg:

The given example will produce an array of random integers between 0 and 4, its size will be 5*5 and have 25 integers

arr2 = np.random.randint(0,5,size = (5,5))

in order to create 5 by 5 matrix, it should be modified to

arr2 = np.random.randint(0,5,size = (5,5)), change the multiplication symbol* to a comma ,#

[[2 1 1 0 1][3 2 1 4 3][2 3 0 3 3][1 3 1 0 0][4 1 2 0 1]]

eg2:

The given example will produce an array of random integers between 0 and 1, its size will be 1*10 and will have 10 integers

arr3= np.random.randint(2, size = 10)

[0 0 0 0 1 1 0 0 1 1]


回答 9

random_matrix = [[random.random for j in range(collumns)] for i in range(rows)
for i in range(rows):
    print random_matrix[i]
random_matrix = [[random.random for j in range(collumns)] for i in range(rows)
for i in range(rows):
    print random_matrix[i]

回答 10

使用map-reduce的答案:-

map(lambda x: map(lambda y: ran(),range(len(inputs[0]))),range(hiden_neurons))

An answer using map-reduce:-

map(lambda x: map(lambda y: ran(),range(len(inputs[0]))),range(hiden_neurons))

回答 11

#this is a function for a square matrix so on the while loop rows does not have to be less than cols.
#you can make your own condition. But if you want your a square matrix, use this code.

import random

import numpy as np

def random_matrix(R, cols):

        matrix = []

        rows =  0

        while  rows < cols:

            N = random.sample(R, cols)

            matrix.append(N)

            rows = rows + 1

    return np.array(matrix)

print(random_matrix(range(10), 5))
#make sure you understand the function random.sample
#this is a function for a square matrix so on the while loop rows does not have to be less than cols.
#you can make your own condition. But if you want your a square matrix, use this code.

import random

import numpy as np

def random_matrix(R, cols):

        matrix = []

        rows =  0

        while  rows < cols:

            N = random.sample(R, cols)

            matrix.append(N)

            rows = rows + 1

    return np.array(matrix)

print(random_matrix(range(10), 5))
#make sure you understand the function random.sample

回答 12

numpy.random.rand(row,column)根据给定的指定(m,n)参数生成0到1之间的随机数。因此,使用它创建一个(m,n)矩阵,并将该矩阵乘以范围限制,然后将其与上限相加。

分析:如果生成零,则仅将保持下限,但是如果生成零,将仅保持上限。顺便说一句,使用rand numpy生成限制,您可以生成极端所需的数字。

import numpy as np

high = 10
low = 5
m,n = 2,2

a = (high - low)*np.random.rand(m,n) + low

输出:

a = array([[5.91580065, 8.1117106 ],
          [6.30986984, 5.720437  ]])

numpy.random.rand(row, column) generates random numbers between 0 and 1, according to the specified (m,n) parameters given. So use it to create a (m,n) matrix and multiply the matrix for the range limit and sum it with the high limit.

Analyzing: If zero is generated just the low limit will be held, but if one is generated just the high limit will be held. In order words, generating the limits using rand numpy you can generate the extreme desired numbers.

import numpy as np

high = 10
low = 5
m,n = 2,2

a = (high - low)*np.random.rand(m,n) + low

Output:

a = array([[5.91580065, 8.1117106 ],
          [6.30986984, 5.720437  ]])

我可以对代码执行哪种模式以使其更容易转换为另一种编程语言?[关闭]

问题:我可以对代码执行哪种模式以使其更容易转换为另一种编程语言?[关闭]

我正着手做一个副项目,目标是将代码从一种编程语言转换为另一种编程语言。我开始使用的语言是PHP和Python(Python到PHP应该更容易入手),但理想情况下,我可以(相对)轻松地添加其他语言。该计划是:

  • 这是针对Web开发的。原始代码和目标代码将位于框架的顶部(我也将不得不编写这些框架)。这些框架将包含MVC设计模式并遵循严格的编码约定。这应该使翻译更加容易。

  • 我还在研究IOC和依赖项注入,因为它们可能使翻译过程更容易且更不易出错。

  • 我将使用Python的解析器模块,该模块可让我摆弄抽象语法树。显然,我可以用PHP获得的最接近的是token_get_all(),这是一个开始。

  • 从那时起,我可以构建AST,符号表和控制流程。

然后,我相信我可以开始输出代码了。我不需要完美的翻译。我仍然需要查看生成的代码并解决问题。理想情况下,翻译人员应标记有问题的翻译。

在您问“这到底是什么意思?”之前 答案是……这将是一次有趣的学习经历。如果您对如何减少这种麻烦有任何见解,请告诉我。


编辑:

我更想知道我可以对代码强制执行哪种类型的模式,而不是如何进行翻译,从而使代码的翻译(即:IoC,SOA?)更容易。

I am setting out to do a side project that has the goal of translating code from one programming language to another. The languages I am starting with are PHP and Python (Python to PHP should be easier to start with), but ideally I would be able to add other languages with (relative) ease. The plan is:

  • This is geared towards web development. The original and target code will be be sitting on top of frameworks (which I will also have to write). These frameworks will embrace an MVC design pattern and follow strict coding conventions. This should make translation somewhat easier.

  • I am also looking at IOC and dependency injection, as they might make the translation process easier and less error prone.

  • I’ll make use of Python’s parser module, which lets me fiddle with the Abstract Syntax Tree. Apparently the closest I can get with PHP is token_get_all(), which is a start.

  • From then on I can build the AST, symbol tables and control flow.

Then I believe I can start outputting code. I don’t need a perfect translation. I’ll still have to review the generated code and fix problems. Ideally the translator should flag problematic translations.

Before you ask “What the hell is the point of this?” The answer is… It’ll be an interesting learning experience. If you have any insights on how to make this less daunting, please let me know.


EDIT:

I am more interested in knowing what kinds of patterns I could enforce on the code to make it easier to translate (ie: IoC, SOA ?) the code than how to do the translation.


回答 0

自1995年以来,在强大的计算机科学家团队的支持下,我一直在构建工具(DMS Software Reengineering Toolkit)来进行通用程序处理(语言翻译为特例)。DMS提供通用解析,AST构建,符号表,控制和数据流分析,转换规则的应用,带有注释的源文本的再生等,所有这些都通过计算机语言的显式定义进行参数化。

机器,你需要做到这一点的量为广大的(特别是如果你希望能够在一个通用的方式来做到这一点对于多国语言),然后你需要用不可靠的定义语言可靠分析器(PHP是这个完美的例子)。

您考虑构建或尝试进行语言到语言的翻译并没有错,但是我认为您会发现,对于真正的语言而言,这是一项比您期望的大得多的任务。我们仅在DMS上投入了大约100个人年,在每种“可靠”的语言定义(包括我们为PHP痛苦地构建的一种语言)上又花了6到12个月的时间,对于讨厌的语言(例如C ++)则投入了更多。这将是“一次学习经历”;这一直在我们身上。(您可能会发现上述网站上的“技术论文”部分有趣,可快速开始学习)。

人们经常尝试从某种他们熟悉的技术入手来构建某种通用的机器。(Python AST是一个很好的例子)。好消息是,这项工作已经完成。坏消息是,机械中内置了无数种假设,直到您尝试将其用于其他用途之前,您几乎不会发现其中的大部分假设。到那时,您发现机器已连接起来可以执行其最初的工作,并且会真的,真的会抵制您使它做其他事情的尝试。(我怀疑尝试让Python AST建模PHP会很有趣)。

我最初开始构建DMS的原因是建立的基础很少内置这样的假设。它使我们有些头痛。到目前为止,还没有黑洞。(在过去的15年中,我工作中最难的部分是试图防止这种假设蔓延)。

很多人也犯了一个错误的假设,即如果他们可以解析(并且可能获得AST),那么他们就可以做复杂的事情了。困难的教训之一是,您需要符号表和流程分析才能进行良好的程序分析或转换。AST是必要的,但还不够。这就是Aho&Ullman的编译器书不止于第二章的原因。(OP拥有此权利,因为他计划在AST之外构建其他机器)。有关此主题的更多信息,请参见解析后的生命

关于“我不需要完美的翻译”的评论很麻烦。弱翻译的工作是转换80%的“简单”代码,而剩下20%的代码要手工完成。如果要转换的应用程序很小,并且只打算转换一次,那么20%就可以了。如果要转换许多应用程序(甚至是随时间变化很小的同一应用程序),那不是很好。如果您尝试转换100K SLOC,则20%是20,000原始代码行,这些代码很难翻译,理解和修改,而您还无法理解另外80,000行已翻译程序。这需要大量的努力。在百万行级别,这实际上是不可能的。更难,他们通常会长时间拖延,付出高昂的代价并经常彻底失败,这很痛苦。

要翻译大型系统,您需要拍摄的是90%的高转换率,或者您可能无法完成翻译活动的手动部分。

另一个关键考虑因素是要翻译的代码大小。即使使用良好的工具,也要花费大量的精力来构建能正常运行的强大翻译器。尽管构建翻译器而不是简单地进行手动转换似乎很酷,而且很酷,但是对于较小的代码库(例如,根据我们的经验,最多10万个SLOC),从经济角度讲,这样做并不合理。没有人喜欢这个答案,但是,如果您真的只需要翻译10K SLOC代码,则最好是硬着头皮做一下。是的,那很痛苦。

我认为我们的工具非常出色(但后来我颇有偏见)。建立一个好的翻译仍然非常困难。我们大约需要1.5到2个人工年,我们知道如何使用我们的工具。不同之处在于,有了如此多的设备,我们成功的次数多于失败的次数。

I’ve been building tools (DMS Software Reengineering Toolkit) to do general purpose program manipulation (with language translation being a special case) since 1995, supported by a strong team of computer scientists. DMS provides generic parsing, AST building, symbol tables, control and data flow analysis, application of translation rules, regeneration of source text with comments, etc., all parameterized by explicit definitions of computer languages.

The amount of machinery you need to do this well is vast (especially if you want to be able to do this for multiple languages in a general way), and then you need reliable parsers for languages with unreliable definitions (PHP is perfect example of this).

There’s nothing wrong with you thinking about building a language-to-language translator or attempting it, but I think you’ll find this a much bigger task for real languages than you expect. We have some 100 man-years invested in just DMS, and another 6-12 months in each “reliable” language definition (including the one we painfully built for PHP), much more for nasty languages such as C++. It will be a “hell of a learning experience”; it has been for us. (You might find the technical Papers section at the above website interesting to jump start that learning).

People often attempt to build some kind of generalized machinery by starting with some piece of technology with which they are familiar, that does a part of the job. (Python ASTs are great example). The good news, is that part of the job is done. The bad news is that machinery has a zillion assumptions built into it, most of which you won’t discover until you try to wrestle it into doing something else. At that point you find out the machinery is wired to do what it originally does, and will really, really resist your attempt to make it do something else. (I suspect trying to get the Python AST to model PHP is going to be a lot of fun).

The reason I started to build DMS originally was to build foundations that had very few such assumptions built in. It has some that give us headaches. So far, no black holes. (The hardest part of my job over the last 15 years is to try to prevent such assumptions from creeping in).

Lots of folks also make the mistake of assuming that if they can parse (and perhaps get an AST), they are well on the way to doing something complicated. One of the hard lessons is that you need symbol tables and flow analysis to do good program analysis or transformation. ASTs are necessary but not sufficient. This is the reason that Aho&Ullman’s compiler book doesn’t stop at chapter 2. (The OP has this right in that he is planning to build additional machinery beyond the AST). For more on this topic, see Life After Parsing.

The remark about “I don’t need a perfect translation” is troublesome. What weak translators do is convert the “easy” 80% of the code, leaving the hard 20% to do by hand. If the application you intend to convert are pretty small, and you only intend to convert it once well, then that 20% is OK. If you want to convert many applications (or even the same one with minor changes over time), this is not nice. If you attempt to convert 100K SLOC then 20% is 20,000 original lines of code that are hard to translate, understand and modify in the context of another 80,000 lines of translated program you already don’t understand. That takes a huge amount of effort. At the million line level, this is simply impossible in practice. (Amazingly there are people that distrust automated tools and insist on translating million line systems by hand; that’s even harder and they normally find out painfully with long time delays, high costs and often outright failure.)

What you have to shoot for to translate large-scale systems is high nineties percentage conversion rates, or it is likely that you can’t complete the manual part of the translation activity.

Another key consideration is size of code to be translated. It takes a lot of energy to build a working, robust translator, even with good tools. While it seems sexy and cool to build a translator instead of simply doing a manual conversion, for small code bases (e.g., up to about 100K SLOC in our experience) the economics simply don’t justify it. Nobody likes this answer, but if you really have to translate just 10K SLOC of code, you are probably better off just biting the bullet and doing it. And yes, that’s painful.

I consider our tools to be extremely good (but then, I’m pretty biased). And it is still very hard to build a good translator; it takes us about 1.5-2 man-years and we know how to use our tools. The difference is that with this much machinery, we succeed considerably more often than we fail.


回答 1

我的答案将解决解析Python以便将其翻译为另一种语言的特定任务,而不是Ira在其答案中很好解决的更高层次的方面。

简而言之:不要使用解析器模块,这是一种更简单的方法。

ast自Python 2.6起提供的模块更加适合您的需求,因为它为您提供了现成的AST可以使用。我已经写了一本关于文章最后一年,但在短,使用parse的方法ast将Python源代码解析为AST。该parser模块将为您提供一个解析树,而不是AST。小心区别

现在,由于Python的AST非常详细,因此对于AST来说,前端工作并不困难。我想您可以很快为功能的某些部分准备一个简单的原型。但是,获得完整的解决方案将花费更多时间,这主要是因为语言的语义不同。语言的一个简单子集(功能,基本类型等)可以轻松翻译,但是一旦进入更复杂的层次,您将需要笨拙的机制来模仿一种语言的核心。例如,考虑一下Python的生成器和列表理解,这在PHP中是不存在的(据我所知,当涉及到PHP时,这是很差的)。

为了给您最后的提示,请考虑2to3由Python开发人员创建的将Python 2代码转换为Python 3代码的工具。从前端来看,它具有将Python转换成某种东西所需的大多数元素。但是,由于Python 2和3的内核相似,因此那里不需要仿真机制。

My answer will address the specific task of parsing Python in order to translate it to another language, and not the higher-level aspects which Ira addressed well in his answer.

In short: do not use the parser module, there’s an easier way.

The ast module, available since Python 2.6 is much more suitable for your needs, since it gives you a ready-made AST to work with. I’ve written an article on this last year, but in short, use the parse method of ast to parse Python source code into an AST. The parser module will give you a parse tree, not an AST. Be wary of the difference.

Now, since Python’s ASTs are quite detailed, given an AST the front-end job isn’t terribly hard. I suppose you can have a simple prototype for some parts of the functionality ready quite quickly. However, getting to a complete solution will take more time, mainly because the semantics of the languages are different. A simple subset of the language (functions, basic types and so on) can be readily translated, but once you get into the more complex layers, you’ll need heavy machinery to emulate one language’s core in another. For example consider Python’s generators and list comprehensions which don’t exist in PHP (to my best knowledge, which is admittedly poor when PHP is involved).

To give you one final tip, consider the 2to3 tool created by the Python devs to translate Python 2 code to Python 3 code. Front-end-wise, it has most of the elements you need to translate Python to something. However, since the cores of Python 2 and 3 are similar, no emulation machinery is required there.


回答 2

编写翻译不是没有可能,尤其是考虑到乔尔的实习生是在夏天完成的。

如果您想讲一种语言,这很容易。如果您想做更多的事情,那会有些困难,但不要太多。最难的部分是,尽管任何图灵完备的语言都可以完成另一种图灵完备的语言所能做的事情,但是内置数据类型却可以显着改变一种语言所要做的事情。

例如:

word = 'This is not a word'
print word[::-2]

需要很多复制的C ++代码(好的,您可以使用一些循环结构来做得很短,但是仍然可以)。

我想那是一个问题。

您是否曾经根据语言语法编写过分词器/解析器?如果没有,您可能想学习如何做,因为这是该项目的主要部分。我要做的是提供基本的Turing完整语法-与Python 字节码相当相似 。然后创建一个采用语言语法的词法分析器/解析器(也许使用BNF),并基于该语法将语言编译为中间语言。然后,您需要做的是相反的操作-根据语法将您的语言创建为目标语言的解析器。

我看到的最明显的问题是,一开始您可能会创建极其低效的代码,尤其是在Python等功能更强大的语言中。

但是,如果以这种方式进行操作,那么您可能会一直想出优化输出的方法。总结一下:

  • 阅读提供的语法
  • 将程序编译成中间(也包括图灵完整)语法
  • 将中间程序编译成最终语言(基于提供的语法)
  • …?
  • 利润!(?)

*功能强大,我的意思是这需要4行:

myinput = raw_input("Enter something: ")
print myinput.replace('a', 'A')
print sum(ord(c) for c in myinput)
print myinput[::-1]

向我展示另一种可以在4行中完成类似工作的语言,并且我将向您展示一种与Python一样强大的语言。

Writing a translator isn’t impossible, especially considering that Joel’s Intern did it over a summer.

If you want to do one language, it’s easy. If you want to do more, it’s a little more difficult, but not too much. The hardest part is that, while any turing complete language can do what another turing complete language does, built-in data types can change what a language does phenomenally.

For instance:

word = 'This is not a word'
print word[::-2]

takes a lot of C++ code to duplicate (ok, well you can do it fairly short with some looping constructs, but still).

That’s a bit of an aside, I guess.

Have you ever written a tokenizer/parser based on a language grammar? You’ll probably want to learn how to do that if you haven’t, because that’s the main part of this project. What I would do is come up with a basic Turing complete syntax – something fairly similar to Python bytecode. Then you create a lexer/parser that takes a language grammar (perhaps using BNF), and based on the grammar, compiles the language into your intermediate language. Then what you’ll want to do is do the reverse – create a parser from your language into target languages based on the grammar.

The most obvious problem I see is that at first you’ll probably create horribly inefficient code, especially in more powerful* languages like Python.

But if you do it this way then you’ll probably be able to figure out ways to optimize the output as you go along. To summarize:

  • read provided grammar
  • compile program into intermediate (but also Turing complete) syntax
  • compile intermediate program into final language (based on provided grammar)
  • …?
  • Profit!(?)

*by powerful I mean that this takes 4 lines:

myinput = raw_input("Enter something: ")
print myinput.replace('a', 'A')
print sum(ord(c) for c in myinput)
print myinput[::-1]

Show me another language that can do something like that in 4 lines, and I’ll show you a language that’s as powerful as Python.


回答 3

有几个答案告诉您不要打扰。好吧,那有什么帮助?你想学习吗?你可以学习。这是编译。碰巧您的目标语言不是机器代码,而是另一种高级语言。这一直都在做。

有一种相对简单的入门方法。首先,进入http://sourceforge.net/projects/lime-php/(如果您要使用PHP)或类似的代码,并查看示例代码。接下来,您可以使用一系列正则表达式编写词法分析器,并将令牌提供给生成的解析器。您的语义动作既可以直接使用另一种语言输出代码,也可以构建一些数据结构(例如对象,人),您可以对其进行按摩和遍历以生成输出代码。

您对PHP和Python很幸运,因为在很多方面,它们是彼此相同的语言,但是语法不同。困难的部分是克服语法形式和数据结构之间的语义差异。例如,Python具有列表和字典,而PHP仅具有assoc数组。

“学习者”方法是为语言的受限子集(例如仅打印语句,简单的数学和变量赋值)构建可以正常运行的内容,然后逐步消除限制。这基本上就是该领域的“大人物”所做的。

哦,由于您在Python中没有静态类型,因此最好编写并依赖PHP函数,例如“ python_add”,该函数根据Python的执行方式添加数字,字符串或对象。

显然,如果您允许它会变得更大。

There are a couple answers telling you not to bother. Well, how helpful is that? You want to learn? You can learn. This is compilation. It just so happens that your target language isn’t machine code, but another high-level language. This is done all the time.

There’s a relatively easy way to get started. First, go get http://sourceforge.net/projects/lime-php/ (if you want to work in PHP) or some such and go through the example code. Next, you can write a lexical analyzer using a sequence of regular expressions and feed tokens to the parser you generate. Your semantic actions can either output code directly in another language or build up some data structure (think objects, man) that you can massage and traverse to generate output code.

You’re lucky with PHP and Python because in many respects they are the same language as each other, but with different syntax. The hard part is getting over the semantic differences between the grammar forms and data structures. For example, Python has lists and dictionaries, while PHP only has assoc arrays.

The “learner” approach is to build something that works OK for a restricted subset of the language (such as only print statements, simple math, and variable assignment), and then progressively remove limitations. That’s basically what the “big” guys in the field all did.

Oh, and since you don’t have static types in Python, it might be best to write and rely on PHP functions like “python_add” which adds numbers, strings, or objects according to the way Python does it.

Obviously, this can get much bigger if you let it.


回答 4

对于使用ast.parse而不是解析器(我以前不知道)的观点,我将第二个@EliBendersky的观点。我也热烈建议您查看他的博客。我使用ast.parse做Python-> JavaScript转换器(@ https://bitbucket.org/amirouche/pythonium)。我通过一些审查其他实现并自己尝试来提出Pythonium设计。我从也是我开始的https://github.com/PythonJS/PythonJS分叉了Pythonium ,它实际上是一个完整的重写。整体设计灵感来自PyPy和http://www.hpl.hp.com/techreports/Compaq-DEC/WRL-89-1.pdf文件。

我尝试过的所有事情,从开始到最佳解决方案,即使看起来像是Pythonium营销,实际上也不是(不要犹豫告诉我,网络礼仪是否看起来不正确):

  • 使用原型继承在Plain Old JavaScript中实现Python语义:AFAIK无法使用JS原型对象系统实现Python多重继承。后来我确实尝试使用其他技巧来做到这一点(参见getattribute)。据我所知,JavaScript中没有实现Python多重继承,最好的是单一继承+ mixins,但我不确定它们是否可以处理钻石继承。类似于Skulpt,但没有Google Clojure。

  • 我尝试过使用Google clojure,就像Skulpt(编译器)一样,而不是实际阅读Skulpt代码#fail。无论如何因为基于JS原型的对象系统仍然是不可能的。创建绑定非常困难,您需要编写JavaScript和大量样板代码(请参阅https://github.com/skulpt/skulpt/issues/50,其中我是幽灵)。那时,还没有明确的方法将绑定集成到构建系统中。我认为Skulpt是一个库,您只需要在html中包含.py文件即可执行,开发人员无需进行任何编译阶段。

  • 尝试过pyjaco(编译器),但是创建绑定(从Python代码调用Javascript代码)非常困难,每次创建的样板代码太多。现在,我认为pyjaco更接近Pythonium。pyjaco是用Python编写的(也是ast.parse),但是很多是用JavaScript编写的,并且使用原型继承。

我从未真正成功运行过睡衣#fail,也从未尝试再次读取代码#fail。但是在我看来,睡衣正在执行API-> API转换(或框架到框架),而不是Python到JavaScript的转换。JavaScript框架使用页面中已经存在的数据或来自服务器的数据。Python代码只是“管道”。之后,我发现睡衣实际上是一个真正的python-> js转换器。

我仍然认为可以进行API-> API(或框架->框架)转换,这基本上是我在Pythonium中所做的,但级别较低。睡衣可能使用与Pythonium相同的算法…

然后,我发现brython完全用Javascript编写,例如Skulpt,不需要编译和大量的绒毛…而是用JavaScript编写。

自从在该项目的过程中编写了第一行代码以来,我就了解PyPy,甚至包括PyPy的JavaScript后端。是的,如果找到它,您可以直接从PyPy用JavaScript生成Python解释器。人们说,那是一场灾难。我没有读到为什么。但是我认为原因是它们用于实现解释器的中间语言RPython是为转换为C(也许是asm)而定制的Python子集。艾拉·巴克斯特(Ira Baxter)说,在构建某些东西时,您总是会做一些假设,并且可能会对其进行微调,使其在PyPy:Python-> C转换的情况下达到最佳效果。这些假设在其他情况下可能不相关,更糟糕的是,它们可以推断出开销,否则,说直接翻译很可能总是会更好。

用Python编写解释器听起来是一个(非常)好主意。但是出于性能原因,我对编译器更感兴趣,实际上将Python编译为JavaScript比解释它更容易。

我以将可以轻松转换为JavaScript的Python子集组合在一起的想法开始了PythonJS。起初,由于过去的经验,我什至没有去实施OO系统。我实现的翻译成JavaScript的Python子集是:

  • 在定义和调用中具有全参数语义的函数。这是我最引以为傲的部分。
  • while / if / elif / else
  • Python类型已转换为JavaScript类型(没有任何类型的python类型)
  • for只能迭代Javascript数组(对于in数组)
  • 透明访问JavaScript:如果您使用Python代码编写Array,它将被转换为JavaScript中的Array。就可用性而言,这是其竞争对手的最大成就。
  • 您可以将Python源代码中定义的函数传递给javascript函数。默认参数将被考虑在内。
  • 它添加了一个名为new的特殊功能,该功能被转换为JavaScript new,例如:new(Python)(1,2,spam,“ egg”)被转换为“ new Python(1,2,spam,” egg“)。
  • 翻译人员会自动处理“ var”。(来自Brett(PythonJS贡献者)的发现非常好。
  • 全局关键字
  • 关闭
  • Lambdas
  • 清单理解
  • 通过requirejs支持导入
  • 单类继承+通过classyjs的mixin

与Python的完整语义相比,这看起来很多,但实际上非常狭窄。它实际上是带有Python语法的JavaScript。

生成的JS是完美的,即。没有开销,无法通过进一步编辑来改善性能。如果您可以改善生成的代码,也可以从Python源文件中完成。此外,编译器也不依赖您可以在http://superherojs.com/编写的.js中找到的JS技巧。,因此它非常易于阅读。

PythonJS这部分的直接后代是Pythonium Veloce模式。完整的实现可以在@ https://bitbucket.org/amirouche/pythonium/src/33898da731ee2d768ced3​​92f1c369afd746c25d7/pythonium/veloce/veloce.py?at=master中找到 //bitbucket.org/amirouche/pythonium/src/33898da731ee2d768ced3​​92f1c369afd746c25d7/pythonium/veloce/veloce.py?at master 793 SLOC +大约100 SLOC与其他翻译器共享的代码。

可以在Veloce模式下翻译pystones.py的改编版本。https://bitbucket.org/amirouche/pythonium/src/33898da731ee2d768ced3​​92f1c369afd746c25d7/pystone/?at=master

设置基本的Python-> JavaScript转换后,我选择了另一条路径将完整的Python转换为JavaScript。除了目标语言外,glib进行基于对象的基于类的代码的方式是JS,因此您可以访问数组,类似地图的对象和许多其他技巧,而所有这些部分都是用Python编写的。IIRC没有Pythonium转换器编写的javascript代码。获得单一继承并不困难,以下是使Pythonium完全兼容Python的困难部分:

  • spam.egg 在Python中总是翻译为 getattribute(spam, "egg")我没有特别描述的内容,但我认为它会浪费很多时间,并且我不确定是否可以使用asm.js或其他任何方式对其进行改进。
  • 方法解析顺序:即使使用Python编写的算法,将其翻译成Python Veloce兼容代码也是一项巨大的努力。
  • getattributre:实际的getattribute解析算法有点棘手,它仍然不支持数据描述符
  • 基于元类的类:我知道在哪里插入代码,但仍然…
  • 最后一点不是最重要的:some_callable(…)始终转换为“ call(some_callable)”。AFAIK转换程序根本不使用推理,因此,每次调用时,都需要检查调用该对象的方式,以及调用该对象的方式。

这部分在https://bitbucket.org/amirouche/pythonium/src/33898da731ee2d768ced3​​92f1c369afd746c25d7/pythonium/compatible/runtime.py?at=master中进行了分解它是用Python编写的,与Python Veloce兼容。

实际的兼容翻译器https://bitbucket.org/amirouche/pythonium/src/33898da731ee2d768ced3​​92f1c369afd746c25d7/pythonium/compatible/compatible.py?at=master不会直接生成JavaScript代码,最重要的是不会进行ast-> ast转换。我尝试过ast-> ast事情,即使ast.NodeTransformer比cst都好,但也无法使用ast。> NodeTransformer,更重要的是,我不需要做ast-> ast。

就我而言,至少对python ast做python ast可能会提高性能,因为我有时会在生成与块相关的代码之前检查块的内容,例如:

  • var / global:要能够var某些东西,我必须知道我需要什么,而不是var。无需生成跟踪在给定块中创建哪个变量并将其插入到生成的功能块顶部的块,而是在进入该块之前实际访问子节点以生成相关代码之前,我只是寻找启示性的变量分配。
  • 到目前为止,生成器在JS中具有特殊的语法,因此当我要编写“ var my_generator = function”时,我需要知道哪个Python函数是生成器

因此,对于翻译的每个阶段,我都不会真正访问每个节点。

整个过程可以描述为:

Python source code -> Python ast -> Python source code compatible with Veloce mode -> Python ast -> JavaScript source code

Python内置函数是用Python代码(!)编写的,IIRC有一些与引导类型相关的限制,但是您可以访问所有可以在兼容模式下转换Pythonium的内容。看看https://bitbucket.org/amirouche/pythonium/src/33898da731ee2d768ced3​​92f1c369afd746c25d7/pythonium/compatible/builtins/?at=master

可以理解从pythonium兼容生成的JS代码的阅读,但是源映射将有很大帮助。

根据这种经验,我可以给您的宝贵建议是老屁:

  • 无论是在文献上还是在现有项目中,都对该主题进行了广泛的审查,这些项目是封闭的或免费的。当我回顾现有的不同项目时,我应该给它更多的时间和动力。
  • 问问题!如果我事先知道PyPy后端是无用的,那是由于C / Javascript语义不匹配导致的开销。我可能会在6个月前或3年前提出Pythonium的想法。
  • 知道你想做什么,有一个目标。对于这个项目,我有不同的目标:使用一点点javascript,学习更多Python知识,并能够编写将在浏览器中运行的Python代码(更多内容以及下面的内容)。
  • 失败就是经验
  • 一小步就是一步
  • 从小开始
  • 远大的梦想
  • 做演示
  • 重复

仅使用Python Veloce模式,我感到非常高兴!但是一直以来,我发现我真正想要的是将我和其他人从Javascript中解放出来,但更重要的是能够以舒适的方式进行创建。这使我了解了Scheme,DSL,模型以及最终特定于域的模型(请参阅http://dsmforum.org/)。

关于Ira Baxter的回应:

估计完全没有帮助。我花了大约6个月的空闲时间来使用PythonJS和Pythonium。所以我可以期望从6个月的全职工作中得到更多。我想我们都知道在企业环境中100人年意味着什么,而根本没有意思…

当某人说某事很难解决或更经常是不可能的事情时,我回答说“只花时间找到不可能解决的问题的解决方案”,否则就说没有什么是不可能的,除非在这种情况下证明是不可能的。

如果没有证明不可能的话,那么它就有想象力的余地:

  • 寻找证明是不可能的

  • 如果这是不可能的,则可能存在可以解决的“劣等”问题。

要么

  • 如果不是不可能,那就找到解决办法

不只是乐观的想法。当我启动Python-> Javascript时,每个人都说这是不可能的。PyPy不可能。元类太难了。等…我认为,唯一使PyPy超过Scheme-> C纸(已有25年历史)的革命是一些自动JIT生成(基于我认为是用RPython解释器编写的提示)。

大多数说某事“困难”或“不可能”的人没有提供原因。C ++很难解析?我知道,它们仍然是(免费的)C ++解析器。细节是邪恶的吗?我知道。仅仅说不可能是没有帮助的,它比令人沮丧的“没有帮助”还要糟糕,而且有些人会劝阻其他人。我通过听说了这个问题 /programming/22621164/how-to-automatically-generate-a-parser-code-to-code-translator-from-a-corpus

什么对您来说是完美?这样便可以定义下一个目标,甚至可以达到整体目标。

我更想知道我可以对代码强制执行哪种类型的模式,而不是如何进行翻译,从而使代码的翻译(即:IoC,SOA?)更容易。

我看不到至少不能以一种不太完美的方式将一种语言不能翻译成另一种语言的模式。由于可以进行语言到语言的翻译,因此您最好首先瞄准。从那以后,我认为是根据http://en.wikipedia.org/wiki/Graph_isomorphism_problem两种计算机语言之间的翻译是树或DAG同构。即使我们已经知道他们都将完成学习,所以…

我最好将API-> API转换可视化为Framework-> Framework,但您可能仍要牢记这些内容,以改进生成的代码。例如:Prolog是非常特定的语法,但是您仍然可以通过在Python中描述相同的图形来像计算一样进行Prolog …如果我要实现从Prolog到Python的转换器,我不会在Python中实现统一,而是在C库中实现带有“ Python语法”,这对于Python编写者来说非常容易理解。最后,语法只是我们赋予其含义的“绘画”(这就是我开始使用scheme的原因)。语言的细节是邪恶的,我不是在谈论语法。语言中使用的概念 getattribute钩子(您可以没有它),但是所需的VM功能(如尾递归优化)可能很难处理。您不必担心初始程序是否不使用尾部递归,即使目标语言中没有尾部递归,也可以使用greenlets / event循环来模拟它。

对于目标语言和源语言,请查找:

  • 大而具体的想法
  • 微小且共同的想法

由此将出现:

  • 容易翻译的东西
  • 难以翻译的事物

您也许还可以知道将翻译成快速和慢速代码的内容。

还有stdlib或任何库的问题,但没有明确的答案,这取决于您的目标。

成语代码或可读的生成代码也有解决方案…

因为可以提供慢速和/或关键路径的C实现,所以针对PHP之类的平台比针对浏览器要容易得多。

鉴于您的第一个项目是将Python转换为PHP,至少对于我所知道的PHP3子集,自定义veloce.py是最好的选择。如果您可以为PHP实现veloce.py,则可能可以运行兼容模式…同样,如果您可以将PHP转换为可以用php_veloce.py生成的PHP子集,则意味着您可以将PHP转换为veloce.py可以使用的Python子集,这意味着您可以将PHP转换为Javascript。只是说…

您还可以查看这些库:

另外,您可能对此博客文章(和评论)感兴趣:https : //www.rfk.id.au/blog/entry/pypy-js-poc-jit/

I will second @EliBendersky point of view regarding using ast.parse instead of parser (which I did not know about before). I also warmly recommend you to review his blog. I used ast.parse to do Python->JavaScript translator (@https://bitbucket.org/amirouche/pythonium). I’ve come up with Pythonium design by somewhat reviewing other implementations and trying them on my own. I forked Pythonium from https://github.com/PythonJS/PythonJS which I also started, It’s actually a complete rewrite . The overall design is inspired from PyPy and http://www.hpl.hp.com/techreports/Compaq-DEC/WRL-89-1.pdf paper.

Everything I tried, from beginning to the best solution, even if it looks like Pythonium marketing it really isn’t (don’t hesitate to tell me if something doesn’t seem correct to the netiquette):

  • Implement Python semantic in Plain Old JavaScript using prototype inheritance: AFAIK it’s impossible to implement Python multiple inheritance using JS prototype object system. I did try to do it using other tricks later (cf. getattribute). As far as I know there is no implementation of Python multiple inheritance in JavaScript, the best that exists is Single inhertance + mixins and I’m not sure they handle diamond inheritance. Kind of similar to Skulpt but without google clojure.

  • I tried with Google clojure, just like Skulpt (compiler) instead of actually reading Skulpt code #fail. Anyway because of JS prototype based object system still impossible. Creating binding was very very difficult, you need to write JavaScript and a lot of boilerplate code (cf. https://github.com/skulpt/skulpt/issues/50 where I am the ghost). At that time there was no clear way to integrate the binding in the build system. I think that Skulpt is a library and you just have to include your .py files in the html to be executed, no compilation phase required to be done by the developer.

  • Tried pyjaco (compiler) but creating bindings (calling Javascript code from Python code) was very difficult, there was too much boilerplate code to create every time. Now I think pyjaco is the one that more near Pythonium. pyjaco is written in Python (ast.parse too) but a lot is written in JavaScript and it use prototype inheritance.

I never actually succeed at running Pyjamas #fail and never tried to read the code #fail again. But in my mind PyJamas was doing API->API tranlation (or framework to framework) and not Python to JavaScript translation. The JavaScript framework consume data that is already in the page or data from the server. Python code is only “plumbing”. After that I discovered that pyjamas was actually a real python->js translator.

Still I think it’s possible to do API->API (or framework->framework) translation and that’s basicly what I do in Pythonium but at lower level. Probably Pyjamas use the same algorithm as Pythonium…

Then I discovered brython fully written in Javascript like Skulpt, no need for compilation and lot of fluff… but written in JavaScript.

Since the initial line written in the course of this project, I knew about PyPy, even the JavaScript backend for PyPy. Yep, you can, if you find it, directly generate a Python interpreter in JavaScript from PyPy. People say, it was a disaster. I read no where why. But I think the reason is that the intermediate language they use to implement the interpreter, RPython, is a subset of Python tailored to be translated to C (and maybe asm). Ira Baxter says you always make assumptions when you build something and probably you fine tune it to be the best at what it’s meant to do in the case of PyPy: Python->C translation. Those assumptions might not be relevant in another context worse they can infere overhead otherwise said direct translation will most likely always be better.

Having the interpreter written in Python sounded like a (very) good idea. But I was more interested in a compiler for performance reasons also it’s actually more easy to compile Python to JavaScript than interpret it.

I started PythonJS with the idea of putting together a subset of Python that I could easily translate to JavaScript. At first I didn’t even bother to implement OO system because of past experience. The subset of Python that I achieved to translate to JavaScript are:

  • function with full parameters semantic both in definition and calling. This is the part I am most proud of.
  • while/if/elif/else
  • Python types were converted to JavaScript types (there is no python types of any kind)
  • for could iterate over Javascript arrays only (for a in array)
  • Transparent access to JavaScript: if you write Array in the Python code it will be translated to Array in javascript. This is the biggest achievement in terms of usability over its competitors.
  • You can pass function defined in Python source to javascript functions. Default arguments will be taken into account.
  • It add has special function called new which is translated to JavaScript new e.g: new(Python)(1, 2, spam, “egg”) is translated to “new Python(1, 2, spam, “egg”).
  • “var” are automatically handled by the translator. (very nice finding from Brett (PythonJS contributor).
  • global keyword
  • closures
  • lambdas
  • list comprehensions
  • imports are supported via requirejs
  • single class inheritance + mixin via classyjs

This seems like a lot but actually very narrow compared to full blown semantic of Python. It’s really JavaScript with a Python syntax.

The generated JS is perfect ie. there is no overhead, it can not be improved in terms of performance by further editing it. If you can improve the generated code, you can do it from the Python source file too. Also, the compiler did not rely on any JS tricks that you can find in .js written by http://superherojs.com/, so it’s very readable.

The direct descendant of this part of PythonJS is the Pythonium Veloce mode. The full implementation can be found @ https://bitbucket.org/amirouche/pythonium/src/33898da731ee2d768ced392f1c369afd746c25d7/pythonium/veloce/veloce.py?at=master 793 SLOC + around 100 SLOC of shared code with the other translator.

An adapted version of pystones.py can be translated in Veloce mode cf. https://bitbucket.org/amirouche/pythonium/src/33898da731ee2d768ced392f1c369afd746c25d7/pystone/?at=master

After having setup basic Python->JavaScript translation I choosed another path to translate full Python to JavaScript. The way of glib doing object oriented class based code except the target language is JS so you have access to arrays, map-like objects and many other tricks and all that part was written in Python. IIRC there is no javascript code written by in Pythonium translator. Getting single inheritance is not difficult here are the difficult parts making Pythonium fully compliant with Python:

  • spam.egg in Python is always translated to getattribute(spam, "egg") I did not profile this in particular but I think that where it loose a lot of time and I’m not sure I can improve upon it with asm.js or anything else.
  • method resolution order: even with the algorithm written in Python, translating it to Python Veloce compatible code was a big endeavour.
  • getattributre: the actual getattribute resolution algorithm is kind of tricky and it still doesn’t support data descriptors
  • metaclass class based: I know where to plug the code, but still…
  • last bu not least: some_callable(…) is always transalted to “call(some_callable)”. AFAIK the translator doesn’t use inference at all, so every time you do a call you need to check which kind of object it is to call it they way it’s meant to be called.

This part is factored in https://bitbucket.org/amirouche/pythonium/src/33898da731ee2d768ced392f1c369afd746c25d7/pythonium/compliant/runtime.py?at=master It’s written in Python compatible with Python Veloce.

The actual compliant translator https://bitbucket.org/amirouche/pythonium/src/33898da731ee2d768ced392f1c369afd746c25d7/pythonium/compliant/compliant.py?at=master doesn’t generate JavaScript code directly and most importantly doesn’t do ast->ast transformation. I tried the ast->ast thing and ast even if nicer than cst is not nice to work with even with ast.NodeTransformer and more importantly I don’t need to do ast->ast.

Doing python ast to python ast in my case at least would maybe be a performance improvement since I sometime inspect the content of a block before generating the code associated with it, for instance:

  • var/global: to be able to var something I must know what I need to and not to var. Instead of generating a block tracking which variable are created in a given block and inserting it on top of the generated function block I just look for revelant variable assignation when I enter the block before actually visiting the child node to generate the associated code.
  • yield, generators have, as of yet, a special syntax in JS, so I need to know which Python function is a generator when I want to write the “var my_generator = function”

So I don’t really visit each node once for each phase of the translation.

The overall process can be described as:

Python source code -> Python ast -> Python source code compatible with Veloce mode -> Python ast -> JavaScript source code

Python builtins are written in Python code (!), IIRC there is a few restrictions related to bootstraping types, but you have access to everything that can translate Pythonium in compliant mode. Have a look at https://bitbucket.org/amirouche/pythonium/src/33898da731ee2d768ced392f1c369afd746c25d7/pythonium/compliant/builtins/?at=master

Reading JS code generated from pythonium compliant can be understood but source maps will greatly help.

The valuable advice I can give you in the light of this experience are kind old farts:

  • extensively review the subject both in literature and existing projects closed source or free. When I reviewed the different existing projects I should have given it way more time and motivation.
  • ask questions! If I knew beforehand that PyPy backend was useless because of the overhead due to C/Javascript semantic mismatch. I would maybe had Pythonium idea way before 6 month ago maybe 3 years ago.
  • know what you want to do, have a target. For this project I had different objectives: pratice a bit a javascript, learn more of Python and be able to write Python code that would run in the browser (more and that below).
  • failure is experience
  • a small step is a step
  • start small
  • dream big
  • do demos
  • iterate

With Python Veloce mode only, I’m very happy! But along the way I discovered that what I was really looking for was liberating me and others from Javascript but more importantly being able to create in a comfortable way. This lead me to Scheme, DSL, Models and eventually domain specific models (cf. http://dsmforum.org/).

About what Ira Baxter response:

The estimations are not helpful at all. I took me more or less 6 month of free time for both PythonJS and Pythonium. So I can expect more from full time 6 month. I think we all know what 100 man-year in an enterprise context can mean and not mean at all…

When someone says something is hard or more often impossible, I answer that “it only takes time to find a solution for a problem that is impossible” otherwise said nothing is impossible except if it’s proven impossible in this case a math proof…

If it’s not proven impossible then it leaves room for imagination:

  • finding a proof proving it’s impossible

and

  • If it is impossible there may be an “inferior” problem that can have a solution.

or

  • if it’s not impossible, finding a solution

It’s not just optimistic thinking. When I started Python->Javascript everybody was saying it was impossible. PyPy impossible. Metaclasses too hard. etc… I think that the only revolution that brings PyPy over Scheme->C paper (which is 25 years old) is some automatic JIT generation (based hints written in the RPython interpreter I think).

Most people that say that a thing is “hard” or “impossible” don’t provide the reasons. C++ is hard to parse? I know that, still they are (free) C++ parser. Evil is in the detail? I know that. Saying it’s impossible alone is not helpful, It’s even worse than “not helpful” it’s discouraging, and some people mean to discourage others. I heard about this question via https://stackoverflow.com/questions/22621164/how-to-automatically-generate-a-parser-code-to-code-translator-from-a-corpus.

What would be perfection for you? That’s how you define next goal and maybe reach the overall goal.

I am more interested in knowing what kinds of patterns I could enforce on the code to make it easier to translate (ie: IoC, SOA ?) the code than how to do the translation.

I see no patterns that can not be translated from one language to another language at least in a less than perfect way. Since language to language translation is possible, you’d better aim for this first. Since, I think according to http://en.wikipedia.org/wiki/Graph_isomorphism_problem, translation between two computer languages is a tree or DAG isomorphism. Even if we already know that they are both turing complete, so…

Framework->Framework which I better visualize as API->API translation might still be something that you might keep in mind as a way to improve the generated code. E.g: Prolog as very specific syntax but still you can do Prolog like computation by describing the same graph in Python… If I was to implement a Prolog to Python translator I wouldn’t implement unification in Python but in a C library and come up with a “Python syntax” that is very readable for a Pythonist. In the end, syntax is only “painting” for which we give a meaning (that’s why I started scheme). Evil is in the detail of the language and I’m not talking about the syntax. The concepts that are used in the language getattribute hook (you can live without it) but required VM features like tail-recursion optimisation can be difficult to deal with. You don’t care if the initial program doesn’t use tail recursion and even if there is no tail recursion in the target language you can emulate it using greenlets/event loop.

For target and source languages, look for:

  • Big and specific ideas
  • Tiny and common shared ideas

From this will emerge:

  • Things that are easy to translate
  • Things that are difficult to translate

You will also probably be able to know what will be translated to fast and slow code.

There is also the question of the stdlib or any library but there is no clear answer, it depends of your goals.

Idiomatic code or readable generated code have also solutions…

Targeting a platform like PHP is much more easy than targeting browsers since you can provide C-implementation of slow and/or critical path.

Given you first project is translating Python to PHP, at least for the PHP3 subset I know of, customising veloce.py is your best bet. If you can implement veloce.py for PHP then probably you will be able to run the compliant mode… Also if you can translate PHP to the subset of PHP you can generate with php_veloce.py it means that you can translate PHP to the subset of Python that veloce.py can consume which would mean that you can translate PHP to Javascript. Just saying…

You can also have a look at those libraries:

Also you might be interested by this blog post (and comments): https://www.rfk.id.au/blog/entry/pypy-js-poc-jit/


回答 5

您可以看一下Vala编译器,该编译器将Vala(一种类似于C#的语言)转换为C。

You could take a look at the Vala compiler, which translates Vala (a C#-like language) into C.


在Python中强制命名参数

问题:在Python中强制命名参数

在Python中,您可能有一个函数定义:

def info(object, spacing=10, collapse=1)

可以通过以下任何一种方式调用:

info(odbchelper)                    
info(odbchelper, 12)                
info(odbchelper, collapse=0)        
info(spacing=15, object=odbchelper)

多亏了Python允许任意顺序的参数(只要它们被命名)。

我们遇到的问题是,随着一些更大的函数的增长,人们可能会在spacing和之间添加参数collapse,这意味着错误的值可能会传递给未命名的参数。此外,有时不清楚需要输入什么。我们正在寻找一种方法来强迫人们命名某些参数-不仅是编码标准,还是理想的标志或pydev插件?

因此,在上述4个示例中,由于所有参数均已命名,因此只有最后一个示例可以通过检查。

奇怪的是,我们只会为某些功能打开它,但是有关如何实现此功能的任何建议-甚至可能的话,我们将不胜感激。

In Python you may have a function definition:

def info(object, spacing=10, collapse=1)

which could be called in any of the following ways:

info(odbchelper)                    
info(odbchelper, 12)                
info(odbchelper, collapse=0)        
info(spacing=15, object=odbchelper)

thanks to Python’s allowing of any-order arguments, so long as they’re named.

The problem we’re having is as some of our larger functions grow, people might be adding parameters between spacing and collapse, meaning that the wrong values may be going to parameters that aren’t named. In addition sometimes it’s not always clear as to what needs to go in. We’re after a way to force people to name certain parameters – not just a coding standard, but ideally a flag or pydev plugin?

so that in the above 4 examples, only the last would pass the check as all the parameters are named.

Odds are we’ll only turn it on for certain functions, but any suggestions as to how to implement this – or if it’s even possible would be appreciated.


回答 0

在Python 3中-是,您可以*在参数列表中指定。

文档

“ *”或“ * identifier”之后的参数仅是关键字参数,并且只能使用关键字参数传递。

>>> def foo(pos, *, forcenamed):
...   print(pos, forcenamed)
... 
>>> foo(pos=10, forcenamed=20)
10 20
>>> foo(10, forcenamed=20)
10 20
>>> foo(10, 20)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: foo() takes exactly 1 positional argument (2 given)

也可以结合使用**kwargs

def foo(pos, *, forcenamed, **kwargs):

In Python 3 – Yes, you can specify * in the argument list.

From docs:

Parameters after “*” or “*identifier” are keyword-only parameters and may only be passed used keyword arguments.

>>> def foo(pos, *, forcenamed):
...   print(pos, forcenamed)
... 
>>> foo(pos=10, forcenamed=20)
10 20
>>> foo(10, forcenamed=20)
10 20
>>> foo(10, 20)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: foo() takes exactly 1 positional argument (2 given)

This can also be combined with **kwargs:

def foo(pos, *, forcenamed, **kwargs):

回答 1

您可以通过以下方式定义函数来强制人们在Python3中使用关键字参数。

def foo(*, arg0="default0", arg1="default1", arg2="default2"):
    pass

通过将第一个参数设置为不带名称的位置参数,您可以强制每个调用该函数的人都使用关键字参数,这正是我想问的。在Python2中,唯一的方法是定义一个这样的函数

def foo(**kwargs):
    pass

这将迫使调用者使用kwargs,但这并不是一个很好的解决方案,因为您随后必须进行检查以仅接受所需的参数。

You can force people to use keyword arguments in Python3 by defining a function in the following way.

def foo(*, arg0="default0", arg1="default1", arg2="default2"):
    pass

By making the first argument a positional argument with no name you force everyone who calls the function to use the keyword arguments which is what I think you were asking about. In Python2 the only way to do this is to define a function like this

def foo(**kwargs):
    pass

That’ll force the caller to use kwargs but this isn’t that great of a solution as you’d then have to put a check to only accept the argument that you need.


回答 2

的确,大多数编程语言都将参数顺序作为函数调用协定的一部分,但这不是必须的。为什么会这样?我对这个问题的理解是,Python在这方面是否与其他编程语言有所不同。除了适用于Python 2的其他良好答案外,请考虑以下因素:

__named_only_start = object()

def info(param1,param2,param3,_p=__named_only_start,spacing=10,collapse=1):
    if _p is not __named_only_start:
        raise TypeError("info() takes at most 3 positional arguments")
    return str(param1+param2+param3) +"-"+ str(spacing) +"-"+ str(collapse)

调用方能够提供参数spacing并按collapse位置(无exceptions)提供的唯一方法是:

info(arg1, arg2, arg3, module.__named_only_start, 11, 2)

在Python中,不使用属于其他模块的私有元素的约定已经非常基本。与Python本身一样,这种参数约定只能被强制执行。

否则,调用将采用以下形式:

info(arg1, arg2, arg3, spacing=11, collapse=2)

一个电话

info(arg1, arg2, arg3, 11, 2)

将为参数分配值11 _p以及该函数的第一条指令引发的异常。

特点:

  • 之前_p=__named_only_start的参数按位置(或按名称)被接受。
  • 之后的参数_p=__named_only_start必须仅通过名称提供(除非__named_only_start获得并使用了有关特殊前哨对象的知识)。

优点:

  • 参数在数量和含义上都是明确的(当然,如果还选择了好名字,则在后面)。
  • 如果将前哨指定为第一个参数,则所有参数都需要按名称指定。
  • 调用该函数时,可以通过__named_only_start在相应位置使用哨兵对象来切换到位置模式。
  • 可以预见到比其他替代方案更好的性能。

缺点:

  • 检查发生在运行时,而不是编译时。
  • 使用额外的参数(尽管不是参数)和额外的检查。相对于常规功能而言,性能下降较小。
  • 功能是没有该语言直接支持的黑客(请参阅下面的注释)。
  • 调用该函数时,可以通过__named_only_start在正确的位置使用哨兵对象来切换到位置模式。是的,这也可以看作是专业人士。

请记住,该答案仅对Python 2有效。Python3实现了类似的,但非常优雅的,语言支持的机制,在其他答案中也有描述。

我发现,当我打开思路思考时,没有问题或其他人的决定看起来是愚蠢,愚蠢或愚蠢的。恰恰相反:我通常会学到很多东西。

True, most programming languages make parameter order part of the function call contract, but this doesn’t need to be so. Why would it? My understanding of the question is, then, if Python is any different to other programming languages in this respect. In addition to other good answers for Python 2, please consider the following:

__named_only_start = object()

def info(param1,param2,param3,_p=__named_only_start,spacing=10,collapse=1):
    if _p is not __named_only_start:
        raise TypeError("info() takes at most 3 positional arguments")
    return str(param1+param2+param3) +"-"+ str(spacing) +"-"+ str(collapse)

The only way a caller would be able to provide arguments spacing and collapse positionally (without an exception) would be:

info(arg1, arg2, arg3, module.__named_only_start, 11, 2)

The convention of not using private elements belonging to other modules already is very basic in Python. As with Python itself, this convention for parameters would only be semi-enforced.

Otherwise, calls would need to be of the form:

info(arg1, arg2, arg3, spacing=11, collapse=2)

A call

info(arg1, arg2, arg3, 11, 2)

would assign value 11 to parameter _p and an exception risen by the function’s first instruction.

Characteristics:

  • Parameters before _p=__named_only_start are admitted positionally (or by name).
  • Parameters after _p=__named_only_start must be provided by name only (unless knowledge about the special sentinel object __named_only_start is obtained and used).

Pros:

  • Parameters are explicit in number and meaning (the later if good names are also chosen, of course).
  • If the sentinel is specified as first parameter, then all arguments need to be specified by name.
  • When calling the function, it’s possible to switch to positional mode by using the sentinel object __named_only_start in the corresponding position.
  • A better performance than other alternatives can be anticipated.

Cons:

  • Checking occurs during run-time, not compile-time.
  • Use of an extra parameter (though not argument) and an additional check. Small performance degradation respect to regular functions.
  • Functionality is a hack without direct support by the language (see note below).
  • When calling the function, it’s possible to switch to positional mode by using the sentinel object __named_only_start in the right position. Yes, this can also be seen as a pro.

Please do keep in mind that this answer is only valid for Python 2. Python 3 implements the similar, but very elegant, language-supported mechanism described in other answers.

I’ve found that when I open my mind and think about it, no question or other’s decision seems stupid, dumb, or just silly. Quite on the contrary: I typically learn a lot.


回答 3

您可以通过使“伪造的”第一个关键字参数具有默认值而不会“自然地”出现,从而以在Python 2和Python 3中都可以使用的方式 来实现。该关键字参数前面可以有一个或多个没有值的参数:

_dummy = object()

def info(object, _kw=_dummy, spacing=10, collapse=1):
    if _kw is not _dummy:
        raise TypeError("info() takes 1 positional argument but at least 2 were given")

这将允许:

info(odbchelper)        
info(odbchelper, collapse=0)        
info(spacing=15, object=odbchelper)

但不是:

info(odbchelper, 12)                

如果将功能更改为:

def info(_kw=_dummy, spacing=10, collapse=1):

那么所有参数都必须具有关键字,并且info(odbchelper)将不再起作用。

这样,您便可以将其他关键字参数放在后面的任何位置_kw,而不必强迫您将其放在最后一个条目之后。这通常是有道理的,例如,按逻辑对事物进行分组或按字母顺序排列关键字可以帮助维护和开发。

因此,无需还原到def(**kwargs)在智能编辑器中使用和丢失签名信息。您的社会契约是通过强迫(其中一些)要求关键字(它们的显示顺序)变得不相关来提供某些信息。

You can do that in a way that works in both Python 2 and Python 3, by making a “bogus” first keyword argument with a default value that will not occur “naturally”. That keyword argument can be preceded by one or more arguments without value:

_dummy = object()

def info(object, _kw=_dummy, spacing=10, collapse=1):
    if _kw is not _dummy:
        raise TypeError("info() takes 1 positional argument but at least 2 were given")

This will allow:

info(odbchelper)        
info(odbchelper, collapse=0)        
info(spacing=15, object=odbchelper)

but not:

info(odbchelper, 12)                

If you change the function to:

def info(_kw=_dummy, spacing=10, collapse=1):

then all arguments must have keywords and info(odbchelper) will no longer work.

This will allow you to position additional keyword arguments any place after _kw, without forcing you to put them after the last entry. This often makes sense, e.g. grouping thing logically or arranging keywords alphabetically can help with maintenance and development.

So there is no need to revert to using def(**kwargs) and losing the signature information in your smart editor. Your social contract is to provide certain information, by forcing (some of them) to require keywords, the order these are presented in, has become irrelevant.


回答 4

更新:

我意识到使用**kwargs并不能解决问题。如果您的程序员根据需要更改函数参数,则可以例如将函数更改为:

def info(foo, **kwargs):

并且旧代码将再次中断(因为现在每个函数调用都必须包含第一个参数)。

确实归结为布莱恩所说的话。


(…)人们可能在spacingcollapse(…)之间添加了参数

通常,在更改函数时,新参数应始终结尾。否则,它将破坏代码。应该很明显。
如果有人更改了功能使代码中断,则必须拒绝此更改。
(正如布莱恩所说,这就像是一份合同)

(…)有时不清楚需要输入什么。

通过查看函数的签名(即def info(object, spacing=10, collapse=1)),应该立即看到每个没有默认值的参数都是强制性的。参数的用途是
什么,应该放在文档字符串中。


旧答案(保持完整性)

这可能不是一个好的解决方案:

您可以通过以下方式定义函数:

def info(**kwargs):
    ''' Some docstring here describing possible and mandatory arguments. '''
    spacing = kwargs.get('spacing', 15)
    obj = kwargs.get('object', None)
    if not obj:
       raise ValueError('object is needed')

kwargs是包含任何关键字参数的字典。您可以检查是否存在强制性参数,如果不存在,则引发异常。

不利的一面是,可能不再是显而易见的,哪些参数是可能的,但是使用适当的文档字符串,应该没问题。

Update:

I realized that using **kwargs would not solve the problem. If your programmers change function arguments as they wish, one could, for example, change the function to this:

def info(foo, **kwargs):

and the old code would break again (because now every function call has to include the first argument).

It really comes down to what Bryan says.


(…) people might be adding parameters between spacing and collapse (…)

In general, when changing functions, new arguments should always go to the end. Otherwise it breaks the code. Should be obvious.
If someone changes the function so that the code breaks, this change has to be rejected.
(As Bryan says, it is like a contract)

(…) sometimes it’s not always clear as to what needs to go in.

By looking at the signature of the function (i.e def info(object, spacing=10, collapse=1) ) one should immediately see that every argument that has not a default value, is mandatory.
What the argument is for, should go into the docstring.


Old answer (kept for completeness):

This is probably not a good solution:

You can define functions this way:

def info(**kwargs):
    ''' Some docstring here describing possible and mandatory arguments. '''
    spacing = kwargs.get('spacing', 15)
    obj = kwargs.get('object', None)
    if not obj:
       raise ValueError('object is needed')

kwargs is a dictionary that contains any keyword argument. You can check whether a mandatory argument is present and if not, raise an exception.

The downside is, that it might not be that obvious anymore, which arguments are possible, but with a proper docstring, it should be fine.


回答 5

python3-only关键字参数(*)可以在python2.x中使用**kwargs

考虑以下python3代码:

def f(pos_arg, *, no_default, has_default='default'):
    print(pos_arg, no_default, has_default)

及其行为:

>>> f(1, 2, 3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: f() takes 1 positional argument but 3 were given
>>> f(1, no_default='hi')
1 hi default
>>> f(1, no_default='hi', has_default='hello')
1 hi hello
>>> f(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: f() missing 1 required keyword-only argument: 'no_default'
>>> f(1, no_default=1, wat='wat')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: f() got an unexpected keyword argument 'wat'

可以使用以下方法对此进行模拟,请注意TypeErrorKeyError在“必需的命名参数”情况下,我可以自由切换到,同样要使相同的异常类型也不会花费太多工作

def f(pos_arg, **kwargs):
    no_default = kwargs.pop('no_default')
    has_default = kwargs.pop('has_default', 'default')
    if kwargs:
        raise TypeError('unexpected keyword argument(s) {}'.format(', '.join(sorted(kwargs))))

    print(pos_arg, no_default, has_default)

行为:

>>> f(1, 2, 3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: f() takes exactly 1 argument (3 given)
>>> f(1, no_default='hi')
(1, 'hi', 'default')
>>> f(1, no_default='hi', has_default='hello')
(1, 'hi', 'hello')
>>> f(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in f
KeyError: 'no_default'
>>> f(1, no_default=1, wat='wat')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 6, in f
TypeError: unexpected keyword argument(s) wat

该食谱在python3.x中同样有效,但是如果您仅在python3.x中应避免使用

The python3 keyword-only arguments (*) can be simulated in python2.x with **kwargs

Consider the following python3 code:

def f(pos_arg, *, no_default, has_default='default'):
    print(pos_arg, no_default, has_default)

and its behaviour:

>>> f(1, 2, 3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: f() takes 1 positional argument but 3 were given
>>> f(1, no_default='hi')
1 hi default
>>> f(1, no_default='hi', has_default='hello')
1 hi hello
>>> f(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: f() missing 1 required keyword-only argument: 'no_default'
>>> f(1, no_default=1, wat='wat')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: f() got an unexpected keyword argument 'wat'

This can be simulated using the following, note I’ve taken the liberty of switching TypeError to KeyError in the “required named argument” case, it wouldn’t be too much work to make that the same exception type as well

def f(pos_arg, **kwargs):
    no_default = kwargs.pop('no_default')
    has_default = kwargs.pop('has_default', 'default')
    if kwargs:
        raise TypeError('unexpected keyword argument(s) {}'.format(', '.join(sorted(kwargs))))

    print(pos_arg, no_default, has_default)

And behaviour:

>>> f(1, 2, 3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: f() takes exactly 1 argument (3 given)
>>> f(1, no_default='hi')
(1, 'hi', 'default')
>>> f(1, no_default='hi', has_default='hello')
(1, 'hi', 'hello')
>>> f(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in f
KeyError: 'no_default'
>>> f(1, no_default=1, wat='wat')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 6, in f
TypeError: unexpected keyword argument(s) wat

The recipe works equally as well in python3.x, but should be avoided if you are python3.x only


回答 6

您可以将函数声明为**args仅接收。这将强制使用关键字参数,但是您需要做一些额外的工作以确保仅传递有效名称。

def foo(**args):
   print args

foo(1,2) # Raises TypeError: foo() takes exactly 0 arguments (2 given)
foo(hello = 1, goodbye = 2) # Works fine.

You could declare your functions as receiving **args only. That would mandate keyword arguments but you’d have some extra work to make sure only valid names are passed in.

def foo(**args):
   print args

foo(1,2) # Raises TypeError: foo() takes exactly 0 arguments (2 given)
foo(hello = 1, goodbye = 2) # Works fine.

回答 7

正如其他答案所说,更改功能签名是一个坏主意。在末尾添加新参数,或者在插入参数的情况下修复每个调用方。

如果仍要执行此操作,请使用函数装饰器inspect.getargspec函数。它将使用如下形式:

@require_named_args
def info(object, spacing=10, collapse=1):
    ....

的实现require_named_args留给读者练习。

我不会打扰。每次调用该函数的速度都会很慢,通过更仔细地编写代码,您将获得更好的结果。

As other answers say, changing function signatures is a bad idea. Either add new parameters to the end, or fix every caller if arguments are inserted.

If you still want to do it, use a function decorator and the inspect.getargspec function. It would be used something like this:

@require_named_args
def info(object, spacing=10, collapse=1):
    ....

Implementation of require_named_args is left as an exercise for the reader.

I would not bother. It will be slow every time the function is called, and you will get better results from writing code more carefully.


回答 8

您可以使用**运算符:

def info(**kwargs):

这样,人们被迫使用命名参数。

You could use the ** operator:

def info(**kwargs):

this way people are forced to use named parameters.


回答 9

def cheeseshop(kind, *arguments, **keywords):

在python中,如果使用* args,则意味着您可以为该参数传递n个参数-这将成为函数内部的列表以访问

如果使用** kw表示其关键字参数,则可以按dict的方式进行访问-您可以传递n个数量的kw args,并且如果要限制该用户必须按顺序输入序列和参数,则不要使用*和**-(它为大型架构提供通用解决方案的pythonic方法…)

如果要使用默认值限制功能,则可以在其中检查

def info(object, spacing, collapse)
  spacing = spacing or 10
  collapse = collapse or 1
def cheeseshop(kind, *arguments, **keywords):

in python if use *args that means you can pass n-number of positional arguments for this parameter – which will be accessed as a tuple inside the function.

And if use **kw that means its keyword arguments, that can be access as dict – you can pass n-number of kw args, and if you want to restrict that user must enter the sequence and arguments in order then don’t use * and ** – (its pythonic way to provide generic solutions for big architectures…)

if you want to restrict your function with default values then you can check inside it

def info(object, spacing, collapse)
  spacing = 10 if spacing is None else spacing
  collapse = 1 if collapse is None else collapse

回答 10

我不明白为什么程序员会首先在其他两个之间添加参数。

如果您希望函数参数与名称一起使用(例如, info(spacing=15, object=odbchelper)),则定义它们的顺序无关紧要,因此您最好将新参数放在最后。

如果您确实希望订单很重要,那么更改后就别指望了!

I don’t get why a programmer will add a parameter in between two others in the first place.

If you want the function parameters to be used with names (e.g. info(spacing=15, object=odbchelper) ) then it shouldn’t matter what order they are defined in, so you might as well put the new parameters at the end.

If you do want the order to matter then can’t expect anything to work if you change it!


PEP 8,为什么关键字参数或默认参数值的’=’周围没有空格?

问题:PEP 8,为什么关键字参数或默认参数值的’=’周围没有空格?

为什么PEP 8建议不要=在关键字参数或默认参数值中使用空格

这是否与=在Python代码中建议在每出现的其他地方推荐空格不一致?

怎么:

func(1, 2, very_long_variable_name=another_very_long_variable_name)

优于:

func(1, 2, very_long_variable_name = another_very_long_variable_name)

Python的BDFL与讨论/解释的任何链接将不胜感激。

请注意,这个问题更多的是关于kwargs而不是默认值,我只是使用了PEP 8中的措词。

我不是在征求意见。我要问这个决定背后的原因。这更像是在问我为什么要在C程序中{if语句使用同一行,而不是是否应该使用它。

Why does PEP 8 recommend not having spaces around = in a keyword argument or a default parameter value?

Is this inconsistent with recommending spaces around every other occurrence of = in Python code?

How is:

func(1, 2, very_long_variable_name=another_very_long_variable_name)

better than:

func(1, 2, very_long_variable_name = another_very_long_variable_name)

Any links to discussion/explanation by Python’s BDFL will be appreciated.

Mind, this question is more about kwargs than default values, i just used the phrasing from PEP 8.

I’m not soliciting opinions. I’m asking for reasons behind this decision. It’s more like asking why would I use { on the same line as if statement in a C program, not whether I should use it or not.


回答 0

我猜这是因为关键字参数与变量赋值本质上是不同的。

例如,有很多这样的代码:

kw1 = some_value
kw2 = some_value
kw3 = some_value
some_func(
    1,
    2,
    kw1=kw1,
    kw2=kw2,
    kw3=kw3)

如您所见,将变量分配给名称完全相同的关键字参数是完全有意义的,因此可以提高可读性,使它们看不到空格。更容易认识到我们正在使用关键字参数,而不是为其本身分配变量。

同样,参数往往在同一行中,而赋值通常每个都在各自的行中,因此节省空间可能是一个重要问题。

I guess that it is because a keyword argument is essentially different than a variable assignment.

For example, there is plenty of code like this:

kw1 = some_value
kw2 = some_value
kw3 = some_value
some_func(
    1,
    2,
    kw1=kw1,
    kw2=kw2,
    kw3=kw3)

As you see, it makes complete sense to assign a variable to a keyword argument named exactly the same, so it improves readability to see them without spaces. It is easier to recognize that we are using keyword arguments and not assigning a variable to itself.

Also, parameters tend to go in the same line whereas assignments usually are each one in their own line, so saving space is likely to be an important matter there.


回答 1

我不会使用very_long_variable_name作为默认参数。所以考虑一下:

func(1, 2, axis='x', angle=90, size=450, name='foo bar')

在此:

func(1, 2, axis = 'x', angle = 90, size = 450, name = 'foo bar')

同样,将变量用作默认值也没有多大意义。也许某些常量变量(实际上不是常量),在这种情况下,我将使用全为大写的名称,以描述性的方式表示,但尽可能简短。所以没有another_very _…

I wouldn’t use very_long_variable_name as a default argument. So consider this:

func(1, 2, axis='x', angle=90, size=450, name='foo bar')

over this:

func(1, 2, axis = 'x', angle = 90, size = 450, name = 'foo bar')

Also, it doesn’t make much sense to use variables as default values. Perhaps some constant variables (which aren’t really constants) and in that case I would use names that are all caps, descriptive yet short as possible. So no another_very_…


回答 2

有优点也有缺点。

我非常不喜欢PPE8兼容代码的读取方式。我不very_long_variable_name=another_very_long_variable_name接受比人类更具可读性 的论点very_long_variable_name = another_very_long_variable_name。人们不是这样阅读的。这是额外的认知负担,尤其是在没有语法突出显示的情况下。

但是,这有很大的好处。如果遵守间隔规则,则使使用工具专门搜索参数更加有效。

There are pros and cons.

I very much dislike how PEP8 compliant code reads. I don’t buy into the argument that very_long_variable_name=another_very_long_variable_name can ever be more human readable than very_long_variable_name = another_very_long_variable_name. This is not how people read. It’s an additional cognitive load, particularly in the absence of syntax highlighting.

There is a significant benefit, however. If the spacing rules are adhered to, it makes searching for parameters exclusively using tools much more effective.


回答 3

IMO省略了用于args的空间,从而提供了更清晰的arg / value对可视分组。看起来不那么混乱。

IMO leaving out the spaces for args provides cleaner visual grouping of the arg/value pairs; it looks less cluttered.


回答 4

我认为有几个原因,尽管我可能只是在进行合理化:

  1. 它节省了空间,允许更多的函数定义和调用适合一行,并为参数名称本身节省了更多空间。
  2. 通过将每个关键字和值连接在一起,可以更轻松地用逗号后的空格分隔不同的参数。这意味着您可以快速查看提供的参数数量。
  3. 这样,语法便与可能具有相同名称的变量分配不同。
  4. 此外,该语法(甚至更多)不同于相等检查a == b,相等检查也可以是调用内的有效表达式。

I think there are several reasons for this, although I might just be rationalizing:

  1. It saves space, allowing more function definitions and calls to fit on one line and saving more space for the argument names themselves.
  2. By joining each keyword and value, you can more easily separate the different arguments by the space after the comma. This means you can quickly eyeball how many arguments you’ve supplied.
  3. The syntax is then distinct from variable assignments, which may have the same name.
  4. Additionally, the syntax is (even more) distinct from equality checks a == b which can also be valid expressions inside a call.

回答 5

对我来说,它使代码更具可读性,因此是一个很好的约定。

我认为变量分配和函数关键字分配在样式方面的主要区别在于=,前者在一行上应该只有一个,而通常有多个=

如果没有其他考虑,我们宁愿foo = 42使用foo=42,因为后者不是等号通常的格式,并且因为前者在视觉上很好地用空格分隔了变量和值。

但是,当有一行中的多个任务,我们更f(foo=42, bar=43, baz=44)f(foo = 42, bar = 43, baz = 44),因为前者在视觉上与分离空白的几个任务,而后者则没有,使得它有点难以看到的关键字/值对。

这里是把它的另一种方式:还有就是约定背后的一致性。一致性是这样的:通过空格使“最高分离级别”在视觉上更加清晰。较低级别的分隔不是(因为它将与分隔较高级别的空白混淆)。对于变量分配,最高的分隔级别是变量和值之间。对于功能关键字分配,最高的分隔级别是各个分配本身之间的分隔。

For me it makes code more readable and is thus a good convention.

I think the key difference in terms of style between variable assignments and function keyword assignments is that there should only be a single = on a line for the former, whereas generally there are multiple =s on a line for the latter.

If there were no other considerations, we would prefer foo = 42 to foo=42, because the latter is not how equals signs are typically formatted, and because the former nicely visually separates the variable and value with whitespace.

But when there are multiple assignments on one line, we prefer f(foo=42, bar=43, baz=44) to f(foo = 42, bar = 43, baz = 44), because the former visually separates the several assignments with whitespace, whereas the latter does not, making it a bit harder to see where the keyword/value pairs are.

Here’s another way of putting it: there is a consistency behind the convention. That consistency is this: the “highest level of separation” is made visually clearer via spaces. Any lower levels of separation are not (because it would be confused with the whitespace separating the higher level). For variable assignment, the highest level of separation is between variable and value. For function keyword assignment, the highest level of separation is between the individual assignments themselves.


python中列表推导或生成器表达式的行连续

问题:python中列表推导或生成器表达式的行连续

您应该如何分解很长的清单理解力?

[something_that_is_pretty_long for something_that_is_pretty_long in somethings_that_are_pretty_long]

我还看到某个地方的人不喜欢使用’\’来分隔行,但从不理解为什么。这是什么原因呢?

How are you supposed to break up a very long list comprehension?

[something_that_is_pretty_long for something_that_is_pretty_long in somethings_that_are_pretty_long]

I have also seen somewhere that people that dislike using ‘\’ to break up lines, but never understood why. What is the reason behind this?


回答 0

[x
 for
 x
 in
 (1,2,3)
]

效果很好,因此您几乎可以随心所欲。我个人更喜欢

 [something_that_is_pretty_long
  for something_that_is_pretty_long
  in somethings_that_are_pretty_long]

\不能被很好理解的原因是它出现在一行的末尾,要么不突出,要么需要额外的填充,当行长改变时必须固定它:

x = very_long_term                     \
  + even_longer_term_than_the_previous \
  + a_third_term

在这种情况下,请使用括号:

x = (very_long_term
     + even_longer_term_than_the_previous
     + a_third_term)
[x
 for
 x
 in
 (1,2,3)
]

works fine, so you can pretty much do as you please. I’d personally prefer

 [something_that_is_pretty_long
  for something_that_is_pretty_long
  in somethings_that_are_pretty_long]

The reason why \ isn’t appreciated very much is that it appears at the end of a line, where it either doesn’t stand out or needs extra padding, which has to be fixed when line lengths change:

x = very_long_term                     \
  + even_longer_term_than_the_previous \
  + a_third_term

In such cases, use parens:

x = (very_long_term
     + even_longer_term_than_the_previous
     + a_third_term)

回答 1

我不反对:

variable = [something_that_is_pretty_long
            for something_that_is_pretty_long
            in somethings_that_are_pretty_long]

\在这种情况下,您不需要。总的来说,我认为人们避免\因为它有点丑陋,但是如果不是最后一件事,它也会给您带来麻烦(请确保没有空格后跟)。我认为使用它比不使用它要好得多,这样可以减少行长。

由于\在上述情况下或对于带括号的表达式不是必需的,所以我实际上很少需要使用它。

I’m not opposed to:

variable = [something_that_is_pretty_long
            for something_that_is_pretty_long
            in somethings_that_are_pretty_long]

You don’t need \ in this case. In general, I think people avoid \ because it’s slightly ugly, but also can give problems if it’s not the very last thing on the line (make sure no whitespace follows it). I think it’s much better to use it than not, though, in order to keep your line lengths down.

Since \ isn’t necessary in the above case, or for parenthesized expressions, I actually find it fairly rare that I even need to use it.


回答 2

在处理多个数据结构的列表时,也可以使用多个缩进。

new_list = [
    {
        'attribute 1': a_very_long_item.attribute1,
        'attribute 2': a_very_long_item.attribute2,
        'list_attribute': [
            {
                'dict_key_1': attribute_item.attribute2,
                'dict_key_2': attribute_item.attribute2
            }
            for attribute_item
            in a_very_long_item.list_of_items
         ]
    }
    for a_very_long_item
    in a_very_long_list
    if a_very_long_item not in [some_other_long_item
        for some_other_long_item 
        in some_other_long_list
    ]
]

请注意,它还如何使用if语句过滤到另一个列表。将if语句放到自己的行中也是有用的。

You can also make use of multiple indentations in cases where you’re dealing with a list of several data structures.

new_list = [
    {
        'attribute 1': a_very_long_item.attribute1,
        'attribute 2': a_very_long_item.attribute2,
        'list_attribute': [
            {
                'dict_key_1': attribute_item.attribute2,
                'dict_key_2': attribute_item.attribute2
            }
            for attribute_item
            in a_very_long_item.list_of_items
         ]
    }
    for a_very_long_item
    in a_very_long_list
    if a_very_long_item not in [some_other_long_item
        for some_other_long_item 
        in some_other_long_list
    ]
]

Notice how it also filters onto another list using an if statement. Dropping the if statement to its own line is useful as well.


排序Python`import x`和`from x import y`语句的正确方法是什么?

问题:排序Python`import x`和`from x import y`语句的正确方法是什么?

Python风格指南建议对进口组这样的:

导入应按以下顺序分组:

  1. 标准库导入
  2. 相关第三方进口
  3. 本地应用程序/特定于库的导入

但是,它没有提及应如何布置两种不同的进口方式的内容:

from foo import bar
import foo

对它们进行排序有多种方法(假设所有这些导入都属于同一组):

  • 首先from..import,然后import

    from g import gg
    from x import xx
    import abc
    import def
    import x
    
  • 首先import,然后from..import

    import abc
    import def
    import x
    from g import gg
    from x import xx
    
  • 按模块名称的字母顺序,忽略导入的类型

    import abc
    import def
    from g import gg
    import x
    from xx import xx
    

PEP8没有提到此命令的首选顺序,某些IDE的“清除导入”功能可能只是该工具的开发人员所做的任何事情。

我正在寻找另一个PEP来澄清这一点,或从BDFL(或另一个Python核心开发人员)那里获得相关的评论/电子邮件请不要发布主观答案来说明您的个人偏好。

The python style guide suggests to group imports like this:

Imports should be grouped in the following order:

  1. standard library imports
  2. related third party imports
  3. local application/library specific imports

However, it does not mention anything how the two different ways of imports should be laid out:

from foo import bar
import foo

There are multiple ways to sort them (let’s assume all those import belong to the same group):

  • first from..import, then import

    from g import gg
    from x import xx
    import abc
    import def
    import x
    
  • first import, then from..import

    import abc
    import def
    import x
    from g import gg
    from x import xx
    
  • alphabetic order by module name, ignoring the kind of import

    import abc
    import def
    from g import gg
    import x
    from xx import xx
    

PEP8 does not mention the preferred order for this and the “cleanup imports” features some IDEs have probably just do whatever the developer of that feature preferred.

I’m looking for another PEP clarifying this or a relevant comment/email from the BDFL (or another Python core developer). Please don’t post subjective answers stating your own preference.


回答 0

进口商品通常按字母顺序排序,并在PEP 8的不同位置进行描述。

按字母顺序排序的模块更易于阅读和搜索。毕竟python都是关于可读性的。另外,更容易验证是否导入了某些内容,并避免了重复的导入

PEP 8中没有关于排序的任何信息,因此所有关于选择所用内容的信息都是如此。

根据知名站点和存储库中也很少使用的参考文献,按字母顺序排序是这种方式。

例如:

import httplib
import logging
import random
import StringIO
import time
import unittest
from nova.api import openstack
from nova.auth import users
from nova.endpoint import cloud

要么

import a_standard
import b_standard

import a_third_party
import b_third_party

from a_soc import f
from a_soc import g
from b_soc import d

Reddit官方存储库还指出,通常应使用PEP-8导入顺序。但是有一些补充是

for each imported group the order of imports should be:
import <package>.<module> style lines in alphabetical order
from <package>.<module> import <symbol> style in alphabetical order

参考文献:

PS:isort实用程序会自动对您的导入进行排序。

Imports are generally sorted alphabetically and described in various places beside PEP 8.

Alphabetically sorted modules are quicker to read and searchable. After all python is all about readability. Also It is easier to verify that something is imported, and avoids duplicated imports

There is nothing available in PEP 8 regarding sorting.So its all about choice what you use.

According to few references from reputable sites and repositories also popularity, Alphabetical ordering is the way.

for eg like this:

import httplib
import logging
import random
import StringIO
import time
import unittest
from nova.api import openstack
from nova.auth import users
from nova.endpoint import cloud

OR

import a_standard
import b_standard

import a_third_party
import b_third_party

from a_soc import f
from a_soc import g
from b_soc import d

Reddit official repository also states that, In general PEP-8 import ordering should be used. However there are a few additions which is

for each imported group the order of imports should be:
import <package>.<module> style lines in alphabetical order
from <package>.<module> import <symbol> style in alphabetical order

References:

PS: the isort utility automatically sorts your imports.


回答 1

根据CIA的内部编码约定(WikiLeaks Vault 7泄漏的一部分),python导入应分为三类:

  1. 标准库导入
  2. 第三方进口
  3. 特定于应用程序的导入

在这些组中,应按字典顺序对导入进行排序,而忽略大小写:

import foo
from foo import bar
from foo.bar import baz
from foo.bar import Quux
from Foob import ar

According to the CIA’s internal coding conventions (part of the WikiLeaks Vault 7 leak), python imports should be grouped into three groups:

  1. Standard library imports
  2. Third-party imports
  3. Application-specific imports

Imports should be ordered lexicographically within these groups, ignoring case:

import foo
from foo import bar
from foo.bar import baz
from foo.bar import Quux
from Foob import ar

回答 2

PEP 8对此一无所获。关于这一点,没有约定,这并不意味着Python社区需要绝对定义一个。对于一个项目而言,选择可能会更好,而对于另一个项目而言,则是最糟糕的……这是一个偏好设置的问题,因为每种解决方案都有其优缺点。但是,如果要遵循约定,则必须遵守引用的主要顺序:

  1. 标准库导入
  2. 相关第三方进口
  3. 本地应用程序/特定于库的导入

例如,Google在此页面建议导入应按字典类别在每个类别(标准/第三方/您的)中进行排序。但是在Facebook,Yahoo和其他地方,这可能是另一种惯例…

The PEP 8 says nothing about it indeed. There’s no convention for this point, and it doesn’t mean the Python community need to define one absolutely. A choice can be better for a project but the worst for another… It’s a question of preferences for this, since each solutions has pro and cons. But if you want to follow conventions, you have to respect the principal order you quoted:

  1. standard library imports
  2. related third party imports
  3. local application/library specific imports

For example, Google recommend in this page that import should be sorted lexicographically, in each categories (standard/third parties/yours). But at Facebook, Yahoo and whatever, it’s maybe another convention…


回答 3

我强烈建议reorder-python-imports。它遵循已接受答案的第二个选项,并且还集成到pre-commit中,这非常有帮助。

I highly recommend reorder-python-imports. It follows the 2nd option of the accepted answer and also integrates into pre-commit, which is super helpful.


回答 4

所有import x语句应按的值排序x,所有from x import y语句应按的值x按字母顺序排序,并且已排序的from x import y语句组必须遵循已排序的import x语句组。

import abc
import def
import x
from g import gg
from x import xx
from z import a

All import x statements should be sorted by the value of x and all from x import y statements should be sorted by the value of x in alphabetical order and the sorted groups of from x import y statements must follow the sorted group of import x statements.

import abc
import def
import x
from g import gg
from x import xx
from z import a

回答 5

我觉得已接受的答案有点太冗长。这是TLDR:

在每个分组中,应按照每个模块的完整包装路径,按字典顺序对导入进行排序,而忽略大小写

Google代码样式指南

因此,第三个选项是正确的:

import abc
import def
from g import yy  # changed gg->yy for illustrative purposes
import x
from xx import xx

I feel like the accepted answer is a bit too verbose. Here is TLDR:

Within each grouping, imports should be sorted lexicographically, ignoring case, according to each module’s full package path

Google code style guide

So, the third option is correct:

import abc
import def
from g import yy  # changed gg->yy for illustrative purposes
import x
from xx import xx