分类目录归档:知识问答

什么是mixin,为什么它们有用?

问题:什么是mixin,为什么它们有用?

在“ Python编程 ”中,Mark Lutz提到了“ mixins”。我来自C / C ++ / C#背景,以前没有听说过这个词。什么是mixin?

本示例的两行之间进行阅读(我已经链接到它,因为它很长),我认为这是使用多重继承来扩展类而不是“适当的”子类的一种情况。这是正确的吗?

为什么我要这样做而不是将新功能放入子类中?因此,为什么混合/多重继承方法比使用组合更好?

什么将mixin与多重继承分开?这仅仅是语义问题吗?

In “Programming Python“, Mark Lutz mentions “mixins”. I’m from a C/C++/C# background and I have not heard the term before. What is a mixin?

Reading between the lines of this example (which I’ve linked to because it’s quite long), I’m presuming it’s a case of using multiple inheritance to extend a class as opposed to ‘proper’ subclassing. Is this right?

Why would I want to do that rather than put the new functionality into a subclass? For that matter, why would a mixin/multiple inheritance approach be better than using composition?

What separates a mixin from multiple inheritance? Is it just a matter of semantics?


回答 0

mixin是一种特殊的多重继承。使用mixin的主要情况有两种:

  1. 您想为一个类提供很多可选功能。
  2. 您想在许多不同的类中使用一种特定功能。

例如,请考虑werkzeug的请求和响应系统。我可以说一个普通的旧请求对象:

from werkzeug import BaseRequest

class Request(BaseRequest):
    pass

如果我想添加接受标头支持,我会做

from werkzeug import BaseRequest, AcceptMixin

class Request(AcceptMixin, BaseRequest):
    pass

如果我想创建一个支持接受标头,etag,身份验证和用户代理支持的请求对象,则可以执行以下操作:

from werkzeug import BaseRequest, AcceptMixin, ETagRequestMixin, UserAgentMixin, AuthenticationMixin

class Request(AcceptMixin, ETagRequestMixin, UserAgentMixin, AuthenticationMixin, BaseRequest):
    pass

区别是细微的,但是在上面的示例中,mixin类并不是独立存在的。在更传统的多重继承中,AuthenticationMixin(例如)可能更像Authenticator。也就是说,该类可能会设计为独立存在。

A mixin is a special kind of multiple inheritance. There are two main situations where mixins are used:

  1. You want to provide a lot of optional features for a class.
  2. You want to use one particular feature in a lot of different classes.

For an example of number one, consider werkzeug’s request and response system. I can make a plain old request object by saying:

from werkzeug import BaseRequest

class Request(BaseRequest):
    pass

If I want to add accept header support, I would make that

from werkzeug import BaseRequest, AcceptMixin

class Request(AcceptMixin, BaseRequest):
    pass

If I wanted to make a request object that supports accept headers, etags, authentication, and user agent support, I could do this:

from werkzeug import BaseRequest, AcceptMixin, ETagRequestMixin, UserAgentMixin, AuthenticationMixin

class Request(AcceptMixin, ETagRequestMixin, UserAgentMixin, AuthenticationMixin, BaseRequest):
    pass

The difference is subtle, but in the above examples, the mixin classes weren’t made to stand on their own. In more traditional multiple inheritance, the AuthenticationMixin (for example) would probably be something more like Authenticator. That is, the class would probably be designed to stand on its own.


回答 1

首先,您应该注意,mixin仅存在于多种继承语言中。您不能使用Java或C#进行混合。

基本上,mixin是独立的基本类型,可为子类提供有限的功能和多态共振。如果您正在考虑使用C#,请考虑一下您不必实际实现的接口,因为该接口已经实现了。您只需继承它并从其功能中受益。

Mixins通常范围狭窄,不打算扩展。

[编辑-关于原因:]

既然您问过,我想我应该说一下原因。最大的好处是您不必一遍又一遍地自己做。在C#中,mixin受益最大的地方可能是Disposal模式。每当实现IDisposable时,几乎总是希望遵循相同的模式,但最终会以较小的变化编写和重新编写相同的基本代码。如果有可扩展的Disposal mixin,则可以节省很多额外的键入操作。

[编辑2-回答您的其他问题]

什么将mixin与多重继承分开?这仅仅是语义问题吗?

是。mixin和标准多重继承之间的区别只是语义问题。具有多重继承的类可能会使用混合作为多重继承的一部分。

mixin的目的是创建一个可以通过继承“混合”到任何其他类型的类型,而不会影响继承类型,同时仍然为该类型提供一些有益的功能。

再次考虑一下已经实现的接口。

我个人不使用mixins,因为我主要使用不支持它们的语言进行开发,因此我很难拿出一个像样的示例来提供“啊!”的好例子。你的时刻。但我会再试一次。我将使用一个人为设计的示例-大多数语言已经以某种方式提供了该功能-希望这将解释应该如何创建和使用mixin。开始:

假设您具有一个可以与XML进行序列化的类型。您希望该类型提供“ ToXML”方法,该方法返回包含具有该类型的数据值的XML片段的字符串,以及“ FromXML”,其允许该类型从字符串中的XML片段重建其数据值。同样,这是一个人为的示例,因此也许您使用文件流或语言运行时库中的XML Writer类…等等。关键是您想将对象序列化为XML并从XML取回新对象。

此示例中的另一个重要点是您希望以通用方式执行此操作。您不需要为要序列化的每种类型实现“ ToXML”和“ FromXML”方法,而是需要一些通用的方法来确保您的类型可以做到这一点并且可以正常工作。您想要代码重用。

如果您的语言支持,则可以创建XmlSerializable mixin为您完成工作。此类型将实现ToXML和FromXML方法。它将使用对示例不重要的某种机制,能够从与之混合的任何类型中收集所有必要的数据,以构建ToXML返回的XML片段,并且当FromXML为叫。

和..就是这样。要使用它,您需要将任何类型的类型都需要序列化为XML,才能从XmlSerializable继承。每当需要序列化或反序列化该类型时,只需调用ToXML或FromXML。实际上,由于XmlSerializable是完全成熟的类型并且是多态的,因此可以想象到,您可以构建一个对原始类型一无所知的文档序列化器,只接受一个XmlSerializable类型的数组。

现在想象一下将此场景用于其他用途,例如创建一个确保每个混合了它的类的mixin记录每个方法调用,或者一个为混合它的类型提供事务性的mixin。列表可以继续。

如果您只是将mixin视为旨在为类型添加少量功能而又不影响该类型的小型基本类型,那么您就是无所不能。

希望。:)

First, you should note that mixins only exist in multiple-inheritance languages. You can’t do a mixin in Java or C#.

Basically, a mixin is a stand-alone base type that provides limited functionality and polymorphic resonance for a child class. If you’re thinking in C#, think of an interface that you don’t have to actually implement because it’s already implemented; you just inherit from it and benefit from its functionality.

Mixins are typically narrow in scope and not meant to be extended.

[edit — as to why:]

I suppose I should address why, since you asked. The big benefit is that you don’t have to do it yourself over and over again. In C#, the biggest place where a mixin could benefit might be from the Disposal pattern. Whenever you implement IDisposable, you almost always want to follow the same pattern, but you end up writing and re-writing the same basic code with minor variations. If there were an extendable Disposal mixin, you could save yourself a lot of extra typing.

[edit 2 — to answer your other questions]

What separates a mixin from multiple inheritance? Is it just a matter of semantics?

Yes. The difference between a mixin and standard multiple inheritance is just a matter of semantics; a class that has multiple inheritance might utilize a mixin as part of that multiple inheritance.

The point of a mixin is to create a type that can be “mixed in” to any other type via inheritance without affecting the inheriting type while still offering some beneficial functionality for that type.

Again, think of an interface that is already implemented.

I personally don’t use mixins since I develop primarily in a language that doesn’t support them, so I’m having a really difficult time coming up with a decent example that will just supply that “ahah!” moment for you. But I’ll try again. I’m going to use an example that’s contrived — most languages already provide the feature in some way or another — but that will, hopefully, explain how mixins are supposed to be created and used. Here goes:

Suppose you have a type that you want to be able to serialize to and from XML. You want the type to provide a “ToXML” method that returns a string containing an XML fragment with the data values of the type, and a “FromXML” that allows the type to reconstruct its data values from an XML fragment in a string. Again, this is a contrived example, so perhaps you use a file stream, or an XML Writer class from your language’s runtime library… whatever. The point is that you want to serialize your object to XML and get a new object back from XML.

The other important point in this example is that you want to do this in a generic way. You don’t want to have to implement a “ToXML” and “FromXML” method for every type that you want to serialize, you want some generic means of ensuring that your type will do this and it just works. You want code reuse.

If your language supported it, you could create the XmlSerializable mixin to do your work for you. This type would implement the ToXML and the FromXML methods. It would, using some mechanism that’s not important to the example, be capable of gathering all the necessary data from any type that it’s mixed in with to build the XML fragment returned by ToXML and it would be equally capable of restoring that data when FromXML is called.

And.. that’s it. To use it, you would have any type that needs to be serialized to XML inherit from XmlSerializable. Whenever you needed to serialize or deserialize that type, you would simply call ToXML or FromXML. In fact, since XmlSerializable is a fully-fledged type and polymorphic, you could conceivably build a document serializer that doesn’t know anything about your original type, accepting only, say, an array of XmlSerializable types.

Now imagine using this scenario for other things, like creating a mixin that ensures that every class that mixes it in logs every method call, or a mixin that provides transactionality to the type that mixes it in. The list can go on and on.

If you just think of a mixin as a small base type designed to add a small amount of functionality to a type without otherwise affecting that type, then you’re golden.

Hopefully. :)


回答 2

该答案旨在通过以下示例解释mixin :

  • 自包含:简短,无需了解任何库即可理解示例。

  • 用Python而不是其他语言。

    可以理解,存在其他语言(例如Ruby)的示例,因为该术语在这些语言中更为常见,但这是Python线程。

它还应考虑有争议的问题:

是否需要多重继承来表征mixin?

定义

我还没有看到来自“权威”来源的引文,清楚地说明了Python中的mixin。

我已经看到了mixin的2种可能定义(如果认为它们与其他类似概念(例如抽象基类)不同),人们并不完全同意哪种正确。

不同语言之间的共识可能会有所不同。

定义1:无多重继承

mixin是一个类,以便该类的某些方法使用该类中未定义的方法。

因此,该类不是要实例化的,而应用作基类。否则,该实例将具有在不引发异常的情况下无法调用的方法。

一些资料来源增加的一个约束是该类可能不包含数据,仅包含方法,但我不明白为什么这样做是必要的。但是实际上,许多有用的mixin没有任何数据,并且没有数据的基类更易于使用。

一个典型的例子是从only <=和实现所有比较运算符==

class ComparableMixin(object):
    """This class has methods which use `<=` and `==`,
    but this class does NOT implement those methods."""
    def __ne__(self, other):
        return not (self == other)
    def __lt__(self, other):
        return self <= other and (self != other)
    def __gt__(self, other):
        return not self <= other
    def __ge__(self, other):
        return self == other or self > other

class Integer(ComparableMixin):
    def __init__(self, i):
        self.i = i
    def __le__(self, other):
        return self.i <= other.i
    def __eq__(self, other):
        return self.i == other.i

assert Integer(0) <  Integer(1)
assert Integer(0) != Integer(1)
assert Integer(1) >  Integer(0)
assert Integer(1) >= Integer(1)

# It is possible to instantiate a mixin:
o = ComparableMixin()
# but one of its methods raise an exception:
#o != o 

这个特定的例子可以通过functools.total_ordering()装饰器来实现,但是这里的游戏是重新发明轮子:

import functools

@functools.total_ordering
class Integer(object):
    def __init__(self, i):
        self.i = i
    def __le__(self, other):
        return self.i <= other.i
    def __eq__(self, other):
        return self.i == other.i

assert Integer(0) < Integer(1)
assert Integer(0) != Integer(1)
assert Integer(1) > Integer(0)
assert Integer(1) >= Integer(1)

定义2:多重继承

mixin是一种设计模式,其中基类的某些方法使用其未定义的方法,并且该方法应由另一个基类实现,而不是由定义1中的派生方法实现。

术语“ 混合类”是指打算在该设计模式中使用的基类(使用方法的那些类是TODO,还是实现该方法的那些?

决定给定类是否为混合类并不容易:该方法可以仅在派生类上实现,在这种情况下,我们回到定义1。您必须考虑作者的意图。

这种模式很有趣,因为可以通过选择不同的基类来重组功能:

class HasMethod1(object):
    def method(self):
        return 1

class HasMethod2(object):
    def method(self):
        return 2

class UsesMethod10(object):
    def usesMethod(self):
        return self.method() + 10

class UsesMethod20(object):
    def usesMethod(self):
        return self.method() + 20

class C1_10(HasMethod1, UsesMethod10): pass
class C1_20(HasMethod1, UsesMethod20): pass
class C2_10(HasMethod2, UsesMethod10): pass
class C2_20(HasMethod2, UsesMethod20): pass

assert C1_10().usesMethod() == 11
assert C1_20().usesMethod() == 21
assert C2_10().usesMethod() == 12
assert C2_20().usesMethod() == 22

# Nothing prevents implementing the method
# on the base class like in Definition 1:

class C3_10(UsesMethod10):
    def method(self):
        return 3

assert C3_10().usesMethod() == 13

权威的Python事件

collections.abc官方文档中,该文档明确使用术语Mixin Methods

它指出如果一个类:

  • 贯彻 __next__
  • 从单个类继承 Iterator

然后该类将免费获得一个__iter__ mixin方法

因此,至少在文档的这一点上,mixin不需要多重继承,并且与定义1保持一致。

当然,文档在不同点上可能是矛盾的,并且其他重要的Python库可能正在其文档中使用其他定义。

该页面还使用术语Set mixin,它明确表明类似类Set并且Iterator可以称为Mixin类。

用其他语言

  • 红宝石:显然不需要混入多重继承,如主要参考书如提到的编程的Ruby和Ruby编程语言

  • C ++:未实现的方法是纯虚拟方法。

    定义1与抽象类(具有纯虚方法的类)的定义一致。该类无法实例化。

    虚拟继承可以定义2:来自两个派生类的多重继承

This answer aims to explain mixins with examples that are:

  • self-contained: short, with no need to know any libraries to understand the example.

  • in Python, not in other languages.

    It is understandable that there were examples from other languages such as Ruby since the term is much more common in those languages, but this is a Python thread.

It shall also consider the controversial question:

Is multiple inheritance necessary or not to characterize a mixin?

Definitions

I have yet to see a citation from an “authoritative” source clearly saying what is a mixin in Python.

I have seen 2 possible definitions of a mixin (if they are to be considered as different from other similar concepts such as abstract base classes), and people don’t entirely agree on which one is correct.

The consensus may vary between different languages.

Definition 1: no multiple inheritance

A mixin is a class such that some method of the class uses a method which is not defined in the class.

Therefore the class is not meant to be instantiated, but rather serve as a base class. Otherwise the instance would have methods that cannot be called without raising an exception.

A constraint which some sources add is that the class may not contain data, only methods, but I don’t see why this is necessary. In practice however, many useful mixins don’t have any data, and base classes without data are simpler to use.

A classic example is the implementation of all comparison operators from only <= and ==:

class ComparableMixin(object):
    """This class has methods which use `<=` and `==`,
    but this class does NOT implement those methods."""
    def __ne__(self, other):
        return not (self == other)
    def __lt__(self, other):
        return self <= other and (self != other)
    def __gt__(self, other):
        return not self <= other
    def __ge__(self, other):
        return self == other or self > other

class Integer(ComparableMixin):
    def __init__(self, i):
        self.i = i
    def __le__(self, other):
        return self.i <= other.i
    def __eq__(self, other):
        return self.i == other.i

assert Integer(0) <  Integer(1)
assert Integer(0) != Integer(1)
assert Integer(1) >  Integer(0)
assert Integer(1) >= Integer(1)

# It is possible to instantiate a mixin:
o = ComparableMixin()
# but one of its methods raise an exception:
#o != o 

This particular example could have been achieved via the functools.total_ordering() decorator, but the game here was to reinvent the wheel:

import functools

@functools.total_ordering
class Integer(object):
    def __init__(self, i):
        self.i = i
    def __le__(self, other):
        return self.i <= other.i
    def __eq__(self, other):
        return self.i == other.i

assert Integer(0) < Integer(1)
assert Integer(0) != Integer(1)
assert Integer(1) > Integer(0)
assert Integer(1) >= Integer(1)

Definition 2: multiple inheritance

A mixin is a design pattern in which some method of a base class uses a method it does not define, and that method is meant to be implemented by another base class, not by the derived like in Definition 1.

The term mixin class refers to base classes which are intended to be used in that design pattern (TODO those that use the method, or those that implement it?)

It is not easy to decide if a given class is a mixin or not: the method could be just implemented on the derived class, in which case we’re back to Definition 1. You have to consider the author’s intentions.

This pattern is interesting because it is possible to recombine functionalities with different choices of base classes:

class HasMethod1(object):
    def method(self):
        return 1

class HasMethod2(object):
    def method(self):
        return 2

class UsesMethod10(object):
    def usesMethod(self):
        return self.method() + 10

class UsesMethod20(object):
    def usesMethod(self):
        return self.method() + 20

class C1_10(HasMethod1, UsesMethod10): pass
class C1_20(HasMethod1, UsesMethod20): pass
class C2_10(HasMethod2, UsesMethod10): pass
class C2_20(HasMethod2, UsesMethod20): pass

assert C1_10().usesMethod() == 11
assert C1_20().usesMethod() == 21
assert C2_10().usesMethod() == 12
assert C2_20().usesMethod() == 22

# Nothing prevents implementing the method
# on the base class like in Definition 1:

class C3_10(UsesMethod10):
    def method(self):
        return 3

assert C3_10().usesMethod() == 13

Authoritative Python occurrences

At the official documentatiton for collections.abc the documentation explicitly uses the term Mixin Methods.

It states that if a class:

  • implements __next__
  • inherits from a single class Iterator

then the class gets an __iter__ mixin method for free.

Therefore at least on this point of the documentation, mixin does not not require multiple inheritance, and is coherent with Definition 1.

The documentation could of course be contradictory at different points, and other important Python libraries might be using the other definition in their documentation.

This page also uses the term Set mixin, which clearly suggests that classes like Set and Iterator can be called Mixin classes.

In other languages

  • Ruby: Clearly does not require multiple inheritance for mixin, as mentioned in major reference books such as Programming Ruby and The Ruby programming Language

  • C++: A method that is not implemented is a pure virtual method.

    Definition 1 coincides with the definition of an abstract class (a class that has a pure virtual method). That class cannot be instantiated.

    Definition 2 is possible with virtual inheritance: Multiple Inheritance from two derived classes


回答 3

我认为它们是使用多重继承的一种有条理的方式-因为mixin最终只是(可能)遵循关于被称为mixin的类的约定的另一个python类。

我对管理您称为Mixin的约定的理解是Mixin:

  • 添加方法但不添加实例变量(类常量可以)
  • 仅继承自object(在Python中)

这样,它限制了多重继承的潜在复杂性,并且通过限制外观(相对于完全多重继承),使跟踪程序流变得相当容易。它们类似于ruby模块

如果我想添加实例变量(比单继承具有更大的灵活性),那么我倾向于组合。

话虽如此,我看到了名为XYZMixin的类,它们确实具有实例变量。

I think of them as a disciplined way of using multiple inheritance – because ultimately a mixin is just another python class that (might) follow the conventions about classes that are called mixins.

My understanding of the conventions that govern something you would call a Mixin are that a Mixin:

  • adds methods but not instance variables (class constants are OK)
  • only inherits from object (in Python)

That way it limits the potential complexity of multiple inheritance, and makes it reasonably easy to track the flow of your program by limiting where you have to look (compared to full multiple inheritance). They are similar to ruby modules.

If I want to add instance variables (with more flexibility than allowed for by single inheritance) then I tend to go for composition.

Having said that, I have seen classes called XYZMixin that do have instance variables.


回答 4

Mixins是“编程”中的一个概念,其中该类提供功能,但并不用于实例化。Mixins的主要目的是提供独立的功能,并且最好的是,mixin本身不与其他mixin继承并且也避免状态。在诸如Ruby之类的语言中,有一些直接的语言支持,但对于Python则没有。但是,您可以使用多类继承来执行Python中提供的功能。

我观看了http://www.youtube.com/watch?v=v_uKI2NOLEM的视频,以了解Mixins的基础知识。对于初学者来说,了解mixin的基础知识,它们如何工作以及在实现它们时可能遇到的问题非常有用。

维基百科仍然是最好的:http : //en.wikipedia.org/wiki/Mixin

Mixins is a concept in Programming in which the class provides functionalities but it is not meant to be used for instantiation. Main purpose of Mixins is to provide functionalities which are standalone and it would be best if the mixins itself do not have inheritance with other mixins and also avoid state. In languages such as Ruby, there is some direct language support but for Python, there isn’t. However, you could used multi-class inheritance to execute the functionality provided in Python.

I watched this video http://www.youtube.com/watch?v=v_uKI2NOLEM to understand the basics of mixins. It is quite useful for a beginner to understand the basics of mixins and how they work and the problems you might face in implementing them.

Wikipedia is still the best: http://en.wikipedia.org/wiki/Mixin


回答 5

什么将mixin与多重继承分开?这仅仅是语义问题吗?

混合是多重继承的有限形式。在某些语言中,将mixin添加到类的机制(在语法方面)与继承略有不同。

特别是在Python的上下文中,mixin是一个父类,它为子类提供功能,但本身并不打算实例化。

您可能会说,“那只是多重继承,而不是真正的mixin”是因为实际上可以实例化和使用对于mixin感到困惑的类,因此,这确实是语义上的,而且非常真实。

多重继承的例子

该示例来自文档,是OrderedCounter:

class OrderedCounter(Counter, OrderedDict):
     'Counter that remembers the order elements are first encountered'

     def __repr__(self):
         return '%s(%r)' % (self.__class__.__name__, OrderedDict(self))

     def __reduce__(self):
         return self.__class__, (OrderedDict(self),)

它从模块子类化Counter和。OrderedDictcollections

双方CounterOrderedDict意图被实例化,并在自己使用。但是,通过将它们都子类化,我们可以得到一个有序的计数器,并在每个对象中重用代码。

这是重用代码的有效方法,但也可能会出现问题。如果事实证明其中一个对象中存在错误,则不小心修复它可能会在子类中创建错误。

混合的例子

通常将Mixins提倡为获得代码重用的方式,而又避免了诸如OrderedCounter之类的协作多重继承可能存在的潜在耦合问题。当您使用mixins时,您使用的功能与数据紧密耦合。

与上面的示例不同,mixin不能单独使用。它提供了新的或不同的功能。

例如,标准库有一对夫妇在混入socketserver

可以使用这些混合类来创建每种类型服务器的分支和线程版本。例如,ThreadingUDPServer的创建如下:

class ThreadingUDPServer(ThreadingMixIn, UDPServer):
    pass

混合类首先出现,因为它会覆盖UDPServer中定义的方法。设置各种属性还可以更改基础服务器机制的行为。

在这种情况下,mixin方法将覆盖UDPServer对象定义中的方法以允许并发。

覆盖的方法似乎是process_request,它还提供了另一种方法process_request_thread。这是源代码

class ThreadingMixIn:
        """Mix-in class to handle each request in a new thread."""

        # Decides how threads will act upon termination of the
        # main process
        daemon_threads = False

        def process_request_thread(self, request, client_address):
            """Same as in BaseServer but as a thread.
            In addition, exception handling is done here.
            """
            try:
                self.finish_request(request, client_address)
            except Exception:
                self.handle_error(request, client_address)
            finally:
                self.shutdown_request(request)

        def process_request(self, request, client_address):
            """Start a new thread to process the request."""
            t = threading.Thread(target = self.process_request_thread,
                                 args = (request, client_address))
            t.daemon = self.daemon_threads
            t.start()

人为的例子

这是一个mixin,主要用于演示目的-大多数对象的发展将超出此repr的用途:

class SimpleInitReprMixin(object):
    """mixin, don't instantiate - useful for classes instantiable
    by keyword arguments to their __init__ method.
    """
    __slots__ = () # allow subclasses to use __slots__ to prevent __dict__
    def __repr__(self):
        kwarg_strings = []
        d = getattr(self, '__dict__', None)
        if d is not None:
            for k, v in d.items():
                kwarg_strings.append('{k}={v}'.format(k=k, v=repr(v)))
        slots = getattr(self, '__slots__', None)
        if slots is not None:
            for k in slots:
                v = getattr(self, k, None)
                kwarg_strings.append('{k}={v}'.format(k=k, v=repr(v)))
        return '{name}({kwargs})'.format(
          name=type(self).__name__,
          kwargs=', '.join(kwarg_strings)
          )

用法是:

class Foo(SimpleInitReprMixin): # add other mixins and/or extend another class here
    __slots__ = 'foo',
    def __init__(self, foo=None):
        self.foo = foo
        super(Foo, self).__init__()

和用法:

>>> f1 = Foo('bar')
>>> f2 = Foo()
>>> f1
Foo(foo='bar')
>>> f2
Foo(foo=None)

What separates a mixin from multiple inheritance? Is it just a matter of semantics?

A mixin is a limited form of multiple inheritance. In some languages the mechanism for adding a mixin to a class is slightly different (in terms of syntax) from that of inheritance.

In the context of Python especially, a mixin is a parent class that provides functionality to subclasses but is not intended to be instantiated itself.

What might cause you to say, “that’s just multiple inheritance, not really a mixin” is if the class that might be confused for a mixin can actually be instantiated and used – so indeed it is a semantic, and very real, difference.

Example of Multiple Inheritance

This example, from the documentation, is an OrderedCounter:

class OrderedCounter(Counter, OrderedDict):
     'Counter that remembers the order elements are first encountered'

     def __repr__(self):
         return '%s(%r)' % (self.__class__.__name__, OrderedDict(self))

     def __reduce__(self):
         return self.__class__, (OrderedDict(self),)

It subclasses both the Counter and the OrderedDict from the collections module.

Both Counter and OrderedDict are intended to be instantiated and used on their own. However, by subclassing them both, we can have a counter that is ordered and reuses the code in each object.

This is a powerful way to reuse code, but it can also be problematic. If it turns out there’s a bug in one of the objects, fixing it without care could create a bug in the subclass.

Example of a Mixin

Mixins are usually promoted as the way to get code reuse without potential coupling issues that cooperative multiple inheritance, like the OrderedCounter, could have. When you use mixins, you use functionality that isn’t as tightly coupled to the data.

Unlike the example above, a mixin is not intended to be used on its own. It provides new or different functionality.

For example, the standard library has a couple of mixins in the socketserver library.

Forking and threading versions of each type of server can be created using these mix-in classes. For instance, ThreadingUDPServer is created as follows:

class ThreadingUDPServer(ThreadingMixIn, UDPServer):
    pass

The mix-in class comes first, since it overrides a method defined in UDPServer. Setting the various attributes also changes the behavior of the underlying server mechanism.

In this case, the mixin methods override the methods in the UDPServer object definition to allow for concurrency.

The overridden method appears to be process_request and it also provides another method, process_request_thread. Here it is from the source code:

class ThreadingMixIn:
        """Mix-in class to handle each request in a new thread."""

        # Decides how threads will act upon termination of the
        # main process
        daemon_threads = False

        def process_request_thread(self, request, client_address):
            """Same as in BaseServer but as a thread.
            In addition, exception handling is done here.
            """
            try:
                self.finish_request(request, client_address)
            except Exception:
                self.handle_error(request, client_address)
            finally:
                self.shutdown_request(request)

        def process_request(self, request, client_address):
            """Start a new thread to process the request."""
            t = threading.Thread(target = self.process_request_thread,
                                 args = (request, client_address))
            t.daemon = self.daemon_threads
            t.start()

A Contrived Example

This is a mixin that is mostly for demonstration purposes – most objects will evolve beyond the usefulness of this repr:

class SimpleInitReprMixin(object):
    """mixin, don't instantiate - useful for classes instantiable
    by keyword arguments to their __init__ method.
    """
    __slots__ = () # allow subclasses to use __slots__ to prevent __dict__
    def __repr__(self):
        kwarg_strings = []
        d = getattr(self, '__dict__', None)
        if d is not None:
            for k, v in d.items():
                kwarg_strings.append('{k}={v}'.format(k=k, v=repr(v)))
        slots = getattr(self, '__slots__', None)
        if slots is not None:
            for k in slots:
                v = getattr(self, k, None)
                kwarg_strings.append('{k}={v}'.format(k=k, v=repr(v)))
        return '{name}({kwargs})'.format(
          name=type(self).__name__,
          kwargs=', '.join(kwarg_strings)
          )

and usage would be:

class Foo(SimpleInitReprMixin): # add other mixins and/or extend another class here
    __slots__ = 'foo',
    def __init__(self, foo=None):
        self.foo = foo
        super(Foo, self).__init__()

And usage:

>>> f1 = Foo('bar')
>>> f2 = Foo()
>>> f1
Foo(foo='bar')
>>> f2
Foo(foo=None)

回答 6

我认为这里有一些很好的解释,但我想提供另一种观点。

在Scala中,您可以执行混合操作,如此处所述,但非常有趣的是,混合操作实际上是“融合”在一起的,以创建一种新的继承类。本质上,您不是从多个类/ mixins继承,而是生成一种具有mixin所有属性的新类。这是有道理的,因为Scala基于JVM(目前不支持多重继承)(从Java 8开始)。顺便说一下,这种mixin类类型是一种特殊类型,在Scala中称为Trait。

它以类定义的方式提示:类NewClass扩展FirstMixin和SecondMixin以及ThirdMixin …

我不确定CPython解释器是否执行相同的操作(mixin类组成),但是我不会感到惊讶。同样,来自C ++背景,我不会将ABC或“接口”等同于mixin,它是一个相似的概念,但是在使用和实现上存在分歧。

I think there have been some good explanations here but I wanted to provide another perspective.

In Scala, you can do mixins as has been described here but what is very interesting is that the mixins are actually ‘fused’ together to create a new kind of class to inherit from. In essence, you do not inherit from multiple classes/mixins, but rather, generate a new kind of class with all the properties of the mixin to inherit from. This makes sense since Scala is based on the JVM where multiple-inheritance is not currently supported (as of Java 8). This mixin class type, by the way, is a special type called a Trait in Scala.

It’s hinted at in the way a class is defined: class NewClass extends FirstMixin with SecondMixin with ThirdMixin …

I’m not sure if the CPython interpreter does the same (mixin class-composition) but I wouldn’t be surprised. Also, coming from a C++ background, I would not call an ABC or ‘interface’ equivalent to a mixin — it’s a similar concept but divergent in use and implementation.


回答 7

我建议不要在新的Python代码中进行混入,如果您能找到其他解决方法(例如,代替合成的继承关系,或者只是在自己的类中使用Monkey修补方法),那就不多了努力。

在老式类中,您可以使用混入作为从另一个类中获取一些方法的一种方式。但是在新式世界中,即使是混入,一切也都继承自object。这意味着对多重继承的任何使用自然会引入MRO问题

有多种方法可以使多继承MRO在Python中工作,其中最著名的是super()函数,但这意味着您必须使用super()来完成整个类的层次结构,并且要理解控制流要困难得多。

I’d advise against mix-ins in new Python code, if you can find any other way around it (such as composition-instead-of-inheritance, or just monkey-patching methods into your own classes) that isn’t much more effort.

In old-style classes you could use mix-ins as a way of grabbing a few methods from another class. But in the new-style world everything, even the mix-in, inherits from object. That means that any use of multiple inheritance naturally introduces MRO issues.

There are ways to make multiple-inheritance MRO work in Python, most notably the super() function, but it means you have to do your whole class hierarchy using super(), and it’s considerably more difficult to understand the flow of control.


回答 8

也许有几个例子会有所帮助。

如果您要构建一个类并希望它像字典一样工作,则可以定义所有__ __必要的方法。但这有点痛苦。作为替代方案,您可以只定义一些,并从UserDict.DictMixin(继承自collections.DictMixinpy3k中)继承(除了任何其他继承)。这将具有自动定义其余所有字典api的作用。

第二个示例:GUI工具箱wxPython允许您创建具有多列的列表控件(例如Windows资源管理器中的文件显示)。默认情况下,这些列表是非常基本的。您可以添加其他功能,例如通过单击列标题,从ListCtrl继承并添加适当的mixins来按特定列对列表进行排序的功能。

Perhaps a couple of examples will help.

If you’re building a class and you want it to act like a dictionary, you can define all the various __ __ methods necessary. But that’s a bit of a pain. As an alternative, you can just define a few, and inherit (in addition to any other inheritance) from UserDict.DictMixin (moved to collections.DictMixin in py3k). This will have the effect of automatically defining all the rest of the dictionary api.

A second example: the GUI toolkit wxPython allows you to make list controls with multiple columns (like, say, the file display in Windows Explorer). By default, these lists are fairly basic. You can add additional functionality, such as the ability to sort the list by a particular column by clicking on the column header, by inheriting from ListCtrl and adding appropriate mixins.


回答 9

这不是Python的示例,但在D编程语言中,该术语mixin用于指代使用几乎相同方式的构造。在课堂上添加一堆东西。

在D中(顺便说一下,它不执行MI),这是通过将一个模板(认为具有语法意识和安全的宏,您将接近)插入一个范围来完成的。这允许在类,结构,函数,模块或任何可以扩展为任意数量的声明的代码中使用一行代码。

It’s not a Python example but in the D programing language the term mixin is used to refer to a construct used much the same way; adding a pile of stuff to a class.

In D (which by the way doesn’t do MI) this is done by inserting a template (think syntactically aware and safe macros and you will be close) into a scope. This allows for a single line of code in a class, struct, function, module or whatever to expand to any number of declarations.


回答 10

OP提到他/她从未听说过C ++中的mixin,也许是因为它们在C ++中被称为“好奇重复模板模式(CRTP)”。另外,@ Ciro Santilli提到mixin是通过C ++中的抽象基类实现的。尽管可以使用抽象基类来实现mixin,但这是一个过高的选择,因为可以在编译时使用模板在运行时实现虚拟功能的功能,而无需在运行时查找虚拟表的开销。

此处详细描述了CRTP模式

我已经使用以下模板类将@Ciro Santilli的答案中的python示例转换为C ++:

    #include <iostream>
    #include <assert.h>

    template <class T>
    class ComparableMixin {
    public:
        bool operator !=(ComparableMixin &other) {
            return ~(*static_cast<T*>(this) == static_cast<T&>(other));
        }
        bool operator <(ComparableMixin &other) {
            return ((*(this) != other) && (*static_cast<T*>(this) <= static_cast<T&>(other)));
        }
        bool operator >(ComparableMixin &other) {
            return ~(*static_cast<T*>(this) <= static_cast<T&>(other));
        }
        bool operator >=(ComparableMixin &other) {
            return ((*static_cast<T*>(this) == static_cast<T&>(other)) || (*(this) > other));
        }
        protected:
            ComparableMixin() {}
    };

    class Integer: public ComparableMixin<Integer> {
    public:
     Integer(int i) {
         this->i = i;
     }
     int i;
     bool operator <=(Integer &other) {
         return (this->i <= other.i);
     }
     bool operator ==(Integer &other) {
         return (this->i == other.i);
     }
    };

int main() {

    Integer i(0) ;
    Integer j(1) ;
    //ComparableMixin<Integer> c; // this will cause compilation error because constructor is protected.
    assert (i < j );
    assert (i != j);
    assert (j >  i);
    assert (j >= i);

    return 0;
}

编辑:在ComparableMixin中添加了受保护的构造函数,因此它只能被继承而不能被实例化。更新了示例,以显示创建ComparableMixin对象时受保护的构造函数将如何导致编译错误。

OP mentioned that he/she never heard of mixin in C++, perhaps that is because they are called Curiously Recurring Template Pattern (CRTP) in C++. Also, @Ciro Santilli mentioned that mixin is implemented via abstract base class in C++. While abstract base class can be used to implement mixin, it is an overkill as the functionality of virtual function at run-time can be achieved using template at compile time without the overhead of virtual table lookup at run-time.

The CRTP pattern is described in detail here

I have converted the python example in @Ciro Santilli’s answer into C++ using template class below:

    #include <iostream>
    #include <assert.h>

    template <class T>
    class ComparableMixin {
    public:
        bool operator !=(ComparableMixin &other) {
            return ~(*static_cast<T*>(this) == static_cast<T&>(other));
        }
        bool operator <(ComparableMixin &other) {
            return ((*(this) != other) && (*static_cast<T*>(this) <= static_cast<T&>(other)));
        }
        bool operator >(ComparableMixin &other) {
            return ~(*static_cast<T*>(this) <= static_cast<T&>(other));
        }
        bool operator >=(ComparableMixin &other) {
            return ((*static_cast<T*>(this) == static_cast<T&>(other)) || (*(this) > other));
        }
        protected:
            ComparableMixin() {}
    };

    class Integer: public ComparableMixin<Integer> {
    public:
     Integer(int i) {
         this->i = i;
     }
     int i;
     bool operator <=(Integer &other) {
         return (this->i <= other.i);
     }
     bool operator ==(Integer &other) {
         return (this->i == other.i);
     }
    };

int main() {

    Integer i(0) ;
    Integer j(1) ;
    //ComparableMixin<Integer> c; // this will cause compilation error because constructor is protected.
    assert (i < j );
    assert (i != j);
    assert (j >  i);
    assert (j >= i);

    return 0;
}

EDIT: Added protected constructor in ComparableMixin so that it can only be inherited and not instantiated. Updated the example to show how protected constructor will cause compilation error when an object of ComparableMixin is created.


回答 11

也许来自ruby的示例可以帮助您:

您可以包括mixin Comparable并定义一个功能"<=>(other)",mixin提供所有这些功能:

<(other)
>(other)
==(other)
<=(other)
>=(other)
between?(other)

它通过调用<=>(other)并返回正确的结果来做到这一点。

"instance <=> other"返回0,如果两个对象是相等的,小于0,如果instance是比更大other和超过0,如果other是更大的。

Maybe an example from ruby can help:

You can include the mixin Comparable and define one function "<=>(other)", the mixin provides all those functions:

<(other)
>(other)
==(other)
<=(other)
>=(other)
between?(other)

It does this by invoking <=>(other) and giving back the right result.

"instance <=> other" returns 0 if both objects are equal, less than 0 if instance is bigger than other and more than 0 if other is bigger.


回答 12

mixin提供了一种在类中添加功能的方法,即,您可以通过将模块包含在所需类中来与模块中定义的方法进行交互。尽管ruby不支持多重继承,但提供了mixin作为实现该目的的替代方法。

这是一个示例,说明如何使用mixin实现多重继承。

module A    # you create a module
    def a1  # lets have a method 'a1' in it
    end
    def a2  # Another method 'a2'
    end
end

module B    # let's say we have another module
    def b1  # A method 'b1'
    end
    def b2  #another method b2
    end
end

class Sample    # we create a class 'Sample'
    include A   # including module 'A' in the class 'Sample' (mixin)
    include B   # including module B as well

    def S1      #class 'Sample' contains a method 's1'
    end
end

samp = Sample.new    # creating an instance object 'samp'

# we can access methods from module A and B in our class(power of mixin)

samp.a1     # accessing method 'a1' from module A
samp.a2     # accessing method 'a2' from module A
samp.b1     # accessing method 'b1' from module B
samp.b2     # accessing method 'a2' from module B
samp.s1     # accessing method 's1' inside the class Sample

mixin gives a way to add functionality in a class, i.e you can interact with methods defined in a module by including the module inside the desired class. Though ruby doesn’t supports multiple inheritance but provides mixin as an alternative to achieve that.

here is an example that explains how multiple inheritance is achieved using mixin.

module A    # you create a module
    def a1  # lets have a method 'a1' in it
    end
    def a2  # Another method 'a2'
    end
end

module B    # let's say we have another module
    def b1  # A method 'b1'
    end
    def b2  #another method b2
    end
end

class Sample    # we create a class 'Sample'
    include A   # including module 'A' in the class 'Sample' (mixin)
    include B   # including module B as well

    def S1      #class 'Sample' contains a method 's1'
    end
end

samp = Sample.new    # creating an instance object 'samp'

# we can access methods from module A and B in our class(power of mixin)

samp.a1     # accessing method 'a1' from module A
samp.a2     # accessing method 'a2' from module A
samp.b1     # accessing method 'b1' from module B
samp.b2     # accessing method 'a2' from module B
samp.s1     # accessing method 's1' inside the class Sample

回答 13

我只是使用python mixin对python milters进行单元测试。通常情况下,军阀会与MTA对话,因此很难进行单元测试。测试混入将覆盖与MTA对话的方法,并创建由测试用例驱动的模拟环境。

因此,您采用未修改的milter应用程序,例如spfmilter和mixin TestBase,如下所示:

class TestMilter(TestBase,spfmilter.spfMilter):
  def __init__(self):
    TestBase.__init__(self)
    spfmilter.config = spfmilter.Config()
    spfmilter.config.access_file = 'test/access.db'
    spfmilter.spfMilter.__init__(self)

然后,在milter应用程序的测试用例中使用TestMilter:

def testPass(self):
  milter = TestMilter()
  rc = milter.connect('mail.example.com',ip='192.0.2.1')
  self.assertEqual(rc,Milter.CONTINUE)
  rc = milter.feedMsg('test1',sender='good@example.com')
  self.assertEqual(rc,Milter.CONTINUE)
  milter.close()

http://pymilter.cvs.sourceforge.net/viewvc/pymilter/pymilter/Milter/test.py?revision=1.6&view=markup

I just used a python mixin to implement unit testing for python milters. Normally, a milter talks to an MTA, making unit testing difficult. The test mixin overrides methods that talk to the MTA, and create a simulated environment driven by test cases instead.

So, you take an unmodified milter application, like spfmilter, and mixin TestBase, like this:

class TestMilter(TestBase,spfmilter.spfMilter):
  def __init__(self):
    TestBase.__init__(self)
    spfmilter.config = spfmilter.Config()
    spfmilter.config.access_file = 'test/access.db'
    spfmilter.spfMilter.__init__(self)

Then, use TestMilter in the test cases for the milter application:

def testPass(self):
  milter = TestMilter()
  rc = milter.connect('mail.example.com',ip='192.0.2.1')
  self.assertEqual(rc,Milter.CONTINUE)
  rc = milter.feedMsg('test1',sender='good@example.com')
  self.assertEqual(rc,Milter.CONTINUE)
  milter.close()

http://pymilter.cvs.sourceforge.net/viewvc/pymilter/pymilter/Milter/test.py?revision=1.6&view=markup


回答 14

我认为以前的答复很好地定义了什么是MixIn。但是,为了更好地理解它们,从代码/实现的角度比较MixIn抽象类接口可能是有用的:

1.抽象类

  • 需要包含一个或多个抽象方法

  • 抽象类 可以包含状态(实例变量)和非抽象方法

2.界面

  • 接口包含抽象方法(没有非抽象方法,没有内部状态)

3. MixIns

  • 混音(如接口)包含内部状态(实例变量)
  • 混音包含一个或多个非抽象方法(与接口不同,它们可以包含非抽象方法)

在例如Python中,这些只是约定,因为以上所有内容均定义为classes。但是,抽象类,接口MixIns的共同特征是它们不应独立存在,即不应实例化。

I think previous responses defined very well what MixIns are. However, in order to better understand them, it might be useful to compare MixIns with Abstract Classes and Interfaces from the code/implementation perspective:

1. Abstract Class

  • Class that needs to contain one or more abstract methods

  • Abstract Class can contain state (instance variables) and non-abstract methods

2. Interface

  • Interface contains abstract methods only (no non-abstract methods and no internal state)

3. MixIns

  • MixIns (like Interfaces) do not contain internal state (instance variables)
  • MixIns contain one or more non-abstract methods (they can contain non-abstract methods unlike interfaces)

In e.g. Python these are just conventions, because all of the above are defined as classes. However, the common feature of both Abstract Classes, Interfaces and MixIns is that they should not exist on their own, i.e. should not be instantiated.


回答 15

我读到您有ac#背景。因此,一个好的起点可能是.NET的mixin实现。

您可能想在http://remix.codeplex.com/上检查Codeplex项目。

观看lang.net专题讨论会链接以获取概述。Codeplex页面上的文档还有更多内容。

问候斯蒂芬

I read that you have a c# background. So a good starting point might be a mixin implementation for .NET.

You might want to check out the codeplex project at http://remix.codeplex.com/

Watch the lang.net Symposium link to get an overview. There is still more to come on documentation on codeplex page.

regards Stefan


在Python中创建单例

问题:在Python中创建单例

这个问题不是为了讨论是否需要单例设计模式,是否是反模式,还是针对任何宗教战争,而是要讨论如何以最pythonic的方式在Python中最好地实现此模式。在这种情况下,我将“最pythonic”定义为表示它遵循“最少惊讶的原理”

我有多个将成为单例的类(我的用例用于记录器,但这并不重要)。当我可以简单地继承或修饰时,我不希望增加gumph来使几个类杂乱无章。

最佳方法:


方法1:装饰器

def singleton(class_):
    instances = {}
    def getinstance(*args, **kwargs):
        if class_ not in instances:
            instances[class_] = class_(*args, **kwargs)
        return instances[class_]
    return getinstance

@singleton
class MyClass(BaseClass):
    pass

优点

  • 装饰器的添加方式通常比多重继承更直观。

缺点

  • 使用MyClass()创建的对象将是真正的单例对象,而MyClass本身是一个函数,而不是类,因此您不能从中调用类方法。也就m = MyClass(); n = MyClass(); o = type(n)();这样m == n && m != o && n != o

方法2:一个基类

class Singleton(object):
    _instance = None
    def __new__(class_, *args, **kwargs):
        if not isinstance(class_._instance, class_):
            class_._instance = object.__new__(class_, *args, **kwargs)
        return class_._instance

class MyClass(Singleton, BaseClass):
    pass

优点

  • 这是一个真正的课堂

缺点

  • 多重继承-好!__new__从第二个基类继承期间可能被覆盖?人们必须思考的超出了必要。

方法3:元类

class Singleton(type):
    _instances = {}
    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
        return cls._instances[cls]

#Python2
class MyClass(BaseClass):
    __metaclass__ = Singleton

#Python3
class MyClass(BaseClass, metaclass=Singleton):
    pass

优点

  • 这是一个真正的课堂
  • 自动神奇地涵盖继承
  • 利用__metaclass__它的正确用途(使我意识到这一点)

缺点

  • 有吗

方法4:装饰器返回具有相同名称的类

def singleton(class_):
    class class_w(class_):
        _instance = None
        def __new__(class_, *args, **kwargs):
            if class_w._instance is None:
                class_w._instance = super(class_w,
                                    class_).__new__(class_,
                                                    *args,
                                                    **kwargs)
                class_w._instance._sealed = False
            return class_w._instance
        def __init__(self, *args, **kwargs):
            if self._sealed:
                return
            super(class_w, self).__init__(*args, **kwargs)
            self._sealed = True
    class_w.__name__ = class_.__name__
    return class_w

@singleton
class MyClass(BaseClass):
    pass

优点

  • 这是一个真正的课堂
  • 自动神奇地涵盖继承

缺点

  • 创建每个新类没有开销吗?在这里,我们为希望创建单例的每个类创建两个类。虽然这对我来说很好,但我担心这可能无法扩展。当然,要扩展这种模式是否太容易了还有争议。
  • _sealed属性的重点是什么
  • 无法使用调用基类上同名的方法,super()因为它们会递归。这意味着您无法自定义__new__,也无法将需要调用的类作为子类__init__

方法5:一个模块

一个模块文件 singleton.py

优点

  • 简单胜于复杂

缺点

This question is not for the discussion of whether or not the singleton design pattern is desirable, is an anti-pattern, or for any religious wars, but to discuss how this pattern is best implemented in Python in such a way that is most pythonic. In this instance I define ‘most pythonic’ to mean that it follows the ‘principle of least astonishment’.

I have multiple classes which would become singletons (my use-case is for a logger, but this is not important). I do not wish to clutter several classes with added gumph when I can simply inherit or decorate.

Best methods:


Method 1: A decorator

def singleton(class_):
    instances = {}
    def getinstance(*args, **kwargs):
        if class_ not in instances:
            instances[class_] = class_(*args, **kwargs)
        return instances[class_]
    return getinstance

@singleton
class MyClass(BaseClass):
    pass

Pros

  • Decorators are additive in a way that is often more intuitive than multiple inheritance.

Cons

  • While objects created using MyClass() would be true singleton objects, MyClass itself is a a function, not a class, so you cannot call class methods from it. Also for m = MyClass(); n = MyClass(); o = type(n)(); then m == n && m != o && n != o

Method 2: A base class

class Singleton(object):
    _instance = None
    def __new__(class_, *args, **kwargs):
        if not isinstance(class_._instance, class_):
            class_._instance = object.__new__(class_, *args, **kwargs)
        return class_._instance

class MyClass(Singleton, BaseClass):
    pass

Pros

  • It’s a true class

Cons

  • Multiple inheritance – eugh! __new__ could be overwritten during inheritance from a second base class? One has to think more than is necessary.

Method 3: A metaclass

class Singleton(type):
    _instances = {}
    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
        return cls._instances[cls]

#Python2
class MyClass(BaseClass):
    __metaclass__ = Singleton

#Python3
class MyClass(BaseClass, metaclass=Singleton):
    pass

Pros

  • It’s a true class
  • Auto-magically covers inheritance
  • Uses __metaclass__ for its proper purpose (and made me aware of it)

Cons

  • Are there any?

Method 4: decorator returning a class with the same name

def singleton(class_):
    class class_w(class_):
        _instance = None
        def __new__(class_, *args, **kwargs):
            if class_w._instance is None:
                class_w._instance = super(class_w,
                                    class_).__new__(class_,
                                                    *args,
                                                    **kwargs)
                class_w._instance._sealed = False
            return class_w._instance
        def __init__(self, *args, **kwargs):
            if self._sealed:
                return
            super(class_w, self).__init__(*args, **kwargs)
            self._sealed = True
    class_w.__name__ = class_.__name__
    return class_w

@singleton
class MyClass(BaseClass):
    pass

Pros

  • It’s a true class
  • Auto-magically covers inheritance

Cons

  • Is there not an overhead for creating each new class? Here we are creating two classes for each class we wish to make a singleton. While this is fine in my case, I worry that this might not scale. Of course there is a matter of debate as to whether it aught to be too easy to scale this pattern…
  • What is the point of the _sealed attribute
  • Can’t call methods of the same name on base classes using super() because they will recurse. This means you can’t customize __new__ and can’t subclass a class that needs you to call up to __init__.

Method 5: a module

a module file singleton.py

Pros

  • Simple is better than complex

Cons


回答 0

使用元类

我建议使用方法2,但最好使用元类而不是基类。这是一个示例实现:

class Singleton(type):
    _instances = {}
    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
        return cls._instances[cls]

class Logger(object):
    __metaclass__ = Singleton

或在Python3中

class Logger(metaclass=Singleton):
    pass

如果要在__init__每次调用类时运行,请添加

        else:
            cls._instances[cls].__init__(*args, **kwargs)

对中的if陈述Singleton.__call__

关于元类的几句话。元类是类的类 ; 也就是说,类是其元类实例。您可以使用来找到Python中对象的元类type(obj)。普通的新式类是类型typeLogger上面代码中的将会是type class 'your_module.Singleton',就像的(唯一的)实例Logger将是type一样class 'your_module.Logger'。当你调用记录仪与Logger(),Python的首先要求的元类LoggerSingleton,做什么,允许实例创建要捷足先登。此过程与Python __getattr__通过执行以下操作引用类的一个属性时调用类来询问类的方法相同:myclass.attribute

元类从本质上决定了类定义的含义以及如何实现该定义。参见例如http://code.activestate.com/recipes/498149/,它实质上是struct使用元类在Python中重新创建C风格的。线程元类的一些(具体)用例是什么?还提供了一些示例,它们通常似乎与声明性编程有关,尤其是在ORM中使用的声明性编程。

在这种情况下,如果您使用方法2,并且子类定义了一个__new__方法,则每次调用都会执行SubClassOfSingleton()该方法-因为它负责调用返回存储实例的方法。对于元类,仅在创建唯一实例时才调用一次。您想自定义调用类的含义,该类的类型决定。

通常,使用元类实现单例是有意义的。单例很特别,因为它只能创建一次,而元类是自定义类创建的方式。如果需要以其他方式自定义单例类定义,则使用元类可以提供更多控制权

您的单例不需要多重继承(因为元类不是基类),但是对于使用多重继承的已创建类的子类,您需要确保单例类是第一个/最左边的一个具有重新定义的元类的类。__call__这不太可能成为问题。实例dict 不在实例的命名空间中,因此不会意外覆盖它。

您还将听到单例模式违反了“单一责任原则”-每个类只能一件事。这样,您就不必担心如果需要更改另一代码,便会弄乱代码要做的一件事,因为它们是分开封装的。元类实现通过了此测试。元类负责执行模式,创建的类和子类无需知道它们是单例。正如您在“ MyClass本身是一个函数而不是一个类,因此您无法从中调用类方法”中指出的那样,方法#1未能通过该测试。

Python 2和3兼容版本

编写适用于Python2和3的东西需要使用稍微复杂一些的方案。由于元类通常是type的子类type,因此可以使用一个在运行时动态创建中介基类并将其作为元类,然后将用作公共Singleton基类的基类。如下所示,这比做起来难解释。

# works in Python 2 & 3
class _Singleton(type):
    """ A metaclass that creates a Singleton base class when called. """
    _instances = {}
    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            cls._instances[cls] = super(_Singleton, cls).__call__(*args, **kwargs)
        return cls._instances[cls]

class Singleton(_Singleton('SingletonMeta', (object,), {})): pass

class Logger(Singleton):
    pass

具有讽刺意味的是,这种方法使用子类来实现元类。一个可能的优点是,与纯元类不同,isinstance(inst, Singleton)它将返回True

更正

在另一个主题上,您可能已经注意到了这一点,但是原始文章中的基类实现是错误的。_instances需要在类引用,您需要使用super()递归,并且__new__实际上是一个静态方法,您必须将类传递给,而不是类方法,因为尚未在其上创建实际的类叫做。所有这些对于元类实现也是正确的。

class Singleton(object):
  _instances = {}
  def __new__(class_, *args, **kwargs):
    if class_ not in class_._instances:
        class_._instances[class_] = super(Singleton, class_).__new__(class_, *args, **kwargs)
    return class_._instances[class_]

class MyClass(Singleton):
  pass

c = MyClass()

室内设计师返校

我本来是在写评论,但评论太长了,因此我将在此处添加。方法4比其他装饰器版本更好,但是它的代码比单例所需的代码更多,并且不清楚它的功能。

主要问题源于该类是它自己的基类。首先,让一个类成为几乎完全相同的类的子类不是很奇怪__class__吗?这也意味着你不能定义调用同名的方法对它们的基类的任何方法super(),因为他们会重复。这意味着您的类无法自定义__new__,并且不能从需要对其__init__调用的任何类派生。

何时使用单例模式

您的用例是想要使用单例的更好示例之一。您在其中一项评论中说:“对我而言,伐木一直是Singletons的自然选择。” 你说的

人们说单身人士很糟糕,最常见的原因是他们是隐性的共享状态。虽然全局变量和顶级模块导入是显式共享状态,但通常会实例化传递的其他对象。这是一个好点,但有两个exceptions

第一个,并且在各个地方都被提及的,是单例是恒定的。全局常数(尤其是枚举)的使用已被广泛接受,并被认为是明智的,因为无论如何,任何用户都无法为其他任何用户弄乱它们。对于恒定的单例也同样如此。

第二个exceptions(相反,它被忽略了)是相反的-当单例仅仅是数据接收器,而不是数据源(直接或间接)时。这就是为什么记录器感觉单例的“自然”使用。由于各种用户没有以其他用户关心的方式更改记录器,因此并没有真正的共享状态。这消除了反对单例模式的主要论点,并使其成为合理的选择,因为它们易于执行任务。

这是来自http://googletesting.blogspot.com/2008/08/root-cause-of-singletons.html的报价:

现在,有一种Singleton可以。那是所有可达对象都是不可变的单例。如果所有对象都是不可变的,则Singleton没有全局状态,因为一切都是恒定的。但是将这种单身人士变成易变的人是如此容易,这是很滑的坡度。因此,我也反对这些Singleton,不是因为它们不好,而是因为它们很容易变坏。(作为一个附带说明,Java枚举就是这些单例。只要您不将状态放入枚举中就可以,所以请不要这样做。)

另一种半可接受的单例是那些不影响代码执行的单例,它们没有“副作用”。日志记录就是一个很好的例子。它加载了Singletons和全局状态。这是可以接受的(因为它不会伤害您),因为无论是否启用给定的记录器,您的应用程序的行为都没有任何不同。此处的信息以一种方式流动:从您的应用程序进入记录器。甚至认为记录器是全局状态,因为没有信息从记录器流入您的应用程序,所以记录器是可以接受的。如果您想让测试断言某些东西正在被记录,那么您仍然应该注入记录器,但是一般来说,记录器即使处于满状态也不会有害。

Use a Metaclass

I would recommend Method #2, but you’re better off using a metaclass than a base class. Here is a sample implementation:

class Singleton(type):
    _instances = {}
    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
        return cls._instances[cls]

class Logger(object):
    __metaclass__ = Singleton

Or in Python3

class Logger(metaclass=Singleton):
    pass

If you want to run __init__ every time the class is called, add

        else:
            cls._instances[cls].__init__(*args, **kwargs)

to the if statement in Singleton.__call__.

A few words about metaclasses. A metaclass is the class of a class; that is, a class is an instance of its metaclass. You find the metaclass of an object in Python with type(obj). Normal new-style classes are of type type. Logger in the code above will be of type class 'your_module.Singleton', just as the (only) instance of Logger will be of type class 'your_module.Logger'. When you call logger with Logger(), Python first asks the metaclass of Logger, Singleton, what to do, allowing instance creation to be pre-empted. This process is the same as Python asking a class what to do by calling __getattr__ when you reference one of it’s attributes by doing myclass.attribute.

A metaclass essentially decides what the definition of a class means and how to implement that definition. See for example http://code.activestate.com/recipes/498149/, which essentially recreates C-style structs in Python using metaclasses. The thread What are some (concrete) use-cases for metaclasses? also provides some examples, they generally seem to be related to declarative programming, especially as used in ORMs.

In this situation, if you use your Method #2, and a subclass defines a __new__ method, it will be executed every time you call SubClassOfSingleton() — because it is responsible for calling the method that returns the stored instance. With a metaclass, it will only be called once, when the only instance is created. You want to customize what it means to call the class, which is decided by it’s type.

In general, it makes sense to use a metaclass to implement a singleton. A singleton is special because is created only once, and a metaclass is the way you customize the creation of a class. Using a metaclass gives you more control in case you need to customize the singleton class definitions in other ways.

Your singletons won’t need multiple inheritance (because the metaclass is not a base class), but for subclasses of the created class that use multiple inheritance, you need to make sure the singleton class is the first / leftmost one with a metaclass that redefines __call__ This is very unlikely to be an issue. The instance dict is not in the instance’s namespace so it won’t accidentally overwrite it.

You will also hear that the singleton pattern violates the “Single Responsibility Principle” — each class should do only one thing. That way you don’t have to worry about messing up one thing the code does if you need to change another, because they are separate and encapsulated. The metaclass implementation passes this test. The metaclass is responsible for enforcing the pattern and the created class and subclasses need not be aware that they are singletons. Method #1 fails this test, as you noted with “MyClass itself is a a function, not a class, so you cannot call class methods from it.”

Python 2 and 3 Compatible Version

Writing something that works in both Python2 and 3 requires using a slightly more complicated scheme. Since metaclasses are usually subclasses of type type, it’s possible to use one to dynamically create an intermediary base class at run time with it as its metaclass and then use that as the baseclass of the public Singleton base class. It’s harder to explain than to do, as illustrated next:

# works in Python 2 & 3
class _Singleton(type):
    """ A metaclass that creates a Singleton base class when called. """
    _instances = {}
    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            cls._instances[cls] = super(_Singleton, cls).__call__(*args, **kwargs)
        return cls._instances[cls]

class Singleton(_Singleton('SingletonMeta', (object,), {})): pass

class Logger(Singleton):
    pass

An ironic aspect of this approach is that it’s using subclassing to implement a metaclass. One possible advantage is that, unlike with a pure metaclass, isinstance(inst, Singleton) will return True.

Corrections

On another topic, you’ve probably already noticed this, but the base class implementation in your original post is wrong. _instances needs to be referenced on the class, you need to use super() or you’re recursing, and __new__ is actually a static method that you have to pass the class to, not a class method, as the actual class hasn’t been created yet when it is called. All of these things will be true for a metaclass implementation as well.

class Singleton(object):
  _instances = {}
  def __new__(class_, *args, **kwargs):
    if class_ not in class_._instances:
        class_._instances[class_] = super(Singleton, class_).__new__(class_, *args, **kwargs)
    return class_._instances[class_]

class MyClass(Singleton):
  pass

c = MyClass()

Decorator Returning A Class

I originally was writing a comment but it was too long, so I’ll add this here. Method #4 is better than the other decorator version, but it’s more code than needed for a singleton, and it’s not as clear what it does.

The main problems stem from the class being it’s own base class. First, isn’t it weird to have a class be a subclass of a nearly identical class with the same name that exists only in its __class__ attribute? This also means that you can’t define any methods that call the method of the same name on their base class with super() because they will recurse. This means your class can’t customize __new__, and can’t derive from any classes that need __init__ called on them.

When to use the singleton pattern

Your use case is one of the better examples of wanting to use a singleton. You say in one of the comments “To me logging has always seemed a natural candidate for Singletons.” You’re absolutely right.

When people say singletons are bad, the most common reason is they are implicit shared state. While with global variables and top-level module imports are explicit shared state, other objects that are passed around are generally instantiated. This is a good point, with two exceptions.

The first, and one that gets mentioned in various places, is when the singletons are constant. Use of global constants, especially enums, is widely accepted, and considered sane because no matter what, none of the users can mess them up for any other user. This is equally true for a constant singleton.

The second exception, which get mentioned less, is the opposite — when the singleton is only a data sink, not a data source (directly or indirectly). This is why loggers feel like a “natural” use for singletons. As the various users are not changing the loggers in ways other users will care about, there is not really shared state. This negates the primary argument against the singleton pattern, and makes them a reasonable choice because of their ease of use for the task.

Here is a quote from http://googletesting.blogspot.com/2008/08/root-cause-of-singletons.html:

Now, there is one kind of Singleton which is OK. That is a singleton where all of the reachable objects are immutable. If all objects are immutable than Singleton has no global state, as everything is constant. But it is so easy to turn this kind of singleton into mutable one, it is very slippery slope. Therefore, I am against these Singletons too, not because they are bad, but because it is very easy for them to go bad. (As a side note Java enumeration are just these kind of singletons. As long as you don’t put state into your enumeration you are OK, so please don’t.)

The other kind of Singletons, which are semi-acceptable are those which don’t effect the execution of your code, They have no “side effects”. Logging is perfect example. It is loaded with Singletons and global state. It is acceptable (as in it will not hurt you) because your application does not behave any different whether or not a given logger is enabled. The information here flows one way: From your application into the logger. Even thought loggers are global state since no information flows from loggers into your application, loggers are acceptable. You should still inject your logger if you want your test to assert that something is getting logged, but in general Loggers are not harmful despite being full of state.


回答 1

class Foo(object):
     pass

some_global_variable = Foo()

模块仅导入一次,其他一切都考虑不周。不要使用单例并且不要使用全局。

class Foo(object):
     pass

some_global_variable = Foo()

Modules are imported only once, everything else is overthinking. Don’t use singletons and try not to use globals.


回答 2

使用模块。它仅导入一次。在其中定义一些全局变量-它们将是单例的“属性”。添加一些功能-单例的“方法”。

Use a module. It is imported only once. Define some global variables in it – they will be singleton’s ‘attributes’. Add some functions – the singleton’s ‘methods’.


回答 3

您可能永远不需要Python中的单例。只需在一个模块中定义所有数据和功能,便拥有事实上的单例。

如果您真的绝对必须要有一个单例类,那么我可以考虑:

class My_Singleton(object):
    def foo(self):
        pass

my_singleton = My_Singleton()

使用方法:

from mysingleton import my_singleton
my_singleton.foo()

其中mysingleton.py是定义My_Singleton的文件名。之所以起作用,是因为第一次导入文件后,Python不会重新执行代码。

You probably never need a singleton in Python. Just define all your data and functions in a module and you have a de-facto singleton.

If you really absolutely have to have a singleton class then I’d go with:

class My_Singleton(object):
    def foo(self):
        pass

my_singleton = My_Singleton()

To use:

from mysingleton import my_singleton
my_singleton.foo()

where mysingleton.py is your filename that My_Singleton is defined in. This works because after the first time a file is imported, Python doesn’t re-execute the code.


回答 4

这是您的一线客:

singleton = lambda c: c()

使用方法如下:

@singleton
class wat(object):
    def __init__(self): self.x = 1
    def get_x(self): return self.x

assert wat.get_x() == 1

您的对象会被实例化。这可能是您想要的,也可能不是。

Here’s a one-liner for you:

singleton = lambda c: c()

Here’s how you use it:

@singleton
class wat(object):
    def __init__(self): self.x = 1
    def get_x(self): return self.x

assert wat.get_x() == 1

Your object gets instantiated eagerly. This may or may not be what you want.


回答 5

看看Stack Overflow问题,是否有一种简单,优雅的方法在Python中定义单例?有几种解决方案。

我强烈建议观看Alex Martelli关于python设计模式的演讲:第1 部分第2部分。特别是在第1部分中,他讨论了单例/共享状态对象。

Check out Stack Overflow question Is there a simple, elegant way to define singletons in Python? with several solutions.

I’d strongly recommend to watch Alex Martelli’s talks on design patterns in python: part 1 and part 2. In particular, in part 1 he talks about singletons/shared state objects.


回答 6

这是我自己的单例实现。您所要做的就是装饰教室。要获得单例,则必须使用该Instance方法。这是一个例子:

   @Singleton
   class Foo:
       def __init__(self):
           print 'Foo created'

   f = Foo() # Error, this isn't how you get the instance of a singleton

   f = Foo.Instance() # Good. Being explicit is in line with the Python Zen
   g = Foo.Instance() # Returns already created instance

   print f is g # True

这是代码:

class Singleton:
    """
    A non-thread-safe helper class to ease implementing singletons.
    This should be used as a decorator -- not a metaclass -- to the
    class that should be a singleton.

    The decorated class can define one `__init__` function that
    takes only the `self` argument. Other than that, there are
    no restrictions that apply to the decorated class.

    To get the singleton instance, use the `Instance` method. Trying
    to use `__call__` will result in a `TypeError` being raised.

    Limitations: The decorated class cannot be inherited from.

    """

    def __init__(self, decorated):
        self._decorated = decorated

    def Instance(self):
        """
        Returns the singleton instance. Upon its first call, it creates a
        new instance of the decorated class and calls its `__init__` method.
        On all subsequent calls, the already created instance is returned.

        """
        try:
            return self._instance
        except AttributeError:
            self._instance = self._decorated()
            return self._instance

    def __call__(self):
        raise TypeError('Singletons must be accessed through `Instance()`.')

    def __instancecheck__(self, inst):
        return isinstance(inst, self._decorated)

Here’s my own implementation of singletons. All you have to do is decorate the class; to get the singleton, you then have to use the Instance method. Here’s an example:

   @Singleton
   class Foo:
       def __init__(self):
           print 'Foo created'

   f = Foo() # Error, this isn't how you get the instance of a singleton

   f = Foo.Instance() # Good. Being explicit is in line with the Python Zen
   g = Foo.Instance() # Returns already created instance

   print f is g # True

And here’s the code:

class Singleton:
    """
    A non-thread-safe helper class to ease implementing singletons.
    This should be used as a decorator -- not a metaclass -- to the
    class that should be a singleton.

    The decorated class can define one `__init__` function that
    takes only the `self` argument. Other than that, there are
    no restrictions that apply to the decorated class.

    To get the singleton instance, use the `Instance` method. Trying
    to use `__call__` will result in a `TypeError` being raised.

    Limitations: The decorated class cannot be inherited from.

    """

    def __init__(self, decorated):
        self._decorated = decorated

    def Instance(self):
        """
        Returns the singleton instance. Upon its first call, it creates a
        new instance of the decorated class and calls its `__init__` method.
        On all subsequent calls, the already created instance is returned.

        """
        try:
            return self._instance
        except AttributeError:
            self._instance = self._decorated()
            return self._instance

    def __call__(self):
        raise TypeError('Singletons must be accessed through `Instance()`.')

    def __instancecheck__(self, inst):
        return isinstance(inst, self._decorated)

回答 7

方法3看起来很整洁,但是如果您希望程序同时在Python 2Python 3中运行,那么它将无法正常工作。即使使用Python版本的测试来保护单独的变体也失败了,因为Python 3版本在Python 2中给出了语法错误。

感谢Mike Watkins:http : //mikewatkins.ca/2008/11/29/python-2-and-3-metaclasses/。如果要使程序在Python 2和Python 3中都能工作,则需要执行以下操作:

class Singleton(type):
    _instances = {}
    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
        return cls._instances[cls]

MC = Singleton('MC', (object), {})

class MyClass(MC):
    pass    # Code for the class implementation

我认为作业中的“对象”需要替换为“ BaseClass”,但是我还没有尝试过(我尝试了如图所示的代码)。

Method 3 seems to be very neat, but if you want your program to run in both Python 2 and Python 3, it doesn’t work. Even protecting the separate variants with tests for the Python version fails, because the Python 3 version gives a syntax error in Python 2.

Thanks to Mike Watkins: http://mikewatkins.ca/2008/11/29/python-2-and-3-metaclasses/. If you want the program to work in both Python 2 and Python 3, you need to do something like:

class Singleton(type):
    _instances = {}
    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
        return cls._instances[cls]

MC = Singleton('MC', (object), {})

class MyClass(MC):
    pass    # Code for the class implementation

I presume that ‘object’ in the assignment needs to be replaced with the ‘BaseClass’, but I haven’t tried that (I have tried code as illustrated).


回答 8

好吧,除了同意关于模块级全局的Pythonic通用建议外,如何做到这一点:

def singleton(class_):
    class class_w(class_):
        _instance = None
        def __new__(class2, *args, **kwargs):
            if class_w._instance is None:
                class_w._instance = super(class_w, class2).__new__(class2, *args, **kwargs)
                class_w._instance._sealed = False
            return class_w._instance
        def __init__(self, *args, **kwargs):
            if self._sealed:
                return
            super(class_w, self).__init__(*args, **kwargs)
            self._sealed = True
    class_w.__name__ = class_.__name__
    return class_w

@singleton
class MyClass(object):
    def __init__(self, text):
        print text
    @classmethod
    def name(class_):
        print class_.__name__

x = MyClass(111)
x.name()
y = MyClass(222)
print id(x) == id(y)

输出为:

111     # the __init__ is called only on the 1st time
MyClass # the __name__ is preserved
True    # this is actually the same instance

Well, other than agreeing with the general Pythonic suggestion on having module-level global, how about this:

def singleton(class_):
    class class_w(class_):
        _instance = None
        def __new__(class2, *args, **kwargs):
            if class_w._instance is None:
                class_w._instance = super(class_w, class2).__new__(class2, *args, **kwargs)
                class_w._instance._sealed = False
            return class_w._instance
        def __init__(self, *args, **kwargs):
            if self._sealed:
                return
            super(class_w, self).__init__(*args, **kwargs)
            self._sealed = True
    class_w.__name__ = class_.__name__
    return class_w

@singleton
class MyClass(object):
    def __init__(self, text):
        print text
    @classmethod
    def name(class_):
        print class_.__name__

x = MyClass(111)
x.name()
y = MyClass(222)
print id(x) == id(y)

Output is:

111     # the __init__ is called only on the 1st time
MyClass # the __name__ is preserved
True    # this is actually the same instance

回答 9

这个怎么样:

def singleton(cls):
    instance=cls()
    cls.__new__ = cls.__call__= lambda cls: instance
    cls.__init__ = lambda self: None
    return instance

将其用作应该为单例的类上的装饰器。像这样:

@singleton
class MySingleton:
    #....

这类似于singleton = lambda c: c()另一个答案中的装饰器。与其他解决方案一样,唯一的实例具有类(MySingleton)的名称。但是,使用此解决方案,您仍然可以通过执行从类“创建”实例(实际上是唯一的实例)MySingleton()。它还会阻止您这样做来创建其他实例type(MySingleton)()(这也会返回相同的实例)。

How about this:

def singleton(cls):
    instance=cls()
    cls.__new__ = cls.__call__= lambda cls: instance
    cls.__init__ = lambda self: None
    return instance

Use it as a decorator on a class that should be a singleton. Like this:

@singleton
class MySingleton:
    #....

This is similar to the singleton = lambda c: c() decorator in another answer. Like the other solution, the only instance has name of the class (MySingleton). However, with this solution you can still “create” instances (actually get the only instance) from the class, by doing MySingleton(). It also prevents you from creating additional instances by doing type(MySingleton)() (that also returns the same instance).


回答 10

我将我扔进戒指。这是一个简单的装饰器。

from abc import ABC

def singleton(real_cls):

    class SingletonFactory(ABC):

        instance = None

        def __new__(cls, *args, **kwargs):
            if not cls.instance:
                cls.instance = real_cls(*args, **kwargs)
            return cls.instance

    SingletonFactory.register(real_cls)
    return SingletonFactory

# Usage
@singleton
class YourClass:
    ...  # Your normal implementation, no special requirements.

我认为它具有一些其他解决方案的好处:

  • 这是简洁明了的(在我看来; D)。
  • 它的作用被完全封装。您无需更改的实现的任何事情YourClass。这包括不需要为您的类使用元类(请注意,上面的元类在工厂中,而不是“真实”类)。
  • 它不依赖于Monkey修补任何东西。
  • 对调用方法透明:
    • 调用者仍然简单地import YourClass,它看起来像一个类(因为是),并且可以正常使用它。无需使调用者适应工厂功能。
    • 什么YourClass()实例仍是的一个真正的实例YourClass您实现的,没有任何形式的代理,所以没有副作用的几率从产生。
    • isinstance(instance, YourClass) 并且类似的操作仍然可以按预期进行(尽管此位确实需要abc,因此排除了Python <2.6)。

我确实有一个缺点:实类的类方法和静态方法不能通过隐藏它的工厂类透明地调用。我已经很少使用了它,以至从未碰到过这种需求,但是通过在工厂上使用实现__getattr__()将所有属性访问权限委派给真实类的自定义元类,可以很容易地对其进行纠正。

我实际上发现的一个相关模式更有用(并不是说我经常需要这些东西)是“唯一”模式,其中使用相同的参数实例化该类会导致返回相同的实例。即“每个参数单”。上面的内容很好地适应了这一点,并且变得更加简洁:

def unique(real_cls):

    class UniqueFactory(ABC):

        @functools.lru_cache(None)  # Handy for 3.2+, but use any memoization decorator you like
        def __new__(cls, *args, **kwargs):
            return real_cls(*args, **kwargs)

    UniqueFactory.register(real_cls)
    return UniqueFactory

综上所述,我确实同意以下一般性建议:如果您认为自己需要这些东西之一,则可能应该停一会儿,问问自己是否确实需要。YAGNI 99%的时间。

I’ll toss mine into the ring. It’s a simple decorator.

from abc import ABC

def singleton(real_cls):

    class SingletonFactory(ABC):

        instance = None

        def __new__(cls, *args, **kwargs):
            if not cls.instance:
                cls.instance = real_cls(*args, **kwargs)
            return cls.instance

    SingletonFactory.register(real_cls)
    return SingletonFactory

# Usage
@singleton
class YourClass:
    ...  # Your normal implementation, no special requirements.

Benefits I think it has over some of the other solutions:

  • It’s clear and concise (to my eye ;D).
  • Its action is completely encapsulated. You don’t need to change a single thing about the implementation of YourClass. This includes not needing to use a metaclass for your class (note that the metaclass above is on the factory, not the “real” class).
  • It doesn’t rely on monkey-patching anything.
  • It’s transparent to callers:
    • Callers still simply import YourClass, it looks like a class (because it is), and they use it normally. No need to adapt callers to a factory function.
    • What YourClass() instantiates is still a true instance of the YourClass you implemented, not a proxy of any kind, so no chance of side effects resulting from that.
    • isinstance(instance, YourClass) and similar operations still work as expected (though this bit does require abc so precludes Python <2.6).

One downside does occur to me: classmethods and staticmethods of the real class are not transparently callable via the factory class hiding it. I’ve used this rarely enough that I’ve never happen to run into that need, but it would be easily rectified by using a custom metaclass on the factory that implements __getattr__() to delegate all-ish attribute access to the real class.

A related pattern I’ve actually found more useful (not that I’m saying these kinds of things are required very often at all) is a “Unique” pattern where instantiating the class with the same arguments results in getting back the same instance. I.e. a “singleton per arguments”. The above adapts to this well and becomes even more concise:

def unique(real_cls):

    class UniqueFactory(ABC):

        @functools.lru_cache(None)  # Handy for 3.2+, but use any memoization decorator you like
        def __new__(cls, *args, **kwargs):
            return real_cls(*args, **kwargs)

    UniqueFactory.register(real_cls)
    return UniqueFactory

All that said, I do agree with the general advice that if you think you need one of these things, you really should probably stop for a moment and ask yourself if you really do. 99% of the time, YAGNI.


回答 11

基于Tolli的答案的代码。

#decorator, modyfies new_cls
def _singleton(new_cls):
    instance = new_cls()                                              #2
    def new(cls):
        if isinstance(instance, cls):                                 #4
            return instance
        else:
            raise TypeError("I can only return instance of {}, caller wanted {}".format(new_cls, cls))
    new_cls.__new__  = new                                            #3
    new_cls.__init__ = lambda self: None                              #5
    return new_cls


#decorator, creates new class
def singleton(cls):
    new_cls = type('singleton({})'.format(cls.__name__), (cls,), {} ) #1
    return _singleton(new_cls)


#metaclass
def meta_singleton(name, bases, attrs):
    new_cls = type(name, bases, attrs)                                #1
    return _singleton(new_cls)

说明:

  1. 创建新类,继承自给定的类cls
    cls例如,在有人想要的情况下,它不会进行修改singleton(list)

  2. 创建实例。覆盖之前__new__是如此简单。

  3. 现在,当我们轻松创建实例后,请__new__使用之前定义的方法进行覆盖。
  4. 该函数instance仅在调用者期望的时候返回,否则抛出TypeError
    当有人尝试从装饰类继承时,不满足该条件。

  5. 如果__new__()返回的实例cls,那么新实例的__init__()方法将被调用一样__init__(self[, ...]),这里的自我是新实例,其余参数都一样传递给__new__()

    instance已经被初始化,所以__init__功能被什么都不做的功能所取代。

看到它在线上工作

Code based on Tolli’s answer.

#decorator, modyfies new_cls
def _singleton(new_cls):
    instance = new_cls()                                              #2
    def new(cls):
        if isinstance(instance, cls):                                 #4
            return instance
        else:
            raise TypeError("I can only return instance of {}, caller wanted {}".format(new_cls, cls))
    new_cls.__new__  = new                                            #3
    new_cls.__init__ = lambda self: None                              #5
    return new_cls


#decorator, creates new class
def singleton(cls):
    new_cls = type('singleton({})'.format(cls.__name__), (cls,), {} ) #1
    return _singleton(new_cls)


#metaclass
def meta_singleton(name, bases, attrs):
    new_cls = type(name, bases, attrs)                                #1
    return _singleton(new_cls)

Explanation:

  1. Create new class, inheriting from given cls
    (it doesn’t modify cls in case someone wants for example singleton(list))

  2. Create instance. Before overriding __new__ it’s so easy.

  3. Now, when we have easily created instance, overrides __new__ using method defined moment ago.
  4. The function returns instance only when it’s what the caller expects, otherwise raises TypeError.
    The condition is not met when someone attempts to inherit from decorated class.

  5. If __new__() returns an instance of cls, then the new instance’s __init__() method will be invoked like __init__(self[, ...]), where self is the new instance and the remaining arguments are the same as were passed to __new__().

    instance is already initialized, so function replaces __init__ with function doing nothing.

See it working online


回答 12

它与晶圆厂的答案有些相似,但并不完全相同。

单合同并不要求我们能够多次调用构造函数。作为一个单例应该仅创建一次,是否不应该将其视为仅创建一次?“欺骗”构造函数无疑会损害可读性。

所以我的建议是这样的:

class Elvis():
    def __init__(self):
        if hasattr(self.__class__, 'instance'):
            raise Exception()
        self.__class__.instance = self
        # initialisation code...

    @staticmethod
    def the():
        if hasattr(Elvis, 'instance'):
            return Elvis.instance
        return Elvis()

这不排除instance用户代码对构造函数或字段的使用:

if Elvis() is King.instance:

…如果您确定Elvis还没有创建,那就已经创建了King

但它鼓励用户the普遍使用该方法:

Elvis.the().leave(Building.the())

为了完成此操作,__delattr__()如果尝试删除instance,也可以重写以引发Exception ,并重写__del__()以引发Exception(除非我们知道程序正在结束…)

进一步的改进


感谢那些为评论和编辑提供帮助的人,我们欢迎其中的更多内容。当我使用Jython时,这应该更通用,并且是线程安全的。

try:
    # This is jython-specific
    from synchronize import make_synchronized
except ImportError:
    # This should work across different python implementations
    def make_synchronized(func):
        import threading
        func.__lock__ = threading.Lock()

        def synced_func(*args, **kws):
            with func.__lock__:
                return func(*args, **kws)

        return synced_func

class Elvis(object): # NB must be subclass of object to use __new__
    instance = None

    @classmethod
    @make_synchronized
    def __new__(cls, *args, **kwargs):
        if cls.instance is not None:
            raise Exception()
        cls.instance = object.__new__(cls, *args, **kwargs)
        return cls.instance

    def __init__(self):
        pass
        # initialisation code...

    @classmethod
    @make_synchronized
    def the(cls):
        if cls.instance is not None:
            return cls.instance
        return cls()

注意事项:

  1. 如果您不从python2.x中的对象继承子类,则将获得一个老式的类,该类不使用 __new__
  2. 装饰时,__new__您必须使用@classmethod装饰,否则__new__它将是未绑定的实例方法
  3. 可以通过使用元类来改善这一点,因为这将使您能够创建the类级别的属性,并将其重命名为instance

It is slightly similar to the answer by fab but not exactly the same.

The singleton contract does not require that we be able to call the constructor multiple times. As a singleton should be created once and once only, shouldn’t it be seen to be created just once? “Spoofing” the constructor arguably impairs legibility.

So my suggestion is just this:

class Elvis():
    def __init__(self):
        if hasattr(self.__class__, 'instance'):
            raise Exception()
        self.__class__.instance = self
        # initialisation code...

    @staticmethod
    def the():
        if hasattr(Elvis, 'instance'):
            return Elvis.instance
        return Elvis()

This does not rule out the use of the constructor or the field instance by user code:

if Elvis() is King.instance:

… if you know for sure that Elvis has not yet been created, and that King has.

But it encourages users to use the the method universally:

Elvis.the().leave(Building.the())

To make this complete you could also override __delattr__() to raise an Exception if an attempt is made to delete instance, and override __del__() so that it raises an Exception (unless we know the program is ending…)

Further improvements


My thanks to those who have helped with comments and edits, of which more are welcome. While I use Jython, this should work more generally, and be thread-safe.

try:
    # This is jython-specific
    from synchronize import make_synchronized
except ImportError:
    # This should work across different python implementations
    def make_synchronized(func):
        import threading
        func.__lock__ = threading.Lock()

        def synced_func(*args, **kws):
            with func.__lock__:
                return func(*args, **kws)

        return synced_func

class Elvis(object): # NB must be subclass of object to use __new__
    instance = None

    @classmethod
    @make_synchronized
    def __new__(cls, *args, **kwargs):
        if cls.instance is not None:
            raise Exception()
        cls.instance = object.__new__(cls, *args, **kwargs)
        return cls.instance

    def __init__(self):
        pass
        # initialisation code...

    @classmethod
    @make_synchronized
    def the(cls):
        if cls.instance is not None:
            return cls.instance
        return cls()

Points of note:

  1. If you don’t subclass from object in python2.x you will get an old-style class, which does not use __new__
  2. When decorating __new__ you must decorate with @classmethod or __new__ will be an unbound instance method
  3. This could possibly be improved by way of use of a metaclass, as this would allow you to make the a class-level property, possibly renaming it to instance

回答 13

一名班轮(我不为此感到自豪,但确实能胜任):

class Myclass:
  def __init__(self):
      # do your stuff
      globals()[type(self).__name__] = lambda: self # singletonify

One liner (I am not proud, but it does the job):

class Myclass:
  def __init__(self):
      # do your stuff
      globals()[type(self).__name__] = lambda: self # singletonify

回答 14

如果您不需要懒惰地初始化Singleton实例,则以下操作应该很容易且线程安全:

class A:
    instance = None
    # Methods and variables of the class/object A follow
A.instance = A()

这种方式A是在模块导入时初始化的单例。

If you don’t need lazy initialization of the instance of the Singleton, then the following should be easy and thread-safe:

class A:
    instance = None
    # Methods and variables of the class/object A follow
A.instance = A()

This way A is a singleton initialized at module import.


回答 15

  • 如果一个人想拥有多个相同类的实例,但是只有当args或kwargs不同时,才可以使用
  • 例如
    1. 如果您有类处理 serial通信,并且要创建一个实例,并且希望将串行端口作为参数发送,那么使用传统方法将无法正常工作
    2. 使用上述装饰器,如果args不同,则可以创建该类的多个实例。
    3. 对于相同的参数,装饰器将返回已经创建的相同实例。
>>> from decorators import singleton
>>>
>>> @singleton
... class A:
...     def __init__(self, *args, **kwargs):
...         pass
...
>>>
>>> a = A(name='Siddhesh')
>>> b = A(name='Siddhesh', lname='Sathe')
>>> c = A(name='Siddhesh', lname='Sathe')
>>> a is b  # has to be different
False
>>> b is c  # has to be same
True
>>>
  • If one wants to have multiple number of instances of the same class, but only if the args or kwargs are different, one can use this
  • Ex.
    1. If you have a class handling serial communication, and to create an instance you want to send the serial port as an argument, then with traditional approach won’t work
    2. Using the above mentioned decorators, one can create multiple instances of the class if the args are different.
    3. For same args, the decorator will return the same instance which is already been created.
>>> from decorators import singleton
>>>
>>> @singleton
... class A:
...     def __init__(self, *args, **kwargs):
...         pass
...
>>>
>>> a = A(name='Siddhesh')
>>> b = A(name='Siddhesh', lname='Sathe')
>>> c = A(name='Siddhesh', lname='Sathe')
>>> a is b  # has to be different
False
>>> b is c  # has to be same
True
>>>

回答 16

也许我误解了单例模式,但是我的解决方案是这个简单而实用的(pythonic?)。该代码实现了两个目标

  1. 使实例可在Foo任何地方(全局)访问。
  2. 只能Foo存在一个实例。

这是代码。

#!/usr/bin/env python3

class Foo:
    me = None

    def __init__(self):
        if Foo.me != None:
            raise Exception('Instance of Foo still exists!')

        Foo.me = self


if __name__ == '__main__':
    Foo()
    Foo()

输出量

Traceback (most recent call last):
  File "./x.py", line 15, in <module>
    Foo()
  File "./x.py", line 8, in __init__
    raise Exception('Instance of Foo still exists!')
Exception: Instance of Foo still exists!

Maybe I missunderstand the singleton pattern but my solution is this simple and pragmatic (pythonic?). This code fullfills two goals

  1. Make the instance of Foo accessiable everywhere (global).
  2. Only one instance of Foo can exist.

This is the code.

#!/usr/bin/env python3

class Foo:
    me = None

    def __init__(self):
        if Foo.me != None:
            raise Exception('Instance of Foo still exists!')

        Foo.me = self


if __name__ == '__main__':
    Foo()
    Foo()

Output

Traceback (most recent call last):
  File "./x.py", line 15, in <module>
    Foo()
  File "./x.py", line 8, in __init__
    raise Exception('Instance of Foo still exists!')
Exception: Instance of Foo still exists!

回答 17

经过一段时间的努力,我最终想到了以下内容,以便从单独的模块中调用配置对象时,它只会被加载一次。元类允许将全局类实例存储在内置指令中,这在当前看来是存储适当程序全局的最简洁的方法。

import builtins

# -----------------------------------------------------------------------------
# So..... you would expect that a class would be "global" in scope, however
#   when different modules use this,
#   EACH ONE effectively has its own class namespace.  
#   In order to get around this, we use a metaclass to intercept
#   "new" and provide the "truly global metaclass instance" if it already exists

class MetaConfig(type):
    def __new__(cls, name, bases, dct):
        try:
            class_inst = builtins.CONFIG_singleton

        except AttributeError:
            class_inst = super().__new__(cls, name, bases, dct)
            builtins.CONFIG_singleton = class_inst
            class_inst.do_load()

        return class_inst

# -----------------------------------------------------------------------------

class Config(metaclass=MetaConfig):

    config_attr = None

    @classmethod
    def do_load(cls):
        ...<load-cfg-from-file>...

After struggling with this for some time I eventually came up with the following, so that the config object would only be loaded once, when called up from separate modules. The metaclass allows a global class instance to be stored in the builtins dict, which at present appears to be the neatest way of storing a proper program global.

import builtins

# -----------------------------------------------------------------------------
# So..... you would expect that a class would be "global" in scope, however
#   when different modules use this,
#   EACH ONE effectively has its own class namespace.  
#   In order to get around this, we use a metaclass to intercept
#   "new" and provide the "truly global metaclass instance" if it already exists

class MetaConfig(type):
    def __new__(cls, name, bases, dct):
        try:
            class_inst = builtins.CONFIG_singleton

        except AttributeError:
            class_inst = super().__new__(cls, name, bases, dct)
            builtins.CONFIG_singleton = class_inst
            class_inst.do_load()

        return class_inst

# -----------------------------------------------------------------------------

class Config(metaclass=MetaConfig):

    config_attr = None

    @classmethod
    def do_load(cls):
        ...<load-cfg-from-file>...

回答 18

我不记得在哪里找到该解决方案,但是从我的非Python专家的角度来看,它是最“优雅”的:

class SomeSingleton(dict):
    __instance__ = None
    def __new__(cls, *args,**kwargs):
        if SomeSingleton.__instance__ is None:
            SomeSingleton.__instance__ = dict.__new__(cls)
        return SomeSingleton.__instance__

    def __init__(self):
        pass

    def some_func(self,arg):
        pass

我为什么喜欢这个?没有装饰器,没有元类,没有多重继承…,如果您决定不再希望它成为Singleton,只需删除该__new__方法。由于我是Python(和OOP的新手)的新手,所以我希望有人能使我明白为什么这是一种糟糕的方法?

I can’t remember where I found this solution, but I find it to be the most ‘elegant’ from my non-Python-expert point of view:

class SomeSingleton(dict):
    __instance__ = None
    def __new__(cls, *args,**kwargs):
        if SomeSingleton.__instance__ is None:
            SomeSingleton.__instance__ = dict.__new__(cls)
        return SomeSingleton.__instance__

    def __init__(self):
        pass

    def some_func(self,arg):
        pass

Why do I like this? No decorators, no meta classes, no multiple inheritance…and if you decide you don’t want it to be a Singleton anymore, just delete the __new__ method. As I am new to Python (and OOP in general) I expect someone will set me straight about why this is a terrible approach?


回答 19

这是我实现单例的首选方式:

class Test(object):
    obj = None

    def __init__(self):
        if Test.obj is not None:
            raise Exception('A Test Singleton instance already exists')
        # Initialization code here

    @classmethod
    def get_instance(cls):
        if cls.obj is None:
            cls.obj = Test()
        return cls.obj

    @classmethod
    def custom_method(cls):
        obj = cls.get_instance()
        # Custom Code here

This is my preferred way of implementing singletons:

class Test(object):
    obj = None

    def __init__(self):
        if Test.obj is not None:
            raise Exception('A Test Singleton instance already exists')
        # Initialization code here

    @classmethod
    def get_instance(cls):
        if cls.obj is None:
            cls.obj = Test()
        return cls.obj

    @classmethod
    def custom_method(cls):
        obj = cls.get_instance()
        # Custom Code here

回答 20

这个答案可能不是您想要的。我想要一个单例,因为只有那个对象才具有其身份,以便进行比较。就我而言,它被用作前哨值。答案很简单,mything = object()根据python的性质制作任何对象,只有该对象才具有其标识。

#!python
MyNone = object()  # The singleton

for item in my_list:
    if item is MyNone:  # An Example identity comparison
        raise StopIteration

This answer is likely not what you’re looking for. I wanted a singleton in the sense that only that object had its identity, for comparison to. In my case it was being used as a Sentinel Value. To which the answer is very simple, make any object mything = object() and by python’s nature, only that thing will have its identity.

#!python
MyNone = object()  # The singleton

for item in my_list:
    if item is MyNone:  # An Example identity comparison
        raise StopIteration

回答 21

该解决方案在模块级别上导致了一些命名空间污染(三个定义,而不仅仅是一个定义),但是我发现很容易遵循。

我希望能够编写这样的内容(延迟初始化),但不幸的是,类在其自身的定义中不可用。

# wouldn't it be nice if we could do this?
class Foo(object):
    instance = None

    def __new__(cls):
        if cls.instance is None:
            cls.instance = object()
            cls.instance.__class__ = Foo
        return cls.instance

由于这是不可能的,因此我们可以在其中分解初始化和静态实例

急切的初始化:

import random


class FooMaker(object):
    def __init__(self, *args):
        self._count = random.random()
        self._args = args


class Foo(object):
    def __new__(self):
        return foo_instance


foo_instance = FooMaker()
foo_instance.__class__ = Foo

延迟初始化:

急切的初始化:

import random


class FooMaker(object):
    def __init__(self, *args):
        self._count = random.random()
        self._args = args


class Foo(object):
    def __new__(self):
        global foo_instance
        if foo_instance is None:
            foo_instance = FooMaker()
        return foo_instance


foo_instance = None

This solution causes some namespace pollution at the module level (three definitions rather than just one), but I find it easy to follow.

I’d like to be able to write something like this (lazy initialization), but unfortunately classes are not available in the body of their own definitions.

# wouldn't it be nice if we could do this?
class Foo(object):
    instance = None

    def __new__(cls):
        if cls.instance is None:
            cls.instance = object()
            cls.instance.__class__ = Foo
        return cls.instance

Since that isn’t possible, we can break out the initialization and the static instance in

Eager Initialization:

import random


class FooMaker(object):
    def __init__(self, *args):
        self._count = random.random()
        self._args = args


class Foo(object):
    def __new__(self):
        return foo_instance


foo_instance = FooMaker()
foo_instance.__class__ = Foo

Lazy initialization:

Eager Initialization:

import random


class FooMaker(object):
    def __init__(self, *args):
        self._count = random.random()
        self._args = args


class Foo(object):
    def __new__(self):
        global foo_instance
        if foo_instance is None:
            foo_instance = FooMaker()
        return foo_instance


foo_instance = None

有没有一种简单的方法可以按值删除列表元素?

问题:有没有一种简单的方法可以按值删除列表元素?

a = [1, 2, 3, 4]
b = a.index(6)

del a[b]
print(a)

上面显示了以下错误:

Traceback (most recent call last):
  File "D:\zjm_code\a.py", line 6, in <module>
    b = a.index(6)
ValueError: list.index(x): x not in list

所以我必须这样做:

a = [1, 2, 3, 4]

try:
    b = a.index(6)
    del a[b]
except:
    pass

print(a)

但是,没有简单的方法可以做到这一点吗?

a = [1, 2, 3, 4]
b = a.index(6)

del a[b]
print(a)

The above shows the following error:

Traceback (most recent call last):
  File "D:\zjm_code\a.py", line 6, in <module>
    b = a.index(6)
ValueError: list.index(x): x not in list

So I have to do this:

a = [1, 2, 3, 4]

try:
    b = a.index(6)
    del a[b]
except:
    pass

print(a)

But is there not a simpler way to do this?


回答 0

要删除列表中元素的首次出现,只需使用list.remove

>>> a = ['a', 'b', 'c', 'd']
>>> a.remove('b')
>>> print(a)
['a', 'c', 'd']

请注意,它不会删除所有出现的元素。为此使用列表理解。

>>> a = [10, 20, 30, 40, 20, 30, 40, 20, 70, 20]
>>> a = [x for x in a if x != 20]
>>> print(a)
[10, 30, 40, 30, 40, 70]

To remove an element’s first occurrence in a list, simply use list.remove:

>>> a = ['a', 'b', 'c', 'd']
>>> a.remove('b')
>>> print(a)
['a', 'c', 'd']

Mind that it does not remove all occurrences of your element. Use a list comprehension for that.

>>> a = [10, 20, 30, 40, 20, 30, 40, 20, 70, 20]
>>> a = [x for x in a if x != 20]
>>> print(a)
[10, 30, 40, 30, 40, 70]

回答 1

通常,如果您告诉Python做一些它无法做的事情,它将抛出一个Exception,因此您必须执行以下任一操作:

if c in a:
    a.remove(c)

要么:

try:
    a.remove(c)
except ValueError:
    pass

只要它是您期望的并且可以正确处理的异常,它就不一定是一件坏事。

Usually Python will throw an Exception if you tell it to do something it can’t so you’ll have to do either:

if c in a:
    a.remove(c)

or:

try:
    a.remove(c)
except ValueError:
    pass

An Exception isn’t necessarily a bad thing as long as it’s one you’re expecting and handle properly.


回答 2

你可以做

a=[1,2,3,4]
if 6 in a:
    a.remove(6)

但以上需要在列表中搜索6 2次,因此尝试使用除外会更快

try:
    a.remove(6)
except:
    pass

You can do

a=[1,2,3,4]
if 6 in a:
    a.remove(6)

but above need to search 6 in list a 2 times, so try except would be faster

try:
    a.remove(6)
except:
    pass

回答 3

考虑:

a = [1,2,2,3,4,5]

要排除所有情况,可以在python中使用filter函数。例如,它看起来像:

a = list(filter(lambda x: x!= 2, a))

因此,它将保留的所有元素a != 2

仅取出其中一项使用

a.remove(2)

Consider:

a = [1,2,2,3,4,5]

To take out all occurrences, you could use the filter function in python. For example, it would look like:

a = list(filter(lambda x: x!= 2, a))

So, it would keep all elements of a != 2.

To just take out one of the items use

a.remove(2)

回答 4

这是就地执行此操作的方法(无需列表理解):

def remove_all(seq, value):
    pos = 0
    for item in seq:
        if item != value:
           seq[pos] = item
           pos += 1
    del seq[pos:]

Here’s how to do it inplace (without list comprehension):

def remove_all(seq, value):
    pos = 0
    for item in seq:
        if item != value:
           seq[pos] = item
           pos += 1
    del seq[pos:]

回答 5

如果您知道要删除的值,这是一种简单的方法(无论如何,我仍然可以想到):

a = [0, 1, 1, 0, 1, 2, 1, 3, 1, 4]
while a.count(1) > 0:
    a.remove(1)

你会得到 [0, 0, 2, 3, 4]

If you know what value to delete, here’s a simple way (as simple as I can think of, anyway):

a = [0, 1, 1, 0, 1, 2, 1, 3, 1, 4]
while a.count(1) > 0:
    a.remove(1)

You’ll get [0, 0, 2, 3, 4]


回答 6

如果集合适用于您的应用程序,则另一种可能性是使用集合而不是列表。

IE,如果您的数据未排序,并且没有重复,则

my_set=set([3,4,2])
my_set.discard(1)

没有错误。

通常,列表只是一个方便存放实际未排序商品的容器。有一些问题询问如何从列表中删除所有出现的元素。如果您不想一开始就喜欢做傻瓜,那么再说一遍就很方便了。

my_set.add(3)

不会从上面改变my_set。

Another possibility is to use a set instead of a list, if a set is applicable in your application.

IE if your data is not ordered, and does not have duplicates, then

my_set=set([3,4,2])
my_set.discard(1)

is error-free.

Often a list is just a handy container for items that are actually unordered. There are questions asking how to remove all occurences of an element from a list. If you don’t want dupes in the first place, once again a set is handy.

my_set.add(3)

doesn’t change my_set from above.


回答 7

如许多其他答案所述,它list.remove()可以工作,但ValueError如果该项目不在列表中,则抛出a 。在python 3.4及更高版本中,有一种有趣的方法可以使用抑制上下文管理器来处理此问题:

from contextlib import suppress
with suppress(ValueError):
    a.remove('b')

As stated by numerous other answers, list.remove() will work, but throw a ValueError if the item wasn’t in the list. With python 3.4+, there’s an interesting approach to handling this, using the suppress contextmanager:

from contextlib import suppress
with suppress(ValueError):
    a.remove('b')

回答 8

通过使用列表的remove方法,可以更轻松地在列表中查找值,然后删除该索引(如果存在)。

>>> a = [1, 2, 3, 4]
>>> try:
...   a.remove(6)
... except ValueError:
...   pass
... 
>>> print a
[1, 2, 3, 4]
>>> try:
...   a.remove(3)
... except ValueError:
...   pass
... 
>>> print a
[1, 2, 4]

如果您经常这样做,则可以将其包装在一个函数中:

def remove_if_exists(L, value):
  try:
    L.remove(value)
  except ValueError:
    pass

Finding a value in a list and then deleting that index (if it exists) is easier done by just using list’s remove method:

>>> a = [1, 2, 3, 4]
>>> try:
...   a.remove(6)
... except ValueError:
...   pass
... 
>>> print a
[1, 2, 3, 4]
>>> try:
...   a.remove(3)
... except ValueError:
...   pass
... 
>>> print a
[1, 2, 4]

If you do this often, you can wrap it up in a function:

def remove_if_exists(L, value):
  try:
    L.remove(value)
  except ValueError:
    pass

回答 9

这个例子很快,并且将从列表中删除该值的所有实例:

a = [1,2,3,1,2,3,4]
while True:
    try:
        a.remove(3)
    except:
        break
print a
>>> [1, 2, 1, 2, 4]

This example is fast and will delete all instances of a value from the list:

a = [1,2,3,1,2,3,4]
while True:
    try:
        a.remove(3)
    except:
        break
print a
>>> [1, 2, 1, 2, 4]

回答 10

一行:

a.remove('b') if 'b' in a else None

有时很有用。

更简单:

if 'b' in a: a.remove('b')

In one line:

a.remove('b') if 'b' in a else None

sometimes it usefull.

Even easier:

if 'b' in a: a.remove('b')

回答 11

如果您的元素是不同的,那么简单的集合差异就可以了。

c = [1,2,3,4,'x',8,6,7,'x',9,'x']
z = list(set(c) - set(['x']))
print z
[1, 2, 3, 4, 6, 7, 8, 9]

If your elements are distinct, then a simple set difference will do.

c = [1,2,3,4,'x',8,6,7,'x',9,'x']
z = list(set(c) - set(['x']))
print z
[1, 2, 3, 4, 6, 7, 8, 9]

回答 12

我们还可以使用.pop:

>>> lst = [23,34,54,45]
>>> remove_element = 23
>>> if remove_element in lst:
...     lst.pop(lst.index(remove_element))
... 
23
>>> lst
[34, 54, 45]
>>> 

We can also use .pop:

>>> lst = [23,34,54,45]
>>> remove_element = 23
>>> if remove_element in lst:
...     lst.pop(lst.index(remove_element))
... 
23
>>> lst
[34, 54, 45]
>>> 

回答 13

通过索引除您要删除的元素以外的所有内容来覆盖列表

>>> s = [5,4,3,2,1]
>>> s[0:2] + s[3:]
[5, 4, 2, 1]

Overwrite the list by indexing everything except the elements you wish to remove

>>> s = [5,4,3,2,1]
>>> s[0:2] + s[3:]
[5, 4, 2, 1]

回答 14

有一个for循环和一个条件:

def cleaner(seq, value):    
    temp = []                      
    for number in seq:
        if number != value:
            temp.append(number)
    return temp

如果要删除一些但不是全部:

def cleaner(seq, value, occ):
    temp = []
    for number in seq:
        if number == value and occ:
            occ -= 1
            continue
        else:
            temp.append(number)
    return temp

With a for loop and a condition:

def cleaner(seq, value):    
    temp = []                      
    for number in seq:
        if number != value:
            temp.append(number)
    return temp

And if you want to remove some, but not all:

def cleaner(seq, value, occ):
    temp = []
    for number in seq:
        if number == value and occ:
            occ -= 1
            continue
        else:
            temp.append(number)
    return temp

回答 15

 list1=[1,2,3,3,4,5,6,1,3,4,5]
 n=int(input('enter  number'))
 while n in list1:
    list1.remove(n)
 print(list1)
 list1=[1,2,3,3,4,5,6,1,3,4,5]
 n=int(input('enter  number'))
 while n in list1:
    list1.remove(n)
 print(list1)

回答 16

举例来说,我们要从x中删除所有1。这就是我要做的:

x = [1, 2, 3, 1, 2, 3]

现在,这是我的方法的实际用法:

def Function(List, Unwanted):
    [List.remove(Unwanted) for Item in range(List.count(Unwanted))]
    return List
x = Function(x, 1)
print(x)

这是我的方法,只需一行:

[x.remove(1) for Item in range(x.count(1))]
print(x)

两者都将其作为输出:

[2, 3, 2, 3, 2, 3]

希望这可以帮助。PS,请注意,这是在3.6.2版中编写的,因此您可能需要针对旧版本进行调整。

Say for example, we want to remove all 1’s from x. This is how I would go about it:

x = [1, 2, 3, 1, 2, 3]

Now, this is a practical use of my method:

def Function(List, Unwanted):
    [List.remove(Unwanted) for Item in range(List.count(Unwanted))]
    return List
x = Function(x, 1)
print(x)

And this is my method in a single line:

[x.remove(1) for Item in range(x.count(1))]
print(x)

Both yield this as an output:

[2, 3, 2, 3, 2, 3]

Hope this helps. PS, pleas note that this was written in version 3.6.2, so you might need to adjust it for older versions.


回答 17

也许您的解决方案适用于int,但不适用于字典。

一方面,remove()对我不起作用。但也许它适用于基本类型。我猜下面的代码也是从对象列表中删除项目的方法。

另一方面,“ del”也无法正常工作。就我而言,使用python 3.6:当我尝试使用“ del”命令从“ for”气泡中的列表中删除元素时,python会更改进程中的索引,而bucle会在时间之前过早停止。仅当您以相反的顺序删除一个元素时,它才有效。这样,您在遍历时不会更改未决元素数组索引

然后,我用了:

c = len(list)-1
for element in (reversed(list)):
    if condition(element):
        del list[c]
    c -= 1
print(list)

其中“列表”类似于[{‘key1’:value1’},{‘key2’:value2},{‘key3’:value3},…]

另外,您可以使用enumerate做更多的pythonic操作:

for i, element in enumerate(reversed(list)):
    if condition(element):
        del list[(i+1)*-1]
print(list)

Maybe your solutions works with ints, but It Doesnt work for me with dictionarys.

In one hand, remove() has not worked for me. But maybe it works with basic Types. I guess the code bellow is also the way to remove items from objects list.

In the other hand, ‘del’ has not worked properly either. In my case, using python 3.6: when I try to delete an element from a list in a ‘for’ bucle with ‘del’ command, python changes the index in the process and bucle stops prematurely before time. It only works if You delete element by element in reversed order. In this way you dont change the pending elements array index when you are going through it

Then, Im used:

c = len(list)-1
for element in (reversed(list)):
    if condition(element):
        del list[c]
    c -= 1
print(list)

where ‘list’ is like [{‘key1′:value1’},{‘key2’:value2}, {‘key3’:value3}, …]

Also You can do more pythonic using enumerate:

for i, element in enumerate(reversed(list)):
    if condition(element):
        del list[(i+1)*-1]
print(list)

回答 18

arr = [1, 1, 3, 4, 5, 2, 4, 3]

# to remove first occurence of that element, suppose 3 in this example
arr.remove(3)

# to remove all occurences of that element, again suppose 3
# use something called list comprehension
new_arr = [element for element in arr if element!=3]

# if you want to delete a position use "pop" function, suppose 
# position 4 
# the pop function also returns a value
removed_element = arr.pop(4)

# u can also use "del" to delete a position
del arr[4]
arr = [1, 1, 3, 4, 5, 2, 4, 3]

# to remove first occurence of that element, suppose 3 in this example
arr.remove(3)

# to remove all occurences of that element, again suppose 3
# use something called list comprehension
new_arr = [element for element in arr if element!=3]

# if you want to delete a position use "pop" function, suppose 
# position 4 
# the pop function also returns a value
removed_element = arr.pop(4)

# u can also use "del" to delete a position
del arr[4]

回答 19

"-v"将从数组中删除所有实例sys.argv,并且如果未找到实例,则不会发出任何投诉:

while "-v" in sys.argv:
    sys.argv.remove('-v')

您可以在名为的文件中查看运行中的代码speechToText.py

$ python speechToText.py -v
['speechToText.py']

$ python speechToText.py -x
['speechToText.py', '-x']

$ python speechToText.py -v -v
['speechToText.py']

$ python speechToText.py -v -v -x
['speechToText.py', '-x']

This removes all instances of "-v" from the array sys.argv, and does not complain if no instances were found:

while "-v" in sys.argv:
    sys.argv.remove('-v')

You can see the code in action, in a file called speechToText.py:

$ python speechToText.py -v
['speechToText.py']

$ python speechToText.py -x
['speechToText.py', '-x']

$ python speechToText.py -v -v
['speechToText.py']

$ python speechToText.py -v -v -x
['speechToText.py', '-x']

回答 20

这就是我的回答,只是使用

def remove_all(data, value):
    i = j = 0
    while j < len(data):
        if data[j] == value:
            j += 1
            continue
        data[i] = data[j]
        i += 1
        j += 1
    for x in range(j - i):
        data.pop()

this is my answer, just use while and for

def remove_all(data, value):
    i = j = 0
    while j < len(data):
        if data[j] == value:
            j += 1
            continue
        data[i] = data[j]
        i += 1
        j += 1
    for x in range(j - i):
        data.pop()

如何找到我的Python site-packages目录的位置?

问题:如何找到我的Python site-packages目录的位置?

我如何找到我的site-packages目录的位置?

How do I find the location of my site-packages directory?


回答 0

网站包目录有两种类型,全局目录和每个用户目录。

  1. 运行时会列出全局站点软件包(“ dist-packages ”)目录sys.path

    python -m site

    要在Python代码中getsitepackages站点模块运行更简洁的列表,请执行以下操作:

    python -c 'import site; print(site.getsitepackages())'

    注意:使用virtualenvs时,getsitepackages不可用,但是sys.path从上面将正确列出virtualenv的site-packages目录。在Python 3中,您可以改为使用sysconfig模块

    python3 -c 'import sysconfig; print(sysconfig.get_paths()["purelib"])'
  2. 每个用户站点包目录(PEP 370)是其中的Python安装本地套餐:

    python -m site --user-site

    如果这指向一个不存在的目录,请检查Python的退出状态并查看python -m site --help说明。

    提示:运行pip list --userpip freeze --user为您提供每个用户站点软件包的所有已安装列表。


实用技巧

  • <package>.__path__可让您识别特定包装的位置:(详细信息

    $ python -c "import setuptools as _; print(_.__path__)"
    ['/usr/lib/python2.7/dist-packages/setuptools']
    
  • <module>.__file__让您识别特定模块的位置:(差异

    $ python3 -c "import os as _; print(_.__file__)"
    /usr/lib/python3.6/os.py
    
  • 运行pip show <package>以显示Debian风格的软件包信息:

    $ pip show pytest
    Name: pytest
    Version: 3.8.2
    Summary: pytest: simple powerful testing with Python
    Home-page: https://docs.pytest.org/en/latest/
    Author: Holger Krekel, Bruno Oliveira, Ronny Pfannschmidt, Floris Bruynooghe, Brianna Laugher, Florian Bruhin and others
    Author-email: None
    License: MIT license
    Location: /home/peter/.local/lib/python3.4/site-packages
    Requires: more-itertools, atomicwrites, setuptools, attrs, pathlib2, six, py, pluggy
    

There are two types of site-packages directories, global and per user.

  1. Global site-packages (“dist-packages“) directories are listed in sys.path when you run:

    python -m site
    

    For a more concise list run getsitepackages from the site module in Python code:

    python -c 'import site; print(site.getsitepackages())'
    

    Note: With virtualenvs getsitepackages is not available, sys.path from above will list the virtualenv’s site-packages directory correctly, though. In Python 3, you may use the sysconfig module instead:

    python3 -c 'import sysconfig; print(sysconfig.get_paths()["purelib"])'
    
  2. The per user site-packages directory (PEP 370) is where Python installs your local packages:

    python -m site --user-site
    

    If this points to a non-existing directory check the exit status of Python and see python -m site --help for explanations.

    Hint: Running pip list --user or pip freeze --user gives you a list of all installed per user site-packages.


Practical Tips

  • <package>.__path__ lets you identify the location(s) of a specific package: (details)

    $ python -c "import setuptools as _; print(_.__path__)"
    ['/usr/lib/python2.7/dist-packages/setuptools']
    
  • <module>.__file__ lets you identify the location of a specific module: (difference)

    $ python3 -c "import os as _; print(_.__file__)"
    /usr/lib/python3.6/os.py
    
  • Run pip show <package> to show Debian-style package information:

    $ pip show pytest
    Name: pytest
    Version: 3.8.2
    Summary: pytest: simple powerful testing with Python
    Home-page: https://docs.pytest.org/en/latest/
    Author: Holger Krekel, Bruno Oliveira, Ronny Pfannschmidt, Floris Bruynooghe, Brianna Laugher, Florian Bruhin and others
    Author-email: None
    License: MIT license
    Location: /home/peter/.local/lib/python3.4/site-packages
    Requires: more-itertools, atomicwrites, setuptools, attrs, pathlib2, six, py, pluggy
    

回答 1

>>> import site; site.getsitepackages()
['/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages']

(或仅带的第一项site.getsitepackages()[0]

>>> import site; site.getsitepackages()
['/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages']

(or just first item with site.getsitepackages()[0])


回答 2

解决方案:

  • 在virtualenv外部-提供全局站点程序包的路径,
  • 包含virtualenv-提供virtualenv的站点包

…是单线的:

python -c "from distutils.sysconfig import get_python_lib; print(get_python_lib())"

出于可读性考虑而格式化(而不是单行使用),其外观如下所示:

from distutils.sysconfig import get_python_lib
print(get_python_lib())


资料来源:“如何安装Django”文档的非常旧的版本(尽管这不仅对Django安装有用)

A solution that:

  • outside of virtualenv – provides the path of global site-packages,
  • insidue a virtualenv – provides the virtualenv’s site-packages

…is this one-liner:

python -c "from distutils.sysconfig import get_python_lib; print(get_python_lib())"

Formatted for readability (rather than use as a one-liner), that looks like the following:

from distutils.sysconfig import get_python_lib
print(get_python_lib())


Source: an very old version of “How to Install Django” documentation (though this is useful to more than just Django installation)


回答 3

对于Ubuntu

python -c "from distutils.sysconfig import get_python_lib; print get_python_lib()"

…是不正确的。

它将指向您 /usr/lib/pythonX.X/dist-packages

该文件夹仅包含您的操作系统已自动安装的程序运行包。

在ubuntu上,包含通过setup_tools \ easy_install \ pip安装的软件包的site-packages文件夹位于/usr/local/lib/pythonX.X/dist-packages

如果用例与安装或阅读源代码有关,则第二个文件夹可能更有用。

如果您不使用Ubuntu,则可以安全地将第一个代码框复制粘贴到终端中。

For Ubuntu,

python -c "from distutils.sysconfig import get_python_lib; print get_python_lib()"

…is not correct.

It will point you to /usr/lib/pythonX.X/dist-packages

This folder only contains packages your operating system has automatically installed for programs to run.

On ubuntu, the site-packages folder that contains packages installed via setup_tools\easy_install\pip will be in /usr/local/lib/pythonX.X/dist-packages

The second folder is probably the more useful one if the use case is related to installation or reading source code.

If you do not use Ubuntu, you are probably safe copy-pasting the first code box into the terminal.


回答 4

这对我有用:

python -m site --user-site

This is what worked for me:

python -m site --user-site

回答 5

假设您已经安装了“ django”软件包。导入并输入dir(django)。它将向您显示该模块的所有功能和属性。键入python解释器-

>>> import django
>>> dir(django)
['VERSION', '__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__', 'get_version']
>>> print django.__path__
['/Library/Python/2.6/site-packages/django']

如果您已经安装了Mercurial,则可以执行相同的操作。

这是给雪豹的。但我认为它通常也应该起作用。

Let’s say you have installed the package ‘django’. import it and type in dir(django). It will show you, all the functions and attributes with that module. Type in the python interpreter –

>>> import django
>>> dir(django)
['VERSION', '__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__', 'get_version']
>>> print django.__path__
['/Library/Python/2.6/site-packages/django']

You can do the same thing if you have installed mercurial.

This is for Snow Leopard. But I think it should work in general as well.


回答 6

如其他人所述,distutils.sysconfig具有相关设置:

import distutils.sysconfig
print distutils.sysconfig.get_python_lib()

…尽管默认值的site.py含义有些粗略,如下所述:

import sys, os
print os.sep.join([sys.prefix, 'lib', 'python' + sys.version[:3], 'site-packages'])

(如果该常量不同,它还会添加${sys.prefix}/lib/site-python和添加两条路径sys.exec_prefix)。

也就是说,背景是什么?您不应该site-packages直接与自己混为一谈。setuptools / distutils将可以进行安装,并且您的程序可能在virtualenv中运行,其中pythonpath完全是用户本地的,因此也不应假定直接使用系统站点包。

As others have noted, distutils.sysconfig has the relevant settings:

import distutils.sysconfig
print distutils.sysconfig.get_python_lib()

…though the default site.py does something a bit more crude, paraphrased below:

import sys, os
print os.sep.join([sys.prefix, 'lib', 'python' + sys.version[:3], 'site-packages'])

(it also adds ${sys.prefix}/lib/site-python and adds both paths for sys.exec_prefix as well, should that constant be different).

That said, what’s the context? You shouldn’t be messing with your site-packages directly; setuptools/distutils will work for installation, and your program may be running in a virtualenv where your pythonpath is completely user-local, so it shouldn’t assume use of the system site-packages directly either.


回答 7

现代的stdlib方法是使用sysconfig模块,该模块在2.7和3.2+版本中可用。

sysconfig)不与混淆distutils.sysconfig子模块(在其他几个答案这里提到)。后者是一个完全不同的模块,缺少get_paths下面讨论的功能。

Python当前使用八个路径(docs):

  • stdlib:包含非平台特定标准Python库文件的目录。
  • platstdlib:包含特定于平台的标准Python库文件的目录。
  • platlib:特定于站点,特定于平台的文件的目录。
  • purelib:特定于站点的,非特定于平台的文件的目录。
  • include:非平台特定头文件的目录。
  • platinclude:特定于平台的头文件的目录。
  • scripts:脚本文件的目录。
  • data:数据文件目录。

在大多数情况下,发现此问题的用户会对“ purelib”路径感兴趣(在某些情况下,您可能也对“ platlib”感兴趣)。与当前接受的答案不同,无论您是否激活了virtualenv,该方法仍然有效。

在系统级别(在Mac OS上为Python 3.7.0):

>>> import sysconfig
>>> sysconfig.get_paths()['purelib']
'/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages'

有了静脉,你会得到这样的东西

>>> import sysconfig
>>> sysconfig.get_paths()['purelib']
'/private/tmp/.venv/lib/python3.7/site-packages'

还可以使用Shell脚本来显示这些详细信息,您可以通过将其sysconfig作为模块执行来调用这些详细信息:

python -m sysconfig

A modern stdlib way is using sysconfig module, available in version 2.7 and 3.2+.

Note: sysconfig (source) is not to be confused with the distutils.sysconfig submodule (source) mentioned in several other answers here. The latter is an entirely different module and it’s lacking the get_paths function discussed below.

Python currently uses eight paths (docs):

  • stdlib: directory containing the standard Python library files that are not platform-specific.
  • platstdlib: directory containing the standard Python library files that are platform-specific.
  • platlib: directory for site-specific, platform-specific files.
  • purelib: directory for site-specific, non-platform-specific files.
  • include: directory for non-platform-specific header files.
  • platinclude: directory for platform-specific header files.
  • scripts: directory for script files.
  • data: directory for data files.

In most cases, users finding this question would be interested in the ‘purelib’ path (in some cases, you might be interested in ‘platlib’ too). Unlike the current accepted answer, this method still works regardless of whether or not you have a virtualenv activated.

At system level (this is Python 3.7.0 on mac OS):

>>> import sysconfig
>>> sysconfig.get_paths()['purelib']
'/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages'

With a venv, you’ll get something like this

>>> import sysconfig
>>> sysconfig.get_paths()['purelib']
'/private/tmp/.venv/lib/python3.7/site-packages'

A shell script is also available to display these details, which you can invoke by executing sysconfig as a module:

python -m sysconfig

回答 8

在基于Debian的系统中随python安装一起安装的本机系统软件包可以在以下位置找到:

/usr/lib/python2.7/dist-packages/

在OSX中- /Library/Python/2.7/site-packages

通过使用此小代码:

from distutils.sysconfig import get_python_lib
print get_python_lib()

但是,pip可以在以下位置找到通过安装的软件包列表:

/ usr / local / bin /

或者,只需编写以下命令即可列出python软件包所在的所有路径。

>>> import site; site.getsitepackages()
['/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages']

注意:位置可能会因您的操作系统而异,例如在OSX中

>>> import site; site.getsitepackages()
['/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages', '/System/Library/Frameworks/Python.framework/Versions/2.7/lib/site-python', '/Library/Python/2.7/site-packages']

The native system packages installed with python installation in Debian based systems can be found at :

/usr/lib/python2.7/dist-packages/

In OSX – /Library/Python/2.7/site-packages

by using this small code :

from distutils.sysconfig import get_python_lib
print get_python_lib()

However, the list of packages installed via pip can be found at :

/usr/local/bin/

Or one can simply write the following command to list all paths where python packages are.

>>> import site; site.getsitepackages()
['/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages']

Note: the location might vary based on your OS, like in OSX

>>> import site; site.getsitepackages()
['/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages', '/System/Library/Frameworks/Python.framework/Versions/2.7/lib/site-python', '/Library/Python/2.7/site-packages']

回答 9

所有答案(或:一遍又一遍重复的相同答案)都不够。您要做的是:

from setuptools.command.easy_install import easy_install
class easy_install_default(easy_install):
  """ class easy_install had problems with the fist parameter not being
      an instance of Distribution, even though it was. This is due to
      some import-related mess.
      """

  def __init__(self):
    from distutils.dist import Distribution
    dist = Distribution()
    self.distribution = dist
    self.initialize_options()
    self._dry_run = None
    self.verbose = dist.verbose
    self.force = None
    self.help = 0
    self.finalized = 0

e = easy_install_default()
import distutils.errors
try:
  e.finalize_options()
except distutils.errors.DistutilsError:
  pass

print e.install_dir

最后一行显示安装目录。可在Ubuntu上使用,而以上版本则不能。不要问我有关Windows或其他dists的问题,但是由于它与easy_install默认使用的目录完全相同,因此在easy_install工作的所有地方(所以,甚至是macs),它都可能是正确的。玩得开心。注意:原始代码中包含许多脏话。

All the answers (or: the same answer repeated over and over) are inadequate. What you want to do is this:

from setuptools.command.easy_install import easy_install
class easy_install_default(easy_install):
  """ class easy_install had problems with the fist parameter not being
      an instance of Distribution, even though it was. This is due to
      some import-related mess.
      """

  def __init__(self):
    from distutils.dist import Distribution
    dist = Distribution()
    self.distribution = dist
    self.initialize_options()
    self._dry_run = None
    self.verbose = dist.verbose
    self.force = None
    self.help = 0
    self.finalized = 0

e = easy_install_default()
import distutils.errors
try:
  e.finalize_options()
except distutils.errors.DistutilsError:
  pass

print e.install_dir

The final line shows you the installation dir. Works on Ubuntu, whereas the above ones don’t. Don’t ask me about windows or other dists, but since it’s the exact same dir that easy_install uses by default, it’s probably correct everywhere where easy_install works (so, everywhere, even macs). Have fun. Note: original code has many swearwords in it.


回答 10

旁注:distutils.sysconfig.get_python_lib()如果存在多个site-packages目录(如本文推荐),则建议的解决方案()不起作用。它只会返回主site-packages目录。

,我也没有更好的解决方案。Python似乎不跟踪站点软件包目录,而只是跟踪其中的软件包。

A side-note: The proposed solution (distutils.sysconfig.get_python_lib()) does not work when there is more than one site-packages directory (as recommended by this article). It will only return the main site-packages directory.

Alas, I have no better solution either. Python doesn’t seem to keep track of site-packages directories, just the packages within them.


回答 11

这对我有用。这将使您同时获得dist-packages和site-packages文件夹。如果该文件夹不在Python的路径上,则无论如何都不会给您带来什么好处。

import sys; 
print [f for f in sys.path if f.endswith('packages')]

输出(Ubuntu安装):

['/home/username/.local/lib/python2.7/site-packages',
 '/usr/local/lib/python2.7/dist-packages',
 '/usr/lib/python2.7/dist-packages']

This works for me. It will get you both dist-packages and site-packages folders. If the folder is not on Python’s path, it won’t be doing you much good anyway.

import sys; 
print [f for f in sys.path if f.endswith('packages')]

Output (Ubuntu installation):

['/home/username/.local/lib/python2.7/site-packages',
 '/usr/local/lib/python2.7/dist-packages',
 '/usr/lib/python2.7/dist-packages']

回答 12

由于它具有“低技术”性质,因此该方法适用于虚拟环境内外的所有发行版。os模块始终位于“ site-packages”的父目录中

import os; print(os.path.dirname(os.__file__) + '/site-packages')

要将目录更改为站点程序包目录,我使用以下别名(在* nix系统上):

alias cdsp='cd $(python -c "import os; print(os.path.dirname(os.__file__))"); cd site-packages'

This should work on all distributions in and out of virtual environment due to it’s “low-tech” nature. The os module always resides in the parent directory of ‘site-packages’

import os; print(os.path.dirname(os.__file__) + '/site-packages')

To change dir to the site-packages dir I use the following alias (on *nix systems):

alias cdsp='cd $(python -c "import os; print(os.path.dirname(os.__file__))"); cd site-packages'

回答 13

get_python_lib已经提到的功能的附加说明:在某些平台上,不同的目录用于平台特定的模块(例如:需要编译的模块)。如果传递plat_specific=True给该函数,则将获得针对特定平台的软件包的站点软件包。

An additional note to the get_python_lib function mentioned already: on some platforms different directories are used for platform specific modules (eg: modules that require compilation). If you pass plat_specific=True to the function you get the site packages for platform specific packages.


回答 14

from distutils.sysconfig import get_python_lib
print get_python_lib()
from distutils.sysconfig import get_python_lib
print get_python_lib()

回答 15

点子显示将提供有关软件包的所有详细信息:https : //pip.pypa.io/en/stable/reference/pip_show/ [点子显示] [1]

获取位置:

pip show <package_name>| grep Location

pip show will give all the details about a package: https://pip.pypa.io/en/stable/reference/pip_show/ [pip show][1]

To get the location:

pip show <package_name>| grep Location

回答 16

回答老问题。但是为此使用ipython。

pip install ipython
ipython 
import imaplib
imaplib?

这将给出有关imaplib软件包的以下输出-

Type:        module
String form: <module 'imaplib' from '/usr/lib/python2.7/imaplib.py'>
File:        /usr/lib/python2.7/imaplib.py
Docstring:  
IMAP4 client.

Based on RFC 2060.

Public class:           IMAP4
Public variable:        Debug
Public functions:       Internaldate2tuple
                        Int2AP
                        ParseFlags
                        Time2Internaldate

Answer to old question. But use ipython for this.

pip install ipython
ipython 
import imaplib
imaplib?

This will give the following output about imaplib package –

Type:        module
String form: <module 'imaplib' from '/usr/lib/python2.7/imaplib.py'>
File:        /usr/lib/python2.7/imaplib.py
Docstring:  
IMAP4 client.

Based on RFC 2060.

Public class:           IMAP4
Public variable:        Debug
Public functions:       Internaldate2tuple
                        Int2AP
                        ParseFlags
                        Time2Internaldate

回答 17

您应该尝试使用此命令来确定pip的安装位置

Python 2

pip show six | grep "Location:" | cut -d " " -f2

Python 3

pip3 show six | grep "Location:" | cut -d " " -f2

You should try this command to determine pip’s install location

Python 2

pip show six | grep "Location:" | cut -d " " -f2

Python 3

pip3 show six | grep "Location:" | cut -d " " -f2

回答 18

我必须为正在处理的项目做些不同的事情:找到相对于基本安装前缀的相对 site-packages目录。如果site-packages文件夹位于中/usr/lib/python2.7/site-packages,则需要该/lib/python2.7/site-packages部件。我有,事实上,遇到在那里系统site-packages/usr/lib64和公认的答案没有对这些系统的工作。

与作弊者的答案类似,我的解决方案深入探究了Distutils的精髓,以发现实际上在内部传递的路径setup.py。弄清楚这真是太痛苦了,我不想让任何人不得不再次弄清楚这一点。

import sys
import os
from distutils.command.install import INSTALL_SCHEMES

if os.name == 'nt':
    scheme_key = 'nt'
else:
    scheme_key = 'unix_prefix'

print(INSTALL_SCHEMES[scheme_key]['purelib'].replace('$py_version_short', (str.split(sys.version))[0][0:3]).replace('$base', ''))

那应该打印类似/Lib/site-packages或的内容/lib/python3.6/site-packages

I had to do something slightly different for a project I was working on: find the relative site-packages directory relative to the base install prefix. If the site-packages folder was in /usr/lib/python2.7/site-packages, I wanted the /lib/python2.7/site-packages part. I have, in fact, encountered systems where site-packages was in /usr/lib64, and the accepted answer did NOT work on those systems.

Similar to cheater’s answer, my solution peeks deep into the guts of Distutils, to find the path that actually gets passed around inside setup.py. It was such a pain to figure out that I don’t want anyone to ever have to figure this out again.

import sys
import os
from distutils.command.install import INSTALL_SCHEMES

if os.name == 'nt':
    scheme_key = 'nt'
else:
    scheme_key = 'unix_prefix'

print(INSTALL_SCHEMES[scheme_key]['purelib'].replace('$py_version_short', (str.split(sys.version))[0][0:3]).replace('$base', ''))

That should print something like /Lib/site-packages or /lib/python3.6/site-packages.


回答 19

如果已将其添加到中,则PYTHONPATH还可以执行类似操作

import sys
print('\n'.join(sys.path))

If it is already added to the PYTHONPATH you can also do something like

import sys
print('\n'.join(sys.path))

如何按键对字典排序?

问题:如何按键对字典排序?

这将是一个很好的方式,从去{2:3, 1:89, 4:5, 3:0}{1:89, 2:3, 3:0, 4:5}
我检查了一些帖子,但它们都使用了返回元组的“排序”运算符。

What would be a nice way to go from {2:3, 1:89, 4:5, 3:0} to {1:89, 2:3, 3:0, 4:5}?
I checked some posts but they all use the “sorted” operator that returns tuples.


回答 0

标准Python字典是无序的。即使对(键,值)对进行了排序,也无法以dict保留顺序的方式存储它们。

最简单的方法是使用OrderedDict,它可以记住元素插入的顺序:

In [1]: import collections

In [2]: d = {2:3, 1:89, 4:5, 3:0}

In [3]: od = collections.OrderedDict(sorted(d.items()))

In [4]: od
Out[4]: OrderedDict([(1, 89), (2, 3), (3, 0), (4, 5)])

没关系od打印出来的方式; 它会按预期工作:

In [11]: od[1]
Out[11]: 89

In [12]: od[3]
Out[12]: 0

In [13]: for k, v in od.iteritems(): print k, v
   ....: 
1 89
2 3
3 0
4 5

Python 3

对于Python 3用户,需要使用.items()而不是.iteritems()

In [13]: for k, v in od.items(): print(k, v)
   ....: 
1 89
2 3
3 0
4 5

Standard Python dictionaries are unordered. Even if you sorted the (key,value) pairs, you wouldn’t be able to store them in a dict in a way that would preserve the ordering.

The easiest way is to use OrderedDict, which remembers the order in which the elements have been inserted:

In [1]: import collections

In [2]: d = {2:3, 1:89, 4:5, 3:0}

In [3]: od = collections.OrderedDict(sorted(d.items()))

In [4]: od
Out[4]: OrderedDict([(1, 89), (2, 3), (3, 0), (4, 5)])

Never mind the way od is printed out; it’ll work as expected:

In [11]: od[1]
Out[11]: 89

In [12]: od[3]
Out[12]: 0

In [13]: for k, v in od.iteritems(): print k, v
   ....: 
1 89
2 3
3 0
4 5

Python 3

For Python 3 users, one needs to use the .items() instead of .iteritems():

In [13]: for k, v in od.items(): print(k, v)
   ....: 
1 89
2 3
3 0
4 5

回答 1

字典本身没有这样的有序项目,如果您想按某种顺序将它们打印等,下面是一些示例:

在Python 2.4及更高版本中:

mydict = {'carl':40,
          'alan':2,
          'bob':1,
          'danny':3}

for key in sorted(mydict):
    print "%s: %s" % (key, mydict[key])

给出:

alan: 2
bob: 1
carl: 40
danny: 3

(低于2.4的Python :)

keylist = mydict.keys()
keylist.sort()
for key in keylist:
    print "%s: %s" % (key, mydict[key])

资料来源:http : //www.saltycrane.com/blog/2007/09/how-to-sort-python-dictionary-by-keys/

Dictionaries themselves do not have ordered items as such, should you want to print them etc to some order, here are some examples:

In Python 2.4 and above:

mydict = {'carl':40,
          'alan':2,
          'bob':1,
          'danny':3}

for key in sorted(mydict):
    print "%s: %s" % (key, mydict[key])

gives:

alan: 2
bob: 1
carl: 40
danny: 3

(Python below 2.4:)

keylist = mydict.keys()
keylist.sort()
for key in keylist:
    print "%s: %s" % (key, mydict[key])

Source: http://www.saltycrane.com/blog/2007/09/how-to-sort-python-dictionary-by-keys/


回答 2

Python的collections库文档中

>>> from collections import OrderedDict

>>> # regular unsorted dictionary
>>> d = {'banana': 3, 'apple':4, 'pear': 1, 'orange': 2}

>>> # dictionary sorted by key -- OrderedDict(sorted(d.items()) also works
>>> OrderedDict(sorted(d.items(), key=lambda t: t[0]))
OrderedDict([('apple', 4), ('banana', 3), ('orange', 2), ('pear', 1)])

>>> # dictionary sorted by value
>>> OrderedDict(sorted(d.items(), key=lambda t: t[1]))
OrderedDict([('pear', 1), ('orange', 2), ('banana', 3), ('apple', 4)])

>>> # dictionary sorted by length of the key string
>>> OrderedDict(sorted(d.items(), key=lambda t: len(t[0])))
OrderedDict([('pear', 1), ('apple', 4), ('orange', 2), ('banana', 3)])

From Python’s collections library documentation:

>>> from collections import OrderedDict

>>> # regular unsorted dictionary
>>> d = {'banana': 3, 'apple':4, 'pear': 1, 'orange': 2}

>>> # dictionary sorted by key -- OrderedDict(sorted(d.items()) also works
>>> OrderedDict(sorted(d.items(), key=lambda t: t[0]))
OrderedDict([('apple', 4), ('banana', 3), ('orange', 2), ('pear', 1)])

>>> # dictionary sorted by value
>>> OrderedDict(sorted(d.items(), key=lambda t: t[1]))
OrderedDict([('pear', 1), ('orange', 2), ('banana', 3), ('apple', 4)])

>>> # dictionary sorted by length of the key string
>>> OrderedDict(sorted(d.items(), key=lambda t: len(t[0])))
OrderedDict([('pear', 1), ('apple', 4), ('orange', 2), ('banana', 3)])

回答 3

对于CPython / PyPy 3.6和任何Python 3.7或更高版本,可以使用以下方法轻松完成此操作:

>>> d = {2:3, 1:89, 4:5, 3:0}
>>> dict(sorted(d.items()))
{1: 89, 2: 3, 3: 0, 4: 5}

For CPython/PyPy 3.6, and any Python 3.7 or higher, this is easily done with:

>>> d = {2:3, 1:89, 4:5, 3:0}
>>> dict(sorted(d.items()))
{1: 89, 2: 3, 3: 0, 4: 5}

回答 4

有许多Python模块提供字典实现,这些实现将按顺序自动维护键。考虑sortedcontainers模块,它是纯Python和快速C实现。还与其他基准测试的流行选项进行性能比较

如果您需要在迭代过程中不断添加和删除键/值对,则使用有序dict是不适当的解决方案。

>>> from sortedcontainers import SortedDict
>>> d = {2:3, 1:89, 4:5, 3:0}
>>> s = SortedDict(d)
>>> s.items()
[(1, 89), (2, 3), (3, 0), (4, 5)]

SortedDict类型还支持索引位置查找和删除,这是内置dict类型无法实现的。

>>> s.iloc[-1]
4
>>> del s.iloc[2]
>>> s.keys()
SortedSet([1, 2, 4])

There are a number of Python modules that provide dictionary implementations which automatically maintain the keys in sorted order. Consider the sortedcontainers module which is pure-Python and fast-as-C implementations. There is also a performance comparison with other popular options benchmarked against one another.

Using an ordered dict is an inadequate solution if you need to constantly add and remove key/value pairs while also iterating.

>>> from sortedcontainers import SortedDict
>>> d = {2:3, 1:89, 4:5, 3:0}
>>> s = SortedDict(d)
>>> s.items()
[(1, 89), (2, 3), (3, 0), (4, 5)]

The SortedDict type also supports indexed location lookups and deletion which isn’t possible with the built-in dict type.

>>> s.iloc[-1]
4
>>> del s.iloc[2]
>>> s.keys()
SortedSet([1, 2, 4])

回答 5

只是:

d = {2:3, 1:89, 4:5, 3:0}
sd = sorted(d.items())

for k,v in sd:
    print k, v

输出:

1 89
2 3
3 0
4 5

Simply:

d = {2:3, 1:89, 4:5, 3:0}
sd = sorted(d.items())

for k,v in sd:
    print k, v

Output:

1 89
2 3
3 0
4 5

回答 6

正如其他人所提到的,字典本质上是无序的。但是,如果问题仅在于按顺序显示字典,则可以__str__在字典子类中重写该方法,并使用此字典类而不是Builtin dict。例如。

class SortedDisplayDict(dict):
   def __str__(self):
       return "{" + ", ".join("%r: %r" % (key, self[key]) for key in sorted(self)) + "}"


>>> d = SortedDisplayDict({2:3, 1:89, 4:5, 3:0})
>>> d
{1: 89, 2: 3, 3: 0, 4: 5}

请注意,这不会改变密钥的存储方式,迭代时它们返回的顺序等,也不会改变它们print在python控制台中的显示方式。

As others have mentioned, dictionaries are inherently unordered. However, if the issue is merely displaying dictionaries in an ordered fashion, you can override the __str__ method in a dictionary subclass, and use this dictionary class rather than the builtin dict. Eg.

class SortedDisplayDict(dict):
   def __str__(self):
       return "{" + ", ".join("%r: %r" % (key, self[key]) for key in sorted(self)) + "}"


>>> d = SortedDisplayDict({2:3, 1:89, 4:5, 3:0})
>>> d
{1: 89, 2: 3, 3: 0, 4: 5}

Note, this changes nothing about how the keys are stored, the order they will come back when you iterate over them etc, just how they’re displayed with print or at the python console.


回答 7

找到了另一种方法:

import json
print json.dumps(d, sort_keys = True)

upd:
1.这也会对嵌套对象进行排序(感谢@DanielF)。
2. python字典是无序的,因此可用于打印或仅分配给str。

Found another way:

import json
print json.dumps(d, sort_keys = True)

upd:
1. this also sorts nested objects (thanks @DanielF).
2. python dictionaries are unordered therefore this is sutable for print or assign to str only.


回答 8

在Python 3中。

>>> D1 = {2:3, 1:89, 4:5, 3:0}
>>> for key in sorted(D1):
    print (key, D1[key])

1 89
2 3
3 0
4 5

In Python 3.

>>> D1 = {2:3, 1:89, 4:5, 3:0}
>>> for key in sorted(D1):
    print (key, D1[key])

gives

1 89
2 3
3 0
4 5

回答 9

Python字典在Python 3.6之前是无序的。在Python 3.6的CPython实现中,字典保留插入顺序。从Python 3.7开始,这将成为一种语言功能。

在Python 3.6的更新日志中(https://docs.python.org/3.6/whatsnew/3.6.html#whatsnew36-compactdict):

此新实现的顺序保留方面被认为是实现细节,因此不应依赖(将来可能会更改,但是希望在更改语言规范之前,先在几个发行版中使用该新dict实现该语言,为所有当前和将来的Python实现强制要求保留顺序的语义;这还有助于保留与仍旧有效的随机迭代顺序的旧版本语言(例如Python 3.5)的向后兼容性。

在Python 3.7的文档中(https://docs.python.org/3.7/tutorial/datastructures.html#dictionaries):

在字典上执行list(d)会以插入顺序返回字典中使用的所有键的列表(如果要对其进行排序,请改用sorted(d))。

因此,与以前的版本不同,您可以在Python 3.6 / 3.7之后对字典进行排序。如果要对嵌套的字典(包括其中的子字典)进行排序,则可以执行以下操作:

test_dict = {'a': 1, 'c': 3, 'b': {'b2': 2, 'b1': 1}}

def dict_reorder(item):
    return {k: sort_dict(v) if isinstance(v, dict) else v for k, v in sorted(item.items())}

reordered_dict = dict_reorder(test_dict)

https://gist.github.com/ligyxy/f60f0374defc383aa098d44cfbd318eb

Python dictionary was unordered before Python 3.6. In CPython implementation of Python 3.6, dictionary keeps the insertion order. From Python 3.7, this will become a language feature.

In changelog of Python 3.6 (https://docs.python.org/3.6/whatsnew/3.6.html#whatsnew36-compactdict):

The order-preserving aspect of this new implementation is considered an implementation detail and should not be relied upon (this may change in the future, but it is desired to have this new dict implementation in the language for a few releases before changing the language spec to mandate order-preserving semantics for all current and future Python implementations; this also helps preserve backwards-compatibility with older versions of the language where random iteration order is still in effect, e.g. Python 3.5).

In document of Python 3.7 (https://docs.python.org/3.7/tutorial/datastructures.html#dictionaries):

Performing list(d) on a dictionary returns a list of all the keys used in the dictionary, in insertion order (if you want it sorted, just use sorted(d) instead).

So unlike previous versions, you can sort a dict after Python 3.6/3.7. If you want to sort a nested dict including the sub-dict inside, you can do:

test_dict = {'a': 1, 'c': 3, 'b': {'b2': 2, 'b1': 1}}

def dict_reorder(item):
    return {k: sort_dict(v) if isinstance(v, dict) else v for k, v in sorted(item.items())}

reordered_dict = dict_reorder(test_dict)

https://gist.github.com/ligyxy/f60f0374defc383aa098d44cfbd318eb


回答 10

在这里,我找到了一些最简单的解决方案,以使用键对python字典进行排序pprint。例如。

>>> x = {'a': 10, 'cd': 20, 'b': 30, 'az': 99} 
>>> print x
{'a': 10, 'b': 30, 'az': 99, 'cd': 20}

但是在使用pprint时,它将返回排序的字典

>>> import pprint 
>>> pprint.pprint(x)
{'a': 10, 'az': 99, 'b': 30, 'cd': 20}

Here I found some simplest solution to sort the python dict by key using pprint. eg.

>>> x = {'a': 10, 'cd': 20, 'b': 30, 'az': 99} 
>>> print x
{'a': 10, 'b': 30, 'az': 99, 'cd': 20}

but while using pprint it will return sorted dict

>>> import pprint 
>>> pprint.pprint(x)
{'a': 10, 'az': 99, 'b': 30, 'cd': 20}

回答 11

有一种简单的方法可以对字典进行排序。

根据您的问题,

解决方案是:

c={2:3, 1:89, 4:5, 3:0}
y=sorted(c.items())
print y

(其中c是您的字典的名称。)

该程序提供以下输出:

[(1, 89), (2, 3), (3, 0), (4, 5)]

就像你想要的。

另一个示例是:

d={"John":36,"Lucy":24,"Albert":32,"Peter":18,"Bill":41}
x=sorted(d.keys())
print x

给出输出:['Albert', 'Bill', 'John', 'Lucy', 'Peter']

y=sorted(d.values())
print y

给出输出:[18, 24, 32, 36, 41]

z=sorted(d.items())
print z

给出输出:

[('Albert', 32), ('Bill', 41), ('John', 36), ('Lucy', 24), ('Peter', 18)]

因此,通过将其更改为键,值和项,您可以按照自己的需要进行打印。希望这会有所帮助!

There is an easy way to sort a dictionary.

According to your question,

The solution is :

c={2:3, 1:89, 4:5, 3:0}
y=sorted(c.items())
print y

(Where c,is the name of your dictionary.)

This program gives the following output:

[(1, 89), (2, 3), (3, 0), (4, 5)]

like u wanted.

Another example is:

d={"John":36,"Lucy":24,"Albert":32,"Peter":18,"Bill":41}
x=sorted(d.keys())
print x

Gives the output:['Albert', 'Bill', 'John', 'Lucy', 'Peter']

y=sorted(d.values())
print y

Gives the output:[18, 24, 32, 36, 41]

z=sorted(d.items())
print z

Gives the output:

[('Albert', 32), ('Bill', 41), ('John', 36), ('Lucy', 24), ('Peter', 18)]

Hence by changing it into keys, values and items , you can print like what u wanted.Hope this helps!


回答 12

将会生成您想要的东西:

 D1 = {2:3, 1:89, 4:5, 3:0}

 sort_dic = {}

 for i in sorted(D1):
     sort_dic.update({i:D1[i]})
 print sort_dic


{1: 89, 2: 3, 3: 0, 4: 5}

但这不是执行此操作的正确方法,因为它可能会显示不同词典的不同行为,这是我最近学到的。因此,Tim在我在这里分享的Query的响应中提出了一种完美的方法。

from collections import OrderedDict
sorted_dict = OrderedDict(sorted(D1.items(), key=lambda t: t[0]))

Will generate exactly what you want:

 D1 = {2:3, 1:89, 4:5, 3:0}

 sort_dic = {}

 for i in sorted(D1):
     sort_dic.update({i:D1[i]})
 print sort_dic


{1: 89, 2: 3, 3: 0, 4: 5}

But this is not the correct way to do this, because, It could show a distinct behavior with different dictionaries, which I have learned recently. Hence perfect way has been suggested by Tim In the response of my Query which I am sharing here.

from collections import OrderedDict
sorted_dict = OrderedDict(sorted(D1.items(), key=lambda t: t[0]))

回答 13

我认为最简单的方法是按键对字典进行排序,然后将排序后的键:值对保存在新字典中。

dict1 = {'renault': 3, 'ford':4, 'volvo': 1, 'toyota': 2} 
dict2 = {}                  # create an empty dict to store the sorted values
for key in sorted(dict1.keys()):
    if not key in dict2:    # Depending on the goal, this line may not be neccessary
        dict2[key] = dict1[key]

为了更清楚一点:

dict1 = {'renault': 3, 'ford':4, 'volvo': 1, 'toyota': 2} 
dict2 = {}                  # create an empty dict to store the sorted     values
for key in sorted(dict1.keys()):
    if not key in dict2:    # Depending on the goal, this line may not be  neccessary
        value = dict1[key]
        dict2[key] = value

I think the easiest thing is to sort the dict by key and save the sorted key:value pair in a new dict.

dict1 = {'renault': 3, 'ford':4, 'volvo': 1, 'toyota': 2} 
dict2 = {}                  # create an empty dict to store the sorted values
for key in sorted(dict1.keys()):
    if not key in dict2:    # Depending on the goal, this line may not be neccessary
        dict2[key] = dict1[key]

To make it clearer:

dict1 = {'renault': 3, 'ford':4, 'volvo': 1, 'toyota': 2} 
dict2 = {}                  # create an empty dict to store the sorted     values
for key in sorted(dict1.keys()):
    if not key in dict2:    # Depending on the goal, this line may not be  neccessary
        value = dict1[key]
        dict2[key] = value

回答 14

您可以根据问题按关键字对当前词典进行排序,从而创建新词典。

这是你的字典

d = {2:3, 1:89, 4:5, 3:0}

通过使用lambda函数对d排序来创建新字典d1

d1 = dict(sorted(d.items(), key = lambda x:x[0]))

d1应该为{1:89,2:3,3:0,4:5},根据d中的键排序。

You can create a new dictionary by sorting the current dictionary by key as per your question.

This is your dictionary

d = {2:3, 1:89, 4:5, 3:0}

Create a new dictionary d1 by sorting this d using lambda function

d1 = dict(sorted(d.items(), key = lambda x:x[0]))

d1 should be {1: 89, 2: 3, 3: 0, 4: 5}, sorted based on keys in d.


回答 15

Python字典是无序的。通常,这不是问题,因为最常见的用例是进行查找。

执行所需操作的最简单方法是创建collections.OrderedDict按排序顺序插入元素。

ordered_dict = collections.OrderedDict([(k, d[k]) for k in sorted(d.keys())])

如上面其他建议那样,如果需要迭代,则最简单的方法是迭代已排序的键。例子-

打印按键排序的值:

# create the dict
d = {k1:v1, k2:v2,...}
# iterate by keys in sorted order
for k in sorted(d.keys()):
    value = d[k]
    # do something with k, value like print
    print k, value

获取按键排序的值列表:

values = [d[k] for k in sorted(d.keys())]

Python dicts are un-ordered. Usually, this is not a problem since the most common use case is to do a lookup.

The simplest way to do what you want would be to create a collections.OrderedDict inserting the elements in sorted order.

ordered_dict = collections.OrderedDict([(k, d[k]) for k in sorted(d.keys())])

If you need to iterated, as others above have suggested, the simplest way would be to iterate over sorted keys. Examples-

Print values sorted by keys:

# create the dict
d = {k1:v1, k2:v2,...}
# iterate by keys in sorted order
for k in sorted(d.keys()):
    value = d[k]
    # do something with k, value like print
    print k, value

Get list of values sorted by keys:

values = [d[k] for k in sorted(d.keys())]

回答 16

我提出单行字典排序。

>> a = {2:3, 1:89, 4:5, 3:0}
>> c = {i:a[i] for i in sorted(a.keys())}
>> print(c)
{1: 89, 2: 3, 3: 0, 4: 5}
[Finished in 0.4s]

希望这会有所帮助。

I come up with single line dict sorting.

>> a = {2:3, 1:89, 4:5, 3:0}
>> c = {i:a[i] for i in sorted(a.keys())}
>> print(c)
{1: 89, 2: 3, 3: 0, 4: 5}
[Finished in 0.4s]

Hope this will be helpful.


回答 17

此函数将按其键对任何字典进行递归排序。也就是说,如果字典中的任何值也是字典,则也将通过其键对它进行排序。如果您在CPython 3.6或更高版本上运行,则可以简单地更改为使用a dict而不是an OrderedDict

from collections import OrderedDict

def sort_dict(d):
    items = [[k, v] for k, v in sorted(d.items(), key=lambda x: x[0])]
    for item in items:
        if isinstance(item[1], dict):
            item[1] = sort_dict(item[1])
    return OrderedDict(items)
    #return dict(items)

This function will sort any dictionary recursively by its key. That is, if any value in the dictionary is also a dictionary, it too will be sorted by its key. If you are running on CPython 3.6 or greater, than a simple change to use a dict rather than an OrderedDict can be made.

from collections import OrderedDict

def sort_dict(d):
    items = [[k, v] for k, v in sorted(d.items(), key=lambda x: x[0])]
    for item in items:
        if isinstance(item[1], dict):
            item[1] = sort_dict(item[1])
    return OrderedDict(items)
    #return dict(items)

回答 18

伙计们,你让事情变得复杂了……这很简单

from pprint import pprint
Dict={'B':1,'A':2,'C':3}
pprint(Dict)

输出为:

{'A':2,'B':1,'C':3}

Guys you are making things complicated … it’s really simple

from pprint import pprint
Dict={'B':1,'A':2,'C':3}
pprint(Dict)

The output is:

{'A':2,'B':1,'C':3}

回答 19

最简单的解决方案是,您应该获得一个dict键的列表,该键是排序顺序,然后遍历dict。例如

a1 = {'a':1, 'b':13, 'd':4, 'c':2, 'e':30}
a1_sorted_keys = sorted(a1, key=a1.get, reverse=True)
for r in a1_sorted_keys:
    print r, a1[r]

以下是输出(降序)

e 30
b 13
d 4
c 2
a 1

Simplest solution is that you should get a list of dict key is sorted order and then iterate over dict. For example

a1 = {'a':1, 'b':13, 'd':4, 'c':2, 'e':30}
a1_sorted_keys = sorted(a1, key=a1.get, reverse=True)
for r in a1_sorted_keys:
    print r, a1[r]

Following will be the output (desending order)

e 30
b 13
d 4
c 2
a 1

回答 20

一种简单的方法:

d = {2:3, 1:89, 4:5, 3:0}

s = {k : d[k] for k in sorted(d)}

s
Out[1]: {1: 89, 2: 3, 3: 0, 4: 5} 

An easy way to do this:

d = {2:3, 1:89, 4:5, 3:0}

s = {k : d[k] for k in sorted(d)}

s
Out[1]: {1: 89, 2: 3, 3: 0, 4: 5} 

回答 21

2.7中这两种方法的时序比较表明它们实际上是相同的:

>>> setup_string = "a = sorted(dict({2:3, 1:89, 4:5, 3:0}).items())"
>>> timeit.timeit(stmt="[(k, val) for k, val in a]", setup=setup_string, number=10000)
0.003599141953657181

>>> setup_string = "from collections import OrderedDict\n"
>>> setup_string += "a = OrderedDict({1:89, 2:3, 3:0, 4:5})\n"
>>> setup_string += "b = a.items()"
>>> timeit.timeit(stmt="[(k, val) for k, val in b]", setup=setup_string, number=10000)
0.003581275490432745 

A timing comparison of the two methods in 2.7 shows them to be virtually identical:

>>> setup_string = "a = sorted(dict({2:3, 1:89, 4:5, 3:0}).items())"
>>> timeit.timeit(stmt="[(k, val) for k, val in a]", setup=setup_string, number=10000)
0.003599141953657181

>>> setup_string = "from collections import OrderedDict\n"
>>> setup_string += "a = OrderedDict({1:89, 2:3, 3:0, 4:5})\n"
>>> setup_string += "b = a.items()"
>>> timeit.timeit(stmt="[(k, val) for k, val in b]", setup=setup_string, number=10000)
0.003581275490432745 

回答 22

from operator import itemgetter
# if you would like to play with multiple dictionaries then here you go:
# Three dictionaries that are composed of first name and last name.
user = [
    {'fname': 'Mo', 'lname': 'Mahjoub'},
    {'fname': 'Abdo', 'lname': 'Al-hebashi'},
    {'fname': 'Ali', 'lname': 'Muhammad'}
]
#  This loop will sort by the first and the last names.
# notice that in a dictionary order doesn't matter. So it could put the first name first or the last name first. 
for k in sorted (user, key=itemgetter ('fname', 'lname')):
    print (k)

# This one will sort by the first name only.
for x in sorted (user, key=itemgetter ('fname')):
    print (x)
from operator import itemgetter
# if you would like to play with multiple dictionaries then here you go:
# Three dictionaries that are composed of first name and last name.
user = [
    {'fname': 'Mo', 'lname': 'Mahjoub'},
    {'fname': 'Abdo', 'lname': 'Al-hebashi'},
    {'fname': 'Ali', 'lname': 'Muhammad'}
]
#  This loop will sort by the first and the last names.
# notice that in a dictionary order doesn't matter. So it could put the first name first or the last name first. 
for k in sorted (user, key=itemgetter ('fname', 'lname')):
    print (k)

# This one will sort by the first name only.
for x in sorted (user, key=itemgetter ('fname')):
    print (x)

回答 23

dictionary = {1:[2],2:[],5:[4,5],4:[5],3:[1]}

temp=sorted(dictionary)
sorted_dict = dict([(k,dictionary[k]) for i,k in enumerate(temp)])

sorted_dict:
         {1: [2], 2: [], 3: [1], 4: [5], 5: [4, 5]}
dictionary = {1:[2],2:[],5:[4,5],4:[5],3:[1]}

temp=sorted(dictionary)
sorted_dict = dict([(k,dictionary[k]) for i,k in enumerate(temp)])

sorted_dict:
         {1: [2], 2: [], 3: [1], 4: [5], 5: [4, 5]}

回答 24

或使用pandas

演示:

>>> d={'B':1,'A':2,'C':3}
>>> df=pd.DataFrame(d,index=[0]).sort_index(axis=1)
   A  B  C
0  2  1  3
>>> df.to_dict('int')[0]
{'A': 2, 'B': 1, 'C': 3}
>>> 

看到:

这个文档

大熊猫的文献资料

Or use pandas,

Demo:

>>> d={'B':1,'A':2,'C':3}
>>> df=pd.DataFrame(d,index=[0]).sort_index(axis=1)
   A  B  C
0  2  1  3
>>> df.to_dict('int')[0]
{'A': 2, 'B': 1, 'C': 3}
>>> 

See:

Docs of this

Documentation of whole pandas


回答 25

我的建议是这样,因为它允许您在添加项目时对字典进行排序或使字典保持排序,并且将来可能需要添加项目:

dict从头开始构建。有第二个数据结构,一个列表,以及您的键列表。bisect软件包具有insort函数,该函数允许插入排序列表中,或者在完全填充字典后对列表进行排序。现在,当您遍历字典时,您将遍历列表以按顺序访问每个键,而不必担心dict结构的表示(不是为排序而设计的)。

My suggestion is this as it allows you to sort a dict or keep a dict sorted as you are adding items and might need to add items in the future:

Build a dict from scratch as you go along. Have a second data structure, a list, with your list of keys. The bisect package has an insort function which allows inserting into a sorted list, or sort your list after completely populating your dict. Now, when you iterate over your dict, you instead iterate over the list to access each key in an in-order fashion without worrying about the representation of the dict structure (which was not made for sorting).


回答 26

l = dict.keys()
l2 = l
l2.append(0)
l3 = []
for repeater in range(0, len(l)):
    smallnum = float("inf")
    for listitem in l2:
        if listitem < smallnum:
            smallnum = listitem
    l2.remove(smallnum)
    l3.append(smallnum)
l3.remove(0)
l = l3

for listitem in l:
    print(listitem)
l = dict.keys()
l2 = l
l2.append(0)
l3 = []
for repeater in range(0, len(l)):
    smallnum = float("inf")
    for listitem in l2:
        if listitem < smallnum:
            smallnum = listitem
    l2.remove(smallnum)
    l3.append(smallnum)
l3.remove(0)
l = l3

for listitem in l:
    print(listitem)

为什么要在easy_install上使用pip?[关闭]

问题:为什么要在easy_install上使用pip?[关闭]

一条推文中写道:

不要使用easy_install,除非您喜欢对自己的脸部进行刺伤。使用点子。

为什么要在easy_install上使用pip?难道不是PyPI和程序包作者最主要的原因吗?如果作者将废话源tarball(例如:缺少文件,没有setup.py)上传到PyPI,则pip和easy_install都将失败。除了化妆品的差异,为什么Python的人(如上面的鸣叫)似乎强烈地倾向于在点子的easy_install?

(假设我们正在谈论由社区维护的Distribute软件包中的easy_install)

A tweet reads:

Don’t use easy_install, unless you like stabbing yourself in the face. Use pip.

Why use pip over easy_install? Doesn’t the fault lie with PyPI and package authors mostly? If an author uploads crap source tarball (eg: missing files, no setup.py) to PyPI, then both pip and easy_install will fail. Other than cosmetic differences, why do Python people (like in the above tweet) seem to strongly favor pip over easy_install?

(Let’s assume that we’re talking about easy_install from the Distribute package, that is maintained by the community)


回答 0

此处的许多答案在2015年已经过时了(尽管最初由Daniel Roseman接受的答案不是)。这是当前的状态:

  • 现在,二进制程序包以轮子(.whl文件)的形式分发-不仅在PyPI上,而且在第三方存储库中,例如Christoph Gohlke的Windows Extension Packagespip可以处理轮子;easy_install不能。
  • 虚拟环境(由3.4内置,或者可以通过2.6添加到2.6 + / 3.1 + virtualenv)已经成为一个非常重要和突出的工具(并在官方文档中推荐);它们pip是开箱即用的,但是甚至无法正常使用easy_install
  • distribute包含的软件包easy_install不再维护。它的改进已setuptools合并回setuptools。尝试安装distribute只会安装setuptools
  • easy_install 本身只是准维护的。
  • 所有的其中箱子pip用于不如easy_install从解包源树-installing,从DVCS回购等-是早已过去的; 你可以pip install .pip install git+https://
  • pip带有来自python.org的官方Python 2.7和3.4+软件包,pip如果您从源代码构建,则默认情况下会包含引导程序。
  • Python打包用户指南》已取代了有关安装,使用和构建软件包的各种文档的不完整之处。现在,Python自己的有关安装Python模块的文档符合该用户指南的要求,并明确地pip称为“首选安装程序”。
  • pip这些年来,还添加了其他新功能,这些功能将永远不会存在easy_install。例如,pip通过构建需求文件,然后在每一侧使用单个命令安装它,可以轻松克隆站点程序包。或将您的需求文件转换为本地回购以用于内部开发。等等。

我知道easy_install在2015年使用的唯一好的理由是在OS X 10.5-10.8中使用Apple预先安装的Python版本的特殊情况。从10.5开始,Apple已包含easy_install,但从10.10开始,它们仍然不包含pip。使用10.9+时,您仍然应该只使用get-pip.py,但是对于10.5-10.8,这存在一些问题,因此更容易实现sudo easy_install pip。(通常,这easy_install pip是一个坏主意;您只想在OS X 10.5-10.8上才能做到这一点。)此外,10.5-10.8包含readline以一种easy_install知道如何纠缠而pip不会纠缠的方式,因此您也想sudo easy_install readline如果要升级。

Many of the answers here are out of date for 2015 (although the initially accepted one from Daniel Roseman is not). Here’s the current state of things:

  • Binary packages are now distributed as wheels (.whl files)—not just on PyPI, but in third-party repositories like Christoph Gohlke’s Extension Packages for Windows. pip can handle wheels; easy_install cannot.
  • Virtual environments (which come built-in with 3.4, or can be added to 2.6+/3.1+ with virtualenv) have become a very important and prominent tool (and recommended in the official docs); they include pip out of the box, but don’t even work properly with easy_install.
  • The distribute package that included easy_install is no longer maintained. Its improvements over setuptools got merged back into setuptools. Trying to install distribute will just install setuptools instead.
  • easy_install itself is only quasi-maintained.
  • All of the cases where pip used to be inferior to easy_install—installing from an unpacked source tree, from a DVCS repo, etc.—are long-gone; you can pip install ., pip install git+https://.
  • pip comes with the official Python 2.7 and 3.4+ packages from python.org, and a pip bootstrap is included by default if you build from source.
  • The various incomplete bits of documentation on installing, using, and building packages have been replaced by the Python Packaging User Guide. Python’s own documentation on Installing Python Modules now defers to this user guide, and explicitly calls out pip as “the preferred installer program”.
  • Other new features have been added to pip over the years that will never be in easy_install. For example, pip makes it easy to clone your site-packages by building a requirements file and then installing it with a single command on each side. Or to convert your requirements file to a local repo to use for in-house development. And so on.

The only good reason that I know of to use easy_install in 2015 is the special case of using Apple’s pre-installed Python versions with OS X 10.5-10.8. Since 10.5, Apple has included easy_install, but as of 10.10 they still don’t include pip. With 10.9+, you should still just use get-pip.py, but for 10.5-10.8, this has some problems, so it’s easier to sudo easy_install pip. (In general, easy_install pip is a bad idea; it’s only for OS X 10.5-10.8 that you want to do this.) Also, 10.5-10.8 include readline in a way that easy_install knows how to kludge around but pip doesn’t, so you also want to sudo easy_install readline if you want to upgrade that.


回答 1

从伊恩·比金(Ian Bicking)自己对pip介绍

pip最初旨在通过以下方式对easy_install进行改进

  • 所有软件包均在安装前已下载。结果不会发生部分完成的安装。
  • 注意在控制台上显示有用的输出。
  • 采取行动的原因已被跟踪。例如,如果正在安装软件包,则pip会跟踪为什么需要该软件包。
  • 错误消息应该很有用。
  • 该代码相对简洁明了,具有内聚性,可以更轻松地以编程方式使用。
  • 软件包不必作为Egg存档安装,可以将它们平放安装(同时保留Egg元数据)。
  • 对其他版本控制系统(Git,Mercurial和Bazaar)的本地支持
  • 卸载软件包。
  • 简单定义固定的需求集并可靠地复制一组包。

From Ian Bicking’s own introduction to pip:

pip was originally written to improve on easy_install in the following ways

  • All packages are downloaded before installation. Partially-completed installation doesn’t occur as a result.
  • Care is taken to present useful output on the console.
  • The reasons for actions are kept track of. For instance, if a package is being installed, pip keeps track of why that package was required.
  • Error messages should be useful.
  • The code is relatively concise and cohesive, making it easier to use programmatically.
  • Packages don’t have to be installed as egg archives, they can be installed flat (while keeping the egg metadata).
  • Native support for other version control systems (Git, Mercurial and Bazaar)
  • Uninstallation of packages.
  • Simple to define fixed sets of requirements and reliably reproduce a set of packages.

回答 2

另一个(至今尚未提及)之所以喜欢点子,是因为它是新的热点,并将在未来继续使用。

以下信息图表(来自《The Hitchhiker’s Guide to Packaging v1.0》中的包装当前状态”部分)表明setuptools / easy_install将来会消失。

在此处输入图片说明

这是Distribution的文档中的另一个信息图,显示Setuptools和easy_install将被新的热点— distributionpip取代。虽然PIP仍然是新的辣味,分发与合并的setuptools在2013年发布的setuptools V0.7。

在此处输入图片说明

Another—as of yet unmentioned—reason for favoring pip is because it is the new hotness and will continue to be used in the future.

The infographic below—from the Current State of Packaging section in the The Hitchhiker’s Guide to Packaging v1.0—shows that setuptools/easy_install will go away in the future.

enter image description here

Here’s another infographic from distribute’s documentation showing that Setuptools and easy_install will be replaced by the new hotness—distribute and pip. While pip is still the new hotness, Distribute merged with Setuptools in 2013 with the release of Setuptools v0.7.

enter image description here


回答 3

有两个原因,可能还有更多:

  1. pip提供uninstall命令

  2. 如果中间安装失败,则pip将使您保持干净状态。

Two reasons, there may be more:

  1. pip provides an uninstall command

  2. if an installation fails in the middle, pip will leave you in a clean state.


回答 4

需求文件。

认真地说,我每天都将它与virtualenv结合使用。


快速依赖管理教程,民谣

需求文件使您可以创建已通过pip安装的所有软件包的快照。通过将这些程序包封装在虚拟环境中,可以使代码库在一组非常特定的程序包中工作,并与其他人共享该代码库。

从Heroku的文档中 https://devcenter.heroku.com/articles/python

您创建一个虚拟环境,并设置您的外壳以使用它。(bash / * nix指令)

virtualenv env
source env/bin/activate

现在,与此外壳一起运行的所有python脚本都将使用该环境的软件包和配置。现在,您可以在此环境中本地安装软件包,而无需在计算机上全局安装。

pip install flask

现在,您可以转储有关安装哪些软件包的信息

pip freeze > requirements.txt

如果您将该文件签入版本控制中,那么当其他人获取您的代码时,他们可以设置自己的虚拟环境并使用以下命令安装所有依赖项:

pip install -r requirements.txt

任何时候您都可以像这样自动执行乏味的操作。

REQUIREMENTS files.

Seriously, I use this in conjunction with virtualenv every day.


QUICK DEPENDENCY MANAGEMENT TUTORIAL, FOLKS

Requirements files allow you to create a snapshot of all packages that have been installed through pip. By encapsulating those packages in a virtualenvironment, you can have your codebase work off a very specific set of packages and share that codebase with others.

From Heroku’s documentation https://devcenter.heroku.com/articles/python

You create a virtual environment, and set your shell to use it. (bash/*nix instructions)

virtualenv env
source env/bin/activate

Now all python scripts run with this shell will use this environment’s packages and configuration. Now you can install a package locally to this environment without needing to install it globally on your machine.

pip install flask

Now you can dump the info about which packages are installed with

pip freeze > requirements.txt

If you checked that file into version control, when someone else gets your code, they can setup their own virtual environment and install all the dependencies with:

pip install -r requirements.txt

Any time you can automate tedium like this is awesome.


回答 5

pip不会安装二进制软件包,并且未在Windows上经过良好测试。

由于Windows默认没有附带编译器,因此通常无法在其中使用pip 。easy_install 可以为Windows安装二进制软件包。

pip won’t install binary packages and isn’t well tested on Windows.

As Windows doesn’t come with a compiler by default pip often can’t be used there. easy_install can install binary packages for Windows.


回答 6

更新:正如某些人所想,setuptools已经吸收distribute了相反的东西。setuptools是最新的最新distutils更改和滚轮格式。因此,easy_installpip或多或少平等现在。

来源:http : //pythonhosted.org/setuptools/merge-faq.html#why-setuptools-and-not-distribute-or-another-name

UPDATE: setuptools has absorbed distribute as opposed to the other way around, as some thought. setuptools is up-to-date with the latest distutils changes and the wheel format. Hence, easy_install and pip are more or less on equal footing now.

Source: http://pythonhosted.org/setuptools/merge-faq.html#why-setuptools-and-not-distribute-or-another-name


回答 7

除了模糊人的答复:

pip不会安装二进制软件包,并且未在Windows上经过良好测试。

由于Windows默认不带编译器,因此通常无法在其中使用pip。easy_install可以为Windows安装二进制软件包。

这是Windows上的一个技巧:

  • 您可以使用easy_install <package>安装二进制软件包来避免生成二进制文件

  • pip uninstall <package>即使您使用过easy_install,也可以使用 。

这只是在Windows上对我有效的解决方法。实际上,如果不涉及二进制文件,我总是使用pip。

请参阅当前的pip doku:http://www.pip-installer.org/en/latest/other-tools.html#pip-compared-to-easy-install

我将在邮件列表中询问为此计划的内容。

这是最新的更新:

新的受支持的安装二进制文件的方式将是wheel!它尚未在标准中,但几乎已经存在。当前版本仍为Alpha:1.0.0a1

https://pypi.python.org/pypi/wheel

http://wheel.readthedocs.org/en/latest/

我将wheel通过创建要PySide使用的OS X安装程序进行测试wheel,而不是蛋。会回来并报告此情况。

欢呼声-克里斯

快速更新:

到的过渡wheel即将结束。大多数软件包都支持wheel

我答应为制作车轮PySide,去年夏天我做了。很棒!

提示:一些开发商至今未能支撑轮格式,仅仅是因为他们忘记更换distutilssetuptools。通常,通过替换中的单个单词很容易转换此类软件包setup.py

As an addition to fuzzyman’s reply:

pip won’t install binary packages and isn’t well tested on Windows.

As Windows doesn’t come with a compiler by default pip often can’t be used there. easy_install can install binary packages for Windows.

Here is a trick on Windows:

  • you can use easy_install <package> to install binary packages to avoid building a binary

  • you can use pip uninstall <package> even if you used easy_install.

This is just a work-around that works for me on windows. Actually I always use pip if no binaries are involved.

See the current pip doku: http://www.pip-installer.org/en/latest/other-tools.html#pip-compared-to-easy-install

I will ask on the mailing list what is planned for that.

Here is the latest update:

The new supported way to install binaries is going to be wheel! It is not yet in the standard, but almost. Current version is still an alpha: 1.0.0a1

https://pypi.python.org/pypi/wheel

http://wheel.readthedocs.org/en/latest/

I will test wheel by creating an OS X installer for PySide using wheel instead of eggs. Will get back and report about this.

cheers – Chris

A quick update:

The transition to wheel is almost over. Most packages are supporting wheel.

I promised to build wheels for PySide, and I did that last summer. Works great!

HINT: A few developers failed so far to support the wheel format, simply because they forget to replace distutils by setuptools. Often, it is easy to convert such packages by replacing this single word in setup.py.


回答 8

刚遇到一个我不得不easy_install代替的特殊情况pip,否则我必须直接提取源代码。

对于该软件包GitPython,in中的版本pip太旧,即0.1.7,而from中的版本easy_install是最新的,即0.3.2.rc1

我正在使用Python 2.7.8。我不知道有关的底层机制easy_installpip,但至少有一些包的版本可能是彼此不同的,有时easy_install是一个较新的版本。

easy_install GitPython

Just met one special case that I had to use easy_install instead of pip, or I have to pull the source codes directly.

For the package GitPython, the version in pip is too old, which is 0.1.7, while the one from easy_install is the latest which is 0.3.2.rc1.

I’m using Python 2.7.8. I’m not sure about the underlay mechanism of easy_install and pip, but at least the versions of some packages may be different from each other, and sometimes easy_install is the one with newer version.

easy_install GitPython

如何在迭代时从列表中删除项目?

问题:如何在迭代时从列表中删除项目?

我正在遍历Python中的元组列表,并尝试在满足特定条件的情况下将其删除。

for tup in somelist:
    if determine(tup):
         code_to_remove_tup

我应该用什么代替code_to_remove_tup?我不知道如何以这种方式删除项目。

I’m iterating over a list of tuples in Python, and am attempting to remove them if they meet certain criteria.

for tup in somelist:
    if determine(tup):
         code_to_remove_tup

What should I use in place of code_to_remove_tup? I can’t figure out how to remove the item in this fashion.


回答 0

您可以使用列表推导来创建一个仅包含您不想删除的元素的新列表:

somelist = [x for x in somelist if not determine(x)]

或者,通过分配给slice somelist[:],您可以将现有列表突变为仅包含所需的项目:

somelist[:] = [x for x in somelist if not determine(x)]

如果还有其他引用somelist需要反映更改,则此方法可能很有用。

除了理解之外,您还可以使用itertools。在Python 2中:

from itertools import ifilterfalse
somelist[:] = ifilterfalse(determine, somelist)

或在Python 3中:

from itertools import filterfalse
somelist[:] = filterfalse(determine, somelist)

为了清楚起见,以及对于那些发现使用[:]黑变或模糊表示法的人,这里有一个更明确的选择。从理论上讲,它在空间和时间上的表现应该与上面的单层表现相同。

temp = []
while somelist:
    x = somelist.pop()
    if not determine(x):
        temp.append(x)
while temp:
    somelist.append(templist.pop())

它也可以在其他语言中工作,而这些语言可能不具有Python列表的替换项功能,并且只需进行很少的修改即可。例如,并非所有语言都False像Python一样将空列表转换为。您可以替换while somelist:更明确的内容,例如while len(somelist) > 0:

You can use a list comprehension to create a new list containing only the elements you don’t want to remove:

somelist = [x for x in somelist if not determine(x)]

Or, by assigning to the slice somelist[:], you can mutate the existing list to contain only the items you want:

somelist[:] = [x for x in somelist if not determine(x)]

This approach could be useful if there are other references to somelist that need to reflect the changes.

Instead of a comprehension, you could also use itertools. In Python 2:

from itertools import ifilterfalse
somelist[:] = ifilterfalse(determine, somelist)

Or in Python 3:

from itertools import filterfalse
somelist[:] = filterfalse(determine, somelist)

For the sake of clarity and for those who find the use of the [:] notation hackish or fuzzy, here’s a more explicit alternative. Theoretically, it should perform the same with regards to space and time than the one-liners above.

temp = []
while somelist:
    x = somelist.pop()
    if not determine(x):
        temp.append(x)
while temp:
    somelist.append(templist.pop())

It also works in other languages that may not have the replace items ability of Python lists, with minimal modifications. For instance, not all languages cast empty lists to a False as Python does. You can substitute while somelist: for something more explicit like while len(somelist) > 0:.


回答 1

暗示列表理解的答案几乎是正确的-除了它们会建立一个全新的列表,然后为其命名与旧列表相同外,它们不会修改旧列表。这与@Lennart的建议中的选择性删除操作不同-速度更快,但是如果通过多个引用访问列表,则说明您只是在重新放置其中一个引用而不更改列表对象本身可能会导致微妙的灾难性错误。

幸运的是,获得列表理解的速度和就地变更所需的语义非常容易,只需编写代码即可:

somelist[:] = [tup for tup in somelist if determine(tup)]

请注意与其他答案的细微差别:这不是分配给裸名-而是分配给恰好是整个列表的列表切片,从而替换同一Python列表对象中的列表内容 ,而不仅仅是重新放置一个引用(从先前的列表对象到新的列表对象),就像其他答案一样。

The answers suggesting list comprehensions are ALMOST correct — except that they build a completely new list and then give it the same name the old list as, they do NOT modify the old list in place. That’s different from what you’d be doing by selective removal, as in @Lennart’s suggestion — it’s faster, but if your list is accessed via multiple references the fact that you’re just reseating one of the references and NOT altering the list object itself can lead to subtle, disastrous bugs.

Fortunately, it’s extremely easy to get both the speed of list comprehensions AND the required semantics of in-place alteration — just code:

somelist[:] = [tup for tup in somelist if determine(tup)]

Note the subtle difference with other answers: this one is NOT assigning to a barename – it’s assigning to a list slice that just happens to be the entire list, thereby replacing the list contents within the same Python list object, rather than just reseating one reference (from previous list object to new list object) like the other answers.


回答 2

您需要获取列表的副本并首先对其进行迭代,否则迭代将失败,并可能导致意外结果。

例如(取决于列表的类型):

for tup in somelist[:]:
    etc....

一个例子:

>>> somelist = range(10)
>>> for x in somelist:
...     somelist.remove(x)
>>> somelist
[1, 3, 5, 7, 9]

>>> somelist = range(10)
>>> for x in somelist[:]:
...     somelist.remove(x)
>>> somelist
[]

You need to take a copy of the list and iterate over it first, or the iteration will fail with what may be unexpected results.

For example (depends on what type of list):

for tup in somelist[:]:
    etc....

An example:

>>> somelist = range(10)
>>> for x in somelist:
...     somelist.remove(x)
>>> somelist
[1, 3, 5, 7, 9]

>>> somelist = range(10)
>>> for x in somelist[:]:
...     somelist.remove(x)
>>> somelist
[]

回答 3

for i in range(len(somelist) - 1, -1, -1):
    if some_condition(somelist, i):
        del somelist[i]

您需要向后走,否则就像将您坐在的树枝锯掉一样:-)

Python 2用户:替换rangexrange以避免创建硬编码列表

for i in range(len(somelist) - 1, -1, -1):
    if some_condition(somelist, i):
        del somelist[i]

You need to go backwards otherwise it’s a bit like sawing off the tree-branch that you are sitting on :-)

Python 2 users: replace range by xrange to avoid creating a hardcoded list


回答 4

官方Python 2教程4.2。“用于声明”

https://docs.python.org/2/tutorial/controlflow.html#for-statements

这部分文档清楚地表明:

  • 您需要复制迭代列表以对其进行修改
  • 一种方法是使用切片符号 [:]

如果需要修改循环中要迭代的序列(例如,复制选定的项目),建议您首先进行复制。遍历序列不会隐式地创建副本。切片符号使这特别方便:

>>> words = ['cat', 'window', 'defenestrate']
>>> for w in words[:]:  # Loop over a slice copy of the entire list.
...     if len(w) > 6:
...         words.insert(0, w)
...
>>> words
['defenestrate', 'cat', 'window', 'defenestrate']

Python 2文档7.3。“ for声明”

https://docs.python.org/2/reference/compound_stmts.html#for

这部分文档再次说明您必须进行复制,并提供了一个实际的删除示例:

注意:循环修改序列时有一个微妙之处(这仅适用于可变序列,即列表)。内部计数器用于跟踪下一个要使用的项目,并且在每次迭代时都会递增。当该计数器达到序列的长度时,循环终止。这意味着,如果套件从序列中删除当前(或上一个)项目,则下一个项目将被跳过(因为它获取已被处理的当前项目的索引)。同样,如果套件在当前项目之前按顺序插入一个项目,则下次通过循环再次处理当前项目。这可能会导致讨厌的错误,可以通过使用整个序列的一部分进行临时复制来避免这些错误,例如,

for x in a[:]:
    if x < 0: a.remove(x)

但是,我不同意此实现,因为.remove()必须迭代整个列表以找到值。

最佳解决方法

要么:

通常.append(),除非内存是一个大问题,否则默认情况下,您只想使用默认选项。

Python可以做得更好吗?

似乎可以改进此特定的Python API。例如,将其与:

两者都清楚地表明,除了使用迭代器本身之外,您无法修改要迭代的列表,并为您提供了无需复制列表即可进行修改的有效方法。

可能的基本原理是,假定Python列表是由动态数组支持的,因此,任何类型的删除都将在时间上效率低下,而Java具有更好的接口层次结构,同时包含ArrayListLinkedList实现ListIterator

Python stdlib中似乎也没有明确的链接列表类型:Python链接列表

Official Python 2 tutorial 4.2. “for Statements”

https://docs.python.org/2/tutorial/controlflow.html#for-statements

This part of the docs makes it clear that:

  • you need to make a copy of the iterated list to modify it
  • one way to do it is with the slice notation [:]

If you need to modify the sequence you are iterating over while inside the loop (for example to duplicate selected items), it is recommended that you first make a copy. Iterating over a sequence does not implicitly make a copy. The slice notation makes this especially convenient:

>>> words = ['cat', 'window', 'defenestrate']
>>> for w in words[:]:  # Loop over a slice copy of the entire list.
...     if len(w) > 6:
...         words.insert(0, w)
...
>>> words
['defenestrate', 'cat', 'window', 'defenestrate']

Python 2 documentation 7.3. “The for statement”

https://docs.python.org/2/reference/compound_stmts.html#for

This part of the docs says once again that you have to make a copy, and gives an actual removal example:

Note: There is a subtlety when the sequence is being modified by the loop (this can only occur for mutable sequences, i.e. lists). An internal counter is used to keep track of which item is used next, and this is incremented on each iteration. When this counter has reached the length of the sequence the loop terminates. This means that if the suite deletes the current (or a previous) item from the sequence, the next item will be skipped (since it gets the index of the current item which has already been treated). Likewise, if the suite inserts an item in the sequence before the current item, the current item will be treated again the next time through the loop. This can lead to nasty bugs that can be avoided by making a temporary copy using a slice of the whole sequence, e.g.,

for x in a[:]:
    if x < 0: a.remove(x)

However, I disagree with this implementation, since .remove() has to iterate the entire list to find the value.

Best workarounds

Either:

Generally you just want to go for the faster .append() option by default unless memory is a big concern.

Could Python do this better?

It seems like this particular Python API could be improved. Compare it, for instance, with:

  • Java ListIterator::remove which documents “This call can only be made once per call to next or previous”
  • C++ std::vector::erase which returns a valid interator to the element after the one removed

both of which make it crystal clear that you cannot modify a list being iterated except with the iterator itself, and gives you efficient ways to do so without copying the list.

Perhaps the underlying rationale is that Python lists are assumed to be dynamic array backed, and therefore any type of removal will be time inefficient anyways, while Java has a nicer interface hierarchy with both ArrayList and LinkedList implementations of ListIterator.

There doesn’t seem to be an explicit linked list type in the Python stdlib either: Python Linked List


回答 5

此类示例的最佳方法是列表理解

somelist = [tup for tup in somelist if determine(tup)]

如果您要做的事情比调用determine函数更复杂,我更喜欢构造一个新列表,然后随便添加一个新列表。例如

newlist = []
for tup in somelist:
    # lots of code here, possibly setting things up for calling determine
    if determine(tup):
        newlist.append(tup)
somelist = newlist

使用列表复制列表remove可能会使您的代码看起来更简洁,如以下答案之一所述。您绝对不应该对非常大的列表执行此操作,因为这涉及到首先复制整个列表,然后O(n) remove对要删除的每个元素执行操作,从而使其成为一种O(n^2)算法。

for tup in somelist[:]:
    # lots of code here, possibly setting things up for calling determine
    if determine(tup):
        newlist.append(tup)

Your best approach for such an example would be a list comprehension

somelist = [tup for tup in somelist if determine(tup)]

In cases where you’re doing something more complex than calling a determine function, I prefer constructing a new list and simply appending to it as I go. For example

newlist = []
for tup in somelist:
    # lots of code here, possibly setting things up for calling determine
    if determine(tup):
        newlist.append(tup)
somelist = newlist

Copying the list using remove might make your code look a little cleaner, as described in one of the answers below. You should definitely not do this for extremely large lists, since this involves first copying the entire list, and also performing an O(n) remove operation for each element being removed, making this an O(n^2) algorithm.

for tup in somelist[:]:
    # lots of code here, possibly setting things up for calling determine
    if determine(tup):
        newlist.append(tup)

回答 6

对于那些喜欢函数式编程的人:

somelist[:] = filter(lambda tup: not determine(tup), somelist)

要么

from itertools import ifilterfalse
somelist[:] = list(ifilterfalse(determine, somelist))

For those that like functional programming:

somelist[:] = filter(lambda tup: not determine(tup), somelist)

or

from itertools import ifilterfalse
somelist[:] = list(ifilterfalse(determine, somelist))

回答 7

我需要使用大量列表来完成此操作,并且复制列表似乎很昂贵,尤其是因为在我的情况下,与保留的项目相比,删除的数量很少。我采用了这种低级方法。

array = [lots of stuff]
arraySize = len(array)
i = 0
while i < arraySize:
    if someTest(array[i]):
        del array[i]
        arraySize -= 1
    else:
        i += 1

我不知道相对于复制大型列表而言,几次删除的效率如何。如果您有任何见解,请发表评论。

I needed to do this with a huge list, and duplicating the list seemed expensive, especially since in my case the number of deletions would be few compared to the items that remain. I took this low-level approach.

array = [lots of stuff]
arraySize = len(array)
i = 0
while i < arraySize:
    if someTest(array[i]):
        del array[i]
        arraySize -= 1
    else:
        i += 1

What I don’t know is how efficient a couple of deletes are compared to copying a large list. Please comment if you have any insight.


回答 8

如果当前列表项符合期望的条件,则仅创建一个新列表也可能很聪明。

所以:

for item in originalList:
   if (item != badValue):
        newList.append(item)

并且避免必须使用新的列表名称重新编码整个项目:

originalList[:] = newList

注意,来自Python文档:

copy.copy(x)返回x的浅表副本。

copy.deepcopy(x)返回x的深层副本。

It might be smart to also just create a new list if the current list item meets the desired criteria.

so:

for item in originalList:
   if (item != badValue):
        newList.append(item)

and to avoid having to re-code the entire project with the new lists name:

originalList[:] = newList

note, from Python documentation:

copy.copy(x) Return a shallow copy of x.

copy.deepcopy(x) Return a deep copy of x.


回答 9

此答案最初是为回答一个问题而编写的,此问题已被标记为重复: 从python列表中删除坐标

您的代码中有两个问题:

1)使用remove()时,您尝试删除整数,而您需要删除元组。

2)for循环将跳过列表中的项目。

让我们看一下执行代码时发生的情况:

>>> L1 = [(1,2), (5,6), (-1,-2), (1,-2)]
>>> for (a,b) in L1:
...   if a < 0 or b < 0:
...     L1.remove(a,b)
... 
Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
TypeError: remove() takes exactly one argument (2 given)

第一个问题是您要将’a’和’b’都传递给remove(),但是remove()仅接受单个参数。那么,如何才能使remove()与您的列表一起正常工作?我们需要弄清楚列表中的每个元素是什么。在这种情况下,每个都是一个元组。要看到这一点,让我们访问列表的一个元素(索引从0开始):

>>> L1[1]
(5, 6)
>>> type(L1[1])
<type 'tuple'>

啊哈!L1的每个元素实际上都是一个元组。这就是我们需要传递给remove()的东西。python中的元组非常简单,只需将值括在括号中即可。“ a,b”不是元组,但是“(a,b)”是元组。因此,我们修改您的代码并再次运行:

# The remove line now includes an extra "()" to make a tuple out of "a,b"
L1.remove((a,b))

这段代码运行无误,但让我们看一下它输出的列表:

L1 is now: [(1, 2), (5, 6), (1, -2)]

为什么(1,-2)仍在您的列表中?事实证明,在使用循环迭代列表时修改列表是一个非常糟糕的主意,无需特别注意。(1,-2)保留在列表中的原因是列表中每个项目的位置在for循环的迭代之间更改。让我们看看如果将上面的代码提供给更长的列表会发生什么:

L1 = [(1,2),(5,6),(-1,-2),(1,-2),(3,4),(5,7),(-4,4),(2,1),(-3,-3),(5,-1),(0,6)]
### Outputs:
L1 is now: [(1, 2), (5, 6), (1, -2), (3, 4), (5, 7), (2, 1), (5, -1), (0, 6)]

从该结果可以推断,每次条件语句的值为true且列表项被删除时,循环的下一次迭代将跳过对列表中下一项的评估,因为其值现在位于不同的索引处。

最直观的解决方案是复制列表,然后遍历原始列表并仅修改副本。您可以尝试这样做:

L2 = L1
for (a,b) in L1:
    if a < 0 or b < 0 :
        L2.remove((a,b))
# Now, remove the original copy of L1 and replace with L2
print L2 is L1
del L1
L1 = L2; del L2
print ("L1 is now: ", L1)

但是,输出将与之前相同:

'L1 is now: ', [(1, 2), (5, 6), (1, -2), (3, 4), (5, 7), (2, 1), (5, -1), (0, 6)]

这是因为当我们创建L2时,python实际上并未创建新对象。相反,它仅将L2引用为与L1相同的对象。我们可以使用“ is”来验证这一点,而“ is”不同于“等于”(==)。

>>> L2=L1
>>> L1 is L2
True

我们可以使用copy.copy()制作一个真实的副本。然后一切都按预期工作:

import copy
L1 = [(1,2), (5,6),(-1,-2), (1,-2),(3,4),(5,7),(-4,4),(2,1),(-3,-3),(5,-1),(0,6)]
L2 = copy.copy(L1)
for (a,b) in L1:
    if a < 0 or b < 0 :
        L2.remove((a,b))
# Now, remove the original copy of L1 and replace with L2
del L1
L1 = L2; del L2
>>> L1 is now: [(1, 2), (5, 6), (3, 4), (5, 7), (2, 1), (0, 6)]

最后,有一个更清洁的解决方案,而不是必须制作全新的L1副本。reversed()函数:

L1 = [(1,2), (5,6),(-1,-2), (1,-2),(3,4),(5,7),(-4,4),(2,1),(-3,-3),(5,-1),(0,6)]
for (a,b) in reversed(L1):
    if a < 0 or b < 0 :
        L1.remove((a,b))
print ("L1 is now: ", L1)
>>> L1 is now: [(1, 2), (5, 6), (3, 4), (5, 7), (2, 1), (0, 6)]

不幸的是,我无法充分描述reversed()的工作方式。当列表传递给它时,它返回一个“ listreverseiterator”对象。出于实际目的,您可以将其视为创建其参数的反向副本。这是我推荐的解决方案。

This answer was originally written in response to a question which has since been marked as duplicate: Removing coordinates from list on python

There are two problems in your code:

1) When using remove(), you attempt to remove integers whereas you need to remove a tuple.

2) The for loop will skip items in your list.

Let’s run through what happens when we execute your code:

>>> L1 = [(1,2), (5,6), (-1,-2), (1,-2)]
>>> for (a,b) in L1:
...   if a < 0 or b < 0:
...     L1.remove(a,b)
... 
Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
TypeError: remove() takes exactly one argument (2 given)

The first problem is that you are passing both ‘a’ and ‘b’ to remove(), but remove() only accepts a single argument. So how can we get remove() to work properly with your list? We need to figure out what each element of your list is. In this case, each one is a tuple. To see this, let’s access one element of the list (indexing starts at 0):

>>> L1[1]
(5, 6)
>>> type(L1[1])
<type 'tuple'>

Aha! Each element of L1 is actually a tuple. So that’s what we need to be passing to remove(). Tuples in python are very easy, they’re simply made by enclosing values in parentheses. “a, b” is not a tuple, but “(a, b)” is a tuple. So we modify your code and run it again:

# The remove line now includes an extra "()" to make a tuple out of "a,b"
L1.remove((a,b))

This code runs without any error, but let’s look at the list it outputs:

L1 is now: [(1, 2), (5, 6), (1, -2)]

Why is (1,-2) still in your list? It turns out modifying the list while using a loop to iterate over it is a very bad idea without special care. The reason that (1, -2) remains in the list is that the locations of each item within the list changed between iterations of the for loop. Let’s look at what happens if we feed the above code a longer list:

L1 = [(1,2),(5,6),(-1,-2),(1,-2),(3,4),(5,7),(-4,4),(2,1),(-3,-3),(5,-1),(0,6)]
### Outputs:
L1 is now: [(1, 2), (5, 6), (1, -2), (3, 4), (5, 7), (2, 1), (5, -1), (0, 6)]

As you can infer from that result, every time that the conditional statement evaluates to true and a list item is removed, the next iteration of the loop will skip evaluation of the next item in the list because its values are now located at different indices.

The most intuitive solution is to copy the list, then iterate over the original list and only modify the copy. You can try doing so like this:

L2 = L1
for (a,b) in L1:
    if a < 0 or b < 0 :
        L2.remove((a,b))
# Now, remove the original copy of L1 and replace with L2
print L2 is L1
del L1
L1 = L2; del L2
print ("L1 is now: ", L1)

However, the output will be identical to before:

'L1 is now: ', [(1, 2), (5, 6), (1, -2), (3, 4), (5, 7), (2, 1), (5, -1), (0, 6)]

This is because when we created L2, python did not actually create a new object. Instead, it merely referenced L2 to the same object as L1. We can verify this with ‘is’ which is different from merely “equals” (==).

>>> L2=L1
>>> L1 is L2
True

We can make a true copy using copy.copy(). Then everything works as expected:

import copy
L1 = [(1,2), (5,6),(-1,-2), (1,-2),(3,4),(5,7),(-4,4),(2,1),(-3,-3),(5,-1),(0,6)]
L2 = copy.copy(L1)
for (a,b) in L1:
    if a < 0 or b < 0 :
        L2.remove((a,b))
# Now, remove the original copy of L1 and replace with L2
del L1
L1 = L2; del L2
>>> L1 is now: [(1, 2), (5, 6), (3, 4), (5, 7), (2, 1), (0, 6)]

Finally, there is one cleaner solution than having to make an entirely new copy of L1. The reversed() function:

L1 = [(1,2), (5,6),(-1,-2), (1,-2),(3,4),(5,7),(-4,4),(2,1),(-3,-3),(5,-1),(0,6)]
for (a,b) in reversed(L1):
    if a < 0 or b < 0 :
        L1.remove((a,b))
print ("L1 is now: ", L1)
>>> L1 is now: [(1, 2), (5, 6), (3, 4), (5, 7), (2, 1), (0, 6)]

Unfortunately, I cannot adequately describe how reversed() works. It returns a ‘listreverseiterator’ object when a list is passed to it. For practical purposes, you can think of it as creating a reversed copy of its argument. This is the solution I recommend.


回答 10

如果要在迭代过程中执行其他任何操作,最好同时获得索引(这可以保证您能够引用它,例如,如果您有字典列表)和实际的列表项内容。

inlist = [{'field1':10, 'field2':20}, {'field1':30, 'field2':15}]    
for idx, i in enumerate(inlist):
    do some stuff with i['field1']
    if somecondition:
        xlist.append(idx)
for i in reversed(xlist): del inlist[i]

enumerate使您可以立即访问项目和索引。reversed这样一来,您以后要删除的索引就不会改变。

If you want to do anything else during the iteration, it may be nice to get both the index (which guarantees you being able to reference it, for example if you have a list of dicts) and the actual list item contents.

inlist = [{'field1':10, 'field2':20}, {'field1':30, 'field2':15}]    
for idx, i in enumerate(inlist):
    do some stuff with i['field1']
    if somecondition:
        xlist.append(idx)
for i in reversed(xlist): del inlist[i]

enumerate gives you access to the item and the index at once. reversed is so that the indices that you’re going to later delete don’t change on you.


回答 11

您可能要使用filter()available作为内置函数。

欲了解更多详情,请点击这里

You might want to use filter() available as the built-in.

For more details check here


回答 12

这里的大多数答案都希望您创建列表的副本。我有一个用例,其中的列表很长(110K个项),而继续缩小列表会更明智。

首先,您需要将while循环替换为foreach循环

i = 0
while i < len(somelist):
    if determine(somelist[i]):
         del somelist[i]
    else:
        i += 1

i在if块中,不会更改的值,因为一旦删除了旧项,您将希望从相同的索引中获取新项的值。

Most of the answers here want you to create a copy of the list. I had a use case where the list was quite long (110K items) and it was smarter to keep reducing the list instead.

First of all you’ll need to replace foreach loop with while loop,

i = 0
while i < len(somelist):
    if determine(somelist[i]):
         del somelist[i]
    else:
        i += 1

The value of i is not changed in the if block because you’ll want to get value of the new item FROM THE SAME INDEX, once the old item is deleted.


回答 13

您可以尝试反向进行循环,因此对于some_list,您将执行以下操作:

list_len = len(some_list)
for i in range(list_len):
    reverse_i = list_len - 1 - i
    cur = some_list[reverse_i]

    # some logic with cur element

    if some_condition:
        some_list.pop(reverse_i)

这样,索引是对齐的,并且不会受到列表更新的影响(无论是否弹出cur元素)。

You can try for-looping in reverse so for some_list you’ll do something like:

list_len = len(some_list)
for i in range(list_len):
    reverse_i = list_len - 1 - i
    cur = some_list[reverse_i]

    # some logic with cur element

    if some_condition:
        some_list.pop(reverse_i)

This way the index is aligned and doesn’t suffer from the list updates (regardless whether you pop cur element or not).


回答 14

一种可能的解决方案,如果您不仅要删除某些内容,而且还希望在单个循环中对所有元素执行某些操作,则很有用:

alist = ['good', 'bad', 'good', 'bad', 'good']
i = 0
for x in alist[:]:
    if x == 'bad':
        alist.pop(i)
        i -= 1
    # do something cool with x or just print x
    print(x)
    i += 1

One possible solution, useful if you want not only remove some things, but also do something with all elements in a single loop:

alist = ['good', 'bad', 'good', 'bad', 'good']
i = 0
for x in alist[:]:
    if x == 'bad':
        alist.pop(i)
        i -= 1
    # do something cool with x or just print x
    print(x)
    i += 1

回答 15

我需要做类似的事情,在我的情况下,问题是内存-我需要将列表中的多个数据集对象合并,然后再对它们做一些事情,作为一个新对象,并且需要摆脱要合并的每个条目避免重复所有操作并浪费内存。就我而言,将对象放在字典中而不是列表中可以很好地工作:

“`

k = range(5)
v = ['a','b','c','d','e']
d = {key:val for key,val in zip(k, v)}

print d
for i in range(5):
    print d[i]
    d.pop(i)
print d

“`

I needed to do something similar and in my case the problem was memory – I needed to merge multiple dataset objects within a list, after doing some stuff with them, as a new object, and needed to get rid of each entry I was merging to avoid duplicating all of them and blowing up memory. In my case having the objects in a dictionary instead of a list worked fine:

“`

k = range(5)
v = ['a','b','c','d','e']
d = {key:val for key,val in zip(k, v)}

print d
for i in range(5):
    print d[i]
    d.pop(i)
print d

“`


回答 16

TLDR:

我写了一个库,使您可以执行此操作:

from fluidIter import FluidIterable
fSomeList = FluidIterable(someList)  
for tup in fSomeList:
    if determine(tup):
        # remove 'tup' without "breaking" the iteration
        fSomeList.remove(tup)
        # tup has also been removed from 'someList'
        # as well as 'fSomeList'

如果可能的话,最好使用另一种方法,该方法不需要在迭代时修改可迭代对象,但是对于某些算法,可能不是那么简单。因此,如果您确定确实希望原始问题中描述的代码模式,则可以。

应该适用于所有可变序列,而不仅仅是列表。


完整答案:

编辑:此答案中的最后一个代码示例给出了一个用例,说明了为什么有时您可能想在适当位置修改列表而不是使用列表理解。答案的第一部分用作如何在适当位置修改数组的教程。

从senderle的答案(针对相关问题)可以得出解决方案。这说明了在遍历已修改的列表时如何更新数组索引。即使解决方案列表被修改,下面的解决方案旨在正确跟踪数组索引。

fluidIter.py这里 下载https://github.com/alanbacon/FluidIterator,它只是一个文件,因此无需安装git。没有安装程序,因此您需要确保该文件位于您自己的python路径中。该代码是为python 3编写的,未经python 2的测试。

from fluidIter import FluidIterable
l = [0,1,2,3,4,5,6,7,8]  
fluidL = FluidIterable(l)                       
for i in fluidL:
    print('initial state of list on this iteration: ' + str(fluidL)) 
    print('current iteration value: ' + str(i))
    print('popped value: ' + str(fluidL.pop(2)))
    print(' ')

print('Final List Value: ' + str(l))

这将产生以下输出:

initial state of list on this iteration: [0, 1, 2, 3, 4, 5, 6, 7, 8]
current iteration value: 0
popped value: 2

initial state of list on this iteration: [0, 1, 3, 4, 5, 6, 7, 8]
current iteration value: 1
popped value: 3

initial state of list on this iteration: [0, 1, 4, 5, 6, 7, 8]
current iteration value: 4
popped value: 4

initial state of list on this iteration: [0, 1, 5, 6, 7, 8]
current iteration value: 5
popped value: 5

initial state of list on this iteration: [0, 1, 6, 7, 8]
current iteration value: 6
popped value: 6

initial state of list on this iteration: [0, 1, 7, 8]
current iteration value: 7
popped value: 7

initial state of list on this iteration: [0, 1, 8]
current iteration value: 8
popped value: 8

Final List Value: [0, 1]

上面我们pop在流体列表对象上使用了该方法。其他常见的迭代方法也实现,诸如del fluidL[i].remove.insert.append.extend。还可以使用切片修改列表(sort并且reverse未实现方法)。

唯一的条件是,您必须仅在适当位置修改列表,如果在任何时候fluidL或将l其重新分配给其他列表对象,代码将无法工作。原始fluidL对象仍将由for循环使用,但超出范围供我们修改。

fluidL[2] = 'a'   # is OK
fluidL = [0, 1, 'a', 3, 4, 5, 6, 7, 8]  # is not OK

如果要访问列表的当前索引值,则不能使用枚举,因为它仅计算for循环已运行了多少次。相反,我们将直接使用迭代器对象。

fluidArr = FluidIterable([0,1,2,3])
# get iterator first so can query the current index
fluidArrIter = fluidArr.__iter__()
for i, v in enumerate(fluidArrIter):
    print('enum: ', i)
    print('current val: ', v)
    print('current ind: ', fluidArrIter.currentIndex)
    print(fluidArr)
    fluidArr.insert(0,'a')
    print(' ')

print('Final List Value: ' + str(fluidArr))

这将输出以下内容:

enum:  0
current val:  0
current ind:  0
[0, 1, 2, 3]

enum:  1
current val:  1
current ind:  2
['a', 0, 1, 2, 3]

enum:  2
current val:  2
current ind:  4
['a', 'a', 0, 1, 2, 3]

enum:  3
current val:  3
current ind:  6
['a', 'a', 'a', 0, 1, 2, 3]

Final List Value: ['a', 'a', 'a', 'a', 0, 1, 2, 3]

FluidIterable类只是提供了原始列表对象的包装。可以将原始对象作为流体对象的属性进行访问,如下所示:

originalList = fluidArr.fixedIterable

可以if __name__ is "__main__":在底部的部分找到更多示例/测试fluidIter.py。这些值得一看,因为它们解释了在各种情况下会发生什么。例如:使用切片替换列表的大部分。或在嵌套for循环中使用(并修改)相同的可迭代对象。

就像我刚开始所说的那样:这是一个复杂的解决方案,将损害代码的可读性,并使调试更加困难。因此,应该首先考虑其他解决方案,例如David Raznick的答案中提到的列表理解。话虽如此,但我发现此类对我有用并且比跟踪需要删除的元素的索引更容易使用的时代。


编辑:如评论中所述,此答案并未真正提出此方法可提供解决方案的问题。我将在这里尝试解决:

列表理解提供了一种生成新列表的方法,但是这些方法倾向于孤立地查看每个元素,而不是整个列表的当前状态。

newList = [i for i in oldList if testFunc(i)]

但是,如果的结果testFunc取决于已添加的元素newList怎么办?还是仍然oldList可以添加其中的元素?也许仍然可以使用列表理解的方法,但是它会开始失去它的优雅感,对我来说,修改列表很容易。

下面的代码是遭受上述问题的一种算法示例。该算法将减少列表,以使任何元素都不是其他任何元素的倍数。

randInts = [70, 20, 61, 80, 54, 18, 7, 18, 55, 9]
fRandInts = FluidIterable(randInts)
fRandIntsIter = fRandInts.__iter__()
# for each value in the list (outer loop)
# test against every other value in the list (inner loop)
for i in fRandIntsIter:
    print(' ')
    print('outer val: ', i)
    innerIntsIter = fRandInts.__iter__()
    for j in innerIntsIter:
        innerIndex = innerIntsIter.currentIndex
        # skip the element that the outloop is currently on
        # because we don't want to test a value against itself
        if not innerIndex == fRandIntsIter.currentIndex:
            # if the test element, j, is a multiple 
            # of the reference element, i, then remove 'j'
            if j%i == 0:
                print('remove val: ', j)
                # remove element in place, without breaking the
                # iteration of either loop
                del fRandInts[innerIndex]
            # end if multiple, then remove
        # end if not the same value as outer loop
    # end inner loop
# end outerloop

print('')
print('final list: ', randInts)

输出和最终的简化列表如下所示

outer val:  70

outer val:  20
remove val:  80

outer val:  61

outer val:  54

outer val:  18
remove val:  54
remove val:  18

outer val:  7
remove val:  70

outer val:  55

outer val:  9
remove val:  18

final list:  [20, 61, 7, 55, 9]

TLDR:

I wrote a library that allows you to do this:

from fluidIter import FluidIterable
fSomeList = FluidIterable(someList)  
for tup in fSomeList:
    if determine(tup):
        # remove 'tup' without "breaking" the iteration
        fSomeList.remove(tup)
        # tup has also been removed from 'someList'
        # as well as 'fSomeList'

It’s best to use another method if possible that doesn’t require modifying your iterable while iterating over it, but for some algorithms it might not be that straight forward. And so if you are sure that you really do want the code pattern described in the original question, it is possible.

Should work on all mutable sequences not just lists.


Full answer:

Edit: The last code example in this answer gives a use case for why you might sometimes want to modify a list in place rather than use a list comprehension. The first part of the answers serves as tutorial of how an array can be modified in place.

The solution follows on from this answer (for a related question) from senderle. Which explains how the the array index is updated while iterating through a list that has been modified. The solution below is designed to correctly track the array index even if the list is modified.

Download fluidIter.py from here https://github.com/alanbacon/FluidIterator, it is just a single file so no need to install git. There is no installer so you will need to make sure that the file is in the python path your self. The code has been written for python 3 and is untested on python 2.

from fluidIter import FluidIterable
l = [0,1,2,3,4,5,6,7,8]  
fluidL = FluidIterable(l)                       
for i in fluidL:
    print('initial state of list on this iteration: ' + str(fluidL)) 
    print('current iteration value: ' + str(i))
    print('popped value: ' + str(fluidL.pop(2)))
    print(' ')

print('Final List Value: ' + str(l))

This will produce the following output:

initial state of list on this iteration: [0, 1, 2, 3, 4, 5, 6, 7, 8]
current iteration value: 0
popped value: 2

initial state of list on this iteration: [0, 1, 3, 4, 5, 6, 7, 8]
current iteration value: 1
popped value: 3

initial state of list on this iteration: [0, 1, 4, 5, 6, 7, 8]
current iteration value: 4
popped value: 4

initial state of list on this iteration: [0, 1, 5, 6, 7, 8]
current iteration value: 5
popped value: 5

initial state of list on this iteration: [0, 1, 6, 7, 8]
current iteration value: 6
popped value: 6

initial state of list on this iteration: [0, 1, 7, 8]
current iteration value: 7
popped value: 7

initial state of list on this iteration: [0, 1, 8]
current iteration value: 8
popped value: 8

Final List Value: [0, 1]

Above we have used the pop method on the fluid list object. Other common iterable methods are also implemented such as del fluidL[i], .remove, .insert, .append, .extend. The list can also be modified using slices (sort and reverse methods are not implemented).

The only condition is that you must only modify the list in place, if at any point fluidL or l were reassigned to a different list object the code would not work. The original fluidL object would still be used by the for loop but would become out of scope for us to modify.

i.e.

fluidL[2] = 'a'   # is OK
fluidL = [0, 1, 'a', 3, 4, 5, 6, 7, 8]  # is not OK

If we want to access the current index value of the list we cannot use enumerate, as this only counts how many times the for loop has run. Instead we will use the iterator object directly.

fluidArr = FluidIterable([0,1,2,3])
# get iterator first so can query the current index
fluidArrIter = fluidArr.__iter__()
for i, v in enumerate(fluidArrIter):
    print('enum: ', i)
    print('current val: ', v)
    print('current ind: ', fluidArrIter.currentIndex)
    print(fluidArr)
    fluidArr.insert(0,'a')
    print(' ')

print('Final List Value: ' + str(fluidArr))

This will output the following:

enum:  0
current val:  0
current ind:  0
[0, 1, 2, 3]

enum:  1
current val:  1
current ind:  2
['a', 0, 1, 2, 3]

enum:  2
current val:  2
current ind:  4
['a', 'a', 0, 1, 2, 3]

enum:  3
current val:  3
current ind:  6
['a', 'a', 'a', 0, 1, 2, 3]

Final List Value: ['a', 'a', 'a', 'a', 0, 1, 2, 3]

The FluidIterable class just provides a wrapper for the original list object. The original object can be accessed as a property of the fluid object like so:

originalList = fluidArr.fixedIterable

More examples / tests can be found in the if __name__ is "__main__": section at the bottom of fluidIter.py. These are worth looking at because they explain what happens in various situations. Such as: Replacing a large sections of the list using a slice. Or using (and modifying) the same iterable in nested for loops.

As I stated to start with: this is a complicated solution that will hurt the readability of your code and make it more difficult to debug. Therefore other solutions such as the list comprehensions mentioned in David Raznick’s answer should be considered first. That being said, I have found times where this class has been useful to me and has been easier to use than keeping track of the indices of elements that need deleting.


Edit: As mentioned in the comments, this answer does not really present a problem for which this approach provides a solution. I will try to address that here:

List comprehensions provide a way to generate a new list but these approaches tend to look at each element in isolation rather than the current state of the list as a whole.

i.e.

newList = [i for i in oldList if testFunc(i)]

But what if the result of the testFunc depends on the elements that have been added to newList already? Or the elements still in oldList that might be added next? There might still be a way to use a list comprehension but it will begin to lose it’s elegance, and for me it feels easier to modify a list in place.

The code below is one example of an algorithm that suffers from the above problem. The algorithm will reduce a list so that no element is a multiple of any other element.

randInts = [70, 20, 61, 80, 54, 18, 7, 18, 55, 9]
fRandInts = FluidIterable(randInts)
fRandIntsIter = fRandInts.__iter__()
# for each value in the list (outer loop)
# test against every other value in the list (inner loop)
for i in fRandIntsIter:
    print(' ')
    print('outer val: ', i)
    innerIntsIter = fRandInts.__iter__()
    for j in innerIntsIter:
        innerIndex = innerIntsIter.currentIndex
        # skip the element that the outloop is currently on
        # because we don't want to test a value against itself
        if not innerIndex == fRandIntsIter.currentIndex:
            # if the test element, j, is a multiple 
            # of the reference element, i, then remove 'j'
            if j%i == 0:
                print('remove val: ', j)
                # remove element in place, without breaking the
                # iteration of either loop
                del fRandInts[innerIndex]
            # end if multiple, then remove
        # end if not the same value as outer loop
    # end inner loop
# end outerloop

print('')
print('final list: ', randInts)

The output and the final reduced list are shown below

outer val:  70

outer val:  20
remove val:  80

outer val:  61

outer val:  54

outer val:  18
remove val:  54
remove val:  18

outer val:  7
remove val:  70

outer val:  55

outer val:  9
remove val:  18

final list:  [20, 61, 7, 55, 9]

回答 17

最有效的方法是列表理解,很多人都表现出他们的情况,当然,这也是一个很好的方式得到一个iterator通过filter

Filter接收一个函数和一个序列。Filter将传递的函数依次应用于每个元素,然后根据函数返回值是True还是决定是保留还是丢弃该元素False

有一个例子(在元组中获得赔率):

list(filter(lambda x:x%2==1, (1, 2, 4, 5, 6, 9, 10, 15)))  
# result: [1, 5, 9, 15]

警告:您也不能处理迭代器。迭代器有时比序列更好。

The most effective method is list comprehension, many people show their case, of course, it is also a good way to get an iterator through filter.

Filter receives a function and a sequence. Filter applies the passed function to each element in turn, and then decides whether to retain or discard the element depending on whether the function return value is True or False.

There is an example (get the odds in the tuple):

list(filter(lambda x:x%2==1, (1, 2, 4, 5, 6, 9, 10, 15)))  
# result: [1, 5, 9, 15]

Caution: You can also not handle iterators. Iterators are sometimes better than sequences.


回答 18

for循环将通过索引进行迭代。

认为你有一个清单,

[5, 7, 13, 29, 65, 91]

您使用了名为的列表变量lis。并且您使用它删除。

你的变量

lis = [5, 7, 13, 29, 35, 65, 91]
       0  1   2   3   4   5   6

在第5次迭代中

您的数字35不是素数,因此您将其从列表中删除。

lis.remove(y)

然后下一个值(65)移至上一个索引。

lis = [5, 7, 13, 29, 65, 91]
       0  1   2   3   4   5

所以第4次迭代完成的指针移到了第5位。

那就是为什么您的循环自从移入上一个索引以来不覆盖65。

因此,您不应将列表引用到另一个仍引用原始变量而不是副本的变量中。

ite = lis #dont do it will reference instead copy

所以使用 list[::]

现在你会给,

[5, 7, 13, 29]

问题是您在迭代过程中从列表中删除了一个值,然后列表索引将崩溃。

因此您可以尝试理解。

它支持所有可迭代的对象,例如list,tuple,dict,string等

for loop will be iterate through index..

consider you have a list,

[5, 7, 13, 29, 65, 91]

you have using list variable called lis. and you using same to remove..

your variable

lis = [5, 7, 13, 29, 35, 65, 91]
       0  1   2   3   4   5   6

during 5th iteration,

your number 35 was not a prime so you removed it from a list.

lis.remove(y)

and then next value (65) move on to previous index.

lis = [5, 7, 13, 29, 65, 91]
       0  1   2   3   4   5

so 4th iteration done pointer moved onto 5th..

thats why your loop doesnt cover 65 since its moved into previous index.

so you shouldn’t reference list into another variable which still reference original instead of copy.

ite = lis #dont do it will reference instead copy

so do copy of list using list[::]

now you it will give,

[5, 7, 13, 29]

Problem is you removed a value from a list during iteration then your list index will collapse.

so you can try comprehension instead.

which supports all the iterable like, list, tuple, dict, string etc


回答 19

如果要在迭代时从列表中删除元素,请使用while循环,以便可以在每次删除后更改当前索引和结束索引。

例:

i = 0
length = len(list1)

while i < length:
    if condition:
        list1.remove(list1[i])
        i -= 1
        length -= 1

    i += 1

If you want to delete elements from a list while iterating, use a while-loop so you can alter the current index and end index after each deletion.

Example:

i = 0
length = len(list1)

while i < length:
    if condition:
        list1.remove(list1[i])
        i -= 1
        length -= 1

    i += 1

回答 20

其他答案是正确的,因为从要迭代的列表中删除通常不是一个好主意。反向迭代避免了陷阱,但是遵循这样做的代码要困难得多,因此通常最好使用列表推导或filter

但是,在一种情况下,可以安全地从要迭代的序列中删除元素:如果仅在迭代时删除一个项目。可以使用a return或a 来确保break。例如:

for i, item in enumerate(lst):
    if item % 4 == 0:
        foo(item)
        del lst[i]
        break

当您对符合条件的列表中的第一个项目执行副作用操作,然后立即从列表中删除该项目时,这通常比列表理解更容易理解。

The other answers are correct that it is usually a bad idea to delete from a list that you’re iterating. Reverse iterating avoids the pitfalls, but it is much more difficult to follow code that does that, so usually you’re better off using a list comprehension or filter.

There is, however, one case where it is safe to remove elements from a sequence that you are iterating: if you’re only removing one item while you’re iterating. This can be ensured using a return or a break. For example:

for i, item in enumerate(lst):
    if item % 4 == 0:
        foo(item)
        del lst[i]
        break

This is often easier to understand than a list comprehension when you’re doing some operations with side effects on the first item in a list that meets some condition and then removing that item from the list immediately after.


回答 21

我可以想到三种解决问题的方法。例如,我将创建一个随机的元组列表somelist = [(1,2,3), (4,5,6), (3,6,6), (7,8,9), (15,0,0), (10,11,12)]。我选择的条件是sum of elements of a tuple = 15。在最终列表中,我们将只有那些总和不等于15的元组。

我选择的是一个随机选择的示例。可以随意更改元组的列表条件,我选择了。

方法1.>使用您建议的框架(其中一个在for循环内填写代码)。我使用一个小的代码del来删除满足上述条件的元组。但是,如果两个连续放置的元组满足给定条件,则此方法将丢失一个元组(满足所述条件)。

for tup in somelist:
    if ( sum(tup)==15 ): 
        del somelist[somelist.index(tup)]

print somelist
>>> [(1, 2, 3), (3, 6, 6), (7, 8, 9), (10, 11, 12)]

方法2.>构造一个新列表,其中包含不满足给定条件的元素(元组)(这与删除满足给定条件的list的元素相同)。以下是该代码:

newlist1 = [somelist[tup] for tup in range(len(somelist)) if(sum(somelist[tup])!=15)]

print newlist1
>>>[(1, 2, 3), (7, 8, 9), (10, 11, 12)]

方法3.>查找满足给定条件的索引,然后使用与这些索引相对应的remove元素(元组)。以下是该代码。

indices = [i for i in range(len(somelist)) if(sum(somelist[i])==15)]
newlist2 = [tup for j, tup in enumerate(somelist) if j not in indices]

print newlist2
>>>[(1, 2, 3), (7, 8, 9), (10, 11, 12)]

方法1和方法2比方法3快。方法2和方法3比方法1更有效。我更喜欢method2。对于上述示例,time(method1) : time(method2) : time(method3) = 1 : 1 : 1.7

I can think of three approaches to solve your problem. As an example, I will create a random list of tuples somelist = [(1,2,3), (4,5,6), (3,6,6), (7,8,9), (15,0,0), (10,11,12)]. The condition that I choose is sum of elements of a tuple = 15. In the final list we will only have those tuples whose sum is not equal to 15.

What I have chosen is a randomly chosen example. Feel free to change the list of tuples and the condition that I have chosen.

Method 1.> Use the framework that you had suggested (where one fills in a code inside a for loop). I use a small code with del to delete a tuple that meets the said condition. However, this method will miss a tuple (which satisfies the said condition) if two consecutively placed tuples meet the given condition.

for tup in somelist:
    if ( sum(tup)==15 ): 
        del somelist[somelist.index(tup)]

print somelist
>>> [(1, 2, 3), (3, 6, 6), (7, 8, 9), (10, 11, 12)]

Method 2.> Construct a new list which contains elements (tuples) where the given condition is not met (this is the same thing as removing elements of list where the given condition is met). Following is the code for that:

newlist1 = [somelist[tup] for tup in range(len(somelist)) if(sum(somelist[tup])!=15)]

print newlist1
>>>[(1, 2, 3), (7, 8, 9), (10, 11, 12)]

Method 3.> Find indices where the given condition is met, and then use remove elements (tuples) corresponding to those indices. Following is the code for that.

indices = [i for i in range(len(somelist)) if(sum(somelist[i])==15)]
newlist2 = [tup for j, tup in enumerate(somelist) if j not in indices]

print newlist2
>>>[(1, 2, 3), (7, 8, 9), (10, 11, 12)]

Method 1 and method 2 are faster than method 3. Method2 and method3 are more efficient than method1. I prefer method2. For the aforementioned example, time(method1) : time(method2) : time(method3) = 1 : 1 : 1.7


回答 22

对于具有很大潜力的任何事物,我使用以下内容。

import numpy as np

orig_list = np.array([1, 2, 3, 4, 5, 100, 8, 13])

remove_me = [100, 1]

cleaned = np.delete(orig_list, remove_me)
print(cleaned)

那应该比其他任何东西都快得多。

For anything that has the potential to be really big, I use the following.

import numpy as np

orig_list = np.array([1, 2, 3, 4, 5, 100, 8, 13])

remove_me = [100, 1]

cleaned = np.delete(orig_list, remove_me)
print(cleaned)

That should be significantly faster than anything else.


回答 23

在某些情况下,您要做的不仅仅是一次过滤一个列表,还希望迭代时更改迭代。

这是一个示例,其中事先复制列表是不正确的,不可能进行反向迭代,并且列表理解也不是一种选择。

""" Sieve of Eratosthenes """

def generate_primes(n):
    """ Generates all primes less than n. """
    primes = list(range(2,n))
    idx = 0
    while idx < len(primes):
        p = primes[idx]
        for multiple in range(p+p, n, p):
            try:
                primes.remove(multiple)
            except ValueError:
                pass #EAFP
        idx += 1
        yield p

In some situations, where you’re doing more than simply filtering a list one item at time, you want your iteration to change while iterating.

Here is an example where copying the list beforehand is incorrect, reverse iteration is impossible and a list comprehension is also not an option.

""" Sieve of Eratosthenes """

def generate_primes(n):
    """ Generates all primes less than n. """
    primes = list(range(2,n))
    idx = 0
    while idx < len(primes):
        p = primes[idx]
        for multiple in range(p+p, n, p):
            try:
                primes.remove(multiple)
            except ValueError:
                pass #EAFP
        idx += 1
        yield p

回答 24

如果以后要使用新列表,只需将elem设置为None,然后在以后的循环中进行判断,就像这样

for i in li:
    i = None

for elem in li:
    if elem is None:
        continue

这样,您无需复制列表,而且更容易理解。

If you will use the new list later, you can simply set the elem to None, and then judge it in the later loop, like this

for i in li:
    i = None

for elem in li:
    if elem is None:
        continue

In this way, you dont’t need copy the list and it’s easier to understand.


回答 25

提出一个数字列表,您想删除所有可被3整除的数字,

list_number =[i for i in range(100)]

使用list comprehension,这将创建一个新列表并创建新的内存空间

new_list =[i for i in list_number if i%3!=0]

使用lambda filter函数,这将创建结果新列表并占用内存空间

new_list = list(filter(lambda x:x%3!=0, list_number))

无需占用新列表和修改现有列表的存储空间

for index, value in enumerate(list_number):
    if list_number[index]%3==0:
        list_number.remove(value)

uppose a list of number and you want to remove all no which are divisible by 3,

list_number =[i for i in range(100)]

using list comprehension,this will careate a new list and create new memory space

new_list =[i for i in list_number if i%3!=0]

using lambda filter function, this will create resultant new list and consume memeory space

new_list = list(filter(lambda x:x%3!=0, list_number))

without consuming memory space for new list and modify existing list

for index, value in enumerate(list_number):
    if list_number[index]%3==0:
        list_number.remove(value)

如何在Python中获取主目录?

问题:如何在Python中获取主目录?

我需要获取当前登录用户的主目录的位置。当前,我在Linux上一直使用以下命令:

os.getenv("HOME")

但是,这在Windows上不起作用。正确的跨平台方法是什么?

I need to get the location of the home directory of the current logged-on user. Currently, I’ve been using the following on Linux:

os.getenv("HOME")

However, this does not work on Windows. What is the correct cross-platform way to do this?


回答 0

您要使用os.path.expanduser
这将确保它可在所有平台上运行:

from os.path import expanduser
home = expanduser("~")

如果您使用的是Python 3.5+,则可以使用pathlib.Path.home()

from pathlib import Path
home = str(Path.home())

You want to use os.path.expanduser.
This will ensure it works on all platforms:

from os.path import expanduser
home = expanduser("~")

If you’re on Python 3.5+ you can use pathlib.Path.home():

from pathlib import Path
home = str(Path.home())

回答 1

这是一种Linux方式cd ..如果您需要使用它,请注意:(如果位于子目录中,它将进入该目录)

Here is a linux way cd .. if you need to use that instead note:(if you are in a sub directory then it will take to the directory)


如何在Python中获取文件创建和修改日期/时间?

问题:如何在Python中获取文件创建和修改日期/时间?

我有一个脚本,该脚本需要根据文件创建和修改日期执行一些操作,但是必须在Linux运行Windows

Python中进行文件创建和修改的最佳跨平台方法是什么?date/times

I have a script that needs to do some stuff based on file creation & modification dates but has to run on Linux & Windows.

What’s the best cross-platform way to get file creation & modification date/times in Python?


回答 0

以跨平台的方式获取某种修改日期很容易-只需调用,便会获得文件在以下位置时的Unix时间戳。os.path.getmtime(path)path最后修改时间。

另一方面,获取文件创建日期是不固定的,且取决于平台,即使在三个大型操作系统之间也有所不同:

综上所述,跨平台代码应如下所示:

import os
import platform

def creation_date(path_to_file):
    """
    Try to get the date that a file was created, falling back to when it was
    last modified if that isn't possible.
    See http://stackoverflow.com/a/39501288/1709587 for explanation.
    """
    if platform.system() == 'Windows':
        return os.path.getctime(path_to_file)
    else:
        stat = os.stat(path_to_file)
        try:
            return stat.st_birthtime
        except AttributeError:
            # We're probably on Linux. No easy way to get creation dates here,
            # so we'll settle for when its content was last modified.
            return stat.st_mtime

Getting some sort of modification date in a cross-platform way is easy – just call os.path.getmtime(path) and you’ll get the Unix timestamp of when the file at path was last modified.

Getting file creation dates, on the other hand, is fiddly and platform-dependent, differing even between the three big OSes:

Putting this all together, cross-platform code should look something like this…

import os
import platform

def creation_date(path_to_file):
    """
    Try to get the date that a file was created, falling back to when it was
    last modified if that isn't possible.
    See http://stackoverflow.com/a/39501288/1709587 for explanation.
    """
    if platform.system() == 'Windows':
        return os.path.getctime(path_to_file)
    else:
        stat = os.stat(path_to_file)
        try:
            return stat.st_birthtime
        except AttributeError:
            # We're probably on Linux. No easy way to get creation dates here,
            # so we'll settle for when its content was last modified.
            return stat.st_mtime

回答 1

您有两种选择。首先,您可以使用os.path.getmtimeos.path.getctime功能:

import os.path, time
print("last modified: %s" % time.ctime(os.path.getmtime(file)))
print("created: %s" % time.ctime(os.path.getctime(file)))

您的另一个选择是使用os.stat

import os, time
(mode, ino, dev, nlink, uid, gid, size, atime, mtime, ctime) = os.stat(file)
print("last modified: %s" % time.ctime(mtime))

ctime()指创建时间在* nix系统,而是最后一次inode的数据变化。(感谢kojiro通过提供指向有趣的博客文章的链接使评论中的事实更加清楚)

You have a couple of choices. For one, you can use the os.path.getmtime and os.path.getctime functions:

import os.path, time
print("last modified: %s" % time.ctime(os.path.getmtime(file)))
print("created: %s" % time.ctime(os.path.getctime(file)))

Your other option is to use os.stat:

import os, time
(mode, ino, dev, nlink, uid, gid, size, atime, mtime, ctime) = os.stat(file)
print("last modified: %s" % time.ctime(mtime))

Note: ctime() does not refer to creation time on *nix systems, but rather the last time the inode data changed. (thanks to kojiro for making that fact more clear in the comments by providing a link to an interesting blog post)


回答 2

最好的功能是os.path.getmtime()。在内部,这只是使用os.stat(filename).st_mtime

datetime模块是最好的操作时间戳,因此您可以将修改日期作为这样的datetime对象获得:

import os
import datetime
def modification_date(filename):
    t = os.path.getmtime(filename)
    return datetime.datetime.fromtimestamp(t)

用法示例:

>>> d = modification_date('/var/log/syslog')
>>> print d
2009-10-06 10:50:01
>>> print repr(d)
datetime.datetime(2009, 10, 6, 10, 50, 1)

The best function to use for this is os.path.getmtime(). Internally, this just uses os.stat(filename).st_mtime.

The datetime module is the best manipulating timestamps, so you can get the modification date as a datetime object like this:

import os
import datetime
def modification_date(filename):
    t = os.path.getmtime(filename)
    return datetime.datetime.fromtimestamp(t)

Usage example:

>>> d = modification_date('/var/log/syslog')
>>> print d
2009-10-06 10:50:01
>>> print repr(d)
datetime.datetime(2009, 10, 6, 10, 50, 1)

回答 3

os.stat https://docs.python.org/2/library/stat.html#module-stat

编辑:在较新的代码中,您可能应该使用os.path.getmtime()(感谢Christian Oudard),
但请注意,它返回的time_t浮点值只有小数秒(如果您的操作系统支持)

os.stat https://docs.python.org/2/library/stat.html#module-stat

edit: In newer code you should probably use os.path.getmtime() (thanks Christian Oudard)
but note that it returns a floating point value of time_t with fraction seconds (if your OS supports it)


回答 4

有两种获取mod时间的方法,os.path.getmtime()或os.stat(),但是ctime不是可靠的跨平台(请参见下文)。

os.path.getmtime()

getmtimepath返回路径
的最后修改时间。返回值是一个数字,给出自纪元以来的秒数(请参见时间模块)。如果文件不存在或不可访问,请引发os.error。1.5.2版中的新功能。在版本2.3中进行了更改:如果os.stat_float_times()返回True,则结果为浮点数。

os.stat()

statpath
在给定路径上执行stat()系统调用。返回值是一个对象,其属性与stat结构的成员相对应,即:st_mode(保护位),st_ino(索引节点号),st_dev(设备),st_nlink(硬链接数),st_uid(所有者的用户ID) ),st_gid(所有者的组ID),st_size(文件大小,以字节为单位),st_atime(最新访问时间),st_mtime(最新内容修改时间),st_ctime(取决于平台;最新元数据更改的时间)在Unix上,或在Windows上创建的时间)

>>> import os
>>> statinfo = os.stat('somefile.txt')
>>> statinfo
(33188, 422511L, 769L, 1, 1032, 100, 926L, 1105022698,1105022732, 1105022732)
>>> statinfo.st_size
926L
>>> 

在上面的示例中,您将使用statinfo.st_mtime或statinfo.st_ctime分别获取mtime和ctime。

There are two methods to get the mod time, os.path.getmtime() or os.stat(), but the ctime is not reliable cross-platform (see below).

os.path.getmtime()

getmtime(path)
Return the time of last modification of path. The return value is a number giving the number of seconds since the epoch (see the time module). Raise os.error if the file does not exist or is inaccessible. New in version 1.5.2. Changed in version 2.3: If os.stat_float_times() returns True, the result is a floating point number.

os.stat()

stat(path)
Perform a stat() system call on the given path. The return value is an object whose attributes correspond to the members of the stat structure, namely: st_mode (protection bits), st_ino (inode number), st_dev (device), st_nlink (number of hard links), st_uid (user ID of owner), st_gid (group ID of owner), st_size (size of file, in bytes), st_atime (time of most recent access), st_mtime (time of most recent content modification), st_ctime (platform dependent; time of most recent metadata change on Unix, or the time of creation on Windows):

>>> import os
>>> statinfo = os.stat('somefile.txt')
>>> statinfo
(33188, 422511L, 769L, 1, 1032, 100, 926L, 1105022698,1105022732, 1105022732)
>>> statinfo.st_size
926L
>>> 

In the above example you would use statinfo.st_mtime or statinfo.st_ctime to get the mtime and ctime, respectively.


回答 5

在Python 3.4及更高版本中,您可以使用面向对象的pathlib模块接口,该接口包括许多os模块的包装器。这是获取文件统计信息的示例。

>>> import pathlib
>>> fname = pathlib.Path('test.py')
>>> assert fname.exists(), f'No such file: {fname}'  # check that the file exists
>>> print(fname.stat())
os.stat_result(st_mode=33206, st_ino=5066549581564298, st_dev=573948050, st_nlink=1, st_uid=0, st_gid=0, st_size=413, st_atime=1523480272, st_mtime=1539787740, st_ctime=1523480272)

有关os.stat_result所含内容的更多信息,请参阅文档。对于您想要的修改时间fname.stat().st_mtime

>>> import datetime
>>> mtime = datetime.datetime.fromtimestamp(fname.stat().st_mtime)
>>> print(mtime)
datetime.datetime(2018, 10, 17, 10, 49, 0, 249980)

如果要在Windows上创建时间,或者在Unix上需要最新的元数据更改,则可以使用fname.stat().st_ctime

>>> ctime = datetime.datetime.fromtimestamp(fname.stat().st_ctime)
>>> print(ctime)
datetime.datetime(2018, 4, 11, 16, 57, 52, 151953)

本文提供了有关pathlib模块的更多有用信息和示例。

In Python 3.4 and above, you can use the object oriented pathlib module interface which includes wrappers for much of the os module. Here is an example of getting the file stats.

>>> import pathlib
>>> fname = pathlib.Path('test.py')
>>> assert fname.exists(), f'No such file: {fname}'  # check that the file exists
>>> print(fname.stat())
os.stat_result(st_mode=33206, st_ino=5066549581564298, st_dev=573948050, st_nlink=1, st_uid=0, st_gid=0, st_size=413, st_atime=1523480272, st_mtime=1539787740, st_ctime=1523480272)

For more information about what os.stat_result contains, refer to the documentation. For the modification time you want fname.stat().st_mtime:

>>> import datetime
>>> mtime = datetime.datetime.fromtimestamp(fname.stat().st_mtime)
>>> print(mtime)
datetime.datetime(2018, 10, 17, 10, 49, 0, 249980)

If you want the creation time on Windows, or the most recent metadata change on Unix, you would use fname.stat().st_ctime:

>>> ctime = datetime.datetime.fromtimestamp(fname.stat().st_ctime)
>>> print(ctime)
datetime.datetime(2018, 4, 11, 16, 57, 52, 151953)

This article has more helpful info and examples for the pathlib module.


回答 6

os.stat返回具有st_mtimest_ctime属性的命名元组。修改时间st_mtime在两个平台上都一样;不幸的是,在Windows上ctime表示“创建时间”,而在POSIX上表示“更改时间”。我不知道有什么方法可以在POSIX平台上获得创建时间。

os.stat returns a named tuple with st_mtime and st_ctime attributes. The modification time is st_mtime on both platforms; unfortunately, on Windows, ctime means “creation time”, whereas on POSIX it means “change time”. I’m not aware of any way to get the creation time on POSIX platforms.


回答 7

import os, time, datetime

file = "somefile.txt"
print(file)

print("Modified")
print(os.stat(file)[-2])
print(os.stat(file).st_mtime)
print(os.path.getmtime(file))

print()

print("Created")
print(os.stat(file)[-1])
print(os.stat(file).st_ctime)
print(os.path.getctime(file))

print()

modified = os.path.getmtime(file)
print("Date modified: "+time.ctime(modified))
print("Date modified:",datetime.datetime.fromtimestamp(modified))
year,month,day,hour,minute,second=time.localtime(modified)[:-3]
print("Date modified: %02d/%02d/%d %02d:%02d:%02d"%(day,month,year,hour,minute,second))

print()

created = os.path.getctime(file)
print("Date created: "+time.ctime(created))
print("Date created:",datetime.datetime.fromtimestamp(created))
year,month,day,hour,minute,second=time.localtime(created)[:-3]
print("Date created: %02d/%02d/%d %02d:%02d:%02d"%(day,month,year,hour,minute,second))

版画

somefile.txt
Modified
1429613446
1429613446.0
1429613446.0

Created
1517491049
1517491049.28306
1517491049.28306

Date modified: Tue Apr 21 11:50:46 2015
Date modified: 2015-04-21 11:50:46
Date modified: 21/04/2015 11:50:46

Date created: Thu Feb  1 13:17:29 2018
Date created: 2018-02-01 13:17:29.283060
Date created: 01/02/2018 13:17:29
import os, time, datetime

file = "somefile.txt"
print(file)

print("Modified")
print(os.stat(file)[-2])
print(os.stat(file).st_mtime)
print(os.path.getmtime(file))

print()

print("Created")
print(os.stat(file)[-1])
print(os.stat(file).st_ctime)
print(os.path.getctime(file))

print()

modified = os.path.getmtime(file)
print("Date modified: "+time.ctime(modified))
print("Date modified:",datetime.datetime.fromtimestamp(modified))
year,month,day,hour,minute,second=time.localtime(modified)[:-3]
print("Date modified: %02d/%02d/%d %02d:%02d:%02d"%(day,month,year,hour,minute,second))

print()

created = os.path.getctime(file)
print("Date created: "+time.ctime(created))
print("Date created:",datetime.datetime.fromtimestamp(created))
year,month,day,hour,minute,second=time.localtime(created)[:-3]
print("Date created: %02d/%02d/%d %02d:%02d:%02d"%(day,month,year,hour,minute,second))

prints

somefile.txt
Modified
1429613446
1429613446.0
1429613446.0

Created
1517491049
1517491049.28306
1517491049.28306

Date modified: Tue Apr 21 11:50:46 2015
Date modified: 2015-04-21 11:50:46
Date modified: 21/04/2015 11:50:46

Date created: Thu Feb  1 13:17:29 2018
Date created: 2018-02-01 13:17:29.283060
Date created: 01/02/2018 13:17:29

回答 8

>>> import os
>>> os.stat('feedparser.py').st_mtime
1136961142.0
>>> os.stat('feedparser.py').st_ctime
1222664012.233
>>> 
>>> import os
>>> os.stat('feedparser.py').st_mtime
1136961142.0
>>> os.stat('feedparser.py').st_ctime
1222664012.233
>>> 

回答 9

如果遵循符号链接并不重要,则也可以使用os.lstat内置函数。

>>> os.lstat("2048.py")
posix.stat_result(st_mode=33188, st_ino=4172202, st_dev=16777218L, st_nlink=1, st_uid=501, st_gid=20, st_size=2078, st_atime=1423378041, st_mtime=1423377552, st_ctime=1423377553)
>>> os.lstat("2048.py").st_atime
1423378041.0

If following symbolic links is not important, you can also use the os.lstat builtin.

>>> os.lstat("2048.py")
posix.stat_result(st_mode=33188, st_ino=4172202, st_dev=16777218L, st_nlink=1, st_uid=501, st_gid=20, st_size=2078, st_atime=1423378041, st_mtime=1423377552, st_ctime=1423377553)
>>> os.lstat("2048.py").st_atime
1423378041.0

回答 10

值得一看的是该crtime库实现了对文件创建时间的跨平台访问。

from crtime import get_crtimes_in_dir

for fname, date in get_crtimes_in_dir(".", raise_on_error=True, as_epoch=False):
    print(fname, date)
    # file_a.py Mon Mar 18 20:51:18 CET 2019

It may worth taking a look at the crtime library which implements cross-platform access to the file creation time.

from crtime import get_crtimes_in_dir

for fname, date in get_crtimes_in_dir(".", raise_on_error=True, as_epoch=False):
    print(fname, date)
    # file_a.py Mon Mar 18 20:51:18 CET 2019

回答 11

os.stat确实包括创建时间。对于os.stat()包含时间的元素,没有st_anything的定义。

所以试试这个:

os.stat('feedparser.py')[8]

将其与您在ls -lah中的文件上的创建日期进行比较

它们应该是相同的。

os.stat does include the creation time. There’s just no definition of st_anything for the element of os.stat() that contains the time.

So try this:

os.stat('feedparser.py')[8]

Compare that with your create date on the file in ls -lah

They should be the same.


回答 12

通过运行系统的stat命令并解析输出,我能够在posix上获得创建时间。

commands.getoutput('stat FILENAME').split('\"')[7]

从终端(OS X)在python外部运行stat返回:

805306374 3382786932 -rwx------ 1 km staff 0 1098083 "Aug 29 12:02:05 2013" "Aug 29 12:02:05 2013" "Aug 29 12:02:20 2013" "Aug 27 12:35:28 2013" 61440 2150 0 testfile.txt

…其中第四个datetime是文件创建时间(而不是ctime更改时间,如其他注释所述)。

I was able to get creation time on posix by running the system’s stat command and parsing the output.

commands.getoutput('stat FILENAME').split('\"')[7]

Running stat outside of python from Terminal (OS X) returned:

805306374 3382786932 -rwx------ 1 km staff 0 1098083 "Aug 29 12:02:05 2013" "Aug 29 12:02:05 2013" "Aug 29 12:02:20 2013" "Aug 27 12:35:28 2013" 61440 2150 0 testfile.txt

… where the fourth datetime is the file creation (rather than ctime change time as other comments noted).


如何获取大熊猫DataFrame的行数?

问题:如何获取大熊猫DataFrame的行数?

我正在尝试使用Pandas获取数据框df的行数,这是我的代码。

方法1:

total_rows = df.count
print total_rows +1

方法2:

total_rows = df['First_columnn_label'].count
print total_rows +1

这两个代码段都给我这个错误:

TypeError:+不支持的操作数类型:“ instancemethod”和“ int”

我究竟做错了什么?

I’m trying to get the number of rows of dataframe df with Pandas, and here is my code.

Method 1:

total_rows = df.count
print total_rows +1

Method 2:

total_rows = df['First_columnn_label'].count
print total_rows +1

Both the code snippets give me this error:

TypeError: unsupported operand type(s) for +: ‘instancemethod’ and ‘int’

What am I doing wrong?


回答 0

您可以使用.shape属性,也可以使用len(DataFrame.index)。但是,存在明显的性能差异(len(DataFrame.index)最快):

In [1]: import numpy as np

In [2]: import pandas as pd

In [3]: df = pd.DataFrame(np.arange(12).reshape(4,3))

In [4]: df
Out[4]: 
   0  1  2
0  0  1  2
1  3  4  5
2  6  7  8
3  9  10 11

In [5]: df.shape
Out[5]: (4, 3)

In [6]: timeit df.shape
2.77 µs ± 644 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [7]: timeit df[0].count()
348 µs ± 1.31 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [8]: len(df.index)
Out[8]: 4

In [9]: timeit len(df.index)
990 ns ± 4.97 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

在此处输入图片说明

编辑:正如@Dan Allen在评论中指出的,len(df.index)并且df[0].count()不能与count排除NaNs 互换使用,

You can use the .shape property or just len(DataFrame.index). However, there are notable performance differences ( len(DataFrame.index) is fastest):

In [1]: import numpy as np

In [2]: import pandas as pd

In [3]: df = pd.DataFrame(np.arange(12).reshape(4,3))

In [4]: df
Out[4]: 
   0  1  2
0  0  1  2
1  3  4  5
2  6  7  8
3  9  10 11

In [5]: df.shape
Out[5]: (4, 3)

In [6]: timeit df.shape
2.77 µs ± 644 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [7]: timeit df[0].count()
348 µs ± 1.31 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [8]: len(df.index)
Out[8]: 4

In [9]: timeit len(df.index)
990 ns ± 4.97 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

enter image description here

EDIT: As @Dan Allen noted in the comments len(df.index) and df[0].count() are not interchangeable as count excludes NaNs,


回答 1

假设df是您的数据框,则:

count_row = df.shape[0]  # gives number of row count
count_col = df.shape[1]  # gives number of col count

或者,更简洁地说,

r, c = df.shape

Suppose df is your dataframe then:

count_row = df.shape[0]  # gives number of row count
count_col = df.shape[1]  # gives number of col count

Or, more succinctly,

r, c = df.shape

回答 2

使用len(df)。从熊猫0.11开始,甚至更早版本。

__len__()当前(0.12)用记录Returns length of index。时间信息,设置方法与root用户的答案相同:

In [7]: timeit len(df.index)
1000000 loops, best of 3: 248 ns per loop

In [8]: timeit len(df)
1000000 loops, best of 3: 573 ns per loop

由于进行了一个附加的函数调用,因此它比len(df.index)直接调用要慢一些,但是在大多数用例中,这不应发挥任何作用。

Use len(df). This works as of pandas 0.11 or maybe even earlier.

__len__() is currently (0.12) documented with Returns length of index. Timing info, set up the same way as in root’s answer:

In [7]: timeit len(df.index)
1000000 loops, best of 3: 248 ns per loop

In [8]: timeit len(df)
1000000 loops, best of 3: 573 ns per loop

Due to one additional function call it is a bit slower than calling len(df.index) directly, but this should not play any role in most use cases.


回答 3

如何获取大熊猫DataFrame的行数?

下表总结了您希望在DataFrame(或Series,为了完整性)中进行计数的不同情况,以及推荐的方法。

在此处输入图片说明

脚注

  1. DataFrame.countSeries由于非空计数随列而异,因此返回每一列的计数。
  2. DataFrameGroupBy.size返回Series,因为同一组中的所有列共享相同的行数。
  3. DataFrameGroupBy.count返回一个DataFrame,因为非空计数在同一组的各列之间可能有所不同。要获取特定列的逐组非空计数,请使用df.groupby(...)['x'].count()其中“ x”为要计数的列。

最少的代码示例

下面,我显示上表中描述的每种方法的示例。首先,设置-

df = pd.DataFrame({
    'A': list('aabbc'), 'B': ['x', 'x', np.nan, 'x', np.nan]})
s = df['B'].copy()

df

   A    B
0  a    x
1  a    x
2  b  NaN
3  b    x
4  c  NaN

s

0      x
1      x
2    NaN
3      x
4    NaN
Name: B, dtype: object

一个数据帧的行数:len(df)df.shape[0]len(df.index)

len(df)
# 5

df.shape[0]
# 5

len(df.index)
# 5

比较固定时间操作的性能似乎很愚蠢,尤其是当差异处于“严重不担心”级别时。但是,这似乎是带有其他答案的趋势,因此为了完整性,我正在做同样的事情。

在上述3种方法中,len(df.index)(如其他答案所述)最快。

注意

  • 上面的所有方法都是固定时间操作,因为它们是简单的属性查找。
  • df.shape(类似于ndarray.shape)是一个返回的元组的属性(# Rows, # Cols)。例如,此处df.shape返回(8, 2)示例。

列数一个数据帧的:df.shape[1]len(df.columns)

df.shape[1]
# 2

len(df.columns)
# 2

类似于len(df.index)len(df.columns)是这两种方法中比较快的一种(但键入的字符更多)。

行计数一个系列:len(s)s.sizelen(s.index)

len(s)
# 5

s.size
# 5

len(s.index)
# 5

s.sizelen(s.index)即将在速度方面是相同的。但我建议len(df)

注意
size是一个属性,它返回元素数(=任何系列的行数)。DataFrames还定义了一个size属性,该属性返回与相同的结果df.shape[0] * df.shape[1]

非空行数:DataFrame.countSeries.count

此处描述的方法仅计算非空值(表示忽略NaN)。

调用DataFrame.count将返回列的非NaN计数:

df.count()

A    5
B    3
dtype: int64

对于系列,请使用Series.count类似的效果:

s.count()
# 3

分组行数: GroupBy.size

对于DataFrames,用于DataFrameGroupBy.size计算每个组的行数。

df.groupby('A').size()

A
a    2
b    2
c    1
dtype: int64

同样,对于Series,您将使用SeriesGroupBy.size

s.groupby(df.A).size()

A
a    2
b    2
c    1
Name: B, dtype: int64

在两种情况下,Series都将返回a。这也很有意义,DataFrames因为所有组都共享相同的行数。

按组的非空行计数: GroupBy.count

与上述类似,但使用GroupBy.count而不是GroupBy.size。请注意,size总是返回a Series,而在特定列上count返回Seriesif,否则返回a DataFrame

以下方法返回相同的内容:

df.groupby('A')['B'].size()
df.groupby('A').size()

A
a    2
b    2
c    1
Name: B, dtype: int64

同时,count我们有

df.groupby('A').count()

   B
A   
a  2
b  1
c  0

…在整个GroupBy对象v / s上调用

df.groupby('A')['B'].count()

A
a    2
b    1
c    0
Name: B, dtype: int64

在特定列上调用。

How do I get the row count of a pandas DataFrame?

This table summarises the different situations in which you’d want to count something in a DataFrame (or Series, for completeness), along with the recommended method(s).

enter image description here

Footnotes

  1. DataFrame.count returns counts for each column as a Series since the non-null count varies by column.
  2. DataFrameGroupBy.size returns a Series, since all columns in the same group share the same row-count.
  3. DataFrameGroupBy.count returns a DataFrame, since the non-null count could differ across columns in the same group. To get the group-wise non-null count for a specific column, use df.groupby(...)['x'].count() where “x” is the column to count.

Minimal Code Examples

Below, I show examples of each of the methods described in the table above. First, the setup –

df = pd.DataFrame({
    'A': list('aabbc'), 'B': ['x', 'x', np.nan, 'x', np.nan]})
s = df['B'].copy()

df

   A    B
0  a    x
1  a    x
2  b  NaN
3  b    x
4  c  NaN

s

0      x
1      x
2    NaN
3      x
4    NaN
Name: B, dtype: object

Row Count of a DataFrame: len(df), df.shape[0], or len(df.index)

len(df)
# 5

df.shape[0]
# 5

len(df.index)
# 5

It seems silly to compare the performance of constant time operations, especially when the difference is on the level of “seriously, don’t worry about it”. But this seems to be a trend with other answers, so I’m doing the same for completeness.

Of the 3 methods above, len(df.index) (as mentioned in other answers) is the fastest.

Note

  • All the methods above are constant time operations as they are simple attribute lookups.
  • df.shape (similar to ndarray.shape) is an attribute that returns a tuple of (# Rows, # Cols). For example, df.shape returns (8, 2) for the example here.

Column Count of a DataFrame: df.shape[1], len(df.columns)

df.shape[1]
# 2

len(df.columns)
# 2

Analogous to len(df.index), len(df.columns) is the faster of the two methods (but takes more characters to type).

Row Count of a Series: len(s), s.size, len(s.index)

len(s)
# 5

s.size
# 5

len(s.index)
# 5

s.size and len(s.index) are about the same in terms of speed. But I recommend len(df).

Note
size is an attribute, and it returns the number of elements (=count of rows for any Series). DataFrames also define a size attribute which returns the same result as df.shape[0] * df.shape[1].

Non-Null Row Count: DataFrame.count and Series.count

The methods described here only count non-null values (meaning NaNs are ignored).

Calling DataFrame.count will return non-NaN counts for each column:

df.count()

A    5
B    3
dtype: int64

For Series, use Series.count to similar effect:

s.count()
# 3

Group-wise Row Count: GroupBy.size

For DataFrames, use DataFrameGroupBy.size to count the number of rows per group.

df.groupby('A').size()

A
a    2
b    2
c    1
dtype: int64

Similarly, for Series, you’ll use SeriesGroupBy.size.

s.groupby(df.A).size()

A
a    2
b    2
c    1
Name: B, dtype: int64

In both cases, a Series is returned. This makes sense for DataFrames as well since all groups share the same row-count.

Group-wise Non-Null Row Count: GroupBy.count

Similar to above, but use GroupBy.count, not GroupBy.size. Note that size always returns a Series, while count returns a Series if called on a specific column, or else a DataFrame.

The following methods return the same thing:

df.groupby('A')['B'].size()
df.groupby('A').size()

A
a    2
b    2
c    1
Name: B, dtype: int64

Meanwhile, for count, we have

df.groupby('A').count()

   B
A   
a  2
b  1
c  0

…called on the entire GroupBy object, v/s,

df.groupby('A')['B'].count()

A
a    2
b    1
c    0
Name: B, dtype: int64

Called on a specific column.


回答 4

TL; DR

采用 len(df)


len()是您的朋友,它可以用作行计数len(df)

另外,您可以访问的所有行df.index和的所有列 df.columns,并且可以使用len(anyList)获取表的计数, len(df.index)获取行数和len(df.columns)列数。

或者,df.shape如果您要访问仅使用的行数,而仅使用df.shape[0]的列数,则可以使用which一起返回行数和列数df.shape[1]

TL;DR

use len(df)


len() is your friend, it can be used for row counts as len(df).

Alternatively, you can access all rows by df.index and all columns by df.columns, and as you can use the len(anyList) for getting the count of list, use len(df.index) for getting the number of rows, and len(df.columns) for the column count.

Or, you can use df.shape which returns the number of rows and columns together, if you want to access the number of rows only use df.shape[0] and for the number of columns only use: df.shape[1].


回答 5

除上述答案外,use还可用于df.axes获取具有行和列索引的元组,然后使用len()function:

total_rows=len(df.axes[0])
total_cols=len(df.axes[1])

Apart from above answers use can use df.axes to get the tuple with row and column indexes and then use len() function:

total_rows=len(df.axes[0])
total_cols=len(df.axes[1])

回答 6

…以Jan-Philip Gehrcke的答案为基础。

之所以len(df)还是len(df.index)比快df.shape[0]。看代码。df.shape是一种@property运行len两次调用的DataFrame方法的方法。

df.shape??
Type:        property
String form: <property object at 0x1127b33c0>
Source:     
# df.shape.fget
@property
def shape(self):
    """
    Return a tuple representing the dimensionality of the DataFrame.
    """
    return len(self.index), len(self.columns)

在len(df)的内幕之下

df.__len__??
Signature: df.__len__()
Source:   
    def __len__(self):
        """Returns length of info axis, but here we use the index """
        return len(self.index)
File:      ~/miniconda2/lib/python2.7/site-packages/pandas/core/frame.py
Type:      instancemethod

len(df.index)将比len(df)由于少了一个函数调用而稍快一些,但这总是比df.shape[0]

…building on Jan-Philip Gehrcke’s answer.

The reason why len(df) or len(df.index) is faster than df.shape[0]. Look at the code. df.shape is a @property that runs a DataFrame method calling len twice.

df.shape??
Type:        property
String form: <property object at 0x1127b33c0>
Source:     
# df.shape.fget
@property
def shape(self):
    """
    Return a tuple representing the dimensionality of the DataFrame.
    """
    return len(self.index), len(self.columns)

And beneath the hood of len(df)

df.__len__??
Signature: df.__len__()
Source:   
    def __len__(self):
        """Returns length of info axis, but here we use the index """
        return len(self.index)
File:      ~/miniconda2/lib/python2.7/site-packages/pandas/core/frame.py
Type:      instancemethod

len(df.index) will be slightly faster than len(df) since it has one less function call, but this is always faster than df.shape[0]


回答 7

我是从大R背景来学习大熊猫的,我发现大熊猫在选择行或列时会更加复杂。我不得不花了一段时间,然后找到了一些应对方法:

获取列数:

len(df.columns)  
## Here:
#df is your data.frame
#df.columns return a string, it contains column's titles of the df. 
#Then, "len()" gets the length of it.

获取行数:

len(df.index) #It's similar.

I come to pandas from R background, and I see that pandas is more complicated when it comes to selecting row or column. I had to wrestle with it for a while, then I found some ways to deal with:

getting the number of columns:

len(df.columns)  
## Here:
#df is your data.frame
#df.columns return a string, it contains column's titles of the df. 
#Then, "len()" gets the length of it.

getting the number of rows:

len(df.index) #It's similar.

回答 8

如果要在链接操作的中间获取行数,可以使用:

df.pipe(len)

例:

row_count = (
      pd.DataFrame(np.random.rand(3,4))
      .reset_index()
      .pipe(len)
)

如果您不想在len()函数中放入长语句,这将很有用。

您可以__len__()改用,但__len__()看起来有点怪异。

In case you want to get the row count in the middle of a chained operation, you can use:

df.pipe(len)

Example:

row_count = (
      pd.DataFrame(np.random.rand(3,4))
      .reset_index()
      .pipe(len)
)

This can be useful if you don’t want to put a long statement inside a len() function.

You could use __len__() instead but __len__() looks a bit weird.


回答 9

嘿,您也可以使用此功能:

假设df是您的数据框。然后df.shape给你你的数据框的形状即(row,col)

因此,分配以下命令以获取所需的

 row = df.shape[0], col = df.shape[1]

Hey you can use do this also:

Let say df is your dataframe. Then df.shape gives you the shape of your dataframe i.e (row,col)

Thus, assign below command to get the required

 row = df.shape[0], col = df.shape[1]

回答 10

对于数据框df,在浏览数据时使用了以逗号分隔的打印格式的行数:

def nrow(df):
    print("{:,}".format(df.shape[0]))

例:

nrow(my_df)
12,456,789

For dataframe df, a printed comma formatted row count used while exploring data:

def nrow(df):
    print("{:,}".format(df.shape[0]))

Example:

nrow(my_df)
12,456,789

回答 11

在我认为是最易读的变体中找出数据帧中行数的另一种方法是 pandas.Index.size

请注意,在我对接受的答案发表评论时:

可疑pandas.Index.size速度实际上比我想知道的要快,len(df.index)但是timeit在我的计算机上却告诉我(每个循环慢150 ns)。

An alternative method to finding out the amount of rows in a dataframe which I think is the most readable variant is pandas.Index.size.

Do note that as I commented on the accepted answer:

Suspected pandas.Index.size would actually be faster than len(df.index) but timeit on my computer tells me otherwise (~150 ns slower per loop).


回答 12

我不确定这是否行得通(可以省略数据),但这可能行得通:

*dataframe name*.tails(1)

然后使用此代码,您可以通过运行代码段并查看提供给您的行号来找到行数。

I’m not sure if this would work(data COULD be omitted), but this may work:

*dataframe name*.tails(1)

and then using this, you could find the number of rows by running the code snippet and looking at the row number that was given to you.


回答 13

这两种方法都可以(df是DataFrame的名称):

方法1:使用len功能:

len(df) 将给出名为DataFrame的行数 df

方法2:使用count函数:

df[col].count()将计算给定列中的行数col

df.count() 将给出所有列的行数。

Either of this can do (df is the name of the DataFrame):

Method 1: Using len function:

len(df) will give the number of rows in a DataFrame named df.

Method 2: using count function:

df[col].count() will count the number of rows in a given column col.

df.count() will give the number of rows for all the columns.