函数调用中的星号

问题:函数调用中的星号

我正在使用itertools.chain以这种方式“拉平”列表列表:

uniqueCrossTabs = list(itertools.chain(*uniqueCrossTabs))

这跟说的有什么不同?

uniqueCrossTabs = list(itertools.chain(uniqueCrossTabs))

I’m using itertools.chain to “flatten” a list of lists in this fashion:

uniqueCrossTabs = list(itertools.chain(*uniqueCrossTabs))

how is this different than saying:

uniqueCrossTabs = list(itertools.chain(uniqueCrossTabs))

回答 0

* 是“ splat”运算符:它接受一个列表作为输入,并将其扩展为函数调用中的实际位置参数。

所以如果uniqueCrossTabs[ [ 1, 2 ], [ 3, 4 ] ],那就itertools.chain(*uniqueCrossTabs)等于说itertools.chain([ 1, 2 ], [ 3, 4 ])

这与传递just显然不同uniqueCrossTabs。对于您的情况,您有一个想要拼合的列表列表;什么itertools.chain()确实是在所有你传递给它的位置参数,其中每个位置参数是在自己的权利迭代拼接返回一个迭代。

换句话说,您希望将每个列表uniqueCrossTabs作为参数传递给chain(),这会将它们链接在一起,但是您没有在单独的变量中使用列表,因此可以使用*运算符将列表扩展为多个列表参数。

正如Jochen Ritzel在评论中指出的那样,chain.from_iterable()它更适合于此操作,因为它假定一个可迭代的对象开始。然后,您的代码将变得简单:

uniqueCrossTabs = list(itertools.chain.from_iterable(uniqueCrossTabs))

* is the “splat” operator: It takes a list as input, and expands it into actual positional arguments in the function call.

So if uniqueCrossTabs was [ [ 1, 2 ], [ 3, 4 ] ], then itertools.chain(*uniqueCrossTabs) is the same as saying itertools.chain([ 1, 2 ], [ 3, 4 ])

This is obviously different from passing in just uniqueCrossTabs. In your case, you have a list of lists that you wish to flatten; what itertools.chain() does is return an iterator over the concatenation of all the positional arguments you pass to it, where each positional argument is iterable in its own right.

In other words, you want to pass each list in uniqueCrossTabs as an argument to chain(), which will chain them together, but you don’t have the lists in separate variables, so you use the * operator to expand the list of lists into several list arguments.

As Jochen Ritzel has pointed out in the comments, chain.from_iterable() is better-suited for this operation, as it assumes a single iterable of iterables to begin with. Your code then becomes simply:

uniqueCrossTabs = list(itertools.chain.from_iterable(uniqueCrossTabs))

回答 1

它将序列拆分为函数调用的单独参数。

>>> def foo(a, b=None, c=None):
...   print a, b, c
... 
>>> foo([1, 2, 3])
[1, 2, 3] None None
>>> foo(*[1, 2, 3])
1 2 3
>>> def bar(*a):
...   print a
... 
>>> bar([1, 2, 3])
([1, 2, 3],)
>>> bar(*[1, 2, 3])
(1, 2, 3)

It splits the sequence into separate arguments for the function call.

>>> def foo(a, b=None, c=None):
...   print a, b, c
... 
>>> foo([1, 2, 3])
[1, 2, 3] None None
>>> foo(*[1, 2, 3])
1 2 3
>>> def bar(*a):
...   print a
... 
>>> bar([1, 2, 3])
([1, 2, 3],)
>>> bar(*[1, 2, 3])
(1, 2, 3)

回答 2

只是解释概念/使用它的另一种方法。

import random

def arbitrary():
    return [x for x in range(1, random.randint(3,10))]

a, b, *rest = arbitrary()

# a = 1
# b = 2
# rest = [3,4,5]

Just an alternative way of explaining the concept/using it.

import random

def arbitrary():
    return [x for x in range(1, random.randint(3,10))]

a, b, *rest = arbitrary()

# a = 1
# b = 2
# rest = [3,4,5]

SQLAlchemy版本控制关心类的导入顺序

问题:SQLAlchemy版本控制关心类的导入顺序

我在这里遵循指南:

http://www.sqlalchemy.org/docs/orm/examples.html?highlight=versioning#versioned-objects

并遇到了一个问题。我的关系定义如下:

generic_ticker = relation('MyClass', backref=backref("stuffs"))

使用字符串,因此它不在乎模型模块的导入顺序。这一切都正常工作,但是当我使用版本控制元时,出现以下错误:

sqlalchemy.exc.InvalidRequestError:初始化映射程序Mapper | MyClass | stuffs时,表达式’Trader’找不到名称(“名称’MyClass’未定义”)。如果这是一个类名,请考虑在定义了两个从属类之后,将这个Relationship()添加到类中。

我跟踪到以下错误:

  File "/home/nick/workspace/gm3/gm3/lib/history_meta.py", line 90, in __init__
    mapper = class_mapper(cls)
  File "/home/nick/venv/tg2env/lib/python2.6/site-packages/sqlalchemy/orm/util.py", line 622, in class_mapper
    mapper = mapper.compile()

class VersionedMeta(DeclarativeMeta):
    def __init__(cls, classname, bases, dict_):
        DeclarativeMeta.__init__(cls, classname, bases, dict_)

        try:
            mapper = class_mapper(cls)
            _history_mapper(mapper)
        except UnmappedClassError:
            pass

我通过尝试一下来解决了这个问题:将所有东西都放入lambda中,然后在所有导入操作完成后再运行它们。这有效,但似乎有点垃圾,如何解决此问题的任何想法是更好的方法?

谢谢!

更新资料

问题实际上与导入顺序无关。设计版本控制示例时,使映射器需要在每个版本控制类的构造函数中进行编译。当尚未定义相关的类时,编译将失败。如果是循环关系,则无法通过更改映射类的定义顺序来使其工作。

更新2

如上述更新所述(我不知道您可以在此处编辑其他人的帖子:))这很可能是由于循环引用。在这种情况下,可能有人会发现我的黑客很有用(我将其与tu​​rbogears一起使用)(替换VersionedMeta并在history_meta中全局添加create_mappers)

create_mappers = []
class VersionedMeta(DeclarativeMeta):
    def __init__(cls, classname, bases, dict_):
        DeclarativeMeta.__init__(cls, classname, bases, dict_)
        #I added this code in as it was crashing otherwise
        def make_mapper():
            try:
                mapper = class_mapper(cls)
                _history_mapper(mapper)
            except UnmappedClassError:
                pass

        create_mappers.append(lambda: make_mapper())

然后,您可以在模型__init__.py中执行以下操作

# Import your model modules here.
from myproj.lib.history_meta import create_mappers

from myproj.model.misc import *
from myproj.model.actor import *
from myproj.model.stuff1 import *
from myproj.model.instrument import *
from myproj.model.stuff import *

#setup the history
[func() for func in create_mappers]

这样,仅在定义了所有类之后,它才创建映射器。

更新3 略微无关,但在某些情况下我遇到了重复的主键错误(一次对同一对象进行2次更改)。我的解决方法是添加一个新的主自动增量键。当然,使用mysql不能超过1个,因此我不得不取消对用于创建历史表的现有内容的主键。查看我的整体代码(包括hist_id并摆脱外键约束):

"""Stolen from the offical sqlalchemy recpies
"""
from sqlalchemy.ext.declarative import DeclarativeMeta
from sqlalchemy.orm import mapper, class_mapper, attributes, object_mapper
from sqlalchemy.orm.exc import UnmappedClassError, UnmappedColumnError
from sqlalchemy import Table, Column, ForeignKeyConstraint, Integer
from sqlalchemy.orm.interfaces import SessionExtension
from sqlalchemy.orm.properties import RelationshipProperty
from sqlalchemy.types import DateTime
import datetime
from sqlalchemy.orm.session import Session

def col_references_table(col, table):
    for fk in col.foreign_keys:
        if fk.references(table):
            return True
    return False

def _history_mapper(local_mapper):
    cls = local_mapper.class_

    # set the "active_history" flag
    # on on column-mapped attributes so that the old version
    # of the info is always loaded (currently sets it on all attributes)
    for prop in local_mapper.iterate_properties:
        getattr(local_mapper.class_, prop.key).impl.active_history = True

    super_mapper = local_mapper.inherits
    super_history_mapper = getattr(cls, '__history_mapper__', None)

    polymorphic_on = None
    super_fks = []
    if not super_mapper or local_mapper.local_table is not super_mapper.local_table:
        cols = []
        for column in local_mapper.local_table.c:
            if column.name == 'version':
                continue

            col = column.copy()
            col.unique = False

            #don't auto increment stuff from the normal db
            if col.autoincrement:
                col.autoincrement = False
            #sqllite falls over with auto incrementing keys if we have a composite key
            if col.primary_key:
                col.primary_key = False

            if super_mapper and col_references_table(column, super_mapper.local_table):
                super_fks.append((col.key, list(super_history_mapper.base_mapper.local_table.primary_key)[0]))

            cols.append(col)

            if column is local_mapper.polymorphic_on:
                polymorphic_on = col

        #if super_mapper:
        #    super_fks.append(('version', super_history_mapper.base_mapper.local_table.c.version))

        cols.append(Column('hist_id', Integer, primary_key=True, autoincrement=True))
        cols.append(Column('version', Integer))
        cols.append(Column('changed', DateTime, default=datetime.datetime.now))

        if super_fks:
            cols.append(ForeignKeyConstraint(*zip(*super_fks)))

        table = Table(local_mapper.local_table.name + '_history', local_mapper.local_table.metadata,
                      *cols, mysql_engine='InnoDB')
    else:
        # single table inheritance.  take any additional columns that may have
        # been added and add them to the history table.
        for column in local_mapper.local_table.c:
            if column.key not in super_history_mapper.local_table.c:
                col = column.copy()
                super_history_mapper.local_table.append_column(col)
        table = None

    if super_history_mapper:
        bases = (super_history_mapper.class_,)
    else:
        bases = local_mapper.base_mapper.class_.__bases__
    versioned_cls = type.__new__(type, "%sHistory" % cls.__name__, bases, {})

    m = mapper(
            versioned_cls, 
            table, 
            inherits=super_history_mapper, 
            polymorphic_on=polymorphic_on,
            polymorphic_identity=local_mapper.polymorphic_identity
            )
    cls.__history_mapper__ = m

    if not super_history_mapper:
        cls.version = Column('version', Integer, default=1, nullable=False)

create_mappers = []

class VersionedMeta(DeclarativeMeta):
    def __init__(cls, classname, bases, dict_):
        DeclarativeMeta.__init__(cls, classname, bases, dict_)
        #I added this code in as it was crashing otherwise
        def make_mapper():
            try:
                mapper = class_mapper(cls)
                _history_mapper(mapper)
            except UnmappedClassError:
                pass

        create_mappers.append(lambda: make_mapper())

def versioned_objects(iter):
    for obj in iter:
        if hasattr(obj, '__history_mapper__'):
            yield obj

def create_version(obj, session, deleted = False):
    obj_mapper = object_mapper(obj)
    history_mapper = obj.__history_mapper__
    history_cls = history_mapper.class_

    obj_state = attributes.instance_state(obj)

    attr = {}

    obj_changed = False

    for om, hm in zip(obj_mapper.iterate_to_root(), history_mapper.iterate_to_root()):
        if hm.single:
            continue

        for hist_col in hm.local_table.c:
            if hist_col.key == 'version' or hist_col.key == 'changed' or hist_col.key == 'hist_id':
                continue

            obj_col = om.local_table.c[hist_col.key]

            # get the value of the
            # attribute based on the MapperProperty related to the
            # mapped column.  this will allow usage of MapperProperties
            # that have a different keyname than that of the mapped column.
            try:
                prop = obj_mapper.get_property_by_column(obj_col)
            except UnmappedColumnError:
                # in the case of single table inheritance, there may be 
                # columns on the mapped table intended for the subclass only.
                # the "unmapped" status of the subclass column on the 
                # base class is a feature of the declarative module as of sqla 0.5.2.
                continue

            # expired object attributes and also deferred cols might not be in the
            # dict.  force it to load no matter what by using getattr().
            if prop.key not in obj_state.dict:
                getattr(obj, prop.key)

            a, u, d = attributes.get_history(obj, prop.key)

            if d:
                attr[hist_col.key] = d[0]
                obj_changed = True
            elif u:
                attr[hist_col.key] = u[0]
            else:
                # if the attribute had no value.
                attr[hist_col.key] = a[0]
                obj_changed = True

    if not obj_changed:
        # not changed, but we have relationships.  OK
        # check those too
        for prop in obj_mapper.iterate_properties:
            if isinstance(prop, RelationshipProperty) and \
                attributes.get_history(obj, prop.key).has_changes():
                obj_changed = True
                break

    if not obj_changed and not deleted:
        return

    attr['version'] = obj.version
    hist = history_cls()
    for key, value in attr.iteritems():
        setattr(hist, key, value)

    obj.version += 1
    session.add(hist)

class VersionedListener(SessionExtension):
    def before_flush(self, session, flush_context, instances):
        for obj in versioned_objects(session.dirty):
            create_version(obj, session)
        for obj in versioned_objects(session.deleted):
            create_version(obj, session, deleted = True)

I was following the guide here:

http://www.sqlalchemy.org/docs/orm/examples.html?highlight=versioning#versioned-objects

and have come across an issue. I have defined my relationships like:

generic_ticker = relation('MyClass', backref=backref("stuffs"))

with strings so it doesn’t care about the import order of my model modules. This all works fine normally, but when I use the versioning meta I get the following error:

sqlalchemy.exc.InvalidRequestError: When initializing mapper Mapper|MyClass|stuffs, expression ‘Trader’ failed to locate a name (“name ‘MyClass’ is not defined”). If this is a class name, consider adding this relationship() to the class after both dependent classes have been defined.

I tracked down the error to:

  File "/home/nick/workspace/gm3/gm3/lib/history_meta.py", line 90, in __init__
    mapper = class_mapper(cls)
  File "/home/nick/venv/tg2env/lib/python2.6/site-packages/sqlalchemy/orm/util.py", line 622, in class_mapper
    mapper = mapper.compile()

class VersionedMeta(DeclarativeMeta):
    def __init__(cls, classname, bases, dict_):
        DeclarativeMeta.__init__(cls, classname, bases, dict_)

        try:
            mapper = class_mapper(cls)
            _history_mapper(mapper)
        except UnmappedClassError:
            pass

I fixed the problem by putting the try: except stuff in a lambda and running them all after all the imports have happened. This works but seems a bit rubbish, any ideas of how to fix this is a better way?

Thanks!

Update

The problem is not actually about import order. The versioning example is designed such that mapper requires compilation in costructor of each versioned class. And compilation fails when related classes are not yet defined. In case of circular relations there is no way to make it working by changing definition order of mapped classes.

Update 2

As the above update states (I didn’t know you could edit other people’s posts on here :)) this is likely due to circular references. In which case may be someone will find my hack useful (I’m using it with turbogears) (Replace VersionedMeta and add in create_mappers global in history_meta)

create_mappers = []
class VersionedMeta(DeclarativeMeta):
    def __init__(cls, classname, bases, dict_):
        DeclarativeMeta.__init__(cls, classname, bases, dict_)
        #I added this code in as it was crashing otherwise
        def make_mapper():
            try:
                mapper = class_mapper(cls)
                _history_mapper(mapper)
            except UnmappedClassError:
                pass

        create_mappers.append(lambda: make_mapper())

Then you can do something like the following in your models __init__.py

# Import your model modules here.
from myproj.lib.history_meta import create_mappers

from myproj.model.misc import *
from myproj.model.actor import *
from myproj.model.stuff1 import *
from myproj.model.instrument import *
from myproj.model.stuff import *

#setup the history
[func() for func in create_mappers]

That way it create the mappers only after all the classes have been defined.

Update 3 Slightly unrelated but I came across a duplicate primary key error in some circumstances (committing 2 changes to the same object in one go). My workaround has been to add a new primary auto-incrementing key. Of course you can’t have more than 1 with mysql so I had to de-primary key the existing stuff used to create the history table. Check out my overall code (including a hist_id and getting rid of the foreign key constraint):

"""Stolen from the offical sqlalchemy recpies
"""
from sqlalchemy.ext.declarative import DeclarativeMeta
from sqlalchemy.orm import mapper, class_mapper, attributes, object_mapper
from sqlalchemy.orm.exc import UnmappedClassError, UnmappedColumnError
from sqlalchemy import Table, Column, ForeignKeyConstraint, Integer
from sqlalchemy.orm.interfaces import SessionExtension
from sqlalchemy.orm.properties import RelationshipProperty
from sqlalchemy.types import DateTime
import datetime
from sqlalchemy.orm.session import Session

def col_references_table(col, table):
    for fk in col.foreign_keys:
        if fk.references(table):
            return True
    return False

def _history_mapper(local_mapper):
    cls = local_mapper.class_

    # set the "active_history" flag
    # on on column-mapped attributes so that the old version
    # of the info is always loaded (currently sets it on all attributes)
    for prop in local_mapper.iterate_properties:
        getattr(local_mapper.class_, prop.key).impl.active_history = True

    super_mapper = local_mapper.inherits
    super_history_mapper = getattr(cls, '__history_mapper__', None)

    polymorphic_on = None
    super_fks = []
    if not super_mapper or local_mapper.local_table is not super_mapper.local_table:
        cols = []
        for column in local_mapper.local_table.c:
            if column.name == 'version':
                continue

            col = column.copy()
            col.unique = False

            #don't auto increment stuff from the normal db
            if col.autoincrement:
                col.autoincrement = False
            #sqllite falls over with auto incrementing keys if we have a composite key
            if col.primary_key:
                col.primary_key = False

            if super_mapper and col_references_table(column, super_mapper.local_table):
                super_fks.append((col.key, list(super_history_mapper.base_mapper.local_table.primary_key)[0]))

            cols.append(col)

            if column is local_mapper.polymorphic_on:
                polymorphic_on = col

        #if super_mapper:
        #    super_fks.append(('version', super_history_mapper.base_mapper.local_table.c.version))

        cols.append(Column('hist_id', Integer, primary_key=True, autoincrement=True))
        cols.append(Column('version', Integer))
        cols.append(Column('changed', DateTime, default=datetime.datetime.now))

        if super_fks:
            cols.append(ForeignKeyConstraint(*zip(*super_fks)))

        table = Table(local_mapper.local_table.name + '_history', local_mapper.local_table.metadata,
                      *cols, mysql_engine='InnoDB')
    else:
        # single table inheritance.  take any additional columns that may have
        # been added and add them to the history table.
        for column in local_mapper.local_table.c:
            if column.key not in super_history_mapper.local_table.c:
                col = column.copy()
                super_history_mapper.local_table.append_column(col)
        table = None

    if super_history_mapper:
        bases = (super_history_mapper.class_,)
    else:
        bases = local_mapper.base_mapper.class_.__bases__
    versioned_cls = type.__new__(type, "%sHistory" % cls.__name__, bases, {})

    m = mapper(
            versioned_cls, 
            table, 
            inherits=super_history_mapper, 
            polymorphic_on=polymorphic_on,
            polymorphic_identity=local_mapper.polymorphic_identity
            )
    cls.__history_mapper__ = m

    if not super_history_mapper:
        cls.version = Column('version', Integer, default=1, nullable=False)

create_mappers = []

class VersionedMeta(DeclarativeMeta):
    def __init__(cls, classname, bases, dict_):
        DeclarativeMeta.__init__(cls, classname, bases, dict_)
        #I added this code in as it was crashing otherwise
        def make_mapper():
            try:
                mapper = class_mapper(cls)
                _history_mapper(mapper)
            except UnmappedClassError:
                pass

        create_mappers.append(lambda: make_mapper())

def versioned_objects(iter):
    for obj in iter:
        if hasattr(obj, '__history_mapper__'):
            yield obj

def create_version(obj, session, deleted = False):
    obj_mapper = object_mapper(obj)
    history_mapper = obj.__history_mapper__
    history_cls = history_mapper.class_

    obj_state = attributes.instance_state(obj)

    attr = {}

    obj_changed = False

    for om, hm in zip(obj_mapper.iterate_to_root(), history_mapper.iterate_to_root()):
        if hm.single:
            continue

        for hist_col in hm.local_table.c:
            if hist_col.key == 'version' or hist_col.key == 'changed' or hist_col.key == 'hist_id':
                continue

            obj_col = om.local_table.c[hist_col.key]

            # get the value of the
            # attribute based on the MapperProperty related to the
            # mapped column.  this will allow usage of MapperProperties
            # that have a different keyname than that of the mapped column.
            try:
                prop = obj_mapper.get_property_by_column(obj_col)
            except UnmappedColumnError:
                # in the case of single table inheritance, there may be 
                # columns on the mapped table intended for the subclass only.
                # the "unmapped" status of the subclass column on the 
                # base class is a feature of the declarative module as of sqla 0.5.2.
                continue

            # expired object attributes and also deferred cols might not be in the
            # dict.  force it to load no matter what by using getattr().
            if prop.key not in obj_state.dict:
                getattr(obj, prop.key)

            a, u, d = attributes.get_history(obj, prop.key)

            if d:
                attr[hist_col.key] = d[0]
                obj_changed = True
            elif u:
                attr[hist_col.key] = u[0]
            else:
                # if the attribute had no value.
                attr[hist_col.key] = a[0]
                obj_changed = True

    if not obj_changed:
        # not changed, but we have relationships.  OK
        # check those too
        for prop in obj_mapper.iterate_properties:
            if isinstance(prop, RelationshipProperty) and \
                attributes.get_history(obj, prop.key).has_changes():
                obj_changed = True
                break

    if not obj_changed and not deleted:
        return

    attr['version'] = obj.version
    hist = history_cls()
    for key, value in attr.iteritems():
        setattr(hist, key, value)

    obj.version += 1
    session.add(hist)

class VersionedListener(SessionExtension):
    def before_flush(self, session, flush_context, instances):
        for obj in versioned_objects(session.dirty):
            create_version(obj, session)
        for obj in versioned_objects(session.deleted):
            create_version(obj, session, deleted = True)

回答 0

我通过尝试一下来解决了这个问题:将所有东西都放入lambda中,然后在所有导入操作完成后再运行它们。

大!

I fixed the problem by putting the try: except stuff in a lambda and running them all after all the imports have happened.

Great!


Django URLs TypeError:对于include(),视图必须是可调用的或列表/元组

问题:Django URLs TypeError:对于include(),视图必须是可调用的或列表/元组

升级到Django 1.10后,出现错误:

TypeError: view must be a callable or a list/tuple in the case of include().

我的urls.py如下:

from django.conf.urls import include, url

urlpatterns = [
    url(r'^$', 'myapp.views.home'),
    url(r'^contact/$', 'myapp.views.contact'),
    url(r'^login/$', 'django.contrib.auth.views.login'),
]

完整的回溯是:

Traceback (most recent call last):
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/utils/autoreload.py", line 226, in wrapper
    fn(*args, **kwargs)
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/core/management/commands/runserver.py", line 121, in inner_run
    self.check(display_num_errors=True)
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/core/management/base.py", line 385, in check
    include_deployment_checks=include_deployment_checks,
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/core/management/base.py", line 372, in _run_checks
    return checks.run_checks(**kwargs)
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/core/checks/registry.py", line 81, in run_checks
    new_errors = check(app_configs=app_configs)
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/core/checks/urls.py", line 14, in check_url_config
    return check_resolver(resolver)
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/core/checks/urls.py", line 24, in check_resolver
    for pattern in resolver.url_patterns:
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/utils/functional.py", line 35, in __get__
    res = instance.__dict__[self.name] = self.func(instance)
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/urls/resolvers.py", line 310, in url_patterns
    patterns = getattr(self.urlconf_module, "urlpatterns", self.urlconf_module)
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/utils/functional.py", line 35, in __get__
    res = instance.__dict__[self.name] = self.func(instance)
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/urls/resolvers.py", line 303, in urlconf_module
    return import_module(self.urlconf_name)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/importlib/__init__.py", line 37, in import_module
    __import__(name)
  File "/Users/alasdair/dev/urlproject/urlproject/urls.py", line 28, in <module>
    url(r'^$', 'myapp.views.home'),
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/conf/urls/__init__.py", line 85, in url
    raise TypeError('view must be a callable or a list/tuple in the case of include().')
TypeError: view must be a callable or a list/tuple in the case of include().

After upgrading to Django 1.10, I get the error:

TypeError: view must be a callable or a list/tuple in the case of include().

My urls.py is as follows:

from django.conf.urls import include, url

urlpatterns = [
    url(r'^$', 'myapp.views.home'),
    url(r'^contact/$', 'myapp.views.contact'),
    url(r'^login/$', 'django.contrib.auth.views.login'),
]

The full traceback is:

Traceback (most recent call last):
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/utils/autoreload.py", line 226, in wrapper
    fn(*args, **kwargs)
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/core/management/commands/runserver.py", line 121, in inner_run
    self.check(display_num_errors=True)
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/core/management/base.py", line 385, in check
    include_deployment_checks=include_deployment_checks,
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/core/management/base.py", line 372, in _run_checks
    return checks.run_checks(**kwargs)
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/core/checks/registry.py", line 81, in run_checks
    new_errors = check(app_configs=app_configs)
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/core/checks/urls.py", line 14, in check_url_config
    return check_resolver(resolver)
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/core/checks/urls.py", line 24, in check_resolver
    for pattern in resolver.url_patterns:
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/utils/functional.py", line 35, in __get__
    res = instance.__dict__[self.name] = self.func(instance)
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/urls/resolvers.py", line 310, in url_patterns
    patterns = getattr(self.urlconf_module, "urlpatterns", self.urlconf_module)
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/utils/functional.py", line 35, in __get__
    res = instance.__dict__[self.name] = self.func(instance)
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/urls/resolvers.py", line 303, in urlconf_module
    return import_module(self.urlconf_name)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/importlib/__init__.py", line 37, in import_module
    __import__(name)
  File "/Users/alasdair/dev/urlproject/urlproject/urls.py", line 28, in <module>
    url(r'^$', 'myapp.views.home'),
  File "/Users/alasdair/.virtualenvs/django110/lib/python2.7/site-packages/django/conf/urls/__init__.py", line 85, in url
    raise TypeError('view must be a callable or a list/tuple in the case of include().')
TypeError: view must be a callable or a list/tuple in the case of include().

回答 0

Django 1.10不再允许您'myapp.views.home'在URL模式中将视图指定为字符串(例如)。

解决方案是更新您urls.py的视图以包含可调用的视图。这意味着您必须在中导入视图urls.py。如果您的URL模式没有名称,那么现在是添加名称的好时机,因为用虚线python路径进行反向操作不再起作用。

from django.conf.urls import include, url

from django.contrib.auth.views import login
from myapp.views import home, contact

urlpatterns = [
    url(r'^$', home, name='home'),
    url(r'^contact/$', contact, name='contact'),
    url(r'^login/$', login, name='login'),
]

如果有很多视图,则不方便分别导入它们。一种替代方法是从您的应用程序导入视图模块。

from django.conf.urls import include, url

from django.contrib.auth import views as auth_views
from myapp import views as myapp_views

urlpatterns = [
    url(r'^$', myapp_views.home, name='home'),
    url(r'^contact/$', myapp_views.contact, name='contact'),
    url(r'^login/$', auth_views.login, name='login'),
]

请注意,我们已经使用as myapp_viewsas auth_views,这允许我们views.py从多个应用程序导入,而不会发生冲突。

有关的更多信息,请参见Django URL调度程序文档urlpatterns

Django 1.10 no longer allows you to specify views as a string (e.g. 'myapp.views.home') in your URL patterns.

The solution is to update your urls.py to include the view callable. This means that you have to import the view in your urls.py. If your URL patterns don’t have names, then now is a good time to add one, because reversing with the dotted python path no longer works.

from django.conf.urls import include, url

from django.contrib.auth.views import login
from myapp.views import home, contact

urlpatterns = [
    url(r'^$', home, name='home'),
    url(r'^contact/$', contact, name='contact'),
    url(r'^login/$', login, name='login'),
]

If there are many views, then importing them individually can be inconvenient. An alternative is to import the views module from your app.

from django.conf.urls import include, url

from django.contrib.auth import views as auth_views
from myapp import views as myapp_views

urlpatterns = [
    url(r'^$', myapp_views.home, name='home'),
    url(r'^contact/$', myapp_views.contact, name='contact'),
    url(r'^login/$', auth_views.login, name='login'),
]

Note that we have used as myapp_views and as auth_views, which allows us to import the views.py from multiple apps without them clashing.

See the Django URL dispatcher docs for more information about urlpatterns.


回答 1

该错误仅意味着myapp.views.home不能调用它,就像函数一样。它实际上是一个字符串。当您的解决方案在django 1.9中运行时,它仍然发出警告,说它将从版本1.10开始弃用,这就是发生的一切。通过@Alasdair先前的溶液中导入必要的视图函数到脚本通过任一 from myapp import views as myapp_viewsfrom myapp.views import home, contact

This error just means that myapp.views.home is not something that can be called, like a function. It is a string in fact. While your solution works in django 1.9, nevertheless it throws a warning saying this will deprecate from version 1.10 onwards, which is exactly what has happened. The previous solution by @Alasdair imports the necessary view functions into the script through either from myapp import views as myapp_views or from myapp.views import home, contact


回答 2

如果视图和模块的名称冲突,也可能会出现此错误。我将我的视图文件分发到views文件夹下,/views/view1.py, /views/view2.py并在view2.py中导入了一个名为table.py的模型时遇到了错误,该模型恰巧是view1.py中的视图名称。因此,命名视图功能 v_table(request,id) 很有帮助。

You may also get this error if you have a name clash of a view and a module. I’ve got the error when i distribute my view files under views folder, /views/view1.py, /views/view2.py and imported some model named table.py in view2.py which happened to be a name of a view in view1.py. So naming the view functions as v_table(request,id) helped.


回答 3

您的代码是

urlpatterns = [
    url(r'^$', 'myapp.views.home'),
    url(r'^contact/$', 'myapp.views.contact'),
    url(r'^login/$', 'django.contrib.auth.views.login'),
]

在导入include()功能时将其更改为以下内容:

urlpatterns = [
    url(r'^$', views.home),
    url(r'^contact/$', views.contact),
    url(r'^login/$', views.login),
]

Your code is

urlpatterns = [
    url(r'^$', 'myapp.views.home'),
    url(r'^contact/$', 'myapp.views.contact'),
    url(r'^login/$', 'django.contrib.auth.views.login'),
]

change it to following as you’re importing include() function :

urlpatterns = [
    url(r'^$', views.home),
    url(r'^contact/$', views.contact),
    url(r'^login/$', views.login),
]

Python中的插入符(^)有什么作用?

问题:Python中的插入符(^)有什么作用?

我今天在python中遇到了插入符号运算符,并对其进行了尝试,得到了以下输出:

>>> 8^3
11
>>> 8^4
12
>>> 8^1
9
>>> 8^0
8
>>> 7^1
6
>>> 7^2
5
>>> 7^7
0
>>> 7^8
15
>>> 9^1
8
>>> 16^1
17
>>> 15^1
14
>>>

它似乎基于8,所以我猜某种字节操作?除了对浮点数的奇怪表现之外,我似乎无法找到更多关于此搜索网站的信息,是否有人链接到该运算符的工作,或者您可以在此处进行解释?

I ran across the caret operator in python today and trying it out, I got the following output:

>>> 8^3
11
>>> 8^4
12
>>> 8^1
9
>>> 8^0
8
>>> 7^1
6
>>> 7^2
5
>>> 7^7
0
>>> 7^8
15
>>> 9^1
8
>>> 16^1
17
>>> 15^1
14
>>>

It seems to be based on 8, so I’m guessing some sort of byte operation? I can’t seem to find much about this searching sites other than it behaves oddly for floats, does anybody have a link to what this operator does or can you explain it here?


回答 0

这是按位异或(异或)。

如果一个操作数(仅一个)(评估为)为true,则结果为true。

展示:

>>> 0^0
0
>>> 1^1
0
>>> 1^0
1
>>> 0^1
1

要解释您自己的示例之一:

>>> 8^3
11

这样考虑:

1000#8(二进制)
0011#3(二进制)
----#应用XOR(“垂直”)
1011#结果= 11(二进制)

It’s a bitwise XOR (exclusive OR).

It results to true if one (and only one) of the operands (evaluates to) true.

To demonstrate:

>>> 0^0
0
>>> 1^1
0
>>> 1^0
1
>>> 0^1
1

To explain one of your own examples:

>>> 8^3
11

Think about it this way:

1000  # 8 (binary)
0011  # 3 (binary)
----  # APPLY XOR ('vertically')
1011  # result = 11 (binary)

回答 1

它根据需要调用对象的__xor__()or __rxor__()方法,对于整数类型,它按位进行异或。

It invokes the __xor__() or __rxor__() method of the object as needed, which for integer types does a bitwise exclusive-or.


回答 2

这是一点一点的异或。《 Python语言参考》的第5章介绍了二进制按位运算符。

It’s a bit-by-bit exclusive-or. Binary bitwise operators are documented in chapter 5 of the Python Language Reference.


回答 3

一般来说,符号^是or 方法的中版本。无论在符号的左右放置什么数据类型,都必须以兼容的方式实现此功能。对于整数,这是常见的操作,但是例如没有一个内置的类型定义的函数类型:__xor____rxor__XORfloatint

In [12]: 3 ^ 4
Out[12]: 7

In [13]: 3.3 ^ 4
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-13-858cc886783d> in <module>()
----> 1 3.3 ^ 4

TypeError: unsupported operand type(s) for ^: 'float' and 'int'

关于Python的一件整洁的事情是,您可以在自己的类中重写此行为。例如,在某些语言中,^符号表示幂。您可以通过这种方式做到这一点,就像一个示例:

class Foo(float):
    def __xor__(self, other):
        return self ** other

然后,类似的事情将起作用,现在,对于Fooonly的实例,该^符号将表示幂。

In [16]: x = Foo(3)

In [17]: x
Out[17]: 3.0

In [18]: x ^ 4
Out[18]: 81.0

Generally speaking, the symbol ^ is an infix version of the __xor__ or __rxor__ methods. Whatever data types are placed to the right and left of the symbol must implement this function in a compatible way. For integers, it is the common XOR operation, but for example there is not a built-in definition of the function for type float with type int:

In [12]: 3 ^ 4
Out[12]: 7

In [13]: 3.3 ^ 4
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-13-858cc886783d> in <module>()
----> 1 3.3 ^ 4

TypeError: unsupported operand type(s) for ^: 'float' and 'int'

One neat thing about Python is that you can override this behavior in a class of your own. For example, in some languages the ^ symbol means exponentiation. You could do that this way, just as one example:

class Foo(float):
    def __xor__(self, other):
        return self ** other

Then something like this will work, and now, for instances of Foo only, the ^ symbol will mean exponentiation.

In [16]: x = Foo(3)

In [17]: x
Out[17]: 3.0

In [18]: x ^ 4
Out[18]: 81.0

回答 4

当您使用^操作员时,幕后方法__xor__便会调用。

a^b 相当于 a.__xor__(b)

此外,a ^= b等价于a = a.__ixor__(b)__xor____ixor__通过使用隐式调用时,其中用作后备^=但不存在)。

原则上,什么__xor__完全取决于其实施。Python中的常见用例是:

  • 集的对称差异(所有元素恰好存在于两个集之一)

演示:

>>> a = {1, 2, 3}
>>> b = {1, 4, 5}
>>> a^b
{2, 3, 4, 5}
>>> a.symmetric_difference(b)
{2, 3, 4, 5}
  • 两个整数的位的按位不等于

演示:

>>> a = 5
>>> b = 6
>>> a^b
3

说明:

    101 (5 decimal)
XOR 110 (6 decimal)
-------------------
    011 (3 decimal)

When you use the ^ operator, behind the curtains the method __xor__ is called.

a^b is equivalent to a.__xor__(b).

Also, a ^= b is equivalent to a = a.__ixor__(b) (where __xor__ is used as a fallback when __ixor__ is implicitly called via using ^= but does not exist).

In principle, what __xor__ does is completely up to its implementation. Common use cases in Python are:

  • Symmetric Difference of sets (all elements present in exactly one of two sets)

Demo:

>>> a = {1, 2, 3}
>>> b = {1, 4, 5}
>>> a^b
{2, 3, 4, 5}
>>> a.symmetric_difference(b)
{2, 3, 4, 5}
  • Bitwise Non-Equal for the bits of two integers

Demo:

>>> a = 5
>>> b = 6
>>> a^b
3

Explanation:

    101 (5 decimal)
XOR 110 (6 decimal)
-------------------
    011 (3 decimal)

字典可以在创建时传递给Django模型吗?

问题:字典可以在创建时传递给Django模型吗?

是否有可能做类似这样的一个东西listdictionary还是其他什么东西?

data_dict = {
    'title' : 'awesome title',
    'body' : 'great body of text',
}

Model.objects.create(data_dict)

如果我可以扩展它,那就更好了:

Model.objects.create(data_dict, extra='hello', extra2='world')

Is it possible to do something similar to this with a list, dictionary or something else?

data_dict = {
    'title' : 'awesome title',
    'body' : 'great body of text',
}

Model.objects.create(data_dict)

Even better if I can extend it:

Model.objects.create(data_dict, extra='hello', extra2='world')

回答 0

如果titlebody是模型中的字段,则可以使用**运算符将关键字参数传递到字典中

假设您的模型称为MyModel

# create instance of model
m = MyModel(**data_dict)
# don't forget to save to database!
m.save()

至于第二个问题,字典必须是最后一个参数。同样,extra并且extra2应该是模型中的字段。

m2 =MyModel(extra='hello', extra2='world', **data_dict)
m2.save()

If title and body are fields in your model, then you can deliver the keyword arguments in your dictionary using the ** operator.

Assuming your model is called MyModel:

# create instance of model
m = MyModel(**data_dict)
# don't forget to save to database!
m.save()

As for your second question, the dictionary has to be the final argument. Again, extra and extra2 should be fields in the model.

m2 =MyModel(extra='hello', extra2='world', **data_dict)
m2.save()

回答 1

并不是直接回答问题,但是我发现这段代码帮助我创建了可以很好地保存为正确答案的字典。如果此数据将导出到json,则需要进行类型转换。

我希望这有帮助:

  #mod is a django database model instance
def toDict( mod ):
  import datetime
  from decimal import Decimal
  import re

    #Go through the object, load in the objects we want
  obj = {}
  for key in mod.__dict__:
    if re.search('^_', key):
      continue

      #Copy my data
    if isinstance( mod.__dict__[key], datetime.datetime ):
      obj[key] = int(calendar.timegm( ts.utctimetuple(mod.__dict__[key])))
    elif isinstance( mod.__dict__[key], Decimal ):
      obj[key] = float( mod.__dict__[key] )
    else:
      obj[key] = mod.__dict__[key]

  return obj 

def toCsv( mod, fields, delim=',' ):
  import datetime
  from decimal import Decimal

    #Dump the items
  raw = []
  for key in fields:
    if key not in mod.__dict__:
      continue

      #Copy my data
    if isinstance( mod.__dict__[key], datetime.datetime ):
      raw.append( str(calendar.timegm( ts.utctimetuple(mod.__dict__[key]))) )
    elif isinstance( mod.__dict__[key], Decimal ):
      raw.append( str(float( mod.__dict__[key] )))
    else:
      raw.append( str(mod.__dict__[key]) )

  return delim.join( raw )

Not directly an answer to the question, but I find this code helped me create the dicts that save nicely into the correct answer. The type conversions made are required if this data will be exported to json.

I hope this helps:

  #mod is a django database model instance
def toDict( mod ):
  import datetime
  from decimal import Decimal
  import re

    #Go through the object, load in the objects we want
  obj = {}
  for key in mod.__dict__:
    if re.search('^_', key):
      continue

      #Copy my data
    if isinstance( mod.__dict__[key], datetime.datetime ):
      obj[key] = int(calendar.timegm( ts.utctimetuple(mod.__dict__[key])))
    elif isinstance( mod.__dict__[key], Decimal ):
      obj[key] = float( mod.__dict__[key] )
    else:
      obj[key] = mod.__dict__[key]

  return obj 

def toCsv( mod, fields, delim=',' ):
  import datetime
  from decimal import Decimal

    #Dump the items
  raw = []
  for key in fields:
    if key not in mod.__dict__:
      continue

      #Copy my data
    if isinstance( mod.__dict__[key], datetime.datetime ):
      raw.append( str(calendar.timegm( ts.utctimetuple(mod.__dict__[key]))) )
    elif isinstance( mod.__dict__[key], Decimal ):
      raw.append( str(float( mod.__dict__[key] )))
    else:
      raw.append( str(mod.__dict__[key]) )

  return delim.join( raw )

在熊猫数据框中插入一行

问题:在熊猫数据框中插入一行

我有一个数据框:

s1 = pd.Series([5, 6, 7])
s2 = pd.Series([7, 8, 9])

df = pd.DataFrame([list(s1), list(s2)],  columns =  ["A", "B", "C"])

   A  B  C
0  5  6  7
1  7  8  9

[2 rows x 3 columns]

并且我需要添加第一行[2、3、4]以获取:

   A  B  C
0  2  3  4
1  5  6  7
2  7  8  9

我已经尝试过append()concat()起作用,但是找不到正确的方法。

如何在数据框中添加/插入序列?

I have a dataframe:

s1 = pd.Series([5, 6, 7])
s2 = pd.Series([7, 8, 9])

df = pd.DataFrame([list(s1), list(s2)],  columns =  ["A", "B", "C"])

   A  B  C
0  5  6  7
1  7  8  9

[2 rows x 3 columns]

and I need to add a first row [2, 3, 4] to get:

   A  B  C
0  2  3  4
1  5  6  7
2  7  8  9

I’ve tried append() and concat() functions but can’t find the right way how to do that.

How to add/insert series to dataframe?


回答 0

只需使用以下命令将行分配给特定索引loc

 df.loc[-1] = [2, 3, 4]  # adding a row
 df.index = df.index + 1  # shifting index
 df = df.sort_index()  # sorting by index

然后,您可以根据需要获得:

    A  B  C
 0  2  3  4
 1  5  6  7
 2  7  8  9

请参阅Pandas文档中的“ 索引:放大设置”

Just assign row to a particular index, using loc:

 df.loc[-1] = [2, 3, 4]  # adding a row
 df.index = df.index + 1  # shifting index
 df = df.sort_index()  # sorting by index

And you get, as desired:

    A  B  C
 0  2  3  4
 1  5  6  7
 2  7  8  9

See in Pandas documentation Indexing: Setting with enlargement.


回答 1

不确定您的调用方式,concat()但是只要两个对象的类型相同,它就可以正常工作。也许问题是您需要将第二个向量转换为数据框?使用您定义的df,以下对我有用:

df2 = pd.DataFrame([[2,3,4]], columns=['A','B','C'])
pd.concat([df2, df])

Not sure how you were calling concat() but it should work as long as both objects are of the same type. Maybe the issue is that you need to cast your second vector to a dataframe? Using the df that you defined the following works for me:

df2 = pd.DataFrame([[2,3,4]], columns=['A','B','C'])
pd.concat([df2, df])

回答 2

实现此目的的一种方法是

>>> pd.DataFrame(np.array([[2, 3, 4]]), columns=['A', 'B', 'C']).append(df, ignore_index=True)
Out[330]: 
   A  B  C
0  2  3  4
1  5  6  7
2  7  8  9

通常,最简单的方法是附加数据帧,而不是序列。在您的情况下,由于您希望新行位于“顶部”(具有起始ID),并且没有功能pd.prepend(),因此我首先创建新的数据框,然后追加旧的数据框。

ignore_index会忽略数据框中旧的正在进行的索引,并确保第一行实际上以index开头,1而不是以index重启0

典型的免责声明:Cetero censeo …追加行是一种效率很低的操作。如果您关心性能,并且可以某种方式确保首先创建具有正确(较长)索引的数据框,然后仅另一行插入该数据框,则绝对应该这样做。看到:

>>> index = np.array([0, 1, 2])
>>> df2 = pd.DataFrame(columns=['A', 'B', 'C'], index=index)
>>> df2.loc[0:1] = [list(s1), list(s2)]
>>> df2
Out[336]: 
     A    B    C
0    5    6    7
1    7    8    9
2  NaN  NaN  NaN
>>> df2 = pd.DataFrame(columns=['A', 'B', 'C'], index=index)
>>> df2.loc[1:] = [list(s1), list(s2)]

到目前为止,我们拥有您所拥有的df

>>> df2
Out[339]: 
     A    B    C
0  NaN  NaN  NaN
1    5    6    7
2    7    8    9

但是现在您可以按如下所示轻松插入该行。由于空间是预先分配的,因此效率更高。

>>> df2.loc[0] = np.array([2, 3, 4])
>>> df2
Out[341]: 
   A  B  C
0  2  3  4
1  5  6  7
2  7  8  9

One way to achieve this is

>>> pd.DataFrame(np.array([[2, 3, 4]]), columns=['A', 'B', 'C']).append(df, ignore_index=True)
Out[330]: 
   A  B  C
0  2  3  4
1  5  6  7
2  7  8  9

Generally, it’s easiest to append dataframes, not series. In your case, since you want the new row to be “on top” (with starting id), and there is no function pd.prepend(), I first create the new dataframe and then append your old one.

ignore_index will ignore the old ongoing index in your dataframe and ensure that the first row actually starts with index 1 instead of restarting with index 0.

Typical Disclaimer: Cetero censeo … appending rows is a quite inefficient operation. If you care about performance and can somehow ensure to first create a dataframe with the correct (longer) index and then just inserting the additional row into the dataframe, you should definitely do that. See:

>>> index = np.array([0, 1, 2])
>>> df2 = pd.DataFrame(columns=['A', 'B', 'C'], index=index)
>>> df2.loc[0:1] = [list(s1), list(s2)]
>>> df2
Out[336]: 
     A    B    C
0    5    6    7
1    7    8    9
2  NaN  NaN  NaN
>>> df2 = pd.DataFrame(columns=['A', 'B', 'C'], index=index)
>>> df2.loc[1:] = [list(s1), list(s2)]

So far, we have what you had as df:

>>> df2
Out[339]: 
     A    B    C
0  NaN  NaN  NaN
1    5    6    7
2    7    8    9

But now you can easily insert the row as follows. Since the space was preallocated, this is more efficient.

>>> df2.loc[0] = np.array([2, 3, 4])
>>> df2
Out[341]: 
   A  B  C
0  2  3  4
1  5  6  7
2  7  8  9

回答 3

我整理了一个简短的函数,该函数在插入行时具有更大的灵活性:

def insert_row(idx, df, df_insert):
    dfA = df.iloc[:idx, ]
    dfB = df.iloc[idx:, ]

    df = dfA.append(df_insert).append(dfB).reset_index(drop = True)

    return df

可以进一步缩短为:

def insert_row(idx, df, df_insert):
    return df.iloc[:idx, ].append(df_insert).append(df.iloc[idx:, ]).reset_index(drop = True)

然后,您可以使用类似:

df = insert_row(2, df, df_new)

这里2是在索引位置df要插入df_new

I put together a short function that allows for a little more flexibility when inserting a row:

def insert_row(idx, df, df_insert):
    dfA = df.iloc[:idx, ]
    dfB = df.iloc[idx:, ]

    df = dfA.append(df_insert).append(dfB).reset_index(drop = True)

    return df

which could be further shortened to:

def insert_row(idx, df, df_insert):
    return df.iloc[:idx, ].append(df_insert).append(df.iloc[idx:, ]).reset_index(drop = True)

Then you could use something like:

df = insert_row(2, df, df_new)

where 2 is the index position in df where you want to insert df_new.


回答 4

我们可以使用numpy.insert。这具有灵活性的优点。您只需要指定要插入的索引。

s1 = pd.Series([5, 6, 7])
s2 = pd.Series([7, 8, 9])

df = pd.DataFrame([list(s1), list(s2)],  columns =  ["A", "B", "C"])

pd.DataFrame(np.insert(df.values, 0, values=[2, 3, 4], axis=0))

    0   1   2
0   2   3   4
1   5   6   7
2   7   8   9

对于np.insert(df.values, 0, values=[2, 3, 4], axis=0),0告诉函数要放置新值的位置/索引。

We can use numpy.insert. This has the advantage of flexibility. You only need to specify the index you want to insert to.

s1 = pd.Series([5, 6, 7])
s2 = pd.Series([7, 8, 9])

df = pd.DataFrame([list(s1), list(s2)],  columns =  ["A", "B", "C"])

pd.DataFrame(np.insert(df.values, 0, values=[2, 3, 4], axis=0))

    0   1   2
0   2   3   4
1   5   6   7
2   7   8   9

For np.insert(df.values, 0, values=[2, 3, 4], axis=0), 0 tells the function the place/index you want to place the new values.


回答 5

这看似过于简单,但令人难以置信的是,没有内置简单的插入新行功能。我已经读了很多关于将新df附加到原始df的信息,但是我想知道这样做是否会更快。

df.loc[0] = [row1data, blah...]
i = len(df) + 1
df.loc[i] = [row2data, blah...]

this might seem overly simple but its incredible that a simple insert new row function isn’t built in. i’ve read a lot about appending a new df to the original, but i’m wondering if this would be faster.

df.loc[0] = [row1data, blah...]
i = len(df) + 1
df.loc[i] = [row2data, blah...]

回答 6

以下是在不排序和重置索引的情况下将行插入pandas数据框的最佳方法:

import pandas as pd

df = pd.DataFrame(columns=['a','b','c'])

def insert(df, row):
    insert_loc = df.index.max()

    if pd.isna(insert_loc):
        df.loc[0] = row
    else:
        df.loc[insert_loc + 1] = row

insert(df,[2,3,4])
insert(df,[8,9,0])
print(df)

Below would be the best way to insert a row into pandas dataframe without sorting and reseting an index:

import pandas as pd

df = pd.DataFrame(columns=['a','b','c'])

def insert(df, row):
    insert_loc = df.index.max()

    if pd.isna(insert_loc):
        df.loc[0] = row
    else:
        df.loc[insert_loc + 1] = row

insert(df,[2,3,4])
insert(df,[8,9,0])
print(df)

回答 7

concat()似乎比最后一行插入和重新索引要快一点。如果有人想知道两种主要方法的速度:

In [x]: %%timeit
     ...: df = pd.DataFrame(columns=['a','b'])
     ...: for i in range(10000):
     ...:     df.loc[-1] = [1,2]
     ...:     df.index = df.index + 1
     ...:     df = df.sort_index()

每个循环17.1 s±705毫秒(平均±标准偏差,共7次运行,每个循环1次)

In [y]: %%timeit
     ...: df = pd.DataFrame(columns=['a', 'b'])
     ...: for i in range(10000):
     ...:     df = pd.concat([pd.DataFrame([[1,2]], columns=df.columns), df])

每个循环6.53 s±127毫秒(平均±标准偏差,共7次运行,每个循环1次)

concat() seems to be a bit faster than last row insertion and reindexing. In case someone would wonder about the speed of two top approaches:

In [x]: %%timeit
     ...: df = pd.DataFrame(columns=['a','b'])
     ...: for i in range(10000):
     ...:     df.loc[-1] = [1,2]
     ...:     df.index = df.index + 1
     ...:     df = df.sort_index()

17.1 s ± 705 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [y]: %%timeit
     ...: df = pd.DataFrame(columns=['a', 'b'])
     ...: for i in range(10000):
     ...:     df = pd.concat([pd.DataFrame([[1,2]], columns=df.columns), df])

6.53 s ± 127 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


回答 8

在pandas中添加一行很简单DataFrame

  1. 创建一个与您的列名称相同的常规Python字典Dataframe

  2. 使用pandas.append()method并传入您的字典名称,其中.append()DataFrame实例上的方法是;

  3. ignore_index=True在您的词典名称之后添加。

It is pretty simple to add a row into a pandas DataFrame:

  1. Create a regular Python dictionary with the same columns names as your Dataframe;

  2. Use pandas.append() method and pass in the name of your dictionary, where .append() is a method on DataFrame instances;

  3. Add ignore_index=True right after your dictionary name.


回答 9

您可以简单地将行追加到DataFrame的末尾,然后调整索引。

例如:

df = df.append(pd.DataFrame([[2,3,4]],columns=df.columns),ignore_index=True)
df.index = (df.index + 1) % len(df)
df = df.sort_index()

concat用作:

df = pd.concat([pd.DataFrame([[1,2,3,4,5,6]],columns=df.columns),df],ignore_index=True)

You can simply append the row to the end of the DataFrame, and then adjust the index.

For instance:

df = df.append(pd.DataFrame([[2,3,4]],columns=df.columns),ignore_index=True)
df.index = (df.index + 1) % len(df)
df = df.sort_index()

Or use concat as:

df = pd.concat([pd.DataFrame([[1,2,3,4,5,6]],columns=df.columns),df],ignore_index=True)

回答 10

在熊猫数据框中添加一行的最简单方法是:

DataFrame.loc[ location of insertion ]= list( )

范例:

DF.loc[ 9 ] = [ ´Pepe , 33, ´Japan ]

注意:列表的长度应与数据框的长度匹配。

The simplest way add a row in a pandas data frame is:

DataFrame.loc[ location of insertion ]= list( )

Example :

DF.loc[ 9 ] = [ ´Pepe’ , 33, ´Japan’ ]

NB: the length of your list should match that of the data frame.


Python的sys.path是从哪里初始化的?

问题:Python的sys.path是从哪里初始化的?

Python的sys.path是从哪里初始化的?

UPD:Python在引用PYTHONPATH之前添加了一些路径:

    >>> import sys
    >>> from pprint import pprint as p
    >>> p(sys.path)
    ['',
     'C:\\Python25\\lib\\site-packages\\setuptools-0.6c9-py2.5.egg',
     'C:\\Python25\\lib\\site-packages\\orbited-0.7.8-py2.5.egg',
     'C:\\Python25\\lib\\site-packages\\morbid-0.8.6.1-py2.5.egg',
     'C:\\Python25\\lib\\site-packages\\demjson-1.4-py2.5.egg',
     'C:\\Python25\\lib\\site-packages\\stomper-0.2.2-py2.5.egg',
     'C:\\Python25\\lib\\site-packages\\uuid-1.30-py2.5.egg',
     'C:\\Python25\\lib\\site-packages\\stompservice-0.1.0-py2.5.egg',
     'C:\\Python25\\lib\\site-packages\\cherrypy-3.0.1-py2.5.egg',
     'C:\\Python25\\lib\\site-packages\\pyorbited-0.2.2-py2.5.egg',
     'C:\\Python25\\lib\\site-packages\\flup-1.0.1-py2.5.egg',
     'C:\\Python25\\lib\\site-packages\\wsgilog-0.1-py2.5.egg',
     'c:\\testdir',
     'C:\\Windows\\system32\\python25.zip',
     'C:\\Python25\\DLLs',
     'C:\\Python25\\lib',
     'C:\\Python25\\lib\\plat-win',
     'C:\\Python25\\lib\\lib-tk',
     'C:\\Python25',
     'C:\\Python25\\lib\\site-packages',
     'C:\\Python25\\lib\\site-packages\\PIL',
     'C:\\Python25\\lib\\site-packages\\win32',
     'C:\\Python25\\lib\\site-packages\\win32\\lib',
     'C:\\Python25\\lib\\site-packages\\Pythonwin']

我的PYTHONPATH是:

    PYTHONPATH=c:\testdir

我想知道PYTHONPATH之前的那些路径来自哪里?

Where is Python’s sys.path initialized from?

UPD: Python is adding some paths before refering to PYTHONPATH:

    >>> import sys
    >>> from pprint import pprint as p
    >>> p(sys.path)
    ['',
     'C:\\Python25\\lib\\site-packages\\setuptools-0.6c9-py2.5.egg',
     'C:\\Python25\\lib\\site-packages\\orbited-0.7.8-py2.5.egg',
     'C:\\Python25\\lib\\site-packages\\morbid-0.8.6.1-py2.5.egg',
     'C:\\Python25\\lib\\site-packages\\demjson-1.4-py2.5.egg',
     'C:\\Python25\\lib\\site-packages\\stomper-0.2.2-py2.5.egg',
     'C:\\Python25\\lib\\site-packages\\uuid-1.30-py2.5.egg',
     'C:\\Python25\\lib\\site-packages\\stompservice-0.1.0-py2.5.egg',
     'C:\\Python25\\lib\\site-packages\\cherrypy-3.0.1-py2.5.egg',
     'C:\\Python25\\lib\\site-packages\\pyorbited-0.2.2-py2.5.egg',
     'C:\\Python25\\lib\\site-packages\\flup-1.0.1-py2.5.egg',
     'C:\\Python25\\lib\\site-packages\\wsgilog-0.1-py2.5.egg',
     'c:\\testdir',
     'C:\\Windows\\system32\\python25.zip',
     'C:\\Python25\\DLLs',
     'C:\\Python25\\lib',
     'C:\\Python25\\lib\\plat-win',
     'C:\\Python25\\lib\\lib-tk',
     'C:\\Python25',
     'C:\\Python25\\lib\\site-packages',
     'C:\\Python25\\lib\\site-packages\\PIL',
     'C:\\Python25\\lib\\site-packages\\win32',
     'C:\\Python25\\lib\\site-packages\\win32\\lib',
     'C:\\Python25\\lib\\site-packages\\Pythonwin']

My PYTHONPATH is:

    PYTHONPATH=c:\testdir

I wonder where those paths before PYTHONPATH’s ones come from?


回答 0

“从环境变量PYTHONPATH初始化,加上与安装有关的默认值”

http://docs.python.org/library/sys.html#sys.path

“Initialized from the environment variable PYTHONPATH, plus an installation-dependent default”

http://docs.python.org/library/sys.html#sys.path


回答 1

Python确实努力进行智能设置sys.path。如何设置可能会变得非常 复杂。下面的指南是一个打了折扣的,有点不完全,有些-错,但希望-有用的时候Python会什么的使用会发生什么的职级和文件Python程序员指南初始值sys.pathsys.executablesys.exec_prefix,和sys.prefix正常的 python安装上。

首先,python会尽最大努力根据操作系统告诉它在文件系统上的实际物理位置。如果操作系统只是说“ python”正在运行,它将在$ PATH中找到自己。它解析任何符号链接。完成此操作后,它将找到的可执行文件的路径用作sys.executable,no ifs,ands或buts的值。

接下来,确定用于初始值sys.exec_prefixsys.prefix

如果pyvenv.cfg在与该目录相同的目录中有一个文件, sys.executable或者在一个目录中,则python会查看该文件。不同的操作系统对此文件执行不同的操作。

python在此配置文件中查找的值之一是configuration选项home = <DIRECTORY>sys.executable 当它动态设置以后的初始值时,Python将使用此目录而不是包含的目录sys.prefix。如果该applocal = true设置出现在pyvenv.cfgWindows 的 文件中,但没有出现在home = <DIRECTORY>设置中,sys.prefix则将被设置为包含的目录sys.executable

接下来,PYTHONHOME检查环境变量。在Linux和Mac上, sys.prefixsys.exec_prefix设置为PYTHONHOME环境变量,如果它存在,并取代任何home = <DIRECTORY>的设置pyvenv.cfg。在Windows上, sys.prefix并且sys.exec_prefix设置为PYTHONHOME环境变量(如果存在),除非在中存在home = <DIRECTORY>设置,否则将使用该设置pyvenv.cfg

否则,可以通过从或指定的目录(如果有)的位置向后走来找到sys.prefix和。sys.exec_prefixsys.executablehomepyvenv.cfg

如果lib/python<version>/dyn-load在该目录或其任何父目录中找到该文件,则将该目录设置为 sys.exec_prefix在Linux或Mac上。如果lib/python<version>/os.py在目录或其任何子目录中找到该文件 ,则将该目录设置为sys.prefix在Linux,Mac和Windows上,并sys.exec_prefix设置为与Windows 相同的值 sys.prefix。如果applocal = true已设置,则在Windows上将跳过整个步骤 。使用的目录,sys.executable或者如果home设置了目录,则将其pyvenv.cfg用于的初始值sys.prefix

如果找不到或没有找到这些“地标”文件sys.prefix,则python设置sys.prefix为“后备”值。Linux和Mac,例如,使用预编译的缺省值的数值sys.prefixsys.exec_prefix。Windows等到sys.path完全确定要为设置后备值 为止sys.prefix

然后,(您一直在等待)python确定要包含在中的初始值sys.path

  1. python正在执行的脚本目录被添加到sys.path。在Windows上,这始终是空字符串,它告诉python使用脚本所在的完整路径。
  2. 除非您在Windows上并且在中设置为true sys.path否则将添加PYTHONPATH环境变量的内容(如果applocal已设置)pyvenv.cfg
  3. <prefix>/lib/python35.zipLinux / Mac和os.path.join(os.dirname(sys.executable), "python.zip")Windows 上 的zip文件路径已添加到中sys.path
  4. 如果在Windows上没有applocal = true在中设置No pyvenv.cfg,则HK_CURRENT_USER\Software\Python\PythonCore\<DLLVersion>\PythonPath\添加注册表项的子项的内容( 如果有)。
  5. 如果在Windows上未applocal = true在中设置No pyvenv.cfg,并且sys.prefix找不到,则添加注册表项的核心内容HK_CURRENT_USER\Software\Python\PythonCore\<DLLVersion>\PythonPath\如果存在);
  6. 如果在Windows上没有applocal = true在中设置No pyvenv.cfg,则HK_LOCAL_MACHINE\Software\Python\PythonCore\<DLLVersion>\PythonPath\添加注册表项的子项的内容( 如果有)。
  7. 如果在Windows上未applocal = true在中设置No pyvenv.cfg,并且sys.prefix找不到,则添加注册表项的核心内容HK_CURRENT_USER\Software\Python\PythonCore\<DLLVersion>\PythonPath\如果存在);
  8. 如果在Windows上并且未设置PYTHONPATH,则找不到前缀,并且不存在注册表项,则添加PYTHONPATH的相对编译时值;否则,将忽略此步骤。
  9. 相对于dynamic-found添加了编译时宏PYTHONPATH中的路径sys.prefix
  10. 在Mac和Linux上,将sys.exec_prefix添加的值。在Windows上,添加了用于(或将要使用)动态搜索的目录sys.prefix

在Windows的现阶段,如果未找到前缀,则python将尝试通过搜索所有目录中sys.path的地标文件来确定它,就像它尝试使用sys.executable以前的目录一样,直到找到了东西。如果不是,sys.prefix则留空。

最后,在完成所有这些之后,Python加载了site模块,这进一步为sys.path以下模块添加了一些内容:

它从头和尾部分开始最多构建四个目录。头部使用sys.prefixsys.exec_prefix; 空头被跳过。对于尾部,它使用空字符串,然后lib/site-packages(在Windows上)或lib/pythonX.Y/site-packages (然后lib/site-python在Unix和Macintosh上)使用。对于每个不同的首尾组合,它会查看它是否指向现有目录,如果是,则将其添加到sys.path中,并检查新添加的配置文件路径。

Python really tries hard to intelligently set sys.path. How it is set can get really complicated. The following guide is a watered-down, somewhat-incomplete, somewhat-wrong, but hopefully-useful guide for the rank-and-file python programmer of what happens when python figures out what to use as the initial values of sys.path, sys.executable, sys.exec_prefix, and sys.prefix on a normal python installation.

First, python does its level best to figure out its actual physical location on the filesystem based on what the operating system tells it. If the OS just says “python” is running, it finds itself in $PATH. It resolves any symbolic links. Once it has done this, the path of the executable that it finds is used as the value for sys.executable, no ifs, ands, or buts.

Next, it determines the initial values for sys.exec_prefix and sys.prefix.

If there is a file called pyvenv.cfg in the same directory as sys.executable or one directory up, python looks at it. Different OSes do different things with this file.

One of the values in this config file that python looks for is the configuration option home = <DIRECTORY>. Python will use this directory instead of the directory containing sys.executable when it dynamically sets the initial value of sys.prefix later. If the applocal = true setting appears in the pyvenv.cfg file on Windows, but not the home = <DIRECTORY> setting, then sys.prefix will be set to the directory containing sys.executable.

Next, the PYTHONHOME environment variable is examined. On Linux and Mac, sys.prefix and sys.exec_prefix are set to the PYTHONHOME environment variable, if it exists, superseding any home = <DIRECTORY> setting in pyvenv.cfg. On Windows, sys.prefix and sys.exec_prefix is set to the PYTHONHOME environment variable, if it exists, unless a home = <DIRECTORY> setting is present in pyvenv.cfg, which is used instead.

Otherwise, these sys.prefix and sys.exec_prefix are found by walking backwards from the location of sys.executable, or the home directory given by pyvenv.cfg if any.

If the file lib/python<version>/dyn-load is found in that directory or any of its parent directories, that directory is set to be to be sys.exec_prefix on Linux or Mac. If the file lib/python<version>/os.py is is found in the directory or any of its subdirectories, that directory is set to be sys.prefix on Linux, Mac, and Windows, with sys.exec_prefix set to the same value as sys.prefix on Windows. This entire step is skipped on Windows if applocal = true is set. Either the directory of sys.executable is used or, if home is set in pyvenv.cfg, that is used instead for the initial value of sys.prefix.

If it can’t find these “landmark” files or sys.prefix hasn’t been found yet, then python sets sys.prefix to a “fallback” value. Linux and Mac, for example, use pre-compiled defaults as the values of sys.prefix and sys.exec_prefix. Windows waits until sys.path is fully figured out to set a fallback value for sys.prefix.

Then, (what you’ve all been waiting for,) python determines the initial values that are to be contained in sys.path.

  1. The directory of the script which python is executing is added to sys.path. On Windows, this is always the empty string, which tells python to use the full path where the script is located instead.
  2. The contents of PYTHONPATH environment variable, if set, is added to sys.path, unless you’re on Windows and applocal is set to true in pyvenv.cfg.
  3. The zip file path, which is <prefix>/lib/python35.zip on Linux/Mac and os.path.join(os.dirname(sys.executable), "python.zip") on Windows, is added to sys.path.
  4. If on Windows and no applocal = true was set in pyvenv.cfg, then the contents of the subkeys of the registry key HK_CURRENT_USER\Software\Python\PythonCore\<DLLVersion>\PythonPath\ are added, if any.
  5. If on Windows and no applocal = true was set in pyvenv.cfg, and sys.prefix could not be found, then the core contents of the of the registry key HK_CURRENT_USER\Software\Python\PythonCore\<DLLVersion>\PythonPath\ is added, if it exists;
  6. If on Windows and no applocal = true was set in pyvenv.cfg, then the contents of the subkeys of the registry key HK_LOCAL_MACHINE\Software\Python\PythonCore\<DLLVersion>\PythonPath\ are added, if any.
  7. If on Windows and no applocal = true was set in pyvenv.cfg, and sys.prefix could not be found, then the core contents of the of the registry key HK_CURRENT_USER\Software\Python\PythonCore\<DLLVersion>\PythonPath\ is added, if it exists;
  8. If on Windows, and PYTHONPATH was not set, the prefix was not found, and no registry keys were present, then the relative compile-time value of PYTHONPATH is added; otherwise, this step is ignored.
  9. Paths in the compile-time macro PYTHONPATH are added relative to the dynamically-found sys.prefix.
  10. On Mac and Linux, the value of sys.exec_prefix is added. On Windows, the directory which was used (or would have been used) to search dynamically for sys.prefix is added.

At this stage on Windows, if no prefix was found, then python will try to determine it by searching all the directories in sys.path for the landmark files, as it tried to do with the directory of sys.executable previously, until it finds something. If it doesn’t, sys.prefix is left blank.

Finally, after all this, Python loads the site module, which adds stuff yet further to sys.path:

It starts by constructing up to four directories from a head and a tail part. For the head part, it uses sys.prefix and sys.exec_prefix; empty heads are skipped. For the tail part, it uses the empty string and then lib/site-packages (on Windows) or lib/pythonX.Y/site-packages and then lib/site-python (on Unix and Macintosh). For each of the distinct head-tail combinations, it sees if it refers to an existing directory, and if so, adds it to sys.path and also inspects the newly added path for configuration files.


Anaconda与Python有何关系?

问题:Anaconda与Python有何关系?

我是一个初学者,我想学习计算机编程。因此,到目前为止,我已经开始自己学习Python,并掌握了有关C和Fortran编程的知识。

现在,我已经安装了Python 3.6.0版,并且一直在努力寻找适合该版本的Python学习文字。甚至在线讲座系列也要求版本2.7和2.5。

现在,我已经有了一本书,但是,该书在版本2中进行了编码,并试图在版本3中使其尽可能接近(根据作者);作者建议“下载Windows版Anaconda”以安装Python。

所以,我的问题是:这是什么“ Anaconda”?我看到这是一个开放的数据科学平台。这是什么意思?是某些编辑器还是诸如Pycharm,IDLE之类的东西?

另外,我从Python.org下载了适用于Windows的Python(我现在正在使用的Python),而我不需要安装任何“开放数据科学平台”。那么这是怎么回事?

请用简单的语言解释。我对这些没有太多的了解。

I am a beginner and I want to learn computer programming. So, for now, I have started learning Python by myself with some knowledge about programming in C and Fortran.

Now, I have installed Python version 3.6.0 and I have struggled finding a suitable text for learning Python in this version. Even the online lecture series ask for versions 2.7 and 2.5 .

Now that I have got a book which, however, makes codes in version 2 and tries to make it as close as possible in version 3 (according to the author); the author recommends “downloading Anaconda for Windows” for installing Python.

So, my question is: What is this ‘Anaconda’? I saw that it was some open data science platform. What does it mean? Is it some editor or something like Pycharm, IDLE or something?

Also, I downloaded my Python (the one that I am using right now) for Windows from Python.org and I didn’t need to install any “open data science platform”. So what is this happening?

Please explain in easy language. I don’t have too much knowledge about these.


回答 0

Anaconda是python和R 发行版。它旨在“开箱即用”地提供数据科学所需的一切(Python方面)。

这包括:

  • 核心Python语言
  • 100多个Python“软件包”(库)
  • Spyder(IDE /编辑器-如PyCharm)和Jupyter
  • conda,Anaconda自己的软件包管理器,用于更新Anaconda和软件包

您的类可能已经推荐了这些额外功能,但是如果您不需要它们,并且可以使用香草Python也可以。

了解更多信息:https : //www.anaconda.com/distribution/

Anaconda is a python and R distribution. It aims to provide everything you need (Python-wise) for data science “out of the box”.

It includes:

  • The core Python language
  • 100+ Python “packages” (libraries)
  • Spyder (IDE/editor – like PyCharm) and Jupyter
  • conda, Anaconda’s own package manager, used for updating Anaconda and packages

Your course may have recommended it as it comes with these extras but if you don’t need them and are getting on fine with vanilla Python that’s OK too.

Learn more: https://www.anaconda.com/distribution/


回答 1

Anaconda是一个Python发行版,可轻松以灵活的方式在Windows或Linux机器上安装Python以及其最常用的第三方库。

我在Windows和Linux上的使用经验都非常积极。它非常完整,可以避免从源代码构建所需的库时出现问题,而这些问题经常通过诸如pip之类的工具一一安装这些库。

顺便说一句:从3.5或3.6开始非常明智,因为2.7即将接近其生命周期,尽管许多应用程序仍依赖它。

至于教程:Python自己的文档非常适合学习该语言。

https://docs.python.org/3/tutorial/

Anaconda is a Python distribution that makes it easy to install Python plus a number of its most often used 3rd party libraries in a flexible way on a Windows or Linux machine.

My experiences with it are very positive, both on Windows and Linux. It is quite complete and avoids problems in building libraries that you need from source code, that frequently plague one by one installations of those libraries by tools like pip.

By the way: It’s very wise to start with 3.5 or 3.6 since 2.7 is approaching the end of its lifecycle, though many applications still depend on it.

As for tutorials: Pythons own docs are quite suitable for learning the language.

https://docs.python.org/3/tutorial/


回答 2

Anaconda是基于Python的数据处理和科学计算平台。它内置了许多非常有用的第三方库。安装Anaconda等效于自动安装Python和一些常用的库,例如Numpy,Pandas,Scrip和Matplotlib,因此,它比常规的Python安装容易得多。如果您没有安装Anaconda,而是仅从python.org安装Python,则还需要使用pip逐一安装各种库。这很痛苦,您需要考虑兼容性,因此强烈建议直接安装Anaconda。

Anaconda is a Python-based data processing and scientific computing platform. It has built in many very useful third-party libraries. Installing Anaconda is equivalent to automatically installing Python and some commonly used libraries such as Numpy, Pandas, Scrip, and Matplotlib, so it makes the installation so much easier than regular Python installation. If you don’t install Anaconda, but instead only install Python from python.org, you also need to use pip to install various libraries one by one. It is painful and you need to consider compatibility, thus it is highly recommended to directly install Anaconda.


获取MAC地址

问题:获取MAC地址

我需要一种跨平台的方法来在运行时确定计算机的MAC地址。对于Windows,可以使用“ wmi”模块,在Linux下,我能找到的唯一方法是运行ifconfig并在其输出中运行正则表达式。我不喜欢使用只能在一个OS上运行的程序包,而且更不用说容易出错的语法解析另一个程序的输出了。

有谁知道跨平台方法(Windows和Linux)方法来获取MAC地址?如果没有,还有谁比我上面列出的方法更优雅?

I need a cross platform method of determining the MAC address of a computer at run time. For windows the ‘wmi’ module can be used and the only method under Linux I could find was to run ifconfig and run a regex across its output. I don’t like using a package that only works on one OS, and parsing the output of another program doesn’t seem very elegant not to mention error prone.

Does anyone know a cross platform method (windows and linux) method to get the MAC address? If not, does anyone know any more elegant methods then those I listed above?


回答 0

Python 2.5包含一个uuid实现(至少在一个版本中),该实现需要mac地址。您可以轻松地将mac查找功能导入您自己的代码中:

from uuid import getnode as get_mac
mac = get_mac()

返回值是作为48位整数的mac地址。

Python 2.5 includes an uuid implementation which (in at least one version) needs the mac address. You can import the mac finding function into your own code easily:

from uuid import getnode as get_mac
mac = get_mac()

The return value is the mac address as 48 bit integer.


回答 1

在Linux下针对此问题的纯python解决方案,用于获取特定本地接口的MAC,最初由vishnubob作为注释发布,并在本activestate食谱中由Ben Mackey进行了改进

#!/usr/bin/python

import fcntl, socket, struct

def getHwAddr(ifname):
    s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
    info = fcntl.ioctl(s.fileno(), 0x8927,  struct.pack('256s', ifname[:15]))
    return ':'.join(['%02x' % ord(char) for char in info[18:24]])

print getHwAddr('eth0')

这是Python 3兼容的代码:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import fcntl
import socket
import struct


def getHwAddr(ifname):
    s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
    info = fcntl.ioctl(s.fileno(), 0x8927,  struct.pack('256s', bytes(ifname, 'utf-8')[:15]))
    return ':'.join('%02x' % b for b in info[18:24])


def main():
    print(getHwAddr('enp0s8'))


if __name__ == "__main__":
    main()

The pure python solution for this problem under Linux to get the MAC for a specific local interface, originally posted as a comment by vishnubob and improved by on Ben Mackey in this activestate recipe

#!/usr/bin/python

import fcntl, socket, struct

def getHwAddr(ifname):
    s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
    info = fcntl.ioctl(s.fileno(), 0x8927,  struct.pack('256s', ifname[:15]))
    return ':'.join(['%02x' % ord(char) for char in info[18:24]])

print getHwAddr('eth0')

This is the Python 3 compatible code:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import fcntl
import socket
import struct


def getHwAddr(ifname):
    s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
    info = fcntl.ioctl(s.fileno(), 0x8927,  struct.pack('256s', bytes(ifname, 'utf-8')[:15]))
    return ':'.join('%02x' % b for b in info[18:24])


def main():
    print(getHwAddr('enp0s8'))


if __name__ == "__main__":
    main()

回答 2

netifaces是一个很好的模块,可用于获取mac地址(和其他地址)。它是跨平台的,比使用套接字或uuid更有意义。

>>> import netifaces
>>> netifaces.interfaces()
['lo', 'eth0', 'tun2']

>>> netifaces.ifaddresses('eth0')[netifaces.AF_LINK]
[{'addr': '08:00:27:50:f2:51', 'broadcast': 'ff:ff:ff:ff:ff:ff'}]

netifaces is a good module to use for getting the mac address (and other addresses). It’s crossplatform and makes a bit more sense than using socket or uuid.

>>> import netifaces
>>> netifaces.interfaces()
['lo', 'eth0', 'tun2']

>>> netifaces.ifaddresses('eth0')[netifaces.AF_LINK]
[{'addr': '08:00:27:50:f2:51', 'broadcast': 'ff:ff:ff:ff:ff:ff'}]


回答 3

您应该注意的另一件事是,uuid.getnode()可以通过返回随机的48位数字来伪造MAC地址,这可能不是您所期望的。另外,也没有明显的迹象表明MAC地址已被伪造,但是您可以通过调用getnode()两次并查看结果是否变化来检测到它。如果两个调用都返回了相同的值,则说明您具有MAC地址,否则得到的是伪造的地址。

>>> print uuid.getnode.__doc__
Get the hardware address as a 48-bit positive integer.

    The first time this runs, it may launch a separate program, which could
    be quite slow.  If all attempts to obtain the hardware address fail, we
    choose a random 48-bit number with its eighth bit set to 1 as recommended
    in RFC 4122.

One other thing that you should note is that uuid.getnode() can fake the MAC addr by returning a random 48-bit number which may not be what you are expecting. Also, there’s no explicit indication that the MAC address has been faked, but you could detect it by calling getnode() twice and seeing if the result varies. If the same value is returned by both calls, you have the MAC address, otherwise you are getting a faked address.

>>> print uuid.getnode.__doc__
Get the hardware address as a 48-bit positive integer.

    The first time this runs, it may launch a separate program, which could
    be quite slow.  If all attempts to obtain the hardware address fail, we
    choose a random 48-bit number with its eighth bit set to 1 as recommended
    in RFC 4122.

回答 4

有时我们有多个网络接口。

找出特定接口的mac地址的一种简单方法是:

def getmac(interface):

  try:
    mac = open('/sys/class/net/'+interface+'/address').readline()
  except:
    mac = "00:00:00:00:00:00"

  return mac[0:17]

调用方法很简单

myMAC = getmac("wlan0")

Sometimes we have more than one net interface.

A simple method to find out the mac address of a specific interface, is:

def getmac(interface):

  try:
    mac = open('/sys/class/net/'+interface+'/address').readline()
  except:
    mac = "00:00:00:00:00:00"

  return mac[0:17]

to call the method is simple

myMAC = getmac("wlan0")

回答 5

从这里使用我的答案:https : //stackoverflow.com/a/18031868/2362361

重要的是要知道您想要MAC到哪个iface,因为可能存在很多(蓝牙,几个nic等)。

当您知道需要使用MAC的iface的IP netifaces(使用PyPI中提供)时,就可以完成此工作:

import netifaces as nif
def mac_for_ip(ip):
    'Returns a list of MACs for interfaces that have given IP, returns None if not found'
    for i in nif.interfaces():
        addrs = nif.ifaddresses(i)
        try:
            if_mac = addrs[nif.AF_LINK][0]['addr']
            if_ip = addrs[nif.AF_INET][0]['addr']
        except IndexError, KeyError: #ignore ifaces that dont have MAC or IP
            if_mac = if_ip = None
        if if_ip == ip:
            return if_mac
    return None

测试:

>>> mac_for_ip('169.254.90.191')
'2c:41:38:0a:94:8b'

Using my answer from here: https://stackoverflow.com/a/18031868/2362361

It would be important to know to which iface you want the MAC for since many can exist (bluetooth, several nics, etc.).

This does the job when you know the IP of the iface you need the MAC for, using netifaces (available in PyPI):

import netifaces as nif
def mac_for_ip(ip):
    'Returns a list of MACs for interfaces that have given IP, returns None if not found'
    for i in nif.interfaces():
        addrs = nif.ifaddresses(i)
        try:
            if_mac = addrs[nif.AF_LINK][0]['addr']
            if_ip = addrs[nif.AF_INET][0]['addr']
        except IndexError, KeyError: #ignore ifaces that dont have MAC or IP
            if_mac = if_ip = None
        if if_ip == ip:
            return if_mac
    return None

Testing:

>>> mac_for_ip('169.254.90.191')
'2c:41:38:0a:94:8b'

回答 6

您可以使用跨平台的psutil进行此操作:

import psutil
nics = psutil.net_if_addrs()
print [j.address for j in nics[i] for i in nics if i!="lo" and j.family==17]

You can do this with psutil which is cross-platform:

import psutil
nics = psutil.net_if_addrs()
print [j.address for j in nics[i] for i in nics if i!="lo" and j.family==17]

回答 7

请注意,您可以使用条件导入在python中构建自己的跨平台库。例如

import platform
if platform.system() == 'Linux':
  import LinuxMac
  mac_address = LinuxMac.get_mac_address()
elif platform.system() == 'Windows':
  # etc

这将允许您使用os.system调用或特定于平台的库。

Note that you can build your own cross-platform library in python using conditional imports. e.g.

import platform
if platform.system() == 'Linux':
  import LinuxMac
  mac_address = LinuxMac.get_mac_address()
elif platform.system() == 'Windows':
  # etc

This will allow you to use os.system calls or platform-specific libraries.


回答 8

如果您不介意依赖,则跨平台的getmac软件包将对此有效。它适用于Python 2.7+和3.4+。它将尝试许多不同的方法,直到获得地址或返回None。

from getmac import get_mac_address
eth_mac = get_mac_address(interface="eth0")
win_mac = get_mac_address(interface="Ethernet 3")
ip_mac = get_mac_address(ip="192.168.0.1")
ip6_mac = get_mac_address(ip6="::1")
host_mac = get_mac_address(hostname="localhost")
updated_mac = get_mac_address(ip="10.0.0.1", network_request=True)

免责声明:我是软件包的作者。

更新(2019年1月14日):该软件包现在仅支持Python 2.7+和3.4+。如果您需要使用旧版本的Python(2.5、2.6、3.2、3.3),则仍可以使用该软件包的旧版本。

The cross-platform getmac package will work for this, if you don’t mind taking on a dependency. It works with Python 2.7+ and 3.4+. It will try many different methods until either getting a address or returning None.

from getmac import get_mac_address
eth_mac = get_mac_address(interface="eth0")
win_mac = get_mac_address(interface="Ethernet 3")
ip_mac = get_mac_address(ip="192.168.0.1")
ip6_mac = get_mac_address(ip6="::1")
host_mac = get_mac_address(hostname="localhost")
updated_mac = get_mac_address(ip="10.0.0.1", network_request=True)

Disclaimer: I am the author of the package.

Update (Jan 14 2019): the package now only supports Python 2.7+ and 3.4+. You can still use an older version of the package if you need to work with an older Python (2.5, 2.6, 3.2, 3.3).


回答 9

要获取eth0接口MAC地址,

import psutil

nics = psutil.net_if_addrs()['eth0']

for interface in nics:
   if interface.family == 17:
      print(interface.address)

To get the eth0 interface MAC address,

import psutil

nics = psutil.net_if_addrs()['eth0']

for interface in nics:
   if interface.family == 17:
      print(interface.address)

回答 10

我不知道统一的方式,但是以下内容可能对您有用:

http://www.codeguru.com/Cpp/IN/network/networkinformation/article.php/c5451

在这种情况下,我要做的就是将它们包装成一个函数,并根据操作系统运行适当的命令,根据需要进行解析,并仅返回所需格式的MAC地址。当然,除了您只需要执行一次,并且与主代码相比,它看起来更干净以外,其他所有内容都一样。

I dont know of a unified way, but heres something that you might find useful:

http://www.codeguru.com/Cpp/I-N/network/networkinformation/article.php/c5451

What I would do in this case would be to wrap these up into a function, and based on the OS it would run the proper command, parse as required and return only the MAC address formatted as you want. Its ofcourse all the same, except that you only have to do it once, and it looks cleaner from the main code.


回答 11

对于Linux,让我介绍一个shell脚本,该脚本将显示mac地址并允许对其进行更改(MAC嗅探)。

 ifconfig eth0 | grep HWaddr |cut -dH -f2|cut -d\  -f2
 00:26:6c:df:c3:95

减少参数(我不是专家)可以尝试:

ifconfig etho | grep HWaddr
eth0      Link encap:Ethernet  HWaddr 00:26:6c:df:c3:95  

要更改MAC,我们可以这样做:

ifconfig eth0 down
ifconfig eth0 hw ether 00:80:48:BA:d1:30
ifconfig eth0 up

会将mac地址更改为00:80:48:BA:d1:30(临时,重新启动后将还原为实际地址)。

For Linux let me introduce a shell script that will show the mac address and allows to change it (MAC sniffing).

 ifconfig eth0 | grep HWaddr |cut -dH -f2|cut -d\  -f2
 00:26:6c:df:c3:95

Cut arguements may dffer (I am not an expert) try:

ifconfig etho | grep HWaddr
eth0      Link encap:Ethernet  HWaddr 00:26:6c:df:c3:95  

To change MAC we may do:

ifconfig eth0 down
ifconfig eth0 hw ether 00:80:48:BA:d1:30
ifconfig eth0 up

will change mac address to 00:80:48:BA:d1:30 (temporarily, will restore to actual one upon reboot).


回答 12

或者,

import uuid
mac_id=(':'.join(['{:02x}'.format((uuid.getnode() >> ele) & 0xff)

Alternatively,

import uuid
mac_id=(':'.join(['{:02x}'.format((uuid.getnode() >> ele) & 0xff)

回答 13

For Linux you can retrieve the MAC address using a SIOCGIFHWADDR ioctl.

struct ifreq    ifr;
uint8_t         macaddr[6];

if ((s = socket(AF_INET, SOCK_DGRAM, IPPROTO_IP)) < 0)
    return -1;

strcpy(ifr.ifr_name, "eth0");

if (ioctl(s, SIOCGIFHWADDR, (void *)&ifr) == 0) {
    if (ifr.ifr_hwaddr.sa_family == ARPHRD_ETHER) {
        memcpy(macaddr, ifr.ifr_hwaddr.sa_data, 6);
        return 0;
... etc ...

You’ve tagged the question “python”. I don’t know of an existing Python module to get this information. You could use ctypes to call the ioctl directly.

For Linux you can retrieve the MAC address using a SIOCGIFHWADDR ioctl.

struct ifreq    ifr;
uint8_t         macaddr[6];

if ((s = socket(AF_INET, SOCK_DGRAM, IPPROTO_IP)) < 0)
    return -1;

strcpy(ifr.ifr_name, "eth0");

if (ioctl(s, SIOCGIFHWADDR, (void *)&ifr) == 0) {
    if (ifr.ifr_hwaddr.sa_family == ARPHRD_ETHER) {
        memcpy(macaddr, ifr.ifr_hwaddr.sa_data, 6);
        return 0;
... etc ...

You’ve tagged the question “python”. I don’t know of an existing Python module to get this information. You could use ctypes to call the ioctl directly.


小马(ORM)如何发挥作用?

问题:小马(ORM)如何发挥作用?

Pony ORM很好地把生成器表达式转换成SQL。例:

>>> select(p for p in Person if p.name.startswith('Paul'))
        .order_by(Person.name)[:2]

SELECT "p"."id", "p"."name", "p"."age"
FROM "Person" "p"
WHERE "p"."name" LIKE "Paul%"
ORDER BY "p"."name"
LIMIT 2

[Person[3], Person[1]]
>>>

我知道Python具有出色的自省和内置元编程功能,但是该库如何能够在不进行预处理的情况下转换生成器表达式?看起来像魔术。

[更新]

搅拌器写道:

这是您要查找的文件。似乎可以使用一些自省向导来重构生成器。我不确定它是否支持100%的Python语法,但这很酷。- 搅拌机

我以为他们正在研究生成器表达协议中的某些功能,但正在查看此文件并看到其中ast涉及的模块…不,他们不是在动态检查程序源,是吗?令人振奋…

@BrenBarn:如果我尝试在select函数调用之外调用生成器,则结果为:

>>> x = (p for p in Person if p.age > 20)
>>> x.next()
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
  File "<interactive input>", line 1, in <genexpr>
  File "C:\Python27\lib\site-packages\pony\orm\core.py", line 1822, in next
    % self.entity.__name__)
  File "C:\Python27\lib\site-packages\pony\utils.py", line 92, in throw
    raise exc
TypeError: Use select(...) function or Person.select(...) method for iteration
>>>

好像他们在做更多不可思议的事情,例如检查select函数调用和动态处理Python抽象语法语法树。

我仍然希望看到有人对此进行解释,其来源远远超出了我的巫术水平。

Pony ORM does the nice trick of converting a generator expression into SQL. Example:

>>> select(p for p in Person if p.name.startswith('Paul'))
        .order_by(Person.name)[:2]

SELECT "p"."id", "p"."name", "p"."age"
FROM "Person" "p"
WHERE "p"."name" LIKE "Paul%"
ORDER BY "p"."name"
LIMIT 2

[Person[3], Person[1]]
>>>

I know Python has wonderful introspection and metaprogramming builtin, but how this library is able to translate the generator expression without preprocessing? It looks like magic.

[update]

Blender wrote:

Here is the file that you’re after. It seems to reconstruct the generator using some introspection wizardry. I’m not sure if it supports 100% of Python’s syntax, but this is pretty cool. – Blender

I was thinking they were exploring some feature from the generator expression protocol, but looking this file, and seeing the ast module involved… No, they are not inspecting the program source on the fly, are they? Mind-blowing…

@BrenBarn: If I try to call the generator outside the select function call, the result is:

>>> x = (p for p in Person if p.age > 20)
>>> x.next()
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
  File "<interactive input>", line 1, in <genexpr>
  File "C:\Python27\lib\site-packages\pony\orm\core.py", line 1822, in next
    % self.entity.__name__)
  File "C:\Python27\lib\site-packages\pony\utils.py", line 92, in throw
    raise exc
TypeError: Use select(...) function or Person.select(...) method for iteration
>>>

Seems like they are doing more arcane incantations like inspecting the select function call and processing the Python abstract syntax grammar tree on the fly.

I still would like to see someone explaining it, the source is way beyond my wizardry level.


回答 0

小马ORM作者在这里。

Pony通过三个步骤将Python生成器转换为SQL查询:

  1. 反编译生成器字节码并重建生成器AST(抽象语法树)
  2. 将Python AST转换为“抽象SQL”-SQL查询的基于列表的通用表示形式
  3. 将抽象SQL表示转换为特定于数据库的SQL方言

最复杂的部分是第二步,其中Pony必须了解Python表达式的“含义”。似乎您对第一步最感兴趣,所以让我解释一下反编译的工作原理。

让我们考虑以下查询:

>>> from pony.orm.examples.estore import *
>>> select(c for c in Customer if c.country == 'USA').show()

将其转换为以下SQL:

SELECT "c"."id", "c"."email", "c"."password", "c"."name", "c"."country", "c"."address"
FROM "Customer" "c"
WHERE "c"."country" = 'USA'

下面是该查询的结果,将其打印出来:

id|email              |password|name          |country|address  
--+-------------------+--------+--------------+-------+---------
1 |john@example.com   |***     |John Smith    |USA    |address 1
2 |matthew@example.com|***     |Matthew Reed  |USA    |address 2
4 |rebecca@example.com|***     |Rebecca Lawson|USA    |address 4

select()函数接受python生成器作为参数,然后分析其字节码。我们可以使用标准的python dis模块获取此生成器的字节码指令:

>>> gen = (c for c in Customer if c.country == 'USA')
>>> import dis
>>> dis.dis(gen.gi_frame.f_code)
  1           0 LOAD_FAST                0 (.0)
        >>    3 FOR_ITER                26 (to 32)
              6 STORE_FAST               1 (c)
              9 LOAD_FAST                1 (c)
             12 LOAD_ATTR                0 (country)
             15 LOAD_CONST               0 ('USA')
             18 COMPARE_OP               2 (==)
             21 POP_JUMP_IF_FALSE        3
             24 LOAD_FAST                1 (c)
             27 YIELD_VALUE         
             28 POP_TOP             
             29 JUMP_ABSOLUTE            3
        >>   32 LOAD_CONST               1 (None)
             35 RETURN_VALUE

Pony ORM decompile()在模块内pony.orm.decompiling具有可以从字节码恢复AST 的功能:

>>> from pony.orm.decompiling import decompile
>>> ast, external_names = decompile(gen)

在这里,我们可以看到AST节点的文本表示形式:

>>> ast
GenExpr(GenExprInner(Name('c'), [GenExprFor(AssName('c', 'OP_ASSIGN'), Name('.0'),
[GenExprIf(Compare(Getattr(Name('c'), 'country'), [('==', Const('USA'))]))])]))

现在让我们看看该decompile()函数是如何工作的。

decompile()函数创建一个Decompiler对象,该对象实现了Visitor模式。反编译器实例一一获取字节码指令。对于每条指令,反编译器对象都会调用其自己的方法。该方法的名称等于当前字节码指令的名称。

Python计算表达式时,它使用堆栈,该堆栈存储中间的计算结果。反编译器对象也有自己的堆栈,但是该堆栈不存储表达式计算的结果,而是存储表达式的AST节点。

当调用下一个字节码指令的反编译器方法时,它将从堆栈中取出AST节点,将它们组合成一个新的AST节点,然后将该节点放在堆栈的顶部。

例如,让我们看看如何c.country == 'USA'计算子表达式。相应的字节码片段为:

              9 LOAD_FAST                1 (c)
             12 LOAD_ATTR                0 (country)
             15 LOAD_CONST               0 ('USA')
             18 COMPARE_OP               2 (==)

因此,反编译器对象执行以下操作:

  1. 来电decompiler.LOAD_FAST('c')。此方法将Name('c')节点放在反编译器堆栈的顶部。
  2. 来电decompiler.LOAD_ATTR('country')。此方法Name('c')从堆栈中取出节点,创建该Geattr(Name('c'), 'country')节点并将其放在堆栈顶部。
  3. 来电decompiler.LOAD_CONST('USA')。此方法将Const('USA')节点放在堆栈顶部。
  4. 来电decompiler.COMPARE_OP('==')。此方法从堆栈中获取两个节点(Getattr和Const),然后将其Compare(Getattr(Name('c'), 'country'), [('==', Const('USA'))]) 放在堆栈的顶部。

在处理完所有字节码指令之后,反编译器堆栈将包含一个与整个生成器表达式相对应的AST节点。

由于Pony ORM仅需要反编译生成器和lambda,因此并没有那么复杂,因为生成器的指令流相对简单-它只是一堆嵌套循环。

目前,Pony ORM涵盖了整个生成器指令集,但以下两点除外:

  1. 内联if表达式: a if b else c
  2. 复合比较: a < b < c

如果Pony遇到此类表达,则会引发NotImplementedError异常。但是即使在这种情况下,您也可以通过将生成器表达式作为字符串传递来使其工作。当您将生成器作为字符串传递时,Pony不使用反编译器模块。相反,它使用标准Python compiler.parse函数获取AST 。

希望这能回答您的问题。

Pony ORM author is here.

Pony translates Python generator into SQL query in three steps:

  1. Decompiling of generator bytecode and rebuilding generator AST (abstract syntax tree)
  2. Translation of Python AST into “abstract SQL” — universal list-based representation of a SQL query
  3. Converting abstract SQL representation into specific database-dependent SQL dialect

The most complex part is the second step, where Pony must understand the “meaning” of Python expressions. Seems you are most interested in the first step, so let me explain how decompiling works.

Let’s consider this query:

>>> from pony.orm.examples.estore import *
>>> select(c for c in Customer if c.country == 'USA').show()

Which will be translated into the following SQL:

SELECT "c"."id", "c"."email", "c"."password", "c"."name", "c"."country", "c"."address"
FROM "Customer" "c"
WHERE "c"."country" = 'USA'

And below is the result of this query which will be printed out:

id|email              |password|name          |country|address  
--+-------------------+--------+--------------+-------+---------
1 |john@example.com   |***     |John Smith    |USA    |address 1
2 |matthew@example.com|***     |Matthew Reed  |USA    |address 2
4 |rebecca@example.com|***     |Rebecca Lawson|USA    |address 4

The select() function accepts a python generator as argument, and then analyzes its bytecode. We can get bytecode instructions of this generator using standard python dis module:

>>> gen = (c for c in Customer if c.country == 'USA')
>>> import dis
>>> dis.dis(gen.gi_frame.f_code)
  1           0 LOAD_FAST                0 (.0)
        >>    3 FOR_ITER                26 (to 32)
              6 STORE_FAST               1 (c)
              9 LOAD_FAST                1 (c)
             12 LOAD_ATTR                0 (country)
             15 LOAD_CONST               0 ('USA')
             18 COMPARE_OP               2 (==)
             21 POP_JUMP_IF_FALSE        3
             24 LOAD_FAST                1 (c)
             27 YIELD_VALUE         
             28 POP_TOP             
             29 JUMP_ABSOLUTE            3
        >>   32 LOAD_CONST               1 (None)
             35 RETURN_VALUE

Pony ORM has the function decompile() within module pony.orm.decompiling which can restore an AST from the bytecode:

>>> from pony.orm.decompiling import decompile
>>> ast, external_names = decompile(gen)

Here, we can see the textual representation of the AST nodes:

>>> ast
GenExpr(GenExprInner(Name('c'), [GenExprFor(AssName('c', 'OP_ASSIGN'), Name('.0'),
[GenExprIf(Compare(Getattr(Name('c'), 'country'), [('==', Const('USA'))]))])]))

Let’s now see how the decompile() function works.

The decompile() function creates a Decompiler object, which implements the Visitor pattern. The decompiler instance gets bytecode instructions one-by-one. For each instruction the decompiler object calls its own method. The name of this method is equal to the name of current bytecode instruction.

When Python calculates an expression, it uses stack, which stores an intermediate result of calculation. The decompiler object also has its own stack, but this stack stores not the result of expression calculation, but AST node for the expression.

When decompiler method for the next bytecode instruction is called, it takes AST nodes from the stack, combines them into a new AST node, and then puts this node on the top of the stack.

For example, let’s see how the subexpression c.country == 'USA' is calculated. The corresponding bytecode fragment is:

              9 LOAD_FAST                1 (c)
             12 LOAD_ATTR                0 (country)
             15 LOAD_CONST               0 ('USA')
             18 COMPARE_OP               2 (==)

So, the decompiler object does the following:

  1. Calls decompiler.LOAD_FAST('c'). This method puts the Name('c') node on the top of the decompiler stack.
  2. Calls decompiler.LOAD_ATTR('country'). This method takes the Name('c') node from the stack, creates the Geattr(Name('c'), 'country') node and puts it on the top of the stack.
  3. Calls decompiler.LOAD_CONST('USA'). This method puts the Const('USA') node on top of the stack.
  4. Calls decompiler.COMPARE_OP('=='). This method takes two nodes (Getattr and Const) from the stack, and then puts Compare(Getattr(Name('c'), 'country'), [('==', Const('USA'))]) on the top of the stack.

After all bytecode instructions are processed, the decompiler stack contains a single AST node which corresponds to the whole generator expression.

Since Pony ORM needs to decompile generators and lambdas only, this is not that complex, because the instruction flow for a generator is relatively straightforward – it is just a bunch of nested loops.

Currently Pony ORM covers the whole generator instructions set except two things:

  1. Inline if expressions: a if b else c
  2. Compound comparisons: a < b < c

If Pony encounters such expression it raises the NotImplementedError exception. But even in this case you can make it work by passing the generator expression as a string. When you pass a generator as a string Pony doesn’t use the decompiler module. Instead it gets the AST using the standard Python compiler.parse function.

Hope this answers your question.


有趣好用的Python教程

退出移动版
微信支付
请使用 微信 扫码支付