标签归档:dynamic

Django动态模型字段

问题:Django动态模型字段

我正在开发一个多租户应用程序,其中一些用户可以定义自己的数据字段(通过管理员)以收集表单中的其他数据并报告数据。后一点使得JSONField不是一个很好的选择,所以我有以下解决方案:

class CustomDataField(models.Model):
    """
    Abstract specification for arbitrary data fields.
    Not used for holding data itself, but metadata about the fields.
    """
    site = models.ForeignKey(Site, default=settings.SITE_ID)
    name = models.CharField(max_length=64)

    class Meta:
        abstract = True

class CustomDataValue(models.Model):
    """
    Abstract specification for arbitrary data.
    """
    value = models.CharField(max_length=1024)

    class Meta:
        abstract = True

请注意,CustomDataField如何具有指向站点的ForeignKey-每个站点将具有一组不同的自定义数据字段,但是使用相同的数据库。然后可以将各种具体的数据字段定义为:

class UserCustomDataField(CustomDataField):
    pass

class UserCustomDataValue(CustomDataValue):
    custom_field = models.ForeignKey(UserCustomDataField)
    user = models.ForeignKey(User, related_name='custom_data')

    class Meta:
        unique_together=(('user','custom_field'),)

这导致以下用途:

custom_field = UserCustomDataField.objects.create(name='zodiac', site=my_site) #probably created in the admin
user = User.objects.create(username='foo')
user_sign = UserCustomDataValue(custom_field=custom_field, user=user, data='Libra')
user.custom_data.add(user_sign) #actually, what does this even do?

但这感觉很笨拙,尤其是在需要手动创建相关数据并将其与具体模型关联的情况下。有没有更好的方法?

已被优先丢弃的选项:

  • 自定义SQL可以即时修改表。一方面是因为它无法扩展,另一方面是因为它太过分了。
  • 无模式的解决方案,例如NoSQL。我没有反对他们的想法,但他们仍然不合适。最终,这个数据类型化,并使用第三方报告应用的可能性是存在的。
  • 上面列出的JSONField,因为它不适用于查询。

I’m working on a multi-tenanted application in which some users can define their own data fields (via the admin) to collect additional data in forms and report on the data. The latter bit makes JSONField not a great option, so instead I have the following solution:

class CustomDataField(models.Model):
    """
    Abstract specification for arbitrary data fields.
    Not used for holding data itself, but metadata about the fields.
    """
    site = models.ForeignKey(Site, default=settings.SITE_ID)
    name = models.CharField(max_length=64)

    class Meta:
        abstract = True

class CustomDataValue(models.Model):
    """
    Abstract specification for arbitrary data.
    """
    value = models.CharField(max_length=1024)

    class Meta:
        abstract = True

Note how CustomDataField has a ForeignKey to Site – each Site will have a different set of custom data fields, but use the same database. Then the various concrete data fields can be defined as:

class UserCustomDataField(CustomDataField):
    pass

class UserCustomDataValue(CustomDataValue):
    custom_field = models.ForeignKey(UserCustomDataField)
    user = models.ForeignKey(User, related_name='custom_data')

    class Meta:
        unique_together=(('user','custom_field'),)

This leads to the following use:

custom_field = UserCustomDataField.objects.create(name='zodiac', site=my_site) #probably created in the admin
user = User.objects.create(username='foo')
user_sign = UserCustomDataValue(custom_field=custom_field, user=user, data='Libra')
user.custom_data.add(user_sign) #actually, what does this even do?

But this feels very clunky, particularly with the need to manually create the related data and associate it with the concrete model. Is there a better approach?

Options that have been pre-emptively discarded:

  • Custom SQL to modify tables on-the-fly. Partly because this won’t scale and partly because it’s too much of a hack.
  • Schema-less solutions like NoSQL. I have nothing against them, but they’re still not a good fit. Ultimately this data is typed, and the possibility exists of using a third-party reporting application.
  • JSONField, as listed above, as it’s not going to work well with queries.

回答 0

到目前为止,有四种可用的方法,其中两种需要特定的存储后端:

  1. Django-eav(不再提供原始软件包,但有一些繁荣的fork

    该解决方案基于实体属性值数据模型,实质上,它使用多个表来存储对象的动态属性。关于此解决方案的重要之处在于:

    • 使用几个纯净而简单的Django模型来表示动态字段,这使得它易于理解并且与数据库无关。
    • 允许您使用以下简单命令将动态属性存储有效地附加/分离到Django模型:

      eav.unregister(Encounter)
      eav.register(Patient)
    • 与Django admin很好地集成 ;

    • 同时真正强大。

    缺点:

    • 不太有效。这更多地是对EAV模式本身的一种批评,该模式要求将数据从列格式手动合并到模型中的一组键值对。
    • 难以维护。维护数据完整性需要多列唯一键约束,这在某些数据库上可能效率不高。
    • 您将需要选择其中一个分支,因为不再维护官方软件包,也没有明确的领导者。

    用法非常简单:

    import eav
    from app.models import Patient, Encounter
    
    eav.register(Encounter)
    eav.register(Patient)
    Attribute.objects.create(name='age', datatype=Attribute.TYPE_INT)
    Attribute.objects.create(name='height', datatype=Attribute.TYPE_FLOAT)
    Attribute.objects.create(name='weight', datatype=Attribute.TYPE_FLOAT)
    Attribute.objects.create(name='city', datatype=Attribute.TYPE_TEXT)
    Attribute.objects.create(name='country', datatype=Attribute.TYPE_TEXT)
    
    self.yes = EnumValue.objects.create(value='yes')
    self.no = EnumValue.objects.create(value='no')
    self.unkown = EnumValue.objects.create(value='unkown')
    ynu = EnumGroup.objects.create(name='Yes / No / Unknown')
    ynu.enums.add(self.yes)
    ynu.enums.add(self.no)
    ynu.enums.add(self.unkown)
    
    Attribute.objects.create(name='fever', datatype=Attribute.TYPE_ENUM,\
                                           enum_group=ynu)
    
    # When you register a model within EAV,
    # you can access all of EAV attributes:
    
    Patient.objects.create(name='Bob', eav__age=12,
                               eav__fever=no, eav__city='New York',
                               eav__country='USA')
    # You can filter queries based on their EAV fields:
    
    query1 = Patient.objects.filter(Q(eav__city__contains='Y'))
    query2 = Q(eav__city__contains='Y') |  Q(eav__fever=no)
  2. PostgreSQL中的Hstore,JSON或JSONB字段

    PostgreSQL支持几种更复杂的数据类型。大多数组件都通过第三方程序包得到支持,但是近年来Django已将它们引入django.contrib.postgres.fields中。

    HStoreField

    Django-hstore最初是第三方软件包,但是Django 1.8将HStoreField作为内置组件以及其他几种PostgreSQL支持的字段类型添加了。

    从某种意义上说,这种方法是好的,它可以让您充分利用两个领域:动态字段和关系数据库。但是,hstore 并不是理想的性能选择,特别是如果您最终要在一个字段中存储数千个项目时。它还仅支持值字符串。

    #app/models.py
    from django.contrib.postgres.fields import HStoreField
    class Something(models.Model):
        name = models.CharField(max_length=32)
        data = models.HStoreField(db_index=True)

    在Django的shell中,您可以像这样使用它:

    >>> instance = Something.objects.create(
                     name='something',
                     data={'a': '1', 'b': '2'}
               )
    >>> instance.data['a']
    '1'        
    >>> empty = Something.objects.create(name='empty')
    >>> empty.data
    {}
    >>> empty.data['a'] = '1'
    >>> empty.save()
    >>> Something.objects.get(name='something').data['a']
    '1'

    您可以针对hstore字段发出索引查询:

    # equivalence
    Something.objects.filter(data={'a': '1', 'b': '2'})
    
    # subset by key/value mapping
    Something.objects.filter(data__a='1')
    
    # subset by list of keys
    Something.objects.filter(data__has_keys=['a', 'b'])
    
    # subset by single key
    Something.objects.filter(data__has_key='a')    

    JSONField

    JSON / JSONB字段支持任何JSON可编码的数据类型,不仅是键/值对,而且比Hstore更快,并且(对于JSONB)更紧凑。一些软件包实现了JSON / JSONB字段,包括django-pgfields,但是从Django 1.9开始,JSONField是使用JSONB进行存储的内置方法。 JSONFieldHStoreField相似,并且在使用大字典时可能会表现更好。它还支持字符串以外的类型,例如整数,布尔值和嵌套字典。

    #app/models.py
    from django.contrib.postgres.fields import JSONField
    class Something(models.Model):
        name = models.CharField(max_length=32)
        data = JSONField(db_index=True)

    在外壳中创建:

    >>> instance = Something.objects.create(
                     name='something',
                     data={'a': 1, 'b': 2, 'nested': {'c':3}}
               )

    索引查询与HStoreField几乎相同,除了可以嵌套。复杂索引可能需要手动创建(或脚本迁移)。

    >>> Something.objects.filter(data__a=1)
    >>> Something.objects.filter(data__nested__c=3)
    >>> Something.objects.filter(data__has_key='a')
  3. Django MongoDB

    或其他NoSQL Django改编版-借助它们,您可以拥有完全动态的模型。

    NoSQL Django库很棒,但是请记住它们不是100%与Django兼容的,例如,要从标准Django 迁移到Django-nonrel,您将需要用ListField替换ManyToMany 。

    看看这个Django MongoDB示例:

    from djangotoolbox.fields import DictField
    
    class Image(models.Model):
        exif = DictField()
    ...
    
    >>> image = Image.objects.create(exif=get_exif_data(...))
    >>> image.exif
    {u'camera_model' : 'Spamcams 4242', 'exposure_time' : 0.3, ...}

    您甚至可以创建任何Django模型的嵌入式列表

    class Container(models.Model):
        stuff = ListField(EmbeddedModelField())
    
    class FooModel(models.Model):
        foo = models.IntegerField()
    
    class BarModel(models.Model):
        bar = models.CharField()
    ...
    
    >>> Container.objects.create(
        stuff=[FooModel(foo=42), BarModel(bar='spam')]
    )
  4. Django-mutant:基于syncdb和South-hooks的动态模型

    Django-mutant实现了完全动态的外键和m2m字段。灵感来自于Will Hardy和Michael Hall 令人难以置信但有些骇人听闻的解决方案。

    所有这些都基于Django South hooks,根据Will Hardy在DjangoCon 2011上的演讲 (观看!)仍然很健壮并已在生产中进行了测试(相关源代码)。

    首先实现这一点的迈克尔·霍尔

    是的,这是神奇的事情,通过这些方法,您可以使用任何关系数据库后端来实现完全动态的Django应用程序,模型和字段。但是要花多少钱呢?大量使用会损害应用的稳定性吗?这些是要考虑的问题。您需要确保保持适当的锁定,以允许同时进行数据库更改请求。

    如果使用的是Michael Halls lib,则代码将如下所示:

    from dynamo import models
    
    test_app, created = models.DynamicApp.objects.get_or_create(
                          name='dynamo'
                        )
    test, created = models.DynamicModel.objects.get_or_create(
                      name='Test',
                      verbose_name='Test Model',
                      app=test_app
                   )
    foo, created = models.DynamicModelField.objects.get_or_create(
                      name = 'foo',
                      verbose_name = 'Foo Field',
                      model = test,
                      field_type = 'dynamiccharfield',
                      null = True,
                      blank = True,
                      unique = False,
                      help_text = 'Test field for Foo',
                   )
    bar, created = models.DynamicModelField.objects.get_or_create(
                      name = 'bar',
                      verbose_name = 'Bar Field',
                      model = test,
                      field_type = 'dynamicintegerfield',
                      null = True,
                      blank = True,
                      unique = False,
                      help_text = 'Test field for Bar',
                   )

As of today, there are four available approaches, two of them requiring a certain storage backend:

  1. Django-eav (the original package is no longer mantained but has some thriving forks)

    This solution is based on Entity Attribute Value data model, essentially, it uses several tables to store dynamic attributes of objects. Great parts about this solution is that it:

    • uses several pure and simple Django models to represent dynamic fields, which makes it simple to understand and database-agnostic;
    • allows you to effectively attach/detach dynamic attribute storage to Django model with simple commands like:

      eav.unregister(Encounter)
      eav.register(Patient)
      
    • Nicely integrates with Django admin;

    • At the same time being really powerful.

    Downsides:

    • Not very efficient. This is more of a criticism of the EAV pattern itself, which requires manually merging the data from a column format to a set of key-value pairs in the model.
    • Harder to maintain. Maintaining data integrity requires a multi-column unique key constraint, which may be inefficient on some databases.
    • You will need to select one of the forks, since the official package is no longer maintained and there is no clear leader.

    The usage is pretty straightforward:

    import eav
    from app.models import Patient, Encounter
    
    eav.register(Encounter)
    eav.register(Patient)
    Attribute.objects.create(name='age', datatype=Attribute.TYPE_INT)
    Attribute.objects.create(name='height', datatype=Attribute.TYPE_FLOAT)
    Attribute.objects.create(name='weight', datatype=Attribute.TYPE_FLOAT)
    Attribute.objects.create(name='city', datatype=Attribute.TYPE_TEXT)
    Attribute.objects.create(name='country', datatype=Attribute.TYPE_TEXT)
    
    self.yes = EnumValue.objects.create(value='yes')
    self.no = EnumValue.objects.create(value='no')
    self.unkown = EnumValue.objects.create(value='unkown')
    ynu = EnumGroup.objects.create(name='Yes / No / Unknown')
    ynu.enums.add(self.yes)
    ynu.enums.add(self.no)
    ynu.enums.add(self.unkown)
    
    Attribute.objects.create(name='fever', datatype=Attribute.TYPE_ENUM,\
                                           enum_group=ynu)
    
    # When you register a model within EAV,
    # you can access all of EAV attributes:
    
    Patient.objects.create(name='Bob', eav__age=12,
                               eav__fever=no, eav__city='New York',
                               eav__country='USA')
    # You can filter queries based on their EAV fields:
    
    query1 = Patient.objects.filter(Q(eav__city__contains='Y'))
    query2 = Q(eav__city__contains='Y') |  Q(eav__fever=no)
    
  2. Hstore, JSON or JSONB fields in PostgreSQL

    PostgreSQL supports several more complex data types. Most are supported via third-party packages, but in recent years Django has adopted them into django.contrib.postgres.fields.

    HStoreField:

    Django-hstore was originally a third-party package, but Django 1.8 added HStoreField as a built-in, along with several other PostgreSQL-supported field types.

    This approach is good in a sense that it lets you have the best of both worlds: dynamic fields and relational database. However, hstore is not ideal performance-wise, especially if you are going to end up storing thousands of items in one field. It also only supports strings for values.

    #app/models.py
    from django.contrib.postgres.fields import HStoreField
    class Something(models.Model):
        name = models.CharField(max_length=32)
        data = models.HStoreField(db_index=True)
    

    In Django’s shell you can use it like this:

    >>> instance = Something.objects.create(
                     name='something',
                     data={'a': '1', 'b': '2'}
               )
    >>> instance.data['a']
    '1'        
    >>> empty = Something.objects.create(name='empty')
    >>> empty.data
    {}
    >>> empty.data['a'] = '1'
    >>> empty.save()
    >>> Something.objects.get(name='something').data['a']
    '1'
    

    You can issue indexed queries against hstore fields:

    # equivalence
    Something.objects.filter(data={'a': '1', 'b': '2'})
    
    # subset by key/value mapping
    Something.objects.filter(data__a='1')
    
    # subset by list of keys
    Something.objects.filter(data__has_keys=['a', 'b'])
    
    # subset by single key
    Something.objects.filter(data__has_key='a')    
    

    JSONField:

    JSON/JSONB fields support any JSON-encodable data type, not just key/value pairs, but also tend to be faster and (for JSONB) more compact than Hstore. Several packages implement JSON/JSONB fields including django-pgfields, but as of Django 1.9, JSONField is a built-in using JSONB for storage. JSONField is similar to HStoreField, and may perform better with large dictionaries. It also supports types other than strings, such as integers, booleans and nested dictionaries.

    #app/models.py
    from django.contrib.postgres.fields import JSONField
    class Something(models.Model):
        name = models.CharField(max_length=32)
        data = JSONField(db_index=True)
    

    Creating in the shell:

    >>> instance = Something.objects.create(
                     name='something',
                     data={'a': 1, 'b': 2, 'nested': {'c':3}}
               )
    

    Indexed queries are nearly identical to HStoreField, except nesting is possible. Complex indexes may require manually creation (or a scripted migration).

    >>> Something.objects.filter(data__a=1)
    >>> Something.objects.filter(data__nested__c=3)
    >>> Something.objects.filter(data__has_key='a')
    
  3. Django MongoDB

    Or other NoSQL Django adaptations — with them you can have fully dynamic models.

    NoSQL Django libraries are great, but keep in mind that they are not 100% the Django-compatible, for example, to migrate to Django-nonrel from standard Django you will need to replace ManyToMany with ListField among other things.

    Checkout this Django MongoDB example:

    from djangotoolbox.fields import DictField
    
    class Image(models.Model):
        exif = DictField()
    ...
    
    >>> image = Image.objects.create(exif=get_exif_data(...))
    >>> image.exif
    {u'camera_model' : 'Spamcams 4242', 'exposure_time' : 0.3, ...}
    

    You can even create embedded lists of any Django models:

    class Container(models.Model):
        stuff = ListField(EmbeddedModelField())
    
    class FooModel(models.Model):
        foo = models.IntegerField()
    
    class BarModel(models.Model):
        bar = models.CharField()
    ...
    
    >>> Container.objects.create(
        stuff=[FooModel(foo=42), BarModel(bar='spam')]
    )
    
  4. Django-mutant: Dynamic models based on syncdb and South-hooks

    Django-mutant implements fully dynamic Foreign Key and m2m fields. And is inspired by incredible but somewhat hackish solutions by Will Hardy and Michael Hall.

    All of these are based on Django South hooks, which, according to Will Hardy’s talk at DjangoCon 2011 (watch it!) are nevertheless robust and tested in production (relevant source code).

    First to implement this was Michael Hall.

    Yes, this is magic, with these approaches you can achieve fully dynamic Django apps, models and fields with any relational database backend. But at what cost? Will stability of application suffer upon heavy use? These are the questions to be considered. You need to be sure to maintain a proper lock in order to allow simultaneous database altering requests.

    If you are using Michael Halls lib, your code will look like this:

    from dynamo import models
    
    test_app, created = models.DynamicApp.objects.get_or_create(
                          name='dynamo'
                        )
    test, created = models.DynamicModel.objects.get_or_create(
                      name='Test',
                      verbose_name='Test Model',
                      app=test_app
                   )
    foo, created = models.DynamicModelField.objects.get_or_create(
                      name = 'foo',
                      verbose_name = 'Foo Field',
                      model = test,
                      field_type = 'dynamiccharfield',
                      null = True,
                      blank = True,
                      unique = False,
                      help_text = 'Test field for Foo',
                   )
    bar, created = models.DynamicModelField.objects.get_or_create(
                      name = 'bar',
                      verbose_name = 'Bar Field',
                      model = test,
                      field_type = 'dynamicintegerfield',
                      null = True,
                      blank = True,
                      unique = False,
                      help_text = 'Test field for Bar',
                   )
    

回答 1

我一直在努力推动django-dynamo的构想。该项目仍未记录在案,但您可以在https://github.com/charettes/django-mutant中阅读代码

实际上FK和M2M字段(请参阅contrib.related)也可以工作,甚至可以为自己的自定义字段定义包装器。

还支持模型选项,例如unique_together和ordering以及Model基类,因此您可以将模型代理,抽象或混合作为子类。

我实际上正在研究一种非内存锁定机制,以确保可以在多个django运行实例之间共享模型定义,同时防止使用过时的定义。

该项目仍处于Alpha状态,但这是我的一个项目的基础技术,因此我必须将其投入生产准备。大型计划还支持django-nonrel,因此我们可以利用mongodb驱动程序。

I’ve been working on pushing the django-dynamo idea further. The project is still undocumented but you can read the code at https://github.com/charettes/django-mutant.

Actually FK and M2M fields (see contrib.related) also work and it’s even possible to define wrapper for your own custom fields.

There’s also support for model options such as unique_together and ordering plus Model bases so you can subclass model proxy, abstract or mixins.

I’m actually working on a not in-memory lock mechanism to make sure model definitions can be shared accross multiple django running instances while preventing them using obsolete definition.

The project is still very alpha but it’s a cornerstone technology for one of my project so I’ll have to take it to production ready. The big plan is supporting django-nonrel also so we can leverage the mongodb driver.


回答 2

进一步的研究表明,这是实体属性值设计模式的一种特殊情况,该模式已通过几个软件包为Django实现。

首先,在PyPi上有一个原始的eav-django项目。

其次,第一个项目的最新分支是django-eav,它主要是一个重构,允许将EAV与django自己的模型或第三方应用程序中的模型一起使用。

Further research reveals that this is a somewhat special case of Entity Attribute Value design pattern, which has been implemented for Django by a couple of packages.

First, there’s the original eav-django project, which is on PyPi.

Second, there’s a more recent fork of the first project, django-eav which is primarily a refactor to allow use of EAV with django’s own models or models in third-party apps.


eval,exec和compile有什么区别?

问题:eval,exec和compile有什么区别?

我一直在研究Python代码的动态评估,并遇到eval()compile()函数,以及exec语句。

有人可以解释之间的区别evalexec怎样的不同模式,compile()适应吗?

I’ve been looking at dynamic evaluation of Python code, and come across the eval() and compile() functions, and the exec statement.

Can someone please explain the difference between eval and exec, and how the different modes of compile() fit in?


回答 0

简短答案,即TL; DR

基本上,eval用于EVAL审视你们单个动态生成的Python表达式,并exec用于EXEC动态生成的Python代码仅针对其副作用尤特。

evalexec具有以下两个区别:

  1. eval仅接受一个表达式exec可以采用具有Python语句的代码块:循环try: except:class和函数/方法def初始化等。

    Python中的表达式就是变量赋值中的值:

    a_variable = (anything you can put within these parentheses is an expression)
  2. eval 返回给定表达式的值,而exec忽略其代码中的返回值,并始终返回None(在Python 2中,它是一条语句,不能用作表达式,因此它实际上不返回任何内容)。

在1.0-2.7版本中,exec有一条声明是因为CPython需要为函数生成另一种类型的代码对象,这些代码对象用于在函数exec内部产生副作用。

在Python 3中,exec是一个函数;它的使用对使用它的函数的已编译字节码没有影响。


因此基本上:

>>> a = 5
>>> eval('37 + a')   # it is an expression
42
>>> exec('37 + a')   # it is an expression statement; value is ignored (None is returned)
>>> exec('a = 47')   # modify a global variable as a side effect
>>> a
47
>>> eval('a = 47')  # you cannot evaluate a statement
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1
    a = 47
      ^
SyntaxError: invalid syntax

compile'exec'模式编译任何数目的语句编译成字节码隐含总是返回None,而在'eval'模式它编译一个单一表达式成字节码即返回该表达式的值。

>>> eval(compile('42', '<string>', 'exec'))  # code returns None
>>> eval(compile('42', '<string>', 'eval'))  # code returns 42
42
>>> exec(compile('42', '<string>', 'eval'))  # code returns 42,
>>>                                          # but ignored by exec

在这种'eval'模式下(eval如果传递了一个字符串,则在函数中),compile如果源代码包含语句或除单个表达式之外的任何其他内容,则会引发异常:

>>> compile('for i in range(3): print(i)', '<string>', 'eval')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1
    for i in range(3): print(i)
      ^
SyntaxError: invalid syntax

实际上,“ eval仅接受单个表达式”语句仅在将字符串(包含Python 源代码)传递给时适用eval。然后将其内部使用编译为字节码。compile(source, '<string>', 'eval')这才是真正的区别。

如果将一个code对象(包含Python 字节码)传递给execeval,则它们的行为相同,除了exec忽略返回值的事实外,它None始终会始终返回。因此eval,如果您只是将compile它先转换为字节码而不是将其作为字符串传递,则可以执行具有语句的内容:

>>> eval(compile('if 1: print("Hello")', '<string>', 'exec'))
Hello
>>>

即使已编译的代码包含语句,也可以正常工作。它仍然会返回None,因为那是从中返回的代码对象的返回值。compile

在这种'eval'模式下(eval如果传递了一个字符串,则在函数中),compile如果源代码包含语句或除单个表达式之外的任何其他内容,则会引发异常:

>>> compile('for i in range(3): print(i)', '<string>'. 'eval')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1
    for i in range(3): print(i)
      ^
SyntaxError: invalid syntax

答案越长,又称血腥细节

execeval

exec函数(在Python 2中为语句)用于执行动态创建的语句或程序:

>>> program = '''
for i in range(3):
    print("Python is cool")
'''
>>> exec(program)
Python is cool
Python is cool
Python is cool
>>> 

eval函数对单个表达式执行相同的操作,返回表达式的值:

>>> a = 2
>>> my_calculation = '42 * a'
>>> result = eval(my_calculation)
>>> result
84

execeval均接受该程序/表达到无论是作为一个运行strunicodebytes对象包含源代码,或者作为一个code对象包含的Python字节码。

如果str/ unicode/ bytes包含源代码传递给exec,它等效行为与:

exec(compile(source, '<string>', 'exec'))

并且eval类似地等效于:

eval(compile(source, '<string>', 'eval'))

由于所有表达式都可以用作Python中的语句(Expr在Python 抽象语法中被称为节点;反之则不成立),exec如果不需要返回值,则可以始终使用。也就是说,您可以使用eval('my_func(42)')exec('my_func(42)'),区别在于eval返回的返回值是my_func,并将其exec丢弃:

>>> def my_func(arg):
...     print("Called with %d" % arg)
...     return arg * 2
... 
>>> exec('my_func(42)')
Called with 42
>>> eval('my_func(42)')
Called with 42
84
>>> 

2,只有exec接受包含语句,源代码一样defforwhileimport,或者class,赋值语句(又名a = 42),或整个程序:

>>> exec('for i in range(3): print(i)')
0
1
2
>>> eval('for i in range(3): print(i)')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1
    for i in range(3): print(i)
      ^
SyntaxError: invalid syntax

双方execeval接受2个额外的位置参数- globalslocals-这是全局和局部变量的作用域,该代码看到。它们默认为globals()和,它们locals()在称为exec或的范围内eval,但任何字典都可以用于globals和,mapping用于localsdict当然包括)。这些不仅可以用于限制/修改代码中看到的变量,而且还经常用于捕获被引用exec代码创建的变量:

>>> g = dict()
>>> l = dict()
>>> exec('global a; a, b = 123, 42', g, l)
>>> g['a']
123
>>> l
{'b': 42}

(如果您显示整个的价值g,这将是更长的时间,因为execeval添加内置插件模块__builtins__来自动如果缺少它的全局变量)。

在Python 2中,该exec语句的正式语法实际上是exec code in globals, locals,如

>>> exec 'global a; a, b = 123, 42' in g, l

但是,替代语法exec(code, globals, locals)也一直被接受(见下文)。

compile

所述compile(source, filename, mode, flags=0, dont_inherit=False, optimize=-1)内置的可用于加快与相同的码的重复调用execeval通过编译源到code对象预先。所述mode参数控制的那种代码片段的compile函数接受和种字节码它产生。选择是'eval''exec''single'

  • 'eval'模式需要一个表达式,并将生成字节码,运行时将返回该表达式的值:

    >>> dis.dis(compile('a + b', '<string>', 'eval'))
      1           0 LOAD_NAME                0 (a)
                  3 LOAD_NAME                1 (b)
                  6 BINARY_ADD
                  7 RETURN_VALUE
  • 'exec'接受从单个表达式到整个代码模块的任何类型的python构造,并像将其作为模块顶级语句一样执行它们。代码对象返回None

    >>> dis.dis(compile('a + b', '<string>', 'exec'))
      1           0 LOAD_NAME                0 (a)
                  3 LOAD_NAME                1 (b)
                  6 BINARY_ADD
                  7 POP_TOP                             <- discard result
                  8 LOAD_CONST               0 (None)   <- load None on stack
                 11 RETURN_VALUE                        <- return top of stack
  • 'single'是一种有限形式,如果最后一条语句是表达式语句,则该格式'exec'接受包含单个语句(或多个由分隔的语句;)的源代码,生成的字节码还将该表达式的值打印repr到标准output(!)上

    一个ifelifelse链,有一个循环else,并try用它exceptelsefinally块被视为一个单独的语句。

    包含2个顶级语句的源代码片段是的错误'single',但在Python 2中存在一个错误,有时会在代码中允许多个顶级语句。只有第一个被编译;其余的将被忽略:

    在Python 2.7.8中:

    >>> exec(compile('a = 5\na = 6', '<string>', 'single'))
    >>> a
    5

    在Python 3.4.2中:

    >>> exec(compile('a = 5\na = 6', '<string>', 'single'))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<string>", line 1
        a = 5
            ^
    SyntaxError: multiple statements found while compiling a single statement

    这对于制作交互式Python Shell非常有用。但是,即使返回eval结果代码,也不返回表达式的值。

这样的最大区别execeval实际上来自compile函数及其模式。


除了将源代码编译为字节码之外,还compile支持将抽象语法树(Python代码的解析树)编译为code对象;并将源代码转换成抽象语法树(ast.parse用Python编写,仅调用compile(source, filename, mode, PyCF_ONLY_AST));这些代码用于动态修改源代码,以及动态代码创建,因为在复杂情况下,将代码作为节点树而不是文本行来处理通常会更容易。


虽然eval只允许您评估包含单个表达式的字符串,但是您可以eval使用整个语句,甚至可以是已被compile打包为字节码的整个模块。也就是说,对于Python 2,这print是一条语句,不能直接eval导致:

>>> eval('for i in range(3): print("Python is cool")')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1
    for i in range(3): print("Python is cool")
      ^
SyntaxError: invalid syntax

compile'exec'模式将它变成一个code对象,你就能eval 做到 ; 该eval函数将返回None

>>> code = compile('for i in range(3): print("Python is cool")',
                   'foo.py', 'exec')
>>> eval(code)
Python is cool
Python is cool
Python is cool

如果一个长相到evalexec源代码CPython的3,这是很明显的; 它们都PyEval_EvalCode使用相同的参数调用,唯一的区别是exec显式返回None

execPython 2和Python 3之间的语法差异

其中一个在Python的主要区别2exec一个声明,eval是一个内置的功能(两者都内置函数在Python 3)。众所周知exec,Python 2 中的正式语法为exec code [in globals[, locals]]

与大多数Python 2到3 移植 指南 似乎并不像 建议的那样execCPython 2中的语句也可以与看起来 完全execPython 3中的函数调用的语法一起使用。原因是Python 0.9.9具有exec(code, globals, locals)内置的在功能上!并且该内置函数在Python 1.0发布之前的某处exec语句替换。

由于这是可取的不破与Python 0.9.9向后兼容性,吉多·范罗苏姆在1993年增加了兼容性劈:如果code是长度为2或3的元组,并globalslocals未传递到exec声明,否则,code将被解释就像元组的第二个元素和第三个元素分别是globals和一样locals。即使在Python 1.4文档(在线最早可用的版本)中也没有提到兼容性hack ;因此对于移植指南和工具的许多作者并不了解,直到2012年11月再次对其进行了记录

第一个表达式也可以是长度为2或3的元组。在这种情况下,必须省略可选部分。形式exec(expr, globals)等同于exec expr in globals,而形式exec(expr, globals, locals)等同于exec expr in globals, locals。元组形式exec提供了与Python 3的兼容性,Python 3 exec是函数而不是语句。

是的,在CPython 2.7中它被方便地称为前向兼容选项(为什么使人们感到困惑,因为根本没有向后兼容选项),实际上它已经存在了二十年了

因此,虽然exec在Python 1和Python 2中是一个语句,而在Python 3和Python 0.9.9中是一个内置函数,

>>> exec("print(a)", globals(), {'a': 42})
42

在可能的每个广泛发行的Python版本中都具有相同的行为;并且也可以在Jython 2.5.2,PyPy 2.3.1(Python 2.7.6)和IronPython 2.6.1中使用(对它们的严格遵循CPython的未记录的行为表示敬意)。

在Pythons 1.0-2.7中,通过其兼容性技巧,您不能做的是将返回值存储exec到变量中:

Python 2.7.11+ (default, Apr 17 2016, 14:00:29) 
[GCC 5.3.1 20160413] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> a = exec('print(42)')
  File "<stdin>", line 1
    a = exec('print(42)')
           ^
SyntaxError: invalid syntax

(这在Python 3中也没有用,因为它exec总是返回None),或将引用传递给exec

>>> call_later(exec, 'print(42)', delay=1000)
  File "<stdin>", line 1
    call_later(exec, 'print(42)', delay=1000)
                  ^
SyntaxError: invalid syntax

某人可能实际使用过的一种模式,尽管可能性不大;

或在列表理解中使用它:

>>> [exec(i) for i in ['print(42)', 'print(foo)']
  File "<stdin>", line 1
    [exec(i) for i in ['print(42)', 'print(foo)']
        ^
SyntaxError: invalid syntax

这是对列表理解的滥用(请for改为使用循环!)。

The short answer, or TL;DR

Basically, eval is used to evaluate a single dynamically generated Python expression, and exec is used to execute dynamically generated Python code only for its side effects.

eval and exec have these two differences:

  1. eval accepts only a single expression, exec can take a code block that has Python statements: loops, try: except:, class and function/method definitions and so on.

    An expression in Python is whatever you can have as the value in a variable assignment:

    a_variable = (anything you can put within these parentheses is an expression)
    
  2. eval returns the value of the given expression, whereas exec ignores the return value from its code, and always returns None (in Python 2 it is a statement and cannot be used as an expression, so it really does not return anything).

In versions 1.0 – 2.7, exec was a statement, because CPython needed to produce a different kind of code object for functions that used exec for its side effects inside the function.

In Python 3, exec is a function; its use has no effect on the compiled bytecode of the function where it is used.


Thus basically:

>>> a = 5
>>> eval('37 + a')   # it is an expression
42
>>> exec('37 + a')   # it is an expression statement; value is ignored (None is returned)
>>> exec('a = 47')   # modify a global variable as a side effect
>>> a
47
>>> eval('a = 47')  # you cannot evaluate a statement
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1
    a = 47
      ^
SyntaxError: invalid syntax

The compile in 'exec' mode compiles any number of statements into a bytecode that implicitly always returns None, whereas in 'eval' mode it compiles a single expression into bytecode that returns the value of that expression.

>>> eval(compile('42', '<string>', 'exec'))  # code returns None
>>> eval(compile('42', '<string>', 'eval'))  # code returns 42
42
>>> exec(compile('42', '<string>', 'eval'))  # code returns 42,
>>>                                          # but ignored by exec

In the 'eval' mode (and thus with the eval function if a string is passed in), the compile raises an exception if the source code contains statements or anything else beyond a single expression:

>>> compile('for i in range(3): print(i)', '<string>', 'eval')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1
    for i in range(3): print(i)
      ^
SyntaxError: invalid syntax

Actually the statement “eval accepts only a single expression” applies only when a string (which contains Python source code) is passed to eval. Then it is internally compiled to bytecode using compile(source, '<string>', 'eval') This is where the difference really comes from.

If a code object (which contains Python bytecode) is passed to exec or eval, they behave identically, excepting for the fact that exec ignores the return value, still returning None always. So it is possible use eval to execute something that has statements, if you just compiled it into bytecode before instead of passing it as a string:

>>> eval(compile('if 1: print("Hello")', '<string>', 'exec'))
Hello
>>>

works without problems, even though the compiled code contains statements. It still returns None, because that is the return value of the code object returned from compile.

In the 'eval' mode (and thus with the eval function if a string is passed in), the compile raises an exception if the source code contains statements or anything else beyond a single expression:

>>> compile('for i in range(3): print(i)', '<string>'. 'eval')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1
    for i in range(3): print(i)
      ^
SyntaxError: invalid syntax

The longer answer, a.k.a the gory details

exec and eval

The exec function (which was a statement in Python 2) is used for executing a dynamically created statement or program:

>>> program = '''
for i in range(3):
    print("Python is cool")
'''
>>> exec(program)
Python is cool
Python is cool
Python is cool
>>> 

The eval function does the same for a single expression, and returns the value of the expression:

>>> a = 2
>>> my_calculation = '42 * a'
>>> result = eval(my_calculation)
>>> result
84

exec and eval both accept the program/expression to be run either as a str, unicode or bytes object containing source code, or as a code object which contains Python bytecode.

If a str/unicode/bytes containing source code was passed to exec, it behaves equivalently to:

exec(compile(source, '<string>', 'exec'))

and eval similarly behaves equivalent to:

eval(compile(source, '<string>', 'eval'))

Since all expressions can be used as statements in Python (these are called the Expr nodes in the Python abstract grammar; the opposite is not true), you can always use exec if you do not need the return value. That is to say, you can use either eval('my_func(42)') or exec('my_func(42)'), the difference being that eval returns the value returned by my_func, and exec discards it:

>>> def my_func(arg):
...     print("Called with %d" % arg)
...     return arg * 2
... 
>>> exec('my_func(42)')
Called with 42
>>> eval('my_func(42)')
Called with 42
84
>>> 

Of the 2, only exec accepts source code that contains statements, like def, for, while, import, or class, the assignment statement (a.k.a a = 42), or entire programs:

>>> exec('for i in range(3): print(i)')
0
1
2
>>> eval('for i in range(3): print(i)')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1
    for i in range(3): print(i)
      ^
SyntaxError: invalid syntax

Both exec and eval accept 2 additional positional arguments – globals and locals – which are the global and local variable scopes that the code sees. These default to the globals() and locals() within the scope that called exec or eval, but any dictionary can be used for globals and any mapping for locals (including dict of course). These can be used not only to restrict/modify the variables that the code sees, but are often also used for capturing the variables that the executed code creates:

>>> g = dict()
>>> l = dict()
>>> exec('global a; a, b = 123, 42', g, l)
>>> g['a']
123
>>> l
{'b': 42}

(If you display the value of the entire g, it would be much longer, because exec and eval add the built-ins module as __builtins__ to the globals automatically if it is missing).

In Python 2, the official syntax for the exec statement is actually exec code in globals, locals, as in

>>> exec 'global a; a, b = 123, 42' in g, l

However the alternate syntax exec(code, globals, locals) has always been accepted too (see below).

compile

The compile(source, filename, mode, flags=0, dont_inherit=False, optimize=-1) built-in can be used to speed up repeated invocations of the same code with exec or eval by compiling the source into a code object beforehand. The mode parameter controls the kind of code fragment the compile function accepts and the kind of bytecode it produces. The choices are 'eval', 'exec' and 'single':

  • 'eval' mode expects a single expression, and will produce bytecode that when run will return the value of that expression:

    >>> dis.dis(compile('a + b', '<string>', 'eval'))
      1           0 LOAD_NAME                0 (a)
                  3 LOAD_NAME                1 (b)
                  6 BINARY_ADD
                  7 RETURN_VALUE
    
  • 'exec' accepts any kinds of python constructs from single expressions to whole modules of code, and executes them as if they were module top-level statements. The code object returns None:

    >>> dis.dis(compile('a + b', '<string>', 'exec'))
      1           0 LOAD_NAME                0 (a)
                  3 LOAD_NAME                1 (b)
                  6 BINARY_ADD
                  7 POP_TOP                             <- discard result
                  8 LOAD_CONST               0 (None)   <- load None on stack
                 11 RETURN_VALUE                        <- return top of stack
    
  • 'single' is a limited form of 'exec' which accepts a source code containing a single statement (or multiple statements separated by ;) if the last statement is an expression statement, the resulting bytecode also prints the repr of the value of that expression to the standard output(!).

    An ifelifelse chain, a loop with else, and try with its except, else and finally blocks is considered a single statement.

    A source fragment containing 2 top-level statements is an error for the 'single', except in Python 2 there is a bug that sometimes allows multiple toplevel statements in the code; only the first is compiled; the rest are ignored:

    In Python 2.7.8:

    >>> exec(compile('a = 5\na = 6', '<string>', 'single'))
    >>> a
    5
    

    And in Python 3.4.2:

    >>> exec(compile('a = 5\na = 6', '<string>', 'single'))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<string>", line 1
        a = 5
            ^
    SyntaxError: multiple statements found while compiling a single statement
    

    This is very useful for making interactive Python shells. However, the value of the expression is not returned, even if you eval the resulting code.

Thus greatest distinction of exec and eval actually comes from the compile function and its modes.


In addition to compiling source code to bytecode, compile supports compiling abstract syntax trees (parse trees of Python code) into code objects; and source code into abstract syntax trees (the ast.parse is written in Python and just calls compile(source, filename, mode, PyCF_ONLY_AST)); these are used for example for modifying source code on the fly, and also for dynamic code creation, as it is often easier to handle the code as a tree of nodes instead of lines of text in complex cases.


While eval only allows you to evaluate a string that contains a single expression, you can eval a whole statement, or even a whole module that has been compiled into bytecode; that is, with Python 2, print is a statement, and cannot be evalled directly:

>>> eval('for i in range(3): print("Python is cool")')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1
    for i in range(3): print("Python is cool")
      ^
SyntaxError: invalid syntax

compile it with 'exec' mode into a code object and you can eval it; the eval function will return None.

>>> code = compile('for i in range(3): print("Python is cool")',
                   'foo.py', 'exec')
>>> eval(code)
Python is cool
Python is cool
Python is cool

If one looks into eval and exec source code in CPython 3, this is very evident; they both call PyEval_EvalCode with same arguments, the only difference being that exec explicitly returns None.

Syntax differences of exec between Python 2 and Python 3

One of the major differences in Python 2 is that exec is a statement and eval is a built-in function (both are built-in functions in Python 3). It is a well-known fact that the official syntax of exec in Python 2 is exec code [in globals[, locals]].

Unlike majority of the Python 2-to-3 porting guides seem to suggest, the exec statement in CPython 2 can be also used with syntax that looks exactly like the exec function invocation in Python 3. The reason is that Python 0.9.9 had the exec(code, globals, locals) built-in function! And that built-in function was replaced with exec statement somewhere before Python 1.0 release.

Since it was desirable to not break backwards compatibility with Python 0.9.9, Guido van Rossum added a compatibility hack in 1993: if the code was a tuple of length 2 or 3, and globals and locals were not passed into the exec statement otherwise, the code would be interpreted as if the 2nd and 3rd element of the tuple were the globals and locals respectively. The compatibility hack was not mentioned even in Python 1.4 documentation (the earliest available version online); and thus was not known to many writers of the porting guides and tools, until it was documented again in November 2012:

The first expression may also be a tuple of length 2 or 3. In this case, the optional parts must be omitted. The form exec(expr, globals) is equivalent to exec expr in globals, while the form exec(expr, globals, locals) is equivalent to exec expr in globals, locals. The tuple form of exec provides compatibility with Python 3, where exec is a function rather than a statement.

Yes, in CPython 2.7 that it is handily referred to as being a forward-compatibility option (why confuse people over that there is a backward compatibility option at all), when it actually had been there for backward-compatibility for two decades.

Thus while exec is a statement in Python 1 and Python 2, and a built-in function in Python 3 and Python 0.9.9,

>>> exec("print(a)", globals(), {'a': 42})
42

has had identical behaviour in possibly every widely released Python version ever; and works in Jython 2.5.2, PyPy 2.3.1 (Python 2.7.6) and IronPython 2.6.1 too (kudos to them following the undocumented behaviour of CPython closely).

What you cannot do in Pythons 1.0 – 2.7 with its compatibility hack, is to store the return value of exec into a variable:

Python 2.7.11+ (default, Apr 17 2016, 14:00:29) 
[GCC 5.3.1 20160413] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> a = exec('print(42)')
  File "<stdin>", line 1
    a = exec('print(42)')
           ^
SyntaxError: invalid syntax

(which wouldn’t be useful in Python 3 either, as exec always returns None), or pass a reference to exec:

>>> call_later(exec, 'print(42)', delay=1000)
  File "<stdin>", line 1
    call_later(exec, 'print(42)', delay=1000)
                  ^
SyntaxError: invalid syntax

Which a pattern that someone might actually have used, though unlikely;

Or use it in a list comprehension:

>>> [exec(i) for i in ['print(42)', 'print(foo)']
  File "<stdin>", line 1
    [exec(i) for i in ['print(42)', 'print(foo)']
        ^
SyntaxError: invalid syntax

which is abuse of list comprehensions (use a for loop instead!).


回答 1

  1. exec不是表达式:Python 2.x中的语句和Python 3.x中的函数。它编译并立即评估字符串中包含的一条语句或一组语句。例:

    exec('print(5)')           # prints 5.
    # exec 'print 5'     if you use Python 2.x, nor the exec neither the print is a function there
    exec('print(5)\nprint(6)')  # prints 5{newline}6.
    exec('if True: print(6)')  # prints 6.
    exec('5')                 # does nothing and returns nothing.
  2. eval是一个内置函数(不是语句),该函数对一个表达式求值并返回该表达式产生的值。例:

    x = eval('5')              # x <- 5
    x = eval('%d + 6' % x)     # x <- 11
    x = eval('abs(%d)' % -100) # x <- 100
    x = eval('x = 5')          # INVALID; assignment is not an expression.
    x = eval('if 1: x = 4')    # INVALID; if is a statement, not an expression.
  3. compile是水平较低版本execeval。它不会执行或评估您的语句或表达式,但会返回可以执行此操作的代码对象。模式如下:

    1. compile(string, '', 'eval')返回如果您完成将执行的代码对象eval(string)。请注意,您不能在这种模式下使用语句。仅(单个)表达式有效。
    2. compile(string, '', 'exec')返回如果您完成将执行的代码对象exec(string)。您可以在此处使用任意数量的语句。
    3. compile(string, '', 'single')类似于exec模式,但是它将忽略除第一条语句以外的所有内容。请注意,带有结果的if/ else语句被视为单个语句。
  1. exec is not an expression: a statement in Python 2.x, and a function in Python 3.x. It compiles and immediately evaluates a statement or set of statement contained in a string. Example:

    exec('print(5)')           # prints 5.
    # exec 'print 5'     if you use Python 2.x, nor the exec neither the print is a function there
    exec('print(5)\nprint(6)')  # prints 5{newline}6.
    exec('if True: print(6)')  # prints 6.
    exec('5')                 # does nothing and returns nothing.
    
  2. eval is a built-in function (not a statement), which evaluates an expression and returns the value that expression produces. Example:

    x = eval('5')              # x <- 5
    x = eval('%d + 6' % x)     # x <- 11
    x = eval('abs(%d)' % -100) # x <- 100
    x = eval('x = 5')          # INVALID; assignment is not an expression.
    x = eval('if 1: x = 4')    # INVALID; if is a statement, not an expression.
    
  3. compile is a lower level version of exec and eval. It does not execute or evaluate your statements or expressions, but returns a code object that can do it. The modes are as follows:

    1. compile(string, '', 'eval') returns the code object that would have been executed had you done eval(string). Note that you cannot use statements in this mode; only a (single) expression is valid.
    2. compile(string, '', 'exec') returns the code object that would have been executed had you done exec(string). You can use any number of statements here.
    3. compile(string, '', 'single') is like the exec mode, but it will ignore everything except for the first statement. Note that an if/else statement with its results is considered a single statement.

回答 2

exec用于语句,不返回任何内容。eval用于表达式,并返回表达式的值。

表达式表示“某事”,而语句表示“做某事”。

exec is for statement and does not return anything. eval is for expression and returns value of expression.

expression means “something” while statement means “do something”.