标签归档:database

如何在两个Django应用之间移动模型(Django 1.7)

问题:如何在两个Django应用之间移动模型(Django 1.7)

因此,大约一年前,我开始了一个项目,像所有新开发人员一样,我并没有真正专注于结构,但是现在我与Django一起走得更远,它开始似乎表明我的项目布局主要是我的模型在结构上很糟糕。

我的模型主要保存在单个应用程序中,实际上这些模型中的大多数应该放在自己的单个应用程序中,我确实尝试解决了此问题并将其向南移动,但是由于外键等原因,我发现它很棘手,而且确实很困难。

但是,由于Django 1.7并内置了对迁移的支持,现在有更好的方法吗?

So about a year ago I started a project and like all new developers I didn’t really focus too much on the structure, however now I am further along with Django it has started to appear that my project layout mainly my models are horrible in structure.

I have models mainly held in a single app and really most of these models should be in their own individual apps, I did try and resolve this and move them with south however I found it tricky and really difficult due to foreign keys ect.

However due to Django 1.7 and built in support for migrations is there a better way to do this now?


回答 0

我正在删除旧答案,因为这可能会导致数据丢失。如ozan所述,我们可以在每个应用中创建2个迁移。这篇文章下面的评论指的是我的旧答案。

第一次迁移以从第一个应用中删除模型。

$ python manage.py makemigrations old_app --empty

编辑迁移文件以包括这些操作。

class Migration(migrations.Migration):

    database_operations = [migrations.AlterModelTable('TheModel', 'newapp_themodel')]

    state_operations = [migrations.DeleteModel('TheModel')]

    operations = [
      migrations.SeparateDatabaseAndState(
        database_operations=database_operations,
        state_operations=state_operations)
    ]

第二次迁移取决于第一次迁移并在第二个应用程序中创建新表。将模型代码移至第二个应用程序后

$ python manage.py makemigrations new_app 

然后将迁移文件编辑为类似的内容。

class Migration(migrations.Migration):

    dependencies = [
        ('old_app', 'above_migration')
    ]

    state_operations = [
        migrations.CreateModel(
            name='TheModel',
            fields=[
                ('id', models.AutoField(verbose_name='ID', serialize=False, auto_created=True, primary_key=True)),
            ],
            options={
                'db_table': 'newapp_themodel',
            },
            bases=(models.Model,),
        )
    ]

    operations = [
        migrations.SeparateDatabaseAndState(state_operations=state_operations)
    ]

I am removing the old answer as may result in data loss. As ozan mentioned, we can create 2 migrations one in each app. The comments below this post refer to my old answer.

First migration to remove model from 1st app.

$ python manage.py makemigrations old_app --empty

Edit migration file to include these operations.

class Migration(migrations.Migration):

    database_operations = [migrations.AlterModelTable('TheModel', 'newapp_themodel')]

    state_operations = [migrations.DeleteModel('TheModel')]

    operations = [
      migrations.SeparateDatabaseAndState(
        database_operations=database_operations,
        state_operations=state_operations)
    ]

Second migration which depends on first migration and create the new table in 2nd app. After moving model code to 2nd app

$ python manage.py makemigrations new_app 

and edit migration file to something like this.

class Migration(migrations.Migration):

    dependencies = [
        ('old_app', 'above_migration')
    ]

    state_operations = [
        migrations.CreateModel(
            name='TheModel',
            fields=[
                ('id', models.AutoField(verbose_name='ID', serialize=False, auto_created=True, primary_key=True)),
            ],
            options={
                'db_table': 'newapp_themodel',
            },
            bases=(models.Model,),
        )
    ]

    operations = [
        migrations.SeparateDatabaseAndState(state_operations=state_operations)
    ]

回答 1

使用可以很容易地做到这一点migrations.SeparateDatabaseAndState。基本上,我们使用数据库操作来同时重命名表,同时使用两个状态操作从一个应用程序的历史记录中删除模型并在另一个应用程序的历史记录中创建模型。

从旧应用中删除

python manage.py makemigrations old_app --empty

在迁移中:

class Migration(migrations.Migration):

    dependencies = []

    database_operations = [
        migrations.AlterModelTable('TheModel', 'newapp_themodel')
    ]

    state_operations = [
        migrations.DeleteModel('TheModel')
    ]

    operations = [
        migrations.SeparateDatabaseAndState(
            database_operations=database_operations,
            state_operations=state_operations)
    ]

添加到新应用

首先,将模型复制到新应用的model.py中,然后:

python manage.py makemigrations new_app

这将生成一个迁移CreateModel操作,其中天真的操作是唯一的操作。将其包装在一个SeparateDatabaseAndState操作中,这样我们就不会尝试重新创建表。还包括先前的迁移作为依赖项:

class Migration(migrations.Migration):

    dependencies = [
        ('old_app', 'above_migration')
    ]

    state_operations = [
        migrations.CreateModel(
            name='TheModel',
            fields=[
                ('id', models.AutoField(verbose_name='ID', serialize=False, auto_created=True, primary_key=True)),
            ],
            options={
                'db_table': 'newapp_themodel',
            },
            bases=(models.Model,),
        )
    ]

    operations = [
        migrations.SeparateDatabaseAndState(state_operations=state_operations)
    ]

This can be done fairly easily using migrations.SeparateDatabaseAndState. Basically, we use a database operation to rename the table concurrently with two state operations to remove the model from one app’s history and create it in another’s.

Remove from old app

python manage.py makemigrations old_app --empty

In the migration:

class Migration(migrations.Migration):

    dependencies = []

    database_operations = [
        migrations.AlterModelTable('TheModel', 'newapp_themodel')
    ]

    state_operations = [
        migrations.DeleteModel('TheModel')
    ]

    operations = [
        migrations.SeparateDatabaseAndState(
            database_operations=database_operations,
            state_operations=state_operations)
    ]

Add to new app

First, copy the model to the new app’s model.py, then:

python manage.py makemigrations new_app

This will generate a migration with a naive CreateModel operation as the sole operation. Wrap that in a SeparateDatabaseAndState operation such that we don’t try to recreate the table. Also include the prior migration as a dependency:

class Migration(migrations.Migration):

    dependencies = [
        ('old_app', 'above_migration')
    ]

    state_operations = [
        migrations.CreateModel(
            name='TheModel',
            fields=[
                ('id', models.AutoField(verbose_name='ID', serialize=False, auto_created=True, primary_key=True)),
            ],
            options={
                'db_table': 'newapp_themodel',
            },
            bases=(models.Model,),
        )
    ]

    operations = [
        migrations.SeparateDatabaseAndState(state_operations=state_operations)
    ]

回答 2

我遇到了同样的问题。 奥赞的回答对我有很大帮助,但不幸的是还不够。确实,我有几个ForeignKey链接到我想移动的模型。经过一番头痛之后,我发现了解决方案,因此决定发布该解决方案以解决人们的时间问题。

您还需要2个步骤:

  1. 在执行任何操作之前,请将所有ForeignKey链接更改TheModelIntegerfield。然后跑python manage.py makemigrations
  2. 完成Ozan的步骤后,请重新转换外键:放回ForeignKey(TheModel)而不是IntegerField()。然后再次进行迁移(python manage.py makemigrations)。然后,您可以迁移,它应该可以工作(python manage.py migrate

希望能帮助到你。当然,在尝试生产之前,请先在本地进行测试,以避免出现意外情况:)

I encountered the same problem. Ozan’s answer helped me a lot but unfortunately was not enough. Indeed I had several ForeignKey linking to the model I wanted to move. After some headache I found the solution so decided to post it to solve people time.

You need 2 more steps:

  1. Before doing anything, change all your ForeignKey linking to TheModel into Integerfield. Then run python manage.py makemigrations
  2. After doing Ozan’s steps, re-convert your foreign keys: put back ForeignKey(TheModel)instead of IntegerField(). Then make the migrations again (python manage.py makemigrations). You can then migrate and it should work (python manage.py migrate)

Hope it helps. Of course test it in local before trying in production to avoid bad suprises :)


回答 3

我是如何做到的(在PostgreSQL == 1.8上进行了测试,使用postgres,可能也是1.7)

情况

app1.YourModel

但您希望它转到: app2.YourModel

  1. 将YourModel(代码)从app1复制到app2。
  2. 将此添加到app2.YourModel:

    Class Meta:
        db_table = 'app1_yourmodel'
  3. $ python manage.py makemigrations app2

  4. 使用migrations.CreateModel()语句在app2中进行了新的迁移(例如0009_auto_something.py),将该语句移至app2的初始迁移(例如0001_initial.py)(就像它一直在那儿一样)。现在删除创建的迁移= 0009_auto_something.py

  5. 就像您执行操作一样,就像app2.YourModel一直存在一样,现在从迁移中删除app1.YourModel的存在。含义:注释掉CreateModel语句,以及之后使用的所有调整或数据迁移。

  6. 当然,必须在您的项目中将对app1.YourModel的每个引用都更改为app2.YourModel。另外,不要忘记迁移中所有可能的app1.YourModel外键都必须更改为app2.YourModel。

  7. 现在,如果您执行$ python manage.py迁移,则什么都没有改变,同样,当您执行$ python manage.py makemigrations时,也不会检测到新的东西。

  8. 现在画龙点睛:从app2.YourModel中删除Class Meta并执行$ python manage.py makemigrations app2 && python manage.py migration app2(如果您研究此迁移,您将看到类似以下的内容:)

        migrations.AlterModelTable(
        name='yourmodel',
        table=None,
    ),

table = None,表示它将采用默认的表名,在这种情况下为app2_yourmodel。

  1. 完成,保存数据。

PS在迁移期间,它将看到content_type app1.yourmodel已被删除并且可以删除。您可以对此说是,但前提是您不使用它。如果您非常依赖它来使该内容类型的FK保持完整,则不要回答yes或no,而是手动进入该时间的db,并删除co​​ntentype app2.yourmodel,然后重命名contenttype app1。 yourmodel到app2.yourmodel,然后继续回答否。

How I did it (tested on Django==1.8, with postgres, so probably also 1.7)

Situation

app1.YourModel

but you want it to go to: app2.YourModel

  1. Copy YourModel (the code) from app1 to app2.
  2. add this to app2.YourModel:

    Class Meta:
        db_table = 'app1_yourmodel'
    
  3. $ python manage.py makemigrations app2

  4. A new migration (e.g. 0009_auto_something.py) is made in app2 with a migrations.CreateModel() statement, move this statement to the initial migration of app2 (e.g. 0001_initial.py) (it will be just like it always have been there). And now remove the created migration = 0009_auto_something.py

  5. Just as you act, like app2.YourModel always has been there, now remove the existence of app1.YourModel from your migrations. Meaning: comment out the CreateModel statements, and every adjustment or datamigration you used after that.

  6. And of course, every reference to app1.YourModel has to be changed to app2.YourModel through your project. Also, don’t forget that all possible foreign keys to app1.YourModel in migrations have to be changed to app2.YourModel

  7. Now if you do $ python manage.py migrate, nothing has changed, also when you do $ python manage.py makemigrations, nothing new has been detected.

  8. Now the finishing touch: remove the Class Meta from app2.YourModel and do $ python manage.py makemigrations app2 && python manage.py migrate app2 (if you look into this migration you’ll see something like this:)

        migrations.AlterModelTable(
        name='yourmodel',
        table=None,
    ),
    

table=None, means it will take the default table-name, which in this case will be app2_yourmodel.

  1. DONE, with data saved.

P.S during the migration it will see that that content_type app1.yourmodel has been removed and can be deleted. You can say yes to that but only if you don’t use it. In case you heavily depend on it to have FKs to that content-type be intact, don’t answer yes or no yet, but go into the db that time manually, and remove the contentype app2.yourmodel, and rename the contenttype app1.yourmodel to app2.yourmodel, and then continue by answering no.


回答 4

我会感到紧张的手工编码迁移(这是Ozan的回答所要求),因此以下内容结合了Ozan和Michael的策略以最大程度地减少所需的手工编码量:

  1. 在移动任何模型之前,请通过运行确保使用干净的基线makemigrations
  2. 将模型的代码从app1移至app2
  3. 按照@Michael的建议,我们使用db_table“新”模型上的Meta选项将新模型指向旧数据库表:

    class Meta:
        db_table = 'app1_yourmodel'
  4. 运行makemigrations。这将CreateModelapp2DeleteModel中生成app1。从技术上讲,这些迁移将引用完全相同的表,并且将删除(包括所有数据)并重新创建表。

  5. 实际上,我们不希望(或不需要)对表做任何事情。我们只需要Django相信所做的更改即可。根据@Ozan的回答,中的state_operations标志将SeparateDatabaseAndState执行此操作。因此,我们使用来将所有migrations条目包装在两个迁移文件SeparateDatabaseAndState(state_operations=[...])。例如,

    operations = [
        ...
        migrations.DeleteModel(
            name='YourModel',
        ),
        ...
    ]

    变成

    operations = [
        migrations.SeparateDatabaseAndState(state_operations=[
            ...
            migrations.DeleteModel(
                name='YourModel',
            ),
            ...
        ])
    ]
  6. 您还需要确保新的“虚拟” CreateModel迁移取决于实际创建或更改原始表的任何迁移。例如,如果您的新迁移是app2.migrations.0004_auto_<date>(针对Create)和app1.migrations.0007_auto_<date>(针对Delete),则最简单的操作是:

    • 打开app1.migrations.0007_auto_<date>并复制其app1依赖项(例如 ('app1', '0006...'),)。这是“立即优先”的迁移app1,应该包括对所有实际模型构建逻辑的依赖。
    • 打开 app2.migrations.0004_auto_<date>并将刚刚复制的依赖项添加到其dependencies列表中。

如果您ForeignKey与要移动的模型有关系,则上述方法可能无效。发生这种情况是因为:

  • 不会自动为ForeignKey更改创建依赖关系
  • 我们不想包装这些ForeignKey更改,state_operations因此我们需要确保它们与表操作是分开的。

注意:Django 2.2添加了一个警告(models.E028),它破坏了此方法。您也许可以解决该问题,managed=False但我尚未对其进行测试。

“最小”操作集因情况而异,但是以下过程应适用于大多数/所有ForeignKey迁移:

  1. app1复制模型app2,设置db_table,但不要更改任何FK引用。
  2. 运行makemigrations并包装所有app2迁移state_operations(请参见上文)
    • 如上所述,将依赖项添加app2 CreateTable到最新的app1迁移中
  3. 将所有FK参考指向新模型。如果您不使用字符串引用,请将旧模型移到models.py(不要删除它)的底部,这样它就不会与导入的类竞争。
  4. 运行,makemigrations但不要包装任何内容state_operations(实际上应该发生FK更改)。将所有ForeignKey迁移(即AlterField)中的依赖项添加到其中的CreateTable迁移中app2(下一步将需要此列表,因此请对其进行跟踪)。例如:

    • 找到包含迁移CreateModelapp2.migrations.0002_auto_<date>和复制迁移的名称。
    • 查找所有具有该模型的ForeignKey的迁移(例如,通过搜索app2.YourModel找到类似以下的迁移:

      class Migration(migrations.Migration):
      
          dependencies = [
              ('otherapp', '0001_initial'),
          ]
      
          operations = [
              migrations.AlterField(
                  model_name='relatedmodel',
                  name='fieldname',
                  field=models.ForeignKey(... to='app2.YourModel'),
              ),
          ]
    • CreateModel迁移作为依赖项添加:

      class Migration(migrations.Migration):
      
          dependencies = [
              ('otherapp', '0001_initial'),
              ('app2', '0002_auto_<date>'),
          ]  
  5. 从中删除模型 app1

  6. 运行迁移makemigrations并将其包装在app1state_operations
    • 为上一步的所有ForeignKey迁移(即AlterField)添加一个依赖项(可能包括app1和中的迁移app2)。
    • 当我构建这些迁移时,DeleteTable已经依赖于AlterField迁移了,因此我不需要手动执行它(即Alterbefore Delete)。

在这一点上,Django是个不错的选择。新模型指向旧表,并且Django的迁移使它确信所有内容都已适当地重新放置。最大的警告(来自@Michael的答案)是ContentType为新模型创建了一个新模型。如果您链接(例如通过ForeignKey)到内容类型,则需要创建一个迁移来更新ContentType表。

我想自己清理一下(元选项和表名),所以我使用了以下过程(来自@Michael):

  1. 删除db_table元条目
  2. makemigrations再次运行以生成数据库重命名
  3. 编辑此最后的迁移,并确保它取决于DeleteTable迁移。似乎没有必要,因为它应该Delete纯粹是逻辑上的,但是app1_yourmodel如果我不这样做,就会遇到错误(例如,不存在)。

I get nervous hand-coding migrations (as is required by Ozan’s answer) so the following combines Ozan’s and Michael’s strategies to minimize the amount of hand-coding required:

  1. Before moving any models, make sure you’re working with a clean baseline by running makemigrations.
  2. Move the code for the Model from app1 to app2
  3. As recommended by @Michael, we point the new model to the old database table using the db_table Meta option on the “new” model:

    class Meta:
        db_table = 'app1_yourmodel'
    
  4. Run makemigrations. This will generate CreateModel in app2 and DeleteModel in app1. Technically, these migrations refer to the exact same table and would remove (including all data) and re-create the table.

  5. In reality, we don’t want (or need) to do anything to the table. We just need Django to believe that the change has been made. Per @Ozan’s answer, the state_operations flag in SeparateDatabaseAndState does this. So we wrap all of the migrations entries IN BOTH MIGRATIONS FILES with SeparateDatabaseAndState(state_operations=[...]). For example,

    operations = [
        ...
        migrations.DeleteModel(
            name='YourModel',
        ),
        ...
    ]
    

    becomes

    operations = [
        migrations.SeparateDatabaseAndState(state_operations=[
            ...
            migrations.DeleteModel(
                name='YourModel',
            ),
            ...
        ])
    ]
    
  6. You also need to make sure the new “virtual” CreateModel migration depends on any migration that actually created or altered the original table. For example, if your new migrations are app2.migrations.0004_auto_<date> (for the Create) and app1.migrations.0007_auto_<date> (for the Delete), the simplest thing to do is:

    • Open app1.migrations.0007_auto_<date> and copy its app1 dependency (e.g. ('app1', '0006...'),). This is the “immediately prior” migration in app1 and should include dependencies on all of the actual model building logic.
    • Open app2.migrations.0004_auto_<date> and add the dependency you just copied to its dependencies list.

If you have ForeignKey relationship(s) to the model you’re moving, the above may not work. This happens because:

  • Dependencies are not automatically created for the ForeignKey changes
  • We do not want to wrap the ForeignKey changes in state_operations so we need to ensure they are separate from the table operations.

NOTE: Django 2.2 added a warning (models.E028) that breaks this method. You may be able to work around it with managed=False but I have not tested it.

The “minimum” set of operations differ depending on the situation, but the following procedure should work for most/all ForeignKey migrations:

  1. COPY the model from app1 to app2, set db_table, but DON’T change any FK references.
  2. Run makemigrations and wrap all app2 migration in state_operations (see above)
    • As above, add a dependency in the app2 CreateTable to the latest app1 migration
  3. Point all of the FK references to the new model. If you aren’t using string references, move the old model to the bottom of models.py (DON’T remove it) so it doesn’t compete with the imported class.
  4. Run makemigrations but DON’T wrap anything in state_operations (the FK changes should actually happen). Add a dependency in all the ForeignKey migrations (i.e. AlterField) to the CreateTable migration in app2 (you’ll need this list for the next step so keep track of them). For example:

    • Find the migration that includes the CreateModel e.g. app2.migrations.0002_auto_<date> and copy the name of that migration.
    • Find all migrations that have a ForeignKey to that model (e.g. by searching app2.YourModel to find migrations like:

      class Migration(migrations.Migration):
      
          dependencies = [
              ('otherapp', '0001_initial'),
          ]
      
          operations = [
              migrations.AlterField(
                  model_name='relatedmodel',
                  name='fieldname',
                  field=models.ForeignKey(... to='app2.YourModel'),
              ),
          ]
      
    • Add the CreateModel migration as as a dependency:

      class Migration(migrations.Migration):
      
          dependencies = [
              ('otherapp', '0001_initial'),
              ('app2', '0002_auto_<date>'),
          ]  
      
  5. Remove the models from app1

  6. Run makemigrations and wrap the app1 migration in state_operations.
    • Add a dependency to all of the ForeignKey migrations (i.e. AlterField) from the previous step (may include migrations in app1 and app2).
    • When I built these migrations, the DeleteTable already depended on the AlterField migrations so I didn’t need to manually enforce it (i.e. Alter before Delete).

At this point, Django is good to go. The new model points to the old table and Django’s migrations have convinced it that everything has been relocated appropriately. The big caveat (from @Michael’s answer) is that a new ContentType is created for the new model. If you link (e.g. by ForeignKey) to content types, you’ll need to create a migration to update the ContentType table.

I wanted to cleanup after myself (Meta options and table names) so I used the following procedure (from @Michael):

  1. Remove the db_table Meta entry
  2. Run makemigrations again to generate the database rename
  3. Edit this last migration and make sure it depends on the DeleteTable migration. It doesn’t seem like it should be necessary as the Delete should be purely logical, but I’ve run into errors (e.g. app1_yourmodel doesn’t exist) if I don’t.

回答 5

如果数据不是很大或太复杂,但仍然很重要,则另一种骇人听闻的选择是:

  • 使用manage.py dumpdata获取数据固定装置
  • 继续正确建模更改和迁移,而不涉及更改
  • 全局将设备从旧型号和应用名称替换为新的
  • 使用manage.py loaddata加载数据

Another hacky alternative if the data is not big or too complicated, but still important to maintain, is to:

  • Get data fixtures using manage.py dumpdata
  • Proceed to model changes and migrations properly, without relating the changes
  • Global replace the fixtures from the old model and app names to the new
  • Load data using manage.py loaddata

回答 6

从我在https://stackoverflow.com/a/47392970/8971048的回答中复制

万一您需要移动模型而又无权访问该应用程序(或者您不想访问该应用程序),则可以创建一个新的Operation并考虑仅在迁移的模型不具有该模型的情况下创建一个新模型存在。

在此示例中,我将“ MyModel”从old_app传递到myapp。

class MigrateOrCreateTable(migrations.CreateModel):
    def __init__(self, source_table, dst_table, *args, **kwargs):
        super(MigrateOrCreateTable, self).__init__(*args, **kwargs)
        self.source_table = source_table
        self.dst_table = dst_table

    def database_forwards(self, app_label, schema_editor, from_state, to_state):
        table_exists = self.source_table in schema_editor.connection.introspection.table_names()
        if table_exists:
            with schema_editor.connection.cursor() as cursor:
                cursor.execute("RENAME TABLE {} TO {};".format(self.source_table, self.dst_table))
        else:
            return super(MigrateOrCreateTable, self).database_forwards(app_label, schema_editor, from_state, to_state)


class Migration(migrations.Migration):

    dependencies = [
        ('myapp', '0002_some_migration'),
    ]

    operations = [
        MigrateOrCreateTable(
            source_table='old_app_mymodel',
            dst_table='myapp_mymodel',
            name='MyModel',
            fields=[
                ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
                ('name', models.CharField(max_length=18))
            ],
        ),
    ]

Copied from my answer at https://stackoverflow.com/a/47392970/8971048

In case you need to move the model and you don’t have access to the app anymore (or you don’t want the access), you can create a new Operation and consider to create a new model only if the migrated model does not exist.

In this example I am passing ‘MyModel’ from old_app to myapp.

class MigrateOrCreateTable(migrations.CreateModel):
    def __init__(self, source_table, dst_table, *args, **kwargs):
        super(MigrateOrCreateTable, self).__init__(*args, **kwargs)
        self.source_table = source_table
        self.dst_table = dst_table

    def database_forwards(self, app_label, schema_editor, from_state, to_state):
        table_exists = self.source_table in schema_editor.connection.introspection.table_names()
        if table_exists:
            with schema_editor.connection.cursor() as cursor:
                cursor.execute("RENAME TABLE {} TO {};".format(self.source_table, self.dst_table))
        else:
            return super(MigrateOrCreateTable, self).database_forwards(app_label, schema_editor, from_state, to_state)


class Migration(migrations.Migration):

    dependencies = [
        ('myapp', '0002_some_migration'),
    ]

    operations = [
        MigrateOrCreateTable(
            source_table='old_app_mymodel',
            dst_table='myapp_mymodel',
            name='MyModel',
            fields=[
                ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
                ('name', models.CharField(max_length=18))
            ],
        ),
    ]

回答 7

这是经过粗略测试的,所以不要忘记备份数据库!!!

例如,有两个应用程序:src_appdst_app,我们希望将模型MoveMesrc_app移至dst_app

为两个应用程序创建空迁移:

python manage.py makemigrations --empty src_app
python manage.py makemigrations --empty dst_app

让我们假设,新的迁移是XXX1_src_app_newXXX1_dst_app_new,之前的主要迁移是XXX0_src_app_oldXXX0_dst_app_old

添加一个操作,该操作重命名MoveMe模型的表,并将其在ProjectState中的app_label重命名为XXX1_dst_app_new。不要忘记增加对XXX0_src_app_old迁移的依赖。产生的XXX1_dst_app_new迁移是:

# -*- coding: utf-8 -*-
from __future__ import unicode_literals

from django.db import models, migrations

# this operations is almost the same as RenameModel
# https://github.com/django/django/blob/1.7/django/db/migrations/operations/models.py#L104
class MoveModelFromOtherApp(migrations.operations.base.Operation):

    def __init__(self, name, old_app_label):
        self.name = name
        self.old_app_label = old_app_label

    def state_forwards(self, app_label, state):

        # Get all of the related objects we need to repoint
        apps = state.render(skip_cache=True)
        model = apps.get_model(self.old_app_label, self.name)
        related_objects = model._meta.get_all_related_objects()
        related_m2m_objects = model._meta.get_all_related_many_to_many_objects()
        # Rename the model
        state.models[app_label, self.name.lower()] = state.models.pop(
            (self.old_app_label, self.name.lower())
        )
        state.models[app_label, self.name.lower()].app_label = app_label
        for model_state in state.models.values():
            try:
                i = model_state.bases.index("%s.%s" % (self.old_app_label, self.name.lower()))
                model_state.bases = model_state.bases[:i] + ("%s.%s" % (app_label, self.name.lower()),) + model_state.bases[i+1:]
            except ValueError:
                pass
        # Repoint the FKs and M2Ms pointing to us
        for related_object in (related_objects + related_m2m_objects):
            # Use the new related key for self referential related objects.
            if related_object.model == model:
                related_key = (app_label, self.name.lower())
            else:
                related_key = (
                    related_object.model._meta.app_label,
                    related_object.model._meta.object_name.lower(),
                )
            new_fields = []
            for name, field in state.models[related_key].fields:
                if name == related_object.field.name:
                    field = field.clone()
                    field.rel.to = "%s.%s" % (app_label, self.name)
                new_fields.append((name, field))
            state.models[related_key].fields = new_fields

    def database_forwards(self, app_label, schema_editor, from_state, to_state):
        old_apps = from_state.render()
        new_apps = to_state.render()
        old_model = old_apps.get_model(self.old_app_label, self.name)
        new_model = new_apps.get_model(app_label, self.name)
        if self.allowed_to_migrate(schema_editor.connection.alias, new_model):
            # Move the main table
            schema_editor.alter_db_table(
                new_model,
                old_model._meta.db_table,
                new_model._meta.db_table,
            )
            # Alter the fields pointing to us
            related_objects = old_model._meta.get_all_related_objects()
            related_m2m_objects = old_model._meta.get_all_related_many_to_many_objects()
            for related_object in (related_objects + related_m2m_objects):
                if related_object.model == old_model:
                    model = new_model
                    related_key = (app_label, self.name.lower())
                else:
                    model = related_object.model
                    related_key = (
                        related_object.model._meta.app_label,
                        related_object.model._meta.object_name.lower(),
                    )
                to_field = new_apps.get_model(
                    *related_key
                )._meta.get_field_by_name(related_object.field.name)[0]
                schema_editor.alter_field(
                    model,
                    related_object.field,
                    to_field,
                )

    def database_backwards(self, app_label, schema_editor, from_state, to_state):
        self.old_app_label, app_label = app_label, self.old_app_label
        self.database_forwards(app_label, schema_editor, from_state, to_state)
        app_label, self.old_app_label = self.old_app_label, app_label

    def describe(self):
        return "Move %s from %s" % (self.name, self.old_app_label)


class Migration(migrations.Migration):

    dependencies = [
       ('dst_app', 'XXX0_dst_app_old'),
       ('src_app', 'XXX0_src_app_old'),
    ]

    operations = [
        MoveModelFromOtherApp('MoveMe', 'src_app'),
    ]

将依赖项添加XXX1_dst_app_newXXX1_src_app_newXXX1_src_app_new是No-op迁移,可确保src_app在之后执行将来的迁移XXX1_dst_app_new

移动MoveMesrc_app/models.pydst_app/models.py。然后运行:

python manage.py migrate

就这样!

This is tested roughly, so do not forget to backup your DB!!!

For example, there are two apps: src_app and dst_app, we want to move model MoveMe from src_app to dst_app.

Create empty migrations for both apps:

python manage.py makemigrations --empty src_app
python manage.py makemigrations --empty dst_app

Let’s assume, that new migrations are XXX1_src_app_new and XXX1_dst_app_new, previuos top migrations are XXX0_src_app_old and XXX0_dst_app_old.

Add an operation that renames table for MoveMe model and renames its app_label in ProjectState to XXX1_dst_app_new. Do not forget to add dependency on XXX0_src_app_old migration. The resulting XXX1_dst_app_new migration is:

# -*- coding: utf-8 -*-
from __future__ import unicode_literals

from django.db import models, migrations

# this operations is almost the same as RenameModel
# https://github.com/django/django/blob/1.7/django/db/migrations/operations/models.py#L104
class MoveModelFromOtherApp(migrations.operations.base.Operation):

    def __init__(self, name, old_app_label):
        self.name = name
        self.old_app_label = old_app_label

    def state_forwards(self, app_label, state):

        # Get all of the related objects we need to repoint
        apps = state.render(skip_cache=True)
        model = apps.get_model(self.old_app_label, self.name)
        related_objects = model._meta.get_all_related_objects()
        related_m2m_objects = model._meta.get_all_related_many_to_many_objects()
        # Rename the model
        state.models[app_label, self.name.lower()] = state.models.pop(
            (self.old_app_label, self.name.lower())
        )
        state.models[app_label, self.name.lower()].app_label = app_label
        for model_state in state.models.values():
            try:
                i = model_state.bases.index("%s.%s" % (self.old_app_label, self.name.lower()))
                model_state.bases = model_state.bases[:i] + ("%s.%s" % (app_label, self.name.lower()),) + model_state.bases[i+1:]
            except ValueError:
                pass
        # Repoint the FKs and M2Ms pointing to us
        for related_object in (related_objects + related_m2m_objects):
            # Use the new related key for self referential related objects.
            if related_object.model == model:
                related_key = (app_label, self.name.lower())
            else:
                related_key = (
                    related_object.model._meta.app_label,
                    related_object.model._meta.object_name.lower(),
                )
            new_fields = []
            for name, field in state.models[related_key].fields:
                if name == related_object.field.name:
                    field = field.clone()
                    field.rel.to = "%s.%s" % (app_label, self.name)
                new_fields.append((name, field))
            state.models[related_key].fields = new_fields

    def database_forwards(self, app_label, schema_editor, from_state, to_state):
        old_apps = from_state.render()
        new_apps = to_state.render()
        old_model = old_apps.get_model(self.old_app_label, self.name)
        new_model = new_apps.get_model(app_label, self.name)
        if self.allowed_to_migrate(schema_editor.connection.alias, new_model):
            # Move the main table
            schema_editor.alter_db_table(
                new_model,
                old_model._meta.db_table,
                new_model._meta.db_table,
            )
            # Alter the fields pointing to us
            related_objects = old_model._meta.get_all_related_objects()
            related_m2m_objects = old_model._meta.get_all_related_many_to_many_objects()
            for related_object in (related_objects + related_m2m_objects):
                if related_object.model == old_model:
                    model = new_model
                    related_key = (app_label, self.name.lower())
                else:
                    model = related_object.model
                    related_key = (
                        related_object.model._meta.app_label,
                        related_object.model._meta.object_name.lower(),
                    )
                to_field = new_apps.get_model(
                    *related_key
                )._meta.get_field_by_name(related_object.field.name)[0]
                schema_editor.alter_field(
                    model,
                    related_object.field,
                    to_field,
                )

    def database_backwards(self, app_label, schema_editor, from_state, to_state):
        self.old_app_label, app_label = app_label, self.old_app_label
        self.database_forwards(app_label, schema_editor, from_state, to_state)
        app_label, self.old_app_label = self.old_app_label, app_label

    def describe(self):
        return "Move %s from %s" % (self.name, self.old_app_label)


class Migration(migrations.Migration):

    dependencies = [
       ('dst_app', 'XXX0_dst_app_old'),
       ('src_app', 'XXX0_src_app_old'),
    ]

    operations = [
        MoveModelFromOtherApp('MoveMe', 'src_app'),
    ]

Add dependency on XXX1_dst_app_new to XXX1_src_app_new. XXX1_src_app_new is no-op migration that is needed to make sure that future src_app migrations will be executed after XXX1_dst_app_new.

Move MoveMe from src_app/models.py to dst_app/models.py. Then run:

python manage.py migrate

That’s all!


回答 8

您可以尝试以下(未试用):

  1. 将模型从src_app移至dest_app
  2. 迁移dest_app; 确保模式迁移依赖于最新的src_app迁移(https://docs.djangoproject.com/en/dev/topics/migrations/#migration-files
  3. 将数据迁移添加到dest_app,从中复制所有数据src_app
  4. 迁移src_app; 确保模式迁移依赖于-的最新(数据)迁移,dest_app即:步骤3的迁移

请注意,您将复制整个表格,而不是移动整个表格,但是那样,两个应用程序都不必触摸属于另一个应用程序的表,我认为这更重要。

You can try the following (untested):

  1. move the model from src_app to dest_app
  2. migrate dest_app; make sure the schema migration depends on the latest src_app migration (https://docs.djangoproject.com/en/dev/topics/migrations/#migration-files)
  3. add a data migration to dest_app, that copies all data from src_app
  4. migrate src_app; make sure the schema migration depends on the latest (data) migration of dest_app — that is: the migration of step 3

Note that you will be copying the whole table, instead of moving it, but that way both apps don’t have to touch a table that belongs to the other app, which I think is more important.


回答 9

假设您要将模型TheModel从app_a移至app_b。

另一种解决方案是手动更改现有迁移。这样做的想法是,每次在app_a的迁移过程中看到更改TheModel的操作时,都会将该操作复制到app_b的初始迁移的末尾。并且每次在app_a的迁移中看到引用’app_a.TheModel’时,都将其更改为’app_b.TheModel’。

我只是对一个现有项目执行此操作,在该项目中我想将某个模型提取到可重用的应用程序中。程序进行得很顺利。我想如果有从app_b到app_a的引用,事情会更加困难。另外,我为模型手动定义了Meta.db_table,这可能有所帮助。

值得注意的是,您最终会更改迁移历史记录。即使您的数据库已应用原始迁移,也没关系。如果原始迁移和重写迁移最终都具有相同的数据库模式,则这种重写应该可以。

Lets say you are moving model TheModel from app_a to app_b.

An alternate solution is to alter the existing migrations by hand. The idea is that each time you see an operation altering TheModel in app_a’s migrations, you copy that operation to the end of app_b’s initial migration. And each time you see a reference ‘app_a.TheModel’ in app_a’s migrations, you change it to ‘app_b.TheModel’.

I just did this for an existing project, where I wanted to extract a certain model to an reusable app. The procedure went smoothly. I guess things would be much harder if there were references from app_b to app_a. Also, I had a manually defined Meta.db_table for my model which might have helped.

Notably you will end up with altered migration history. This doesn’t matter, even if you have a database with the original migrations applied. If both the original and the rewritten migrations end up with the same database schema, then such rewrite should be OK.


回答 10

  1. 将旧模型的名称更改为“ model_name_old”
  2. 移民
  3. 制作名为“ model_name_new”的新模型,并在相关模型上具有相同的关系(例如,用户模型现在具有user.blog_old和user.blog_new)
  4. 移民
  5. 编写自定义迁移,将所有数据迁移到新模型表
  6. 通过在运行迁移之前和之后将备份与新的数据库副本进行比较来测试这些迁移的难处
  7. 当一切都令人满意时,删除旧模型
  8. 移民
  9. 将新模型更改为正确的名称’model_name_new’->’model_name’
  10. 在登台服务器上测试整个迁移过程
  11. 将您的生产站点关闭几分钟,以便在不干扰用户的情况下运行所有​​迁移

对每个需要移动的模型分别执行此操作。我不建议通过更改为整数然后再返回外键来执行其他答案所说的事情,在迁移之后,新的外键可能会有所不同,并且行的ID可能会有所不同,并且我不想冒任何风险切换回外键时ID不匹配的问题。

  1. change the names of old models to ‘model_name_old’
  2. makemigrations
  3. make new models named ‘model_name_new’ with identical relationships on the related models (eg. user model now has user.blog_old and user.blog_new)
  4. makemigrations
  5. write a custom migration that migrates all the data to the new model tables
  6. test the hell out of these migrations by comparing backups with new db copies before and after running the migrations
  7. when all is satisfactory, delete the old models
  8. makemigrations
  9. change the new models to the correct name ‘model_name_new’ -> ‘model_name’
  10. test the whole slew of migrations on a staging server
  11. take your production site down for a few minutes in order to run all migrations without users interfering

Do this individually for each model that needs to be moved. I wouldn’t suggest doing what the other answer says by changing to integers and back to foreign keys There is a chance that new foreign keys will be different and rows may have different IDs after the migrations and I didn’t want to run any risk of mismatching ids when switching back to foreign keys.


使用SQLAlchemy ORM批量插入

问题:使用SQLAlchemy ORM批量插入

有什么方法可以让SQLAlchemy进行批量插入,而不是插入每个对象。即

在做:

INSERT INTO `foo` (`bar`) VALUES (1), (2), (3)

而不是:

INSERT INTO `foo` (`bar`) VALUES (1)
INSERT INTO `foo` (`bar`) VALUES (2)
INSERT INTO `foo` (`bar`) VALUES (3)

我刚刚将一些代码转换为使用sqlalchemy而不是原始sql,尽管现在使用它起来要好得多,但现在似乎要慢一些(最多10倍),我想知道这是否是原因。

也许我可以更有效地使用会话来改善这种情况。目前,我已经添加了一些东西,autoCommit=False并做了一个session.commit()。尽管如果在其他地方更改了数据库,这似乎会使数据过时,例如,即使我执行新查询,我仍然可以返回旧结果?

谢谢你的帮助!

Is there any way to get SQLAlchemy to do a bulk insert rather than inserting each individual object. i.e.,

doing:

INSERT INTO `foo` (`bar`) VALUES (1), (2), (3)

rather than:

INSERT INTO `foo` (`bar`) VALUES (1)
INSERT INTO `foo` (`bar`) VALUES (2)
INSERT INTO `foo` (`bar`) VALUES (3)

I’ve just converted some code to use sqlalchemy rather than raw sql and although it is now much nicer to work with it seems to be slower now (up to a factor of 10), I’m wondering if this is the reason.

May be I could improve the situation using sessions more efficiently. At the moment I have autoCommit=False and do a session.commit() after I’ve added some stuff. Although this seems to cause the data to go stale if the DB is changed elsewhere, like even if I do a new query I still get old results back?

Thanks for your help!


回答 0

SQLAlchemy在版本中引入了该功能1.0.0

批量操作-SQLAlchemy文档

通过这些操作,您现在可以批量插入或更新!

例如,您可以执行以下操作:

s = Session()
objects = [
    User(name="u1"),
    User(name="u2"),
    User(name="u3")
]
s.bulk_save_objects(objects)
s.commit()

在这里,将制成大量插入物。

SQLAlchemy introduced that in version 1.0.0:

Bulk operations – SQLAlchemy docs

With these operations, you can now do bulk inserts or updates!

For instance, you can do:

s = Session()
objects = [
    User(name="u1"),
    User(name="u2"),
    User(name="u3")
]
s.bulk_save_objects(objects)
s.commit()

Here, a bulk insert will be made.


回答 1

sqlalchemy文档对可用于批量插入的各种技术的性能进行了总结

ORM基本上不是用于高性能批量插入的-这是SQLAlchemy除了将ORM作为一流组件之外还提供Core的全部原因。

对于快速批量插入的用例,ORM所基于的SQL生成和执行系统是Core的一部分。直接使用该系统,我们可以产生与直接使用原始数据库API相比具有竞争力的INSERT。

另外,SQLAlchemy ORM提供了Bulk Operations方法套件,该套件提供了到工作单元过程各部分的挂钩,以便发出基于ORM的自动化程度较低的Core级INSERT和UPDATE构造。

下面的示例说明了基于时间的测试,该测试针对从自动程度最高到最少的几种不同的行插入方法。使用cPython 2.7,可以观察到运行时:

classics-MacBook-Pro:sqlalchemy classic$ python test.py
SQLAlchemy ORM: Total time for 100000 records 12.0471920967 secs
SQLAlchemy ORM pk given: Total time for 100000 records 7.06283402443 secs
SQLAlchemy ORM bulk_save_objects(): Total time for 100000 records 0.856323003769 secs
SQLAlchemy Core: Total time for 100000 records 0.485800027847 secs
sqlite3: Total time for 100000 records 0.487842082977 sec

脚本:

import time
import sqlite3

from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String,  create_engine
from sqlalchemy.orm import scoped_session, sessionmaker

Base = declarative_base()
DBSession = scoped_session(sessionmaker())
engine = None


class Customer(Base):
    __tablename__ = "customer"
    id = Column(Integer, primary_key=True)
    name = Column(String(255))


def init_sqlalchemy(dbname='sqlite:///sqlalchemy.db'):
    global engine
    engine = create_engine(dbname, echo=False)
    DBSession.remove()
    DBSession.configure(bind=engine, autoflush=False, expire_on_commit=False)
    Base.metadata.drop_all(engine)
    Base.metadata.create_all(engine)


def test_sqlalchemy_orm(n=100000):
    init_sqlalchemy()
    t0 = time.time()
    for i in xrange(n):
        customer = Customer()
        customer.name = 'NAME ' + str(i)
        DBSession.add(customer)
        if i % 1000 == 0:
            DBSession.flush()
    DBSession.commit()
    print(
        "SQLAlchemy ORM: Total time for " + str(n) +
        " records " + str(time.time() - t0) + " secs")


def test_sqlalchemy_orm_pk_given(n=100000):
    init_sqlalchemy()
    t0 = time.time()
    for i in xrange(n):
        customer = Customer(id=i+1, name="NAME " + str(i))
        DBSession.add(customer)
        if i % 1000 == 0:
            DBSession.flush()
    DBSession.commit()
    print(
        "SQLAlchemy ORM pk given: Total time for " + str(n) +
        " records " + str(time.time() - t0) + " secs")


def test_sqlalchemy_orm_bulk_insert(n=100000):
    init_sqlalchemy()
    t0 = time.time()
    n1 = n
    while n1 > 0:
        n1 = n1 - 10000
        DBSession.bulk_insert_mappings(
            Customer,
            [
                dict(name="NAME " + str(i))
                for i in xrange(min(10000, n1))
            ]
        )
    DBSession.commit()
    print(
        "SQLAlchemy ORM bulk_save_objects(): Total time for " + str(n) +
        " records " + str(time.time() - t0) + " secs")


def test_sqlalchemy_core(n=100000):
    init_sqlalchemy()
    t0 = time.time()
    engine.execute(
        Customer.__table__.insert(),
        [{"name": 'NAME ' + str(i)} for i in xrange(n)]
    )
    print(
        "SQLAlchemy Core: Total time for " + str(n) +
        " records " + str(time.time() - t0) + " secs")


def init_sqlite3(dbname):
    conn = sqlite3.connect(dbname)
    c = conn.cursor()
    c.execute("DROP TABLE IF EXISTS customer")
    c.execute(
        "CREATE TABLE customer (id INTEGER NOT NULL, "
        "name VARCHAR(255), PRIMARY KEY(id))")
    conn.commit()
    return conn


def test_sqlite3(n=100000, dbname='sqlite3.db'):
    conn = init_sqlite3(dbname)
    c = conn.cursor()
    t0 = time.time()
    for i in xrange(n):
        row = ('NAME ' + str(i),)
        c.execute("INSERT INTO customer (name) VALUES (?)", row)
    conn.commit()
    print(
        "sqlite3: Total time for " + str(n) +
        " records " + str(time.time() - t0) + " sec")

if __name__ == '__main__':
    test_sqlalchemy_orm(100000)
    test_sqlalchemy_orm_pk_given(100000)
    test_sqlalchemy_orm_bulk_insert(100000)
    test_sqlalchemy_core(100000)
    test_sqlite3(100000)

The sqlalchemy docs have a writeup on the performance of various techniques that can be used for bulk inserts:

ORMs are basically not intended for high-performance bulk inserts – this is the whole reason SQLAlchemy offers the Core in addition to the ORM as a first-class component.

For the use case of fast bulk inserts, the SQL generation and execution system that the ORM builds on top of is part of the Core. Using this system directly, we can produce an INSERT that is competitive with using the raw database API directly.

Alternatively, the SQLAlchemy ORM offers the Bulk Operations suite of methods, which provide hooks into subsections of the unit of work process in order to emit Core-level INSERT and UPDATE constructs with a small degree of ORM-based automation.

The example below illustrates time-based tests for several different methods of inserting rows, going from the most automated to the least. With cPython 2.7, runtimes observed:

classics-MacBook-Pro:sqlalchemy classic$ python test.py
SQLAlchemy ORM: Total time for 100000 records 12.0471920967 secs
SQLAlchemy ORM pk given: Total time for 100000 records 7.06283402443 secs
SQLAlchemy ORM bulk_save_objects(): Total time for 100000 records 0.856323003769 secs
SQLAlchemy Core: Total time for 100000 records 0.485800027847 secs
sqlite3: Total time for 100000 records 0.487842082977 sec

Script:

import time
import sqlite3

from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String,  create_engine
from sqlalchemy.orm import scoped_session, sessionmaker

Base = declarative_base()
DBSession = scoped_session(sessionmaker())
engine = None


class Customer(Base):
    __tablename__ = "customer"
    id = Column(Integer, primary_key=True)
    name = Column(String(255))


def init_sqlalchemy(dbname='sqlite:///sqlalchemy.db'):
    global engine
    engine = create_engine(dbname, echo=False)
    DBSession.remove()
    DBSession.configure(bind=engine, autoflush=False, expire_on_commit=False)
    Base.metadata.drop_all(engine)
    Base.metadata.create_all(engine)


def test_sqlalchemy_orm(n=100000):
    init_sqlalchemy()
    t0 = time.time()
    for i in xrange(n):
        customer = Customer()
        customer.name = 'NAME ' + str(i)
        DBSession.add(customer)
        if i % 1000 == 0:
            DBSession.flush()
    DBSession.commit()
    print(
        "SQLAlchemy ORM: Total time for " + str(n) +
        " records " + str(time.time() - t0) + " secs")


def test_sqlalchemy_orm_pk_given(n=100000):
    init_sqlalchemy()
    t0 = time.time()
    for i in xrange(n):
        customer = Customer(id=i+1, name="NAME " + str(i))
        DBSession.add(customer)
        if i % 1000 == 0:
            DBSession.flush()
    DBSession.commit()
    print(
        "SQLAlchemy ORM pk given: Total time for " + str(n) +
        " records " + str(time.time() - t0) + " secs")


def test_sqlalchemy_orm_bulk_insert(n=100000):
    init_sqlalchemy()
    t0 = time.time()
    n1 = n
    while n1 > 0:
        n1 = n1 - 10000
        DBSession.bulk_insert_mappings(
            Customer,
            [
                dict(name="NAME " + str(i))
                for i in xrange(min(10000, n1))
            ]
        )
    DBSession.commit()
    print(
        "SQLAlchemy ORM bulk_save_objects(): Total time for " + str(n) +
        " records " + str(time.time() - t0) + " secs")


def test_sqlalchemy_core(n=100000):
    init_sqlalchemy()
    t0 = time.time()
    engine.execute(
        Customer.__table__.insert(),
        [{"name": 'NAME ' + str(i)} for i in xrange(n)]
    )
    print(
        "SQLAlchemy Core: Total time for " + str(n) +
        " records " + str(time.time() - t0) + " secs")


def init_sqlite3(dbname):
    conn = sqlite3.connect(dbname)
    c = conn.cursor()
    c.execute("DROP TABLE IF EXISTS customer")
    c.execute(
        "CREATE TABLE customer (id INTEGER NOT NULL, "
        "name VARCHAR(255), PRIMARY KEY(id))")
    conn.commit()
    return conn


def test_sqlite3(n=100000, dbname='sqlite3.db'):
    conn = init_sqlite3(dbname)
    c = conn.cursor()
    t0 = time.time()
    for i in xrange(n):
        row = ('NAME ' + str(i),)
        c.execute("INSERT INTO customer (name) VALUES (?)", row)
    conn.commit()
    print(
        "sqlite3: Total time for " + str(n) +
        " records " + str(time.time() - t0) + " sec")

if __name__ == '__main__':
    test_sqlalchemy_orm(100000)
    test_sqlalchemy_orm_pk_given(100000)
    test_sqlalchemy_orm_bulk_insert(100000)
    test_sqlalchemy_core(100000)
    test_sqlite3(100000)

回答 2

据我所知,没有办法让ORM发出批量插入。我认为根本原因是SQLAlchemy需要跟踪每个对象的身份(即新的主键),而大容量插入会对此产生干扰。例如,假设您的foo表包含一id列并映射到一个Foo类:

x = Foo(bar=1)
print x.id
# None
session.add(x)
session.flush()
# BEGIN
# INSERT INTO foo (bar) VALUES(1)
# COMMIT
print x.id
# 1

由于SQLAlchemy在x.id不发出另一个查询的情况下获取了该值,因此我们可以推断出它直接从该INSERT语句中获取了该值。如果不需要随后通过相同实例访问创建的对象,则可以跳过ORM层进行插入:

Foo.__table__.insert().execute([{'bar': 1}, {'bar': 2}, {'bar': 3}])
# INSERT INTO foo (bar) VALUES ((1,), (2,), (3,))

SQLAlchemy无法将这些新行与任何现有对象匹配,因此您必须重新查询它们以进行任何后续操作。

至于过时的数据,记住该会话没有内置的方式来了解何时在会话外更改数据库是很有帮助的。为了通过现有实例访问外部修改的数据,必须将这些实例标记为expired。默认情况下会发生这种情况session.commit(),但可以通过调用session.expire_all()或手动完成session.expire(instance)。一个例子(省略SQL):

x = Foo(bar=1)
session.add(x)
session.commit()
print x.bar
# 1
foo.update().execute(bar=42)
print x.bar
# 1
session.expire(x)
print x.bar
# 42

session.commit()expires x,因此第一个打印语句隐式打开一个新事务并重新查询x属性。如果注释掉第一个打印语句,您会注意到第二个打印语句现在会选择正确的值,因为直到更新后才会发出新查询。

从事务隔离的角度来看,这是有道理的-您只应在事务之间进行外部修改。如果这给您带来麻烦,建议您弄清或重新考虑应用程序的事务边界,而不要立即进行操作session.expire_all()

As far as I know, there is no way to get the ORM to issue bulk inserts. I believe the underlying reason is that SQLAlchemy needs to keep track of each object’s identity (i.e., new primary keys), and bulk inserts interfere with that. For example, assuming your foo table contains an id column and is mapped to a Foo class:

x = Foo(bar=1)
print x.id
# None
session.add(x)
session.flush()
# BEGIN
# INSERT INTO foo (bar) VALUES(1)
# COMMIT
print x.id
# 1

Since SQLAlchemy picked up the value for x.id without issuing another query, we can infer that it got the value directly from the INSERT statement. If you don’t need subsequent access to the created objects via the same instances, you can skip the ORM layer for your insert:

Foo.__table__.insert().execute([{'bar': 1}, {'bar': 2}, {'bar': 3}])
# INSERT INTO foo (bar) VALUES ((1,), (2,), (3,))

SQLAlchemy can’t match these new rows with any existing objects, so you’ll have to query them anew for any subsequent operations.

As far as stale data is concerned, it’s helpful to remember that the session has no built-in way to know when the database is changed outside of the session. In order to access externally modified data through existing instances, the instances must be marked as expired. This happens by default on session.commit(), but can be done manually by calling session.expire_all() or session.expire(instance). An example (SQL omitted):

x = Foo(bar=1)
session.add(x)
session.commit()
print x.bar
# 1
foo.update().execute(bar=42)
print x.bar
# 1
session.expire(x)
print x.bar
# 42

session.commit() expires x, so the first print statement implicitly opens a new transaction and re-queries x‘s attributes. If you comment out the first print statement, you’ll notice that the second one now picks up the correct value, because the new query isn’t emitted until after the update.

This makes sense from the point of view of transactional isolation – you should only pick up external modifications between transactions. If this is causing you trouble, I’d suggest clarifying or re-thinking your application’s transaction boundaries instead of immediately reaching for session.expire_all().


回答 3

我通常使用add_all

from app import session
from models import User

objects = [User(name="u1"), User(name="u2"), User(name="u3")]
session.add_all(objects)
session.commit()

I usually do it using add_all.

from app import session
from models import User

objects = [User(name="u1"), User(name="u2"), User(name="u3")]
session.add_all(objects)
session.commit()

回答 4

从0.8版开始,直接支持已添加到SQLAlchemy

根据docsconnection.execute(table.insert().values(data))应该可以解决问题。(请注意,这是一样的connection.execute(table.insert(), data)通过将呼叫这导致许多个别行插入executemany)。除了本地连接之外,其他任何方面的性能差异都可能很大。

Direct support was added to SQLAlchemy as of version 0.8

As per the docs, connection.execute(table.insert().values(data)) should do the trick. (Note that this is not the same as connection.execute(table.insert(), data) which results in many individual row inserts via a call to executemany). On anything but a local connection the difference in performance can be enormous.


回答 5

SQLAlchemy在版本中引入了该功能1.0.0

批量操作-SQLAlchemy文档

通过这些操作,您现在可以批量插入或更新!

例如(如果您希望简单表INSERT的开销最小),可以使用Session.bulk_insert_mappings()

loadme = [(1, 'a'),
          (2, 'b'),
          (3, 'c')]
dicts = [dict(bar=t[0], fly=t[1]) for t in loadme]

s = Session()
s.bulk_insert_mappings(Foo, dicts)
s.commit()

或者,如果需要,可以跳过loadme元组,直接将字典写进去dicts(但是我发现将所有的单词遗漏在数据之外并循环加载字典列表会更容易)。

SQLAlchemy introduced that in version 1.0.0:

Bulk operations – SQLAlchemy docs

With these operations, you can now do bulk inserts or updates!

For instance (if you want the lowest overhead for simple table INSERTs), you can use Session.bulk_insert_mappings():

loadme = [(1, 'a'),
          (2, 'b'),
          (3, 'c')]
dicts = [dict(bar=t[0], fly=t[1]) for t in loadme]

s = Session()
s.bulk_insert_mappings(Foo, dicts)
s.commit()

Or, if you want, skip the loadme tuples and write the dictionaries directly into dicts (but I find it easier to leave all the wordiness out of the data and load up a list of dictionaries in a loop).


回答 6

Piere的回答是正确的,但是一个问题是bulk_save_objects,如果您担心的话,默认情况下不会返回对象的主键。设置return_defaultsTrue可得到此行为。

文档在这里

foos = [Foo(bar='a',), Foo(bar='b'), Foo(bar='c')]
session.bulk_save_objects(foos, return_defaults=True)
for foo in foos:
    assert foo.id is not None
session.commit()

Piere’s answer is correct but one issue is that bulk_save_objects by default does not return the primary keys of the objects, if that is of concern to you. Set return_defaults to True to get this behavior.

The documentation is here.

foos = [Foo(bar='a',), Foo(bar='b'), Foo(bar='c')]
session.bulk_save_objects(foos, return_defaults=True)
for foo in foos:
    assert foo.id is not None
session.commit()

回答 7

条条大路通罗马,但其中一些横穿山脉,需要渡轮,但是如果您想快速到达那儿,只需上高速公路。


在这种情况下,高速公路将使用psycopg2execute_batch()功能。该文档说的最好:

当前的实现executemany()(使用非常慈善的轻描淡写)不是特别有效。这些功能可用于加快针对一组参数的语句的重复执行。通过减少服务器往返次数,性能可以比使用服务器好几个数量级。executemany()

在我自己的测试execute_batch()快2倍左右executemany(),并给出配置进行进一步的调整所以page_size的选项(如果你想挤进业绩的最后2-3%的驾驶者)。

如果使用SQLAlchemy,则可以通过use_batch_mode=True在实例化引擎时将其设置为参数来轻松启用相同功能。create_engine()

All Roads Lead to Rome, but some of them crosses mountains, requires ferries but if you want to get there quickly just take the motorway.


In this case the motorway is to use the execute_batch() feature of psycopg2. The documentation says it the best:

The current implementation of executemany() is (using an extremely charitable understatement) not particularly performing. These functions can be used to speed up the repeated execution of a statement against a set of parameters. By reducing the number of server roundtrips the performance can be orders of magnitude better than using executemany().

In my own test execute_batch() is approximately twice as fast as executemany(), and gives the option to configure the page_size for further tweaking (if you want to squeeze the last 2-3% of performance out of the driver).

The same feature can easily be enabled if you are using SQLAlchemy by setting use_batch_mode=True as a parameter when you instantiate the engine with create_engine()


回答 8

这是一种方法:

values = [1, 2, 3]
Foo.__table__.insert().execute([{'bar': x} for x in values])

这样插入:

INSERT INTO `foo` (`bar`) VALUES (1), (2), (3)

参考:SQLAlchemy FAQ包含各种提交方法的基准。

This is a way:

values = [1, 2, 3]
Foo.__table__.insert().execute([{'bar': x} for x in values])

This will insert like this:

INSERT INTO `foo` (`bar`) VALUES (1), (2), (3)

Reference: The SQLAlchemy FAQ includes benchmarks for various commit methods.


回答 9

到目前为止,我发现的最佳答案是在sqlalchemy文档中:

http://docs.sqlalchemy.org/en/latest/faq/performance.html#im-inserting-400-000-rows-with-the-orm-and-it-s-really-slow

有一个完整示例说明了可能的解决方案基准。

如文档所示:

bulk_save_objects不是最佳解决方案,但其性能是正确的。

就可读性而言,第二好的实现是我认为使用SQLAlchemy Core:

def test_sqlalchemy_core(n=100000):
    init_sqlalchemy()
    t0 = time.time()
    engine.execute(
        Customer.__table__.insert(),
            [{"name": 'NAME ' + str(i)} for i in xrange(n)]
    )

文档文章中提供了此功能的上下文。

The best answer I found so far was in sqlalchemy documentation:

http://docs.sqlalchemy.org/en/latest/faq/performance.html#i-m-inserting-400-000-rows-with-the-orm-and-it-s-really-slow

There is a complete example of a benchmark of possible solutions.

As shown in the documentation:

bulk_save_objects is not the best solution but it performance are correct.

The second best implementation in terms of readability I think was with the SQLAlchemy Core:

def test_sqlalchemy_core(n=100000):
    init_sqlalchemy()
    t0 = time.time()
    engine.execute(
        Customer.__table__.insert(),
            [{"name": 'NAME ' + str(i)} for i in xrange(n)]
    )

The context of this function is given in the documentation article.


SQLAlchemy:级联删除

问题:SQLAlchemy:级联删除

我必须缺少SQLAlchemy的级联选项的琐碎内容,因为我无法获得简单的级联删除来正确操作-如果删除了父元素,则子级将保留并带有null外键。

我在这里放了一个简洁的测试用例:

from sqlalchemy import Column, Integer, ForeignKey
from sqlalchemy.orm import relationship

from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class Parent(Base):
    __tablename__ = "parent"
    id = Column(Integer, primary_key = True)

class Child(Base):
    __tablename__ = "child"
    id = Column(Integer, primary_key = True)
    parentid = Column(Integer, ForeignKey(Parent.id))
    parent = relationship(Parent, cascade = "all,delete", backref = "children")

engine = create_engine("sqlite:///:memory:")
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)

session = Session()

parent = Parent()
parent.children.append(Child())
parent.children.append(Child())
parent.children.append(Child())

session.add(parent)
session.commit()

print "Before delete, children = {0}".format(session.query(Child).count())
print "Before delete, parent = {0}".format(session.query(Parent).count())

session.delete(parent)
session.commit()

print "After delete, children = {0}".format(session.query(Child).count())
print "After delete parent = {0}".format(session.query(Parent).count())

session.close()

输出:

Before delete, children = 3
Before delete, parent = 1
After delete, children = 3
After delete parent = 0

父母与子女之间存在简单的一对多关系。该脚本创建一个父级,添加3个子级,然后提交。接下来,它删除父级,但子级仍然存在。为什么?如何使孩子级联删除?

I must be missing something trivial with SQLAlchemy’s cascade options because I cannot get a simple cascade delete to operate correctly — if a parent element is a deleted, the children persist, with null foreign keys.

I’ve put a concise test case here:

from sqlalchemy import Column, Integer, ForeignKey
from sqlalchemy.orm import relationship

from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class Parent(Base):
    __tablename__ = "parent"
    id = Column(Integer, primary_key = True)

class Child(Base):
    __tablename__ = "child"
    id = Column(Integer, primary_key = True)
    parentid = Column(Integer, ForeignKey(Parent.id))
    parent = relationship(Parent, cascade = "all,delete", backref = "children")

engine = create_engine("sqlite:///:memory:")
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)

session = Session()

parent = Parent()
parent.children.append(Child())
parent.children.append(Child())
parent.children.append(Child())

session.add(parent)
session.commit()

print "Before delete, children = {0}".format(session.query(Child).count())
print "Before delete, parent = {0}".format(session.query(Parent).count())

session.delete(parent)
session.commit()

print "After delete, children = {0}".format(session.query(Child).count())
print "After delete parent = {0}".format(session.query(Parent).count())

session.close()

Output:

Before delete, children = 3
Before delete, parent = 1
After delete, children = 3
After delete parent = 0

There is a simple, one-to-many relationship between Parent and Child. The script creates a parent, adds 3 children, then commits. Next, it deletes the parent, but the children persist. Why? How do I make the children cascade delete?


回答 0

问题是sqlalchemy认为Child是父级的,因为这是您定义关系的地方(当然,它并不关心您将其称为“子级”)。

如果您在Parent类上定义关系,它将起作用:

children = relationship("Child", cascade="all,delete", backref="parent")

(请注意"Child"为字符串:使用声明式样式时允许这样做,以便您可以引用尚未定义的类)

您可能还想添加delete-orphandelete导致删除父级时删除子级,delete-orphan也删除从父级“删除”的所有子级,即使未删除父级也是如此)

编辑:刚刚发现:如果您确实想在Child类上定义关系,则可以这样做,但是您将必须在backref上定义级联(通过显式创建backref),如下所示:

parent = relationship(Parent, backref=backref("children", cascade="all,delete"))

(暗示from sqlalchemy.orm import backref

The problem is that sqlalchemy considers Child as the parent, because that is where you defined your relationship (it doesn’t care that you called it “Child” of course).

If you define the relationship on the Parent class instead, it will work:

children = relationship("Child", cascade="all,delete", backref="parent")

(note "Child" as a string: this is allowed when using the declarative style, so that you are able to refer to a class that is not yet defined)

You might want to add delete-orphan as well (delete causes children to be deleted when the parent gets deleted, delete-orphan also deletes any children that were “removed” from the parent, even if the parent is not deleted)

EDIT: just found out: if you really want to define the relationship on the Child class, you can do so, but you will have to define the cascade on the backref (by creating the backref explicitly), like this:

parent = relationship(Parent, backref=backref("children", cascade="all,delete"))

(implying from sqlalchemy.orm import backref)


回答 1

当您删除@Steven的附件时,session.delete()这是一件好事,对于我而言,这永远不会发生。我注意到大部分时间都是通过删除session.query().filter().delete()(它不会将元素放入内存中并直接从db中删除)。使用此方法sqlalchemy cascade='all, delete'无效。但是,有一个解决方案:ON DELETE CASCADE通过db(注意:并非所有数据库都支持它)。

class Child(Base):
    __tablename__ = "children"

    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey("parents.id", ondelete='CASCADE'))

class Parent(Base):
    __tablename__ = "parents"

    id = Column(Integer, primary_key=True)
    child = relationship(Child, backref="parent", passive_deletes=True)

@Steven’s asnwer is good when you are deleting through session.delete() which never happens in my case. I noticed that most of the time I delete through session.query().filter().delete() (which doesn’t put elements in the memory and deletes directly from db). Using this method sqlalchemy’s cascade='all, delete' doesn’t work. There is a solution though: ON DELETE CASCADE through db (note: not all databases support it).

class Child(Base):
    __tablename__ = "children"

    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey("parents.id", ondelete='CASCADE'))

class Parent(Base):
    __tablename__ = "parents"

    id = Column(Integer, primary_key=True)
    child = relationship(Child, backref="parent", passive_deletes=True)

回答 2

很老的帖子,但是我只是花了一两个小时,所以我想分享我的发现,特别是因为列出的其他一些评论不太正确。

TL; DR

给子表一个外部表或修改现有表,并添加ondelete='CASCADE'

parent_id = db.Column(db.Integer, db.ForeignKey('parent.id', ondelete='CASCADE'))

一个下列关系:

a)在父表上:

children = db.relationship('Child', backref='parent', passive_deletes=True)

b)在子表上:

parent = db.relationship('Parent', backref=backref('children', passive_deletes=True))

细节

首先,尽管接受了答案,但父母/子女关系不是通过使用建立的relationship,而是通过使用建立的ForeignKey。您可以将它relationship放在父表或子表上,它将正常工作。尽管显然在子表上,backref除了关键字参数之外,您还必须使用该函数。

选项1(首选)

其次,SqlAlchemy支持两种不同的级联。我建议的第一个和第一个建议是内置于数据库中的,通常采取对外键声明的约束形式。在PostgreSQL中,它看起来像这样:

CONSTRAINT child_parent_id_fkey FOREIGN KEY (parent_id)
REFERENCES parent_table(id) MATCH SIMPLE
ON DELETE CASCADE

这意味着当您从中删除记录时parent_table,数据库中的所有相应行都child_table将为您删除。它快速可靠,可能是您最好的选择。您可以通过以下方式在SqlAlchemy中进行设置ForeignKey(子表定义的一部分):

parent_id = db.Column(db.Integer, db.ForeignKey('parent.id', ondelete='CASCADE'))
parent = db.relationship('Parent', backref=backref('children', passive_deletes=True))

ondelete='CASCADE'是创建零件ON DELETE CASCADE放在桌子上。

知道了!

这里有一个重要的警告。请注意我如何relationship指定passive_deletes=True?如果没有的话,整个事情将无法进行。这是因为默认情况下,当您删除父记录时,SqlAlchemy所做的事情确实很奇怪。它将所有子行的外键设置为NULL。因此,如果您从parent_tablewhere id= 5 删除一行,那么它将基本上执行

UPDATE child_table SET parent_id = NULL WHERE parent_id = 5

为什么你要这个我不知道。如果许多数据库引擎甚至允许您将有效外键设置为NULL,那么我会感到很惊讶,从而创建了一个孤儿。似乎是个坏主意,但也许有一个用例。无论如何,如果让SqlAlchemy执行此操作,则将防止数据库能够使用ON DELETE CASCADE您设置的清理子级。这是因为它依靠那些外键来知道要删除哪些子行。一旦SqlAlchemy将它们全部设置为NULL,数据库将无法删除它们。设置passive_deletes=Trueprevent可以防止SqlAlchemy NULL读出外键。

您可以在SqlAlchemy文档中阅读有关被动删除的更多信息。

选项2

您可以执行的另一种方法是让SqlAlchemy为您完成。这是使用的cascade参数设置的relationship。如果您在父表上定义了关系,则它看起来像这样:

children = relationship('Child', cascade='all,delete', backref='parent')

如果该关系与孩子有关,则可以这样进行:

parent = relationship('Parent', backref=backref('children', cascade='all,delete'))

同样,这是孩子,因此您必须调用一个称为的方法backref并将级联数据放入其中。

这样,当您删除父行时,SqlAlchemy实际上将运行delete语句供您清理子行。如果您愿意,这可能不如让该数据库处理有效,所以我不建议这样做。

这是有关其支持的级联功能的SqlAlchemy文档

Pretty old post, but I just spent an hour or two on this, so I wanted to share my finding, especially since some of the other comments listed aren’t quite right.

TL;DR

Give the child table a foreign or modify the existing one, adding ondelete='CASCADE':

parent_id = db.Column(db.Integer, db.ForeignKey('parent.id', ondelete='CASCADE'))

And one of the following relationships:

a) This on the parent table:

children = db.relationship('Child', backref='parent', passive_deletes=True)

b) Or this on the child table:

parent = db.relationship('Parent', backref=backref('children', passive_deletes=True))

Details

First off, despite what the accepted answer says, the parent/child relationship is not established by using relationship, it’s established by using ForeignKey. You can put the relationship on either the parent or child tables and it will work fine. Although, apparently on the child tables, you have to use the backref function in addition to the keyword argument.

Option 1 (preferred)

Second, SqlAlchemy supports two different kinds of cascading. The first, and the one I recommend, is built into your database and usually takes the form of a constraint on the foreign key declaration. In PostgreSQL it looks like this:

CONSTRAINT child_parent_id_fkey FOREIGN KEY (parent_id)
REFERENCES parent_table(id) MATCH SIMPLE
ON DELETE CASCADE

This means that when you delete a record from parent_table, then all the corresponding rows in child_table will be deleted for you by the database. It’s fast and reliable and probably your best bet. You set this up in SqlAlchemy through ForeignKey like this (part of the child table definition):

parent_id = db.Column(db.Integer, db.ForeignKey('parent.id', ondelete='CASCADE'))
parent = db.relationship('Parent', backref=backref('children', passive_deletes=True))

The ondelete='CASCADE' is the part that creates the ON DELETE CASCADE on the table.

Gotcha!

There’s an important caveat here. Notice how I have a relationship specified with passive_deletes=True? If you don’t have that, the entire thing will not work. This is because by default when you delete a parent record SqlAlchemy does something really weird. It sets the foreign keys of all child rows to NULL. So if you delete a row from parent_table where id = 5, then it will basically execute

UPDATE child_table SET parent_id = NULL WHERE parent_id = 5

Why you would want this I have no idea. I’d be surprised if many database engines even allowed you to set a valid foreign key to NULL, creating an orphan. Seems like a bad idea, but maybe there’s a use case. Anyway, if you let SqlAlchemy do this, you will prevent the database from being able to clean up the children using the ON DELETE CASCADE that you set up. This is because it relies on those foreign keys to know which child rows to delete. Once SqlAlchemy has set them all to NULL, the database can’t delete them. Setting the passive_deletes=True prevents SqlAlchemy from NULLing out the foreign keys.

You can read more about passive deletes in the SqlAlchemy docs.

Option 2

The other way you can do it is to let SqlAlchemy do it for you. This is set up using the cascade argument of the relationship. If you have the relationship defined on the parent table, it looks like this:

children = relationship('Child', cascade='all,delete', backref='parent')

If the relationship is on the child, you do it like this:

parent = relationship('Parent', backref=backref('children', cascade='all,delete'))

Again, this is the child so you have to call a method called backref and putting the cascade data in there.

With this in place, when you delete a parent row, SqlAlchemy will actually run delete statements for you to clean up the child rows. This will likely not be as efficient as letting this database handle if for you so I don’t recommend it.

Here are the SqlAlchemy docs on the cascading features it supports.


回答 3

Steven是正确的,因为您需要显式创建backref,这将导致级联被应用到父级(而不是像在测试场景中那样被应用于子级)。

但是,在Child上定义关系不会使sqlalchemy将Child视为父级。定义关系的位置(子级或父级)都无关紧要,它的外键链接两个确定父级和子级的表。

不过,遵循一个惯例是有意义的,并且根据史蒂文的回应,我正在定义我所有与父母的孩子关系。

Steven is correct in that you need to explicitly create the backref, this results in the cascade being applied on the parent (as opposed to it being applied to the child like in the test scenario).

However, defining the relationship on the Child does NOT make sqlalchemy consider Child the parent. It doesn’t matter where the relationship is defined (child or parent), its the foreign key that links the two tables that determines which is the parent and which is the child.

It makes sense to stick to one convention though, and based on Steven’s response, I’m defining all my child relationships on the parent.


回答 4

我也为文档苦苦挣扎,但是发现文档字符串本身比手册更容易。例如,如果您从sqlalchemy.orm导入关系并执行help(relationship),它将为您提供可以为级联指定的所有选项。项目符号为delete-orphan

如果检测到没有父母的孩子类型的项目,则将其标记为删除。
请注意,此选项可防止在没有父母出席的情况下持久保留孩子Class中待处理的项目。

我知道您的问题更多地在于定义父子关系的文档的方式。但是似乎您也可能对层叠选项有疑问,因为"all"include "delete""delete-orphan"是唯一未包含的选项"all"

I struggled with the documentation as well, but found that the docstrings themselves tend to be easier than the manual. For example, if you import relationship from sqlalchemy.orm and do help(relationship), it will give you all the options you can specify for cascade. The bullet for delete-orphan says:

if an item of the child’s type with no parent is detected, mark it for deletion.
Note that this option prevents a pending item of the child’s class from being persisted without a parent present.

I realize your issue was more with the way the documentation for defining parent-child relationships. But it seemed that you might also be having a problem with the cascade options, because "all" includes "delete". "delete-orphan" is the only option that’s not included in "all".


回答 5

史蒂文的答案很坚定。我想指出另外一个含义。

通过使用relationship,您将使应用层(Flask)负责引用完整性。这意味着其他不通过Flask访问数据库的进程(例如数据库实用程序或直接连接到数据库的人)将不会遇到这些约束,并且可能以破坏您如此努力设计的逻辑数据模型的方式更改数据。

尽可能使用ForeignKeyd512和Alex描述的方法。DB引擎非常擅长真正地执行约束(以不可避免的方式),因此,这是保持数据完整性的最佳策略。您唯一需要依赖应用程序来处理数据完整性的时间是数据库无法处理数据完整性时,例如不支持外键的SQLite版本。

如果您需要在实体之间创建进一步的链接以启用诸如导航父子对象关系之类的应用行为backref,请与结合使用ForeignKey

Steven’s answer is solid. I’d like to point out an additional implication.

By using relationship, you’re making the app layer (Flask) responsible for referential integrity. That means other processes that access the database not through Flask, like a database utility or a person connecting to the database directly, will not experience those constraints and could change your data in a way that breaks the logical data model you worked so hard to design.

Whenever possible, use the ForeignKey approach described by d512 and Alex. The DB engine is very good at truly enforcing constraints (in an unavoidable way), so this is by far the best strategy for maintaining data integrity. The only time you need to rely on an app to handle data integrity is when the database can’t handle them, e.g. versions of SQLite that don’t support foreign keys.

If you need to create further linkage among entities to enable app behaviors like navigating parent-child object relationships, use backref in conjunction with ForeignKey.


回答 6

Stevan的回答是完美的。但是,如果仍然出现错误。在此之上的其他可能的尝试是-

http://vincentaudebert.github.io/python/sql/2015/10/09/cascade-delete-sqlalchemy/

从链接复制-

快速提示:即使您在模型中指定了级联删除,如果您遇到外键依赖关系时遇到麻烦。

使用SQLAlchemy指定cascade='all, delete'父级表上应具有的级联删除。好的,但是当您执行类似的操作时:

session.query(models.yourmodule.YourParentTable).filter(conditions).delete()

实际上,它会触发有关您的子表中使用的外键的错误。

我用它来查询对象然后删除它的解决方案:

session = models.DBSession()
your_db_object = session.query(models.yourmodule.YourParentTable).filter(conditions).first()
if your_db_object is not None:
    session.delete(your_db_object)

这将删除您的父记录以及与其关联的所有子记录。

Answer by Stevan is perfect. But if you are still getting the error. Other possible try on top of that would be –

http://vincentaudebert.github.io/python/sql/2015/10/09/cascade-delete-sqlalchemy/

Copied from the link-

Quick tip if you get in trouble with a foreign key dependency even if you have specified a cascade delete in your models.

Using SQLAlchemy, to specify a cascade delete you should have cascade='all, delete' on your parent table. Ok but then when you execute something like:

session.query(models.yourmodule.YourParentTable).filter(conditions).delete()

It actually triggers an error about a foreign key used in your children tables.

The solution I used it to query the object and then delete it:

session = models.DBSession()
your_db_object = session.query(models.yourmodule.YourParentTable).filter(conditions).first()
if your_db_object is not None:
    session.delete(your_db_object)

This should delete your parent record AND all the children associated with it.


回答 7

Alex Okrushko的回答对我来说几乎是最好的。结合使用ondelete =’CASCADE’和passive_deletes = True。但是我必须做些额外的事情才能使其在sqlite中起作用。

Base = declarative_base()
ROOM_TABLE = "roomdata"
FURNITURE_TABLE = "furnituredata"

class DBFurniture(Base):
    __tablename__ = FURNITURE_TABLE
    id = Column(Integer, primary_key=True)
    room_id = Column(Integer, ForeignKey('roomdata.id', ondelete='CASCADE'))


class DBRoom(Base):
    __tablename__ = ROOM_TABLE
    id = Column(Integer, primary_key=True)
    furniture = relationship("DBFurniture", backref="room", passive_deletes=True)

确保添加此代码以确保其适用于sqlite。

from sqlalchemy import event
from sqlalchemy.engine import Engine
from sqlite3 import Connection as SQLite3Connection

@event.listens_for(Engine, "connect")
def _set_sqlite_pragma(dbapi_connection, connection_record):
    if isinstance(dbapi_connection, SQLite3Connection):
        cursor = dbapi_connection.cursor()
        cursor.execute("PRAGMA foreign_keys=ON;")
        cursor.close()

从这里偷来的:SQLAlchemy表达式语言和SQLite的删除级联

Alex Okrushko answer almost worked best for me. Used ondelete=’CASCADE’ and passive_deletes=True combined. But I had to do something extra to make it work for sqlite.

Base = declarative_base()
ROOM_TABLE = "roomdata"
FURNITURE_TABLE = "furnituredata"

class DBFurniture(Base):
    __tablename__ = FURNITURE_TABLE
    id = Column(Integer, primary_key=True)
    room_id = Column(Integer, ForeignKey('roomdata.id', ondelete='CASCADE'))


class DBRoom(Base):
    __tablename__ = ROOM_TABLE
    id = Column(Integer, primary_key=True)
    furniture = relationship("DBFurniture", backref="room", passive_deletes=True)

Make sure to add this code to ensure it works for sqlite.

from sqlalchemy import event
from sqlalchemy.engine import Engine
from sqlite3 import Connection as SQLite3Connection

@event.listens_for(Engine, "connect")
def _set_sqlite_pragma(dbapi_connection, connection_record):
    if isinstance(dbapi_connection, SQLite3Connection):
        cursor = dbapi_connection.cursor()
        cursor.execute("PRAGMA foreign_keys=ON;")
        cursor.close()

Stolen from here: SQLAlchemy expression language and SQLite’s on delete cascade


回答 8

TLDR:如果上述解决方案不起作用,请尝试将nullable = False添加到您的列中。

我想在这里为一些可能无法使层叠功能与现有解决方案配合使用的人提供一个小技巧(很棒)。我的工作和示例之间的主要区别是我使用了自动映射。我不确切知道这可能如何影响级联的设置,但是我想指出我使用了它。我也在使用SQLite数据库。

我尝试了这里描述的所有解决方案,但是当删除父行时,子表中的行继续将其外键设置为null。我在这里尝试了所有解决方案都无济于事。但是,一旦我将带有外键的子列设置为nullable = False,级联就可以工作。

在子表上,我添加了:

Column('parent_id', Integer(), ForeignKey('parent.id', ondelete="CASCADE"), nullable=False)
Child.parent = relationship("parent", backref=backref("children", passive_deletes=True)

通过此设置,级联可以按预期运行。

TLDR: If the above solutions don’t work, try adding nullable=False to your column.

I’d like to add a small point here for some people who may not get the cascade function to work with the existing solutions (which are great). The main difference between my work and the example was that I used automap. I do not know exactly how that might interfere with the setup of cascades, but I want to note that I used it. I am also working with a SQLite database.

I tried every solution described here, but rows in my child table continued to have their foreign key set to null when the parent row was deleted. I’d tried all the solutions here to no avail. However, the cascade worked once I set the child column with the foreign key to nullable = False.

On the child table, I added:

Column('parent_id', Integer(), ForeignKey('parent.id', ondelete="CASCADE"), nullable=False)
Child.parent = relationship("parent", backref=backref("children", passive_deletes=True)

With this setup, the cascade functioned as expected.


如何在Django中重置数据库?我收到命令“重置”未找到错误

问题:如何在Django中重置数据库?我收到命令“重置”未找到错误

在此处按照示例通过Django进行学习:http://lightbird.net/dbe/todo_list.html

该教程说:

“这改变了我们的表布局,我们必须要求Django重置并重新创建表:

manage.py reset todo; manage.py syncdb

但是,当我运行时manage.py reset todo,出现错误:

$ python manage.py reset todo                                       
- Unknown command: 'reset'

这是因为我使用的是sqlite3而不是Postgresql吗?

有人可以告诉我重置数据库的命令是什么吗?

命令:python manage.py sqlclear todo返回错误:

$ python manage.py sqlclear todo    
CommandError: App with label todo could not be found.    
Are you sure your INSTALLED_APPS setting is correct?

因此,我在settings.py的INSTALLED_APPS中添加了“ todo”,然后python manage.py sqlclear todo再次运行,导致此错误:

$ python manage.py sqlclear todo                                      
- NameError: name 'admin' is not defined

Following this Django by Example tutotrial here: http://lightbird.net/dbe/todo_list.html

The tutorial says:

“This changes our table layout and we’ll have to ask Django to reset and recreate tables:

manage.py reset todo; manage.py syncdb

though, when I run manage.py reset todo, I get the error:

$ python manage.py reset todo                                       
- Unknown command: 'reset'

Is this because I am using sqlite3 and not postgresql?

Can somebody tell me what the command is to reset the database?

The command: python manage.py sqlclear todo returns the error:

$ python manage.py sqlclear todo    
CommandError: App with label todo could not be found.    
Are you sure your INSTALLED_APPS setting is correct?

So I added ‘todo’ to my INSTALLED_APPS in settings.py, and ran python manage.py sqlclear todo again, resulting in this error:

$ python manage.py sqlclear todo                                      
- NameError: name 'admin' is not defined

回答 0

reset已被flushDjango 1.5 取代,请参阅:

python manage.py help flush

reset has been replaced by flush with Django 1.5, see:

python manage.py help flush

回答 1

看起来“冲洗”答案适用于部分情况,但不是全部情况。我不仅需要刷新数据库中的值,还需要正确地重新创建表。我还没有使用迁移(早期),所以我确实需要删除所有表。

我发现删除所有表的两种方法都需要核心django以外的东西。

如果您使用的是Heroku,请使用pg:reset删除所有表:

heroku pg:reset DATABASE_URL
heroku run python manage.py syncdb

如果您可以安装Django扩展,则可以通过以下方式完全重置:

python ./manage.py reset_db --router=default

It looks like the ‘flush’ answer will work for some, but not all cases. I needed not just to flush the values in the database, but to recreate the tables properly. I’m not using migrations yet (early days) so I really needed to drop all the tables.

Two ways I’ve found to drop all tables, both require something other than core django.

If you’re on Heroku, drop all the tables with pg:reset:

heroku pg:reset DATABASE_URL
heroku run python manage.py syncdb

If you can install Django Extensions, it has a way to do a complete reset:

python ./manage.py reset_db --router=default

回答 2

与LisaD的答案类似,Django扩展具有出色的reset_db命令,该命令可以完全删除所有内容,而不仅仅是像“ flush”一样截断表。

python ./manage.py reset_db

仅仅刷新表并不能解决在删除对象时发生的永久性错误。进行reset_db解决了该问题。

Similar to LisaD’s answer, Django Extensions has a great reset_db command that totally drops everything, instead of just truncating the tables like “flush” does.

python ./manage.py reset_db

Merely flushing the tables wasn’t fixing a persistent error that occurred when I was deleting objects. Doing a reset_db fixed the problem.


回答 3

如果您使用的是Django 2.0,那么

python manage.py flush 

将工作

if you are using Django 2.0 Then

python manage.py flush 

will work


回答 4

如果要清理整个数据库,则可以使用: python manage.py flush 如果要清理Django应用程序的数据库表,则可以使用: python manage.py migration appname 0

If you want to clean the whole database, you can use: python manage.py flush If you want to clean database table of a Django app, you can use: python manage.py migrate appname zero


回答 5

使用django 1.11,只需从migrations每个应用程序的文件夹中删除所有迁移文件(除外的所有文件__init__.py)。然后

  1. 手动删除数据库。
  2. 手动创建数据库。
  3. 运行python3 manage.py makemigrations
  4. 运行python3 manage.py migrate

瞧,您的数据库已完全重置。

With django 1.11, simply delete all migration files from the migrations folder of each application (all files except __init__.py). Then

  1. Manually drop database.
  2. Manually create database.
  3. Run python3 manage.py makemigrations.
  4. Run python3 manage.py migrate.

And voilla, your database has been completely reset.


回答 6

对我来说,这解决了问题。

heroku pg:reset DATABASE_URL

heroku run bash
>> Inside heroku bash
cd app_name && rm -rf migrations && cd ..
./manage.py makemigrations app_name
./manage.py migrate

For me this solved the problem.

heroku pg:reset DATABASE_URL

heroku run bash
>> Inside heroku bash
cd app_name && rm -rf migrations && cd ..
./manage.py makemigrations app_name
./manage.py migrate

回答 7

只是跟进@LisaD的答案。
截至2016年(Django 1.9),您需要输入:

heroku pg:reset DATABASE_URL
heroku run python manage.py makemigrations
heroku run python manage.py migrate

这将为您提供Heroku中的全新数据库。

Just a follow up to @LisaD’s answer.
As of 2016 (Django 1.9), you need to type:

heroku pg:reset DATABASE_URL
heroku run python manage.py makemigrations
heroku run python manage.py migrate

This will give you a fresh new database within Heroku.


回答 8

  1. 只需手动删除数据库即可。确保首先创建备份(在我的情况下db.sqlite3是我的数据库)

  2. 运行此命令 manage.py migrate

  1. Just manually delete you database. Ensure you create backup first (in my case db.sqlite3 is my database)

  2. Run this command manage.py migrate


回答 9

python manage.py flush

删除旧的数据库内容,

不要忘记创建新的超级用户:

python manage.py createsuperuser
python manage.py flush

deleted old db contents,

Don’t forget to create new superuser:

python manage.py createsuperuser

使用Python将CSV文件导入sqlite3数据库表

问题:使用Python将CSV文件导入sqlite3数据库表

我有一个CSV文件,我想使用Python将该文件批量导入到我的sqlite3数据库中。该命令是“ .import …..”。但似乎不能这样工作。谁能给我一个在sqlite3中如何做的例子?我正在使用Windows,以防万一。谢谢

I have a CSV file and I want to bulk-import this file into my sqlite3 database using Python. the command is “.import …..”. but it seems that it cannot work like this. Can anyone give me an example of how to do it in sqlite3? I am using windows just in case. Thanks


回答 0

import csv, sqlite3

con = sqlite3.connect(":memory:") # change to 'sqlite:///your_filename.db'
cur = con.cursor()
cur.execute("CREATE TABLE t (col1, col2);") # use your column names here

with open('data.csv','r') as fin: # `with` statement available in 2.5+
    # csv.DictReader uses first line in file for column headings by default
    dr = csv.DictReader(fin) # comma is default delimiter
    to_db = [(i['col1'], i['col2']) for i in dr]

cur.executemany("INSERT INTO t (col1, col2) VALUES (?, ?);", to_db)
con.commit()
con.close()
import csv, sqlite3

con = sqlite3.connect(":memory:") # change to 'sqlite:///your_filename.db'
cur = con.cursor()
cur.execute("CREATE TABLE t (col1, col2);") # use your column names here

with open('data.csv','r') as fin: # `with` statement available in 2.5+
    # csv.DictReader uses first line in file for column headings by default
    dr = csv.DictReader(fin) # comma is default delimiter
    to_db = [(i['col1'], i['col2']) for i in dr]

cur.executemany("INSERT INTO t (col1, col2) VALUES (?, ?);", to_db)
con.commit()
con.close()

回答 1

留给读者一个创建与磁盘上文件的sqlite连接的练习…但是熊猫库现在提供了两种方法

df = pandas.read_csv(csvfile)
df.to_sql(table_name, conn, if_exists='append', index=False)

Creating an sqlite connection to a file on disk is left as an exercise for the reader … but there is now a two-liner made possible by the pandas library

df = pandas.read_csv(csvfile)
df.to_sql(table_name, conn, if_exists='append', index=False)

回答 2

我的2美分(更通用):

import csv, sqlite3
import logging

def _get_col_datatypes(fin):
    dr = csv.DictReader(fin) # comma is default delimiter
    fieldTypes = {}
    for entry in dr:
        feildslLeft = [f for f in dr.fieldnames if f not in fieldTypes.keys()]
        if not feildslLeft: break # We're done
        for field in feildslLeft:
            data = entry[field]

            # Need data to decide
            if len(data) == 0:
                continue

            if data.isdigit():
                fieldTypes[field] = "INTEGER"
            else:
                fieldTypes[field] = "TEXT"
        # TODO: Currently there's no support for DATE in sqllite

    if len(feildslLeft) > 0:
        raise Exception("Failed to find all the columns data types - Maybe some are empty?")

    return fieldTypes


def escapingGenerator(f):
    for line in f:
        yield line.encode("ascii", "xmlcharrefreplace").decode("ascii")


def csvToDb(csvFile, outputToFile = False):
    # TODO: implement output to file

    with open(csvFile,mode='r', encoding="ISO-8859-1") as fin:
        dt = _get_col_datatypes(fin)

        fin.seek(0)

        reader = csv.DictReader(fin)

        # Keep the order of the columns name just as in the CSV
        fields = reader.fieldnames
        cols = []

        # Set field and type
        for f in fields:
            cols.append("%s %s" % (f, dt[f]))

        # Generate create table statement:
        stmt = "CREATE TABLE ads (%s)" % ",".join(cols)

        con = sqlite3.connect(":memory:")
        cur = con.cursor()
        cur.execute(stmt)

        fin.seek(0)


        reader = csv.reader(escapingGenerator(fin))

        # Generate insert statement:
        stmt = "INSERT INTO ads VALUES(%s);" % ','.join('?' * len(cols))

        cur.executemany(stmt, reader)
        con.commit()

    return con

My 2 cents (more generic):

import csv, sqlite3
import logging

def _get_col_datatypes(fin):
    dr = csv.DictReader(fin) # comma is default delimiter
    fieldTypes = {}
    for entry in dr:
        feildslLeft = [f for f in dr.fieldnames if f not in fieldTypes.keys()]
        if not feildslLeft: break # We're done
        for field in feildslLeft:
            data = entry[field]

            # Need data to decide
            if len(data) == 0:
                continue

            if data.isdigit():
                fieldTypes[field] = "INTEGER"
            else:
                fieldTypes[field] = "TEXT"
        # TODO: Currently there's no support for DATE in sqllite

    if len(feildslLeft) > 0:
        raise Exception("Failed to find all the columns data types - Maybe some are empty?")

    return fieldTypes


def escapingGenerator(f):
    for line in f:
        yield line.encode("ascii", "xmlcharrefreplace").decode("ascii")


def csvToDb(csvFile, outputToFile = False):
    # TODO: implement output to file

    with open(csvFile,mode='r', encoding="ISO-8859-1") as fin:
        dt = _get_col_datatypes(fin)

        fin.seek(0)

        reader = csv.DictReader(fin)

        # Keep the order of the columns name just as in the CSV
        fields = reader.fieldnames
        cols = []

        # Set field and type
        for f in fields:
            cols.append("%s %s" % (f, dt[f]))

        # Generate create table statement:
        stmt = "CREATE TABLE ads (%s)" % ",".join(cols)

        con = sqlite3.connect(":memory:")
        cur = con.cursor()
        cur.execute(stmt)

        fin.seek(0)


        reader = csv.reader(escapingGenerator(fin))

        # Generate insert statement:
        stmt = "INSERT INTO ads VALUES(%s);" % ','.join('?' * len(cols))

        cur.executemany(stmt, reader)
        con.commit()

    return con

回答 3

.import命令是sqlite3命令行工具的功能。要在Python中完成此操作,您应该简单地使用Python具备的任何功能(例如csv模块)加载数据,然后照常插入数据。

这样,您还可以控制要插入的类型,而不必依赖sqlite3似乎没有记录的行为。

The .import command is a feature of the sqlite3 command-line tool. To do it in Python, you should simply load the data using whatever facilities Python has, such as the csv module, and inserting the data as per usual.

This way, you also have control over what types are inserted, rather than relying on sqlite3’s seemingly undocumented behaviour.


回答 4

#!/usr/bin/python
# -*- coding: utf-8 -*-

import sys, csv, sqlite3

def main():
    con = sqlite3.connect(sys.argv[1]) # database file input
    cur = con.cursor()
    cur.executescript("""
        DROP TABLE IF EXISTS t;
        CREATE TABLE t (COL1 TEXT, COL2 TEXT);
        """) # checks to see if table exists and makes a fresh table.

    with open(sys.argv[2], "rb") as f: # CSV file input
        reader = csv.reader(f, delimiter=',') # no header information with delimiter
        for row in reader:
            to_db = [unicode(row[0], "utf8"), unicode(row[1], "utf8")] # Appends data from CSV file representing and handling of text
            cur.execute("INSERT INTO neto (COL1, COL2) VALUES(?, ?);", to_db)
            con.commit()
    con.close() # closes connection to database

if __name__=='__main__':
    main()
#!/usr/bin/python
# -*- coding: utf-8 -*-

import sys, csv, sqlite3

def main():
    con = sqlite3.connect(sys.argv[1]) # database file input
    cur = con.cursor()
    cur.executescript("""
        DROP TABLE IF EXISTS t;
        CREATE TABLE t (COL1 TEXT, COL2 TEXT);
        """) # checks to see if table exists and makes a fresh table.

    with open(sys.argv[2], "rb") as f: # CSV file input
        reader = csv.reader(f, delimiter=',') # no header information with delimiter
        for row in reader:
            to_db = [unicode(row[0], "utf8"), unicode(row[1], "utf8")] # Appends data from CSV file representing and handling of text
            cur.execute("INSERT INTO neto (COL1, COL2) VALUES(?, ?);", to_db)
            con.commit()
    con.close() # closes connection to database

if __name__=='__main__':
    main()

回答 5

非常感谢bernie的回答!必须进行一些调整-这对我有用:

import csv, sqlite3
conn = sqlite3.connect("pcfc.sl3")
curs = conn.cursor()
curs.execute("CREATE TABLE PCFC (id INTEGER PRIMARY KEY, type INTEGER, term TEXT, definition TEXT);")
reader = csv.reader(open('PC.txt', 'r'), delimiter='|')
for row in reader:
    to_db = [unicode(row[0], "utf8"), unicode(row[1], "utf8"), unicode(row[2], "utf8")]
    curs.execute("INSERT INTO PCFC (type, term, definition) VALUES (?, ?, ?);", to_db)
conn.commit()

我的文本文件(PC.txt)如下所示:

1 | Term 1 | Definition 1
2 | Term 2 | Definition 2
3 | Term 3 | Definition 3

Many thanks for bernie’s answer! Had to tweak it a bit – here’s what worked for me:

import csv, sqlite3
conn = sqlite3.connect("pcfc.sl3")
curs = conn.cursor()
curs.execute("CREATE TABLE PCFC (id INTEGER PRIMARY KEY, type INTEGER, term TEXT, definition TEXT);")
reader = csv.reader(open('PC.txt', 'r'), delimiter='|')
for row in reader:
    to_db = [unicode(row[0], "utf8"), unicode(row[1], "utf8"), unicode(row[2], "utf8")]
    curs.execute("INSERT INTO PCFC (type, term, definition) VALUES (?, ?, ?);", to_db)
conn.commit()

My text file (PC.txt) looks like this:

1 | Term 1 | Definition 1
2 | Term 2 | Definition 2
3 | Term 3 | Definition 3

回答 6

没错,这.import是正确的方法,但这是来自SQLite3.exe Shell的命令。这个问题的很多顶级答案都涉及本机python循环,但如果文件很大(我的记录是10 ^ 6到10 ^ 7记录),则要避免将所有内容读入熊猫或使用本机python列表理解/循环(尽管我没有时间比较它们)。

对于大文件,我认为最好的选择是使用预先创建空表,sqlite3.execute("CREATE TABLE...")从CSV文件中删除标题,然后用于subprocess.run()执行sqlite的import语句。由于我相信最后一部分是最相关的,所以我将从这一点开始。

subprocess.run()

from pathlib import Path
db_name = Path('my.db').resolve()
csv_file = Path('file.csv').resolve()
result = subprocess.run(['sqlite3',
                         str(db_name),
                         '-cmd',
                         '.mode csv',
                         '.import '+str(csv_file).replace('\\','\\\\')
                                 +' <table_name>'],
                        capture_output=True)

解释
在命令行中,您要查找的命令是sqlite3 my.db -cmd ".mode csv" ".import file.csv table"subprocess.run()运行命令行过程。to的参数subprocess.run()是一串字符串,这些字符串被解释为命令,后跟所有参数。

  • sqlite3 my.db 打开数据库
  • -cmd数据库允许您将多个跟进命令传递给sqlite程序后添加标志。在外壳中,每个命令都必须用引号引起来,但是在这里,它们只需要成为序列中自己的元素即可
  • '.mode csv' 符合您的期望
  • '.import '+str(csv_file).replace('\\','\\\\')+' <table_name>'是导入命令。
    不幸的是,由于子进程将所有后续操作都传递给带-cmd引号的字符串,因此如果您有Windows目录路径,则需要将反斜杠加倍。

剥离标题

这并不是问题的重点,但这是我所使用的。同样,我不想在任何时候将整个文件读入内存:

with open(csv, "r") as source:
    source.readline()
    with open(str(csv)+"_nohead", "w") as target:
        shutil.copyfileobj(source, target)

You’re right that .import is the way to go, but that’s a command from the SQLite3.exe shell. A lot of the top answers to this question involve native python loops, but if your files are large (mine are 10^6 to 10^7 records), you want to avoid reading everything into pandas or using a native python list comprehension/loop (though I did not time them for comparison).

For large files, I believe the best option is to create the empty table in advance using sqlite3.execute("CREATE TABLE..."), strip the headers from your CSV files, and then use subprocess.run() to execute sqlite’s import statement. Since the last part is I believe the most pertinent, I will start with that.

subprocess.run()

from pathlib import Path
db_name = Path('my.db').resolve()
csv_file = Path('file.csv').resolve()
result = subprocess.run(['sqlite3',
                         str(db_name),
                         '-cmd',
                         '.mode csv',
                         '.import '+str(csv_file).replace('\\','\\\\')
                                 +' <table_name>'],
                        capture_output=True)

Explanation
From the command line, the command you’re looking for is sqlite3 my.db -cmd ".mode csv" ".import file.csv table". subprocess.run() runs a command line process. The argument to subprocess.run() is a sequence of strings which are interpreted as a command followed by all of it’s arguments.

  • sqlite3 my.db opens the database
  • -cmd flag after the database allows you to pass multiple follow on commands to the sqlite program. In the shell, each command has to be in quotes, but here, they just need to be their own element of the sequence
  • '.mode csv' does what you’d expect
  • '.import '+str(csv_file).replace('\\','\\\\')+' <table_name>' is the import command.
    Unfortunately, since subprocess passes all follow-ons to -cmd as quoted strings, you need to double up your backslashes if you have a windows directory path.

Stripping Headers

Not really the main point of the question, but here’s what I used. Again, I didn’t want to read the whole files into memory at any point:

with open(csv, "r") as source:
    source.readline()
    with open(str(csv)+"_nohead", "w") as target:
        shutil.copyfileobj(source, target)


回答 7

基于Guy L解决方案(喜欢它),但可以处理转义的字段。

import csv, sqlite3

def _get_col_datatypes(fin):
    dr = csv.DictReader(fin) # comma is default delimiter
    fieldTypes = {}
    for entry in dr:
        feildslLeft = [f for f in dr.fieldnames if f not in fieldTypes.keys()]        
        if not feildslLeft: break # We're done
        for field in feildslLeft:
            data = entry[field]

            # Need data to decide
            if len(data) == 0:
                continue

            if data.isdigit():
                fieldTypes[field] = "INTEGER"
            else:
                fieldTypes[field] = "TEXT"
        # TODO: Currently there's no support for DATE in sqllite

    if len(feildslLeft) > 0:
        raise Exception("Failed to find all the columns data types - Maybe some are empty?")

    return fieldTypes


def escapingGenerator(f):
    for line in f:
        yield line.encode("ascii", "xmlcharrefreplace").decode("ascii")


def csvToDb(csvFile,dbFile,tablename, outputToFile = False):

    # TODO: implement output to file

    with open(csvFile,mode='r', encoding="ISO-8859-1") as fin:
        dt = _get_col_datatypes(fin)

        fin.seek(0)

        reader = csv.DictReader(fin)

        # Keep the order of the columns name just as in the CSV
        fields = reader.fieldnames
        cols = []

        # Set field and type
        for f in fields:
            cols.append("\"%s\" %s" % (f, dt[f]))

        # Generate create table statement:
        stmt = "create table if not exists \"" + tablename + "\" (%s)" % ",".join(cols)
        print(stmt)
        con = sqlite3.connect(dbFile)
        cur = con.cursor()
        cur.execute(stmt)

        fin.seek(0)


        reader = csv.reader(escapingGenerator(fin))

        # Generate insert statement:
        stmt = "INSERT INTO \"" + tablename + "\" VALUES(%s);" % ','.join('?' * len(cols))

        cur.executemany(stmt, reader)
        con.commit()
        con.close()

Based on Guy L solution (Love it) but can handle escaped fields.

import csv, sqlite3

def _get_col_datatypes(fin):
    dr = csv.DictReader(fin) # comma is default delimiter
    fieldTypes = {}
    for entry in dr:
        feildslLeft = [f for f in dr.fieldnames if f not in fieldTypes.keys()]        
        if not feildslLeft: break # We're done
        for field in feildslLeft:
            data = entry[field]

            # Need data to decide
            if len(data) == 0:
                continue

            if data.isdigit():
                fieldTypes[field] = "INTEGER"
            else:
                fieldTypes[field] = "TEXT"
        # TODO: Currently there's no support for DATE in sqllite

    if len(feildslLeft) > 0:
        raise Exception("Failed to find all the columns data types - Maybe some are empty?")

    return fieldTypes


def escapingGenerator(f):
    for line in f:
        yield line.encode("ascii", "xmlcharrefreplace").decode("ascii")


def csvToDb(csvFile,dbFile,tablename, outputToFile = False):

    # TODO: implement output to file

    with open(csvFile,mode='r', encoding="ISO-8859-1") as fin:
        dt = _get_col_datatypes(fin)

        fin.seek(0)

        reader = csv.DictReader(fin)

        # Keep the order of the columns name just as in the CSV
        fields = reader.fieldnames
        cols = []

        # Set field and type
        for f in fields:
            cols.append("\"%s\" %s" % (f, dt[f]))

        # Generate create table statement:
        stmt = "create table if not exists \"" + tablename + "\" (%s)" % ",".join(cols)
        print(stmt)
        con = sqlite3.connect(dbFile)
        cur = con.cursor()
        cur.execute(stmt)

        fin.seek(0)


        reader = csv.reader(escapingGenerator(fin))

        # Generate insert statement:
        stmt = "INSERT INTO \"" + tablename + "\" VALUES(%s);" % ','.join('?' * len(cols))

        cur.executemany(stmt, reader)
        con.commit()
        con.close()

回答 8

为此,您可以使用blazeodo有效

import blaze as bz
csv_path = 'data.csv'
bz.odo(csv_path, 'sqlite:///data.db::data')

Odo将csv文件存储到data.db该模式下的(sqlite数据库)data

或者您odo直接使用,而无需使用blaze。两种方法都可以。阅读本文档

You can do this using blaze & odo efficiently

import blaze as bz
csv_path = 'data.csv'
bz.odo(csv_path, 'sqlite:///data.db::data')

Odo will store the csv file to data.db (sqlite database) under the schema data

Or you use odo directly, without blaze. Either ways is fine. Read this documentation


回答 9

如果必须将CSV文件作为python程序的一部分导入,则为简便起见,可以os.system按照以下建议使用:

import os

cmd = """sqlite3 database.db <<< ".import input.csv mytable" """

rc = os.system(cmd)

print(rc)

关键是,通过指定数据库的文件名,假设读取数据没有错误,数据将自动保存。

If the CSV file must be imported as part of a python program, then for simplicity and efficiency, you could use os.system along the lines suggested by the following:

import os

cmd = """sqlite3 database.db <<< ".import input.csv mytable" """

rc = os.system(cmd)

print(rc)

The point is that by specifying the filename of the database, the data will automatically be saved, assuming there are no errors reading it.


回答 10

import csv, sqlite3

def _get_col_datatypes(fin):
    dr = csv.DictReader(fin) # comma is default delimiter
    fieldTypes = {}
    for entry in dr:
        feildslLeft = [f for f in dr.fieldnames if f not in fieldTypes.keys()]        
        if not feildslLeft: break # We're done
        for field in feildslLeft:
            data = entry[field]

        # Need data to decide
        if len(data) == 0:
            continue

        if data.isdigit():
            fieldTypes[field] = "INTEGER"
        else:
            fieldTypes[field] = "TEXT"
    # TODO: Currently there's no support for DATE in sqllite

if len(feildslLeft) > 0:
    raise Exception("Failed to find all the columns data types - Maybe some are empty?")

return fieldTypes


def escapingGenerator(f):
    for line in f:
        yield line.encode("ascii", "xmlcharrefreplace").decode("ascii")


def csvToDb(csvFile,dbFile,tablename, outputToFile = False):

    # TODO: implement output to file

    with open(csvFile,mode='r', encoding="ISO-8859-1") as fin:
        dt = _get_col_datatypes(fin)

        fin.seek(0)

        reader = csv.DictReader(fin)

        # Keep the order of the columns name just as in the CSV
        fields = reader.fieldnames
        cols = []

        # Set field and type
        for f in fields:
            cols.append("\"%s\" %s" % (f, dt[f]))

        # Generate create table statement:
        stmt = "create table if not exists \"" + tablename + "\" (%s)" % ",".join(cols)
        print(stmt)
        con = sqlite3.connect(dbFile)
        cur = con.cursor()
        cur.execute(stmt)

        fin.seek(0)


        reader = csv.reader(escapingGenerator(fin))

        # Generate insert statement:
        stmt = "INSERT INTO \"" + tablename + "\" VALUES(%s);" % ','.join('?' * len(cols))

        cur.executemany(stmt, reader)
        con.commit()
        con.close()
import csv, sqlite3

def _get_col_datatypes(fin):
    dr = csv.DictReader(fin) # comma is default delimiter
    fieldTypes = {}
    for entry in dr:
        feildslLeft = [f for f in dr.fieldnames if f not in fieldTypes.keys()]        
        if not feildslLeft: break # We're done
        for field in feildslLeft:
            data = entry[field]

        # Need data to decide
        if len(data) == 0:
            continue

        if data.isdigit():
            fieldTypes[field] = "INTEGER"
        else:
            fieldTypes[field] = "TEXT"
    # TODO: Currently there's no support for DATE in sqllite

if len(feildslLeft) > 0:
    raise Exception("Failed to find all the columns data types - Maybe some are empty?")

return fieldTypes


def escapingGenerator(f):
    for line in f:
        yield line.encode("ascii", "xmlcharrefreplace").decode("ascii")


def csvToDb(csvFile,dbFile,tablename, outputToFile = False):

    # TODO: implement output to file

    with open(csvFile,mode='r', encoding="ISO-8859-1") as fin:
        dt = _get_col_datatypes(fin)

        fin.seek(0)

        reader = csv.DictReader(fin)

        # Keep the order of the columns name just as in the CSV
        fields = reader.fieldnames
        cols = []

        # Set field and type
        for f in fields:
            cols.append("\"%s\" %s" % (f, dt[f]))

        # Generate create table statement:
        stmt = "create table if not exists \"" + tablename + "\" (%s)" % ",".join(cols)
        print(stmt)
        con = sqlite3.connect(dbFile)
        cur = con.cursor()
        cur.execute(stmt)

        fin.seek(0)


        reader = csv.reader(escapingGenerator(fin))

        # Generate insert statement:
        stmt = "INSERT INTO \"" + tablename + "\" VALUES(%s);" % ','.join('?' * len(cols))

        cur.executemany(stmt, reader)
        con.commit()
        con.close()

回答 11

为了简单起见,您可以使用项目的Makefile中的sqlite3命令行工具。

%.sql3: %.csv
    rm -f $@
    sqlite3 $@ -echo -cmd ".mode csv" ".import $< $*"
%.dump: %.sql3
    sqlite3 $< "select * from $*"

make test.sql3然后从现有的test.csv文件使用单个表“ test”创建sqlite数据库。然后make test.dump,您可以验证内容。

in the interest of simplicity, you could use the sqlite3 command line tool from the Makefile of your project.

%.sql3: %.csv
    rm -f $@
    sqlite3 $@ -echo -cmd ".mode csv" ".import $< $*"
%.dump: %.sql3
    sqlite3 $< "select * from $*"

make test.sql3 then creates the sqlite database from an existing test.csv file, with a single table “test”. you can then make test.dump to verify the contents.


回答 12

我发现有必要分批从csv到数据库的数据传输,以免耗尽内存。可以这样完成:

import csv
import sqlite3
from operator import itemgetter

# Establish connection
conn = sqlite3.connect("mydb.db")

# Create the table 
conn.execute(
    """
    CREATE TABLE persons(
        person_id INTEGER,
        last_name TEXT, 
        first_name TEXT, 
        address TEXT
    )
    """
)

# These are the columns from the csv that we want
cols = ["person_id", "last_name", "first_name", "address"]

# If the csv file is huge, we instead add the data in chunks
chunksize = 10000

# Parse csv file and populate db in chunks
with conn, open("persons.csv") as f:
    reader = csv.DictReader(f)

    chunk = []
    for i, row in reader: 

        if i % chunksize == 0 and i > 0:
            conn.executemany(
                """
                INSERT INTO persons
                    VALUES(?, ?, ?, ?)
                """, chunk
            )
            chunk = []

        items = itemgetter(*cols)(row)
        chunk.append(items)

I’ve found that it can be necessary to break up the transfer of data from the csv to the database in chunks as to not run out of memory. This can be done like this:

import csv
import sqlite3
from operator import itemgetter

# Establish connection
conn = sqlite3.connect("mydb.db")

# Create the table 
conn.execute(
    """
    CREATE TABLE persons(
        person_id INTEGER,
        last_name TEXT, 
        first_name TEXT, 
        address TEXT
    )
    """
)

# These are the columns from the csv that we want
cols = ["person_id", "last_name", "first_name", "address"]

# If the csv file is huge, we instead add the data in chunks
chunksize = 10000

# Parse csv file and populate db in chunks
with conn, open("persons.csv") as f:
    reader = csv.DictReader(f)

    chunk = []
    for i, row in reader: 

        if i % chunksize == 0 and i > 0:
            conn.executemany(
                """
                INSERT INTO persons
                    VALUES(?, ?, ?, ?)
                """, chunk
            )
            chunk = []

        items = itemgetter(*cols)(row)
        chunk.append(items)


SQLAlchemy:如何过滤日期字段?

问题:SQLAlchemy:如何过滤日期字段?

这是模型:

class User(Base):
    ...
    birthday = Column(Date, index=True)   #in database it's like '1987-01-17'
    ...

我想在两个日期之间进行过滤,例如选择间隔18-30年的所有用户。

如何用SQLAlchemy实现它?

我想:

query = DBSession.query(User).filter(
    and_(User.birthday >= '1988-01-17', User.birthday <= '1985-01-17')
) 

# means age >= 24 and age <= 27

我知道这是不正确的,但是该怎么做正确呢?

Here is model:

class User(Base):
    ...
    birthday = Column(Date, index=True)   #in database it's like '1987-01-17'
    ...

I want to filter between two dates, for example to choose all users in interval 18-30 years.

How to implement it with SQLAlchemy?

I think of:

query = DBSession.query(User).filter(
    and_(User.birthday >= '1988-01-17', User.birthday <= '1985-01-17')
) 

# means age >= 24 and age <= 27

I know this is not correct, but how to do correct?


回答 0

实际上,除了错别字,您的查询是正确的:您的过滤器排除了所有记录:您应该更改<=for >=,反之亦然:

qry = DBSession.query(User).filter(
        and_(User.birthday <= '1988-01-17', User.birthday >= '1985-01-17'))
# or same:
qry = DBSession.query(User).filter(User.birthday <= '1988-01-17').\
        filter(User.birthday >= '1985-01-17')

您也可以使用between

qry = DBSession.query(User).filter(User.birthday.between('1985-01-17', '1988-01-17'))

In fact, your query is right except for the typo: your filter is excluding all records: you should change the <= for >= and vice versa:

qry = DBSession.query(User).filter(
        and_(User.birthday <= '1988-01-17', User.birthday >= '1985-01-17'))
# or same:
qry = DBSession.query(User).filter(User.birthday <= '1988-01-17').\
        filter(User.birthday >= '1985-01-17')

Also you can use between:

qry = DBSession.query(User).filter(User.birthday.between('1985-01-17', '1988-01-17'))

回答 1

from app import SQLAlchemyDB as db

Chance.query.filter(Chance.repo_id==repo_id, 
                    Chance.status=="1", 
                    db.func.date(Chance.apply_time)<=end, 
                    db.func.date(Chance.apply_time)>=start).count()

它等于:

select
   count(id)
from
   Chance
where
   repo_id=:repo_id 
   and status='1'
   and date(apple_time) <= end
   and date(apple_time) >= start

希望可以帮助你。

from app import SQLAlchemyDB as db

Chance.query.filter(Chance.repo_id==repo_id, 
                    Chance.status=="1", 
                    db.func.date(Chance.apply_time)<=end, 
                    db.func.date(Chance.apply_time)>=start).count()

it is equal to:

select
   count(id)
from
   Chance
where
   repo_id=:repo_id 
   and status='1'
   and date(apple_time) <= end
   and date(apple_time) >= start

wish can help you.


回答 2

如果要获得整个期间:

    from sqlalchemy import and_, func

    query = DBSession.query(User).filter(and_(func.date(User.birthday) >= '1985-01-17'),\
                                              func.date(User.birthday) <= '1988-01-17'))

这表示范围:1985-01-17 00 : 00-1988-01-17 23:59

if you want to get the whole period:

    from sqlalchemy import and_, func

    query = DBSession.query(User).filter(and_(func.date(User.birthday) >= '1985-01-17'),\
                                              func.date(User.birthday) <= '1988-01-17'))

That means range: 1985-01-17 00:001988-01-17 23:59


使用Python插入MySQL数据库后,如何获得“ id”?

问题:使用Python插入MySQL数据库后,如何获得“ id”?

我执行INSERT INTO语句

cursor.execute("INSERT INTO mytable(height) VALUES(%s)",(height))

我想获取主键。

我的表有2列:

id      primary, auto increment
height  this is the other column.

我刚插入这个后,如何获得“ id”?

I execute an INSERT INTO statement

cursor.execute("INSERT INTO mytable(height) VALUES(%s)",(height))

and I want to get the primary key.

My table has 2 columns:

id      primary, auto increment
height  this is the other column.

How do I get the “id”, after I just inserted this?


回答 0

使用cursor.lastrowid得到插入光标对象的最后一行ID,或者connection.insert_id()从该连接上最后插入获取的ID。

Use cursor.lastrowid to get the last row ID inserted on the cursor object, or connection.insert_id() to get the ID from the last insert on that connection.


回答 1

另外,cursor.lastrowid(MySQLdb支持的dbapi / PEP249扩展名):

>>> import MySQLdb
>>> connection = MySQLdb.connect(user='root')
>>> cursor = connection.cursor()
>>> cursor.execute('INSERT INTO sometable VALUES (...)')
1L
>>> connection.insert_id()
3L
>>> cursor.lastrowid
3L
>>> cursor.execute('SELECT last_insert_id()')
1L
>>> cursor.fetchone()
(3L,)
>>> cursor.execute('select @@identity')
1L
>>> cursor.fetchone()
(3L,)

cursor.lastrowidconnection.insert_id()比另一次MySQL往返便宜,而且便宜很多。

Also, cursor.lastrowid (a dbapi/PEP249 extension supported by MySQLdb):

>>> import MySQLdb
>>> connection = MySQLdb.connect(user='root')
>>> cursor = connection.cursor()
>>> cursor.execute('INSERT INTO sometable VALUES (...)')
1L
>>> connection.insert_id()
3L
>>> cursor.lastrowid
3L
>>> cursor.execute('SELECT last_insert_id()')
1L
>>> cursor.fetchone()
(3L,)
>>> cursor.execute('select @@identity')
1L
>>> cursor.fetchone()
(3L,)

cursor.lastrowid is somewhat cheaper than connection.insert_id() and much cheaper than another round trip to MySQL.


回答 2

Python DBAPI规范还为光标对象定义了“ lastrowid”属性,因此…

id = cursor.lastrowid

…也应该起作用,并且显然基于每个连接。

Python DBAPI spec also define ‘lastrowid’ attribute for cursor object, so…

id = cursor.lastrowid

…should work too, and it’s per-connection based obviously.


回答 3

SELECT @@IDENTITY AS 'Identity';

要么

SELECT last_insert_id();
SELECT @@IDENTITY AS 'Identity';

or

SELECT last_insert_id();

Django Model()与Model.objects.create()

问题:Django Model()与Model.objects.create()

运行两个命令有什么区别:

foo = FooModel()

bar = BarModel.objects.create()

第二个方法是否立即BarModel在数据库中创建一个,而对于FooModelsave()必须显式调用该方法以将其添加到数据库中?

What it the difference between running two commands:

foo = FooModel()

and

bar = BarModel.objects.create()

Does the second one immediately create a BarModel in the database, while for FooModel, the save() method has to be called explicitly to add it to the database?


回答 0

https://docs.djangoproject.com/zh-CN/stable/topics/db/queries/#creating-objects

要在一个步骤中创建和保存对象,请使用create()方法。

https://docs.djangoproject.com/en/stable/topics/db/queries/#creating-objects

To create and save an object in a single step, use the create() method.


回答 1

两种语法不等效,并且可能导致意外错误。这是一个显示差异的简单示例。如果您有模型:

from django.db import models

class Test(models.Model):

    added = models.DateTimeField(auto_now_add=True)

然后创建第一个对象:

foo = Test.objects.create(pk=1)

然后尝试使用相同的主键创建一个对象:

foo_duplicate = Test.objects.create(pk=1)
# returns the error:
# django.db.utils.IntegrityError: (1062, "Duplicate entry '1' for key 'PRIMARY'")

foo_duplicate = Test(pk=1).save()
# returns the error:
# django.db.utils.IntegrityError: (1048, "Column 'added' cannot be null")

The two syntaxes are not equivalent and it can lead to unexpected errors. Here is a simple example showing the differences. If you have a model:

from django.db import models

class Test(models.Model):

    added = models.DateTimeField(auto_now_add=True)

And you create a first object:

foo = Test.objects.create(pk=1)

Then you try to create an object with the same primary key:

foo_duplicate = Test.objects.create(pk=1)
# returns the error:
# django.db.utils.IntegrityError: (1062, "Duplicate entry '1' for key 'PRIMARY'")

foo_duplicate = Test(pk=1).save()
# returns the error:
# django.db.utils.IntegrityError: (1048, "Column 'added' cannot be null")

回答 2

更新15.3.2017:

我已经对此打开了Django问题,似乎已经在这里被初步接受:https : //code.djangoproject.com/ticket/27825

我的经验是,当通过Django 在引用中使用ConstructorORM)类时,1.10.5数据中可能存在一些不一致(即,创建对象的属性可能获取输入数据的类型,而不是ORM对象属性的强制类型)。 :

models

class Payment(models.Model):
     amount_cash = models.DecimalField()

some_test.pyobject.create

Class SomeTestCase:
    def generate_orm_obj(self, _constructor, base_data=None, modifiers=None):
        objs = []
        if not base_data:
            base_data = {'amount_case': 123.00}
        for modifier in modifiers:
            actual_data = deepcopy(base_data)
            actual_data.update(modifier)
            # Hacky fix,
            _obj = _constructor.objects.create(**actual_data)
            print(type(_obj.amount_cash)) # Decimal
            assert created
           objs.append(_obj)
        return objs

some_test.pyConstructor()

Class SomeTestCase:
    def generate_orm_obj(self, _constructor, base_data=None, modifiers=None):
        objs = []
        if not base_data:
            base_data = {'amount_case': 123.00}
        for modifier in modifiers:
            actual_data = deepcopy(base_data)
            actual_data.update(modifier)
            # Hacky fix,
            _obj = _constructor(**actual_data)
            print(type(_obj.amount_cash)) # Float
            assert created
           objs.append(_obj)
        return objs

UPDATE 15.3.2017:

I have opened a Django-issue on this and it seems to be preliminary accepted here: https://code.djangoproject.com/ticket/27825

My experience is that when using the Constructor (ORM) class by references with Django 1.10.5 there might be some inconsistencies in the data (i.e. the attributes of the created object may get the type of the input data instead of the casted type of the ORM object property) example:

models

class Payment(models.Model):
     amount_cash = models.DecimalField()

some_test.pyobject.create

Class SomeTestCase:
    def generate_orm_obj(self, _constructor, base_data=None, modifiers=None):
        objs = []
        if not base_data:
            base_data = {'amount_case': 123.00}
        for modifier in modifiers:
            actual_data = deepcopy(base_data)
            actual_data.update(modifier)
            # Hacky fix,
            _obj = _constructor.objects.create(**actual_data)
            print(type(_obj.amount_cash)) # Decimal
            assert created
           objs.append(_obj)
        return objs

some_test.pyConstructor()

Class SomeTestCase:
    def generate_orm_obj(self, _constructor, base_data=None, modifiers=None):
        objs = []
        if not base_data:
            base_data = {'amount_case': 123.00}
        for modifier in modifiers:
            actual_data = deepcopy(base_data)
            actual_data.update(modifier)
            # Hacky fix,
            _obj = _constructor(**actual_data)
            print(type(_obj.amount_cash)) # Float
            assert created
           objs.append(_obj)
        return objs

Dataset-易于使用的SQL数据存储数据处理,支持隐式表创建、大容量装载和事务

DataSet:面向懒人的数据库

总之,数据集使读取和写入数据库中的数据与读取和写入JSON文件一样简单

Read the docs

要安装数据集,请使用以下命令获取它pip

$ pip install dataset

注:从版本1.0开始,数据集被分成两个包,现在将数据导出功能提取到一个独立的包中,数据流转请参阅相关存储库here

Sqlmap-自动SQL注入和数据库工具

sqlmap是一款开源渗透测试工具,可自动检测和利用SQL注入缺陷并接管数据库服务器。它具有强大的检测引擎、许多适用于终极渗透测试仪的利基功能,以及广泛的交换机,包括数据库指纹、通过从数据库提取数据、访问底层文件系统以及通过带外连接在操作系统上执行命令

屏幕截图

您可以访问collection of screenshots演示维基上的一些功能

安装

您可以通过单击下载最新的tarballhere或最新的Zipball,请单击here

最好,您可以通过克隆sqlmap来下载sqlmap。Git存储库:

git clone --depth 1 https://github.com/sqlmapproject/sqlmap.git sqlmap-dev

sqlmap开箱即用Python版本2.62.73.x在任何平台上

用法

要获取基本选项和开关的列表,请使用:

python sqlmap.py -h

要获取所有选项和开关的列表,请使用:

python sqlmap.py -hh

您可以找到一个示例运行here要获得sqlmap功能的概述、支持的功能列表、所有选项和开关的说明以及示例,建议您参阅user’s manual

链接

译本