标签归档:json

将Enum成员序列化为JSON

问题:将Enum成员序列化为JSON

如何将PythonEnum成员序列化为JSON,以便可以将生成的JSON反序列化为Python对象?

例如,此代码:

from enum import Enum    
import json

class Status(Enum):
    success = 0

json.dumps(Status.success)

导致错误:

TypeError: <Status.success: 0> is not JSON serializable

我该如何避免呢?

How do I serialise a Python Enum member to JSON, so that I can deserialise the resulting JSON back into a Python object?

For example, this code:

from enum import Enum    
import json

class Status(Enum):
    success = 0

json.dumps(Status.success)

results in the error:

TypeError: <Status.success: 0> is not JSON serializable

How can I avoid that?


回答 0

如果您想将任意enum.Enum成员编码为JSON,然后将其解码为相同的enum成员(而不是简单的enum成员的value属性),则可以编写一个自定义JSONEncoder类,并使用一个解码函数作为object_hook参数传递给json.load()or来实现json.loads()

PUBLIC_ENUMS = {
    'Status': Status,
    # ...
}

class EnumEncoder(json.JSONEncoder):
    def default(self, obj):
        if type(obj) in PUBLIC_ENUMS.values():
            return {"__enum__": str(obj)}
        return json.JSONEncoder.default(self, obj)

def as_enum(d):
    if "__enum__" in d:
        name, member = d["__enum__"].split(".")
        return getattr(PUBLIC_ENUMS[name], member)
    else:
        return d

as_enum函数依赖于已使用EnumEncoder或类似行为进行编码的JSON 。

对成员的限制PUBLIC_ENUMS是必要的,以避免使用恶意制作的文本来(例如)欺骗调用代码以将私有信息(例如,应用程序使用的密钥)保存到不相关的数据库字段中,然后从该字段中将其公开(请参阅http://chat.stackoverflow.com/transcript/message/35999686#35999686)。

用法示例:

>>> data = {
...     "action": "frobnicate",
...     "status": Status.success
... }
>>> text = json.dumps(data, cls=EnumEncoder)
>>> text
'{"status": {"__enum__": "Status.success"}, "action": "frobnicate"}'
>>> json.loads(text, object_hook=as_enum)
{'status': <Status.success: 0>, 'action': 'frobnicate'}

If you want to encode an arbitrary enum.Enum member to JSON and then decode it as the same enum member (rather than simply the enum member’s value attribute), you can do so by writing a custom JSONEncoder class, and a decoding function to pass as the object_hook argument to json.load() or json.loads():

PUBLIC_ENUMS = {
    'Status': Status,
    # ...
}

class EnumEncoder(json.JSONEncoder):
    def default(self, obj):
        if type(obj) in PUBLIC_ENUMS.values():
            return {"__enum__": str(obj)}
        return json.JSONEncoder.default(self, obj)

def as_enum(d):
    if "__enum__" in d:
        name, member = d["__enum__"].split(".")
        return getattr(PUBLIC_ENUMS[name], member)
    else:
        return d

The as_enum function relies on the JSON having been encoded using EnumEncoder, or something which behaves identically to it.

The restriction to members of PUBLIC_ENUMS is necessary to avoid a maliciously crafted text being used to, for example, trick calling code into saving private information (e.g. a secret key used by the application) to an unrelated database field, from where it could then be exposed (see http://chat.stackoverflow.com/transcript/message/35999686#35999686).

Example usage:

>>> data = {
...     "action": "frobnicate",
...     "status": Status.success
... }
>>> text = json.dumps(data, cls=EnumEncoder)
>>> text
'{"status": {"__enum__": "Status.success"}, "action": "frobnicate"}'
>>> json.loads(text, object_hook=as_enum)
{'status': <Status.success: 0>, 'action': 'frobnicate'}

回答 1

我知道这很老,但我认为这会对人们有所帮助。我刚刚经历了这个确切的问题,发现您是否使用字符串枚举,将您的枚举声明str为几乎所有情况下都可以正常工作的子类:

import json
from enum import Enum

class LogLevel(str, Enum):
    DEBUG = 'DEBUG'
    INFO = 'INFO'

print(LogLevel.DEBUG)
print(json.dumps(LogLevel.DEBUG))
print(json.loads('"DEBUG"'))
print(LogLevel('DEBUG'))

将输出:

LogLevel.DEBUG
"DEBUG"
DEBUG
LogLevel.DEBUG

如您所见,加载JSON将输出字符串,DEBUG但可以轻松将其转换回LogLevel对象。如果您不想创建自定义JSONEncoder,则是一个不错的选择。

I know this is old but I feel this will help people. I just went through this exact problem and discovered if you’re using string enums, declaring your enums as a subclass of str works well for almost all situations:

import json
from enum import Enum

class LogLevel(str, Enum):
    DEBUG = 'DEBUG'
    INFO = 'INFO'

print(LogLevel.DEBUG)
print(json.dumps(LogLevel.DEBUG))
print(json.loads('"DEBUG"'))
print(LogLevel('DEBUG'))

Will output:

LogLevel.DEBUG
"DEBUG"
DEBUG
LogLevel.DEBUG

As you can see, loading the JSON outputs the string DEBUG but it is easily castable back into a LogLevel object. A good option if you don’t want to create a custom JSONEncoder.


回答 2

正确答案取决于您打算对序列化版本进行的处理。

如果您要反序列化回Python,请参见Zero的答案

如果您的序列化版本将要使用另一种语言,那么您可能想使用IntEnum代替,它将自动序列化为相应的整数:

from enum import IntEnum
import json

class Status(IntEnum):
    success = 0
    failure = 1

json.dumps(Status.success)

这将返回:

'0'

The correct answer depends on what you intend to do with the serialized version.

If you are going to unserialize back into Python, see Zero’s answer.

If your serialized version is going to another language then you probably want to use an IntEnum instead, which is automatically serialized as the corresponding integer:

from enum import IntEnum
import json

class Status(IntEnum):
    success = 0
    failure = 1

json.dumps(Status.success)

and this returns:

'0'

回答 3

在Python 3.7中,只能使用 json.dumps(enum_obj, default=str)

In Python 3.7, can just use json.dumps(enum_obj, default=str)


回答 4

我喜欢Zero Piraeus的回答,但是对它进行了稍作修改,以便使用称为Boto的Amazon Web Services(AWS)的API。

class EnumEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, Enum):
            return obj.name
        return json.JSONEncoder.default(self, obj)

然后,我将此方法添加到我的数据模型中:

    def ToJson(self) -> str:
        return json.dumps(self.__dict__, cls=EnumEncoder, indent=1, sort_keys=True)

我希望这可以帮助别人。

I liked Zero Piraeus’ answer, but modified it slightly for working with the API for Amazon Web Services (AWS) known as Boto.

class EnumEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, Enum):
            return obj.name
        return json.JSONEncoder.default(self, obj)

I then added this method to my data model:

    def ToJson(self) -> str:
        return json.dumps(self.__dict__, cls=EnumEncoder, indent=1, sort_keys=True)

I hope this helps someone.


回答 5

如果您使用的jsonpickle是最简单的方法,则应如下所示。

from enum import Enum
import jsonpickle


@jsonpickle.handlers.register(Enum, base=True)
class EnumHandler(jsonpickle.handlers.BaseHandler):

    def flatten(self, obj, data):
        return obj.value  # Convert to json friendly format


if __name__ == '__main__':
    class Status(Enum):
        success = 0
        error = 1

    class SimpleClass:
        pass

    simple_class = SimpleClass()
    simple_class.status = Status.success

    json = jsonpickle.encode(simple_class, unpicklable=False)
    print(json)

在Json序列化之后,您将获得预期的{"status": 0}而不是

{"status": {"__objclass__": {"py/type": "__main__.Status"}, "_name_": "success", "_value_": 0}}

If you are using jsonpickle the easiest way should look as below.

from enum import Enum
import jsonpickle


@jsonpickle.handlers.register(Enum, base=True)
class EnumHandler(jsonpickle.handlers.BaseHandler):

    def flatten(self, obj, data):
        return obj.value  # Convert to json friendly format


if __name__ == '__main__':
    class Status(Enum):
        success = 0
        error = 1

    class SimpleClass:
        pass

    simple_class = SimpleClass()
    simple_class.status = Status.success

    json = jsonpickle.encode(simple_class, unpicklable=False)
    print(json)

After Json serialization you will have as expected {"status": 0} instead of

{"status": {"__objclass__": {"py/type": "__main__.Status"}, "_name_": "success", "_value_": 0}}

回答 6

这为我工作:

class Status(Enum):
    success = 0

    def __json__(self):
        return self.value

不必更改其他任何内容。显然,您只会从中获得该值,并且如果您想稍后将序列化的值转换回枚举,则需要做一些其他工作。

This worked for me:

class Status(Enum):
    success = 0

    def __json__(self):
        return self.value

Didn’t have to change anything else. Obviously, you’ll only get the value out of this and will need to do some other work if you want to convert the serialized value back into the enum later.


使用Django 1.7加载初始数据和数据迁移

问题:使用Django 1.7加载初始数据和数据迁移

我最近从Django 1.6切换到1.7,并且开始使用迁移功能(我从未使用过South)。

在1.7之前,我曾经用fixture/initial_data.json文件加载初始数据,该文件是用python manage.py syncdb命令加载的(在创建数据库时)。

现在,我开始使用迁移,并且不赞成使用此行为:

如果应用程序使用迁移,则不会自动加载固定装置。由于Django 2.0中的应用程序需要迁移,因此该行为被视为已弃用。如果要加载应用程序的初始数据,请考虑在数据迁移中进行。(https://docs.djangoproject.com/zh-CN/1.7/howto/initial-data/#automatically-loading-initial-data-fixtures

官方文件并没有对如何做一个明显的例子,所以我的问题是:

使用数据迁移导入此类初始数据的最佳方法是什么:

  1. 通过多次调用编写Python代码mymodel.create(...)
  2. 使用或编写Django函数(如调用loaddata)从JSON固定文件加载数据。

我更喜欢第二种选择。

我不想使用South,因为Django现在似乎可以本地使用。

I recently switched from Django 1.6 to 1.7, and I began using migrations (I never used South).

Before 1.7, I used to load initial data with a fixture/initial_data.json file, which was loaded with the python manage.py syncdb command (when creating the database).

Now, I started using migrations, and this behavior is deprecated :

If an application uses migrations, there is no automatic loading of fixtures. Since migrations will be required for applications in Django 2.0, this behavior is considered deprecated. If you want to load initial data for an app, consider doing it in a data migration. (https://docs.djangoproject.com/en/1.7/howto/initial-data/#automatically-loading-initial-data-fixtures)

The official documentation does not have a clear example on how to do it, so my question is :

What is the best way to import such initial data using data migrations :

  1. Write Python code with multiple calls to mymodel.create(...),
  2. Use or write a Django function (like calling loaddata) to load data from a JSON fixture file.

I prefer the second option.

I don’t want to use South, as Django seems to be able to do it natively now.


回答 0

更新:有关此解决方案可能导致的问题,请参见下面的@GwynBleidD注释,有关对将来的模型更改更持久的方法,请参见下面的@Rockallite答案。


假设您有一个夹具文件 <yourapp>/fixtures/initial_data.json

  1. 创建您的空迁移:

    在Django 1.7中:

    python manage.py makemigrations --empty <yourapp>

    在Django 1.8+中,您可以提供一个名称:

    python manage.py makemigrations --empty <yourapp> --name load_intial_data
  2. 编辑您的迁移文件 <yourapp>/migrations/0002_auto_xxx.py

    2.1。自定义实现,受Django’ loaddata(初始答案)启发:

    import os
    from sys import path
    from django.core import serializers
    
    fixture_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '../fixtures'))
    fixture_filename = 'initial_data.json'
    
    def load_fixture(apps, schema_editor):
        fixture_file = os.path.join(fixture_dir, fixture_filename)
    
        fixture = open(fixture_file, 'rb')
        objects = serializers.deserialize('json', fixture, ignorenonexistent=True)
        for obj in objects:
            obj.save()
        fixture.close()
    
    def unload_fixture(apps, schema_editor):
        "Brutally deleting all entries for this model..."
    
        MyModel = apps.get_model("yourapp", "ModelName")
        MyModel.objects.all().delete()
    
    class Migration(migrations.Migration):  
    
        dependencies = [
            ('yourapp', '0001_initial'),
        ]
    
        operations = [
            migrations.RunPython(load_fixture, reverse_code=unload_fixture),
        ]

    2.2。一个更简单的解决方案load_fixture(根据@juliocesar的建议):

    from django.core.management import call_command
    
    fixture_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '../fixtures'))
    fixture_filename = 'initial_data.json'
    
    def load_fixture(apps, schema_editor):
        fixture_file = os.path.join(fixture_dir, fixture_filename)
        call_command('loaddata', fixture_file) 

    如果要使用自定义目录,则很有用。

    2.3。最简单的:调用loaddataapp_label从将加载器具<yourapp>fixtures目录自动:

    from django.core.management import call_command
    
    fixture = 'initial_data'
    
    def load_fixture(apps, schema_editor):
        call_command('loaddata', fixture, app_label='yourapp') 

    如果您未指定app_label,loaddata会尝试fixture所有应用程序的夹具目录(您可能不想要)中加载文件名。

  3. 运行

    python manage.py migrate <yourapp>

Update: See @GwynBleidD’s comment below for the problems this solution can cause, and see @Rockallite’s answer below for an approach that’s more durable to future model changes.


Assuming you have a fixture file in <yourapp>/fixtures/initial_data.json

  1. Create your empty migration:

    In Django 1.7:

    python manage.py makemigrations --empty <yourapp>
    

    In Django 1.8+, you can provide a name:

    python manage.py makemigrations --empty <yourapp> --name load_intial_data
    
  2. Edit your migration file <yourapp>/migrations/0002_auto_xxx.py

    2.1. Custom implementation, inspired by Django’ loaddata (initial answer):

    import os
    from sys import path
    from django.core import serializers
    
    fixture_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '../fixtures'))
    fixture_filename = 'initial_data.json'
    
    def load_fixture(apps, schema_editor):
        fixture_file = os.path.join(fixture_dir, fixture_filename)
    
        fixture = open(fixture_file, 'rb')
        objects = serializers.deserialize('json', fixture, ignorenonexistent=True)
        for obj in objects:
            obj.save()
        fixture.close()
    
    def unload_fixture(apps, schema_editor):
        "Brutally deleting all entries for this model..."
    
        MyModel = apps.get_model("yourapp", "ModelName")
        MyModel.objects.all().delete()
    
    class Migration(migrations.Migration):  
    
        dependencies = [
            ('yourapp', '0001_initial'),
        ]
    
        operations = [
            migrations.RunPython(load_fixture, reverse_code=unload_fixture),
        ]
    

    2.2. A simpler solution for load_fixture (per @juliocesar’s suggestion):

    from django.core.management import call_command
    
    fixture_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '../fixtures'))
    fixture_filename = 'initial_data.json'
    
    def load_fixture(apps, schema_editor):
        fixture_file = os.path.join(fixture_dir, fixture_filename)
        call_command('loaddata', fixture_file) 
    

    Useful if you want to use a custom directory.

    2.3. Simplest: calling loaddata with app_label will load fixtures from the <yourapp>‘s fixtures dir automatically :

    from django.core.management import call_command
    
    fixture = 'initial_data'
    
    def load_fixture(apps, schema_editor):
        call_command('loaddata', fixture, app_label='yourapp') 
    

    If you don’t specify app_label, loaddata will try to load fixture filename from all apps fixtures directories (which you probably don’t want).

  3. Run it

    python manage.py migrate <yourapp>
    

回答 1

精简版

您应使用loaddata的数据迁移直接管理命令。

# Bad example for a data migration
from django.db import migrations
from django.core.management import call_command


def load_fixture(apps, schema_editor):
    # No, it's wrong. DON'T DO THIS!
    call_command('loaddata', 'your_data.json', app_label='yourapp')


class Migration(migrations.Migration):
    dependencies = [
        # Dependencies to other migrations
    ]

    operations = [
        migrations.RunPython(load_fixture),
    ]

长版

loaddata利用利用django.core.serializers.python.Deserializer最新模型反序列化迁移中的历史数据。那是不正确的行为。

例如,假设有一个数据迁移,该数据迁移利用loaddata管理命令从固定装置加载数据,并且该数据迁移已应用于您的开发环境。

以后,您决定将新的必填字段添加到相应的模型中,这样就可以对更新后的模型进行新迁移(并可能在./manage.py makemigrations提示您时向新字段提供一次性值)。

您运行下一个迁移,一切顺利。

最后,开发完Django应用程序,然后将其部署在生产服务器上。现在是时候在生产环境上从头开始运行整个迁移了。

但是,数据迁移失败。这是因为来自loaddata命令的反序列化模型(代表当前代码)无法与添加的新必填字段的空数据一起保存。原始灯具缺少必要的数据!

但是,即使使用新字段所需的数据更新了灯具,数据迁移仍然会失败。在运行数据迁移时,尚未应用将相应列添加到数据库的下一次迁移。您无法将数据保存到不存在的列中!

结论:在数据迁移中,该loaddata命令引入了模型与数据库之间潜在的不一致。您绝对应该在数据迁移中直接使用它。

解决方案

loaddata命令依赖于django.core.serializers.python._get_model功能以从固定装置中获取相应的模型,该装置将返回模型的最新版本。我们需要对其进行Monkey修补,以便获得历史模型。

(以下代码适用于Django 1.8.x)

# Good example for a data migration
from django.db import migrations
from django.core.serializers import base, python
from django.core.management import call_command


def load_fixture(apps, schema_editor):
    # Save the old _get_model() function
    old_get_model = python._get_model

    # Define new _get_model() function here, which utilizes the apps argument to
    # get the historical version of a model. This piece of code is directly stolen
    # from django.core.serializers.python._get_model, unchanged. However, here it
    # has a different context, specifically, the apps variable.
    def _get_model(model_identifier):
        try:
            return apps.get_model(model_identifier)
        except (LookupError, TypeError):
            raise base.DeserializationError("Invalid model identifier: '%s'" % model_identifier)

    # Replace the _get_model() function on the module, so loaddata can utilize it.
    python._get_model = _get_model

    try:
        # Call loaddata command
        call_command('loaddata', 'your_data.json', app_label='yourapp')
    finally:
        # Restore old _get_model() function
        python._get_model = old_get_model


class Migration(migrations.Migration):
    dependencies = [
        # Dependencies to other migrations
    ]

    operations = [
        migrations.RunPython(load_fixture),
    ]

Short version

You should NOT use loaddata management command directly in a data migration.

# Bad example for a data migration
from django.db import migrations
from django.core.management import call_command


def load_fixture(apps, schema_editor):
    # No, it's wrong. DON'T DO THIS!
    call_command('loaddata', 'your_data.json', app_label='yourapp')


class Migration(migrations.Migration):
    dependencies = [
        # Dependencies to other migrations
    ]

    operations = [
        migrations.RunPython(load_fixture),
    ]

Long version

loaddata utilizes django.core.serializers.python.Deserializer which uses the most up-to-date models to deserialize historical data in a migration. That’s incorrect behavior.

For example, supposed that there is a data migration which utilizes loaddata management command to load data from a fixture, and it’s already applied on your development environment.

Later, you decide to add a new required field to the corresponding model, so you do it and make a new migration against your updated model (and possibly provide a one-off value to the new field when ./manage.py makemigrations prompts you).

You run the next migration, and all is well.

Finally, you’re done developing your Django application, and you deploy it on the production server. Now it’s time for you to run the whole migrations from scratch on the production environment.

However, the data migration fails. That’s because the deserialized model from loaddata command, which represents the current code, can’t be saved with empty data for the new required field you added. The original fixture lacks necessary data for it!

But even if you update the fixture with required data for the new field, the data migration still fails. When the data migration is running, the next migration which adds the corresponding column to the database, is not applied yet. You can’t save data to a column which does not exist!

Conclusion: in a data migration, the loaddata command introduces potential inconsistency between the model and the database. You should definitely NOT use it directly in a data migration.

The Solution

loaddata command relies on django.core.serializers.python._get_model function to get the corresponding model from a fixture, which will return the most up-to-date version of a model. We need to monkey-patch it so it gets the historical model.

(The following code works for Django 1.8.x)

# Good example for a data migration
from django.db import migrations
from django.core.serializers import base, python
from django.core.management import call_command


def load_fixture(apps, schema_editor):
    # Save the old _get_model() function
    old_get_model = python._get_model

    # Define new _get_model() function here, which utilizes the apps argument to
    # get the historical version of a model. This piece of code is directly stolen
    # from django.core.serializers.python._get_model, unchanged. However, here it
    # has a different context, specifically, the apps variable.
    def _get_model(model_identifier):
        try:
            return apps.get_model(model_identifier)
        except (LookupError, TypeError):
            raise base.DeserializationError("Invalid model identifier: '%s'" % model_identifier)

    # Replace the _get_model() function on the module, so loaddata can utilize it.
    python._get_model = _get_model

    try:
        # Call loaddata command
        call_command('loaddata', 'your_data.json', app_label='yourapp')
    finally:
        # Restore old _get_model() function
        python._get_model = old_get_model


class Migration(migrations.Migration):
    dependencies = [
        # Dependencies to other migrations
    ]

    operations = [
        migrations.RunPython(load_fixture),
    ]

回答 2

受一些评论(即n__o的评论)的启发,以及我initial_data.*在多个应用程序中散布了许多文件这一事实,我决定创建一个Django应用程序,以方便创建这些数据迁移。

使用Django的迁移夹具,你可以简单地运行下面的管理命令,它会通过所有搜索你INSTALLED_APPSinitial_data.*文件,并把它们变成数据迁移。

./manage.py create_initial_data_fixtures
Migrations for 'eggs':
  0002_auto_20150107_0817.py:
Migrations for 'sausage':
  Ignoring 'initial_data.yaml' - migration already exists.
Migrations for 'foo':
  Ignoring 'initial_data.yaml' - not migrated.

看到 安装/使用说明, django-migration-fixture

Inspired by some of the comments (namely n__o’s) and the fact that I have a lot of initial_data.* files spread out over multiple apps I decided to create a Django app that would facilitate the creation of these data migrations.

Using django-migration-fixture you can simply run the following management command and it will search through all your INSTALLED_APPS for initial_data.* files and turn them into data migrations.

./manage.py create_initial_data_fixtures
Migrations for 'eggs':
  0002_auto_20150107_0817.py:
Migrations for 'sausage':
  Ignoring 'initial_data.yaml' - migration already exists.
Migrations for 'foo':
  Ignoring 'initial_data.yaml' - not migrated.

See django-migration-fixture for install/usage instructions.


回答 3

为了给您的数据库一些初始数据,编写一个数据迁移。 在数据迁移中,使用RunPython函数加载数据。

不要编写任何loaddata命令,因为这种方式已被弃用。

您的数据迁移将仅运行一次。迁移是迁移的有序序列。运行003_xxxx.py迁移时,django迁移会在数据库中写入该应用已迁移到该版本(003)的信息,并将仅运行以下迁移。

In order to give your database some initial data, write a data migration. In the data migration, use the RunPython function to load your data.

Don’t write any loaddata command as this way is deprecated.

Your data migrations will be run only once. The migrations are an ordered sequence of migrations. When the 003_xxxx.py migrations is run, django migrations writes in the database that this app is migrated until this one (003), and will run the following migrations only.


回答 4

不幸的是,上面介绍的解决方案对我不起作用。我发现每次更改模型时都必须更新固定装置。理想情况下,我会写数据迁移来类似地修改创建的数据和夹具加载的数据。

为了方便起见,我编写了一个快速功能,它将在fixtures当前应用程序的目录中查找并加载夹具。将此功能放入与迁移中的字段匹配的模型历史记录中的迁移中。

The solutions presented above didn’t work for me unfortunately. I found that every time I change my models I have to update my fixtures. Ideally I would instead write data migrations to modify created data and fixture-loaded data similarly.

To facilitate this I wrote a quick function which will look in the fixtures directory of the current app and load a fixture. Put this function into a migration in the point of the model history that matches the fields in the migration.


回答 5

我认为固定装置有点不好。如果您的数据库经常更改,那么使其保持最新状态将很快成为噩梦。实际上,这不仅是我的观点,在《 Django的两个独家报道》一书中,它的解释要好得多。

相反,我将编写一个Python文件来提供初始设置。如果您还需要其他东西,我建议您去看看工厂男孩

如果需要迁移某些数据,则应使用数据迁移

还有关于使用固定装置的“燃烧固定装置,使用模型工厂”

In my opinion fixtures are a bit bad. If your database changes frequently, keeping them up-to-date will came a nightmare soon. Actually, it’s not only my opinion, in the book “Two Scoops of Django” it’s explained much better.

Instead I’ll write a Python file to provide initial setup. If you need something more I’ll suggest you look at Factory boy.

If you need to migrate some data you should use data migrations.

There’s also “Burn Your Fixtures, Use Model Factories” about using fixtures.


回答 6

在Django 2.1上,我想用初始数据加载某些模型(例如国家名称)。

但是我希望这种情况在执行初始迁移后立即自动发生。

因此,我认为拥有一个 sql/在每个应用程序中需要加载初始数据文件夹。

然后,在该sql/文件夹中,我将包含.sql带有所需DML的文件,以将初始数据加载到相应的模型中,例如:

INSERT INTO appName_modelName(fieldName)
VALUES
    ("country 1"),
    ("country 2"),
    ("country 3"),
    ("country 4");

为了更具描述性,这是包含sql/文件夹的应用程序的外观:

另外,我发现某些情况下需要按sql特定顺序执行脚本。因此,我决定为文件名加上一个连续的数字,如上图所示。

然后,我需要一种方法,SQLs可以自动在任何应用程序文件夹中加载可用的文件python manage.py migrate

因此,我创建了另一个名为的应用程序initial_data_migrations,然后将该应用程序添加到INSTALLED_APPSin settings.py文件列表中。然后,我在migrations里面创建了一个文件夹,并添加了一个名为run_sql_scripts.py实际上是自定义迁移)的文件。如下图所示:

我创建run_sql_scripts.py了它,以便它负责运行sql每个应用程序中可用的所有脚本。然后当有人跑步时将其解雇python manage.py migrate。此自定义migration还会将涉及的应用程序添加为依赖项,这样,它sql仅在所需的应用程序执行了0001_initial.py迁移之后才尝试运行语句(我们不想尝试针对不存在的表运行SQL语句)。

这是该脚本的来源:

import os
import itertools

from django.db import migrations
from YourDjangoProjectName.settings import BASE_DIR, INSTALLED_APPS

SQL_FOLDER = "/sql/"

APP_SQL_FOLDERS = [
    (os.path.join(BASE_DIR, app + SQL_FOLDER), app) for app in INSTALLED_APPS
    if os.path.isdir(os.path.join(BASE_DIR, app + SQL_FOLDER))
]

SQL_FILES = [
    sorted([path + file for file in os.listdir(path) if file.lower().endswith('.sql')])
    for path, app in APP_SQL_FOLDERS
]


def load_file(path):
    with open(path, 'r') as f:
        return f.read()


class Migration(migrations.Migration):

    dependencies = [
        (app, '__first__') for path, app in APP_SQL_FOLDERS
    ]

    operations = [
        migrations.RunSQL(load_file(f)) for f in list(itertools.chain.from_iterable(SQL_FILES))
    ]

我希望有人觉得这有帮助,对我来说效果很好!如果您有任何疑问,请告诉我。

注意:这可能不是最好的解决方案,因为我刚刚开始使用django,但是由于我在使用django进行搜索时没有找到太多信息,因此仍想与大家共享此“操作方法”。

On Django 2.1, I wanted to load some models (Like country names for example) with initial data.

But I wanted this to happen automatically right after the execution of initial migrations.

So I thought that it would be great to have an sql/ folder inside each application that required initial data to be loaded.

Then within that sql/ folder I would have .sql files with the required DMLs to load the initial data into the corresponding models, for example:

INSERT INTO appName_modelName(fieldName)
VALUES
    ("country 1"),
    ("country 2"),
    ("country 3"),
    ("country 4");

To be more descriptive, this is how an app containing an sql/ folder would look:

Also I found some cases where I needed the sql scripts to be executed in a specific order. So I decided to prefix the file names with a consecutive number as seen in the image above.

Then I needed a way to load any SQLs available inside any application folder automatically by doing python manage.py migrate.

So I created another application named initial_data_migrations and then I added this app to the list of INSTALLED_APPS in settings.py file. Then I created a migrations folder inside and added a file called run_sql_scripts.py (Which actually is a custom migration). As seen in the image below:

I created run_sql_scripts.py so that it takes care of running all sql scripts available within each application. This one is then fired when someone runs python manage.py migrate. This custom migration also adds the involved applications as dependencies, that way it attempts to run the sql statements only after the required applications have executed their 0001_initial.py migrations (We don’t want to attempt running a SQL statement against a non-existent table).

Here is the source of that script:

import os
import itertools

from django.db import migrations
from YourDjangoProjectName.settings import BASE_DIR, INSTALLED_APPS

SQL_FOLDER = "/sql/"

APP_SQL_FOLDERS = [
    (os.path.join(BASE_DIR, app + SQL_FOLDER), app) for app in INSTALLED_APPS
    if os.path.isdir(os.path.join(BASE_DIR, app + SQL_FOLDER))
]

SQL_FILES = [
    sorted([path + file for file in os.listdir(path) if file.lower().endswith('.sql')])
    for path, app in APP_SQL_FOLDERS
]


def load_file(path):
    with open(path, 'r') as f:
        return f.read()


class Migration(migrations.Migration):

    dependencies = [
        (app, '__first__') for path, app in APP_SQL_FOLDERS
    ]

    operations = [
        migrations.RunSQL(load_file(f)) for f in list(itertools.chain.from_iterable(SQL_FILES))
    ]

I hope someone finds this helpful, it worked just fine for me!. If you have any questions please let me know.

NOTE: This might not be the best solution since I’m just getting started with django, however still wanted to share this “How-to” with you all since I didn’t find much information while googling about this.


使用Python将JSON数据漂亮地打印到文件中

问题:使用Python将JSON数据漂亮地打印到文件中

用于类的项目涉及解析Twitter JSON数据。我正在获取数据并将其设置为文件没有太大的麻烦,但是它们全部集中在一行中。这对我要进行的数据操作很好,但是文件很难读取,而且我无法很好地对其进行检查,这使得为数据操作编写代码非常困难。

有谁知道如何在Python中执行此操作(即不使用命令行工具,但我无法使用该工具)?到目前为止,这是我的代码:

header, output = client.request(twitterRequest, method="GET", body=None,
                            headers=None, force_auth_header=True)

# now write output to a file
twitterDataFile = open("twitterData.json", "wb")
# magic happens here to make it pretty-printed
twitterDataFile.write(output)
twitterDataFile.close()

请注意,我很高兴有人向我指向simplejson文档等,但是正如我已经说过的那样,我已经研究过了并继续需要帮助。一个真正有用的答复将比那里的示例更加详细和解释。谢谢

另外: 在Windows命令行中尝试此操作:

more twitterData.json | python -mjson.tool > twitterData-pretty.json

结果:

Invalid control character at: line 1 column 65535 (char 65535)

我会给您我正在使用的数据,但是它非常大,您已经看到了我用来制作文件的代码。

A project for class involves parsing Twitter JSON data. I’m getting the data and setting it to the file without much trouble, but it’s all in one line. This is fine for the data manipulation I’m trying to do, but the file is ridiculously hard to read and I can’t examine it very well, making the code writing for the data manipulation part very difficult.

Does anyone know how to do that from within Python (i.e. not using the command line tool, which I can’t get to work)? Here’s my code so far:

header, output = client.request(twitterRequest, method="GET", body=None,
                            headers=None, force_auth_header=True)

# now write output to a file
twitterDataFile = open("twitterData.json", "wb")
# magic happens here to make it pretty-printed
twitterDataFile.write(output)
twitterDataFile.close()

Note I appreciate people pointing me to simplejson documentation and such, but as I have stated, I have already looked at that and continue to need assistance. A truly helpful reply will be more detailed and explanatory than the examples found there. Thanks

Also: Trying this in the windows command line:

more twitterData.json | python -mjson.tool > twitterData-pretty.json

results in this:

Invalid control character at: line 1 column 65535 (char 65535)

I’d give you the data I’m using, but it’s very large and you’ve already seen the code I used to make the file.


回答 0

您应该使用可选参数indent

header, output = client.request(twitterRequest, method="GET", body=None,
                            headers=None, force_auth_header=True)

# now write output to a file
twitterDataFile = open("twitterData.json", "w")
# magic happens here to make it pretty-printed
twitterDataFile.write(simplejson.dumps(simplejson.loads(output), indent=4, sort_keys=True))
twitterDataFile.close()

You should use the optional argument indent.

header, output = client.request(twitterRequest, method="GET", body=None,
                            headers=None, force_auth_header=True)

# now write output to a file
twitterDataFile = open("twitterData.json", "w")
# magic happens here to make it pretty-printed
twitterDataFile.write(simplejson.dumps(simplejson.loads(output), indent=4, sort_keys=True))
twitterDataFile.close()

回答 1

您可以解析JSON,然后使用缩进再次将其输出,如下所示:

import json
mydata = json.loads(output)
print json.dumps(mydata, indent=4)

有关更多信息,请参见http://docs.python.org/library/json.html

You can parse the JSON, then output it again with indents like this:

import json
mydata = json.loads(output)
print json.dumps(mydata, indent=4)

See http://docs.python.org/library/json.html for more info.


回答 2

import json

with open("twitterdata.json", "w") as twitter_data_file:
    json.dump(output, twitter_data_file, indent=4, sort_keys=True)

你并不需要json.dumps(),如果你不想以后解析字符串,只需简单地使用json.dump()。它也更快。

import json

with open("twitterdata.json", "w") as twitter_data_file:
    json.dump(output, twitter_data_file, indent=4, sort_keys=True)

You don’t need json.dumps() if you don’t want to parse the string later, just simply use json.dump(). It’s faster too.


回答 3

您可以使用python的json模块进行漂亮的打印。

>>> import json
>>> print json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4)
{
    "4": 5,
    "6": 7
}

所以,在你的情况下

>>> print json.dumps(json_output, indent=4)

You can use json module of python to pretty print.

>>> import json
>>> print json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4)
{
    "4": 5,
    "6": 7
}

So, in your case

>>> print json.dumps(json_output, indent=4)

回答 4

如果您已经具有想要格式化的JSON文件,则可以使用以下命令:

    with open('twitterdata.json', 'r+') as f:
        data = json.load(f)
        f.seek(0)
        json.dump(data, f, indent=4)
        f.truncate()

If you already have existing JSON files which you want to pretty format you could use this:

    with open('twitterdata.json', 'r+') as f:
        data = json.load(f)
        f.seek(0)
        json.dump(data, f, indent=4)
        f.truncate()

回答 5

如果要生成新的* .json或修改现有的josn文件,请使用“ indent”参数获取漂亮的json格式。

import json
responseData = json.loads(output)
with open('twitterData.json','w') as twitterDataFile:    
    json.dump(responseData, twitterDataFile, indent=4)

If you are generating new *.json or modifying existing josn file the use “indent” parameter for pretty view json format.

import json
responseData = json.loads(output)
with open('twitterData.json','w') as twitterDataFile:    
    json.dump(responseData, twitterDataFile, indent=4)

回答 6

import json
def writeToFile(logData, fileName, openOption="w"):
  file = open(fileName, openOption)
  file.write(json.dumps(json.loads(logData), indent=4)) 
  file.close()  
import json
def writeToFile(logData, fileName, openOption="w"):
  file = open(fileName, openOption)
  file.write(json.dumps(json.loads(logData), indent=4)) 
  file.close()  

回答 7

您可以将文件重定向到python并使用该工具打开,并使用更多内容来读取它。

示例代码将是,

cat filename.json | python -m json.tool | more

You could redirect a file to python and open using the tool and to read it use more.

The sample code will be,

cat filename.json | python -m json.tool | more

TypeError:ObjectId(”)不可序列化JSON

问题:TypeError:ObjectId(”)不可序列化JSON

在使用Python查询文档上的聚合函数后,我从MongoDB返回了响应,它返回有效响应,我可以打印该响应但不能返回它。

错误:

TypeError: ObjectId('51948e86c25f4b1d1c0d303c') is not JSON serializable

打印:

{'result': [{'_id': ObjectId('51948e86c25f4b1d1c0d303c'), 'api_calls_with_key': 4, 'api_calls_per_day': 0.375, 'api_calls_total': 6, 'api_calls_without_key': 2}], 'ok': 1.0}

但是当我尝试返回时:

TypeError: ObjectId('51948e86c25f4b1d1c0d303c') is not JSON serializable

这是RESTfull调用:

@appv1.route('/v1/analytics')
def get_api_analytics():
    # get handle to collections in MongoDB
    statistics = sldb.statistics

    objectid = ObjectId("51948e86c25f4b1d1c0d303c")

    analytics = statistics.aggregate([
    {'$match': {'owner': objectid}},
    {'$project': {'owner': "$owner",
    'api_calls_with_key': {'$cond': [{'$eq': ["$apikey", None]}, 0, 1]},
    'api_calls_without_key': {'$cond': [{'$ne': ["$apikey", None]}, 0, 1]}
    }},
    {'$group': {'_id': "$owner",
    'api_calls_with_key': {'$sum': "$api_calls_with_key"},
    'api_calls_without_key': {'$sum': "$api_calls_without_key"}
    }},
    {'$project': {'api_calls_with_key': "$api_calls_with_key",
    'api_calls_without_key': "$api_calls_without_key",
    'api_calls_total': {'$add': ["$api_calls_with_key", "$api_calls_without_key"]},
    'api_calls_per_day': {'$divide': [{'$add': ["$api_calls_with_key", "$api_calls_without_key"]}, {'$dayOfMonth': datetime.now()}]},
    }}
    ])


    print(analytics)

    return analytics

数据库连接良好,集合也在那里,我得到了有效的预期结果,但是当我尝试返回时,它给了我Json错误。任何想法如何将响应转换回JSON。谢谢

My response back from MongoDB after querying an aggregated function on document using Python, It returns valid response and i can print it but can not return it.

Error:

TypeError: ObjectId('51948e86c25f4b1d1c0d303c') is not JSON serializable

Print:

{'result': [{'_id': ObjectId('51948e86c25f4b1d1c0d303c'), 'api_calls_with_key': 4, 'api_calls_per_day': 0.375, 'api_calls_total': 6, 'api_calls_without_key': 2}], 'ok': 1.0}

But When i try to return:

TypeError: ObjectId('51948e86c25f4b1d1c0d303c') is not JSON serializable

It is RESTfull call:

@appv1.route('/v1/analytics')
def get_api_analytics():
    # get handle to collections in MongoDB
    statistics = sldb.statistics

    objectid = ObjectId("51948e86c25f4b1d1c0d303c")

    analytics = statistics.aggregate([
    {'$match': {'owner': objectid}},
    {'$project': {'owner': "$owner",
    'api_calls_with_key': {'$cond': [{'$eq': ["$apikey", None]}, 0, 1]},
    'api_calls_without_key': {'$cond': [{'$ne': ["$apikey", None]}, 0, 1]}
    }},
    {'$group': {'_id': "$owner",
    'api_calls_with_key': {'$sum': "$api_calls_with_key"},
    'api_calls_without_key': {'$sum': "$api_calls_without_key"}
    }},
    {'$project': {'api_calls_with_key': "$api_calls_with_key",
    'api_calls_without_key': "$api_calls_without_key",
    'api_calls_total': {'$add': ["$api_calls_with_key", "$api_calls_without_key"]},
    'api_calls_per_day': {'$divide': [{'$add': ["$api_calls_with_key", "$api_calls_without_key"]}, {'$dayOfMonth': datetime.now()}]},
    }}
    ])


    print(analytics)

    return analytics

db is well connected and collection is there too and I got back valid expected result but when i try to return it gives me Json error. Any idea how to convert the response back into JSON. Thanks


回答 0

您应该定义自己JSONEncoder并使用它:

import json
from bson import ObjectId

class JSONEncoder(json.JSONEncoder):
    def default(self, o):
        if isinstance(o, ObjectId):
            return str(o)
        return json.JSONEncoder.default(self, o)

JSONEncoder().encode(analytics)

也可以通过以下方式使用它。

json.encode(analytics, cls=JSONEncoder)

You should define you own JSONEncoder and using it:

import json
from bson import ObjectId

class JSONEncoder(json.JSONEncoder):
    def default(self, o):
        if isinstance(o, ObjectId):
            return str(o)
        return json.JSONEncoder.default(self, o)

JSONEncoder().encode(analytics)

It’s also possible to use it in the following way.

json.encode(analytics, cls=JSONEncoder)

回答 1

Pymongo提供json_util-您可以改用它来处理BSON类型

Pymongo provides json_util – you can use that one instead to handle BSON types

def parse_json(data):
    return json.loads(json_util.dumps(data))

回答 2

>>> from bson import Binary, Code
>>> from bson.json_util import dumps
>>> dumps([{'foo': [1, 2]},
...        {'bar': {'hello': 'world'}},
...        {'code': Code("function x() { return 1; }")},
...        {'bin': Binary("")}])
'[{"foo": [1, 2]}, {"bar": {"hello": "world"}}, {"code": {"$code": "function x() { return 1; }", "$scope": {}}}, {"bin": {"$binary": "AQIDBA==", "$type": "00"}}]'

来自json_util的实际示例。

与Flask的jsonify不同,“转储”将返回一个字符串,因此不能用作Flask的jsonify的1:1替换。

但是这个问题表明,我们可以使用json_util.dumps()进行序列化,使用json.loads()转换回dict,最后对其调用Flask的jsonify。

示例(取自上一个问题的答案):

from bson import json_util, ObjectId
import json

#Lets create some dummy document to prove it will work
page = {'foo': ObjectId(), 'bar': [ObjectId(), ObjectId()]}

#Dump loaded BSON to valid JSON string and reload it as dict
page_sanitized = json.loads(json_util.dumps(page))
return page_sanitized

此解决方案会将ObjectId和其他代码(例如Binary,Code等)转换为等效的字符串,例如“ $ oid”。

JSON输出如下所示:

{
  "_id": {
    "$oid": "abc123"
  }
}
>>> from bson import Binary, Code
>>> from bson.json_util import dumps
>>> dumps([{'foo': [1, 2]},
...        {'bar': {'hello': 'world'}},
...        {'code': Code("function x() { return 1; }")},
...        {'bin': Binary("")}])
'[{"foo": [1, 2]}, {"bar": {"hello": "world"}}, {"code": {"$code": "function x() { return 1; }", "$scope": {}}}, {"bin": {"$binary": "AQIDBA==", "$type": "00"}}]'

Actual example from json_util.

Unlike Flask’s jsonify, “dumps” will return a string, so it cannot be used as a 1:1 replacement of Flask’s jsonify.

But this question shows that we can serialize using json_util.dumps(), convert back to dict using json.loads() and finally call Flask’s jsonify on it.

Example (derived from previous question’s answer):

from bson import json_util, ObjectId
import json

#Lets create some dummy document to prove it will work
page = {'foo': ObjectId(), 'bar': [ObjectId(), ObjectId()]}

#Dump loaded BSON to valid JSON string and reload it as dict
page_sanitized = json.loads(json_util.dumps(page))
return page_sanitized

This solution will convert ObjectId and others (ie Binary, Code, etc) to a string equivalent such as “$oid.”

JSON output would look like this:

{
  "_id": {
    "$oid": "abc123"
  }
}

回答 3

收到“无法JSON序列化”错误的大多数用户只需default=str使用即可指定json.dumps。例如:

json.dumps(my_obj, default=str)

这将强制转换为str,从而避免错误。当然,然后查看生成的输出以确认这就是您所需要的。

Most users who receive the “not JSON serializable” error simply need to specify default=str when using json.dumps. For example:

json.dumps(my_obj, default=str)

This will force a conversion to str, preventing the error. Of course then look at the generated output to confirm that it is what you need.


回答 4

from bson import json_util
import json

@app.route('/')
def index():
    for _ in "collection_name".find():
        return json.dumps(i, indent=4, default=json_util.default)

这是将BSON转换为JSON对象的示例示例。你可以试试看

from bson import json_util
import json

@app.route('/')
def index():
    for _ in "collection_name".find():
        return json.dumps(i, indent=4, default=json_util.default)

This is the sample example for converting BSON into JSON object. You can try this.


回答 5

作为快速替代品,您可以更改{'owner': objectid}{'owner': str(objectid)}

但是定义自己的方法JSONEncoder是更好的解决方案,它取决于您的要求。

As a quick replacement, you can change {'owner': objectid} to {'owner': str(objectid)}.

But defining your own JSONEncoder is a better solution, it depends on your requirements.


回答 6

张贴在这里,因为我认为它可能是使用的人有用的Flaskpymongo。这是我当前的“最佳实践”设置,用于允许flask封送pymongo bson数据类型。

mongoflask.py

from datetime import datetime, date

import isodate as iso
from bson import ObjectId
from flask.json import JSONEncoder
from werkzeug.routing import BaseConverter


class MongoJSONEncoder(JSONEncoder):
    def default(self, o):
        if isinstance(o, (datetime, date)):
            return iso.datetime_isoformat(o)
        if isinstance(o, ObjectId):
            return str(o)
        else:
            return super().default(o)


class ObjectIdConverter(BaseConverter):
    def to_python(self, value):
        return ObjectId(value)

    def to_url(self, value):
        return str(value)

app.py

from .mongoflask import MongoJSONEncoder, ObjectIdConverter

def create_app():
    app = Flask(__name__)
    app.json_encoder = MongoJSONEncoder
    app.url_map.converters['objectid'] = ObjectIdConverter

    # Client sends their string, we interpret it as an ObjectId
    @app.route('/users/<objectid:user_id>')
    def show_user(user_id):
        # setup not shown, pretend this gets us a pymongo db object
        db = get_db()

        # user_id is a bson.ObjectId ready to use with pymongo!
        result = db.users.find_one({'_id': user_id})

        # And jsonify returns normal looking json!
        # {"_id": "5b6b6959828619572d48a9da",
        #  "name": "Will",
        #  "birthday": "1990-03-17T00:00:00Z"}
        return jsonify(result)


    return app

为什么这样做而不是提供BSON或mongod扩展JSON

我认为提供mongo特殊JSON给客户端应用程序带来了负担。大多数客户端应用程序都不会以任何复杂的方式使用mongo对象。如果我提供扩展的json,现在必须在服务器端和客户端使用它。ObjectId并且Timestamp更容易作为字符串使用,这使所有mongo编组的疯狂都隔离在服务器上。

{
  "_id": "5b6b6959828619572d48a9da",
  "created_at": "2018-08-08T22:06:17Z"
}

我认为,与大多数应用程序相比,这要轻得多。

{
  "_id": {"$oid": "5b6b6959828619572d48a9da"},
  "created_at": {"$date": 1533837843000}
}

Posting here as I think it may be useful for people using Flask with pymongo. This is my current “best practice” setup for allowing flask to marshall pymongo bson data types.

mongoflask.py

from datetime import datetime, date

import isodate as iso
from bson import ObjectId
from flask.json import JSONEncoder
from werkzeug.routing import BaseConverter


class MongoJSONEncoder(JSONEncoder):
    def default(self, o):
        if isinstance(o, (datetime, date)):
            return iso.datetime_isoformat(o)
        if isinstance(o, ObjectId):
            return str(o)
        else:
            return super().default(o)


class ObjectIdConverter(BaseConverter):
    def to_python(self, value):
        return ObjectId(value)

    def to_url(self, value):
        return str(value)

app.py

from .mongoflask import MongoJSONEncoder, ObjectIdConverter

def create_app():
    app = Flask(__name__)
    app.json_encoder = MongoJSONEncoder
    app.url_map.converters['objectid'] = ObjectIdConverter

    # Client sends their string, we interpret it as an ObjectId
    @app.route('/users/<objectid:user_id>')
    def show_user(user_id):
        # setup not shown, pretend this gets us a pymongo db object
        db = get_db()

        # user_id is a bson.ObjectId ready to use with pymongo!
        result = db.users.find_one({'_id': user_id})

        # And jsonify returns normal looking json!
        # {"_id": "5b6b6959828619572d48a9da",
        #  "name": "Will",
        #  "birthday": "1990-03-17T00:00:00Z"}
        return jsonify(result)


    return app

Why do this instead of serving BSON or mongod extended JSON?

I think serving mongo special JSON puts a burden on client applications. Most client apps will not care using mongo objects in any complex way. If I serve extended json, now I have to use it server side, and the client side. ObjectId and Timestamp are easier to work with as strings and this keeps all this mongo marshalling madness quarantined to the server.

{
  "_id": "5b6b6959828619572d48a9da",
  "created_at": "2018-08-08T22:06:17Z"
}

I think this is less onerous to work with for most applications than.

{
  "_id": {"$oid": "5b6b6959828619572d48a9da"},
  "created_at": {"$date": 1533837843000}
}

回答 7

这就是我最近修复错误的方式

    @app.route('/')
    def home():
        docs = []
        for doc in db.person.find():
            doc.pop('_id') 
            docs.append(doc)
        return jsonify(docs)

This is how I’ve recently fixed the error

    @app.route('/')
    def home():
        docs = []
        for doc in db.person.find():
            doc.pop('_id') 
            docs.append(doc)
        return jsonify(docs)

回答 8

我知道我发帖太晚了,但我认为这至少可以帮助到一些人!

tim和defuz提到的两个示例(都获得最高投票)都可以很好地工作。但是,有时会有细微的差别,这有时可能很重要。

  1. 以下方法添加了一个额外的字段,该字段是多余的,在所有情况下可能都不理想

Pymongo提供了json_util-您可以改用那个来处理BSON类型

输出:{“ _id”:{“ $ oid”:“ abc123”}}

  1. 在JsonEncoder类以所需的字符串格式给出相同输出的情况下,我们还需要使用json.loads(output)。但这导致

输出:{“ _id”:“ abc123”}

即使第一种方法看起来很简单,但这两种方法都需要非常少的精力。

I know I’m posting late but thought it would help at least a few folks!

Both the examples mentioned by tim and defuz(which are top voted) works perfectly fine. However, there is a minute difference which could be significant at times.

  1. The following method adds one extra field which is redundant and may not be ideal in all the cases

Pymongo provides json_util – you can use that one instead to handle BSON types

Output: { “_id”: { “$oid”: “abc123” } }

  1. Where as the JsonEncoder class gives the same output in the string format as we need and we need to use json.loads(output) in addition. But it leads to

Output: { “_id”: “abc123” }

Even though, the first method looks simple, both the method need very minimal effort.


回答 9

就我而言,我需要这样的东西:

class JsonEncoder():
    def encode(self, o):
        if '_id' in o:
            o['_id'] = str(o['_id'])
        return o

in my case I needed something like this:

class JsonEncoder():
    def encode(self, o):
        if '_id' in o:
            o['_id'] = str(o['_id'])
        return o

回答 10

Flask的jsonify提供了JSON Security中描述的安全性增强功能。如果自定义编码器与Flask一起使用,则最好考虑JSON安全性中讨论的要点

Flask’s jsonify provides security enhancement as described in JSON Security. If custom encoder is used with Flask, its better to consider the points discussed in the JSON Security


回答 11

我想提供一个附加的解决方案,以改善公认的答案。我以前在这里的另一个线程中提供了答案。

from flask import Flask
from flask.json import JSONEncoder

from bson import json_util

from . import resources

# define a custom encoder point to the json_util provided by pymongo (or its dependency bson)
class CustomJSONEncoder(JSONEncoder):
    def default(self, obj): return json_util.default(obj)

application = Flask(__name__)
application.json_encoder = CustomJSONEncoder

if __name__ == "__main__":
    application.run()

I would like to provide an additional solution that improves the accepted answer. I have previously provided the answers in another thread here.

from flask import Flask
from flask.json import JSONEncoder

from bson import json_util

from . import resources

# define a custom encoder point to the json_util provided by pymongo (or its dependency bson)
class CustomJSONEncoder(JSONEncoder):
    def default(self, obj): return json_util.default(obj)

application = Flask(__name__)
application.json_encoder = CustomJSONEncoder

if __name__ == "__main__":
    application.run()

回答 12

如果您不需要记录的_id,我建议在查询数据库时将其取消设置,这将使您可以直接打印返回的记录,例如

要在查询时取消设置_id,然后在循环中打印数据,您可以编写如下代码

records = mycollection.find(query, {'_id': 0}) #second argument {'_id':0} unsets the id from the query
for record in records:
    print(record)

If you will not be needing the _id of the records I will recommend unsetting it when querying the DB which will enable you to print the returned records directly e.g

To unset the _id when querying and then print data in a loop you write something like this

records = mycollection.find(query, {'_id': 0}) #second argument {'_id':0} unsets the id from the query
for record in records:
    print(record)

回答 13

解决方案:mongoengine +棉花糖

如果您使用mongoenginemarshamallow则此解决方案可能适用于您。

基本上,我是String从棉花糖导入字段的,而我覆盖了默认值Schema id进行String编码。

from marshmallow import Schema
from marshmallow.fields import String

class FrontendUserSchema(Schema):

    id = String()

    class Meta:
        fields = ("id", "email")

SOLUTION for: mongoengine + marshmallow

If you use mongoengine and marshamallow then this solution might be applicable for you.

Basically, I imported String field from marshmallow, and I overwritten default Schema id to be String encoded.

from marshmallow import Schema
from marshmallow.fields import String

class FrontendUserSchema(Schema):

    id = String()

    class Meta:
        fields = ("id", "email")

回答 14

from bson.objectid import ObjectId
from core.services.db_connection import DbConnectionService

class DbExecutionService:
     def __init__(self):
        self.db = DbConnectionService()

     def list(self, collection, search):
        session = self.db.create_connection(collection)
        return list(map(lambda row: {i: str(row[i]) if isinstance(row[i], ObjectId) else row[i] for i in row}, session.find(search))
from bson.objectid import ObjectId
from core.services.db_connection import DbConnectionService

class DbExecutionService:
     def __init__(self):
        self.db = DbConnectionService()

     def list(self, collection, search):
        session = self.db.create_connection(collection)
        return list(map(lambda row: {i: str(row[i]) if isinstance(row[i], ObjectId) else row[i] for i in row}, session.find(search))

回答 15

如果您不想_id回应,可以重构代码,如下所示:

jsonResponse = getResponse(mock_data)
del jsonResponse['_id'] # removes '_id' from the final response
return jsonResponse

这将消除TypeError: ObjectId('') is not JSON serializable错误。

If you don’t want _id in response, you can refactor your code something like this:

jsonResponse = getResponse(mock_data)
del jsonResponse['_id'] # removes '_id' from the final response
return jsonResponse

This will remove the TypeError: ObjectId('') is not JSON serializable error.


Python / Json:期望属性名称用双引号引起来

问题:Python / Json:期望属性名称用双引号引起来

我一直在尝试找到一种在Python中加载JSON对象的好方法。我发送此json数据:

{'http://example.org/about': {'http://purl.org/dc/terms/title': [{'type': 'literal', 'value': "Anna's Homepage"}]}}

到后端,它将在这里作为字符串接收,然后我用来json.loads(data)解析它。

但是每次我遇到同样的异常:

ValueError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

我用谷歌搜索,但是除了这个解决方案之外json.loads(json.dumps(data))似乎什么都没用,在我个人看来,这种解决方案效率不高,因为它接受任何类型的数据,即使不是json格式的数据也是如此。

任何建议将不胜感激。

I’ve been trying to figure out a good way to load JSON objects in Python. I send this json data:

{'http://example.org/about': {'http://purl.org/dc/terms/title': [{'type': 'literal', 'value': "Anna's Homepage"}]}}

to the backend where it will be received as a string then I used json.loads(data) to parse it.

But each time I got the same exception :

ValueError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

I googled it but nothing seems to work besides this solution json.loads(json.dumps(data)) which personally seems for me not that efficient since it accept any kind of data even the ones that are not in json format.

Any suggestions will be much appreciated.


回答 0

这个:

{'http://example.org/about': {'http://purl.org/dc/terms/title': [{'type': 'literal', 'value': "Anna's Homepage"}]}}

不是JSON。
这个:

{"http://example.org/about": {"http://purl.org/dc/terms/title": [{"type": "literal", "value": "Anna's Homepage"}]}}

是JSON。

编辑:
一些评论者建议以上是不够的。
JSON规范-RFC7159声明字符串以引号开头和结尾。那是"
单一仲裁'在JSON中没有语义,只能在字符串中使用。

This:

{'http://example.org/about': {'http://purl.org/dc/terms/title': [{'type': 'literal', 'value': "Anna's Homepage"}]}}

is not JSON.
This:

{"http://example.org/about": {"http://purl.org/dc/terms/title": [{"type": "literal", "value": "Anna's Homepage"}]}}

is JSON.

EDIT:
Some commenters suggested that the above is not enough.
JSON specification – RFC7159 states that a string begins and ends with quotation mark. That is ".
Single quoute ' has no semantic meaning in JSON and is allowed only inside a string.


回答 1

因为JSON仅允许使用双引号将字符串引起来,所以您可以像这样操作字符串:

str = str.replace("\'", "\"")

如果您的JSON包含转义的单引号(\'),则应使用更精确的以下代码:

import re
p = re.compile('(?<!\\\\)'')
str = p.sub('\"', str)

这将用JSON字符串str中的双引号替换所有出现的单引号,在后一种情况下将不会替换转义的单引号。

您还可以使用js-beautify不太严格的方法:

$ pip install jsbeautifier
$ js-beautify file.js

as JSON only allows enclosing strings with double quotes you can manipulate the string like this:

str = str.replace("\'", "\"")

if your JSON holds escaped single-quotes (\') then you should use the more precise following code:

import re
p = re.compile('(?<!\\\\)\'')
str = p.sub('\"', str)

This will replace all occurrences of single quote with double quote in the JSON string str and in the latter case will not replace escaped single-quotes.

You can also use js-beautify which is less strict:

$ pip install jsbeautifier
$ js-beautify file.js

回答 2

就我而言,双引号不是问题。

上一个逗号给了我同样的错误信息。

{'a':{'b':c,}}
           ^

为了删除该逗号,我编写了一些简单的代码。

import json

with open('a.json','r') as f:
    s = f.read()
    s = s.replace('\t','')
    s = s.replace('\n','')
    s = s.replace(',}','}')
    s = s.replace(',]',']')
    data = json.loads(s)

这对我有用。

In my case, double quotes was not a problem.

Last comma gave me same error message.

{'a':{'b':c,}}
           ^

To remove this comma, I wrote some simple code.

import json

with open('a.json','r') as f:
    s = f.read()
    s = s.replace('\t','')
    s = s.replace('\n','')
    s = s.replace(',}','}')
    s = s.replace(',]',']')
    data = json.loads(s)

And this worked for me.


回答 3

很简单,该字符串不是有效的JSON。如错误所述,JSON文档需要使用双引号。

您需要修复数据源。

Quite simply, that string is not valid JSON. As the error says, JSON documents need to use double quotes.

You need to fix the source of the data.


回答 4

我检查了您的JSON数据

{'http://example.org/about': {'http://purl.org/dc/terms/title': [{'type': 'literal', 'value': "Anna's Homepage"}]}}

http://jsonlint.com/中,结果为:

Error: Parse error on line 1:
{   'http://example.org/
--^
Expecting 'STRING', '}', got 'undefined'

将其修改为以下字符串可解决JSON错误:

{
    "http://example.org/about": {
        "http://purl.org/dc/terms/title": [{
            "type": "literal",
            "value": "Anna's Homepage"
        }]
    }
}

I’ve checked your JSON data

{'http://example.org/about': {'http://purl.org/dc/terms/title': [{'type': 'literal', 'value': "Anna's Homepage"}]}}

in http://jsonlint.com/ and the results were:

Error: Parse error on line 1:
{   'http://example.org/
--^
Expecting 'STRING', '}', got 'undefined'

modifying it to the following string solve the JSON error:

{
    "http://example.org/about": {
        "http://purl.org/dc/terms/title": [{
            "type": "literal",
            "value": "Anna's Homepage"
        }]
    }
}

回答 5

JSON字符串必须使用双引号。JSON python库强制执行此操作,因此您无法加载字符串。您的数据需要如下所示:

{"http://example.org/about": {"http://purl.org/dc/terms/title": [{"type": "literal", "value": "Anna's Homepage"}]}}

如果那不是您可以做的,则可以使用ast.literal_eval()代替json.loads()

JSON strings must use double quotes. The JSON python library enforces this so you are unable to load your string. Your data needs to look like this:

{"http://example.org/about": {"http://purl.org/dc/terms/title": [{"type": "literal", "value": "Anna's Homepage"}]}}

If that’s not something you can do, you could use ast.literal_eval() instead of json.loads()


回答 6

import ast

inpt = {'http://example.org/about': {'http://purl.org/dc/terms/title':
                                     [{'type': 'literal', 'value': "Anna's Homepage"}]}}

json_data = ast.literal_eval(json.dumps(inpt))

print(json_data)

这样可以解决问题。

import ast

inpt = {'http://example.org/about': {'http://purl.org/dc/terms/title':
                                     [{'type': 'literal', 'value': "Anna's Homepage"}]}}

json_data = ast.literal_eval(json.dumps(inpt))

print(json_data)

this will solve the problem.


回答 7

正如错误地明确指出的那样,名称应该用双引号而不是单引号引起来。您传递的字符串不是有效的JSON。它看起来像

{"http://example.org/about": {"http://purl.org/dc/terms/title": [{"type": "literal", "value": "Anna's Homepage"}]}}

As it clearly says in error, names should be enclosed in double quotes instead of single quotes. The string you pass is just not a valid JSON. It should look like

{"http://example.org/about": {"http://purl.org/dc/terms/title": [{"type": "literal", "value": "Anna's Homepage"}]}}

回答 8

我使用这种方法并设法获得所需的输出。我的剧本

x = "{'inner-temperature': 31.73, 'outer-temperature': 28.38, 'keys-value': 0}"

x = x.replace("'", '"')
j = json.loads(x)
print(j['keys-value'])

输出

>>> 0

I used this method and managed to get the desired output. my script

x = "{'inner-temperature': 31.73, 'outer-temperature': 28.38, 'keys-value': 0}"

x = x.replace("'", '"')
j = json.loads(x)
print(j['keys-value'])

output

>>> 0

回答 9

with open('input.json','r') as f:
    s = f.read()
    s = s.replace('\'','\"')
    data = json.loads(s)

这对我来说效果很好。谢谢。

with open('input.json','r') as f:
    s = f.read()
    s = s.replace('\'','\"')
    data = json.loads(s)

This worked perfectly well for me. Thanks.


回答 10

x = x.replace("'", '"')
j = json.loads(x)

虽然这是正确的解决方案,但是如果存在这样的JSON,则可能会引起很多麻烦-

{'status': 'success', 'data': {'equity': {'enabled': True, 'net': 66706.14510000008, 'available': {'adhoc_margin': 0, 'cash': 1277252.56, 'opening_balance': 1277252.56, 'live_balance': 66706.14510000008, 'collateral': 249823.93, 'intraday_payin': 15000}, 'utilised': {'debits': 1475370.3449, 'exposure': 607729.3129, 'm2m_realised': 0, 'm2m_unrealised': -9033, 'option_premium': 0, 'payout': 0, 'span': 858608.032, 'holding_sales': 0, 'turnover': 0, 'liquid_collateral': 0, 'stock_collateral': 249823.93}}, 'commodity': {'enabled': True, 'net': 0, 'available': {'adhoc_margin': 0, 'cash': 0, 'opening_balance': 0, 'live_balance': 0, 'collateral': 0, 'intraday_payin': 0}, 'utilised': {'debits': 0, 'exposure': 0, 'm2m_realised': 0, 'm2m_unrealised': 0, 'option_premium': 0, 'payout': 0, 'span': 0, 'holding_sales': 0, 'turnover': 0, 'liquid_collateral': 0, 'stock_collateral': 0}}}}

注意到“ True”值了吗?使用它可以对布尔值进行双重检查。这将涵盖这些情况-

x = x.replace("'", '"').replace("True", '"True"').replace("False", '"False"').replace("null", '"null"')
j = json.loads(x)

另外,请确保您不

x = json.loads(x)

它必须是另一个变量。

x = x.replace("'", '"')
j = json.loads(x)

Although this is the correct solution, but it may lead to quite a headache if there a JSON like this –

{'status': 'success', 'data': {'equity': {'enabled': True, 'net': 66706.14510000008, 'available': {'adhoc_margin': 0, 'cash': 1277252.56, 'opening_balance': 1277252.56, 'live_balance': 66706.14510000008, 'collateral': 249823.93, 'intraday_payin': 15000}, 'utilised': {'debits': 1475370.3449, 'exposure': 607729.3129, 'm2m_realised': 0, 'm2m_unrealised': -9033, 'option_premium': 0, 'payout': 0, 'span': 858608.032, 'holding_sales': 0, 'turnover': 0, 'liquid_collateral': 0, 'stock_collateral': 249823.93}}, 'commodity': {'enabled': True, 'net': 0, 'available': {'adhoc_margin': 0, 'cash': 0, 'opening_balance': 0, 'live_balance': 0, 'collateral': 0, 'intraday_payin': 0}, 'utilised': {'debits': 0, 'exposure': 0, 'm2m_realised': 0, 'm2m_unrealised': 0, 'option_premium': 0, 'payout': 0, 'span': 0, 'holding_sales': 0, 'turnover': 0, 'liquid_collateral': 0, 'stock_collateral': 0}}}}

Noticed that “True” value? Use this to make things are double checked for Booleans. This will cover those cases –

x = x.replace("'", '"').replace("True", '"True"').replace("False", '"False"').replace("null", '"null"')
j = json.loads(x)

Also, make sure you do not make

x = json.loads(x)

It has to be another variable.


回答 11

我有类似的问题。彼此通信的两个组件正在使用队列。

在将消息放入队列之前,第一个组件没有执行json.dumps。因此,接收组件生成的JSON字符串用单引号引起来。这导致错误

 Expecting property name enclosed in double quotes

添加json.dumps开始创建格式正确的JSON,并解决了问题。

I had similar problem . Two components communicating with each other was using a queue .

First component was not doing json.dumps before putting message to queue. So the JSON string generated by receiving component was in single quotes. This was causing error

 Expecting property name enclosed in double quotes

Adding json.dumps started creating correctly formatted JSON & solved issue.


回答 12

使用eval功能。

它解决了单引号和双引号之间的差异。

Use the eval function.

It takes care of the discrepancy between single and double quotes.


回答 13

其他答案很好地解释了错误,因为传递给json模块的引号字符无效。

就我而言,即使在字符串中替换'",我仍然继续出现ValueError 。我最终意识到,有些引号形式的unicode符号已进入我的字符串中:

           `  ´     

要清理所有这些,您只需通过正则表达式传递字符串即可:

import re

raw_string = '{“key”:“value”}'

parsed_string = re.sub(r"[“|”|‛|’|‘|`|´|″|′|']", '"', my_string)

json_object = json.loads(parsed_string)

As the other answers explain well the error occurs because of invalid quote characters passed to the json module.

In my case I continued to get the ValueError even after replacing ' with " in my string. What I finally realized was that some quote-like unicode symbols had found their way into my string:

 “  ”  ‛  ’  ‘  `  ´  ″  ′ 

To clean all of these you can just pass your string through a regular expression:

import re

raw_string = '{“key”:“value”}'

parsed_string = re.sub(r"[“|”|‛|’|‘|`|´|″|′|']", '"', my_string)

json_object = json.loads(parsed_string)


回答 14

手动编辑JSON时,我多次遇到此问题。如果有人要在不注意的情况下从文件中删除某些内容,则可能会引发相同的错误。

例如,如果缺少JSON最后的“}”,它将抛出相同的错误。

因此,如果您手动编辑文件,请确保按照JSON解码器的要求格式化文件,否则会遇到相同的问题。

希望这可以帮助!

I have run into this problem multiple times when the JSON has been edited by hand. If someone was to delete something from the file without noticing it can throw the same error.

For instance, If your JSON last “}” is missing it will throw the same error.

So If you edit you file by hand make sure you format it like it is expected by the JSON decoder, otherwise you will run into the same problem.

Hope this helps!


回答 15

使用该json.dumps()方法总是理想的。为了摆脱此错误,我使用了以下代码

json.dumps(YOUR_DICT_STRING).replace("'", '"')

It is always ideal to use the json.dumps() method. To get rid of this error, I used the following code

json.dumps(YOUR_DICT_STRING).replace("'", '"')

JSON中的单引号与双引号

问题:JSON中的单引号与双引号

我的代码:

import simplejson as json

s = "{'username':'dfdsfdsf'}" #1
#s = '{"username":"dfdsfdsf"}' #2
j = json.loads(s)

#1 定义是错误的

#2 定义是正确的

我听说在Python中引号和引号可以互换。谁能向我解释一下?

My code:

import simplejson as json

s = "{'username':'dfdsfdsf'}" #1
#s = '{"username":"dfdsfdsf"}' #2
j = json.loads(s)

#1 definition is wrong

#2 definition is right

I heard that in Python that single and double quote can be interchangable. Can anyone explain this to me?


回答 0

JSON语法不是Python语法。JSON的字符串需要双引号。

JSON syntax is not Python syntax. JSON requires double quotes for its strings.


回答 1

您可以使用 ast.literal_eval()

>>> import ast
>>> s = "{'username':'dfdsfdsf'}"
>>> ast.literal_eval(s)
{'username': 'dfdsfdsf'}

you can use ast.literal_eval()

>>> import ast
>>> s = "{'username':'dfdsfdsf'}"
>>> ast.literal_eval(s)
{'username': 'dfdsfdsf'}

回答 2

您可以通过以下方式转储带有双引号的JSON:

import json

# mixing single and double quotes
data = {'jsonKey': 'jsonValue',"title": "hello world"}

# get string with all double quotes
json_string = json.dumps(data) 

You can dump JSON with double quote by:

import json

# mixing single and double quotes
data = {'jsonKey': 'jsonValue',"title": "hello world"}

# get string with all double quotes
json_string = json.dumps(data) 

回答 3

demjson也是解决json语法错误的好软件包:

pip install demjson

用法:

from demjson import decode
bad_json = "{'username':'dfdsfdsf'}"
python_dict = decode(bad_json)

编辑:

demjson.decode是处理损坏的json的好工具,但是当您处理json数据时,这ast.literal_eval是一个更好的匹配,而且速度更快。

demjson is also a good package to solve the problem of bad json syntax:

pip install demjson

Usage:

from demjson import decode
bad_json = "{'username':'dfdsfdsf'}"
python_dict = decode(bad_json)

Edit:

demjson.decode is a great tool for damaged json, but when you are dealing with big amourt of json data ast.literal_eval is a better match and much faster.


回答 4

到目前为止,给出了两个问题的答案,例如,如果一个流这样的非标准JSON。因为这样一来,可能不得不解释传入的字符串(而不是python字典)。

问题1- demjson:使用Python 3.7。+并使用conda时,我无法安装demjson,因为它显然不支持Python> 3.5。因此,我需要一个具有更简单方法的解决方案,例如astand / or json.dumps

问题2- astjson.dumps:如果JSON既是单引号又包含至少一个值的字符串,而该字符串又包含单引号,那么我发现的唯一简单而实用的解决方案就是同时应用这两种方法:

在以下示例中,我们假设line是传入的JSON字符串对象:

>>> line = str({'abc':'008565','name':'xyz','description':'can control TV\'s and more'})

步骤1:使用ast.literal_eval()
步骤2 将输入的字符串转换为字典:将json.dumps其应用于键和值的可靠转换,但不影响值的内容

>>> import ast
>>> import json
>>> print(json.dumps(ast.literal_eval(line)))
{"abc": "008565", "name": "xyz", "description": "can control TV's and more"}

json.dumps一个人不能完成这项工作,因为它不能解释JSON,而只能看到字符串。与相似ast.literal_eval():尽管它可以正确解释JSON(字典),但它不会转换我们所需的内容。

Two issues with answers given so far, if , for instance, one streams such non-standard JSON. Because then one might have to interpret an incoming string (not a python dictionary).

Issue 1 – demjson: With Python 3.7.+ and using conda I wasn’t able to install demjson since obviosly it does not support Python >3.5 currently. So I need a solution with simpler means, for instance astand/or json.dumps.

Issue 2 – ast & json.dumps: If a JSON is both single quoted and contains a string in at least one value, which in turn contains single quotes, the only simple yet practical solution I have found is applying both:

In the following example we assume line is the incoming JSON string object :

>>> line = str({'abc':'008565','name':'xyz','description':'can control TV\'s and more'})

Step 1: convert the incoming string into a dictionary using ast.literal_eval()
Step 2: apply json.dumps to it for the reliable conversion of keys and values, but without touching the contents of values:

>>> import ast
>>> import json
>>> print(json.dumps(ast.literal_eval(line)))
{"abc": "008565", "name": "xyz", "description": "can control TV's and more"}

json.dumps alone would not do the job because it does not interpret the JSON, but only see the string. Similar for ast.literal_eval(): although it interprets correctly the JSON (dictionary), it does not convert what we need.


回答 5

您可以通过以下方式解决它:

s = "{'username':'dfdsfdsf'}"
j = eval(s)

You can fix it that way:

s = "{'username':'dfdsfdsf'}"
j = eval(s)

回答 6

如前所述,JSON不是Python语法。您需要在JSON中使用双引号。它的创建者以使用允许语法的严格子集来减轻程序员的认知负担而闻名。


如果一个JSON字符串本身包含一个单引号(如@Jiaaro所指出的),则以下内容可能会失败。不使用。此处以不起作用的示例为例。

这是非常有用的知道有一个JSON字符串没有单引号。说,您从浏览器控制台/任何地方复制并粘贴了它。然后,您可以输入

a = json.loads('very_long_json_string_pasted_here')

如果它也使用单引号,则可能会中断。

As said, JSON is not Python syntax. You need to use double quotes in JSON. Its creator is (in-)famous for using strict subsets of allowable syntax to ease programmer cognitive overload.


Below can fail if one of the JSON strings itself contains a single quote as pointed out by @Jiaaro. DO NOT USE. Left here as an example of what does not work.

It is really useful to know that there are no single quotes in a JSON string. Say, you copied and pasted it from a browser console/whatever. Then, you can just type

a = json.loads('very_long_json_string_pasted_here')

This might otherwise break if it used single quotes, too.


回答 7

使用eval函数确实解决了我的问题。

single_quoted_dict_in_string = "{'key':'value', 'key2': 'value2'}"
desired_double_quoted_dict = eval(single_quoted_dict_in_string)
# Go ahead, now you can convert it into json easily
print(desired_double_quoted_dict)

It truly solved my problem using eval function.

single_quoted_dict_in_string = "{'key':'value', 'key2': 'value2'}"
desired_double_quoted_dict = eval(single_quoted_dict_in_string)
# Go ahead, now you can convert it into json easily
print(desired_double_quoted_dict)

回答 8

我最近遇到了一个非常类似的问题,并且相信我的解决方案也将对您有用。我有一个文本文件,其中包含以下形式的项目列表:

["first item", 'the "Second" item', "thi'rd", 'some \\"hellish\\" \'quoted" item']

我想将上面的内容解析为python列表,但由于我不信任输入内容,因此对eval()并不热衷。我首先尝试使用JSON,但它仅接受双引号项目,因此我针对此特定情况编写了自己的非常简单的词法分析器(只需插入您自己的“ stringtoparse”,您将获得输出列表:“ items”)

#This lexer takes a JSON-like 'array' string and converts single-quoted array items into escaped double-quoted items,
#then puts the 'array' into a python list
#Issues such as  ["item 1", '","item 2 including those double quotes":"', "item 3"] are resolved with this lexer
items = []      #List of lexed items
item = ""       #Current item container
dq = True       #Double-quotes active (False->single quotes active)
bs = 0          #backslash counter
in_item = False #True if currently lexing an item within the quotes (False if outside the quotes; ie comma and whitespace)
for c in stringtoparse[1:-1]:   #Assuming encasement by brackets
    if c=="\\": #if there are backslashes, count them! Odd numbers escape the quotes...
        bs = bs + 1
        continue                    
    if (dq and c=='"') or (not dq and c=="'"):  #quote matched at start/end of an item
        if bs & 1==1:   #if escaped quote, ignore as it must be part of the item
            continue
        else:   #not escaped quote - toggle in_item
            in_item = not in_item
            if item!="":            #if item not empty, we must be at the end
                items += [item]     #so add it to the list of items
                item = ""           #and reset for the next item
            continue                
    if not in_item: #toggle of single/double quotes to enclose items
        if dq and c=="'":
            dq = False
            in_item = True
        elif not dq and c=='"':
            dq = True
            in_item = True
        continue
    if in_item: #character is part of an item, append it to the item
        if not dq and c=='"':           #if we are using single quotes
            item += bs * "\\" + "\""    #escape double quotes for JSON
        else:
            item += bs * "\\" + c
        bs = 0
        continue

希望它对某人有用。请享用!

I recently came up against a very similar problem, and believe my solution would work for you too. I had a text file which contained a list of items in the form:

["first item", 'the "Second" item', "thi'rd", 'some \\"hellish\\" \'quoted" item']

I wanted to parse the above into a python list but was not keen on eval() as I couldn’t trust the input. I tried first using JSON but it only accepts double quoted items, so I wrote my own very simple lexer for this specific case (just plug in your own “stringtoparse” and you will get as output list: ‘items’)

#This lexer takes a JSON-like 'array' string and converts single-quoted array items into escaped double-quoted items,
#then puts the 'array' into a python list
#Issues such as  ["item 1", '","item 2 including those double quotes":"', "item 3"] are resolved with this lexer
items = []      #List of lexed items
item = ""       #Current item container
dq = True       #Double-quotes active (False->single quotes active)
bs = 0          #backslash counter
in_item = False #True if currently lexing an item within the quotes (False if outside the quotes; ie comma and whitespace)
for c in stringtoparse[1:-1]:   #Assuming encasement by brackets
    if c=="\\": #if there are backslashes, count them! Odd numbers escape the quotes...
        bs = bs + 1
        continue                    
    if (dq and c=='"') or (not dq and c=="'"):  #quote matched at start/end of an item
        if bs & 1==1:   #if escaped quote, ignore as it must be part of the item
            continue
        else:   #not escaped quote - toggle in_item
            in_item = not in_item
            if item!="":            #if item not empty, we must be at the end
                items += [item]     #so add it to the list of items
                item = ""           #and reset for the next item
            continue                
    if not in_item: #toggle of single/double quotes to enclose items
        if dq and c=="'":
            dq = False
            in_item = True
        elif not dq and c=='"':
            dq = True
            in_item = True
        continue
    if in_item: #character is part of an item, append it to the item
        if not dq and c=='"':           #if we are using single quotes
            item += bs * "\\" + "\""    #escape double quotes for JSON
        else:
            item += bs * "\\" + c
        bs = 0
        continue

Hopefully it is useful to somebody. Enjoy!


回答 9

import ast 
answer = subprocess.check_output(PYTHON_ + command, shell=True).strip()
    print(ast.literal_eval(answer.decode(UTF_)))

为我工作

import ast 
answer = subprocess.check_output(PYTHON_ + command, shell=True).strip()
    print(ast.literal_eval(answer.decode(UTF_)))

Works for me


回答 10

import json
data = json.dumps(list)
print(data)

上面的代码段应该可以正常工作。

import json
data = json.dumps(list)
print(data)

The above code snippet should work.


如何以JSON格式发送POST请求?

问题:如何以JSON格式发送POST请求?

data = {
        'ids': [12, 3, 4, 5, 6 , ...]
    }
    urllib2.urlopen("http://abc.com/api/posts/create",urllib.urlencode(data))

我想发送POST请求,但是其中一个字段应该是数字列表。我怎样才能做到这一点 ?(JSON?)

data = {
        'ids': [12, 3, 4, 5, 6 , ...]
    }
    urllib2.urlopen("http://abc.com/api/posts/create",urllib.urlencode(data))

I want to send a POST request, but one of the fields should be a list of numbers. How can I do that ? (JSON?)


回答 0

如果您的服务器期望POST请求为json,那么您将需要添加标头,并为请求序列化数据…

Python 2.x

import json
import urllib2

data = {
        'ids': [12, 3, 4, 5, 6]
}

req = urllib2.Request('http://example.com/api/posts/create')
req.add_header('Content-Type', 'application/json')

response = urllib2.urlopen(req, json.dumps(data))

Python 3.x

https://stackoverflow.com/a/26876308/496445


如果不指定标题,它将是默认application/x-www-form-urlencoded类型。

If your server is expecting the POST request to be json, then you would need to add a header, and also serialize the data for your request…

Python 2.x

import json
import urllib2

data = {
        'ids': [12, 3, 4, 5, 6]
}

req = urllib2.Request('http://example.com/api/posts/create')
req.add_header('Content-Type', 'application/json')

response = urllib2.urlopen(req, json.dumps(data))

Python 3.x

https://stackoverflow.com/a/26876308/496445


If you don’t specify the header, it will be the default application/x-www-form-urlencoded type.


回答 1

我建议使用令人难以置信的requests模块。

http://docs.python-requests.org/zh-CN/v0.10.7/user/quickstart/#custom-headers

url = 'https://api.github.com/some/endpoint'
payload = {'some': 'data'}
headers = {'content-type': 'application/json'}

response = requests.post(url, data=json.dumps(payload), headers=headers)

I recommend using the incredible requests module.

http://docs.python-requests.org/en/v0.10.7/user/quickstart/#custom-headers

url = 'https://api.github.com/some/endpoint'
payload = {'some': 'data'}
headers = {'content-type': 'application/json'}

response = requests.post(url, data=json.dumps(payload), headers=headers)

回答 2

对于python 3.4.2,我发现以下将起作用:

import urllib.request
import json      

body = {'ids': [12, 14, 50]}  

myurl = "http://www.testmycode.com"
req = urllib.request.Request(myurl)
req.add_header('Content-Type', 'application/json; charset=utf-8')
jsondata = json.dumps(body)
jsondataasbytes = jsondata.encode('utf-8')   # needs to be bytes
req.add_header('Content-Length', len(jsondataasbytes))
print (jsondataasbytes)
response = urllib.request.urlopen(req, jsondataasbytes)

for python 3.4.2 I found the following will work:

import urllib.request
import json      

body = {'ids': [12, 14, 50]}  
myurl = "http://www.testmycode.com"

req = urllib.request.Request(myurl)
req.add_header('Content-Type', 'application/json; charset=utf-8')
jsondata = json.dumps(body)
jsondataasbytes = jsondata.encode('utf-8')   # needs to be bytes
req.add_header('Content-Length', len(jsondataasbytes))
response = urllib.request.urlopen(req, jsondataasbytes)

回答 3

这非常适合 Python 3.5,如果URL中包含查询字符串/参数值,

请求网址= https://bah2.com/ws/rest/v1/concept/
参数值= 21f6bb43-98a1-419d-8f0c-8133669e40ca

import requests

url = 'https://bahbah2.com/ws/rest/v1/concept/21f6bb43-98a1-419d-8f0c-8133669e40ca'
data = {"name": "Value"}
r = requests.post(url, auth=('username', 'password'), verify=False, json=data)
print(r.status_code)

This works perfect for Python 3.5, if the URL contains Query String / Parameter value,

Request URL = https://bah2.com/ws/rest/v1/concept/
Parameter value = 21f6bb43-98a1-419d-8f0c-8133669e40ca

import requests

url = 'https://bahbah2.com/ws/rest/v1/concept/21f6bb43-98a1-419d-8f0c-8133669e40ca'
data = {"name": "Value"}
r = requests.post(url, auth=('username', 'password'), verify=False, json=data)
print(r.status_code)

回答 4

您必须添加标题,否则会出现http 400错误。该代码在python2.6,centos5.4上运行良好

码:

    import urllib2,json

    url = 'http://www.google.com/someservice'
    postdata = {'key':'value'}

    req = urllib2.Request(url)
    req.add_header('Content-Type','application/json')
    data = json.dumps(postdata)

    response = urllib2.urlopen(req,data)

You have to add header,or you will get http 400 error. The code works well on python2.6,centos5.4

code:

    import urllib2,json

    url = 'http://www.google.com/someservice'
    postdata = {'key':'value'}

    req = urllib2.Request(url)
    req.add_header('Content-Type','application/json')
    data = json.dumps(postdata)

    response = urllib2.urlopen(req,data)

回答 5

这是一个如何使用Python标准库中的urllib.request对象的示例。

import urllib.request
import json
from pprint import pprint

url = "https://app.close.com/hackwithus/3d63efa04a08a9e0/"

values = {
    "first_name": "Vlad",
    "last_name": "Bezden",
    "urls": [
        "https://twitter.com/VladBezden",
        "https://github.com/vlad-bezden",
    ],
}


headers = {
    "Content-Type": "application/json",
    "Accept": "application/json",
}

data = json.dumps(values).encode("utf-8")
pprint(data)

try:
    req = urllib.request.Request(url, data, headers)
    with urllib.request.urlopen(req) as f:
        res = f.read()
    pprint(res.decode())
except Exception as e:
    pprint(e)

Here is an example of how to use urllib.request object from Python standard library.

import urllib.request
import json
from pprint import pprint

url = "https://app.close.com/hackwithus/3d63efa04a08a9e0/"

values = {
    "first_name": "Vlad",
    "last_name": "Bezden",
    "urls": [
        "https://twitter.com/VladBezden",
        "https://github.com/vlad-bezden",
    ],
}


headers = {
    "Content-Type": "application/json",
    "Accept": "application/json",
}

data = json.dumps(values).encode("utf-8")
pprint(data)

try:
    req = urllib.request.Request(url, data, headers)
    with urllib.request.urlopen(req) as f:
        res = f.read()
    pprint(res.decode())
except Exception as e:
    pprint(e)

回答 6

在最新的请求包中,您可以使用jsonin requests.post()方法来发送json dict,并且Content-Typein标头将设置为application/json。无需显式指定标头。

import requests

payload = {'key': 'value'}

requests.post(url, json=payload)

In the lastest requests package, you can use json parameter in requests.post() method to send a json dict, and the Content-Type in header will be set to application/json. There is no need to specify header explicitly.

import requests

payload = {'key': 'value'}

requests.post(url, json=payload)

回答 7

这对我使用api来说效果很好

import requests

data={'Id':id ,'name': name}
r = requests.post( url = 'https://apiurllink', data = data)

This one works fine for me with apis

import requests

data={'Id':id ,'name': name}
r = requests.post( url = 'https://apiurllink', data = data)

如何将CSV文件转换为多行JSON?

问题:如何将CSV文件转换为多行JSON?

这是我的代码,非常简单的东西…

import csv
import json

csvfile = open('file.csv', 'r')
jsonfile = open('file.json', 'w')

fieldnames = ("FirstName","LastName","IDNumber","Message")
reader = csv.DictReader( csvfile, fieldnames)
out = json.dumps( [ row for row in reader ] )
jsonfile.write(out)

声明一些字段名称,阅读器使用CSV读取文件,并使用字段名称将文件转储为JSON格式。这是问题所在…

CSV文件中的每个记录都在不同的行上。我希望JSON输出采用相同的方式。问题是它把所有的东西都丢在一条长长的长线上。

我试过使用类似的for line in csvfile:代码,然后在该代码下面运行我的代码,reader = csv.DictReader( line, fieldnames)该代码循环遍历每一行,但它在一行上执行整个文件,然后在另一行上遍历整个文件…继续直到行数结束。

有任何纠正建议吗?

编辑:澄清一下,目前我有:(第1行的每条记录)

[{"FirstName":"John","LastName":"Doe","IDNumber":"123","Message":"None"},{"FirstName":"George","LastName":"Washington","IDNumber":"001","Message":"Something"}]

我正在寻找的是:(2条记录中的2条记录)

{"FirstName":"John","LastName":"Doe","IDNumber":"123","Message":"None"}
{"FirstName":"George","LastName":"Washington","IDNumber":"001","Message":"Something"}

不是每个单独的字段缩进/在单独的行上缩进,而是每个记录都在其自己的行上。

一些样本输入。

"John","Doe","001","Message1"
"George","Washington","002","Message2"

Here’s my code, really simple stuff…

import csv
import json

csvfile = open('file.csv', 'r')
jsonfile = open('file.json', 'w')

fieldnames = ("FirstName","LastName","IDNumber","Message")
reader = csv.DictReader( csvfile, fieldnames)
out = json.dumps( [ row for row in reader ] )
jsonfile.write(out)

Declare some field names, the reader uses CSV to read the file, and the filed names to dump the file to a JSON format. Here’s the problem…

Each record in the CSV file is on a different row. I want the JSON output to be the same way. The problem is it dumps it all on one giant, long line.

I’ve tried using something like for line in csvfile: and then running my code below that with reader = csv.DictReader( line, fieldnames) which loops through each line, but it does the entire file on one line, then loops through the entire file on another line… continues until it runs out of lines.

Any suggestions for correcting this?

Edit: To clarify, currently I have: (every record on line 1)

[{"FirstName":"John","LastName":"Doe","IDNumber":"123","Message":"None"},{"FirstName":"George","LastName":"Washington","IDNumber":"001","Message":"Something"}]

What I’m looking for: (2 records on 2 lines)

{"FirstName":"John","LastName":"Doe","IDNumber":"123","Message":"None"}
{"FirstName":"George","LastName":"Washington","IDNumber":"001","Message":"Something"}

Not each individual field indented/on a separate line, but each record on it’s own line.

Some sample input.

"John","Doe","001","Message1"
"George","Washington","002","Message2"

回答 0

您所需输出的问题是它不是有效的json文档;这是json文档流

没关系,如果您需要的话,但这意味着对于输出中想要的每个文档,您都必须调用json.dumps

由于您要分隔文档的换行符不包含在这些文档中,因此您需要自己提供它。因此,我们只需要从对json.dump的调用中拉出循环,并为每个编写的文档插入换行符即可。

import csv
import json

csvfile = open('file.csv', 'r')
jsonfile = open('file.json', 'w')

fieldnames = ("FirstName","LastName","IDNumber","Message")
reader = csv.DictReader( csvfile, fieldnames)
for row in reader:
    json.dump(row, jsonfile)
    jsonfile.write('\n')

The problem with your desired output is that it is not valid json document,; it’s a stream of json documents!

That’s okay, if its what you need, but that means that for each document you want in your output, you’ll have to call json.dumps.

Since the newline you want separating your documents is not contained in those documents, you’re on the hook for supplying it yourself. So we just need to pull the loop out of the call to json.dump and interpose newlines for each document written.

import csv
import json

csvfile = open('file.csv', 'r')
jsonfile = open('file.json', 'w')

fieldnames = ("FirstName","LastName","IDNumber","Message")
reader = csv.DictReader( csvfile, fieldnames)
for row in reader:
    json.dump(row, jsonfile)
    jsonfile.write('\n')

回答 1

您可以通过以下示例使用Pandas DataFrame实现此目的:

import pandas as pd
csv_file = pd.DataFrame(pd.read_csv("path/to/file.csv", sep = ",", header = 0, index_col = False))
csv_file.to_json("/path/to/new/file.json", orient = "records", date_format = "epoch", double_precision = 10, force_ascii = True, date_unit = "ms", default_handler = None)

You can use Pandas DataFrame to achieve this, with the following Example:

import pandas as pd
csv_file = pd.DataFrame(pd.read_csv("path/to/file.csv", sep = ",", header = 0, index_col = False))
csv_file.to_json("/path/to/new/file.json", orient = "records", date_format = "epoch", double_precision = 10, force_ascii = True, date_unit = "ms", default_handler = None)

回答 2

我接受了@SingleNegationElimination的响应,并将其简化为可以在管道中使用的三层:

import csv
import json
import sys

for row in csv.DictReader(sys.stdin):
    json.dump(row, sys.stdout)
    sys.stdout.write('\n')

I took @SingleNegationElimination’s response and simplified it into a three-liner that can be used in a pipeline:

import csv
import json
import sys

for row in csv.DictReader(sys.stdin):
    json.dump(row, sys.stdout)
    sys.stdout.write('\n')

回答 3

import csv
import json

file = 'csv_file_name.csv'
json_file = 'output_file_name.json'

#Read CSV File
def read_CSV(file, json_file):
    csv_rows = []
    with open(file) as csvfile:
        reader = csv.DictReader(csvfile)
        field = reader.fieldnames
        for row in reader:
            csv_rows.extend([{field[i]:row[field[i]] for i in range(len(field))}])
        convert_write_json(csv_rows, json_file)

#Convert csv data into json
def convert_write_json(data, json_file):
    with open(json_file, "w") as f:
        f.write(json.dumps(data, sort_keys=False, indent=4, separators=(',', ': '))) #for pretty
        f.write(json.dumps(data))


read_CSV(file,json_file)

json.dumps()的文档

import csv
import json

file = 'csv_file_name.csv'
json_file = 'output_file_name.json'

#Read CSV File
def read_CSV(file, json_file):
    csv_rows = []
    with open(file) as csvfile:
        reader = csv.DictReader(csvfile)
        field = reader.fieldnames
        for row in reader:
            csv_rows.extend([{field[i]:row[field[i]] for i in range(len(field))}])
        convert_write_json(csv_rows, json_file)

#Convert csv data into json
def convert_write_json(data, json_file):
    with open(json_file, "w") as f:
        f.write(json.dumps(data, sort_keys=False, indent=4, separators=(',', ': '))) #for pretty
        f.write(json.dumps(data))


read_CSV(file,json_file)

Documentation of json.dumps()


回答 4

你可以试试这个

import csvmapper

# how does the object look
mapper = csvmapper.DictMapper([ 
  [ 
     { 'name' : 'FirstName'},
     { 'name' : 'LastName' },
     { 'name' : 'IDNumber', 'type':'int' },
     { 'name' : 'Messages' }
  ]
 ])

# parser instance
parser = csvmapper.CSVParser('sample.csv', mapper)
# conversion service
converter = csvmapper.JSONConverter(parser)

print converter.doConvert(pretty=True)

编辑:

更简单的方法

import csvmapper

fields = ('FirstName', 'LastName', 'IDNumber', 'Messages')
parser = CSVParser('sample.csv', csvmapper.FieldMapper(fields))

converter = csvmapper.JSONConverter(parser)

print converter.doConvert(pretty=True)

You can try this

import csvmapper

# how does the object look
mapper = csvmapper.DictMapper([ 
  [ 
     { 'name' : 'FirstName'},
     { 'name' : 'LastName' },
     { 'name' : 'IDNumber', 'type':'int' },
     { 'name' : 'Messages' }
  ]
 ])

# parser instance
parser = csvmapper.CSVParser('sample.csv', mapper)
# conversion service
converter = csvmapper.JSONConverter(parser)

print converter.doConvert(pretty=True)

Edit:

Simpler approach

import csvmapper

fields = ('FirstName', 'LastName', 'IDNumber', 'Messages')
parser = CSVParser('sample.csv', csvmapper.FieldMapper(fields))

converter = csvmapper.JSONConverter(parser)

print converter.doConvert(pretty=True)

回答 5

indent参数添加到json.dumps

 data = {'this': ['has', 'some', 'things'],
         'in': {'it': 'with', 'some': 'more'}}
 print(json.dumps(data, indent=4))

另请注意,您可以简单地使用json.dumpopen jsonfile

json.dump(data, jsonfile)

Add the indent parameter to json.dumps

 data = {'this': ['has', 'some', 'things'],
         'in': {'it': 'with', 'some': 'more'}}
 print(json.dumps(data, indent=4))

Also note that, you can simply use json.dump with the open jsonfile:

json.dump(data, jsonfile)

回答 6

我看到这很旧,但是我需要来自SingleNegationElimination的代码,但是包含非utf-8字符的数据存在问题。这些出现在我不太关心的领域中,因此我选择忽略它们。但是,这需要一些努力。我是python的新手,因此经过反复试验后,我开始使用它。该代码是SingleNegationElimination的副本,带有utf-8的额外处理。我试图用https://docs.python.org/2.7/library/csv.html做到这一点,但最终放弃了。下面的代码工作。

import csv, json

csvfile = open('file.csv', 'r')
jsonfile = open('file.json', 'w')

fieldnames = ("Scope","Comment","OOS Code","In RMF","Code","Status","Name","Sub Code","CAT","LOB","Description","Owner","Manager","Platform Owner")
reader = csv.DictReader(csvfile , fieldnames)

code = ''
for row in reader:
    try:
        print('+' + row['Code'])
        for key in row:
            row[key] = row[key].decode('utf-8', 'ignore').encode('utf-8')      
        json.dump(row, jsonfile)
        jsonfile.write('\n')
    except:
        print('-' + row['Code'])
        raise

I see this is old but I needed the code from SingleNegationElimination however I had issue with the data containing non utf-8 characters. These appeared in fields I was not overly concerned with so I chose to ignore them. However that took some effort. I am new to python so with some trial and error I got it to work. The code is a copy of SingleNegationElimination with the extra handling of utf-8. I tried to do it with https://docs.python.org/2.7/library/csv.html but in the end gave up. The below code worked.

import csv, json

csvfile = open('file.csv', 'r')
jsonfile = open('file.json', 'w')

fieldnames = ("Scope","Comment","OOS Code","In RMF","Code","Status","Name","Sub Code","CAT","LOB","Description","Owner","Manager","Platform Owner")
reader = csv.DictReader(csvfile , fieldnames)

code = ''
for row in reader:
    try:
        print('+' + row['Code'])
        for key in row:
            row[key] = row[key].decode('utf-8', 'ignore').encode('utf-8')      
        json.dump(row, jsonfile)
        jsonfile.write('\n')
    except:
        print('-' + row['Code'])
        raise

回答 7

如何使用Pandas将csv文件读入DataFrame(pd.read_csv),然后根据需要操纵列(删除它们或更新值),最后将DataFrame转换回JSON(pd.DataFrame.to_json)。

注意:我还没有检查过效率如何,但这绝对是处理大型csv并将其转换为json的最简单方法之一。

How about using Pandas to read the csv file into a DataFrame (pd.read_csv), then manipulating the columns if you want (dropping them or updating values) and finally converting the DataFrame back to JSON (pd.DataFrame.to_json).

Note: I haven’t checked how efficient this will be but this is definitely one of the easiest ways to manipulate and convert a large csv to json.


回答 8

作为@MONTYHS答案的略微改进,通过一堆字段名进行迭代:

import csv
import json

csvfilename = 'filename.csv'
jsonfilename = csvfilename.split('.')[0] + '.json'
csvfile = open(csvfilename, 'r')
jsonfile = open(jsonfilename, 'w')
reader = csv.DictReader(csvfile)

fieldnames = ('FirstName', 'LastName', 'IDNumber', 'Message')

output = []

for each in reader:
  row = {}
  for field in fieldnames:
    row[field] = each[field]
output.append(row)

json.dump(output, jsonfile, indent=2, sort_keys=True)

As slight improvement to @MONTYHS answer, iterating through a tup of fieldnames:

import csv
import json

csvfilename = 'filename.csv'
jsonfilename = csvfilename.split('.')[0] + '.json'
csvfile = open(csvfilename, 'r')
jsonfile = open(jsonfilename, 'w')
reader = csv.DictReader(csvfile)

fieldnames = ('FirstName', 'LastName', 'IDNumber', 'Message')

output = []

for each in reader:
  row = {}
  for field in fieldnames:
    row[field] = each[field]
output.append(row)

json.dump(output, jsonfile, indent=2, sort_keys=True)

回答 9

import csv
import json
csvfile = csv.DictReader('filename.csv', 'r'))
output =[]
for each in csvfile:
    row ={}
    row['FirstName'] = each['FirstName']
    row['LastName']  = each['LastName']
    row['IDNumber']  = each ['IDNumber']
    row['Message']   = each['Message']
    output.append(row)
json.dump(output,open('filename.json','w'),indent=4,sort_keys=False)
import csv
import json
csvfile = csv.DictReader('filename.csv', 'r'))
output =[]
for each in csvfile:
    row ={}
    row['FirstName'] = each['FirstName']
    row['LastName']  = each['LastName']
    row['IDNumber']  = each ['IDNumber']
    row['Message']   = each['Message']
    output.append(row)
json.dump(output,open('filename.json','w'),indent=4,sort_keys=False)

不可JSON序列化

问题:不可JSON序列化

我有以下代码序列化查询集;

def render_to_response(self, context, **response_kwargs):

    return HttpResponse(json.simplejson.dumps(list(self.get_queryset())),
                        mimetype="application/json")

以下是我的 get_querset()

[{'product': <Product: hederello ()>, u'_id': u'9802', u'_source': {u'code': u'23981', u'facilities': [{u'facility': {u'name': {u'fr': u'G\xe9n\xe9ral', u'en': u'General'}, u'value': {u'fr': [u'bar', u'r\xe9ception ouverte 24h/24', u'chambres non-fumeurs', u'chambres familiales',.........]}]

我需要序列化。但是它说无法序列化<Product: hederello ()>。因为列表由Django对象和字典组成。有任何想法吗 ?

I have the following code for serializing the queryset;

def render_to_response(self, context, **response_kwargs):

    return HttpResponse(json.simplejson.dumps(list(self.get_queryset())),
                        mimetype="application/json")

And following is my get_querset()

[{'product': <Product: hederello ()>, u'_id': u'9802', u'_source': {u'code': u'23981', u'facilities': [{u'facility': {u'name': {u'fr': u'G\xe9n\xe9ral', u'en': u'General'}, u'value': {u'fr': [u'bar', u'r\xe9ception ouverte 24h/24', u'chambres non-fumeurs', u'chambres familiales',.........]}]

Which I need to serialize. But it says not able to serialize the <Product: hederello ()>. Because list composed of both django objects and dicts. Any ideas ?


回答 0

simplejson并且json不能很好地与Django对象配合使用。

Django的内置序列化器只能序列化由django对象填充的查询集:

data = serializers.serialize('json', self.get_queryset())
return HttpResponse(data, content_type="application/json")

就您而言,self.get_queryset()其中包含django对象和dict的混合。

一种选择是摆脱中的模型实例,self.get_queryset()并使用dict将其替换为model_to_dict

from django.forms.models import model_to_dict

data = self.get_queryset()

for item in data:
   item['product'] = model_to_dict(item['product'])

return HttpResponse(json.simplejson.dumps(data), mimetype="application/json")

希望能有所帮助。

simplejson and json don’t work with django objects well.

Django’s built-in serializers can only serialize querysets filled with django objects:

data = serializers.serialize('json', self.get_queryset())
return HttpResponse(data, content_type="application/json")

In your case, self.get_queryset() contains a mix of django objects and dicts inside.

One option is to get rid of model instances in the self.get_queryset() and replace them with dicts using model_to_dict:

from django.forms.models import model_to_dict

data = self.get_queryset()

for item in data:
   item['product'] = model_to_dict(item['product'])

return HttpResponse(json.simplejson.dumps(data), mimetype="application/json")

Hope that helps.


回答 1

最简单的方法是使用JsonResponse

对于查询集,您应传递该查询集的的列表values,如下所示:

from django.http import JsonResponse

queryset = YourModel.objects.filter(some__filter="some value").values()
return JsonResponse({"models_to_return": list(queryset)})

The easiest way is to use a JsonResponse.

For a queryset, you should pass a list of the the values for that queryset, like so:

from django.http import JsonResponse

queryset = YourModel.objects.filter(some__filter="some value").values()
return JsonResponse({"models_to_return": list(queryset)})

回答 2

我发现可以使用“ .values”方法相当简单地完成此操作,该方法还提供了命名字段:

result_list = list(my_queryset.values('first_named_field', 'second_named_field'))
return HttpResponse(json.dumps(result_list))

必须使用“列表”来获取可迭代的数据,因为“值查询集”类型仅当作为可迭代的拾取时才是字典。

文档:https : //docs.djangoproject.com/en/1.7/ref/models/querysets/#values

I found that this can be done rather simple using the “.values” method, which also gives named fields:

result_list = list(my_queryset.values('first_named_field', 'second_named_field'))
return HttpResponse(json.dumps(result_list))

“list” must be used to get data as iterable, since the “value queryset” type is only a dict if picked up as an iterable.

Documentation: https://docs.djangoproject.com/en/1.7/ref/models/querysets/#values


回答 3

从1.9版本开始,更轻松和官方的获取json的方式

from django.http import JsonResponse
from django.forms.models import model_to_dict


return JsonResponse(  model_to_dict(modelinstance) )

From version 1.9 Easier and official way of getting json

from django.http import JsonResponse
from django.forms.models import model_to_dict


return JsonResponse(  model_to_dict(modelinstance) )

回答 4

我们的js程序员要求我向她返回确切的JSON格式数据,而不是json编码的字符串。

下面是解决方案(这将返回一个可以在浏览器中直接使用/查看的对象)

import json
from xxx.models import alert
from django.core import serializers

def test(request):
    alert_list = alert.objects.all()

    tmpJson = serializers.serialize("json",alert_list)
    tmpObj = json.loads(tmpJson)

    return HttpResponse(json.dumps(tmpObj))

Our js-programmer asked me to return the exact JSON format data instead of a json-encoded string to her.

Below is the solution.(This will return an object that can be used/viewed straightly in the browser)

import json
from xxx.models import alert
from django.core import serializers

def test(request):
    alert_list = alert.objects.all()

    tmpJson = serializers.serialize("json",alert_list)
    tmpObj = json.loads(tmpJson)

    return HttpResponse(json.dumps(tmpObj))

回答 5

首先,我在模型中添加了to_dict方法;

def to_dict(self):
    return {"name": self.woo, "title": self.foo}

然后我有这个;

class DjangoJSONEncoder(JSONEncoder):

    def default(self, obj):
        if isinstance(obj, models.Model):
            return obj.to_dict()
        return JSONEncoder.default(self, obj)


dumps = curry(dumps, cls=DjangoJSONEncoder)

最后使用此类来序列化我的查询集。

def render_to_response(self, context, **response_kwargs):
    return HttpResponse(dumps(self.get_queryset()))

这个效果很好

First I added a to_dict method to my model ;

def to_dict(self):
    return {"name": self.woo, "title": self.foo}

Then I have this;

class DjangoJSONEncoder(JSONEncoder):

    def default(self, obj):
        if isinstance(obj, models.Model):
            return obj.to_dict()
        return JSONEncoder.default(self, obj)


dumps = curry(dumps, cls=DjangoJSONEncoder)

and at last use this class to serialize my queryset.

def render_to_response(self, context, **response_kwargs):
    return HttpResponse(dumps(self.get_queryset()))

This works quite well


加载和解析具有多个JSON对象的JSON文件

问题:加载和解析具有多个JSON对象的JSON文件

我正在尝试在Python中加载和解析JSON文件。但是我在尝试加载文件时遇到了麻烦:

import json
json_data = open('file')
data = json.load(json_data)

Yield:

ValueError: Extra data: line 2 column 1 - line 225116 column 1 (char 232 - 160128774)

我看着18.2。json Python文档中的JSON编码器和解码器,但是通读这个看起来糟透了的文档非常令人沮丧。

前几行(用随机条目匿名):

{"votes": {"funny": 2, "useful": 5, "cool": 1}, "user_id": "harveydennis", "name": "Jasmine Graham", "url": "http://example.org/user_details?userid=harveydennis", "average_stars": 3.5, "review_count": 12, "type": "user"}
{"votes": {"funny": 1, "useful": 2, "cool": 4}, "user_id": "njohnson", "name": "Zachary Ballard", "url": "https://www.example.com/user_details?userid=njohnson", "average_stars": 3.5, "review_count": 12, "type": "user"}
{"votes": {"funny": 1, "useful": 0, "cool": 4}, "user_id": "david06", "name": "Jonathan George", "url": "https://example.com/user_details?userid=david06", "average_stars": 3.5, "review_count": 12, "type": "user"}
{"votes": {"funny": 6, "useful": 5, "cool": 0}, "user_id": "santiagoerika", "name": "Amanda Taylor", "url": "https://www.example.com/user_details?userid=santiagoerika", "average_stars": 3.5, "review_count": 12, "type": "user"}
{"votes": {"funny": 1, "useful": 8, "cool": 2}, "user_id": "rodriguezdennis", "name": "Jennifer Roach", "url": "http://www.example.com/user_details?userid=rodriguezdennis", "average_stars": 3.5, "review_count": 12, "type": "user"}

I am trying to load and parse a JSON file in Python. But I’m stuck trying to load the file:

import json
json_data = open('file')
data = json.load(json_data)

Yields:

ValueError: Extra data: line 2 column 1 - line 225116 column 1 (char 232 - 160128774)

I looked at 18.2. json — JSON encoder and decoder in the Python documentation, but it’s pretty discouraging to read through this horrible-looking documentation.

First few lines (anonymized with randomized entries):

{"votes": {"funny": 2, "useful": 5, "cool": 1}, "user_id": "harveydennis", "name": "Jasmine Graham", "url": "http://example.org/user_details?userid=harveydennis", "average_stars": 3.5, "review_count": 12, "type": "user"}
{"votes": {"funny": 1, "useful": 2, "cool": 4}, "user_id": "njohnson", "name": "Zachary Ballard", "url": "https://www.example.com/user_details?userid=njohnson", "average_stars": 3.5, "review_count": 12, "type": "user"}
{"votes": {"funny": 1, "useful": 0, "cool": 4}, "user_id": "david06", "name": "Jonathan George", "url": "https://example.com/user_details?userid=david06", "average_stars": 3.5, "review_count": 12, "type": "user"}
{"votes": {"funny": 6, "useful": 5, "cool": 0}, "user_id": "santiagoerika", "name": "Amanda Taylor", "url": "https://www.example.com/user_details?userid=santiagoerika", "average_stars": 3.5, "review_count": 12, "type": "user"}
{"votes": {"funny": 1, "useful": 8, "cool": 2}, "user_id": "rodriguezdennis", "name": "Jennifer Roach", "url": "http://www.example.com/user_details?userid=rodriguezdennis", "average_stars": 3.5, "review_count": 12, "type": "user"}

回答 0

您有一个JSON Lines格式的文本文件。您需要逐行解析文件:

import json

data = []
with open('file') as f:
    for line in f:
        data.append(json.loads(line))

行都包含有效的JSON,但总体而言,它不是有效的JSON值,因为没有顶级列表或对象定义。

请注意,由于该文件每行包含JSON,因此您无需费力地尝试一次性分析所有内容或找出流JSON解析器。现在,您可以选择在继续进行下一行之前分别处理每一行,从而节省了进程中的内存。如果文件很大,您可能不想将每个结果附加到一个列表中,然后再处理所有内容。

如果您有一个文件,其中包含带有分隔符的单个JSON对象,请使用如何使用“ json”模块一次读取一个JSON对象?使用缓冲方法解析单个对象。

You have a JSON Lines format text file. You need to parse your file line by line:

import json

data = []
with open('file') as f:
    for line in f:
        data.append(json.loads(line))

Each line contains valid JSON, but as a whole, it is not a valid JSON value as there is no top-level list or object definition.

Note that because the file contains JSON per line, you are saved the headaches of trying to parse it all in one go or to figure out a streaming JSON parser. You can now opt to process each line separately before moving on to the next, saving memory in the process. You probably don’t want to append each result to one list and then process everything if your file is really big.

If you have a file containing individual JSON objects with delimiters in-between, use How do I use the ‘json’ module to read in one JSON object at a time? to parse out individual objects using a buffered method.


回答 1

对于那些绊倒这个问题的人:python jsonlines库(比这个问题要年轻得多)优雅地处理每行一个json文档的文件。参见https://jsonlines.readthedocs.io/

for those stumbling upon this question: the python jsonlines library (much younger than this question) elegantly handles files with one json document per line. see https://jsonlines.readthedocs.io/


回答 2

病了格式化。每行有一个JSON对象,但是它们不包含在较大的数据结构(即数组)中。您可能需要重新格式化它,使其以每行结尾处的逗号开头[和结尾],或者将其作为单独的字典逐行进行解析。

That is ill-formatted. You have one JSON object per line, but they are not contained in a larger data structure (ie an array). You’ll either need to reformat it so that it begins with [ and ends with ] with a comma at the end of each line, or parse it line by line as separate dictionaries.