如何使可序列化的JSON类

问题:如何使可序列化的JSON类

如何使Python类可序列化?

一个简单的类:

class FileItem:
    def __init__(self, fname):
        self.fname = fname

我应该怎么做才能获得输出:

>>> import json

>>> my_file = FileItem('/foo/bar')
>>> json.dumps(my_file)
TypeError: Object of type 'FileItem' is not JSON serializable

没有错误

How to make a Python class serializable?

A simple class:

class FileItem:
    def __init__(self, fname):
        self.fname = fname

What should I do to be able to get output of:

>>> import json

>>> my_file = FileItem('/foo/bar')
>>> json.dumps(my_file)
TypeError: Object of type 'FileItem' is not JSON serializable

Without the error


回答 0

您对预期的输出有想法吗?例如这样做吗?

>>> f  = FileItem("/foo/bar")
>>> magic(f)
'{"fname": "/foo/bar"}'

在这种情况下,您只能调用json.dumps(f.__dict__)

如果您想要更多的自定义输出,则必须继承JSONEncoder并实现自己的自定义序列化。

有关一个简单的示例,请参见下文。

>>> from json import JSONEncoder
>>> class MyEncoder(JSONEncoder):
        def default(self, o):
            return o.__dict__    

>>> MyEncoder().encode(f)
'{"fname": "/foo/bar"}'

然后,将该类json.dumps()作为clskwarg 传递给方法:

json.dumps(cls=MyEncoder)

如果你也想解码,那么你将有一个自定义供应object_hookJSONDecoder类。例如

>>> def from_json(json_object):
        if 'fname' in json_object:
            return FileItem(json_object['fname'])
>>> f = JSONDecoder(object_hook = from_json).decode('{"fname": "/foo/bar"}')
>>> f
<__main__.FileItem object at 0x9337fac>
>>> 

Do you have an idea about the expected output? For e.g. will this do?

>>> f  = FileItem("/foo/bar")
>>> magic(f)
'{"fname": "/foo/bar"}'

In that case you can merely call json.dumps(f.__dict__).

If you want more customized output then you will have to subclass JSONEncoder and implement your own custom serialization.

For a trivial example, see below.

>>> from json import JSONEncoder
>>> class MyEncoder(JSONEncoder):
        def default(self, o):
            return o.__dict__    

>>> MyEncoder().encode(f)
'{"fname": "/foo/bar"}'

Then you pass this class into the json.dumps() method as cls kwarg:

json.dumps(cls=MyEncoder)

If you also want to decode then you’ll have to supply a custom object_hook to the JSONDecoder class. For e.g.

>>> def from_json(json_object):
        if 'fname' in json_object:
            return FileItem(json_object['fname'])
>>> f = JSONDecoder(object_hook = from_json).decode('{"fname": "/foo/bar"}')
>>> f
<__main__.FileItem object at 0x9337fac>
>>> 

回答 1

这是一个简单功能的简单解决方案:

.toJSON() 方法

代替JSON可序列化的类,实现一个序列化器方法:

import json

class Object:
    def toJSON(self):
        return json.dumps(self, default=lambda o: o.__dict__, 
            sort_keys=True, indent=4)

因此,您只需调用它即可序列化:

me = Object()
me.name = "Onur"
me.age = 35
me.dog = Object()
me.dog.name = "Apollo"

print(me.toJSON())

将输出:

{
    "age": 35,
    "dog": {
        "name": "Apollo"
    },
    "name": "Onur"
}

Here is a simple solution for a simple feature:

.toJSON() Method

Instead of a JSON serializable class, implement a serializer method:

import json

class Object:
    def toJSON(self):
        return json.dumps(self, default=lambda o: o.__dict__, 
            sort_keys=True, indent=4)

So you just call it to serialize:

me = Object()
me.name = "Onur"
me.age = 35
me.dog = Object()
me.dog.name = "Apollo"

print(me.toJSON())

will output:

{
    "age": 35,
    "dog": {
        "name": "Apollo"
    },
    "name": "Onur"
}

回答 2

对于更复杂的类,您可以考虑使用jsonpickle工具:

jsonpickle是一个Python库,用于将复杂的Python对象与JSON进行序列化和反序列化。

用于将Python编码为JSON的标准Python库(例如stdlib的json,simplejson和demjson)只能处理具有直接JSON等效项的Python原语(例如,字典,列表,字符串,整数等)。jsonpickle建立在这些库之上,并允许将更复杂的数据结构序列化为JSON。jsonpickle具有高度的可配置性和可扩展性,允许用户选择JSON后端并添加其他后端。

(链接到PyPi上的jsonpickle)

For more complex classes you could consider the tool jsonpickle:

jsonpickle is a Python library for serialization and deserialization of complex Python objects to and from JSON.

The standard Python libraries for encoding Python into JSON, such as the stdlib’s json, simplejson, and demjson, can only handle Python primitives that have a direct JSON equivalent (e.g. dicts, lists, strings, ints, etc.). jsonpickle builds on top of these libraries and allows more complex data structures to be serialized to JSON. jsonpickle is highly configurable and extendable–allowing the user to choose the JSON backend and add additional backends.

(link to jsonpickle on PyPi)


回答 3

大多数答案都涉及将对json.dumps()的调用更改为并非总是可能或不希望的(例如,它可能发生在框架组件内部)。

如果您希望能够原样调用 json.dumps(obj),那么一个简单的解决方案就是从dict继承:

class FileItem(dict):
    def __init__(self, fname):
        dict.__init__(self, fname=fname)

f = FileItem('tasks.txt')
json.dumps(f)  #No need to change anything here

如果您的类只是基本数据表示形式,则此方法有效,对于棘手的事情,您始终可以显式设置键。

Most of the answers involve changing the call to json.dumps(), which is not always possible or desirable (it may happen inside a framework component for example).

If you want to be able to call json.dumps(obj) as is, then a simple solution is inheriting from dict:

class FileItem(dict):
    def __init__(self, fname):
        dict.__init__(self, fname=fname)

f = FileItem('tasks.txt')
json.dumps(f)  #No need to change anything here

This works if your class is just basic data representation, for trickier things you can always set keys explicitly.


回答 4

我喜欢Onur的答案,但会扩展为包括一个可选toJSON()方法,以使对象自行序列化:

def dumper(obj):
    try:
        return obj.toJSON()
    except:
        return obj.__dict__
print json.dumps(some_big_object, default=dumper, indent=2)

I like Onur’s answer but would expand to include an optional toJSON() method for objects to serialize themselves:

def dumper(obj):
    try:
        return obj.toJSON()
    except:
        return obj.__dict__
print json.dumps(some_big_object, default=dumper, indent=2)

回答 5

另一个选择是将JSON转储包装在其自己的类中:

import json

class FileItem:
    def __init__(self, fname):
        self.fname = fname

    def __repr__(self):
        return json.dumps(self.__dict__)

或者,甚至更好的是,从类中继承FileItem JsonSerializable类:

import json

class JsonSerializable(object):
    def toJson(self):
        return json.dumps(self.__dict__)

    def __repr__(self):
        return self.toJson()


class FileItem(JsonSerializable):
    def __init__(self, fname):
        self.fname = fname

测试:

>>> f = FileItem('/foo/bar')
>>> f.toJson()
'{"fname": "/foo/bar"}'
>>> f
'{"fname": "/foo/bar"}'
>>> str(f) # string coercion
'{"fname": "/foo/bar"}'

Another option is to wrap JSON dumping in its own class:

import json

class FileItem:
    def __init__(self, fname):
        self.fname = fname

    def __repr__(self):
        return json.dumps(self.__dict__)

Or, even better, subclassing FileItem class from a JsonSerializable class:

import json

class JsonSerializable(object):
    def toJson(self):
        return json.dumps(self.__dict__)

    def __repr__(self):
        return self.toJson()


class FileItem(JsonSerializable):
    def __init__(self, fname):
        self.fname = fname

Testing:

>>> f = FileItem('/foo/bar')
>>> f.toJson()
'{"fname": "/foo/bar"}'
>>> f
'{"fname": "/foo/bar"}'
>>> str(f) # string coercion
'{"fname": "/foo/bar"}'

回答 6

只需将to_json方法添加到您的类中,如下所示:

def to_json(self):
  return self.message # or how you want it to be serialized

并将此代码(来自此答案添加到所有内容的顶部:

from json import JSONEncoder

def _default(self, obj):
    return getattr(obj.__class__, "to_json", _default.default)(obj)

_default.default = JSONEncoder().default
JSONEncoder.default = _default

导入时,它将对Monkeyjson模块进行Monkey补丁处理,因此JSONEncoder.default()自动检查特殊的“ to_json()”方法,并在找到后使用该方法对对象进行编码。

就像Onur所说的一样,但是这次您不必更新json.dumps()项目中的每个项目。

Just add to_json method to your class like this:

def to_json(self):
  return self.message # or how you want it to be serialized

And add this code (from this answer), to somewhere at the top of everything:

from json import JSONEncoder

def _default(self, obj):
    return getattr(obj.__class__, "to_json", _default.default)(obj)

_default.default = JSONEncoder().default
JSONEncoder.default = _default

This will monkey-patch json module when it’s imported so JSONEncoder.default() automatically checks for a special “to_json()” method and uses it to encode the object if found.

Just like Onur said, but this time you don’t have to update every json.dumps() in your project.


回答 7

前几天,我遇到了这个问题,并为Python对象实现了一个更通用的Encoder版本,可以处理嵌套对象继承的字段

import json
import inspect

class ObjectEncoder(json.JSONEncoder):
    def default(self, obj):
        if hasattr(obj, "to_json"):
            return self.default(obj.to_json())
        elif hasattr(obj, "__dict__"):
            d = dict(
                (key, value)
                for key, value in inspect.getmembers(obj)
                if not key.startswith("__")
                and not inspect.isabstract(value)
                and not inspect.isbuiltin(value)
                and not inspect.isfunction(value)
                and not inspect.isgenerator(value)
                and not inspect.isgeneratorfunction(value)
                and not inspect.ismethod(value)
                and not inspect.ismethoddescriptor(value)
                and not inspect.isroutine(value)
            )
            return self.default(d)
        return obj

例:

class C(object):
    c = "NO"
    def to_json(self):
        return {"c": "YES"}

class B(object):
    b = "B"
    i = "I"
    def __init__(self, y):
        self.y = y

    def f(self):
        print "f"

class A(B):
    a = "A"
    def __init__(self):
        self.b = [{"ab": B("y")}]
        self.c = C()

print json.dumps(A(), cls=ObjectEncoder, indent=2, sort_keys=True)

结果:

{
  "a": "A", 
  "b": [
    {
      "ab": {
        "b": "B", 
        "i": "I", 
        "y": "y"
      }
    }
  ], 
  "c": {
    "c": "YES"
  }, 
  "i": "I"
}

I came across this problem the other day and implemented a more general version of an Encoder for Python objects that can handle nested objects and inherited fields:

import json
import inspect

class ObjectEncoder(json.JSONEncoder):
    def default(self, obj):
        if hasattr(obj, "to_json"):
            return self.default(obj.to_json())
        elif hasattr(obj, "__dict__"):
            d = dict(
                (key, value)
                for key, value in inspect.getmembers(obj)
                if not key.startswith("__")
                and not inspect.isabstract(value)
                and not inspect.isbuiltin(value)
                and not inspect.isfunction(value)
                and not inspect.isgenerator(value)
                and not inspect.isgeneratorfunction(value)
                and not inspect.ismethod(value)
                and not inspect.ismethoddescriptor(value)
                and not inspect.isroutine(value)
            )
            return self.default(d)
        return obj

Example:

class C(object):
    c = "NO"
    def to_json(self):
        return {"c": "YES"}

class B(object):
    b = "B"
    i = "I"
    def __init__(self, y):
        self.y = y

    def f(self):
        print "f"

class A(B):
    a = "A"
    def __init__(self):
        self.b = [{"ab": B("y")}]
        self.c = C()

print json.dumps(A(), cls=ObjectEncoder, indent=2, sort_keys=True)

Result:

{
  "a": "A", 
  "b": [
    {
      "ab": {
        "b": "B", 
        "i": "I", 
        "y": "y"
      }
    }
  ], 
  "c": {
    "c": "YES"
  }, 
  "i": "I"
}

回答 8

如果您使用的是Python3.5 +,则可以使用jsons。它将把您的对象(及其所有属性递归地)转换成字典。

import jsons

a_dict = jsons.dump(your_object)

或者,如果您想要一个字符串:

a_str = jsons.dumps(your_object)

或者如果您的Class实施了jsons.JsonSerializable

a_dict = your_object.json

If you’re using Python3.5+, you could use jsons. It will convert your object (and all its attributes recursively) to a dict.

import jsons

a_dict = jsons.dump(your_object)

Or if you wanted a string:

a_str = jsons.dumps(your_object)

Or if your class implemented jsons.JsonSerializable:

a_dict = your_object.json

回答 9

import simplejson

class User(object):
    def __init__(self, name, mail):
        self.name = name
        self.mail = mail

    def _asdict(self):
        return self.__dict__

print(simplejson.dumps(User('alice', 'alice@mail.com')))

如果使用标准json,则需要定义一个default函数

import json
def default(o):
    return o._asdict()

print(json.dumps(User('alice', 'alice@mail.com'), default=default))
import simplejson

class User(object):
    def __init__(self, name, mail):
        self.name = name
        self.mail = mail

    def _asdict(self):
        return self.__dict__

print(simplejson.dumps(User('alice', 'alice@mail.com')))

if use standard json, u need to define a default function

import json
def default(o):
    return o._asdict()

print(json.dumps(User('alice', 'alice@mail.com'), default=default))

回答 10

json在可以打印的对象方面受到限制,并且jsonpickle(您可能需要pip install jsonpickle)在不能缩进文本方面受到限制。如果您想检查无法更改其类的对象的内容,我仍然找不到比以下方法更直接的方法:

 import json
 import jsonpickle
 ...
 print  json.dumps(json.loads(jsonpickle.encode(object)), indent=2)

注意:他们仍然无法打印对象方法。

json is limited in terms of objects it can print, and jsonpickle (you may need a pip install jsonpickle) is limited in terms it can’t indent text. If you would like to inspect the contents of an object whose class you can’t change, I still couldn’t find a straighter way than:

 import json
 import jsonpickle
 ...
 print  json.dumps(json.loads(jsonpickle.encode(object)), indent=2)

Note: that still they can’t print the object methods.


回答 11

此类可以解决问题,它将对象转换为标准json。

import json


class Serializer(object):
    @staticmethod
    def serialize(object):
        return json.dumps(object, default=lambda o: o.__dict__.values()[0])

用法:

Serializer.serialize(my_object)

python2.7和工作python3

This class can do the trick, it converts object to standard json .

import json


class Serializer(object):
    @staticmethod
    def serialize(object):
        return json.dumps(object, default=lambda o: o.__dict__.values()[0])

usage:

Serializer.serialize(my_object)

working in python2.7 and python3.


回答 12

import json

class Foo(object):
    def __init__(self):
        self.bar = 'baz'
        self._qux = 'flub'

    def somemethod(self):
        pass

def default(instance):
    return {k: v
            for k, v in vars(instance).items()
            if not str(k).startswith('_')}

json_foo = json.dumps(Foo(), default=default)
assert '{"bar": "baz"}' == json_foo

print(json_foo)
import json

class Foo(object):
    def __init__(self):
        self.bar = 'baz'
        self._qux = 'flub'

    def somemethod(self):
        pass

def default(instance):
    return {k: v
            for k, v in vars(instance).items()
            if not str(k).startswith('_')}

json_foo = json.dumps(Foo(), default=default)
assert '{"bar": "baz"}' == json_foo

print(json_foo)

回答 13

jaraco给出了一个非常简洁的答案。我需要修复一些小问题,但这可行:

# Your custom class
class MyCustom(object):
    def __json__(self):
        return {
            'a': self.a,
            'b': self.b,
            '__python__': 'mymodule.submodule:MyCustom.from_json',
        }

    to_json = __json__  # supported by simplejson

    @classmethod
    def from_json(cls, json):
        obj = cls()
        obj.a = json['a']
        obj.b = json['b']
        return obj

# Dumping and loading
import simplejson

obj = MyCustom()
obj.a = 3
obj.b = 4

json = simplejson.dumps(obj, for_json=True)

# Two-step loading
obj2_dict = simplejson.loads(json)
obj2 = MyCustom.from_json(obj2_dict)

# Make sure we have the correct thing
assert isinstance(obj2, MyCustom)
assert obj2.__dict__ == obj.__dict__

请注意,我们需要两个步骤进行加载。目前,该__python__属性尚未使用。

这有多普遍?

使用AlJohri的方法,我检查了方法的普及程度:

序列化(Python-> JSON):

反序列化(JSON-> Python):

jaraco gave a pretty neat answer. I needed to fix some minor things, but this works:

Code

# Your custom class
class MyCustom(object):
    def __json__(self):
        return {
            'a': self.a,
            'b': self.b,
            '__python__': 'mymodule.submodule:MyCustom.from_json',
        }

    to_json = __json__  # supported by simplejson

    @classmethod
    def from_json(cls, json):
        obj = cls()
        obj.a = json['a']
        obj.b = json['b']
        return obj

# Dumping and loading
import simplejson

obj = MyCustom()
obj.a = 3
obj.b = 4

json = simplejson.dumps(obj, for_json=True)

# Two-step loading
obj2_dict = simplejson.loads(json)
obj2 = MyCustom.from_json(obj2_dict)

# Make sure we have the correct thing
assert isinstance(obj2, MyCustom)
assert obj2.__dict__ == obj.__dict__

Note that we need two steps for loading. For now, the __python__ property is not used.

How common is this?

Using the method of AlJohri, I check popularity of approaches:

Serialization (Python -> JSON):

Deserialization (JSON -> Python):


回答 14

这对我来说效果很好:

class JsonSerializable(object):

    def serialize(self):
        return json.dumps(self.__dict__)

    def __repr__(self):
        return self.serialize()

    @staticmethod
    def dumper(obj):
        if "serialize" in dir(obj):
            return obj.serialize()

        return obj.__dict__

然后

class FileItem(JsonSerializable):
    ...

log.debug(json.dumps(<my object>, default=JsonSerializable.dumper, indent=2))

This has worked well for me:

class JsonSerializable(object):

    def serialize(self):
        return json.dumps(self.__dict__)

    def __repr__(self):
        return self.serialize()

    @staticmethod
    def dumper(obj):
        if "serialize" in dir(obj):
            return obj.serialize()

        return obj.__dict__

and then

class FileItem(JsonSerializable):
    ...

and

log.debug(json.dumps(<my object>, default=JsonSerializable.dumper, indent=2))

回答 15

如果您不介意为其安装软件包,则可以使用json-tricks

pip install json-tricks

在此之后,你只需要导入dump(s)json_tricks替代JSON,它通常会工作:

from json_tricks import dumps
json_str = dumps(cls_instance, indent=4)

这会给

{
        "__instance_type__": [
                "module_name.test_class",
                "MyTestCls"
        ],
        "attributes": {
                "attr": "val",
                "dct_attr": {
                        "hello": 42
                }
        }
}

基本上就是这样!


总的来说,这会很好。有一些exceptions,例如,如果发生了特殊情况__new__,或者发生了更多的元类魔术。

显然,加载也可以(否则有什么意义):

from json_tricks import loads
json_str = loads(json_str)

这确实假定module_name.test_class.MyTestCls可以导入并且没有以不兼容的方式进行更改。您将获得一个实例,而不是字典或其他内容,它应该与您转储的副本相同。

如果要自定义某些东西的序列化方法,可以向类中添加特殊方法,如下所示:

class CustomEncodeCls:
        def __init__(self):
                self.relevant = 42
                self.irrelevant = 37

        def __json_encode__(self):
                # should return primitive, serializable types like dict, list, int, string, float...
                return {'relevant': self.relevant}

        def __json_decode__(self, **attrs):
                # should initialize all properties; note that __init__ is not called implicitly
                self.relevant = attrs['relevant']
                self.irrelevant = 12

例如,它仅序列化部分属性参数。

作为免费赠品,您可以获得numpy数组(日期)的反序列化,日期和时间,有序映射以及在json中包含注释的功能。

免责声明:我创建了json_tricks,因为我和您有同样的问题。

If you don’t mind installing a package for it, you can use json-tricks:

pip install json-tricks

After that you just need to import dump(s) from json_tricks instead of json, and it’ll usually work:

from json_tricks import dumps
json_str = dumps(cls_instance, indent=4)

which’ll give

{
        "__instance_type__": [
                "module_name.test_class",
                "MyTestCls"
        ],
        "attributes": {
                "attr": "val",
                "dct_attr": {
                        "hello": 42
                }
        }
}

And that’s basically it!


This will work great in general. There are some exceptions, e.g. if special things happen in __new__, or more metaclass magic is going on.

Obviously loading also works (otherwise what’s the point):

from json_tricks import loads
json_str = loads(json_str)

This does assume that module_name.test_class.MyTestCls can be imported and hasn’t changed in non-compatible ways. You’ll get back an instance, not some dictionary or something, and it should be an identical copy to the one you dumped.

If you want to customize how something gets (de)serialized, you can add special methods to your class, like so:

class CustomEncodeCls:
        def __init__(self):
                self.relevant = 42
                self.irrelevant = 37

        def __json_encode__(self):
                # should return primitive, serializable types like dict, list, int, string, float...
                return {'relevant': self.relevant}

        def __json_decode__(self, **attrs):
                # should initialize all properties; note that __init__ is not called implicitly
                self.relevant = attrs['relevant']
                self.irrelevant = 12

which serializes only part of the attributes parameters, as an example.

And as a free bonus, you get (de)serialization of numpy arrays, date & times, ordered maps, as well as the ability to include comments in json.

Disclaimer: I created json_tricks, because I had the same problem as you.


回答 16

jsonweb似乎是对我最好的解决方案。参见http://www.jsonweb.info/en/latest/

from jsonweb.encode import to_object, dumper

@to_object()
class DataModel(object):
  def __init__(self, id, value):
   self.id = id
   self.value = value

>>> data = DataModel(5, "foo")
>>> dumper(data)
'{"__type__": "DataModel", "id": 5, "value": "foo"}'

jsonweb seems to be the best solution for me. See http://www.jsonweb.info/en/latest/

from jsonweb.encode import to_object, dumper

@to_object()
class DataModel(object):
  def __init__(self, id, value):
   self.id = id
   self.value = value

>>> data = DataModel(5, "foo")
>>> dumper(data)
'{"__type__": "DataModel", "id": 5, "value": "foo"}'

回答 17

这是我的3美分…
这演示了一个类似树的python对象的显式json序列化。
注意:如果您实际上想要这样的代码,则可以使用扭曲的FilePath类。

import json, sys, os

class File:
    def __init__(self, path):
        self.path = path

    def isdir(self):
        return os.path.isdir(self.path)

    def isfile(self):
        return os.path.isfile(self.path)

    def children(self):        
        return [File(os.path.join(self.path, f)) 
                for f in os.listdir(self.path)]

    def getsize(self):        
        return os.path.getsize(self.path)

    def getModificationTime(self):
        return os.path.getmtime(self.path)

def _default(o):
    d = {}
    d['path'] = o.path
    d['isFile'] = o.isfile()
    d['isDir'] = o.isdir()
    d['mtime'] = int(o.getModificationTime())
    d['size'] = o.getsize() if o.isfile() else 0
    if o.isdir(): d['children'] = o.children()
    return d

folder = os.path.abspath('.')
json.dump(File(folder), sys.stdout, default=_default)

Here is my 3 cents …
This demonstrates explicit json serialization for a tree-like python object.
Note: If you actually wanted some code like this you could use the twisted FilePath class.

import json, sys, os

class File:
    def __init__(self, path):
        self.path = path

    def isdir(self):
        return os.path.isdir(self.path)

    def isfile(self):
        return os.path.isfile(self.path)

    def children(self):        
        return [File(os.path.join(self.path, f)) 
                for f in os.listdir(self.path)]

    def getsize(self):        
        return os.path.getsize(self.path)

    def getModificationTime(self):
        return os.path.getmtime(self.path)

def _default(o):
    d = {}
    d['path'] = o.path
    d['isFile'] = o.isfile()
    d['isDir'] = o.isdir()
    d['mtime'] = int(o.getModificationTime())
    d['size'] = o.getsize() if o.isfile() else 0
    if o.isdir(): d['children'] = o.children()
    return d

folder = os.path.abspath('.')
json.dump(File(folder), sys.stdout, default=_default)

回答 18

当我尝试将Peewee的模型存储到PostgreSQL中时遇到了这个问题 JSONField

经过一段时间的努力,这是一般的解决方案。

我的解决方案的关键是浏览Python的源代码,并意识到代码文档(在此描述)已经解释了如何扩展现有的json.dumps以支持其他数据类型。

假设您当前有一个模型,其中包含一些无法序列化为JSON的字段,并且包含JSON字段的模型最初看起来像这样:

class SomeClass(Model):
    json_field = JSONField()

只需这样定义一个自定义JSONEncoder

class CustomJsonEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, SomeTypeUnsupportedByJsonDumps):
            return < whatever value you want >
        return json.JSONEncoder.default(self, obj)

    @staticmethod
    def json_dumper(obj):
        return json.dumps(obj, cls=CustomJsonEncoder)

然后JSONField像下面这样使用它:

class SomeClass(Model):
    json_field = JSONField(dumps=CustomJsonEncoder.json_dumper)

关键是default(self, obj)上面的方法。对于... is not JSON serializable您从Python收到的每一个投诉,只需添加代码来处理从unserializable-to-JSON类型(例如Enumdatetime

例如,这是我如何支持从继承的类Enum

class TransactionType(Enum):
   CURRENT = 1
   STACKED = 2

   def default(self, obj):
       if isinstance(obj, TransactionType):
           return obj.value
       return json.JSONEncoder.default(self, obj)

最后,使用上述实现的代码,您可以将任何Peewee模型转换为JSON可序列化的对象,如下所示:

peewee_model = WhateverPeeweeModel()
new_model = SomeClass()
new_model.json_field = model_to_dict(peewee_model)

尽管上面的代码(某种程度上)特定于Peewee,但是我认为:

  1. 通常适用于其他ORM(例如Django等)
  2. 另外,如果您了解其json.dumps工作原理,那么该解决方案通常也适用于Python(无ORM)

如有任何疑问,请发表在评论部分。谢谢!

I ran into this problem when I tried to store Peewee’s model into PostgreSQL JSONField.

After struggling for a while, here’s the general solution.

The key to my solution is going through Python’s source code and realizing that the code documentation (described here) already explains how to extend the existing json.dumps to support other data types.

Suppose you current have a model that contains some fields that are not serializable to JSON and the model that contains the JSON field originally looks like this:

class SomeClass(Model):
    json_field = JSONField()

Just define a custom JSONEncoder like this:

class CustomJsonEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, SomeTypeUnsupportedByJsonDumps):
            return < whatever value you want >
        return json.JSONEncoder.default(self, obj)

    @staticmethod
    def json_dumper(obj):
        return json.dumps(obj, cls=CustomJsonEncoder)

And then just use it in your JSONField like below:

class SomeClass(Model):
    json_field = JSONField(dumps=CustomJsonEncoder.json_dumper)

The key is the default(self, obj) method above. For every single ... is not JSON serializable complaint you receive from Python, just add code to handle the unserializable-to-JSON type (such as Enum or datetime)

For example, here’s how I support a class inheriting from Enum:

class TransactionType(Enum):
   CURRENT = 1
   STACKED = 2

   def default(self, obj):
       if isinstance(obj, TransactionType):
           return obj.value
       return json.JSONEncoder.default(self, obj)

Finally, with the code implemented like above, you can just convert any Peewee models to be a JSON-seriazable object like below:

peewee_model = WhateverPeeweeModel()
new_model = SomeClass()
new_model.json_field = model_to_dict(peewee_model)

Though the code above was (somewhat) specific to Peewee, but I think:

  1. It’s applicable to other ORMs (Django, etc) in general
  2. Also, if you understood how json.dumps works, this solution also works with Python (sans ORM) in general too

Any questions, please post in the comments section. Thanks!


回答 19

此函数使用递归遍历字典的每个部分,然后调用非内置类型的类的repr()方法。

def sterilize(obj):
    object_type = type(obj)
    if isinstance(obj, dict):
        return {k: sterilize(v) for k, v in obj.items()}
    elif object_type in (list, tuple):
        return [sterilize(v) for v in obj]
    elif object_type in (str, int, bool):
        return obj
    else:
        return obj.__repr__()

This function uses recursion to iterate over every part of the dictionary and then calls the repr() methods of classes that are not build-in types.

def sterilize(obj):
    object_type = type(obj)
    if isinstance(obj, dict):
        return {k: sterilize(v) for k, v in obj.items()}
    elif object_type in (list, tuple):
        return [sterilize(v) for v in obj]
    elif object_type in (str, int, bool):
        return obj
    else:
        return obj.__repr__()

回答 20

这是一个小型库,它将带有其所有子级的对象序列化为JSON并将其解析回:

https://github.com/Toubs/PyJSONSerialization/

This is a small library that serializes an object with all its children to JSON and also parses it back:

https://github.com/Toubs/PyJSONSerialization/


回答 21

我想出了自己的解决方案。使用此方法,传递任何文档(dictlistObjectId等)进行序列化。

def getSerializable(doc):
    # check if it's a list
    if isinstance(doc, list):
        for i, val in enumerate(doc):
            doc[i] = getSerializable(doc[i])
        return doc

    # check if it's a dict
    if isinstance(doc, dict):
        for key in doc.keys():
            doc[key] = getSerializable(doc[key])
        return doc

    # Process ObjectId
    if isinstance(doc, ObjectId):
        doc = str(doc)
        return doc

    # Use any other custom serializting stuff here...

    # For the rest of stuff
    return doc

I came up with my own solution. Use this method, pass any document (dict,list, ObjectId etc) to serialize.

def getSerializable(doc):
    # check if it's a list
    if isinstance(doc, list):
        for i, val in enumerate(doc):
            doc[i] = getSerializable(doc[i])
        return doc

    # check if it's a dict
    if isinstance(doc, dict):
        for key in doc.keys():
            doc[key] = getSerializable(doc[key])
        return doc

    # Process ObjectId
    if isinstance(doc, ObjectId):
        doc = str(doc)
        return doc

    # Use any other custom serializting stuff here...

    # For the rest of stuff
    return doc

回答 22

我选择使用装饰器解决datetime对象序列化问题。这是我的代码:

#myjson.py
#Author: jmooremcc 7/16/2017

import json
from datetime import datetime, date, time, timedelta
"""
This module uses decorators to serialize date objects using json
The filename is myjson.py
In another module you simply add the following import statement:
    from myjson import json

json.dumps and json.dump will then correctly serialize datetime and date 
objects
"""

def json_serial(obj):
    """JSON serializer for objects not serializable by default json code"""

    if isinstance(obj, (datetime, date)):
        serial = str(obj)
        return serial
    raise TypeError ("Type %s not serializable" % type(obj))


def FixDumps(fn):
    def hook(obj):
        return fn(obj, default=json_serial)

    return hook

def FixDump(fn):
    def hook(obj, fp):
        return fn(obj,fp, default=json_serial)

    return hook


json.dumps=FixDumps(json.dumps)
json.dump=FixDump(json.dump)


if __name__=="__main__":
    today=datetime.now()
    data={'atime':today, 'greet':'Hello'}
    str=json.dumps(data)
    print str

通过导入上述模块,我的其他模块以常规方式(不指定默认关键字)使用json来序列化包含日期时间对象的数据。datetime序列化程序代码会被json.dumps和json.dump自动调用。

I chose to use decorators to solve the datetime object serialization problem. Here is my code:

#myjson.py
#Author: jmooremcc 7/16/2017

import json
from datetime import datetime, date, time, timedelta
"""
This module uses decorators to serialize date objects using json
The filename is myjson.py
In another module you simply add the following import statement:
    from myjson import json

json.dumps and json.dump will then correctly serialize datetime and date 
objects
"""

def json_serial(obj):
    """JSON serializer for objects not serializable by default json code"""

    if isinstance(obj, (datetime, date)):
        serial = str(obj)
        return serial
    raise TypeError ("Type %s not serializable" % type(obj))


def FixDumps(fn):
    def hook(obj):
        return fn(obj, default=json_serial)

    return hook

def FixDump(fn):
    def hook(obj, fp):
        return fn(obj,fp, default=json_serial)

    return hook


json.dumps=FixDumps(json.dumps)
json.dump=FixDump(json.dump)


if __name__=="__main__":
    today=datetime.now()
    data={'atime':today, 'greet':'Hello'}
    str=json.dumps(data)
    print str

By importing the above module, my other modules use json in a normal way (without specifying the default keyword) to serialize data that contains date time objects. The datetime serializer code is automatically called for json.dumps and json.dump.


回答 23

我最喜欢Lost Koder的方法。尝试序列化其成员/方法无法序列化的更复杂对象时,我遇到了问题。这是适用于更多对象的实现:

class Serializer(object):
    @staticmethod
    def serialize(obj):
        def check(o):
            for k, v in o.__dict__.items():
                try:
                    _ = json.dumps(v)
                    o.__dict__[k] = v
                except TypeError:
                    o.__dict__[k] = str(v)
            return o
        return json.dumps(check(obj).__dict__, indent=2)

I liked Lost Koder’s method the most. I ran into issues when trying to serialize more complex objects whos members/methods aren’t serializable. Here’s my implementation that works on more objects:

class Serializer(object):
    @staticmethod
    def serialize(obj):
        def check(o):
            for k, v in o.__dict__.items():
                try:
                    _ = json.dumps(v)
                    o.__dict__[k] = v
                except TypeError:
                    o.__dict__[k] = str(v)
            return o
        return json.dumps(check(obj).__dict__, indent=2)

回答 24

如果您能够安装软件包,我建议您尝试dill,这对于我的项目来说效果很好。这个套件的优点是它具有与相同的介面pickle,因此,如果您已经pickle在专案中使用过,可以简单地以dill为该脚本并查看脚本是否在运行,而无需更改任何代码。因此,这是一个非常便宜的解决方案!

(完全反公开:我与Dill项目毫无关系,也从未参与过该项目。)

安装软件包:

pip install dill

然后,编辑要导入的代码,dill而不是pickle

# import pickle
import dill as pickle

运行您的脚本,看看它是否有效。(如果这样做,您可能需要清理代码,以使您不再隐藏pickle模块名称!)

dill项目页面可以和不能序列化的数据类型的一些细节:

dill 可以腌制以下标准类型:

无,类型,布尔值,整数,长整型,浮点型,复杂,str,unicode,元组,列表,字典,文件,缓冲区,内置,新旧样式类,新旧样式类实例,集合,frozenset,数组,功能,异常

dill 也可以腌制更多“异国”标准类型:

具有yields的函数,嵌套函数,lambdas,单元格,方法,unboundmethod,模块,代码,methodwrapper,dictproxy,methoddescriptor,getsetdescriptor,memberdescriptor,wrapperperscriptor,xrange,slice,未实现,省略号,退出

dill 尚不能腌制这些标准类型:

框架,发生器,回溯

If you are able to install a package, I’d recommend trying dill, which worked just fine for my project. A nice thing about this package is that it has the same interface as pickle, so if you have already been using pickle in your project you can simply substitute in dill and see if the script runs, without changing any code. So it is a very cheap solution to try!

(Full anti-disclosure: I am in no way affiliated with and have never contributed to the dill project.)

Install the package:

pip install dill

Then edit your code to import dill instead of pickle:

# import pickle
import dill as pickle

Run your script and see if it works. (If it does you may want to clean up your code so that you are no longer shadowing the pickle module name!)

Some specifics on datatypes that dill can and cannot serialize, from the project page:

dill can pickle the following standard types:

none, type, bool, int, long, float, complex, str, unicode, tuple, list, dict, file, buffer, builtin, both old and new style classes, instances of old and new style classes, set, frozenset, array, functions, exceptions

dill can also pickle more ‘exotic’ standard types:

functions with yields, nested functions, lambdas, cell, method, unboundmethod, module, code, methodwrapper, dictproxy, methoddescriptor, getsetdescriptor, memberdescriptor, wrapperdescriptor, xrange, slice, notimplemented, ellipsis, quit

dill cannot yet pickle these standard types:

frame, generator, traceback


回答 25

我在这里没有提到串行版本控制或反向兼容,因此我将发布已经使用了一段时间的解决方案。我可能有很多东西要学,特别是Java和Javascript在这里可能比我更成熟,但是这里

https://gist.github.com/andy-d/b7878d0044a4242c0498ed6d67fd50fe

I see no mention here of serial versioning or backcompat, so I will post my solution which I’ve been using for a bit. I probably have a lot more to learn from, specifically Java and Javascript are probably more mature than me here but here goes

https://gist.github.com/andy-d/b7878d0044a4242c0498ed6d67fd50fe


回答 26

要添加另一个选项:您可以使用attrs包和asdict方法。

class ObjectEncoder(JSONEncoder):
    def default(self, o):
        return attr.asdict(o)

json.dumps(objects, cls=ObjectEncoder)

并转换回来

def from_json(o):
    if '_obj_name' in o:
        type_ = o['_obj_name']
        del o['_obj_name']
        return globals()[type_](**o)
    else:
        return o

data = JSONDecoder(object_hook=from_json).decode(data)

类看起来像这样

@attr.s
class Foo(object):
    x = attr.ib()
    _obj_name = attr.ib(init=False, default='Foo')

To add another option: You can use the attrs package and the asdict method.

class ObjectEncoder(JSONEncoder):
    def default(self, o):
        return attr.asdict(o)

json.dumps(objects, cls=ObjectEncoder)

and to convert back

def from_json(o):
    if '_obj_name' in o:
        type_ = o['_obj_name']
        del o['_obj_name']
        return globals()[type_](**o)
    else:
        return o

data = JSONDecoder(object_hook=from_json).decode(data)

class looks like this

@attr.s
class Foo(object):
    x = attr.ib()
    _obj_name = attr.ib(init=False, default='Foo')

回答 27

除了Onur的答案外,您可能还想处理类似以下的datetime类型。
(为了处理:“ datetime.datetime”对象没有属性“ dict ”异常。)

def datetime_option(value):
    if isinstance(value, datetime.date):
        return value.timestamp()
    else:
        return value.__dict__

用法:

def toJSON(self):
    return json.dumps(self, default=datetime_option, sort_keys=True, indent=4)

In addition to the Onur’s answer, You possibly want to deal with datetime type like below.
(in order to handle: ‘datetime.datetime’ object has no attribute ‘dict‘ exception.)

def datetime_option(value):
    if isinstance(value, datetime.date):
        return value.timestamp()
    else:
        return value.__dict__

Usage:

def toJSON(self):
    return json.dumps(self, default=datetime_option, sort_keys=True, indent=4)

回答 28

首先,我们需要使对象符合JSON,因此可以使用标准JSON模块将其转储。我这样做是这样的:

def serialize(o):
    if isinstance(o, dict):
        return {k:serialize(v) for k,v in o.items()}
    if isinstance(o, list):
        return [serialize(e) for e in o]
    if isinstance(o, bytes):
        return o.decode("utf-8")
    return o

First we need to make our object JSON-compliant, so we can dump it using the standard JSON module. I did it this way:

def serialize(o):
    if isinstance(o, dict):
        return {k:serialize(v) for k,v in o.items()}
    if isinstance(o, list):
        return [serialize(e) for e in o]
    if isinstance(o, bytes):
        return o.decode("utf-8")
    return o

回答 29

基于Quinten Cabo答案

def sterilize(obj):
    if type(obj) in (str, float, int, bool, type(None)):
        return obj
    elif isinstance(obj, dict):
        return {k: sterilize(v) for k, v in obj.items()}
    elif hasattr(obj, '__iter__') and callable(obj.__iter__):
        return [sterilize(v) for v in obj]
    elif hasattr(obj, '__dict__'):
        return {k: sterilize(v) for k, v in obj.__dict__.items() if k not in ['__module__', '__dict__', '__weakref__', '__doc__']}
    else:
        return repr(obj)

区别是

  1. 适用于任何迭代而不是just listtuple(适用于NumPy数组等)
  2. 适用于动态类型(包含 __dict__)。
  3. 包括本机类型floatNone因此它们不会转换为字符串。

留给读者的练习是处理__slots__,这些类既可迭代且具有成员,这些类既是字典又具有成员,等等。

Building on Quinten Cabo‘s answer:

def sterilize(obj):
    if type(obj) in (str, float, int, bool, type(None)):
        return obj
    elif isinstance(obj, dict):
        return {k: sterilize(v) for k, v in obj.items()}
    elif hasattr(obj, '__iter__') and callable(obj.__iter__):
        return [sterilize(v) for v in obj]
    elif hasattr(obj, '__dict__'):
        return {k: sterilize(v) for k, v in obj.__dict__.items() if k not in ['__module__', '__dict__', '__weakref__', '__doc__']}
    else:
        return repr(obj)

The differences are

  1. Works for any iterable instead of just list and tuple (it works for NumPy arrays, etc.)
  2. Works for dynamic types (ones that contain a __dict__).
  3. Includes native types float and None so they don’t get converted to string.

Left as an exercise to the reader is to handle __slots__, classes that are both iterable and have members, classes that are dictionaries and also have members, etc.