标签归档:json

存储Python字典

问题:存储Python字典

我习惯于使用.csv文件将数据导入和导出Python,但这存在明显的挑战。关于将字典(或字典集)存储在json或pck文件中的简单方法的任何建议?例如:

data = {}
data ['key1'] = "keyinfo"
data ['key2'] = "keyinfo2"

我想知道如何保存它,然后再将其加载回去。

I’m used to bringing data in and out of Python using CSV files, but there are obvious challenges to this. Are there simple ways to store a dictionary (or sets of dictionaries) in a JSON or pickle file?

For example:

data = {}
data ['key1'] = "keyinfo"
data ['key2'] = "keyinfo2"

I would like to know both how to save this, and then how to load it back in.


回答 0

泡菜保存:

try:
    import cPickle as pickle
except ImportError:  # python 3.x
    import pickle

with open('data.p', 'wb') as fp:
    pickle.dump(data, fp, protocol=pickle.HIGHEST_PROTOCOL)

有关该参数的其他信息,请参见pickle模块文档protocol

酸洗负荷:

with open('data.p', 'rb') as fp:
    data = pickle.load(fp)

JSON保存:

import json

with open('data.json', 'w') as fp:
    json.dump(data, fp)

提供额外的参数,例如sort_keysindent以获得漂亮的结果。参数sort_keys将按字母顺序对键进行排序,而indent将使用indent=N空格缩进您的数据结构。

json.dump(data, fp, sort_keys=True, indent=4)

JSON加载:

with open('data.json', 'r') as fp:
    data = json.load(fp)

Pickle save:

try:
    import cPickle as pickle
except ImportError:  # Python 3.x
    import pickle

with open('data.p', 'wb') as fp:
    pickle.dump(data, fp, protocol=pickle.HIGHEST_PROTOCOL)

See the pickle module documentation for additional information regarding the protocol argument.

Pickle load:

with open('data.p', 'rb') as fp:
    data = pickle.load(fp)

JSON save:

import json

with open('data.json', 'w') as fp:
    json.dump(data, fp)

Supply extra arguments, like sort_keys or indent, to get a pretty result. The argument sort_keys will sort the keys alphabetically and indent will indent your data structure with indent=N spaces.

json.dump(data, fp, sort_keys=True, indent=4)

JSON load:

with open('data.json', 'r') as fp:
    data = json.load(fp)

回答 1

最小的示例,直接写入文件:

import json
json.dump(data, open(filename, 'wb'))
data = json.load(open(filename))

或安全地打开/关闭:

import json
with open(filename, 'wb') as outfile:
    json.dump(data, outfile)
with open(filename) as infile:
    data = json.load(infile)

如果要将其保存为字符串而不是文件:

import json
json_str = json.dumps(data)
data = json.loads(json_str)

Minimal example, writing directly to a file:

import json
json.dump(data, open(filename, 'wb'))
data = json.load(open(filename))

or safely opening / closing:

import json
with open(filename, 'wb') as outfile:
    json.dump(data, outfile)
with open(filename) as infile:
    data = json.load(infile)

If you want to save it in a string instead of a file:

import json
json_str = json.dumps(data)
data = json.loads(json_str)

回答 2

另请参阅加速包ujson。 https://pypi.python.org/pypi/ujson

import ujson
with open('data.json', 'wb') as fp:
    ujson.dump(data, fp)

Also see the speeded-up package ujson:

import ujson

with open('data.json', 'wb') as fp:
    ujson.dump(data, fp)

回答 3

要写入文件:

import json
myfile.write(json.dumps(mydict))

要读取文件:

import json
mydict = json.loads(myfile.read())

myfile 是存储字典的文件的文件对象。

To write to a file:

import json
myfile.write(json.dumps(mydict))

To read from a file:

import json
mydict = json.loads(myfile.read())

myfile is the file object for the file that you stored the dict in.


回答 4

如果您正在序列化之后但不需要其他程序中的数据,则强烈建议您使用该shelve模块。将其视为持久性字典。

myData = shelve.open('/path/to/file')

# check for values.
keyVar in myData

# set values
myData[anotherKey] = someValue

# save the data for future use.
myData.close()

If you’re after serialization, but won’t need the data in other programs, I strongly recommend the shelve module. Think of it as a persistent dictionary.

myData = shelve.open('/path/to/file')

# Check for values.
keyVar in myData

# Set values
myData[anotherKey] = someValue

# Save the data for future use.
myData.close()

回答 5

如果您想要替代picklejson,则可以使用klepto

>>> init = {'y': 2, 'x': 1, 'z': 3}
>>> import klepto
>>> cache = klepto.archives.file_archive('memo', init, serialized=False)
>>> cache        
{'y': 2, 'x': 1, 'z': 3}
>>>
>>> # dump dictionary to the file 'memo.py'
>>> cache.dump() 
>>> 
>>> # import from 'memo.py'
>>> from memo import memo
>>> print memo
{'y': 2, 'x': 1, 'z': 3}

使用klepto,如果使用过serialized=True,则该字典将被memo.pkl作为腌制的字典写入,而不是使用明文。

你可以在klepto这里找到:https : //github.com/uqfoundation/klepto

dill酸洗可能比酸洗更好pickle,因为dill可以在python中序列化几乎所有内容。 klepto也可以使用dill

你可以在dill这里找到:https : //github.com/uqfoundation/dill

前几行中额外的mumbo-jumbo是因为klepto可以配置为将字典存储到文件,目录上下文或SQL数据库中。无论选择什么作为后端存档,API都是相同的。它为您提供了一个“可存档”字典,您可以使用该字典loaddump与档案进行交互。

If you want an alternative to pickle or json, you can use klepto.

>>> init = {'y': 2, 'x': 1, 'z': 3}
>>> import klepto
>>> cache = klepto.archives.file_archive('memo', init, serialized=False)
>>> cache        
{'y': 2, 'x': 1, 'z': 3}
>>>
>>> # dump dictionary to the file 'memo.py'
>>> cache.dump() 
>>> 
>>> # import from 'memo.py'
>>> from memo import memo
>>> print memo
{'y': 2, 'x': 1, 'z': 3}

With klepto, if you had used serialized=True, the dictionary would have been written to memo.pkl as a pickled dictionary instead of with clear text.

You can get klepto here: https://github.com/uqfoundation/klepto

dill is probably a better choice for pickling then pickle itself, as dill can serialize almost anything in python. klepto also can use dill.

You can get dill here: https://github.com/uqfoundation/dill

The additional mumbo-jumbo on the first few lines are because klepto can be configured to store dictionaries to a file, to a directory context, or to a SQL database. The API is the same for whatever you choose as the backend archive. It gives you an “archivable” dictionary with which you can use load and dump to interact with the archive.


回答 6

这是一个老话题,但是为了完整起见,我们应该包括ConfigParser和configparser,它们分别是Python 2和3中的标准库的一部分。该模块读取和写入config / ini文件,并且(至少在Python 3中)其行为类似于字典。它的另一个好处是,您可以将多个词典存储到config / ini文件的不同部分中,并对其进行调用。甜!

Python 2.7.x示例。

import ConfigParser

config = ConfigParser.ConfigParser()

dict1 = {'key1':'keyinfo', 'key2':'keyinfo2'}
dict2 = {'k1':'hot', 'k2':'cross', 'k3':'buns'}
dict3 = {'x':1, 'y':2, 'z':3}

# make each dictionary a separate section in config
config.add_section('dict1')
for key in dict1.keys():
    config.set('dict1', key, dict1[key])

config.add_section('dict2')
for key in dict2.keys():
    config.set('dict2', key, dict2[key])

config.add_section('dict3')
for key in dict3.keys():
    config.set('dict3', key, dict3[key])

# save config to file
f = open('config.ini', 'w')
config.write(f)
f.close()

# read config from file
config2 = ConfigParser.ConfigParser()
config2.read('config.ini')

dictA = {}
for item in config2.items('dict1'):
    dictA[item[0]] = item[1]

dictB = {}
for item in config2.items('dict2'):
    dictB[item[0]] = item[1]

dictC = {}
for item in config2.items('dict3'):
    dictC[item[0]] = item[1]

print(dictA)
print(dictB)
print(dictC)

Python 3.X示例。

import configparser

config = configparser.ConfigParser()

dict1 = {'key1':'keyinfo', 'key2':'keyinfo2'}
dict2 = {'k1':'hot', 'k2':'cross', 'k3':'buns'}
dict3 = {'x':1, 'y':2, 'z':3}

# make each dictionary a separate section in config
config['dict1'] = dict1
config['dict2'] = dict2
config['dict3'] = dict3

# save config to file
f = open('config.ini', 'w')
config.write(f)
f.close()

# read config from file
config2 = configparser.ConfigParser()
config2.read('config.ini')

# ConfigParser objects are a lot like dictionaries, but if you really
# want a dictionary you can ask it to convert a section to a dictionary
dictA = dict(config2['dict1'] )
dictB = dict(config2['dict2'] )
dictC = dict(config2['dict3'])

print(dictA)
print(dictB)
print(dictC)

控制台输出

{'key2': 'keyinfo2', 'key1': 'keyinfo'}
{'k1': 'hot', 'k2': 'cross', 'k3': 'buns'}
{'z': '3', 'y': '2', 'x': '1'}

config.ini的内容

[dict1]
key2 = keyinfo2
key1 = keyinfo

[dict2]
k1 = hot
k2 = cross
k3 = buns

[dict3]
z = 3
y = 2
x = 1

For completeness, we should include ConfigParser and configparser which are part of the standard library in Python 2 and 3, respectively. This module reads and writes to a config/ini file and (at least in Python 3) behaves in a lot of ways like a dictionary. It has the added benefit that you can store multiple dictionaries into separate sections of your config/ini file and recall them. Sweet!

Python 2.7.x example.

import ConfigParser

config = ConfigParser.ConfigParser()

dict1 = {'key1':'keyinfo', 'key2':'keyinfo2'}
dict2 = {'k1':'hot', 'k2':'cross', 'k3':'buns'}
dict3 = {'x':1, 'y':2, 'z':3}

# Make each dictionary a separate section in the configuration
config.add_section('dict1')
for key in dict1.keys():
    config.set('dict1', key, dict1[key])
   
config.add_section('dict2')
for key in dict2.keys():
    config.set('dict2', key, dict2[key])

config.add_section('dict3')
for key in dict3.keys():
    config.set('dict3', key, dict3[key])

# Save the configuration to a file
f = open('config.ini', 'w')
config.write(f)
f.close()

# Read the configuration from a file
config2 = ConfigParser.ConfigParser()
config2.read('config.ini')

dictA = {}
for item in config2.items('dict1'):
    dictA[item[0]] = item[1]

dictB = {}
for item in config2.items('dict2'):
    dictB[item[0]] = item[1]

dictC = {}
for item in config2.items('dict3'):
    dictC[item[0]] = item[1]

print(dictA)
print(dictB)
print(dictC)

Python 3.X example.

import configparser

config = configparser.ConfigParser()

dict1 = {'key1':'keyinfo', 'key2':'keyinfo2'}
dict2 = {'k1':'hot', 'k2':'cross', 'k3':'buns'}
dict3 = {'x':1, 'y':2, 'z':3}

# Make each dictionary a separate section in the configuration
config['dict1'] = dict1
config['dict2'] = dict2
config['dict3'] = dict3

# Save the configuration to a file
f = open('config.ini', 'w')
config.write(f)
f.close()

# Read the configuration from a file
config2 = configparser.ConfigParser()
config2.read('config.ini')

# ConfigParser objects are a lot like dictionaries, but if you really
# want a dictionary you can ask it to convert a section to a dictionary
dictA = dict(config2['dict1'] )
dictB = dict(config2['dict2'] )
dictC = dict(config2['dict3'])

print(dictA)
print(dictB)
print(dictC)

Console output

{'key2': 'keyinfo2', 'key1': 'keyinfo'}
{'k1': 'hot', 'k2': 'cross', 'k3': 'buns'}
{'z': '3', 'y': '2', 'x': '1'}

Contents of config.ini

[dict1]
key2 = keyinfo2
key1 = keyinfo

[dict2]
k1 = hot
k2 = cross
k3 = buns

[dict3]
z = 3
y = 2
x = 1

回答 7

如果保存到json文件,最好的和最简单的方法是:

import json
with open("file.json", "wb") as f:
    f.write(json.dumps(dict).encode("utf-8"))

If save to a JSON file, the best and easiest way of doing this is:

import json
with open("file.json", "wb") as f:
    f.write(json.dumps(dict).encode("utf-8"))

回答 8

我的用例是将多个json对象保存到文件中,而marty的回答对我有所帮助。但是要满足我的用例,答案并不完整,因为每次保存新条目时,它都会覆盖旧数据。

为了将多个条目保存在一个文件中,必须检查旧内容(即在写入之前先读取)。存放json数据的典型文件将具有a listobjectas根。因此,我认为我的json文件始终具有a,list of objects并且每次向其添加数据时,我只会首先加载列表,在其中添加新数据,然后将其转储回文件(w)的仅可写实例:

def saveJson(url,sc): #this function writes the 2 values to file
    newdata = {'url':url,'sc':sc}
    json_path = "db/file.json"

    old_list= []
    with open(json_path) as myfile:  #read the contents first
        old_list = json.load(myfile)
    old_list.append(newdata)

    with open(json_path,"w") as myfile:  #overwrite the whole content
        json.dump(old_list,myfile,sort_keys=True,indent=4)

    return "sucess"

新的json文件将如下所示:

[
    {
        "sc": "a11",
        "url": "www.google.com"
    },
    {
        "sc": "a12",
        "url": "www.google.com"
    },
    {
        "sc": "a13",
        "url": "www.google.com"
    }
]

注意:必须file.json使用[]以初始数据命名的文件,此方法才能正常工作

PS:与原始问题无关,但是通过首先检查我们的条目是否已经存在(基于1 /多个键),然后仅追加并保存数据,也可以进一步改进此方法。让我知道是否有人需要该支票,我将添加到答案中

My use case was to save multiple JSON objects to a file and marty’s answer helped me somewhat. But to serve my use case, the answer was not complete as it would overwrite the old data every time a new entry was saved.

To save multiple entries in a file, one must check for the old content (i.e., read before write). A typical file holding JSON data will either have a list or an object as root. So I considered that my JSON file always has a list of objects and every time I add data to it, I simply load the list first, append my new data in it, and dump it back to a writable-only instance of file (w):

def saveJson(url,sc): # This function writes the two values to the file
    newdata = {'url':url,'sc':sc}
    json_path = "db/file.json"

    old_list= []
    with open(json_path) as myfile:  # Read the contents first
        old_list = json.load(myfile)
    old_list.append(newdata)

    with open(json_path,"w") as myfile:  # Overwrite the whole content
        json.dump(old_list, myfile, sort_keys=True, indent=4)

    return "success"

The new JSON file will look something like this:

[
    {
        "sc": "a11",
        "url": "www.google.com"
    },
    {
        "sc": "a12",
        "url": "www.google.com"
    },
    {
        "sc": "a13",
        "url": "www.google.com"
    }
]

NOTE: It is essential to have a file named file.json with [] as initial data for this approach to work

PS: not related to original question, but this approach could also be further improved by first checking if our entry already exists (based on one or multiple keys) and only then append and save the data.


如何从网页获取JSON到Python脚本

问题:如何从网页获取JSON到Python脚本

在我的一个脚本中获得了以下代码:

#
# url is defined above.
#
jsonurl = urlopen(url)

#
# While trying to debug, I put this in:
#
print jsonurl

#
# Was hoping text would contain the actual json crap from the URL, but seems not...
#
text = json.loads(jsonurl)
print text

我想要做的是获取{{.....etc.....}}在Firefox中将其加载到脚本中时在URL上看到的内容,以便我可以解析出一个值。我已经用Google搜索了很多,但是关于如何{{...}}从URL 实际获取内容.json到Python脚本中的对象中,我还没有找到一个好的答案。

Got the following code in one of my scripts:

#
# url is defined above.
#
jsonurl = urlopen(url)

#
# While trying to debug, I put this in:
#
print jsonurl

#
# Was hoping text would contain the actual json crap from the URL, but seems not...
#
text = json.loads(jsonurl)
print text

What I want to do is get the {{.....etc.....}} stuff that I see on the URL when I load it in Firefox into my script so I can parse a value out of it. I’ve Googled a ton but I haven’t found a good answer as to how to actually get the {{...}} stuff from a URL ending in .json into an object in a Python script.


回答 0

从URL获取数据,然后调用json.loads例如

Python3示例

import urllib.request, json 
with urllib.request.urlopen("http://maps.googleapis.com/maps/api/geocode/json?address=google") as url:
    data = json.loads(url.read().decode())
    print(data)

Python2示例

import urllib, json
url = "http://maps.googleapis.com/maps/api/geocode/json?address=google"
response = urllib.urlopen(url)
data = json.loads(response.read())
print data

输出结果将是这样的:

{
"results" : [
    {
    "address_components" : [
        {
            "long_name" : "Charleston and Huff",
            "short_name" : "Charleston and Huff",
            "types" : [ "establishment", "point_of_interest" ]
        },
        {
            "long_name" : "Mountain View",
            "short_name" : "Mountain View",
            "types" : [ "locality", "political" ]
        },
        {
...

Get data from the URL and then call json.loads e.g.

Python3 example:

import urllib.request, json 
with urllib.request.urlopen("http://maps.googleapis.com/maps/api/geocode/json?address=google") as url:
    data = json.loads(url.read().decode())
    print(data)

Python2 example:

import urllib, json
url = "http://maps.googleapis.com/maps/api/geocode/json?address=google"
response = urllib.urlopen(url)
data = json.loads(response.read())
print data

The output would result in something like this:

{
"results" : [
    {
    "address_components" : [
        {
            "long_name" : "Charleston and Huff",
            "short_name" : "Charleston and Huff",
            "types" : [ "establishment", "point_of_interest" ]
        },
        {
            "long_name" : "Mountain View",
            "short_name" : "Mountain View",
            "types" : [ "locality", "political" ]
        },
        {
...

回答 1

我猜您实际上是想从URL中获取数据:

jsonurl = urlopen(url)
text = json.loads(jsonurl.read()) # <-- read from it

或者,在请求库中检出JSON解码器

import requests
r = requests.get('someurl')
print r.json() # if response type was set to JSON, then you'll automatically have a JSON response here...

I’ll take a guess that you actually want to get data from the URL:

jsonurl = urlopen(url)
text = json.loads(jsonurl.read()) # <-- read from it

Or, check out JSON decoder in the requests library.

import requests
r = requests.get('someurl')
print r.json() # if response type was set to JSON, then you'll automatically have a JSON response here...

回答 2

这会从使用Python 2.X和Python 3.X的网页获取JSON格式的字典:

#!/usr/bin/env python

try:
    # For Python 3.0 and later
    from urllib.request import urlopen
except ImportError:
    # Fall back to Python 2's urllib2
    from urllib2 import urlopen

import json


def get_jsonparsed_data(url):
    """
    Receive the content of ``url``, parse it as JSON and return the object.

    Parameters
    ----------
    url : str

    Returns
    -------
    dict
    """
    response = urlopen(url)
    data = response.read().decode("utf-8")
    return json.loads(data)


url = ("http://maps.googleapis.com/maps/api/geocode/json?"
       "address=googleplex&sensor=false")
print(get_jsonparsed_data(url))

另请参阅:JSON读写示例

This gets a dictionary in JSON format from a webpage with Python 2.X and Python 3.X:

#!/usr/bin/env python

try:
    # For Python 3.0 and later
    from urllib.request import urlopen
except ImportError:
    # Fall back to Python 2's urllib2
    from urllib2 import urlopen

import json


def get_jsonparsed_data(url):
    """
    Receive the content of ``url``, parse it as JSON and return the object.

    Parameters
    ----------
    url : str

    Returns
    -------
    dict
    """
    response = urlopen(url)
    data = response.read().decode("utf-8")
    return json.loads(data)


url = ("http://maps.googleapis.com/maps/api/geocode/json?"
       "address=googleplex&sensor=false")
print(get_jsonparsed_data(url))

See also: Read and write example for JSON


回答 3

我发现这是使用Python 3时从网页获取JSON的最简单,最有效的方法:

import json,urllib.request
data = urllib.request.urlopen("https://api.github.com/users?since=100").read()
output = json.loads(data)
print (output)

I have found this to be the easiest and most efficient way to get JSON from a webpage when using Python 3:

import json,urllib.request
data = urllib.request.urlopen("https://api.github.com/users?since=100").read()
output = json.loads(data)
print (output)

回答 4

调用的所有urlopen()操作(根据docs)都返回一个类似文件的对象。一旦有了它,就需要调用其read()方法来在网络上实际提取JSON数据。

就像是:

jsonurl = urlopen(url)

text = json.loads(jsonurl.read())
print text

All that the call to urlopen() does (according to the docs) is return a file-like object. Once you have that, you need to call its read() method to actually pull the JSON data across the network.

Something like:

jsonurl = urlopen(url)

text = json.loads(jsonurl.read())
print text

回答 5

在Python 2中,可以使用json.load()代替json.loads()

import json
import urllib

url = 'https://api.github.com/users?since=100'
output = json.load(urllib.urlopen(url))
print(output)

不幸的是,这在Python 3中不起作用。json.load只是json.loads的包装,它为类似文件的对象调用read()。json.loads需要一个字符串对象,而urllib.urlopen(url).read()的输出是一个字节对象。因此,必须获取文件编码才能使其在Python 3中工作。

在此示例中,我们查询标头以获取编码,如果没有得到编码,则回退到utf-8。标头对象在Python 2和3之间是不同的,因此必须以不同的方式完成。使用请求可以避免所有这些情况,但是有时您需要坚持使用标准库。

import json
from six.moves.urllib.request import urlopen

DEFAULT_ENCODING = 'utf-8'
url = 'https://api.github.com/users?since=100'
urlResponse = urlopen(url)

if hasattr(urlResponse.headers, 'get_content_charset'):
    encoding = urlResponse.headers.get_content_charset(DEFAULT_ENCODING)
else:
    encoding = urlResponse.headers.getparam('charset') or DEFAULT_ENCODING

output = json.loads(urlResponse.read().decode(encoding))
print(output)

In Python 2, json.load() will work instead of json.loads()

import json
import urllib

url = 'https://api.github.com/users?since=100'
output = json.load(urllib.urlopen(url))
print(output)

Unfortunately, that doesn’t work in Python 3. json.load is just a wrapper around json.loads that calls read() for a file-like object. json.loads requires a string object and the output of urllib.urlopen(url).read() is a bytes object. So one has to get the file encoding in order to make it work in Python 3.

In this example we query the headers for the encoding and fall back to utf-8 if we don’t get one. The headers object is different between Python 2 and 3 so it has to be done different ways. Using requests would avoid all this, but sometimes you need to stick to the standard library.

import json
from six.moves.urllib.request import urlopen

DEFAULT_ENCODING = 'utf-8'
url = 'https://api.github.com/users?since=100'
urlResponse = urlopen(url)

if hasattr(urlResponse.headers, 'get_content_charset'):
    encoding = urlResponse.headers.get_content_charset(DEFAULT_ENCODING)
else:
    encoding = urlResponse.headers.getparam('charset') or DEFAULT_ENCODING

output = json.loads(urlResponse.read().decode(encoding))
print(output)

回答 6

无需使用额外的库来解析json …

json.loads()返回字典

所以就你而言 text["someValueKey"]

There’s no need to use an extra library to parse the json…

json.loads() returns a dictionary.

So in your case, just do text["someValueKey"]


回答 7

答案较晚,但python>=3.6您可以使用:

import dload
j = dload.json(url)

安装dload方式:

pip3 install dload

Late answer, but for python>=3.6 you can use:

import dload
j = dload.json(url)

Install dload with:

pip3 install dload

回答 8

您可以使用json.dumps

import json

# Hier comes you received data

data = json.dumps(response)

print(data)

为了加载json并将其写入文件,以下代码很有用:

data = json.loads(json.dumps(Response, sort_keys=False, indent=4))
with open('data.json', 'w') as outfile:
json.dump(data, outfile, sort_keys=False, indent=4)

you can use json.dumps:

import json

# Hier comes you received data

data = json.dumps(response)

print(data)

for loading json and write it on file the following code is useful:

data = json.loads(json.dumps(Response, sort_keys=False, indent=4))
with open('data.json', 'w') as outfile:
json.dump(data, outfile, sort_keys=False, indent=4)

如何将SqlAlchemy结果序列化为JSON?

问题:如何将SqlAlchemy结果序列化为JSON?

Django有一些很好的自动ORM模型从数据库返回到JSON格式的自动序列化。

如何将SQLAlchemy查询结果序列化为JSON格式?

我试过了,jsonpickle.encode但是它编码查询对象本身。我试过了json.dumps(items)但是回来了

TypeError: <Product('3', 'some name', 'some desc')> is not JSON serializable

将SQLAlchemy ORM对象序列化为JSON / XML真的很难吗?没有默认的序列化程序吗?如今,序列化ORM查询结果是非常常见的任务。

我需要的只是返回SQLAlchemy查询结果的JSON或XML数据表示形式。

javascript datagird(JQGrid http://www.trirand.com/blog/)中需要使用JSON / XML格式的SQLAlchemy对象查询结果

Django has some good automatic serialization of ORM models returned from DB to JSON format.

How to serialize SQLAlchemy query result to JSON format?

I tried jsonpickle.encode but it encodes query object itself. I tried json.dumps(items) but it returns

TypeError: <Product('3', 'some name', 'some desc')> is not JSON serializable

Is it really so hard to serialize SQLAlchemy ORM objects to JSON /XML? Isn’t there any default serializer for it? It’s very common task to serialize ORM query results nowadays.

What I need is just to return JSON or XML data representation of SQLAlchemy query result.

SQLAlchemy objects query result in JSON/XML format is needed to be used in javascript datagird (JQGrid http://www.trirand.com/blog/)


回答 0

平面实施

您可以使用如下形式:

from sqlalchemy.ext.declarative import DeclarativeMeta

class AlchemyEncoder(json.JSONEncoder):

    def default(self, obj):
        if isinstance(obj.__class__, DeclarativeMeta):
            # an SQLAlchemy class
            fields = {}
            for field in [x for x in dir(obj) if not x.startswith('_') and x != 'metadata']:
                data = obj.__getattribute__(field)
                try:
                    json.dumps(data) # this will fail on non-encodable values, like other classes
                    fields[field] = data
                except TypeError:
                    fields[field] = None
            # a json-encodable dict
            return fields

        return json.JSONEncoder.default(self, obj)

然后使用以下命令转换为JSON:

c = YourAlchemyClass()
print json.dumps(c, cls=AlchemyEncoder)

它将忽略不可编码的字段(将它们设置为“无”)。

它不会自动扩展关系(因为这可能导致自我引用,并永远循环)。

递归的非循环实现

但是,如果您希望永远循环,则可以使用:

from sqlalchemy.ext.declarative import DeclarativeMeta

def new_alchemy_encoder():
    _visited_objs = []

    class AlchemyEncoder(json.JSONEncoder):
        def default(self, obj):
            if isinstance(obj.__class__, DeclarativeMeta):
                # don't re-visit self
                if obj in _visited_objs:
                    return None
                _visited_objs.append(obj)

                # an SQLAlchemy class
                fields = {}
                for field in [x for x in dir(obj) if not x.startswith('_') and x != 'metadata']:
                    fields[field] = obj.__getattribute__(field)
                # a json-encodable dict
                return fields

            return json.JSONEncoder.default(self, obj)

    return AlchemyEncoder

然后使用以下代码编码对象:

print json.dumps(e, cls=new_alchemy_encoder(), check_circular=False)

这将对所有子项,所有子项以及所有子项进行编码。基本上,可能对整个数据库进行编码。当它到达之前已编码的内容时,会将其编码为“无”。

递归(可能是循环的)选择性实现

另一种可能更好的选择是能够指定要扩展的字段:

def new_alchemy_encoder(revisit_self = False, fields_to_expand = []):
    _visited_objs = []

    class AlchemyEncoder(json.JSONEncoder):
        def default(self, obj):
            if isinstance(obj.__class__, DeclarativeMeta):
                # don't re-visit self
                if revisit_self:
                    if obj in _visited_objs:
                        return None
                    _visited_objs.append(obj)

                # go through each field in this SQLalchemy class
                fields = {}
                for field in [x for x in dir(obj) if not x.startswith('_') and x != 'metadata']:
                    val = obj.__getattribute__(field)

                    # is this field another SQLalchemy object, or a list of SQLalchemy objects?
                    if isinstance(val.__class__, DeclarativeMeta) or (isinstance(val, list) and len(val) > 0 and isinstance(val[0].__class__, DeclarativeMeta)):
                        # unless we're expanding this field, stop here
                        if field not in fields_to_expand:
                            # not expanding this field: set it to None and continue
                            fields[field] = None
                            continue

                    fields[field] = val
                # a json-encodable dict
                return fields

            return json.JSONEncoder.default(self, obj)

    return AlchemyEncoder

您现在可以通过以下方式调用它:

print json.dumps(e, cls=new_alchemy_encoder(False, ['parents']), check_circular=False)

例如,仅扩展名为“父母”的SQLAlchemy字段。

A flat implementation

You could use something like this:

from sqlalchemy.ext.declarative import DeclarativeMeta

class AlchemyEncoder(json.JSONEncoder):

    def default(self, obj):
        if isinstance(obj.__class__, DeclarativeMeta):
            # an SQLAlchemy class
            fields = {}
            for field in [x for x in dir(obj) if not x.startswith('_') and x != 'metadata']:
                data = obj.__getattribute__(field)
                try:
                    json.dumps(data) # this will fail on non-encodable values, like other classes
                    fields[field] = data
                except TypeError:
                    fields[field] = None
            # a json-encodable dict
            return fields

        return json.JSONEncoder.default(self, obj)

and then convert to JSON using:

c = YourAlchemyClass()
print json.dumps(c, cls=AlchemyEncoder)

It will ignore fields that are not encodable (set them to ‘None’).

It doesn’t auto-expand relations (since this could lead to self-references, and loop forever).

A recursive, non-circular implementation

If, however, you’d rather loop forever, you could use:

from sqlalchemy.ext.declarative import DeclarativeMeta

def new_alchemy_encoder():
    _visited_objs = []

    class AlchemyEncoder(json.JSONEncoder):
        def default(self, obj):
            if isinstance(obj.__class__, DeclarativeMeta):
                # don't re-visit self
                if obj in _visited_objs:
                    return None
                _visited_objs.append(obj)

                # an SQLAlchemy class
                fields = {}
                for field in [x for x in dir(obj) if not x.startswith('_') and x != 'metadata']:
                    fields[field] = obj.__getattribute__(field)
                # a json-encodable dict
                return fields

            return json.JSONEncoder.default(self, obj)

    return AlchemyEncoder

And then encode objects using:

print json.dumps(e, cls=new_alchemy_encoder(), check_circular=False)

This would encode all children, and all their children, and all their children… Potentially encode your entire database, basically. When it reaches something its encoded before, it will encode it as ‘None’.

A recursive, possibly-circular, selective implementation

Another alternative, probably better, is to be able to specify the fields you want to expand:

def new_alchemy_encoder(revisit_self = False, fields_to_expand = []):
    _visited_objs = []

    class AlchemyEncoder(json.JSONEncoder):
        def default(self, obj):
            if isinstance(obj.__class__, DeclarativeMeta):
                # don't re-visit self
                if revisit_self:
                    if obj in _visited_objs:
                        return None
                    _visited_objs.append(obj)

                # go through each field in this SQLalchemy class
                fields = {}
                for field in [x for x in dir(obj) if not x.startswith('_') and x != 'metadata']:
                    val = obj.__getattribute__(field)

                    # is this field another SQLalchemy object, or a list of SQLalchemy objects?
                    if isinstance(val.__class__, DeclarativeMeta) or (isinstance(val, list) and len(val) > 0 and isinstance(val[0].__class__, DeclarativeMeta)):
                        # unless we're expanding this field, stop here
                        if field not in fields_to_expand:
                            # not expanding this field: set it to None and continue
                            fields[field] = None
                            continue

                    fields[field] = val
                # a json-encodable dict
                return fields

            return json.JSONEncoder.default(self, obj)

    return AlchemyEncoder

You can now call it with:

print json.dumps(e, cls=new_alchemy_encoder(False, ['parents']), check_circular=False)

To only expand SQLAlchemy fields called ‘parents’, for example.


回答 1

您可以将对象输出为字典:

class User:
   def as_dict(self):
       return {c.name: getattr(self, c.name) for c in self.__table__.columns}

然后你用 User.as_dict()序列化对象。

将sqlalchemy行对象转换为python dict中所述

You could just output your object as a dictionary:

class User:
   def as_dict(self):
       return {c.name: getattr(self, c.name) for c in self.__table__.columns}

And then you use User.as_dict() to serialize your object.

As explained in Convert sqlalchemy row object to python dict


回答 2

您可以将RowProxy转换为这样的字典:

 d = dict(row.items())

然后将其序列化为JSON(您将必须为datetime值之类的东西指定编码器),如果您只想要一个记录(而不是相关记录的完整层次结构),这并不难。

json.dumps([(dict(row.items())) for row in rs])

You can convert a RowProxy to a dict like this:

 d = dict(row.items())

Then serialize that to JSON ( you will have to specify an encoder for things like datetime values ) It’s not that hard if you just want one record ( and not a full hierarchy of related records ).

json.dumps([(dict(row.items())) for row in rs])

回答 3

我建议使用棉花糖。它允许您创建序列化器来表示模型实例,并支持关系和嵌套对象。

这是他们文档中的截断示例。采取ORM模型Author

class Author(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    first = db.Column(db.String(80))
    last = db.Column(db.String(80))

该类的棉花糖架构如下所示:

class AuthorSchema(Schema):
    id = fields.Int(dump_only=True)
    first = fields.Str()
    last = fields.Str()
    formatted_name = fields.Method("format_name", dump_only=True)

    def format_name(self, author):
        return "{}, {}".format(author.last, author.first)

…并像这样使用:

author_schema = AuthorSchema()
author_schema.dump(Author.query.first())

…将产生如下输出:

{
        "first": "Tim",
        "formatted_name": "Peters, Tim",
        "id": 1,
        "last": "Peters"
}

看看他们完整的Flask-SQLAlchemy示例

一个名为的库marshmallow-sqlalchemy专门集成了SQLAlchemy和棉花糖。在该库中,上述Author模型的架构如下所示:

class AuthorSchema(ModelSchema):
    class Meta:
        model = Author

集成允许从SQLAlchemy Column类型推断字段类型。

棉花糖-sqlalchemy在这里。

I recommend using marshmallow. It allows you to create serializers to represent your model instances with support to relations and nested objects.

Here is a truncated example from their docs. Take the ORM model, Author:

class Author(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    first = db.Column(db.String(80))
    last = db.Column(db.String(80))

A marshmallow schema for that class is constructed like this:

class AuthorSchema(Schema):
    id = fields.Int(dump_only=True)
    first = fields.Str()
    last = fields.Str()
    formatted_name = fields.Method("format_name", dump_only=True)

    def format_name(self, author):
        return "{}, {}".format(author.last, author.first)

…and used like this:

author_schema = AuthorSchema()
author_schema.dump(Author.query.first())

…would produce an output like this:

{
        "first": "Tim",
        "formatted_name": "Peters, Tim",
        "id": 1,
        "last": "Peters"
}

Have a look at their full Flask-SQLAlchemy Example.

A library called marshmallow-sqlalchemy specifically integrates SQLAlchemy and marshmallow. In that library, the schema for the Author model described above looks like this:

class AuthorSchema(ModelSchema):
    class Meta:
        model = Author

The integration allows the field types to be inferred from the SQLAlchemy Column types.

marshmallow-sqlalchemy here.


回答 4

Python的3.7+和瓶1.1+可以使用内置的数据类

from dataclasses import dataclass
from datetime import datetime
from flask import Flask, jsonify
from flask_sqlalchemy import SQLAlchemy


app = Flask(__name__)
db = SQLAlchemy(app)


@dataclass
class User(db.Model):
  id: int
  email: str

  id = db.Column(db.Integer, primary_key=True, auto_increment=True)
  email = db.Column(db.String(200), unique=True)


@app.route('/users/')
def users():
  users = User.query.all()
  return jsonify(users)  


if __name__ == "__main__":
  users = User(email="user1@gmail.com"), User(email="user2@gmail.com")
  db.create_all()
  db.session.add_all(users)
  db.session.commit()
  app.run()

/users/现在,该路线将返回用户列表。

[
  {"email": "user1@gmail.com", "id": 1},
  {"email": "user2@gmail.com", "id": 2}
]

自动序列化相关模型

@dataclass
class Account(db.Model):
  id: int
  users: User

  id = db.Column(db.Integer)
  users = db.relationship(User)  # User model would need a db.ForeignKey field

来自的回应jsonify(account)就是这样。

{  
   "id":1,
   "users":[  
      {  
         "email":"user1@gmail.com",
         "id":1
      },
      {  
         "email":"user2@gmail.com",
         "id":2
      }
   ]
}

覆盖默认的JSON编码器

from flask.json import JSONEncoder


class CustomJSONEncoder(JSONEncoder):
  "Add support for serializing timedeltas"

  def default(o):
    if type(o) == datetime.timedelta:
      return str(o)
    elif type(o) == datetime.datetime:
      return o.isoformat()
    else:
      return super().default(o)

app.json_encoder = CustomJSONEncoder      

Python 3.7+ and Flask 1.1+ can use the built-in dataclasses package

from dataclasses import dataclass
from datetime import datetime
from flask import Flask, jsonify
from flask_sqlalchemy import SQLAlchemy


app = Flask(__name__)
db = SQLAlchemy(app)


@dataclass
class User(db.Model):
  id: int
  email: str

  id = db.Column(db.Integer, primary_key=True, auto_increment=True)
  email = db.Column(db.String(200), unique=True)


@app.route('/users/')
def users():
  users = User.query.all()
  return jsonify(users)  


if __name__ == "__main__":
  users = User(email="user1@gmail.com"), User(email="user2@gmail.com")
  db.create_all()
  db.session.add_all(users)
  db.session.commit()
  app.run()

The /users/ route will now return a list of users.

[
  {"email": "user1@gmail.com", "id": 1},
  {"email": "user2@gmail.com", "id": 2}
]

Auto-serialize related models

@dataclass
class Account(db.Model):
  id: int
  users: User

  id = db.Column(db.Integer)
  users = db.relationship(User)  # User model would need a db.ForeignKey field

The response from jsonify(account) would be this.

{  
   "id":1,
   "users":[  
      {  
         "email":"user1@gmail.com",
         "id":1
      },
      {  
         "email":"user2@gmail.com",
         "id":2
      }
   ]
}

Overwrite the default JSON Encoder

from flask.json import JSONEncoder


class CustomJSONEncoder(JSONEncoder):
  "Add support for serializing timedeltas"

  def default(o):
    if type(o) == datetime.timedelta:
      return str(o)
    elif type(o) == datetime.datetime:
      return o.isoformat()
    else:
      return super().default(o)

app.json_encoder = CustomJSONEncoder      

回答 5

烧瓶JsonTools包装具有实现JsonSerializableBase为您的模型提供 Base类的实现。

用法:

from sqlalchemy.ext.declarative import declarative_base
from flask.ext.jsontools import JsonSerializableBase

Base = declarative_base(cls=(JsonSerializableBase,))

class User(Base):
    #...

现在 User模型可以神奇地序列化了。

如果您的框架不是Flask,则只需获取代码

Flask-JsonTools package has an implementation of JsonSerializableBase Base class for your models.

Usage:

from sqlalchemy.ext.declarative import declarative_base
from flask.ext.jsontools import JsonSerializableBase

Base = declarative_base(cls=(JsonSerializableBase,))

class User(Base):
    #...

Now the User model is magically serializable.

If your framework is not Flask, you can just grab the code


回答 6

出于安全原因,您永远不要返回模型的所有字段。我喜欢有选择地选择它们。

现在编码瓶的JSON支持UUID,日期时间和关系数据(以及添加queryquery_class对flask_sqlalchemy db.Model类)。我已更新编码器,如下所示:

app / json_encoder.py

    from sqlalchemy.ext.declarative import DeclarativeMeta
    from flask import json


    class AlchemyEncoder(json.JSONEncoder):
        def default(self, o):
            if isinstance(o.__class__, DeclarativeMeta):
                data = {}
                fields = o.__json__() if hasattr(o, '__json__') else dir(o)
                for field in [f for f in fields if not f.startswith('_') and f not in ['metadata', 'query', 'query_class']]:
                    value = o.__getattribute__(field)
                    try:
                        json.dumps(value)
                        data[field] = value
                    except TypeError:
                        data[field] = None
                return data
            return json.JSONEncoder.default(self, o)

app/__init__.py

# json encoding
from app.json_encoder import AlchemyEncoder
app.json_encoder = AlchemyEncoder

这样,我可以选择添加一个__json__属性,该属性返回我想编码的字段列表:

app/models.py

class Queue(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    song_id = db.Column(db.Integer, db.ForeignKey('song.id'), unique=True, nullable=False)
    song = db.relationship('Song', lazy='joined')
    type = db.Column(db.String(20), server_default=u'audio/mpeg')
    src = db.Column(db.String(255), nullable=False)
    created_at = db.Column(db.DateTime, server_default=db.func.now())
    updated_at = db.Column(db.DateTime, server_default=db.func.now(), onupdate=db.func.now())

    def __init__(self, song):
        self.song = song
        self.src = song.full_path

    def __json__(self):
        return ['song', 'src', 'type', 'created_at']

我将@jsonapi添加到视图中,返回结果列表,然后输出如下:

[

{

    "created_at": "Thu, 23 Jul 2015 11:36:53 GMT",
    "song": 

        {
            "full_path": "/static/music/Audioslave/Audioslave [2002]/1 Cochise.mp3",
            "id": 2,
            "path_name": "Audioslave/Audioslave [2002]/1 Cochise.mp3"
        },
    "src": "/static/music/Audioslave/Audioslave [2002]/1 Cochise.mp3",
    "type": "audio/mpeg"
}

]

For security reasons you should never return all the model’s fields. I prefer to selectively choose them.

Flask’s json encoding now supports UUID, datetime and relationships (and added query and query_class for flask_sqlalchemy db.Model class). I’ve updated the encoder as follows:

app/json_encoder.py

    from sqlalchemy.ext.declarative import DeclarativeMeta
    from flask import json


    class AlchemyEncoder(json.JSONEncoder):
        def default(self, o):
            if isinstance(o.__class__, DeclarativeMeta):
                data = {}
                fields = o.__json__() if hasattr(o, '__json__') else dir(o)
                for field in [f for f in fields if not f.startswith('_') and f not in ['metadata', 'query', 'query_class']]:
                    value = o.__getattribute__(field)
                    try:
                        json.dumps(value)
                        data[field] = value
                    except TypeError:
                        data[field] = None
                return data
            return json.JSONEncoder.default(self, o)

app/__init__.py

# json encoding
from app.json_encoder import AlchemyEncoder
app.json_encoder = AlchemyEncoder

With this I can optionally add a __json__ property that returns the list of fields I wish to encode:

app/models.py

class Queue(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    song_id = db.Column(db.Integer, db.ForeignKey('song.id'), unique=True, nullable=False)
    song = db.relationship('Song', lazy='joined')
    type = db.Column(db.String(20), server_default=u'audio/mpeg')
    src = db.Column(db.String(255), nullable=False)
    created_at = db.Column(db.DateTime, server_default=db.func.now())
    updated_at = db.Column(db.DateTime, server_default=db.func.now(), onupdate=db.func.now())

    def __init__(self, song):
        self.song = song
        self.src = song.full_path

    def __json__(self):
        return ['song', 'src', 'type', 'created_at']

I add @jsonapi to my view, return the resultlist and then my output is as follows:

[

{

    "created_at": "Thu, 23 Jul 2015 11:36:53 GMT",
    "song": 

        {
            "full_path": "/static/music/Audioslave/Audioslave [2002]/1 Cochise.mp3",
            "id": 2,
            "path_name": "Audioslave/Audioslave [2002]/1 Cochise.mp3"
        },
    "src": "/static/music/Audioslave/Audioslave [2002]/1 Cochise.mp3",
    "type": "audio/mpeg"
}

]

回答 7

您可以这样使用SqlAlchemy的自省:

mysql = SQLAlchemy()
from sqlalchemy import inspect

class Contacts(mysql.Model):  
    __tablename__ = 'CONTACTS'
    id = mysql.Column(mysql.Integer, primary_key=True)
    first_name = mysql.Column(mysql.String(128), nullable=False)
    last_name = mysql.Column(mysql.String(128), nullable=False)
    phone = mysql.Column(mysql.String(128), nullable=False)
    email = mysql.Column(mysql.String(128), nullable=False)
    street = mysql.Column(mysql.String(128), nullable=False)
    zip_code = mysql.Column(mysql.String(128), nullable=False)
    city = mysql.Column(mysql.String(128), nullable=False)
    def toDict(self):
        return { c.key: getattr(self, c.key) for c in inspect(self).mapper.column_attrs }

@app.route('/contacts',methods=['GET'])
def getContacts():
    contacts = Contacts.query.all()
    contactsArr = []
    for contact in contacts:
        contactsArr.append(contact.toDict()) 
    return jsonify(contactsArr)

@app.route('/contacts/<int:id>',methods=['GET'])
def getContact(id):
    contact = Contacts.query.get(id)
    return jsonify(contact.toDict())

从这里的答案中获得启发: 将sqlalchemy行对象转换为python dict

You can use introspection of SqlAlchemy as this :

mysql = SQLAlchemy()
from sqlalchemy import inspect

class Contacts(mysql.Model):  
    __tablename__ = 'CONTACTS'
    id = mysql.Column(mysql.Integer, primary_key=True)
    first_name = mysql.Column(mysql.String(128), nullable=False)
    last_name = mysql.Column(mysql.String(128), nullable=False)
    phone = mysql.Column(mysql.String(128), nullable=False)
    email = mysql.Column(mysql.String(128), nullable=False)
    street = mysql.Column(mysql.String(128), nullable=False)
    zip_code = mysql.Column(mysql.String(128), nullable=False)
    city = mysql.Column(mysql.String(128), nullable=False)
    def toDict(self):
        return { c.key: getattr(self, c.key) for c in inspect(self).mapper.column_attrs }

@app.route('/contacts',methods=['GET'])
def getContacts():
    contacts = Contacts.query.all()
    contactsArr = []
    for contact in contacts:
        contactsArr.append(contact.toDict()) 
    return jsonify(contactsArr)

@app.route('/contacts/<int:id>',methods=['GET'])
def getContact(id):
    contact = Contacts.query.get(id)
    return jsonify(contact.toDict())

Get inspired from an answer here : Convert sqlalchemy row object to python dict


回答 8

更详细的解释。在模型中,添加:

def as_dict(self):
       return {c.name: str(getattr(self, c.name)) for c in self.__table__.columns}

str()是针对python 3的,因此如果使用python 2则使用unicode()。它应该有助于反序列化日期。如果不处理这些问题,则可以将其删除。

您现在可以像这样查询数据库

some_result = User.query.filter_by(id=current_user.id).first().as_dict()

First()需要避免奇怪的错误。as_dict()现在将反序列化结果。反序列化后,可以将其转换为json

jsonify(some_result)

A more detailed explanation. In your model, add:

def as_dict(self):
       return {c.name: str(getattr(self, c.name)) for c in self.__table__.columns}

The str() is for python 3 so if using python 2 use unicode(). It should help deserialize dates. You can remove it if not dealing with those.

You can now query the database like this

some_result = User.query.filter_by(id=current_user.id).first().as_dict()

First() is needed to avoid weird errors. as_dict() will now deserialize the result. After deserialization, it is ready to be turned to json

jsonify(some_result)

回答 9

它不是那么简单。我写了一些代码来做到这一点。我仍在努力,它使用MochiKit框架。它基本上使用代理和已注册的JSON转换器在Python和Javascript之间转换复合对象。

数据库对象的浏览器端是db.js 它需要在基本的Python代理源的proxy.js

在Python方面,有基本的代理模块。最后是webserver.py中的SqlAlchemy对象编码器。它还取决于在models.py文件中找到的元数据提取器。

It is not so straighforward. I wrote some code to do this. I’m still working on it, and it uses the MochiKit framework. It basically translates compound objects between Python and Javascript using a proxy and registered JSON converters.

Browser side for database objects is db.js It needs the basic Python proxy source in proxy.js.

On the Python side there is the base proxy module. Then finally the SqlAlchemy object encoder in webserver.py. It also depends on metadata extractors found in the models.py file.


回答 10

虽然最初的问题可以回溯一会儿,但这里的答案(以及我的经验)表明,这是一个不平凡的问题,它具有许多不同的方法,并具有不同的权衡取舍。

这就是为什么我构建SQLAthanor库,库扩展了SQLAlchemy的声明性ORM,并提供了您可能想看的可配置序列化/反序列化支持。

该库支持:

  • Python 2.7、3.4、3.5和3.6。
  • SQLAlchemy 0.9版或更高版本
  • 到JSON,CSV,YAML和Python的序列化/反序列化 dict
  • 列/属性,关系,混合属性和关联代理的序列化/反序列化
  • 为特定格式和列/关系/属性启用和禁用序列化(例如,您要支持入站 password值,但不要包括值)
  • 序列化前和序列化后的值处理(用于验证或强制类型转换)
  • 非常简单的语法,既Python风格又与SQLAlchemy自己的方法无缝地保持一致

您可以在此处查看(我希望!)全面的文档: https //sqlathanor.readthedocs.io/en/latest

希望这可以帮助!

While the original question goes back awhile, the number of answers here (and my own experiences) suggest it’s a non-trivial question with a lot of different approaches of varying complexity with different trade-offs.

That’s why I built the SQLAthanor library that extends SQLAlchemy’s declarative ORM with configurable serialization/de-serialization support that you might want to take a look at.

The library supports:

  • Python 2.7, 3.4, 3.5, and 3.6.
  • SQLAlchemy versions 0.9 and higher
  • serialization/de-serialization to/from JSON, CSV, YAML, and Python dict
  • serialization/de-serialization of columns/attributes, relationships, hybrid properties, and association proxies
  • enabling and disabling of serialization for particular formats and columns/relationships/attributes (e.g. you want to support an inbound password value, but never include an outbound one)
  • pre-serialization and post-deserialization value processing (for validation or type coercion)
  • a pretty straightforward syntax that is both Pythonic and seamlessly consistent with SQLAlchemy’s own approach

You can check out the (I hope!) comprehensive docs here: https://sqlathanor.readthedocs.io/en/latest

Hope this helps!


回答 11

自定义序列化和反序列化。

“ from_json”(类方法)基于json数据构建Model对象。

“反序列化”只能在实例上调用,并将json中的所有数据合并到Model实例中。

“序列化” -递归序列化

需要__write_only__属性来定义只写属性(例如“ password_hash”)。

class Serializable(object):
    __exclude__ = ('id',)
    __include__ = ()
    __write_only__ = ()

    @classmethod
    def from_json(cls, json, selfObj=None):
        if selfObj is None:
            self = cls()
        else:
            self = selfObj
        exclude = (cls.__exclude__ or ()) + Serializable.__exclude__
        include = cls.__include__ or ()
        if json:
            for prop, value in json.iteritems():
                # ignore all non user data, e.g. only
                if (not (prop in exclude) | (prop in include)) and isinstance(
                        getattr(cls, prop, None), QueryableAttribute):
                    setattr(self, prop, value)
        return self

    def deserialize(self, json):
        if not json:
            return None
        return self.__class__.from_json(json, selfObj=self)

    @classmethod
    def serialize_list(cls, object_list=[]):
        output = []
        for li in object_list:
            if isinstance(li, Serializable):
                output.append(li.serialize())
            else:
                output.append(li)
        return output

    def serialize(self, **kwargs):

        # init write only props
        if len(getattr(self.__class__, '__write_only__', ())) == 0:
            self.__class__.__write_only__ = ()
        dictionary = {}
        expand = kwargs.get('expand', ()) or ()
        prop = 'props'
        if expand:
            # expand all the fields
            for key in expand:
                getattr(self, key)
        iterable = self.__dict__.items()
        is_custom_property_set = False
        # include only properties passed as parameter
        if (prop in kwargs) and (kwargs.get(prop, None) is not None):
            is_custom_property_set = True
            iterable = kwargs.get(prop, None)
        # loop trough all accessible properties
        for key in iterable:
            accessor = key
            if isinstance(key, tuple):
                accessor = key[0]
            if not (accessor in self.__class__.__write_only__) and not accessor.startswith('_'):
                # force select from db to be able get relationships
                if is_custom_property_set:
                    getattr(self, accessor, None)
                if isinstance(self.__dict__.get(accessor), list):
                    dictionary[accessor] = self.__class__.serialize_list(object_list=self.__dict__.get(accessor))
                # check if those properties are read only
                elif isinstance(self.__dict__.get(accessor), Serializable):
                    dictionary[accessor] = self.__dict__.get(accessor).serialize()
                else:
                    dictionary[accessor] = self.__dict__.get(accessor)
        return dictionary

Custom serialization and deserialization.

“from_json” (class method) builds a Model object based on json data.

“deserialize” could be called only on instance, and merge all data from json into Model instance.

“serialize” – recursive serialization

__write_only__ property is needed to define write only properties (“password_hash” for example).

class Serializable(object):
    __exclude__ = ('id',)
    __include__ = ()
    __write_only__ = ()

    @classmethod
    def from_json(cls, json, selfObj=None):
        if selfObj is None:
            self = cls()
        else:
            self = selfObj
        exclude = (cls.__exclude__ or ()) + Serializable.__exclude__
        include = cls.__include__ or ()
        if json:
            for prop, value in json.iteritems():
                # ignore all non user data, e.g. only
                if (not (prop in exclude) | (prop in include)) and isinstance(
                        getattr(cls, prop, None), QueryableAttribute):
                    setattr(self, prop, value)
        return self

    def deserialize(self, json):
        if not json:
            return None
        return self.__class__.from_json(json, selfObj=self)

    @classmethod
    def serialize_list(cls, object_list=[]):
        output = []
        for li in object_list:
            if isinstance(li, Serializable):
                output.append(li.serialize())
            else:
                output.append(li)
        return output

    def serialize(self, **kwargs):

        # init write only props
        if len(getattr(self.__class__, '__write_only__', ())) == 0:
            self.__class__.__write_only__ = ()
        dictionary = {}
        expand = kwargs.get('expand', ()) or ()
        prop = 'props'
        if expand:
            # expand all the fields
            for key in expand:
                getattr(self, key)
        iterable = self.__dict__.items()
        is_custom_property_set = False
        # include only properties passed as parameter
        if (prop in kwargs) and (kwargs.get(prop, None) is not None):
            is_custom_property_set = True
            iterable = kwargs.get(prop, None)
        # loop trough all accessible properties
        for key in iterable:
            accessor = key
            if isinstance(key, tuple):
                accessor = key[0]
            if not (accessor in self.__class__.__write_only__) and not accessor.startswith('_'):
                # force select from db to be able get relationships
                if is_custom_property_set:
                    getattr(self, accessor, None)
                if isinstance(self.__dict__.get(accessor), list):
                    dictionary[accessor] = self.__class__.serialize_list(object_list=self.__dict__.get(accessor))
                # check if those properties are read only
                elif isinstance(self.__dict__.get(accessor), Serializable):
                    dictionary[accessor] = self.__dict__.get(accessor).serialize()
                else:
                    dictionary[accessor] = self.__dict__.get(accessor)
        return dictionary

回答 12

这是一个解决方案,可让您选择想要包含在输出中的关系。注意:这是一个完整的重写,将dict / str作为arg而不是列表。修复一些东西。

def deep_dict(self, relations={}):
    """Output a dict of an SA object recursing as deep as you want.

    Takes one argument, relations which is a dictionary of relations we'd
    like to pull out. The relations dict items can be a single relation
    name or deeper relation names connected by sub dicts

    Example:
        Say we have a Person object with a family relationship
            person.deep_dict(relations={'family':None})
        Say the family object has homes as a relation then we can do
            person.deep_dict(relations={'family':{'homes':None}})
            OR
            person.deep_dict(relations={'family':'homes'})
        Say homes has a relation like rooms you can do
            person.deep_dict(relations={'family':{'homes':'rooms'}})
            and so on...
    """
    mydict =  dict((c, str(a)) for c, a in
                    self.__dict__.items() if c != '_sa_instance_state')
    if not relations:
        # just return ourselves
        return mydict

    # otherwise we need to go deeper
    if not isinstance(relations, dict) and not isinstance(relations, str):
        raise Exception("relations should be a dict, it is of type {}".format(type(relations)))

    # got here so check and handle if we were passed a dict
    if isinstance(relations, dict):
        # we were passed deeper info
        for left, right in relations.items():
            myrel = getattr(self, left)
            if isinstance(myrel, list):
                mydict[left] = [rel.deep_dict(relations=right) for rel in myrel]
            else:
                mydict[left] = myrel.deep_dict(relations=right)
    # if we get here check and handle if we were passed a string
    elif isinstance(relations, str):
        # passed a single item
        myrel = getattr(self, relations)
        left = relations
        if isinstance(myrel, list):
            mydict[left] = [rel.deep_dict(relations=None)
                                 for rel in myrel]
        else:
            mydict[left] = myrel.deep_dict(relations=None)

    return mydict

因此,例如使用人员/家庭/房屋/房间的示例…将其转换为json,您所需要做的就是

json.dumps(person.deep_dict(relations={'family':{'homes':'rooms'}}))

Here is a solution that lets you select the relations you want to include in your output as deep as you would like to go. NOTE: This is a complete re-write taking a dict/str as an arg rather than a list. fixes some stuff..

def deep_dict(self, relations={}):
    """Output a dict of an SA object recursing as deep as you want.

    Takes one argument, relations which is a dictionary of relations we'd
    like to pull out. The relations dict items can be a single relation
    name or deeper relation names connected by sub dicts

    Example:
        Say we have a Person object with a family relationship
            person.deep_dict(relations={'family':None})
        Say the family object has homes as a relation then we can do
            person.deep_dict(relations={'family':{'homes':None}})
            OR
            person.deep_dict(relations={'family':'homes'})
        Say homes has a relation like rooms you can do
            person.deep_dict(relations={'family':{'homes':'rooms'}})
            and so on...
    """
    mydict =  dict((c, str(a)) for c, a in
                    self.__dict__.items() if c != '_sa_instance_state')
    if not relations:
        # just return ourselves
        return mydict

    # otherwise we need to go deeper
    if not isinstance(relations, dict) and not isinstance(relations, str):
        raise Exception("relations should be a dict, it is of type {}".format(type(relations)))

    # got here so check and handle if we were passed a dict
    if isinstance(relations, dict):
        # we were passed deeper info
        for left, right in relations.items():
            myrel = getattr(self, left)
            if isinstance(myrel, list):
                mydict[left] = [rel.deep_dict(relations=right) for rel in myrel]
            else:
                mydict[left] = myrel.deep_dict(relations=right)
    # if we get here check and handle if we were passed a string
    elif isinstance(relations, str):
        # passed a single item
        myrel = getattr(self, relations)
        left = relations
        if isinstance(myrel, list):
            mydict[left] = [rel.deep_dict(relations=None)
                                 for rel in myrel]
        else:
            mydict[left] = myrel.deep_dict(relations=None)

    return mydict

so for an example using person/family/homes/rooms… turning it into json all you need is

json.dumps(person.deep_dict(relations={'family':{'homes':'rooms'}}))

回答 13

def alc2json(row):
    return dict([(col, str(getattr(row,col))) for col in row.__table__.columns.keys()])

我以为我会和这个打一点代码高尔夫。

仅供参考:我正在使用automap_base因为我们有根据业务需求单独设计的架构。我今天才刚开始使用SQLAlchemy,但是文档指出automap_base是对clarativeative_base的扩展,这似乎是SQLAlchemy ORM中的典型范例,因此我认为这应该可行。

按照Tjorriemorrie的解决方案,使用后跟的外键并不太花哨,但是它只是将列与值匹配,并通过str()-列的值来处理Python类型。我们的值包括Python datetime.time和decimal.Decimal类类型的结果,因此可以完成工作。

希望这对任何路人都有帮助!

def alc2json(row):
    return dict([(col, str(getattr(row,col))) for col in row.__table__.columns.keys()])

I thought I’d play a little code golf with this one.

FYI: I am using automap_base since we have a separately designed schema according to business requirements. I just started using SQLAlchemy today but the documentation states that automap_base is an extension to declarative_base which seems to be the typical paradigm in the SQLAlchemy ORM so I believe this should work.

It does not get fancy with following foreign keys per Tjorriemorrie‘s solution, but it simply matches columns to values and handles Python types by str()-ing the column values. Our values consist Python datetime.time and decimal.Decimal class type results so it gets the job done.

Hope this helps any passers-by!


回答 14

我知道这是一个比较老的帖子。我接受了@SashaB提供的解决方案,并根据需要进行了修改。

我添加了以下内容:

  1. 字段忽略列表:序列化时要忽略的字段列表
  2. 字段替换列表:字典,其中包含要在序列化时用值替换的字段名称。
  3. 删除了方法并使BaseQuery序列化

我的代码如下:

def alchemy_json_encoder(revisit_self = False, fields_to_expand = [], fields_to_ignore = [], fields_to_replace = {}):
   """
   Serialize SQLAlchemy result into JSon
   :param revisit_self: True / False
   :param fields_to_expand: Fields which are to be expanded for including their children and all
   :param fields_to_ignore: Fields to be ignored while encoding
   :param fields_to_replace: Field keys to be replaced by values assigned in dictionary
   :return: Json serialized SQLAlchemy object
   """
   _visited_objs = []
   class AlchemyEncoder(json.JSONEncoder):
      def default(self, obj):
        if isinstance(obj.__class__, DeclarativeMeta):
            # don't re-visit self
            if revisit_self:
                if obj in _visited_objs:
                    return None
                _visited_objs.append(obj)

            # go through each field in this SQLalchemy class
            fields = {}
            for field in [x for x in dir(obj) if not x.startswith('_') and x != 'metadata' and x not in fields_to_ignore]:
                val = obj.__getattribute__(field)
                # is this field method defination, or an SQLalchemy object
                if not hasattr(val, "__call__") and not isinstance(val, BaseQuery):
                    field_name = fields_to_replace[field] if field in fields_to_replace else field
                    # is this field another SQLalchemy object, or a list of SQLalchemy objects?
                    if isinstance(val.__class__, DeclarativeMeta) or \
                            (isinstance(val, list) and len(val) > 0 and isinstance(val[0].__class__, DeclarativeMeta)):
                        # unless we're expanding this field, stop here
                        if field not in fields_to_expand:
                            # not expanding this field: set it to None and continue
                            fields[field_name] = None
                            continue

                    fields[field_name] = val
            # a json-encodable dict
            return fields

        return json.JSONEncoder.default(self, obj)
   return AlchemyEncoder

希望它能对某人有所帮助!

I know this is quite an older post. I took solution given by @SashaB and modified as per my need.

I added following things to it:

  1. Field ignore list: A list of fields to be ignored while serializing
  2. Field replace list: A dictionary containing field names to be replaced by values while serializing.
  3. Removed methods and BaseQuery getting serialized

My code is as follows:

def alchemy_json_encoder(revisit_self = False, fields_to_expand = [], fields_to_ignore = [], fields_to_replace = {}):
   """
   Serialize SQLAlchemy result into JSon
   :param revisit_self: True / False
   :param fields_to_expand: Fields which are to be expanded for including their children and all
   :param fields_to_ignore: Fields to be ignored while encoding
   :param fields_to_replace: Field keys to be replaced by values assigned in dictionary
   :return: Json serialized SQLAlchemy object
   """
   _visited_objs = []
   class AlchemyEncoder(json.JSONEncoder):
      def default(self, obj):
        if isinstance(obj.__class__, DeclarativeMeta):
            # don't re-visit self
            if revisit_self:
                if obj in _visited_objs:
                    return None
                _visited_objs.append(obj)

            # go through each field in this SQLalchemy class
            fields = {}
            for field in [x for x in dir(obj) if not x.startswith('_') and x != 'metadata' and x not in fields_to_ignore]:
                val = obj.__getattribute__(field)
                # is this field method defination, or an SQLalchemy object
                if not hasattr(val, "__call__") and not isinstance(val, BaseQuery):
                    field_name = fields_to_replace[field] if field in fields_to_replace else field
                    # is this field another SQLalchemy object, or a list of SQLalchemy objects?
                    if isinstance(val.__class__, DeclarativeMeta) or \
                            (isinstance(val, list) and len(val) > 0 and isinstance(val[0].__class__, DeclarativeMeta)):
                        # unless we're expanding this field, stop here
                        if field not in fields_to_expand:
                            # not expanding this field: set it to None and continue
                            fields[field_name] = None
                            continue

                    fields[field_name] = val
            # a json-encodable dict
            return fields

        return json.JSONEncoder.default(self, obj)
   return AlchemyEncoder

Hope it helps someone!


回答 15

使用SQLAlchemy中的内置序列化器

from sqlalchemy.ext.serializer import loads, dumps
obj = MyAlchemyObject()
# serialize object
serialized_obj = dumps(obj)

# deserialize object
obj = loads(serialized_obj)

如果要在会话之间转移对象,请记住使用将该对象与当前会话分离session.expunge(obj)。要再次附加它,只需执行即可session.add(obj)

Use the built-in serializer in SQLAlchemy:

from sqlalchemy.ext.serializer import loads, dumps
obj = MyAlchemyObject()
# serialize object
serialized_obj = dumps(obj)

# deserialize object
obj = loads(serialized_obj)

If you’re transferring the object between sessions, remember to detach the object from the current session using session.expunge(obj). To attach it again, just do session.add(obj).


回答 16

以下代码会将sqlalchemy结果序列化为json。

import json
from collections import OrderedDict


def asdict(self):
    result = OrderedDict()
    for key in self.__mapper__.c.keys():
        if getattr(self, key) is not None:
            result[key] = str(getattr(self, key))
        else:
            result[key] = getattr(self, key)
    return result


def to_array(all_vendors):
    v = [ ven.asdict() for ven in all_vendors ]
    return json.dumps(v) 

叫乐,

def all_products():
    all_products = Products.query.all()
    return to_array(all_products)

following code will serialize sqlalchemy result to json.

import json
from collections import OrderedDict


def asdict(self):
    result = OrderedDict()
    for key in self.__mapper__.c.keys():
        if getattr(self, key) is not None:
            result[key] = str(getattr(self, key))
        else:
            result[key] = getattr(self, key)
    return result


def to_array(all_vendors):
    v = [ ven.asdict() for ven in all_vendors ]
    return json.dumps(v) 

Calling fun,

def all_products():
    all_products = Products.query.all()
    return to_array(all_products)

回答 17

AlchemyEncoder很棒,但有时会失败,并使用十进制值。这是解决小数点问题的改进编码器-

class AlchemyEncoder(json.JSONEncoder):
# To serialize SQLalchemy objects 
def default(self, obj):
    if isinstance(obj.__class__, DeclarativeMeta):
        model_fields = {}
        for field in [x for x in dir(obj) if not x.startswith('_') and x != 'metadata']:
            data = obj.__getattribute__(field)
            print data
            try:
                json.dumps(data)  # this will fail on non-encodable values, like other classes
                model_fields[field] = data
            except TypeError:
                model_fields[field] = None
        return model_fields
    if isinstance(obj, Decimal):
        return float(obj)
    return json.JSONEncoder.default(self, obj)

The AlchemyEncoder is wonderful but sometimes fails with Decimal values. Here is an improved encoder that solves the decimal problem –

class AlchemyEncoder(json.JSONEncoder):
# To serialize SQLalchemy objects 
def default(self, obj):
    if isinstance(obj.__class__, DeclarativeMeta):
        model_fields = {}
        for field in [x for x in dir(obj) if not x.startswith('_') and x != 'metadata']:
            data = obj.__getattribute__(field)
            print data
            try:
                json.dumps(data)  # this will fail on non-encodable values, like other classes
                model_fields[field] = data
            except TypeError:
                model_fields[field] = None
        return model_fields
    if isinstance(obj, Decimal):
        return float(obj)
    return json.JSONEncoder.default(self, obj)

回答 18

当使用sqlalchemy连接到数据库时,这是一个高度可配置的简单解决方案。使用大熊猫。

import pandas as pd
import sqlalchemy

#sqlalchemy engine configuration
engine = sqlalchemy.create_engine....

def my_function():
  #read in from sql directly into a pandas dataframe
  #check the pandas documentation for additional config options
  sql_DF = pd.read_sql_table("table_name", con=engine)

  # "orient" is optional here but allows you to specify the json formatting you require
  sql_json = sql_DF.to_json(orient="index")

  return sql_json

When using sqlalchemy to connect to a db I this is a simple solution which is highly configurable. Use pandas.

import pandas as pd
import sqlalchemy

#sqlalchemy engine configuration
engine = sqlalchemy.create_engine....

def my_function():
  #read in from sql directly into a pandas dataframe
  #check the pandas documentation for additional config options
  sql_DF = pd.read_sql_table("table_name", con=engine)

  # "orient" is optional here but allows you to specify the json formatting you require
  sql_json = sql_DF.to_json(orient="index")

  return sql_json


回答 19

在Flask下,这可以工作并处理datatime字段,将类型的字段转换
'time': datetime.datetime(2018, 3, 22, 15, 40)
"time": "2018-03-22 15:40:00"

obj = {c.name: str(getattr(self, c.name)) for c in self.__table__.columns}

# This to get the JSON body
return json.dumps(obj)

# Or this to get a response object
return jsonify(obj)

Under Flask, this works and handles datatime fields, transforming a field of type
'time': datetime.datetime(2018, 3, 22, 15, 40) into
"time": "2018-03-22 15:40:00":

obj = {c.name: str(getattr(self, c.name)) for c in self.__table__.columns}

# This to get the JSON body
return json.dumps(obj)

# Or this to get a response object
return jsonify(obj)

回答 20

带有utf-8的内置串行器扼流圈无法解码某些输入的无效起始字节。相反,我去了:

def row_to_dict(row):
    temp = row.__dict__
    temp.pop('_sa_instance_state', None)
    return temp


def rows_to_list(rows):
    ret_rows = []
    for row in rows:
        ret_rows.append(row_to_dict(row))
    return ret_rows


@website_blueprint.route('/api/v1/some/endpoint', methods=['GET'])
def some_api():
    '''
    /some_endpoint
    '''
    rows = rows_to_list(SomeModel.query.all())
    response = app.response_class(
        response=jsonplus.dumps(rows),
        status=200,
        mimetype='application/json'
    )
    return response

The built in serializer chokes with utf-8 cannot decode invalid start byte for some inputs. Instead, I went with:

def row_to_dict(row):
    temp = row.__dict__
    temp.pop('_sa_instance_state', None)
    return temp


def rows_to_list(rows):
    ret_rows = []
    for row in rows:
        ret_rows.append(row_to_dict(row))
    return ret_rows


@website_blueprint.route('/api/v1/some/endpoint', methods=['GET'])
def some_api():
    '''
    /some_endpoint
    '''
    rows = rows_to_list(SomeModel.query.all())
    response = app.response_class(
        response=jsonplus.dumps(rows),
        status=200,
        mimetype='application/json'
    )
    return response

回答 21

也许您可以使用这样的类

from sqlalchemy.ext.declarative import declared_attr
from sqlalchemy import Table


class Custom:
    """Some custom logic here!"""

    __table__: Table  # def for mypy

    @declared_attr
    def __tablename__(cls):  # pylint: disable=no-self-argument
        return cls.__name__  # pylint: disable= no-member

    def to_dict(self) -> Dict[str, Any]:
        """Serializes only column data."""
        return {c.name: getattr(self, c.name) for c in self.__table__.columns}

Base = declarative_base(cls=Custom)

class MyOwnTable(Base):
    #COLUMNS!

这样所有对象都有to_dict方法

Maybe you can use a class like this

from sqlalchemy.ext.declarative import declared_attr
from sqlalchemy import Table


class Custom:
    """Some custom logic here!"""

    __table__: Table  # def for mypy

    @declared_attr
    def __tablename__(cls):  # pylint: disable=no-self-argument
        return cls.__name__  # pylint: disable= no-member

    def to_dict(self) -> Dict[str, Any]:
        """Serializes only column data."""
        return {c.name: getattr(self, c.name) for c in self.__table__.columns}

Base = declarative_base(cls=Custom)

class MyOwnTable(Base):
    #COLUMNS!

With that all objects have the to_dict method


回答 22

在使用一些原始sql和未定义的对象时,使用cursor.description似乎可以得到我想要的东西:

with connection.cursor() as cur:
    print(query)
    cur.execute(query)
    for item in cur.fetchall():
        row = {column.name: item[i] for i, column in enumerate(cur.description)}
        print(row)

While using some raw sql and undefined objects, using cursor.description appeared to get what I was looking for:

with connection.cursor() as cur:
    print(query)
    cur.execute(query)
    for item in cur.fetchall():
        row = {column.name: item[i] for i, column in enumerate(cur.description)}
        print(row)

回答 23

step1:
class CNAME:
   ...
   def as_dict(self):
       return {item.name: getattr(self, item.name) for item in self.__table__.columns}

step2:
list = []
for data in session.query(CNAME).all():
    list.append(data.as_dict())

step3:
return jsonify(list)
step1:
class CNAME:
   ...
   def as_dict(self):
       return {item.name: getattr(self, item.name) for item in self.__table__.columns}

step2:
list = []
for data in session.query(CNAME).all():
    list.append(data.as_dict())

step3:
return jsonify(list)

回答 24

我使用(太多?)词典的观点:

def serialize(_query):
    #d = dictionary written to per row
    #D = dictionary d is written to each time, then reset
    #Master = dictionary of dictionaries; the id Key (int, unique from database) 
    from D is used as the Key for the dictionary D entry in Master
    Master = {}
    D = {}
    x = 0
    for u in _query:
        d = u.__dict__
        D = {}
        for n in d.keys():
           if n != '_sa_instance_state':
                    D[n] = d[n]
        x = d['id']
        Master[x] = D
    return Master

与flask(包括jsonify)和flask_sqlalchemy一起运行,以将输出打印为JSON。

使用jsonify(serialize())调用该函数。

适用于到目前为止我尝试过的所有SQLAlchemy查询(运行SQLite3)

My take utilizing (too many?) dictionaries:

def serialize(_query):
    #d = dictionary written to per row
    #D = dictionary d is written to each time, then reset
    #Master = dictionary of dictionaries; the id Key (int, unique from database) 
    from D is used as the Key for the dictionary D entry in Master
    Master = {}
    D = {}
    x = 0
    for u in _query:
        d = u.__dict__
        D = {}
        for n in d.keys():
           if n != '_sa_instance_state':
                    D[n] = d[n]
        x = d['id']
        Master[x] = D
    return Master

Running with flask (including jsonify) and flask_sqlalchemy to print outputs as JSON.

Call the function with jsonify(serialize()).

Works with all SQLAlchemy queries I’ve tried so far (running SQLite3)


将类实例序列化为JSON

问题:将类实例序列化为JSON

我正在尝试创建类实例的JSON字符串表示形式并且遇到困难。假设该类的构建如下:

class testclass:
    value1 = "a"
    value2 = "b"

像这样对json.dumps进行调用:

t = testclass()
json.dumps(t)

失败了,并告诉我testclass不可JSON序列化。

TypeError: <__main__.testclass object at 0x000000000227A400> is not JSON serializable

我也尝试过使用pickle模块:

t = testclass()
print(pickle.dumps(t, pickle.HIGHEST_PROTOCOL))

它提供类实例信息,但不提供类实例的序列化内容。

b'\x80\x03c__main__\ntestclass\nq\x00)\x81q\x01}q\x02b.'

我究竟做错了什么?

I am trying to create a JSON string representation of a class instance and having difficulty. Let’s say the class is built like this:

class testclass:
    value1 = "a"
    value2 = "b"

A call to the json.dumps is made like this:

t = testclass()
json.dumps(t)

It is failing and telling me that the testclass is not JSON serializable.

TypeError: <__main__.testclass object at 0x000000000227A400> is not JSON serializable

I have also tried using the pickle module :

t = testclass()
print(pickle.dumps(t, pickle.HIGHEST_PROTOCOL))

And it gives class instance information but not a serialized content of the class instance.

b'\x80\x03c__main__\ntestclass\nq\x00)\x81q\x01}q\x02b.'

What am I doing wrong?


回答 0

基本问题是,JSON编码器json.dumps()默认仅知道如何序列化一组有限的对象类型(所有内置类型)。在此处列出:https//docs.python.org/3.3/library/json.html#encoders-and-decoders

一个好的解决方案是使您的类继承自该类JSONEncoder,然后实现该JSONEncoder.default()函数,并使该函数为您的类发出正确的JSON。

一个简单的解决方案是调用该实例json.dumps().__dict__成员。那是一个标准的Python dict,如果您的类很简单,它将是JSON可序列化的。

class Foo(object):
    def __init__(self):
        self.x = 1
        self.y = 2

foo = Foo()
s = json.dumps(foo) # raises TypeError with "is not JSON serializable"

s = json.dumps(foo.__dict__) # s set to: {"x":1, "y":2}

在此博客文章中讨论了上述方法:

    使用__dict__将任意Python对象序列化为JSON

The basic problem is that the JSON encoder json.dumps() only knows how to serialize a limited set of object types by default, all built-in types. List here: https://docs.python.org/3.3/library/json.html#encoders-and-decoders

One good solution would be to make your class inherit from JSONEncoder and then implement the JSONEncoder.default() function, and make that function emit the correct JSON for your class.

A simple solution would be to call json.dumps() on the .__dict__ member of that instance. That is a standard Python dict and if your class is simple it will be JSON serializable.

class Foo(object):
    def __init__(self):
        self.x = 1
        self.y = 2

foo = Foo()
s = json.dumps(foo) # raises TypeError with "is not JSON serializable"

s = json.dumps(foo.__dict__) # s set to: {"x":1, "y":2}

The above approach is discussed in this blog posting:

    Serializing arbitrary Python objects to JSON using __dict__


回答 1

您可以尝试一种对我有用的方法:

json.dumps()可以采用默认的可选参数,您可以在其中为未知类型指定自定义序列化函数,在我看来,

def serialize(obj):
    """JSON serializer for objects not serializable by default json code"""

    if isinstance(obj, date):
        serial = obj.isoformat()
        return serial

    if isinstance(obj, time):
        serial = obj.isoformat()
        return serial

    return obj.__dict__

前两个if用于日期和时间序列化,然后obj.__dict__返回任何其他对象。

最后的通话如下:

json.dumps(myObj, default=serialize)

当序列化一个集合并且不想__dict__显式地为每个对象调用时,这特别好。在这里,它会自动为您完成。

到目前为止,对我来说非常好,期待您的想法。

There’s one way that works great for me that you can try out:

json.dumps() can take an optional parameter default where you can specify a custom serializer function for unknown types, which in my case looks like

def serialize(obj):
    """JSON serializer for objects not serializable by default json code"""

    if isinstance(obj, date):
        serial = obj.isoformat()
        return serial

    if isinstance(obj, time):
        serial = obj.isoformat()
        return serial

    return obj.__dict__

First two ifs are for date and time serialization and then there is a obj.__dict__ returned for any other object.

the final call looks like:

json.dumps(myObj, default=serialize)

It’s especially good when you are serializing a collection and you don’t want to call __dict__ explicitly for every object. Here it’s done for you automatically.

So far worked so good for me, looking forward for your thoughts.


回答 2

您可以defaultjson.dumps()函数中指定命名参数:

json.dumps(obj, default=lambda x: x.__dict__)

说明:

形成文档(2.73.6):

``default(obj)`` is a function that should return a serializable version
of obj or raise TypeError. The default simply raises TypeError.

(适用于Python 2.7和Python 3.x)

注意:在这种情况下,您需要instance变量而不是class变量,如问题中的示例所示。(我假设询问者是class instance一个类的对象)

我首先从@phihag的答案中学到了这一点。发现它是最简单,最干净的方法。

You can specify the default named parameter in the json.dumps() function:

json.dumps(obj, default=lambda x: x.__dict__)

Explanation:

Form the docs (2.7, 3.6):

``default(obj)`` is a function that should return a serializable version
of obj or raise TypeError. The default simply raises TypeError.

(Works on Python 2.7 and Python 3.x)

Note: In this case you need instance variables and not class variables, as the example in the question tries to do. (I am assuming the asker meant class instance to be an object of a class)

I learned this first from @phihag’s answer here. Found it to be the simplest and cleanest way to do the job.


回答 3

我只是做:

data=json.dumps(myobject.__dict__)

这不是完整的答案,如果您有某种复杂的对象类,那么您肯定不会得到所有。但是,我将其用于一些简单的对象。

在OptionParser模块中获得的“ options”类确实非常有效。这里是JSON请求本身。

  def executeJson(self, url, options):
        data=json.dumps(options.__dict__)
        if options.verbose:
            print data
        headers = {'Content-type': 'application/json', 'Accept': 'text/plain'}
        return requests.post(url, data, headers=headers)

I just do:

data=json.dumps(myobject.__dict__)

This is not the full answer, and if you have some sort of complicated object class you certainly will not get everything. However I use this for some of my simple objects.

One that it works really well on is the “options” class that you get from the OptionParser module. Here it is along with the JSON request itself.

  def executeJson(self, url, options):
        data=json.dumps(options.__dict__)
        if options.verbose:
            print data
        headers = {'Content-type': 'application/json', 'Accept': 'text/plain'}
        return requests.post(url, data, headers=headers)

回答 4

使用jsonpickle

import jsonpickle

object = YourClass()
json_object = jsonpickle.encode(object)

Using jsonpickle

import jsonpickle

object = YourClass()
json_object = jsonpickle.encode(object)

回答 5

JSON并非真正意在序列化任意Python对象。这对于序列化dict对象pickle非常有用,但是实际上通常应该使用该模块。的输出pickle不是真正的人类可读的,但是应该可以使它顺畅运行。如果您坚持使用JSON,则可以签出jsonpickle模块,这是一种有趣的混合方法。

https://github.com/jsonpickle/jsonpickle

JSON is not really meant for serializing arbitrary Python objects. It’s great for serializing dict objects, but the pickle module is really what you should be using in general. Output from pickle is not really human-readable, but it should unpickle just fine. If you insist on using JSON, you could check out the jsonpickle module, which is an interesting hybrid approach.

https://github.com/jsonpickle/jsonpickle


回答 6

这是两个简单的函数,用于对任何不复杂的类进行序列化,没有任何花哨的地方。

我将其用于配置类型的东西,因为我可以在不进行代码调整的情况下将新成员添加到类中。

import json

class SimpleClass:
    def __init__(self, a=None, b=None, c=None):
        self.a = a
        self.b = b
        self.c = c

def serialize_json(instance=None, path=None):
    dt = {}
    dt.update(vars(instance))

    with open(path, "w") as file:
        json.dump(dt, file)

def deserialize_json(cls=None, path=None):
    def read_json(_path):
        with open(_path, "r") as file:
            return json.load(file)

    data = read_json(path)

    instance = object.__new__(cls)

    for key, value in data.items():
        setattr(instance, key, value)

    return instance

# Usage: Create class and serialize under Windows file system.
write_settings = SimpleClass(a=1, b=2, c=3)
serialize_json(write_settings, r"c:\temp\test.json")

# Read back and rehydrate.
read_settings = deserialize_json(SimpleClass, r"c:\temp\test.json")

# results are the same.
print(vars(write_settings))
print(vars(read_settings))

# output:
# {'c': 3, 'b': 2, 'a': 1}
# {'c': 3, 'b': 2, 'a': 1}

Here are two simple functions for serialization of any non-sophisticated classes, nothing fancy as explained before.

I use this for configuration type stuff because I can add new members to the classes with no code adjustments.

import json

class SimpleClass:
    def __init__(self, a=None, b=None, c=None):
        self.a = a
        self.b = b
        self.c = c

def serialize_json(instance=None, path=None):
    dt = {}
    dt.update(vars(instance))

    with open(path, "w") as file:
        json.dump(dt, file)

def deserialize_json(cls=None, path=None):
    def read_json(_path):
        with open(_path, "r") as file:
            return json.load(file)

    data = read_json(path)

    instance = object.__new__(cls)

    for key, value in data.items():
        setattr(instance, key, value)

    return instance

# Usage: Create class and serialize under Windows file system.
write_settings = SimpleClass(a=1, b=2, c=3)
serialize_json(write_settings, r"c:\temp\test.json")

# Read back and rehydrate.
read_settings = deserialize_json(SimpleClass, r"c:\temp\test.json")

# results are the same.
print(vars(write_settings))
print(vars(read_settings))

# output:
# {'c': 3, 'b': 2, 'a': 1}
# {'c': 3, 'b': 2, 'a': 1}

回答 7

关于如何开始执行此操作,有一些很好的答案。但是要记住一些事情:

  • 如果实例嵌套在大型数据结构中怎么办?
  • 如果还想要类的名称怎么办?
  • 如果要反序列化实例怎么办?
  • 如果您正在使用该怎么办 __slots__而不是__dict__呢?
  • 如果您只是不想自己做,该怎么办?

json-tricks是一个库(由我创建,其他人对此做出了贡献)已经有一段时间了。例如:

class MyTestCls:
    def __init__(self, **kwargs):
        for k, v in kwargs.items():
            setattr(self, k, v)

cls_instance = MyTestCls(s='ub', dct={'7': 7})

json = dumps(cls_instance, indent=4)
instance = loads(json)

您将恢复实例。这里的json看起来像这样:

{
    "__instance_type__": [
        "json_tricks.test_class",
        "MyTestCls"
    ],
    "attributes": {
        "s": "ub",
        "dct": {
            "7": 7
        }
    }
}

如果您想提出自己的解决方案,请查看以下内容的来源 json-tricks以免忘记某些特殊情况(例如__slots__)。

它还执行其他类型,例如numpy数组,日期时间,复数;它还允许发表评论。

There are some good answers on how to get started on doing this. But there are some things to keep in mind:

  • What if the instance is nested inside a large data structure?
  • What if also want the class name?
  • What if you want to deserialize the instance?
  • What if you’re using __slots__ instead of __dict__?
  • What if you just don’t want to do it yourself?

json-tricks is a library (that I made and others contributed to) which has been able to do this for quite a while. For example:

class MyTestCls:
    def __init__(self, **kwargs):
        for k, v in kwargs.items():
            setattr(self, k, v)

cls_instance = MyTestCls(s='ub', dct={'7': 7})

json = dumps(cls_instance, indent=4)
instance = loads(json)

You’ll get your instance back. Here the json looks like this:

{
    "__instance_type__": [
        "json_tricks.test_class",
        "MyTestCls"
    ],
    "attributes": {
        "s": "ub",
        "dct": {
            "7": 7
        }
    }
}

If you like to make your own solution, you might look at the source of json-tricks so as not to forget some special cases (like __slots__).

It also does other types like numpy arrays, datetimes, complex numbers; it also allows for comments.


回答 8

Python3.x

我所能达到的最好的方法就是这个。
注意,此代码也对待set()。
这种方法是通用的,只需要扩展类(在第二个示例中)。
请注意,我只是在处理文件,但是很容易根据自己的喜好修改行为。

但是,这是CoDec。

通过更多的工作,您可以用其他方式构造您的类。我假定使用默认的构造函数来实例化它,然后更新类dict。

import json
import collections


class JsonClassSerializable(json.JSONEncoder):

    REGISTERED_CLASS = {}

    def register(ctype):
        JsonClassSerializable.REGISTERED_CLASS[ctype.__name__] = ctype

    def default(self, obj):
        if isinstance(obj, collections.Set):
            return dict(_set_object=list(obj))
        if isinstance(obj, JsonClassSerializable):
            jclass = {}
            jclass["name"] = type(obj).__name__
            jclass["dict"] = obj.__dict__
            return dict(_class_object=jclass)
        else:
            return json.JSONEncoder.default(self, obj)

    def json_to_class(self, dct):
        if '_set_object' in dct:
            return set(dct['_set_object'])
        elif '_class_object' in dct:
            cclass = dct['_class_object']
            cclass_name = cclass["name"]
            if cclass_name not in self.REGISTERED_CLASS:
                raise RuntimeError(
                    "Class {} not registered in JSON Parser"
                    .format(cclass["name"])
                )
            instance = self.REGISTERED_CLASS[cclass_name]()
            instance.__dict__ = cclass["dict"]
            return instance
        return dct

    def encode_(self, file):
        with open(file, 'w') as outfile:
            json.dump(
                self.__dict__, outfile,
                cls=JsonClassSerializable,
                indent=4,
                sort_keys=True
            )

    def decode_(self, file):
        try:
            with open(file, 'r') as infile:
                self.__dict__ = json.load(
                    infile,
                    object_hook=self.json_to_class
                )
        except FileNotFoundError:
            print("Persistence load failed "
                  "'{}' do not exists".format(file)
                  )


class C(JsonClassSerializable):

    def __init__(self):
        self.mill = "s"


JsonClassSerializable.register(C)


class B(JsonClassSerializable):

    def __init__(self):
        self.a = 1230
        self.c = C()


JsonClassSerializable.register(B)


class A(JsonClassSerializable):

    def __init__(self):
        self.a = 1
        self.b = {1, 2}
        self.c = B()

JsonClassSerializable.register(A)

A().encode_("test")
b = A()
b.decode_("test")
print(b.a)
print(b.b)
print(b.c.a)

编辑

通过更多的研究,我找到了一种使用元类进行泛化而无需SUPERCLASS寄存器方法调用的方法。

import json
import collections

REGISTERED_CLASS = {}

class MetaSerializable(type):

    def __call__(cls, *args, **kwargs):
        if cls.__name__ not in REGISTERED_CLASS:
            REGISTERED_CLASS[cls.__name__] = cls
        return super(MetaSerializable, cls).__call__(*args, **kwargs)


class JsonClassSerializable(json.JSONEncoder, metaclass=MetaSerializable):

    def default(self, obj):
        if isinstance(obj, collections.Set):
            return dict(_set_object=list(obj))
        if isinstance(obj, JsonClassSerializable):
            jclass = {}
            jclass["name"] = type(obj).__name__
            jclass["dict"] = obj.__dict__
            return dict(_class_object=jclass)
        else:
            return json.JSONEncoder.default(self, obj)

    def json_to_class(self, dct):
        if '_set_object' in dct:
            return set(dct['_set_object'])
        elif '_class_object' in dct:
            cclass = dct['_class_object']
            cclass_name = cclass["name"]
            if cclass_name not in REGISTERED_CLASS:
                raise RuntimeError(
                    "Class {} not registered in JSON Parser"
                    .format(cclass["name"])
                )
            instance = REGISTERED_CLASS[cclass_name]()
            instance.__dict__ = cclass["dict"]
            return instance
        return dct

    def encode_(self, file):
        with open(file, 'w') as outfile:
            json.dump(
                self.__dict__, outfile,
                cls=JsonClassSerializable,
                indent=4,
                sort_keys=True
            )

    def decode_(self, file):
        try:
            with open(file, 'r') as infile:
                self.__dict__ = json.load(
                    infile,
                    object_hook=self.json_to_class
                )
        except FileNotFoundError:
            print("Persistence load failed "
                  "'{}' do not exists".format(file)
                  )


class C(JsonClassSerializable):

    def __init__(self):
        self.mill = "s"


class B(JsonClassSerializable):

    def __init__(self):
        self.a = 1230
        self.c = C()


class A(JsonClassSerializable):

    def __init__(self):
        self.a = 1
        self.b = {1, 2}
        self.c = B()


A().encode_("test")
b = A()
b.decode_("test")
print(b.a)
# 1
print(b.b)
# {1, 2}
print(b.c.a)
# 1230
print(b.c.c.mill)
# s

Python3.x

The best aproach I could reach with my knowledge was this.
Note that this code treat set() too.
This approach is generic just needing the extension of class (in the second example).
Note that I’m just doing it to files, but it’s easy to modify the behavior to your taste.

However this is a CoDec.

With a little more work you can construct your class in other ways. I assume a default constructor to instance it, then I update the class dict.

import json
import collections


class JsonClassSerializable(json.JSONEncoder):

    REGISTERED_CLASS = {}

    def register(ctype):
        JsonClassSerializable.REGISTERED_CLASS[ctype.__name__] = ctype

    def default(self, obj):
        if isinstance(obj, collections.Set):
            return dict(_set_object=list(obj))
        if isinstance(obj, JsonClassSerializable):
            jclass = {}
            jclass["name"] = type(obj).__name__
            jclass["dict"] = obj.__dict__
            return dict(_class_object=jclass)
        else:
            return json.JSONEncoder.default(self, obj)

    def json_to_class(self, dct):
        if '_set_object' in dct:
            return set(dct['_set_object'])
        elif '_class_object' in dct:
            cclass = dct['_class_object']
            cclass_name = cclass["name"]
            if cclass_name not in self.REGISTERED_CLASS:
                raise RuntimeError(
                    "Class {} not registered in JSON Parser"
                    .format(cclass["name"])
                )
            instance = self.REGISTERED_CLASS[cclass_name]()
            instance.__dict__ = cclass["dict"]
            return instance
        return dct

    def encode_(self, file):
        with open(file, 'w') as outfile:
            json.dump(
                self.__dict__, outfile,
                cls=JsonClassSerializable,
                indent=4,
                sort_keys=True
            )

    def decode_(self, file):
        try:
            with open(file, 'r') as infile:
                self.__dict__ = json.load(
                    infile,
                    object_hook=self.json_to_class
                )
        except FileNotFoundError:
            print("Persistence load failed "
                  "'{}' do not exists".format(file)
                  )


class C(JsonClassSerializable):

    def __init__(self):
        self.mill = "s"


JsonClassSerializable.register(C)


class B(JsonClassSerializable):

    def __init__(self):
        self.a = 1230
        self.c = C()


JsonClassSerializable.register(B)


class A(JsonClassSerializable):

    def __init__(self):
        self.a = 1
        self.b = {1, 2}
        self.c = B()

JsonClassSerializable.register(A)

A().encode_("test")
b = A()
b.decode_("test")
print(b.a)
print(b.b)
print(b.c.a)

Edit

With some more of research I found a way to generalize without the need of the SUPERCLASS register method call, using a metaclass

import json
import collections

REGISTERED_CLASS = {}

class MetaSerializable(type):

    def __call__(cls, *args, **kwargs):
        if cls.__name__ not in REGISTERED_CLASS:
            REGISTERED_CLASS[cls.__name__] = cls
        return super(MetaSerializable, cls).__call__(*args, **kwargs)


class JsonClassSerializable(json.JSONEncoder, metaclass=MetaSerializable):

    def default(self, obj):
        if isinstance(obj, collections.Set):
            return dict(_set_object=list(obj))
        if isinstance(obj, JsonClassSerializable):
            jclass = {}
            jclass["name"] = type(obj).__name__
            jclass["dict"] = obj.__dict__
            return dict(_class_object=jclass)
        else:
            return json.JSONEncoder.default(self, obj)

    def json_to_class(self, dct):
        if '_set_object' in dct:
            return set(dct['_set_object'])
        elif '_class_object' in dct:
            cclass = dct['_class_object']
            cclass_name = cclass["name"]
            if cclass_name not in REGISTERED_CLASS:
                raise RuntimeError(
                    "Class {} not registered in JSON Parser"
                    .format(cclass["name"])
                )
            instance = REGISTERED_CLASS[cclass_name]()
            instance.__dict__ = cclass["dict"]
            return instance
        return dct

    def encode_(self, file):
        with open(file, 'w') as outfile:
            json.dump(
                self.__dict__, outfile,
                cls=JsonClassSerializable,
                indent=4,
                sort_keys=True
            )

    def decode_(self, file):
        try:
            with open(file, 'r') as infile:
                self.__dict__ = json.load(
                    infile,
                    object_hook=self.json_to_class
                )
        except FileNotFoundError:
            print("Persistence load failed "
                  "'{}' do not exists".format(file)
                  )


class C(JsonClassSerializable):

    def __init__(self):
        self.mill = "s"


class B(JsonClassSerializable):

    def __init__(self):
        self.a = 1230
        self.c = C()


class A(JsonClassSerializable):

    def __init__(self):
        self.a = 1
        self.b = {1, 2}
        self.c = B()


A().encode_("test")
b = A()
b.decode_("test")
print(b.a)
# 1
print(b.b)
# {1, 2}
print(b.c.a)
# 1230
print(b.c.c.mill)
# s

回答 9

我相信与其采用公认的答案所建议的继​​承,不如使用多态性更好。否则,必须有一个很大的if else语句才能自定义每个对象的编码。这意味着为JSON创建通用的默认编码器,如下所示:

def jsonDefEncoder(obj):
   if hasattr(obj, 'jsonEnc'):
      return obj.jsonEnc()
   else: #some default behavior
      return obj.__dict__

然后jsonEnc()在要序列化的每个类中都有一个函数。例如

class A(object):
   def __init__(self,lengthInFeet):
      self.lengthInFeet=lengthInFeet
   def jsonEnc(self):
      return {'lengthInMeters': lengthInFeet * 0.3 } # each foot is 0.3 meter

然后你打电话 json.dumps(classInstance,default=jsonDefEncoder)

I believe instead of inheritance as suggested in accepted answer, it’s better to use polymorphism. Otherwise you have to have a big if else statement to customize encoding of every object. That means create a generic default encoder for JSON as:

def jsonDefEncoder(obj):
   if hasattr(obj, 'jsonEnc'):
      return obj.jsonEnc()
   else: #some default behavior
      return obj.__dict__

and then have a jsonEnc() function in each class you want to serialize. e.g.

class A(object):
   def __init__(self,lengthInFeet):
      self.lengthInFeet=lengthInFeet
   def jsonEnc(self):
      return {'lengthInMeters': lengthInFeet * 0.3 } # each foot is 0.3 meter

Then you call json.dumps(classInstance,default=jsonDefEncoder)


如何在Python中检查字符串是否为有效的JSON?

问题:如何在Python中检查字符串是否为有效的JSON?

在Python中,有没有办法在尝试解析字符串之前检查字符串是否为有效JSON?

例如,使用Facebook Graph API之类的东西时,有时返回JSON,有时可能返回图像文件。

In Python, is there a way to check if a string is valid JSON before trying to parse it?

For example working with things like the Facebook Graph API, sometimes it returns JSON, sometimes it could return an image file.


回答 0

您可以尝试这样做json.loads(),这将引发ValueError如果传递的字符串不能被解码为JSON。

通常,针对这种情况的“ Pythonic ”哲学被称为EAFP,因为它比许可更容易寻求宽恕

You can try to do json.loads(), which will throw a ValueError if the string you pass can’t be decoded as JSON.

In general, the “Pythonic” philosophy for this kind of situation is called EAFP, for Easier to Ask for Forgiveness than Permission.


回答 1

如果字符串是有效的json,示例Python脚本将返回一个布尔值:

import json

def is_json(myjson):
  try:
    json_object = json.loads(myjson)
  except ValueError as e:
    return False
  return True

哪些打印:

print is_json("{}")                          #prints True
print is_json("{asdf}")                      #prints False
print is_json('{ "age":100}')                #prints True
print is_json("{'age':100 }")                #prints False
print is_json("{\"age\":100 }")              #prints True
print is_json('{"age":100 }')                #prints True
print is_json('{"foo":[5,6.8],"foo":"bar"}') #prints True

将JSON字符串转换为Python字典:

import json
mydict = json.loads('{"foo":"bar"}')
print(mydict['foo'])    #prints bar

mylist = json.loads("[5,6,7]")
print(mylist)
[5, 6, 7]

将python对象转换为JSON字符串:

foo = {}
foo['gummy'] = 'bear'
print(json.dumps(foo))           #prints {"gummy": "bear"}

如果要访问低级解析,请不要自己滚动,请使用现有的库:http : //www.json.org/

关于python JSON模块的出色教程:https : //pymotw.com/2/json/

是String JSON并显示语法错误和错误消息:

sudo cpan JSON::XS
echo '{"foo":[5,6.8],"foo":"bar" bar}' > myjson.json
json_xs -t none < myjson.json

印刷品:

, or } expected while parsing object/hash, at character offset 28 (before "bar}
at /usr/local/bin/json_xs line 183, <STDIN> line 1.

json_xs 能够进行语法检查,解析,验证,编码,解码等操作:

https://metacpan.org/pod/json_xs

Example Python script returns a boolean if a string is valid json:

import json

def is_json(myjson):
  try:
    json_object = json.loads(myjson)
  except ValueError as e:
    return False
  return True

Which prints:

print is_json("{}")                          #prints True
print is_json("{asdf}")                      #prints False
print is_json('{ "age":100}')                #prints True
print is_json("{'age':100 }")                #prints False
print is_json("{\"age\":100 }")              #prints True
print is_json('{"age":100 }')                #prints True
print is_json('{"foo":[5,6.8],"foo":"bar"}') #prints True

Convert a JSON string to a Python dictionary:

import json
mydict = json.loads('{"foo":"bar"}')
print(mydict['foo'])    #prints bar

mylist = json.loads("[5,6,7]")
print(mylist)
[5, 6, 7]

Convert a python object to JSON string:

foo = {}
foo['gummy'] = 'bear'
print(json.dumps(foo))           #prints {"gummy": "bear"}

If you want access to low-level parsing, don’t roll your own, use an existing library: http://www.json.org/

Great tutorial on python JSON module: https://pymotw.com/2/json/

Is String JSON and show syntax errors and error messages:

sudo cpan JSON::XS
echo '{"foo":[5,6.8],"foo":"bar" bar}' > myjson.json
json_xs -t none < myjson.json

Prints:

, or } expected while parsing object/hash, at character offset 28 (before "bar}
at /usr/local/bin/json_xs line 183, <STDIN> line 1.

json_xs is capable of syntax checking, parsing, prittifying, encoding, decoding and more:

https://metacpan.org/pod/json_xs


回答 2

我要说的是,解析是您真正可以完全分辨的唯一方法。json.loads()如果格式不正确,则python 函数(几乎可以肯定)会引发异常。但是,出于示例目的,您可能只需要检查前两个非空白字符即可。

我不熟悉Facebook发送回的JSON,但是来自Web应用程序的大多数JSON字符串都将以方括号[或大{括号开头。我知道没有图像格式以这些字符开头。

相反,如果您知道可能会显示哪种图像格式,则可以检查字符串的开头以查找其签名以识别图像,如果不是图像,则假定您具有JSON。

在您要查找图形的情况下,另一个用于识别图形而不是文本字符串的简单技巧只是测试字符串的前几十个字符中的非ASCII字符(假设JSON为ASCII) )。

I would say parsing it is the only way you can really entirely tell. Exception will be raised by python’s json.loads() function (almost certainly) if not the correct format. However, the the purposes of your example you can probably just check the first couple of non-whitespace characters…

I’m not familiar with the JSON that facebook sends back, but most JSON strings from web apps will start with a open square [ or curly { bracket. No images formats I know of start with those characters.

Conversely if you know what image formats might show up, you can check the start of the string for their signatures to identify images, and assume you have JSON if it’s not an image.

Another simple hack to identify a graphic, rather than a text string, in the case you’re looking for a graphic, is just to test for non-ASCII characters in the first couple of dozen characters of the string (assuming the JSON is ASCII).


回答 3

我想出了一个通用的,有趣的解决方案:

class SafeInvocator(object):
    def __init__(self, module):
        self._module = module

    def _safe(self, func):
        def inner(*args, **kwargs):
            try:
                return func(*args, **kwargs)
            except:
                return None

        return inner

    def __getattr__(self, item):
        obj = getattr(self.module, item)
        return self._safe(obj) if hasattr(obj, '__call__') else obj

您可以像这样使用它:

safe_json = SafeInvocator(json)
text = "{'foo':'bar'}"
item = safe_json.loads(text)
if item:
    # do something

I came up with an generic, interesting solution to this problem:

class SafeInvocator(object):
    def __init__(self, module):
        self._module = module

    def _safe(self, func):
        def inner(*args, **kwargs):
            try:
                return func(*args, **kwargs)
            except:
                return None

        return inner

    def __getattr__(self, item):
        obj = getattr(self.module, item)
        return self._safe(obj) if hasattr(obj, '__call__') else obj

and you can use it like so:

safe_json = SafeInvocator(json)
text = "{'foo':'bar'}"
item = safe_json.loads(text)
if item:
    # do something

如何将JSON转换为CSV?

问题:如何将JSON转换为CSV?

我有一个要转换为CSV文件的JSON文件。如何使用Python执行此操作?

我试过了:

import json
import csv

f = open('data.json')
data = json.load(f)
f.close()

f = open('data.csv')
csv_file = csv.writer(f)
for item in data:
    csv_file.writerow(item)

f.close()

但是,它没有用。我正在使用Django,收到的错误是:

file' object has no attribute 'writerow'

然后,我尝试了以下方法:

import json
import csv

f = open('data.json')
data = json.load(f)
f.close()

f = open('data.csv')
csv_file = csv.writer(f)
for item in data:
    f.writerow(item)  # ← changed

f.close()

然后我得到错误:

sequence expected

样本json文件:

[{
        "pk": 22,
        "model": "auth.permission",
        "fields": {
            "codename": "add_logentry",
            "name": "Can add log entry",
            "content_type": 8
        }
    }, {
        "pk": 23,
        "model": "auth.permission",
        "fields": {
            "codename": "change_logentry",
            "name": "Can change log entry",
            "content_type": 8
        }
    }, {
        "pk": 24,
        "model": "auth.permission",
        "fields": {
            "codename": "delete_logentry",
            "name": "Can delete log entry",
            "content_type": 8
        }
    }, {
        "pk": 4,
        "model": "auth.permission",
        "fields": {
            "codename": "add_group",
            "name": "Can add group",
            "content_type": 2
        }
    }, {
        "pk": 10,
        "model": "auth.permission",
        "fields": {
            "codename": "add_message",
            "name": "Can add message",
            "content_type": 4
        }
    }
]

I have a JSON file I want to convert to a CSV file. How can I do this with Python?

I tried:

import json
import csv

f = open('data.json')
data = json.load(f)
f.close()

f = open('data.csv')
csv_file = csv.writer(f)
for item in data:
    csv_file.writerow(item)

f.close()

However, it did not work. I am using Django and the error I received is:

`file' object has no attribute 'writerow'`

I then tried the following:

import json
import csv

f = open('data.json')
data = json.load(f)
f.close()

f = open('data.csv')
csv_file = csv.writer(f)
for item in data:
    f.writerow(item)  # ← changed

f.close()

I then get the error:

`sequence expected`

Sample json file:

[{
        "pk": 22,
        "model": "auth.permission",
        "fields": {
            "codename": "add_logentry",
            "name": "Can add log entry",
            "content_type": 8
        }
    }, {
        "pk": 23,
        "model": "auth.permission",
        "fields": {
            "codename": "change_logentry",
            "name": "Can change log entry",
            "content_type": 8
        }
    }, {
        "pk": 24,
        "model": "auth.permission",
        "fields": {
            "codename": "delete_logentry",
            "name": "Can delete log entry",
            "content_type": 8
        }
    }, {
        "pk": 4,
        "model": "auth.permission",
        "fields": {
            "codename": "add_group",
            "name": "Can add group",
            "content_type": 2
        }
    }, {
        "pk": 10,
        "model": "auth.permission",
        "fields": {
            "codename": "add_message",
            "name": "Can add message",
            "content_type": 4
        }
    }
]

回答 0

首先,您的JSON具有嵌套对象,因此通常无法直接将其转换为CSV。您需要将其更改为以下内容:

{
    "pk": 22,
    "model": "auth.permission",
    "codename": "add_logentry",
    "content_type": 8,
    "name": "Can add log entry"
},
......]

这是从中生成CSV的代码:

import csv
import json

x = """[
    {
        "pk": 22,
        "model": "auth.permission",
        "fields": {
            "codename": "add_logentry",
            "name": "Can add log entry",
            "content_type": 8
        }
    },
    {
        "pk": 23,
        "model": "auth.permission",
        "fields": {
            "codename": "change_logentry",
            "name": "Can change log entry",
            "content_type": 8
        }
    },
    {
        "pk": 24,
        "model": "auth.permission",
        "fields": {
            "codename": "delete_logentry",
            "name": "Can delete log entry",
            "content_type": 8
        }
    }
]"""

x = json.loads(x)

f = csv.writer(open("test.csv", "wb+"))

# Write CSV Header, If you dont need that, remove this line
f.writerow(["pk", "model", "codename", "name", "content_type"])

for x in x:
    f.writerow([x["pk"],
                x["model"],
                x["fields"]["codename"],
                x["fields"]["name"],
                x["fields"]["content_type"]])

您将获得以下输出:

pk,model,codename,name,content_type
22,auth.permission,add_logentry,Can add log entry,8
23,auth.permission,change_logentry,Can change log entry,8
24,auth.permission,delete_logentry,Can delete log entry,8

First, your JSON has nested objects, so it normally cannot be directly converted to CSV. You need to change that to something like this:

{
    "pk": 22,
    "model": "auth.permission",
    "codename": "add_logentry",
    "content_type": 8,
    "name": "Can add log entry"
},
......]

Here is my code to generate CSV from that:

import csv
import json

x = """[
    {
        "pk": 22,
        "model": "auth.permission",
        "fields": {
            "codename": "add_logentry",
            "name": "Can add log entry",
            "content_type": 8
        }
    },
    {
        "pk": 23,
        "model": "auth.permission",
        "fields": {
            "codename": "change_logentry",
            "name": "Can change log entry",
            "content_type": 8
        }
    },
    {
        "pk": 24,
        "model": "auth.permission",
        "fields": {
            "codename": "delete_logentry",
            "name": "Can delete log entry",
            "content_type": 8
        }
    }
]"""

x = json.loads(x)

f = csv.writer(open("test.csv", "wb+"))

# Write CSV Header, If you dont need that, remove this line
f.writerow(["pk", "model", "codename", "name", "content_type"])

for x in x:
    f.writerow([x["pk"],
                x["model"],
                x["fields"]["codename"],
                x["fields"]["name"],
                x["fields"]["content_type"]])

You will get output as:

pk,model,codename,name,content_type
22,auth.permission,add_logentry,Can add log entry,8
23,auth.permission,change_logentry,Can change log entry,8
24,auth.permission,delete_logentry,Can delete log entry,8

回答 1

使用pandas 这就像使用两个命令一样简单!

pandas.read_json()

要将JSON字符串转换为pandas对象(序列或数据框)。然后,假设结果存储为df

df.to_csv()

它可以返回字符串,也可以直接写入csv文件。

基于先前答案的冗长性,我们都应该感谢熊猫的捷径。

With the pandas library, this is as easy as using two commands!

pandas.read_json()

To convert a JSON string to a pandas object (either a series or dataframe). Then, assuming the results were stored as df:

df.to_csv()

Which can either return a string or write directly to a csv-file.

Based on the verbosity of previous answers, we should all thank pandas for the shortcut.


回答 2

我假设您的JSON文件将解码为词典列表。首先,我们需要一个将JSON对象展平的函数:

def flattenjson( b, delim ):
    val = {}
    for i in b.keys():
        if isinstance( b[i], dict ):
            get = flattenjson( b[i], delim )
            for j in get.keys():
                val[ i + delim + j ] = get[j]
        else:
            val[i] = b[i]

    return val

在JSON对象上运行此代码段的结果:

flattenjson( {
    "pk": 22, 
    "model": "auth.permission", 
    "fields": {
      "codename": "add_message", 
      "name": "Can add message", 
      "content_type": 8
    }
  }, "__" )

{
    "pk": 22, 
    "model": "auth.permission', 
    "fields__codename": "add_message", 
    "fields__name": "Can add message", 
    "fields__content_type": 8
}

在将此函数应用于JSON对象输入数组中的每个dict之后:

input = map( lambda x: flattenjson( x, "__" ), input )

并找到相关的列名:

columns = [ x for row in input for x in row.keys() ]
columns = list( set( columns ) )

通过csv模块运行它并不难:

with open( fname, 'wb' ) as out_file:
    csv_w = csv.writer( out_file )
    csv_w.writerow( columns )

    for i_r in input:
        csv_w.writerow( map( lambda x: i_r.get( x, "" ), columns ) )

我希望这有帮助!

I am assuming that your JSON file will decode into a list of dictionaries. First we need a function which will flatten the JSON objects:

def flattenjson( b, delim ):
    val = {}
    for i in b.keys():
        if isinstance( b[i], dict ):
            get = flattenjson( b[i], delim )
            for j in get.keys():
                val[ i + delim + j ] = get[j]
        else:
            val[i] = b[i]

    return val

The result of running this snippet on your JSON object:

flattenjson( {
    "pk": 22, 
    "model": "auth.permission", 
    "fields": {
      "codename": "add_message", 
      "name": "Can add message", 
      "content_type": 8
    }
  }, "__" )

is

{
    "pk": 22, 
    "model": "auth.permission', 
    "fields__codename": "add_message", 
    "fields__name": "Can add message", 
    "fields__content_type": 8
}

After applying this function to each dict in the input array of JSON objects:

input = map( lambda x: flattenjson( x, "__" ), input )

and finding the relevant column names:

columns = [ x for row in input for x in row.keys() ]
columns = list( set( columns ) )

it’s not hard to run this through the csv module:

with open( fname, 'wb' ) as out_file:
    csv_w = csv.writer( out_file )
    csv_w.writerow( columns )

    for i_r in input:
        csv_w.writerow( map( lambda x: i_r.get( x, "" ), columns ) )

I hope this helps!


回答 3

JSON可以代表各种各样的数据结构-JS“对象”大致类似于Python字典(带有字符串键),JS“数组”大致类似于Python列表,并且您可以嵌套它们,只要最后一个“叶”元素是数字或字符串。

CSV本质上只能表示一个二维表-可选地带有“标题”的第一行,即“列名”,这可以使该表可解释为字典列表,而不是通常的解释,而是列表列表(同样,“叶”元素可以是数字或字符串)。

因此,在一般情况下,您无法将任意JSON结构转换为CSV。在某些特殊情况下,您可以(没有进一步嵌套的数组的阵列;都具有完全相同的键的对象的阵列)。哪种特殊情况(如果有)适用于您的问题?解决方案的详细信息取决于您的特殊情况。考虑到您甚至没有提到哪个适用的惊人事实,我怀疑您可能没有考虑过约束,实际上没有可用的案例适用,并且您的问题无法解决。但是请澄清一下!

JSON can represent a wide variety of data structures — a JS “object” is roughly like a Python dict (with string keys), a JS “array” roughly like a Python list, and you can nest them as long as the final “leaf” elements are numbers or strings.

CSV can essentially represent only a 2-D table — optionally with a first row of “headers”, i.e., “column names”, which can make the table interpretable as a list of dicts, instead of the normal interpretation, a list of lists (again, “leaf” elements can be numbers or strings).

So, in the general case, you can’t translate an arbitrary JSON structure to a CSV. In a few special cases you can (array of arrays with no further nesting; arrays of objects which all have exactly the same keys). Which special case, if any, applies to your problem? The details of the solution depend on which special case you do have. Given the astonishing fact that you don’t even mention which one applies, I suspect you may not have considered the constraint, neither usable case in fact applies, and your problem is impossible to solve. But please do clarify!


回答 4

通用解决方案,可将平面对象的任何json列表转换为csv。

将input.json文件作为第一个参数传递给命令行。

import csv, json, sys

input = open(sys.argv[1])
data = json.load(input)
input.close()

output = csv.writer(sys.stdout)

output.writerow(data[0].keys())  # header row

for row in data:
    output.writerow(row.values())

A generic solution which translates any json list of flat objects to csv.

Pass the input.json file as first argument on command line.

import csv, json, sys

input = open(sys.argv[1])
data = json.load(input)
input.close()

output = csv.writer(sys.stdout)

output.writerow(data[0].keys())  # header row

for row in data:
    output.writerow(row.values())

回答 5

假设您的JSON数据位于名为的文件中,那么这段代码应该对您有用data.json

import json
import csv

with open("data.json") as file:
    data = json.load(file)

with open("data.csv", "w") as file:
    csv_file = csv.writer(file)
    for item in data:
        fields = list(item['fields'].values())
        csv_file.writerow([item['pk'], item['model']] + fields)

This code should work for you, assuming that your JSON data is in a file called data.json.

import json
import csv

with open("data.json") as file:
    data = json.load(file)

with open("data.csv", "w") as file:
    csv_file = csv.writer(file)
    for item in data:
        fields = list(item['fields'].values())
        csv_file.writerow([item['pk'], item['model']] + fields)

回答 6

它易于使用csv.DictWriter(),详细的实现可以像这样:

def read_json(filename):
    return json.loads(open(filename).read())
def write_csv(data,filename):
    with open(filename, 'w+') as outf:
        writer = csv.DictWriter(outf, data[0].keys())
        writer.writeheader()
        for row in data:
            writer.writerow(row)
# implement
write_csv(read_json('test.json'), 'output.csv')

请注意,这假设您的所有JSON对象都具有相同的字段。

这是可以帮助您的参考

It’ll be easy to use csv.DictWriter(),the detailed implementation can be like this:

def read_json(filename):
    return json.loads(open(filename).read())
def write_csv(data,filename):
    with open(filename, 'w+') as outf:
        writer = csv.DictWriter(outf, data[0].keys())
        writer.writeheader()
        for row in data:
            writer.writerow(row)
# implement
write_csv(read_json('test.json'), 'output.csv')

Note that this assumes that all of your JSON objects have the same fields.

Here is the reference which may help you.


回答 7

我在Dan提出的解决方案上遇到了麻烦,但这对我有用:

import json
import csv 

f = open('test.json')
data = json.load(f)
f.close()

f=csv.writer(open('test.csv','wb+'))

for item in data:
  f.writerow([item['pk'], item['model']] + item['fields'].values())

其中“ test.json”包含以下内容:

[ 
{"pk": 22, "model": "auth.permission", "fields": 
  {"codename": "add_logentry", "name": "Can add log entry", "content_type": 8 } }, 
{"pk": 23, "model": "auth.permission", "fields": 
  {"codename": "change_logentry", "name": "Can change log entry", "content_type": 8 } }, {"pk": 24, "model": "auth.permission", "fields": 
  {"codename": "delete_logentry", "name": "Can delete log entry", "content_type": 8 } }
]

I was having trouble with Dan’s proposed solution, but this worked for me:

import json
import csv 

f = open('test.json')
data = json.load(f)
f.close()

f=csv.writer(open('test.csv','wb+'))

for item in data:
  f.writerow([item['pk'], item['model']] + item['fields'].values())

Where “test.json” contained the following:

[ 
{"pk": 22, "model": "auth.permission", "fields": 
  {"codename": "add_logentry", "name": "Can add log entry", "content_type": 8 } }, 
{"pk": 23, "model": "auth.permission", "fields": 
  {"codename": "change_logentry", "name": "Can change log entry", "content_type": 8 } }, {"pk": 24, "model": "auth.permission", "fields": 
  {"codename": "delete_logentry", "name": "Can delete log entry", "content_type": 8 } }
]

回答 8

json_normalize从使用pandas

  • 根据提供的数据,将其命名为 test.json
  • encoding='utf-8' 可能没有必要。
  • 以下代码利用了该pathlib
    • .open 是一种方法 pathlib
    • 也适用于非Windows路径
import pandas as pd
# As of Pandas 1.01, json_normalize as pandas.io.json.json_normalize is deprecated and is now exposed in the top-level namespace.
# from pandas.io.json import json_normalize
from pathlib import Path
import json

# set path to file
p = Path(r'c:\some_path_to_file\test.json')

# read json
with p.open('r', encoding='utf-8') as f:
    data = json.loads(f.read())

# create dataframe
df = pd.json_normalize(data)

# dataframe view
 pk            model  fields.codename           fields.name  fields.content_type
 22  auth.permission     add_logentry     Can add log entry                    8
 23  auth.permission  change_logentry  Can change log entry                    8
 24  auth.permission  delete_logentry  Can delete log entry                    8
  4  auth.permission        add_group         Can add group                    2
 10  auth.permission      add_message       Can add message                    4

# save to csv
df.to_csv('test.csv', index=False, encoding='utf-8')

CSV输出:

pk,model,fields.codename,fields.name,fields.content_type
22,auth.permission,add_logentry,Can add log entry,8
23,auth.permission,change_logentry,Can change log entry,8
24,auth.permission,delete_logentry,Can delete log entry,8
4,auth.permission,add_group,Can add group,2
10,auth.permission,add_message,Can add message,4

有关更多嵌套JSON对象的其他资源:

Use json_normalize from pandas:

  • Given the data provided, in a file named test.json.
  • encoding='utf-8' may not be necessary.
  • The following code takes advantage of the pathlib library.
  • .open is a method of pathlib.
  • Works with non-Windows paths too.
import pandas as pd
# As of Pandas 1.01, json_normalize as pandas.io.json.json_normalize is deprecated and is now exposed in the top-level namespace.
# from pandas.io.json import json_normalize
from pathlib import Path
import json

# set path to file
p = Path(r'c:\some_path_to_file\test.json')

# read json
with p.open('r', encoding='utf-8') as f:
    data = json.loads(f.read())

# create dataframe
df = pd.json_normalize(data)

# dataframe view
 pk            model  fields.codename           fields.name  fields.content_type
 22  auth.permission     add_logentry     Can add log entry                    8
 23  auth.permission  change_logentry  Can change log entry                    8
 24  auth.permission  delete_logentry  Can delete log entry                    8
  4  auth.permission        add_group         Can add group                    2
 10  auth.permission      add_message       Can add message                    4

# save to csv
df.to_csv('test.csv', index=False, encoding='utf-8')

CSV Output:

pk,model,fields.codename,fields.name,fields.content_type
22,auth.permission,add_logentry,Can add log entry,8
23,auth.permission,change_logentry,Can change log entry,8
24,auth.permission,delete_logentry,Can delete log entry,8
4,auth.permission,add_group,Can add group,2
10,auth.permission,add_message,Can add message,4

Other Resources for more heavily nested JSON objects:


回答 9

正如前面的答案中提到的,将json转换为csv的困难是因为json文件可以包含嵌套的字典,因此是多维数据结构,而csv是2D数据结构。但是,将多维结构转换为csv的一种好方法是让多个csv与主键绑定在一起。

在您的示例中,第一个csv输出具有“ pk”,“ model”,“ fields”列作为您的列。“ pk”和“ model”的值很容易获得,但是由于“字段”列包含一个字典,因此它应该是其自己的csv,并且因为“代号”似乎是主键,因此可以用作输入为“字段”完成第一个csv。第二个csv包含“字段”列中的词典,其代号为主键,可用于将2个csv绑在一起。

这是为您的json文件提供的解决方案,它将嵌套词典转换为2个csvs。

import csv
import json

def readAndWrite(inputFileName, primaryKey=""):
    input = open(inputFileName+".json")
    data = json.load(input)
    input.close()

    header = set()

    if primaryKey != "":
        outputFileName = inputFileName+"-"+primaryKey
        if inputFileName == "data":
            for i in data:
                for j in i["fields"].keys():
                    if j not in header:
                        header.add(j)
    else:
        outputFileName = inputFileName
        for i in data:
            for j in i.keys():
                if j not in header:
                    header.add(j)

    with open(outputFileName+".csv", 'wb') as output_file:
        fieldnames = list(header)
        writer = csv.DictWriter(output_file, fieldnames, delimiter=',', quotechar='"')
        writer.writeheader()
        for x in data:
            row_value = {}
            if primaryKey == "":
                for y in x.keys():
                    yValue = x.get(y)
                    if type(yValue) == int or type(yValue) == bool or type(yValue) == float or type(yValue) == list:
                        row_value[y] = str(yValue).encode('utf8')
                    elif type(yValue) != dict:
                        row_value[y] = yValue.encode('utf8')
                    else:
                        if inputFileName == "data":
                            row_value[y] = yValue["codename"].encode('utf8')
                            readAndWrite(inputFileName, primaryKey="codename")
                writer.writerow(row_value)
            elif primaryKey == "codename":
                for y in x["fields"].keys():
                    yValue = x["fields"].get(y)
                    if type(yValue) == int or type(yValue) == bool or type(yValue) == float or type(yValue) == list:
                        row_value[y] = str(yValue).encode('utf8')
                    elif type(yValue) != dict:
                        row_value[y] = yValue.encode('utf8')
                writer.writerow(row_value)

readAndWrite("data")

As mentioned in the previous answers the difficulty in converting json to csv is because a json file can contain nested dictionaries and therefore be a multidimensional data structure verses a csv which is a 2D data structure. However, a good way to turn a multidimensional structure to a csv is to have multiple csvs that tie together with primary keys.

In your example, the first csv output has the columns “pk”,”model”,”fields” as your columns. Values for “pk”, and “model” are easy to get but because the “fields” column contains a dictionary, it should be its own csv and because “codename” appears to the be the primary key, you can use as the input for “fields” to complete the first csv. The second csv contains the dictionary from the “fields” column with codename as the the primary key that can be used to tie the 2 csvs together.

Here is a solution for your json file which converts a nested dictionaries to 2 csvs.

import csv
import json

def readAndWrite(inputFileName, primaryKey=""):
    input = open(inputFileName+".json")
    data = json.load(input)
    input.close()

    header = set()

    if primaryKey != "":
        outputFileName = inputFileName+"-"+primaryKey
        if inputFileName == "data":
            for i in data:
                for j in i["fields"].keys():
                    if j not in header:
                        header.add(j)
    else:
        outputFileName = inputFileName
        for i in data:
            for j in i.keys():
                if j not in header:
                    header.add(j)

    with open(outputFileName+".csv", 'wb') as output_file:
        fieldnames = list(header)
        writer = csv.DictWriter(output_file, fieldnames, delimiter=',', quotechar='"')
        writer.writeheader()
        for x in data:
            row_value = {}
            if primaryKey == "":
                for y in x.keys():
                    yValue = x.get(y)
                    if type(yValue) == int or type(yValue) == bool or type(yValue) == float or type(yValue) == list:
                        row_value[y] = str(yValue).encode('utf8')
                    elif type(yValue) != dict:
                        row_value[y] = yValue.encode('utf8')
                    else:
                        if inputFileName == "data":
                            row_value[y] = yValue["codename"].encode('utf8')
                            readAndWrite(inputFileName, primaryKey="codename")
                writer.writerow(row_value)
            elif primaryKey == "codename":
                for y in x["fields"].keys():
                    yValue = x["fields"].get(y)
                    if type(yValue) == int or type(yValue) == bool or type(yValue) == float or type(yValue) == list:
                        row_value[y] = str(yValue).encode('utf8')
                    elif type(yValue) != dict:
                        row_value[y] = yValue.encode('utf8')
                writer.writerow(row_value)

readAndWrite("data")

回答 10

我知道问这个问题已经有很长时间了,但是我想我可以添加到其他所有人的答案中,并分享一篇博客文章,我认为它可以非常简洁地说明解决方案。

这是链接

打开文件进行写入

employ_data = open('/tmp/EmployData.csv', 'w')

创建csv writer对象

csvwriter = csv.writer(employ_data)
count = 0
for emp in emp_data:
      if count == 0:
             header = emp.keys()
             csvwriter.writerow(header)
             count += 1
      csvwriter.writerow(emp.values())

确保关闭文件以保存内容

employ_data.close()

I know it has been a long time since this question has been asked but I thought I might add to everyone else’s answer and share a blog post that I think explain the solution in a very concise way.

Here is the link

Open a file for writing

employ_data = open('/tmp/EmployData.csv', 'w')

Create the csv writer object

csvwriter = csv.writer(employ_data)
count = 0
for emp in emp_data:
      if count == 0:
             header = emp.keys()
             csvwriter.writerow(header)
             count += 1
      csvwriter.writerow(emp.values())

Make sure to close the file in order to save the contents

employ_data.close()

回答 11

这不是一个很聪明的方法,但是我遇到了同样的问题,这对我有用:

import csv

f = open('data.json')
data = json.load(f)
f.close()

new_data = []

for i in data:
   flat = {}
   names = i.keys()
   for n in names:
      try:
         if len(i[n].keys()) > 0:
            for ii in i[n].keys():
               flat[n+"_"+ii] = i[n][ii]
      except:
         flat[n] = i[n]
   new_data.append(flat)  

f = open(filename, "r")
writer = csv.DictWriter(f, new_data[0].keys())
writer.writeheader()
for row in new_data:
   writer.writerow(row)
f.close()

It is not a very smart way to do it, but I have had the same problem and this worked for me:

import csv

f = open('data.json')
data = json.load(f)
f.close()

new_data = []

for i in data:
   flat = {}
   names = i.keys()
   for n in names:
      try:
         if len(i[n].keys()) > 0:
            for ii in i[n].keys():
               flat[n+"_"+ii] = i[n][ii]
      except:
         flat[n] = i[n]
   new_data.append(flat)  

f = open(filename, "r")
writer = csv.DictWriter(f, new_data[0].keys())
writer.writeheader()
for row in new_data:
   writer.writerow(row)
f.close()

回答 12

Alec的答案很好,但是在多层嵌套的情况下,它是行不通的。这是修改后的版本,支持多层嵌套。如果嵌套对象已经指定了自己的键(例如,Firebase Analytics / BigTable / BigQuery数据),它还可以使标头名称更好:

"""Converts JSON with nested fields into a flattened CSV file.
"""

import sys
import json
import csv
import os

import jsonlines

from orderedset import OrderedSet

# from https://stackoverflow.com/a/28246154/473201
def flattenjson( b, prefix='', delim='/', val=None ):
  if val == None:
    val = {}

  if isinstance( b, dict ):
    for j in b.keys():
      flattenjson(b[j], prefix + delim + j, delim, val)
  elif isinstance( b, list ):
    get = b
    for j in range(len(get)):
      key = str(j)

      # If the nested data contains its own key, use that as the header instead.
      if isinstance( get[j], dict ):
        if 'key' in get[j]:
          key = get[j]['key']

      flattenjson(get[j], prefix + delim + key, delim, val)
  else:
    val[prefix] = b

  return val

def main(argv):
  if len(argv) < 2:
    raise Error('Please specify a JSON file to parse')

  filename = argv[1]
  allRows = []
  fieldnames = OrderedSet()
  with jsonlines.open(filename) as reader:
    for obj in reader:
      #print obj
      flattened = flattenjson(obj)
      #print 'keys: %s' % flattened.keys()
      fieldnames.update(flattened.keys())
      allRows.append(flattened)

  outfilename = filename + '.csv'
  with open(outfilename, 'w') as file:
    csvwriter = csv.DictWriter(file, fieldnames=fieldnames)
    csvwriter.writeheader()
    for obj in allRows:
      csvwriter.writerow(obj)



if __name__ == '__main__':
  main(sys.argv)

Alec’s answer is great, but it doesn’t work in the case where there are multiple levels of nesting. Here’s a modified version that supports multiple levels of nesting. It also makes the header names a bit nicer if the nested object already specifies its own key (e.g. Firebase Analytics / BigTable / BigQuery data):

"""Converts JSON with nested fields into a flattened CSV file.
"""

import sys
import json
import csv
import os

import jsonlines

from orderedset import OrderedSet

# from https://stackoverflow.com/a/28246154/473201
def flattenjson( b, prefix='', delim='/', val=None ):
  if val is None:
    val = {}

  if isinstance( b, dict ):
    for j in b.keys():
      flattenjson(b[j], prefix + delim + j, delim, val)
  elif isinstance( b, list ):
    get = b
    for j in range(len(get)):
      key = str(j)

      # If the nested data contains its own key, use that as the header instead.
      if isinstance( get[j], dict ):
        if 'key' in get[j]:
          key = get[j]['key']

      flattenjson(get[j], prefix + delim + key, delim, val)
  else:
    val[prefix] = b

  return val

def main(argv):
  if len(argv) < 2:
    raise Error('Please specify a JSON file to parse')

  print "Loading and Flattening..."
  filename = argv[1]
  allRows = []
  fieldnames = OrderedSet()
  with jsonlines.open(filename) as reader:
    for obj in reader:
      # print 'orig:\n'
      # print obj
      flattened = flattenjson(obj)
      #print 'keys: %s' % flattened.keys()
      # print 'flattened:\n'
      # print flattened
      fieldnames.update(flattened.keys())
      allRows.append(flattened)

  print "Exporting to CSV..."
  outfilename = filename + '.csv'
  count = 0
  with open(outfilename, 'w') as file:
    csvwriter = csv.DictWriter(file, fieldnames=fieldnames)
    csvwriter.writeheader()
    for obj in allRows:
      # print 'allRows:\n'
      # print obj
      csvwriter.writerow(obj)
      count += 1

  print "Wrote %d rows" % count



if __name__ == '__main__':
  main(sys.argv)

回答 13

这相对较好。它将json展平以将其写入csv文件。嵌套元素被管理:)

那是为了python 3

import json

o = json.loads('your json string') # Be careful, o must be a list, each of its objects will make a line of the csv.

def flatten(o, k='/'):
    global l, c_line
    if isinstance(o, dict):
        for key, value in o.items():
            flatten(value, k + '/' + key)
    elif isinstance(o, list):
        for ov in o:
            flatten(ov, '')
    elif isinstance(o, str):
        o = o.replace('\r',' ').replace('\n',' ').replace(';', ',')
        if not k in l:
            l[k]={}
        l[k][c_line]=o

def render_csv(l):
    ftime = True

    for i in range(100): #len(l[list(l.keys())[0]])
        for k in l:
            if ftime :
                print('%s;' % k, end='')
                continue
            v = l[k]
            try:
                print('%s;' % v[i], end='')
            except:
                print(';', end='')
        print()
        ftime = False
        i = 0

def json_to_csv(object_list):
    global l, c_line
    l = {}
    c_line = 0
    for ov in object_list : # Assumes json is a list of objects
        flatten(ov)
        c_line += 1
    render_csv(l)

json_to_csv(o)

请享用。

This works relatively well. It flattens the json to write it to a csv file. Nested elements are managed :)

That’s for python 3

import json

o = json.loads('your json string') # Be careful, o must be a list, each of its objects will make a line of the csv.

def flatten(o, k='/'):
    global l, c_line
    if isinstance(o, dict):
        for key, value in o.items():
            flatten(value, k + '/' + key)
    elif isinstance(o, list):
        for ov in o:
            flatten(ov, '')
    elif isinstance(o, str):
        o = o.replace('\r',' ').replace('\n',' ').replace(';', ',')
        if not k in l:
            l[k]={}
        l[k][c_line]=o

def render_csv(l):
    ftime = True

    for i in range(100): #len(l[list(l.keys())[0]])
        for k in l:
            if ftime :
                print('%s;' % k, end='')
                continue
            v = l[k]
            try:
                print('%s;' % v[i], end='')
            except:
                print(';', end='')
        print()
        ftime = False
        i = 0

def json_to_csv(object_list):
    global l, c_line
    l = {}
    c_line = 0
    for ov in object_list : # Assumes json is a list of objects
        flatten(ov)
        c_line += 1
    render_csv(l)

json_to_csv(o)

enjoy.


回答 14

我解决这个问题的简单方法:

创建一个新的Python文件,例如:json_to_csv.py

添加此代码:

import csv, json, sys
#if you are not using utf-8 files, remove the next line
sys.setdefaultencoding("UTF-8")
#check if you pass the input file and output file
if sys.argv[1] is not None and sys.argv[2] is not None:

    fileInput = sys.argv[1]
    fileOutput = sys.argv[2]

    inputFile = open(fileInput)
    outputFile = open(fileOutput, 'w')
    data = json.load(inputFile)
    inputFile.close()

    output = csv.writer(outputFile)

    output.writerow(data[0].keys())  # header row

    for row in data:
        output.writerow(row.values())

添加此代码后,保存文件并在终端上运行:

python json_to_csv.py input.txt output.csv

希望对您有所帮助。

拜拜!

My simple way to solve this:

Create a new Python file like: json_to_csv.py

Add this code:

import csv, json, sys
#if you are not using utf-8 files, remove the next line
sys.setdefaultencoding("UTF-8")
#check if you pass the input file and output file
if sys.argv[1] is not None and sys.argv[2] is not None:

    fileInput = sys.argv[1]
    fileOutput = sys.argv[2]

    inputFile = open(fileInput)
    outputFile = open(fileOutput, 'w')
    data = json.load(inputFile)
    inputFile.close()

    output = csv.writer(outputFile)

    output.writerow(data[0].keys())  # header row

    for row in data:
        output.writerow(row.values())

After add this code, save the file and run at the terminal:

python json_to_csv.py input.txt output.csv

I hope this help you.

SEEYA!


回答 15

令人惊讶的是,我发现到目前为止,这里发布的所有答案都无法正确处理所有可能的情况(例如,嵌套的字典,嵌套的列表,无值等)。

该解决方案应适用于所有情况:

def flatten_json(json):
    def process_value(keys, value, flattened):
        if isinstance(value, dict):
            for key in value.keys():
                process_value(keys + [key], value[key], flattened)
        elif isinstance(value, list):
            for idx, v in enumerate(value):
                process_value(keys + [str(idx)], v, flattened)
        else:
            flattened['__'.join(keys)] = value

    flattened = {}
    for key in json.keys():
        process_value([key], json[key], flattened)
    return flattened

Surprisingly, I found that none of the answers posted here so far correctly deal with all possible scenarios (e.g., nested dicts, nested lists, None values, etc).

This solution should work across all scenarios:

def flatten_json(json):
    def process_value(keys, value, flattened):
        if isinstance(value, dict):
            for key in value.keys():
                process_value(keys + [key], value[key], flattened)
        elif isinstance(value, list):
            for idx, v in enumerate(value):
                process_value(keys + [str(idx)], v, flattened)
        else:
            flattened['__'.join(keys)] = value

    flattened = {}
    for key in json.keys():
        process_value([key], json[key], flattened)
    return flattened

回答 16

试试这个

import csv, json, sys

input = open(sys.argv[1])
data = json.load(input)
input.close()

output = csv.writer(sys.stdout)

output.writerow(data[0].keys())  # header row

for item in data:
    output.writerow(item.values())

Try this

import csv, json, sys

input = open(sys.argv[1])
data = json.load(input)
input.close()

output = csv.writer(sys.stdout)

output.writerow(data[0].keys())  # header row

for item in data:
    output.writerow(item.values())

回答 17

此代码适用于任何给定的json文件

# -*- coding: utf-8 -*-
"""
Created on Mon Jun 17 20:35:35 2019
author: Ram
"""

import json
import csv

with open("file1.json") as file:
    data = json.load(file)



# create the csv writer object
pt_data1 = open('pt_data1.csv', 'w')
csvwriter = csv.writer(pt_data1)

count = 0

for pt in data:

      if count == 0:

             header = pt.keys()

             csvwriter.writerow(header)

             count += 1

      csvwriter.writerow(pt.values())

pt_data1.close()

This code works for any given json file

# -*- coding: utf-8 -*-
"""
Created on Mon Jun 17 20:35:35 2019
author: Ram
"""

import json
import csv

with open("file1.json") as file:
    data = json.load(file)



# create the csv writer object
pt_data1 = open('pt_data1.csv', 'w')
csvwriter = csv.writer(pt_data1)

count = 0

for pt in data:

      if count == 0:

             header = pt.keys()

             csvwriter.writerow(header)

             count += 1

      csvwriter.writerow(pt.values())

pt_data1.close()

回答 18

修改了Alec McGail的答案以支持内部带有列表的JSON

    def flattenjson(self, mp, delim="|"):
            ret = []
            if isinstance(mp, dict):
                    for k in mp.keys():
                            csvs = self.flattenjson(mp[k], delim)
                            for csv in csvs:
                                    ret.append(k + delim + csv)
            elif isinstance(mp, list):
                    for k in mp:
                            csvs = self.flattenjson(k, delim)
                            for csv in csvs:
                                    ret.append(csv)
            else:
                    ret.append(mp)

            return ret

谢谢!

Modified Alec McGail’s answer to support JSON with lists inside

    def flattenjson(self, mp, delim="|"):
            ret = []
            if isinstance(mp, dict):
                    for k in mp.keys():
                            csvs = self.flattenjson(mp[k], delim)
                            for csv in csvs:
                                    ret.append(k + delim + csv)
            elif isinstance(mp, list):
                    for k in mp:
                            csvs = self.flattenjson(k, delim)
                            for csv in csvs:
                                    ret.append(csv)
            else:
                    ret.append(mp)

            return ret

Thanks!


回答 19

import json,csv
t=''
t=(type('a'))
json_data = []
data = None
write_header = True
item_keys = []
try:
with open('kk.json') as json_file:
    json_data = json_file.read()

    data = json.loads(json_data)
except Exception as e:
    print( e)

with open('bar.csv', 'at') as csv_file:
    writer = csv.writer(csv_file)#, quoting=csv.QUOTE_MINIMAL)
    for item in data:
        item_values = []
        for key in item:
            if write_header:
                item_keys.append(key)
            value = item.get(key, '')
            if (type(value)==t):
                item_values.append(value.encode('utf-8'))
            else:
                item_values.append(value)
        if write_header:
            writer.writerow(item_keys)
            write_header = False
        writer.writerow(item_values)
import json,csv
t=''
t=(type('a'))
json_data = []
data = None
write_header = True
item_keys = []
try:
with open('kk.json') as json_file:
    json_data = json_file.read()

    data = json.loads(json_data)
except Exception as e:
    print( e)

with open('bar.csv', 'at') as csv_file:
    writer = csv.writer(csv_file)#, quoting=csv.QUOTE_MINIMAL)
    for item in data:
        item_values = []
        for key in item:
            if write_header:
                item_keys.append(key)
            value = item.get(key, '')
            if (type(value)==t):
                item_values.append(value.encode('utf-8'))
            else:
                item_values.append(value)
        if write_header:
            writer.writerow(item_keys)
            write_header = False
        writer.writerow(item_values)

回答 20

如果我们考虑以下示例,将json格式的文件转换为csv格式的文件。

{
 "item_data" : [
      {
        "item": "10023456",
        "class": "100",
        "subclass": "123"
      }
      ]
}

以下代码将json文件(data3.json)转换为csv文件(data3.csv)。

import json
import csv
with open("/Users/Desktop/json/data3.json") as file:
    data = json.load(file)
    file.close()
    print(data)

fname = "/Users/Desktop/json/data3.csv"

with open(fname, "w", newline='') as file:
    csv_file = csv.writer(file)
    csv_file.writerow(['dept',
                       'class',
                       'subclass'])
    for item in data["item_data"]:
         csv_file.writerow([item.get('item_data').get('dept'),
                            item.get('item_data').get('class'),
                            item.get('item_data').get('subclass')])

上面提到的代码已在本地安装的pycharm中执行,并且已成功将json文件转换为csv文件。希望此帮助转换文件。

If we consider the below example for converting the json format file to csv formatted file.

{
 "item_data" : [
      {
        "item": "10023456",
        "class": "100",
        "subclass": "123"
      }
      ]
}

The below code will convert the json file ( data3.json ) to csv file ( data3.csv ).

import json
import csv
with open("/Users/Desktop/json/data3.json") as file:
    data = json.load(file)
    file.close()
    print(data)

fname = "/Users/Desktop/json/data3.csv"

with open(fname, "w", newline='') as file:
    csv_file = csv.writer(file)
    csv_file.writerow(['dept',
                       'class',
                       'subclass'])
    for item in data["item_data"]:
         csv_file.writerow([item.get('item_data').get('dept'),
                            item.get('item_data').get('class'),
                            item.get('item_data').get('subclass')])

The above mentioned code has been executed in the locally installed pycharm and it has successfully converted the json file to the csv file. Hope this help to convert the files.


回答 21

由于数据似乎是字典格式的,因此您似乎应该实际使用csv.DictWriter()来实际输出带有适当标题信息的行。这样可以使转换处理起来更加容易。然后,fieldnames参数将正确设置顺序,而第一行的输出作为标头将允许它稍后由csv.DictReader()读取和处理。

例如,Mike Repass使用

output = csv.writer(sys.stdout)

output.writerow(data[0].keys())  # header row

for row in data:
  output.writerow(row.values())

但是,只需将初始设置更改为output = csv.DictWriter(filesetting,fieldnames = data [0] .keys())

请注意,由于未定义字典中元素的顺序,因此可能必须显式创建字段名称条目。一旦执行此操作,写行将起作用。然后,写入将按最初显示的方式工作。

Since the data appears to be in a dictionary format, it would appear that you should actually use csv.DictWriter() to actually output the lines with the appropriate header information. This should allow the conversion to be handled somewhat easier. The fieldnames parameter would then set up the order properly while the output of the first line as the headers would allow it to be read and processed later by csv.DictReader().

For example, Mike Repass used

output = csv.writer(sys.stdout)

output.writerow(data[0].keys())  # header row

for row in data:
  output.writerow(row.values())

However just change the initial setup to output = csv.DictWriter(filesetting, fieldnames=data[0].keys())

Note that since the order of elements in a dictionary is not defined, you might have to create fieldnames entries explicitly. Once you do that, the writerow will work. The writes then work as originally shown.


回答 22

不幸的是,我对获得惊人的@Alec McGail答案贡献不大。我正在使用Python3,需要将地图转换为@Alexis R注释后的列表。

另外,我发现csv编写器正在向文件添加一个额外的CR(我在csv文件中的每一行都有一行空行)。按照@Jason R. Coombs对这个线程的回答,解决方案非常简单: Python中的CSV添加了额外的回车符

您只需将lineterminator =’\ n’参数添加到csv.writer。这将是:csv_w = csv.writer( out_file, lineterminator='\n' )

Unfortunately I have not enouthg reputation to make a small contribution to the amazing @Alec McGail answer. I was using Python3 and I have needed to convert the map to a list following the @Alexis R comment.

Additionaly I have found the csv writer was adding a extra CR to the file (I have a empty line for each line with data inside the csv file). The solution was very easy following the @Jason R. Coombs answer to this thread: CSV in Python adding an extra carriage return

You need to simply add the lineterminator=’\n’ parameter to the csv.writer. It will be: csv_w = csv.writer( out_file, lineterminator='\n' )


回答 23

您可以使用此代码将json文件转换为csv文件读取文件后,我将对象转换为pandas数据框,然后将其保存到CSV文件

import os
import pandas as pd
import json
import numpy as np

data = []
os.chdir('D:\\Your_directory\\folder')
with open('file_name.json', encoding="utf8") as data_file:    
     for line in data_file:
        data.append(json.loads(line))

dataframe = pd.DataFrame(data)        
## Saving the dataframe to a csv file
dataframe.to_csv("filename.csv", encoding='utf-8',index= False)

You can use this code to convert a json file to csv file After reading the file, I am converting the object to pandas dataframe and then saving this to a CSV file

import os
import pandas as pd
import json
import numpy as np

data = []
os.chdir('D:\\Your_directory\\folder')
with open('file_name.json', encoding="utf8") as data_file:    
     for line in data_file:
        data.append(json.loads(line))

dataframe = pd.DataFrame(data)        
## Saving the dataframe to a csv file
dataframe.to_csv("filename.csv", encoding='utf-8',index= False)

回答 24

我可能参加聚会晚了,但我认为,我已经解决了类似的问题。我有一个看起来像这样的json文件

JSON文件结构

我只想从这些json文件中提取一些键/值。因此,我编写了以下代码以提取相同的代码。

    """json_to_csv.py
    This script reads n numbers of json files present in a folder and then extract certain data from each file and write in a csv file.
    The folder contains the python script i.e. json_to_csv.py, output.csv and another folder descriptions containing all the json files.
"""

import os
import json
import csv


def get_list_of_json_files():
    """Returns the list of filenames of all the Json files present in the folder
    Parameter
    ---------
    directory : str
        'descriptions' in this case
    Returns
    -------
    list_of_files: list
        List of the filenames of all the json files
    """

    list_of_files = os.listdir('descriptions')  # creates list of all the files in the folder

    return list_of_files


def create_list_from_json(jsonfile):
    """Returns a list of the extracted items from json file in the same order we need it.
    Parameter
    _________
    jsonfile : json
        The json file containing the data
    Returns
    -------
    one_sample_list : list
        The list of the extracted items needed for the final csv
    """

    with open(jsonfile) as f:
        data = json.load(f)

    data_list = []  # create an empty list

    # append the items to the list in the same order.
    data_list.append(data['_id'])
    data_list.append(data['_modelType'])
    data_list.append(data['creator']['_id'])
    data_list.append(data['creator']['name'])
    data_list.append(data['dataset']['_accessLevel'])
    data_list.append(data['dataset']['_id'])
    data_list.append(data['dataset']['description'])
    data_list.append(data['dataset']['name'])
    data_list.append(data['meta']['acquisition']['image_type'])
    data_list.append(data['meta']['acquisition']['pixelsX'])
    data_list.append(data['meta']['acquisition']['pixelsY'])
    data_list.append(data['meta']['clinical']['age_approx'])
    data_list.append(data['meta']['clinical']['benign_malignant'])
    data_list.append(data['meta']['clinical']['diagnosis'])
    data_list.append(data['meta']['clinical']['diagnosis_confirm_type'])
    data_list.append(data['meta']['clinical']['melanocytic'])
    data_list.append(data['meta']['clinical']['sex'])
    data_list.append(data['meta']['unstructured']['diagnosis'])
    # In few json files, the race was not there so using KeyError exception to add '' at the place
    try:
        data_list.append(data['meta']['unstructured']['race'])
    except KeyError:
        data_list.append("")  # will add an empty string in case race is not there.
    data_list.append(data['name'])

    return data_list


def write_csv():
    """Creates the desired csv file
    Parameters
    __________
    list_of_files : file
        The list created by get_list_of_json_files() method
    result.csv : csv
        The csv file containing the header only
    Returns
    _______
    result.csv : csv
        The desired csv file
    """

    list_of_files = get_list_of_json_files()
    for file in list_of_files:
        row = create_list_from_json(f'descriptions/{file}')  # create the row to be added to csv for each file (json-file)
        with open('output.csv', 'a') as c:
            writer = csv.writer(c)
            writer.writerow(row)
        c.close()


if __name__ == '__main__':
    write_csv()

我希望这将有所帮助。有关此代码如何工作的详细信息,请单击此处

I might be late to the party, but I think, I have dealt with the similar problem. I had a json file which looked like this

JSON File Structure

I only wanted to extract few keys/values from these json file. So, I wrote the following code to extract the same.

    """json_to_csv.py
    This script reads n numbers of json files present in a folder and then extract certain data from each file and write in a csv file.
    The folder contains the python script i.e. json_to_csv.py, output.csv and another folder descriptions containing all the json files.
"""

import os
import json
import csv


def get_list_of_json_files():
    """Returns the list of filenames of all the Json files present in the folder
    Parameter
    ---------
    directory : str
        'descriptions' in this case
    Returns
    -------
    list_of_files: list
        List of the filenames of all the json files
    """

    list_of_files = os.listdir('descriptions')  # creates list of all the files in the folder

    return list_of_files


def create_list_from_json(jsonfile):
    """Returns a list of the extracted items from json file in the same order we need it.
    Parameter
    _________
    jsonfile : json
        The json file containing the data
    Returns
    -------
    one_sample_list : list
        The list of the extracted items needed for the final csv
    """

    with open(jsonfile) as f:
        data = json.load(f)

    data_list = []  # create an empty list

    # append the items to the list in the same order.
    data_list.append(data['_id'])
    data_list.append(data['_modelType'])
    data_list.append(data['creator']['_id'])
    data_list.append(data['creator']['name'])
    data_list.append(data['dataset']['_accessLevel'])
    data_list.append(data['dataset']['_id'])
    data_list.append(data['dataset']['description'])
    data_list.append(data['dataset']['name'])
    data_list.append(data['meta']['acquisition']['image_type'])
    data_list.append(data['meta']['acquisition']['pixelsX'])
    data_list.append(data['meta']['acquisition']['pixelsY'])
    data_list.append(data['meta']['clinical']['age_approx'])
    data_list.append(data['meta']['clinical']['benign_malignant'])
    data_list.append(data['meta']['clinical']['diagnosis'])
    data_list.append(data['meta']['clinical']['diagnosis_confirm_type'])
    data_list.append(data['meta']['clinical']['melanocytic'])
    data_list.append(data['meta']['clinical']['sex'])
    data_list.append(data['meta']['unstructured']['diagnosis'])
    # In few json files, the race was not there so using KeyError exception to add '' at the place
    try:
        data_list.append(data['meta']['unstructured']['race'])
    except KeyError:
        data_list.append("")  # will add an empty string in case race is not there.
    data_list.append(data['name'])

    return data_list


def write_csv():
    """Creates the desired csv file
    Parameters
    __________
    list_of_files : file
        The list created by get_list_of_json_files() method
    result.csv : csv
        The csv file containing the header only
    Returns
    _______
    result.csv : csv
        The desired csv file
    """

    list_of_files = get_list_of_json_files()
    for file in list_of_files:
        row = create_list_from_json(f'descriptions/{file}')  # create the row to be added to csv for each file (json-file)
        with open('output.csv', 'a') as c:
            writer = csv.writer(c)
            writer.writerow(row)
        c.close()


if __name__ == '__main__':
    write_csv()

I hope this will help. For details on how this code work you can check here


回答 25

这是@MikeRepass答案的修改。此版本将CSV写入文件,并且适用于Python 2和Python 3。

import csv,json
input_file="data.json"
output_file="data.csv"
with open(input_file) as f:
    content=json.load(f)
try:
    context=open(output_file,'w',newline='') # Python 3
except TypeError:
    context=open(output_file,'wb') # Python 2
with context as file:
    writer=csv.writer(file)
    writer.writerow(content[0].keys()) # header row
    for row in content:
        writer.writerow(row.values())

This is a modification of @MikeRepass’s answer. This version writes the CSV to a file, and works for both Python 2 and Python 3.

import csv,json
input_file="data.json"
output_file="data.csv"
with open(input_file) as f:
    content=json.load(f)
try:
    context=open(output_file,'w',newline='') # Python 3
except TypeError:
    context=open(output_file,'wb') # Python 2
with context as file:
    writer=csv.writer(file)
    writer.writerow(content[0].keys()) # header row
    for row in content:
        writer.writerow(row.values())

让JSON对象接受字节或让urlopen输出字符串

问题:让JSON对象接受字节或让urlopen输出字符串

使用Python 3,我需要从URL请求json文档。

response = urllib.request.urlopen(request)

response对象是带有readreadline方法的类似文件的对象。通常,可以使用在文本模式下打开的文件来创建JSON对象。

obj = json.load(fp)

我想做的是:

obj = json.load(response)

但是,此方法不起作用,因为urlopen以二进制模式返回文件对象。

解决方法当然是:

str_response = response.read().decode('utf-8')
obj = json.loads(str_response)

但这感觉不好…

有没有更好的方法可以将字节文件对象转换为字符串文件对象?还是我缺少任何一个参数urlopenjson.load给出编码?

With Python 3 I am requesting a json document from a URL.

response = urllib.request.urlopen(request)

The response object is a file-like object with read and readline methods. Normally a JSON object can be created with a file opened in text mode.

obj = json.load(fp)

What I would like to do is:

obj = json.load(response)

This however does not work as urlopen returns a file object in binary mode.

A work around is of course:

str_response = response.read().decode('utf-8')
obj = json.loads(str_response)

but this feels bad…

Is there a better way that I can transform a bytes file object to a string file object? Or am I missing any parameters for either urlopen or json.load to give an encoding?


回答 0

HTTP发送字节。如果所讨论的资源是文本,则通常通过Content-Type HTTP标头或其他机制(RFC,HTML meta http-equiv等)指定字符编码。

urllib 应该知道如何将字节编码为字符串,但这太幼稚了-这是一个功能强大且功能强大的非Pythonic库。

深入Python 3提供了有关情况的概述。

您的“变通方法”很好-尽管感觉不对,但这是正确的方法。

HTTP sends bytes. If the resource in question is text, the character encoding is normally specified, either by the Content-Type HTTP header or by another mechanism (an RFC, HTML meta http-equiv,…).

urllib should know how to encode the bytes to a string, but it’s too naïve—it’s a horribly underpowered and un-Pythonic library.

Dive Into Python 3 provides an overview about the situation.

Your “work-around” is fine—although it feels wrong, it’s the correct way to do it.


Http-prompt 构建在HTTPie之上的交互式命令行HTTP和API测试客户端,具有自动完成、语法突出显示等功能

NumPy数组不可JSON序列化

问题:NumPy数组不可JSON序列化

创建NumPy数组并将其另存为Django上下文变量后,加载网页时出现以下错误:

array([   0,  239,  479,  717,  952, 1192, 1432, 1667], dtype=int64) is not JSON serializable

这是什么意思?

After creating a NumPy array, and saving it as a Django context variable, I receive the following error when loading the webpage:

array([   0,  239,  479,  717,  952, 1192, 1432, 1667], dtype=int64) is not JSON serializable

What does this mean?


回答 0

我经常“ jsonify” np.arrays。尝试首先在数组上使用“ .tolist()”方法,如下所示:

import numpy as np
import codecs, json 

a = np.arange(10).reshape(2,5) # a 2 by 5 array
b = a.tolist() # nested lists with same data, indices
file_path = "/path.json" ## your path variable
json.dump(b, codecs.open(file_path, 'w', encoding='utf-8'), separators=(',', ':'), sort_keys=True, indent=4) ### this saves the array in .json format

为了“ unjsonify”数组使用:

obj_text = codecs.open(file_path, 'r', encoding='utf-8').read()
b_new = json.loads(obj_text)
a_new = np.array(b_new)

I regularly “jsonify” np.arrays. Try using the “.tolist()” method on the arrays first, like this:

import numpy as np
import codecs, json 

a = np.arange(10).reshape(2,5) # a 2 by 5 array
b = a.tolist() # nested lists with same data, indices
file_path = "/path.json" ## your path variable
json.dump(b, codecs.open(file_path, 'w', encoding='utf-8'), separators=(',', ':'), sort_keys=True, indent=4) ### this saves the array in .json format

In order to “unjsonify” the array use:

obj_text = codecs.open(file_path, 'r', encoding='utf-8').read()
b_new = json.loads(obj_text)
a_new = np.array(b_new)

回答 1

将numpy.ndarray或任何嵌套列表组合作为JSON存储。

class NumpyEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.ndarray):
            return obj.tolist()
        return json.JSONEncoder.default(self, obj)

a = np.array([[1, 2, 3], [4, 5, 6]])
print(a.shape)
json_dump = json.dumps({'a': a, 'aa': [2, (2, 3, 4), a], 'bb': [2]}, cls=NumpyEncoder)
print(json_dump)

将输出:

(2, 3)
{"a": [[1, 2, 3], [4, 5, 6]], "aa": [2, [2, 3, 4], [[1, 2, 3], [4, 5, 6]]], "bb": [2]}

要从JSON还原:

json_load = json.loads(json_dump)
a_restored = np.asarray(json_load["a"])
print(a_restored)
print(a_restored.shape)

将输出:

[[1 2 3]
 [4 5 6]]
(2, 3)

Store as JSON a numpy.ndarray or any nested-list composition.

class NumpyEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.ndarray):
            return obj.tolist()
        return json.JSONEncoder.default(self, obj)

a = np.array([[1, 2, 3], [4, 5, 6]])
print(a.shape)
json_dump = json.dumps({'a': a, 'aa': [2, (2, 3, 4), a], 'bb': [2]}, cls=NumpyEncoder)
print(json_dump)

Will output:

(2, 3)
{"a": [[1, 2, 3], [4, 5, 6]], "aa": [2, [2, 3, 4], [[1, 2, 3], [4, 5, 6]]], "bb": [2]}

To restore from JSON:

json_load = json.loads(json_dump)
a_restored = np.asarray(json_load["a"])
print(a_restored)
print(a_restored.shape)

Will output:

[[1 2 3]
 [4 5 6]]
(2, 3)

回答 2

您可以使用Pandas

import pandas as pd
pd.Series(your_array).to_json(orient='values')

You can use Pandas:

import pandas as pd
pd.Series(your_array).to_json(orient='values')

回答 3

如果您在字典中嵌套了numpy数组,我找到了最佳解决方案:

import json
import numpy as np

class NumpyEncoder(json.JSONEncoder):
    """ Special json encoder for numpy types """
    def default(self, obj):
        if isinstance(obj, np.integer):
            return int(obj)
        elif isinstance(obj, np.floating):
            return float(obj)
        elif isinstance(obj, np.ndarray):
            return obj.tolist()
        return json.JSONEncoder.default(self, obj)

dumped = json.dumps(data, cls=NumpyEncoder)

with open(path, 'w') as f:
    json.dump(dumped, f)

感谢这个家伙

I found the best solution if you have nested numpy arrays in a dictionary:

import json
import numpy as np

class NumpyEncoder(json.JSONEncoder):
    """ Special json encoder for numpy types """
    def default(self, obj):
        if isinstance(obj, np.integer):
            return int(obj)
        elif isinstance(obj, np.floating):
            return float(obj)
        elif isinstance(obj, np.ndarray):
            return obj.tolist()
        return json.JSONEncoder.default(self, obj)

dumped = json.dumps(data, cls=NumpyEncoder)

with open(path, 'w') as f:
    json.dump(dumped, f)

Thanks to this guy.


回答 4

使用json.dumps defaultkwarg:

default应该是一个为无法序列化的对象调用的函数。

default函数中,检查对象是否来自numpy模块,如果是,则将其ndarray.tolist用于ndarray或将其.item用于任何其他特定于numpy的类型。

import numpy as np

def default(obj):
    if type(obj).__module__ == np.__name__:
        if isinstance(obj, np.ndarray):
            return obj.tolist()
        else:
            return obj.item()
    raise TypeError('Unknown type:', type(obj))

dumped = json.dumps(data, default=default)

Use the json.dumps default kwarg:

default should be a function that gets called for objects that can’t otherwise be serialized. … or raise a TypeError

In the default function check if the object is from the module numpy, if so either use ndarray.tolist for a ndarray or use .item for any other numpy specific type.

import numpy as np

def default(obj):
    if type(obj).__module__ == np.__name__:
        if isinstance(obj, np.ndarray):
            return obj.tolist()
        else:
            return obj.item()
    raise TypeError('Unknown type:', type(obj))

dumped = json.dumps(data, default=default)

回答 5

默认情况下不支持此功能,但是您可以使其轻松工作!如果您想返回完全相同的数据,则需要对几件事进行编码:

  • 数据本身,您可以获得 obj.tolist() @travelingbones。有时这可能已经足够了。
  • 数据类型。我觉得在某些情况下这很重要。
  • 如果您假设输入确实始终是“矩形”网格,则可以从上面得出尺寸(不一定是2D)。
  • 内存顺序(行或列为主)。这通常并不重要,但有时却很重要(例如性能),那么为什么不保存所有内容呢?

此外,您的numpy数组可能是数据结构的一部分,例如,您有一个包含一些矩阵的列表。为此,您可以使用基本上完成上述操作的自定义编码器。

这应该足以实施解决方案。或者,您可以使用json-tricks来做到这一点(并支持其他各种类型)(免责声明:我做到了)。

pip install json-tricks

然后

data = [
    arange(0, 10, 1, dtype=int).reshape((2, 5)),
    datetime(year=2017, month=1, day=19, hour=23, minute=00, second=00),
    1 + 2j,
    Decimal(42),
    Fraction(1, 3),
    MyTestCls(s='ub', dct={'7': 7}),  # see later
    set(range(7)),
]
# Encode with metadata to preserve types when decoding
print(dumps(data))

This is not supported by default, but you can make it work quite easily! There are several things you’ll want to encode if you want the exact same data back:

  • The data itself, which you can get with obj.tolist() as @travelingbones mentioned. Sometimes this may be good enough.
  • The data type. I feel this is important in quite some cases.
  • The dimension (not necessarily 2D), which could be derived from the above if you assume the input is indeed always a ‘rectangular’ grid.
  • The memory order (row- or column-major). This doesn’t often matter, but sometimes it does (e.g. performance), so why not save everything?

Furthermore, your numpy array could part of your data structure, e.g. you have a list with some matrices inside. For that you could use a custom encoder which basically does the above.

This should be enough to implement a solution. Or you could use json-tricks which does just this (and supports various other types) (disclaimer: I made it).

pip install json-tricks

Then

data = [
    arange(0, 10, 1, dtype=int).reshape((2, 5)),
    datetime(year=2017, month=1, day=19, hour=23, minute=00, second=00),
    1 + 2j,
    Decimal(42),
    Fraction(1, 3),
    MyTestCls(s='ub', dct={'7': 7}),  # see later
    set(range(7)),
]
# Encode with metadata to preserve types when decoding
print(dumps(data))

回答 6

嵌套字典中有一些numpy.ndarrays,我也遇到类似的问题。

def jsonify(data):
    json_data = dict()
    for key, value in data.iteritems():
        if isinstance(value, list): # for lists
            value = [ jsonify(item) if isinstance(item, dict) else item for item in value ]
        if isinstance(value, dict): # for nested lists
            value = jsonify(value)
        if isinstance(key, int): # if key is integer: > to string
            key = str(key)
        if type(value).__module__=='numpy': # if value is numpy.*: > to python list
            value = value.tolist()
        json_data[key] = value
    return json_data

I had a similar problem with a nested dictionary with some numpy.ndarrays in it.

def jsonify(data):
    json_data = dict()
    for key, value in data.iteritems():
        if isinstance(value, list): # for lists
            value = [ jsonify(item) if isinstance(item, dict) else item for item in value ]
        if isinstance(value, dict): # for nested lists
            value = jsonify(value)
        if isinstance(key, int): # if key is integer: > to string
            key = str(key)
        if type(value).__module__=='numpy': # if value is numpy.*: > to python list
            value = value.tolist()
        json_data[key] = value
    return json_data

回答 7

您还可以使用default参数例如:

def myconverter(o):
    if isinstance(o, np.float32):
        return float(o)

json.dump(data, default=myconverter)

You could also use default argument for example:

def myconverter(o):
    if isinstance(o, np.float32):
        return float(o)

json.dump(data, default=myconverter)

回答 8

另外,关于Python中的列表和数组,还有一些非常有趣的信息〜> Python中的列表与数组的 Python列表与数组-何时使用?

可以注意到,在将数组保存到JSON文件之前将其转换为列表之后,无论如何,现在无论如何在我的部署中,一旦读取该JSON文件以备后用,我就可以继续以列表形式使用它(如而不是将其转换回数组)。

这样,与屏幕上的列表(逗号分隔)和数组(非逗号分隔)相比,实际上它看起来更好(在我看来)。

使用上面的@travelingbones的.tolist()方法,我已经这样使用了(也发现了一些我发现的错误):

保存词典

def writeDict(values, name):
    writeName = DIR+name+'.json'
    with open(writeName, "w") as outfile:
        json.dump(values, outfile)

阅读词典

def readDict(name):
    readName = DIR+name+'.json'
    try:
        with open(readName, "r") as infile:
            dictValues = json.load(infile)
            return(dictValues)
    except IOError as e:
        print(e)
        return('None')
    except ValueError as e:
        print(e)
        return('None')

希望这可以帮助!

Also, some very interesting information further on lists vs. arrays in Python ~> Python List vs. Array – when to use?

It could be noted that once I convert my arrays into a list before saving it in a JSON file, in my deployment right now anyways, once I read that JSON file for use later, I can continue to use it in a list form (as opposed to converting it back to an array).

AND actually looks nicer (in my opinion) on the screen as a list (comma seperated) vs. an array (not-comma seperated) this way.

Using @travelingbones’s .tolist() method above, I’ve been using as such (catching a few errors I’ve found too):

SAVE DICTIONARY

def writeDict(values, name):
    writeName = DIR+name+'.json'
    with open(writeName, "w") as outfile:
        json.dump(values, outfile)

READ DICTIONARY

def readDict(name):
    readName = DIR+name+'.json'
    try:
        with open(readName, "r") as infile:
            dictValues = json.load(infile)
            return(dictValues)
    except IOError as e:
        print(e)
        return('None')
    except ValueError as e:
        print(e)
        return('None')

Hope this helps!


回答 9

这是一个对我有用的实现,并删除了所有nan(假设它们是简单的对象(列表或字典)):

from numpy import isnan

def remove_nans(my_obj, val=None):
    if isinstance(my_obj, list):
        for i, item in enumerate(my_obj):
            if isinstance(item, list) or isinstance(item, dict):
                my_obj[i] = remove_nans(my_obj[i], val=val)

            else:
                try:
                    if isnan(item):
                        my_obj[i] = val
                except Exception:
                    pass

    elif isinstance(my_obj, dict):
        for key, item in my_obj.iteritems():
            if isinstance(item, list) or isinstance(item, dict):
                my_obj[key] = remove_nans(my_obj[key], val=val)

            else:
                try:
                    if isnan(item):
                        my_obj[key] = val
                except Exception:
                    pass

    return my_obj

Here is an implementation that work for me and removed all nans (assuming these are simple object (list or dict)):

from numpy import isnan

def remove_nans(my_obj, val=None):
    if isinstance(my_obj, list):
        for i, item in enumerate(my_obj):
            if isinstance(item, list) or isinstance(item, dict):
                my_obj[i] = remove_nans(my_obj[i], val=val)

            else:
                try:
                    if isnan(item):
                        my_obj[i] = val
                except Exception:
                    pass

    elif isinstance(my_obj, dict):
        for key, item in my_obj.iteritems():
            if isinstance(item, list) or isinstance(item, dict):
                my_obj[key] = remove_nans(my_obj[key], val=val)

            else:
                try:
                    if isnan(item):
                        my_obj[key] = val
                except Exception:
                    pass

    return my_obj

回答 10

这是一个不同的答案,但这可能有助于帮助试图保存数据然后再次读取的人们。
有一个比泡菜快和容易的hi。
我试图保存并在泡菜转储中阅读它,但是阅读时有很多问题,浪费了一个小时,尽管我正在处理自己的数据以创建聊天机器人,但仍然找不到解决方案。

vec_x并且vec_y是numpy数组:

data=[vec_x,vec_y]
hkl.dump( data, 'new_data_file.hkl' )

然后,您只需阅读并执行以下操作:

data2 = hkl.load( 'new_data_file.hkl' )

This is a different answer, but this might help to help people who are trying to save data and then read it again.
There is hickle which is faster than pickle and easier.
I tried to save and read it in pickle dump but while reading there were lot of problems and wasted an hour and still didn’t find solution though I was working on my own data to create a chat bot.

vec_x and vec_y are numpy arrays:

data=[vec_x,vec_y]
hkl.dump( data, 'new_data_file.hkl' )

Then you just read it and perform the operations:

data2 = hkl.load( 'new_data_file.hkl' )

回答 11

可以使用检查类型来简化循环:

with open("jsondontdoit.json", 'w') as fp:
    for key in bests.keys():
        if type(bests[key]) == np.ndarray:
            bests[key] = bests[key].tolist()
            continue
        for idx in bests[key]:
            if type(bests[key][idx]) == np.ndarray:
                bests[key][idx] = bests[key][idx].tolist()
    json.dump(bests, fp)
    fp.close()

May do simple for loop with checking types:

with open("jsondontdoit.json", 'w') as fp:
    for key in bests.keys():
        if type(bests[key]) == np.ndarray:
            bests[key] = bests[key].tolist()
            continue
        for idx in bests[key]:
            if type(bests[key][idx]) == np.ndarray:
                bests[key][idx] = bests[key][idx].tolist()
    json.dump(bests, fp)
    fp.close()

回答 12

使用NumpyEncoder它将成功处理json转储。不抛出-NumPy数组不是JSON可序列化的

import numpy as np
import json
from numpyencoder import NumpyEncoder
arr = array([   0,  239,  479,  717,  952, 1192, 1432, 1667], dtype=int64) 
json.dumps(arr,cls=NumpyEncoder)

use NumpyEncoder it will process json dump successfully.without throwing – NumPy array is not JSON serializable

import numpy as np
import json
from numpyencoder import NumpyEncoder
arr = array([   0,  239,  479,  717,  952, 1192, 1432, 1667], dtype=int64) 
json.dumps(arr,cls=NumpyEncoder)

回答 13

TypeError:array([[0.46872085,0.67374235,1.0218339,0.13210179,0.5440686,0.9140083,0.58720225,0.2199381]],dtype = float32)不是JSON可序列化的

当我期望以json格式响应时,尝试将数据列表传递给model.predict()时,抛出了上述错误。

> 1        json_file = open('model.json','r')
> 2        loaded_model_json = json_file.read()
> 3        json_file.close()
> 4        loaded_model = model_from_json(loaded_model_json)
> 5        #load weights into new model
> 6        loaded_model.load_weights("model.h5")
> 7        loaded_model.compile(optimizer='adam', loss='mean_squared_error')
> 8        X =  [[874,12450,678,0.922500,0.113569]]
> 9        d = pd.DataFrame(X)
> 10       prediction = loaded_model.predict(d)
> 11       return jsonify(prediction)

但幸运的是找到了解决抛出错误的提示对象的序列化仅适用于以下转换映射应采用以下方式object-dict array-list string-string integer-integer

如果您向上滚动以查看第10行的代码,则这行代码将生成array数据类型的输出,当您尝试将array转换为json格式时,这是不可能的。

最终我找到了解决方案,只需通过遵循以下几行代码将获得的输出转换为类型列表即可

预测=加载模型。预测(d)
列表类型=预测。列表()返回jsonify(列表类型)

hoo!终于得到了预期的输出, 在此处输入图片说明

TypeError: array([[0.46872085, 0.67374235, 1.0218339 , 0.13210179, 0.5440686 , 0.9140083 , 0.58720225, 0.2199381 ]], dtype=float32) is not JSON serializable

The above-mentioned error was thrown when i tried to pass of list of data to model.predict() when i was expecting the response in json format.

> 1        json_file = open('model.json','r')
> 2        loaded_model_json = json_file.read()
> 3        json_file.close()
> 4        loaded_model = model_from_json(loaded_model_json)
> 5        #load weights into new model
> 6        loaded_model.load_weights("model.h5")
> 7        loaded_model.compile(optimizer='adam', loss='mean_squared_error')
> 8        X =  [[874,12450,678,0.922500,0.113569]]
> 9        d = pd.DataFrame(X)
> 10       prediction = loaded_model.predict(d)
> 11       return jsonify(prediction)

But luckily found the hint to resolve the error that was throwing The serializing of the objects is applicable only for the following conversion Mapping should be in following way object – dict array – list string – string integer – integer

If you scroll up to see the line number 10 prediction = loaded_model.predict(d) where this line of code was generating the output of type array datatype , when you try to convert array to json format its not possible

Finally i found the solution just by converting obtained output to the type list by following lines of code

prediction = loaded_model.predict(d)
listtype = prediction.tolist() return jsonify(listtype)

Bhoom! finally got the expected output, enter image description here


Python JSON序列化Decimal对象

问题:Python JSON序列化Decimal对象

我有一个Decimal('3.9')对象的一部分,希望将其编码为JSON字符串,看起来像{'x': 3.9}。我不在乎客户端的精度,因此浮点数很好。

是否有序列化此序列的好方法?JSONDecoder不接受Decimal对象,并且事先转换为float会产生{'x': 3.8999999999999999}错误,这将浪费大量带宽。

I have a Decimal('3.9') as part of an object, and wish to encode this to a JSON string which should look like {'x': 3.9}. I don’t care about precision on the client side, so a float is fine.

Is there a good way to serialize this? JSONDecoder doesn’t accept Decimal objects, and converting to a float beforehand yields {'x': 3.8999999999999999} which is wrong, and will be a big waste of bandwidth.


回答 0

子类化如何json.JSONEncoder

class DecimalEncoder(json.JSONEncoder):
    def _iterencode(self, o, markers=None):
        if isinstance(o, decimal.Decimal):
            # wanted a simple yield str(o) in the next line,
            # but that would mean a yield on the line with super(...),
            # which wouldn't work (see my comment below), so...
            return (str(o) for o in [o])
        return super(DecimalEncoder, self)._iterencode(o, markers)

然后像这样使用它:

json.dumps({'x': decimal.Decimal('5.5')}, cls=DecimalEncoder)

How about subclassing json.JSONEncoder?

class DecimalEncoder(json.JSONEncoder):
    def default(self, o):
        if isinstance(o, decimal.Decimal):
            # wanted a simple yield str(o) in the next line,
            # but that would mean a yield on the line with super(...),
            # which wouldn't work (see my comment below), so...
            return (str(o) for o in [o])
        return super(DecimalEncoder, self).default(o)

Then use it like so:

json.dumps({'x': decimal.Decimal('5.5')}, cls=DecimalEncoder)

回答 1

Simplejson 2.1及更高版本具有对Decimal类型的本地支持:

>>> json.dumps(Decimal('3.9'), use_decimal=True)
'3.9'

请注意,use_decimalTrue在默认情况下:

def dumps(obj, skipkeys=False, ensure_ascii=True, check_circular=True,
    allow_nan=True, cls=None, indent=None, separators=None,
    encoding='utf-8', default=None, use_decimal=True,
    namedtuple_as_object=True, tuple_as_array=True,
    bigint_as_string=False, sort_keys=False, item_sort_key=None,
    for_json=False, ignore_nan=False, **kw):

所以:

>>> json.dumps(Decimal('3.9'))
'3.9'

希望此功能将包含在标准库中。

Simplejson 2.1 and higher has native support for Decimal type:

>>> json.dumps(Decimal('3.9'), use_decimal=True)
'3.9'

Note that use_decimal is True by default:

def dumps(obj, skipkeys=False, ensure_ascii=True, check_circular=True,
    allow_nan=True, cls=None, indent=None, separators=None,
    encoding='utf-8', default=None, use_decimal=True,
    namedtuple_as_object=True, tuple_as_array=True,
    bigint_as_string=False, sort_keys=False, item_sort_key=None,
    for_json=False, ignore_nan=False, **kw):

So:

>>> json.dumps(Decimal('3.9'))
'3.9'

Hopefully, this feature will be included in standard library.


回答 2

我想让大家知道,我在运行Python 2.6.5的Web服务器上尝试了MichałMarczyk的答案,并且运行良好。但是,我升级到了Python 2.7,它停止工作了。我试图想到某种编码Decimal对象的方法,这就是我想出的:

import decimal

class DecimalEncoder(json.JSONEncoder):
    def default(self, o):
        if isinstance(o, decimal.Decimal):
            return float(o)
        return super(DecimalEncoder, self).default(o)

希望这应该对任何遇到Python 2.7问题的人有所帮助。我测试了它,看来效果很好。如果有人注意到我的解决方案中的任何错误或提出了更好的方法,请告诉我。

I would like to let everyone know that I tried Michał Marczyk’s answer on my web server that was running Python 2.6.5 and it worked fine. However, I upgraded to Python 2.7 and it stopped working. I tried to think of some sort of way to encode Decimal objects and this is what I came up with:

import decimal

class DecimalEncoder(json.JSONEncoder):
    def default(self, o):
        if isinstance(o, decimal.Decimal):
            return float(o)
        return super(DecimalEncoder, self).default(o)

This should hopefully help anyone who is having problems with Python 2.7. I tested it and it seems to work fine. If anyone notices any bugs in my solution or comes up with a better way, please let me know.


回答 3

在我的Flask应用程序中,它使用python 2.7.11,flask alchemy(具有“ db.decimal”类型)和Flask Marshmallow(用于“即时”序列化器和反序列化器),每次执行GET或POST时,我都会遇到此错误。序列化器和反序列化器无法将Decimal类型转换为任何JSON可识别格式。

我做了一个“ pip install simplejson”,然后只需添加

import simplejson as json

序列化器和解串器再次开始发出嗡嗡声。我什么也没做… DEciamls显示为’234.00’浮点格式。

In my Flask app, Which uses python 2.7.11, flask alchemy(with ‘db.decimal’ types), and Flask Marshmallow ( for ‘instant’ serializer and deserializer), i had this error, every time i did a GET or POST. The serializer and deserializer, failed to convert Decimal types into any JSON identifiable format.

I did a “pip install simplejson”, then Just by adding

import simplejson as json

the serializer and deserializer starts to purr again. I did nothing else… DEciamls are displayed as ‘234.00’ float format.


回答 4

我尝试从GAE 2.7的simplejson切换到内置的json,但小数位数有问题。如果default返回的str(o)有引号(因为_iterencode在default的结果上调用_iterencode),并且float(o)会删除结尾的0。

如果default返回一个从float继承的类的对象(或任何不带其他格式调用repr的类)并具有自定义__repr__方法,则它看起来像我希望的那样工作。

import json
from decimal import Decimal

class fakefloat(float):
    def __init__(self, value):
        self._value = value
    def __repr__(self):
        return str(self._value)

def defaultencode(o):
    if isinstance(o, Decimal):
        # Subclass float with custom repr?
        return fakefloat(o)
    raise TypeError(repr(o) + " is not JSON serializable")

json.dumps([10.20, "10.20", Decimal('10.20')], default=defaultencode)
'[10.2, "10.20", 10.20]'

I tried switching from simplejson to builtin json for GAE 2.7, and had issues with the decimal. If default returned str(o) there were quotes (because _iterencode calls _iterencode on the results of default), and float(o) would remove trailing 0.

If default returns an object of a class that inherits from float (or anything that calls repr without additional formatting) and has a custom __repr__ method, it seems to work like I want it to.

import json
from decimal import Decimal

class fakefloat(float):
    def __init__(self, value):
        self._value = value
    def __repr__(self):
        return str(self._value)

def defaultencode(o):
    if isinstance(o, Decimal):
        # Subclass float with custom repr?
        return fakefloat(o)
    raise TypeError(repr(o) + " is not JSON serializable")

json.dumps([10.20, "10.20", Decimal('10.20')], default=defaultencode)
'[10.2, "10.20", 10.20]'

回答 5

缺少本机选项,因此我将其添加到寻找它的下一个家伙/画廊。

从Django 1.7.x开始,有一个内置组件DjangoJSONEncoder,您可以从中获取它django.core.serializers.json

import json
from django.core.serializers.json import DjangoJSONEncoder
from django.forms.models import model_to_dict

model_instance = YourModel.object.first()
model_dict = model_to_dict(model_instance)

json.dumps(model_dict, cls=DjangoJSONEncoder)

快点!

The native option is missing so I’ll add it for the next guy/gall that looks for it.

Starting on Django 1.7.x there is a built-in DjangoJSONEncoder that you can get it from django.core.serializers.json.

import json
from django.core.serializers.json import DjangoJSONEncoder
from django.forms.models import model_to_dict

model_instance = YourModel.object.first()
model_dict = model_to_dict(model_instance)

json.dumps(model_dict, cls=DjangoJSONEncoder)

Presto!


回答 6

我的$ .02!

因为要为Web服务器序列化大量数据,所以我扩展了许多JSON编码器。这是一些不错的代码。请注意,它很容易扩展为几乎任何您想要的数据格式,并且可以将3.9复制为"thing": 3.9

JSONEncoder_olddefault = json.JSONEncoder.default
def JSONEncoder_newdefault(self, o):
    if isinstance(o, UUID): return str(o)
    if isinstance(o, datetime): return str(o)
    if isinstance(o, time.struct_time): return datetime.fromtimestamp(time.mktime(o))
    if isinstance(o, decimal.Decimal): return str(o)
    return JSONEncoder_olddefault(self, o)
json.JSONEncoder.default = JSONEncoder_newdefault

让我的生活变得更加轻松…

My $.02!

I extend a bunch of the JSON encoder since I am serializing tons of data for my web server. Here’s some nice code. Note that it’s easily extendable to pretty much any data format you feel like and will reproduce 3.9 as "thing": 3.9

JSONEncoder_olddefault = json.JSONEncoder.default
def JSONEncoder_newdefault(self, o):
    if isinstance(o, UUID): return str(o)
    if isinstance(o, datetime): return str(o)
    if isinstance(o, time.struct_time): return datetime.fromtimestamp(time.mktime(o))
    if isinstance(o, decimal.Decimal): return str(o)
    return JSONEncoder_olddefault(self, o)
json.JSONEncoder.default = JSONEncoder_newdefault

Makes my life so much easier…


回答 7

3.9不能在IEEE浮点数中精确表示,它总是以表示3.8999999999999999,例如try print repr(3.9),您可以在此处了解更多信息:

http://en.wikipedia.org/wiki/Floating_point
http://docs.sun.com/source/806-3568/ncg_goldberg.html

因此,如果您不希望浮动,则只能将其作为字符串发送,并且要允许将十进制对象自动转换为JSON,请执行以下操作:

import decimal
from django.utils import simplejson

def json_encode_decimal(obj):
    if isinstance(obj, decimal.Decimal):
        return str(obj)
    raise TypeError(repr(obj) + " is not JSON serializable")

d = decimal.Decimal('3.5')
print simplejson.dumps([d], default=json_encode_decimal)

3.9 can not be exactly represented in IEEE floats, it will always come as 3.8999999999999999, e.g. try print repr(3.9), you can read more about it here:

http://en.wikipedia.org/wiki/Floating_point
http://docs.sun.com/source/806-3568/ncg_goldberg.html

So if you don’t want float, only option you have to send it as string, and to allow automatic conversion of decimal objects to JSON, do something like this:

import decimal
from django.utils import simplejson

def json_encode_decimal(obj):
    if isinstance(obj, decimal.Decimal):
        return str(obj)
    raise TypeError(repr(obj) + " is not JSON serializable")

d = decimal.Decimal('3.5')
print simplejson.dumps([d], default=json_encode_decimal)

回答 8

对于Django用户

最近遇到了TypeError: Decimal('2337.00') is not JSON serializable JSON编码,即json.dumps(data)

解决方案

# converts Decimal, Datetime, UUIDs to str for Encoding
from django.core.serializers.json import DjangoJSONEncoder  

json.dumps(response.data, cls=DjangoJSONEncoder)

但是,现在Decimal值将是一个字符串,现在我们可以在解码数据时使用parse_floatin中的option 显式设置十进制/浮点值解析器json.loads

import decimal 

data = json.loads(data, parse_float=decimal.Decimal) # default is float(num_str)

For Django users:

Recently came across TypeError: Decimal('2337.00') is not JSON serializable while JSON encoding i.e. json.dumps(data)

Solution:

# converts Decimal, Datetime, UUIDs to str for Encoding
from django.core.serializers.json import DjangoJSONEncoder  

json.dumps(response.data, cls=DjangoJSONEncoder)

But, now the Decimal value will be a string, now we can explicitly set the decimal/float value parser when decoding data, using parse_float option in json.loads:

import decimal 

data = json.loads(data, parse_float=decimal.Decimal) # default is float(num_str)

回答 9

JSON标准文档(如json.org中链接):

JSON与数字的语义无关。在任何编程语言中,可以有各种容量和补码形式的数字类型,固定或浮动,二进制或十进制。这可能使不同编程语言之间的交换变得困难。JSON而是仅提供人类使用的数字表示形式:数字序列。所有编程语言都知道如何理解数字序列,即使它们在内部表示形式上存在分歧。这足以允许互换。

因此,在JSON中将小数表示为数字(而不是字符串)实际上是准确的。贝娄为该问题提供了可能的解决方案。

定义自定义JSON编码器:

import json


class CustomJsonEncoder(json.JSONEncoder):

    def default(self, obj):
        if isinstance(obj, Decimal):
            return float(obj)
        return super(CustomJsonEncoder, self).default(obj)

然后在序列化数据时使用它:

json.dumps(data, cls=CustomJsonEncoder)

从其他答案的注释中可以看出,较旧版本的python在转换为float时可能会弄乱表示形式,但现在不再如此。

要在Python中返回小数:

Decimal(str(value))

Python 3.0文档中的小数点提示了该解决方案:

要从浮点数创建小数,请首先将其转换为字符串。

From the JSON Standard Document, as linked in json.org:

JSON is agnostic about the semantics of numbers. In any programming language, there can be a variety of number types of various capacities and complements, fixed or floating, binary or decimal. That can make interchange between different programming languages difficult. JSON instead offers only the representation of numbers that humans use: a sequence of digits. All programming languages know how to make sense of digit sequences even if they disagree on internal representations. That is enough to allow interchange.

So it’s actually accurate to represent Decimals as numbers (rather than strings) in JSON. Bellow lies a possible solution to the problem.

Define a custom JSON encoder:

import json


class CustomJsonEncoder(json.JSONEncoder):

    def default(self, obj):
        if isinstance(obj, Decimal):
            return float(obj)
        return super(CustomJsonEncoder, self).default(obj)

Then use it when serializing your data:

json.dumps(data, cls=CustomJsonEncoder)

As noted from comments on the other answers, older versions of python might mess up the representation when converting to float, but that’s not the case anymore.

To get the decimal back in Python:

Decimal(str(value))

This solution is hinted in Python 3.0 documentation on decimals:

To create a Decimal from a float, first convert it to a string.


回答 10

这就是我从课堂上摘录的

class CommonJSONEncoder(json.JSONEncoder):

    """
    Common JSON Encoder
    json.dumps(myString, cls=CommonJSONEncoder)
    """

    def default(self, obj):

        if isinstance(obj, decimal.Decimal):
            return {'type{decimal}': str(obj)}

class CommonJSONDecoder(json.JSONDecoder):

    """
    Common JSON Encoder
    json.loads(myString, cls=CommonJSONEncoder)
    """

    @classmethod
    def object_hook(cls, obj):
        for key in obj:
            if isinstance(key, six.string_types):
                if 'type{decimal}' == key:
                    try:
                        return decimal.Decimal(obj[key])
                    except:
                        pass

    def __init__(self, **kwargs):
        kwargs['object_hook'] = self.object_hook
        super(CommonJSONDecoder, self).__init__(**kwargs)

通过单元测试:

def test_encode_and_decode_decimal(self):
    obj = Decimal('1.11')
    result = json.dumps(obj, cls=CommonJSONEncoder)
    self.assertTrue('type{decimal}' in result)
    new_obj = json.loads(result, cls=CommonJSONDecoder)
    self.assertEqual(new_obj, obj)

    obj = {'test': Decimal('1.11')}
    result = json.dumps(obj, cls=CommonJSONEncoder)
    self.assertTrue('type{decimal}' in result)
    new_obj = json.loads(result, cls=CommonJSONDecoder)
    self.assertEqual(new_obj, obj)

    obj = {'test': {'abc': Decimal('1.11')}}
    result = json.dumps(obj, cls=CommonJSONEncoder)
    self.assertTrue('type{decimal}' in result)
    new_obj = json.loads(result, cls=CommonJSONDecoder)
    self.assertEqual(new_obj, obj)

This is what I have, extracted from our class

class CommonJSONEncoder(json.JSONEncoder):

    """
    Common JSON Encoder
    json.dumps(myString, cls=CommonJSONEncoder)
    """

    def default(self, obj):

        if isinstance(obj, decimal.Decimal):
            return {'type{decimal}': str(obj)}

class CommonJSONDecoder(json.JSONDecoder):

    """
    Common JSON Encoder
    json.loads(myString, cls=CommonJSONEncoder)
    """

    @classmethod
    def object_hook(cls, obj):
        for key in obj:
            if isinstance(key, six.string_types):
                if 'type{decimal}' == key:
                    try:
                        return decimal.Decimal(obj[key])
                    except:
                        pass

    def __init__(self, **kwargs):
        kwargs['object_hook'] = self.object_hook
        super(CommonJSONDecoder, self).__init__(**kwargs)

Which passes unittest:

def test_encode_and_decode_decimal(self):
    obj = Decimal('1.11')
    result = json.dumps(obj, cls=CommonJSONEncoder)
    self.assertTrue('type{decimal}' in result)
    new_obj = json.loads(result, cls=CommonJSONDecoder)
    self.assertEqual(new_obj, obj)

    obj = {'test': Decimal('1.11')}
    result = json.dumps(obj, cls=CommonJSONEncoder)
    self.assertTrue('type{decimal}' in result)
    new_obj = json.loads(result, cls=CommonJSONDecoder)
    self.assertEqual(new_obj, obj)

    obj = {'test': {'abc': Decimal('1.11')}}
    result = json.dumps(obj, cls=CommonJSONEncoder)
    self.assertTrue('type{decimal}' in result)
    new_obj = json.loads(result, cls=CommonJSONDecoder)
    self.assertEqual(new_obj, obj)

回答 11

您可以根据需要创建自定义JSON编码器。

import json
from datetime import datetime, date
from time import time, struct_time, mktime
import decimal

class CustomJSONEncoder(json.JSONEncoder):
    def default(self, o):
        if isinstance(o, datetime):
            return str(o)
        if isinstance(o, date):
            return str(o)
        if isinstance(o, decimal.Decimal):
            return float(o)
        if isinstance(o, struct_time):
            return datetime.fromtimestamp(mktime(o))
        # Any other serializer if needed
        return super(CustomJSONEncoder, self).default(o)

解码器可以这样称呼,

import json
from decimal import Decimal
json.dumps({'x': Decimal('3.9')}, cls=CustomJSONEncoder)

输出将是:

>>'{"x": 3.9}'

You can create a custom JSON encoder as per your requirement.

import json
from datetime import datetime, date
from time import time, struct_time, mktime
import decimal

class CustomJSONEncoder(json.JSONEncoder):
    def default(self, o):
        if isinstance(o, datetime):
            return str(o)
        if isinstance(o, date):
            return str(o)
        if isinstance(o, decimal.Decimal):
            return float(o)
        if isinstance(o, struct_time):
            return datetime.fromtimestamp(mktime(o))
        # Any other serializer if needed
        return super(CustomJSONEncoder, self).default(o)

The Decoder can be called like this,

import json
from decimal import Decimal
json.dumps({'x': Decimal('3.9')}, cls=CustomJSONEncoder)

and the output will be:

>>'{"x": 3.9}'

回答 12

对于那些不想使用第三方库的人来说……Elias Zamaria的问题是它会转换为float,这会遇到问题。例如:

>>> json.dumps({'x': Decimal('0.0000001')}, cls=DecimalEncoder)
'{"x": 1e-07}'
>>> json.dumps({'x': Decimal('100000000000.01734')}, cls=DecimalEncoder)
'{"x": 100000000000.01733}'

JSONEncoder.encode()方法使您可以返回原义的json内容,与不同的是JSONEncoder.default(),该方法返回的是json兼容类型(例如float),然后以常规方式对其进行编码。问题encode()在于它(通常)仅在最高级别上起作用。但是它仍然可以使用,但需要做一些额外的工作(python 3.x):

import json
from collections.abc import Mapping, Iterable
from decimal import Decimal

class DecimalEncoder(json.JSONEncoder):
    def encode(self, obj):
        if isinstance(obj, Mapping):
            return '{' + ', '.join(f'{self.encode(k)}: {self.encode(v)}' for (k, v) in obj.items()) + '}'
        if isinstance(obj, Iterable) and (not isinstance(obj, str)):
            return '[' + ', '.join(map(self.encode, obj)) + ']'
        if isinstance(obj, Decimal):
            return f'{obj.normalize():f}'  # using normalize() gets rid of trailing 0s, using ':f' prevents scientific notation
        return super().encode(obj)

这给你:

>>> json.dumps({'x': Decimal('0.0000001')}, cls=DecimalEncoder)
'{"x": 0.0000001}'
>>> json.dumps({'x': Decimal('100000000000.01734')}, cls=DecimalEncoder)
'{"x": 100000000000.01734}'

For those who don’t want to use a third-party library… An issue with Elias Zamaria’s answer is that it converts to float, which can run into problems. For example:

>>> json.dumps({'x': Decimal('0.0000001')}, cls=DecimalEncoder)
'{"x": 1e-07}'
>>> json.dumps({'x': Decimal('100000000000.01734')}, cls=DecimalEncoder)
'{"x": 100000000000.01733}'

The JSONEncoder.encode() method lets you return the literal json content, unlike JSONEncoder.default(), which has you return a json compatible type (like float) that then gets encoded in the normal way. The problem with encode() is that it (normally) only works at the top level. But it’s still usable, with a little extra work (python 3.x):

import json
from collections.abc import Mapping, Iterable
from decimal import Decimal

class DecimalEncoder(json.JSONEncoder):
    def encode(self, obj):
        if isinstance(obj, Mapping):
            return '{' + ', '.join(f'{self.encode(k)}: {self.encode(v)}' for (k, v) in obj.items()) + '}'
        if isinstance(obj, Iterable) and (not isinstance(obj, str)):
            return '[' + ', '.join(map(self.encode, obj)) + ']'
        if isinstance(obj, Decimal):
            return f'{obj.normalize():f}'  # using normalize() gets rid of trailing 0s, using ':f' prevents scientific notation
        return super().encode(obj)

Which gives you:

>>> json.dumps({'x': Decimal('0.0000001')}, cls=DecimalEncoder)
'{"x": 0.0000001}'
>>> json.dumps({'x': Decimal('100000000000.01734')}, cls=DecimalEncoder)
'{"x": 100000000000.01734}'

回答 13

根据stdOrgnlDave的答案,我定义了此包装器,可以使用可选的种类调用该包装器,因此编码器仅适用于您项目中的某些种类。我认为应该在代码内完成工作,而不要使用此“默认”编码器,因为“它比隐式更好”是显式的,但是我知道使用它可以节省您的时间。:-)

import time
import json
import decimal
from uuid import UUID
from datetime import datetime

def JSONEncoder_newdefault(kind=['uuid', 'datetime', 'time', 'decimal']):
    '''
    JSON Encoder newdfeault is a wrapper capable of encoding several kinds
    Use it anywhere on your code to make the full system to work with this defaults:
        JSONEncoder_newdefault()  # for everything
        JSONEncoder_newdefault(['decimal'])  # only for Decimal
    '''
    JSONEncoder_olddefault = json.JSONEncoder.default

    def JSONEncoder_wrapped(self, o):
        '''
        json.JSONEncoder.default = JSONEncoder_newdefault
        '''
        if ('uuid' in kind) and isinstance(o, uuid.UUID):
            return str(o)
        if ('datetime' in kind) and isinstance(o, datetime):
            return str(o)
        if ('time' in kind) and isinstance(o, time.struct_time):
            return datetime.fromtimestamp(time.mktime(o))
        if ('decimal' in kind) and isinstance(o, decimal.Decimal):
            return str(o)
        return JSONEncoder_olddefault(self, o)
    json.JSONEncoder.default = JSONEncoder_wrapped

# Example
if __name__ == '__main__':
    JSONEncoder_newdefault()

Based on stdOrgnlDave answer I have defined this wrapper that it can be called with optional kinds so the encoder will work only for certain kinds inside your projects. I believe the work should be done inside your code and not to use this “default” encoder since “it is better explicit than implicit”, but I understand using this will save some of your time. :-)

import time
import json
import decimal
from uuid import UUID
from datetime import datetime

def JSONEncoder_newdefault(kind=['uuid', 'datetime', 'time', 'decimal']):
    '''
    JSON Encoder newdfeault is a wrapper capable of encoding several kinds
    Use it anywhere on your code to make the full system to work with this defaults:
        JSONEncoder_newdefault()  # for everything
        JSONEncoder_newdefault(['decimal'])  # only for Decimal
    '''
    JSONEncoder_olddefault = json.JSONEncoder.default

    def JSONEncoder_wrapped(self, o):
        '''
        json.JSONEncoder.default = JSONEncoder_newdefault
        '''
        if ('uuid' in kind) and isinstance(o, uuid.UUID):
            return str(o)
        if ('datetime' in kind) and isinstance(o, datetime):
            return str(o)
        if ('time' in kind) and isinstance(o, time.struct_time):
            return datetime.fromtimestamp(time.mktime(o))
        if ('decimal' in kind) and isinstance(o, decimal.Decimal):
            return str(o)
        return JSONEncoder_olddefault(self, o)
    json.JSONEncoder.default = JSONEncoder_wrapped

# Example
if __name__ == '__main__':
    JSONEncoder_newdefault()

回答 14

如果要将包含小数的字典传递给requests库(使用json关键字参数),则只需安装simplejson

$ pip3 install simplejson    
$ python3
>>> import requests
>>> from decimal import Decimal
>>> # This won't error out:
>>> requests.post('https://www.google.com', json={'foo': Decimal('1.23')})

问题的原因是仅在存在时才requests使用simplejson,而在json未安装时退回到内置。

If you want to pass a dictionary containing decimals to the requests library (using the json keyword argument), you simply need to install simplejson:

$ pip3 install simplejson    
$ python3
>>> import requests
>>> from decimal import Decimal
>>> # This won't error out:
>>> requests.post('https://www.google.com', json={'foo': Decimal('1.23')})

The reason of the problem is that requests uses simplejson only if it is present, and falls back to the built-in json if it is not installed.


回答 15

这可以通过添加

    elif isinstance(o, decimal.Decimal):
        yield str(o)

\Lib\json\encoder.py:JSONEncoder._iterencode,但我希望有一个更好的解决方案

this can be done by adding

    elif isinstance(o, decimal.Decimal):
        yield str(o)

in \Lib\json\encoder.py:JSONEncoder._iterencode, but I was hoping for a better solution