标签归档:json

如何在Python中从文件/流中懒惰地读取多个JSON值?

问题:如何在Python中从文件/流中懒惰地读取多个JSON值?

我想一次从Python的文件/流中读取多个JSON对象。不幸的是json.load().read()直到文件结束为止。似乎没有任何方法可以使用它来读取单个对象或延迟迭代这些对象。

有什么办法吗?使用标准库将是理想的选择,但是如果有第三方库,我会改用它。

目前,我将每个对象放在单独的行上并使用json.loads(f.readline()),但我真的不希望这样做。

使用范例

example.py

import my_json as json
import sys

for o in json.iterload(sys.stdin):
    print("Working on a", type(o))

in.txt

{"foo": ["bar", "baz"]} 1 2 [] 4 5 6

示例会话

$ python3.2 example.py < in.txt
Working on a dict
Working on a int
Working on a int
Working on a list
Working on a int
Working on a int
Working on a int

I’d like to read multiple JSON objects from a file/stream in Python, one at a time. Unfortunately json.load() just .read()s until end-of-file; there doesn’t seem to be any way to use it to read a single object or to lazily iterate over the objects.

Is there any way to do this? Using the standard library would be ideal, but if there’s a third-party library I’d use that instead.

At the moment I’m putting each object on a separate line and using json.loads(f.readline()), but I would really prefer not to need to do this.

Example Use

example.py

import my_json as json
import sys

for o in json.iterload(sys.stdin):
    print("Working on a", type(o))

in.txt

{"foo": ["bar", "baz"]} 1 2 [] 4 5 6

example session

$ python3.2 example.py < in.txt
Working on a dict
Working on a int
Working on a int
Working on a list
Working on a int
Working on a int
Working on a int

回答 0

这是一个非常简单的解决方案。秘诀是尝试,失败并使用异常中的信息正确解析。唯一的限制是该文件必须可搜索。

def stream_read_json(fn):
    import json
    start_pos = 0
    with open(fn, 'r') as f:
        while True:
            try:
                obj = json.load(f)
                yield obj
                return
            except json.JSONDecodeError as e:
                f.seek(start_pos)
                json_str = f.read(e.pos)
                obj = json.loads(json_str)
                start_pos += e.pos
                yield obj

编辑:只是注意到这仅适用于Python> = 3.5。对于较早版本,失败将返回ValueError,并且您必须从字符串中解析出位置,例如

def stream_read_json(fn):
    import json
    import re
    start_pos = 0
    with open(fn, 'r') as f:
        while True:
            try:
                obj = json.load(f)
                yield obj
                return
            except ValueError as e:
                f.seek(start_pos)
                end_pos = int(re.match('Extra data: line \d+ column \d+ .*\(char (\d+).*\)',
                                    e.args[0]).groups()[0])
                json_str = f.read(end_pos)
                obj = json.loads(json_str)
                start_pos += end_pos
                yield obj

Here’s a much, much simpler solution. The secret is to try, fail, and use the information in the exception to parse correctly. The only limitation is the file must be seekable.

def stream_read_json(fn):
    import json
    start_pos = 0
    with open(fn, 'r') as f:
        while True:
            try:
                obj = json.load(f)
                yield obj
                return
            except json.JSONDecodeError as e:
                f.seek(start_pos)
                json_str = f.read(e.pos)
                obj = json.loads(json_str)
                start_pos += e.pos
                yield obj

Edit: just noticed that this will only work for Python >=3.5. For earlier, failures return a ValueError, and you have to parse out the position from the string, e.g.

def stream_read_json(fn):
    import json
    import re
    start_pos = 0
    with open(fn, 'r') as f:
        while True:
            try:
                obj = json.load(f)
                yield obj
                return
            except ValueError as e:
                f.seek(start_pos)
                end_pos = int(re.match('Extra data: line \d+ column \d+ .*\(char (\d+).*\)',
                                    e.args[0]).groups()[0])
                json_str = f.read(end_pos)
                obj = json.loads(json_str)
                start_pos += end_pos
                yield obj

回答 1

JSON通常对于这种增量使用不是很好。没有序列化多个对象的标准方法,这样就可以轻松地一次加载一个对象,而无需解析整个对象。

您正在使用的每行对象解决方案也可以在其他地方看到。Scrapy将其称为“ JSON行”:

您可以用Python稍微做到一点:

for jsonline in f:
    yield json.loads(jsonline)   # or do the processing in this loop

我认为这是最好的方法-它不依赖任何第三方库,而且很容易理解发生了什么。我也在自己的一些代码中使用过它。

JSON generally isn’t very good for this sort of incremental use; there’s no standard way to serialise multiple objects so that they can easily be loaded one at a time, without parsing the whole lot.

The object per line solution that you’re using is seen elsewhere too. Scrapy calls it ‘JSON lines’:

You can do it slightly more Pythonically:

for jsonline in f:
    yield json.loads(jsonline)   # or do the processing in this loop

I think this is about the best way – it doesn’t rely on any third party libraries, and it’s easy to understand what’s going on. I’ve used it in some of my own code as well.


回答 2

也许有点晚了,但是我有这个确切的问题(或多或少)。对于这些问题,我的标准解决方案通常是仅对某些众所周知的根对象进行正则表达式拆分,但对我而言这是不可能的。一般而言,唯一可行的方法是实现适当的标记器

在没有找到足够通用且性能合理的解决方案之后,我结束了自己编写splitstream模块的工作。这是一个预令牌器,可以理解JSON和XML,并将连续流分成多个块进行解析(不过实际解析由您自己决定)。为了获得某种性能,它被编写为C模块。

例:

from splitstream import splitfile

for jsonstr in splitfile(sys.stdin, format="json")):
    yield json.loads(jsonstr)

A little late maybe, but I had this exact problem (well, more or less). My standard solution for these problems is usually to just do a regex split on some well-known root object, but in my case it was impossible. The only feasible way to do this generically is to implement a proper tokenizer.

After not finding a generic-enough and reasonably well-performing solution, I ended doing this myself, writing the splitstream module. It is a pre-tokenizer that understands JSON and XML and splits a continuous stream into multiple chunks for parsing (it leaves the actual parsing up to you though). To get some kind of performance out of it, it is written as a C module.

Example:

from splitstream import splitfile

for jsonstr in splitfile(sys.stdin, format="json")):
    yield json.loads(jsonstr)

回答 3

当然可以。您只需要raw_decode直接采取。该实现将整个文件加载到内存中并对该字符串进行操作(与之类似json.load);如果您有大文件,则可以对其进行修改,使其仅在必要时从文件中读取而没有太大困难。

import json
from json.decoder import WHITESPACE

def iterload(string_or_fp, cls=json.JSONDecoder, **kwargs):
    if isinstance(string_or_fp, file):
        string = string_or_fp.read()
    else:
        string = str(string_or_fp)

    decoder = cls(**kwargs)
    idx = WHITESPACE.match(string, 0).end()
    while idx < len(string):
        obj, end = decoder.raw_decode(string, idx)
        yield obj
        idx = WHITESPACE.match(string, end).end()

用法:按照您的要求,它是一个发生器。

Sure you can do this. You just have to take to raw_decode directly. This implementation loads the whole file into memory and operates on that string (much as json.load does); if you have large files you can modify it to only read from the file as necessary without much difficulty.

import json
from json.decoder import WHITESPACE

def iterload(string_or_fp, cls=json.JSONDecoder, **kwargs):
    if isinstance(string_or_fp, file):
        string = string_or_fp.read()
    else:
        string = str(string_or_fp)

    decoder = cls(**kwargs)
    idx = WHITESPACE.match(string, 0).end()
    while idx < len(string):
        obj, end = decoder.raw_decode(string, idx)
        yield obj
        idx = WHITESPACE.match(string, end).end()

Usage: just as you requested, it’s a generator.


回答 4

这实际上是一个非常棘手的问题,因为您必须逐行进行流式处理,但是跨多行的模式匹配要针对大括号,还需要模式匹配json。这是一种json-preparse,然后是json parse。与其他格式相比,Json易于解析,因此不一定总是需要解析库,但是,我们应该如何解决这些矛盾的问题?

生成器来救援!

生成器对于此类问题的好处是,您可以将它们堆叠在一起,从而逐渐消除问题的难度,同时保持惰性。我还考虑过使用将值传回生成器的机制(send()),但幸运的是,我不需要使用该机制。

要解决第一个问题,您需要某种streamingfinditer,作为re.finditer的流版本。我在下面的尝试根据需要插入行(取消注释调试语句以查看),同时仍返回匹配项。然后,我实际上对其进行了一些修改,以产生不匹配的行和匹配项(在生成的元组的第一部分中标记为0或1)。

import re

def streamingfinditer(pat,stream):
  for s in stream:
#    print "Read next line: " + s
    while 1:
      m = re.search(pat,s)
      if not m:
        yield (0,s)
        break
      yield (1,m.group())
      s = re.split(pat,s,1)[1]

这样,就可以匹配直到大括号,每次都考虑大括号是否平衡,然后根据需要返回简单或复合对象。

braces='{}[]'
whitespaceesc=' \t'
bracesesc='\\'+'\\'.join(braces)
balancemap=dict(zip(braces,[1,-1,1,-1]))
bracespat='['+bracesesc+']'
nobracespat='[^'+bracesesc+']*'
untilbracespat=nobracespat+bracespat

def simpleorcompoundobjects(stream):
  obj = ""
  unbalanced = 0
  for (c,m) in streamingfinditer(re.compile(untilbracespat),stream):
    if (c == 0): # remainder of line returned, nothing interesting
      if (unbalanced == 0):
        yield (0,m)
      else:
        obj += m
    if (c == 1): # match returned
      if (unbalanced == 0):
        yield (0,m[:-1])
        obj += m[-1]
      else:
        obj += m
      unbalanced += balancemap[m[-1]]
      if (unbalanced == 0):
        yield (1,obj)
        obj="" 

这将返回元组,如下所示:

(0,"String of simple non-braced objects easy to parse")
(1,"{ 'Compound' : 'objects' }")

基本上这就是讨厌的部分。现在,我们只需要按照我们认为合适的方式进行最终的解析即可。例如,我们可以使用Jeremy Roman的iterload函数(谢谢!)对一行进行解析:

def streamingiterload(stream):
  for c,o in simpleorcompoundobjects(stream):
    for x in iterload(o):
      yield x 

测试一下:

of = open("test.json","w") 
of.write("""[ "hello" ] { "goodbye" : 1 } 1 2 {
} 2
9 78
 4 5 { "animals" : [ "dog" , "lots of mice" ,
 "cat" ] }
""")
of.close()
// open & stream the json
f = open("test.json","r")
for o in streamingiterload(f.readlines()):
  print o
f.close()

我得到了这些结果(如果您打开该调试行,则将看到它按需要插入行中):

[u'hello']
{u'goodbye': 1}
1
2
{}
2
9
78
4
5
{u'animals': [u'dog', u'lots of mice', u'cat']}

这并非在所有情况下都适用。由于该json库的实现,如果不自己重新实现解析器,就不可能完全正确地工作。

This is a pretty nasty problem actually because you have to stream in lines, but pattern match across multiple lines against braces, but also pattern match json. It’s a sort of json-preparse followed by a json parse. Json is, in comparison to other formats, easy to parse so it’s not always necessary to go for a parsing library, nevertheless, how to should we solve these conflicting issues?

Generators to the rescue!

The beauty of generators for a problem like this is you can stack them on top of each other gradually abstracting away the difficulty of the problem whilst maintaining laziness. I also considered using the mechanism for passing back values into a generator (send()) but fortunately found I didn’t need to use that.

To solve the first of the problems you need some sort of streamingfinditer, as a streaming version of re.finditer. My attempt at this below pulls in lines as needed (uncomment the debug statement to see) whilst still returning matches. I actually then modified it slightly to yield non-matched lines as well as matches (marked as 0 or 1 in the first part of the yielded tuple).

import re

def streamingfinditer(pat,stream):
  for s in stream:
#    print "Read next line: " + s
    while 1:
      m = re.search(pat,s)
      if not m:
        yield (0,s)
        break
      yield (1,m.group())
      s = re.split(pat,s,1)[1]

With that, it’s then possible to match up until braces, account each time for whether the braces are balanced, and then return either simple or compound objects as appropriate.

braces='{}[]'
whitespaceesc=' \t'
bracesesc='\\'+'\\'.join(braces)
balancemap=dict(zip(braces,[1,-1,1,-1]))
bracespat='['+bracesesc+']'
nobracespat='[^'+bracesesc+']*'
untilbracespat=nobracespat+bracespat

def simpleorcompoundobjects(stream):
  obj = ""
  unbalanced = 0
  for (c,m) in streamingfinditer(re.compile(untilbracespat),stream):
    if (c == 0): # remainder of line returned, nothing interesting
      if (unbalanced == 0):
        yield (0,m)
      else:
        obj += m
    if (c == 1): # match returned
      if (unbalanced == 0):
        yield (0,m[:-1])
        obj += m[-1]
      else:
        obj += m
      unbalanced += balancemap[m[-1]]
      if (unbalanced == 0):
        yield (1,obj)
        obj="" 

This returns tuples as follows:

(0,"String of simple non-braced objects easy to parse")
(1,"{ 'Compound' : 'objects' }")

Basically that’s the nasty part done. We now just have to do the final level of parsing as we see fit. For example we can use Jeremy Roman’s iterload function (Thanks!) to do parsing for a single line:

def streamingiterload(stream):
  for c,o in simpleorcompoundobjects(stream):
    for x in iterload(o):
      yield x 

Test it:

of = open("test.json","w") 
of.write("""[ "hello" ] { "goodbye" : 1 } 1 2 {
} 2
9 78
 4 5 { "animals" : [ "dog" , "lots of mice" ,
 "cat" ] }
""")
of.close()
// open & stream the json
f = open("test.json","r")
for o in streamingiterload(f.readlines()):
  print o
f.close()

I get these results (and if you turn on that debug line, you’ll see it pulls in the lines as needed):

[u'hello']
{u'goodbye': 1}
1
2
{}
2
9
78
4
5
{u'animals': [u'dog', u'lots of mice', u'cat']}

This won’t work for all situations. Due to the implementation of the json library, it is impossible to work entirely correctly without reimplementing the parser yourself.


回答 5

我相信这样做的更好方法是使用状态机。以下是我通过将下面的链接上的NodeJS代码转换为Python 3得出的示例代码(使用的非本地关键字仅在Python 3中可用,该代码在Python 2上不起作用)

编辑1:更新并使其代码与Python 2兼容

编辑2:更新并添加了仅Python3版本

https://gist.github.com/creationix/5992451

仅限Python 3版本

# A streaming byte oriented JSON parser.  Feed it a single byte at a time and
# it will emit complete objects as it comes across them.  Whitespace within and
# between objects is ignored.  This means it can parse newline delimited JSON.
import math


def json_machine(emit, next_func=None):
    def _value(byte_data):
        if not byte_data:
            return

        if byte_data == 0x09 or byte_data == 0x0a or byte_data == 0x0d or byte_data == 0x20:
            return _value  # Ignore whitespace

        if byte_data == 0x22:  # "
            return string_machine(on_value)

        if byte_data == 0x2d or (0x30 <= byte_data < 0x40):  # - or 0-9
            return number_machine(byte_data, on_number)

        if byte_data == 0x7b:  #:
            return object_machine(on_value)

        if byte_data == 0x5b:  # [
            return array_machine(on_value)

        if byte_data == 0x74:  # t
            return constant_machine(TRUE, True, on_value)

        if byte_data == 0x66:  # f
            return constant_machine(FALSE, False, on_value)

        if byte_data == 0x6e:  # n
            return constant_machine(NULL, None, on_value)

        if next_func == _value:
            raise Exception("Unexpected 0x" + str(byte_data))

        return next_func(byte_data)

    def on_value(value):
        emit(value)
        return next_func

    def on_number(number, byte):
        emit(number)
        return _value(byte)

    next_func = next_func or _value
    return _value


TRUE = [0x72, 0x75, 0x65]
FALSE = [0x61, 0x6c, 0x73, 0x65]
NULL = [0x75, 0x6c, 0x6c]


def constant_machine(bytes_data, value, emit):
    i = 0
    length = len(bytes_data)

    def _constant(byte_data):
        nonlocal i
        if byte_data != bytes_data[i]:
            i += 1
            raise Exception("Unexpected 0x" + str(byte_data))

        i += 1
        if i < length:
            return _constant
        return emit(value)

    return _constant


def string_machine(emit):
    string = ""

    def _string(byte_data):
        nonlocal string

        if byte_data == 0x22:  # "
            return emit(string)

        if byte_data == 0x5c:  # \
            return _escaped_string

        if byte_data & 0x80:  # UTF-8 handling
            return utf8_machine(byte_data, on_char_code)

        if byte_data < 0x20:  # ASCII control character
            raise Exception("Unexpected control character: 0x" + str(byte_data))

        string += chr(byte_data)
        return _string

    def _escaped_string(byte_data):
        nonlocal string

        if byte_data == 0x22 or byte_data == 0x5c or byte_data == 0x2f:  # " \ /
            string += chr(byte_data)
            return _string

        if byte_data == 0x62:  # b
            string += "\b"
            return _string

        if byte_data == 0x66:  # f
            string += "\f"
            return _string

        if byte_data == 0x6e:  # n
            string += "\n"
            return _string

        if byte_data == 0x72:  # r
            string += "\r"
            return _string

        if byte_data == 0x74:  # t
            string += "\t"
            return _string

        if byte_data == 0x75:  # u
            return hex_machine(on_char_code)

    def on_char_code(char_code):
        nonlocal string
        string += chr(char_code)
        return _string

    return _string


# Nestable state machine for UTF-8 Decoding.
def utf8_machine(byte_data, emit):
    left = 0
    num = 0

    def _utf8(byte_data):
        nonlocal num, left
        if (byte_data & 0xc0) != 0x80:
            raise Exception("Invalid byte in UTF-8 character: 0x" + byte_data.toString(16))

        left = left - 1

        num |= (byte_data & 0x3f) << (left * 6)
        if left:
            return _utf8
        return emit(num)

    if 0xc0 <= byte_data < 0xe0:  # 2-byte UTF-8 Character
        left = 1
        num = (byte_data & 0x1f) << 6
        return _utf8

    if 0xe0 <= byte_data < 0xf0:  # 3-byte UTF-8 Character
        left = 2
        num = (byte_data & 0xf) << 12
        return _utf8

    if 0xf0 <= byte_data < 0xf8:  # 4-byte UTF-8 Character
        left = 3
        num = (byte_data & 0x07) << 18
        return _utf8

    raise Exception("Invalid byte in UTF-8 string: 0x" + str(byte_data))


# Nestable state machine for hex escaped characters
def hex_machine(emit):
    left = 4
    num = 0

    def _hex(byte_data):
        nonlocal num, left

        if 0x30 <= byte_data < 0x40:
            i = byte_data - 0x30
        elif 0x61 <= byte_data <= 0x66:
            i = byte_data - 0x57
        elif 0x41 <= byte_data <= 0x46:
            i = byte_data - 0x37
        else:
            raise Exception("Expected hex char in string hex escape")

        left -= 1
        num |= i << (left * 4)

        if left:
            return _hex
        return emit(num)

    return _hex


def number_machine(byte_data, emit):
    sign = 1
    number = 0
    decimal = 0
    esign = 1
    exponent = 0

    def _mid(byte_data):
        if byte_data == 0x2e:  # .
            return _decimal

        return _later(byte_data)

    def _number(byte_data):
        nonlocal number
        if 0x30 <= byte_data < 0x40:
            number = number * 10 + (byte_data - 0x30)
            return _number

        return _mid(byte_data)

    def _start(byte_data):
        if byte_data == 0x30:
            return _mid

        if 0x30 < byte_data < 0x40:
            return _number(byte_data)

        raise Exception("Invalid number: 0x" + str(byte_data))

    if byte_data == 0x2d:  # -
        sign = -1
        return _start

    def _decimal(byte_data):
        nonlocal decimal
        if 0x30 <= byte_data < 0x40:
            decimal = (decimal + byte_data - 0x30) / 10
            return _decimal

        return _later(byte_data)

    def _later(byte_data):
        if byte_data == 0x45 or byte_data == 0x65:  # E e
            return _esign

        return _done(byte_data)

    def _esign(byte_data):
        nonlocal esign
        if byte_data == 0x2b:  # +
            return _exponent

        if byte_data == 0x2d:  # -
            esign = -1
            return _exponent

        return _exponent(byte_data)

    def _exponent(byte_data):
        nonlocal exponent
        if 0x30 <= byte_data < 0x40:
            exponent = exponent * 10 + (byte_data - 0x30)
            return _exponent

        return _done(byte_data)

    def _done(byte_data):
        value = sign * (number + decimal)
        if exponent:
            value *= math.pow(10, esign * exponent)

        return emit(value, byte_data)

    return _start(byte_data)


def array_machine(emit):
    array_data = []

    def _array(byte_data):
        if byte_data == 0x5d:  # ]
            return emit(array_data)

        return json_machine(on_value, _comma)(byte_data)

    def on_value(value):
        array_data.append(value)

    def _comma(byte_data):
        if byte_data == 0x09 or byte_data == 0x0a or byte_data == 0x0d or byte_data == 0x20:
            return _comma  # Ignore whitespace

        if byte_data == 0x2c:  # ,
            return json_machine(on_value, _comma)

        if byte_data == 0x5d:  # ]
            return emit(array_data)

        raise Exception("Unexpected byte: 0x" + str(byte_data) + " in array body")

    return _array


def object_machine(emit):
    object_data = {}
    key = None

    def _object(byte_data):
        if byte_data == 0x7d:  #
            return emit(object_data)

        return _key(byte_data)

    def _key(byte_data):
        if byte_data == 0x09 or byte_data == 0x0a or byte_data == 0x0d or byte_data == 0x20:
            return _object  # Ignore whitespace

        if byte_data == 0x22:
            return string_machine(on_key)

        raise Exception("Unexpected byte: 0x" + str(byte_data))

    def on_key(result):
        nonlocal key
        key = result
        return _colon

    def _colon(byte_data):
        if byte_data == 0x09 or byte_data == 0x0a or byte_data == 0x0d or byte_data == 0x20:
            return _colon  # Ignore whitespace

        if byte_data == 0x3a:  # :
            return json_machine(on_value, _comma)

        raise Exception("Unexpected byte: 0x" + str(byte_data))

    def on_value(value):
        object_data[key] = value

    def _comma(byte_data):
        if byte_data == 0x09 or byte_data == 0x0a or byte_data == 0x0d or byte_data == 0x20:
            return _comma  # Ignore whitespace

        if byte_data == 0x2c:  # ,
            return _key

        if byte_data == 0x7d:  #
            return emit(object_data)

        raise Exception("Unexpected byte: 0x" + str(byte_data))

    return _object

Python 2兼容版本

# A streaming byte oriented JSON parser.  Feed it a single byte at a time and
# it will emit complete objects as it comes across them.  Whitespace within and
# between objects is ignored.  This means it can parse newline delimited JSON.
import math


def json_machine(emit, next_func=None):
    def _value(byte_data):
        if not byte_data:
            return

        if byte_data == 0x09 or byte_data == 0x0a or byte_data == 0x0d or byte_data == 0x20:
            return _value  # Ignore whitespace

        if byte_data == 0x22:  # "
            return string_machine(on_value)

        if byte_data == 0x2d or (0x30 <= byte_data < 0x40):  # - or 0-9
            return number_machine(byte_data, on_number)

        if byte_data == 0x7b:  #:
            return object_machine(on_value)

        if byte_data == 0x5b:  # [
            return array_machine(on_value)

        if byte_data == 0x74:  # t
            return constant_machine(TRUE, True, on_value)

        if byte_data == 0x66:  # f
            return constant_machine(FALSE, False, on_value)

        if byte_data == 0x6e:  # n
            return constant_machine(NULL, None, on_value)

        if next_func == _value:
            raise Exception("Unexpected 0x" + str(byte_data))

        return next_func(byte_data)

    def on_value(value):
        emit(value)
        return next_func

    def on_number(number, byte):
        emit(number)
        return _value(byte)

    next_func = next_func or _value
    return _value


TRUE = [0x72, 0x75, 0x65]
FALSE = [0x61, 0x6c, 0x73, 0x65]
NULL = [0x75, 0x6c, 0x6c]


def constant_machine(bytes_data, value, emit):
    local_data = {"i": 0, "length": len(bytes_data)}

    def _constant(byte_data):
        # nonlocal i, length
        if byte_data != bytes_data[local_data["i"]]:
            local_data["i"] += 1
            raise Exception("Unexpected 0x" + byte_data.toString(16))

        local_data["i"] += 1

        if local_data["i"] < local_data["length"]:
            return _constant
        return emit(value)

    return _constant


def string_machine(emit):
    local_data = {"string": ""}

    def _string(byte_data):
        # nonlocal string

        if byte_data == 0x22:  # "
            return emit(local_data["string"])

        if byte_data == 0x5c:  # \
            return _escaped_string

        if byte_data & 0x80:  # UTF-8 handling
            return utf8_machine(byte_data, on_char_code)

        if byte_data < 0x20:  # ASCII control character
            raise Exception("Unexpected control character: 0x" + byte_data.toString(16))

        local_data["string"] += chr(byte_data)
        return _string

    def _escaped_string(byte_data):
        # nonlocal string

        if byte_data == 0x22 or byte_data == 0x5c or byte_data == 0x2f:  # " \ /
            local_data["string"] += chr(byte_data)
            return _string

        if byte_data == 0x62:  # b
            local_data["string"] += "\b"
            return _string

        if byte_data == 0x66:  # f
            local_data["string"] += "\f"
            return _string

        if byte_data == 0x6e:  # n
            local_data["string"] += "\n"
            return _string

        if byte_data == 0x72:  # r
            local_data["string"] += "\r"
            return _string

        if byte_data == 0x74:  # t
            local_data["string"] += "\t"
            return _string

        if byte_data == 0x75:  # u
            return hex_machine(on_char_code)

    def on_char_code(char_code):
        # nonlocal string
        local_data["string"] += chr(char_code)
        return _string

    return _string


# Nestable state machine for UTF-8 Decoding.
def utf8_machine(byte_data, emit):
    local_data = {"left": 0, "num": 0}

    def _utf8(byte_data):
        # nonlocal num, left
        if (byte_data & 0xc0) != 0x80:
            raise Exception("Invalid byte in UTF-8 character: 0x" + byte_data.toString(16))

        local_data["left"] -= 1

        local_data["num"] |= (byte_data & 0x3f) << (local_data["left"] * 6)
        if local_data["left"]:
            return _utf8
        return emit(local_data["num"])

    if 0xc0 <= byte_data < 0xe0:  # 2-byte UTF-8 Character
        local_data["left"] = 1
        local_data["num"] = (byte_data & 0x1f) << 6
        return _utf8

    if 0xe0 <= byte_data < 0xf0:  # 3-byte UTF-8 Character
        local_data["left"] = 2
        local_data["num"] = (byte_data & 0xf) << 12
        return _utf8

    if 0xf0 <= byte_data < 0xf8:  # 4-byte UTF-8 Character
        local_data["left"] = 3
        local_data["num"] = (byte_data & 0x07) << 18
        return _utf8

    raise Exception("Invalid byte in UTF-8 string: 0x" + str(byte_data))


# Nestable state machine for hex escaped characters
def hex_machine(emit):
    local_data = {"left": 4, "num": 0}

    def _hex(byte_data):
        # nonlocal num, left
        i = 0  # Parse the hex byte
        if 0x30 <= byte_data < 0x40:
            i = byte_data - 0x30
        elif 0x61 <= byte_data <= 0x66:
            i = byte_data - 0x57
        elif 0x41 <= byte_data <= 0x46:
            i = byte_data - 0x37
        else:
            raise Exception("Expected hex char in string hex escape")

        local_data["left"] -= 1
        local_data["num"] |= i << (local_data["left"] * 4)

        if local_data["left"]:
            return _hex
        return emit(local_data["num"])

    return _hex


def number_machine(byte_data, emit):
    local_data = {"sign": 1, "number": 0, "decimal": 0, "esign": 1, "exponent": 0}

    def _mid(byte_data):
        if byte_data == 0x2e:  # .
            return _decimal

        return _later(byte_data)

    def _number(byte_data):
        # nonlocal number
        if 0x30 <= byte_data < 0x40:
            local_data["number"] = local_data["number"] * 10 + (byte_data - 0x30)
            return _number

        return _mid(byte_data)

    def _start(byte_data):
        if byte_data == 0x30:
            return _mid

        if 0x30 < byte_data < 0x40:
            return _number(byte_data)

        raise Exception("Invalid number: 0x" + byte_data.toString(16))

    if byte_data == 0x2d:  # -
        local_data["sign"] = -1
        return _start

    def _decimal(byte_data):
        # nonlocal decimal
        if 0x30 <= byte_data < 0x40:
            local_data["decimal"] = (local_data["decimal"] + byte_data - 0x30) / 10
            return _decimal

        return _later(byte_data)

    def _later(byte_data):
        if byte_data == 0x45 or byte_data == 0x65:  # E e
            return _esign

        return _done(byte_data)

    def _esign(byte_data):
        # nonlocal esign
        if byte_data == 0x2b:  # +
            return _exponent

        if byte_data == 0x2d:  # -
            local_data["esign"] = -1
            return _exponent

        return _exponent(byte_data)

    def _exponent(byte_data):
        # nonlocal exponent
        if 0x30 <= byte_data < 0x40:
            local_data["exponent"] = local_data["exponent"] * 10 + (byte_data - 0x30)
            return _exponent

        return _done(byte_data)

    def _done(byte_data):
        value = local_data["sign"] * (local_data["number"] + local_data["decimal"])
        if local_data["exponent"]:
            value *= math.pow(10, local_data["esign"] * local_data["exponent"])

        return emit(value, byte_data)

    return _start(byte_data)


def array_machine(emit):
    local_data = {"array_data": []}

    def _array(byte_data):
        if byte_data == 0x5d:  # ]
            return emit(local_data["array_data"])

        return json_machine(on_value, _comma)(byte_data)

    def on_value(value):
        # nonlocal array_data
        local_data["array_data"].append(value)

    def _comma(byte_data):
        if byte_data == 0x09 or byte_data == 0x0a or byte_data == 0x0d or byte_data == 0x20:
            return _comma  # Ignore whitespace

        if byte_data == 0x2c:  # ,
            return json_machine(on_value, _comma)

        if byte_data == 0x5d:  # ]
            return emit(local_data["array_data"])

        raise Exception("Unexpected byte: 0x" + str(byte_data) + " in array body")

    return _array


def object_machine(emit):
    local_data = {"object_data": {}, "key": ""}

    def _object(byte_data):
        # nonlocal object_data, key
        if byte_data == 0x7d:  #
            return emit(local_data["object_data"])

        return _key(byte_data)

    def _key(byte_data):
        if byte_data == 0x09 or byte_data == 0x0a or byte_data == 0x0d or byte_data == 0x20:
            return _object  # Ignore whitespace

        if byte_data == 0x22:
            return string_machine(on_key)

        raise Exception("Unexpected byte: 0x" + byte_data.toString(16))

    def on_key(result):
        # nonlocal object_data, key
        local_data["key"] = result
        return _colon

    def _colon(byte_data):
        # nonlocal object_data, key
        if byte_data == 0x09 or byte_data == 0x0a or byte_data == 0x0d or byte_data == 0x20:
            return _colon  # Ignore whitespace

        if byte_data == 0x3a:  # :
            return json_machine(on_value, _comma)

        raise Exception("Unexpected byte: 0x" + str(byte_data))

    def on_value(value):
        # nonlocal object_data, key
        local_data["object_data"][local_data["key"]] = value

    def _comma(byte_data):
        # nonlocal object_data
        if byte_data == 0x09 or byte_data == 0x0a or byte_data == 0x0d or byte_data == 0x20:
            return _comma  # Ignore whitespace

        if byte_data == 0x2c:  # ,
            return _key

        if byte_data == 0x7d:  #
            return emit(local_data["object_data"])

        raise Exception("Unexpected byte: 0x" + str(byte_data))

    return _object

测试它

if __name__ == "__main__":
    test_json = """[1,2,"3"] {"name": 
    "tarun"} 1 2 
    3 [{"name":"a", 
    "data": [1,
    null,2]}]
"""
    def found_json(data):
        print(data)

    state = json_machine(found_json)

    for char in test_json:
        state = state(ord(char))

相同的输出是

[1, 2, '3']
{'name': 'tarun'}
1
2
3
[{'name': 'a', 'data': [1, None, 2]}]

I believe a better way of doing it would be to use a state machine. Below is a sample code that I worked out by converting a NodeJS code on below link to Python 3 (used nonlocal keyword only available in Python 3, code won’t work on Python 2)

Edit-1: Updated and made code compatible with Python 2

Edit-2: Updated and added a Python3 only version as well

https://gist.github.com/creationix/5992451

Python 3 only version

# A streaming byte oriented JSON parser.  Feed it a single byte at a time and
# it will emit complete objects as it comes across them.  Whitespace within and
# between objects is ignored.  This means it can parse newline delimited JSON.
import math


def json_machine(emit, next_func=None):
    def _value(byte_data):
        if not byte_data:
            return

        if byte_data == 0x09 or byte_data == 0x0a or byte_data == 0x0d or byte_data == 0x20:
            return _value  # Ignore whitespace

        if byte_data == 0x22:  # "
            return string_machine(on_value)

        if byte_data == 0x2d or (0x30 <= byte_data < 0x40):  # - or 0-9
            return number_machine(byte_data, on_number)

        if byte_data == 0x7b:  #:
            return object_machine(on_value)

        if byte_data == 0x5b:  # [
            return array_machine(on_value)

        if byte_data == 0x74:  # t
            return constant_machine(TRUE, True, on_value)

        if byte_data == 0x66:  # f
            return constant_machine(FALSE, False, on_value)

        if byte_data == 0x6e:  # n
            return constant_machine(NULL, None, on_value)

        if next_func == _value:
            raise Exception("Unexpected 0x" + str(byte_data))

        return next_func(byte_data)

    def on_value(value):
        emit(value)
        return next_func

    def on_number(number, byte):
        emit(number)
        return _value(byte)

    next_func = next_func or _value
    return _value


TRUE = [0x72, 0x75, 0x65]
FALSE = [0x61, 0x6c, 0x73, 0x65]
NULL = [0x75, 0x6c, 0x6c]


def constant_machine(bytes_data, value, emit):
    i = 0
    length = len(bytes_data)

    def _constant(byte_data):
        nonlocal i
        if byte_data != bytes_data[i]:
            i += 1
            raise Exception("Unexpected 0x" + str(byte_data))

        i += 1
        if i < length:
            return _constant
        return emit(value)

    return _constant


def string_machine(emit):
    string = ""

    def _string(byte_data):
        nonlocal string

        if byte_data == 0x22:  # "
            return emit(string)

        if byte_data == 0x5c:  # \
            return _escaped_string

        if byte_data & 0x80:  # UTF-8 handling
            return utf8_machine(byte_data, on_char_code)

        if byte_data < 0x20:  # ASCII control character
            raise Exception("Unexpected control character: 0x" + str(byte_data))

        string += chr(byte_data)
        return _string

    def _escaped_string(byte_data):
        nonlocal string

        if byte_data == 0x22 or byte_data == 0x5c or byte_data == 0x2f:  # " \ /
            string += chr(byte_data)
            return _string

        if byte_data == 0x62:  # b
            string += "\b"
            return _string

        if byte_data == 0x66:  # f
            string += "\f"
            return _string

        if byte_data == 0x6e:  # n
            string += "\n"
            return _string

        if byte_data == 0x72:  # r
            string += "\r"
            return _string

        if byte_data == 0x74:  # t
            string += "\t"
            return _string

        if byte_data == 0x75:  # u
            return hex_machine(on_char_code)

    def on_char_code(char_code):
        nonlocal string
        string += chr(char_code)
        return _string

    return _string


# Nestable state machine for UTF-8 Decoding.
def utf8_machine(byte_data, emit):
    left = 0
    num = 0

    def _utf8(byte_data):
        nonlocal num, left
        if (byte_data & 0xc0) != 0x80:
            raise Exception("Invalid byte in UTF-8 character: 0x" + byte_data.toString(16))

        left = left - 1

        num |= (byte_data & 0x3f) << (left * 6)
        if left:
            return _utf8
        return emit(num)

    if 0xc0 <= byte_data < 0xe0:  # 2-byte UTF-8 Character
        left = 1
        num = (byte_data & 0x1f) << 6
        return _utf8

    if 0xe0 <= byte_data < 0xf0:  # 3-byte UTF-8 Character
        left = 2
        num = (byte_data & 0xf) << 12
        return _utf8

    if 0xf0 <= byte_data < 0xf8:  # 4-byte UTF-8 Character
        left = 3
        num = (byte_data & 0x07) << 18
        return _utf8

    raise Exception("Invalid byte in UTF-8 string: 0x" + str(byte_data))


# Nestable state machine for hex escaped characters
def hex_machine(emit):
    left = 4
    num = 0

    def _hex(byte_data):
        nonlocal num, left

        if 0x30 <= byte_data < 0x40:
            i = byte_data - 0x30
        elif 0x61 <= byte_data <= 0x66:
            i = byte_data - 0x57
        elif 0x41 <= byte_data <= 0x46:
            i = byte_data - 0x37
        else:
            raise Exception("Expected hex char in string hex escape")

        left -= 1
        num |= i << (left * 4)

        if left:
            return _hex
        return emit(num)

    return _hex


def number_machine(byte_data, emit):
    sign = 1
    number = 0
    decimal = 0
    esign = 1
    exponent = 0

    def _mid(byte_data):
        if byte_data == 0x2e:  # .
            return _decimal

        return _later(byte_data)

    def _number(byte_data):
        nonlocal number
        if 0x30 <= byte_data < 0x40:
            number = number * 10 + (byte_data - 0x30)
            return _number

        return _mid(byte_data)

    def _start(byte_data):
        if byte_data == 0x30:
            return _mid

        if 0x30 < byte_data < 0x40:
            return _number(byte_data)

        raise Exception("Invalid number: 0x" + str(byte_data))

    if byte_data == 0x2d:  # -
        sign = -1
        return _start

    def _decimal(byte_data):
        nonlocal decimal
        if 0x30 <= byte_data < 0x40:
            decimal = (decimal + byte_data - 0x30) / 10
            return _decimal

        return _later(byte_data)

    def _later(byte_data):
        if byte_data == 0x45 or byte_data == 0x65:  # E e
            return _esign

        return _done(byte_data)

    def _esign(byte_data):
        nonlocal esign
        if byte_data == 0x2b:  # +
            return _exponent

        if byte_data == 0x2d:  # -
            esign = -1
            return _exponent

        return _exponent(byte_data)

    def _exponent(byte_data):
        nonlocal exponent
        if 0x30 <= byte_data < 0x40:
            exponent = exponent * 10 + (byte_data - 0x30)
            return _exponent

        return _done(byte_data)

    def _done(byte_data):
        value = sign * (number + decimal)
        if exponent:
            value *= math.pow(10, esign * exponent)

        return emit(value, byte_data)

    return _start(byte_data)


def array_machine(emit):
    array_data = []

    def _array(byte_data):
        if byte_data == 0x5d:  # ]
            return emit(array_data)

        return json_machine(on_value, _comma)(byte_data)

    def on_value(value):
        array_data.append(value)

    def _comma(byte_data):
        if byte_data == 0x09 or byte_data == 0x0a or byte_data == 0x0d or byte_data == 0x20:
            return _comma  # Ignore whitespace

        if byte_data == 0x2c:  # ,
            return json_machine(on_value, _comma)

        if byte_data == 0x5d:  # ]
            return emit(array_data)

        raise Exception("Unexpected byte: 0x" + str(byte_data) + " in array body")

    return _array


def object_machine(emit):
    object_data = {}
    key = None

    def _object(byte_data):
        if byte_data == 0x7d:  #
            return emit(object_data)

        return _key(byte_data)

    def _key(byte_data):
        if byte_data == 0x09 or byte_data == 0x0a or byte_data == 0x0d or byte_data == 0x20:
            return _object  # Ignore whitespace

        if byte_data == 0x22:
            return string_machine(on_key)

        raise Exception("Unexpected byte: 0x" + str(byte_data))

    def on_key(result):
        nonlocal key
        key = result
        return _colon

    def _colon(byte_data):
        if byte_data == 0x09 or byte_data == 0x0a or byte_data == 0x0d or byte_data == 0x20:
            return _colon  # Ignore whitespace

        if byte_data == 0x3a:  # :
            return json_machine(on_value, _comma)

        raise Exception("Unexpected byte: 0x" + str(byte_data))

    def on_value(value):
        object_data[key] = value

    def _comma(byte_data):
        if byte_data == 0x09 or byte_data == 0x0a or byte_data == 0x0d or byte_data == 0x20:
            return _comma  # Ignore whitespace

        if byte_data == 0x2c:  # ,
            return _key

        if byte_data == 0x7d:  #
            return emit(object_data)

        raise Exception("Unexpected byte: 0x" + str(byte_data))

    return _object

Python 2 compatible version

# A streaming byte oriented JSON parser.  Feed it a single byte at a time and
# it will emit complete objects as it comes across them.  Whitespace within and
# between objects is ignored.  This means it can parse newline delimited JSON.
import math


def json_machine(emit, next_func=None):
    def _value(byte_data):
        if not byte_data:
            return

        if byte_data == 0x09 or byte_data == 0x0a or byte_data == 0x0d or byte_data == 0x20:
            return _value  # Ignore whitespace

        if byte_data == 0x22:  # "
            return string_machine(on_value)

        if byte_data == 0x2d or (0x30 <= byte_data < 0x40):  # - or 0-9
            return number_machine(byte_data, on_number)

        if byte_data == 0x7b:  #:
            return object_machine(on_value)

        if byte_data == 0x5b:  # [
            return array_machine(on_value)

        if byte_data == 0x74:  # t
            return constant_machine(TRUE, True, on_value)

        if byte_data == 0x66:  # f
            return constant_machine(FALSE, False, on_value)

        if byte_data == 0x6e:  # n
            return constant_machine(NULL, None, on_value)

        if next_func == _value:
            raise Exception("Unexpected 0x" + str(byte_data))

        return next_func(byte_data)

    def on_value(value):
        emit(value)
        return next_func

    def on_number(number, byte):
        emit(number)
        return _value(byte)

    next_func = next_func or _value
    return _value


TRUE = [0x72, 0x75, 0x65]
FALSE = [0x61, 0x6c, 0x73, 0x65]
NULL = [0x75, 0x6c, 0x6c]


def constant_machine(bytes_data, value, emit):
    local_data = {"i": 0, "length": len(bytes_data)}

    def _constant(byte_data):
        # nonlocal i, length
        if byte_data != bytes_data[local_data["i"]]:
            local_data["i"] += 1
            raise Exception("Unexpected 0x" + byte_data.toString(16))

        local_data["i"] += 1

        if local_data["i"] < local_data["length"]:
            return _constant
        return emit(value)

    return _constant


def string_machine(emit):
    local_data = {"string": ""}

    def _string(byte_data):
        # nonlocal string

        if byte_data == 0x22:  # "
            return emit(local_data["string"])

        if byte_data == 0x5c:  # \
            return _escaped_string

        if byte_data & 0x80:  # UTF-8 handling
            return utf8_machine(byte_data, on_char_code)

        if byte_data < 0x20:  # ASCII control character
            raise Exception("Unexpected control character: 0x" + byte_data.toString(16))

        local_data["string"] += chr(byte_data)
        return _string

    def _escaped_string(byte_data):
        # nonlocal string

        if byte_data == 0x22 or byte_data == 0x5c or byte_data == 0x2f:  # " \ /
            local_data["string"] += chr(byte_data)
            return _string

        if byte_data == 0x62:  # b
            local_data["string"] += "\b"
            return _string

        if byte_data == 0x66:  # f
            local_data["string"] += "\f"
            return _string

        if byte_data == 0x6e:  # n
            local_data["string"] += "\n"
            return _string

        if byte_data == 0x72:  # r
            local_data["string"] += "\r"
            return _string

        if byte_data == 0x74:  # t
            local_data["string"] += "\t"
            return _string

        if byte_data == 0x75:  # u
            return hex_machine(on_char_code)

    def on_char_code(char_code):
        # nonlocal string
        local_data["string"] += chr(char_code)
        return _string

    return _string


# Nestable state machine for UTF-8 Decoding.
def utf8_machine(byte_data, emit):
    local_data = {"left": 0, "num": 0}

    def _utf8(byte_data):
        # nonlocal num, left
        if (byte_data & 0xc0) != 0x80:
            raise Exception("Invalid byte in UTF-8 character: 0x" + byte_data.toString(16))

        local_data["left"] -= 1

        local_data["num"] |= (byte_data & 0x3f) << (local_data["left"] * 6)
        if local_data["left"]:
            return _utf8
        return emit(local_data["num"])

    if 0xc0 <= byte_data < 0xe0:  # 2-byte UTF-8 Character
        local_data["left"] = 1
        local_data["num"] = (byte_data & 0x1f) << 6
        return _utf8

    if 0xe0 <= byte_data < 0xf0:  # 3-byte UTF-8 Character
        local_data["left"] = 2
        local_data["num"] = (byte_data & 0xf) << 12
        return _utf8

    if 0xf0 <= byte_data < 0xf8:  # 4-byte UTF-8 Character
        local_data["left"] = 3
        local_data["num"] = (byte_data & 0x07) << 18
        return _utf8

    raise Exception("Invalid byte in UTF-8 string: 0x" + str(byte_data))


# Nestable state machine for hex escaped characters
def hex_machine(emit):
    local_data = {"left": 4, "num": 0}

    def _hex(byte_data):
        # nonlocal num, left
        i = 0  # Parse the hex byte
        if 0x30 <= byte_data < 0x40:
            i = byte_data - 0x30
        elif 0x61 <= byte_data <= 0x66:
            i = byte_data - 0x57
        elif 0x41 <= byte_data <= 0x46:
            i = byte_data - 0x37
        else:
            raise Exception("Expected hex char in string hex escape")

        local_data["left"] -= 1
        local_data["num"] |= i << (local_data["left"] * 4)

        if local_data["left"]:
            return _hex
        return emit(local_data["num"])

    return _hex


def number_machine(byte_data, emit):
    local_data = {"sign": 1, "number": 0, "decimal": 0, "esign": 1, "exponent": 0}

    def _mid(byte_data):
        if byte_data == 0x2e:  # .
            return _decimal

        return _later(byte_data)

    def _number(byte_data):
        # nonlocal number
        if 0x30 <= byte_data < 0x40:
            local_data["number"] = local_data["number"] * 10 + (byte_data - 0x30)
            return _number

        return _mid(byte_data)

    def _start(byte_data):
        if byte_data == 0x30:
            return _mid

        if 0x30 < byte_data < 0x40:
            return _number(byte_data)

        raise Exception("Invalid number: 0x" + byte_data.toString(16))

    if byte_data == 0x2d:  # -
        local_data["sign"] = -1
        return _start

    def _decimal(byte_data):
        # nonlocal decimal
        if 0x30 <= byte_data < 0x40:
            local_data["decimal"] = (local_data["decimal"] + byte_data - 0x30) / 10
            return _decimal

        return _later(byte_data)

    def _later(byte_data):
        if byte_data == 0x45 or byte_data == 0x65:  # E e
            return _esign

        return _done(byte_data)

    def _esign(byte_data):
        # nonlocal esign
        if byte_data == 0x2b:  # +
            return _exponent

        if byte_data == 0x2d:  # -
            local_data["esign"] = -1
            return _exponent

        return _exponent(byte_data)

    def _exponent(byte_data):
        # nonlocal exponent
        if 0x30 <= byte_data < 0x40:
            local_data["exponent"] = local_data["exponent"] * 10 + (byte_data - 0x30)
            return _exponent

        return _done(byte_data)

    def _done(byte_data):
        value = local_data["sign"] * (local_data["number"] + local_data["decimal"])
        if local_data["exponent"]:
            value *= math.pow(10, local_data["esign"] * local_data["exponent"])

        return emit(value, byte_data)

    return _start(byte_data)


def array_machine(emit):
    local_data = {"array_data": []}

    def _array(byte_data):
        if byte_data == 0x5d:  # ]
            return emit(local_data["array_data"])

        return json_machine(on_value, _comma)(byte_data)

    def on_value(value):
        # nonlocal array_data
        local_data["array_data"].append(value)

    def _comma(byte_data):
        if byte_data == 0x09 or byte_data == 0x0a or byte_data == 0x0d or byte_data == 0x20:
            return _comma  # Ignore whitespace

        if byte_data == 0x2c:  # ,
            return json_machine(on_value, _comma)

        if byte_data == 0x5d:  # ]
            return emit(local_data["array_data"])

        raise Exception("Unexpected byte: 0x" + str(byte_data) + " in array body")

    return _array


def object_machine(emit):
    local_data = {"object_data": {}, "key": ""}

    def _object(byte_data):
        # nonlocal object_data, key
        if byte_data == 0x7d:  #
            return emit(local_data["object_data"])

        return _key(byte_data)

    def _key(byte_data):
        if byte_data == 0x09 or byte_data == 0x0a or byte_data == 0x0d or byte_data == 0x20:
            return _object  # Ignore whitespace

        if byte_data == 0x22:
            return string_machine(on_key)

        raise Exception("Unexpected byte: 0x" + byte_data.toString(16))

    def on_key(result):
        # nonlocal object_data, key
        local_data["key"] = result
        return _colon

    def _colon(byte_data):
        # nonlocal object_data, key
        if byte_data == 0x09 or byte_data == 0x0a or byte_data == 0x0d or byte_data == 0x20:
            return _colon  # Ignore whitespace

        if byte_data == 0x3a:  # :
            return json_machine(on_value, _comma)

        raise Exception("Unexpected byte: 0x" + str(byte_data))

    def on_value(value):
        # nonlocal object_data, key
        local_data["object_data"][local_data["key"]] = value

    def _comma(byte_data):
        # nonlocal object_data
        if byte_data == 0x09 or byte_data == 0x0a or byte_data == 0x0d or byte_data == 0x20:
            return _comma  # Ignore whitespace

        if byte_data == 0x2c:  # ,
            return _key

        if byte_data == 0x7d:  #
            return emit(local_data["object_data"])

        raise Exception("Unexpected byte: 0x" + str(byte_data))

    return _object

Testing it

if __name__ == "__main__":
    test_json = """[1,2,"3"] {"name": 
    "tarun"} 1 2 
    3 [{"name":"a", 
    "data": [1,
    null,2]}]
"""
    def found_json(data):
        print(data)

    state = json_machine(found_json)

    for char in test_json:
        state = state(ord(char))

The output of the same is

[1, 2, '3']
{'name': 'tarun'}
1
2
3
[{'name': 'a', 'data': [1, None, 2]}]

回答 6

我想提供一个解决方案。关键思想是“尝试”解码:如果失败,则给它更多提要,否则使用偏移量信息准备下一次解码。

但是,当前的json模块不能容忍要解码的字符串开头的SPACE,因此我必须将其剥离。

import sys
import json

def iterload(file):
    buffer = ""
    dec = json.JSONDecoder()
    for line in file:         
        buffer = buffer.strip(" \n\r\t") + line.strip(" \n\r\t")
        while(True):
            try:
                r = dec.raw_decode(buffer)
            except:
                break
            yield r[0]
            buffer = buffer[r[1]:].strip(" \n\r\t")


for o in iterload(sys.stdin):
    print("Working on a", type(o),  o)

=========================我已经测试了多个txt文件,并且工作正常。(in1.txt)

{"foo": ["bar", "baz"]
}
 1 2 [
  ]  4
{"foo1": ["bar1", {"foo2":{"A":1, "B":3}, "DDD":4}]
}
 5   6

(in2.txt)

{"foo"
: ["bar",
  "baz"]
  } 
1 2 [
] 4 5 6

(in.txt,您的首字母)

{"foo": ["bar", "baz"]} 1 2 [] 4 5 6

(本尼迪克特测试用例的输出)

python test.py < in.txt
('Working on a', <type 'list'>, [u'hello'])
('Working on a', <type 'dict'>, {u'goodbye': 1})
('Working on a', <type 'int'>, 1)
('Working on a', <type 'int'>, 2)
('Working on a', <type 'dict'>, {})
('Working on a', <type 'int'>, 2)
('Working on a', <type 'int'>, 9)
('Working on a', <type 'int'>, 78)
('Working on a', <type 'int'>, 4)
('Working on a', <type 'int'>, 5)
('Working on a', <type 'dict'>, {u'animals': [u'dog', u'lots of mice', u'cat']})

I’d like to provide a solution. The key thought is to “try” to decode: if it fails, give it more feed, otherwise use the offset information to prepare next decoding.

However the current json module can’t tolerate SPACE in head of string to be decoded, so I have to strip them off.

import sys
import json

def iterload(file):
    buffer = ""
    dec = json.JSONDecoder()
    for line in file:         
        buffer = buffer.strip(" \n\r\t") + line.strip(" \n\r\t")
        while(True):
            try:
                r = dec.raw_decode(buffer)
            except:
                break
            yield r[0]
            buffer = buffer[r[1]:].strip(" \n\r\t")


for o in iterload(sys.stdin):
    print("Working on a", type(o),  o)

========================= I have tested for several txt files, and it works fine. (in1.txt)

{"foo": ["bar", "baz"]
}
 1 2 [
  ]  4
{"foo1": ["bar1", {"foo2":{"A":1, "B":3}, "DDD":4}]
}
 5   6

(in2.txt)

{"foo"
: ["bar",
  "baz"]
  } 
1 2 [
] 4 5 6

(in.txt, your initial)

{"foo": ["bar", "baz"]} 1 2 [] 4 5 6

(output for Benedict’s testcase)

python test.py < in.txt
('Working on a', <type 'list'>, [u'hello'])
('Working on a', <type 'dict'>, {u'goodbye': 1})
('Working on a', <type 'int'>, 1)
('Working on a', <type 'int'>, 2)
('Working on a', <type 'dict'>, {})
('Working on a', <type 'int'>, 2)
('Working on a', <type 'int'>, 9)
('Working on a', <type 'int'>, 78)
('Working on a', <type 'int'>, 4)
('Working on a', <type 'int'>, 5)
('Working on a', <type 'dict'>, {u'animals': [u'dog', u'lots of mice', u'cat']})

回答 7

这是我的:

import simplejson as json
from simplejson import JSONDecodeError
class StreamJsonListLoader():
    """
    When you have a big JSON file containint a list, such as

    [{
        ...
    },
    {
        ...
    },
    {
        ...
    },
    ...
    ]

    And it's too big to be practically loaded into memory and parsed by json.load,
    This class comes to the rescue. It lets you lazy-load the large json list.
    """

    def __init__(self, filename_or_stream):
        if type(filename_or_stream) == str:
            self.stream = open(filename_or_stream)
        else:
            self.stream = filename_or_stream

        if not self.stream.read(1) == '[':
            raise NotImplementedError('Only JSON-streams of lists (that start with a [) are supported.')

    def __iter__(self):
        return self

    def next(self):
        read_buffer = self.stream.read(1)
        while True:
            try:
                json_obj = json.loads(read_buffer)

                if not self.stream.read(1) in [',',']']:
                    raise Exception('JSON seems to be malformed: object is not followed by comma (,) or end of list (]).')
                return json_obj
            except JSONDecodeError:
                next_char = self.stream.read(1)
                read_buffer += next_char
                while next_char != '}':
                    next_char = self.stream.read(1)
                    if next_char == '':
                        raise StopIteration
                    read_buffer += next_char

Here’s mine:

import simplejson as json
from simplejson import JSONDecodeError
class StreamJsonListLoader():
    """
    When you have a big JSON file containint a list, such as

    [{
        ...
    },
    {
        ...
    },
    {
        ...
    },
    ...
    ]

    And it's too big to be practically loaded into memory and parsed by json.load,
    This class comes to the rescue. It lets you lazy-load the large json list.
    """

    def __init__(self, filename_or_stream):
        if type(filename_or_stream) == str:
            self.stream = open(filename_or_stream)
        else:
            self.stream = filename_or_stream

        if not self.stream.read(1) == '[':
            raise NotImplementedError('Only JSON-streams of lists (that start with a [) are supported.')

    def __iter__(self):
        return self

    def next(self):
        read_buffer = self.stream.read(1)
        while True:
            try:
                json_obj = json.loads(read_buffer)

                if not self.stream.read(1) in [',',']']:
                    raise Exception('JSON seems to be malformed: object is not followed by comma (,) or end of list (]).')
                return json_obj
            except JSONDecodeError:
                next_char = self.stream.read(1)
                read_buffer += next_char
                while next_char != '}':
                    next_char = self.stream.read(1)
                    if next_char == '':
                        raise StopIteration
                    read_buffer += next_char

回答 8

我使用@wuilang的优雅解决方案。简单的方法-读取字节,尝试解码,读取字节,尝试解码,…-起作用了,但不幸的是,它非常慢。

就我而言,我试图从文件中读取具有相同对象类型的“漂亮打印” JSON对象。这使我可以优化方法。我可以逐行读取文件,仅当找到包含“}”的行时才解码:

def iterload(stream):
    buf = ""
    dec = json.JSONDecoder()
    for line in stream:
        line = line.rstrip()
        buf = buf + line
        if line == "}":
            yield dec.raw_decode(buf)
            buf = ""

如果您碰巧使用的是每行一个紧凑的JSON,该字符串在字符串文字中转义了换行符,那么您可以放心地简化此方法:

def iterload(stream):
    dec = json.JSONDecoder()
    for line in stream:
        yield dec.raw_decode(line)

显然,这些简单的方法仅适用于非常特定的JSON。但是,如果这些假设成立,则这些解决方案将正确,快速地工作。

I used @wuilang’s elegant solution. The simple approach — read a byte, try to decode, read a byte, try to decode, … — worked, but unfortunately it was very slow.

In my case, I was trying to read “pretty-printed” JSON objects of the same object type from a file. This allowed me to optimize the approach; I could read the file line-by-line, only decoding when I found a line that contained exactly “}”:

def iterload(stream):
    buf = ""
    dec = json.JSONDecoder()
    for line in stream:
        line = line.rstrip()
        buf = buf + line
        if line == "}":
            yield dec.raw_decode(buf)
            buf = ""

If you happen to be working with one-per-line compact JSON that escapes newlines in string literals, then you can safely simplify this approach even more:

def iterload(stream):
    dec = json.JSONDecoder()
    for line in stream:
        yield dec.raw_decode(line)

Obviously, these simple approaches only work for very specific kinds of JSON. However, if these assumptions hold, these solutions work correctly and quickly.


回答 9

如果使用json.JSONDecoder实例,则可以使用raw_decode成员函数。它返回JSON值的python表示形式的元组和解析停止位置的索引。这使得切片(或在流对象中搜索)剩余的JSON值变得容易。我对多余的while循环不满意,因为它会跳过输入中不同JSON值之间的空白,但是我认为它可以完成工作。

import json

def yield_multiple_value(f):
    '''
    parses multiple JSON values from a file.
    '''
    vals_str = f.read()
    decoder = json.JSONDecoder()
    try:
        nread = 0
        while nread < len(vals_str):
            val, n = decoder.raw_decode(vals_str[nread:])
            nread += n
            # Skip over whitespace because of bug, below.
            while nread < len(vals_str) and vals_str[nread].isspace():
                nread += 1
            yield val
    except json.JSONDecodeError as e:
        pass
    return

下一个版本要短得多,它将占用已经解析的字符串部分。似乎由于某种原因,当字符串中的第一个字符为空格时,第二次调用json.JSONDecoder.raw_decode()似乎失败,这也是我跳过上述while循环中的空格的原因…

def yield_multiple_value(f):
    '''
    parses multiple JSON values from a file.
    '''
    vals_str = f.read()
    decoder = json.JSONDecoder()
    while vals_str:
        val, n = decoder.raw_decode(vals_str)
        #remove the read characters from the start.
        vals_str = vals_str[n:]
        # remove leading white space because a second call to decoder.raw_decode()
        # fails when the string starts with whitespace, and
        # I don't understand why...
        vals_str = vals_str.lstrip()
        yield val
    return

在有关json.JSONDecoder类的文档中,raw_decode https://docs.python.org/3/library/json.html#encoders-and-decoders方法包含以下内容:

这可用于从结尾可能有无关数据的字符串中解码JSON文档。

而且这些无关的数据很容易成为另一个JSON值。换句话说,在编写该方法时可能会牢记此目的。

使用上层函数的input.txt文件,我得到了原始问题中给出的示例输出。

If you use a json.JSONDecoder instance you can use raw_decode member function. It returns a tuple of python representation of the JSON value and an index to where the parsing stopped. This makes it easy to slice (or seek in a stream object) the remaining JSON values. I’m not so happy about the extra while loop to skip over the white space between the different JSON values in the input but it gets the job done in my opinion.

import json

def yield_multiple_value(f):
    '''
    parses multiple JSON values from a file.
    '''
    vals_str = f.read()
    decoder = json.JSONDecoder()
    try:
        nread = 0
        while nread < len(vals_str):
            val, n = decoder.raw_decode(vals_str[nread:])
            nread += n
            # Skip over whitespace because of bug, below.
            while nread < len(vals_str) and vals_str[nread].isspace():
                nread += 1
            yield val
    except json.JSONDecodeError as e:
        pass
    return

The next version is much shorter and eats the part of the string that is already parsed. It seems that for some reason a second call json.JSONDecoder.raw_decode() seems to fail when the first character in the string is a whitespace, that is also the reason why I skip over the whitespace in the whileloop above …

def yield_multiple_value(f):
    '''
    parses multiple JSON values from a file.
    '''
    vals_str = f.read()
    decoder = json.JSONDecoder()
    while vals_str:
        val, n = decoder.raw_decode(vals_str)
        #remove the read characters from the start.
        vals_str = vals_str[n:]
        # remove leading white space because a second call to decoder.raw_decode()
        # fails when the string starts with whitespace, and
        # I don't understand why...
        vals_str = vals_str.lstrip()
        yield val
    return

In the documentation about the json.JSONDecoder class the method raw_decode https://docs.python.org/3/library/json.html#encoders-and-decoders contains the following:

This can be used to decode a JSON document from a string that may have extraneous data at the end.

And this extraneous data can easily be another JSON value. In other words the method might be written with this purpose in mind.

With the input.txt using the upper function I obtain the example output as presented in the original question.


回答 10

您可以完全出于此目的使用https://pypi.org/project/json-stream-parser/

import sys
from json_stream_parser import load_iter
for obj in load_iter(sys.stdin):
    print(obj)

输出

{'foo': ['bar', 'baz']}
1
2
[]
4
5
6

You can use https://pypi.org/project/json-stream-parser/ for exactly that purpose.

import sys
from json_stream_parser import load_iter
for obj in load_iter(sys.stdin):
    print(obj)

output

{'foo': ['bar', 'baz']}
1
2
[]
4
5
6

Python:您将如何保存一个简单的设置/配置文件?

问题:Python:您将如何保存一个简单的设置/配置文件?

我不在乎,如果是JSONpickleYAML,或什么的。

我见过的所有其他实现都不兼容前向,因此,如果我有一个配置文件,在代码中添加一个新密钥,然后加载该配置文件,它将崩溃。

有没有简单的方法可以做到这一点?

I don’t care if it’s JSON, pickle, YAML, or whatever.

All other implementations I have seen are not forwards compatible, so if I have a config file, add a new key in the code, then load that config file, it’ll just crash.

Are there any simple way to do this?


回答 0

python中的配置文件

有多种方法可以执行此操作,具体取决于所需的文件格式。

ConfigParser [.ini格式]

除非有令人信服的理由使用其他格式,否则我将使用标准的configparser方法。

像这样写一个文件:

# python 2.x
# from ConfigParser import SafeConfigParser
# config = SafeConfigParser()

# python 3.x
from configparser import ConfigParser
config = ConfigParser()

config.read('config.ini')
config.add_section('main')
config.set('main', 'key1', 'value1')
config.set('main', 'key2', 'value2')
config.set('main', 'key3', 'value3')

with open('config.ini', 'w') as f:
    config.write(f)

文件格式非常简单,其中的部分用方括号标记:

[main]
key1 = value1
key2 = value2
key3 = value3

可以从文件中提取值,如下所示:

# python 2.x
# from ConfigParser import SafeConfigParser
# config = SafeConfigParser()

# python 3.x
from configparser import ConfigParser
config = ConfigParser()

config.read('config.ini')

print config.get('main', 'key1') # -> "value1"
print config.get('main', 'key2') # -> "value2"
print config.get('main', 'key3') # -> "value3"

# getfloat() raises an exception if the value is not a float
a_float = config.getfloat('main', 'a_float')

# getint() and getboolean() also do this for their respective types
an_int = config.getint('main', 'an_int')

JSON [.json格式]

JSON数据可能非常复杂,并且具有高度可移植的优势。

将数据写入文件:

import json

config = {'key1': 'value1', 'key2': 'value2'}

with open('config.json', 'w') as f:
    json.dump(config, f)

从文件读取数据:

import json

with open('config.json', 'r') as f:
    config = json.load(f)

#edit the data
config['key3'] = 'value3'

#write it back to the file
with open('config.json', 'w') as f:
    json.dump(config, f)

YAML

这个答案提供一个基本的YAML示例。可以在pyYAML网站上找到更多详细信息。

Configuration files in python

There are several ways to do this depending on the file format required.

ConfigParser [.ini format]

I would use the standard configparser approach unless there were compelling reasons to use a different format.

Write a file like so:

# python 2.x
# from ConfigParser import SafeConfigParser
# config = SafeConfigParser()

# python 3.x
from configparser import ConfigParser
config = ConfigParser()

config.read('config.ini')
config.add_section('main')
config.set('main', 'key1', 'value1')
config.set('main', 'key2', 'value2')
config.set('main', 'key3', 'value3')

with open('config.ini', 'w') as f:
    config.write(f)

The file format is very simple with sections marked out in square brackets:

[main]
key1 = value1
key2 = value2
key3 = value3

Values can be extracted from the file like so:

# python 2.x
# from ConfigParser import SafeConfigParser
# config = SafeConfigParser()

# python 3.x
from configparser import ConfigParser
config = ConfigParser()

config.read('config.ini')

print config.get('main', 'key1') # -> "value1"
print config.get('main', 'key2') # -> "value2"
print config.get('main', 'key3') # -> "value3"

# getfloat() raises an exception if the value is not a float
a_float = config.getfloat('main', 'a_float')

# getint() and getboolean() also do this for their respective types
an_int = config.getint('main', 'an_int')

JSON [.json format]

JSON data can be very complex and has the advantage of being highly portable.

Write data to a file:

import json

config = {'key1': 'value1', 'key2': 'value2'}

with open('config.json', 'w') as f:
    json.dump(config, f)

Read data from a file:

import json

with open('config.json', 'r') as f:
    config = json.load(f)

#edit the data
config['key3'] = 'value3'

#write it back to the file
with open('config.json', 'w') as f:
    json.dump(config, f)

YAML

A basic YAML example is provided in this answer. More details can be found on the pyYAML website.


回答 1

ConfigParser Basic示例

该文件可以像这样加载和使用:

#!/usr/bin/env python

import ConfigParser
import io

# Load the configuration file
with open("config.yml") as f:
    sample_config = f.read()
config = ConfigParser.RawConfigParser(allow_no_value=True)
config.readfp(io.BytesIO(sample_config))

# List all contents
print("List all contents")
for section in config.sections():
    print("Section: %s" % section)
    for options in config.options(section):
        print("x %s:::%s:::%s" % (options,
                                  config.get(section, options),
                                  str(type(options))))

# Print some contents
print("\nPrint some contents")
print(config.get('other', 'use_anonymous'))  # Just get the value
print(config.getboolean('other', 'use_anonymous'))  # You know the datatype?

哪个输出

List all contents
Section: mysql
x host:::localhost:::<type 'str'>
x user:::root:::<type 'str'>
x passwd:::my secret password:::<type 'str'>
x db:::write-math:::<type 'str'>
Section: other
x preprocessing_queue:::["preprocessing.scale_and_center",
"preprocessing.dot_reduction",
"preprocessing.connect_lines"]:::<type 'str'>
x use_anonymous:::yes:::<type 'str'>

Print some contents
yes
True

如您所见,您可以使用易于读写的标准数据格式。诸如getboolean和getint之类的方法允许您获取数据类型,而不是简单的字符串。

编写配置

import os
configfile_name = "config.yaml"

# Check if there is already a configurtion file
if not os.path.isfile(configfile_name):
    # Create the configuration file as it doesn't exist yet
    cfgfile = open(configfile_name, 'w')

    # Add content to the file
    Config = ConfigParser.ConfigParser()
    Config.add_section('mysql')
    Config.set('mysql', 'host', 'localhost')
    Config.set('mysql', 'user', 'root')
    Config.set('mysql', 'passwd', 'my secret password')
    Config.set('mysql', 'db', 'write-math')
    Config.add_section('other')
    Config.set('other',
               'preprocessing_queue',
               ['preprocessing.scale_and_center',
                'preprocessing.dot_reduction',
                'preprocessing.connect_lines'])
    Config.set('other', 'use_anonymous', True)
    Config.write(cfgfile)
    cfgfile.close()

结果是

[mysql]
host = localhost
user = root
passwd = my secret password
db = write-math

[other]
preprocessing_queue = ['preprocessing.scale_and_center', 'preprocessing.dot_reduction', 'preprocessing.connect_lines']
use_anonymous = True

XML基本示例

似乎Python社区根本不使用配置文件。但是,解析/编写XML很容易,并且使用Python可以有很多可能性。一个是BeautifulSoup:

from BeautifulSoup import BeautifulSoup

with open("config.xml") as f:
    content = f.read()

y = BeautifulSoup(content)
print(y.mysql.host.contents[0])
for tag in y.other.preprocessing_queue:
    print(tag)

config.xml可能看起来像这样

<config>
    <mysql>
        <host>localhost</host>
        <user>root</user>
        <passwd>my secret password</passwd>
        <db>write-math</db>
    </mysql>
    <other>
        <preprocessing_queue>
            <li>preprocessing.scale_and_center</li>
            <li>preprocessing.dot_reduction</li>
            <li>preprocessing.connect_lines</li>
        </preprocessing_queue>
        <use_anonymous value="true" />
    </other>
</config>

ConfigParser Basic example

The file can be loaded and used like this:

#!/usr/bin/env python

import ConfigParser
import io

# Load the configuration file
with open("config.yml") as f:
    sample_config = f.read()
config = ConfigParser.RawConfigParser(allow_no_value=True)
config.readfp(io.BytesIO(sample_config))

# List all contents
print("List all contents")
for section in config.sections():
    print("Section: %s" % section)
    for options in config.options(section):
        print("x %s:::%s:::%s" % (options,
                                  config.get(section, options),
                                  str(type(options))))

# Print some contents
print("\nPrint some contents")
print(config.get('other', 'use_anonymous'))  # Just get the value
print(config.getboolean('other', 'use_anonymous'))  # You know the datatype?

which outputs

List all contents
Section: mysql
x host:::localhost:::<type 'str'>
x user:::root:::<type 'str'>
x passwd:::my secret password:::<type 'str'>
x db:::write-math:::<type 'str'>
Section: other
x preprocessing_queue:::["preprocessing.scale_and_center",
"preprocessing.dot_reduction",
"preprocessing.connect_lines"]:::<type 'str'>
x use_anonymous:::yes:::<type 'str'>

Print some contents
yes
True

As you can see, you can use a standard data format that is easy to read and write. Methods like getboolean and getint allow you to get the datatype instead of a simple string.

Writing configuration

import os
configfile_name = "config.yaml"

# Check if there is already a configurtion file
if not os.path.isfile(configfile_name):
    # Create the configuration file as it doesn't exist yet
    cfgfile = open(configfile_name, 'w')

    # Add content to the file
    Config = ConfigParser.ConfigParser()
    Config.add_section('mysql')
    Config.set('mysql', 'host', 'localhost')
    Config.set('mysql', 'user', 'root')
    Config.set('mysql', 'passwd', 'my secret password')
    Config.set('mysql', 'db', 'write-math')
    Config.add_section('other')
    Config.set('other',
               'preprocessing_queue',
               ['preprocessing.scale_and_center',
                'preprocessing.dot_reduction',
                'preprocessing.connect_lines'])
    Config.set('other', 'use_anonymous', True)
    Config.write(cfgfile)
    cfgfile.close()

results in

[mysql]
host = localhost
user = root
passwd = my secret password
db = write-math

[other]
preprocessing_queue = ['preprocessing.scale_and_center', 'preprocessing.dot_reduction', 'preprocessing.connect_lines']
use_anonymous = True

XML Basic example

Seems not to be used at all for configuration files by the Python community. However, parsing / writing XML is easy and there are plenty of possibilities to do so with Python. One is BeautifulSoup:

from BeautifulSoup import BeautifulSoup

with open("config.xml") as f:
    content = f.read()

y = BeautifulSoup(content)
print(y.mysql.host.contents[0])
for tag in y.other.preprocessing_queue:
    print(tag)

where the config.xml might look like this

<config>
    <mysql>
        <host>localhost</host>
        <user>root</user>
        <passwd>my secret password</passwd>
        <db>write-math</db>
    </mysql>
    <other>
        <preprocessing_queue>
            <li>preprocessing.scale_and_center</li>
            <li>preprocessing.dot_reduction</li>
            <li>preprocessing.connect_lines</li>
        </preprocessing_queue>
        <use_anonymous value="true" />
    </other>
</config>

回答 2

如果要使用INI文件之类的东西来保存设置,请考虑使用configparser,它可以从文本文件加载键值对,并可以轻松地写回该文件。

INI文件的格式为:

[Section]
key = value
key with spaces = somevalue

If you want to use something like an INI file to hold settings, consider using configparser which loads key value pairs from a text file, and can easily write back to the file.

INI file has the format:

[Section]
key = value
key with spaces = somevalue

回答 3

保存并加载字典。您将拥有任意键,值和任意数量的键,值对。

Save and load a dictionary. You will have arbitrary keys, values and arbitrary number of key, values pairs.


回答 4

尝试使用ReadSettings

from readsettings import ReadSettings
data = ReadSettings("settings.json") # Load or create any json, yml, yaml or toml file
data["name"] = "value" # Set "name" to "value"
data["name"] # Returns: "value"

Try using ReadSettings:

from readsettings import ReadSettings
data = ReadSettings("settings.json") # Load or create any json, yml, yaml or toml file
data["name"] = "value" # Set "name" to "value"
data["name"] # Returns: "value"

回答 5

尝试使用cfg4py

  1. 分层设计,支持多种环境,因此切勿将开发人员设置与生产站点设置混淆。
  2. 代码完成。Cfg4py会将您的Yaml转换为python类,然后在您键入代码时可以完成代码。
  3. 还有很多..

免责声明:我是这个模块的作者

try using cfg4py:

  1. Hierarchichal design, mulitiple env supported, so never mess up dev settings with production site settings.
  2. Code completion. Cfg4py will convert your yaml into a python class, then code completion is available while you typing your code.
  3. many more..

DISCLAIMER: I’m the author of this module


格式使用标准json模块浮动

问题:格式使用标准json模块浮动

我正在使用python 2.6中的标准json模块来序列化float列表。但是,我得到这样的结果:

>>> import json
>>> json.dumps([23.67, 23.97, 23.87])
'[23.670000000000002, 23.969999999999999, 23.870000000000001]'

我希望浮点数仅使用两位十进制数字进行格式化。输出应如下所示:

>>> json.dumps([23.67, 23.97, 23.87])
'[23.67, 23.97, 23.87]'

我尝试定义自己的JSON Encoder类:

class MyEncoder(json.JSONEncoder):
    def encode(self, obj):
        if isinstance(obj, float):
            return format(obj, '.2f')
        return json.JSONEncoder.encode(self, obj)

这适用于唯一的float对象:

>>> json.dumps(23.67, cls=MyEncoder)
'23.67'

但是对于嵌套对象失败:

>>> json.dumps([23.67, 23.97, 23.87])
'[23.670000000000002, 23.969999999999999, 23.870000000000001]'

我不想有外部依赖性,所以我更喜欢使用标准的json模块。

我该如何实现?

I am using the standard json module in python 2.6 to serialize a list of floats. However, I’m getting results like this:

>>> import json
>>> json.dumps([23.67, 23.97, 23.87])
'[23.670000000000002, 23.969999999999999, 23.870000000000001]'

I want the floats to be formated with only two decimal digits. The output should look like this:

>>> json.dumps([23.67, 23.97, 23.87])
'[23.67, 23.97, 23.87]'

I have tried defining my own JSON Encoder class:

class MyEncoder(json.JSONEncoder):
    def encode(self, obj):
        if isinstance(obj, float):
            return format(obj, '.2f')
        return json.JSONEncoder.encode(self, obj)

This works for a sole float object:

>>> json.dumps(23.67, cls=MyEncoder)
'23.67'

But fails for nested objects:

>>> json.dumps([23.67, 23.97, 23.87])
'[23.670000000000002, 23.969999999999999, 23.870000000000001]'

I don’t want to have external dependencies, so I prefer to stick with the standard json module.

How can I achieve this?


回答 0

注:这并没有任何最新版本的Python的工作。

不幸的是,我相信您必须通过Monkey补丁来做到这一点(我认为这表明标准库json软件包中存在设计缺陷)。例如,此代码:

import json
from json import encoder
encoder.FLOAT_REPR = lambda o: format(o, '.2f')
    
print(json.dumps(23.67))
print(json.dumps([23.67, 23.97, 23.87]))

发出:

23.67
[23.67, 23.97, 23.87]

如您所愿。显然,应该有一种覆盖的结构化方法,FLOAT_REPR以便您可以控制浮点数的每个表示形式;但不幸的是,这不是json包装的设计方式:-(。

Note: This does not work in any recent version of Python.

Unfortunately, I believe you have to do this by monkey-patching (which, to my opinion, indicates a design defect in the standard library json package). E.g., this code:

import json
from json import encoder
encoder.FLOAT_REPR = lambda o: format(o, '.2f')
    
print(json.dumps(23.67))
print(json.dumps([23.67, 23.97, 23.87]))

emits:

23.67
[23.67, 23.97, 23.87]

as you desire. Obviously, there should be an architected way to override FLOAT_REPR so that EVERY representation of a float is under your control if you wish it to be; but unfortunately that’s not how the json package was designed:-(.


回答 1

import simplejson
    
class PrettyFloat(float):
    def __repr__(self):
        return '%.15g' % self
    
def pretty_floats(obj):
    if isinstance(obj, float):
        return PrettyFloat(obj)
    elif isinstance(obj, dict):
        return dict((k, pretty_floats(v)) for k, v in obj.items())
    elif isinstance(obj, (list, tuple)):
        return list(map(pretty_floats, obj))
    return obj
    
print(simplejson.dumps(pretty_floats([23.67, 23.97, 23.87])))

发出

[23.67, 23.97, 23.87]

无需进行Monkey修补。

import simplejson
    
class PrettyFloat(float):
    def __repr__(self):
        return '%.15g' % self
    
def pretty_floats(obj):
    if isinstance(obj, float):
        return PrettyFloat(obj)
    elif isinstance(obj, dict):
        return dict((k, pretty_floats(v)) for k, v in obj.items())
    elif isinstance(obj, (list, tuple)):
        return list(map(pretty_floats, obj))
    return obj
    
print(simplejson.dumps(pretty_floats([23.67, 23.97, 23.87])))

emits

[23.67, 23.97, 23.87]

No monkeypatching necessary.


回答 2

如果您使用的是Python 2.7,一个简单的解决方案是将浮点数显式舍入到所需的精度。

>>> sys.version
'2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit (Intel)]'
>>> json.dumps(1.0/3.0)
'0.3333333333333333'
>>> json.dumps(round(1.0/3.0, 2))
'0.33'

之所以有效,是因为Python 2.7使浮点舍入更加一致。不幸的是,这在Python 2.6中不起作用:

>>> sys.version
'2.6.6 (r266:84292, Dec 27 2010, 00:02:40) \n[GCC 4.4.5]'
>>> json.dumps(round(1.0/3.0, 2))
'0.33000000000000002'

上面提到的解决方案是2.6的解决方法,但没有一个是完全足够的。如果您的Python运行时使用JSON模块的C版本,则Monkey修补json.encoder.FLOAT_REPR不起作用。Tom Wuttke的答案中的PrettyFloat类起作用,但是仅当%g编码对于您的应用程序全局起作用时。%.15g有点魔术,它可以工作,因为浮点精度是17个有效数字,%g不打印尾随零。

我花了一些时间尝试制作一个PrettyFloat,它允许为每个数字自定义精度。即,像这样的语法

>>> json.dumps(PrettyFloat(1.0 / 3.0, 4))
'0.3333'

要做到这一点并不容易。从float继承很尴尬。从Object继承并使用带有自己的default()方法的JSONEncoder子类应该可以工作,除了json模块似乎假定所有自定义类型都应序列化为字符串。即:您最终在输出中使用Javascript字符串“ 0.33”,而不是数字0.33。也许还有一种方法可以使这项工作完成,但是比看起来要难。

If you’re using Python 2.7, a simple solution is to simply round your floats explicitly to the desired precision.

>>> sys.version
'2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit (Intel)]'
>>> json.dumps(1.0/3.0)
'0.3333333333333333'
>>> json.dumps(round(1.0/3.0, 2))
'0.33'

This works because Python 2.7 made float rounding more consistent. Unfortunately this does not work in Python 2.6:

>>> sys.version
'2.6.6 (r266:84292, Dec 27 2010, 00:02:40) \n[GCC 4.4.5]'
>>> json.dumps(round(1.0/3.0, 2))
'0.33000000000000002'

The solutions mentioned above are workarounds for 2.6, but none are entirely adequate. Monkey patching json.encoder.FLOAT_REPR does not work if your Python runtime uses a C version of the JSON module. The PrettyFloat class in Tom Wuttke’s answer works, but only if %g encoding works globally for your application. The %.15g is a bit magic, it works because float precision is 17 significant digits and %g does not print trailing zeroes.

I spent some time trying to make a PrettyFloat that allowed customization of precision for each number. Ie, a syntax like

>>> json.dumps(PrettyFloat(1.0 / 3.0, 4))
'0.3333'

It’s not easy to get this right. Inheriting from float is awkward. Inheriting from Object and using a JSONEncoder subclass with its own default() method should work, except the json module seems to assume all custom types should be serialized as strings. Ie: you end up with the Javascript string “0.33” in the output, not the number 0.33. There may be a way yet to make this work, but it’s harder than it looks.


回答 3

真不幸,dumps这使您无法做任何漂浮的事情。但是loads确实如此。因此,如果您不介意额外的CPU负载,则可以将其扔到编码器/解码器/编码器中,并得到正确的结果:

>>> json.dumps(json.loads(json.dumps([.333333333333, .432432]), parse_float=lambda x: round(float(x), 3)))
'[0.333, 0.432]'

Really unfortunate that dumps doesn’t allow you to do anything to floats. However loads does. So if you don’t mind the extra CPU load, you could throw it through the encoder/decoder/encoder and get the right result:

>>> json.dumps(json.loads(json.dumps([.333333333333, .432432]), parse_float=lambda x: round(float(x), 3)))
'[0.333, 0.432]'

回答 4

这是在Python 3中对我有用的解决方案,不需要Monkey补丁:

import json

def round_floats(o):
    if isinstance(o, float): return round(o, 2)
    if isinstance(o, dict): return {k: round_floats(v) for k, v in o.items()}
    if isinstance(o, (list, tuple)): return [round_floats(x) for x in o]
    return o


json.dumps(round_floats([23.63437, 23.93437, 23.842347]))

输出为:

[23.63, 23.93, 23.84]

它复制数据,但具有四舍五入的浮点数。

Here’s a solution that worked for me in Python 3 and does not require monkey patching:

import json

def round_floats(o):
    if isinstance(o, float): return round(o, 2)
    if isinstance(o, dict): return {k: round_floats(v) for k, v in o.items()}
    if isinstance(o, (list, tuple)): return [round_floats(x) for x in o]
    return o


json.dumps(round_floats([23.63437, 23.93437, 23.842347]))

Output is:

[23.63, 23.93, 23.84]

It copies the data but with rounded floats.


回答 5

如果您坚持使用Python 2.5或更早版本:如果安装了C加速,则Monkey-patch技巧似乎不适用于原始的simplejson模块:

$ python
Python 2.5.4 (r254:67916, Jan 20 2009, 11:06:13) 
[GCC 4.2.1 (SUSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import simplejson
>>> simplejson.__version__
'2.0.9'
>>> simplejson._speedups
<module 'simplejson._speedups' from '/home/carlos/.python-eggs/simplejson-2.0.9-py2.5-linux-i686.egg-tmp/simplejson/_speedups.so'>
>>> simplejson.encoder.FLOAT_REPR = lambda f: ("%.2f" % f)
>>> simplejson.dumps([23.67, 23.97, 23.87])
'[23.670000000000002, 23.969999999999999, 23.870000000000001]'
>>> simplejson.encoder.c_make_encoder = None
>>> simplejson.dumps([23.67, 23.97, 23.87])
'[23.67, 23.97, 23.87]'
>>> 

If you’re stuck with Python 2.5 or earlier versions: The monkey-patch trick does not seem to work with the original simplejson module if the C speedups are installed:

$ python
Python 2.5.4 (r254:67916, Jan 20 2009, 11:06:13) 
[GCC 4.2.1 (SUSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import simplejson
>>> simplejson.__version__
'2.0.9'
>>> simplejson._speedups
<module 'simplejson._speedups' from '/home/carlos/.python-eggs/simplejson-2.0.9-py2.5-linux-i686.egg-tmp/simplejson/_speedups.so'>
>>> simplejson.encoder.FLOAT_REPR = lambda f: ("%.2f" % f)
>>> simplejson.dumps([23.67, 23.97, 23.87])
'[23.670000000000002, 23.969999999999999, 23.870000000000001]'
>>> simplejson.encoder.c_make_encoder = None
>>> simplejson.dumps([23.67, 23.97, 23.87])
'[23.67, 23.97, 23.87]'
>>> 

回答 6

您可以做您需要做的事情,但是没有记录:

>>> import json
>>> json.encoder.FLOAT_REPR = lambda f: ("%.2f" % f)
>>> json.dumps([23.67, 23.97, 23.87])
'[23.67, 23.97, 23.87]'

You can do what you need to do, but it isn’t documented:

>>> import json
>>> json.encoder.FLOAT_REPR = lambda f: ("%.2f" % f)
>>> json.dumps([23.67, 23.97, 23.87])
'[23.67, 23.97, 23.87]'

回答 7

Alex Martelli的解决方案将适用于单线程应用程序,但不适用于需要控制每个线程的小数位数的多线程应用程序。这是一种应在多线程应用程序中使用的解决方案:

import threading
from json import encoder

def FLOAT_REPR(f):
    """
    Serialize a float to a string, with a given number of digits
    """
    decimal_places = getattr(encoder.thread_local, 'decimal_places', 0)
    format_str = '%%.%df' % decimal_places
    return format_str % f

encoder.thread_local = threading.local()
encoder.FLOAT_REPR = FLOAT_REPR     

#As an example, call like this:
import json

encoder.thread_local.decimal_places = 1
json.dumps([1.56, 1.54]) #Should result in '[1.6, 1.5]'

您仅可以将encoder.thread_local.decimal_places设置为所需的小数位数,而该线程中对json.dumps()的下一次调用将使用该小数位数

Alex Martelli’s solution will work for single threaded apps, but may not work for multi-threaded apps that need to control the number of decimal places per thread. Here is a solution that should work in multi threaded apps:

import threading
from json import encoder

def FLOAT_REPR(f):
    """
    Serialize a float to a string, with a given number of digits
    """
    decimal_places = getattr(encoder.thread_local, 'decimal_places', 0)
    format_str = '%%.%df' % decimal_places
    return format_str % f

encoder.thread_local = threading.local()
encoder.FLOAT_REPR = FLOAT_REPR     

#As an example, call like this:
import json

encoder.thread_local.decimal_places = 1
json.dumps([1.56, 1.54]) #Should result in '[1.6, 1.5]'

You can merely set encoder.thread_local.decimal_places to the number of decimal places you want, and the next call to json.dumps() in that thread will use that number of decimal places


回答 8

如果您需要在python 2.7中执行此操作而不覆盖全局json.encoder.FLOAT_REPR,这是一种方法。

import json
import math

class MyEncoder(json.JSONEncoder):
    "JSON encoder that renders floats to two decimal places"

    FLOAT_FRMT = '{0:.2f}'

    def floatstr(self, obj):
        return self.FLOAT_FRMT.format(obj)

    def _iterencode(self, obj, markers=None):
        # stl JSON lame override #1
        new_obj = obj
        if isinstance(obj, float):
            if not math.isnan(obj) and not math.isinf(obj):
                new_obj = self.floatstr(obj)
        return super(MyEncoder, self)._iterencode(new_obj, markers=markers)

    def _iterencode_dict(self, dct, markers=None):
        # stl JSON lame override #2
        new_dct = {}
        for key, value in dct.iteritems():
            if isinstance(key, float):
                if not math.isnan(key) and not math.isinf(key):
                    key = self.floatstr(key)
            new_dct[key] = value
        return super(MyEncoder, self)._iterencode_dict(new_dct, markers=markers)

然后,在python 2.7中:

>>> from tmp import MyEncoder
>>> enc = MyEncoder()
>>> enc.encode([23.67, 23.98, 23.87])
'[23.67, 23.98, 23.87]'

在python 2.6中,它无法正常工作,正如Matthew Schinckel指出的那样:

>>> import MyEncoder
>>> enc = MyEncoder()  
>>> enc.encode([23.67, 23.97, 23.87])
'["23.67", "23.97", "23.87"]'

If you need to do this in python 2.7 without overriding the global json.encoder.FLOAT_REPR, here’s one way.

import json
import math

class MyEncoder(json.JSONEncoder):
    "JSON encoder that renders floats to two decimal places"

    FLOAT_FRMT = '{0:.2f}'

    def floatstr(self, obj):
        return self.FLOAT_FRMT.format(obj)

    def _iterencode(self, obj, markers=None):
        # stl JSON lame override #1
        new_obj = obj
        if isinstance(obj, float):
            if not math.isnan(obj) and not math.isinf(obj):
                new_obj = self.floatstr(obj)
        return super(MyEncoder, self)._iterencode(new_obj, markers=markers)

    def _iterencode_dict(self, dct, markers=None):
        # stl JSON lame override #2
        new_dct = {}
        for key, value in dct.iteritems():
            if isinstance(key, float):
                if not math.isnan(key) and not math.isinf(key):
                    key = self.floatstr(key)
            new_dct[key] = value
        return super(MyEncoder, self)._iterencode_dict(new_dct, markers=markers)

Then, in python 2.7:

>>> from tmp import MyEncoder
>>> enc = MyEncoder()
>>> enc.encode([23.67, 23.98, 23.87])
'[23.67, 23.98, 23.87]'

In python 2.6, it doesn’t quite work as Matthew Schinckel points out below:

>>> import MyEncoder
>>> enc = MyEncoder()  
>>> enc.encode([23.67, 23.97, 23.87])
'["23.67", "23.97", "23.87"]'

回答 9

优点:

  • 适用于任何JSON编码器,甚至python的repr。
  • 短(ish),似乎起作用。

缺点:

  • 丑陋的regexp hack,未经测试。
  • 二次复杂度。

    def fix_floats(json, decimals=2, quote='"'):
        pattern = r'^((?:(?:"(?:\\.|[^\\"])*?")|[^"])*?)(-?\d+\.\d{'+str(decimals)+'}\d+)'
        pattern = re.sub('"', quote, pattern) 
        fmt = "%%.%df" % decimals
        n = 1
        while n:
            json, n = re.subn(pattern, lambda m: m.group(1)+(fmt % float(m.group(2)).rstrip('0')), json)
        return json

Pros:

  • Works with any JSON encoder, or even python’s repr.
  • Short(ish), seems to work.

Cons:

  • Ugly regexp hack, barely tested.
  • Quadratic complexity.

    def fix_floats(json, decimals=2, quote='"'):
        pattern = r'^((?:(?:"(?:\\.|[^\\"])*?")|[^"])*?)(-?\d+\.\d{'+str(decimals)+'}\d+)'
        pattern = re.sub('"', quote, pattern) 
        fmt = "%%.%df" % decimals
        n = 1
        while n:
            json, n = re.subn(pattern, lambda m: m.group(1)+(fmt % float(m.group(2)).rstrip('0')), json)
        return json
    

回答 10

导入标准json模块时,只需更改默认编码器FLOAT_REPR。确实不需要导入或创建Encoder实例。

import json
json.encoder.FLOAT_REPR = lambda o: format(o, '.2f')

json.dumps([23.67, 23.97, 23.87]) #returns  '[23.67, 23.97, 23.87]'

有时,将python可以用str猜出的最佳表示形式作为json输出也非常有用。这将确保重要数字不会被忽略。

import json
json.dumps([23.67, 23.9779, 23.87489])
# output is'[23.670000000000002, 23.977900000000002, 23.874890000000001]'

json.encoder.FLOAT_REPR = str
json.dumps([23.67, 23.9779, 23.87489])
# output is '[23.67, 23.9779, 23.87489]'

When importing the standard json module, it is enough to change the default encoder FLOAT_REPR. There isn’t really the need to import or create Encoder instances.

import json
json.encoder.FLOAT_REPR = lambda o: format(o, '.2f')

json.dumps([23.67, 23.97, 23.87]) #returns  '[23.67, 23.97, 23.87]'

Sometimes is also very useful to output as json the best representation python can guess with str. This will make sure signifficant digits are not ignored.

import json
json.dumps([23.67, 23.9779, 23.87489])
# output is'[23.670000000000002, 23.977900000000002, 23.874890000000001]'

json.encoder.FLOAT_REPR = str
json.dumps([23.67, 23.9779, 23.87489])
# output is '[23.67, 23.9779, 23.87489]'

回答 11

我同意@Nelson的观点,从float继承是很尴尬的,但是也许只涉及__repr__函数的解决方案是可以原谅的。我最终使用该decimal软件包在需要时重新格式化浮点数。好处是,这在所有repr()被调用的上下文中都有效,例如在简单地将列表打印到stdout时也是如此。同样,创建数据后,精度可以在运行时配置。缺点当然是您的数据需要转换为特殊的float类(不幸的是,您似乎无法获得Monkey补丁float.__repr__)。为此,我提供了一个简短的转换功能。

代码:

import decimal
C = decimal.getcontext()

class decimal_formatted_float(float):
   def __repr__(self):
       s = str(C.create_decimal_from_float(self))
       if '.' in s: s = s.rstrip('0')
       return s

def convert_to_dff(elem):
    try:
        return elem.__class__(map(convert_to_dff, elem))
    except:
        if isinstance(elem, float):
            return decimal_formatted_float(elem)
        else:
            return elem

用法示例:

>>> import json
>>> li = [(1.2345,),(7.890123,4.567,890,890.)]
>>>
>>> decimal.getcontext().prec = 15
>>> dff_li = convert_to_dff(li)
>>> dff_li
[(1.2345,), (7.890123, 4.567, 890, 890)]
>>> json.dumps(dff_li)
'[[1.2345], [7.890123, 4.567, 890, 890]]'
>>>
>>> decimal.getcontext().prec = 3
>>> dff_li = convert_to_dff(li)
>>> dff_li
[(1.23,), (7.89, 4.57, 890, 890)]
>>> json.dumps(dff_li)
'[[1.23], [7.89, 4.57, 890, 890]]'

I agree with @Nelson that inheriting from float is awkward, but perhaps a solution that only touches the __repr__ function might be forgiveable. I ended up using the decimal package for this to reformat floats when needed. The upside is that this works in all contexts where repr() is being called, so also when simply printing lists to stdout for example. Also, the precision is runtime configurable, after the data has been created. Downside is of course that your data needs to be converted to this special float class (as unfortunately you cannot seem to monkey patch float.__repr__). For that I provide a brief conversion function.

The code:

import decimal
C = decimal.getcontext()

class decimal_formatted_float(float):
   def __repr__(self):
       s = str(C.create_decimal_from_float(self))
       if '.' in s: s = s.rstrip('0')
       return s

def convert_to_dff(elem):
    try:
        return elem.__class__(map(convert_to_dff, elem))
    except:
        if isinstance(elem, float):
            return decimal_formatted_float(elem)
        else:
            return elem

Usage example:

>>> import json
>>> li = [(1.2345,),(7.890123,4.567,890,890.)]
>>>
>>> decimal.getcontext().prec = 15
>>> dff_li = convert_to_dff(li)
>>> dff_li
[(1.2345,), (7.890123, 4.567, 890, 890)]
>>> json.dumps(dff_li)
'[[1.2345], [7.890123, 4.567, 890, 890]]'
>>>
>>> decimal.getcontext().prec = 3
>>> dff_li = convert_to_dff(li)
>>> dff_li
[(1.23,), (7.89, 4.57, 890, 890)]
>>> json.dumps(dff_li)
'[[1.23], [7.89, 4.57, 890, 890]]'

回答 12

使用numpy

如果您实际上有很长的浮动,则可以使用numpy将其正确向上/向下取整:

import json 

import numpy as np

data = np.array([23.671234, 23.97432, 23.870123])

json.dumps(np.around(data, decimals=2).tolist())

'[23.67, 23.97, 23.87]'

Using numpy

If you actually have really long floats you can round them up/down correctly with numpy:

import json 

import numpy as np

data = np.array([23.671234, 23.97432, 23.870123])

json.dumps(np.around(data, decimals=2).tolist())

'[23.67, 23.97, 23.87]'


回答 13

我刚刚发布了fjson(一个小的Python库)来解决此问题。与安装

pip install fjson

并使用like json,并添加float_format参数:

import math
import fjson


data = {"a": 1, "b": math.pi}
print(fjson.dumps(data, float_format=".6e", indent=2))
{
  "a": 1,
  "b": 3.141593e+00
}

I just released fjson, a small Python library to fix this issue. Install with

pip install fjson

and use just like json, with the addition of the float_format parameter:

import math
import fjson


data = {"a": 1, "b": math.pi}
print(fjson.dumps(data, float_format=".6e", indent=2))
{
  "a": 1,
  "b": 3.141593e+00
}

如何在Python中美化JSON?

问题:如何在Python中美化JSON?

有人可以建议我如何使用Python或通过命令行美化JSON吗?

唯一可以做到的基于在线的JSON美化器是:http : //jsonviewer.stack.hu/

但是,我需要在Python中使用它。

这是我的数据集:

{ "head": {"vars": [ "address" , "description" ,"listprice" ]} , "results": { "bindings": [ 
    {
        "address" : { "type":"string", "value" : " Dyne Road, London NW6"},
            "description" :{ "type":"string", "value" : "6 bed semi detached house"},
            "listprice" : { "type":"string", "value" : "1,150,000"}
    }
    ,
        {
            "address" : { "type":"string", "value" : " Tweedy Road, Bromley BR1"},
            "description" :{ "type":"string", "value" : "5 bed terraced house"},
            "listprice" : { "type":"string", "value" : "550,000"}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Vera Avenue, London N21"},
            "description" :{ "type":"string", "value" : "4 bed detached house"},
            "listprice" : { "type":"string", "value" : "

                995,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Wimbledon Park Side, London SW19"},
            "description" :{ "type":"string", "value" : "3 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Westbere Road, West Hampstead, London NW2"},
            "description" :{ "type":"string", "value" : "5 bedroom  semi detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " The Avenue, Hatch End, Pinner HA5"},
            "description" :{ "type":"string", "value" : "5 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Princes Park Avenue, London NW11"},
            "description" :{ "type":"string", "value" : "4 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Canons Drive, Edgware HA8"},
            "description" :{ "type":"string", "value" : "4 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Westbere Road, West Hampstead NW2"},
            "description" :{ "type":"string", "value" : "5 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Haymills Estate, Ealing, London"},
            "description" :{ "type":"string", "value" : "5 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Dene Terrace Woodclyffe Drive, Chislehurst, Kent BR7"},
            "description" :{ "type":"string", "value" : "5 bedroom  terraced house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Dene Terrace Woodclyffe Drive, Chislehurst, Kent BR7"},
            "description" :{ "type":"string", "value" : "5 bedroom  semi detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Northwick Close, St John's Wood NW8"},
            "description" :{ "type":"string", "value" : "3 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Claremont Gardens, Surbiton KT6"},
            "description" :{ "type":"string", "value" : "13 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Dene Terrace Woodclyffe Drive, Chislehurst, Kent BR7"},
            "description" :{ "type":"string", "value" : "5 bedroom  end terrace house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Stamford Road, London N1"},
            "description" :{ "type":"string", "value" : "4 bedroom  terraced house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Stanhope Avenue, London N3"},
            "description" :{ "type":"string", "value" : "6 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Haymills Estate, Ealing, London"},
            "description" :{ "type":"string", "value" : "5 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Elms Crescent, London SW4"},
            "description" :{ "type":"string", "value" : "5 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Princes Park Avenue, London NW11"},
            "description" :{ "type":"string", "value" : "4 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Abbeville Road, London SW4"},
            "description" :{ "type":"string", "value" : "4 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Canons Drive, Edgware HA8"},
            "description" :{ "type":"string", "value" : "4 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Henson Avenue, Willesdon Green NW2"},
            "description" :{ "type":"string", "value" : "5 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Woodstock Road, London NW11"},
            "description" :{ "type":"string", "value" : "5 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Tamworth Street, London SW6"},
            "description" :{ "type":"string", "value" : "5 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Stanhope Avenue, Finchley, London"},
            "description" :{ "type":"string", "value" : "5 bedroom  semi detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " The Old Burlington, Church Street, London W4"},
            "description" :{ "type":"string", "value" : "3 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Ebury Close, Northwood HA6"},
            "description" :{ "type":"string", "value" : "4 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Middleton Road, London NW11"},
            "description" :{ "type":"string", "value" : "4 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Henson Avenue, Willesden Green NW2"},
            "description" :{ "type":"string", "value" : "5 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Huron Road, London SW17"},
            "description" :{ "type":"string", "value" : "6 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Corringway, Ealing W5"},
            "description" :{ "type":"string", "value" : "5 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Woodlands Avenue, New Malden KT3"},
            "description" :{ "type":"string", "value" : "5 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Gunnersbury Park Area, Ealing, London"},
            "description" :{ "type":"string", "value" : "6 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Blenheim Gardens, London, Brent NW2"},
            "description" :{ "type":"string", "value" : "6 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Creighton Road, London NW6"},
            "description" :{ "type":"string", "value" : "4 bedroom  terraced house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Plaistow Lane, Bromley BR1"},
            "description" :{ "type":"string", "value" : "7 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Greenfield Gardens, London NW2"},
            "description" :{ "type":"string", "value" : "4 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Hendon Avenue, London N3"},
            "description" :{ "type":"string", "value" : "3 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Peckham Park Road, London SE15"},
            "description" :{ "type":"string", "value" : "6 bedroom  semi detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Woodclyffe Drive, Chislehurst BR7"},
            "description" :{ "type":"string", "value" : "5 bedroom  house for sale"},
            "listprice" : { "type":"string", "value" : "

                From 1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Highwood Hill, Mill Hill, London"},
            "description" :{ "type":"string", "value" : "5 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Stanhope Avenue, London N3"},
            "description" :{ "type":"string", "value" : "5 bedroom  semi detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Kersley Mews, London SW11"},
            "description" :{ "type":"string", "value" : "3 bedroom  mews for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Ebury Close, Northwood HA6"},
            "description" :{ "type":"string", "value" : "4 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Ellesmere Road, Chiswick W4"},
            "description" :{ "type":"string", "value" : "6 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " The Avenue, Hatch End, Pinner, Middlesex"},
            "description" :{ "type":"string", "value" : "5 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Wandsworth, London SW18"},
            "description" :{ "type":"string", "value" : "6 bedroom  semi detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Carlton Road, New Malden KT3"},
            "description" :{ "type":"string", "value" : "4 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " St Mary's Mews, Ealing W5"},
            "description" :{ "type":"string", "value" : "3 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Ritherdon Road, Balham, London SW17"},
            "description" :{ "type":"string", "value" : "5 bedroom  semi detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Goldsmith Avenue, London W3"},
            "description" :{ "type":"string", "value" : "5 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Plaistow Lane, Bromley, Kent BR1"},
            "description" :{ "type":"string", "value" : "7 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ] } }

Can someone suggest how I can beautify JSON in Python or through the command line?

The only online based JSON beautifier which could do it was: http://jsonviewer.stack.hu/.

I need to use it from within Python, however.

This is my dataset:

{ "head": {"vars": [ "address" , "description" ,"listprice" ]} , "results": { "bindings": [ 
    {
        "address" : { "type":"string", "value" : " Dyne Road, London NW6"},
            "description" :{ "type":"string", "value" : "6 bed semi detached house"},
            "listprice" : { "type":"string", "value" : "1,150,000"}
    }
    ,
        {
            "address" : { "type":"string", "value" : " Tweedy Road, Bromley BR1"},
            "description" :{ "type":"string", "value" : "5 bed terraced house"},
            "listprice" : { "type":"string", "value" : "550,000"}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Vera Avenue, London N21"},
            "description" :{ "type":"string", "value" : "4 bed detached house"},
            "listprice" : { "type":"string", "value" : "

                995,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Wimbledon Park Side, London SW19"},
            "description" :{ "type":"string", "value" : "3 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Westbere Road, West Hampstead, London NW2"},
            "description" :{ "type":"string", "value" : "5 bedroom  semi detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " The Avenue, Hatch End, Pinner HA5"},
            "description" :{ "type":"string", "value" : "5 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Princes Park Avenue, London NW11"},
            "description" :{ "type":"string", "value" : "4 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Canons Drive, Edgware HA8"},
            "description" :{ "type":"string", "value" : "4 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Westbere Road, West Hampstead NW2"},
            "description" :{ "type":"string", "value" : "5 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Haymills Estate, Ealing, London"},
            "description" :{ "type":"string", "value" : "5 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Dene Terrace Woodclyffe Drive, Chislehurst, Kent BR7"},
            "description" :{ "type":"string", "value" : "5 bedroom  terraced house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Dene Terrace Woodclyffe Drive, Chislehurst, Kent BR7"},
            "description" :{ "type":"string", "value" : "5 bedroom  semi detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Northwick Close, St John's Wood NW8"},
            "description" :{ "type":"string", "value" : "3 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Claremont Gardens, Surbiton KT6"},
            "description" :{ "type":"string", "value" : "13 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Dene Terrace Woodclyffe Drive, Chislehurst, Kent BR7"},
            "description" :{ "type":"string", "value" : "5 bedroom  end terrace house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Stamford Road, London N1"},
            "description" :{ "type":"string", "value" : "4 bedroom  terraced house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Stanhope Avenue, London N3"},
            "description" :{ "type":"string", "value" : "6 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Haymills Estate, Ealing, London"},
            "description" :{ "type":"string", "value" : "5 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Elms Crescent, London SW4"},
            "description" :{ "type":"string", "value" : "5 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Princes Park Avenue, London NW11"},
            "description" :{ "type":"string", "value" : "4 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Abbeville Road, London SW4"},
            "description" :{ "type":"string", "value" : "4 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Canons Drive, Edgware HA8"},
            "description" :{ "type":"string", "value" : "4 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Henson Avenue, Willesdon Green NW2"},
            "description" :{ "type":"string", "value" : "5 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Woodstock Road, London NW11"},
            "description" :{ "type":"string", "value" : "5 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Tamworth Street, London SW6"},
            "description" :{ "type":"string", "value" : "5 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Stanhope Avenue, Finchley, London"},
            "description" :{ "type":"string", "value" : "5 bedroom  semi detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " The Old Burlington, Church Street, London W4"},
            "description" :{ "type":"string", "value" : "3 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Ebury Close, Northwood HA6"},
            "description" :{ "type":"string", "value" : "4 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Middleton Road, London NW11"},
            "description" :{ "type":"string", "value" : "4 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Henson Avenue, Willesden Green NW2"},
            "description" :{ "type":"string", "value" : "5 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Huron Road, London SW17"},
            "description" :{ "type":"string", "value" : "6 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Corringway, Ealing W5"},
            "description" :{ "type":"string", "value" : "5 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Woodlands Avenue, New Malden KT3"},
            "description" :{ "type":"string", "value" : "5 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Gunnersbury Park Area, Ealing, London"},
            "description" :{ "type":"string", "value" : "6 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Blenheim Gardens, London, Brent NW2"},
            "description" :{ "type":"string", "value" : "6 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Creighton Road, London NW6"},
            "description" :{ "type":"string", "value" : "4 bedroom  terraced house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Plaistow Lane, Bromley BR1"},
            "description" :{ "type":"string", "value" : "7 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Greenfield Gardens, London NW2"},
            "description" :{ "type":"string", "value" : "4 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Hendon Avenue, London N3"},
            "description" :{ "type":"string", "value" : "3 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Peckham Park Road, London SE15"},
            "description" :{ "type":"string", "value" : "6 bedroom  semi detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Woodclyffe Drive, Chislehurst BR7"},
            "description" :{ "type":"string", "value" : "5 bedroom  house for sale"},
            "listprice" : { "type":"string", "value" : "

                From 1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Highwood Hill, Mill Hill, London"},
            "description" :{ "type":"string", "value" : "5 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Stanhope Avenue, London N3"},
            "description" :{ "type":"string", "value" : "5 bedroom  semi detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Kersley Mews, London SW11"},
            "description" :{ "type":"string", "value" : "3 bedroom  mews for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Ebury Close, Northwood HA6"},
            "description" :{ "type":"string", "value" : "4 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Ellesmere Road, Chiswick W4"},
            "description" :{ "type":"string", "value" : "6 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " The Avenue, Hatch End, Pinner, Middlesex"},
            "description" :{ "type":"string", "value" : "5 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Wandsworth, London SW18"},
            "description" :{ "type":"string", "value" : "6 bedroom  semi detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Carlton Road, New Malden KT3"},
            "description" :{ "type":"string", "value" : "4 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " St Mary's Mews, Ealing W5"},
            "description" :{ "type":"string", "value" : "3 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Ritherdon Road, Balham, London SW17"},
            "description" :{ "type":"string", "value" : "5 bedroom  semi detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Goldsmith Avenue, London W3"},
            "description" :{ "type":"string", "value" : "5 bedroom  property for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ,
        {
            "address" : { "type":"string", "value" : " Plaistow Lane, Bromley, Kent BR1"},
            "description" :{ "type":"string", "value" : "7 bedroom  detached house for sale"},
            "listprice" : { "type":"string", "value" : "

                1,250,000


                    "}
        }
    ] } }

回答 0

在命令行中:

echo '{"one":1,"two":2}' | python -mjson.tool

输出:

{
    "one": 1, 
    "two": 2
}

Python手册以编程方式描述了精美印刷的JSON

>>> import json
>>> print json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4)
{
    "4": 5,
    "6": 7
}

From the command-line:

echo '{"one":1,"two":2}' | python -mjson.tool

which outputs:

{
    "one": 1, 
    "two": 2
}

Programmtically, the Python manual describes pretty-printing JSON:

>>> import json
>>> print json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4)
{
    "4": 5,
    "6": 7
}

回答 1

json模块中使用函数的indent参数。dumps

从文档:

>>> import json
>>> print json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4)
{
    "4": 5,
    "6": 7
}

Use the indent argument of the dumps function in the json module.

From the docs:

>>> import json
>>> print json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4)
{
    "4": 5,
    "6": 7
}

回答 2

一个最小的Python内解决方案,该解决方案为通过命令行提供的json数据着色:

import sys
import json
from pygments import highlight, lexers, formatters

formatted_json = json.dumps(json.loads(sys.argv[1]), indent=4)
colorful_json = highlight(unicode(formatted_json, 'UTF-8'), lexers.JsonLexer(), formatters.TerminalFormatter())
print(colorful_json)

pjson上述启发。该代码需要pygments安装。

输出示例:

在此处输入图片说明

A minimal in-python solution that colors json data supplied via the command line:

import sys
import json
from pygments import highlight, lexers, formatters

formatted_json = json.dumps(json.loads(sys.argv[1]), indent=4)
colorful_json = highlight(unicode(formatted_json, 'UTF-8'), lexers.JsonLexer(), formatters.TerminalFormatter())
print(colorful_json)

Inspired by pjson mentioned above. This code needs pygments to be installed.

Output example:

enter image description here


回答 3

试试underscore-cli

cat myfile.json | underscore print --color

这是一个非常漂亮的工具,可以优雅地对结构化数据进行大量操作,执行js代码片段,填充模板等。它具有荒谬的文档记录,完善的结构,可供认真使用。我写的。:)

Try underscore-cli:

cat myfile.json | underscore print --color

It’s a pretty nifty tool that can elegantly do a lot of manipulation of structured data, execute js snippets, fill templates, etc. It’s ridiculously well documented, polished, and ready for serious use. And I wrote it. :)


回答 4

我为此使用python的cli命令是:

cat myfile.json | python -mjson.tool

您应该可以在这里找到更多信息:

http://docs.python.org/library/json.html

The cli command I’ve used with python for this is:

cat myfile.json | python -mjson.tool

You should be able to find more info here:

http://docs.python.org/library/json.html


回答 5

看起来jsbeautifier开源了他们的工具,并将它们打包为Python和JS库以及CLI工具。看起来他们并不喜欢Web服务,但我并没有仔细检查。请参阅带有安装说明的github repo


从他们的文档中了解Python CLI和库的用法:

要使用python进行美化:

$ pip install jsbeautifier
$ js-beautify file.js

美化的输出到 stdout

要使用jsbeautifier作为一个库很简单:

import jsbeautifier
res = jsbeautifier.beautify('your javascript string')
res = jsbeautifier.beautify_file('some_file.js')

…或指定一些选项:

opts = jsbeautifier.default_options()
opts.indent_size = 2
res = jsbeautifier.beautify('some javascript', opts)

如果要传递字符串而不是文件名,并且正在使用bash,则可以使用进程替换,如下所示:

$ js-beautify <(echo '{"some": "json"}')

It looks like jsbeautifier open sourced their tools and packaged them as Python and JS libs, and as CLI tools. It doesn’t look like they call out to a web service, but I didn’t check too closely. See the github repo with install instructions.


From their docs for Python CLI and library usage:

To beautify using python:

$ pip install jsbeautifier
$ js-beautify file.js

Beautified output goes to stdout.

To use jsbeautifier as a library is simple:

import jsbeautifier
res = jsbeautifier.beautify('your javascript string')
res = jsbeautifier.beautify_file('some_file.js')

…or, to specify some options:

opts = jsbeautifier.default_options()
opts.indent_size = 2
res = jsbeautifier.beautify('some javascript', opts)

If you want to pass a string instead of a filename, and you are using bash, then you can use process substitution like so:

$ js-beautify <(echo '{"some": "json"}')

回答 6

我不喜欢json.dumps(…)->的输出,因为我的口味太多了换行符。而且我不想使用命令行工具或安装某些工具。我终于找到了Pythons pprint(=漂亮打印)。不幸的是,它不会生成正确的JSON,但我认为对存储的数据使用用户友好的glympse很有用。

输出 json.dumps(json_dict, indent=4)

{
    "hyperspace": {
        "constraints": [],
        "design": [
            [
                "windFarm.windparkSize.k",
                "continuous",
                [
                    0,
                    0,
                    5
                ]
            ],
            [
                "hydroPlant.primaryControlMax",
                "continuous",
                [
                    100,
                    300
                ]
            ]
        ],
        "kpis": [
            "frequency.y",
            "city.load.p[2]"
        ]
    },
    "lhc_size": 10,
    "number_of_runs": 10
}

pprint的用法:

import pprint

json_dict = {"hyperspace": {"constraints": [], "design": [["windFarm.windparkSize.k", "continuous", [0, 0, 5]], ["hydroPlant.primaryControlMax", "continuous", [100, 300]]], "kpis": ["frequency.y", "city.load.p[2]"]}, "lhc_size": 10, "number_of_runs": 10}

formatted_json_str = pprint.pformat(json_dict)
print(formatted_json_str)
pprint.pprint(json_dict)

pprint.pformat(...)或的结果pprint.pprint(...)

{'hyperspace': {'constraints': [],
                'design': [['windFarm.windparkSize.k', 'continuous', [0, 0, 5]],
                           ['hydroPlant.primaryControlMax',
                            'continuous',
                            [100, 300]]],
                'kpis': ['frequency.y', 'city.load.p[2]']},
 'lhc_size': 10,
 'number_of_runs': 10}

I didn’t like the output of json.dumps(…) -> For my taste way too much newlines. And I didn’t want to use a command line tool or install something. I finally found Pythons pprint (= pretty print). Unfortunately it doesn’t generate proper JSON but I think it is useful to have a user friendly glympse at the stored data.

Output of json.dumps(json_dict, indent=4)

{
    "hyperspace": {
        "constraints": [],
        "design": [
            [
                "windFarm.windparkSize.k",
                "continuous",
                [
                    0,
                    0,
                    5
                ]
            ],
            [
                "hydroPlant.primaryControlMax",
                "continuous",
                [
                    100,
                    300
                ]
            ]
        ],
        "kpis": [
            "frequency.y",
            "city.load.p[2]"
        ]
    },
    "lhc_size": 10,
    "number_of_runs": 10
}

Usage of pprint:

import pprint

json_dict = {"hyperspace": {"constraints": [], "design": [["windFarm.windparkSize.k", "continuous", [0, 0, 5]], ["hydroPlant.primaryControlMax", "continuous", [100, 300]]], "kpis": ["frequency.y", "city.load.p[2]"]}, "lhc_size": 10, "number_of_runs": 10}

formatted_json_str = pprint.pformat(json_dict)
print(formatted_json_str)
pprint.pprint(json_dict)

Result of pprint.pformat(...) or pprint.pprint(...):

{'hyperspace': {'constraints': [],
                'design': [['windFarm.windparkSize.k', 'continuous', [0, 0, 5]],
                           ['hydroPlant.primaryControlMax',
                            'continuous',
                            [100, 300]]],
                'kpis': ['frequency.y', 'city.load.p[2]']},
 'lhc_size': 10,
 'number_of_runs': 10}

回答 7

alias jsonp='pbpaste | python -m json.tool'

这将漂亮地打印OSX中剪贴板上的JSON。只需复制它,然后在Bash提示符下调用别名即可。

alias jsonp='pbpaste | python -m json.tool'

This will pretty print JSON that’s on the clipboard in OSX. Just Copy it then call the alias from a Bash prompt.


回答 8

您可以将输出传递给jq。如果您的python脚本包含类似

print json.dumps(data)

然后您可以开火:

python foo.py | jq '.'

You could pipe the output to jq. If you python script contains something like

print json.dumps(data)

then you can fire:

python foo.py | jq '.'

回答 9

使用python工具库

命令行:python -mjson.tool

在代码中:http : //docs.python.org/library/json.html

Use the python tool library

Command line: python -mjson.tool

In code: http://docs.python.org/library/json.html


回答 10

首次安装pygments

然后

echo '<some json>' | python -m json.tool | pygmentize -l json

First install pygments

then

echo '<some json>' | python -m json.tool | pygmentize -l json


回答 11

您的数据格式不正确。值字段尤其具有许多空格和换行符。自动格式化程序将无法解决此问题,因为它们不会修改实际数据。生成用于输出的数据时,请根据需要对其进行过滤以避免空格。

Your data is poorly formed. The value fields in particular have numerous spaces and new lines. Automated formatters won’t work on this, as they will not modify the actual data. As you generate the data for output, filter it as needed to avoid the spaces.


回答 12

使用jsonlint(例如xmllint):

aptitude install python-demjson
jsonlint -f foo.json

With jsonlint (like xmllint):

aptitude install python-demjson
jsonlint -f foo.json

如何以相同的顺序比较两个具有相同元素的JSON对象相等?

问题:如何以相同的顺序比较两个具有相同元素的JSON对象相等?

我如何测试python中两个JSON对象是否相等,而忽略列表的顺序?

例如 …

JSON文件a

{
    "errors": [
        {"error": "invalid", "field": "email"},
        {"error": "required", "field": "name"}
    ],
    "success": false
}

JSON文档b

{
    "success": false,
    "errors": [
        {"error": "required", "field": "name"},
        {"error": "invalid", "field": "email"}
    ]
}

a并且b应该比较相等,即使"errors"列表的顺序不同。

How can I test whether two JSON objects are equal in python, disregarding the order of lists?

For example …

JSON document a:

{
    "errors": [
        {"error": "invalid", "field": "email"},
        {"error": "required", "field": "name"}
    ],
    "success": false
}

JSON document b:

{
    "success": false,
    "errors": [
        {"error": "required", "field": "name"},
        {"error": "invalid", "field": "email"}
    ]
}

a and b should compare equal, even though the order of the "errors" lists are different.


回答 0

如果您希望两个具有相同元素但顺序不同的对象相等,那么比较明显的事情就是比较它们的排序后的副本-例如,以JSON字符串a和表示的字典b

import json

a = json.loads("""
{
    "errors": [
        {"error": "invalid", "field": "email"},
        {"error": "required", "field": "name"}
    ],
    "success": false
}
""")

b = json.loads("""
{
    "success": false,
    "errors": [
        {"error": "required", "field": "name"},
        {"error": "invalid", "field": "email"}
    ]
}
""")
>>> sorted(a.items()) == sorted(b.items())
False

…但这是行不通的,因为在每种情况下,"errors"顶层dict的项都是一个列表,其中相同元素的顺序不同,并且sorted()除“一个可迭代的。

为了解决这个问题,我们可以定义一个ordered函数,该函数将对找到的所有列表进行递归排序(并将字典转换(key, value)成对列表,以便它们可排序):

def ordered(obj):
    if isinstance(obj, dict):
        return sorted((k, ordered(v)) for k, v in obj.items())
    if isinstance(obj, list):
        return sorted(ordered(x) for x in obj)
    else:
        return obj

如果我们将此功能应用于ab,结果比较相等:

>>> ordered(a) == ordered(b)
True

If you want two objects with the same elements but in a different order to compare equal, then the obvious thing to do is compare sorted copies of them – for instance, for the dictionaries represented by your JSON strings a and b:

import json

a = json.loads("""
{
    "errors": [
        {"error": "invalid", "field": "email"},
        {"error": "required", "field": "name"}
    ],
    "success": false
}
""")

b = json.loads("""
{
    "success": false,
    "errors": [
        {"error": "required", "field": "name"},
        {"error": "invalid", "field": "email"}
    ]
}
""")
>>> sorted(a.items()) == sorted(b.items())
False

… but that doesn’t work, because in each case, the "errors" item of the top-level dict is a list with the same elements in a different order, and sorted() doesn’t try to sort anything except the “top” level of an iterable.

To fix that, we can define an ordered function which will recursively sort any lists it finds (and convert dictionaries to lists of (key, value) pairs so that they’re orderable):

def ordered(obj):
    if isinstance(obj, dict):
        return sorted((k, ordered(v)) for k, v in obj.items())
    if isinstance(obj, list):
        return sorted(ordered(x) for x in obj)
    else:
        return obj

If we apply this function to a and b, the results compare equal:

>>> ordered(a) == ordered(b)
True

回答 1

另一种方法是使用json.dumps(X, sort_keys=True)选项:

import json
a, b = json.dumps(a, sort_keys=True), json.dumps(b, sort_keys=True)
a == b # a normal string comparison

这适用于嵌套字典和列表。

Another way could be to use json.dumps(X, sort_keys=True) option:

import json
a, b = json.dumps(a, sort_keys=True), json.dumps(b, sort_keys=True)
a == b # a normal string comparison

This works for nested dictionaries and lists.


回答 2

对其进行解码,并将其作为mgilson注释进行比较。

字典的顺序无关紧要,只要键和值匹配即可。(字典在Python中没有顺序)

>>> {'a': 1, 'b': 2} == {'b': 2, 'a': 1}
True

但是顺序在清单中很重要。排序将解决列表的问题。

>>> [1, 2] == [2, 1]
False
>>> [1, 2] == sorted([2, 1])
True

>>> a = '{"errors": [{"error": "invalid", "field": "email"}, {"error": "required", "field": "name"}], "success": false}'
>>> b = '{"errors": [{"error": "required", "field": "name"}, {"error": "invalid", "field": "email"}], "success": false}'
>>> a, b = json.loads(a), json.loads(b)
>>> a['errors'].sort()
>>> b['errors'].sort()
>>> a == b
True

上面的示例适用于问题中的JSON。有关一般解决方案,请参见Zero Piraeus的答案。

Decode them and compare them as mgilson comment.

Order does not matter for dictionary as long as the keys, and values matches. (Dictionary has no order in Python)

>>> {'a': 1, 'b': 2} == {'b': 2, 'a': 1}
True

But order is important in list; sorting will solve the problem for the lists.

>>> [1, 2] == [2, 1]
False
>>> [1, 2] == sorted([2, 1])
True

>>> a = '{"errors": [{"error": "invalid", "field": "email"}, {"error": "required", "field": "name"}], "success": false}'
>>> b = '{"errors": [{"error": "required", "field": "name"}, {"error": "invalid", "field": "email"}], "success": false}'
>>> a, b = json.loads(a), json.loads(b)
>>> a['errors'].sort()
>>> b['errors'].sort()
>>> a == b
True

Above example will work for the JSON in the question. For general solution, see Zero Piraeus’s answer.


回答 3

对于以下两个字典“ dictWithListsInValue”和“ reorderedDictWithReorderedListsInValue”,它们只是彼此的重新排序版本

dictObj = {"foo": "bar", "john": "doe"}
reorderedDictObj = {"john": "doe", "foo": "bar"}
dictObj2 = {"abc": "def"}
dictWithListsInValue = {'A': [{'X': [dictObj2, dictObj]}, {'Y': 2}], 'B': dictObj2}
reorderedDictWithReorderedListsInValue = {'B': dictObj2, 'A': [{'Y': 2}, {'X': [reorderedDictObj, dictObj2]}]}
a = {"L": "M", "N": dictWithListsInValue}
b = {"L": "M", "N": reorderedDictWithReorderedListsInValue}

print(sorted(a.items()) == sorted(b.items()))  # gives false

给我错误的结果即错误。

所以我这样创建了自己的cutstom ObjectComparator:

def my_list_cmp(list1, list2):
    if (list1.__len__() != list2.__len__()):
        return False

    for l in list1:
        found = False
        for m in list2:
            res = my_obj_cmp(l, m)
            if (res):
                found = True
                break

        if (not found):
            return False

    return True


def my_obj_cmp(obj1, obj2):
    if isinstance(obj1, list):
        if (not isinstance(obj2, list)):
            return False
        return my_list_cmp(obj1, obj2)
    elif (isinstance(obj1, dict)):
        if (not isinstance(obj2, dict)):
            return False
        exp = set(obj2.keys()) == set(obj1.keys())
        if (not exp):
            # print(obj1.keys(), obj2.keys())
            return False
        for k in obj1.keys():
            val1 = obj1.get(k)
            val2 = obj2.get(k)
            if isinstance(val1, list):
                if (not my_list_cmp(val1, val2)):
                    return False
            elif isinstance(val1, dict):
                if (not my_obj_cmp(val1, val2)):
                    return False
            else:
                if val2 != val1:
                    return False
    else:
        return obj1 == obj2

    return True


dictObj = {"foo": "bar", "john": "doe"}
reorderedDictObj = {"john": "doe", "foo": "bar"}
dictObj2 = {"abc": "def"}
dictWithListsInValue = {'A': [{'X': [dictObj2, dictObj]}, {'Y': 2}], 'B': dictObj2}
reorderedDictWithReorderedListsInValue = {'B': dictObj2, 'A': [{'Y': 2}, {'X': [reorderedDictObj, dictObj2]}]}
a = {"L": "M", "N": dictWithListsInValue}
b = {"L": "M", "N": reorderedDictWithReorderedListsInValue}

print(my_obj_cmp(a, b))  # gives true

这给了我正确的预期输出!

逻辑很简单:

如果对象的类型为“列表”,则将第一个列表的每个项目与第二个列表的项目进行比较,直到找到为止;如果在通过第二个列表之后未找到该项目,则“找到”为= false。返回“找到的”值

否则,如果要比较的对象的类型为“ dict”,则比较两个对象中所有相应键的存在值。(执行递归比较)

否则,只需调用obj1 == obj2即可。默认情况下,它适用于字符串和数字的对象,并且eq()的定义适当。

(请注意,可以通过删除在object2中找到的项目来进一步改进该算法,以便object1的下一个项目不会将自身与object2中已经找到的项目进行比较。)

For the following two dicts ‘dictWithListsInValue’ and ‘reorderedDictWithReorderedListsInValue’ which are simply reordered versions of each other

dictObj = {"foo": "bar", "john": "doe"}
reorderedDictObj = {"john": "doe", "foo": "bar"}
dictObj2 = {"abc": "def"}
dictWithListsInValue = {'A': [{'X': [dictObj2, dictObj]}, {'Y': 2}], 'B': dictObj2}
reorderedDictWithReorderedListsInValue = {'B': dictObj2, 'A': [{'Y': 2}, {'X': [reorderedDictObj, dictObj2]}]}
a = {"L": "M", "N": dictWithListsInValue}
b = {"L": "M", "N": reorderedDictWithReorderedListsInValue}

print(sorted(a.items()) == sorted(b.items()))  # gives false

gave me wrong result i.e. false .

So I created my own cutstom ObjectComparator like this:

def my_list_cmp(list1, list2):
    if (list1.__len__() != list2.__len__()):
        return False

    for l in list1:
        found = False
        for m in list2:
            res = my_obj_cmp(l, m)
            if (res):
                found = True
                break

        if (not found):
            return False

    return True


def my_obj_cmp(obj1, obj2):
    if isinstance(obj1, list):
        if (not isinstance(obj2, list)):
            return False
        return my_list_cmp(obj1, obj2)
    elif (isinstance(obj1, dict)):
        if (not isinstance(obj2, dict)):
            return False
        exp = set(obj2.keys()) == set(obj1.keys())
        if (not exp):
            # print(obj1.keys(), obj2.keys())
            return False
        for k in obj1.keys():
            val1 = obj1.get(k)
            val2 = obj2.get(k)
            if isinstance(val1, list):
                if (not my_list_cmp(val1, val2)):
                    return False
            elif isinstance(val1, dict):
                if (not my_obj_cmp(val1, val2)):
                    return False
            else:
                if val2 != val1:
                    return False
    else:
        return obj1 == obj2

    return True


dictObj = {"foo": "bar", "john": "doe"}
reorderedDictObj = {"john": "doe", "foo": "bar"}
dictObj2 = {"abc": "def"}
dictWithListsInValue = {'A': [{'X': [dictObj2, dictObj]}, {'Y': 2}], 'B': dictObj2}
reorderedDictWithReorderedListsInValue = {'B': dictObj2, 'A': [{'Y': 2}, {'X': [reorderedDictObj, dictObj2]}]}
a = {"L": "M", "N": dictWithListsInValue}
b = {"L": "M", "N": reorderedDictWithReorderedListsInValue}

print(my_obj_cmp(a, b))  # gives true

which gave me the correct expected output!

Logic is pretty simple:

If the objects are of type ‘list’ then compare each item of the first list with the items of the second list until found , and if the item is not found after going through the second list , then ‘found’ would be = false. ‘found’ value is returned

Else if the objects to be compared are of type ‘dict’ then compare the values present for all the respective keys in both the objects. (Recursive comparison is performed)

Else simply call obj1 == obj2 . It by default works fine for the object of strings and numbers and for those eq() is defined appropriately .

(Note that the algorithm can further be improved by removing the items found in object2, so that the next item of object1 would not compare itself with the items already found in the object2)


回答 4

您可以编写自己的equals函数:

  • 在以下情况下,字典是相等的:1)所有键都相等,2)所有值都相等
  • 如果满足以下条件,则列表相等:所有项目均相同且顺序相同
  • 如果原语相等 a == b

因为您处理JSON,你就会有标准的Python类型:dictlist等等,所以你可以做硬类型检查if type(obj) == 'dict':,等等。

粗略示例(未经测试):

def json_equals(jsonA, jsonB):
    if type(jsonA) != type(jsonB):
        # not equal
        return False
    if type(jsonA) == dict:
        if len(jsonA) != len(jsonB):
            return False
        for keyA in jsonA:
            if keyA not in jsonB or not json_equal(jsonA[keyA], jsonB[keyA]):
                return False
    elif type(jsonA) == list:
        if len(jsonA) != len(jsonB):
            return False
        for itemA, itemB in zip(jsonA, jsonB):
            if not json_equal(itemA, itemB):
                return False
    else:
        return jsonA == jsonB

You can write your own equals function:

  • dicts are equal if: 1) all keys are equal, 2) all values are equal
  • lists are equal if: all items are equal and in the same order
  • primitives are equal if a == b

Because you’re dealing with json, you’ll have standard python types: dict, list, etc., so you can do hard type checking if type(obj) == 'dict':, etc.

Rough example (not tested):

def json_equals(jsonA, jsonB):
    if type(jsonA) != type(jsonB):
        # not equal
        return False
    if type(jsonA) == dict:
        if len(jsonA) != len(jsonB):
            return False
        for keyA in jsonA:
            if keyA not in jsonB or not json_equal(jsonA[keyA], jsonB[keyA]):
                return False
    elif type(jsonA) == list:
        if len(jsonA) != len(jsonB):
            return False
        for itemA, itemB in zip(jsonA, jsonB):
            if not json_equal(itemA, itemB):
                return False
    else:
        return jsonA == jsonB

回答 5

对于其他想要调试两个JSON对象(通常有一个引用和一个target)的人,可以使用以下解决方案。它将列出从目标到引用的不同/不匹配路径的“ 路径 ”。

level 选项用于选择您要研究的深度。

show_variables 可以打开该选项以显示相关变量。

def compareJson(example_json, target_json, level=-1, show_variables=False):
  _different_variables = _parseJSON(example_json, target_json, level=level, show_variables=show_variables)
  return len(_different_variables) == 0, _different_variables

def _parseJSON(reference, target, path=[], level=-1, show_variables=False):  
  if level > 0 and len(path) == level:
    return []
  
  _different_variables = list()
  # the case that the inputs is a dict (i.e. json dict)  
  if isinstance(reference, dict):
    for _key in reference:      
      _path = path+[_key]
      try:
        _different_variables += _parseJSON(reference[_key], target[_key], _path, level, show_variables)
      except KeyError:
        _record = ''.join(['[%s]'%str(p) for p in _path])
        if show_variables:
          _record += ': %s <--> MISSING!!'%str(reference[_key])
        _different_variables.append(_record)
  # the case that the inputs is a list/tuple
  elif isinstance(reference, list) or isinstance(reference, tuple):
    for index, v in enumerate(reference):
      _path = path+[index]
      try:
        _target_v = target[index]
        _different_variables += _parseJSON(v, _target_v, _path, level, show_variables)
      except IndexError:
        _record = ''.join(['[%s]'%str(p) for p in _path])
        if show_variables:
          _record += ': %s <--> MISSING!!'%str(v)
        _different_variables.append(_record)
  # the actual comparison about the value, if they are not the same, record it
  elif reference != target:
    _record = ''.join(['[%s]'%str(p) for p in path])
    if show_variables:
      _record += ': %s <--> %s'%(str(reference), str(target))
    _different_variables.append(_record)

  return _different_variables

For others who’d like to debug the two JSON objects (usually, there is a reference and a target), here is a solution you may use. It will list the “path” of different/mismatched ones from target to the reference.

level option is used for selecting how deep you would like to look into.

show_variables option can be turned on to show the relevant variable.

def compareJson(example_json, target_json, level=-1, show_variables=False):
  _different_variables = _parseJSON(example_json, target_json, level=level, show_variables=show_variables)
  return len(_different_variables) == 0, _different_variables

def _parseJSON(reference, target, path=[], level=-1, show_variables=False):  
  if level > 0 and len(path) == level:
    return []
  
  _different_variables = list()
  # the case that the inputs is a dict (i.e. json dict)  
  if isinstance(reference, dict):
    for _key in reference:      
      _path = path+[_key]
      try:
        _different_variables += _parseJSON(reference[_key], target[_key], _path, level, show_variables)
      except KeyError:
        _record = ''.join(['[%s]'%str(p) for p in _path])
        if show_variables:
          _record += ': %s <--> MISSING!!'%str(reference[_key])
        _different_variables.append(_record)
  # the case that the inputs is a list/tuple
  elif isinstance(reference, list) or isinstance(reference, tuple):
    for index, v in enumerate(reference):
      _path = path+[index]
      try:
        _target_v = target[index]
        _different_variables += _parseJSON(v, _target_v, _path, level, show_variables)
      except IndexError:
        _record = ''.join(['[%s]'%str(p) for p in _path])
        if show_variables:
          _record += ': %s <--> MISSING!!'%str(v)
        _different_variables.append(_record)
  # the actual comparison about the value, if they are not the same, record it
  elif reference != target:
    _record = ''.join(['[%s]'%str(p) for p in path])
    if show_variables:
      _record += ': %s <--> %s'%(str(reference), str(target))
    _different_variables.append(_record)

  return _different_variables

Httpbin-HTTP请求和响应服务,用Python+Flask编写

传入的Django请求中的JSON数据在哪里?

问题:传入的Django请求中的JSON数据在哪里?

我正在尝试使用Django / Python处理传入的JSON / Ajax请求。

request.is_ajax()True在请求中,但是我不知道有效负载在哪里以及JSON数据。

request.POST.dir 包含以下内容:

['__class__', '__cmp__', '__contains__', '__copy__', '__deepcopy__', '__delattr__',
 '__delitem__', '__dict__', '__doc__', '__eq__', '__ge__', '__getattribute__',
'__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__',
 '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', 
'__setattr__', '__setitem__', '__str__', '__weakref__', '_assert_mutable', '_encoding', 
'_get_encoding', '_mutable', '_set_encoding', 'appendlist', 'clear', 'copy', 'encoding', 
'fromkeys', 'get', 'getlist', 'has_key', 'items', 'iteritems', 'iterkeys', 'itervalues', 
'keys', 'lists', 'pop', 'popitem', 'setdefault', 'setlist', 'setlistdefault', 'update', 
'urlencode', 'values']

在请求发布键中显然没有键。

当我查看Firebug中的POST时,请求中发送了JSON数据。

I’m trying to process incoming JSON/Ajax requests with Django/Python.

request.is_ajax() is True on the request, but I have no idea where the payload is with the JSON data.

request.POST.dir contains this:

['__class__', '__cmp__', '__contains__', '__copy__', '__deepcopy__', '__delattr__',
 '__delitem__', '__dict__', '__doc__', '__eq__', '__ge__', '__getattribute__',
'__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__',
 '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', 
'__setattr__', '__setitem__', '__str__', '__weakref__', '_assert_mutable', '_encoding', 
'_get_encoding', '_mutable', '_set_encoding', 'appendlist', 'clear', 'copy', 'encoding', 
'fromkeys', 'get', 'getlist', 'has_key', 'items', 'iteritems', 'iterkeys', 'itervalues', 
'keys', 'lists', 'pop', 'popitem', 'setdefault', 'setlist', 'setlistdefault', 'update', 
'urlencode', 'values']

There are apparently no keys in the request post keys.

When I look at the POST in Firebug, there is JSON data being sent up in the request.


回答 0

如果您要将JSON发布到Django,我想您想要request.bodyrequest.raw_post_data在Django <1.4上)。这将为您提供通过帖子发送的原始JSON数据。从那里您可以进一步处理它。

这是一个使用JavaScript,jQuery,jquery-json和Django 的示例。

JavaScript:

var myEvent = {id: calEvent.id, start: calEvent.start, end: calEvent.end,
               allDay: calEvent.allDay };
$.ajax({
    url: '/event/save-json/',
    type: 'POST',
    contentType: 'application/json; charset=utf-8',
    data: $.toJSON(myEvent),
    dataType: 'text',
    success: function(result) {
        alert(result.Result);
    }
});

Django:

def save_events_json(request):
    if request.is_ajax():
        if request.method == 'POST':
            print 'Raw Data: "%s"' % request.body   
    return HttpResponse("OK")

Django <1.4:

  def save_events_json(request):
    if request.is_ajax():
        if request.method == 'POST':
            print 'Raw Data: "%s"' % request.raw_post_data
    return HttpResponse("OK")

If you are posting JSON to Django, I think you want request.body (request.raw_post_data on Django < 1.4). This will give you the raw JSON data sent via the post. From there you can process it further.

Here is an example using JavaScript, jQuery, jquery-json and Django.

JavaScript:

var myEvent = {id: calEvent.id, start: calEvent.start, end: calEvent.end,
               allDay: calEvent.allDay };
$.ajax({
    url: '/event/save-json/',
    type: 'POST',
    contentType: 'application/json; charset=utf-8',
    data: $.toJSON(myEvent),
    dataType: 'text',
    success: function(result) {
        alert(result.Result);
    }
});

Django:

def save_events_json(request):
    if request.is_ajax():
        if request.method == 'POST':
            print 'Raw Data: "%s"' % request.body   
    return HttpResponse("OK")

Django < 1.4:

  def save_events_json(request):
    if request.is_ajax():
        if request.method == 'POST':
            print 'Raw Data: "%s"' % request.raw_post_data
    return HttpResponse("OK")

回答 1

我有同样的问题。我一直在发布复杂的JSON响应,但无法使用request.POST字典读取数据。

我的JSON POST数据为:

//JavaScript code:
//Requires json2.js and jQuery.
var response = {data:[{"a":1, "b":2},{"a":2, "b":2}]}
json_response = JSON.stringify(response); // proper serialization method, read 
                                          // http://ejohn.org/blog/ecmascript-5-strict-mode-json-and-more/
$.post('url',json_response);

在这种情况下,您需要使用金黄色葡萄球菌提供的方法。阅读request.body并使用json stdlib反序列化。

#Django code:
import json
def save_data(request):
  if request.method == 'POST':
    json_data = json.loads(request.body) # request.raw_post_data w/ Django < 1.4
    try:
      data = json_data['data']
    except KeyError:
      HttpResponseServerError("Malformed data!")
    HttpResponse("Got json data")

I had the same problem. I had been posting a complex JSON response, and I couldn’t read my data using the request.POST dictionary.

My JSON POST data was:

//JavaScript code:
//Requires json2.js and jQuery.
var response = {data:[{"a":1, "b":2},{"a":2, "b":2}]}
json_response = JSON.stringify(response); // proper serialization method, read 
                                          // http://ejohn.org/blog/ecmascript-5-strict-mode-json-and-more/
$.post('url',json_response);

In this case you need to use method provided by aurealus. Read the request.body and deserialize it with the json stdlib.

#Django code:
import json
def save_data(request):
  if request.method == 'POST':
    json_data = json.loads(request.body) # request.raw_post_data w/ Django < 1.4
    try:
      data = json_data['data']
    except KeyError:
      HttpResponseServerError("Malformed data!")
    HttpResponse("Got json data")

回答 2

方法一

客户:发送为 JSON

$.ajax({
    url: 'example.com/ajax/',
    type: 'POST',
    contentType: 'application/json; charset=utf-8',
    processData: false,
    data: JSON.stringify({'name':'John', 'age': 42}),
    ...
});

//Sent as a JSON object {'name':'John', 'age': 42}

服务器:

data = json.loads(request.body) # {'name':'John', 'age': 42}

方法二

客户端:发送为x-www-form-urlencoded
(注意:contentTypeprocessData已更改,JSON.stringify不需要)

$.ajax({
    url: 'example.com/ajax/',
    type: 'POST',    
    data: {'name':'John', 'age': 42},
    contentType: 'application/x-www-form-urlencoded; charset=utf-8',  //Default
    processData: true,       
});

//Sent as a query string name=John&age=42

服务器:

data = request.POST # will be <QueryDict: {u'name':u'John', u'age': 42}>

在1.5+版本中进行了更改:https : //docs.djangoproject.com/en/dev/releases/1.5/#non-form-data-in-http-requests

HTTP请求中的非格式数据
request.POST将不再包含通过HTTP请求发布的数据,该数据头中包含非特定于格式的内容类型。在以前的版本中,以multipart / form-data或application / x-www-form-urlencoded以外的内容类型发布的数据仍将最终在request.POST属性中表示。对于这些情况,希望访问原始POST数据的开发人员应改用request.body属性。

可能相关

Method 1

Client : Send as JSON

$.ajax({
    url: 'example.com/ajax/',
    type: 'POST',
    contentType: 'application/json; charset=utf-8',
    processData: false,
    data: JSON.stringify({'name':'John', 'age': 42}),
    ...
});

//Sent as a JSON object {'name':'John', 'age': 42}

Server :

data = json.loads(request.body) # {'name':'John', 'age': 42}

Method 2

Client : Send as x-www-form-urlencoded
(Note: contentType & processData have changed, JSON.stringify is not needed)

$.ajax({
    url: 'example.com/ajax/',
    type: 'POST',    
    data: {'name':'John', 'age': 42},
    contentType: 'application/x-www-form-urlencoded; charset=utf-8',  //Default
    processData: true,       
});

//Sent as a query string name=John&age=42

Server :

data = request.POST # will be <QueryDict: {u'name':u'John', u'age': 42}>

Changed in 1.5+ : https://docs.djangoproject.com/en/dev/releases/1.5/#non-form-data-in-http-requests

Non-form data in HTTP requests :
request.POST will no longer include data posted via HTTP requests with non form-specific content-types in the header. In prior versions, data posted with content-types other than multipart/form-data or application/x-www-form-urlencoded would still end up represented in the request.POST attribute. Developers wishing to access the raw POST data for these cases, should use the request.body attribute instead.

Probably related


回答 3

重要的是要记住,Python 3以不同的方式表示字符串-它们是字节数组。

使用Django 1.9和Python 2.7并在主体(而非标头)中发送JSON数据,您将使用类似以下内容:

mydata = json.loads(request.body)

但是对于Django 1.9和Python 3.4,您可以使用:

mydata = json.loads(request.body.decode("utf-8"))

我刚刚经历了这个学习过程,制作了我的第一个Py3 Django应用!

Its important to remember Python 3 has a different way to represent strings – they are byte arrays.

Using Django 1.9 and Python 2.7 and sending the JSON data in the main body (not a header) you would use something like:

mydata = json.loads(request.body)

But for Django 1.9 and Python 3.4 you would use:

mydata = json.loads(request.body.decode("utf-8"))

I just went through this learning curve making my first Py3 Django app!


回答 4

request.raw_response现在已弃用。request.body而是使用它来处理非常规表单数据,例如XML有效负载,二进制图像等。

有关此问题的Django文档

request.raw_response is now deprecated. Use request.body instead to process non-conventional form data such as XML payloads, binary images, etc.

Django documentation on the issue.


回答 5

在Django 1.6 python 3.3上

客户

$.ajax({
    url: '/urll/',
    type: 'POST',
    contentType: 'application/json; charset=utf-8',
    data: JSON.stringify(json_object),
    dataType: 'json',
    success: function(result) {
        alert(result.Result);
    }
});

服务器

def urll(request):

if request.is_ajax():
    if request.method == 'POST':
        print ('Raw Data:', request.body) 

        print ('type(request.body):', type(request.body)) # this type is bytes

        print(json.loads(request.body.decode("utf-8")))

on django 1.6 python 3.3

client

$.ajax({
    url: '/urll/',
    type: 'POST',
    contentType: 'application/json; charset=utf-8',
    data: JSON.stringify(json_object),
    dataType: 'json',
    success: function(result) {
        alert(result.Result);
    }
});

server

def urll(request):

if request.is_ajax():
    if request.method == 'POST':
        print ('Raw Data:', request.body) 

        print ('type(request.body):', type(request.body)) # this type is bytes

        print(json.loads(request.body.decode("utf-8")))

回答 6

HTTP POST有效负载只是一堆字节。Django(与大多数框架一样)通过URL编码参数或MIME多部分编码将其解码为字典。如果仅将JSON数据转储到POST内容中,则Django将不会对其进行解码。从完整的POST内容(而不是字典)中进行JSON解码;或将JSON数据放入MIME多部分包装器中。

简而言之,请显示JavaScript代码。问题似乎在那里。

The HTTP POST payload is just a flat bunch of bytes. Django (like most frameworks) decodes it into a dictionary from either URL encoded parameters, or MIME-multipart encoding. If you just dump the JSON data in the POST content, Django won’t decode it. Either do the JSON decoding from the full POST content (not the dictionary); or put the JSON data into a MIME-multipart wrapper.

In short, show the JavaScript code. The problem seems to be there.


回答 7

request.raw_post_data已不推荐使用。使用request.body替代

request.raw_post_data has been deprecated. Use request.body instead


回答 8

这样的事情。它的工作原理:从客户端请求数据

registerData = {
{% for field in userFields%}
  {{ field.name }}: {{ field.name }},
{% endfor %}
}


var request = $.ajax({
   url: "{% url 'MainApp:rq-create-account-json' %}",
   method: "POST",
   async: false,
   contentType: "application/json; charset=utf-8",
   data: JSON.stringify(registerData),
   dataType: "json"
});

request.done(function (msg) {
   [alert(msg);]
   alert(msg.name);
});

request.fail(function (jqXHR, status) {
  alert(status);
});

在服务器上处理请求

@csrf_exempt
def rq_create_account_json(request):
   if request.is_ajax():
       if request.method == 'POST':
           json_data = json.loads(request.body)
           print(json_data)
           return JsonResponse(json_data)
   return HttpResponse("Error")

Something like this. It’s worked: Request data from client

registerData = {
{% for field in userFields%}
  {{ field.name }}: {{ field.name }},
{% endfor %}
}


var request = $.ajax({
   url: "{% url 'MainApp:rq-create-account-json' %}",
   method: "POST",
   async: false,
   contentType: "application/json; charset=utf-8",
   data: JSON.stringify(registerData),
   dataType: "json"
});

request.done(function (msg) {
   [alert(msg);]
   alert(msg.name);
});

request.fail(function (jqXHR, status) {
  alert(status);
});

Process request at the server

@csrf_exempt
def rq_create_account_json(request):
   if request.is_ajax():
       if request.method == 'POST':
           json_data = json.loads(request.body)
           print(json_data)
           return JsonResponse(json_data)
   return HttpResponse("Error")

回答 9

html code 

file name  : view.html


    <!DOCTYPE html>
    <html>
    <head>
    <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
    <script>
    $(document).ready(function(){
        $("#mySelect").change(function(){
            selected = $("#mySelect option:selected").text()
            $.ajax({
                type: 'POST',
                dataType: 'json',
                contentType: 'application/json; charset=utf-8',
                url: '/view/',
                data: {
                       'fruit': selected
                      },
                success: function(result) {
                        document.write(result)
                        }
        });
      });
    });
    </script>
    </head>
    <body>

    <form>
        <br>
    Select your favorite fruit:
    <select id="mySelect">
      <option value="apple" selected >Select fruit</option>
      <option value="apple">Apple</option>
      <option value="orange">Orange</option>
      <option value="pineapple">Pineapple</option>
      <option value="banana">Banana</option>
    </select>
    </form>
    </body>
    </html>

Django code:


Inside views.py


def view(request):

    if request.method == 'POST':
        print request.body
        data = request.body
        return HttpResponse(json.dumps(data))
html code 

file name  : view.html


    <!DOCTYPE html>
    <html>
    <head>
    <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
    <script>
    $(document).ready(function(){
        $("#mySelect").change(function(){
            selected = $("#mySelect option:selected").text()
            $.ajax({
                type: 'POST',
                dataType: 'json',
                contentType: 'application/json; charset=utf-8',
                url: '/view/',
                data: {
                       'fruit': selected
                      },
                success: function(result) {
                        document.write(result)
                        }
        });
      });
    });
    </script>
    </head>
    <body>

    <form>
        <br>
    Select your favorite fruit:
    <select id="mySelect">
      <option value="apple" selected >Select fruit</option>
      <option value="apple">Apple</option>
      <option value="orange">Orange</option>
      <option value="pineapple">Pineapple</option>
      <option value="banana">Banana</option>
    </select>
    </form>
    </body>
    </html>

Django code:


Inside views.py


def view(request):

    if request.method == 'POST':
        print request.body
        data = request.body
        return HttpResponse(json.dumps(data))

回答 10

使用Angular您应该添加标头以请求或将其添加到模块配置标头: {'Content-Type': 'application/x-www-form-urlencoded'}

$http({
    url: url,
    method: method,
    timeout: timeout,
    data: data,
    headers: {'Content-Type': 'application/x-www-form-urlencoded'}
})

Using Angular you should add header to request or add it to module config headers: {'Content-Type': 'application/x-www-form-urlencoded'}

$http({
    url: url,
    method: method,
    timeout: timeout,
    data: data,
    headers: {'Content-Type': 'application/x-www-form-urlencoded'}
})

回答 11

request.POST只是一个类似于字典的对象,因此只需使用dict语法对其进行索引。

假设您的表单字段为fred,则可以执行以下操作:

if 'fred' in request.POST:
    mydata = request.POST['fred']

或者,使用表单对象处理POST数据。

request.POST is just a dictionary-like object, so just index into it with dict syntax.

Assuming your form field is fred, you could do something like this:

if 'fred' in request.POST:
    mydata = request.POST['fred']

Alternately, use a form object to deal with the POST data.


Python:json.loads返回以“ u”为前缀的项目

问题:Python:json.loads返回以“ u”为前缀的项目

我将收到来自Obj-C的JSON编码字符串,并且正在解码一个伪字符串(目前),如下面的代码。我的输出结果是在每个项目前加上字符’u’:

[{u'i': u'imap.gmail.com', u'p': u'aaaa'}, {u'i': u'333imap.com', u'p': u'bbbb'}...

JSON如何添加此Unicode字符?删除它的最佳方法是什么?

mail_accounts = []
da = {}
try:
    s = '[{"i":"imap.gmail.com","p":"aaaa"},{"i":"imap.aol.com","p":"bbbb"},{"i":"333imap.com","p":"ccccc"},{"i":"444ap.gmail.com","p":"ddddd"},{"i":"555imap.gmail.com","p":"eee"}]'
    jdata = json.loads(s)
    for d in jdata:
        for key, value in d.iteritems():
            if key not in da:
                da[key] = value
            else:
                da = {}
                da[key] = value
        mail_accounts.append(da)
except Exception, err:
    sys.stderr.write('Exception Error: %s' % str(err))

print mail_accounts

I’ll be receiving a JSON encoded string form Obj-C, and I am decoding a dummy string (for now) like the code below. My output comes out with character ‘u’ prefixing each item:

[{u'i': u'imap.gmail.com', u'p': u'aaaa'}, {u'i': u'333imap.com', u'p': u'bbbb'}...

How is JSON adding this unicode char? What’s the best way to remove it?

mail_accounts = []
da = {}
try:
    s = '[{"i":"imap.gmail.com","p":"aaaa"},{"i":"imap.aol.com","p":"bbbb"},{"i":"333imap.com","p":"ccccc"},{"i":"444ap.gmail.com","p":"ddddd"},{"i":"555imap.gmail.com","p":"eee"}]'
    jdata = json.loads(s)
    for d in jdata:
        for key, value in d.iteritems():
            if key not in da:
                da[key] = value
            else:
                da = {}
                da[key] = value
        mail_accounts.append(da)
except Exception, err:
    sys.stderr.write('Exception Error: %s' % str(err))

print mail_accounts

回答 0

u-前缀仅表示您具有Unicode字符串。当您真正使用字符串时,它不会出现在您的数据中。不要被打印输出扔掉。

例如,尝试以下操作:

print mail_accounts[0]["i"]

你不会看到你。

The u- prefix just means that you have a Unicode string. When you really use the string, it won’t appear in your data. Don’t be thrown by the printed output.

For example, try this:

print mail_accounts[0]["i"]

You won’t see a u.


回答 1

一切都很棒,伙计。’u’是一件好事,它表示字符串在python 2.x中为Unicode类型。

http://docs.python.org/2/howto/unicode.html#the-unicode-type

Everything is cool, man. The ‘u’ is a good thing, it indicates that the string is of type Unicode in python 2.x.

http://docs.python.org/2/howto/unicode.html#the-unicode-type


回答 2

d3下面打印件是您要查找打印件(这是转储和装载的组合):)

具有:

import json

d = """{"Aa": 1, "BB": "blabla", "cc": "False"}"""

d1 = json.loads(d)              # Produces a dictionary out of the given string
d2 = json.dumps(d)              # Produces a string out of a given dict or string
d3 = json.dumps(json.loads(d))  # 'dumps' gets the dict from 'loads' this time

print "d1:  " + str(d1)
print "d2:  " + d2
print "d3:  " + d3

印刷品:

d1:  {u'Aa': 1, u'cc': u'False', u'BB': u'blabla'}
d2:  "{\"Aa\": 1, \"BB\": \"blabla\", \"cc\": \"False\"}"
d3:  {"Aa": 1, "cc": "False", "BB": "blabla"}

The d3 print below is the one you are looking for (which is the combination of dumps and loads) :)

Having:

import json

d = """{"Aa": 1, "BB": "blabla", "cc": "False"}"""

d1 = json.loads(d)              # Produces a dictionary out of the given string
d2 = json.dumps(d)              # Produces a string out of a given dict or string
d3 = json.dumps(json.loads(d))  # 'dumps' gets the dict from 'loads' this time

print "d1:  " + str(d1)
print "d2:  " + d2
print "d3:  " + d3

Prints:

d1:  {u'Aa': 1, u'cc': u'False', u'BB': u'blabla'}
d2:  "{\"Aa\": 1, \"BB\": \"blabla\", \"cc\": \"False\"}"
d3:  {"Aa": 1, "cc": "False", "BB": "blabla"}

回答 3

u前缀意思是,那些字符串是unicode的,而不是8位的字符串。不显示u前缀的最佳方法是切换到Python 3,默认情况下字符串为unicode。如果不是这种选择,则str构造函数将从Unicode转换为8位,因此只需在结果上递归循环并转换unicode为即可str。但是,最好将字符串保留为unicode。

The u prefix means that those strings are unicode rather than 8-bit strings. The best way to not show the u prefix is to switch to Python 3, where strings are unicode by default. If that’s not an option, the str constructor will convert from unicode to 8-bit, so simply loop recursively over the result and convert unicode to str. However, it is probably best just to leave the strings as unicode.


回答 4

Unicode在这里是合适的类型。JSONDecoder文档描述了转换表,并声明将JSON字符串对象解码为Unicode对象

https://docs.python.org/2/library/json.html#encoders-and-decoders

JSON                    Python
==================================
object                  dict
array                   list
string                  unicode
number (int)            int, long
number (real)           float
true                    True
false                   False
null                    None

“ encoding确定用于解释此实例解码的任何str对象的编码(默认为UTF-8)。”

Unicode is an appropriate type here. The JSONDecoder docs describe the conversion table and state that json string objects are decoded into Unicode objects

https://docs.python.org/2/library/json.html#encoders-and-decoders

JSON                    Python
==================================
object                  dict
array                   list
string                  unicode
number (int)            int, long
number (real)           float
true                    True
false                   False
null                    None

“encoding determines the encoding used to interpret any str objects decoded by this instance (UTF-8 by default).”


回答 5

那些附加在对象上的“ u”字符表示该对象以“ unicode”编码。

如果要从对象中删除那些“ u”字符,可以执行以下操作:

import json, ast
jdata = ast.literal_eval(json.dumps(jdata)) # Removing uni-code chars

让我们从python shell签出

>>> import json, ast
>>> jdata = [{u'i': u'imap.gmail.com', u'p': u'aaaa'}, {u'i': u'333imap.com', u'p': u'bbbb'}]
>>> jdata = ast.literal_eval(json.dumps(jdata))
>>> jdata
[{'i': 'imap.gmail.com', 'p': 'aaaa'}, {'i': '333imap.com', 'p': 'bbbb'}]

Those ‘u’ characters being appended to an object signifies that the object is encoded in “unicode”.

If you want to remove those ‘u’ chars from your object you can do this:

import json, ast
jdata = ast.literal_eval(json.dumps(jdata)) # Removing uni-code chars

Let’s checkout from python shell

>>> import json, ast
>>> jdata = [{u'i': u'imap.gmail.com', u'p': u'aaaa'}, {u'i': u'333imap.com', u'p': u'bbbb'}]
>>> jdata = ast.literal_eval(json.dumps(jdata))
>>> jdata
[{'i': 'imap.gmail.com', 'p': 'aaaa'}, {'i': '333imap.com', 'p': 'bbbb'}]

回答 6

当尝试使用Python logging库捕获日志中的JSON数据时,为了调试和故障排除,我一直遇到这个问题。u当您想要复制文本并将其粘贴到代码中的某个位置时,获取字符确实是个麻烦。

就像大家都会告诉你的那样,这是因为它是Unicode表示,它可能来自于您一开始就习惯json.loads()从字符串中加载数据的事实。

如果要在日志中使用不带u前缀的JSON表示形式,诀窍是json.dumps()在注销之前使用它。例如:

import json
import logging

# Prepare the data
json_data = json.loads('{"key": "value"}')

# Log normally and get the Unicode indicator
logging.warning('data: {}'.format(json_data))
>>> WARNING:root:data: {u'key': u'value'}

# Dump to a string before logging and get clean output!
logging.warning('data: {}'.format(json.dumps(json_data)))
>>> WARNING:root:data: {'key': 'value'}

I kept running into this problem when trying to capture JSON data in the log with the Python logging library, for debugging and troubleshooting purposes. Getting the u character is a real nuisance when you want to copy the text and paste it into your code somewhere.

As everyone will tell you, this is because it is a Unicode representation, and it could come from the fact that you’ve used json.loads() to load in the data from a string in the first place.

If you want the JSON representation in the log, without the u prefix, the trick is to use json.dumps() before logging it out. For example:

import json
import logging

# Prepare the data
json_data = json.loads('{"key": "value"}')

# Log normally and get the Unicode indicator
logging.warning('data: {}'.format(json_data))
>>> WARNING:root:data: {u'key': u'value'}

# Dump to a string before logging and get clean output!
logging.warning('data: {}'.format(json.dumps(json_data)))
>>> WARNING:root:data: {'key': 'value'}

回答 7

试试这个:

mail_accounts [0] .encode(“ ascii”)

Try this:

mail_accounts[0].encode(“ascii”)


回答 8

只需用单引号替换u …

print (str.replace(mail_accounts,"u'","'"))

Just replace the u’ with a single quote…

print (str.replace(mail_accounts,"u'","'"))

在Python中序列化JSON时,“ TypeError :(整数)不可JSON序列化”?

问题:在Python中序列化JSON时,“ TypeError :(整数)不可JSON序列化”?

我正在尝试从python发送一个简单的字典到json文件,但是我一直收到“ TypeError:1425不能序列化JSON”消息。

import json
alerts = {'upper':[1425],'lower':[576],'level':[2],'datetime':['2012-08-08 15:30']}
afile = open('test.json','w')
afile.write(json.dumps(alerts,encoding='UTF-8'))
afile.close()

如果我添加默认参数,则它将写入,但是整数值将作为字符串写入json文件,这是不可取的。

afile.write(json.dumps(alerts,encoding='UTF-8',default=str))

I am trying to send a simple dictionary to a json file from python, but I keep getting the “TypeError: 1425 is not JSON serializable” message.

import json
alerts = {'upper':[1425],'lower':[576],'level':[2],'datetime':['2012-08-08 15:30']}
afile = open('test.json','w')
afile.write(json.dumps(alerts,encoding='UTF-8'))
afile.close()

If I add the default argument, then it writes, but the integer values are written to the json file as strings, which is undesirable.

afile.write(json.dumps(alerts,encoding='UTF-8',default=str))

回答 0

我发现了问题。问题是我的整数实际上是type numpy.int64

I found my problem. The issue was that my integers were actually type numpy.int64.


回答 1

在python 3中将numpy.int64转储到json字符串中似乎存在问题,并且python团队已经对此进行了讨论。可以在此处找到更多详细信息。

Serhiy Storchaka提供了一种解决方法。它工作得很好,所以我将其粘贴在这里:

def convert(o):
    if isinstance(o, numpy.int64): return int(o)  
    raise TypeError

json.dumps({'value': numpy.int64(42)}, default=convert)

It seems like there may be a issue to dump numpy.int64 into json string in Python 3 and the python team already have a conversation about it. More details can be found here.

There is a workaround provided by Serhiy Storchaka. It works very well so I paste it here:

def convert(o):
    if isinstance(o, numpy.int64): return int(o)  
    raise TypeError

json.dumps({'value': numpy.int64(42)}, default=convert)

回答 2

这为我解决了问题:

def serialize(self):
    return {
        my_int: int(self.my_int), 
        my_float: float(self.my_float)
    }

This solved the problem for me:

def serialize(self):
    return {
        my_int: int(self.my_int), 
        my_float: float(self.my_float)
    }

回答 3

只需将数字从int64(从numpy)转换为int

例如,如果变量x是int64:

int(x)

如果是int64数组:

map(int, x)

Just convert numbers from int64 (from numpy) to int.

For example, if variable x is a int64:

int(x)

If is array of int64:

map(int, x)

回答 4

正如@JAC在评价最高的答案的注释中指出的那样,可以在将numpy dtypes转换为本地python类型的线程中找到通用解决方案(适用于所有numpy类型) 。

不过,我将在下面添加解决方案的版本,因为我需要一个通用的解决方案,该解决方案将这些答案以及其他线程的答案结合在一起。这应该适用于几乎所有的numpy类型。

def convert(o):
    if isinstance(o, np.generic): return o.item()  
    raise TypeError

json.dumps({'value': numpy.int64(42)}, default=convert)

as @JAC pointed out in the comments of the highest rated answer, the generic solution (for all numpy types) can be found in the thread Converting numpy dtypes to native python types.

Nevertheless, I´ll add my version of the solution below, as my in my case I needed a generic solution that combines these answers and with the answers of the other thread. This should work with almost all numpy types.

def convert(o):
    if isinstance(o, np.generic): return o.item()  
    raise TypeError

json.dumps({'value': numpy.int64(42)}, default=convert)

回答 5

这可能是较晚的响应,但最近我遇到了相同的错误。经过大量的冲浪后,此解决方案对我有所帮助。

alerts = {'upper':[1425],'lower':[576],'level':[2],'datetime':['2012-08-08 15:30']}
def myconverter(obj):
        if isinstance(obj, np.integer):
            return int(obj)
        elif isinstance(obj, np.floating):
            return float(obj)
        elif isinstance(obj, np.ndarray):
            return obj.tolist()
        elif isinstance(obj, datetime.datetime):
            return obj.__str__()

通话myconverterjson.dumps()像下面。json.dumps(alerts, default=myconverter).

This might be the late response, but recently i got the same error. After lot of surfing this solution helped me.

alerts = {'upper':[1425],'lower':[576],'level':[2],'datetime':['2012-08-08 15:30']}
def myconverter(obj):
        if isinstance(obj, np.integer):
            return int(obj)
        elif isinstance(obj, np.floating):
            return float(obj)
        elif isinstance(obj, np.ndarray):
            return obj.tolist()
        elif isinstance(obj, datetime.datetime):
            return obj.__str__()

Call myconverter in json.dumps() like below. json.dumps(alerts, default=myconverter).


回答 6

或者,您可以先将对象转换为数据框:

df = pd.DataFrame(obj)

然后将其保存dataframejson文件中:

df.to_json(path_or_buf='df.json')

希望这可以帮助

Alternatively you can convert your object into a dataframe first:

df = pd.DataFrame(obj)

and then save this dataframe in a json file:

df.to_json(path_or_buf='df.json')

Hope this helps


回答 7

您具有Numpy数据类型,只需更改为普通的int()或float()数据类型即可。它将正常工作。

You have Numpy Data Type, Just change to normal int() or float() data type. it will work fine.


回答 8

同样的问题。列出包含numpy.int64类型的数字,该数字引发TypeError。我的快速解决方法是

mylist = eval(str(mylist_of_integers))
json.dumps({'mylist': mylist})

它将列表转换为str(),而eval()函数像python表达式那样评估“字符串”,并在我的情况下以整数列表形式返回结果。

Same problem. List contained numbers of type numpy.int64 which throws a TypeError. Quick workaround for me was to

mylist = eval(str(mylist_of_integers))
json.dumps({'mylist': mylist})

which converts list to str() and eval() function evaluates the String like a Python expression and returns the result as a list of integers in my case.


回答 9

from numpyencoder import NumpyEncoder

在Python3中解决此问题:

import json
from numpyencoder import NumpyEncoder
alerts = {'upper':[1425],'lower':[576],'level':[2],'datetime':['2012-08-08 
15:30']}
afile = open('test.json','w')
afile.write(json.dumps(alerts,encoding='UTF-8',cls=NumpyEncoder))
afile.close()

Use the below code to resolve the issue.

import json
from numpyencoder import NumpyEncoder
alerts = {'upper':[1425],'lower':[576],'level':[2],'datetime':['2012-08-08 
15:30']}
afile = open('test.json','w')
afile.write(json.dumps(alerts,encoding='UTF-8',cls=NumpyEncoder))
afile.close()

如何在Django中序列化模型实例?

问题:如何在Django中序列化模型实例?

关于如何序列化模型QuerySet的文档很多,但是如何将模型实例的字段序列化为JSON?

There is a lot of documentation on how to serialize a Model QuerySet but how do you just serialize to JSON the fields of a Model Instance?


回答 0

您可以轻松地使用列表来包装所需的对象,而这正是Django序列化程序正确地序列化它所需要的,例如:

from django.core import serializers

# assuming obj is a model instance
serialized_obj = serializers.serialize('json', [ obj, ])

You can easily use a list to wrap the required object and that’s all what django serializers need to correctly serialize it, eg.:

from django.core import serializers

# assuming obj is a model instance
serialized_obj = serializers.serialize('json', [ obj, ])

回答 1

如果您要处理的模型实例列表是您最好的选择serializers.serialize(),那么它会完全满足您的需求。

但是,您要尝试序列化单个对象而不是对象的对象时会遇到问题list。这样,为了摆脱各种黑客攻击,只需使用Django即可model_to_dict(如果我没记错的serializers.serialize()话,也要依赖它):

from django.forms.models import model_to_dict

# assuming obj is your model instance
dict_obj = model_to_dict( obj )

现在,您只需要直接json.dumps调用即可将其序列化为json:

import json
serialized = json.dumps(dict_obj)

而已!:)

If you’re dealing with a list of model instances the best you can do is using serializers.serialize(), it gonna fit your need perfectly.

However, you are to face an issue with trying to serialize a single object, not a list of objects. That way, in order to get rid of different hacks, just use Django’s model_to_dict (if I’m not mistaken, serializers.serialize() relies on it, too):

from django.forms.models import model_to_dict

# assuming obj is your model instance
dict_obj = model_to_dict( obj )

You now just need one straight json.dumps call to serialize it to json:

import json
serialized = json.dumps(dict_obj)

That’s it! :)


回答 2

为了避免数组包装,请在返回响应之前将其删除:

import json
from django.core import serializers

def getObject(request, id):
    obj = MyModel.objects.get(pk=id)
    data = serializers.serialize('json', [obj,])
    struct = json.loads(data)
    data = json.dumps(struct[0])
    return HttpResponse(data, mimetype='application/json')

我也发现了关于这个主题的这篇有趣的文章:

http://timsaylor.com/convert-django-model-instances-to-dictionaries

它使用django.forms.models.model_to_dict,它看起来像是完成这项工作的理想工具。

To avoid the array wrapper, remove it before you return the response:

import json
from django.core import serializers

def getObject(request, id):
    obj = MyModel.objects.get(pk=id)
    data = serializers.serialize('json', [obj,])
    struct = json.loads(data)
    data = json.dumps(struct[0])
    return HttpResponse(data, mimetype='application/json')

I found this interesting post on the subject too:

http://timsaylor.com/convert-django-model-instances-to-dictionaries

It uses django.forms.models.model_to_dict, which looks like the perfect tool for the job.


回答 3

对此有一个很好的答案,我很惊讶没有提到它。仅需几行,您就可以处理日期,模型以及其他所有内容。

制作一个可以处理模型的自定义编码器:

from django.forms import model_to_dict
from django.core.serializers.json import DjangoJSONEncoder
from django.db.models import Model

class ExtendedEncoder(DjangoJSONEncoder):

    def default(self, o):

        if isinstance(o, Model):
            return model_to_dict(o)

        return super().default(o)

现在在使用json.dumps时使用它

json.dumps(data, cls=ExtendedEncoder)

现在,模型,日期和所有内容都可以序列化,而不必放在数组中或序列化和非序列化。您拥有的所有自定义内容都可以添加到default方法中。

您甚至可以通过以下方式使用Django的本地JsonResponse:

from django.http import JsonResponse

JsonResponse(data, encoder=ExtendedEncoder)
``

There is a good answer for this and I’m surprised it hasn’t been mentioned. With a few lines you can handle dates, models, and everything else.

Make a custom encoder that can handle models:

from django.forms import model_to_dict
from django.core.serializers.json import DjangoJSONEncoder
from django.db.models import Model

class ExtendedEncoder(DjangoJSONEncoder):

    def default(self, o):

        if isinstance(o, Model):
            return model_to_dict(o)

        return super().default(o)

Now use it when you use json.dumps

json.dumps(data, cls=ExtendedEncoder)

Now models, dates and everything can be serialized and it doesn’t have to be in an array or serialized and unserialized. Anything you have that is custom can just be added to the default method.

You can even use Django’s native JsonResponse this way:

from django.http import JsonResponse

JsonResponse(data, encoder=ExtendedEncoder)

回答 4

听起来您要问的是涉及序列化Django模型实例的数据结构以实现互操作性。其他张贴者是正确的:如果您希望将序列化表格与可以通过Django api查询数据库的python应用程序一起使用,则需要使用一个对象序列化一个查询集。另一方面,如果您需要的是在不接触数据库或不使用Django的情况下在其他地方重新添加模型实例的方法,则您需要做一些工作。

这是我的工作:

首先,我demjson用于转换。碰巧是我首先发现的,但可能不是最好的。我的实现方式取决于其功能之一,但其他转换器也应采用类似的方式。

其次,json_equivalent在可能需要序列化的所有模型上实现一个方法。这是的神奇方法demjson,但是无论您选择哪种实现,都可能要考虑一下。这个想法是,您返回一个可以直接转换为的对象json(即数组或字典)。如果您真的想自动执行此操作:

def json_equivalent(self):
    dictionary = {}
    for field in self._meta.get_all_field_names()
        dictionary[field] = self.__getattribute__(field)
    return dictionary

除非您具有完全平坦的数据结构(否ForeignKeys,数据库中只有数字和字符串,等等),否则这对您没有帮助。否则,您应该认真考虑实现此方法的正确方法。

第三,打电话给demjson.JSON.encode(instance)您,您便拥有了想要的东西。

It sounds like what you’re asking about involves serializing the data structure of a Django model instance for interoperability. The other posters are correct: if you wanted the serialized form to be used with a python application that can query the database via Django’s api, then you would wan to serialize a queryset with one object. If, on the other hand, what you need is a way to re-inflate the model instance somewhere else without touching the database or without using Django, then you have a little bit of work to do.

Here’s what I do:

First, I use demjson for the conversion. It happened to be what I found first, but it might not be the best. My implementation depends on one of its features, but there should be similar ways with other converters.

Second, implement a json_equivalent method on all models that you might need serialized. This is a magic method for demjson, but it’s probably something you’re going to want to think about no matter what implementation you choose. The idea is that you return an object that is directly convertible to json (i.e. an array or dictionary). If you really want to do this automatically:

def json_equivalent(self):
    dictionary = {}
    for field in self._meta.get_all_field_names()
        dictionary[field] = self.__getattribute__(field)
    return dictionary

This will not be helpful to you unless you have a completely flat data structure (no ForeignKeys, only numbers and strings in the database, etc.). Otherwise, you should seriously think about the right way to implement this method.

Third, call demjson.JSON.encode(instance) and you have what you want.


回答 5

如果您要问如何从模型中序列化一个对象,并且知道仅要在查询集中获取一个对象(例如,使用objects.get),则可以使用以下方法:

import django.core.serializers
import django.http
import models

def jsonExample(request,poll_id):
    s = django.core.serializers.serialize('json',[models.Poll.objects.get(id=poll_id)])
    # s is a string with [] around it, so strip them off
    o=s.strip("[]")
    return django.http.HttpResponse(o, mimetype="application/json")

这将使您具有以下形式:

{"pk": 1, "model": "polls.poll", "fields": {"pub_date": "2013-06-27T02:29:38.284Z", "question": "What's up?"}}

If you’re asking how to serialize a single object from a model and you know you’re only going to get one object in the queryset (for instance, using objects.get), then use something like:

import django.core.serializers
import django.http
import models

def jsonExample(request,poll_id):
    s = django.core.serializers.serialize('json',[models.Poll.objects.get(id=poll_id)])
    # s is a string with [] around it, so strip them off
    o=s.strip("[]")
    return django.http.HttpResponse(o, mimetype="application/json")

which would get you something of the form:

{"pk": 1, "model": "polls.poll", "fields": {"pub_date": "2013-06-27T02:29:38.284Z", "question": "What's up?"}}

回答 6

我通过向模型添加序列化方法解决了这个问题:

def toJSON(self):
    import simplejson
    return simplejson.dumps(dict([(attr, getattr(self, attr)) for attr in [f.name for f in self._meta.fields]]))

这是那些讨厌单线的冗长等效项:

def toJSON(self):
    fields = []
    for field in self._meta.fields:
        fields.append(field.name)

    d = {}
    for attr in fields:
        d[attr] = getattr(self, attr)

    import simplejson
    return simplejson.dumps(d)

_meta.fields 是模型字段的有序列表,可以从实例和模型本身进行访问。

I solved this problem by adding a serialization method to my model:

def toJSON(self):
    import simplejson
    return simplejson.dumps(dict([(attr, getattr(self, attr)) for attr in [f.name for f in self._meta.fields]]))

Here’s the verbose equivalent for those averse to one-liners:

def toJSON(self):
    fields = []
    for field in self._meta.fields:
        fields.append(field.name)

    d = {}
    for attr in fields:
        d[attr] = getattr(self, attr)

    import simplejson
    return simplejson.dumps(d)

_meta.fields is an ordered list of model fields which can be accessed from instances and from the model itself.


回答 7

这是我的解决方案,可让您轻松自定义JSON并组织相关记录

首先在模型上实现一种方法。我称是,json但是您可以随便叫它,例如:

class Car(Model):
    ...
    def json(self):
        return {
            'manufacturer': self.manufacturer.name,
            'model': self.model,
            'colors': [color.json for color in self.colors.all()],
        }

然后在视图中我这样做:

data = [car.json for car in Car.objects.all()]
return HttpResponse(json.dumps(data), content_type='application/json; charset=UTF-8', status=status)

Here’s my solution for this, which allows you to easily customize the JSON as well as organize related records

Firstly implement a method on the model. I call is json but you can call it whatever you like, e.g.:

class Car(Model):
    ...
    def json(self):
        return {
            'manufacturer': self.manufacturer.name,
            'model': self.model,
            'colors': [color.json for color in self.colors.all()],
        }

Then in the view I do:

data = [car.json for car in Car.objects.all()]
return HttpResponse(json.dumps(data), content_type='application/json; charset=UTF-8', status=status)

回答 8

使用清单,将解决问题

第1步:

 result=YOUR_MODELE_NAME.objects.values('PROP1','PROP2').all();

第2步:

 result=list(result)  #after getting data from model convert result to list

第三步:

 return HttpResponse(json.dumps(result), content_type = "application/json")

Use list, it will solve problem

Step1:

 result=YOUR_MODELE_NAME.objects.values('PROP1','PROP2').all();

Step2:

 result=list(result)  #after getting data from model convert result to list

Step3:

 return HttpResponse(json.dumps(result), content_type = "application/json")

回答 9

要序列化和反序列化,请使用以下命令:

from django.core import serializers

serial = serializers.serialize("json", [obj])
...
# .next() pulls the first object out of the generator
# .object retrieves django object the object from the DeserializedObject
obj = next(serializers.deserialize("json", serial)).object

To serialize and deserialze, use the following:

from django.core import serializers

serial = serializers.serialize("json", [obj])
...
# .next() pulls the first object out of the generator
# .object retrieves django object the object from the DeserializedObject
obj = next(serializers.deserialize("json", serial)).object

回答 10

.values() 我需要将模型实例转换为JSON。

.values()文档:https ://docs.djangoproject.com/zh/3.0/ref/models/querysets/#values

名为Project的模型的示例用法。

注意:我正在使用Django Rest Framework

    @csrf_exempt
    @api_view(["GET"])
    def get_project(request):
        id = request.query_params['id']
        data = Project.objects.filter(id=id).values()
        if len(data) == 0:
            return JsonResponse(status=404, data={'message': 'Project with id {} not found.'.format(id)})
        return JsonResponse(data[0])

有效ID的结果:

{
    "id": 47,
    "title": "Project Name",
    "description": "",
    "created_at": "2020-01-21T18:13:49.693Z",
}

.values() is what I needed to convert a model instance to JSON.

.values() documentation: https://docs.djangoproject.com/en/3.0/ref/models/querysets/#values

Example usage with a model called Project.

Note: I’m using Django Rest Framework

    @csrf_exempt
    @api_view(["GET"])
    def get_project(request):
        id = request.query_params['id']
        data = Project.objects.filter(id=id).values()
        if len(data) == 0:
            return JsonResponse(status=404, data={'message': 'Project with id {} not found.'.format(id)})
        return JsonResponse(data[0])

Result from a valid id:

{
    "id": 47,
    "title": "Project Name",
    "description": "",
    "created_at": "2020-01-21T18:13:49.693Z",
}

回答 11

如果要将单个模型对象作为json响应返回给客户端,则可以执行以下简单解决方案:

from django.forms.models import model_to_dict
from django.http import JsonResponse

movie = Movie.objects.get(pk=1)
return JsonResponse(model_to_dict(movie))

If you want to return the single model object as a json response to a client, you can do this simple solution:

from django.forms.models import model_to_dict
from django.http import JsonResponse

movie = Movie.objects.get(pk=1)
return JsonResponse(model_to_dict(movie))

回答 12

使用Django序列化器python格式,

from django.core import serializers

qs = SomeModel.objects.all()
serialized_obj = serializers.serialize('python', qs)

jsonpython格式有什么区别?

json格式将返回的结果str,而python将在返回的结果要么listOrderedDict

Use Django Serializer with python format,

from django.core import serializers

qs = SomeModel.objects.all()
serialized_obj = serializers.serialize('python', qs)

What’s difference between json and python format?

The json format will return the result as str whereas python will return the result in either list or OrderedDict


回答 13

似乎您不能序列化一个实例,而必须序列化一个对象的QuerySet。

from django.core import serializers
from models import *

def getUser(request):
    return HttpResponse(json(Users.objects.filter(id=88)))

我用完了svndjango发行版,因此在较早的版本中可能不存在。

It doesn’t seem you can serialize an instance, you’d have to serialize a QuerySet of one object.

from django.core import serializers
from models import *

def getUser(request):
    return HttpResponse(json(Users.objects.filter(id=88)))

I run out of the svn release of django, so this may not be in earlier versions.


回答 14

ville = UneVille.objects.get(nom='lihlihlihlih')
....
blablablab
.......

return HttpResponse(simplejson.dumps(ville.__dict__))

我返回我的实例的命令

因此它返回的内容类似于{‘field1’:value,“ field2”:value,….}

ville = UneVille.objects.get(nom='lihlihlihlih')
....
blablablab
.......

return HttpResponse(simplejson.dumps(ville.__dict__))

I return the dict of my instance

so it return something like {‘field1’:value,”field2″:value,….}


回答 15

这样呢:

def ins2dic(obj):
    SubDic = obj.__dict__
    del SubDic['id']
    del SubDic['_state']
return SubDic

或排除您不想要的任何东西。

how about this way:

def ins2dic(obj):
    SubDic = obj.__dict__
    del SubDic['id']
    del SubDic['_state']
return SubDic

or exclude anything you don’t want.


回答 16

与我希望从框架(最简单的方法)相比,所有这些答案都有些棘手,如果您使用其余框架,我认为到目前为止,这是最简单的方法:

rep = YourSerializerClass().to_representation(your_instance)
json.dumps(rep)

这将直接使用Serializer,同时尊重您在其上定义的字段以及任何关联等。

All of these answers were a little hacky compared to what I would expect from a framework, the simplest method, I think by far, if you are using the rest framework:

rep = YourSerializerClass().to_representation(your_instance)
json.dumps(rep)

This uses the Serializer directly, respecting the fields you’ve defined on it, as well as any associations, etc.