python中的json.dump()和json.dumps()有什么区别?

问题:python中的json.dump()和json.dumps()有什么区别?

我在官方文档中进行了搜索,以查找python中的json.dump()和json.dumps()之间的区别。显然,它们与文件写入选项有关。
但是,它们之间的详细区别是什么?在什么情况下,一个比另一个具有更多的优势?

I searched in this official document to find difference between the json.dump() and json.dumps() in python. It is clear that they are related with file write option.
But what is the detailed difference between them and in what situations one has more advantage than other?


回答 0

除了文档所说的内容外,没有什么可添加的。如果要将JSON转储到文件/套接字或其他文件中,则应使用dump()。如果只需要它作为字符串(用于打印,解析或其他操作),则使用dumps()(转储字符串)

正如Antii Haapala在此答案中提到的,在ensure_ascii行为上有一些细微的差异。这主要是由于底层write()函数是如何工作的,因为它是对块而不是整个字符串进行操作。检查他的答案以获取更多详细信息。

json.dump()

将obj作为JSON格式的流序列化到fp(支持.write()的类似文件的对象

如果ensure_ascii为False,则写入fp的某些块可能是unicode实例

json.dumps()

将obj序列化为JSON格式的str

如果sure_ascii为False,则结果可能包含非ASCII字符,并且返回值可能是unicode实例

There isn’t much else to add other than what the docs say. If you want to dump the JSON into a file/socket or whatever, then you should go with dump(). If you only need it as a string (for printing, parsing or whatever) then use dumps() (dump string)

As mentioned by Antti Haapala in this answer, there are some minor differences on the ensure_ascii behaviour. This is mostly due to how the underlying write() function works, being that it operates on chunks rather than the whole string. Check his answer for more details on that.

json.dump()

Serialize obj as a JSON formatted stream to fp (a .write()-supporting file-like object

If ensure_ascii is False, some chunks written to fp may be unicode instances

json.dumps()

Serialize obj to a JSON formatted str

If ensure_ascii is False, the result may contain non-ASCII characters and the return value may be a unicode instance


回答 1

与功能s取字符串参数。其他则采用文件流。

The functions with an s take string parameters. The others take file streams.


回答 2

在内存使用和速度上。

调用时,jsonstr = json.dumps(mydata)它首先在内存中创建数据的完整副本,然后才将file.write(jsonstr)其复制到磁盘。因此,这是一种更快的方法,但是如果要保存大量数据,则可能会成为问题。

当调用json.dump(mydata, file)-不带’s’时,不使用新的内存,因为数据是按块转储的。但是整个过程要慢大约2倍。

来源:我检查了json.dump()和的源代码,json.dumps()还测试了两个变量,它们测量了time.time()htop中的时间并观察了它们的内存使用情况。

In memory usage and speed.

When you call jsonstr = json.dumps(mydata) it first creates a full copy of your data in memory and only then you file.write(jsonstr) it to disk. So this is a faster method but can be a problem if you have a big piece of data to save.

When you call json.dump(mydata, file) — without ‘s’, new memory is not used, as the data is dumped by chunks. But the whole process is about 2 times slower.

Source: I checked the source code of json.dump() and json.dumps() and also tested both the variants measuring the time with time.time() and watching the memory usage in htop.


回答 3

Python 2的一个显着差异是,如果您使用ensure_ascii=False,则dump可以将UTF-8编码的数据正确写入文件中(除非您使用的扩展名不是UTF-8的8位字符串):

dumps另一方面,with ensure_ascii=False可以产生a strunicode仅取决于您用于字符串的类型:

使用此转换表将obj序列化为JSON格式的str。如果sure_ascii为False,则结果可能包含非ASCII字符,并且返回值可能是unicodeinstance

(强调我的)。请注意,它可能仍然是一个str实例。

因此,如果不检查返回的格式以及可能使用的格式,就无法使用其返回值将结构保存到文件中unicode.encode

当然,这在Python 3中不再是有效的问题,因为不再存在这种8位/ Unicode的混淆。


至于loadVS loadsload认为整个文件是一个JSON文件,所以你不能用它来从单个文件读取多个新行限制JSON文件。

One notable difference in Python 2 is that if you’re using ensure_ascii=False, dump will properly write UTF-8 encoded data into the file (unless you used 8-bit strings with extended characters that are not UTF-8):

dumps on the other hand, with ensure_ascii=False can produce a str or unicode just depending on what types you used for strings:

Serialize obj to a JSON formatted str using this conversion table. If ensure_ascii is False, the result may contain non-ASCII characters and the return value may be a unicode instance.

(emphasis mine). Note that it may still be a str instance as well.

Thus you cannot use its return value to save the structure into file without checking which format was returned and possibly playing with unicode.encode.

This of course is not valid concern in Python 3 any more, since there is no more this 8-bit/Unicode confusion.


As for load vs loads, load considers the whole file to be one JSON document, so you cannot use it to read multiple newline limited JSON documents from a single file.