加载和解析具有多个JSON对象的JSON文件

Question 1

I am trying to load and parse a JSON file in Python. But I’m stuck trying to load the file:

import json
json_data = open('file')
data = json.load(json_data)

Yields:

ValueError: Extra data: line 2 column 1 - line 225116 column 1 (char 232 - 160128774)

I looked at 18.2. json — JSON encoder and decoder in the Python documentation, but it’s pretty discouraging to read through this horrible-looking documentation.

First few lines (anonymized with randomized entries):

{"votes": {"funny": 2, "useful": 5, "cool": 1}, "user_id": "harveydennis", "name": "Jasmine Graham", "url": "http://example.org/user_details?userid=harveydennis", "average_stars": 3.5, "review_count": 12, "type": "user"}
{"votes": {"funny": 1, "useful": 2, "cool": 4}, "user_id": "njohnson", "name": "Zachary Ballard", "url": "https://www.example.com/user_details?userid=njohnson", "average_stars": 3.5, "review_count": 12, "type": "user"}
{"votes": {"funny": 1, "useful": 0, "cool": 4}, "user_id": "david06", "name": "Jonathan George", "url": "https://example.com/user_details?userid=david06", "average_stars": 3.5, "review_count": 12, "type": "user"}
{"votes": {"funny": 6, "useful": 5, "cool": 0}, "user_id": "santiagoerika", "name": "Amanda Taylor", "url": "https://www.example.com/user_details?userid=santiagoerika", "average_stars": 3.5, "review_count": 12, "type": "user"}
{"votes": {"funny": 1, "useful": 8, "cool": 2}, "user_id": "rodriguezdennis", "name": "Jennifer Roach", "url": "http://www.example.com/user_details?userid=rodriguezdennis", "average_stars": 3.5, "review_count": 12, "type": "user"}

Question 2

You have a JSON Lines format text file. You need to parse your file line by line:

import json

data = []
with open('file') as f:
    for line in f:
        data.append(json.loads(line))

Each line contains valid JSON, but as a whole, it is not a valid JSON value as there is no top-level list or object definition.

Note that because the file contains JSON per line, you are saved the headaches of trying to parse it all in one go or to figure out a streaming JSON parser. You can now opt to process each line separately before moving on to the next, saving memory in the process. You probably don’t want to append each result to one list and then process everything if your file is really big.

If you have a file containing individual JSON objects with delimiters in-between, use How do I use the ‘json’ module to read in one JSON object at a time? to parse out individual objects using a buffered method.

Question 3

for those stumbling upon this question: the python jsonlines library (much younger than this question) elegantly handles files with one json document per line. see https://jsonlines.readthedocs.io/

Question 4

That is ill-formatted. You have one JSON object per line, but they are not contained in a larger data structure (ie an array). You’ll either need to reformat it so that it begins with [ and ends with ] with a comma at the end of each line, or parse it line by line as separate dictionaries.

加载和解析具有多个JSON对象的JSON文件

问题：加载和解析具有多个JSON对象的JSON文件

回答 0

回答 1

回答 2

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

7行代码 Python热力图可视化分析缺失数据处理

Python 流程图 — 一键转化代码为流程图

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

忽略带有str.contains的NaN

您如何捕获此异常？

SQLAlchemy按降序排列？

将熊猫数据框列表连接在一起

Django设置默认表单值

如何获取列表中的元素数量？

加载和解析具有多个JSON对象的JSON文件

问题：加载和解析具有多个JSON对象的JSON文件

回答 0

回答 1

回答 2

相关文章

排行榜展示

文章展示