存储Python字典

问题:存储Python字典

我习惯于使用.csv文件将数据导入和导出Python,但这存在明显的挑战。关于将字典(或字典集)存储在json或pck文件中的简单方法的任何建议?例如:

data = {}
data ['key1'] = "keyinfo"
data ['key2'] = "keyinfo2"

我想知道如何保存它,然后再将其加载回去。

I’m used to bringing data in and out of Python using CSV files, but there are obvious challenges to this. Are there simple ways to store a dictionary (or sets of dictionaries) in a JSON or pickle file?

For example:

data = {}
data ['key1'] = "keyinfo"
data ['key2'] = "keyinfo2"

I would like to know both how to save this, and then how to load it back in.


回答 0

泡菜保存:

try:
    import cPickle as pickle
except ImportError:  # python 3.x
    import pickle

with open('data.p', 'wb') as fp:
    pickle.dump(data, fp, protocol=pickle.HIGHEST_PROTOCOL)

有关该参数的其他信息,请参见pickle模块文档protocol

酸洗负荷:

with open('data.p', 'rb') as fp:
    data = pickle.load(fp)

JSON保存:

import json

with open('data.json', 'w') as fp:
    json.dump(data, fp)

提供额外的参数,例如sort_keysindent以获得漂亮的结果。参数sort_keys将按字母顺序对键进行排序,而indent将使用indent=N空格缩进您的数据结构。

json.dump(data, fp, sort_keys=True, indent=4)

JSON加载:

with open('data.json', 'r') as fp:
    data = json.load(fp)

Pickle save:

try:
    import cPickle as pickle
except ImportError:  # Python 3.x
    import pickle

with open('data.p', 'wb') as fp:
    pickle.dump(data, fp, protocol=pickle.HIGHEST_PROTOCOL)

See the pickle module documentation for additional information regarding the protocol argument.

Pickle load:

with open('data.p', 'rb') as fp:
    data = pickle.load(fp)

JSON save:

import json

with open('data.json', 'w') as fp:
    json.dump(data, fp)

Supply extra arguments, like sort_keys or indent, to get a pretty result. The argument sort_keys will sort the keys alphabetically and indent will indent your data structure with indent=N spaces.

json.dump(data, fp, sort_keys=True, indent=4)

JSON load:

with open('data.json', 'r') as fp:
    data = json.load(fp)

回答 1

最小的示例,直接写入文件:

import json
json.dump(data, open(filename, 'wb'))
data = json.load(open(filename))

或安全地打开/关闭:

import json
with open(filename, 'wb') as outfile:
    json.dump(data, outfile)
with open(filename) as infile:
    data = json.load(infile)

如果要将其保存为字符串而不是文件:

import json
json_str = json.dumps(data)
data = json.loads(json_str)

Minimal example, writing directly to a file:

import json
json.dump(data, open(filename, 'wb'))
data = json.load(open(filename))

or safely opening / closing:

import json
with open(filename, 'wb') as outfile:
    json.dump(data, outfile)
with open(filename) as infile:
    data = json.load(infile)

If you want to save it in a string instead of a file:

import json
json_str = json.dumps(data)
data = json.loads(json_str)

回答 2

另请参阅加速包ujson。 https://pypi.python.org/pypi/ujson

import ujson
with open('data.json', 'wb') as fp:
    ujson.dump(data, fp)

Also see the speeded-up package ujson:

import ujson

with open('data.json', 'wb') as fp:
    ujson.dump(data, fp)

回答 3

要写入文件:

import json
myfile.write(json.dumps(mydict))

要读取文件:

import json
mydict = json.loads(myfile.read())

myfile 是存储字典的文件的文件对象。

To write to a file:

import json
myfile.write(json.dumps(mydict))

To read from a file:

import json
mydict = json.loads(myfile.read())

myfile is the file object for the file that you stored the dict in.


回答 4

如果您正在序列化之后但不需要其他程序中的数据,则强烈建议您使用该shelve模块。将其视为持久性字典。

myData = shelve.open('/path/to/file')

# check for values.
keyVar in myData

# set values
myData[anotherKey] = someValue

# save the data for future use.
myData.close()

If you’re after serialization, but won’t need the data in other programs, I strongly recommend the shelve module. Think of it as a persistent dictionary.

myData = shelve.open('/path/to/file')

# Check for values.
keyVar in myData

# Set values
myData[anotherKey] = someValue

# Save the data for future use.
myData.close()

回答 5

如果您想要替代picklejson,则可以使用klepto

>>> init = {'y': 2, 'x': 1, 'z': 3}
>>> import klepto
>>> cache = klepto.archives.file_archive('memo', init, serialized=False)
>>> cache        
{'y': 2, 'x': 1, 'z': 3}
>>>
>>> # dump dictionary to the file 'memo.py'
>>> cache.dump() 
>>> 
>>> # import from 'memo.py'
>>> from memo import memo
>>> print memo
{'y': 2, 'x': 1, 'z': 3}

使用klepto,如果使用过serialized=True,则该字典将被memo.pkl作为腌制的字典写入,而不是使用明文。

你可以在klepto这里找到:https : //github.com/uqfoundation/klepto

dill酸洗可能比酸洗更好pickle,因为dill可以在python中序列化几乎所有内容。 klepto也可以使用dill

你可以在dill这里找到:https : //github.com/uqfoundation/dill

前几行中额外的mumbo-jumbo是因为klepto可以配置为将字典存储到文件,目录上下文或SQL数据库中。无论选择什么作为后端存档,API都是相同的。它为您提供了一个“可存档”字典,您可以使用该字典loaddump与档案进行交互。

If you want an alternative to pickle or json, you can use klepto.

>>> init = {'y': 2, 'x': 1, 'z': 3}
>>> import klepto
>>> cache = klepto.archives.file_archive('memo', init, serialized=False)
>>> cache        
{'y': 2, 'x': 1, 'z': 3}
>>>
>>> # dump dictionary to the file 'memo.py'
>>> cache.dump() 
>>> 
>>> # import from 'memo.py'
>>> from memo import memo
>>> print memo
{'y': 2, 'x': 1, 'z': 3}

With klepto, if you had used serialized=True, the dictionary would have been written to memo.pkl as a pickled dictionary instead of with clear text.

You can get klepto here: https://github.com/uqfoundation/klepto

dill is probably a better choice for pickling then pickle itself, as dill can serialize almost anything in python. klepto also can use dill.

You can get dill here: https://github.com/uqfoundation/dill

The additional mumbo-jumbo on the first few lines are because klepto can be configured to store dictionaries to a file, to a directory context, or to a SQL database. The API is the same for whatever you choose as the backend archive. It gives you an “archivable” dictionary with which you can use load and dump to interact with the archive.


回答 6

这是一个老话题,但是为了完整起见,我们应该包括ConfigParser和configparser,它们分别是Python 2和3中的标准库的一部分。该模块读取和写入config / ini文件,并且(至少在Python 3中)其行为类似于字典。它的另一个好处是,您可以将多个词典存储到config / ini文件的不同部分中,并对其进行调用。甜!

Python 2.7.x示例。

import ConfigParser

config = ConfigParser.ConfigParser()

dict1 = {'key1':'keyinfo', 'key2':'keyinfo2'}
dict2 = {'k1':'hot', 'k2':'cross', 'k3':'buns'}
dict3 = {'x':1, 'y':2, 'z':3}

# make each dictionary a separate section in config
config.add_section('dict1')
for key in dict1.keys():
    config.set('dict1', key, dict1[key])

config.add_section('dict2')
for key in dict2.keys():
    config.set('dict2', key, dict2[key])

config.add_section('dict3')
for key in dict3.keys():
    config.set('dict3', key, dict3[key])

# save config to file
f = open('config.ini', 'w')
config.write(f)
f.close()

# read config from file
config2 = ConfigParser.ConfigParser()
config2.read('config.ini')

dictA = {}
for item in config2.items('dict1'):
    dictA[item[0]] = item[1]

dictB = {}
for item in config2.items('dict2'):
    dictB[item[0]] = item[1]

dictC = {}
for item in config2.items('dict3'):
    dictC[item[0]] = item[1]

print(dictA)
print(dictB)
print(dictC)

Python 3.X示例。

import configparser

config = configparser.ConfigParser()

dict1 = {'key1':'keyinfo', 'key2':'keyinfo2'}
dict2 = {'k1':'hot', 'k2':'cross', 'k3':'buns'}
dict3 = {'x':1, 'y':2, 'z':3}

# make each dictionary a separate section in config
config['dict1'] = dict1
config['dict2'] = dict2
config['dict3'] = dict3

# save config to file
f = open('config.ini', 'w')
config.write(f)
f.close()

# read config from file
config2 = configparser.ConfigParser()
config2.read('config.ini')

# ConfigParser objects are a lot like dictionaries, but if you really
# want a dictionary you can ask it to convert a section to a dictionary
dictA = dict(config2['dict1'] )
dictB = dict(config2['dict2'] )
dictC = dict(config2['dict3'])

print(dictA)
print(dictB)
print(dictC)

控制台输出

{'key2': 'keyinfo2', 'key1': 'keyinfo'}
{'k1': 'hot', 'k2': 'cross', 'k3': 'buns'}
{'z': '3', 'y': '2', 'x': '1'}

config.ini的内容

[dict1]
key2 = keyinfo2
key1 = keyinfo

[dict2]
k1 = hot
k2 = cross
k3 = buns

[dict3]
z = 3
y = 2
x = 1

For completeness, we should include ConfigParser and configparser which are part of the standard library in Python 2 and 3, respectively. This module reads and writes to a config/ini file and (at least in Python 3) behaves in a lot of ways like a dictionary. It has the added benefit that you can store multiple dictionaries into separate sections of your config/ini file and recall them. Sweet!

Python 2.7.x example.

import ConfigParser

config = ConfigParser.ConfigParser()

dict1 = {'key1':'keyinfo', 'key2':'keyinfo2'}
dict2 = {'k1':'hot', 'k2':'cross', 'k3':'buns'}
dict3 = {'x':1, 'y':2, 'z':3}

# Make each dictionary a separate section in the configuration
config.add_section('dict1')
for key in dict1.keys():
    config.set('dict1', key, dict1[key])
   
config.add_section('dict2')
for key in dict2.keys():
    config.set('dict2', key, dict2[key])

config.add_section('dict3')
for key in dict3.keys():
    config.set('dict3', key, dict3[key])

# Save the configuration to a file
f = open('config.ini', 'w')
config.write(f)
f.close()

# Read the configuration from a file
config2 = ConfigParser.ConfigParser()
config2.read('config.ini')

dictA = {}
for item in config2.items('dict1'):
    dictA[item[0]] = item[1]

dictB = {}
for item in config2.items('dict2'):
    dictB[item[0]] = item[1]

dictC = {}
for item in config2.items('dict3'):
    dictC[item[0]] = item[1]

print(dictA)
print(dictB)
print(dictC)

Python 3.X example.

import configparser

config = configparser.ConfigParser()

dict1 = {'key1':'keyinfo', 'key2':'keyinfo2'}
dict2 = {'k1':'hot', 'k2':'cross', 'k3':'buns'}
dict3 = {'x':1, 'y':2, 'z':3}

# Make each dictionary a separate section in the configuration
config['dict1'] = dict1
config['dict2'] = dict2
config['dict3'] = dict3

# Save the configuration to a file
f = open('config.ini', 'w')
config.write(f)
f.close()

# Read the configuration from a file
config2 = configparser.ConfigParser()
config2.read('config.ini')

# ConfigParser objects are a lot like dictionaries, but if you really
# want a dictionary you can ask it to convert a section to a dictionary
dictA = dict(config2['dict1'] )
dictB = dict(config2['dict2'] )
dictC = dict(config2['dict3'])

print(dictA)
print(dictB)
print(dictC)

Console output

{'key2': 'keyinfo2', 'key1': 'keyinfo'}
{'k1': 'hot', 'k2': 'cross', 'k3': 'buns'}
{'z': '3', 'y': '2', 'x': '1'}

Contents of config.ini

[dict1]
key2 = keyinfo2
key1 = keyinfo

[dict2]
k1 = hot
k2 = cross
k3 = buns

[dict3]
z = 3
y = 2
x = 1

回答 7

如果保存到json文件,最好的和最简单的方法是:

import json
with open("file.json", "wb") as f:
    f.write(json.dumps(dict).encode("utf-8"))

If save to a JSON file, the best and easiest way of doing this is:

import json
with open("file.json", "wb") as f:
    f.write(json.dumps(dict).encode("utf-8"))

回答 8

我的用例是将多个json对象保存到文件中,而marty的回答对我有所帮助。但是要满足我的用例,答案并不完整,因为每次保存新条目时,它都会覆盖旧数据。

为了将多个条目保存在一个文件中,必须检查旧内容(即在写入之前先读取)。存放json数据的典型文件将具有a listobjectas根。因此,我认为我的json文件始终具有a,list of objects并且每次向其添加数据时,我只会首先加载列表,在其中添加新数据,然后将其转储回文件(w)的仅可写实例:

def saveJson(url,sc): #this function writes the 2 values to file
    newdata = {'url':url,'sc':sc}
    json_path = "db/file.json"

    old_list= []
    with open(json_path) as myfile:  #read the contents first
        old_list = json.load(myfile)
    old_list.append(newdata)

    with open(json_path,"w") as myfile:  #overwrite the whole content
        json.dump(old_list,myfile,sort_keys=True,indent=4)

    return "sucess"

新的json文件将如下所示:

[
    {
        "sc": "a11",
        "url": "www.google.com"
    },
    {
        "sc": "a12",
        "url": "www.google.com"
    },
    {
        "sc": "a13",
        "url": "www.google.com"
    }
]

注意:必须file.json使用[]以初始数据命名的文件,此方法才能正常工作

PS:与原始问题无关,但是通过首先检查我们的条目是否已经存在(基于1 /多个键),然后仅追加并保存数据,也可以进一步改进此方法。让我知道是否有人需要该支票,我将添加到答案中

My use case was to save multiple JSON objects to a file and marty’s answer helped me somewhat. But to serve my use case, the answer was not complete as it would overwrite the old data every time a new entry was saved.

To save multiple entries in a file, one must check for the old content (i.e., read before write). A typical file holding JSON data will either have a list or an object as root. So I considered that my JSON file always has a list of objects and every time I add data to it, I simply load the list first, append my new data in it, and dump it back to a writable-only instance of file (w):

def saveJson(url,sc): # This function writes the two values to the file
    newdata = {'url':url,'sc':sc}
    json_path = "db/file.json"

    old_list= []
    with open(json_path) as myfile:  # Read the contents first
        old_list = json.load(myfile)
    old_list.append(newdata)

    with open(json_path,"w") as myfile:  # Overwrite the whole content
        json.dump(old_list, myfile, sort_keys=True, indent=4)

    return "success"

The new JSON file will look something like this:

[
    {
        "sc": "a11",
        "url": "www.google.com"
    },
    {
        "sc": "a12",
        "url": "www.google.com"
    },
    {
        "sc": "a13",
        "url": "www.google.com"
    }
]

NOTE: It is essential to have a file named file.json with [] as initial data for this approach to work

PS: not related to original question, but this approach could also be further improved by first checking if our entry already exists (based on one or multiple keys) and only then append and save the data.