如何在Python中解析YAML文件

问题:如何在Python中解析YAML文件

如何在Python中解析YAML文件?

How can I parse a YAML file in Python?


回答 0

不依赖C标头的最简单,最纯净的方法是PyYaml(文档),可以通过pip install pyyaml以下方式安装:

#!/usr/bin/env python

import yaml

with open("example.yaml", 'r') as stream:
    try:
        print(yaml.safe_load(stream))
    except yaml.YAMLError as exc:
        print(exc)

就是这样。一个普通的yaml.load()函数也存在,但是yaml.safe_load()除非您明确需要提供的任意对象序列化/反序列化,以避免引入执行任意代码的可能性,否则应始终首选该函数。

请注意,PyYaml项目支持YAML 1.1规范之前的版本。如果需要YAML 1.2规范支持,请参见ruamel.yaml,本答案中所述

The easiest and purest method without relying on C headers is PyYaml (documentation), which can be installed via pip install pyyaml:

#!/usr/bin/env python

import yaml

with open("example.yaml", 'r') as stream:
    try:
        print(yaml.safe_load(stream))
    except yaml.YAMLError as exc:
        print(exc)

And that’s it. A plain yaml.load() function also exists, but yaml.safe_load() should always be preferred unless you explicitly need the arbitrary object serialization/deserialization provided in order to avoid introducing the possibility for arbitrary code execution.

Note the PyYaml project supports versions up through the YAML 1.1 specification. If YAML 1.2 specification support is needed, see ruamel.yaml as noted in this answer.


回答 1

使用Python 2 + 3(和Unicode)读写YAML文件

# -*- coding: utf-8 -*-
import yaml
import io

# Define data
data = {
    'a list': [
        1, 
        42, 
        3.141, 
        1337, 
        'help', 
        u'€'
    ],
    'a string': 'bla',
    'another dict': {
        'foo': 'bar',
        'key': 'value',
        'the answer': 42
    }
}

# Write YAML file
with io.open('data.yaml', 'w', encoding='utf8') as outfile:
    yaml.dump(data, outfile, default_flow_style=False, allow_unicode=True)

# Read YAML file
with open("data.yaml", 'r') as stream:
    data_loaded = yaml.safe_load(stream)

print(data == data_loaded)

创建的YAML文件

a list:
- 1
- 42
- 3.141
- 1337
- help
- 
a string: bla
another dict:
  foo: bar
  key: value
  the answer: 42

通用文件结尾

.yml.yaml

备择方案

对于您的应用程序,以下内容可能很重要:

  • 其他编程语言的支持
  • 阅读/写作表现
  • 紧凑度(文件大小)

另请参阅:数据序列化格式的比较

如果您想寻找一种制作配置文件的方法,则可能需要阅读我的短文《Python中的配置文件》。

Read & Write YAML files with Python 2+3 (and unicode)

# -*- coding: utf-8 -*-
import yaml
import io

# Define data
data = {
    'a list': [
        1, 
        42, 
        3.141, 
        1337, 
        'help', 
        u'€'
    ],
    'a string': 'bla',
    'another dict': {
        'foo': 'bar',
        'key': 'value',
        'the answer': 42
    }
}

# Write YAML file
with io.open('data.yaml', 'w', encoding='utf8') as outfile:
    yaml.dump(data, outfile, default_flow_style=False, allow_unicode=True)

# Read YAML file
with open("data.yaml", 'r') as stream:
    data_loaded = yaml.safe_load(stream)

print(data == data_loaded)

Created YAML file

a list:
- 1
- 42
- 3.141
- 1337
- help
- €
a string: bla
another dict:
  foo: bar
  key: value
  the answer: 42

Common file endings

.yml and .yaml

Alternatives

For your application, the following might be important:

  • Support by other programming languages
  • Reading / writing performance
  • Compactness (file size)

See also: Comparison of data serialization formats

In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python


回答 2

如果您具有符合YAML 1.2规范(2009年发布)的YAML,则应使用ruamel.yaml(免责声明:我是该软件包的作者)。它本质上是PyYAML的超集,它支持大多数YAML 1.1(自2005年起)。

如果希望在往返时保留您的注释,则当然应该使用ruamel.yaml。

升级@Jon的示例很容易:

import ruamel.yaml as yaml

with open("example.yaml") as stream:
    try:
        print(yaml.safe_load(stream))
    except yaml.YAMLError as exc:
        print(exc)

safe_load()除非您真的完全控制了输入,否则就使用它(很少),并且知道您在做什么。

如果您使用pathlib Path来处理文件,则最好使用新的ruamel.yaml API:

from ruamel.yaml import YAML
from pathlib import Path

path = Path('example.yaml')
yaml = YAML(typ='safe')
data = yaml.load(path)

If you have YAML that conforms to the YAML 1.2 specification (released 2009) then you should use ruamel.yaml (disclaimer: I am the author of that package). It is essentially a superset of PyYAML, which supports most of YAML 1.1 (from 2005).

If you want to be able to preserve your comments when round-tripping, you certainly should use ruamel.yaml.

Upgrading @Jon’s example is easy:

import ruamel.yaml as yaml

with open("example.yaml") as stream:
    try:
        print(yaml.safe_load(stream))
    except yaml.YAMLError as exc:
        print(exc)

Use safe_load() unless you really have full control over the input, need it (seldom the case) and know what you are doing.

If you are using pathlib Path for manipulating files, you are better of using the new API ruamel.yaml provides:

from ruamel.yaml import YAML
from pathlib import Path

path = Path('example.yaml')
yaml = YAML(typ='safe')
data = yaml.load(path)

回答 3

首先使用pip3安装pyyaml。

然后导入yaml模块并将文件加载到名为“ my_dict”的字典中:

import yaml
with open('filename.yaml') as f:
    my_dict = yaml.safe_load(f)

这就是您所需要的。现在,整个yaml文件都在“ my_dict”字典中。

First install pyyaml using pip3.

Then import yaml module and load the file into a dictionary called ‘my_dict’:

import yaml
with open('filename.yaml') as f:
    my_dict = yaml.safe_load(f)

That’s all you need. Now the entire yaml file is in ‘my_dict’ dictionary.


回答 4

例:


defaults.yaml

url: https://www.google.com

环境

from ruamel import yaml

data = yaml.safe_load(open('defaults.yaml'))
data['url']

Example:


defaults.yaml

url: https://www.google.com

environment.py

from ruamel import yaml

data = yaml.safe_load(open('defaults.yaml'))
data['url']

回答 5

我使用ruamel.yaml详情和辩论在这里

from ruamel import yaml

with open(filename, 'r') as fp:
    read_data = yaml.load(fp)

用法ruamel.yaml是PyYAML的旧惯例兼容(有一些简单的可解决的问题),并因为它是在链接说明我公司提供,使用

from ruamel import yaml

代替

import yaml

它将解决您的大多数问题。

编辑:事实证明PyYAML并没有死,它只是保存在另一个地方。

I use ruamel.yaml. Details & debate here.

from ruamel import yaml

with open(filename, 'r') as fp:
    read_data = yaml.load(fp)

Usage of ruamel.yaml is compatible (with some simple solvable problems) with old usages of PyYAML and as it is stated in link I provided, use

from ruamel import yaml

instead of

import yaml

and it will fix most of your problems.

EDIT: PyYAML is not dead as it turns out, it’s just maintained in a different place.


回答 6

#!/usr/bin/env python

import sys
import yaml

def main(argv):

    with open(argv[0]) as stream:
        try:
            #print(yaml.load(stream))
            return 0
        except yaml.YAMLError as exc:
            print(exc)
            return 1

if __name__ == "__main__":
    sys.exit(main(sys.argv[1:]))
#!/usr/bin/env python

import sys
import yaml

def main(argv):

    with open(argv[0]) as stream:
        try:
            #print(yaml.load(stream))
            return 0
        except yaml.YAMLError as exc:
            print(exc)
            return 1

if __name__ == "__main__":
    sys.exit(main(sys.argv[1:]))