标签归档:pretty-print

有什么方法可以正确打印订购的字典吗?

问题:有什么方法可以正确打印订购的字典吗?

我喜欢Python中的pprint模块。我经常使用它进行测试和调试。我经常使用width选项来确保输出完全适合我的终端窗口。

直到他们在Python 2.7中添加了新的有序词典类型(我真的很喜欢的另一个很酷的功能)之前,它一直运行良好。如果我尝试漂亮地打印有序词典,则显示效果会不好。整个事情并没有出现在每个键值对各自的行上,而是整条显示在一条长行上,该行包装许多次并且很难阅读。

这里有没有人像老旧的无序词典一样,能够很好地打印出来?如果我花了足够的时间,我可能可以使用PrettyPrinter.format方法解决问题,但是我想知道这里是否有人知道解决方案。

更新:我为此提交了一个错误报告。您可以在http://bugs.python.org/issue10592上看到它。

I like the pprint module in Python. I use it a lot for testing and debugging. I frequently use the width option to make sure the output fits nicely within my terminal window.

It has worked fine until they added the new ordered dictionary type in Python 2.7 (another cool feature I really like). If I try to pretty-print an ordered dictionary, it doesn’t show nicely. Instead of having each key-value pair on its own line, the whole thing shows up on one long line, which wraps many times and is hard to read.

Does anyone here have a way to make it print nicely, like the old unordered dictionaries? I could probably figure something out, possibly using the PrettyPrinter.format method, if I spend enough time, but I am wondering if anyone here already knows of a solution.

UPDATE: I filed a bug report for this. You can see it at http://bugs.python.org/issue10592.


回答 0

作为临时的解决方法,您可以尝试以JSON格式进行转储。您会丢失一些类型信息,但是看起来不错,可以保持顺序。

import json

pprint(data, indent=4)
# ^ugly

print(json.dumps(data, indent=4))
# ^nice

As a temporary workaround you can try dumping in JSON format. You lose some type information, but it looks nice and keeps the order.

import json

pprint(data, indent=4)
# ^ugly

print(json.dumps(data, indent=4))
# ^nice

回答 1

如果您的OrderedDict的顺序是alpha排序,则以下内容将起作用,因为pprint将在打印之前对字典进行排序。

pprint(dict(o.items()))

The following will work if the order of your OrderedDict is an alpha sort, since pprint will sort a dict before print.

pprint(dict(o.items()))

回答 2

这是另一个在pprint()内部覆盖并使用stock 函数的方法。与我之前的版本不同,它将OrderedDict在另一个容器(例如a)内处理,list并且还应该能够处理给定的任何可选关键字参数-但是,它对输出的控制程度与另一个容器不同。

它通过将stock函数的输出重定向到一个临时缓冲区中进行操作,然后对其进行自动换行,然后再将其发送到输出流。尽管最终产生的输出不是特别漂亮,但它是不错的,并且可能“足够好”用作解决方法。

更新2.0

通过使用标准库textwrap模块进行了简化,并进行了修改,使其可以在Python 2和3中使用。

from collections import OrderedDict
try:
    from cStringIO import StringIO
except ImportError:  # Python 3
    from io import StringIO
from pprint import pprint as pp_pprint
import sys
import textwrap

def pprint(object, **kwrds):
    try:
        width = kwrds['width']
    except KeyError: # unlimited, use stock function
        pp_pprint(object, **kwrds)
        return
    buffer = StringIO()
    stream = kwrds.get('stream', sys.stdout)
    kwrds.update({'stream': buffer})
    pp_pprint(object, **kwrds)
    words = buffer.getvalue().split()
    buffer.close()

    # word wrap output onto multiple lines <= width characters
    try:
        print >> stream, textwrap.fill(' '.join(words), width=width)
    except TypeError:  # Python 3
        print(textwrap.fill(' '.join(words), width=width), file=stream)

d = dict((('john',1), ('paul',2), ('mary',3)))
od = OrderedDict((('john',1), ('paul',2), ('mary',3)))
lod = [OrderedDict((('john',1), ('paul',2), ('mary',3))),
       OrderedDict((('moe',1), ('curly',2), ('larry',3))),
       OrderedDict((('weapons',1), ('mass',2), ('destruction',3)))]

样本输出:

pprint(d, width=40)

»   {'john': 1, 'mary': 3, 'paul': 2}

pprint(od, width=40)

» OrderedDict([('john', 1), ('paul', 2),
   ('mary', 3)])

pprint(lod, width=40)

» [OrderedDict([('john', 1), ('paul', 2),
   ('mary', 3)]), OrderedDict([('moe', 1),
   ('curly', 2), ('larry', 3)]),
   OrderedDict([('weapons', 1), ('mass',
   2), ('destruction', 3)])]

Here’s another answer that works by overriding and using the stock pprint() function internally. Unlike my earlier one it will handle OrderedDict‘s inside another container such as a list and should also be able to handle any optional keyword arguments given — however it does not have the same degree of control over the output that the other one afforded.

It operates by redirecting the stock function’s output into a temporary buffer and then word wraps that before sending it on to the output stream. While the final output produced isn’t exceptionalily pretty, it’s decent and may be “good enough” to use as a workaround.

Update 2.0

Simplified by using standard library textwrap module, and modified to work in both Python 2 & 3.

from collections import OrderedDict
try:
    from cStringIO import StringIO
except ImportError:  # Python 3
    from io import StringIO
from pprint import pprint as pp_pprint
import sys
import textwrap

def pprint(object, **kwrds):
    try:
        width = kwrds['width']
    except KeyError: # unlimited, use stock function
        pp_pprint(object, **kwrds)
        return
    buffer = StringIO()
    stream = kwrds.get('stream', sys.stdout)
    kwrds.update({'stream': buffer})
    pp_pprint(object, **kwrds)
    words = buffer.getvalue().split()
    buffer.close()

    # word wrap output onto multiple lines <= width characters
    try:
        print >> stream, textwrap.fill(' '.join(words), width=width)
    except TypeError:  # Python 3
        print(textwrap.fill(' '.join(words), width=width), file=stream)

d = dict((('john',1), ('paul',2), ('mary',3)))
od = OrderedDict((('john',1), ('paul',2), ('mary',3)))
lod = [OrderedDict((('john',1), ('paul',2), ('mary',3))),
       OrderedDict((('moe',1), ('curly',2), ('larry',3))),
       OrderedDict((('weapons',1), ('mass',2), ('destruction',3)))]

Sample output:

pprint(d, width=40)

»   {'john': 1, 'mary': 3, 'paul': 2}

pprint(od, width=40)

» OrderedDict([('john', 1), ('paul', 2),
   ('mary', 3)])

pprint(lod, width=40)

» [OrderedDict([('john', 1), ('paul', 2),
   ('mary', 3)]), OrderedDict([('moe', 1),
   ('curly', 2), ('larry', 3)]),
   OrderedDict([('weapons', 1), ('mass',
   2), ('destruction', 3)])]


回答 3

打印命令字典,例如

from collections import OrderedDict

d=OrderedDict([
    ('a', OrderedDict([
        ('a1',1),
        ('a2','sss')
    ])),
    ('b', OrderedDict([
        ('b1', OrderedDict([
            ('bb1',1),
            ('bb2',4.5)])),
        ('b2',4.5)
    ])),
])

我做

def dict_or_OrdDict_to_formatted_str(OD, mode='dict', s="", indent=' '*4, level=0):
    def is_number(s):
        try:
            float(s)
            return True
        except ValueError:
            return False
    def fstr(s):
        return s if is_number(s) else '"%s"'%s
    if mode != 'dict':
        kv_tpl = '("%s", %s)'
        ST = 'OrderedDict([\n'; END = '])'
    else:
        kv_tpl = '"%s": %s'
        ST = '{\n'; END = '}'
    for i,k in enumerate(OD.keys()):
        if type(OD[k]) in [dict, OrderedDict]:
            level += 1
            s += (level-1)*indent+kv_tpl%(k,ST+dict_or_OrdDict_to_formatted_str(OD[k], mode=mode, indent=indent, level=level)+(level-1)*indent+END)
            level -= 1
        else:
            s += level*indent+kv_tpl%(k,fstr(OD[k]))
        if i!=len(OD)-1:
            s += ","
        s += "\n"
    return s

print dict_or_OrdDict_to_formatted_str(d)

哪个Yield

"a": {
    "a1": 1,
    "a2": "sss"
},
"b": {
    "b1": {
        "bb1": 1,
        "bb2": 4.5
    },
    "b2": 4.5
}

要么

print dict_or_OrdDict_to_formatted_str(d, mode='OD')

产生

("a", OrderedDict([
    ("a1", 1),
    ("a2", "sss")
])),
("b", OrderedDict([
    ("b1", OrderedDict([
        ("bb1", 1),
        ("bb2", 4.5)
    ])),
    ("b2", 4.5)
]))

To print an ordered dict, e.g.

from collections import OrderedDict

d=OrderedDict([
    ('a', OrderedDict([
        ('a1',1),
        ('a2','sss')
    ])),
    ('b', OrderedDict([
        ('b1', OrderedDict([
            ('bb1',1),
            ('bb2',4.5)])),
        ('b2',4.5)
    ])),
])

I do

def dict_or_OrdDict_to_formatted_str(OD, mode='dict', s="", indent=' '*4, level=0):
    def is_number(s):
        try:
            float(s)
            return True
        except ValueError:
            return False
    def fstr(s):
        return s if is_number(s) else '"%s"'%s
    if mode != 'dict':
        kv_tpl = '("%s", %s)'
        ST = 'OrderedDict([\n'; END = '])'
    else:
        kv_tpl = '"%s": %s'
        ST = '{\n'; END = '}'
    for i,k in enumerate(OD.keys()):
        if type(OD[k]) in [dict, OrderedDict]:
            level += 1
            s += (level-1)*indent+kv_tpl%(k,ST+dict_or_OrdDict_to_formatted_str(OD[k], mode=mode, indent=indent, level=level)+(level-1)*indent+END)
            level -= 1
        else:
            s += level*indent+kv_tpl%(k,fstr(OD[k]))
        if i!=len(OD)-1:
            s += ","
        s += "\n"
    return s

print dict_or_OrdDict_to_formatted_str(d)

Which yields

"a": {
    "a1": 1,
    "a2": "sss"
},
"b": {
    "b1": {
        "bb1": 1,
        "bb2": 4.5
    },
    "b2": 4.5
}

or

print dict_or_OrdDict_to_formatted_str(d, mode='OD')

which yields

("a", OrderedDict([
    ("a1", 1),
    ("a2", "sss")
])),
("b", OrderedDict([
    ("b1", OrderedDict([
        ("bb1", 1),
        ("bb2", 4.5)
    ])),
    ("b2", 4.5)
]))

回答 4

这是破解的实现的方法pprintpprint在打印之前对键进行排序,因此,为了保持顺序,我们只需要按所需的方式对键进行排序即可。

请注意,这会影响items()功能。因此,您可能需要在执行pprint之后保留和恢复覆盖的功能。

from collections import OrderedDict
import pprint

class ItemKey(object):
  def __init__(self, name, position):
    self.name = name
    self.position = position
  def __cmp__(self, b):
    assert isinstance(b, ItemKey)
    return cmp(self.position, b.position)
  def __repr__(self):
    return repr(self.name)

OrderedDict.items = lambda self: [
    (ItemKey(name, i), value)
    for i, (name, value) in enumerate(self.iteritems())]
OrderedDict.__repr__ = dict.__repr__

a = OrderedDict()
a[4] = '4'
a[1] = '1'
a[2] = '2'
print pprint.pformat(a) # {4: '4', 1: '1', 2: '2'}

Here’s a way that hacks the implementation of pprint. pprint sorts the keys before printing, so to preserve order, we just have to make the keys sort in the way we want.

Note that this impacts the items() function. So you might want to preserve and restore the overridden functions after doing the pprint.

from collections import OrderedDict
import pprint

class ItemKey(object):
  def __init__(self, name, position):
    self.name = name
    self.position = position
  def __cmp__(self, b):
    assert isinstance(b, ItemKey)
    return cmp(self.position, b.position)
  def __repr__(self):
    return repr(self.name)

OrderedDict.items = lambda self: [
    (ItemKey(name, i), value)
    for i, (name, value) in enumerate(self.iteritems())]
OrderedDict.__repr__ = dict.__repr__

a = OrderedDict()
a[4] = '4'
a[1] = '1'
a[2] = '2'
print pprint.pformat(a) # {4: '4', 1: '1', 2: '2'}

回答 5

这是我漂亮打印OrderedDict的方法

from collections import OrderedDict
import json
d = OrderedDict()
d['duck'] = 'alive'
d['parrot'] = 'dead'
d['penguin'] = 'exploded'
d['Falcon'] = 'discharged'
print(d)
print(json.dumps(d,indent=4))

OutPut:

OrderedDict([('duck', 'alive'), ('parrot', 'dead'), ('penguin', 'exploded'), ('Falcon', 'discharged')])

{
    "duck": "alive",
    "parrot": "dead",
    "penguin": "exploded",
    "Falcon": "discharged"
}

如果您想按键顺序漂亮地打印字典

print(json.dumps(indent=4,sort_keys=True))
{
    "Falcon": "discharged",
    "duck": "alive",
    "parrot": "dead",
    "penguin": "exploded"
}

Here is my approach to pretty print an OrderedDict

from collections import OrderedDict
import json
d = OrderedDict()
d['duck'] = 'alive'
d['parrot'] = 'dead'
d['penguin'] = 'exploded'
d['Falcon'] = 'discharged'
print(d)
print(json.dumps(d,indent=4))

OutPut:

OrderedDict([('duck', 'alive'), ('parrot', 'dead'), ('penguin', 'exploded'), ('Falcon', 'discharged')])

{
    "duck": "alive",
    "parrot": "dead",
    "penguin": "exploded",
    "Falcon": "discharged"
}

If you want to pretty print dictionary with keys in sorted order

print(json.dumps(indent=4,sort_keys=True))
{
    "Falcon": "discharged",
    "duck": "alive",
    "parrot": "dead",
    "penguin": "exploded"
}

回答 6

这非常粗糙,但是我只需要一种可视化由任意映射和Iterable组成的数据结构的方法,这就是我放弃之前想到的。它是递归的,因此它将遍历嵌套结构和列表。我使用了集合中的Mapping和Iterable抽象基类来处理几乎所有内容。

我的目标是使用简洁的python代码输出几乎像yaml这样的输出,但并没有完全做到这一点。

def format_structure(d, level=0):
    x = ""
    if isinstance(d, Mapping):
        lenk = max(map(lambda x: len(str(x)), d.keys()))
        for k, v in d.items():
            key_text = "\n" + " "*level + " "*(lenk - len(str(k))) + str(k)
            x += key_text + ": " + format_structure(v, level=level+lenk)
    elif isinstance(d, Iterable) and not isinstance(d, basestring):
        for e in d:
            x += "\n" + " "*level + "- " + format_structure(e, level=level+4)
    else:
        x = str(d)
    return x

和一些使用OrderedDict的测试数据和OrderedDicts的列表…(sheesh Python严重需要OrderedDict文字…)

d = OrderedDict([("main",
                  OrderedDict([("window",
                                OrderedDict([("size", [500, 500]),
                                             ("position", [100, 900])])),
                               ("splash_enabled", True),
                               ("theme", "Dark")])),
                 ("updates",
                  OrderedDict([("automatic", True),
                               ("servers",
                                [OrderedDict([("url", "http://server1.com"),
                                              ("name", "Stable")]),
                                 OrderedDict([("url", "http://server2.com"),
                                              ("name", "Beta")]),
                                 OrderedDict([("url", "http://server3.com"),
                                              ("name", "Dev")])]),
                               ("prompt_restart", True)])),
                 ("logging",
                  OrderedDict([("enabled", True),
                               ("rotate", True)]))])

print format_structure(d)

产生以下输出:

   main: 
               window: 
                         size: 
                             - 500
                             - 500
                     position: 
                             - 100
                             - 900
       splash_enabled: True
                theme: Dark
updates: 
            automatic: True
              servers: 
                     - 
                          url: http://server1.com
                         name: Stable
                     - 
                          url: http://server2.com
                         name: Beta
                     - 
                          url: http://server3.com
                         name: Dev
       prompt_restart: True
logging: 
       enabled: True
        rotate: True

在使用str.format()进行更好的对齐的过程中,我有一些想法,但并不想深入研究它。您需要根据所需的对齐类型动态指定字段宽度,这会变得棘手或麻烦。

无论如何,这以可读的分层方式向我显示了我的数据,因此对我有用!

This is pretty crude, but I just needed a way to visualize a data structure made up of any arbitrary Mappings and Iterables and this is what I came up with before giving up. It’s recursive, so it will fall through nested structures and lists just fine. I used the Mapping and Iterable abstract base classes from collections to handle just about anything.

I was aiming for almost yaml like output with concise python code, but didn’t quite make it.

def format_structure(d, level=0):
    x = ""
    if isinstance(d, Mapping):
        lenk = max(map(lambda x: len(str(x)), d.keys()))
        for k, v in d.items():
            key_text = "\n" + " "*level + " "*(lenk - len(str(k))) + str(k)
            x += key_text + ": " + format_structure(v, level=level+lenk)
    elif isinstance(d, Iterable) and not isinstance(d, basestring):
        for e in d:
            x += "\n" + " "*level + "- " + format_structure(e, level=level+4)
    else:
        x = str(d)
    return x

and some test data using OrderedDict and lists of OrderedDicts… (sheesh Python needs OrderedDict literals sooo badly…)

d = OrderedDict([("main",
                  OrderedDict([("window",
                                OrderedDict([("size", [500, 500]),
                                             ("position", [100, 900])])),
                               ("splash_enabled", True),
                               ("theme", "Dark")])),
                 ("updates",
                  OrderedDict([("automatic", True),
                               ("servers",
                                [OrderedDict([("url", "http://server1.com"),
                                              ("name", "Stable")]),
                                 OrderedDict([("url", "http://server2.com"),
                                              ("name", "Beta")]),
                                 OrderedDict([("url", "http://server3.com"),
                                              ("name", "Dev")])]),
                               ("prompt_restart", True)])),
                 ("logging",
                  OrderedDict([("enabled", True),
                               ("rotate", True)]))])

print format_structure(d)

yields the following output:

   main: 
               window: 
                         size: 
                             - 500
                             - 500
                     position: 
                             - 100
                             - 900
       splash_enabled: True
                theme: Dark
updates: 
            automatic: True
              servers: 
                     - 
                          url: http://server1.com
                         name: Stable
                     - 
                          url: http://server2.com
                         name: Beta
                     - 
                          url: http://server3.com
                         name: Dev
       prompt_restart: True
logging: 
       enabled: True
        rotate: True

I had some thoughts along the way of using str.format() for better alignment, but didn’t feel like digging into it. You’d need to dynamically specify the field widths depending on the type of alignment you want, which would get either tricky or cumbersome.

Anyway, this shows me my data in readable hierarchical fashion, so that works for me!


回答 7

def pprint_od(od):
    print "{"
    for key in od:
        print "%s:%s,\n" % (key, od[key]) # Fixed syntax
    print "}"

你去了^^

for item in li:
    pprint_od(item)

要么

(pprint_od(item) for item in li)
def pprint_od(od):
    print "{"
    for key in od:
        print "%s:%s,\n" % (key, od[key]) # Fixed syntax
    print "}"

There you go ^^

for item in li:
    pprint_od(item)

or

(pprint_od(item) for item in li)

回答 8

我已经在python3.5上测试了这个基于Monkey补丁的邪恶方法,它可以工作:

pprint.PrettyPrinter._dispatch[pprint._collections.OrderedDict.__repr__] = pprint.PrettyPrinter._pprint_dict


def unsorted_pprint(data):
    def fake_sort(*args, **kwargs):
        return args[0]
    orig_sorted = __builtins__.sorted
    try:
        __builtins__.sorted = fake_sort
        pprint.pprint(data)
    finally:
        __builtins__.sorted = orig_sorted

您可以pprint使用通常的基于dict的摘要,还可以在通话过程中禁用排序功能,这样就不会为打印实际排序任何键。

I’ve tested this unholy monkey-patch based hack on python3.5 and it works:

pprint.PrettyPrinter._dispatch[pprint._collections.OrderedDict.__repr__] = pprint.PrettyPrinter._pprint_dict


def unsorted_pprint(data):
    def fake_sort(*args, **kwargs):
        return args[0]
    orig_sorted = __builtins__.sorted
    try:
        __builtins__.sorted = fake_sort
        pprint.pprint(data)
    finally:
        __builtins__.sorted = orig_sorted

You make pprint use the usual dict based summary and also disable sorting for the duration of the call so that no keys are actually sorted for printing.


回答 9

从Python 3.8开始:pprint.PrettyPrinter公开sort_dicts关键字参数。

默认情况下为True,将其设置为False将使字典不排序。

>>> from pprint import PrettyPrinter

>>> x = {'John': 1,
>>>      'Mary': 2,
>>>      'Paul': 3,
>>>      'Lisa': 4,
>>>      }

>>> PrettyPrinter(sort_dicts=False).pprint(x)

将输出:

{'John': 1, 
 'Mary': 2, 
 'Paul': 3,
 'Lisa': 4}

参考:https : //docs.python.org/3/library/pprint.html

As of Python 3.8 : pprint.PrettyPrinter exposes the sort_dicts keyword parameter.

True by default, setting it to False will leave the dictionary unsorted.

>>> from pprint import PrettyPrinter

>>> x = {'John': 1,
>>>      'Mary': 2,
>>>      'Paul': 3,
>>>      'Lisa': 4,
>>>      }

>>> PrettyPrinter(sort_dicts=False).pprint(x)

Will output :

{'John': 1, 
 'Mary': 2, 
 'Paul': 3,
 'Lisa': 4}

Reference : https://docs.python.org/3/library/pprint.html


回答 10

pprint()方法只是调用其中__repr__()的事物的方法,在它的方法中OrderedDict似乎并没有做很多(或没有任何东西)。

如果您不关心订单在打印输出中的可见性,那么这是一个便宜的解决方案,该解决方案在以下情况下可能会很大:

class PrintableOrderedDict(OrderedDict):
    def __repr__(self):
        return dict.__repr__(self)

令我惊讶的是,订单没有得到保存……嗯。

The pprint() method is just invoking the __repr__() method of things in it, and OrderedDict doesn’t appear to do much in it’s method (or doesn’t have one or something).

Here’s a cheap solution that should work IF YOU DON’T CARE ABOUT THE ORDER BEING VISIBLE IN THE PPRINT OUTPUT, which may be a big if:

class PrintableOrderedDict(OrderedDict):
    def __repr__(self):
        return dict.__repr__(self)

I’m actually surprised that the order isn’t preserved… ah well.


回答 11

您还可以使用以下简化的kzh答案:

pprint(data.items(), indent=4)

它保留顺序,并且输出结果几乎与webwurst答案相同(通过json dump打印)。

You can also use this simplification of the kzh answer:

pprint(data.items(), indent=4)

It preserves the order and will output almost the same than the webwurst answer (print through json dump).


回答 12

对于python <3.8(例如3.6):

Monkey补丁pprintsorted为了防止其排序。这也将有利于一切递归工作,并且比json需要使用width参数的用户更适合:

import pprint
pprint.sorted = lambda arg, *a, **kw: arg

>>> pprint.pprint({'z': 1, 'a': 2, 'c': {'z': 0, 'a': 1}}, width=20)
{'z': 1,
 'a': 2,
 'c': {'z': 0,
       'a': 1}}

编辑:清理

要清理这个肮脏的业务,只需运行: pprint.sorted = sorted

对于真正干净的解决方案,甚至可以使用contextmanager:

import pprint
import contextlib

@contextlib.contextmanager
def pprint_ordered():
    pprint.sorted = lambda arg, *args, **kwargs: arg
    yield
    pprint.sorted = sorted

# usage:

with pprint_ordered():
    pprint.pprint({'z': 1, 'a': 2, 'c': {'z': 0, 'a': 1}}, width=20)

# without it    
pprint.pprint({'z': 1, 'a': 2, 'c': {'z': 0, 'a': 1}}, width=20)

# prints: 
#    
# {'z': 1,
#  'a': 2,
#  'c': {'z': 0,
#        'a': 1}}
#
# {'a': 2,
#  'c': {'a': 1,
#        'z': 0},
#  'z': 1}

For python < 3.8 (e.g. 3.6):

Monkey patch pprint‘s sorted in order to prevent it from sorting. This will have the benefit of everything working recursively as well, and is more suitable than the json option for whoever needs to use e.g. width parameter:

import pprint
pprint.sorted = lambda arg, *a, **kw: arg

>>> pprint.pprint({'z': 1, 'a': 2, 'c': {'z': 0, 'a': 1}}, width=20)
{'z': 1,
 'a': 2,
 'c': {'z': 0,
       'a': 1}}

Edit: cleaning up

To clean up after this dirty business just run: pprint.sorted = sorted

For a really clean solution can even use a contextmanager:

import pprint
import contextlib

@contextlib.contextmanager
def pprint_ordered():
    pprint.sorted = lambda arg, *args, **kwargs: arg
    yield
    pprint.sorted = sorted

# usage:

with pprint_ordered():
    pprint.pprint({'z': 1, 'a': 2, 'c': {'z': 0, 'a': 1}}, width=20)

# without it    
pprint.pprint({'z': 1, 'a': 2, 'c': {'z': 0, 'a': 1}}, width=20)

# prints: 
#    
# {'z': 1,
#  'a': 2,
#  'c': {'z': 0,
#        'a': 1}}
#
# {'a': 2,
#  'c': {'a': 1,
#        'z': 0},
#  'z': 1}

回答 13

您可以重新定义pprint()并拦截对的调用OrderedDict。这是一个简单的例子。按照规定,OrderedDict越权代码忽略任何可选streamindentwidth,或者depth可能已经通过关键字,但可以增强贯彻落实。但这种方法不处理他们另一个容器内,比如一个listOrderDict

from collections import OrderedDict
from pprint import pprint as pp_pprint

def pprint(obj, *args, **kwrds):
    if not isinstance(obj, OrderedDict):
        # use stock function
        return pp_pprint(obj, *args, **kwrds)
    else:
        # very simple sample custom implementation...
        print "{"
        for key in obj:
            print "    %r:%r" % (key, obj[key])
        print "}"

l = [10, 2, 4]
d = dict((('john',1), ('paul',2), ('mary',3)))
od = OrderedDict((('john',1), ('paul',2), ('mary',3)))
pprint(l, width=4)
# [10,
#  2,
#  4]
pprint(d)
# {'john': 1, 'mary': 3, 'paul': 2}

pprint(od)
# {
#     'john':1
#     'paul':2
#     'mary':3
# }

You could redefine pprint() and intercept calls for OrderedDict‘s. Here’s a simple illustration. As written, the OrderedDict override code ignores any optional stream, indent, width, or depth keywords that may have been passed, but could be enhanced to implement them. Unfortunately this technique doesn’t handle them inside another container, such as a list of OrderDict‘s

from collections import OrderedDict
from pprint import pprint as pp_pprint

def pprint(obj, *args, **kwrds):
    if not isinstance(obj, OrderedDict):
        # use stock function
        return pp_pprint(obj, *args, **kwrds)
    else:
        # very simple sample custom implementation...
        print "{"
        for key in obj:
            print "    %r:%r" % (key, obj[key])
        print "}"

l = [10, 2, 4]
d = dict((('john',1), ('paul',2), ('mary',3)))
od = OrderedDict((('john',1), ('paul',2), ('mary',3)))
pprint(l, width=4)
# [10,
#  2,
#  4]
pprint(d)
# {'john': 1, 'mary': 3, 'paul': 2}

pprint(od)
# {
#     'john':1
#     'paul':2
#     'mary':3
# }

回答 14

如果字典项都是一种类型,则可以使用令人惊叹的数据处理库pandas

>>> import pandas as pd
>>> x = {'foo':1, 'bar':2}
>>> pd.Series(x)
bar    2
foo    1
dtype: int64

要么

>>> import pandas as pd
>>> x = {'foo':'bar', 'baz':'bam'}
>>> pd.Series(x)
baz    bam
foo    bar
dtype: object

If the dictionary items are all of one type, you could use the amazing data-handling library pandas:

>>> import pandas as pd
>>> x = {'foo':1, 'bar':2}
>>> pd.Series(x)
bar    2
foo    1
dtype: int64

or

>>> import pandas as pd
>>> x = {'foo':'bar', 'baz':'bam'}
>>> pd.Series(x)
baz    bam
foo    bar
dtype: object

使用Python将JSON数据漂亮地打印到文件中

问题:使用Python将JSON数据漂亮地打印到文件中

用于类的项目涉及解析Twitter JSON数据。我正在获取数据并将其设置为文件没有太大的麻烦,但是它们全部集中在一行中。这对我要进行的数据操作很好,但是文件很难读取,而且我无法很好地对其进行检查,这使得为数据操作编写代码非常困难。

有谁知道如何在Python中执行此操作(即不使用命令行工具,但我无法使用该工具)?到目前为止,这是我的代码:

header, output = client.request(twitterRequest, method="GET", body=None,
                            headers=None, force_auth_header=True)

# now write output to a file
twitterDataFile = open("twitterData.json", "wb")
# magic happens here to make it pretty-printed
twitterDataFile.write(output)
twitterDataFile.close()

请注意,我很高兴有人向我指向simplejson文档等,但是正如我已经说过的那样,我已经研究过了并继续需要帮助。一个真正有用的答复将比那里的示例更加详细和解释。谢谢

另外: 在Windows命令行中尝试此操作:

more twitterData.json | python -mjson.tool > twitterData-pretty.json

结果:

Invalid control character at: line 1 column 65535 (char 65535)

我会给您我正在使用的数据,但是它非常大,您已经看到了我用来制作文件的代码。

A project for class involves parsing Twitter JSON data. I’m getting the data and setting it to the file without much trouble, but it’s all in one line. This is fine for the data manipulation I’m trying to do, but the file is ridiculously hard to read and I can’t examine it very well, making the code writing for the data manipulation part very difficult.

Does anyone know how to do that from within Python (i.e. not using the command line tool, which I can’t get to work)? Here’s my code so far:

header, output = client.request(twitterRequest, method="GET", body=None,
                            headers=None, force_auth_header=True)

# now write output to a file
twitterDataFile = open("twitterData.json", "wb")
# magic happens here to make it pretty-printed
twitterDataFile.write(output)
twitterDataFile.close()

Note I appreciate people pointing me to simplejson documentation and such, but as I have stated, I have already looked at that and continue to need assistance. A truly helpful reply will be more detailed and explanatory than the examples found there. Thanks

Also: Trying this in the windows command line:

more twitterData.json | python -mjson.tool > twitterData-pretty.json

results in this:

Invalid control character at: line 1 column 65535 (char 65535)

I’d give you the data I’m using, but it’s very large and you’ve already seen the code I used to make the file.


回答 0

您应该使用可选参数indent

header, output = client.request(twitterRequest, method="GET", body=None,
                            headers=None, force_auth_header=True)

# now write output to a file
twitterDataFile = open("twitterData.json", "w")
# magic happens here to make it pretty-printed
twitterDataFile.write(simplejson.dumps(simplejson.loads(output), indent=4, sort_keys=True))
twitterDataFile.close()

You should use the optional argument indent.

header, output = client.request(twitterRequest, method="GET", body=None,
                            headers=None, force_auth_header=True)

# now write output to a file
twitterDataFile = open("twitterData.json", "w")
# magic happens here to make it pretty-printed
twitterDataFile.write(simplejson.dumps(simplejson.loads(output), indent=4, sort_keys=True))
twitterDataFile.close()

回答 1

您可以解析JSON,然后使用缩进再次将其输出,如下所示:

import json
mydata = json.loads(output)
print json.dumps(mydata, indent=4)

有关更多信息,请参见http://docs.python.org/library/json.html

You can parse the JSON, then output it again with indents like this:

import json
mydata = json.loads(output)
print json.dumps(mydata, indent=4)

See http://docs.python.org/library/json.html for more info.


回答 2

import json

with open("twitterdata.json", "w") as twitter_data_file:
    json.dump(output, twitter_data_file, indent=4, sort_keys=True)

你并不需要json.dumps(),如果你不想以后解析字符串,只需简单地使用json.dump()。它也更快。

import json

with open("twitterdata.json", "w") as twitter_data_file:
    json.dump(output, twitter_data_file, indent=4, sort_keys=True)

You don’t need json.dumps() if you don’t want to parse the string later, just simply use json.dump(). It’s faster too.


回答 3

您可以使用python的json模块进行漂亮的打印。

>>> import json
>>> print json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4)
{
    "4": 5,
    "6": 7
}

所以,在你的情况下

>>> print json.dumps(json_output, indent=4)

You can use json module of python to pretty print.

>>> import json
>>> print json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4)
{
    "4": 5,
    "6": 7
}

So, in your case

>>> print json.dumps(json_output, indent=4)

回答 4

如果您已经具有想要格式化的JSON文件,则可以使用以下命令:

    with open('twitterdata.json', 'r+') as f:
        data = json.load(f)
        f.seek(0)
        json.dump(data, f, indent=4)
        f.truncate()

If you already have existing JSON files which you want to pretty format you could use this:

    with open('twitterdata.json', 'r+') as f:
        data = json.load(f)
        f.seek(0)
        json.dump(data, f, indent=4)
        f.truncate()

回答 5

如果要生成新的* .json或修改现有的josn文件,请使用“ indent”参数获取漂亮的json格式。

import json
responseData = json.loads(output)
with open('twitterData.json','w') as twitterDataFile:    
    json.dump(responseData, twitterDataFile, indent=4)

If you are generating new *.json or modifying existing josn file the use “indent” parameter for pretty view json format.

import json
responseData = json.loads(output)
with open('twitterData.json','w') as twitterDataFile:    
    json.dump(responseData, twitterDataFile, indent=4)

回答 6

import json
def writeToFile(logData, fileName, openOption="w"):
  file = open(fileName, openOption)
  file.write(json.dumps(json.loads(logData), indent=4)) 
  file.close()  
import json
def writeToFile(logData, fileName, openOption="w"):
  file = open(fileName, openOption)
  file.write(json.dumps(json.loads(logData), indent=4)) 
  file.close()  

回答 7

您可以将文件重定向到python并使用该工具打开,并使用更多内容来读取它。

示例代码将是,

cat filename.json | python -m json.tool | more

You could redirect a file to python and open using the tool and to read it use more.

The sample code will be,

cat filename.json | python -m json.tool | more

格式化浮点数而不尾随零

问题:格式化浮点数而不尾随零

如何格式化浮点数,使其不包含尾随零?换句话说,我希望结果字符串尽可能短。

例如:

3 -> "3"
3. -> "3"
3.0 -> "3"
3.1 -> "3.1"
3.14 -> "3.14"
3.140 -> "3.14"

How can I format a float so that it doesn’t contain trailing zeros? In other words, I want the resulting string to be as short as possible.

For example:

3 -> "3"
3. -> "3"
3.0 -> "3"
3.1 -> "3.1"
3.14 -> "3.14"
3.140 -> "3.14"

回答 0

我,我会做的('%f' % x).rstrip('0').rstrip('.')-保证定点格式,而不是科学记数法,等等等等呀,还不如光滑和优雅的%g,但是,它的工作原理(我不知道如何强制%g从不使用科学记数法; -)。

Me, I’d do ('%f' % x).rstrip('0').rstrip('.') — guarantees fixed-point formatting rather than scientific notation, etc etc. Yeah, not as slick and elegant as %g, but, it works (and I don’t know how to force %g to never use scientific notation;-).


回答 1

您可以%g用来实现以下目的:

'%g'%(3.140)

或者,对于Python 2.6或更高版本:

'{0:g}'.format(3.140)

文档中查找formatg原因(除其他外)

从有效数中删除不重要的尾随零,并且如果在其后没有剩余数字,则也删除小数点。

You could use %g to achieve this:

'%g'%(3.140)

or, for Python 2.6 or better:

'{0:g}'.format(3.140)

From the docs for format: g causes (among other things)

insignificant trailing zeros [to be] removed from the significand, and the decimal point is also removed if there are no remaining digits following it.


回答 2

尝试最简单且可能最有效的方法呢?方法normalize()删除所有最右边的尾随零。

from decimal import Decimal

print (Decimal('0.001000').normalize())
# Result: 0.001

适用于Python 2Python 3

– 更新 –

@ BobStein-VisiBone指出的唯一问题是,数字10、100、1000 …将以指数形式显示。可以使用以下函数轻松解决此问题:

from decimal import Decimal


def format_float(f):
    d = Decimal(str(f));
    return d.quantize(Decimal(1)) if d == d.to_integral() else d.normalize()

What about trying the easiest and probably most effective approach? The method normalize() removes all the rightmost trailing zeros.

from decimal import Decimal

print (Decimal('0.001000').normalize())
# Result: 0.001

Works in Python 2 and Python 3.

— Updated —

The only problem as @BobStein-VisiBone pointed out, is that numbers like 10, 100, 1000… will be displayed in exponential representation. This can be easily fixed using the following function instead:

from decimal import Decimal


def format_float(f):
    d = Decimal(str(f));
    return d.quantize(Decimal(1)) if d == d.to_integral() else d.normalize()

回答 3

在查看了几个类似问题的答案之后,这似乎是我的最佳解决方案:

def floatToString(inputValue):
    return ('%.15f' % inputValue).rstrip('0').rstrip('.')

我的推理:

%g 没有摆脱科学计数法。

>>> '%g' % 0.000035
'3.5e-05'

小数点后15位似乎可以避免发生奇怪的行为,并且可以满足我的许多要求。

>>> ('%.15f' % 1.35).rstrip('0').rstrip('.')
'1.35'
>>> ('%.16f' % 1.35).rstrip('0').rstrip('.')
'1.3500000000000001'

我本可以使用format(inputValue, '.15f').来代替'%.15f' % inputValue,但是会慢一些(〜30%)。

我本可以使用Decimal(inputValue).normalize(),但这也有一些问题。例如,速度要慢很多(〜11x)。我还发现,尽管它具有很高的精度,但使用时仍然会遭受精度损失normalize()

>>> Decimal('0.21000000000000000000000000006').normalize()
Decimal('0.2100000000000000000000000001')
>>> Decimal('0.21000000000000000000000000006')
Decimal('0.21000000000000000000000000006')

最重要的是,我仍然会转换为Decimal从,float这样您最终可以得到的不是您输入的数字。我认为Decimal当算术停留在DecimalDecimal使用字符串初始化时效果最佳。

>>> Decimal(1.35)
Decimal('1.350000000000000088817841970012523233890533447265625')
>>> Decimal('1.35')
Decimal('1.35')

我确信Decimal.normalize()可以使用上下文设置将的精度问题调整为所需的值,但是考虑到速度已经很慢并且不需要可笑的精度,而且无论如何我仍然会从浮点数转换而失去精度,我没有认为这不值得追求。

我不担心可能的“ -0”结果,因为-0.0是有效的浮点数,并且无论如何它都可能很少出现,但是由于您确实提到要保持字符串结果尽可能短,因此总是可以以很少的额外速度成本使用额外的条件。

def floatToString(inputValue):
    result = ('%.15f' % inputValue).rstrip('0').rstrip('.')
    return '0' if result == '-0' else result

After looking over answers to several similar questions, this seems to be the best solution for me:

def floatToString(inputValue):
    return ('%.15f' % inputValue).rstrip('0').rstrip('.')

My reasoning:

%g doesn’t get rid of scientific notation.

>>> '%g' % 0.000035
'3.5e-05'

15 decimal places seems to avoid strange behavior and has plenty of precision for my needs.

>>> ('%.15f' % 1.35).rstrip('0').rstrip('.')
'1.35'
>>> ('%.16f' % 1.35).rstrip('0').rstrip('.')
'1.3500000000000001'

I could have used format(inputValue, '.15f'). instead of '%.15f' % inputValue, but that is a bit slower (~30%).

I could have used Decimal(inputValue).normalize(), but this has a few issues as well. For one, it is A LOT slower (~11x). I also found that although it has pretty great precision, it still suffers from precision loss when using normalize().

>>> Decimal('0.21000000000000000000000000006').normalize()
Decimal('0.2100000000000000000000000001')
>>> Decimal('0.21000000000000000000000000006')
Decimal('0.21000000000000000000000000006')

Most importantly, I would still be converting to Decimal from a float which can make you end up with something other than the number you put in there. I think Decimal works best when the arithmetic stays in Decimal and the Decimal is initialized with a string.

>>> Decimal(1.35)
Decimal('1.350000000000000088817841970012523233890533447265625')
>>> Decimal('1.35')
Decimal('1.35')

I’m sure the precision issue of Decimal.normalize() can be adjusted to what is needed using context settings, but considering the already slow speed and not needing ridiculous precision and the fact that I’d still be converting from a float and losing precision anyway, I didn’t think it was worth pursuing.

I’m not concerned with the possible “-0” result since -0.0 is a valid floating point number and it would probably be a rare occurrence anyway, but since you did mention you want to keep the string result as short as possible, you could always use an extra conditional at very little extra speed cost.

def floatToString(inputValue):
    result = ('%.15f' % inputValue).rstrip('0').rstrip('.')
    return '0' if result == '-0' else result

回答 4

这是一个对我有用的解决方案。这是一个混合的解决方案通过多网并使用新的.format() 语法

for num in 3, 3., 3.0, 3.1, 3.14, 3.140:
    print('{0:.2f}'.format(num).rstrip('0').rstrip('.'))

输出

3
3
3
3.1
3.14
3.14

Here’s a solution that worked for me. It’s a blend of the solution by PolyMesh and use of the new .format() syntax.

for num in 3, 3., 3.0, 3.1, 3.14, 3.140:
    print('{0:.2f}'.format(num).rstrip('0').rstrip('.'))

Output:

3
3
3
3.1
3.14
3.14

回答 5

您可以简单地使用format()实现此目的:

format(3.140, '.10g') 其中10是您想要的精度。

You can simply use format() to achieve this:

format(3.140, '.10g') where 10 is the precision you want.


回答 6

>>> str(a if a % 1 else int(a))
>>> str(a if a % 1 else int(a))

回答 7

尽管格式化可能是大多数Python方式,但这里是使用该more_itertools.rstrip工具的替代解决方案。

import more_itertools as mit


def fmt(num, pred=None):
    iterable = str(num)
    predicate = pred if pred is not None else lambda x: x in {".", "0"}
    return "".join(mit.rstrip(iterable, predicate))

assert fmt(3) == "3"
assert fmt(3.) == "3"
assert fmt(3.0) == "3"
assert fmt(3.1) == "3.1"
assert fmt(3.14) == "3.14"
assert fmt(3.140) == "3.14"
assert fmt(3.14000) == "3.14"
assert fmt("3,0", pred=lambda x: x in set(",0")) == "3"

该数字将转换为字符串,该字符串将除去满足谓词的结尾字符。函数定义fmt不是必需的,但是在这里用于测试所有通过的断言。注意:它适用于字符串输入并接受可选谓词。

另请参阅有关此第三方库的详细信息more_itertools

While formatting is likely that most Pythonic way, here is an alternate solution using the more_itertools.rstrip tool.

import more_itertools as mit


def fmt(num, pred=None):
    iterable = str(num)
    predicate = pred if pred is not None else lambda x: x in {".", "0"}
    return "".join(mit.rstrip(iterable, predicate))

assert fmt(3) == "3"
assert fmt(3.) == "3"
assert fmt(3.0) == "3"
assert fmt(3.1) == "3.1"
assert fmt(3.14) == "3.14"
assert fmt(3.140) == "3.14"
assert fmt(3.14000) == "3.14"
assert fmt("3,0", pred=lambda x: x in set(",0")) == "3"

The number is converted to a string, which is stripped of trailing characters that satisfy a predicate. The function definition fmt is not required, but it is used here to test assertions, which all pass. Note: it works on string inputs and accepts optional predicates.

See also details on this third-party library, more_itertools.


回答 8

如果您可以将3.和3.0都显示为“ 3.0”,那么这是一种非常简单的方法,可将浮点数表示形式的零右移:

print("%s"%3.140)

(感谢@ellimilial指出exceptions)

If you can live with 3. and 3.0 appearing as “3.0”, a very simple approach that right-strips zeros from float representations:

print("%s"%3.140)

(thanks @ellimilial for pointing out the exceptions)


回答 9

您可以选择使用QuantiPhy软件包。通常,在处理带有单位和SI比例因子的数字时,会使用QuantiPhy,但它具有多种不错的数字格式设置选项。

    >>> from quantiphy import Quantity

    >>> cases = '3 3. 3.0 3.1 3.14 3.140 3.14000'.split()
    >>> for case in cases:
    ...    q = Quantity(case)
    ...    print(f'{case:>7} -> {q:p}')
          3 -> 3
         3. -> 3
        3.0 -> 3
        3.1 -> 3.1
       3.14 -> 3.14
      3.140 -> 3.14
    3.14000 -> 3.14

在这种情况下,它将不使用电子符号:

    >>> cases = '3.14e-9 3.14 3.14e9'.split()
    >>> for case in cases:
    ...    q = Quantity(case)
    ...    print(f'{case:>7} -> {q:,p}')
    3.14e-9 -> 0
       3.14 -> 3.14
     3.14e9 -> 3,140,000,000

您可能更喜欢的替代方法是使用SI比例因子,也许使用单位。

    >>> cases = '3e-9 3.14e-9 3 3.14 3e9 3.14e9'.split()
    >>> for case in cases:
    ...    q = Quantity(case, 'm')
    ...    print(f'{case:>7} -> {q}')
       3e-9 -> 3 nm
    3.14e-9 -> 3.14 nm
          3 -> 3 m
       3.14 -> 3.14 m
        3e9 -> 3 Gm
     3.14e9 -> 3.14 Gm

Using the QuantiPhy package is an option. Normally QuantiPhy is used when working with numbers with units and SI scale factors, but it has a variety of nice number formatting options.

    >>> from quantiphy import Quantity

    >>> cases = '3 3. 3.0 3.1 3.14 3.140 3.14000'.split()
    >>> for case in cases:
    ...    q = Quantity(case)
    ...    print(f'{case:>7} -> {q:p}')
          3 -> 3
         3. -> 3
        3.0 -> 3
        3.1 -> 3.1
       3.14 -> 3.14
      3.140 -> 3.14
    3.14000 -> 3.14

And it will not use e-notation in this situation:

    >>> cases = '3.14e-9 3.14 3.14e9'.split()
    >>> for case in cases:
    ...    q = Quantity(case)
    ...    print(f'{case:>7} -> {q:,p}')
    3.14e-9 -> 0
       3.14 -> 3.14
     3.14e9 -> 3,140,000,000

An alternative you might prefer is to use SI scale factors, perhaps with units.

    >>> cases = '3e-9 3.14e-9 3 3.14 3e9 3.14e9'.split()
    >>> for case in cases:
    ...    q = Quantity(case, 'm')
    ...    print(f'{case:>7} -> {q}')
       3e-9 -> 3 nm
    3.14e-9 -> 3.14 nm
          3 -> 3 m
       3.14 -> 3.14 m
        3e9 -> 3 Gm
     3.14e9 -> 3.14 Gm

回答 10

OP希望删除多余的零,并使生成的字符串尽可能短。

我发现%g指数格式会缩短结果字符串的大小和数值。对于不需要指数表示法的值(例如128.0)来说,问题来了,它既不是很大也不是很小。

这是将数字格式化为短字符串的一种方法,仅当Decimal.normalize创建的字符串过长时才使用%g指数表示法。这可能不是最快的解决方案(因为它确实使用Decimal.normalize)

def floatToString (inputValue, precision = 3):
    rc = str(Decimal(inputValue).normalize())
    if 'E' in rc or len(rc) > 5:
        rc = '{0:.{1}g}'.format(inputValue, precision)        
    return rc

inputs = [128.0, 32768.0, 65536, 65536 * 2, 31.5, 1.000, 10.0]

outputs = [floatToString(i) for i in inputs]

print(outputs)

# ['128', '32768', '65536', '1.31e+05', '31.5', '1', '10']

OP would like to remove superflouous zeros and make the resulting string as short as possible.

I find the %g exponential formatting shortens the resulting string for very large and very small values. The problem comes for values that don’t need exponential notation, like 128.0, which is neither very large or very small.

Here is one way to format numbers as short strings that uses %g exponential notation only when Decimal.normalize creates strings that are too long. This might not be the fastest solution (since it does use Decimal.normalize)

def floatToString (inputValue, precision = 3):
    rc = str(Decimal(inputValue).normalize())
    if 'E' in rc or len(rc) > 5:
        rc = '{0:.{1}g}'.format(inputValue, precision)        
    return rc

inputs = [128.0, 32768.0, 65536, 65536 * 2, 31.5, 1.000, 10.0]

outputs = [floatToString(i) for i in inputs]

print(outputs)

# ['128', '32768', '65536', '1.31e+05', '31.5', '1', '10']

回答 11

对于float,您可以使用以下代码:

def format_float(num):
    return ('%i' if num == int(num) else '%s') % num

测试一下:

>>> format_float(1.00000)
'1'
>>> format_float(1.1234567890000000000)
'1.123456789'

对于十进制,请在此处查看解决方案:https : //stackoverflow.com/a/42668598/5917543

For float you could use this:

def format_float(num):
    return ('%i' if num == int(num) else '%s') % num

Test it:

>>> format_float(1.00000)
'1'
>>> format_float(1.1234567890000000000)
'1.123456789'

For Decimal see solution here: https://stackoverflow.com/a/42668598/5917543


回答 12

"{:.5g}".format(x)

我用它来格式化浮点数以尾随零。

"{:.5g}".format(x)

I use this to format floats to trail zeros.


回答 13

答案是:

import numpy

num1 = 3.1400
num2 = 3.000
numpy.format_float_positional(num1, 3, trim='-')
numpy.format_float_positional(num2, 3, trim='-')

输出“ 3.14”和“ 3”

trim='-' 删除尾随零和小数。

Here’s the answer:

import numpy

num1 = 3.1400
num2 = 3.000
numpy.format_float_positional(num1, 3, trim='-')
numpy.format_float_positional(num2, 3, trim='-')

output “3.14” and “3”

trim='-' removes both the trailing zero’s, and the decimal.


回答 14

使用宽度足够大的%g,例如’%.99g’。对于任何较大的数字,它将以定点表示法打印。

编辑:这不起作用

>>> '%.99g' % 0.0000001
'9.99999999999999954748111825886258685613938723690807819366455078125e-08'

Use %g with big enough width, for example ‘%.99g’. It will print in fixed-point notation for any reasonably big number.

EDIT: it doesn’t work

>>> '%.99g' % 0.0000001
'9.99999999999999954748111825886258685613938723690807819366455078125e-08'

回答 15

您可以这样使用max()

print(max(int(x), x))

You can use max() like this:

print(max(int(x), x))


回答 16

您可以通过以下大多数pythonic方式实现该目标:

python3:

"{:0.0f}".format(num)

You can achieve that in most pythonic way like that:

python3:

"{:0.0f}".format(num)

回答 17

处理%f,您应该放

%.2f

,其中:.2f == .00浮动。

例:

打印“价格:%.2f”%价格[产品]

输出:

价格:1.50

Handling %f and you should put

%.2f

, where: .2f == .00 floats.

Example:

print “Price: %.2f” % prices[product]

output:

Price: 1.50


如何在没有科学符号和给定精度的情况下漂亮地打印numpy.array?

问题:如何在没有科学符号和给定精度的情况下漂亮地打印numpy.array?

我很好奇,是否有任何打印格式化的方法numpy.arrays,例如,类似于以下方式:

x = 1.23456
print '%.3f' % x

如果我想打印numpy.array浮点数,它会以“科学”格式打印几位小数,即使对于低维数组也很难阅读。但是,numpy.array显然必须将其打印为字符串,即使用%s。有解决方案吗?

I’m curious, whether there is any way to print formatted numpy.arrays, e.g., in a way similar to this:

x = 1.23456
print '%.3f' % x

If I want to print the numpy.array of floats, it prints several decimals, often in ‘scientific’ format, which is rather hard to read even for low-dimensional arrays. However, numpy.array apparently has to be printed as a string, i.e., with %s. Is there a solution for this?


回答 0

您可以set_printoptions用来设置输出的精度:

import numpy as np
x=np.random.random(10)
print(x)
# [ 0.07837821  0.48002108  0.41274116  0.82993414  0.77610352  0.1023732
#   0.51303098  0.4617183   0.33487207  0.71162095]

np.set_printoptions(precision=3)
print(x)
# [ 0.078  0.48   0.413  0.83   0.776  0.102  0.513  0.462  0.335  0.712]

suppress禁止对小数使用科学计数法:

y=np.array([1.5e-10,1.5,1500])
print(y)
# [  1.500e-10   1.500e+00   1.500e+03]
np.set_printoptions(suppress=True)
print(y)
# [    0.      1.5  1500. ]

有关其他选项,请参见文档中的set_printoptions


使用NumPy 1.15.0或更高版本在本地应用打印选项,可以使用numpy.printoptions上下文管理器。例如,在with-suite precision=3suppress=True中设置:

x = np.random.random(10)
with np.printoptions(precision=3, suppress=True):
    print(x)
    # [ 0.073  0.461  0.689  0.754  0.624  0.901  0.049  0.582  0.557  0.348]

但是在with-suite打印选项之外,将恢复为默认设置:

print(x)    
# [ 0.07334334  0.46132615  0.68935231  0.75379645  0.62424021  0.90115836
#   0.04879837  0.58207504  0.55694118  0.34768638]

如果您使用的是NumPy的早期版本,则可以自己创建上下文管理器。例如,

import numpy as np
import contextlib

@contextlib.contextmanager
def printoptions(*args, **kwargs):
    original = np.get_printoptions()
    np.set_printoptions(*args, **kwargs)
    try:
        yield
    finally: 
        np.set_printoptions(**original)

x = np.random.random(10)
with printoptions(precision=3, suppress=True):
    print(x)
    # [ 0.073  0.461  0.689  0.754  0.624  0.901  0.049  0.582  0.557  0.348]

为防止浮点数结尾处的零被剥离:

np.set_printoptions现在有一个formatter参数,可让您为每种类型指定格式功能。

np.set_printoptions(formatter={'float': '{: 0.3f}'.format})
print(x)

哪个打印

[ 0.078  0.480  0.413  0.830  0.776  0.102  0.513  0.462  0.335  0.712]

代替

[ 0.078  0.48   0.413  0.83   0.776  0.102  0.513  0.462  0.335  0.712]

You can use set_printoptions to set the precision of the output:

import numpy as np
x=np.random.random(10)
print(x)
# [ 0.07837821  0.48002108  0.41274116  0.82993414  0.77610352  0.1023732
#   0.51303098  0.4617183   0.33487207  0.71162095]

np.set_printoptions(precision=3)
print(x)
# [ 0.078  0.48   0.413  0.83   0.776  0.102  0.513  0.462  0.335  0.712]

And suppress suppresses the use of scientific notation for small numbers:

y=np.array([1.5e-10,1.5,1500])
print(y)
# [  1.500e-10   1.500e+00   1.500e+03]
np.set_printoptions(suppress=True)
print(y)
# [    0.      1.5  1500. ]

See the docs for set_printoptions for other options.


To apply print options locally, using NumPy 1.15.0 or later, you could use the numpy.printoptions context manager. For example, inside the with-suite precision=3 and suppress=True are set:

x = np.random.random(10)
with np.printoptions(precision=3, suppress=True):
    print(x)
    # [ 0.073  0.461  0.689  0.754  0.624  0.901  0.049  0.582  0.557  0.348]

But outside the with-suite the print options are back to default settings:

print(x)    
# [ 0.07334334  0.46132615  0.68935231  0.75379645  0.62424021  0.90115836
#   0.04879837  0.58207504  0.55694118  0.34768638]

If you are using an earlier version of NumPy, you can create the context manager yourself. For example,

import numpy as np
import contextlib

@contextlib.contextmanager
def printoptions(*args, **kwargs):
    original = np.get_printoptions()
    np.set_printoptions(*args, **kwargs)
    try:
        yield
    finally: 
        np.set_printoptions(**original)

x = np.random.random(10)
with printoptions(precision=3, suppress=True):
    print(x)
    # [ 0.073  0.461  0.689  0.754  0.624  0.901  0.049  0.582  0.557  0.348]

To prevent zeros from being stripped from the end of floats:

np.set_printoptions now has a formatter parameter which allows you to specify a format function for each type.

np.set_printoptions(formatter={'float': '{: 0.3f}'.format})
print(x)

which prints

[ 0.078  0.480  0.413  0.830  0.776  0.102  0.513  0.462  0.335  0.712]

instead of

[ 0.078  0.48   0.413  0.83   0.776  0.102  0.513  0.462  0.335  0.712]

回答 1

您可以np.set_printoptionsnp.array_str命令中获得功能的子集,该命令仅适用于单个打印语句。

http://docs.scipy.org/doc/numpy/reference/generated/numpy.array_str.html

例如:

In [27]: x = np.array([[1.1, 0.9, 1e-6]]*3)

In [28]: print x
[[  1.10000000e+00   9.00000000e-01   1.00000000e-06]
 [  1.10000000e+00   9.00000000e-01   1.00000000e-06]
 [  1.10000000e+00   9.00000000e-01   1.00000000e-06]]

In [29]: print np.array_str(x, precision=2)
[[  1.10e+00   9.00e-01   1.00e-06]
 [  1.10e+00   9.00e-01   1.00e-06]
 [  1.10e+00   9.00e-01   1.00e-06]]

In [30]: print np.array_str(x, precision=2, suppress_small=True)
[[ 1.1  0.9  0. ]
 [ 1.1  0.9  0. ]
 [ 1.1  0.9  0. ]]

You can get a subset of the np.set_printoptions functionality from the np.array_str command, which applies only to a single print statement.

http://docs.scipy.org/doc/numpy/reference/generated/numpy.array_str.html

For example:

In [27]: x = np.array([[1.1, 0.9, 1e-6]]*3)

In [28]: print x
[[  1.10000000e+00   9.00000000e-01   1.00000000e-06]
 [  1.10000000e+00   9.00000000e-01   1.00000000e-06]
 [  1.10000000e+00   9.00000000e-01   1.00000000e-06]]

In [29]: print np.array_str(x, precision=2)
[[  1.10e+00   9.00e-01   1.00e-06]
 [  1.10e+00   9.00e-01   1.00e-06]
 [  1.10e+00   9.00e-01   1.00e-06]]

In [30]: print np.array_str(x, precision=2, suppress_small=True)
[[ 1.1  0.9  0. ]
 [ 1.1  0.9  0. ]
 [ 1.1  0.9  0. ]]

回答 2

Unutbu给出了一个非常完整的答案(他们也从我这里得到了+1),但这是一种高科技的替代方法:

>>> x=np.random.randn(5)
>>> x
array([ 0.25276524,  2.28334499, -1.88221637,  0.69949927,  1.0285625 ])
>>> ['{:.2f}'.format(i) for i in x]
['0.25', '2.28', '-1.88', '0.70', '1.03']

作为一项功能(使用format()语法进行格式化):

def ndprint(a, format_string ='{0:.2f}'):
    print [format_string.format(v,i) for i,v in enumerate(a)]

用法:

>>> ndprint(x)
['0.25', '2.28', '-1.88', '0.70', '1.03']

>>> ndprint(x, '{:10.4e}')
['2.5277e-01', '2.2833e+00', '-1.8822e+00', '6.9950e-01', '1.0286e+00']

>>> ndprint(x, '{:.8g}')
['0.25276524', '2.283345', '-1.8822164', '0.69949927', '1.0285625']

可以使用以下格式的字符串访问数组的索引:

>>> ndprint(x, 'Element[{1:d}]={0:.2f}')
['Element[0]=0.25', 'Element[1]=2.28', 'Element[2]=-1.88', 'Element[3]=0.70', 'Element[4]=1.03']

Unutbu gave a really complete answer (they got a +1 from me too), but here is a lo-tech alternative:

>>> x=np.random.randn(5)
>>> x
array([ 0.25276524,  2.28334499, -1.88221637,  0.69949927,  1.0285625 ])
>>> ['{:.2f}'.format(i) for i in x]
['0.25', '2.28', '-1.88', '0.70', '1.03']

As a function (using the format() syntax for formatting):

def ndprint(a, format_string ='{0:.2f}'):
    print [format_string.format(v,i) for i,v in enumerate(a)]

Usage:

>>> ndprint(x)
['0.25', '2.28', '-1.88', '0.70', '1.03']

>>> ndprint(x, '{:10.4e}')
['2.5277e-01', '2.2833e+00', '-1.8822e+00', '6.9950e-01', '1.0286e+00']

>>> ndprint(x, '{:.8g}')
['0.25276524', '2.283345', '-1.8822164', '0.69949927', '1.0285625']

The index of the array is accessible in the format string:

>>> ndprint(x, 'Element[{1:d}]={0:.2f}')
['Element[0]=0.25', 'Element[1]=2.28', 'Element[2]=-1.88', 'Element[3]=0.70', 'Element[4]=1.03']

回答 3

FYI Numpy 1.15(发布日期待定)将包括一个上下文管理器,用于在本地设置打印选项。这意味着以下内容将与接受的答案(由unutbu和Neil G撰写)中的相应示例相同,而无需编写您自己的上下文管理器。例如,使用他们的示例:

x = np.random.random(10)
with np.printoptions(precision=3, suppress=True):
    print(x)
    # [ 0.073  0.461  0.689  0.754  0.624  0.901  0.049  0.582  0.557  0.348]

FYI Numpy 1.15 (release date pending) will include a context manager for setting print options locally. This means that the following will work the same as the corresponding example in the accepted answer (by unutbu and Neil G) without having to write your own context manager. E.g., using their example:

x = np.random.random(10)
with np.printoptions(precision=3, suppress=True):
    print(x)
    # [ 0.073  0.461  0.689  0.754  0.624  0.901  0.049  0.582  0.557  0.348]

回答 4

在denis答案中隐藏了使它很容易以字符串形式获得结果的gem(在当今的numpy版本中): np.array2string

>>> import numpy as np
>>> x=np.random.random(10)
>>> np.array2string(x, formatter={'float_kind':'{0:.3f}'.format})
'[0.599 0.847 0.513 0.155 0.844 0.753 0.920 0.797 0.427 0.420]'

The gem that makes it all too easy to obtain the result as a string (in today’s numpy versions) is hidden in denis answer: np.array2string

>>> import numpy as np
>>> x=np.random.random(10)
>>> np.array2string(x, formatter={'float_kind':'{0:.3f}'.format})
'[0.599 0.847 0.513 0.155 0.844 0.753 0.920 0.797 0.427 0.420]'

回答 5

几年后,下面是另一个。但是对于日常使用,我只是

np.set_printoptions( threshold=20, edgeitems=10, linewidth=140,
    formatter = dict( float = lambda x: "%.3g" % x ))  # float arrays %.3g

''' printf( "... %.3g ... %.1f  ...", arg, arg ... ) for numpy arrays too

Example:
    printf( """ x: %.3g   A: %.1f   s: %s   B: %s """,
                   x,        A,        "str",  B )

If `x` and `A` are numbers, this is like `"format" % (x, A, "str", B)` in python.
If they're numpy arrays, each element is printed in its own format:
    `x`: e.g. [ 1.23 1.23e-6 ... ]  3 digits
    `A`: [ [ 1 digit after the decimal point ... ] ... ]
with the current `np.set_printoptions()`. For example, with
    np.set_printoptions( threshold=100, edgeitems=3, suppress=True )
only the edges of big `x` and `A` are printed.
`B` is printed as `str(B)`, for any `B` -- a number, a list, a numpy object ...

`printf()` tries to handle too few or too many arguments sensibly,
but this is iffy and subject to change.

How it works:
numpy has a function `np.array2string( A, "%.3g" )` (simplifying a bit).
`printf()` splits the format string, and for format / arg pairs
    format: % d e f g
    arg: try `np.asanyarray()`
-->  %s  np.array2string( arg, format )
Other formats and non-ndarray args are left alone, formatted as usual.

Notes:

`printf( ... end= file= )` are passed on to the python `print()` function.

Only formats `% [optional width . precision] d e f g` are implemented,
not `%(varname)format` .

%d truncates floats, e.g. 0.9 and -0.9 to 0; %.0f rounds, 0.9 to 1 .
%g is the same as %.6g, 6 digits.
%% is a single "%" character.

The function `sprintf()` returns a long string. For example,
    title = sprintf( "%s  m %g  n %g  X %.3g",
                    __file__, m, n, X )
    print( title )
    ...
    pl.title( title )

Module globals:
_fmt = "%.3g"  # default for extra args
_squeeze = np.squeeze  # (n,1) (1,n) -> (n,) print in 1 line not n

See also:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.set_printoptions.html
http://docs.python.org/2.7/library/stdtypes.html#string-formatting

'''
# http://stackoverflow.com/questions/2891790/pretty-printing-of-numpy-array


#...............................................................................
from __future__ import division, print_function
import re
import numpy as np

__version__ = "2014-02-03 feb denis"

_splitformat = re.compile( r'''(
    %
    (?<! %% )  # not %%
    -? [ \d . ]*  # optional width.precision
    \w
    )''', re.X )
    # ... %3.0f  ... %g  ... %-10s ...
    # -> ['...' '%3.0f' '...' '%g' '...' '%-10s' '...']
    # odd len, first or last may be ""

_fmt = "%.3g"  # default for extra args
_squeeze = np.squeeze  # (n,1) (1,n) -> (n,) print in 1 line not n

#...............................................................................
def printf( format, *args, **kwargs ):
    print( sprintf( format, *args ), **kwargs )  # end= file=

printf.__doc__ = __doc__


def sprintf( format, *args ):
    """ sprintf( "text %.3g text %4.1f ... %s ... ", numpy arrays or ... )
        %[defg] array -> np.array2string( formatter= )
    """
    args = list(args)
    if not isinstance( format, basestring ):
        args = [format] + args
        format = ""

    tf = _splitformat.split( format )  # [ text %e text %f ... ]
    nfmt = len(tf) // 2
    nargs = len(args)
    if nargs < nfmt:
        args += (nfmt - nargs) * ["?arg?"]
    elif nargs > nfmt:
        tf += (nargs - nfmt) * [_fmt, " "]  # default _fmt

    for j, arg in enumerate( args ):
        fmt = tf[ 2*j + 1 ]
        if arg is None \
        or isinstance( arg, basestring ) \
        or (hasattr( arg, "__iter__" ) and len(arg) == 0):
            tf[ 2*j + 1 ] = "%s"  # %f -> %s, not error
            continue
        args[j], isarray = _tonumpyarray(arg)
        if isarray  and fmt[-1] in "defgEFG":
            tf[ 2*j + 1 ] = "%s"
            fmtfunc = (lambda x: fmt % x)
            formatter = dict( float_kind=fmtfunc, int=fmtfunc )
            args[j] = np.array2string( args[j], formatter=formatter )
    try:
        return "".join(tf) % tuple(args)
    except TypeError:  # shouldn't happen
        print( "error: tf %s  types %s" % (tf, map( type, args )))
        raise


def _tonumpyarray( a ):
    """ a, isarray = _tonumpyarray( a )
        ->  scalar, False
            np.asanyarray(a), float or int
            a, False
    """
    a = getattr( a, "value", a )  # cvxpy
    if np.isscalar(a):
        return a, False
    if hasattr( a, "__iter__" )  and len(a) == 0:
        return a, False
    try:
        # map .value ?
        a = np.asanyarray( a )
    except ValueError:
        return a, False
    if hasattr( a, "dtype" )  and a.dtype.kind in "fi":  # complex ?
        if callable( _squeeze ):
            a = _squeeze( a )  # np.squeeze
        return a, True
    else:
        return a, False


#...............................................................................
if __name__ == "__main__":
    import sys

    n = 5
    seed = 0
        # run this.py n= ...  in sh or ipython
    for arg in sys.argv[1:]:
        exec( arg )
    np.set_printoptions( 1, threshold=4, edgeitems=2, linewidth=80, suppress=True )
    np.random.seed(seed)

    A = np.random.exponential( size=(n,n) ) ** 10
    x = A[0]

    printf( "x: %.3g  \nA: %.1f  \ns: %s  \nB: %s ",
                x,         A,         "str",   A )
    printf( "x %%d: %d", x )
    printf( "x %%.0f: %.0f", x )
    printf( "x %%.1e: %.1e", x )
    printf( "x %%g: %g", x )
    printf( "x %%s uses np printoptions: %s", x )

    printf( "x with default _fmt: ", x )
    printf( "no args" )
    printf( "too few args: %g %g", x )
    printf( x )
    printf( x, x )
    printf( None )
    printf( "[]:", [] )
    printf( "[3]:", [3] )
    printf( np.array( [] ))
    printf( [[]] )  # squeeze

Years later, another one is below. But for everyday use I just

np.set_printoptions( threshold=20, edgeitems=10, linewidth=140,
    formatter = dict( float = lambda x: "%.3g" % x ))  # float arrays %.3g

''' printf( "... %.3g ... %.1f  ...", arg, arg ... ) for numpy arrays too

Example:
    printf( """ x: %.3g   A: %.1f   s: %s   B: %s """,
                   x,        A,        "str",  B )

If `x` and `A` are numbers, this is like `"format" % (x, A, "str", B)` in python.
If they're numpy arrays, each element is printed in its own format:
    `x`: e.g. [ 1.23 1.23e-6 ... ]  3 digits
    `A`: [ [ 1 digit after the decimal point ... ] ... ]
with the current `np.set_printoptions()`. For example, with
    np.set_printoptions( threshold=100, edgeitems=3, suppress=True )
only the edges of big `x` and `A` are printed.
`B` is printed as `str(B)`, for any `B` -- a number, a list, a numpy object ...

`printf()` tries to handle too few or too many arguments sensibly,
but this is iffy and subject to change.

How it works:
numpy has a function `np.array2string( A, "%.3g" )` (simplifying a bit).
`printf()` splits the format string, and for format / arg pairs
    format: % d e f g
    arg: try `np.asanyarray()`
-->  %s  np.array2string( arg, format )
Other formats and non-ndarray args are left alone, formatted as usual.

Notes:

`printf( ... end= file= )` are passed on to the python `print()` function.

Only formats `% [optional width . precision] d e f g` are implemented,
not `%(varname)format` .

%d truncates floats, e.g. 0.9 and -0.9 to 0; %.0f rounds, 0.9 to 1 .
%g is the same as %.6g, 6 digits.
%% is a single "%" character.

The function `sprintf()` returns a long string. For example,
    title = sprintf( "%s  m %g  n %g  X %.3g",
                    __file__, m, n, X )
    print( title )
    ...
    pl.title( title )

Module globals:
_fmt = "%.3g"  # default for extra args
_squeeze = np.squeeze  # (n,1) (1,n) -> (n,) print in 1 line not n

See also:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.set_printoptions.html
http://docs.python.org/2.7/library/stdtypes.html#string-formatting

'''
# http://stackoverflow.com/questions/2891790/pretty-printing-of-numpy-array


#...............................................................................
from __future__ import division, print_function
import re
import numpy as np

__version__ = "2014-02-03 feb denis"

_splitformat = re.compile( r'''(
    %
    (?<! %% )  # not %%
    -? [ \d . ]*  # optional width.precision
    \w
    )''', re.X )
    # ... %3.0f  ... %g  ... %-10s ...
    # -> ['...' '%3.0f' '...' '%g' '...' '%-10s' '...']
    # odd len, first or last may be ""

_fmt = "%.3g"  # default for extra args
_squeeze = np.squeeze  # (n,1) (1,n) -> (n,) print in 1 line not n

#...............................................................................
def printf( format, *args, **kwargs ):
    print( sprintf( format, *args ), **kwargs )  # end= file=

printf.__doc__ = __doc__


def sprintf( format, *args ):
    """ sprintf( "text %.3g text %4.1f ... %s ... ", numpy arrays or ... )
        %[defg] array -> np.array2string( formatter= )
    """
    args = list(args)
    if not isinstance( format, basestring ):
        args = [format] + args
        format = ""

    tf = _splitformat.split( format )  # [ text %e text %f ... ]
    nfmt = len(tf) // 2
    nargs = len(args)
    if nargs < nfmt:
        args += (nfmt - nargs) * ["?arg?"]
    elif nargs > nfmt:
        tf += (nargs - nfmt) * [_fmt, " "]  # default _fmt

    for j, arg in enumerate( args ):
        fmt = tf[ 2*j + 1 ]
        if arg is None \
        or isinstance( arg, basestring ) \
        or (hasattr( arg, "__iter__" ) and len(arg) == 0):
            tf[ 2*j + 1 ] = "%s"  # %f -> %s, not error
            continue
        args[j], isarray = _tonumpyarray(arg)
        if isarray  and fmt[-1] in "defgEFG":
            tf[ 2*j + 1 ] = "%s"
            fmtfunc = (lambda x: fmt % x)
            formatter = dict( float_kind=fmtfunc, int=fmtfunc )
            args[j] = np.array2string( args[j], formatter=formatter )
    try:
        return "".join(tf) % tuple(args)
    except TypeError:  # shouldn't happen
        print( "error: tf %s  types %s" % (tf, map( type, args )))
        raise


def _tonumpyarray( a ):
    """ a, isarray = _tonumpyarray( a )
        ->  scalar, False
            np.asanyarray(a), float or int
            a, False
    """
    a = getattr( a, "value", a )  # cvxpy
    if np.isscalar(a):
        return a, False
    if hasattr( a, "__iter__" )  and len(a) == 0:
        return a, False
    try:
        # map .value ?
        a = np.asanyarray( a )
    except ValueError:
        return a, False
    if hasattr( a, "dtype" )  and a.dtype.kind in "fi":  # complex ?
        if callable( _squeeze ):
            a = _squeeze( a )  # np.squeeze
        return a, True
    else:
        return a, False


#...............................................................................
if __name__ == "__main__":
    import sys

    n = 5
    seed = 0
        # run this.py n= ...  in sh or ipython
    for arg in sys.argv[1:]:
        exec( arg )
    np.set_printoptions( 1, threshold=4, edgeitems=2, linewidth=80, suppress=True )
    np.random.seed(seed)

    A = np.random.exponential( size=(n,n) ) ** 10
    x = A[0]

    printf( "x: %.3g  \nA: %.1f  \ns: %s  \nB: %s ",
                x,         A,         "str",   A )
    printf( "x %%d: %d", x )
    printf( "x %%.0f: %.0f", x )
    printf( "x %%.1e: %.1e", x )
    printf( "x %%g: %g", x )
    printf( "x %%s uses np printoptions: %s", x )

    printf( "x with default _fmt: ", x )
    printf( "no args" )
    printf( "too few args: %g %g", x )
    printf( x )
    printf( x, x )
    printf( None )
    printf( "[]:", [] )
    printf( "[3]:", [3] )
    printf( np.array( [] ))
    printf( [[]] )  # squeeze

回答 6

这是我所使用的,并且非常简单:

print(np.vectorize("%.2f".__mod__)(sparse))

And here is what I use, and it’s pretty uncomplicated:

print(np.vectorize("%.2f".__mod__)(sparse))

回答 7

惊讶的是没有看到around提到的方法-意味着不会弄乱打印选项。

import numpy as np

x = np.random.random([5,5])
print(np.around(x,decimals=3))

Output:
[[0.475 0.239 0.183 0.991 0.171]
 [0.231 0.188 0.235 0.335 0.049]
 [0.87  0.212 0.219 0.9   0.3  ]
 [0.628 0.791 0.409 0.5   0.319]
 [0.614 0.84  0.812 0.4   0.307]]

Was surprised to not see around method mentioned – means no messing with print options.

import numpy as np

x = np.random.random([5,5])
print(np.around(x,decimals=3))

Output:
[[0.475 0.239 0.183 0.991 0.171]
 [0.231 0.188 0.235 0.335 0.049]
 [0.87  0.212 0.219 0.9   0.3  ]
 [0.628 0.791 0.409 0.5   0.319]
 [0.614 0.84  0.812 0.4   0.307]]

回答 8

我经常希望不同的列具有不同的格式。这是我通过将NumPy数组(的片段)转换为元组来使用格式多样的简单2D数组的方式:

import numpy as np
dat = np.random.random((10,11))*100  # Array of random values between 0 and 100
print(dat)                           # Lines get truncated and are hard to read
for i in range(10):
    print((4*"%6.2f"+7*"%9.4f") % tuple(dat[i,:]))

I often want different columns to have different formats. Here is how I print a simple 2D array using some variety in the formatting by converting (slices of) my NumPy array to a tuple:

import numpy as np
dat = np.random.random((10,11))*100  # Array of random values between 0 and 100
print(dat)                           # Lines get truncated and are hard to read
for i in range(10):
    print((4*"%6.2f"+7*"%9.4f") % tuple(dat[i,:]))

回答 9

numpy.char.mod根据您应用程序的详细信息,它可能也很有用,例如:numpy.char.mod('Value=%4.2f', numpy.arange(5, 10, 0.1))将返回一个包含元素“ Value = 5.00”,“ Value = 5.10”等的字符串数组(作为一个人为的示例)。

numpy.char.mod may also be useful, depending on the details of your application e.g.:numpy.char.mod('Value=%4.2f', numpy.arange(5, 10, 0.1)) will return a string array with elements “Value=5.00”, “Value=5.10” etc. (as a somewhat contrived example).


回答 10

numpy数组具有round(precision)返回一个新的numpy数组的方法,该数组具有相应的舍入元素。

import numpy as np

x = np.random.random([5,5])
print(x.round(3))

The numpy arrays have the method round(precision) which return a new numpy array with elements rounded accordingly.

import numpy as np

x = np.random.random([5,5])
print(x.round(3))

回答 11

我发现使用循环显示列表或数组时,通常的浮点格式{:9.5f}可以正常工作-抑制小数值电子注释。但是,当格式化程序在单个print语句中有多个项目时,该格式有时无法抑制其电子注释。例如:

import numpy as np
np.set_printoptions(suppress=True)
a3 = 4E-3
a4 = 4E-4
a5 = 4E-5
a6 = 4E-6
a7 = 4E-7
a8 = 4E-8
#--first, display separate numbers-----------
print('Case 3:  a3, a4, a5:             {:9.5f}{:9.5f}{:9.5f}'.format(a3,a4,a5))
print('Case 4:  a3, a4, a5, a6:         {:9.5f}{:9.5f}{:9.5f}{:9.5}'.format(a3,a4,a5,a6))
print('Case 5:  a3, a4, a5, a6, a7:     {:9.5f}{:9.5f}{:9.5f}{:9.5}{:9.5f}'.format(a3,a4,a5,a6,a7))
print('Case 6:  a3, a4, a5, a6, a7, a8: {:9.5f}{:9.5f}{:9.5f}{:9.5f}{:9.5}{:9.5f}'.format(a3,a4,a5,a6,a7,a8))
#---second, display a list using a loop----------
myList = [a3,a4,a5,a6,a7,a8]
print('List 6:  a3, a4, a5, a6, a7, a8: ', end='')
for x in myList: 
    print('{:9.5f}'.format(x), end='')
print()
#---third, display a numpy array using a loop------------
myArray = np.array(myList)
print('Array 6: a3, a4, a5, a6, a7, a8: ', end='')
for x in myArray:
    print('{:9.5f}'.format(x), end='')
print()

我的结果显示了情况4、5和6中的错误:

Case 3:  a3, a4, a5:               0.00400  0.00040  0.00004
Case 4:  a3, a4, a5, a6:           0.00400  0.00040  0.00004    4e-06
Case 5:  a3, a4, a5, a6, a7:       0.00400  0.00040  0.00004    4e-06  0.00000
Case 6:  a3, a4, a5, a6, a7, a8:   0.00400  0.00040  0.00004  0.00000    4e-07  0.00000
List 6:  a3, a4, a5, a6, a7, a8:   0.00400  0.00040  0.00004  0.00000  0.00000  0.00000
Array 6: a3, a4, a5, a6, a7, a8:   0.00400  0.00040  0.00004  0.00000  0.00000  0.00000

我对此没有任何解释,因此我总是使用循环来浮动多个值的输出。

I find that the usual float format {:9.5f} works properly — suppressing small-value e-notations — when displaying a list or an array using a loop. But that format sometimes fails to suppress its e-notation when a formatter has several items in a single print statement. For example:

import numpy as np
np.set_printoptions(suppress=True)
a3 = 4E-3
a4 = 4E-4
a5 = 4E-5
a6 = 4E-6
a7 = 4E-7
a8 = 4E-8
#--first, display separate numbers-----------
print('Case 3:  a3, a4, a5:             {:9.5f}{:9.5f}{:9.5f}'.format(a3,a4,a5))
print('Case 4:  a3, a4, a5, a6:         {:9.5f}{:9.5f}{:9.5f}{:9.5}'.format(a3,a4,a5,a6))
print('Case 5:  a3, a4, a5, a6, a7:     {:9.5f}{:9.5f}{:9.5f}{:9.5}{:9.5f}'.format(a3,a4,a5,a6,a7))
print('Case 6:  a3, a4, a5, a6, a7, a8: {:9.5f}{:9.5f}{:9.5f}{:9.5f}{:9.5}{:9.5f}'.format(a3,a4,a5,a6,a7,a8))
#---second, display a list using a loop----------
myList = [a3,a4,a5,a6,a7,a8]
print('List 6:  a3, a4, a5, a6, a7, a8: ', end='')
for x in myList: 
    print('{:9.5f}'.format(x), end='')
print()
#---third, display a numpy array using a loop------------
myArray = np.array(myList)
print('Array 6: a3, a4, a5, a6, a7, a8: ', end='')
for x in myArray:
    print('{:9.5f}'.format(x), end='')
print()

My results show the bug in cases 4, 5, and 6:

Case 3:  a3, a4, a5:               0.00400  0.00040  0.00004
Case 4:  a3, a4, a5, a6:           0.00400  0.00040  0.00004    4e-06
Case 5:  a3, a4, a5, a6, a7:       0.00400  0.00040  0.00004    4e-06  0.00000
Case 6:  a3, a4, a5, a6, a7, a8:   0.00400  0.00040  0.00004  0.00000    4e-07  0.00000
List 6:  a3, a4, a5, a6, a7, a8:   0.00400  0.00040  0.00004  0.00000  0.00000  0.00000
Array 6: a3, a4, a5, a6, a7, a8:   0.00400  0.00040  0.00004  0.00000  0.00000  0.00000

I have no explanation for this, and therefore I always use a loop for floating output of multiple values.


回答 12

我用

def np_print(array,fmt="10.5f"):
    print (array.size*("{:"+fmt+"}")).format(*array)

修改多维数组并不难。

I use

def np_print(array,fmt="10.5f"):
    print (array.size*("{:"+fmt+"}")).format(*array)

It’s not difficult to modify it for multi-dimensional arrays.


回答 13

另一个选择是使用decimal模块:

import numpy as np
from decimal import *

arr = np.array([  56.83,  385.3 ,    6.65,  126.63,   85.76,  192.72,  112.81, 10.55])
arr2 = [str(Decimal(i).quantize(Decimal('.01'))) for i in arr]

# ['56.83', '385.30', '6.65', '126.63', '85.76', '192.72', '112.81', '10.55']

Yet another option is to use the decimal module:

import numpy as np
from decimal import *

arr = np.array([  56.83,  385.3 ,    6.65,  126.63,   85.76,  192.72,  112.81, 10.55])
arr2 = [str(Decimal(i).quantize(Decimal('.01'))) for i in arr]

# ['56.83', '385.30', '6.65', '126.63', '85.76', '192.72', '112.81', '10.55']

用Python漂亮地打印XML

问题:用Python漂亮地打印XML

在Python中漂亮地打印XML的最佳方法(或多种方法)是什么?

What is the best way (or are the various ways) to pretty print XML in Python?


回答 0

import xml.dom.minidom

dom = xml.dom.minidom.parse(xml_fname) # or xml.dom.minidom.parseString(xml_string)
pretty_xml_as_string = dom.toprettyxml()
import xml.dom.minidom

dom = xml.dom.minidom.parse(xml_fname) # or xml.dom.minidom.parseString(xml_string)
pretty_xml_as_string = dom.toprettyxml()

回答 1

lxml是最新的,更新的,并且包含漂亮的打印功能

import lxml.etree as etree

x = etree.parse("filename")
print etree.tostring(x, pretty_print=True)

查看lxml教程:http : //lxml.de/tutorial.html

lxml is recent, updated, and includes a pretty print function

import lxml.etree as etree

x = etree.parse("filename")
print etree.tostring(x, pretty_print=True)

Check out the lxml tutorial: http://lxml.de/tutorial.html


回答 2

另一个解决方案是借用indent函数,以与自2.5以来内置在Python中的ElementTree库一起使用。如下所示:

from xml.etree import ElementTree

def indent(elem, level=0):
    i = "\n" + level*"  "
    j = "\n" + (level-1)*"  "
    if len(elem):
        if not elem.text or not elem.text.strip():
            elem.text = i + "  "
        if not elem.tail or not elem.tail.strip():
            elem.tail = i
        for subelem in elem:
            indent(subelem, level+1)
        if not elem.tail or not elem.tail.strip():
            elem.tail = j
    else:
        if level and (not elem.tail or not elem.tail.strip()):
            elem.tail = j
    return elem        

root = ElementTree.parse('/tmp/xmlfile').getroot()
indent(root)
ElementTree.dump(root)

Another solution is to borrow this indent function, for use with the ElementTree library that’s built in to Python since 2.5. Here’s what that would look like:

from xml.etree import ElementTree

def indent(elem, level=0):
    i = "\n" + level*"  "
    j = "\n" + (level-1)*"  "
    if len(elem):
        if not elem.text or not elem.text.strip():
            elem.text = i + "  "
        if not elem.tail or not elem.tail.strip():
            elem.tail = i
        for subelem in elem:
            indent(subelem, level+1)
        if not elem.tail or not elem.tail.strip():
            elem.tail = j
    else:
        if level and (not elem.tail or not elem.tail.strip()):
            elem.tail = j
    return elem        

root = ElementTree.parse('/tmp/xmlfile').getroot()
indent(root)
ElementTree.dump(root)

回答 3

这是我的(hacky?)解决方案,用于解决丑陋的文本节点问题。

uglyXml = doc.toprettyxml(indent='  ')

text_re = re.compile('>\n\s+([^<>\s].*?)\n\s+</', re.DOTALL)    
prettyXml = text_re.sub('>\g<1></', uglyXml)

print prettyXml

上面的代码将生成:

<?xml version="1.0" ?>
<issues>
  <issue>
    <id>1</id>
    <title>Add Visual Studio 2005 and 2008 solution files</title>
    <details>We need Visual Studio 2005/2008 project files for Windows.</details>
  </issue>
</issues>

代替这个:

<?xml version="1.0" ?>
<issues>
  <issue>
    <id>
      1
    </id>
    <title>
      Add Visual Studio 2005 and 2008 solution files
    </title>
    <details>
      We need Visual Studio 2005/2008 project files for Windows.
    </details>
  </issue>
</issues>

免责声明:可能存在一些限制。

Here’s my (hacky?) solution to get around the ugly text node problem.

uglyXml = doc.toprettyxml(indent='  ')

text_re = re.compile('>\n\s+([^<>\s].*?)\n\s+</', re.DOTALL)    
prettyXml = text_re.sub('>\g<1></', uglyXml)

print prettyXml

The above code will produce:

<?xml version="1.0" ?>
<issues>
  <issue>
    <id>1</id>
    <title>Add Visual Studio 2005 and 2008 solution files</title>
    <details>We need Visual Studio 2005/2008 project files for Windows.</details>
  </issue>
</issues>

Instead of this:

<?xml version="1.0" ?>
<issues>
  <issue>
    <id>
      1
    </id>
    <title>
      Add Visual Studio 2005 and 2008 solution files
    </title>
    <details>
      We need Visual Studio 2005/2008 project files for Windows.
    </details>
  </issue>
</issues>

Disclaimer: There are probably some limitations.


回答 4

正如其他人指出的那样,lxml内置了一个漂亮的打印机。

请注意,尽管默认情况下它将CDATA节更改为普通文本,这可能会带来讨厌的结果。

这是一个Python函数,可保留输入文件,仅更改缩进(请注意strip_cdata=False)。此外,它确保输出使用UTF-8作为编码,而不是默认的ASCII(请注意encoding='utf-8'):

from lxml import etree

def prettyPrintXml(xmlFilePathToPrettyPrint):
    assert xmlFilePathToPrettyPrint is not None
    parser = etree.XMLParser(resolve_entities=False, strip_cdata=False)
    document = etree.parse(xmlFilePathToPrettyPrint, parser)
    document.write(xmlFilePathToPrettyPrint, pretty_print=True, encoding='utf-8')

用法示例:

prettyPrintXml('some_folder/some_file.xml')

As others pointed out, lxml has a pretty printer built in.

Be aware though that by default it changes CDATA sections to normal text, which can have nasty results.

Here’s a Python function that preserves the input file and only changes the indentation (notice the strip_cdata=False). Furthermore it makes sure the output uses UTF-8 as encoding instead of the default ASCII (notice the encoding='utf-8'):

from lxml import etree

def prettyPrintXml(xmlFilePathToPrettyPrint):
    assert xmlFilePathToPrettyPrint is not None
    parser = etree.XMLParser(resolve_entities=False, strip_cdata=False)
    document = etree.parse(xmlFilePathToPrettyPrint, parser)
    document.write(xmlFilePathToPrettyPrint, pretty_print=True, encoding='utf-8')

Example usage:

prettyPrintXml('some_folder/some_file.xml')

回答 5

BeautifulSoup有一个易于使用的prettify()方法。

每个缩进级别缩进一个空格。它比lxml的pretty_print好得多,而且又短又可爱。

from bs4 import BeautifulSoup

bs = BeautifulSoup(open(xml_file), 'xml')
print bs.prettify()

BeautifulSoup has a easy to use prettify() method.

It indents one space per indentation level. It works much better than lxml’s pretty_print and is short and sweet.

from bs4 import BeautifulSoup

bs = BeautifulSoup(open(xml_file), 'xml')
print bs.prettify()

回答 6

如果有的xmllint话,可以产生一个子流程并使用它。xmllint --format <file>漂亮地将其输入XML打印到标准输出。

请注意,此方法使用python外部的程序,这使其有点像hack。

def pretty_print_xml(xml):
    proc = subprocess.Popen(
        ['xmllint', '--format', '/dev/stdin'],
        stdin=subprocess.PIPE,
        stdout=subprocess.PIPE,
    )
    (output, error_output) = proc.communicate(xml);
    return output

print(pretty_print_xml(data))

If you have xmllint you can spawn a subprocess and use it. xmllint --format <file> pretty-prints its input XML to standard output.

Note that this method uses an program external to python, which makes it sort of a hack.

def pretty_print_xml(xml):
    proc = subprocess.Popen(
        ['xmllint', '--format', '/dev/stdin'],
        stdin=subprocess.PIPE,
        stdout=subprocess.PIPE,
    )
    (output, error_output) = proc.communicate(xml);
    return output

print(pretty_print_xml(data))

回答 7

我尝试编辑上面的“ ade”答案,但是在最初匿名提供反馈后,Stack Overflow不允许我进行编辑。这是用于精巧打印ElementTree的函数的错误版本。

def indent(elem, level=0, more_sibs=False):
    i = "\n"
    if level:
        i += (level-1) * '  '
    num_kids = len(elem)
    if num_kids:
        if not elem.text or not elem.text.strip():
            elem.text = i + "  "
            if level:
                elem.text += '  '
        count = 0
        for kid in elem:
            indent(kid, level+1, count < num_kids - 1)
            count += 1
        if not elem.tail or not elem.tail.strip():
            elem.tail = i
            if more_sibs:
                elem.tail += '  '
    else:
        if level and (not elem.tail or not elem.tail.strip()):
            elem.tail = i
            if more_sibs:
                elem.tail += '  '

I tried to edit “ade”s answer above, but Stack Overflow wouldn’t let me edit after I had initially provided feedback anonymously. This is a less buggy version of the function to pretty-print an ElementTree.

def indent(elem, level=0, more_sibs=False):
    i = "\n"
    if level:
        i += (level-1) * '  '
    num_kids = len(elem)
    if num_kids:
        if not elem.text or not elem.text.strip():
            elem.text = i + "  "
            if level:
                elem.text += '  '
        count = 0
        for kid in elem:
            indent(kid, level+1, count < num_kids - 1)
            count += 1
        if not elem.tail or not elem.tail.strip():
            elem.tail = i
            if more_sibs:
                elem.tail += '  '
    else:
        if level and (not elem.tail or not elem.tail.strip()):
            elem.tail = i
            if more_sibs:
                elem.tail += '  '

回答 8

如果您使用的是DOM实现,则每种都有自己的内置漂亮打印形式:

# minidom
#
document.toprettyxml()

# 4DOM
#
xml.dom.ext.PrettyPrint(document, stream)

# pxdom (or other DOM Level 3 LS-compliant imp)
#
serializer.domConfig.setParameter('format-pretty-print', True)
serializer.writeToString(document)

如果您使用的其他东西没有它自己的漂亮打印机-或那些漂亮打印机没有按照您想要的方式做-您可能必须编写或继承自己的序列化器。

If you’re using a DOM implementation, each has their own form of pretty-printing built-in:

# minidom
#
document.toprettyxml()

# 4DOM
#
xml.dom.ext.PrettyPrint(document, stream)

# pxdom (or other DOM Level 3 LS-compliant imp)
#
serializer.domConfig.setParameter('format-pretty-print', True)
serializer.writeToString(document)

If you’re using something else without its own pretty-printer — or those pretty-printers don’t quite do it the way you want —  you’d probably have to write or subclass your own serialiser.


回答 9

我对minidom的漂亮字体有一些疑问。每当我尝试用给定编码之外的字符漂亮地打印文档时,都会出现UnicodeError,例如,如果我在文档中有一个β并且尝试了doc.toprettyxml(encoding='latin-1')。这是我的解决方法:

def toprettyxml(doc, encoding):
    """Return a pretty-printed XML document in a given encoding."""
    unistr = doc.toprettyxml().replace(u'<?xml version="1.0" ?>',
                          u'<?xml version="1.0" encoding="%s"?>' % encoding)
    return unistr.encode(encoding, 'xmlcharrefreplace')

I had some problems with minidom’s pretty print. I’d get a UnicodeError whenever I tried pretty-printing a document with characters outside the given encoding, eg if I had a β in a document and I tried doc.toprettyxml(encoding='latin-1'). Here’s my workaround for it:

def toprettyxml(doc, encoding):
    """Return a pretty-printed XML document in a given encoding."""
    unistr = doc.toprettyxml().replace(u'<?xml version="1.0" ?>',
                          u'<?xml version="1.0" encoding="%s"?>' % encoding)
    return unistr.encode(encoding, 'xmlcharrefreplace')

回答 10

from yattag import indent

pretty_string = indent(ugly_string)

除非您要求使用以下命令,否则它不会在文本节点内添加空格或换行符:

indent(mystring, indent_text = True)

您可以指定缩进单位应该是什么以及换行符应该是什么样。

pretty_xml_string = indent(
    ugly_xml_string,
    indentation = '    ',
    newline = '\r\n'
)

该文档位于http://www.yattag.org主页上。

from yattag import indent

pretty_string = indent(ugly_string)

It won’t add spaces or newlines inside text nodes, unless you ask for it with:

indent(mystring, indent_text = True)

You can specify what the indentation unit should be and what the newline should look like.

pretty_xml_string = indent(
    ugly_xml_string,
    indentation = '    ',
    newline = '\r\n'
)

The doc is on http://www.yattag.org homepage.


回答 11

我编写了一个解决方案,以遍历现有的ElementTree并按照通常期望的那样使用文本/尾部缩进。

def prettify(element, indent='  '):
    queue = [(0, element)]  # (level, element)
    while queue:
        level, element = queue.pop(0)
        children = [(level + 1, child) for child in list(element)]
        if children:
            element.text = '\n' + indent * (level+1)  # for child open
        if queue:
            element.tail = '\n' + indent * queue[0][0]  # for sibling open
        else:
            element.tail = '\n' + indent * (level-1)  # for parent close
        queue[0:0] = children  # prepend so children come before siblings

I wrote a solution to walk through an existing ElementTree and use text/tail to indent it as one typically expects.

def prettify(element, indent='  '):
    queue = [(0, element)]  # (level, element)
    while queue:
        level, element = queue.pop(0)
        children = [(level + 1, child) for child in list(element)]
        if children:
            element.text = '\n' + indent * (level+1)  # for child open
        if queue:
            element.tail = '\n' + indent * queue[0][0]  # for sibling open
        else:
            element.tail = '\n' + indent * (level-1)  # for parent close
        queue[0:0] = children  # prepend so children come before siblings

回答 12

python的XML漂亮打印对于此任务看起来非常不错。(也应适当命名。)

一种替代方法是使用pyXML,它具有PrettyPrint功能

XML pretty print for python looks pretty good for this task. (Appropriately named, too.)

An alternative is to use pyXML, which has a PrettyPrint function.


回答 13

这是一个Python3解决方案,它摆脱了丑陋的换行符问题(大量空白),并且仅使用标准库,而不像大多数其他实现那样。

import xml.etree.ElementTree as ET
import xml.dom.minidom
import os

def pretty_print_xml_given_root(root, output_xml):
    """
    Useful for when you are editing xml data on the fly
    """
    xml_string = xml.dom.minidom.parseString(ET.tostring(root)).toprettyxml()
    xml_string = os.linesep.join([s for s in xml_string.splitlines() if s.strip()]) # remove the weird newline issue
    with open(output_xml, "w") as file_out:
        file_out.write(xml_string)

def pretty_print_xml_given_file(input_xml, output_xml):
    """
    Useful for when you want to reformat an already existing xml file
    """
    tree = ET.parse(input_xml)
    root = tree.getroot()
    pretty_print_xml_given_root(root, output_xml)

我在这里找到了解决常见换行问题的方法。

Here’s a Python3 solution that gets rid of the ugly newline issue (tons of whitespace), and it only uses standard libraries unlike most other implementations.

import xml.etree.ElementTree as ET
import xml.dom.minidom
import os

def pretty_print_xml_given_root(root, output_xml):
    """
    Useful for when you are editing xml data on the fly
    """
    xml_string = xml.dom.minidom.parseString(ET.tostring(root)).toprettyxml()
    xml_string = os.linesep.join([s for s in xml_string.splitlines() if s.strip()]) # remove the weird newline issue
    with open(output_xml, "w") as file_out:
        file_out.write(xml_string)

def pretty_print_xml_given_file(input_xml, output_xml):
    """
    Useful for when you want to reformat an already existing xml file
    """
    tree = ET.parse(input_xml)
    root = tree.getroot()
    pretty_print_xml_given_root(root, output_xml)

I found how to fix the common newline issue here.


回答 14

您可以将流行的外部库xmltodict与一起使用unparsepretty=True您将获得最佳结果:

xmltodict.unparse(
    xmltodict.parse(my_xml), full_document=False, pretty=True)

full_document=False反对<?xml version="1.0" encoding="UTF-8"?>在顶部。

You can use popular external library xmltodict, with unparse and pretty=True you will get best result:

xmltodict.unparse(
    xmltodict.parse(my_xml), full_document=False, pretty=True)

full_document=False against <?xml version="1.0" encoding="UTF-8"?> at the top.


回答 15

看一下vkbeautify模块。

这是我非常流行的javascript / nodejs插件的同名python版本。它可以漂亮地打印/最小化XML,JSON和CSS文本。输入和输出可以是字符串/文件的任意组合。它非常紧凑,没有任何依赖性。

例子

import vkbeautify as vkb

vkb.xml(text)                       
vkb.xml(text, 'path/to/dest/file')  
vkb.xml('path/to/src/file')        
vkb.xml('path/to/src/file', 'path/to/dest/file') 

Take a look at the vkbeautify module.

It is a python version of my very popular javascript/nodejs plugin with the same name. It can pretty-print/minify XML, JSON and CSS text. Input and output can be string/file in any combinations. It is very compact and doesn’t have any dependency.

Examples:

import vkbeautify as vkb

vkb.xml(text)                       
vkb.xml(text, 'path/to/dest/file')  
vkb.xml('path/to/src/file')        
vkb.xml('path/to/src/file', 'path/to/dest/file') 

回答 16

如果您不想进行重新解析,则可以使用xmlpp.py库和该get_pprint()函数。在我的用例中,它工作得很好且流畅,而无需重新解析为lxml ElementTree对象。

An alternative if you don’t want to have to reparse, there is the xmlpp.py library with the get_pprint() function. It worked nice and smoothly for my use cases, without having to reparse to an lxml ElementTree object.


回答 17

您可以尝试这种变化…

安装BeautifulSoup和后端lxml(解析器)库:

user$ pip3 install lxml bs4

处理您的XML文档:

from bs4 import BeautifulSoup

with open('/path/to/file.xml', 'r') as doc: 
    for line in doc: 
        print(BeautifulSoup(line, 'lxml-xml').prettify())  

You can try this variation…

Install BeautifulSoup and the backend lxml (parser) libraries:

user$ pip3 install lxml bs4

Process your XML document:

from bs4 import BeautifulSoup

with open('/path/to/file.xml', 'r') as doc: 
    for line in doc: 
        print(BeautifulSoup(line, 'lxml-xml').prettify())  

回答 18

我遇到了这个问题,并像这样解决了它:

def write_xml_file (self, file, xml_root_element, xml_declaration=False, pretty_print=False, encoding='unicode', indent='\t'):
    pretty_printed_xml = etree.tostring(xml_root_element, xml_declaration=xml_declaration, pretty_print=pretty_print, encoding=encoding)
    if pretty_print: pretty_printed_xml = pretty_printed_xml.replace('  ', indent)
    file.write(pretty_printed_xml)

在我的代码中,此方法的调用方式如下:

try:
    with open(file_path, 'w') as file:
        file.write('<?xml version="1.0" encoding="utf-8" ?>')

        # create some xml content using etree ...

        xml_parser = XMLParser()
        xml_parser.write_xml_file(file, xml_root, xml_declaration=False, pretty_print=True, encoding='unicode', indent='\t')

except IOError:
    print("Error while writing in log file!")

这仅是因为etree默认情况下会使用two spaces缩进,但我发现并不太强调缩进,因此效果不佳。我无法为etree设置任何设置或为任何函数更改标准etree缩进的参数。我喜欢使用etree多么容易,但这确实让我很烦。

I had this problem and solved it like this:

def write_xml_file (self, file, xml_root_element, xml_declaration=False, pretty_print=False, encoding='unicode', indent='\t'):
    pretty_printed_xml = etree.tostring(xml_root_element, xml_declaration=xml_declaration, pretty_print=pretty_print, encoding=encoding)
    if pretty_print: pretty_printed_xml = pretty_printed_xml.replace('  ', indent)
    file.write(pretty_printed_xml)

In my code this method is called like this:

try:
    with open(file_path, 'w') as file:
        file.write('<?xml version="1.0" encoding="utf-8" ?>')

        # create some xml content using etree ...

        xml_parser = XMLParser()
        xml_parser.write_xml_file(file, xml_root, xml_declaration=False, pretty_print=True, encoding='unicode', indent='\t')

except IOError:
    print("Error while writing in log file!")

This works only because etree by default uses two spaces to indent, which I don’t find very much emphasizing the indentation and therefore not pretty. I couldn’t ind any setting for etree or parameter for any function to change the standard etree indent. I like how easy it is to use etree, but this was really annoying me.


回答 19

要将整个xml文档转换为漂亮的xml文档
(例如:假设您已提取[解压缩] LibreOffice Writer .odt或.ods文件,并且想要将丑陋的“ content.xml”文件转换为自动化git版本控制git difftool.odt / .ods文件的生成,例如我在此处实现的)

import xml.dom.minidom

file = open("./content.xml", 'r')
xml_string = file.read()
file.close()

parsed_xml = xml.dom.minidom.parseString(xml_string)
pretty_xml_as_string = parsed_xml.toprettyxml()

file = open("./content_new.xml", 'w')
file.write(pretty_xml_as_string)
file.close()

参考资料:
-感谢本·诺兰德在本页上的回答,这为我提供了大部分帮助。

For converting an entire xml document to a pretty xml document
(ex: assuming you’ve extracted [unzipped] a LibreOffice Writer .odt or .ods file, and you want to convert the ugly “content.xml” file to a pretty one for automated git version control and git difftooling of .odt/.ods files, such as I’m implementing here)

import xml.dom.minidom

file = open("./content.xml", 'r')
xml_string = file.read()
file.close()

parsed_xml = xml.dom.minidom.parseString(xml_string)
pretty_xml_as_string = parsed_xml.toprettyxml()

file = open("./content_new.xml", 'w')
file.write(pretty_xml_as_string)
file.close()

References:
– Thanks to Ben Noland’s answer on this page which got me most of the way there.


回答 20

from lxml import etree
import xml.dom.minidom as mmd

xml_root = etree.parse(xml_fiel_path, etree.XMLParser())

def print_xml(xml_root):
    plain_xml = etree.tostring(xml_root).decode('utf-8')
    urgly_xml = ''.join(plain_xml .split())
    good_xml = mmd.parseString(urgly_xml)
    print(good_xml.toprettyxml(indent='    ',))

对于带有中文的xml来说效果很好!

from lxml import etree
import xml.dom.minidom as mmd

xml_root = etree.parse(xml_fiel_path, etree.XMLParser())

def print_xml(xml_root):
    plain_xml = etree.tostring(xml_root).decode('utf-8')
    urgly_xml = ''.join(plain_xml .split())
    good_xml = mmd.parseString(urgly_xml)
    print(good_xml.toprettyxml(indent='    ',))

It’s working well for the xml with Chinese!


回答 21

如果由于某种原因您无法使用其他用户提到的任何Python模块,那么我建议使用以下针对Python 2.7的解决方案:

import subprocess

def makePretty(filepath):
  cmd = "xmllint --format " + filepath
  prettyXML = subprocess.check_output(cmd, shell = True)
  with open(filepath, "w") as outfile:
    outfile.write(prettyXML)

据我所知,该解决方案将在xmllint安装了该软件包的基于Unix的系统上运行。

If for some reason you can’t get your hands on any of the Python modules that other users mentioned, I suggest the following solution for Python 2.7:

import subprocess

def makePretty(filepath):
  cmd = "xmllint --format " + filepath
  prettyXML = subprocess.check_output(cmd, shell = True)
  with open(filepath, "w") as outfile:
    outfile.write(prettyXML)

As far as I know, this solution will work on Unix-based systems that have the xmllint package installed.


回答 22

我用几行代码解决了这个问题,打开文件,遍历文件并添加缩进,然后再次保存。我正在处理小型xml文件,并且不想添加依赖项,也不想为用户安装更多库。无论如何,这就是我最终得到的结果:

    f = open(file_name,'r')
    xml = f.read()
    f.close()

    #Removing old indendations
    raw_xml = ''        
    for line in xml:
        raw_xml += line

    xml = raw_xml

    new_xml = ''
    indent = '    '
    deepness = 0

    for i in range((len(xml))):

        new_xml += xml[i]   
        if(i<len(xml)-3):

            simpleSplit = xml[i:(i+2)] == '><'
            advancSplit = xml[i:(i+3)] == '></'        
            end = xml[i:(i+2)] == '/>'    
            start = xml[i] == '<'

            if(advancSplit):
                deepness += -1
                new_xml += '\n' + indent*deepness
                simpleSplit = False
                deepness += -1
            if(simpleSplit):
                new_xml += '\n' + indent*deepness
            if(start):
                deepness += 1
            if(end):
                deepness += -1

    f = open(file_name,'w')
    f.write(new_xml)
    f.close()

它对我有用,也许有人会使用它:)

I solved this with some lines of code, opening the file, going trough it and adding indentation, then saving it again. I was working with small xml files, and did not want to add dependencies, or more libraries to install for the user. Anyway, here is what I ended up with:

    f = open(file_name,'r')
    xml = f.read()
    f.close()

    #Removing old indendations
    raw_xml = ''        
    for line in xml:
        raw_xml += line

    xml = raw_xml

    new_xml = ''
    indent = '    '
    deepness = 0

    for i in range((len(xml))):

        new_xml += xml[i]   
        if(i<len(xml)-3):

            simpleSplit = xml[i:(i+2)] == '><'
            advancSplit = xml[i:(i+3)] == '></'        
            end = xml[i:(i+2)] == '/>'    
            start = xml[i] == '<'

            if(advancSplit):
                deepness += -1
                new_xml += '\n' + indent*deepness
                simpleSplit = False
                deepness += -1
            if(simpleSplit):
                new_xml += '\n' + indent*deepness
            if(start):
                deepness += 1
            if(end):
                deepness += -1

    f = open(file_name,'w')
    f.write(new_xml)
    f.close()

It works for me, perhaps someone will have some use of it :)


是否有内置功能可以打印对象的所有当前属性和值?

问题:是否有内置功能可以打印对象的所有当前属性和值?

所以我在这里寻找的是类似PHP的print_r函数。

这样一来,我可以通过查看问题对象的状态来调试脚本。

So what I’m looking for here is something like PHP’s print_r function.

This is so I can debug my scripts by seeing what’s the state of the object in question.


回答 0

您实际上是将两种不同的东西混合在一起。

使用dir()vars()inspect模块来得到你所感兴趣的是(我用__builtins__作为一个例子,你可以使用任何对象,而不是)。

>>> l = dir(__builtins__)
>>> d = __builtins__.__dict__

随心所欲地打印该词典:

>>> print l
['ArithmeticError', 'AssertionError', 'AttributeError',...

要么

>>> from pprint import pprint
>>> pprint(l)
['ArithmeticError',
 'AssertionError',
 'AttributeError',
 'BaseException',
 'DeprecationWarning',
...

>>> pprint(d, indent=2)
{ 'ArithmeticError': <type 'exceptions.ArithmeticError'>,
  'AssertionError': <type 'exceptions.AssertionError'>,
  'AttributeError': <type 'exceptions.AttributeError'>,
...
  '_': [ 'ArithmeticError',
         'AssertionError',
         'AttributeError',
         'BaseException',
         'DeprecationWarning',
...

交互式调试器中还可以作为命令提供漂亮的打印:

(Pdb) pp vars()
{'__builtins__': {'ArithmeticError': <type 'exceptions.ArithmeticError'>,
                  'AssertionError': <type 'exceptions.AssertionError'>,
                  'AttributeError': <type 'exceptions.AttributeError'>,
                  'BaseException': <type 'exceptions.BaseException'>,
                  'BufferError': <type 'exceptions.BufferError'>,
                  ...
                  'zip': <built-in function zip>},
 '__file__': 'pass.py',
 '__name__': '__main__'}

You are really mixing together two different things.

Use dir(), vars() or the inspect module to get what you are interested in (I use __builtins__ as an example; you can use any object instead).

>>> l = dir(__builtins__)
>>> d = __builtins__.__dict__

Print that dictionary however fancy you like:

>>> print l
['ArithmeticError', 'AssertionError', 'AttributeError',...

or

>>> from pprint import pprint
>>> pprint(l)
['ArithmeticError',
 'AssertionError',
 'AttributeError',
 'BaseException',
 'DeprecationWarning',
...

>>> pprint(d, indent=2)
{ 'ArithmeticError': <type 'exceptions.ArithmeticError'>,
  'AssertionError': <type 'exceptions.AssertionError'>,
  'AttributeError': <type 'exceptions.AttributeError'>,
...
  '_': [ 'ArithmeticError',
         'AssertionError',
         'AttributeError',
         'BaseException',
         'DeprecationWarning',
...

Pretty printing is also available in the interactive debugger as a command:

(Pdb) pp vars()
{'__builtins__': {'ArithmeticError': <type 'exceptions.ArithmeticError'>,
                  'AssertionError': <type 'exceptions.AssertionError'>,
                  'AttributeError': <type 'exceptions.AttributeError'>,
                  'BaseException': <type 'exceptions.BaseException'>,
                  'BufferError': <type 'exceptions.BufferError'>,
                  ...
                  'zip': <built-in function zip>},
 '__file__': 'pass.py',
 '__name__': '__main__'}

回答 1

您要vars()pprint()

from pprint import pprint
pprint(vars(your_object))

You want vars() mixed with pprint():

from pprint import pprint
pprint(vars(your_object))

回答 2

def dump(obj):
  for attr in dir(obj):
    print("obj.%s = %r" % (attr, getattr(obj, attr)))

有很多第三方函数可以根据其作者的喜好添加诸如异常处理,国家/特殊字符打印,递归到嵌套对象等功能。但他们基本上都归结为这一点。

def dump(obj):
  for attr in dir(obj):
    print("obj.%s = %r" % (attr, getattr(obj, attr)))

There are many 3rd-party functions out there that add things like exception handling, national/special character printing, recursing into nested objects etc. according to their authors’ preferences. But they all basically boil down to this.


回答 3

已经提到了dir,但这只会为您提供属性的名称。如果还需要它们的值,请尝试__dict__。

class O:
   def __init__ (self):
      self.value = 3

o = O()

这是输出:

>>> o.__dict__

{'value': 3}

dir has been mentioned, but that’ll only give you the attributes’ names. If you want their values as well try __dict__.

class O:
   def __init__ (self):
      self.value = 3

o = O()

Here is the output:

>>> o.__dict__

{'value': 3}

回答 4

您可以使用“ dir()”函数执行此操作。

>>> import sys
>>> dir(sys)
['__displayhook__', '__doc__', '__excepthook__', '__name__', '__stderr__', '__stdin__', '__stdo
t__', '_current_frames', '_getframe', 'api_version', 'argv', 'builtin_module_names', 'byteorder
, 'call_tracing', 'callstats', 'copyright', 'displayhook', 'dllhandle', 'exc_clear', 'exc_info'
 'exc_type', 'excepthook', 'exec_prefix', 'executable', 'exit', 'getcheckinterval', 'getdefault
ncoding', 'getfilesystemencoding', 'getrecursionlimit', 'getrefcount', 'getwindowsversion', 'he
version', 'maxint', 'maxunicode', 'meta_path', 'modules', 'path', 'path_hooks', 'path_importer_
ache', 'platform', 'prefix', 'ps1', 'ps2', 'setcheckinterval', 'setprofile', 'setrecursionlimit
, 'settrace', 'stderr', 'stdin', 'stdout', 'subversion', 'version', 'version_info', 'warnoption
', 'winver']
>>>

另一个有用的功能是帮助。

>>> help(sys)
Help on built-in module sys:

NAME
    sys

FILE
    (built-in)

MODULE DOCS
    http://www.python.org/doc/current/lib/module-sys.html

DESCRIPTION
    This module provides access to some objects used or maintained by the
    interpreter and to functions that interact strongly with the interpreter.

    Dynamic objects:

    argv -- command line arguments; argv[0] is the script pathname if known

You can use the “dir()” function to do this.

>>> import sys
>>> dir(sys)
['__displayhook__', '__doc__', '__excepthook__', '__name__', '__stderr__', '__stdin__', '__stdo
t__', '_current_frames', '_getframe', 'api_version', 'argv', 'builtin_module_names', 'byteorder
, 'call_tracing', 'callstats', 'copyright', 'displayhook', 'dllhandle', 'exc_clear', 'exc_info'
 'exc_type', 'excepthook', 'exec_prefix', 'executable', 'exit', 'getcheckinterval', 'getdefault
ncoding', 'getfilesystemencoding', 'getrecursionlimit', 'getrefcount', 'getwindowsversion', 'he
version', 'maxint', 'maxunicode', 'meta_path', 'modules', 'path', 'path_hooks', 'path_importer_
ache', 'platform', 'prefix', 'ps1', 'ps2', 'setcheckinterval', 'setprofile', 'setrecursionlimit
, 'settrace', 'stderr', 'stdin', 'stdout', 'subversion', 'version', 'version_info', 'warnoption
', 'winver']
>>>

Another useful feature is help.

>>> help(sys)
Help on built-in module sys:

NAME
    sys

FILE
    (built-in)

MODULE DOCS
    http://www.python.org/doc/current/lib/module-sys.html

DESCRIPTION
    This module provides access to some objects used or maintained by the
    interpreter and to functions that interact strongly with the interpreter.

    Dynamic objects:

    argv -- command line arguments; argv[0] is the script pathname if known

回答 5

要打印对象的当前状态,您可以:

>>> obj # in an interpreter

要么

print repr(obj) # in a script

要么

print obj

为您的类定义__str____repr__方法。从Python文档中

__repr__(self)repr()内置函数和字符串转换(反引号)调用以计算对象的“正式”字符串表示形式。如果可能的话,这应该看起来像一个有效的Python表达式,可以用来重新创建具有相同值的对象(在适当的环境下)。如果无法做到这一点,则应返回“ <…一些有用的说明…>”形式的字符串。返回值必须是一个字符串对象。如果一个类定义了repr()而不是__str__(),那么__repr__()当需要该类实例的“非正式”字符串表示形式时,也可以使用该类。这通常用于调试,因此重要的是,表示形式必须信息丰富且明确。

__str__(self)str()内置函数和print语句调用,以计算对象的“非正式”字符串表示形式。区别__repr__()在于它不必是有效的Python表达式:相反,可以使用更方便或更简洁的表示形式。返回值必须是一个字符串对象。

To print the current state of the object you might:

>>> obj # in an interpreter

or

print repr(obj) # in a script

or

print obj

For your classes define __str__ or __repr__ methods. From the Python documentation:

__repr__(self) Called by the repr() built-in function and by string conversions (reverse quotes) to compute the “official” string representation of an object. If at all possible, this should look like a valid Python expression that could be used to recreate an object with the same value (given an appropriate environment). If this is not possible, a string of the form “<…some useful description…>” should be returned. The return value must be a string object. If a class defines repr() but not __str__(), then __repr__() is also used when an “informal” string representation of instances of that class is required. This is typically used for debugging, so it is important that the representation is information-rich and unambiguous.

__str__(self) Called by the str() built-in function and by the print statement to compute the “informal” string representation of an object. This differs from __repr__() in that it does not have to be a valid Python expression: a more convenient or concise representation may be used instead. The return value must be a string object.


回答 6

可能值得一看-

是否有与Perl的Data :: Dumper等效的Python?

我的建议是

https://gist.github.com/1071857

请注意,perl有一个称为Data :: Dumper的模块,该模块将对象数据转换回perl源代码(注意:它不会将代码转换回源代码,并且几乎始终不希望输出中的对象方法函数)。可以将其用于持久性,但通用目的是用于调试。

标准python pprint有很多无法实现的功能,特别是当它看到一个对象的实例并为您提供该对象的内部十六进制指针时,它只会停止下降(错误,该指针不是很多使用方式)。简而言之,python就是关于这个伟大的面向对象范例的全部,但是您开箱即用的工具是为处理对象以外的东西而设计的。

perl Data :: Dumper允许您控制要深入的深度,还可以检测圆形链接结构(这很重要)。从根本上讲,此过程在perl中更容易实现,因为对象没有祝福以外的任何魔力(普遍定义良好的过程)。

Might be worth checking out —

Is there a Python equivalent to Perl’s Data::Dumper?

My recommendation is this —

https://gist.github.com/1071857

Note that perl has a module called Data::Dumper which translates object data back to perl source code (NB: it does NOT translate code back to source, and almost always you don’t want to the object method functions in the output). This can be used for persistence, but the common purpose is for debugging.

There are a number of things standard python pprint fails to achieve, in particular it just stops descending when it sees an instance of an object and gives you the internal hex pointer of the object (errr, that pointer is not a whole lot of use by the way). So in a nutshell, python is all about this great object oriented paradigm, but the tools you get out of the box are designed for working with something other than objects.

The perl Data::Dumper allows you to control how deep you want to go, and also detects circular linked structures (that’s really important). This process is fundamentally easier to achieve in perl because objects have no particular magic beyond their blessing (a universally well defined process).


回答 7

我建议使用help(your_object)

help(dir)

 If called without an argument, return the names in the current scope.
 Else, return an alphabetized list of names comprising (some of) the attributes
 of the given object, and of attributes reachable from it.
 If the object supplies a method named __dir__, it will be used; otherwise
 the default dir() logic is used and returns:
 for a module object: the module's attributes.
 for a class object:  its attributes, and recursively the attributes
 of its bases.
 for any other object: its attributes, its class's attributes, and
 recursively the attributes of its class's base classes.

help(vars)

Without arguments, equivalent to locals().
With an argument, equivalent to object.__dict__.

I recommend using help(your_object).

help(dir)

 If called without an argument, return the names in the current scope.
 Else, return an alphabetized list of names comprising (some of) the attributes
 of the given object, and of attributes reachable from it.
 If the object supplies a method named __dir__, it will be used; otherwise
 the default dir() logic is used and returns:
 for a module object: the module's attributes.
 for a class object:  its attributes, and recursively the attributes
 of its bases.
 for any other object: its attributes, its class's attributes, and
 recursively the attributes of its class's base classes.

help(vars)

Without arguments, equivalent to locals().
With an argument, equivalent to object.__dict__.

回答 8

在大多数情况下,使用__dict__dir()将获得所需的信息。如果您碰巧需要更多细节,则标准库包含检查模块,可让您获得一些令人印象深刻的细节。真正真正的信息包括:

  • 函数名称和方法参数
  • 类层次结构
  • 函数/类对象的实现源代码
  • 框架对象外的局部变量

如果你只是寻找“难道我的对象有什么属性值?”,然后dir()__dict__可能是足够的。如果您真的想深入研究任意对象的当前状态(请记住,在python中几乎所有对象都是对象),那么inspect值得考虑。

In most cases, using __dict__ or dir() will get you the info you’re wanting. If you should happen to need more details, the standard library includes the inspect module, which allows you to get some impressive amount of detail. Some of the real nuggests of info include:

  • names of function and method parameters
  • class hierarchies
  • source code of the implementation of a functions/class objects
  • local variables out of a frame object

If you’re just looking for “what attribute values does my object have?”, then dir() and __dict__ are probably sufficient. If you’re really looking to dig into the current state of arbitrary objects (keeping in mind that in python almost everything is an object), then inspect is worthy of consideration.


回答 9

是否有内置功能可以打印对象的所有当前属性和值?

不可以。最受好评的答案不包括某些类型的属性,被接受的答案显示了如何获取所有属性,包括非公共api的方法和部分。但是,没有为此提供良好的内置函数。

因此,简短的推论是您可以编写自己的脚本,但是它将计算属性和其他计算的数据描述符(它们是公共API的一部分),并且您可能不希望这样做:

from pprint import pprint
from inspect import getmembers
from types import FunctionType

def attributes(obj):
    disallowed_names = {
      name for name, value in getmembers(type(obj)) 
        if isinstance(value, FunctionType)}
    return {
      name: getattr(obj, name) for name in dir(obj) 
        if name[0] != '_' and name not in disallowed_names and hasattr(obj, name)}

def print_attributes(obj):
    pprint(attributes(obj))

其他答案的问题

在具有许多不同类型的数据成员的类上观察当前投票最高的答案的应用:

from pprint import pprint

class Obj:
    __slots__ = 'foo', 'bar', '__dict__'
    def __init__(self, baz):
        self.foo = ''
        self.bar = 0
        self.baz = baz
    @property
    def quux(self):
        return self.foo * self.bar

obj = Obj('baz')
pprint(vars(obj))

仅打印:

{'baz': 'baz'}

由于vars 返回__dict__对象的,而并非副本,因此,如果您修改vars返回的dict,那么您也将修改__dict__对象本身的。

vars(obj)['quux'] = 'WHAT?!'
vars(obj)

返回:

{'baz': 'baz', 'quux': 'WHAT?!'}

-这很糟糕,因为quux是我们不应该设置的属性,也不应该在命名空间中…

在当前接受的答案(和其他答案)中应用建议并没有多大好处:

>>> dir(obj)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__slots__', '__str__', '__subclasshook__', 'bar', 'baz', 'foo', 'quux']

如我们所见,dir仅返回与一个对象关联的所有(实际上只是大多数)名称。

inspect.getmembers注释中提到的,也存在类似缺陷-它返回所有名称值。

从Class

在教学时,我让我的学生创建一个函数,该函数提供对象的语义公共API:

def api(obj):
    return [name for name in dir(obj) if name[0] != '_']

我们可以扩展它以提供对象的语义命名空间的副本,但是我们需要排除__slots__未分配的内容,并且如果我们认真对待“当前属性”的请求,则需要排除计算出的属性(如它们可能变得昂贵,并且可以解释为不是“当前”):

from types import FunctionType
from inspect import getmembers

def attrs(obj):
     disallowed_properties = {
       name for name, value in getmembers(type(obj)) 
         if isinstance(value, (property, FunctionType))}
     return {
       name: getattr(obj, name) for name in api(obj) 
         if name not in disallowed_properties and hasattr(obj, name)}

现在我们不计算或显示属性quux:

>>> attrs(obj)
{'bar': 0, 'baz': 'baz', 'foo': ''}

注意事项

但是也许我们确实知道我们的财产并不昂贵。我们可能想要更改逻辑以使其也包括在内。也许我们想排除其他 自定义数据描述符。

然后,我们需要进一步自定义此功能。因此,我们不能拥有一个内在的功能,就可以神奇地准确地知道我们想要什么并提供它,这是有道理的。这是我们需要创建自己的功能。

结论

没有内置函数可以执行此操作,因此您应该执行最适合您情况的语义上的操作。

Is there a built-in function to print all the current properties and values of an object?

No. The most upvoted answer excludes some kinds of attributes, and the accepted answer shows how to get all attributes, including methods and parts of the non-public api. But there is no good complete builtin function for this.

So the short corollary is that you can write your own, but it will calculate properties and other calculated data-descriptors that are part of the public API, and you might not want that:

from pprint import pprint
from inspect import getmembers
from types import FunctionType

def attributes(obj):
    disallowed_names = {
      name for name, value in getmembers(type(obj)) 
        if isinstance(value, FunctionType)}
    return {
      name: getattr(obj, name) for name in dir(obj) 
        if name[0] != '_' and name not in disallowed_names and hasattr(obj, name)}

def print_attributes(obj):
    pprint(attributes(obj))

Problems with other answers

Observe the application of the currently top voted answer on a class with a lot of different kinds of data members:

from pprint import pprint

class Obj:
    __slots__ = 'foo', 'bar', '__dict__'
    def __init__(self, baz):
        self.foo = ''
        self.bar = 0
        self.baz = baz
    @property
    def quux(self):
        return self.foo * self.bar

obj = Obj('baz')
pprint(vars(obj))

only prints:

{'baz': 'baz'}

Because vars only returns the __dict__ of an object, and it’s not a copy, so if you modify the dict returned by vars, you’re also modifying the __dict__ of the object itself.

vars(obj)['quux'] = 'WHAT?!'
vars(obj)

returns:

{'baz': 'baz', 'quux': 'WHAT?!'}

— which is bad because quux is a property that we shouldn’t be setting and shouldn’t be in the namespace…

Applying the advice in the currently accepted answer (and others) is not much better:

>>> dir(obj)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__slots__', '__str__', '__subclasshook__', 'bar', 'baz', 'foo', 'quux']

As we can see, dir only returns all (actually just most) of the names associated with an object.

inspect.getmembers, mentioned in the comments, is similarly flawed – it returns all names and values.

From class

When teaching I have my students create a function that provides the semantically public API of an object:

def api(obj):
    return [name for name in dir(obj) if name[0] != '_']

We can extend this to provide a copy of the semantic namespace of an object, but we need to exclude __slots__ that aren’t assigned, and if we’re taking the request for “current properties” seriously, we need to exclude calculated properties (as they could become expensive, and could be interpreted as not “current”):

from types import FunctionType
from inspect import getmembers

def attrs(obj):
     disallowed_properties = {
       name for name, value in getmembers(type(obj)) 
         if isinstance(value, (property, FunctionType))}
     return {
       name: getattr(obj, name) for name in api(obj) 
         if name not in disallowed_properties and hasattr(obj, name)}

And now we do not calculate or show the property, quux:

>>> attrs(obj)
{'bar': 0, 'baz': 'baz', 'foo': ''}

Caveats

But perhaps we do know our properties aren’t expensive. We may want to alter the logic to include them as well. And perhaps we want to exclude other custom data descriptors instead.

Then we need to further customize this function. And so it makes sense that we cannot have a built-in function that magically knows exactly what we want and provides it. This is functionality we need to create ourselves.

Conclusion

There is no built-in function that does this, and you should do what is most semantically appropriate for your situation.


回答 10

一个带有魔术的元编程示例Dump对象

$ cat dump.py
#!/usr/bin/python
import sys
if len(sys.argv) > 2:
    module, metaklass  = sys.argv[1:3]
    m = __import__(module, globals(), locals(), [metaklass])
    __metaclass__ = getattr(m, metaklass)

class Data:
    def __init__(self):
        self.num = 38
        self.lst = ['a','b','c']
        self.str = 'spam'
    dumps   = lambda self: repr(self)
    __str__ = lambda self: self.dumps()

data = Data()
print data

没有参数:

$ python dump.py
<__main__.Data instance at 0x00A052D8>

带有Gnosis实用程序

$ python dump.py gnosis.magic MetaXMLPickler
<?xml version="1.0"?>
<!DOCTYPE PyObject SYSTEM "PyObjects.dtd">
<PyObject module="__main__" class="Data" id="11038416">
<attr name="lst" type="list" id="11196136" >
  <item type="string" value="a" />
  <item type="string" value="b" />
  <item type="string" value="c" />
</attr>
<attr name="num" type="numeric" value="38" />
<attr name="str" type="string" value="spam" />
</PyObject>

它有点过时了,但仍然可以使用。

A metaprogramming example Dump object with magic:

$ cat dump.py
#!/usr/bin/python
import sys
if len(sys.argv) > 2:
    module, metaklass  = sys.argv[1:3]
    m = __import__(module, globals(), locals(), [metaklass])
    __metaclass__ = getattr(m, metaklass)

class Data:
    def __init__(self):
        self.num = 38
        self.lst = ['a','b','c']
        self.str = 'spam'
    dumps   = lambda self: repr(self)
    __str__ = lambda self: self.dumps()

data = Data()
print data

Without arguments:

$ python dump.py
<__main__.Data instance at 0x00A052D8>

With Gnosis Utils:

$ python dump.py gnosis.magic MetaXMLPickler
<?xml version="1.0"?>
<!DOCTYPE PyObject SYSTEM "PyObjects.dtd">
<PyObject module="__main__" class="Data" id="11038416">
<attr name="lst" type="list" id="11196136" >
  <item type="string" value="a" />
  <item type="string" value="b" />
  <item type="string" value="c" />
</attr>
<attr name="num" type="numeric" value="38" />
<attr name="str" type="string" value="spam" />
</PyObject>

It is a bit outdated but still working.


回答 11

如果您正在使用它进行调试,并且只想递归地转储所有内容,那么可接受的答案将不令人满意,因为这要求您的类已经具有良好的__str__实现。如果不是这种情况,那么效果会更好:

import json
print(json.dumps(YOUR_OBJECT, 
                 default=lambda obj: vars(obj),
                 indent=1))

If you’re using this for debugging, and you just want a recursive dump of everything, the accepted answer is unsatisfying because it requires that your classes have good __str__ implementations already. If that’s not the case, this works much better:

import json
print(json.dumps(YOUR_OBJECT, 
                 default=lambda obj: vars(obj),
                 indent=1))

回答 12

尝试ppretty

from ppretty import ppretty


class A(object):
    s = 5

    def __init__(self):
        self._p = 8

    @property
    def foo(self):
        return range(10)


print ppretty(A(), show_protected=True, show_static=True, show_properties=True)

输出:

__main__.A(_p = 8, foo = [0, 1, ..., 8, 9], s = 5)

Try ppretty

from ppretty import ppretty


class A(object):
    s = 5

    def __init__(self):
        self._p = 8

    @property
    def foo(self):
        return range(10)


print ppretty(A(), show_protected=True, show_static=True, show_properties=True)

Output:

__main__.A(_p = 8, foo = [0, 1, ..., 8, 9], s = 5)

回答 13

from pprint import pprint

def print_r(the_object):
    print ("CLASS: ", the_object.__class__.__name__, " (BASE CLASS: ", the_object.__class__.__bases__,")")
    pprint(vars(the_object))
from pprint import pprint

def print_r(the_object):
    print ("CLASS: ", the_object.__class__.__name__, " (BASE CLASS: ", the_object.__class__.__bases__,")")
    pprint(vars(the_object))

回答 14

这将以json或yaml缩进格式递归打印所有对象内容:

import jsonpickle # pip install jsonpickle
import json
import yaml # pip install pyyaml

serialized = jsonpickle.encode(obj, max_depth=2) # max_depth is optional
print json.dumps(json.loads(serialized), indent=4)
print yaml.dump(yaml.load(serialized), indent=4)

This prints out all the object contents recursively in json or yaml indented format:

import jsonpickle # pip install jsonpickle
import json
import yaml # pip install pyyaml

serialized = jsonpickle.encode(obj, max_depth=2) # max_depth is optional
print json.dumps(json.loads(serialized), indent=4)
print yaml.dump(yaml.load(serialized), indent=4)

回答 15

我赞成仅提及pprint的答案。明确地说,如果要查看复杂数据结构中的所有,请执行以下操作:

from pprint import pprint
pprint(my_var)

其中my_var是您感兴趣的变量。当我使用时,pprint(vars(my_var))我什么也没得到,这里的其他答案也无济于事,或者该方法看起来不必要地冗长。顺便说一句,在我的特定情况下,我正在检查的代码具有字典词典。

值得指出的是,对于某些自定义类,您可能只会得到无用<someobject.ExampleClass object at 0x7f739267f400>的输出。在这种情况下,您可能必须实现一个__str__方法或尝试其他解决方案。我仍然想找到没有第三方库就可以在所有情况下使用的简单方法。

I’ve upvoted the answer that mentions only pprint. To be clear, if you want to see all the values in a complex data structure, then do something like:

from pprint import pprint
pprint(my_var)

Where my_var is your variable of interest. When I used pprint(vars(my_var)) I got nothing, and other answers here didn’t help or the method looked unnecessarily long. By the way, in my particular case, the code I was inspecting had a dictionary of dictionaries.

Worth pointing out that with some custom classes you may just end up with an unhelpful <someobject.ExampleClass object at 0x7f739267f400> kind of output. In that case, you might have to implement a __str__ method, or try some of the other solutions. I’d still like to find something simple that works in all scenarios, without third party libraries.


回答 16

我需要在一些日志中打印DEBUG信息,并且无法使用pprint,因为它将破坏它。相反,我这样做了,并且得到了几乎相同的东西。

DO = DemoObject()

itemDir = DO.__dict__

for i in itemDir:
    print '{0}  :  {1}'.format(i, itemDir[i])

I was needing to print DEBUG info in some logs and was unable to use pprint because it would break it. Instead I did this and got virtually the same thing.

DO = DemoObject()

itemDir = DO.__dict__

for i in itemDir:
    print '{0}  :  {1}'.format(i, itemDir[i])

回答 17

要转储“ myObject”:

from bson import json_util
import json

print(json.dumps(myObject, default=json_util.default, sort_keys=True, indent=4, separators=(',', ': ')))

我尝试了vars()和dir(); 都因为我要找的东西而失败了。vars()无效,因为对象没有__dict__(exceptions.TypeError:vars()参数必须具有__dict__属性)。dir()并不是我要找的东西:它只是字段名的列表,不提供值或对象结构。

我认为json.dumps()适用于没有default = json_util.default的大多数对象,但是我在对象中有一个datetime字段,因此标准json序列化程序失败。请参阅如何克服python中的“ datetime.datetime无法JSON序列化”?

To dump “myObject”:

from bson import json_util
import json

print(json.dumps(myObject, default=json_util.default, sort_keys=True, indent=4, separators=(',', ': ')))

I tried vars() and dir(); both failed for what I was looking for. vars() didn’t work because the object didn’t have __dict__ (exceptions.TypeError: vars() argument must have __dict__ attribute). dir() wasn’t what I was looking for: it’s just a listing of field names, doesn’t give the values or the object structure.

I think json.dumps() would work for most objects without the default=json_util.default, but I had a datetime field in the object so the standard json serializer failed. See How to overcome “datetime.datetime not JSON serializable” in python?


回答 18

为什么不简单一些:

for key,value in obj.__dict__.iteritems():
    print key,value

Why not something simple:

for key,value in obj.__dict__.iteritems():
    print key,value

回答 19

pprint包含一个“漂亮打印机”,用于生成美观的数据结构表示。格式化程序产生的数据结构可以由解释器正确解析,并且易于阅读。如果可能的话,输出保持在一行上,并在分成多行时缩进。

pprint contains a “pretty printer” for producing aesthetically pleasing representations of your data structures. The formatter produces representations of data structures that can be parsed correctly by the interpreter, and are also easy for a human to read. The output is kept on a single line, if possible, and indented when split across multiple lines.


回答 20

只需尝试beeprint

它不仅可以帮助您打印对象变量,而且还可以帮助您输出漂亮的输出,例如:

class(NormalClassNewStyle):
  dicts: {
  },
  lists: [],
  static_props: 1,
  tupl: (1, 2)

Just try beeprint.

It will help you not only with printing object variables, but beautiful output as well, like this:

class(NormalClassNewStyle):
  dicts: {
  },
  lists: [],
  static_props: 1,
  tupl: (1, 2)

回答 21

对于每个奋斗的人

  • vars() 不返回所有属性。
  • dir() 不返回属性的值。

以下代码显示带有的所有属性obj及其值:

for attr in dir(obj):
        try:
            print("obj.{} = {}".format(attr, getattr(obj, attr)))
        except AttributeError:
            print("obj.{} = ?".format(attr))

For everybody struggling with

  • vars() not returning all attributes.
  • dir() not returning the attributes’ values.

The following code prints all attributes of obj with their values:

for attr in dir(obj):
        try:
            print("obj.{} = {}".format(attr, getattr(obj, attr)))
        except AttributeError:
            print("obj.{} = ?".format(attr))

回答 22

您可以尝试Flask调试工具栏。
https://pypi.python.org/pypi/Flask-DebugToolbar

from flask import Flask
from flask_debugtoolbar import DebugToolbarExtension

app = Flask(__name__)

# the toolbar is only enabled in debug mode:
app.debug = True

# set a 'SECRET_KEY' to enable the Flask session cookies
app.config['SECRET_KEY'] = '<replace with a secret key>'

toolbar = DebugToolbarExtension(app)

You can try the Flask Debug Toolbar.
https://pypi.python.org/pypi/Flask-DebugToolbar

from flask import Flask
from flask_debugtoolbar import DebugToolbarExtension

app = Flask(__name__)

# the toolbar is only enabled in debug mode:
app.debug = True

# set a 'SECRET_KEY' to enable the Flask session cookies
app.config['SECRET_KEY'] = '<replace with a secret key>'

toolbar = DebugToolbarExtension(app)

回答 23

我喜欢使用python对象内置类型keysvalues

对于属性,无论它们是方法还是变量:

o.keys()

对于这些属性的值:

o.values()

I like working with python object built-in types keys or values.

For attributes regardless they are methods or variables:

o.keys()

For values of those attributes:

o.values()

回答 24

无论在类中,__init__或外部如何定义变量,该方法都有效。

your_obj = YourObj()
attrs_with_value = {attr: getattr(your_obj, attr) for attr in dir(your_obj)}

This works no matter how your varibles are defined within a class, inside __init__ or outside.

your_obj = YourObj()
attrs_with_value = {attr: getattr(your_obj, attr) for attr in dir(your_obj)}

如何打印JSON文件?

问题:如何打印JSON文件?

我有一个JSON文件,我想对其进行漂亮打印-在python中执行此操作的最简单方法是什么?我知道PrettyPrint带有一个“对象”,我认为它可以是一个文件,但是我不知道如何传递文件-仅使用文件名是行不通的。

I have a JSON file that is a mess that I want to prettyprint– what’s the easiest way to do this in python? I know PrettyPrint takes an “object”, which I think can be a file, but I don’t know how to pass a file in– just using the filename doesn’t work.


回答 0

json模块已经使用indent参数实现了一些基本的漂亮打印:

>>> import json
>>>
>>> your_json = '["foo", {"bar":["baz", null, 1.0, 2]}]'
>>> parsed = json.loads(your_json)
>>> print(json.dumps(parsed, indent=4, sort_keys=True))
[
    "foo", 
    {
        "bar": [
            "baz", 
            null, 
            1.0, 
            2
        ]
    }
]

要解析文件,请使用json.load()

with open('filename.txt', 'r') as handle:
    parsed = json.load(handle)

The json module already implements some basic pretty printing with the indent parameter:

>>> import json
>>>
>>> your_json = '["foo", {"bar":["baz", null, 1.0, 2]}]'
>>> parsed = json.loads(your_json)
>>> print(json.dumps(parsed, indent=4, sort_keys=True))
[
    "foo", 
    {
        "bar": [
            "baz", 
            null, 
            1.0, 
            2
        ]
    }
]

To parse a file, use json.load():

with open('filename.txt', 'r') as handle:
    parsed = json.load(handle)

回答 1

您可以在命令行上执行此操作:

python3 -m json.tool some.json

(正如问题注释中已经提到的,感谢@Kai Petzke的python3建议)。

实际上,就命令行上的json处理而言,python不是我最喜欢的工具。简单的漂亮打印是可以的,但是如果您要操作json,它可能会变得过于复杂。您很快就需要编写一个单独的脚本文件,最终可能得到其键为u“ some-key”(python unicode)的地图,这会使选择字段更加困难,并且实际上并没有朝着漂亮的方向发展。 -印刷。

您也可以使用jq

jq . some.json

并获得颜色作为奖励(并且更容易扩展)。

附录:关于使用jq一方面处理大型JSON文件,另一方面使用非常大的jq程序的注释有些混乱。对于漂亮地打印由单个大型JSON实体组成的文件,实际的限制是RAM。对于漂亮地打印由单个真实数据数组组成的2GB文件,漂亮打印所需的“最大驻留集大小”为5GB(无论使用jq 1.5还是1.6)。还要注意,jq可以在python之后使用pip install jq

You can do this on the command line:

python3 -m json.tool some.json

(as already mentioned in the commentaries to the question, thanks to @Kai Petzke for the python3 suggestion).

Actually python is not my favourite tool as far as json processing on the command line is concerned. For simple pretty printing is ok, but if you want to manipulate the json it can become overcomplicated. You’d soon need to write a separate script-file, you could end up with maps whose keys are u”some-key” (python unicode), which makes selecting fields more difficult and doesn’t really go in the direction of pretty-printing.

You can also use jq:

jq . some.json

and you get colors as a bonus (and way easier extendability).

Addendum: There is some confusion in the comments about using jq to process large JSON files on the one hand, and having a very large jq program on the other. For pretty-printing a file consisting of a single large JSON entity, the practical limitation is RAM. For pretty-printing a 2GB file consisting of a single array of real-world data, the “maximum resident set size” required for pretty-printing was 5GB (whether using jq 1.5 or 1.6). Note also that jq can be used from within python after pip install jq.


回答 2

您可以使用内置的modul pprint(https://docs.python.org/3.6/library/pprint.html)

如何读取带有json数据的文件并打印出来。

import json
import pprint

json_data = None
with open('filename.txt', 'r') as f:
    data = f.read()
    json_data = json.loads(data)

pprint.pprint(json_data)

You could use the built-in modul pprint (https://docs.python.org/3.6/library/pprint.html).

How you can read the file with json data and print it out.

import json
import pprint

json_data = None
with open('filename.txt', 'r') as f:
    data = f.read()
    json_data = json.loads(data)

pprint.pprint(json_data)

回答 3

Pygmentize + Python json.tool =带有语法突出显示的漂亮打印

Pygmentize是杀手级工具。看到这个。

我结合python json.tool与pygmentize

echo '{"foo": "bar"}' | python -m json.tool | pygmentize -l json

有关pygmentize安装说明,请参见上面的链接。

下图是一个演示:

Pygmentize + Python json.tool = Pretty Print with Syntax Highlighting

Pygmentize is a killer tool. See this.

I combine python json.tool with pygmentize

echo '{"foo": "bar"}' | python -m json.tool | pygmentize -l json

See the link above for pygmentize installation instruction.

A demo of this is in the image below:


回答 4

使用此功能,不出汗不必记住,如果你的JSON是一种strdict再次-这个漂亮的打印只要看看:

import json

def pp_json(json_thing, sort=True, indents=4):
    if type(json_thing) is str:
        print(json.dumps(json.loads(json_thing), sort_keys=sort, indent=indents))
    else:
        print(json.dumps(json_thing, sort_keys=sort, indent=indents))
    return None

pp_json(your_json_string_or_dict)

Use this function and don’t sweat having to remember if your JSON is a str or dict again – just look at the pretty print:

import json

def pp_json(json_thing, sort=True, indents=4):
    if type(json_thing) is str:
        print(json.dumps(json.loads(json_thing), sort_keys=sort, indent=indents))
    else:
        print(json.dumps(json_thing, sort_keys=sort, indent=indents))
    return None

pp_json(your_json_string_or_dict)

回答 5

我曾经写过一个prettyjson()函数来产生漂亮的输出。您可以从此仓库中获取实现。

此功能的主要功能是尝试将dict和list项目保持在一行中,直到maxlinelength达到确定的水平为止。这样会产生更少的JSON行,输出看起来更紧凑且更易于阅读。

您可以产生这种输出,例如:

{
  "grid": {"port": "COM5"},
  "policy": {
    "movingaverage": 5,
    "hysteresis": 5,
    "fan1": {
      "name": "CPU",
      "signal": "cpu",
      "mode": "auto",
      "speed": 100,
      "curve": [[0, 75], [50, 75], [75, 100]]
    }
}

UPD 19年12月:我将代码放入单独的存储库中,更正了一些错误,并进行了其他一些调整。

I once wrote a prettyjson() function to produce nice-looking output. You can grab the implementation from this repo.

The main feature of this function is it tries to keep dict and list items in one line until a certain maxlinelength is reached. This produces fewer lines of JSON, the output looks more compact and easier to read.

You can produce this kind of output for instance:

{
  "grid": {"port": "COM5"},
  "policy": {
    "movingaverage": 5,
    "hysteresis": 5,
    "fan1": {
      "name": "CPU",
      "signal": "cpu",
      "mode": "auto",
      "speed": 100,
      "curve": [[0, 75], [50, 75], [75, 100]]
    }
}

UPD Dec’19: I placed the code into a separate repo, corrected a few bugs and made a few other tweaks.


回答 6

为了能够从命令行进行漂亮的打印并能够控制缩进等,您可以设置类似于以下的别名:

alias jsonpp="python -c 'import sys, json; print json.dumps(json.load(sys.stdin), sort_keys=True, indent=2)'"

然后以下列方式之一使用别名:

cat myfile.json | jsonpp
jsonpp < myfile.json

To be able to pretty print from the command line and be able to have control over the indentation etc. you can set up an alias similar to this:

alias jsonpp="python -c 'import sys, json; print json.dumps(json.load(sys.stdin), sort_keys=True, indent=2)'"

And then use the alias in one of these ways:

cat myfile.json | jsonpp
jsonpp < myfile.json

回答 7

使用pprint:https ://docs.python.org/3.6/library/pprint.html

import pprint
pprint.pprint(json)

print() 相比 pprint.pprint()

print(json)
{'feed': {'title': 'W3Schools Home Page', 'title_detail': {'type': 'text/plain', 'language': None, 'base': '', 'value': 'W3Schools Home Page'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://www.w3schools.com'}], 'link': 'https://www.w3schools.com', 'subtitle': 'Free web building tutorials', 'subtitle_detail': {'type': 'text/html', 'language': None, 'base': '', 'value': 'Free web building tutorials'}}, 'entries': [], 'bozo': 0, 'encoding': 'utf-8', 'version': 'rss20', 'namespaces': {}}

pprint.pprint(json)
{'bozo': 0,
 'encoding': 'utf-8',
 'entries': [],
 'feed': {'link': 'https://www.w3schools.com',
          'links': [{'href': 'https://www.w3schools.com',
                     'rel': 'alternate',
                     'type': 'text/html'}],
          'subtitle': 'Free web building tutorials',
          'subtitle_detail': {'base': '',
                              'language': None,
                              'type': 'text/html',
                              'value': 'Free web building tutorials'},
          'title': 'W3Schools Home Page',
          'title_detail': {'base': '',
                           'language': None,
                           'type': 'text/plain',
                           'value': 'W3Schools Home Page'}},
 'namespaces': {},
 'version': 'rss20'}

Use pprint: https://docs.python.org/3.6/library/pprint.html

import pprint
pprint.pprint(json)

print() compared to pprint.pprint()

print(json)
{'feed': {'title': 'W3Schools Home Page', 'title_detail': {'type': 'text/plain', 'language': None, 'base': '', 'value': 'W3Schools Home Page'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://www.w3schools.com'}], 'link': 'https://www.w3schools.com', 'subtitle': 'Free web building tutorials', 'subtitle_detail': {'type': 'text/html', 'language': None, 'base': '', 'value': 'Free web building tutorials'}}, 'entries': [], 'bozo': 0, 'encoding': 'utf-8', 'version': 'rss20', 'namespaces': {}}

pprint.pprint(json)
{'bozo': 0,
 'encoding': 'utf-8',
 'entries': [],
 'feed': {'link': 'https://www.w3schools.com',
          'links': [{'href': 'https://www.w3schools.com',
                     'rel': 'alternate',
                     'type': 'text/html'}],
          'subtitle': 'Free web building tutorials',
          'subtitle_detail': {'base': '',
                              'language': None,
                              'type': 'text/html',
                              'value': 'Free web building tutorials'},
          'title': 'W3Schools Home Page',
          'title_detail': {'base': '',
                           'language': None,
                           'type': 'text/plain',
                           'value': 'W3Schools Home Page'}},
 'namespaces': {},
 'version': 'rss20'}

回答 8

这是一个简单的示例,可以在Python中以一种不错的方式将JSON打印到控制台,而无需将JSON作为本地文件存储在您的计算机上:

import pprint
import json 
from urllib.request import urlopen # (Only used to get this example)

# Getting a JSON example for this example 
r = urlopen("https://mdn.github.io/fetch-examples/fetch-json/products.json")
text = r.read() 

# To print it
pprint.pprint(json.loads(text))

Here’s a simple example of pretty printing JSON to the console in a nice way in Python, without requiring the JSON to be on your computer as a local file:

import pprint
import json 
from urllib.request import urlopen # (Only used to get this example)

# Getting a JSON example for this example 
r = urlopen("https://mdn.github.io/fetch-examples/fetch-json/products.json")
text = r.read() 

# To print it
pprint.pprint(json.loads(text))

回答 9

def saveJson(date,fileToSave):
    with open(fileToSave, 'w+') as fileToSave:
        json.dump(date, fileToSave, ensure_ascii=True, indent=4, sort_keys=True)

它可以显示或保存到文件中。

def saveJson(date,fileToSave):
    with open(fileToSave, 'w+') as fileToSave:
        json.dump(date, fileToSave, ensure_ascii=True, indent=4, sort_keys=True)

It works to display or save it to a file.


回答 10

我认为最好先解析json,以避免出现错误:

def format_response(response):
    try:
        parsed = json.loads(response.text)
    except JSONDecodeError:
        return response.text
    return json.dumps(parsed, ensure_ascii=True, indent=4)

I think that’s better to parse the json before, to avoid errors:

def format_response(response):
    try:
        parsed = json.loads(response.text)
    except JSONDecodeError:
        return response.text
    return json.dumps(parsed, ensure_ascii=True, indent=4)

回答 11

您可以尝试pprintjson


安装

$ pip3 install pprintjson

用法

使用pprintjson CLI从文件漂亮地打印JSON。

$ pprintjson "./path/to/file.json"

使用pprintjson CLI从标准输入漂亮地打印JSON。

$ echo '{ "a": 1, "b": "string", "c": true }' | pprintjson

使用pprintjson CLI从字符串漂亮地打印JSON。

$ pprintjson -c '{ "a": 1, "b": "string", "c": true }'

从缩进为1的字符串漂亮地打印JSON。

$ pprintjson -c '{ "a": 1, "b": "string", "c": true }' -i 1

从字符串漂亮地打印JSON并将输出保存到文件output.json。

$ pprintjson -c '{ "a": 1, "b": "string", "c": true }' -o ./output.json

输出量

You could try pprintjson.


Installation

$ pip3 install pprintjson

Usage

Pretty print JSON from a file using the pprintjson CLI.

$ pprintjson "./path/to/file.json"

Pretty print JSON from a stdin using the pprintjson CLI.

$ echo '{ "a": 1, "b": "string", "c": true }' | pprintjson

Pretty print JSON from a string using the pprintjson CLI.

$ pprintjson -c '{ "a": 1, "b": "string", "c": true }'

Pretty print JSON from a string with an indent of 1.

$ pprintjson -c '{ "a": 1, "b": "string", "c": true }' -i 1

Pretty print JSON from a string and save output to a file output.json.

$ pprintjson -c '{ "a": 1, "b": "string", "c": true }' -o ./output.json

Output


回答 12

它远非完美,但可以做到。

data = data.replace(',"',',\n"')

您可以对其进行改进,添加缩进等,但是如果您只想能够阅读更简洁的json,则可以采用这种方法。

It’s far from perfect, but it does the job.

data = data.replace(',"',',\n"')

you can improve it, add indenting and so on, but if you just want to be able to read a cleaner json, this is the way to go.