Python 实用宝典

Question 1

I am using the standard json module in python 2.6 to serialize a list of floats. However, I’m getting results like this:

>>> import json
>>> json.dumps([23.67, 23.97, 23.87])
'[23.670000000000002, 23.969999999999999, 23.870000000000001]'

I want the floats to be formated with only two decimal digits. The output should look like this:

>>> json.dumps([23.67, 23.97, 23.87])
'[23.67, 23.97, 23.87]'

I have tried defining my own JSON Encoder class:

class MyEncoder(json.JSONEncoder):
    def encode(self, obj):
        if isinstance(obj, float):
            return format(obj, '.2f')
        return json.JSONEncoder.encode(self, obj)

This works for a sole float object:

>>> json.dumps(23.67, cls=MyEncoder)
'23.67'

But fails for nested objects:

>>> json.dumps([23.67, 23.97, 23.87])
'[23.670000000000002, 23.969999999999999, 23.870000000000001]'

I don’t want to have external dependencies, so I prefer to stick with the standard json module.

How can I achieve this?

Question 2

Note: This does not work in any recent version of Python.

Unfortunately, I believe you have to do this by monkey-patching (which, to my opinion, indicates a design defect in the standard library json package). E.g., this code:

import json
from json import encoder
encoder.FLOAT_REPR = lambda o: format(o, '.2f')
    
print(json.dumps(23.67))
print(json.dumps([23.67, 23.97, 23.87]))

emits:

23.67
[23.67, 23.97, 23.87]

as you desire. Obviously, there should be an architected way to override FLOAT_REPR so that EVERY representation of a float is under your control if you wish it to be; but unfortunately that’s not how the json package was designed:-(.

Question 3

import simplejson
    
class PrettyFloat(float):
    def __repr__(self):
        return '%.15g' % self
    
def pretty_floats(obj):
    if isinstance(obj, float):
        return PrettyFloat(obj)
    elif isinstance(obj, dict):
        return dict((k, pretty_floats(v)) for k, v in obj.items())
    elif isinstance(obj, (list, tuple)):
        return list(map(pretty_floats, obj))
    return obj
    
print(simplejson.dumps(pretty_floats([23.67, 23.97, 23.87])))

emits

[23.67, 23.97, 23.87]

No monkeypatching necessary.

Question 4

If you’re using Python 2.7, a simple solution is to simply round your floats explicitly to the desired precision.

>>> sys.version
'2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit (Intel)]'
>>> json.dumps(1.0/3.0)
'0.3333333333333333'
>>> json.dumps(round(1.0/3.0, 2))
'0.33'

This works because Python 2.7 made float rounding more consistent. Unfortunately this does not work in Python 2.6:

>>> sys.version
'2.6.6 (r266:84292, Dec 27 2010, 00:02:40) \n[GCC 4.4.5]'
>>> json.dumps(round(1.0/3.0, 2))
'0.33000000000000002'

The solutions mentioned above are workarounds for 2.6, but none are entirely adequate. Monkey patching json.encoder.FLOAT_REPR does not work if your Python runtime uses a C version of the JSON module. The PrettyFloat class in Tom Wuttke’s answer works, but only if %g encoding works globally for your application. The %.15g is a bit magic, it works because float precision is 17 significant digits and %g does not print trailing zeroes.

I spent some time trying to make a PrettyFloat that allowed customization of precision for each number. Ie, a syntax like

>>> json.dumps(PrettyFloat(1.0 / 3.0, 4))
'0.3333'

It’s not easy to get this right. Inheriting from float is awkward. Inheriting from Object and using a JSONEncoder subclass with its own default() method should work, except the json module seems to assume all custom types should be serialized as strings. Ie: you end up with the Javascript string “0.33” in the output, not the number 0.33. There may be a way yet to make this work, but it’s harder than it looks.

Question 5

Really unfortunate that dumps doesn’t allow you to do anything to floats. However loads does. So if you don’t mind the extra CPU load, you could throw it through the encoder/decoder/encoder and get the right result:

>>> json.dumps(json.loads(json.dumps([.333333333333, .432432]), parse_float=lambda x: round(float(x), 3)))
'[0.333, 0.432]'

Question 6

Here’s a solution that worked for me in Python 3 and does not require monkey patching:

import json

def round_floats(o):
    if isinstance(o, float): return round(o, 2)
    if isinstance(o, dict): return {k: round_floats(v) for k, v in o.items()}
    if isinstance(o, (list, tuple)): return [round_floats(x) for x in o]
    return o


json.dumps(round_floats([23.63437, 23.93437, 23.842347]))

Output is:

[23.63, 23.93, 23.84]

It copies the data but with rounded floats.

Question 7

If you’re stuck with Python 2.5 or earlier versions: The monkey-patch trick does not seem to work with the original simplejson module if the C speedups are installed:

$ python
Python 2.5.4 (r254:67916, Jan 20 2009, 11:06:13) 
[GCC 4.2.1 (SUSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import simplejson
>>> simplejson.__version__
'2.0.9'
>>> simplejson._speedups
<module 'simplejson._speedups' from '/home/carlos/.python-eggs/simplejson-2.0.9-py2.5-linux-i686.egg-tmp/simplejson/_speedups.so'>
>>> simplejson.encoder.FLOAT_REPR = lambda f: ("%.2f" % f)
>>> simplejson.dumps([23.67, 23.97, 23.87])
'[23.670000000000002, 23.969999999999999, 23.870000000000001]'
>>> simplejson.encoder.c_make_encoder = None
>>> simplejson.dumps([23.67, 23.97, 23.87])
'[23.67, 23.97, 23.87]'
>>>

Question 8

You can do what you need to do, but it isn’t documented:

>>> import json
>>> json.encoder.FLOAT_REPR = lambda f: ("%.2f" % f)
>>> json.dumps([23.67, 23.97, 23.87])
'[23.67, 23.97, 23.87]'

Question 9

Alex Martelli’s solution will work for single threaded apps, but may not work for multi-threaded apps that need to control the number of decimal places per thread. Here is a solution that should work in multi threaded apps:

import threading
from json import encoder

def FLOAT_REPR(f):
    """
    Serialize a float to a string, with a given number of digits
    """
    decimal_places = getattr(encoder.thread_local, 'decimal_places', 0)
    format_str = '%%.%df' % decimal_places
    return format_str % f

encoder.thread_local = threading.local()
encoder.FLOAT_REPR = FLOAT_REPR     

#As an example, call like this:
import json

encoder.thread_local.decimal_places = 1
json.dumps([1.56, 1.54]) #Should result in '[1.6, 1.5]'

You can merely set encoder.thread_local.decimal_places to the number of decimal places you want, and the next call to json.dumps() in that thread will use that number of decimal places

Question 10

If you need to do this in python 2.7 without overriding the global json.encoder.FLOAT_REPR, here’s one way.

import json
import math

class MyEncoder(json.JSONEncoder):
    "JSON encoder that renders floats to two decimal places"

    FLOAT_FRMT = '{0:.2f}'

    def floatstr(self, obj):
        return self.FLOAT_FRMT.format(obj)

    def _iterencode(self, obj, markers=None):
        # stl JSON lame override #1
        new_obj = obj
        if isinstance(obj, float):
            if not math.isnan(obj) and not math.isinf(obj):
                new_obj = self.floatstr(obj)
        return super(MyEncoder, self)._iterencode(new_obj, markers=markers)

    def _iterencode_dict(self, dct, markers=None):
        # stl JSON lame override #2
        new_dct = {}
        for key, value in dct.iteritems():
            if isinstance(key, float):
                if not math.isnan(key) and not math.isinf(key):
                    key = self.floatstr(key)
            new_dct[key] = value
        return super(MyEncoder, self)._iterencode_dict(new_dct, markers=markers)

Then, in python 2.7:

>>> from tmp import MyEncoder
>>> enc = MyEncoder()
>>> enc.encode([23.67, 23.98, 23.87])
'[23.67, 23.98, 23.87]'

In python 2.6, it doesn’t quite work as Matthew Schinckel points out below:

>>> import MyEncoder
>>> enc = MyEncoder()  
>>> enc.encode([23.67, 23.97, 23.87])
'["23.67", "23.97", "23.87"]'

Question 11

Pros:

Works with any JSON encoder, or even python’s repr.
Short(ish), seems to work.

Cons:

Ugly regexp hack, barely tested.

Quadratic complexity.

def fix_floats(json, decimals=2, quote='"'):
    pattern = r'^((?:(?:"(?:\\.|[^\\"])*?")|[^"])*?)(-?\d+\.\d{'+str(decimals)+'}\d+)'
    pattern = re.sub('"', quote, pattern) 
    fmt = "%%.%df" % decimals
    n = 1
    while n:
        json, n = re.subn(pattern, lambda m: m.group(1)+(fmt % float(m.group(2)).rstrip('0')), json)
    return json

Question 12

When importing the standard json module, it is enough to change the default encoder FLOAT_REPR. There isn’t really the need to import or create Encoder instances.

import json
json.encoder.FLOAT_REPR = lambda o: format(o, '.2f')

json.dumps([23.67, 23.97, 23.87]) #returns  '[23.67, 23.97, 23.87]'

Sometimes is also very useful to output as json the best representation python can guess with str. This will make sure signifficant digits are not ignored.

import json
json.dumps([23.67, 23.9779, 23.87489])
# output is'[23.670000000000002, 23.977900000000002, 23.874890000000001]'

json.encoder.FLOAT_REPR = str
json.dumps([23.67, 23.9779, 23.87489])
# output is '[23.67, 23.9779, 23.87489]'

Question 13

I agree with @Nelson that inheriting from float is awkward, but perhaps a solution that only touches the __repr__ function might be forgiveable. I ended up using the decimal package for this to reformat floats when needed. The upside is that this works in all contexts where repr() is being called, so also when simply printing lists to stdout for example. Also, the precision is runtime configurable, after the data has been created. Downside is of course that your data needs to be converted to this special float class (as unfortunately you cannot seem to monkey patch float.__repr__). For that I provide a brief conversion function.

The code:

import decimal
C = decimal.getcontext()

class decimal_formatted_float(float):
   def __repr__(self):
       s = str(C.create_decimal_from_float(self))
       if '.' in s: s = s.rstrip('0')
       return s

def convert_to_dff(elem):
    try:
        return elem.__class__(map(convert_to_dff, elem))
    except:
        if isinstance(elem, float):
            return decimal_formatted_float(elem)
        else:
            return elem

Usage example:

>>> import json
>>> li = [(1.2345,),(7.890123,4.567,890,890.)]
>>>
>>> decimal.getcontext().prec = 15
>>> dff_li = convert_to_dff(li)
>>> dff_li
[(1.2345,), (7.890123, 4.567, 890, 890)]
>>> json.dumps(dff_li)
'[[1.2345], [7.890123, 4.567, 890, 890]]'
>>>
>>> decimal.getcontext().prec = 3
>>> dff_li = convert_to_dff(li)
>>> dff_li
[(1.23,), (7.89, 4.57, 890, 890)]
>>> json.dumps(dff_li)
'[[1.23], [7.89, 4.57, 890, 890]]'

Question 14

Using numpy

If you actually have really long floats you can round them up/down correctly with numpy:

import json 

import numpy as np

data = np.array([23.671234, 23.97432, 23.870123])

json.dumps(np.around(data, decimals=2).tolist())

'[23.67, 23.97, 23.87]'

Question 15

I just released fjson, a small Python library to fix this issue. Install with

pip install fjson

and use just like json, with the addition of the float_format parameter:

import math
import fjson


data = {"a": 1, "b": math.pi}
print(fjson.dumps(data, float_format=".6e", indent=2))

{
  "a": 1,
  "b": 3.141593e+00
}

Python 实用宝典

格式使用标准json模块浮动

问题：格式使用标准json模块浮动

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

使用numpy

Using numpy

回答 13

有趣好用的Python教程