python教程—UnicodeDecodeError: & # 39; use utf8 # 39;编解码器无法解码位置3-6的字节:无效数据-Python实用宝典

python教程—UnicodeDecodeError: & # 39; use utf8 # 39;编解码器无法解码位置3-6的字节:无效数据

unicode在python2上是如何工作的?我就是不明白。在这里,我从服务器下载数据并将其解析为JSON。

unicode在python2上是如何工作的?我就是不明白。

在这里,我从服务器下载数据并将其解析为JSON。

    Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/eventlet-0.9.12-py2.6.egg/eventlet/hubs/poll.py", line 92, in wait readers.get(fileno, noop).cb(fileno) File "/usr/local/lib/python2.6/dist-packages/eventlet-0.9.12-py2.6.egg/eventlet/greenthread.py", line 202, in main result = function(*args, **kwargs) File "android_suggest.py", line 60, in fetch suggestions = suggest(chars) File "android_suggest.py", line 28, in suggest return [i['s'] for i in json.loads(opener.open('https://market.android.com/suggest/SuggRequest?json=1&query='+s+'&hl=de&gl=DE').read())] File "/usr/lib/python2.6/json/__init__.py", line 307, in loads return _default_decoder.decode(s) File "/usr/lib/python2.6/json/decoder.py", line 319, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib/python2.6/json/decoder.py", line 336, in raw_decode obj, end = self._scanner.iterscan(s, **kw).next() File "/usr/lib/python2.6/json/scanner.py", line 55, in iterscan rval, next_pos = action(m, context) File "/usr/lib/python2.6/json/decoder.py", line 217, in JSONArray value, end = iterscan(s, idx=end, context=context).next() File "/usr/lib/python2.6/json/scanner.py", line 55, in iterscan rval, next_pos = action(m, context) File "/usr/lib/python2.6/json/decoder.py", line 183, in JSONObject value, end = iterscan(s, idx=end, context=context).next() File "/usr/lib/python2.6/json/scanner.py", line 55, in iterscan rval, next_pos = action(m, context) File "/usr/lib/python2.6/json/decoder.py", line 155, in JSONString return scanstring(match.string, match.end(), encoding, strict) UnicodeDecodeError: 'utf8' codec can't decode bytes in position 3-6: invalid data

谢谢你! !

EDIT: the following string causes the error: '[{"t":"q","s":"abhxf6ren"}]'. xf6 should be decoded to ö (abhören)

回答

要解析为JSON的字符串不是用UTF-8编码的。它很可能是用ISO-8859-1编码的。试试以下:

    json.loads(unicode(opener.open(...), "ISO-8859-1"))

它将处理JSON消息中可能出现的任何umlauts。

您应该阅读Joel Spolsky的,这是每个软件开发人员必须了解Unicode和字符集(没有借口!)的绝对最小值。我希望它能澄清关于Unicode的一些问题。

​Python实用宝典 (pythondict.com)
不只是一个宝典
欢迎关注公众号:Python实用宝典

本文由 Python实用宝典 作者:Python实用宝典 发表,其版权均为 Python实用宝典 所有,文章内容系作者个人观点,不代表 Python实用宝典 对观点赞同或支持。如需转载,请注明文章来源。
0

发表评论