问题:语法错误:函数返回“£”时文件中的非ASCII字符“ \ xa3”

说我有一个功能:

def NewFunction():
    return '£'

我想打印一些在前面带有井号的东西,并且在我尝试运行该程序时打印出错误,并显示以下错误消息:

SyntaxError: Non-ASCII character '\xa3' in file 'blah' but no encoding declared;
see http://www.python.org/peps/pep-0263.html for details

谁能告诉我如何在返回函数中加入井号吗?我基本上是在课堂上使用它,并且在'__str__'包含磅符号的部分内。

Say I have a function:

def NewFunction():
    return '£'

I want to print some stuff with a pound sign in front of it and it prints an error when I try to run this program, this error message is displayed:

SyntaxError: Non-ASCII character '\xa3' in file 'blah' but no encoding declared;
see http://www.python.org/peps/pep-0263.html for details

Can anyone inform me how I can include a pound sign in my return function? I’m basically using it in a class and it’s within the '__str__' part that the pound sign is included.


回答 0

我建议阅读该错误给您的PEP。问题是您的代码试图使用ASCII编码,但是井号不是ASCII字符。尝试使用UTF-8编码。您可以# -*- coding: utf-8 -*-先将.py文件放在顶部。为了更高级,您还可以在代码中逐个字符串定义编码。但是,如果您尝试将井字符号文字放入代码中,则需要一个支持整个文件的编码。

I’d recommend reading that PEP the error gives you. The problem is that your code is trying to use the ASCII encoding, but the pound symbol is not an ASCII character. Try using UTF-8 encoding. You can start by putting # -*- coding: utf-8 -*- at the top of your .py file. To get more advanced, you can also define encodings on a string by string basis in your code. However, if you are trying to put the pound sign literal in to your code, you’ll need an encoding that supports it for the entire file.


回答 1

在我的.py脚本顶部添加以下两行对我有用(第一行是必需的):

#!/usr/bin/env python
# -*- coding: utf-8 -*- 

Adding the following two line sat the top of my .py script worked for me (first line was necessary):

#!/usr/bin/env python
# -*- coding: utf-8 -*- 

回答 2

首先将# -*- coding: utf-8 -*-行添加到文件的开头,然后u'foo'用于所有非ASCII Unicode数据:

def NewFunction():
    return u'£'

或使用自python 2.6以来可用的魔法使其自动执行:

from __future__ import unicode_literals

First add the # -*- coding: utf-8 -*- line to the beginning of the file and then use u'foo' for all your non-ASCII unicode data:

def NewFunction():
    return u'£'

or use the magic available since Python 2.6 to make it automatic:

from __future__ import unicode_literals

回答 3

错误消息会告诉您确切的问题。Python解释器需要知道非ASCII字符的编码。

如果要返回U + 00A3,则可以说

return u'\u00a3'

它通过Unicode转义序列以纯ASCII形式表示此字符。如果要返回包含文字字节0xA3的字节字符串,则为

return b'\xa3'

(在Python 2中b是隐式的;但是显式的比隐式的要好)。

错误消息中链接的PEP指示您确切如何告诉Python“此文件不是纯ASCII;这是我正在使用的编码”。如果编码为UTF-8,则应为

# coding=utf-8

或与Emacs兼容

# -*- encoding: utf-8 -*-

如果您不知道编辑器使用哪种编码来保存此文件,请使用十六进制编辑器和某种谷歌搜索来检查它。堆栈溢出标签有一个标签信息页面,其中包含更多信息和一些故障排除提示。

用这么多的词来说,超出7位ASCII范围(0x00-0x7F)的地方,Python不能也不应该猜测字节序列代表什么字符串。https://tripleee.github.io/8bit#a3显示了字节0xA3的21种可能的解释,而这仅来自传统的8位编码;但也可能是多字节编码的第一个字节。但实际上,我想您实际上正在使用Latin-1,因此您应该

# coding: latin-1

作为源文件的第一行或第二行。无论如何,在不知道字节应该代表哪个字符的情况下,人类也将无法猜测。

警告:coding: latin-1肯定会消除错误消息(因为没有字节序列在技术上不允许在此编码中使用),但是如果实际编码是其他内容,则在解释代码时可能会产生完全错误的结果。声明编码时,您确实必须完全确定地知道文件的编码。

The error message tells you exactly what’s wrong. The Python interpreter needs to know the encoding of the non-ASCII character.

If you want to return U+00A3 then you can say

return u'\u00a3'

which represents this character in pure ASCII by way of a Unicode escape sequence. If you want to return a byte string containing the literal byte 0xA3, that’s

return b'\xa3'

(where in Python 2 the b is implicit; but explicit is better than implicit).

The linked PEP in the error message instructs you exactly how to tell Python “this file is not pure ASCII; here’s the encoding I’m using”. If the encoding is UTF-8, that would be

# coding=utf-8

or the Emacs-compatible

# -*- encoding: utf-8 -*-

If you don’t know which encoding your editor uses to save this file, examine it with something like a hex editor and some googling. The Stack Overflow tag has a tag info page with more information and some troubleshooting tips.

In so many words, outside of the 7-bit ASCII range (0x00-0x7F), Python can’t and mustn’t guess what string a sequence of bytes represents. https://tripleee.github.io/8bit#a3 shows 21 possible interpretations for the byte 0xA3 and that’s only from the legacy 8-bit encodings; but it could also very well be the first byte of a multi-byte encoding. But in fact, I would guess you are actually using Latin-1, so you should have

# coding: latin-1

as the first or second line of your source file. Anyway, without knowledge of which character the byte is supposed to represent, a human would not be able to guess this, either.

A caveat: coding: latin-1 will definitely remove the error message (because there are no byte sequences which are not technically permitted in this encoding), but might produce completely the wrong result when the code is interpreted if the actual encoding is something else. You really have to know the encoding of the file with complete certainty when you declare the encoding.


回答 4

在脚本中添加以下两行为我解决了这个问题。

# !/usr/bin/python
# coding=utf-8

希望能帮助到你 !

Adding the following two lines in the script solved the issue for me.

# !/usr/bin/python
# coding=utf-8

Hope it helps !


回答 5

您可能正在尝试使用Python 2解释器运行Python 3文件。当前(截至2019年)python,在Windows和大多数Linux发行版上,当两个版本都安装时,命令默认为Python 2。

但是,如果您确实正在使用Python 2脚本,则此页面上尚未提及的解决方案是将文件重新保存为UTF-8 + BOM编码,这会将三个特殊字节添加到文件的开头,它们将明确告知Python解释器(和您的文本编辑器)有关文件编码的信息。

You’re probably trying to run Python 3 file with Python 2 interpreter. Currently (as of 2019), python command defaults to Python 2 when both versions are installed, on Windows and most Linux distributions.

But in case you’re indeed working on a Python 2 script, a not yet mentioned on this page solution is to resave the file in UTF-8+BOM encoding, that will add three special bytes to the start of the file, they will explicitly inform the Python interpreter (and your text editor) about the file encoding.


声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。