python字符串前的a前缀是什么意思?

问题:python字符串前的a前缀是什么意思?

在python源代码中,我偶然发现在类似如下的字符串之前看到一个小b

b"abcdef"

我知道u表示unicode字符串的r前缀和原始字符串文字的前缀。

b它看起来像一个没有任何前缀的纯字符串,它代表什么?在哪种源代码中有用?

In a python source code I stumbled upon I’ve seen a small b before a string like in:

b"abcdef"

I know about the u prefix signifying a unicode string, and the r prefix for a raw string literal.

What does the b stand for and in which kind of source code is it useful as it seems to be exactly like a plain string without any prefix?


回答 0

这是Python3 bytes 文字。在Python 2.5和更早版本中不存在此前缀(等效于2.x的纯字符串,而3.x的纯字符串等效u于2.x中带前缀的文字)。在Python 2.6+中,它等效于纯字符串,以与3.x兼容

This is Python3 bytes literal. This prefix is absent in Python 2.5 and older (it is equivalent to a plain string of 2.x, while plain string of 3.x is equivalent to a literal with u prefix in 2.x). In Python 2.6+ it is equivalent to a plain string, for compatibility with 3.x.


回答 1

b前缀表示一个bytes字符串常量

如果您看到它在Python 3源代码中使用过,该表达式将创建一个bytes对象,而不是常规Unicode str对象。如果您看到它在Python Shell中回显,或者作为列表,字典或其他容器内容的一部分回显,那么您会看到bytes使用此符号表示的对象。

bytes对象基本上包含一个介于0-255之间的整数序列,但是当表示这些对象时,Python 会将这些字节显示为ASCII码点,以使其更易于读取其内容。外部任何字节可打印的ASCII字符范围被示为转义序列(例如\n\x82等)。相反,您可以同时使用ASCII字符和转义序列来定义字节值。对于ASCII值,使用其数字值(例如b'A'== b'\x41'

因为bytes对象由整数序列组成,所以您可以bytes从其他任何整数序列(其值在0-255范围内)构造一个对象,例如列表:

bytes([72, 101, 108, 108, 111])

和索引给你回的整数(但切片产生一个新bytes值;对于上面的例子中,value[0]给你72,但是value[:1]b'H'作为72是用于大写字母的ASCII码点ħ)。

bytes模拟二进制数据,包括编码文本。如果您的bytes值确实包含文本,则需要先使用正确的编解码器对其进行解码。例如,如果数据编码为UTF-8,则可以使用以下方法获取Unicode str值:

strvalue = bytesvalue.decode('utf-8')

相反,要从str对象中的文本转到bytes需要编码。您需要确定要使用的编码。默认是使用UTF-8,但是您所需要的很大程度上取决于您的用例:

bytesvalue = strvalue.encode('utf-8')

您也可以使用构造函数bytes(strvalue, encoding)执行相同的操作。

解码和编码方法都使用一个额外的参数来指定应如何处理错误

Python 2版本2.6和2.7还支持使用b'..'字符串文字语法创建字符串文字,以简化适用于Python 2和3的代码。

bytes对象是不变的,就像str字符串一样。如果您需要一个可变的字节值,请使用一个bytearray()对象

The b prefix signifies a bytes string literal.

If you see it used in Python 3 source code, the expression creates a bytes object, not a regular Unicode str object. If you see it echoed in your Python shell or as part of a list, dict or other container contents, then you see a bytes object represented using this notation.

bytes objects basically contain a sequence of integers in the range 0-255, but when represented, Python displays these bytes as ASCII codepoints to make it easier to read their contents. Any bytes outside the printable range of ASCII characters are shown as escape sequences (e.g. \n, \x82, etc.). Inversely, you can use both ASCII characters and escape sequences to define byte values; for ASCII values their numeric value is used (e.g. b'A' == b'\x41')

Because a bytes object consist of a sequence of integers, you can construct a bytes object from any other sequence of integers with values in the 0-255 range, like a list:

bytes([72, 101, 108, 108, 111])

and indexing gives you back the integers (but slicing produces a new bytes value; for the above example, value[0] gives you 72, but value[:1] is b'H' as 72 is the ASCII code point for the capital letter H).

bytes model binary data, including encoded text. If your bytes value does contain text, you need to first decode it, using the correct codec. If the data is encoded as UTF-8, for example, you can obtain a Unicode str value with:

strvalue = bytesvalue.decode('utf-8')

Conversely, to go from text in a str object to bytes you need to encode. You need to decide on an encoding to use; the default is to use UTF-8, but what you will need is highly dependent on your use case:

bytesvalue = strvalue.encode('utf-8')

You can also use the constructor, bytes(strvalue, encoding) to do the same.

Both the decoding and encoding methods take an extra argument to specify how errors should be handled.

Python 2, versions 2.6 and 2.7 also support creating string literals using b'..' string literal syntax, to ease code that works on both Python 2 and 3.

bytes objects are immutable, just like str strings are. Use a bytearray() object if you need to have a mutable bytes value.