TypeError:’str’不支持缓冲区接口

问题:TypeError:’str’不支持缓冲区接口

plaintext = input("Please enter the text you want to compress")
filename = input("Please enter the desired filename")
with gzip.open(filename + ".gz", "wb") as outfile:
    outfile.write(plaintext) 

上面的python代码给了我以下错误:

Traceback (most recent call last):
  File "C:/Users/Ankur Gupta/Desktop/Python_works/gzip_work1.py", line 33, in <module>
    compress_string()
  File "C:/Users/Ankur Gupta/Desktop/Python_works/gzip_work1.py", line 15, in compress_string
    outfile.write(plaintext)
  File "C:\Python32\lib\gzip.py", line 312, in write
    self.crc = zlib.crc32(data, self.crc) & 0xffffffff
TypeError: 'str' does not support the buffer interface
plaintext = input("Please enter the text you want to compress")
filename = input("Please enter the desired filename")
with gzip.open(filename + ".gz", "wb") as outfile:
    outfile.write(plaintext) 

The above python code is giving me following error:

Traceback (most recent call last):
  File "C:/Users/Ankur Gupta/Desktop/Python_works/gzip_work1.py", line 33, in <module>
    compress_string()
  File "C:/Users/Ankur Gupta/Desktop/Python_works/gzip_work1.py", line 15, in compress_string
    outfile.write(plaintext)
  File "C:\Python32\lib\gzip.py", line 312, in write
    self.crc = zlib.crc32(data, self.crc) & 0xffffffff
TypeError: 'str' does not support the buffer interface

回答 0

如果您使用的是Python3x,则string与Python 2.x的类型不同,则必须将其转换为字节(对其进行编码)。

plaintext = input("Please enter the text you want to compress")
filename = input("Please enter the desired filename")
with gzip.open(filename + ".gz", "wb") as outfile:
    outfile.write(bytes(plaintext, 'UTF-8'))

也不要使用像string或那样的变量file名作为模块或函数的名称。

编辑@汤姆

是的,非ASCII文本也会被压缩/解压缩。我使用UTF-8编码的波兰字母:

plaintext = 'Polish text: ąćęłńóśźżĄĆĘŁŃÓŚŹŻ'
filename = 'foo.gz'
with gzip.open(filename, 'wb') as outfile:
    outfile.write(bytes(plaintext, 'UTF-8'))
with gzip.open(filename, 'r') as infile:
    outfile_content = infile.read().decode('UTF-8')
print(outfile_content)

If you use Python3x then string is not the same type as for Python 2.x, you must cast it to bytes (encode it).

plaintext = input("Please enter the text you want to compress")
filename = input("Please enter the desired filename")
with gzip.open(filename + ".gz", "wb") as outfile:
    outfile.write(bytes(plaintext, 'UTF-8'))

Also do not use variable names like string or file while those are names of module or function.

EDIT @Tom

Yes, non-ASCII text is also compressed/decompressed. I use Polish letters with UTF-8 encoding:

plaintext = 'Polish text: ąćęłńóśźżĄĆĘŁŃÓŚŹŻ'
filename = 'foo.gz'
with gzip.open(filename, 'wb') as outfile:
    outfile.write(bytes(plaintext, 'UTF-8'))
with gzip.open(filename, 'r') as infile:
    outfile_content = infile.read().decode('UTF-8')
print(outfile_content)

回答 1

有一个更容易解决此问题的方法。

您只需要向t模式添加a 即可wt。这会导致Python将文件打开为文本文件,而不是二进制文件。然后一切都会正常。

完整的程序变为:

plaintext = input("Please enter the text you want to compress")
filename = input("Please enter the desired filename")
with gzip.open(filename + ".gz", "wt") as outfile:
    outfile.write(plaintext)

There is an easier solution to this problem.

You just need to add a t to the mode so it becomes wt. This causes Python to open the file as a text file and not binary. Then everything will just work.

The complete program becomes this:

plaintext = input("Please enter the text you want to compress")
filename = input("Please enter the desired filename")
with gzip.open(filename + ".gz", "wt") as outfile:
    outfile.write(plaintext)

回答 2

您不能将Python 3的“字符串”序列化为字节,而无需显式转换为某些编码。

outfile.write(plaintext.encode('utf-8'))

可能就是您想要的。同样适用于python 2.x和3.x。

You can not serialize a Python 3 ‘string’ to bytes without explict conversion to some encoding.

outfile.write(plaintext.encode('utf-8'))

is possibly what you want. Also this works for both python 2.x and 3.x.


回答 3

对于Python 3.x,您可以通过以下方式将文本转换为原始字节:

bytes("my data", "encoding")

例如:

bytes("attack at dawn", "utf-8")

返回的对象将与一起使用outfile.write

For Python 3.x you can convert your text to raw bytes through:

bytes("my data", "encoding")

For example:

bytes("attack at dawn", "utf-8")

The object returned will work with outfile.write.


回答 4

从py2切换到py3时,通常会出现此问题。在py2 plaintext中既是字符串也是字节数组类型。在py3 plaintext中只有一个字符串,并且在二进制模式下打开时,该方法outfile.write()实际上采用字节数组outfile,因此会引发异常。更改输入以plaintext.encode('utf-8')解决问题。继续阅读,如果这困扰您。

在py2中,file.write的声明使其看起来像您传入了一个字符串:file.write(str)。实际上,您正在传入一个字节数组,您应该已经读过这样的声明:file.write(bytes)。如果您这样阅读,问题很简单,file.write(bytes)需要一个字节类型,并且在py3中要从str中获取字节,您可以将其转换:

py3>> outfile.write(plaintext.encode('utf-8'))

为何py2 docs声明file.write使用字符串?在py2中,声明区别并不重要,因为:

py2>> str==bytes         #str and bytes aliased a single hybrid class in py2
True

py2 的str-bytes类具有一些方法/构造函数,这些方法/构造函数使其在某些方面类似于字符串类,在某些方面类似于字节数组类。方便file.write吗?

py2>> plaintext='my string literal'
py2>> type(plaintext)
str                              #is it a string or is it a byte array? it's both!

py2>> outfile.write(plaintext)   #can use plaintext as a byte array

为什么py3破坏了这个不错的系统?好吧,因为在py2中,基本字符串函数不适用于世界其他地方。测量具有非ASCII字符的单词的长度?

py2>> len('¡no')        #length of string=3, length of UTF-8 byte array=4, since with variable len encoding the non-ASCII chars = 2-6 bytes
4                       #always gives bytes.len not str.len

一直以来,您一直以为在py2 中请求字符串的len,所以您一直在从编码中获取字节数组的长度。这种含糊不清是双重责任阶层的根本问题。您实现哪个版本的方法调用?

好消息是py3可以解决此问题。它解开了strbytes类。的STR类有绳状的方法中,单独的字节类具有字节阵列方法:

py3>> len('¡ok')       #string
3
py3>> len('¡ok'.encode('utf-8'))     #bytes
4

希望知道这一点有助于揭开问题的神秘面纱,并使迁移的痛苦更容易承担。

This problem commonly occurs when switching from py2 to py3. In py2 plaintext is both a string and a byte array type. In py3 plaintext is only a string, and the method outfile.write() actually takes a byte array when outfile is opened in binary mode, so an exception is raised. Change the input to plaintext.encode('utf-8') to fix the problem. Read on if this bothers you.

In py2, the declaration for file.write made it seem like you passed in a string: file.write(str). Actually you were passing in a byte array, you should have been reading the declaration like this: file.write(bytes). If you read it like this the problem is simple, file.write(bytes) needs a bytes type and in py3 to get bytes out of a str you convert it:

py3>> outfile.write(plaintext.encode('utf-8'))

Why did the py2 docs declare file.write took a string? Well in py2 the declaration distinction didn’t matter because:

py2>> str==bytes         #str and bytes aliased a single hybrid class in py2
True

The str-bytes class of py2 has methods/constructors that make it behave like a string class in some ways and a byte array class in others. Convenient for file.write isn’t it?:

py2>> plaintext='my string literal'
py2>> type(plaintext)
str                              #is it a string or is it a byte array? it's both!

py2>> outfile.write(plaintext)   #can use plaintext as a byte array

Why did py3 break this nice system? Well because in py2 basic string functions didn’t work for the rest of the world. Measure the length of a word with a non-ASCII character?

py2>> len('¡no')        #length of string=3, length of UTF-8 byte array=4, since with variable len encoding the non-ASCII chars = 2-6 bytes
4                       #always gives bytes.len not str.len

All this time you thought you were asking for the len of a string in py2, you were getting the length of the byte array from the encoding. That ambiguity is the fundamental problem with double-duty classes. Which version of any method call do you implement?

The good news then is that py3 fixes this problem. It disentangles the str and bytes classes. The str class has string-like methods, the separate bytes class has byte array methods:

py3>> len('¡ok')       #string
3
py3>> len('¡ok'.encode('utf-8'))     #bytes
4

Hopefully knowing this helps de-mystify the issue, and makes the migration pain a little easier to bear.


回答 5

>>> s = bytes("s","utf-8")
>>> print(s)
b's'
>>> s = s.decode("utf-8")
>>> print(s)
s

好吧,如果对消除烦人的’b’字符有用,如果有人有更好的主意,请建议我或随时在这里随时编辑我。我只是新手

>>> s = bytes("s","utf-8")
>>> print(s)
b's'
>>> s = s.decode("utf-8")
>>> print(s)
s

Well if useful for you in case removing annoying ‘b’ character.If anyone got better idea please suggest me or feel free to edit me anytime in here.I’m just newbie


回答 6

为了Djangodjango.test.TestCase单元测试,我改变了我的Python2语法:

def test_view(self):
    response = self.client.get(reverse('myview'))
    self.assertIn(str(self.obj.id), response.content)
    ...

要使用Python3 .decode('utf8')语法:

def test_view(self):
    response = self.client.get(reverse('myview'))
    self.assertIn(str(self.obj.id), response.content.decode('utf8'))
    ...

For Django in django.test.TestCase unit testing, I changed my Python2 syntax:

def test_view(self):
    response = self.client.get(reverse('myview'))
    self.assertIn(str(self.obj.id), response.content)
    ...

To use the Python3 .decode('utf8') syntax:

def test_view(self):
    response = self.client.get(reverse('myview'))
    self.assertIn(str(self.obj.id), response.content.decode('utf8'))
    ...