Python3中的StringIO

问题:Python3中的StringIO

我正在使用Python 3.2.1,但无法导入StringIO模块。我使用 io.StringIO和它的作品,但我不能使用它numpygenfromtxt是这样的:

x="1 3\n 4.5 8"        
numpy.genfromtxt(io.StringIO(x))

我收到以下错误:

TypeError: Can't convert 'bytes' object to str implicitly  

当我写的import StringIO时候说

ImportError: No module named 'StringIO'

I am using Python 3.2.1 and I can’t import the StringIO module. I use io.StringIO and it works, but I can’t use it with numpy‘s genfromtxt like this:

x="1 3\n 4.5 8"        
numpy.genfromtxt(io.StringIO(x))

I get the following error:

TypeError: Can't convert 'bytes' object to str implicitly  

and when I write import StringIO it says

ImportError: No module named 'StringIO'

回答 0

当我写导入StringIO时,它说没有这样的模块。

Python 3.0的新功能开始

StringIOcStringIO模块都没有了。而是导入io 模块,分别将io.StringIOio.BytesIO用于文本和数据。


修复一些Python 2代码以使其在Python 3(caveat emptor)中工作的可能有用的方法:

try:
    from StringIO import StringIO ## for Python 2
except ImportError:
    from io import StringIO ## for Python 3

注意:此示例可能与问题的主要内容相切,仅作为一般性地解决缺失StringIO模块时要考虑的内容。 有关消息的更直接解决方案TypeError: Can't convert 'bytes' object to str implicitly,请参阅此答案

when i write import StringIO it says there is no such module.

From What’s New In Python 3.0:

The StringIO and cStringIO modules are gone. Instead, import the io module and use io.StringIO or io.BytesIO for text and data respectively.

.


A possibly useful method of fixing some Python 2 code to also work in Python 3 (caveat emptor):

try:
    from StringIO import StringIO ## for Python 2
except ImportError:
    from io import StringIO ## for Python 3

Note: This example may be tangential to the main issue of the question and is included only as something to consider when generically addressing the missing StringIO module. For a more direct solution the the message TypeError: Can't convert 'bytes' object to str implicitly, see this answer.


回答 1

就我而言,我使用了:

from io import StringIO

In my case I have used:

from io import StringIO

回答 2

在Python 3上,numpy.genfromtxt期望字节流。使用以下内容:

numpy.genfromtxt(io.BytesIO(x.encode()))

On Python 3 numpy.genfromtxt expects a bytes stream. Use the following:

numpy.genfromtxt(io.BytesIO(x.encode()))

回答 3

谢谢OP的问题,以及Roman的回答。我不得不花点时间找到它。希望以下内容对其他人有所帮助。

Python 2.7

请参阅:https//docs.scipy.org/doc/numpy/user/basics.io.genfromtxt.html

import numpy as np
from StringIO import StringIO

data = "1, abc , 2\n 3, xxx, 4"

print type(data)
"""
<type 'str'>
"""

print '\n', np.genfromtxt(StringIO(data), delimiter=",", dtype="|S3", autostrip=True)
"""
[['1' 'abc' '2']
 ['3' 'xxx' '4']]
"""

print '\n', type(data)
"""
<type 'str'>
"""

print '\n', np.genfromtxt(StringIO(data), delimiter=",", autostrip=True)
"""
[[  1.  nan   2.]
 [  3.  nan   4.]]
"""

Python 3.5:

import numpy as np
from io import StringIO
import io

data = "1, abc , 2\n 3, xxx, 4"
#print(data)
"""
1, abc , 2
 3, xxx, 4
"""

#print(type(data))
"""
<class 'str'>
"""

#np.genfromtxt(StringIO(data), delimiter=",", autostrip=True)
# TypeError: Can't convert 'bytes' object to str implicitly

print('\n')
print(np.genfromtxt(io.BytesIO(data.encode()), delimiter=",", dtype="|S3", autostrip=True))
"""
[[b'1' b'abc' b'2']
 [b'3' b'xxx' b'4']]
"""

print('\n')
print(np.genfromtxt(io.BytesIO(data.encode()), delimiter=",", autostrip=True))
"""
[[  1.  nan   2.]
 [  3.  nan   4.]]
"""

在旁边:

dtype =“ | Sx”,其中x = {1,2,3,…}中的任何一个:

dtypes。Python中S1和S2之间的区别

“ | S1和| S2字符串是数据类型描述符;第一个意味着数组保存长度为1的字符串,第二个长度为2。…”

Thank you OP for your question, and Roman for your answer. I had to search a bit to find this; I hope the following helps others.

Python 2.7

See: https://docs.scipy.org/doc/numpy/user/basics.io.genfromtxt.html

import numpy as np
from StringIO import StringIO

data = "1, abc , 2\n 3, xxx, 4"

print type(data)
"""
<type 'str'>
"""

print '\n', np.genfromtxt(StringIO(data), delimiter=",", dtype="|S3", autostrip=True)
"""
[['1' 'abc' '2']
 ['3' 'xxx' '4']]
"""

print '\n', type(data)
"""
<type 'str'>
"""

print '\n', np.genfromtxt(StringIO(data), delimiter=",", autostrip=True)
"""
[[  1.  nan   2.]
 [  3.  nan   4.]]
"""

Python 3.5:

import numpy as np
from io import StringIO
import io

data = "1, abc , 2\n 3, xxx, 4"
#print(data)
"""
1, abc , 2
 3, xxx, 4
"""

#print(type(data))
"""
<class 'str'>
"""

#np.genfromtxt(StringIO(data), delimiter=",", autostrip=True)
# TypeError: Can't convert 'bytes' object to str implicitly

print('\n')
print(np.genfromtxt(io.BytesIO(data.encode()), delimiter=",", dtype="|S3", autostrip=True))
"""
[[b'1' b'abc' b'2']
 [b'3' b'xxx' b'4']]
"""

print('\n')
print(np.genfromtxt(io.BytesIO(data.encode()), delimiter=",", autostrip=True))
"""
[[  1.  nan   2.]
 [  3.  nan   4.]]
"""

Aside:

dtype=”|Sx”, where x = any of { 1, 2, 3, …}:

dtypes. Difference between S1 and S2 in Python

“The |S1 and |S2 strings are data type descriptors; the first means the array holds strings of length 1, the second of length 2. …”


回答 4

您可以从六个模块中使用StringIO

import six
import numpy

x = "1 3\n 4.5 8"
numpy.genfromtxt(six.StringIO(x))

You can use the StringIO from the six module:

import six
import numpy

x = "1 3\n 4.5 8"
numpy.genfromtxt(six.StringIO(x))

回答 5

Roman Shapovalov的代码应该在Python 3.x和2.6 / 2.7中都可以使用。这里还是完整的示例:

import io
import numpy
x = "1 3\n 4.5 8"
numpy.genfromtxt(io.BytesIO(x.encode()))

输出:

array([[ 1. ,  3. ],
       [ 4.5,  8. ]])

Python 3.x的说明:

  • numpy.genfromtxt 接受字节流(将类似文件的对象解释为字节而不是Unicode)。
  • io.BytesIO 接受字节字符串并返回字节流。 io.StringIO另一方面,将采用Unicode字符串并返回Unicode流。
  • x 被分配了一个字符串文字,在Python 3.x中是Unicode字符串。
  • encode()x提取Unicode字符串并从中取出一个字节字符串,从而提供io.BytesIO有效的参数。

Python 2.6 / 2.7的唯一区别是它x是一个字节字符串(假定from __future__ import unicode_literals未使用),然后encode()获取该字节字符串,x并仍然从中提取相同的字节字符串。因此结果是相同的。


由于这是SO最受欢迎的问题之一,因此StringIO,这里有一些有关import语句和不同Python版本的更多说明。

以下是采用字符串并返回流的类:

  • io.BytesIO(Python 2.6、2.7和3.x)-接收一个字节字符串。返回字节流。
  • io.StringIO(Python 2.6、2.7和3.x)-采用Unicode字符串。返回Unicode流。
  • StringIO.StringIO(Python 2.x)-接受字节字符串或Unicode字符串。如果为字节字符串,则返回字节流。如果是Unicode字符串,则返回Unicode流。
  • cStringIO.StringIO(Python 2.x)-的更快版本StringIO.StringIO,但不能采用包含非ASCII字符的Unicode字符串。

请注意,StringIO.StringIO导入为from StringIO import StringIO,然后用作StringIO(...)。要么这样做,要么您执行import StringIO然后使用StringIO.StringIO(...)。模块名称和类名称恰好是相同的。这类似于datetime这种方式。

使用什么,取决于您支持的Python版本:

  • 如果您仅支持Python 3.x:只需使用io.BytesIOio.StringIO取决于您使用的是哪种数据。

  • 如果您同时支持Python 2.6 / 2.7和3.x,或者正尝试将代码从2.6 / 2.7转换到3.x:最简单的选择仍然是使用io.BytesIOio.StringIO。尽管它StringIO.StringIO很灵活,因此在2.6 / 2.7中似乎是首选,但这种灵活性可能会掩盖3.x版本中将出现的错误。例如,我有一些使用StringIO.StringIOio.StringIO取决于Python版本的代码,但实际上是传递一个字节字符串,因此当我在Python 3.x中进行测试时,它失败了,必须加以修复。

    使用的另一个优点io.StringIO是对通用换行符的支持。如果你传递关键字参数newline=''io.StringIO,这将是能够在任何的分割线\n\r\n\r。我发现那StringIO.StringIO会绊倒\r尤其严重。

    请注意,如果您导入BytesIOStringIOsix,你StringIO.StringIO在Python 2.x和相应的类从io在Python 3.x的 如果您同意我之前的评估,实际上这是应该避免的一种情况,而应该six从中io引入。

  • 如果您支持Python 2.5或更低版本和3.x:您将需要StringIO.StringIO2.5或更低版本,因此您最好使用six。但是要意识到,同时支持2.5和3.x通常非常困难,因此,如果可能的话,应该考虑将最低支持的版本提高到2.6。

Roman Shapovalov’s code should work in Python 3.x as well as Python 2.6/2.7. Here it is again with the complete example:

import io
import numpy
x = "1 3\n 4.5 8"
numpy.genfromtxt(io.BytesIO(x.encode()))

Output:

array([[ 1. ,  3. ],
       [ 4.5,  8. ]])

Explanation for Python 3.x:

  • numpy.genfromtxt takes a byte stream (a file-like object interpreted as bytes instead of Unicode).
  • io.BytesIO takes a byte string and returns a byte stream. io.StringIO, on the other hand, would take a Unicode string and and return a Unicode stream.
  • x gets assigned a string literal, which in Python 3.x is a Unicode string.
  • encode() takes the Unicode string x and makes a byte string out of it, thus giving io.BytesIO a valid argument.

The only difference for Python 2.6/2.7 is that x is a byte string (assuming from __future__ import unicode_literals is not used), and then encode() takes the byte string x and still makes the same byte string out of it. So the result is the same.


Since this is one of SO’s most popular questions regarding StringIO, here’s some more explanation on the import statements and different Python versions.

Here are the classes which take a string and return a stream:

  • io.BytesIO (Python 2.6, 2.7, and 3.x) – Takes a byte string. Returns a byte stream.
  • io.StringIO (Python 2.6, 2.7, and 3.x) – Takes a Unicode string. Returns a Unicode stream.
  • StringIO.StringIO (Python 2.x) – Takes a byte string or Unicode string. If byte string, returns a byte stream. If Unicode string, returns a Unicode stream.
  • cStringIO.StringIO (Python 2.x) – Faster version of StringIO.StringIO, but can’t take Unicode strings which contain non-ASCII characters.

Note that StringIO.StringIO is imported as from StringIO import StringIO, then used as StringIO(...). Either that, or you do import StringIO and then use StringIO.StringIO(...). The module name and class name just happen to be the same. It’s similar to datetime that way.

What to use, depending on your supported Python versions:

  • If you only support Python 3.x: Just use io.BytesIO or io.StringIO depending on what kind of data you’re working with.

  • If you support both Python 2.6/2.7 and 3.x, or are trying to transition your code from 2.6/2.7 to 3.x: The easiest option is still to use io.BytesIO or io.StringIO. Although StringIO.StringIO is flexible and thus seems preferred for 2.6/2.7, that flexibility could mask bugs that will manifest in 3.x. For example, I had some code which used StringIO.StringIO or io.StringIO depending on Python version, but I was actually passing a byte string, so when I got around to testing it in Python 3.x it failed and had to be fixed.

    Another advantage of using io.StringIO is the support for universal newlines. If you pass the keyword argument newline='' into io.StringIO, it will be able to split lines on any of \n, \r\n, or \r. I found that StringIO.StringIO would trip up on \r in particular.

    Note that if you import BytesIO or StringIO from six, you get StringIO.StringIO in Python 2.x and the appropriate class from io in Python 3.x. If you agree with my previous paragraphs’ assessment, this is actually one case where you should avoid six and just import from io instead.

  • If you support Python 2.5 or lower and 3.x: You’ll need StringIO.StringIO for 2.5 or lower, so you might as well use six. But realize that it’s generally very difficult to support both 2.5 and 3.x, so you should consider bumping your lowest supported version to 2.6 if at all possible.


回答 6

为了使此处的示例可 与Python 3.5.2一起使用,可以重写如下:

import io
data =io.BytesIO(b"1, 2, 3\n4, 5, 6") 
import numpy
numpy.genfromtxt(data, delimiter=",")

进行更改的原因可能是文件的内容以数据(字节)为单位,除非经过某种方式解码,否则它们不会生成文本。genfrombytes可能比genfromtxt

In order to make examples from here work with Python 3.5.2, you can rewrite as follows :

import io
data =io.BytesIO(b"1, 2, 3\n4, 5, 6") 
import numpy
numpy.genfromtxt(data, delimiter=",")

The reason for the change may be that the content of a file is in data (bytes) which do not make text until being decoded somehow. genfrombytes may be a better name than genfromtxt.


回答 7

尝试这个

从StringIO导入StringIO

x =“ 1 3 \ n 4.5 8”

numpy.genfromtxt(StringIO(x))

try this

from StringIO import StringIO

x=”1 3\n 4.5 8″

numpy.genfromtxt(StringIO(x))