标签归档:string

如何在Python3中像printf一样打印?

问题:如何在Python3中像printf一样打印?

在Python 2中,我使用了:

print "a=%d,b=%d" % (f(x,n),g(x,n))

我试过了:

print("a=%d,b=%d") % (f(x,n),g(x,n))

In Python 2 I used:

print "a=%d,b=%d" % (f(x,n),g(x,n))

I’ve tried:

print("a=%d,b=%d") % (f(x,n),g(x,n))

回答 0

在Python2中,print是一个引入了以下语句的关键字:

print "Hi"

在Python3中,print是可以调用的函数:

print ("Hi")

在这两个版本中,%都是一个运算符,它在左侧需要一个字符串,在右侧需要一个值或一个值的元组或一个映射对象(如dict)。

因此,您的行应如下所示:

print("a=%d,b=%d" % (f(x,n),g(x,n)))

另外,对于Python3和更高版本,建议使用{}-style格式而不是%-style格式:

print('a={:d}, b={:d}'.format(f(x,n),g(x,n)))

Python 3.6引入了另一种字符串格式范例:f-strings

print(f'a={f(x,n):d}, b={g(x,n):d}')

In Python2, print was a keyword which introduced a statement:

print "Hi"

In Python3, print is a function which may be invoked:

print ("Hi")

In both versions, % is an operator which requires a string on the left-hand side and a value or a tuple of values or a mapping object (like dict) on the right-hand side.

So, your line ought to look like this:

print("a=%d,b=%d" % (f(x,n),g(x,n)))

Also, the recommendation for Python3 and newer is to use {}-style formatting instead of %-style formatting:

print('a={:d}, b={:d}'.format(f(x,n),g(x,n)))

Python 3.6 introduces yet another string-formatting paradigm: f-strings.

print(f'a={f(x,n):d}, b={g(x,n):d}')

回答 1

最推荐的方法是使用format方法。在这里了解更多

a, b = 1, 2

print("a={0},b={1}".format(a, b))

The most recommended way to do is to use format method. Read more about it here

a, b = 1, 2

print("a={0},b={1}".format(a, b))

回答 2

来自O’Reilly的Python Cookbook的简单printf()函数。

import sys
def printf(format, *args):
    sys.stdout.write(format % args)

输出示例:

i = 7
pi = 3.14159265359
printf("hi there, i=%d, pi=%.2f\n", i, pi)
# hi there, i=7, pi=3.14

Simple printf() function from O’Reilly’s Python Cookbook.

import sys
def printf(format, *args):
    sys.stdout.write(format % args)

Example output:

i = 7
pi = 3.14159265359
printf("hi there, i=%d, pi=%.2f\n", i, pi)
# hi there, i=7, pi=3.14

回答 3

Python 3.6引入了用于内联插值的f字符串。更好的是,它扩展了语法,还允许使用插值的格式说明符。我在Google上搜索时一直在努力的工作(并遇到了这个老问题!):

print(f'{account:40s} ({ratio:3.2f}) -> AUD {splitAmount}')

PEP 498包含详细信息。而且…它用其他语言的格式说明符排序了我的烦恼-允许说明符本身可以是表达式!好极了!请参阅:格式说明符

Python 3.6 introduced f-strings for inline interpolation. What’s even nicer is it extended the syntax to also allow format specifiers with interpolation. Something I’ve been working on while I googled this (and came across this old question!):

print(f'{account:40s} ({ratio:3.2f}) -> AUD {splitAmount}')

PEP 498 has the details. And… it sorted my pet peeve with format specifiers in other langs — allows for specifiers that themselves can be expressions! Yay! See: Format Specifiers.


回答 4

简单的例子:

print("foo %d, bar %d" % (1,2))

Simple Example:

print("foo %d, bar %d" % (1,2))


回答 5

一个简单的。

def printf(format, *values):
    print(format % values )

然后:

printf("Hello, this is my name %s and my age %d", "Martin", 20)

A simpler one.

def printf(format, *values):
    print(format % values )

Then:

printf("Hello, this is my name %s and my age %d", "Martin", 20)

回答 6

因为您%print(...)括号之外,所以您试图将变量插入到调用结果printprint(...)返回None,所以这将不起作用,还有一个小问题,您已经在这个时间和时间旅行中打印了模板,这是我们所居住的宇宙定律所禁止的。

你想整个事情进行打印,包括%和它的操作数,需要为内部print(...)通话,从而使打印之前它的字符串可以建成。

print( "a=%d,b=%d" % (f(x,n), g(x,n)) )

我添加了一些额外的空格以使其更清晰(尽管它们不是必需的,通常也不认为是好的样式)。

Because your % is outside the print(...) parentheses, you’re trying to insert your variables into the result of your print call. print(...) returns None, so this won’t work, and there’s also the small matter of you already having printed your template by this time and time travel being prohibited by the laws of the universe we inhabit.

The whole thing you want to print, including the % and its operand, needs to be inside your print(...) call, so that the string can be built before it is printed.

print( "a=%d,b=%d" % (f(x,n), g(x,n)) )

I have added a few extra spaces to make it clearer (though they are not necessary and generally not considered good style).


回答 7

python中没有其他的printf单词…我很惊讶!最好的代码是

def printf(format, *args):
    sys.stdout.write(format % args)

由于这种形式不允许打印\ n。其他所有人都没有。这就是为什么打印不好的原因。而且,您还需要以特殊形式编写args。上面的功能没有缺点。这是printf函数的标准常用形式。

Other words printf absent in python… I’m surprised! Best code is

def printf(format, *args):
    sys.stdout.write(format % args)

Because of this form allows not to print \n. All others no. That’s why print is bad operator. And also you need write args in special form. There is no disadvantages in function above. It’s a standard usual form of printf function.


回答 8

print("Name={}, balance={}".format(var-name, var-balance))
print("Name={}, balance={}".format(var-name, var-balance))

不带换行符的打印(print’a’,)打印空格,如何删除?

问题:不带换行符的打印(print’a’,)打印空格,如何删除?

我有此代码:

>>> for i in xrange(20):
...     print 'a',
... 
a a a a a a a a a a a a a a a a a a a a

我想输出'a',而' '不像这样:

aaaaaaaaaaaaaaaaaaaa

可能吗?

I have this code:

>>> for i in xrange(20):
...     print 'a',
... 
a a a a a a a a a a a a a a a a a a a a

I want to output 'a', without ' ' like this:

aaaaaaaaaaaaaaaaaaaa

Is it possible?


回答 0

有多种方法可以实现您的结果。如果你只是想为你的情况的解决方案,使用字符串倍增@Ant提到。仅当您的每个print语句都打印相同的字符串时,这才起作用。请注意,它适用于任何长度字符串的乘法(例如,'foo' * 20有效)。

>>> print 'a' * 20
aaaaaaaaaaaaaaaaaaaa

如果通常要这样做,请构建一个字符串,然后将其打印一次。这将为该字符串消耗一些内存,但是仅对进行一次调用print。请注意,+=现在使用的字符串串联使用的大小与您串联的字符串大小成线性关系,因此此操作很快。

>>> for i in xrange(20):
...     s += 'a'
... 
>>> print s
aaaaaaaaaaaaaaaaaaaa

或者,您可以使用sys.stdout更直接地进行操作。write(),这print是一个包装器。这将只写入您提供的原始字符串,而不进行任何格式化。请注意,即使在20 a秒结束时也不会打印换行符。

>>> import sys
>>> for i in xrange(20):
...     sys.stdout.write('a')
... 
aaaaaaaaaaaaaaaaaaaa>>> 

Python 3将print语句更改为print()函数,该函数允许您设置end参数。通过从中导入,可以在> = 2.6中使用它__future__。不过,我会避免在任何严重的2.x代码中使用此方法,因为对于从未使用过3.x的用户来说,这会有些混乱。但是,它应该使您体会3.x带来的一些好处。

>>> from __future__ import print_function
>>> for i in xrange(20):
...     print('a', end='')
... 
aaaaaaaaaaaaaaaaaaaa>>> 

There are a number of ways of achieving your result. If you’re just wanting a solution for your case, use string multiplication as @Ant mentions. This is only going to work if each of your print statements prints the same string. Note that it works for multiplication of any length string (e.g. 'foo' * 20 works).

>>> print 'a' * 20
aaaaaaaaaaaaaaaaaaaa

If you want to do this in general, build up a string and then print it once. This will consume a bit of memory for the string, but only make a single call to print. Note that string concatenation using += is now linear in the size of the string you’re concatenating so this will be fast.

>>> for i in xrange(20):
...     s += 'a'
... 
>>> print s
aaaaaaaaaaaaaaaaaaaa

Or you can do it more directly using sys.stdout.write(), which print is a wrapper around. This will write only the raw string you give it, without any formatting. Note that no newline is printed even at the end of the 20 as.

>>> import sys
>>> for i in xrange(20):
...     sys.stdout.write('a')
... 
aaaaaaaaaaaaaaaaaaaa>>> 

Python 3 changes the print statement into a print() function, which allows you to set an end parameter. You can use it in >=2.6 by importing from __future__. I’d avoid this in any serious 2.x code though, as it will be a little confusing for those who have never used 3.x. However, it should give you a taste of some of the goodness 3.x brings.

>>> from __future__ import print_function
>>> for i in xrange(20):
...     print('a', end='')
... 
aaaaaaaaaaaaaaaaaaaa>>> 

回答 1

PEP 3105:Python 2.6新增功能文档中的作为函数打印

>>> from __future__ import print_function
>>> print('a', end='')

显然,这仅适用于python 3.0或更高版本(或2.6+以from __future__ import print_function开头)。该print语句已删除,并print()在Python 3.0中默认成为函数。

From PEP 3105: print As a Function in the What’s New in Python 2.6 document:

>>> from __future__ import print_function
>>> print('a', end='')

Obviously that only works with python 3.0 or higher (or 2.6+ with a from __future__ import print_function at the beginning). The print statement was removed and became the print() function by default in Python 3.0.


回答 2

您可以通过在print语句之间将空字符串打印到stdout来抑制空格。

>>> import sys
>>> for i in range(20):
...   print 'a',
...   sys.stdout.write('')
... 
aaaaaaaaaaaaaaaaaaaa

但是,更干净的解决方案是首先构建您要打印的整个字符串,然后使用单个print语句将其输出。

You can suppress the space by printing an empty string to stdout between the print statements.

>>> import sys
>>> for i in range(20):
...   print 'a',
...   sys.stdout.write('')
... 
aaaaaaaaaaaaaaaaaaaa

However, a cleaner solution is to first build the entire string you’d like to print and then output it with a single print statement.


回答 3

您可以打印一个退格字符('\b'):

for i in xrange(20):
    print '\ba',

结果:

aaaaaaaaaaaaaaaaaaaa

You could print a backspace character ('\b'):

for i in xrange(20):
    print '\ba',

result:

aaaaaaaaaaaaaaaaaaaa

回答 4

Python 3.x:

for i in range(20):
    print('a', end='')

Python 2.6或2.7:

from __future__ import print_function
for i in xrange(20):
    print('a', end='')

Python 3.x:

for i in range(20):
    print('a', end='')

Python 2.6 or 2.7:

from __future__ import print_function
for i in xrange(20):
    print('a', end='')

回答 5

如果希望他们一次显示一个,则可以执行以下操作:

import time
import sys
for i in range(20):
    sys.stdout.write('a')
    sys.stdout.flush()
    time.sleep(0.5)

sys.stdout.flush() 必须在每次运行循环时强制写入字符。

If you want them to show up one at a time, you can do this:

import time
import sys
for i in range(20):
    sys.stdout.write('a')
    sys.stdout.flush()
    time.sleep(0.5)

sys.stdout.flush() is necessary to force the character to be written each time the loop is run.


回答 6

恰如其分:

打印为O(1),但先构建一个字符串,然后打印为O(n),其中n是字符串中字符的总数。因此,是的,尽管构建字符串是“更干净的”,但这并不是最有效的方法。

我的操作方式如下:

from sys import stdout
printf = stdout.write

现在,您有了一个“打印功能”,可以打印出您给它的任何字符串,而无需每次都返回换行符。

printf("Hello,")
printf("World!")

输出将是:世界,您好!

但是,如果要打印整数,浮点数或其他非字符串值,则必须使用str()函数将它们转换为字符串。

printf(str(2) + " " + str(4))

输出将是:2 4

Just as a side note:

Printing is O(1) but building a string and then printing is O(n), where n is the total number of characters in the string. So yes, while building the string is “cleaner”, it’s not the most efficient method of doing so.

The way I would do it is as follows:

from sys import stdout
printf = stdout.write

Now you have a “print function” that prints out any string you give it without returning the new line character each time.

printf("Hello,")
printf("World!")

The output will be: Hello, World!

However, if you want to print integers, floats, or other non-string values, you’ll have to convert them to a string with the str() function.

printf(str(2) + " " + str(4))

The output will be: 2 4


回答 7

无论是什么蚂蚁 ,或积累成一个字符串,然后打印一次:

s = '';
for i in xrange(20):
    s += 'a'
print s

Either what Ant says, or accumulate into a string, then print once:

s = '';
for i in xrange(20):
    s += 'a'
print s

回答 8

没有什么?你的意思是

>>> print 'a' * 20
aaaaaaaaaaaaaaaaaaaa

without what? do you mean

>>> print 'a' * 20
aaaaaaaaaaaaaaaaaaaa

?


回答 9

这真的很简单

对于python 3+版本,您只需要编写以下代码

for i in range(20):
      print('a',end='')

只需将循环转换为以下代码,您就不必担心其他事情

this is really simple

for python 3+ versions you only have to write the following codes

for i in range(20):
      print('a',end='')

just convert the loop to the following codes, you don’t have to worry about other things


回答 10

哇!!!

这是相当长一段时间

现在,在python 3.x中,这将非常容易

码:

for i in range(20):
      print('a',end='') # here end variable will clarify what you want in 
                        # end of the code

输出:

aaaaaaaaaaaaaaaaaaaa 

有关print()函数的更多信息

print(value1,value2,value3,sep='-',end='\n',file=sys.stdout,flush=False)

在这里

value1,value2,value3

您可以使用逗号打印多个值

sep = '-'

3个值将以’-‘字符分隔

您可以使用任何字符来代替甚至像sep =’@’或sep =’good’这样的字符串

end='\n'

默认情况下,打印功能将’\ n’字符放在输出末尾

但是您可以通过更改最终变量值来使用任何字符或字符串

例如end =’$’或end =’。或end =’Hello’

file=sys.stdout

这是默认值,系统标准输出

使用此参数,您可以创建输出文件流,例如

print("I am a Programmer", file=open("output.txt", "w"))

通过此代码,您将创建一个名为output.txt的文件,其中将存储您作为程序员的输出

flush = False

这是使用flush = True的默认值,您可以强制刷新流

WOW!!!

It’s pretty long time ago

Now, In python 3.x it will be pretty easy

code:

for i in range(20):
      print('a',end='') # here end variable will clarify what you want in 
                        # end of the code

output:

aaaaaaaaaaaaaaaaaaaa 

More about print() function

print(value1,value2,value3,sep='-',end='\n',file=sys.stdout,flush=False)

Here:

value1,value2,value3

you can print multiple values using commas

sep = '-'

3 values will be separated by ‘-‘ character

you can use any character instead of that even string like sep=’@’ or sep=’good’

end='\n'

by default print function put ‘\n’ charater at the end of output

but you can use any character or string by changing end variale value

like end=’$’ or end=’.’ or end=’Hello’

file=sys.stdout

this is a default value, system standard output

using this argument you can create a output file stream like

print("I am a Programmer", file=open("output.txt", "w"))

by this code you will create a file named output.txt where your output I am a Programmer will be stored

flush = False

It’s a default value using flush=True you can forcibly flush the stream


回答 11

就如此容易

def printSleeping():
     sleep = "I'm sleeping"
     v = ""
     for i in sleep:
         v += i
         system('cls')
         print v
         time.sleep(0.02)

as simple as that

def printSleeping():
     sleep = "I'm sleeping"
     v = ""
     for i in sleep:
         v += i
         system('cls')
         print v
         time.sleep(0.02)

Python-write()与writelines()和串联字符串

问题:Python-write()与writelines()和串联字符串

所以我正在学习Python。我正在上课,遇到一个问题,我不得不将很多压缩target.write()成一个write(),同时"\n"在每个用户输入变量(的对象write())之间都有一个。

我想出了:

nl = "\n"
lines = line1, nl, line2, nl, line3, nl
textdoc.writelines(lines)

如果我尝试这样做:

textdoc.write(lines)

我得到一个错误。但是如果我输入:

textdoc.write(line1 + "\n" + line2 + ....)

然后工作正常。为什么我不能在其中使用字符串作为换行符,write()但可以在其中使用呢writelines()

Python 2.7当我搜索google时,发现的大部分资源都超出了我的想象力,我仍然是一个外行。

So I’m learning Python. I am going through the lessons and ran into a problem where I had to condense a great many target.write() into a single write(), while having a "\n" between each user input variable(the object of write()).

I came up with:

nl = "\n"
lines = line1, nl, line2, nl, line3, nl
textdoc.writelines(lines)

If I try to do:

textdoc.write(lines)

I get an error. But if I type:

textdoc.write(line1 + "\n" + line2 + ....)

Then it works fine. Why am I unable to use a string for a newline in write() but I can use it in writelines()?

Python 2.7 When I searched google most resources I found were way over my head, I’m still a lay-people.


回答 0

  • writelines 期待字符串的迭代
  • write 需要一个字符串。

line1 + "\n" + line2将这些字符串合并到一个字符串中,然后再传递给write

请注意,如果您有很多行,则可能要使用"\n".join(list_of_lines)

  • writelines expects an iterable of strings
  • write expects a single string.

line1 + "\n" + line2 merges those strings together into a single string before passing it to write.

Note that if you have many lines, you may want to use "\n".join(list_of_lines).


回答 1

为什么我不能在write()中将字符串用于换行符,但可以在writelines()中使用它?

想法如下:如果要编写单个字符串,可以使用write()。如果您有一系列字符串,则可以使用编写所有字符串writelines()

write(arg)需要一个字符串作为参数并将其写入文件。如果您提供字符串列表,它将引发异常(顺便说一下,向我们显示错误!)。

writelines(arg)期望将iterable作为参数(在最一般的意义上,可迭代对象可以是元组,列表,字符串或迭代器)。迭代器中包含的每个项目均应为字符串。您提供的是一个字符串元组,因此一切正常。

字符串的性质对两个函数都无关紧要,即,无论您提供什么字符串,它们都只会写入文件。有趣的是,writelines()它本身并不添加换行符,因此方法名称实际上可能会造成很大的混乱。实际上,它的行为类似于一个称为的虚构方法write_all_of_these_strings(sequence)

接下来是Python中的一种惯用方式,将字符串列表写入文件,同时将每个字符串保留在自己的行中:

lines = ['line1', 'line2']
with open('filename.txt', 'w') as f:
    f.write('\n'.join(lines))

这将为您关闭文件。该构造'\n'.join(lines)将列表中的字符串连接(连接),lines并使用字符“ \ n”作为粘合。比使用+运算符更有效。

从相同的lines序列开始,以相同的输出结束,但使用writelines()

lines = ['line1', 'line2']
with open('filename.txt', 'w') as f:
    f.writelines("%s\n" % l for l in lines)

这利用了生成器表达式并动态创建了以换行符结尾的字符串。writelines()遍历此字符串序列并写入每个项目。

编辑:您应该注意的另一点:

write()并且readlines()writelines()引入之前就存在。writelines()后来作为的对应版本引入readlines(),以便人们可以轻松地编写通过readlines()以下方式读取的文件内容:

outfile.writelines(infile.readlines())

确实,这就是为什么使用writelines如此混乱的名称的主要原因。而且,今天,我们真的不再想要使用此方法。readlines()writelines()开始写入数据之前,将整个文件读取到计算机的内存中。首先,这可能会浪费时间。为什么不阅读其他部分就开始写部分数据呢?但是,最重要的是,这种方法可能会占用大量内存。在极端情况下,如果输入文件大于计算机的内存,则此方法甚至不起作用。解决此问题的方法是仅使用迭代器。一个工作示例:

with open('inputfile') as infile:
    with open('outputfile') as outfile:
        for line in infile:
            outfile.write(line)

这将逐行读取输入文件。读取一行后,该行即被写入输出文件。从概念上讲,内存中始终只有一行(相比之下,在采用读取行/写入行方法的情况下,整个文件内容都在内存中)。

Why am I unable to use a string for a newline in write() but I can use it in writelines()?

The idea is the following: if you want to write a single string you can do this with write(). If you have a sequence of strings you can write them all using writelines().

write(arg) expects a string as argument and writes it to the file. If you provide a list of strings, it will raise an exception (by the way, show errors to us!).

writelines(arg) expects an iterable as argument (an iterable object can be a tuple, a list, a string, or an iterator in the most general sense). Each item contained in the iterator is expected to be a string. A tuple of strings is what you provided, so things worked.

The nature of the string(s) does not matter to both of the functions, i.e. they just write to the file whatever you provide them. The interesting part is that writelines() does not add newline characters on its own, so the method name can actually be quite confusing. It actually behaves like an imaginary method called write_all_of_these_strings(sequence).

What follows is an idiomatic way in Python to write a list of strings to a file while keeping each string in its own line:

lines = ['line1', 'line2']
with open('filename.txt', 'w') as f:
    f.write('\n'.join(lines))

This takes care of closing the file for you. The construct '\n'.join(lines) concatenates (connects) the strings in the list lines and uses the character ‘\n’ as glue. It is more efficient than using the + operator.

Starting from the same lines sequence, ending up with the same output, but using writelines():

lines = ['line1', 'line2']
with open('filename.txt', 'w') as f:
    f.writelines("%s\n" % l for l in lines)

This makes use of a generator expression and dynamically creates newline-terminated strings. writelines() iterates over this sequence of strings and writes every item.

Edit: Another point you should be aware of:

write() and readlines() existed before writelines() was introduced. writelines() was introduced later as a counterpart of readlines(), so that one could easily write the file content that was just read via readlines():

outfile.writelines(infile.readlines())

Really, this is the main reason why writelines has such a confusing name. Also, today, we do not really want to use this method anymore. readlines() reads the entire file to the memory of your machine before writelines() starts to write the data. First of all, this may waste time. Why not start writing parts of data while reading other parts? But, most importantly, this approach can be very memory consuming. In an extreme scenario, where the input file is larger than the memory of your machine, this approach won’t even work. The solution to this problem is to use iterators only. A working example:

with open('inputfile') as infile:
    with open('outputfile') as outfile:
        for line in infile:
            outfile.write(line)

This reads the input file line by line. As soon as one line is read, this line is written to the output file. Schematically spoken, there always is only one single line in memory (compared to the entire file content being in memory in case of the readlines/writelines approach).


回答 2

如果您只想保存和加载列表,请尝试Pickle

泡菜保存:

with open("yourFile","wb")as file:
 pickle.dump(YourList,file)

和加载:

with open("yourFile","rb")as file:
 YourList=pickle.load(file)

if you just want to save and load a list try Pickle

Pickle saving:

with open("yourFile","wb")as file:
 pickle.dump(YourList,file)

and loading:

with open("yourFile","rb")as file:
 YourList=pickle.load(file)

回答 3

实际上,我认为问题在于您的变量“行”不好。您将行定义为元组,但是我相信write()需要一个字符串。您所要做的就是将逗号变成加号(+)。

nl = "\n"
lines = line1+nl+line2+nl+line3+nl
textdoc.writelines(lines)

应该管用。

Actually, I think the problem is that your variable “lines” is bad. You defined lines as a tuple, but I believe that write() requires a string. All you have to change is your commas into pluses (+).

nl = "\n"
lines = line1+nl+line2+nl+line3+nl
textdoc.writelines(lines)

should work.


回答 4

习德(Zed Shaw)的书中的练习16?您可以使用转义符,如下所示:

paragraph1 = "%s \n %s \n %s \n" % (line1, line2, line3)
target.write(paragraph1)
target.close()

Exercise 16 from Zed Shaw’s book? You can use escape characters as follows:

paragraph1 = "%s \n %s \n %s \n" % (line1, line2, line3)
target.write(paragraph1)
target.close()

如何在python字符串中找到子字符串的首次出现?

问题:如何在python字符串中找到子字符串的首次出现?

因此,如果我的字符串是“花花公子很酷”。
我想找到’dude’的第一个索引:

mystring.findfirstindex('dude') # should return 4

这是什么python命令?
谢谢。

So if my string is “the dude is a cool dude”.
I’d like to find the first index of ‘dude’:

mystring.findfirstindex('dude') # should return 4

What is the python command for this?
Thanks.


回答 0

find()

>>> s = "the dude is a cool dude"
>>> s.find('dude')
4

find()

>>> s = "the dude is a cool dude"
>>> s.find('dude')
4

回答 1

快速概述: indexfind

find方法旁边也有indexfindindex这两个产生相同的结果:返回第一个出现的位置,如果没有找到index将引发ValueError,而find回报-1。在速度方面,两者都有相同的基准结果。

s.find(t)    #returns: -1, or index where t starts in s
s.index(t)   #returns: Same as find, but raises ValueError if t is not in s

其他知识: rfindrindex

在一般情况下,发现和指数收益率,其中传入的字符串开始最小的指数,并rfindrindex返回它开始大部分的字符串搜索算法进行搜索的最大索引从左到右,所以开始的功能r表示搜索从发生右向左

因此,如果您正在搜索的元素的可能性比列表的开始更接近结尾,rfind或者rindex会更快。

s.rfind(t)   #returns: Same as find, but searched right to left
s.rindex(t)  #returns: Same as index, but searches right to left

来源: Python:Visual快速入门指南,Toby Donaldson

Quick Overview: index and find

Next to the find method there is as well index. find and index both yield the same result: returning the position of the first occurrence, but if nothing is found index will raise a ValueError whereas find returns -1. Speedwise, both have the same benchmark results.

s.find(t)    #returns: -1, or index where t starts in s
s.index(t)   #returns: Same as find, but raises ValueError if t is not in s

Additional knowledge: rfind and rindex:

In general, find and index return the smallest index where the passed-in string starts, and rfind and rindex return the largest index where it starts Most of the string searching algorithms search from left to right, so functions starting with r indicate that the search happens from right to left.

So in case that the likelihood of the element you are searching is close to the end than to the start of the list, rfind or rindex would be faster.

s.rfind(t)   #returns: Same as find, but searched right to left
s.rindex(t)  #returns: Same as index, but searches right to left

Source: Python: Visual QuickStart Guide, Toby Donaldson


回答 2

通过不使用任何python内置函数来以算法方式实现此功能。这可以实现为

def find_pos(string,word):

    for i in range(len(string) - len(word)+1):
        if string[i:i+len(word)] == word:
            return i
    return 'Not Found'

string = "the dude is a cool dude"
word = 'dude1'
print(find_pos(string,word))
# output 4

to implement this in algorithmic way, by not using any python inbuilt function . This can be implemented as

def find_pos(string,word):

    for i in range(len(string) - len(word)+1):
        if string[i:i+len(word)] == word:
            return i
    return 'Not Found'

string = "the dude is a cool dude"
word = 'dude1'
print(find_pos(string,word))
# output 4

回答 3

def find_pos(chaine,x):

    for i in range(len(chaine)):
        if chaine[i] ==x :
            return 'yes',i 
    return 'no'
def find_pos(chaine,x):

    for i in range(len(chaine)):
        if chaine[i] ==x :
            return 'yes',i 
    return 'no'

像C#中的StringBuilder这样的Python字符串类?

问题:像C#中的StringBuilder这样的Python字符串类?

Python中是否像StringBuilderC#中一样有一些字符串类?

Is there some string class in Python like StringBuilder in C#?


回答 0

没有一对一的关联。对于非常好的文章,请参见Python中的高效字符串连接

使用Python编程语言构建长字符串有时会导致运行速度非常慢。在本文中,我研究了各种字符串连接方法的计算性能。

There is no one-to-one correlation. For a really good article please see Efficient String Concatenation in Python:

Building long strings in the Python progamming language can sometimes result in very slow running code. In this article I investigate the computational performance of various string concatenation methods.


回答 1

我使用了Oliver Crow的代码(由Andrew Hare给出的链接),并对其进行了一些修改以适应Python 2.7.3。(通过使用timeit包)。我在个人计算机Lenovo T61、6GB RAM,Debian GNU / Linux 6.0.6(挤压)上运行。

这是10,000次迭代的结果:

方法1:0.0538418292999秒
处理大小4800 kb
方法2:0.22602891922秒
处理大小4960 kb
method3:0.0605459213257秒
处理大小4980 kb
method4:0.0544030666351秒
处理大小5536 kb
method5:0.0551080703735秒
处理大小5272 kb
method6:0.0542731285095秒
处理大小5512 kb

并且进行了5,000,000次迭代(方法2被忽略了,因为它运行得太慢了,就像永远一样):

方法1:5.88603997231秒
处理大小37976 kb
方法3:8.40748500824秒
处理大小38024 kb
方法4:7.96380496025秒
程序大小321968 kb
方法5:8.03666186333秒
处理大小71720 kb
方法6:6.68192911148秒
处理大小38240 kb

很明显,Python的人在优化字符串连接方面做得非常出色,正如Hoare所说:“过早的优化是万恶之源” :-)

I have used the code of Oliver Crow (link given by Andrew Hare) and adapted it a bit to tailor Python 2.7.3. (by using timeit package). I ran on my personal computer, Lenovo T61, 6GB RAM, Debian GNU/Linux 6.0.6 (squeeze).

Here is the result for 10,000 iterations:

method1:  0.0538418292999 secs
process size 4800 kb
method2:  0.22602891922 secs
process size 4960 kb
method3:  0.0605459213257 secs
process size 4980 kb
method4:  0.0544030666351 secs
process size 5536 kb
method5:  0.0551080703735 secs
process size 5272 kb
method6:  0.0542731285095 secs
process size 5512 kb

and for 5,000,000 iterations (method 2 was ignored because it ran tooo slowly, like forever):

method1:  5.88603997231 secs
process size 37976 kb
method3:  8.40748500824 secs
process size 38024 kb
method4:  7.96380496025 secs
process size 321968 kb
method5:  8.03666186333 secs
process size 71720 kb
method6:  6.68192911148 secs
process size 38240 kb

It is quite obvious that Python guys have done pretty great job to optimize string concatenation, and as Hoare said: “premature optimization is the root of all evil” :-)


回答 2

依靠编译器优化是脆弱的。接受的答案中链接的基准和Antoine-tran给出的数字不可信。安德鲁·黑尔(Andrew Hare)错误地repr在其方法中包含了一个调用。这会平均降低所有方法的速度,但会掩盖构建字符串的实际代价。

使用join。它非常快速且更强大。

$ ipython3
Python 3.5.1 (default, Mar  2 2016, 03:38:02) 
IPython 4.1.2 -- An enhanced Interactive Python.

In [1]: values = [str(num) for num in range(int(1e3))]

In [2]: %%timeit
   ...: ''.join(values)
   ...: 
100000 loops, best of 3: 7.37 µs per loop

In [3]: %%timeit
   ...: result = ''
   ...: for value in values:
   ...:     result += value
   ...: 
10000 loops, best of 3: 82.8 µs per loop

In [4]: import io

In [5]: %%timeit
   ...: writer = io.StringIO()
   ...: for value in values:
   ...:     writer.write(value)
   ...: writer.getvalue()
   ...: 
10000 loops, best of 3: 81.8 µs per loop

Relying on compiler optimizations is fragile. The benchmarks linked in the accepted answer and numbers given by Antoine-tran are not to be trusted. Andrew Hare makes the mistake of including a call to repr in his methods. That slows all the methods equally but obscures the real penalty in constructing the string.

Use join. It’s very fast and more robust.

$ ipython3
Python 3.5.1 (default, Mar  2 2016, 03:38:02) 
IPython 4.1.2 -- An enhanced Interactive Python.

In [1]: values = [str(num) for num in range(int(1e3))]

In [2]: %%timeit
   ...: ''.join(values)
   ...: 
100000 loops, best of 3: 7.37 µs per loop

In [3]: %%timeit
   ...: result = ''
   ...: for value in values:
   ...:     result += value
   ...: 
10000 loops, best of 3: 82.8 µs per loop

In [4]: import io

In [5]: %%timeit
   ...: writer = io.StringIO()
   ...: for value in values:
   ...:     writer.write(value)
   ...: writer.getvalue()
   ...: 
10000 loops, best of 3: 81.8 µs per loop

回答 3

Python具有满足类似目的的几件事:

  • 从片段构建大字符串的一种常用方法是增长字符串列表,并在完成后将其加入。这是一个常用的Python习惯用法。
    • 要构建包含格式化数据的字符串,您需要单独进行格式化。
  • 为了在字符级别插入和删除,您将保留一个长度为一的字符串列表。(要通过字符串进行此操作,list(your_string)您可以调用。您也可以UserString.MutableString为此使用a 。
  • (c)StringIO.StringIO 对于原本会占用文件的内容很有用,但对于一般的字符串构建则没什么用。

Python has several things that fulfill similar purposes:

  • One common way to build large strings from pieces is to grow a list of strings and join it when you are done. This is a frequently-used Python idiom.
    • To build strings incorporating data with formatting, you would do the formatting separately.
  • For insertion and deletion at a character level, you would keep a list of length-one strings. (To make this from a string, you’d call list(your_string). You could also use a UserString.MutableString for this.
  • (c)StringIO.StringIO is useful for things that would otherwise take a file, but less so for general string building.

回答 4

从上面使用方法5(伪文件),我们可以获得非常好的性能和灵活性

from cStringIO import StringIO

class StringBuilder:
     _file_str = None

     def __init__(self):
         self._file_str = StringIO()

     def Append(self, str):
         self._file_str.write(str)

     def __str__(self):
         return self._file_str.getvalue()

现在使用它

sb = StringBuilder()

sb.Append("Hello\n")
sb.Append("World")

print sb

Using method 5 from above (The Pseudo File) we can get very good perf and flexibility

from cStringIO import StringIO

class StringBuilder:
     _file_str = None

     def __init__(self):
         self._file_str = StringIO()

     def Append(self, str):
         self._file_str.write(str)

     def __str__(self):
         return self._file_str.getvalue()

now using it

sb = StringBuilder()

sb.Append("Hello\n")
sb.Append("World")

print sb

回答 5

您可以尝试StringIOcStringIO


回答 6

没有显式的类似物-我认为您应该使用字符串串联(可能如前所述进行优化)或第三方类(我怀疑它们效率更高)-python中的列表是动态类型的,因此无法快速工作char []用于缓冲区(我假设)。由于许多语言中的字符串固有特性(不可变性),类似Stringbuilder的类不是过早的优化-允许进行许多优化(例如,为片段/子字符串引用相同的缓冲区)。类似于Stringbuilder / stringbuffer / stringstream的类的工作比连接字符串(产生许多仍需要分配和垃圾回收的小型临时对象)甚至字符串格式的类似于printf的工具要快得多,不需要解释格式化模式的开销,这对于很多格式调用。

There is no explicit analogue – i think you are expected to use string concatenations(likely optimized as said before) or third-party class(i doubt that they are a lot more efficient – lists in python are dynamic-typed so no fast-working char[] for buffer as i assume). Stringbuilder-like classes are not premature optimization because of innate feature of strings in many languages(immutability) – that allows many optimizations(for example, referencing same buffer for slices/substrings). Stringbuilder/stringbuffer/stringstream-like classes work a lot faster than concatenating strings(producing many small temporary objects that still need allocations and garbage collection) and even string formatting printf-like tools, not needing of interpreting formatting pattern overhead that is pretty consuming for a lot of format calls.


回答 7

如果您在这里寻找Python中的快速字符串连接方法,则不需要特殊的StringBuilder类。简单的串联也可以正常工作,而不会降低C#中的性能。

resultString = ""

resultString += "Append 1"
resultString += "Append 2"

有关性能结果,请参见Antoine-tran的答案

In case you are here looking for a fast string concatenation method in Python, then you do not need a special StringBuilder class. Simple concatenation works just as well without the performance penalty seen in C#.

resultString = ""

resultString += "Append 1"
resultString += "Append 2"

See Antoine-tran’s answer for performance results


如何检查字符串中的字符是否为字母?(Python)

问题:如何检查字符串中的字符是否为字母?(Python)

我知道islowerisupper,但是您可以检查该字符是否是字母?例如:

>>> s = 'abcdefg'
>>> s2 = '123abcd'
>>> s3 = 'abcDEFG'
>>> s[0].islower()
True

>>> s2[0].islower()
False

>>> s3[0].islower()
True

除了做.islower()还是,有什么办法可以问它是否是一个角色.isupper()

I know about islower and isupper, but can you check whether or not that character is a letter? For Example:

>>> s = 'abcdefg'
>>> s2 = '123abcd'
>>> s3 = 'abcDEFG'
>>> s[0].islower()
True

>>> s2[0].islower()
False

>>> s3[0].islower()
True

Is there any way to just ask if it is a character besides doing .islower() or .isupper()?


回答 0

您可以使用str.isalpha()

例如:

s = 'a123b'

for char in s:
    print(char, char.isalpha())

输出:

a True
1 False
2 False
3 False
b True

You can use str.isalpha().

For example:

s = 'a123b'

for char in s:
    print(char, char.isalpha())

Output:

a True
1 False
2 False
3 False
b True

回答 1

str.isalpha()

如果字符串中的所有字符都是字母并且至少包含一个字符,则返回true,否则返回false。字母字符是在Unicode字符数据库中定义为“字母”的那些字符,即,具有一般类别属性为“ Lm”,“ Lt”,“ Lu”,“ Ll”或“ Lo”之一的那些字符。请注意,这与Unicode标准中定义的“字母”属性不同。

在python2.x中:

>>> s = u'a1中文'
>>> for char in s: print char, char.isalpha()
...
a True
1 False
 True
 True
>>> s = 'a1中文'
>>> for char in s: print char, char.isalpha()
...
a True
1 False
 False
 False
 False
 False
 False
 False
>>>

在python3.x中:

>>> s = 'a1中文'
>>> for char in s: print(char, char.isalpha())
...
a True
1 False
 True
 True
>>>

此代码的工作原理:

>>> def is_alpha(word):
...     try:
...         return word.encode('ascii').isalpha()
...     except:
...         return False
...
>>> is_alpha('中国')
False
>>> is_alpha(u'中国')
False
>>>

>>> a = 'a'
>>> b = 'a'
>>> ord(a), ord(b)
(65345, 97)
>>> a.isalpha(), b.isalpha()
(True, True)
>>> is_alpha(a), is_alpha(b)
(False, True)
>>>
str.isalpha()

Return true if all characters in the string are alphabetic and there is at least one character, false otherwise. Alphabetic characters are those characters defined in the Unicode character database as “Letter”, i.e., those with general category property being one of “Lm”, “Lt”, “Lu”, “Ll”, or “Lo”. Note that this is different from the “Alphabetic” property defined in the Unicode Standard.

In python2.x:

>>> s = u'a1中文'
>>> for char in s: print char, char.isalpha()
...
a True
1 False
中 True
文 True
>>> s = 'a1中文'
>>> for char in s: print char, char.isalpha()
...
a True
1 False
� False
� False
� False
� False
� False
� False
>>>

In python3.x:

>>> s = 'a1中文'
>>> for char in s: print(char, char.isalpha())
...
a True
1 False
中 True
文 True
>>>

This code work:

>>> def is_alpha(word):
...     try:
...         return word.encode('ascii').isalpha()
...     except:
...         return False
...
>>> is_alpha('中国')
False
>>> is_alpha(u'中国')
False
>>>

>>> a = 'a'
>>> b = 'a'
>>> ord(a), ord(b)
(65345, 97)
>>> a.isalpha(), b.isalpha()
(True, True)
>>> is_alpha(a), is_alpha(b)
(False, True)
>>>

回答 2

我发现使用函数和基本代码可以实现此目的。这是一个接受字符串并计算大写字母,小写字母以及“其他”数量的代码。其他分类为空格,标点符号,甚至日语和中文字符。

def check(count):

    lowercase = 0
    uppercase = 0
    other = 0

    low = 'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'
    upper = 'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'



    for n in count:
        if n in low:
            lowercase += 1
        elif n in upper:
            uppercase += 1
        else:
            other += 1

    print("There are " + str(lowercase) + " lowercase letters.")
    print("There are " + str(uppercase) + " uppercase letters.")
    print("There are " + str(other) + " other elements to this sentence.")

I found a good way to do this with using a function and basic code. This is a code that accepts a string and counts the number of capital letters, lowercase letters and also ‘other’. Other is classed as a space, punctuation mark or even Japanese and Chinese characters.

def check(count):

    lowercase = 0
    uppercase = 0
    other = 0

    low = 'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'
    upper = 'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'



    for n in count:
        if n in low:
            lowercase += 1
        elif n in upper:
            uppercase += 1
        else:
            other += 1

    print("There are " + str(lowercase) + " lowercase letters.")
    print("There are " + str(uppercase) + " uppercase letters.")
    print("There are " + str(other) + " other elements to this sentence.")

回答 3

data = "abcdefg hi j 12345"

digits_count = 0
letters_count = 0
others_count = 0

for i in userinput:

    if i.isdigit():
        digits_count += 1 
    elif i.isalpha():
        letters_count += 1
    else:
        others_count += 1

print("Result:")        
print("Letters=", letters_count)
print("Digits=", digits_count)

输出:

Please Enter Letters with Numbers:
abcdefg hi j 12345
Result:
Letters = 10
Digits = 5

通过使用str.isalpha()您可以检查它是否是字母。

data = "abcdefg hi j 12345"

digits_count = 0
letters_count = 0
others_count = 0

for i in userinput:

    if i.isdigit():
        digits_count += 1 
    elif i.isalpha():
        letters_count += 1
    else:
        others_count += 1

print("Result:")        
print("Letters=", letters_count)
print("Digits=", digits_count)

Output:

Please Enter Letters with Numbers:
abcdefg hi j 12345
Result:
Letters = 10
Digits = 5

By using str.isalpha() you can check if it is a letter.


回答 4

这有效:

any(c.isalpha() for c in 'string')

This works:

any(c.isalpha() for c in 'string')

回答 5

这有效:

word = str(input("Enter string:"))
notChar = 0
isChar = 0
for char in word:
    if not char.isalpha():
        notChar += 1
    else:
        isChar += 1
print(isChar, " were letters; ", notChar, " were not letters.")

This works:

word = str(input("Enter string:"))
notChar = 0
isChar = 0
for char in word:
    if not char.isalpha():
        notChar += 1
    else:
        isChar += 1
print(isChar, " were letters; ", notChar, " were not letters.")

从字符串中删除数字

问题:从字符串中删除数字

如何删除字符串中的数字?

How can I remove digits from a string?


回答 0

这适合您的情况吗?

>>> s = '12abcd405'
>>> result = ''.join([i for i in s if not i.isdigit()])
>>> result
'abcd'

这利用了列表理解,这里发生的事情与此结构类似:

no_digits = []
# Iterate through the string, adding non-numbers to the no_digits list
for i in s:
    if not i.isdigit():
        no_digits.append(i)

# Now join all elements of the list with '', 
# which puts all of the characters together.
result = ''.join(no_digits)

正如@AshwiniChaudhary和@KirkStrauser指出的那样,您实际上不需要在单行代码中使用括号,从而使括号内的内容成为生成器表达式(比列表理解更有效)。即使这不符合您的分配要求,您最终还是应该阅读以下内容:):

>>> s = '12abcd405'
>>> result = ''.join(i for i in s if not i.isdigit())
>>> result
'abcd'

Would this work for your situation?

>>> s = '12abcd405'
>>> result = ''.join([i for i in s if not i.isdigit()])
>>> result
'abcd'

This makes use of a list comprehension, and what is happening here is similar to this structure:

no_digits = []
# Iterate through the string, adding non-numbers to the no_digits list
for i in s:
    if not i.isdigit():
        no_digits.append(i)

# Now join all elements of the list with '', 
# which puts all of the characters together.
result = ''.join(no_digits)

As @AshwiniChaudhary and @KirkStrauser point out, you actually do not need to use the brackets in the one-liner, making the piece inside the parentheses a generator expression (more efficient than a list comprehension). Even if this doesn’t fit the requirements for your assignment, it is something you should read about eventually :) :

>>> s = '12abcd405'
>>> result = ''.join(i for i in s if not i.isdigit())
>>> result
'abcd'

回答 1

而且,经常把它丢进去,是经常被遗忘的str.translate,它比循环/正则表达式快得多:

对于Python 2:

from string import digits

s = 'abc123def456ghi789zero0'
res = s.translate(None, digits)
# 'abcdefghizero'

对于Python 3:

from string import digits

s = 'abc123def456ghi789zero0'
remove_digits = str.maketrans('', '', digits)
res = s.translate(remove_digits)
# 'abcdefghizero'

And, just to throw it in the mix, is the oft-forgotten str.translate which will work a lot faster than looping/regular expressions:

For Python 2:

from string import digits

s = 'abc123def456ghi789zero0'
res = s.translate(None, digits)
# 'abcdefghizero'

For Python 3:

from string import digits

s = 'abc123def456ghi789zero0'
remove_digits = str.maketrans('', '', digits)
res = s.translate(remove_digits)
# 'abcdefghizero'

回答 2

不知道您的老师是否允许您使用过滤器,但是…

filter(lambda x: x.isalpha(), "a1a2a3s3d4f5fg6h")

返回-

'aaasdffgh'

比循环更有效率…

例:

for i in range(10):
  a.replace(str(i),'')

Not sure if your teacher allows you to use filters but…

filter(lambda x: x.isalpha(), "a1a2a3s3d4f5fg6h")

returns-

'aaasdffgh'

Much more efficient than looping…

Example:

for i in range(10):
  a.replace(str(i),'')

回答 3

那这个呢:

out_string = filter(lambda c: not c.isdigit(), in_string)

What about this:

out_string = filter(lambda c: not c.isdigit(), in_string)

回答 4

只是几个(其他人建议了其中一些)

方法1:

''.join(i for i in myStr if not i.isdigit())

方法2:

def removeDigits(s):
    answer = []
    for char in s:
        if not char.isdigit():
            answer.append(char)
    return ''.join(char)

方法3:

''.join(filter(lambda x: not x.isdigit(), mystr))

方法4:

nums = set(map(int, range(10)))
''.join(i for i in mystr if i not in nums)

方法5:

''.join(i for i in mystr if ord(i) not in range(48, 58))

Just a few (others have suggested some of these)

Method 1:

''.join(i for i in myStr if not i.isdigit())

Method 2:

def removeDigits(s):
    answer = []
    for char in s:
        if not char.isdigit():
            answer.append(char)
    return ''.join(char)

Method 3:

''.join(filter(lambda x: not x.isdigit(), mystr))

Method 4:

nums = set(map(int, range(10)))
''.join(i for i in mystr if i not in nums)

Method 5:

''.join(i for i in mystr if ord(i) not in range(48, 58))

回答 5

说st是您的未格式化的字符串,然后运行

st_nodigits=''.join(i for i in st if i.isalpha())

正如刚才提到的。但是我猜想您需要非常简单的内容,所以说s是您的字符串,st_res是没有数字的字符串,那么这是您的代码

l = ['0','1','2','3','4','5','6','7','8','9']
st_res=""
for ch in s:
 if ch not in l:
  st_res+=ch

Say st is your unformatted string, then run

st_nodigits=''.join(i for i in st if i.isalpha())

as mentioned above. But my guess that you need something very simple so say s is your string and st_res is a string without digits, then here is your code

l = ['0','1','2','3','4','5','6','7','8','9']
st_res=""
for ch in s:
 if ch not in l:
  st_res+=ch

回答 6

我很乐意使用正则表达式来完成此操作,但是由于您只能使用列表,循环,函数等。

这是我想出的:

stringWithNumbers="I have 10 bananas for my 5 monkeys!"
stringWithoutNumbers=''.join(c if c not in map(str,range(0,10)) else "" for c in stringWithNumbers)
print(stringWithoutNumbers) #I have  bananas for my  monkeys!

I’d love to use regex to accomplish this, but since you can only use lists, loops, functions, etc..

here’s what I came up with:

stringWithNumbers="I have 10 bananas for my 5 monkeys!"
stringWithoutNumbers=''.join(c if c not in map(str,range(0,10)) else "" for c in stringWithNumbers)
print(stringWithoutNumbers) #I have  bananas for my  monkeys!

回答 7

如果我正确理解您的问题,一种方法是将字符串分解为chars,然后使用循环检查该字符串中的每个char是字符串还是数字,然后将string保存到变量中,然后循环一次完成后,向用户显示

If i understand your question right, one way to do is break down the string in chars and then check each char in that string using a loop whether it’s a string or a number and then if string save it in a variable and then once the loop is finished, display that to the user


为什么在split()结果中返回空字符串?

问题:为什么在split()结果中返回空字符串?

什么是点'/segment/segment/'.split('/')回来['', 'segment', 'segment', '']

注意空元素。如果您要分割的分隔符恰好位于字符串的第一位置,并且位于字符串的末尾,那么它又能为您带来什么额外的价值呢?

What is the point of '/segment/segment/'.split('/') returning ['', 'segment', 'segment', '']?

Notice the empty elements. If you’re splitting on a delimiter that happens to be at position one and at the very end of a string, what extra value does it give you to have the empty string returned from each end?


回答 0

str.splitstr.join,所以

"/".join(['', 'segment', 'segment', ''])

让您返回原始字符串。

如果没有空字符串,则第一个和最后符串'/'将丢失join()

str.split complements str.join, so

"/".join(['', 'segment', 'segment', ''])

gets you back the original string.

If the empty strings were not there, the first and last '/' would be missing after the join()


回答 1

更一般而言,要删除split()结果中返回的空字符串,您可能需要查看该filter函数。

例:

filter(None, '/segment/segment/'.split('/'))

退货

['segment', 'segment']

More generally, to remove empty strings returned in split() results, you may want to look at the filter function.

Example:

f = filter(None, '/segment/segment/'.split('/'))
s_all = list(f)

returns

['segment', 'segment']

回答 2

这里有两点要考虑:

  • 期望结果'/segment/segment/'.split('/')['segment', 'segment']于是合理的,但这会丢失信息。如果split()按照您想要的方式工作,如果我告诉您a.split('/') == ['segment', 'segment'],您将无法告诉我是什么a
  • 结果应该是什么'a//b'.split()['a', 'b']?或['a', '', 'b']?即,是否应split()合并相邻的定界符?如果需要,那么将很难解析由字符分隔的数据,并且某些字段可以为空。我可以肯定,有很多人确实想要上述情况的结果中的空值!

最后,归结为两点:

一致性:如果我有n定界符,则在中a,我会在n+1返回值split()

应该可以做复杂的事情,并且可以轻松地做简单的事情:如果由于想要忽略空字符串split(),可以始终这样做:

def mysplit(s, delim=None):
    return [x for x in s.split(delim) if x]

但是如果不想忽略空值,则应该可以。

该语言必须选择一种定义split()-有太多不同的用例无法满足所有人的默认要求。我认为Python的选择是不错的选择,也是最合乎逻辑的选择。(顺便说一句,我不喜欢C的原因之一strtok()是因为它合并了相邻的定界符,因此很难对其进行认真的解析/标记化处理。)

有一个exceptions:a.split()没有参数会挤压连续的空格,但是有人可以认为在这种情况下这样做是正确的。如果您不想要这种行为,则可以始终这样做a.split(' ')

There are two main points to consider here:

  • Expecting the result of '/segment/segment/'.split('/') to be equal to ['segment', 'segment'] is reasonable, but then this loses information. If split() worked the way you wanted, if I tell you that a.split('/') == ['segment', 'segment'], you can’t tell me what a was.
  • What should be the result of 'a//b'.split() be? ['a', 'b']?, or ['a', '', 'b']? I.e., should split() merge adjacent delimiters? If it should, then it will be very hard to parse data that’s delimited by a character, and some of the fields can be empty. I am fairly sure there are many people who do want the empty values in the result for the above case!

In the end, it boils down to two things:

Consistency: if I have n delimiters, in a, I get n+1 values back after the split().

It should be possible to do complex things, and easy to do simple things: if you want to ignore empty strings as a result of the split(), you can always do:

def mysplit(s, delim=None):
    return [x for x in s.split(delim) if x]

but if one doesn’t want to ignore the empty values, one should be able to.

The language has to pick one definition of split()—there are too many different use cases to satisfy everyone’s requirement as a default. I think that Python’s choice is a good one, and is the most logical. (As an aside, one of the reasons I don’t like C’s strtok() is because it merges adjacent delimiters, making it extremely hard to do serious parsing/tokenization with it.)

There is one exception: a.split() without an argument squeezes consecutive white-space, but one can argue that this is the right thing to do in that case. If you don’t want the behavior, you can always to a.split(' ').


回答 3

x.split(y)始终返回列表1 + x.count(y)项是一种珍贵的规律性-为@ gnibbler本已指出,这让splitjoin对方的确切逆(因为它们显然应该是),这也正是各种分隔符连记录的语义(映射例如csv文件行[[引用网络的净额]],/etc/groupUnix中的行等等),它允许(如@Roman的回答所述)轻松检查(例如)绝对路径与相对路径(在文件路径和URL中),等等。

另一种看待它的方式是,您不应该肆意地将信息扔出窗外,以免获得任何好处。x.split(y)等于会得到什么x.strip(y).split(y)?没事,当然-它很容易使用第二种形式时,这就是你的意思,但如果第一种形式是任意视为指第二个,你有很多工作要做,当你希望第一个(如上一段所指出的,这绝非罕见。

但是实际上,根据数学规律进行思考是您可以自学的设计可传递API的最简单,最通用的方法。举一个不同的例子,对于任何有效的xy x == x[:y] + x[y:]-这立即表明为什么应该排除切片的一个极端非常重要。您可以表述不变的断言越简单,就越有可能产生的语义就是您在现实生活中需要的语义-这是神秘的事实,即数学在处理宇宙中非常有用。

尝试为split前导和尾随定界符是特殊情况的方言制定不变式…反例:像这样的字符串方法isspace并不是最大程度的简单- x.isspace()等效于x and all(c in string.whitespace for c in x)-愚蠢的前导x and是您经常发现自己编码的原因not x or x.isspace(),返回到应该is...字符串方法中设计的简单性(因此,空字符串“就是”您想要的任何东西)与街上人马的感觉相反,也许[[空集,如零&c,始终使大多数人感到困惑;-)]],但完全符合明显完善的数学常识!-)。

Having x.split(y) always return a list of 1 + x.count(y) items is a precious regularity — as @gnibbler’s already pointed out it makes split and join exact inverses of each other (as they obviously should be), it also precisely maps the semantics of all kinds of delimiter-joined records (such as csv file lines [[net of quoting issues]], lines from /etc/group in Unix, and so on), it allows (as @Roman’s answer mentioned) easy checks for (e.g.) absolute vs relative paths (in file paths and URLs), and so forth.

Another way to look at it is that you shouldn’t wantonly toss information out of the window for no gain. What would be gained in making x.split(y) equivalent to x.strip(y).split(y)? Nothing, of course — it’s easy to use the second form when that’s what you mean, but if the first form was arbitrarily deemed to mean the second one, you’d have lot of work to do when you do want the first one (which is far from rare, as the previous paragraph points out).

But really, thinking in terms of mathematical regularity is the simplest and most general way you can teach yourself to design passable APIs. To take a different example, it’s very important that for any valid x and y x == x[:y] + x[y:] — which immediately indicates why one extreme of a slicing should be excluded. The simpler the invariant assertion you can formulate, the likelier it is that the resulting semantics are what you need in real life uses — part of the mystical fact that maths is very useful in dealing with the universe.

Try formulating the invariant for a split dialect in which leading and trailing delimiters are special-cased… counter-example: string methods such as isspace are not maximally simple — x.isspace() is equivalent to x and all(c in string.whitespace for c in x) — that silly leading x and is why you so often find yourself coding not x or x.isspace(), to get back to the simplicity which should have been designed into the is... string methods (whereby an empty string “is” anything you want — contrary to man-in-the-street horse-sense, maybe [[empty sets, like zero &c, have always confused most people;-)]], but fully conforming to obvious well-refined mathematical common-sense!-).


回答 4

我不确定您要寻找哪种答案?您得到三个匹配,因为您有三个定界符。如果您不想要那个空的,只需使用:

'/segment/segment/'.strip('/').split('/')

I’m not sure what kind of answer you’re looking for? You get three matches because you have three delimiters. If you don’t want that empty one, just use:

'/segment/segment/'.strip('/').split('/')

回答 5

好吧,它让您知道那里有一个定界符。因此,看到4个结果会让您知道您有3个定界符。这使您能够使用此信息执行任何所需的操作,而不是让Python删除空元素,然后让您手动检查是否需要定界符(如果需要)。

一个简单的例子:假设您要检查绝对文件名和相对文件名。这样,您就可以使用拆分完成所有操作,而不必检查文件名的第一个字符是什么。

Well, it lets you know there was a delimiter there. So, seeing 4 results lets you know you had 3 delimiters. This gives you the power to do whatever you want with this information, rather than having Python drop the empty elements, and then making you manually check for starting or ending delimiters if you need to know it.

Simple example: Say you want to check for absolute vs. relative filenames. This way you can do it all with the split, without also having to check what the first character of your filename is.


回答 6

考虑以下最小示例:

>>> '/'.split('/')
['', '']

split必须给您定界符前后的内容'/',但没有其他字符。因此,它必须为您提供一个空字符串,从技术上讲,它在之前和之后'/',因为'' + '/' + '' == '/'

Consider this minimal example:

>>> '/'.split('/')
['', '']

split must give you what’s before and after the delimiter '/', but there are no other characters. So it has to give you the empty string, which technically precedes and follows the '/', because '' + '/' + '' == '/'.


遍历字符串

问题:遍历字符串

我有这样定义的多行字符串:

foo = """
this is 
a multi-line string.
"""

我们用作我正在编写的解析器的测试输入的字符串。解析器功能接收file-object作为输入并对其进行迭代。它还确实next()直接调用该方法以跳过行,因此我确实需要一个迭代器作为输入,而不是可迭代的。我需要一个迭代器,它可以在字符串的各个行之间进行迭代,就像file-object可以在文本文件的行之间进行迭代一样。我当然可以这样:

lineiterator = iter(foo.splitlines())

是否有更直接的方法?在这种情况下,字符串必须遍历一次才能进行拆分,然后再由解析器再次遍历。在我的测试用例中,这无关紧要,因为那里的字符串很短,我只是出于好奇而问。Python有很多有用且高效的内置程序,但是我找不到适合此需求的东西。

I have a multi-line string defined like this:

foo = """
this is 
a multi-line string.
"""

This string we used as test-input for a parser I am writing. The parser-function receives a file-object as input and iterates over it. It does also call the next() method directly to skip lines, so I really need an iterator as input, not an iterable. I need an iterator that iterates over the individual lines of that string like a file-object would over the lines of a text-file. I could of course do it like this:

lineiterator = iter(foo.splitlines())

Is there a more direct way of doing this? In this scenario the string has to traversed once for the splitting, and then again by the parser. It doesn’t matter in my test-case, since the string is very short there, I am just asking out of curiosity. Python has so many useful and efficient built-ins for such stuff, but I could find nothing that suits this need.


回答 0

这是三种可能性:

foo = """
this is 
a multi-line string.
"""

def f1(foo=foo): return iter(foo.splitlines())

def f2(foo=foo):
    retval = ''
    for char in foo:
        retval += char if not char == '\n' else ''
        if char == '\n':
            yield retval
            retval = ''
    if retval:
        yield retval

def f3(foo=foo):
    prevnl = -1
    while True:
      nextnl = foo.find('\n', prevnl + 1)
      if nextnl < 0: break
      yield foo[prevnl + 1:nextnl]
      prevnl = nextnl

if __name__ == '__main__':
  for f in f1, f2, f3:
    print list(f())

将其作为主要脚本运行,确认这三个功能等效。使用timeit(并使用* 100for foo获得大量字符串以进行更精确的测量):

$ python -mtimeit -s'import asp' 'list(asp.f3())'
1000 loops, best of 3: 370 usec per loop
$ python -mtimeit -s'import asp' 'list(asp.f2())'
1000 loops, best of 3: 1.36 msec per loop
$ python -mtimeit -s'import asp' 'list(asp.f1())'
10000 loops, best of 3: 61.5 usec per loop

注意,我们需要list()调用以确保遍历迭代器,而不仅仅是构建迭代器。

IOW,天真的实现要快得多,甚至都不有趣:比我尝试find调用快6倍,而调用比底层方法快4倍。

经验教训:测量永远是一件好事(但必须准确);像这样的字符串方法splitlines以非常快的方式实现;通过在非常低的级别上进行编程(尤其是通过+=非常小的片段的循环)来将字符串组合在一起可能会非常慢。

编辑:添加了@Jacob的提案,对其进行了稍微修改以使其与其他提案具有相同的结果(保留行尾空白),即:

from cStringIO import StringIO

def f4(foo=foo):
    stri = StringIO(foo)
    while True:
        nl = stri.readline()
        if nl != '':
            yield nl.strip('\n')
        else:
            raise StopIteration

测量得出:

$ python -mtimeit -s'import asp' 'list(asp.f4())'
1000 loops, best of 3: 406 usec per loop

不如.find基于方法的方法好-仍然要牢记,因为它可能不大可能出现小的一次性错误(如f3上面所述,任何出现+1和-1的循环都应该自动触发一个个的怀疑-许多循环应该缺少这些调整并且应该进行调整-尽管我相信我的代码也是正确的,因为我能够使用其他函数检查其输出’)。

但是基于拆分的方法仍然占主导地位。

顺便说一句:可能更好的样式f4是:

from cStringIO import StringIO

def f4(foo=foo):
    stri = StringIO(foo)
    while True:
        nl = stri.readline()
        if nl == '': break
        yield nl.strip('\n')

至少,它不那么冗长。\n不幸的是,需要去除尾随s禁止使用来更清楚,更快速地替换while循环return iter(stri)iter在现代版本的Python中,多余的部分是多余的,我相信从2.3或2.4开始,但它也是无害的)。也许也值得尝试:

    return itertools.imap(lambda s: s.strip('\n'), stri)

或其变体-但我在这里停止,因为这几乎是strip基础,最简单和最快的一项理论练习。

Here are three possibilities:

foo = """
this is 
a multi-line string.
"""

def f1(foo=foo): return iter(foo.splitlines())

def f2(foo=foo):
    retval = ''
    for char in foo:
        retval += char if not char == '\n' else ''
        if char == '\n':
            yield retval
            retval = ''
    if retval:
        yield retval

def f3(foo=foo):
    prevnl = -1
    while True:
      nextnl = foo.find('\n', prevnl + 1)
      if nextnl < 0: break
      yield foo[prevnl + 1:nextnl]
      prevnl = nextnl

if __name__ == '__main__':
  for f in f1, f2, f3:
    print list(f())

Running this as the main script confirms the three functions are equivalent. With timeit (and a * 100 for foo to get substantial strings for more precise measurement):

$ python -mtimeit -s'import asp' 'list(asp.f3())'
1000 loops, best of 3: 370 usec per loop
$ python -mtimeit -s'import asp' 'list(asp.f2())'
1000 loops, best of 3: 1.36 msec per loop
$ python -mtimeit -s'import asp' 'list(asp.f1())'
10000 loops, best of 3: 61.5 usec per loop

Note we need the list() call to ensure the iterators are traversed, not just built.

IOW, the naive implementation is so much faster it isn’t even funny: 6 times faster than my attempt with find calls, which in turn is 4 times faster than a lower-level approach.

Lessons to retain: measurement is always a good thing (but must be accurate); string methods like splitlines are implemented in very fast ways; putting strings together by programming at a very low level (esp. by loops of += of very small pieces) can be quite slow.

Edit: added @Jacob’s proposal, slightly modified to give the same results as the others (trailing blanks on a line are kept), i.e.:

from cStringIO import StringIO

def f4(foo=foo):
    stri = StringIO(foo)
    while True:
        nl = stri.readline()
        if nl != '':
            yield nl.strip('\n')
        else:
            raise StopIteration

Measuring gives:

$ python -mtimeit -s'import asp' 'list(asp.f4())'
1000 loops, best of 3: 406 usec per loop

not quite as good as the .find based approach — still, worth keeping in mind because it might be less prone to small off-by-one bugs (any loop where you see occurrences of +1 and -1, like my f3 above, should automatically trigger off-by-one suspicions — and so should many loops which lack such tweaks and should have them — though I believe my code is also right since I was able to check its output with other functions’).

But the split-based approach still rules.

An aside: possibly better style for f4 would be:

from cStringIO import StringIO

def f4(foo=foo):
    stri = StringIO(foo)
    while True:
        nl = stri.readline()
        if nl == '': break
        yield nl.strip('\n')

at least, it’s a bit less verbose. The need to strip trailing \ns unfortunately prohibits the clearer and faster replacement of the while loop with return iter(stri) (the iter part whereof is redundant in modern versions of Python, I believe since 2.3 or 2.4, but it’s also innocuous). Maybe worth trying, also:

    return itertools.imap(lambda s: s.strip('\n'), stri)

or variations thereof — but I’m stopping here since it’s pretty much a theoretical exercise wrt the strip based, simplest and fastest, one.


回答 1

我不确定您的意思是“然后再由解析器”。拆分完成后,将不再遍历字符串,而仅遍历拆分字符串列表。只要您的字符串的大小不是绝对很大,这实际上可能是最快的方法。python使用不可变字符串的事实意味着您必须始终创建一个新字符串,因此无论如何都必须这样做。

如果字符串很大,则不利之处在于内存使用情况:您将同时在内存中拥有原始字符串和拆分字符串列表,从而使所需的内存增加了一倍。迭代器方法可以节省您的开销,可以根据需要构建字符串,尽管它仍然要付出“分割”的代价。但是,如果您的字符串太大,则通常甚至要避免将未拆分的字符串存储在内存中。最好只从文件中读取字符串,该文件已经允许您以行形式遍历该字符串。

但是,如果您确实已经在内存中存储了一个巨大的字符串,则一种方法是使用StringIO,它为字符串提供了一个类似于文件的接口,包括允许逐行迭代(内部使用.find查找下一个换行符)。您将得到:

import StringIO
s = StringIO.StringIO(myString)
for line in s:
    do_something_with(line)

I’m not sure what you mean by “then again by the parser”. After the splitting has been done, there’s no further traversal of the string, only a traversal of the list of split strings. This will probably actually be the fastest way to accomplish this, so long as the size of your string isn’t absolutely huge. The fact that python uses immutable strings means that you must always create a new string, so this has to be done at some point anyway.

If your string is very large, the disadvantage is in memory usage: you’ll have the original string and a list of split strings in memory at the same time, doubling the memory required. An iterator approach can save you this, building a string as needed, though it still pays the “splitting” penalty. However, if your string is that large, you generally want to avoid even the unsplit string being in memory. It would be better just to read the string from a file, which already allows you to iterate through it as lines.

However if you do have a huge string in memory already, one approach would be to use StringIO, which presents a file-like interface to a string, including allowing iterating by line (internally using .find to find the next newline). You then get:

import StringIO
s = StringIO.StringIO(myString)
for line in s:
    do_something_with(line)

回答 2

如果我没有看错Modules/cStringIO.c,这应该是非常有效的(尽管有些冗长):

from cStringIO import StringIO

def iterbuf(buf):
    stri = StringIO(buf)
    while True:
        nl = stri.readline()
        if nl != '':
            yield nl.strip()
        else:
            raise StopIteration

If I read Modules/cStringIO.c correctly, this should be quite efficient (although somewhat verbose):

from cStringIO import StringIO

def iterbuf(buf):
    stri = StringIO(buf)
    while True:
        nl = stri.readline()
        if nl != '':
            yield nl.strip()
        else:
            raise StopIteration

回答 3

基于正则表达式的搜索有时比生成器方法要快:

RRR = re.compile(r'(.*)\n')
def f4(arg):
    return (i.group(1) for i in RRR.finditer(arg))

Regex-based searching is sometimes faster than generator approach:

RRR = re.compile(r'(.*)\n')
def f4(arg):
    return (i.group(1) for i in RRR.finditer(arg))

回答 4

我想你可以自己动手:

def parse(string):
    retval = ''
    for char in string:
        retval += char if not char == '\n' else ''
        if char == '\n':
            yield retval
            retval = ''
    if retval:
        yield retval

我不确定此实现的效率如何,但这只会在您的字符串上迭代一次。

嗯,生成器。

编辑:

当然,您还想添加想要执行的任何类型的解析操作,但这很简单。

I suppose you could roll your own:

def parse(string):
    retval = ''
    for char in string:
        retval += char if not char == '\n' else ''
        if char == '\n':
            yield retval
            retval = ''
    if retval:
        yield retval

I’m not sure how efficient this implementation is, but that will only iterate over your string once.

Mmm, generators.

Edit:

Of course you’ll also want to add in whatever type of parsing actions you want to take, but that’s pretty simple.


回答 5

您可以遍历“文件”,该文件将产生包括尾随换行符在内的行。要使用字符串制作“虚拟文件”,可以使用StringIO

import io  # for Py2.7 that would be import cStringIO as io

for line in io.StringIO(foo):
    print(repr(line))

You can iterate over “a file”, which produces lines, including the trailing newline character. To make a “virtual file” out of a string, you can use StringIO:

import io  # for Py2.7 that would be import cStringIO as io

for line in io.StringIO(foo):
    print(repr(line))

如何将数据作为字符串(而非文件)写入CSV格式?

问题:如何将数据作为字符串(而非文件)写入CSV格式?

我想将数据[1,2,'a','He said "what do you mean?"']转换为CSV格式的字符串。

通常会用到csv.writer()它,因为它处理所有疯狂的情况(逗号转义,引号转义,CSV方言等)。捕获的结果是csv.writer()期望输出到文件对象,而不是字符串。

我当前的解决方案是此功能有点怪异:

def CSV_String_Writeline(data):
    class Dummy_Writer:
        def write(self,instring):
            self.outstring = instring.strip("\r\n")
    dw = Dummy_Writer()
    csv_w = csv.writer( dw )
    csv_w.writerow(data)
    return dw.outstring

谁能提供一种仍然可以很好地处理边缘情况的更优雅的解决方案?

编辑:这是我最终完成的方式:

def csv2string(data):
    si = StringIO.StringIO()
    cw = csv.writer(si)
    cw.writerow(data)
    return si.getvalue().strip('\r\n')

I want to cast data like [1,2,'a','He said "what do you mean?"'] to a CSV-formatted string.

Normally one would use csv.writer() for this, because it handles all the crazy edge cases (comma escaping, quote mark escaping, CSV dialects, etc.) The catch is that csv.writer() expects to output to a file object, not to a string.

My current solution is this somewhat hacky function:

def CSV_String_Writeline(data):
    class Dummy_Writer:
        def write(self,instring):
            self.outstring = instring.strip("\r\n")
    dw = Dummy_Writer()
    csv_w = csv.writer( dw )
    csv_w.writerow(data)
    return dw.outstring

Can anyone give a more elegant solution that still handles the edge cases well?

Edit: Here’s how I ended up doing it:

def csv2string(data):
    si = StringIO.StringIO()
    cw = csv.writer(si)
    cw.writerow(data)
    return si.getvalue().strip('\r\n')

回答 0

您可以使用StringIO而不是自己的Dummy_Writer

此模块实现了类似文件的类,该类StringIO读写字符串缓冲区(也称为内存文件)。

还有cStringIO,这是StringIO该类的更快版本。

You could use StringIO instead of your own Dummy_Writer:

This module implements a file-like class, StringIO, that reads and writes a string buffer (also known as memory files).

There is also cStringIO, which is a faster version of the StringIO class.


回答 1

在Python 3中:

>>> import io
>>> import csv
>>> output = io.StringIO()
>>> csvdata = [1,2,'a','He said "what do you mean?"',"Whoa!\nNewlines!"]
>>> writer = csv.writer(output, quoting=csv.QUOTE_NONNUMERIC)
>>> writer.writerow(csvdata)
59
>>> output.getvalue()
'1,2,"a","He said ""what do you mean?""","Whoa!\nNewlines!"\r\n'

对于Python 2,需要更改一些细节:

>>> output = io.BytesIO()
>>> writer = csv.writer(output)
>>> writer.writerow(csvdata)
57L
>>> output.getvalue()
'1,2,a,"He said ""what do you mean?""","Whoa!\nNewlines!"\r\n'

In Python 3:

>>> import io
>>> import csv
>>> output = io.StringIO()
>>> csvdata = [1,2,'a','He said "what do you mean?"',"Whoa!\nNewlines!"]
>>> writer = csv.writer(output, quoting=csv.QUOTE_NONNUMERIC)
>>> writer.writerow(csvdata)
59
>>> output.getvalue()
'1,2,"a","He said ""what do you mean?""","Whoa!\nNewlines!"\r\n'

Some details need to be changed a bit for Python 2:

>>> output = io.BytesIO()
>>> writer = csv.writer(output)
>>> writer.writerow(csvdata)
57L
>>> output.getvalue()
'1,2,a,"He said ""what do you mean?""","Whoa!\nNewlines!"\r\n'

回答 2

我发现答案总的来说有点令人困惑。对于Python 2,这种用法对我有用:

import csv, io

def csv2string(data):
    si = io.BytesIO()
    cw = csv.writer(si)
    cw.writerow(data)
    return si.getvalue().strip('\r\n')

data=[1,2,'a','He said "what do you mean?"']
print csv2string(data)

I found the answers, all in all, a bit confusing. For Python 2, this usage worked for me:

import csv, io

def csv2string(data):
    si = io.BytesIO()
    cw = csv.writer(si)
    cw.writerow(data)
    return si.getvalue().strip('\r\n')

data=[1,2,'a','He said "what do you mean?"']
print csv2string(data)

回答 3

由于我大量使用此代码将结果从sanic作为csv数据异步流回用户,因此我为Python 3编写了以下代码段。

该代码段可让您一次又一次地重复使用相同的StringIo缓冲区。


import csv
from io import StringIO


class ArgsToCsv:
    def __init__(self, seperator=","):
        self.seperator = seperator
        self.buffer = StringIO()
        self.writer = csv.writer(self.buffer)

    def stringify(self, *args):
        self.writer.writerow(args)
        value = self.buffer.getvalue().strip("\r\n")
        self.buffer.seek(0)
        self.buffer.truncate(0)
        return value + "\n"

例:

csv_formatter = ArgsToCsv()

output += csv_formatter.stringify(
    10,
    """
    lol i have some pretty
    "freaky"
    strings right here \' yo!
    """,
    [10, 20, 30],
)

在github gist上查看更多用法:源代码和测试

since i use this quite a lot to stream results asynchronously from sanic back to the user as csv data i wrote the following snippet for Python 3.

The snippet lets you reuse the same StringIo buffer over and over again.


import csv
from io import StringIO


class ArgsToCsv:
    def __init__(self, seperator=","):
        self.seperator = seperator
        self.buffer = StringIO()
        self.writer = csv.writer(self.buffer)

    def stringify(self, *args):
        self.writer.writerow(args)
        value = self.buffer.getvalue().strip("\r\n")
        self.buffer.seek(0)
        self.buffer.truncate(0)
        return value + "\n"

example:

csv_formatter = ArgsToCsv()

output += csv_formatter.stringify(
    10,
    """
    lol i have some pretty
    "freaky"
    strings right here \' yo!
    """,
    [10, 20, 30],
)

Check out further usage at the github gist: source and test


回答 4

import csv
from StringIO import StringIO
with open('file.csv') as file:
    file = file.read()

stream = StringIO(file)

csv_file = csv.DictReader(stream)
import csv
from StringIO import StringIO
with open('file.csv') as file:
    file = file.read()

stream = StringIO(file)

csv_file = csv.DictReader(stream)

回答 5

这是适用于utf-8的版本。csvline2string仅用于一行,末尾没有换行符,csv2string用于多行,具有换行符:

import csv, io

def csvline2string(one_line_of_data):
    si = BytesIO.StringIO()
    cw = csv.writer(si)
    cw.writerow(one_line_of_data)
    return si.getvalue().strip('\r\n')

def csv2string(data):
    si = BytesIO.StringIO()
    cw = csv.writer(si)
    for one_line_of_data in data:
        cw.writerow(one_line_of_data)
    return si.getvalue()

Here’s the version that works for utf-8. csvline2string for just one line, without linebreaks at the end, csv2string for many lines, with linebreaks:

import csv, io

def csvline2string(one_line_of_data):
    si = BytesIO.StringIO()
    cw = csv.writer(si)
    cw.writerow(one_line_of_data)
    return si.getvalue().strip('\r\n')

def csv2string(data):
    si = BytesIO.StringIO()
    cw = csv.writer(si)
    for one_line_of_data in data:
        cw.writerow(one_line_of_data)
    return si.getvalue()