标签归档:string

如何在python字符串中打印文字大括号字符并在其上使用.format?

问题:如何在python字符串中打印文字大括号字符并在其上使用.format?

x = " \{ Hello \} {0} "
print(x.format(42))

给我 : Key Error: Hello\\

我想打印输出: {Hello} 42

x = " \{ Hello \} {0} "
print(x.format(42))

gives me : Key Error: Hello\\

I want to print the output: {Hello} 42


回答 0

您需要将{{和加倍}}

>>> x = " {{ Hello }} {0} "
>>> print(x.format(42))
' { Hello } 42 '

这是Python文档中有关格式字符串语法的相关部分:

格式字符串包含用花括号括起来的“替换字段” {}。花括号中不包含的所有内容均视为文字文本,该文本原样复制到输出中。如果需要在文字文本中包含大括号字符,可以通过加倍:{{和来对其进行转义}}

You need to double the {{ and }}:

>>> x = " {{ Hello }} {0} "
>>> print(x.format(42))
' { Hello } 42 '

Here’s the relevant part of the Python documentation for format string syntax:

Format strings contain “replacement fields” surrounded by curly braces {}. Anything that is not contained in braces is considered literal text, which is copied unchanged to the output. If you need to include a brace character in the literal text, it can be escaped by doubling: {{ and }}.


回答 1

您可以通过将花括号加倍来逃脱它。

例如:

x = "{{ Hello }} {0}"
print(x.format(42))

You escape it by doubling the braces.

Eg:

x = "{{ Hello }} {0}"
print(x.format(42))

回答 2

Python 3.6+(2017年)

在最新版本的Python中,将使用f字符串(另请参阅PEP498)。

对于f弦,应使用double {{}}

n = 42  
print(f" {{Hello}} {n} ")

产生所需的

 {Hello} 42

如果您需要在方括号中解析表达式而不是使用文字文本,则需要三组方括号:

hello = "HELLO"
print(f"{{{hello.lower()}}}")

产生

{hello}

Python 3.6+ (2017)

In the recent versions of Python one would use f-strings (see also PEP498).

With f-strings one should use double {{ or }}

n = 42  
print(f" {{Hello}} {n} ")

produces the desired

 {Hello} 42

If you need to resolve an expression in the brackets instead of using literal text you’ll need three sets of brackets:

hello = "HELLO"
print(f"{{{hello.lower()}}}")

produces

{hello}

回答 3

OP写了这个评论:

我正在尝试出于某种目的格式化小型JSON,例如:'{"all": false, "selected": "{}"}'.format(data)获得类似{"all": false, "selected": "1,2"}

在处理JSON时经常会出现“转义括号”问题。

我建议这样做:

import json
data = "1,2"
mydict = {"all": "false", "selected": data}
json.dumps(mydict)

它比替代方案更清洁,替代方案是:

'{{"all": false, "selected": "{}"}}'.format(data)

json当JSON字符串比示例复杂时,最好使用该库。

The OP wrote this comment:

I was trying to format a small JSON for some purposes, like this: '{"all": false, "selected": "{}"}'.format(data) to get something like {"all": false, "selected": "1,2"}

It’s pretty common that the “escaping braces” issue comes up when dealing with JSON.

I suggest doing this:

import json
data = "1,2"
mydict = {"all": "false", "selected": data}
json.dumps(mydict)

It’s cleaner than the alternative, which is:

'{{"all": false, "selected": "{}"}}'.format(data)

Using the json library is definitely preferable when the JSON string gets more complicated than the example.


回答 4

尝试这样做:

x = " {{ Hello }} {0} "
print x.format(42)

Try doing this:

x = " {{ Hello }} {0} "
print x.format(42)

回答 5

尝试这个:

x = "{{ Hello }} {0}"

Try this:

x = "{{ Hello }} {0}"


回答 6

尽管没有更好的效果,但仅供参考,您也可以这样做:

>>> x = '{}Hello{} {}'
>>> print x.format('{','}',42)
{Hello} 42

例如,当有人要打印时,此功能很有用{argument}。它可能比'{{{}}}'.format('argument')

请注意,您在Python 2.7之后省略了参数位置(例如{}而不是{0}

Although not any better, just for the reference, you can also do this:

>>> x = '{}Hello{} {}'
>>> print x.format('{','}',42)
{Hello} 42

It can be useful for example when someone wants to print {argument}. It is maybe more readable than '{{{}}}'.format('argument')

Note that you omit argument positions (e.g. {} instead of {0}) after Python 2.7


回答 7

如果您打算做很多事情,最好定义一个实用函数,让您使用任意大括号替代项,例如

def custom_format(string, brackets, *args, **kwargs):
    if len(brackets) != 2:
        raise ValueError('Expected two brackets. Got {}.'.format(len(brackets)))
    padded = string.replace('{', '{{').replace('}', '}}')
    substituted = padded.replace(brackets[0], '{').replace(brackets[1], '}')
    formatted = substituted.format(*args, **kwargs)
    return formatted

>>> custom_format('{{[cmd]} process 1}', brackets='[]', cmd='firefox.exe')
'{{firefox.exe} process 1}'

请注意,这将适用于括号为长度为2的字符串或两个字符串为可迭代的字符串(对于多字符定界符)。

If you are going to be doing this a lot, it might be good to define a utility function that will let you use arbitrary brace substitutes instead, like

def custom_format(string, brackets, *args, **kwargs):
    if len(brackets) != 2:
        raise ValueError('Expected two brackets. Got {}.'.format(len(brackets)))
    padded = string.replace('{', '{{').replace('}', '}}')
    substituted = padded.replace(brackets[0], '{').replace(brackets[1], '}')
    formatted = substituted.format(*args, **kwargs)
    return formatted

>>> custom_format('{{[cmd]} process 1}', brackets='[]', cmd='firefox.exe')
'{{firefox.exe} process 1}'

Note that this will work either with brackets being a string of length 2 or an iterable of two strings (for multi-character delimiters).


回答 8

我最近遇到了这个问题,因为我想将字符串注入预先格式化的JSON中。我的解决方案是创建一个辅助方法,如下所示:

def preformat(msg):
    """ allow {{key}} to be used for formatting in text
    that already uses curly braces.  First switch this into
    something else, replace curlies with double curlies, and then
    switch back to regular braces
    """
    msg = msg.replace('{{', '<<<').replace('}}', '>>>')
    msg = msg.replace('{', '{{').replace('}', '}}')
    msg = msg.replace('<<<', '{').replace('>>>', '}')
    return msg

然后,您可以执行以下操作:

formatted = preformat("""
    {
        "foo": "{{bar}}"
    }""").format(bar="gas")

如果性能不成问题,则完成工作。

I recently ran into this, because I wanted to inject strings into preformatted JSON. My solution was to create a helper method, like this:

def preformat(msg):
    """ allow {{key}} to be used for formatting in text
    that already uses curly braces.  First switch this into
    something else, replace curlies with double curlies, and then
    switch back to regular braces
    """
    msg = msg.replace('{{', '<<<').replace('}}', '>>>')
    msg = msg.replace('{', '{{').replace('}', '}}')
    msg = msg.replace('<<<', '{').replace('>>>', '}')
    return msg

You can then do something like:

formatted = preformat("""
    {
        "foo": "{{bar}}"
    }""").format(bar="gas")

Gets the job done if performance is not an issue.


回答 9

如果需要在字符串中保留两个大括号,则变量的每一侧都需要5个大括号。

>>> myvar = 'test'
>>> "{{{{{0}}}}}".format(myvar)
'{{test}}'

If you need to keep two curly braces in the string, you need 5 curly braces on each side of the variable.

>>> myvar = 'test'
>>> "{{{{{0}}}}}".format(myvar)
'{{test}}'

回答 10

原因是,{}.format()您的情况下的语法,因此.format()无法识别,{Hello}因此引发了错误。

您可以使用双大括号{{}}覆盖它,

x = " {{ Hello }} {0} "

要么

尝试%s格式化文本,

x = " { Hello } %s"
print x%(42)  

Reason is , {} is the syntax of .format() so in your case .format() doesn’t recognize {Hello} so it threw an error.

you can override it by using double curly braces {{}},

x = " {{ Hello }} {0} "

or

try %s for text formatting,

x = " { Hello } %s"
print x%(42)  

回答 11

我在尝试打印文本时偶然发现了这个问题,可以将其复制粘贴到Latex文档中。我扩展这个答案,并使用命名的替换字段:

假设您要打印出带有诸如的索引的多个变量的乘积 在此处输入图片说明,在Latex中将是$A_{ 0042 }*A_{ 3141 }*A_{ 2718 }*A_{ 0042 }$ 这样的代码。以下代码使用命名字段完成工作,因此对于许多索引而言,它仍然可读:

idx_mapping = {'i1':42, 'i2':3141, 'i3':2178 }
print('$A_{{ {i1:04d} }} * A_{{ {i2:04d} }} * A_{{ {i3:04d} }} * A_{{ {i1:04d} }}$'.format(**idx_mapping))

I stumbled upon this problem when trying to print text, which I can copy paste into a Latex document. I extend on this answer and make use of named replacement fields:

Lets say you want to print out a product of mulitple variables with indices such as enter image description here, which in Latex would be $A_{ 0042 }*A_{ 3141 }*A_{ 2718 }*A_{ 0042 }$ The following code does the job with named fields so that for many indices it stays readable:

idx_mapping = {'i1':42, 'i2':3141, 'i3':2178 }
print('$A_{{ {i1:04d} }} * A_{{ {i2:04d} }} * A_{{ {i3:04d} }} * A_{{ {i1:04d} }}$'.format(**idx_mapping))

回答 12

如果你想打印一个大括号(例如{),您可以使用{{,如果你愿意,你可以在后面的字符串添加多个支架。例如:

>>> f'{{ there is a curly brace on the left. Oh, and 1 + 1 is {1 + 1}'
'{ there is a curly brace on the left. Oh, and 1 + 1 is 2'

If you want to only print one curly brace (for example {) you can use {{, and you can add more braces later in the string if you want. For example:

>>> f'{{ there is a curly brace on the left. Oh, and 1 + 1 is {1 + 1}'
'{ there is a curly brace on the left. Oh, and 1 + 1 is 2'

回答 13

当您只是想插入代码字符串时,我建议您使用jinja2,它是Python的全功能模板引擎,即:

from jinja2 import Template

foo = Template('''
#include <stdio.h>

void main() {
    printf("hello universe number {{number}}");
}
''')

for i in range(2):
    print(foo.render(number=i))

因此,您不会因为其他答案而被迫复制花括号

When you’re just trying to interpolate code strings I’d suggest using jinja2 which is a full-featured template engine for Python, ie:

from jinja2 import Template

foo = Template('''
#include <stdio.h>

void main() {
    printf("hello universe number {{number}}");
}
''')

for i in range(2):
    print(foo.render(number=i))

So you won’t be enforced to duplicate curly braces as the whole bunch of other answers suggest


回答 14

您可以通过使用原始字符串方法来实现此目的,只需在字符串前添加不带引号的字符’r’。

# to print '{I am inside braces}'
print(r'{I am inside braces}')

You can do this by using raw string method by simply adding character ‘r’ without quotes before the string.

# to print '{I am inside braces}'
print(r'{I am inside braces}')

如何将零填充到字符串?

问题:如何将零填充到字符串?

用Python的方法将数字字符串填充到左侧的零(即数字字符串具有特定的长度)是什么?

What is a Pythonic way to pad a numeric string with zeroes to the left, i.e. so the numeric string has a specific length?


回答 0

字串:

>>> n = '4'
>>> print(n.zfill(3))
004

对于数字:

>>> n = 4
>>> print(f'{n:03}') # Preferred method, python >= 3.6
004
>>> print('%03d' % n)
004
>>> print(format(n, '03')) # python >= 2.6
004
>>> print('{0:03d}'.format(n))  # python >= 2.6 + python 3
004
>>> print('{foo:03d}'.format(foo=n))  # python >= 2.6 + python 3
004
>>> print('{:03d}'.format(n))  # python >= 2.7 + python3
004

字符串格式化文档

Strings:

>>> n = '4'
>>> print(n.zfill(3))
004

And for numbers:

>>> n = 4
>>> print(f'{n:03}') # Preferred method, python >= 3.6
004
>>> print('%03d' % n)
004
>>> print(format(n, '03')) # python >= 2.6
004
>>> print('{0:03d}'.format(n))  # python >= 2.6 + python 3
004
>>> print('{foo:03d}'.format(foo=n))  # python >= 2.6 + python 3
004
>>> print('{:03d}'.format(n))  # python >= 2.7 + python3
004

String formatting documentation.


回答 1

只需使用字符串对象的rjust方法即可。

本示例将使一个10个字符长的字符串,必要时进行填充。

>>> t = 'test'
>>> t.rjust(10, '0')
>>> '000000test'

Just use the rjust method of the string object.

This example will make a string of 10 characters long, padding as necessary.

>>> t = 'test'
>>> t.rjust(10, '0')
>>> '000000test'

回答 2

此外zfill,您可以使用常规的字符串格式:

print(f'{number:05d}') # (since Python 3.6), or
print('{:05d}'.format(number)) # or
print('{0:05d}'.format(number)) # or (explicit 0th positional arg. selection)
print('{n:05d}'.format(n=number)) # or (explicit `n` keyword arg. selection)
print(format(number, '05d'))

字符串格式f-strings的文档。

Besides zfill, you can use general string formatting:

print(f'{number:05d}') # (since Python 3.6), or
print('{:05d}'.format(number)) # or
print('{0:05d}'.format(number)) # or (explicit 0th positional arg. selection)
print('{n:05d}'.format(n=number)) # or (explicit `n` keyword arg. selection)
print(format(number, '05d'))

Documentation for string formatting and f-strings.


回答 3

对于使用f字符串的Python 3.6+:

>>> i = 1
>>> f"{i:0>2}"  # Works for both numbers and strings.
'01'
>>> f"{i:02}"  # Works only for numbers.
'01'

对于Python 2至Python 3.5:

>>> "{:0>2}".format("1")  # Works for both numbers and strings.
'01'
>>> "{:02}".format(1)  # Works only for numbers.
'01'

For Python 3.6+ using f-strings:

>>> i = 1
>>> f"{i:0>2}"  # Works for both numbers and strings.
'01'
>>> f"{i:02}"  # Works only for numbers.
'01'

For Python 2 to Python 3.5:

>>> "{:0>2}".format("1")  # Works for both numbers and strings.
'01'
>>> "{:02}".format(1)  # Works only for numbers.
'01'

回答 4

>>> '99'.zfill(5)
'00099'
>>> '99'.rjust(5,'0')
'00099'

如果您想要相反的话:

>>> '99'.ljust(5,'0')
'99000'
>>> '99'.zfill(5)
'00099'
>>> '99'.rjust(5,'0')
'00099'

if you want the opposite:

>>> '99'.ljust(5,'0')
'99000'

回答 5

str(n).zfill(width)可以与strings,ints,floats …一起使用,并且与Python 2. x和3. x兼容:

>>> n = 3
>>> str(n).zfill(5)
'00003'
>>> n = '3'
>>> str(n).zfill(5)
'00003'
>>> n = '3.0'
>>> str(n).zfill(5)
'003.0'

str(n).zfill(width) will work with strings, ints, floats… and is Python 2.x and 3.x compatible:

>>> n = 3
>>> str(n).zfill(5)
'00003'
>>> n = '3'
>>> str(n).zfill(5)
'00003'
>>> n = '3.0'
>>> str(n).zfill(5)
'003.0'

回答 6

对于那些来这里了解的人,而不仅仅是一个快速的答案。我特别针对时间字符串执行以下操作:

hour = 4
minute = 3
"{:0>2}:{:0>2}".format(hour,minute)
# prints 04:03

"{:0>3}:{:0>5}".format(hour,minute)
# prints '004:00003'

"{:0<3}:{:0<5}".format(hour,minute)
# prints '400:30000'

"{:$<3}:{:#<5}".format(hour,minute)
# prints '4$$:3####'

“ 0”符号用“ 2”填充字符替换,默认为空白

“>”符号会分配字符串左侧的所有2个“ 0”字符

“:”符号format_spec

For the ones who came here to understand and not just a quick answer. I do these especially for time strings:

hour = 4
minute = 3
"{:0>2}:{:0>2}".format(hour,minute)
# prints 04:03

"{:0>3}:{:0>5}".format(hour,minute)
# prints '004:00003'

"{:0<3}:{:0<5}".format(hour,minute)
# prints '400:30000'

"{:$<3}:{:#<5}".format(hour,minute)
# prints '4$$:3####'

“0” symbols what to replace with the “2” padding characters, the default is an empty space

“>” symbols allign all the 2 “0” character to the left of the string

“:” symbols the format_spec


回答 7

将数字字符串的左边填充零的最有效方法是什么(即,数字字符串具有特定的长度)?

str.zfill 专用于此目的:

>>> '1'.zfill(4)
'0001'

请注意,它专门用于根据请求处理数字字符串,并将a +-移至字符串的开头:

>>> '+1'.zfill(4)
'+001'
>>> '-1'.zfill(4)
'-001'

这是有关的帮助str.zfill

>>> help(str.zfill)
Help on method_descriptor:

zfill(...)
    S.zfill(width) -> str

    Pad a numeric string S with zeros on the left, to fill a field
    of the specified width. The string S is never truncated.

性能

这也是替代方法最有效的方法:

>>> min(timeit.repeat(lambda: '1'.zfill(4)))
0.18824880896136165
>>> min(timeit.repeat(lambda: '1'.rjust(4, '0')))
0.2104538488201797
>>> min(timeit.repeat(lambda: f'{1:04}'))
0.32585487607866526
>>> min(timeit.repeat(lambda: '{:04}'.format(1)))
0.34988890308886766

为了最好地将苹果与苹果进行比较%(请注意,它实际上速度较慢),否则将预先计算:

>>> min(timeit.repeat(lambda: '1'.zfill(0 or 4)))
0.19728074967861176
>>> min(timeit.repeat(lambda: '%04d' % (0 or 1)))
0.2347015216946602

实作

稍微挖掘一下,我发现该zfill方法的实现Objects/stringlib/transmogrify.h

static PyObject *
stringlib_zfill(PyObject *self, PyObject *args)
{
    Py_ssize_t fill;
    PyObject *s;
    char *p;
    Py_ssize_t width;

    if (!PyArg_ParseTuple(args, "n:zfill", &width))
        return NULL;

    if (STRINGLIB_LEN(self) >= width) {
        return return_self(self);
    }

    fill = width - STRINGLIB_LEN(self);

    s = pad(self, fill, 0, '0');

    if (s == NULL)
        return NULL;

    p = STRINGLIB_STR(s);
    if (p[fill] == '+' || p[fill] == '-') {
        /* move sign to beginning of string */
        p[0] = p[fill];
        p[fill] = '0';
    }

    return s;
}

让我们来看一下这个C代码。

它首先在位置上解析参数,这意味着它不允许关键字参数:

>>> '1'.zfill(width=4)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: zfill() takes no keyword arguments

然后,它检查长度是否相同或更长,在这种情况下,它将返回字符串。

>>> '1'.zfill(0)
'1'

zfill电话pad(此pad功能也被称为ljustrjustcenter也)。这基本上将内容复制到一个新的字符串中并填充填充。

static inline PyObject *
pad(PyObject *self, Py_ssize_t left, Py_ssize_t right, char fill)
{
    PyObject *u;

    if (left < 0)
        left = 0;
    if (right < 0)
        right = 0;

    if (left == 0 && right == 0) {
        return return_self(self);
    }

    u = STRINGLIB_NEW(NULL, left + STRINGLIB_LEN(self) + right);
    if (u) {
        if (left)
            memset(STRINGLIB_STR(u), fill, left);
        memcpy(STRINGLIB_STR(u) + left,
               STRINGLIB_STR(self),
               STRINGLIB_LEN(self));
        if (right)
            memset(STRINGLIB_STR(u) + left + STRINGLIB_LEN(self),
                   fill, right);
    }

    return u;
}

调用之后padzfill将任何原始的字符串移到字符串的开头+-开头。

请注意,原始字符串实际上不需要是数字:

>>> '+foo'.zfill(10)
'+000000foo'
>>> '-foo'.zfill(10)
'-000000foo'

What is the most pythonic way to pad a numeric string with zeroes to the left, i.e., so the numeric string has a specific length?

str.zfill is specifically intended to do this:

>>> '1'.zfill(4)
'0001'

Note that it is specifically intended to handle numeric strings as requested, and moves a + or - to the beginning of the string:

>>> '+1'.zfill(4)
'+001'
>>> '-1'.zfill(4)
'-001'

Here’s the help on str.zfill:

>>> help(str.zfill)
Help on method_descriptor:

zfill(...)
    S.zfill(width) -> str

    Pad a numeric string S with zeros on the left, to fill a field
    of the specified width. The string S is never truncated.

Performance

This is also the most performant of alternative methods:

>>> min(timeit.repeat(lambda: '1'.zfill(4)))
0.18824880896136165
>>> min(timeit.repeat(lambda: '1'.rjust(4, '0')))
0.2104538488201797
>>> min(timeit.repeat(lambda: f'{1:04}'))
0.32585487607866526
>>> min(timeit.repeat(lambda: '{:04}'.format(1)))
0.34988890308886766

To best compare apples to apples for the % method (note it is actually slower), which will otherwise pre-calculate:

>>> min(timeit.repeat(lambda: '1'.zfill(0 or 4)))
0.19728074967861176
>>> min(timeit.repeat(lambda: '%04d' % (0 or 1)))
0.2347015216946602

Implementation

With a little digging, I found the implementation of the zfill method in Objects/stringlib/transmogrify.h:

static PyObject *
stringlib_zfill(PyObject *self, PyObject *args)
{
    Py_ssize_t fill;
    PyObject *s;
    char *p;
    Py_ssize_t width;

    if (!PyArg_ParseTuple(args, "n:zfill", &width))
        return NULL;

    if (STRINGLIB_LEN(self) >= width) {
        return return_self(self);
    }

    fill = width - STRINGLIB_LEN(self);

    s = pad(self, fill, 0, '0');

    if (s == NULL)
        return NULL;

    p = STRINGLIB_STR(s);
    if (p[fill] == '+' || p[fill] == '-') {
        /* move sign to beginning of string */
        p[0] = p[fill];
        p[fill] = '0';
    }

    return s;
}

Let’s walk through this C code.

It first parses the argument positionally, meaning it doesn’t allow keyword arguments:

>>> '1'.zfill(width=4)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: zfill() takes no keyword arguments

It then checks if it’s the same length or longer, in which case it returns the string.

>>> '1'.zfill(0)
'1'

zfill calls pad (this pad function is also called by ljust, rjust, and center as well). This basically copies the contents into a new string and fills in the padding.

static inline PyObject *
pad(PyObject *self, Py_ssize_t left, Py_ssize_t right, char fill)
{
    PyObject *u;

    if (left < 0)
        left = 0;
    if (right < 0)
        right = 0;

    if (left == 0 && right == 0) {
        return return_self(self);
    }

    u = STRINGLIB_NEW(NULL, left + STRINGLIB_LEN(self) + right);
    if (u) {
        if (left)
            memset(STRINGLIB_STR(u), fill, left);
        memcpy(STRINGLIB_STR(u) + left,
               STRINGLIB_STR(self),
               STRINGLIB_LEN(self));
        if (right)
            memset(STRINGLIB_STR(u) + left + STRINGLIB_LEN(self),
                   fill, right);
    }

    return u;
}

After calling pad, zfill moves any originally preceding + or - to the beginning of the string.

Note that for the original string to actually be numeric is not required:

>>> '+foo'.zfill(10)
'+000000foo'
>>> '-foo'.zfill(10)
'-000000foo'

回答 8

width = 10
x = 5
print "%0*d" % (width, x)
> 0000000005

有关所有激动人心的细节,请参见打印文档!

适用于Python 3.x的更新(7.5年后)

最后一行现在应该是:

print("%0*d" % (width, x))

print()现在是一个函数,而不是一个语句。请注意,我仍然更喜欢Old School printf()风格,因为IMNSHO读起来更好,并且因为,嗯,自1980年1月以来我一直在使用该符号。

width = 10
x = 5
print "%0*d" % (width, x)
> 0000000005

See the print documentation for all the exciting details!

Update for Python 3.x (7.5 years later)

That last line should now be:

print("%0*d" % (width, x))

I.e. print() is now a function, not a statement. Note that I still prefer the Old School printf() style because, IMNSHO, it reads better, and because, um, I’ve been using that notation since January, 1980. Something … old dogs .. something something … new tricks.


回答 9

使用Python时>= 3.6,最干净的方法是使用带字符串格式的f 字符串

>>> s = f"{1:08}"  # inline with int
>>> s
'00000001'
>>> s = f"{'1':0>8}"  # inline with str
>>> s
'00000001'
>>> n = 1
>>> s = f"{n:08}"  # int variable
>>> s
'00000001'
>>> c = "1"
>>> s = f"{c:0>8}"  # str variable
>>> s
'00000001'

我更喜欢使用格式化int,因为只有这样才能正确处理符号:

>>> f"{-1:08}"
'-0000001'

>>> f"{1:+08}"
'+0000001'

>>> f"{'-1':0>8}"
'000000-1'

When using Python >= 3.6, the cleanest way is to use f-strings with string formatting:

>>> s = f"{1:08}"  # inline with int
>>> s
'00000001'
>>> s = f"{'1':0>8}"  # inline with str
>>> s
'00000001'
>>> n = 1
>>> s = f"{n:08}"  # int variable
>>> s
'00000001'
>>> c = "1"
>>> s = f"{c:0>8}"  # str variable
>>> s
'00000001'

I would prefer formatting with an int, since only then the sign is handled correctly:

>>> f"{-1:08}"
'-0000001'

>>> f"{1:+08}"
'+0000001'

>>> f"{'-1':0>8}"
'000000-1'

回答 10

对于保存为整数的邮政编码:

>>> a = 6340
>>> b = 90210
>>> print '%05d' % a
06340
>>> print '%05d' % b
90210

For zip codes saved as integers:

>>> a = 6340
>>> b = 90210
>>> print '%05d' % a
06340
>>> print '%05d' % b
90210

回答 11

快速时序比较:

setup = '''
from random import randint
def test_1():
    num = randint(0,1000000)
    return str(num).zfill(7)
def test_2():
    num = randint(0,1000000)
    return format(num, '07')
def test_3():
    num = randint(0,1000000)
    return '{0:07d}'.format(num)
def test_4():
    num = randint(0,1000000)
    return format(num, '07d')
def test_5():
    num = randint(0,1000000)
    return '{:07d}'.format(num)
def test_6():
    num = randint(0,1000000)
    return '{x:07d}'.format(x=num)
def test_7():
    num = randint(0,1000000)
    return str(num).rjust(7, '0')
'''
import timeit
print timeit.Timer("test_1()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_2()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_3()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_4()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_5()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_6()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_7()", setup=setup).repeat(3, 900000)


> [2.281613943830961, 2.2719342631547077, 2.261691106209631]
> [2.311480238815406, 2.318420542148333, 2.3552384305184493]
> [2.3824197456864304, 2.3457239951596485, 2.3353268829498646]
> [2.312442972404032, 2.318053102249902, 2.3054072168069872]
> [2.3482314132374853, 2.3403386400002475, 2.330108825844775]
> [2.424549090688892, 2.4346475296851438, 2.429691196530058]
> [2.3259756401716487, 2.333549212826732, 2.32049893822186]

我对不同的重复进行了不同的测试。差异并不大,但是在所有测试中,zfill解决方案都是最快的。

Quick timing comparison:

setup = '''
from random import randint
def test_1():
    num = randint(0,1000000)
    return str(num).zfill(7)
def test_2():
    num = randint(0,1000000)
    return format(num, '07')
def test_3():
    num = randint(0,1000000)
    return '{0:07d}'.format(num)
def test_4():
    num = randint(0,1000000)
    return format(num, '07d')
def test_5():
    num = randint(0,1000000)
    return '{:07d}'.format(num)
def test_6():
    num = randint(0,1000000)
    return '{x:07d}'.format(x=num)
def test_7():
    num = randint(0,1000000)
    return str(num).rjust(7, '0')
'''
import timeit
print timeit.Timer("test_1()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_2()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_3()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_4()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_5()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_6()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_7()", setup=setup).repeat(3, 900000)


> [2.281613943830961, 2.2719342631547077, 2.261691106209631]
> [2.311480238815406, 2.318420542148333, 2.3552384305184493]
> [2.3824197456864304, 2.3457239951596485, 2.3353268829498646]
> [2.312442972404032, 2.318053102249902, 2.3054072168069872]
> [2.3482314132374853, 2.3403386400002475, 2.330108825844775]
> [2.424549090688892, 2.4346475296851438, 2.429691196530058]
> [2.3259756401716487, 2.333549212826732, 2.32049893822186]

I’ve made different tests of different repetitions. The differences are not huge, but in all tests, the zfill solution was fastest.


回答 12

另一种方法是将列表理解与长度条件检查结合使用。下面是一个演示:

# input list of strings that we want to prepend zeros
In [71]: list_of_str = ["101010", "10101010", "11110", "0000"]

# prepend zeros to make each string to length 8, if length of string is less than 8
In [83]: ["0"*(8-len(s)) + s if len(s) < desired_len else s for s in list_of_str]
Out[83]: ['00101010', '10101010', '00011110', '00000000']

Another approach would be to use a list comprehension with a condition checking for lengths. Below is a demonstration:

# input list of strings that we want to prepend zeros
In [71]: list_of_str = ["101010", "10101010", "11110", "0000"]

# prepend zeros to make each string to length 8, if length of string is less than 8
In [83]: ["0"*(8-len(s)) + s if len(s) < desired_len else s for s in list_of_str]
Out[83]: ['00101010', '10101010', '00011110', '00000000']

回答 13

还可以:

 h = 2
 m = 7
 s = 3
 print("%02d:%02d:%02d" % (h, m, s))

因此输出为:“ 02:07:03”

Its ok too:

 h = 2
 m = 7
 s = 3
 print("%02d:%02d:%02d" % (h, m, s))

so output will be: “02:07:03”


回答 14

您还可以重复“ 0”,将其添加到str(n)最右端的宽度切片。快速而肮脏的表情。

def pad_left(n, width, pad="0"):
    return ((pad * width) + str(n))[-width:]

You could also repeat “0”, prepend it to str(n) and get the rightmost width slice. Quick and dirty little expression.

def pad_left(n, width, pad="0"):
    return ((pad * width) + str(n))[-width:]

如何检查字符串是否为空?

问题:如何检查字符串是否为空?

Python是否有类似空字符串变量的内容,您可以在其中执行以下操作:

if myString == string.empty:

无论如何,检查空字符串值的最优雅方法是什么?我""每次都很难检查空字符串,因此很难进行编码。

Does Python have something like an empty string variable where you can do:

if myString == string.empty:

Regardless, what’s the most elegant way to check for empty string values? I find hard coding "" every time for checking an empty string not as good.


回答 0

空字符串是“ falsy”,这意味着它们在布尔上下文中被认为是错误的,因此您可以执行以下操作:

if not myString:

如果您知道变量是字符串,则这是首选方法。如果您的变量也可以是其他类型,则应使用myString == ""。有关在布尔上下文中为假的其他值,请参见“ 真值测试 ”文档。

Empty strings are “falsy” which means they are considered false in a Boolean context, so you can just do this:

if not myString:

This is the preferred way if you know that your variable is a string. If your variable could also be some other type then you should use myString == "". See the documentation on Truth Value Testing for other values that are false in Boolean contexts.


回答 1

PEP 8“编程建议”部分中

对于序列(字符串,列表,元组),请使用以下事实:空序列为假。

因此,您应该使用:

if not some_string:

要么:

if some_string:

只是为了澄清,序列评估FalseTrue在布尔上下文如果它们是空的或不是。他们不等于FalseTrue

From PEP 8, in the “Programming Recommendations” section:

For sequences, (strings, lists, tuples), use the fact that empty sequences are false.

So you should use:

if not some_string:

or:

if some_string:

Just to clarify, sequences are evaluated to False or True in a Boolean context if they are empty or not. They are not equal to False or True.


回答 2

最优雅的方法可能是简单地检查其真实性或虚假性,例如:

if not my_string:

但是,您可能要删除空格,因为:

 >>> bool("")
 False
 >>> bool("   ")
 True
 >>> bool("   ".strip())
 False

但是,您可能应该对此更加明确,除非您确定该字符串已经通过某种验证并且可以通过这种方式进行测试。

The most elegant way would probably be to simply check if its true or falsy, e.g.:

if not my_string:

However, you may want to strip white space because:

 >>> bool("")
 False
 >>> bool("   ")
 True
 >>> bool("   ".strip())
 False

You should probably be a bit more explicit in this however, unless you know for sure that this string has passed some kind of validation and is a string that can be tested this way.


回答 3

我会在剥离之前测试无。另外,我会使用一个空字符串为False(或Falsy)的事实。这种方法类似于Apache的StringUtils.isBlank番石榴的Strings.isNullOrEmpty

这就是我用来测试字符串是否为None或Empty或Blank的内容:

def isBlank (myString):
    if myString and myString.strip():
        #myString is not None AND myString is not empty or blank
        return False
    #myString is None OR myString is empty or blank
    return True

并且,与测试字符串是否不是None或NOR空或NOR空白完全相反:

def isNotBlank (myString):
    if myString and myString.strip():
        #myString is not None AND myString is not empty or blank
        return True
    #myString is None OR myString is empty or blank
    return False

上面代码的更简洁形式:

def isBlank (myString):
    return not (myString and myString.strip())

def isNotBlank (myString):
    return bool(myString and myString.strip())

I would test noneness before stripping. Also, I would use the fact that empty strings are False (or Falsy). This approach is similar to Apache’s StringUtils.isBlank or Guava’s Strings.isNullOrEmpty

This is what I would use to test if a string is either None OR Empty OR Blank:

def isBlank (myString):
    if myString and myString.strip():
        #myString is not None AND myString is not empty or blank
        return False
    #myString is None OR myString is empty or blank
    return True

And, the exact opposite to test if a string is not None NOR Empty NOR Blank:

def isNotBlank (myString):
    if myString and myString.strip():
        #myString is not None AND myString is not empty or blank
        return True
    #myString is None OR myString is empty or blank
    return False

More concise forms of the above code:

def isBlank (myString):
    return not (myString and myString.strip())

def isNotBlank (myString):
    return bool(myString and myString.strip())

回答 4

我曾经写过类似Bartek的答案和javascript启发的东西:

def is_not_blank(s):
    return bool(s and s.strip())

测试:

print is_not_blank("")    # False
print is_not_blank("   ") # False
print is_not_blank("ok")  # True
print is_not_blank(None)  # False

I once wrote something similar to Bartek’s answer and javascript inspired:

def is_not_blank(s):
    return bool(s and s.strip())

Test:

print is_not_blank("")    # False
print is_not_blank("   ") # False
print is_not_blank("ok")  # True
print is_not_blank(None)  # False

回答 5

唯一真正可靠的方法是:

if "".__eq__(myString):

所有其他解决方案都可能存在问题,并且可能导致检查失败。

len(myString)==0如果myString是继承自str并覆盖的类的对象,则会失败__len__()方法方法方法。

同样myString == ""myString.__eq__("")如果myString覆盖__eq__()__ne__()

由于某种原因,"" == myString如果myString覆盖也将被愚弄__eq__()

myString is """" is myString等价。如果它们myString实际上不是字符串而是字符串的子类,则它们都将失败(都将返回False)。另外,由于它们是身份检查,所以它们起作用的唯一原因是因为Python使用了String Pooling(也称为String Internment),该字符串池在被插入的情况下使用相同的字符串实例(请参见此处:为什么使用’=来比较字符串=’或’是否’有时会产生不同的结果?)。和""从一开始就在CPython中进行实习

身份检查的最大问题是,据我所知,String Internment不规范要插入哪些字符串。从理论上讲""没有必要进行实习,而依赖于实现。

真正不能被愚弄的唯一方法是开头提到的方法:"".__eq__(myString)。由于此__eq__()方法明确调用了空字符串的方法,因此不能通过覆盖myString中的任何方法来欺骗它,并且可以与的子类牢固地结合使用str

如果对象覆盖了它的__bool__()方法,那么依靠字符串的虚假性也可能无法工作。

这不仅是理论上的工作,而且实际上可能与实际用法有关,因为我之前看到过框架和库的子类化str,并且使用myString is ""那里可能会返回错误的输出。

而且,is通常使用字符串比较字符串是一个很糟糕的陷阱,因为它有时会正确运行,而在其他时候则无法正常工作,因为字符串池遵循非常奇怪的规则。

也就是说,在大多数情况下,所有提到的解决方案都可以正常工作。这是大多数学术工作。

The only really solid way of doing this is the following:

if "".__eq__(myString):

All other solutions have possible problems and edge cases where the check can fail.

len(myString)==0 can fail if myString is an object of a class that inherits from str and overrides the __len__() method.

Similarly myString == "" and myString.__eq__("") can fail if myString overrides __eq__() and __ne__().

For some reason "" == myString also gets fooled if myString overrides __eq__().

myString is "" and "" is myString are equivalent. They will both fail if myString is not actually a string but a subclass of string (both will return False). Also, since they are identity checks, the only reason why they work is because Python uses String Pooling (also called String Internment) which uses the same instance of a string if it is interned (see here: Why does comparing strings using either ‘==’ or ‘is’ sometimes produce a different result?). And "" is interned from the start in CPython

The big problem with the identity check is that String Internment is (as far as I could find) that it is not standardised which strings are interned. That means, theoretically "" is not necessary interned and that is implementation dependant.

The only way of doing this that really cannot be fooled is the one mentioned in the beginning: "".__eq__(myString). Since this explicitly calls the __eq__() method of the empty string it cannot be fooled by overriding any methods in myString and solidly works with subclasses of str.

Also relying on the falsyness of a string might not work if the object overrides it’s __bool__() method.

This is not only theoretical work but might actually be relevant in real usage since I have seen frameworks and libraries subclassing str before and using myString is "" might return a wrong output there.

Also, comparing strings using is in general is a pretty evil trap since it will work correctly sometimes, but not at other times, since string pooling follows pretty strange rules.

That said, in most cases all of the mentioned solutions will work correctly. This is post is mostly academic work.


回答 6

测试空字符串或空字符串(较短的方法):

if myString.strip():
    print("it's not an empty or blank string")
else:
    print("it's an empty or blank string")

Test empty or blank string (shorter way):

if myString.strip():
    print("it's not an empty or blank string")
else:
    print("it's an empty or blank string")

回答 7

如果要区分空字符串和空字符串,建议使用if len(string),否则,建议仅使用if string其他人所说的方法。关于充满空格的字符串的警告仍然适用,因此请不要忘记strip

If you want to differentiate between empty and null strings, I would suggest using if len(string), otherwise, I’d suggest using simply if string as others have said. The caveat about strings full of whitespace still applies though, so don’t forget to strip.


回答 8

if stringname:false字符串为空时给出a 。我想这不可能比这更简单。

if stringname: gives a false when the string is empty. I guess it can’t be simpler than this.


回答 9

a = ''
b = '   '
a.isspace() -> False
b.isspace() -> True
a = ''
b = '   '
a.isspace() -> False
b.isspace() -> True

回答 10

每次检查空字符串时,我都发现用硬编码“”不好。

干净的代码方法

这样做:foo == ""是非常不好的做法。""是一个神奇的价值。您永远都不应检查魔术值(通常称为魔术数)

您应该做的是与描述性变量名称进行比较。

描述性变量名称

可能会认为“ empty_string”是一种描述性变量名。不是

在去做之前,请empty_string = ""以为您有一个很好的变量名可以比较。这不是“描述性变量名称”的含义。

一个好的描述性变量名称基于其上下文。你要想想空字符串什么

  • 它从何而来。
  • 为什么在那儿。
  • 为什么您需要检查它。

简单表单字段示例

您正在构建一个表单,用户可以在其中输入值。您要检查用户是否写了什么。

一个好的变量名可能是 not_filled_in

这使得代码非常可读

if formfields.name == not_filled_in:
    raise ValueError("We need your name")

全面的CSV解析示例

您正在解析CSV文件,并希望将空字符串解析为 None

(由于CSV完全基于文本,因此无法表示 None使用预定义的关键字)

一个好的变量名可能是 CSV_NONE

如果您有一个新的CSV文件,该文件None用另一个字符串表示,则使代码易于更改和调整。""

if csvfield == CSV_NONE:
    csvfield = None

毫无疑问,这段代码是否正确。很明显,它做了应该做的事情。

比较一下

if csvfield == EMPTY_STRING:
    csvfield = None

这里的第一个问题是,为什么空字符串应该得到特殊对待?

这将告诉以后的编码人员,应该始终将空字符串视为None

这是因为它将业务逻辑(应为CSV值None)与代码实现(我们实际比较的是什么)混合在一起

两者之间需要分开关注

I find hardcoding(sic) “” every time for checking an empty string not as good.

Clean code approach

Doing this: foo == "" is very bad practice. "" is a magical value. You should never check against magical values (more commonly known as magical numbers)

What you should do is compare to a descriptive variable name.

Descriptive variable names

One may think that “empty_string” is a descriptive variable name. It isn’t.

Before you go and do empty_string = "" and think you have a great variable name to compare to. This is not what “descriptive variable name” means.

A good descriptive variable name is based on its context. You have to think about what the empty string is.

  • Where does it come from.
  • Why is it there.
  • Why do you need to check for it.

Simple form field example

You are building a form where a user can enter values. You want to check if the user wrote something or not.

A good variable name may be not_filled_in

This makes the code very readable

if formfields.name == not_filled_in:
    raise ValueError("We need your name")

Thorough CSV parsing example

You are parsing CSV files and want the empty string to be parsed as None

(Since CSV is entirely text based, it cannot represent None without using predefined keywords)

A good variable name may be CSV_NONE

This makes the code easy to change and adapt if you have a new CSV file that represents None with another string than ""

if csvfield == CSV_NONE:
    csvfield = None

There are no questions about if this piece of code is correct. It is pretty clear that it does what it should do.

Compare this to

if csvfield == EMPTY_STRING:
    csvfield = None

The first question here is, Why does the empty string deserve special treatment?

This would tell future coders that an empty string should always be considered as None.

This is because it mixes business logic (What CSV value should be None) with code implementation (What are we actually comparing to)

There needs to be a separation of concern between the two.


回答 11

这个怎么样?也许它不是“最优雅的”,但是看起来很完整和清晰:

if (s is None) or (str(s).strip()==""): // STRING s IS "EMPTY"...

How about this? Perhaps it’s not “the most elegant”, but it seems pretty complete and clear:

if (s is None) or (str(s).strip()==""): // STRING s IS "EMPTY"...

回答 12

响应@ 1290。抱歉,无法格式化注释中的块。该None值在Python中不是空字符串,也不是(空格)。安德鲁·克拉克(Andrew Clark)的答案是正确的:if not myString。@rouble的答案是特定于应用程序的,不能回答OP的问题。如果您对“空白”字符串采用特殊定义,则会遇到麻烦。特别是,标准行为是str(None)产生'None'一个非空字符串。

但是,如果必须将Noneand(空格)视为“空白”字符串,这是一种更好的方法:

class weirdstr(str):
    def __new__(cls, content):
        return str.__new__(cls, content if content is not None else '')
    def __nonzero__(self):
        return bool(self.strip())

例子:

>>> normal = weirdstr('word')
>>> print normal, bool(normal)
word True

>>> spaces = weirdstr('   ')
>>> print spaces, bool(spaces)
    False

>>> blank = weirdstr('')
>>> print blank, bool(blank)
 False

>>> none = weirdstr(None)
>>> print none, bool(none)
 False

>>> if not spaces:
...     print 'This is a so-called blank string'
... 
This is a so-called blank string

满足@rouble要求,同时不破坏bool字符串的预期行为。

Responding to @1290. Sorry, no way to format blocks in comments. The None value is not an empty string in Python, and neither is (spaces). The answer from Andrew Clark is the correct one: if not myString. The answer from @rouble is application-specific and does not answer the OP’s question. You will get in trouble if you adopt a peculiar definition of what is a “blank” string. In particular, the standard behavior is that str(None) produces 'None', a non-blank string.

However if you must treat None and (spaces) as “blank” strings, here is a better way:

class weirdstr(str):
    def __new__(cls, content):
        return str.__new__(cls, content if content is not None else '')
    def __nonzero__(self):
        return bool(self.strip())

Examples:

>>> normal = weirdstr('word')
>>> print normal, bool(normal)
word True

>>> spaces = weirdstr('   ')
>>> print spaces, bool(spaces)
    False

>>> blank = weirdstr('')
>>> print blank, bool(blank)
 False

>>> none = weirdstr(None)
>>> print none, bool(none)
 False

>>> if not spaces:
...     print 'This is a so-called blank string'
... 
This is a so-called blank string

Meets the @rouble requirements while not breaking the expected bool behavior of strings.


回答 13

我觉得这很优雅,因为它可以确保它是一个字符串并检查其长度:

def empty(mystring):
    assert isinstance(mystring, str)
    if len(mystring) == 0:
        return True
    else:
        return False

I find this elegant as it makes sure it is a string and checks its length:

def empty(mystring):
    assert isinstance(mystring, str)
    if len(mystring) == 0:
        return True
    else:
        return False

回答 14

另一个简单的方法可能是定义一个简单的函数:

def isStringEmpty(inputString):
    if len(inputString) == 0:
        return True
    else:
        return False

Another easy way could be to define a simple function:

def isStringEmpty(inputString):
    if len(inputString) == 0:
        return True
    else:
        return False

回答 15

not str(myString)

对于空字符串,此表达式为True。非空字符串,None和非字符串对象都将产生False,但需要注意的是,对象可能会覆盖__str__以通过返回虚假值来阻止此逻辑。

not str(myString)

This expression is True for strings that are empty. Non-empty strings, None and non-string objects will all produce False, with the caveat that objects may override __str__ to thwart this logic by returning a falsy value.


回答 16

您可能会看一下在Python中分配空值或字符串

这是关于比较空字符串。因此not,您可以测试您的字符串是否等于带有""空字符串的空字符串,而不是使用来测试是否为空。

You may have a look at this Assigning empty value or string in Python

This is about comparing strings that are empty. So instead of testing for emptiness with not, you may test is your string is equal to empty string with "" the empty string…


回答 17

对于那些期望像Apache StringUtils.isBlank或Guava Strings.isNullOrEmpty这样的行为的用户:

if mystring and mystring.strip():
    print "not blank string"
else:
    print "blank string"

for those who expect a behaviour like the apache StringUtils.isBlank or Guava Strings.isNullOrEmpty :

if mystring and mystring.strip():
    print "not blank string"
else:
    print "blank string"

回答 18

当您逐行读取文件并想要确定哪一行为空时,请确保您将使用.strip(),因为“空”行中有换行符:

lines = open("my_file.log", "r").readlines()

for line in lines:
    if not line.strip():
        continue

    # your code for non-empty lines

When you are reading file by lines and want to determine, which line is empty, make sure you will use .strip(), because there is new line character in “empty” line:

lines = open("my_file.log", "r").readlines()

for line in lines:
    if not line.strip():
        continue

    # your code for non-empty lines

回答 19

str = ""
if not str:
   print "Empty String"
if(len(str)==0):
   print "Empty String"
str = ""
if not str:
   print "Empty String"
if(len(str)==0):
   print "Empty String"

回答 20

如果你只是用

not var1 

不可能将一个布尔变量False与一个空字符串区别开''

var1 = ''
not var1
> True

var1 = False
not var1
> True

但是,如果您在脚本中添加简单条件,则会有所不同:

var1  = False
not var1 and var1 != ''
> True

var1 = ''
not var1 and var1 != ''
> False

If you just use

not var1 

it is not possible to difference a variable which is boolean False from an empty string '':

var1 = ''
not var1
> True

var1 = False
not var1
> True

However, if you add a simple condition to your script, the difference is made:

var1  = False
not var1 and var1 != ''
> True

var1 = ''
not var1 and var1 != ''
> False

回答 21

如果这对某人有用,这是我构建的一种快速功能,可以用列表列表中的N / A替换空白字符串(python 2)。

y = [["1","2",""],["1","4",""]]

def replace_blank_strings_in_lists_of_lists(list_of_lists):
    new_list = []
    for one_list in list_of_lists:
        new_one_list = []
        for element in one_list:
            if element:
                new_one_list.append(element)
            else:
                new_one_list.append("N/A")
        new_list.append(new_one_list)
    return new_list


x= replace_blank_strings_in_lists_of_lists(y)
print x

这对于将列表列表发布到不接受某些字段空白的mysql数据库(在模式中标记为NN的字段,在我的情况下,这是由于复合主键引起的)非常有用。

In case this is useful to someone, here is a quick function i built out to replace blank strings with N/A’s in lists of lists (python 2).

y = [["1","2",""],["1","4",""]]

def replace_blank_strings_in_lists_of_lists(list_of_lists):
    new_list = []
    for one_list in list_of_lists:
        new_one_list = []
        for element in one_list:
            if element:
                new_one_list.append(element)
            else:
                new_one_list.append("N/A")
        new_list.append(new_one_list)
    return new_list


x= replace_blank_strings_in_lists_of_lists(y)
print x

This is useful for posting lists of lists to a mysql database that does not accept blanks for certain fields (fields marked as NN in schema. in my case, this was due to a composite primary key).


回答 22

我对”,’,’\ n’等字符串进行了一些实验。当且仅当变量foo是具有至少一个非空白字符的字符串时,我希望isNotWhitespace为True。我正在使用Python 3.6。我最终得到的是:

isWhitespace = str is type(foo) and not foo.strip()
isNotWhitespace = str is type(foo) and not not foo.strip()

如果需要,可以将其包装在方法定义中。

I did some experimentation with strings like ”, ‘ ‘, ‘\n’, etc. I want isNotWhitespace to be True if and only if the variable foo is a string with at least one non-whitespace character. I’m using Python 3.6. Here’s what I ended up with:

isWhitespace = str is type(foo) and not foo.strip()
isNotWhitespace = str is type(foo) and not not foo.strip()

Wrap this in a method definition if desired.


回答 23

如prmatta上面所述,但有误。

def isNoneOrEmptyOrBlankString (myString):
    if myString:
        if not myString.strip():
            return True
        else:
            return False
    return False

As prmatta posted above, but with mistake.

def isNoneOrEmptyOrBlankString (myString):
    if myString:
        if not myString.strip():
            return True
        else:
            return False
    return False

将整数转换为字符串?

问题:将整数转换为字符串?

我想在Python中将整数转换为字符串。我是徒劳地打字:

d = 15
d.str()

当我尝试将其转换为字符串时,它显示错误,例如int没有任何名为的属性str

I want to convert an integer to a string in Python. I am typecasting it in vain:

d = 15
d.str()

When I try to convert it to string, it’s showing an error like int doesn’t have any attribute called str.


回答 0

>>> str(10)
'10'
>>> int('10')
10

链接到文档:

转换为字符串是通过内置str()函数完成的,该函数基本上调用__str__()其参数的方法。

>>> str(10)
'10'
>>> int('10')
10

Links to the documentation:

Conversion to a string is done with the builtin str() function, which basically calls the __str__() method of its parameter.


回答 1

尝试这个:

str(i)

Try this:

str(i)

回答 2

Python中没有类型转换,也没有类型强制。您必须以显式方式转换变量。

要使用字符串转换对象,请使用str()函数。它适用于具有称为__str__()define 的方法的任何对象。事实上

str(a)

相当于

a.__str__()

如果要将某些内容转换为int,float等,则相同。

There is not typecast and no type coercion in Python. You have to convert your variable in an explicit way.

To convert an object in string you use the str() function. It works with any object that has a method called __str__() defined. In fact

str(a)

is equivalent to

a.__str__()

The same if you want to convert something to int, float, etc.


回答 3

要管理非整数输入:

number = raw_input()
try:
    value = int(number)
except ValueError:
    value = 0

To manage non-integer inputs:

number = raw_input()
try:
    value = int(number)
except ValueError:
    value = 0

回答 4

>>> i = 5
>>> print "Hello, world the number is " + i
TypeError: must be str, not int
>>> s = str(i)
>>> print "Hello, world the number is " + s
Hello, world the number is 5
>>> i = 5
>>> print "Hello, world the number is " + i
TypeError: must be str, not int
>>> s = str(i)
>>> print "Hello, world the number is " + s
Hello, world the number is 5

回答 5

在Python => 3.6中,您可以使用f格式:

>>> int_value = 10
>>> f'{int_value}'
'10'
>>>

In Python => 3.6 you can use f formatting:

>>> int_value = 10
>>> f'{int_value}'
'10'
>>>

回答 6

对于Python 3.6,您可以使用f-strings新功能将其转换为字符串,并且与str()函数相比,它更快,它的用法如下:

age = 45
strAge = f'{age}'

因此,Python提供了str()函数。

digit = 10
print(type(digit)) # will show <class 'int'>
convertedDigit= str(digit)
print(type(convertedDigit)) # will show <class 'str'>

有关更多详细的答案,请查看本文:将Python Int转换为String并将Python String转换为Int

For Python 3.6 you can use the f-strings new feature to convert to string and it’s faster compared to str() function, it is used like that:

age = 45
strAge = f'{age}'

Python provides the str() function for that reason.

digit = 10
print(type(digit)) # will show <class 'int'>
convertedDigit= str(digit)
print(type(convertedDigit)) # will show <class 'str'>

For more detailed answer you can check this article: Converting Python Int to String and Python String to Int


回答 7

我认为最体面的方式是“。

i = 32   -->    `i` == '32'

The most decent way in my opinion is “.

i = 32   -->    `i` == '32'

回答 8

可以使用%s.format

>>> "%s" % 10
'10'
>>>

(要么)

>>> '{}'.format(10)
'10'
>>>

Can use %s or .format

>>> "%s" % 10
'10'
>>>

(OR)

>>> '{}'.format(10)
'10'
>>>

回答 9

对于想要将int转换为特定数字的字符串的人,建议使用以下方法。

month = "{0:04d}".format(localtime[1])

有关更多详细信息,您可以参考堆栈溢出问题显示数字前导零

For someone who wants to convert int to string in specific digits, the below method is recommended.

month = "{0:04d}".format(localtime[1])

For more details, you can refer to Stack Overflow question Display number with leading zeros.


回答 10

通过在Python 3.6中引入f字符串,这也将起作用:

f'{10}' == '10'

实际上str(),它比调用速度更快,但会降低可读性。

实际上,它比%x字符串格式和.format()!快。

With the introduction of f-strings in Python 3.6, this will also work:

f'{10}' == '10'

It is actually faster than calling str(), at the cost of readability.

In fact, it’s faster than %x string formatting and .format()!


反转Python中的字符串

问题:反转Python中的字符串

reversePython str对象没有内置函数。实施此方法的最佳方法是什么?

如果提供非常简洁的答案,请详细说明其效率。例如,是否将str对象转换为其他对象等。

There is no built in reverse function for Python’s str object. What is the best way of implementing this method?

If supplying a very concise answer, please elaborate on its efficiency. For example, whether the str object is converted to a different object, etc.


回答 0

怎么样:

>>> 'hello world'[::-1]
'dlrow olleh'

这是扩展片语法。它的工作方式是[begin:end:step]-离开begin和end并指定步骤-1,它反转字符串。

How about:

>>> 'hello world'[::-1]
'dlrow olleh'

This is extended slice syntax. It works by doing [begin:end:step] – by leaving begin and end off and specifying a step of -1, it reverses a string.


回答 1

@Paolo s[::-1]是最快的;较慢的方法(可能更具可读性,但这值得商))是''.join(reversed(s))

@Paolo’s s[::-1] is fastest; a slower approach (maybe more readable, but that’s debatable) is ''.join(reversed(s)).


回答 2

为字符串实现反向函数的最佳方法是什么?

我对这个问题的经验是学术上的。但是,如果您是专业人士在寻找快速答案,请使用按-1以下步骤操作的切片:

>>> 'a string'[::-1]
'gnirts a'

或更可读(但由于方法名称查找和在给定迭代器时join形成列表的事实而变慢)str.join

>>> ''.join(reversed('a string'))
'gnirts a'

或为了可读性和可重用性,将切片放入函数中

def reversed_string(a_string):
    return a_string[::-1]

然后:

>>> reversed_string('a_string')
'gnirts_a'

更长的解释

如果您对学术博览会感兴趣,请继续阅读。

Python的str对象中没有内置的反向函数。

您应该了解以下有关Python字符串的几件事:

  1. 在Python中,字符串是不可变的。更改字符串不会修改该字符串。它创建了一个新的。

  2. 字符串是可切片的。分割字符串会以给定的增量从字符串的一个点向后或向前,再到另一点,为您提供一个新的字符串。它们在下标中采用切片符号或切片对象:

    string[subscript]

下标通过在括号内包含冒号来创建切片:

    string[start:stop:step]

要在大括号之外创建切片,您需要创建一个slice对象:

    slice_obj = slice(start, stop, step)
    string[slice_obj]

可读的方法:

虽然''.join(reversed('foo'))可读,但需要str.join在另一个调用的函数上调用字符串方法,这可能会比较慢。让我们将其放在函数中-我们将回到它:

def reverse_string_readable_answer(string):
    return ''.join(reversed(string))

最高效的方法:

使用反向切片快得多:

'foo'[::-1]

但是,对于不熟悉切片或原始作者意图的人,我们如何使它更具可读性和可理解性?让我们在下标符号之外创建一个slice对象,为其指定一个描述性名称,然后将其传递给下标符号。

start = stop = None
step = -1
reverse_slice = slice(start, stop, step)
'foo'[reverse_slice]

实现为功能

为了实际实现此功能,我认为在语义上足够清晰,只需使用一个描述性名称即可:

def reversed_string(a_string):
    return a_string[::-1]

用法很简单:

reversed_string('foo')

您的老师可能想要什么:

如果您有一位讲师,他们可能希望您从一个空字符串开始,然后从旧字符串开始构建一个新字符串。您可以使用while循环使用纯语法和文字进行此操作:

def reverse_a_string_slowly(a_string):
    new_string = ''
    index = len(a_string)
    while index:
        index -= 1                    # index = index - 1
        new_string += a_string[index] # new_string = new_string + character
    return new_string

从理论上讲这是不好的,因为请记住,字符串是不可变的 -因此,每次看起来像在您的字符上附加一个字符时new_string,理论上每次都会创建一个新的字符串!但是,CPython知道如何在某些情况下对此进行优化,其中这种微不足道的情况就是其中之一。

最佳实践

从理论上讲,更好的方法是将您的子字符串收集到列表中,然后再加入它们:

def reverse_a_string_more_slowly(a_string):
    new_strings = []
    index = len(a_string)
    while index:
        index -= 1                       
        new_strings.append(a_string[index])
    return ''.join(new_strings)

但是,正如我们在下面的CPython时序中所看到的,实际上这需要花费更长的时间,因为CPython可以优化字符串连接。

时机

计时如下:

>>> a_string = 'amanaplanacanalpanama' * 10
>>> min(timeit.repeat(lambda: reverse_string_readable_answer(a_string)))
10.38789987564087
>>> min(timeit.repeat(lambda: reversed_string(a_string)))
0.6622700691223145
>>> min(timeit.repeat(lambda: reverse_a_string_slowly(a_string)))
25.756799936294556
>>> min(timeit.repeat(lambda: reverse_a_string_more_slowly(a_string)))
38.73570013046265

CPython优化了字符串连接,而其他实现可能没有

…不依赖于CPython对a + = b或a = a + b形式的语句的就地字符串连接的有效实现。即使在CPython中,这种优化也是脆弱的(仅适用于某些类型),并且在不使用引用计数的实现中根本没有这种优化。在库的性能敏感部分中,应使用”.join()形式。这将确保在各种实现方式中串联发生在线性时间内。

What is the best way of implementing a reverse function for strings?

My own experience with this question is academic. However, if you’re a pro looking for the quick answer, use a slice that steps by -1:

>>> 'a string'[::-1]
'gnirts a'

or more readably (but slower due to the method name lookups and the fact that join forms a list when given an iterator), str.join:

>>> ''.join(reversed('a string'))
'gnirts a'

or for readability and reusability, put the slice in a function

def reversed_string(a_string):
    return a_string[::-1]

and then:

>>> reversed_string('a_string')
'gnirts_a'

Longer explanation

If you’re interested in the academic exposition, please keep reading.

There is no built-in reverse function in Python’s str object.

Here is a couple of things about Python’s strings you should know:

  1. In Python, strings are immutable. Changing a string does not modify the string. It creates a new one.

  2. Strings are sliceable. Slicing a string gives you a new string from one point in the string, backwards or forwards, to another point, by given increments. They take slice notation or a slice object in a subscript:

    string[subscript]
    

The subscript creates a slice by including a colon within the braces:

    string[start:stop:step]

To create a slice outside of the braces, you’ll need to create a slice object:

    slice_obj = slice(start, stop, step)
    string[slice_obj]

A readable approach:

While ''.join(reversed('foo')) is readable, it requires calling a string method, str.join, on another called function, which can be rather relatively slow. Let’s put this in a function – we’ll come back to it:

def reverse_string_readable_answer(string):
    return ''.join(reversed(string))

Most performant approach:

Much faster is using a reverse slice:

'foo'[::-1]

But how can we make this more readable and understandable to someone less familiar with slices or the intent of the original author? Let’s create a slice object outside of the subscript notation, give it a descriptive name, and pass it to the subscript notation.

start = stop = None
step = -1
reverse_slice = slice(start, stop, step)
'foo'[reverse_slice]

Implement as Function

To actually implement this as a function, I think it is semantically clear enough to simply use a descriptive name:

def reversed_string(a_string):
    return a_string[::-1]

And usage is simply:

reversed_string('foo')

What your teacher probably wants:

If you have an instructor, they probably want you to start with an empty string, and build up a new string from the old one. You can do this with pure syntax and literals using a while loop:

def reverse_a_string_slowly(a_string):
    new_string = ''
    index = len(a_string)
    while index:
        index -= 1                    # index = index - 1
        new_string += a_string[index] # new_string = new_string + character
    return new_string

This is theoretically bad because, remember, strings are immutable – so every time where it looks like you’re appending a character onto your new_string, it’s theoretically creating a new string every time! However, CPython knows how to optimize this in certain cases, of which this trivial case is one.

Best Practice

Theoretically better is to collect your substrings in a list, and join them later:

def reverse_a_string_more_slowly(a_string):
    new_strings = []
    index = len(a_string)
    while index:
        index -= 1                       
        new_strings.append(a_string[index])
    return ''.join(new_strings)

However, as we will see in the timings below for CPython, this actually takes longer, because CPython can optimize the string concatenation.

Timings

Here are the timings:

>>> a_string = 'amanaplanacanalpanama' * 10
>>> min(timeit.repeat(lambda: reverse_string_readable_answer(a_string)))
10.38789987564087
>>> min(timeit.repeat(lambda: reversed_string(a_string)))
0.6622700691223145
>>> min(timeit.repeat(lambda: reverse_a_string_slowly(a_string)))
25.756799936294556
>>> min(timeit.repeat(lambda: reverse_a_string_more_slowly(a_string)))
38.73570013046265

CPython optimizes string concatenation, whereas other implementations may not:

… do not rely on CPython’s efficient implementation of in-place string concatenation for statements in the form a += b or a = a + b . This optimization is fragile even in CPython (it only works for some types) and isn’t present at all in implementations that don’t use refcounting. In performance sensitive parts of the library, the ”.join() form should be used instead. This will ensure that concatenation occurs in linear time across various implementations.


回答 3

快速解答(TL; DR)

### example01 -------------------
mystring  =   'coup_ate_grouping'
backwards =   mystring[::-1]
print backwards

### ... or even ...
mystring  =   'coup_ate_grouping'[::-1]
print mystring

### result01 -------------------
'''
gnipuorg_eta_puoc
'''

详细答案

背景

提供此答案是为了解决@odigity的以下问题:

哇。起初,我对Paolo提出的解决方案感到震惊,但这使我在读了第一条评论时感到的恐惧退缩了:“那太好了。做得好!” 我感到非常不安,以至于这样一个聪明的社区认为将如此神秘的方法用于如此基本的东西是一个好主意。为什么不只是s.reverse()?

问题

  • 语境
    • Python 2.x
    • Python 3.x
  • 场景:
    • 开发人员想要转换字符串
    • 转换是颠倒所有字符的顺序

陷阱

  • 开发人员可能期望像 string.reverse()
  • 较新的开发人员可能无法阅读本机惯用的(又称“ pythonic ”)解决方案
  • 开发人员可能会尝试实施自己的版本,string.reverse()以避免切片符号。
  • 在某些情况下,切片符号的输出可能是违反直觉的:
    • 参见例如example02
      • print 'coup_ate_grouping'[-4:] ## => 'ping'
      • 相比
      • print 'coup_ate_grouping'[-4:-1] ## => 'pin'
      • 相比
      • print 'coup_ate_grouping'[-1] ## => 'g'
    • 建立索引的不同结果[-1]可能会使一些开发人员失望

基本原理

Python有一种特殊的情况要注意:字符串是可迭代的类型。

排除string.reverse()方法的一个基本原理是给予python开发人员动力以利用这种特殊情况的力量。

简而言之,这简单地意味着字符串中的每个单独字符都可以像其他编程语言中的数组一样容易地作为元素顺序排列的一部分进行操作。

要了解其工作原理,请查看example02可以提供很好的概述。

示例02

### example02 -------------------
## start (with positive integers)
print 'coup_ate_grouping'[0]  ## => 'c'
print 'coup_ate_grouping'[1]  ## => 'o' 
print 'coup_ate_grouping'[2]  ## => 'u' 

## start (with negative integers)
print 'coup_ate_grouping'[-1]  ## => 'g'
print 'coup_ate_grouping'[-2]  ## => 'n' 
print 'coup_ate_grouping'[-3]  ## => 'i' 

## start:end 
print 'coup_ate_grouping'[0:4]    ## => 'coup'    
print 'coup_ate_grouping'[4:8]    ## => '_ate'    
print 'coup_ate_grouping'[8:12]   ## => '_gro'    

## start:end 
print 'coup_ate_grouping'[-4:]    ## => 'ping' (counter-intuitive)
print 'coup_ate_grouping'[-4:-1]  ## => 'pin'
print 'coup_ate_grouping'[-4:-2]  ## => 'pi'
print 'coup_ate_grouping'[-4:-3]  ## => 'p'
print 'coup_ate_grouping'[-4:-4]  ## => ''
print 'coup_ate_grouping'[0:-1]   ## => 'coup_ate_groupin'
print 'coup_ate_grouping'[0:]     ## => 'coup_ate_grouping' (counter-intuitive)

## start:end:step (or start:end:stride)
print 'coup_ate_grouping'[-1::1]  ## => 'g'   
print 'coup_ate_grouping'[-1::-1] ## => 'gnipuorg_eta_puoc'

## combinations
print 'coup_ate_grouping'[-1::-1][-4:] ## => 'puoc'

结论

对于某些不希望在学习语言上花费很多时间的采用者和开发人员来说,与理解切片符号在python中的工作方式相关的认知负担确实可能过大。

但是,一旦理解了基本原理,此方法相对于固定字符串操作方法的功能可能会非常有利。

对于那些有其他想法的人,还有其他方法,例如lambda函数,迭代器或简单的一次性函数声明。

如果需要,开发人员可以实现自己的string.reverse()方法,但是最好理解python这方面的原理。

也可以看看

Quick Answer (TL;DR)

Example

### example01 -------------------
mystring  =   'coup_ate_grouping'
backwards =   mystring[::-1]
print backwards

### ... or even ...
mystring  =   'coup_ate_grouping'[::-1]
print mystring

### result01 -------------------
'''
gnipuorg_eta_puoc
'''

Detailed Answer

Background

This answer is provided to address the following concern from @odigity:

Wow. I was horrified at first by the solution Paolo proposed, but that took a back seat to the horror I felt upon reading the first comment: “That’s very pythonic. Good job!” I’m so disturbed that such a bright community thinks using such cryptic methods for something so basic is a good idea. Why isn’t it just s.reverse()?

Problem

  • Context
    • Python 2.x
    • Python 3.x
  • Scenario:
    • Developer wants to transform a string
    • Transformation is to reverse order of all the characters

Solution

Pitfalls

  • Developer might expect something like string.reverse()
  • The native idiomatic (aka “pythonic“) solution may not be readable to newer developers
  • Developer may be tempted to implement his or her own version of string.reverse() to avoid slice notation.
  • The output of slice notation may be counter-intuitive in some cases:
    • see e.g., example02
      • print 'coup_ate_grouping'[-4:] ## => 'ping'
      • compared to
      • print 'coup_ate_grouping'[-4:-1] ## => 'pin'
      • compared to
      • print 'coup_ate_grouping'[-1] ## => 'g'
    • the different outcomes of indexing on [-1] may throw some developers off

Rationale

Python has a special circumstance to be aware of: a string is an iterable type.

One rationale for excluding a string.reverse() method is to give python developers incentive to leverage the power of this special circumstance.

In simplified terms, this simply means each individual character in a string can be easily operated on as a part of a sequential arrangement of elements, just like arrays in other programming languages.

To understand how this works, reviewing example02 can provide a good overview.

Example02

### example02 -------------------
## start (with positive integers)
print 'coup_ate_grouping'[0]  ## => 'c'
print 'coup_ate_grouping'[1]  ## => 'o' 
print 'coup_ate_grouping'[2]  ## => 'u' 

## start (with negative integers)
print 'coup_ate_grouping'[-1]  ## => 'g'
print 'coup_ate_grouping'[-2]  ## => 'n' 
print 'coup_ate_grouping'[-3]  ## => 'i' 

## start:end 
print 'coup_ate_grouping'[0:4]    ## => 'coup'    
print 'coup_ate_grouping'[4:8]    ## => '_ate'    
print 'coup_ate_grouping'[8:12]   ## => '_gro'    

## start:end 
print 'coup_ate_grouping'[-4:]    ## => 'ping' (counter-intuitive)
print 'coup_ate_grouping'[-4:-1]  ## => 'pin'
print 'coup_ate_grouping'[-4:-2]  ## => 'pi'
print 'coup_ate_grouping'[-4:-3]  ## => 'p'
print 'coup_ate_grouping'[-4:-4]  ## => ''
print 'coup_ate_grouping'[0:-1]   ## => 'coup_ate_groupin'
print 'coup_ate_grouping'[0:]     ## => 'coup_ate_grouping' (counter-intuitive)

## start:end:step (or start:end:stride)
print 'coup_ate_grouping'[-1::1]  ## => 'g'   
print 'coup_ate_grouping'[-1::-1] ## => 'gnipuorg_eta_puoc'

## combinations
print 'coup_ate_grouping'[-1::-1][-4:] ## => 'puoc'

Conclusion

The cognitive load associated with understanding how slice notation works in python may indeed be too much for some adopters and developers who do not wish to invest much time in learning the language.

Nevertheless, once the basic principles are understood, the power of this approach over fixed string manipulation methods can be quite favorable.

For those who think otherwise, there are alternate approaches, such as lambda functions, iterators, or simple one-off function declarations.

If desired, a developer can implement her own string.reverse() method, however it is good to understand the rationale behind this aspect of python.

See also


回答 4

仅当忽略Unicode修饰符/字形群集时,现有答案才是正确的。我将在稍后处理,但首先请看一些反转算法的速度:

在此处输入图片说明

list_comprehension  : min:   0.6μs, mean:   0.6μs, max:    2.2μs
reverse_func        : min:   1.9μs, mean:   2.0μs, max:    7.9μs
reverse_reduce      : min:   5.7μs, mean:   5.9μs, max:   10.2μs
reverse_loop        : min:   3.0μs, mean:   3.1μs, max:    6.8μs

在此处输入图片说明

list_comprehension  : min:   4.2μs, mean:   4.5μs, max:   31.7μs
reverse_func        : min:  75.4μs, mean:  76.6μs, max:  109.5μs
reverse_reduce      : min: 749.2μs, mean: 882.4μs, max: 2310.4μs
reverse_loop        : min: 469.7μs, mean: 577.2μs, max: 1227.6μs

您可以看到,列表推导(reversed = string[::-1])的时间在所有情况下都是最低的(即使在修正我的错字之后)。

字符串反转

如果您真的想按常识反转字符串,则方法会更加复杂。例如,采用以下字符串(棕色手指指向左黄色手指指向上)。那是两个字素,但有3个unicode码点。另一个是皮肤修饰剂

example = "👈🏾👆"

但是,如果使用任何给定的方法将其反转,则会使棕色手指指向上方黄色手指指向左侧。这样做的原因是“棕色”颜色修改器仍在中间,并应用于之前的任何内容。所以我们有

  • U:手指向上
  • M:棕色修饰剂
  • L:手指指向左

original: LMU
reversed: UML (above solutions)
reversed: ULM (correct reversal)

Unicode音素簇比修饰符代码点要复杂一些。幸运的是,用于处理库字形

>>> import grapheme
>>> g = grapheme.graphemes("👈🏾👆")
>>> list(g)
['👈🏾', '👆']

因此正确的答案是

def reverse_graphemes(string):
    g = list(grapheme.graphemes(string))
    return ''.join(g[::-1])

到目前为止也是最慢的:

list_comprehension  : min:    0.5μs, mean:    0.5μs, max:    2.1μs
reverse_func        : min:   68.9μs, mean:   70.3μs, max:  111.4μs
reverse_reduce      : min:  742.7μs, mean:  810.1μs, max: 1821.9μs
reverse_loop        : min:  513.7μs, mean:  552.6μs, max: 1125.8μs
reverse_graphemes   : min: 3882.4μs, mean: 4130.9μs, max: 6416.2μs

编码

#!/usr/bin/env python

import numpy as np
import random
import timeit
from functools import reduce
random.seed(0)


def main():
    longstring = ''.join(random.choices("ABCDEFGHIJKLM", k=2000))
    functions = [(list_comprehension, 'list_comprehension', longstring),
                 (reverse_func, 'reverse_func', longstring),
                 (reverse_reduce, 'reverse_reduce', longstring),
                 (reverse_loop, 'reverse_loop', longstring)
                 ]
    duration_list = {}
    for func, name, params in functions:
        durations = timeit.repeat(lambda: func(params), repeat=100, number=3)
        duration_list[name] = list(np.array(durations) * 1000)
        print('{func:<20}: '
              'min: {min:5.1f}μs, mean: {mean:5.1f}μs, max: {max:6.1f}μs'
              .format(func=name,
                      min=min(durations) * 10**6,
                      mean=np.mean(durations) * 10**6,
                      max=max(durations) * 10**6,
                      ))
        create_boxplot('Reversing a string of length {}'.format(len(longstring)),
                       duration_list)


def list_comprehension(string):
    return string[::-1]


def reverse_func(string):
    return ''.join(reversed(string))


def reverse_reduce(string):
    return reduce(lambda x, y: y + x, string)


def reverse_loop(string):
    reversed_str = ""
    for i in string:
        reversed_str = i + reversed_str
    return reversed_str


def create_boxplot(title, duration_list, showfliers=False):
    import seaborn as sns
    import matplotlib.pyplot as plt
    import operator
    plt.figure(num=None, figsize=(8, 4), dpi=300,
               facecolor='w', edgecolor='k')
    sns.set(style="whitegrid")
    sorted_keys, sorted_vals = zip(*sorted(duration_list.items(),
                                           key=operator.itemgetter(1)))
    flierprops = dict(markerfacecolor='0.75', markersize=1,
                      linestyle='none')
    ax = sns.boxplot(data=sorted_vals, width=.3, orient='h',
                     flierprops=flierprops,
                     showfliers=showfliers)
    ax.set(xlabel="Time in ms", ylabel="")
    plt.yticks(plt.yticks()[0], sorted_keys)
    ax.set_title(title)
    plt.tight_layout()
    plt.savefig("output-string.png")


if __name__ == '__main__':
    main()

The existing answers are only correct if Unicode Modifiers / grapheme clusters are ignored. I’ll deal with that later, but first have a look at the speed of some reversal algorithms:

enter image description here

list_comprehension  : min:   0.6μs, mean:   0.6μs, max:    2.2μs
reverse_func        : min:   1.9μs, mean:   2.0μs, max:    7.9μs
reverse_reduce      : min:   5.7μs, mean:   5.9μs, max:   10.2μs
reverse_loop        : min:   3.0μs, mean:   3.1μs, max:    6.8μs

enter image description here

list_comprehension  : min:   4.2μs, mean:   4.5μs, max:   31.7μs
reverse_func        : min:  75.4μs, mean:  76.6μs, max:  109.5μs
reverse_reduce      : min: 749.2μs, mean: 882.4μs, max: 2310.4μs
reverse_loop        : min: 469.7μs, mean: 577.2μs, max: 1227.6μs

You can see that the time for the list comprehension (reversed = string[::-1]) is in all cases by far the lowest (even after fixing my typo).

String Reversal

If you really want to reverse a string in the common sense, it is WAY more complicated. For example, take the following string (brown finger pointing left, yellow finger pointing up). Those are two graphemes, but 3 unicode code points. The additional one is a skin modifier.

example = "👈🏾👆"

But if you reverse it with any of the given methods, you get brown finger pointing up, yellow finger pointing left. The reason for this is that the “brown” color modifier is still in the middle and gets applied to whatever is before it. So we have

  • U: finger pointing up
  • M: brown modifier
  • L: finger pointing left

and

original: LMU
reversed: UML (above solutions)
reversed: ULM (correct reversal)

Unicode Grapheme Clusters are a bit more complicated than just modifier code points. Luckily, there is a library for handling graphemes:

>>> import grapheme
>>> g = grapheme.graphemes("👈🏾👆")
>>> list(g)
['👈🏾', '👆']

and hence the correct answer would be

def reverse_graphemes(string):
    g = list(grapheme.graphemes(string))
    return ''.join(g[::-1])

which also is by far the slowest:

list_comprehension  : min:    0.5μs, mean:    0.5μs, max:    2.1μs
reverse_func        : min:   68.9μs, mean:   70.3μs, max:  111.4μs
reverse_reduce      : min:  742.7μs, mean:  810.1μs, max: 1821.9μs
reverse_loop        : min:  513.7μs, mean:  552.6μs, max: 1125.8μs
reverse_graphemes   : min: 3882.4μs, mean: 4130.9μs, max: 6416.2μs

The Code

#!/usr/bin/env python

import numpy as np
import random
import timeit
from functools import reduce
random.seed(0)


def main():
    longstring = ''.join(random.choices("ABCDEFGHIJKLM", k=2000))
    functions = [(list_comprehension, 'list_comprehension', longstring),
                 (reverse_func, 'reverse_func', longstring),
                 (reverse_reduce, 'reverse_reduce', longstring),
                 (reverse_loop, 'reverse_loop', longstring)
                 ]
    duration_list = {}
    for func, name, params in functions:
        durations = timeit.repeat(lambda: func(params), repeat=100, number=3)
        duration_list[name] = list(np.array(durations) * 1000)
        print('{func:<20}: '
              'min: {min:5.1f}μs, mean: {mean:5.1f}μs, max: {max:6.1f}μs'
              .format(func=name,
                      min=min(durations) * 10**6,
                      mean=np.mean(durations) * 10**6,
                      max=max(durations) * 10**6,
                      ))
        create_boxplot('Reversing a string of length {}'.format(len(longstring)),
                       duration_list)


def list_comprehension(string):
    return string[::-1]


def reverse_func(string):
    return ''.join(reversed(string))


def reverse_reduce(string):
    return reduce(lambda x, y: y + x, string)


def reverse_loop(string):
    reversed_str = ""
    for i in string:
        reversed_str = i + reversed_str
    return reversed_str


def create_boxplot(title, duration_list, showfliers=False):
    import seaborn as sns
    import matplotlib.pyplot as plt
    import operator
    plt.figure(num=None, figsize=(8, 4), dpi=300,
               facecolor='w', edgecolor='k')
    sns.set(style="whitegrid")
    sorted_keys, sorted_vals = zip(*sorted(duration_list.items(),
                                           key=operator.itemgetter(1)))
    flierprops = dict(markerfacecolor='0.75', markersize=1,
                      linestyle='none')
    ax = sns.boxplot(data=sorted_vals, width=.3, orient='h',
                     flierprops=flierprops,
                     showfliers=showfliers)
    ax.set(xlabel="Time in ms", ylabel="")
    plt.yticks(plt.yticks()[0], sorted_keys)
    ax.set_title(title)
    plt.tight_layout()
    plt.savefig("output-string.png")


if __name__ == '__main__':
    main()

回答 5

1.使用切片符号

def rev_string(s): 
    return s[::-1]

2.使用reversed()函数

def rev_string(s): 
    return ''.join(reversed(s))

3.使用递归

def rev_string(s): 
    if len(s) == 1:
        return s

    return s[-1] + rev_string(s[:-1])

1. using slice notation

def rev_string(s): 
    return s[::-1]

2. using reversed() function

def rev_string(s): 
    return ''.join(reversed(s))

3. using recursion

def rev_string(s): 
    if len(s) == 1:
        return s

    return s[-1] + rev_string(s[:-1])

回答 6

观察它的一种比较简单的方法是:

string = 'happy'
print(string)

‘快乐’

string_reversed = string[-1::-1]
print(string_reversed)

‘yppah’

用英语[-1 ::-1]读为:

“从-1开始,一直走,采取-1的步骤”

A lesser perplexing way to look at it would be:

string = 'happy'
print(string)

‘happy’

string_reversed = string[-1::-1]
print(string_reversed)

‘yppah’

In English [-1::-1] reads as:

“Starting at -1, go all the way, taking steps of -1”


回答 7

不使用reversed()或[::-1]反转python中的字符串

def reverse(test):
    n = len(test)
    x=""
    for i in range(n-1,-1,-1):
        x += test[i]
    return x

Reverse a string in python without using reversed() or [::-1]

def reverse(test):
    n = len(test)
    x=""
    for i in range(n-1,-1,-1):
        x += test[i]
    return x

回答 8

这也是一种有趣的方式:

def reverse_words_1(s):
    rev = ''
    for i in range(len(s)):
        j = ~i  # equivalent to j = -(i + 1)
        rev += s[j]
    return rev

或类似:

def reverse_words_2(s):
    rev = ''
    for i in reversed(range(len(s)):
        rev += s[i]
    return rev

使用支持.reverse()的BYTERArray的另一种“异国情调”方式

b = bytearray('Reverse this!', 'UTF-8')
b.reverse()
b.decode('UTF-8')

将生成:

'!siht esreveR'

This is also an interesting way:

def reverse_words_1(s):
    rev = ''
    for i in range(len(s)):
        j = ~i  # equivalent to j = -(i + 1)
        rev += s[j]
    return rev

or similar:

def reverse_words_2(s):
    rev = ''
    for i in reversed(range(len(s)):
        rev += s[i]
    return rev

Another more ‘exotic’ way using byterarray which supports .reverse()

b = bytearray('Reverse this!', 'UTF-8')
b.reverse()
b.decode('UTF-8')

will produce:

'!siht esreveR'

回答 9

def reverse(input):
    return reduce(lambda x,y : y+x, input)
def reverse(input):
    return reduce(lambda x,y : y+x, input)

回答 10

original = "string"

rev_index = original[::-1]
rev_func = list(reversed(list(original))) #nsfw

print(original)
print(rev_index)
print(''.join(rev_func))
original = "string"

rev_index = original[::-1]
rev_func = list(reversed(list(original))) #nsfw

print(original)
print(rev_index)
print(''.join(rev_func))

回答 11

def reverse_string(string):
    length = len(string)
    temp = ''
    for i in range(length):
        temp += string[length - i - 1]
    return temp

print(reverse_string('foo')) #prints "oof"

这是通过遍历一个字符串并将其值反向分配给另一个字符串来实现的。

def reverse_string(string):
    length = len(string)
    temp = ''
    for i in range(length):
        temp += string[length - i - 1]
    return temp

print(reverse_string('foo')) #prints "oof"

This works by looping through a string and assigning its values in reverse order to another string.


回答 12

这是一个没有幻想的:

def reverse(text):
    r_text = ''
    index = len(text) - 1

    while index >= 0:
        r_text += text[index] #string canbe concatenated
        index -= 1

    return r_text

print reverse("hello, world!")

Here is a no fancy one:

def reverse(text):
    r_text = ''
    index = len(text) - 1

    while index >= 0:
        r_text += text[index] #string canbe concatenated
        index -= 1

    return r_text

print reverse("hello, world!")

回答 13

这是一个没有[::-1]reversed(出于学习目的)的:

def reverse(text):
    new_string = []
    n = len(text)
    while (n > 0):
        new_string.append(text[n-1])
        n -= 1
    return ''.join(new_string)
print reverse("abcd")

您可以+=用来连接字符串,但join()速度更快。

Here is one without [::-1] or reversed (for learning purposes):

def reverse(text):
    new_string = []
    n = len(text)
    while (n > 0):
        new_string.append(text[n-1])
        n -= 1
    return ''.join(new_string)
print reverse("abcd")

you can use += to concatenate strings but join() is faster.


回答 14

递归方法:

def reverse(s): return s[0] if len(s)==1 else s[len(s)-1] + reverse(s[0:len(s)-1])

例:

print(reverse("Hello!"))    #!olleH

Recursive method:

def reverse(s): return s[0] if len(s)==1 else s[len(s)-1] + reverse(s[0:len(s)-1])

example:

print(reverse("Hello!"))    #!olleH

回答 15

以上所有解决方案都是完美的,但是如果我们尝试在python中使用for循环来反转字符串会变得有些棘手,所以这是我们如何使用for循环来反转字符串

string ="hello,world"
for i in range(-1,-len(string)-1,-1):
    print (string[i],end=(" ")) 

我希望这对某人有帮助。

All of the above solutions are perfect but if we are trying to reverse a string using for loop in python will became a little bit tricky so here is how we can reverse a string using for loop

string ="hello,world"
for i in range(-1,-len(string)-1,-1):
    print (string[i],end=(" ")) 

I hope this one will be helpful for someone.


回答 16

这是我的风格:

def reverse_string(string):
    character_list = []
    for char in string:
        character_list.append(char)
    reversed_string = ""
    for char in reversed(character_list):
        reversed_string += char
    return reversed_string

Thats my way:

def reverse_string(string):
    character_list = []
    for char in string:
        character_list.append(char)
    reversed_string = ""
    for char in reversed(character_list):
        reversed_string += char
    return reversed_string

回答 17

反向字符串有很多方法,但我也创建了另一种方法只是为了好玩。我认为这种方法还不错。

def reverse(_str):
    list_char = list(_str) # Create a hypothetical list. because string is immutable

    for i in range(len(list_char)/2): # just t(n/2) to reverse a big string
        list_char[i], list_char[-i - 1] = list_char[-i - 1], list_char[i]

    return ''.join(list_char)

print(reverse("Ehsan"))

There are a lot of ways to reverse a string but I also created another one just for fun. I think this approach is not that bad.

def reverse(_str):
    list_char = list(_str) # Create a hypothetical list. because string is immutable

    for i in range(len(list_char)/2): # just t(n/2) to reverse a big string
        list_char[i], list_char[-i - 1] = list_char[-i - 1], list_char[i]

    return ''.join(list_char)

print(reverse("Ehsan"))

回答 18

此类使用python魔术函数反转字符串:

class Reverse(object):
    """ Builds a reverse method using magic methods """

    def __init__(self, data):
        self.data = data
        self.index = len(data)


    def __iter__(self):
        return self

    def __next__(self):
        if self.index == 0:
            raise StopIteration

        self.index = self.index - 1
        return self.data[self.index]


REV_INSTANCE = Reverse('hello world')

iter(REV_INSTANCE)

rev_str = ''
for char in REV_INSTANCE:
    rev_str += char

print(rev_str)  

输出量

dlrow olleh

参考

This class uses python magic functions to reverse a string:

class Reverse(object):
    """ Builds a reverse method using magic methods """

    def __init__(self, data):
        self.data = data
        self.index = len(data)


    def __iter__(self):
        return self

    def __next__(self):
        if self.index == 0:
            raise StopIteration

        self.index = self.index - 1
        return self.data[self.index]


REV_INSTANCE = Reverse('hello world')

iter(REV_INSTANCE)

rev_str = ''
for char in REV_INSTANCE:
    rev_str += char

print(rev_str)  

Output

dlrow olleh

Reference


回答 19

使用python 3,您可以就地反转字符串,这意味着它不会被分配给另一个变量。首先,您必须将字符串转换为列表,然后利用该reverse()函数。

https://docs.python.org/3/tutorial/datastructures.html

   def main():
        my_string = ["h","e","l","l","o"]
        print(reverseString(my_string))

    def reverseString(s):
      print(s)
      s.reverse()
      return s

    if __name__ == "__main__":
        main()

With python 3 you can reverse the string in-place meaning it won’t get assigned to another variable. First you have to convert the string into a list and then leverage the reverse() function.

https://docs.python.org/3/tutorial/datastructures.html

   def main():
        my_string = ["h","e","l","l","o"]
        print(reverseString(my_string))

    def reverseString(s):
      print(s)
      s.reverse()
      return s

    if __name__ == "__main__":
        main()

回答 20

这是简单而有意义的反向功能,易于理解和编码

def reverse_sentence(text):
    words = text.split(" ")
    reverse =""
    for word in reversed(words):
        reverse += word+ " "
    return reverse

This is simple and meaningful reverse function, easy to understand and code

def reverse_sentence(text):
    words = text.split(" ")
    reverse =""
    for word in reversed(words):
        reverse += word+ " "
    return reverse

回答 21

这很简单:

打印“ loremipsum” [-1 ::-1]

从逻辑上讲:

def str_reverse_fun():
    empty_list = []
    new_str = 'loremipsum'
    index = len(new_str)
    while index:
        index = index - 1
        empty_list.append(new_str[index])
    return ''.join(empty_list)
print str_reverse_fun()

输出:

muspimerol

Here is simply:

print “loremipsum”[-1::-1]

and some logically:

def str_reverse_fun():
    empty_list = []
    new_str = 'loremipsum'
    index = len(new_str)
    while index:
        index = index - 1
        empty_list.append(new_str[index])
    return ''.join(empty_list)
print str_reverse_fun()

output:

muspimerol


回答 22

反转没有python魔术的字符串。

>>> def reversest(st):
    a=len(st)-1
    for i in st:
        print(st[a],end="")
        a=a-1

Reverse a string without python magic.

>>> def reversest(st):
    a=len(st)-1
    for i in st:
        print(st[a],end="")
        a=a-1

回答 23

当然,在Python中,您可以做非常漂亮的1行内容。:)
这是一个简单,全面的解决方案,可以在任何编程语言中使用。

def reverse_string(phrase):
    reversed = ""
    length = len(phrase)
    for i in range(length):
        reversed += phrase[length-1-i]
    return reversed

phrase = raw_input("Provide a string: ")
print reverse_string(phrase)

Sure, in Python you can do very fancy 1-line stuff. :)
Here’s a simple, all rounder solution that could work in any programming language.

def reverse_string(phrase):
    reversed = ""
    length = len(phrase)
    for i in range(length):
        reversed += phrase[length-1-i]
    return reversed

phrase = raw_input("Provide a string: ")
print reverse_string(phrase)

回答 24

s = 'hello'
ln = len(s)
i = 1
while True:
    rev = s[ln-i]
    print rev,
    i = i + 1
    if i == ln + 1 :
        break

输出:

o l l e h
s = 'hello'
ln = len(s)
i = 1
while True:
    rev = s[ln-i]
    print rev,
    i = i + 1
    if i == ln + 1 :
        break

OUTPUT :

o l l e h

回答 25

您可以将反向功能与列表综合一起使用。但是我不明白为什么在python 3中取消了这种方法是不必要的。

string = [ char for char in reversed(string)]

You can use the reversed function with a list comprehesive. But I don’t understand why this method was eliminated in python 3, was unnecessarily.

string = [ char for char in reversed(string)]

具有大写字母和数字的随机字符串生成

问题:具有大写字母和数字的随机字符串生成

我想生成一个大小为N的字符串。

它应该由数字和大写英文字母组成,例如:

  • 6U1S75
  • 4Z4UKK
  • U911K4

我如何以pythonic方式实现这一目标?

I want to generate a string of size N.

It should be made up of numbers and uppercase English letters such as:

  • 6U1S75
  • 4Z4UKK
  • U911K4

How can I achieve this in a pythonic way?


回答 0

一行回答:

''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))

甚至更短,从Python 3.6开始,使用random.choices()

''.join(random.choices(string.ascii_uppercase + string.digits, k=N))

加密更安全的版本;参见https://stackoverflow.com/a/23728630/2213647

''.join(random.SystemRandom().choice(string.ascii_uppercase + string.digits) for _ in range(N))

详细而言,具有清除函数以进一步重用:

>>> import string
>>> import random
>>> def id_generator(size=6, chars=string.ascii_uppercase + string.digits):
...    return ''.join(random.choice(chars) for _ in range(size))
...
>>> id_generator()
'G5G74W'
>>> id_generator(3, "6793YUIO")
'Y3U'

它是如何工作的 ?

我们导入string,一个包含常见ASCII字符序列的模块,以及random一个处理随机生成的模块。

string.ascii_uppercase + string.digits 只是串联表示大写ASCII字符和数字的字符列表:

>>> string.ascii_uppercase
'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> string.digits
'0123456789'
>>> string.ascii_uppercase + string.digits
'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'

然后,我们使用列表推导创建“ n”个元素的列表:

>>> range(4) # range create a list of 'n' numbers
[0, 1, 2, 3]
>>> ['elem' for _ in range(4)] # we use range to create 4 times 'elem'
['elem', 'elem', 'elem', 'elem']

在上面的例子中,我们使用[创建列表,但我们不这样做的id_generator功能,所以Python没有在内存中创建列表,但生成的飞行元素,一个接一个(更多相关信息点击这里)。

而不是要求创建字符串的n倍elem,我们将要求Python创建从字符序列中选取的随机字符的n倍:

>>> random.choice("abcde")
'a'
>>> random.choice("abcde")
'd'
>>> random.choice("abcde")
'b'

因此,random.choice(chars) for _ in range(size)实际上是在创建一个size字符序列。从chars以下位置随机选择的字符:

>>> [random.choice('abcde') for _ in range(3)]
['a', 'b', 'b']
>>> [random.choice('abcde') for _ in range(3)]
['e', 'b', 'e']
>>> [random.choice('abcde') for _ in range(3)]
['d', 'a', 'c']

然后,我们将它们与一个空字符串连接起来,以便序列成为一个字符串:

>>> ''.join(['a', 'b', 'b'])
'abb'
>>> [random.choice('abcde') for _ in range(3)]
['d', 'c', 'b']
>>> ''.join(random.choice('abcde') for _ in range(3))
'dac'

Answer in one line:

''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))

or even shorter starting with Python 3.6 using random.choices():

''.join(random.choices(string.ascii_uppercase + string.digits, k=N))

A cryptographically more secure version; see https://stackoverflow.com/a/23728630/2213647:

''.join(random.SystemRandom().choice(string.ascii_uppercase + string.digits) for _ in range(N))

In details, with a clean function for further reuse:

>>> import string
>>> import random
>>> def id_generator(size=6, chars=string.ascii_uppercase + string.digits):
...    return ''.join(random.choice(chars) for _ in range(size))
...
>>> id_generator()
'G5G74W'
>>> id_generator(3, "6793YUIO")
'Y3U'

How does it work ?

We import string, a module that contains sequences of common ASCII characters, and random, a module that deals with random generation.

string.ascii_uppercase + string.digits just concatenates the list of characters representing uppercase ASCII chars and digits:

>>> string.ascii_uppercase
'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> string.digits
'0123456789'
>>> string.ascii_uppercase + string.digits
'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'

Then we use a list comprehension to create a list of ‘n’ elements:

>>> range(4) # range create a list of 'n' numbers
[0, 1, 2, 3]
>>> ['elem' for _ in range(4)] # we use range to create 4 times 'elem'
['elem', 'elem', 'elem', 'elem']

In the example above, we use [ to create the list, but we don’t in the id_generator function so Python doesn’t create the list in memory, but generates the elements on the fly, one by one (more about this here).

Instead of asking to create ‘n’ times the string elem, we will ask Python to create ‘n’ times a random character, picked from a sequence of characters:

>>> random.choice("abcde")
'a'
>>> random.choice("abcde")
'd'
>>> random.choice("abcde")
'b'

Therefore random.choice(chars) for _ in range(size) really is creating a sequence of size characters. Characters that are randomly picked from chars:

>>> [random.choice('abcde') for _ in range(3)]
['a', 'b', 'b']
>>> [random.choice('abcde') for _ in range(3)]
['e', 'b', 'e']
>>> [random.choice('abcde') for _ in range(3)]
['d', 'a', 'c']

Then we just join them with an empty string so the sequence becomes a string:

>>> ''.join(['a', 'b', 'b'])
'abb'
>>> [random.choice('abcde') for _ in range(3)]
['d', 'c', 'b']
>>> ''.join(random.choice('abcde') for _ in range(3))
'dac'

回答 1

该堆栈溢出问题是“随机字符串Python”在Google上当前排名最高的结果。当前的最佳答案是:

''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))

这是一种极好的方法,但是随机PRNG并不是加密安全的。我假设许多研究此问题的人都希望生成用于加密或密码的随机字符串。您可以通过在上面的代码中进行一些小的更改来安全地执行此操作:

''.join(random.SystemRandom().choice(string.ascii_uppercase + string.digits) for _ in range(N))

使用random.SystemRandom()的,而不是在* nix机器只是随机使用/ dev / urandom的,并CryptGenRandom()在Windows中。这些是加密安全的PRNG。在需要安全PRNG的应用程序中使用random.choice代替random.SystemRandom().choice可能会造成灾难性的后果,并且鉴于这个问题的普遍性,我敢打赌,这个错误已经犯了很多遍了。

如果您使用的是python3.6或更高版本,则可以使用MSeifert的答案中提到的新的secrets模块:

''.join(secrets.choice(string.ascii_uppercase + string.digits) for _ in range(N))

该模块文档还讨论了生成安全令牌最佳实践的便捷方法。

This Stack Overflow quesion is the current top Google result for “random string Python”. The current top answer is:

''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))

This is an excellent method, but the PRNG in random is not cryptographically secure. I assume many people researching this question will want to generate random strings for encryption or passwords. You can do this securely by making a small change in the above code:

''.join(random.SystemRandom().choice(string.ascii_uppercase + string.digits) for _ in range(N))

Using random.SystemRandom() instead of just random uses /dev/urandom on *nix machines and CryptGenRandom() in Windows. These are cryptographically secure PRNGs. Using random.choice instead of random.SystemRandom().choice in an application that requires a secure PRNG could be potentially devastating, and given the popularity of this question, I bet that mistake has been made many times already.

If you’re using python3.6 or above, you can use the new secrets module as mentioned in MSeifert’s answer:

''.join(secrets.choice(string.ascii_uppercase + string.digits) for _ in range(N))

The module docs also discuss convenient ways to generate secure tokens and best practices.


回答 2

只需使用Python的内置uuid:

如果您可以使用UUID,请使用内置的uuid软件包。

一线解决方案:

import uuid; uuid.uuid4().hex.upper()[0:6]

深度版本:

例:

import uuid
uuid.uuid4() #uuid4 => full random uuid
# Outputs something like: UUID('0172fc9a-1dac-4414-b88d-6b9a6feb91ea')

如果您确实需要格式(例如“ 6U1S75”),则可以这样做:

import uuid

def my_random_string(string_length=10):
    """Returns a random string of length string_length."""
    random = str(uuid.uuid4()) # Convert UUID format to a Python string.
    random = random.upper() # Make all characters uppercase.
    random = random.replace("-","") # Remove the UUID '-'.
    return random[0:string_length] # Return the random string.

print(my_random_string(6)) # For example, D9E50C

Simply use Python’s builtin uuid:

If UUIDs are okay for your purposes, use the built-in uuid package.

One Line Solution:

import uuid; uuid.uuid4().hex.upper()[0:6]

In Depth Version:

Example:

import uuid
uuid.uuid4() #uuid4 => full random uuid
# Outputs something like: UUID('0172fc9a-1dac-4414-b88d-6b9a6feb91ea')

If you need exactly your format (for example, “6U1S75”), you can do it like this:

import uuid

def my_random_string(string_length=10):
    """Returns a random string of length string_length."""
    random = str(uuid.uuid4()) # Convert UUID format to a Python string.
    random = random.upper() # Make all characters uppercase.
    random = random.replace("-","") # Remove the UUID '-'.
    return random[0:string_length] # Return the random string.

print(my_random_string(6)) # For example, D9E50C

回答 3

一种更简单,更快速但稍微少一点的随机方式是使用random.sample而不是分别选择每个字母,如果允许n次重复,则将您的随机基础扩大n倍,例如

import random
import string

char_set = string.ascii_uppercase + string.digits
print ''.join(random.sample(char_set*6, 6))

注意:random.sample防止字符重用,乘以字符集的大小可以进行多次重复,但是与纯随机选择相比,它们的可能性仍然较小。如果我们选择长度为6的字符串,并选择“ X”作为第一个字符,则在选择示例中,第二个字符获得“ X”的几率与获得“ X”作为第二个字符的几率相同第一个字符。在random.sample实现中,将“ X”作为任何后续字符的几率仅为将其作为第一个字符的机会的6/7

A simpler, faster but slightly less random way is to use random.sample instead of choosing each letter separately, If n-repetitions are allowed, enlarge your random basis by n times e.g.

import random
import string

char_set = string.ascii_uppercase + string.digits
print ''.join(random.sample(char_set*6, 6))

Note: random.sample prevents character reuse, multiplying the size of the character set makes multiple repetitions possible, but they are still less likely then they are in a pure random choice. If we go for a string of length 6, and we pick ‘X’ as the first character, in the choice example, the odds of getting ‘X’ for the second character are the same as the odds of getting ‘X’ as the first character. In the random.sample implementation, the odds of getting ‘X’ as any subsequent character are only 6/7 the chance of getting it as the first character


回答 4

import uuid
lowercase_str = uuid.uuid4().hex  

lowercase_str 是一个像 'cea8b32e00934aaea8c005a35d85a5c0'

uppercase_str = lowercase_str.upper()

uppercase_str'CEA8B32E00934AAEA8C005A35D85A5C0'

import uuid
lowercase_str = uuid.uuid4().hex  

lowercase_str is a random value like 'cea8b32e00934aaea8c005a35d85a5c0'

uppercase_str = lowercase_str.upper()

uppercase_str is 'CEA8B32E00934AAEA8C005A35D85A5C0'


回答 5

执行此操作的更快,更轻松,更灵活的方法是使用strgen模块(pip install StringGenerator)。

生成一个包含大写字母和数字的6个字符的随机字符串:

>>> from strgen import StringGenerator as SG
>>> SG("[\u\d]{6}").render()
u'YZI2CI'

获取唯一列表:

>>> SG("[\l\d]{10}").render_list(5,unique=True)
[u'xqqtmi1pOk', u'zmkWdUr63O', u'PGaGcPHrX2', u'6RZiUbkk2i', u'j9eIeeWgEF']

保证一个“特殊”字符字符串:

>>> SG("[\l\d]{10}&[\p]").render()
u'jaYI0bcPG*0'

随机的HTML颜色:

>>> SG("#[\h]{6}").render()
u'#CEdFCa'

等等

我们需要意识到:

''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))

可能没有数字(或大写字符)。

strgen比上述任何一种解决方案的开发时间都更快。Ignacio提供的解决方案是运行速度最快的解决方案,并且是使用Python标准库的正确答案。但是您几乎不会以这种形式使用它。您将要使用SystemRandom(如果不可用,则使用备用版本),确保表示所需的字符集,使用(或不使用unicode),确保连续的调用产生唯一的字符串,使用字符串模块字符类之一的子集,等等。这比提供的答案需要更多的代码。概括解决方案的各种尝试都具有局限性,strgen使用简单的模板语言可以以更高的简洁性和更高的表达力来解决。

在PyPI上:

pip install StringGenerator

披露:我是strgen模块的作者。

A faster, easier and more flexible way to do this is to use the strgen module (pip install StringGenerator).

Generate a 6-character random string with upper case letters and digits:

>>> from strgen import StringGenerator as SG
>>> SG("[\u\d]{6}").render()
u'YZI2CI'

Get a unique list:

>>> SG("[\l\d]{10}").render_list(5,unique=True)
[u'xqqtmi1pOk', u'zmkWdUr63O', u'PGaGcPHrX2', u'6RZiUbkk2i', u'j9eIeeWgEF']

Guarantee one “special” character in the string:

>>> SG("[\l\d]{10}&[\p]").render()
u'jaYI0bcPG*0'

A random HTML color:

>>> SG("#[\h]{6}").render()
u'#CEdFCa'

etc.

We need to be aware that this:

''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))

might not have a digit (or uppercase character) in it.

strgen is faster in developer-time than any of the above solutions. The solution from Ignacio is the fastest run-time performing and is the right answer using the Python Standard Library. But you will hardly ever use it in that form. You will want to use SystemRandom (or fallback if not available), make sure required character sets are represented, use unicode (or not), make sure successive invocations produce a unique string, use a subset of one of the string module character classes, etc. This all requires lots more code than in the answers provided. The various attempts to generalize a solution all have limitations that strgen solves with greater brevity and expressive power using a simple template language.

It’s on PyPI:

pip install StringGenerator

Disclosure: I’m the author of the strgen module.


回答 6

从Python 3.6开始,如果需要加密secrets模块,则应使用模块而不是模块(否则,此答案与@Ignacio Vazquez-Abrams的答案相同):random

from secrets import choice
import string

''.join([choice(string.ascii_uppercase + string.digits) for _ in range(N)])

还有一点需要注意:列表理解str.join比使用生成器表达式要快!

From Python 3.6 on you should use the secrets module if you need it to be cryptographically secure instead of the random module (otherwise this answer is identical to the one of @Ignacio Vazquez-Abrams):

from secrets import choice
import string

''.join([choice(string.ascii_uppercase + string.digits) for _ in range(N)])

One additional note: a list-comprehension is faster in the case of str.join than using a generator expression!


回答 7

基于另一个Stack Overflow答案,创建随机字符串和随机十六进制数的最轻巧的方法是,比接受的答案更好的版本是:

('%06x' % random.randrange(16**6)).upper()

快多了。

Based on another Stack Overflow answer, Most lightweight way to create a random string and a random hexadecimal number, a better version than the accepted answer would be:

('%06x' % random.randrange(16**6)).upper()

much faster.


回答 8

如果您需要一个随机字符串而不是随机字符串,则应使用它os.urandom作为源

from os import urandom
from itertools import islice, imap, repeat
import string

def rand_string(length=5):
    chars = set(string.ascii_uppercase + string.digits)
    char_gen = (c for c in imap(urandom, repeat(1)) if c in chars)
    return ''.join(islice(char_gen, None, length))

If you need a random string rather than a pseudo random one, you should use os.urandom as the source

from os import urandom
from itertools import islice, imap, repeat
import string

def rand_string(length=5):
    chars = set(string.ascii_uppercase + string.digits)
    char_gen = (c for c in imap(urandom, repeat(1)) if c in chars)
    return ''.join(islice(char_gen, None, length))

回答 9

我以为还没有人回答这个大声笑!但是,嘿,这是我自己做的:

import random

def random_alphanumeric(limit):
    #ascii alphabet of all alphanumerals
    r = (range(48, 58) + range(65, 91) + range(97, 123))
    random.shuffle(r)
    return reduce(lambda i, s: i + chr(s), r[:random.randint(0, len(r))], "")

I thought no one had answered this yet lol! But hey, here’s my own go at it:

import random

def random_alphanumeric(limit):
    #ascii alphabet of all alphanumerals
    r = (range(48, 58) + range(65, 91) + range(97, 123))
    random.shuffle(r)
    return reduce(lambda i, s: i + chr(s), r[:random.randint(0, len(r))], "")

回答 10

与Ignacio发布的random.choice()方法相比,此方法稍快一些,但也更令人讨厌。

它利用了伪随机算法的特性,并且比按每个字符生成新的随机数更快地按位和移位。

# must be length 32 -- 5 bits -- the question didn't specify using the full set
# of uppercase letters ;)
_ALPHABET = 'ABCDEFGHJKLMNPQRSTUVWXYZ23456789'

def generate_with_randbits(size=32):
    def chop(x):
        while x:
            yield x & 31
            x = x >> 5
    return  ''.join(_ALPHABET[x] for x in chop(random.getrandbits(size * 5))).ljust(size, 'A')

…创建一个在0..31的时间里取出5位数字的生成器,直到没有剩余

… join()生成器的结果在具有正确位的随机数上

使用Timeit,对于32个字符的字符串,计时为:

[('generate_with_random_choice', 28.92901611328125),
 ('generate_with_randbits', 20.0293550491333)]

…但是对于64个字符串,randbit会失败;)

除非我真的不喜欢我的同事,否则我可能永远不会在生产代码中使用这种方法。

编辑:已更新为适合该问题(仅适用于大写和数字),并使用按位运算符&和>>代替%和//

This method is slightly faster, and slightly more annoying, than the random.choice() method Ignacio posted.

It takes advantage of the nature of pseudo-random algorithms, and banks on bitwise and and shift being faster than generating a new random number for each character.

# must be length 32 -- 5 bits -- the question didn't specify using the full set
# of uppercase letters ;)
_ALPHABET = 'ABCDEFGHJKLMNPQRSTUVWXYZ23456789'

def generate_with_randbits(size=32):
    def chop(x):
        while x:
            yield x & 31
            x = x >> 5
    return  ''.join(_ALPHABET[x] for x in chop(random.getrandbits(size * 5))).ljust(size, 'A')

…create a generator that takes out 5 bit numbers at a time 0..31 until none left

…join() the results of the generator on a random number with the right bits

With Timeit, for 32-character strings, the timing was:

[('generate_with_random_choice', 28.92901611328125),
 ('generate_with_randbits', 20.0293550491333)]

…but for 64 character strings, randbits loses out ;)

I would probably never use this approach in production code unless I really disliked my co-workers.

edit: updated to suit the question (uppercase and digits only), and use bitwise operators & and >> instead of % and //


回答 11

我会这样:

import random
from string import digits, ascii_uppercase

legals = digits + ascii_uppercase

def rand_string(length, char_set=legals):

    output = ''
    for _ in range(length): output += random.choice(char_set)
    return output

要不就:

def rand_string(length, char_set=legals):

    return ''.join( random.choice(char_set) for _ in range(length) )

I’d do it this way:

import random
from string import digits, ascii_uppercase

legals = digits + ascii_uppercase

def rand_string(length, char_set=legals):

    output = ''
    for _ in range(length): output += random.choice(char_set)
    return output

Or just:

def rand_string(length, char_set=legals):

    return ''.join( random.choice(char_set) for _ in range(length) )

回答 12

使用Numpy的random.choice()函数

import numpy as np
import string        

if __name__ == '__main__':
    length = 16
    a = np.random.choice(list(string.ascii_uppercase + string.digits), length)                
    print(''.join(a))

文档在这里http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.random.choice.html

Use Numpy’s random.choice() function

import numpy as np
import string        

if __name__ == '__main__':
    length = 16
    a = np.random.choice(list(string.ascii_uppercase + string.digits), length)                
    print(''.join(a))

Documentation is here http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.random.choice.html


回答 13

有时0(零)和O(字母O)可能会造成混淆。所以我用

import uuid
uuid.uuid4().hex[:6].upper().replace('0','X').replace('O','Y')

Sometimes 0 (zero) & O (letter O) can be confusing. So I use

import uuid
uuid.uuid4().hex[:6].upper().replace('0','X').replace('O','Y')

回答 14

>>> import string 
>>> import random

以下逻辑仍会生成6个字符的随机样本

>>> print ''.join(random.sample((string.ascii_uppercase+string.digits),6))
JT7K3Q

无需乘以6

>>> print ''.join(random.sample((string.ascii_uppercase+string.digits)*6,6))

TK82HK
>>> import string 
>>> import random

the following logic still generates 6 character random sample

>>> print ''.join(random.sample((string.ascii_uppercase+string.digits),6))
JT7K3Q

No need to multiply by 6

>>> print ''.join(random.sample((string.ascii_uppercase+string.digits)*6,6))

TK82HK

回答 15

对于那些喜欢使用python的人:

from itertools import imap, starmap, islice, repeat
from functools import partial
from string import letters, digits, join
from random import choice

join_chars = partial(join, sep='')
identity = lambda o: o

def irand_seqs(symbols=join_chars((letters, digits)), length=6, join=join_chars, select=choice, breakup=islice):
    """ Generates an indefinite sequence of joined random symbols each of a specific length
    :param symbols: symbols to select,
        [defaults to string.letters + string.digits, digits 0 - 9, lower and upper case English letters.]
    :param length: the length of each sequence,
        [defaults to 6]
    :param join: method used to join selected symbol, 
        [defaults to ''.join generating a string.]
    :param select: method used to select a random element from the giving population. 
        [defaults to random.choice, which selects a single element randomly]
    :return: indefinite iterator generating random sequences of giving [:param length]
    >>> from tools import irand_seqs
    >>> strings = irand_seqs()
    >>> a = next(strings)
    >>> assert isinstance(a, (str, unicode))
    >>> assert len(a) == 6
    >>> assert next(strings) != next(strings)
    """
    return imap(join, starmap(breakup, repeat((imap(select, repeat(symbols)), None, length))))

它首先通过从给定池中生成一个随机选择的符号的不确定序列,然后将该序列分解为多个长度部分,然后再进行连接,然后生成一个连接的随机序列的不确定的[infinite]迭代器,它应与支持getitem的任何序列一起工作,默认情况下,它只是生成随机的字母数字字母序列,尽管您可以轻松地进行修改以生成其他内容:

例如,生成数字的随机元组:

>>> irand_tuples = irand_seqs(xrange(10), join=tuple)
>>> next(irand_tuples)
(0, 5, 5, 7, 2, 8)
>>> next(irand_tuples)
(3, 2, 2, 0, 3, 1)

如果您不想使用下一代,则可以使其可调用:

>>> irand_tuples = irand_seqs(xrange(10), join=tuple)
>>> make_rand_tuples = partial(next, irand_tuples) 
>>> make_rand_tuples()
(1, 6, 2, 8, 1, 9)

如果要动态生成序列,只需将join设置为identity。

>>> irand_tuples = irand_seqs(xrange(10), join=identity)
>>> selections = next(irand_tuples)
>>> next(selections)
8
>>> list(selections)
[6, 3, 8, 2, 2]

正如其他人提到的,如果您需要更高的安全性,请设置适当的选择功能:

>>> from random import SystemRandom
>>> rand_strs = irand_seqs(select=SystemRandom().choice)
'QsaDxQ'

默认选择器是choice可以为每个块多次选择相同的符号,如果相反,您希望为每个块最多选择一次相同的成员,那么一种可能的用法是:

>>> from random import sample
>>> irand_samples = irand_seqs(xrange(10), length=1, join=next, select=lambda pool: sample(pool, 6))
>>> next(irand_samples)
[0, 9, 2, 3, 1, 6]

我们使用它sample作为选择器来进行完整的选择,因此这些块实际上是长度为1的块,要加入next该连接,我们只需调用即可提取下一个完全生成的块,当然,这个示例似乎有点麻烦,而且它是…

For those of you who enjoy functional python:

from itertools import imap, starmap, islice, repeat
from functools import partial
from string import letters, digits, join
from random import choice

join_chars = partial(join, sep='')
identity = lambda o: o

def irand_seqs(symbols=join_chars((letters, digits)), length=6, join=join_chars, select=choice, breakup=islice):
    """ Generates an indefinite sequence of joined random symbols each of a specific length
    :param symbols: symbols to select,
        [defaults to string.letters + string.digits, digits 0 - 9, lower and upper case English letters.]
    :param length: the length of each sequence,
        [defaults to 6]
    :param join: method used to join selected symbol, 
        [defaults to ''.join generating a string.]
    :param select: method used to select a random element from the giving population. 
        [defaults to random.choice, which selects a single element randomly]
    :return: indefinite iterator generating random sequences of giving [:param length]
    >>> from tools import irand_seqs
    >>> strings = irand_seqs()
    >>> a = next(strings)
    >>> assert isinstance(a, (str, unicode))
    >>> assert len(a) == 6
    >>> assert next(strings) != next(strings)
    """
    return imap(join, starmap(breakup, repeat((imap(select, repeat(symbols)), None, length))))

It generates an indefinite [infinite] iterator, of joined random sequences, by first generating an indefinite sequence of randomly selected symbol from the giving pool, then breaking this sequence into length parts which is then joined, it should work with any sequence that supports getitem, by default it simply generates a random sequence of alpha numeric letters, though you can easily modify to generate other things:

for example to generate random tuples of digits:

>>> irand_tuples = irand_seqs(xrange(10), join=tuple)
>>> next(irand_tuples)
(0, 5, 5, 7, 2, 8)
>>> next(irand_tuples)
(3, 2, 2, 0, 3, 1)

if you don’t want to use next for generation you can simply make it callable:

>>> irand_tuples = irand_seqs(xrange(10), join=tuple)
>>> make_rand_tuples = partial(next, irand_tuples) 
>>> make_rand_tuples()
(1, 6, 2, 8, 1, 9)

if you want to generate the sequence on the fly simply set join to identity.

>>> irand_tuples = irand_seqs(xrange(10), join=identity)
>>> selections = next(irand_tuples)
>>> next(selections)
8
>>> list(selections)
[6, 3, 8, 2, 2]

As others have mentioned if you need more security then set the appropriate select function:

>>> from random import SystemRandom
>>> rand_strs = irand_seqs(select=SystemRandom().choice)
'QsaDxQ'

the default selector is choice which may select the same symbol multiple times for each chunk, if instead you’d want the same member selected at most once for each chunk then, one possible usage:

>>> from random import sample
>>> irand_samples = irand_seqs(xrange(10), length=1, join=next, select=lambda pool: sample(pool, 6))
>>> next(irand_samples)
[0, 9, 2, 3, 1, 6]

we use sample as our selector, to do the complete selection, so the chunks are actually length 1, and to join we simply call next which fetches the next completely generated chunk, granted this example seems a bit cumbersome and it is …


回答 16

(1)这将为您提供所有大写字母和数字:

import string, random
passkey=''
for x in range(8):
    if random.choice([1,2]) == 1:
        passkey += passkey.join(random.choice(string.ascii_uppercase))
    else:
        passkey += passkey.join(random.choice(string.digits))
print passkey 

(2)如果您以后想在键中包含小写字母,那么这也将起作用:

import string, random
passkey=''
for x in range(8):
    if random.choice([1,2]) == 1:
        passkey += passkey.join(random.choice(string.ascii_letters))
    else:
        passkey += passkey.join(random.choice(string.digits))
print passkey  

(1) This will give you all caps and numbers:

import string, random
passkey=''
for x in range(8):
    if random.choice([1,2]) == 1:
        passkey += passkey.join(random.choice(string.ascii_uppercase))
    else:
        passkey += passkey.join(random.choice(string.digits))
print passkey 

(2) If you later want to include lowercase letters in your key, then this will also work:

import string, random
passkey=''
for x in range(8):
    if random.choice([1,2]) == 1:
        passkey += passkey.join(random.choice(string.ascii_letters))
    else:
        passkey += passkey.join(random.choice(string.digits))
print passkey  

回答 17

这是对Anurag Uniyal的回应,也是我自己的工作。

import random
import string

oneFile = open('‪Numbers.txt', 'w')
userInput = 0
key_count = 0
value_count = 0
chars = string.ascii_uppercase + string.digits + string.punctuation

for userInput in range(int(input('How many 12 digit keys do you want?'))):
    while key_count <= userInput:
        key_count += 1
        number = random.randint(1, 999)
        key = number

        text = str(key) + ": " + str(''.join(random.sample(chars*6, 12)))
        oneFile.write(text + "\n")
oneFile.close()

this is a take on Anurag Uniyal ‘s response and something that i was working on myself.

import random
import string

oneFile = open('‪Numbers.txt', 'w')
userInput = 0
key_count = 0
value_count = 0
chars = string.ascii_uppercase + string.digits + string.punctuation

for userInput in range(int(input('How many 12 digit keys do you want?'))):
    while key_count <= userInput:
        key_count += 1
        number = random.randint(1, 999)
        key = number

        text = str(key) + ": " + str(''.join(random.sample(chars*6, 12)))
        oneFile.write(text + "\n")
oneFile.close()

回答 18

>>> import random
>>> str = []
>>> chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890'
>>> num = int(raw_input('How long do you want the string to be?  '))
How long do you want the string to be?  10
>>> for k in range(1, num+1):
...    str.append(random.choice(chars))
...
>>> str = "".join(str)
>>> str
'tm2JUQ04CK'

random.choice函数从列表中选择一个随机条目。您还创建了一个列表,以便可以将字符追加到for语句中。在端str是[ ‘T’, ‘M’, ‘2’, ‘J’, ‘U’, ‘Q’, ‘0’, ‘4’, ‘C’, ‘K’],但str = "".join(str)需要照顾您,留下您'tm2JUQ04CK'

希望这可以帮助!

>>> import random
>>> str = []
>>> chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890'
>>> num = int(raw_input('How long do you want the string to be?  '))
How long do you want the string to be?  10
>>> for k in range(1, num+1):
...    str.append(random.choice(chars))
...
>>> str = "".join(str)
>>> str
'tm2JUQ04CK'

The random.choice function picks a random entry in a list. You also create a list so that you can append the character in the for statement. At the end str is [‘t’, ‘m’, ‘2’, ‘J’, ‘U’, ‘Q’, ‘0’, ‘4’, ‘C’, ‘K’], but the str = "".join(str) takes care of that, leaving you with 'tm2JUQ04CK'.

Hope this helps!


回答 19

import string
from random import *
characters = string.ascii_letters + string.punctuation  + string.digits
password =  "".join(choice(characters) for x in range(randint(8, 16)))
print password
import string
from random import *
characters = string.ascii_letters + string.punctuation  + string.digits
password =  "".join(choice(characters) for x in range(randint(8, 16)))
print password

回答 20

import random
q=2
o=1
list  =[r'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','s','0','1','2','3','4','5','6','7','8','9','0']
while(q>o):
    print("")

    for i in range(1,128):
        x=random.choice(list)
        print(x,end="")

在这里,可以在for循环中更改字符串的长度,即在range(1,length)中的i可以更改。这是一种简单易懂的算法。它使用列表,因此您可以丢弃不需要的字符。

import random
q=2
o=1
list  =[r'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','s','0','1','2','3','4','5','6','7','8','9','0']
while(q>o):
    print("")

    for i in range(1,128):
        x=random.choice(list)
        print(x,end="")

Here length of string can be changed in for loop i.e for i in range(1,length) It is simple algorithm which is easy to understand. it uses list so you can discard characters that you do not need.


回答 21

一个简单的:

import string
import random
character = string.lowercase + string.uppercase + string.digits + string.punctuation
char_len = len(character)
# you can specify your password length here
pass_len = random.randint(10,20)
password = ''
for x in range(pass_len):
    password = password + character[random.randint(0,char_len-1)]
print password

A simple one:

import string
import random
character = string.lowercase + string.uppercase + string.digits + string.punctuation
char_len = len(character)
# you can specify your password length here
pass_len = random.randint(10,20)
password = ''
for x in range(pass_len):
    password = password + character[random.randint(0,char_len-1)]
print password

回答 22

我想建议您下一个选择:

import crypt
n = 10
crypt.crypt("any sring").replace('/', '').replace('.', '').upper()[-n:-1]

偏执模式:

import uuid
import crypt
n = 10
crypt.crypt(str(uuid.uuid4())).replace('/', '').replace('.', '').upper()[-n:-1]

I would like to suggest you next option:

import crypt
n = 10
crypt.crypt("any sring").replace('/', '').replace('.', '').upper()[-n:-1]

Paranoic mode:

import uuid
import crypt
n = 10
crypt.crypt(str(uuid.uuid4())).replace('/', '').replace('.', '').upper()[-n:-1]

回答 23

两种方法:

import random, math

def randStr_1(chars:str, length:int) -> str:
    chars *= math.ceil(length / len(chars))
    chars = letters[0:length]
    chars = list(chars)
    random.shuffle(characters)

    return ''.join(chars)

def randStr_2(chars:str, length:int) -> str:
    return ''.join(random.choice(chars) for i in range(chars))


基准测试:

from timeit import timeit

setup = """
import os, subprocess, time, string, random, math

def randStr_1(letters:str, length:int) -> str:
    letters *= math.ceil(length / len(letters))
    letters = letters[0:length]
    letters = list(letters)
    random.shuffle(letters)
    return ''.join(letters)

def randStr_2(letters:str, length:int) -> str:
    return ''.join(random.choice(letters) for i in range(length))
"""

print('Method 1 vs Method 2', ', run 10 times each.')

for length in [100,1000,10000,50000,100000,500000,1000000]:
    print(length, 'characters:')

    eff1 = timeit("randStr_1(string.ascii_letters, {})".format(length), setup=setup, number=10)
    eff2 = timeit("randStr_2(string.ascii_letters, {})".format(length), setup=setup, number=10)
    print('\t{}s : {}s'.format(round(eff1, 6), round(eff2, 6)))
    print('\tratio = {} : {}\n'.format(eff1/eff1, round(eff2/eff1, 2)))

输出:

Method 1 vs Method 2 , run 10 times each.

100 characters:
    0.001411s : 0.00179s
    ratio = 1.0 : 1.27

1000 characters:
    0.013857s : 0.017603s
    ratio = 1.0 : 1.27

10000 characters:
    0.13426s : 0.151169s
    ratio = 1.0 : 1.13

50000 characters:
    0.709403s : 0.855136s
    ratio = 1.0 : 1.21

100000 characters:
    1.360735s : 1.674584s
    ratio = 1.0 : 1.23

500000 characters:
    6.754923s : 7.160508s
    ratio = 1.0 : 1.06

1000000 characters:
    11.232965s : 14.223914s
    ratio = 1.0 : 1.27

第一种方法的性能更好。

Two methods :

import random, math

def randStr_1(chars:str, length:int) -> str:
    chars *= math.ceil(length / len(chars))
    chars = letters[0:length]
    chars = list(chars)
    random.shuffle(characters)

    return ''.join(chars)

def randStr_2(chars:str, length:int) -> str:
    return ''.join(random.choice(chars) for i in range(chars))


Benchmark :

from timeit import timeit

setup = """
import os, subprocess, time, string, random, math

def randStr_1(letters:str, length:int) -> str:
    letters *= math.ceil(length / len(letters))
    letters = letters[0:length]
    letters = list(letters)
    random.shuffle(letters)
    return ''.join(letters)

def randStr_2(letters:str, length:int) -> str:
    return ''.join(random.choice(letters) for i in range(length))
"""

print('Method 1 vs Method 2', ', run 10 times each.')

for length in [100,1000,10000,50000,100000,500000,1000000]:
    print(length, 'characters:')

    eff1 = timeit("randStr_1(string.ascii_letters, {})".format(length), setup=setup, number=10)
    eff2 = timeit("randStr_2(string.ascii_letters, {})".format(length), setup=setup, number=10)
    print('\t{}s : {}s'.format(round(eff1, 6), round(eff2, 6)))
    print('\tratio = {} : {}\n'.format(eff1/eff1, round(eff2/eff1, 2)))

Output :

Method 1 vs Method 2 , run 10 times each.

100 characters:
    0.001411s : 0.00179s
    ratio = 1.0 : 1.27

1000 characters:
    0.013857s : 0.017603s
    ratio = 1.0 : 1.27

10000 characters:
    0.13426s : 0.151169s
    ratio = 1.0 : 1.13

50000 characters:
    0.709403s : 0.855136s
    ratio = 1.0 : 1.21

100000 characters:
    1.360735s : 1.674584s
    ratio = 1.0 : 1.23

500000 characters:
    6.754923s : 7.160508s
    ratio = 1.0 : 1.06

1000000 characters:
    11.232965s : 14.223914s
    ratio = 1.0 : 1.27

The performance of first method is better.


回答 24

我已经回答了几乎所有答案,但是看起来都没有那么容易。我建议您尝试使用passgen库,该库通常用于创建随机密码。

您可以生成随机字符串,长度,标点,数字,字母大小写。

这是您的情况的代码:

from passgen import passgen
string_length = int(input())
random_string = passgen(length=string_length, punctuation=False, digits=True, letters=True, case='upper')

I have gone though almost all of the answers but none of them looks easier. I would suggest you to try the passgen library which is generally used to create random passwords.

You can generate random strings of your choice of length, punctuation, digits, letters and case.

Here’s the code for your case:

from passgen import passgen
string_length = int(input())
random_string = passgen(length=string_length, punctuation=False, digits=True, letters=True, case='upper')

回答 25

生成随机的16字节ID包含字母,数字,“ _”和“-”

os.urandom(16).translate((f'{string.ascii_letters}{string.digits}-_'*4).encode('ascii'))

Generate random 16-byte ID containig letters, digits, ‘_’ and ‘-‘

os.urandom(16).translate((f'{string.ascii_letters}{string.digits}-_'*4).encode('ascii'))


回答 26

import string, random
lower = string.ascii_lowercase
upper = string.ascii_uppercase
digits = string.digits
special = '!"£$%^&*.,@#/?'

def rand_pass(l=4, u=4, d=4, s=4):
    p = []
    [p.append(random.choice(lower)) for x in range(l)]
    [p.append(random.choice(upper)) for x in range(u)]
    [p.append(random.choice(digits)) for x in range(d)]
    [p.append(random.choice(special)) for x in range(s)]
    random.shuffle(p)
    return "".join(p)

print(rand_pass())
# @5U,@A4yIZvnp%51
import string, random
lower = string.ascii_lowercase
upper = string.ascii_uppercase
digits = string.digits
special = '!"£$%^&*.,@#/?'

def rand_pass(l=4, u=4, d=4, s=4):
    p = []
    [p.append(random.choice(lower)) for x in range(l)]
    [p.append(random.choice(upper)) for x in range(u)]
    [p.append(random.choice(digits)) for x in range(d)]
    [p.append(random.choice(special)) for x in range(s)]
    random.shuffle(p)
    return "".join(p)

print(rand_pass())
# @5U,@A4yIZvnp%51

回答 27

我发现这更简单,更清洁。

str_Key           = ""
str_FullKey       = "" 
str_CharacterPool = "01234ABCDEFfghij~>()"
for int_I in range(64): 
    str_Key = random.choice(str_CharacterPool) 
    str_FullKey = str_FullKey + str_Key 

只需更改64以更改长度,更改CharacterPool以仅执行alpha字母数字或仅数字或奇怪字符或任何您想要的操作。

I found this to be simpler and cleaner.

str_Key           = ""
str_FullKey       = "" 
str_CharacterPool = "01234ABCDEFfghij~>()"
for int_I in range(64): 
    str_Key = random.choice(str_CharacterPool) 
    str_FullKey = str_FullKey + str_Key 

Just change the 64 to vary the length, vary the CharacterPool to do alpha only alpha numeric or numeric only or strange characters or whatever you want.


创建长的多行字符串的Pythonic方法

问题:创建长的多行字符串的Pythonic方法

我有一个很长的查询。我想在Python中将其分成几行。用JavaScript做到这一点的一种方法是使用几个句子,然后将它们与一个+运算符连接起来(我知道,这可能不是最有效的方法,但是我并不真正关心此阶段的性能,只是代码可读性) 。例:

var long_string = 'some text not important. just garbage to' +
                  'illustrate my example';

我尝试在Python中做类似的事情,但是没有用,所以我过去常常\拆分长字符串。但是,我不确定这是否是唯一/最佳/最佳的方法。看起来很尴尬。实际代码:

query = 'SELECT action.descr as "action", '\
    'role.id as role_id,'\
    'role.descr as role'\
    'FROM '\
    'public.role_action_def,'\
    'public.role,'\
    'public.record_def, '\
    'public.action'\
    'WHERE role.id = role_action_def.role_id AND'\
    'record_def.id = role_action_def.def_id AND'\
    'action.id = role_action_def.action_id AND'\
    'role_action_def.account_id = ' + account_id + ' AND'\
    'record_def.account_id=' + account_id + ' AND'\
    'def_id=' + def_id

I have a very long query. I would like to split it in several lines in Python. A way to do it in JavaScript would be using several sentences and joining them with a + operator (I know, maybe it’s not the most efficient way to do it, but I’m not really concerned about performance in this stage, just code readability). Example:

var long_string = 'some text not important. just garbage to' +
                  'illustrate my example';

I tried doing something similar in Python, but it didn’t work, so I used \ to split the long string. However, I’m not sure if this is the only/best/pythonicest way of doing it. It looks awkward. Actual code:

query = 'SELECT action.descr as "action", '\
    'role.id as role_id,'\
    'role.descr as role'\
    'FROM '\
    'public.role_action_def,'\
    'public.role,'\
    'public.record_def, '\
    'public.action'\
    'WHERE role.id = role_action_def.role_id AND'\
    'record_def.id = role_action_def.def_id AND'\
    'action.id = role_action_def.action_id AND'\
    'role_action_def.account_id = ' + account_id + ' AND'\
    'record_def.account_id=' + account_id + ' AND'\
    'def_id=' + def_id

回答 0

您在谈论多行字符串吗?容易,使用三引号将它们开始和结束。

s = """ this is a very
        long string if I had the
        energy to type more and more ..."""

您也可以使用单引号(当然在开始和结束时使用三个引号),并将结果字符串s与其他任何字符串一样对待。

注意:与任何字符串一样,引号和结尾引号之间的任何内容都将成为字符串的一部分,因此本示例中有一个前导空格(如@ root45所指出)。该字符串还将包含空格和换行符。

即:

' this is a very\n        long string if I had the\n        energy to type more and more ...'

最后,还可以像这样在Python中构造长行:

 s = ("this is a very"
      "long string too"
      "for sure ..."
     )

其中将包含任何额外的空格或换行符(这是一个有意的示例,显示了跳过空格的结果将导致什么):

'this is a verylong string toofor sure ...'

不需要逗号,只需将要连接的字符串放在一对括号中,并确保考虑到任何需要的空格和换行符。

Are you talking about multi-line strings? Easy, use triple quotes to start and end them.

s = """ this is a very
        long string if I had the
        energy to type more and more ..."""

You can use single quotes too (3 of them of course at start and end) and treat the resulting string s just like any other string.

NOTE: Just as with any string, anything between the starting and ending quotes becomes part of the string, so this example has a leading blank (as pointed out by @root45). This string will also contain both blanks and newlines.

I.e.,:

' this is a very\n        long string if I had the\n        energy to type more and more ...'

Finally, one can also construct long lines in Python like this:

 s = ("this is a very"
      "long string too"
      "for sure ..."
     )

which will not include any extra blanks or newlines (this is a deliberate example showing what the effect of skipping blanks will result in):

'this is a verylong string toofor sure ...'

No commas required, simply place the strings to be joined together into a pair of parenthesis and be sure to account for any needed blanks and newlines.


回答 1

如果您不希望使用多行字符串,而只需要一个长的单行字符串,则可以使用括号,只需确保在字符串段之间不包含逗号,那么它将是一个元组。

query = ('SELECT   action.descr as "action", '
         'role.id as role_id,'
         'role.descr as role'
         ' FROM '
         'public.role_action_def,'
         'public.role,'
         'public.record_def, '
         'public.action'
         ' WHERE role.id = role_action_def.role_id AND'
         ' record_def.id = role_action_def.def_id AND'
         ' action.id = role_action_def.action_id AND'
         ' role_action_def.account_id = '+account_id+' AND'
         ' record_def.account_id='+account_id+' AND'
         ' def_id='+def_id)

在您正在构造的SQL语句中,多行字符串也可以。但是,如果多行字符串将包含额外的空格将是一个问题,那么这将是实现所需内容的好方法。

If you don’t want a multiline string but just have a long single line string, you can use parentheses, just make sure you don’t include commas between the string segments, then it will be a tuple.

query = ('SELECT   action.descr as "action", '
         'role.id as role_id,'
         'role.descr as role'
         ' FROM '
         'public.role_action_def,'
         'public.role,'
         'public.record_def, '
         'public.action'
         ' WHERE role.id = role_action_def.role_id AND'
         ' record_def.id = role_action_def.def_id AND'
         ' action.id = role_action_def.action_id AND'
         ' role_action_def.account_id = '+account_id+' AND'
         ' record_def.account_id='+account_id+' AND'
         ' def_id='+def_id)

In a SQL statement like what you’re constructing, multiline strings would also be fine. But if the extra whitespace a multiline string would contain would be a problem, then this would be a good way to achieve what you want.


回答 2

打破行\对我的作品。这是一个例子:

longStr = "This is a very long string " \
        "that I wrote to help somebody " \
        "who had a question about " \
        "writing long strings in Python"

Breaking lines by \ works for me. Here is an example:

longStr = "This is a very long string " \
        "that I wrote to help somebody " \
        "who had a question about " \
        "writing long strings in Python"

回答 3

我发现自己对此很满意:

string = """This is a
very long string,
containing commas,
that I split up
for readability""".replace('\n',' ')

I found myself happy with this one:

string = """This is a
very long string,
containing commas,
that I split up
for readability""".replace('\n',' ')

回答 4

我发现在构建长字符串时,通常会执行诸如构建SQL查询之类的事情,在这种情况下,这是最好的:

query = ' '.join((  # note double parens, join() takes an iterable
    "SELECT foo",
    "FROM bar",
    "WHERE baz",
))

莱文的建议是好的,但可能容易出错:

query = (
    "SELECT foo"
    "FROM bar"
    "WHERE baz"
)

query == "SELECT fooFROM barWHERE baz"  # probably not what you want

I find that when building long strings, you are usually doing something like building an SQL query, in which case this is best:

query = ' '.join((  # note double parens, join() takes an iterable
    "SELECT foo",
    "FROM bar",
    "WHERE baz",
))

What Levon suggested is good, but might be vulnerable to mistakes:

query = (
    "SELECT foo"
    "FROM bar"
    "WHERE baz"
)

query == "SELECT fooFROM barWHERE baz"  # probably not what you want

回答 5

您还可以在使用“”符号时串联变量:

foo = '1234'

long_string = """fosdl a sdlfklaskdf as
as df ajsdfj asdfa sld
a sdf alsdfl alsdfl """ +  foo + """ aks
asdkfkasdk fak"""

编辑:找到了一种更好的方法,命名为params和.format():

body = """
<html>
<head>
</head>
<body>
    <p>Lorem ipsum.</p>
    <dl>
        <dt>Asdf:</dt>     <dd><a href="{link}">{name}</a></dd>
    </dl>
    </body>
</html>
""".format(
    link='http://www.asdf.com',
    name='Asdf',
)

print(body)

You can also concatenate variables in when using “”” notation:

foo = '1234'

long_string = """fosdl a sdlfklaskdf as
as df ajsdfj asdfa sld
a sdf alsdfl alsdfl """ +  foo + """ aks
asdkfkasdk fak"""

EDIT: Found a better way, with named params and .format():

body = """
<html>
<head>
</head>
<body>
    <p>Lorem ipsum.</p>
    <dl>
        <dt>Asdf:</dt>     <dd><a href="{link}">{name}</a></dd>
    </dl>
    </body>
</html>
""".format(
    link='http://www.asdf.com',
    name='Asdf',
)

print(body)

回答 6

此方法使用:

  • 只需一个反斜杠即可避免初始换行
  • 通过使用三引号引起来的字符串,几乎没有内部标点符号
  • 使用textwrap inspect模块去除局部缩进
  • account_iddef_id变量使用python 3.6格式的字符串插值(’f’)。

这种方式对我来说似乎是最pythonic的。

# import textwrap  # See update to answer below
import inspect

# query = textwrap.dedent(f'''\
query = inspect.cleandoc(f'''
    SELECT action.descr as "action", 
    role.id as role_id,
    role.descr as role
    FROM 
    public.role_action_def,
    public.role,
    public.record_def, 
    public.action
    WHERE role.id = role_action_def.role_id AND
    record_def.id = role_action_def.def_id AND
    action.id = role_action_def.action_id AND
    role_action_def.account_id = {account_id} AND
    record_def.account_id={account_id} AND
    def_id={def_id}'''
)

更新:1/29/2019合并@ShadowRanger的建议使用inspect.cleandoc代替textwrap.dedent

This approach uses:

  • just one backslash to avoid an initial linefeed
  • almost no internal punctuation by using a triple quoted string
  • strips away local indentation using the textwrap inspect module
  • uses python 3.6 formatted string interpolation (‘f’) for the account_id and def_id variables.

This way looks the most pythonic to me.

# import textwrap  # See update to answer below
import inspect

# query = textwrap.dedent(f'''\
query = inspect.cleandoc(f'''
    SELECT action.descr as "action", 
    role.id as role_id,
    role.descr as role
    FROM 
    public.role_action_def,
    public.role,
    public.record_def, 
    public.action
    WHERE role.id = role_action_def.role_id AND
    record_def.id = role_action_def.def_id AND
    action.id = role_action_def.action_id AND
    role_action_def.account_id = {account_id} AND
    record_def.account_id={account_id} AND
    def_id={def_id}'''
)

Update: 1/29/2019 Incorporate @ShadowRanger’s suggestion to use inspect.cleandoc instead of textwrap.dedent


回答 7

在Python> = 3.6中,您可以使用格式化字符串文字(f字符串)

query= f'''SELECT   action.descr as "action"
    role.id as role_id,
    role.descr as role
    FROM
    public.role_action_def,
    public.role,
    public.record_def,
    public.action
    WHERE role.id = role_action_def.role_id AND
    record_def.id = role_action_def.def_id AND
    action.id = role_action_def.action_id AND
    role_action_def.account_id = {account_id} AND
    record_def.account_id = {account_id} AND
    def_id = {def_id}'''

In Python >= 3.6 you can use Formatted string literals (f string)

query= f'''SELECT   action.descr as "action"
    role.id as role_id,
    role.descr as role
    FROM
    public.role_action_def,
    public.role,
    public.record_def,
    public.action
    WHERE role.id = role_action_def.role_id AND
    record_def.id = role_action_def.def_id AND
    action.id = role_action_def.action_id AND
    role_action_def.account_id = {account_id} AND
    record_def.account_id = {account_id} AND
    def_id = {def_id}'''

回答 8

例如:

sql = ("select field1, field2, field3, field4 "
       "from table "
       "where condition1={} "
       "and condition2={}").format(1, 2)

Output: 'select field1, field2, field3, field4 from table 
         where condition1=1 and condition2=2'

如果condition的值应该是字符串,则可以这样:

sql = ("select field1, field2, field3, field4 "
       "from table "
       "where condition1='{0}' "
       "and condition2='{1}'").format('2016-10-12', '2017-10-12')

Output: "select field1, field2, field3, field4 from table where
         condition1='2016-10-12' and condition2='2017-10-12'"

For example:

sql = ("select field1, field2, field3, field4 "
       "from table "
       "where condition1={} "
       "and condition2={}").format(1, 2)

Output: 'select field1, field2, field3, field4 from table 
         where condition1=1 and condition2=2'

if the value of condition should be a string, you can do like this:

sql = ("select field1, field2, field3, field4 "
       "from table "
       "where condition1='{0}' "
       "and condition2='{1}'").format('2016-10-12', '2017-10-12')

Output: "select field1, field2, field3, field4 from table where
         condition1='2016-10-12' and condition2='2017-10-12'"

回答 9

textwrap.dedent这里找到了长字符串的最佳选择:

def create_snippet():
    code_snippet = textwrap.dedent("""\
        int main(int argc, char* argv[]) {
            return 0;
        }
    """)
    do_something(code_snippet)

I find textwrap.dedent the best for long strings as described here:

def create_snippet():
    code_snippet = textwrap.dedent("""\
        int main(int argc, char* argv[]) {
            return 0;
        }
    """)
    do_something(code_snippet)

回答 10

其他人已经提到了括号方法,但是我想在括号中添加,允许内联注释。

对每个片段进行评论:

nursery_rhyme = (
    'Mary had a little lamb,'          # Comments are great!
    'its fleece was white as snow.'
    'And everywhere that Mary went,'
    'her sheep would surely go.'       # What a pesky sheep.
)

继续后不允许发表评论:

当使用反斜杠连续行(\)时,不允许注释。您会收到一个SyntaxError: unexpected character after line continuation character错误消息。

nursery_rhyme = 'Mary had a little lamb,' \  # These comments
    'its fleece was white as snow.'       \  # are invalid!
    'And everywhere that Mary went,'      \
    'her sheep would surely go.'
# => SyntaxError: unexpected character after line continuation character

对Regex字符串的更好注释:

根据https://docs.python.org/3/library/re.html#re.VERBOSE的示例,

a = re.compile(
    r'\d+'  # the integral part
    r'\.'   # the decimal point
    r'\d*'  # some fractional digits
)
# Using VERBOSE flag, IDE usually can't syntax highight the string comment.
a = re.compile(r"""\d +  # the integral part
                   \.    # the decimal point
                   \d *  # some fractional digits""", re.X)

Others have mentioned the parentheses method already, but I’d like to add that with parentheses, inline comments are allowed.

Comment on each fragment:

nursery_rhyme = (
    'Mary had a little lamb,'          # Comments are great!
    'its fleece was white as snow.'
    'And everywhere that Mary went,'
    'her sheep would surely go.'       # What a pesky sheep.
)

Comment not allowed after continuation:

When using backslash line continuations (\ ), comments are not allowed. You’ll receive a SyntaxError: unexpected character after line continuation character error.

nursery_rhyme = 'Mary had a little lamb,' \  # These comments
    'its fleece was white as snow.'       \  # are invalid!
    'And everywhere that Mary went,'      \
    'her sheep would surely go.'
# => SyntaxError: unexpected character after line continuation character

Better comments for Regex strings:

Based on the example from https://docs.python.org/3/library/re.html#re.VERBOSE,

a = re.compile(
    r'\d+'  # the integral part
    r'\.'   # the decimal point
    r'\d*'  # some fractional digits
)
# Using VERBOSE flag, IDE usually can't syntax highight the string comment.
a = re.compile(r"""\d +  # the integral part
                   \.    # the decimal point
                   \d *  # some fractional digits""", re.X)

回答 11

我个人发现以下是用Python编写原始SQL查询的最佳方式(简单,安全和Pythonic),尤其是在使用Python的sqlite3模块时

query = '''
    SELECT
        action.descr as action,
        role.id as role_id,
        role.descr as role
    FROM
        public.role_action_def,
        public.role,
        public.record_def,
        public.action
    WHERE
        role.id = role_action_def.role_id
        AND record_def.id = role_action_def.def_id
        AND action.id = role_action_def.action_id
        AND role_action_def.account_id = ?
        AND record_def.account_id = ?
        AND def_id = ?
'''
vars = (account_id, account_id, def_id)   # a tuple of query variables
cursor.execute(query, vars)   # using Python's sqlite3 module

优点

  • 简洁的代码(Pythonic!)
  • 防止SQL注入
  • 与Python 2和Python 3兼容(毕竟是Pythonic)
  • 无需字符串连接
  • 无需确保每行的最右字符是一个空格

缺点

  • 由于查询中的变量已被?占位符替换,因此?当查询中有很多变量时,要跟踪哪个变量将被哪个Python变量替换可能会有些困难。

I personally find the following to be the best (simple, safe and Pythonic) way to write raw SQL queries in Python, especially when using Python’s sqlite3 module:

query = '''
    SELECT
        action.descr as action,
        role.id as role_id,
        role.descr as role
    FROM
        public.role_action_def,
        public.role,
        public.record_def,
        public.action
    WHERE
        role.id = role_action_def.role_id
        AND record_def.id = role_action_def.def_id
        AND action.id = role_action_def.action_id
        AND role_action_def.account_id = ?
        AND record_def.account_id = ?
        AND def_id = ?
'''
vars = (account_id, account_id, def_id)   # a tuple of query variables
cursor.execute(query, vars)   # using Python's sqlite3 module

Pros

  • Neat and simple code (Pythonic!)
  • Safe from SQL injection
  • Compatible with both Python 2 and Python 3 (it’s Pythonic after all)
  • No string concatenation required
  • No need to ensure that the right-most character of each line is a space

Cons

  • Since variables in the query are replaced by the ? placeholder, it may become a little difficult to keep track of which ? is to be substituted by which Python variable when there are lots of them in the query.

回答 12

我通常使用这样的东西:

text = '''
    This string was typed to be a demo
    on how could we write a multi-line
    text in Python.
'''

如果要删除每行中令人讨厌的空格,可以执行以下操作:

text = '\n'.join(line.lstrip() for line in text.splitlines())

I usually use something like this:

text = '''
    This string was typed to be a demo
    on how could we write a multi-line
    text in Python.
'''

If you want to remove annoying blank spaces in each line, you could do as follows:

text = '\n'.join(line.lstrip() for line in text.splitlines())

回答 13

您的实际代码不起作用,在“行”末尾缺少空格(例如: role.descr as roleFROM...

多行字符串有三引号:

string = """line
  line2
  line3"""

它将包含换行符和多余的空格,但是对于SQL来说这不是问题。

Your actual code shouldn’t work, you are missing whitespaces at the end of “lines” (eg: role.descr as roleFROM...)

There is triplequotes for multiline string:

string = """line
  line2
  line3"""

It will contain the line breaks and extra spaces, but for SQL that’s not a problem.


回答 14

您还可以将sql语句放置在单独的文件中,action.sql然后使用以下命令将其加载到py文件中:

with open('action.sql') as f:
   query = f.read()

因此,sql语句将与python代码分开。如果sql语句中有需要从python填充的参数,则可以使用字符串格式(例如%s或{field})

You can also place the sql-statement in a seperate file action.sql and load it in the py file with

with open('action.sql') as f:
   query = f.read()

So the sql-statements will be separated from the python code. If there are parameters in the sql statement which needs to be filled from python, you can use string formating (like %s or {field})


回答 15

“Àla” Scala方式(但是我认为这是OQ要求的最Python方式):

description = """
            | The intention of this module is to provide a method to 
            | pass meta information in markdown_ header files for 
            | using it in jinja_ templates. 
            | 
            | Also, to provide a method to use markdown files as jinja 
            | templates. Maybe you prefer to see the code than 
            | to install it.""".replace('\n            | \n','\n').replace('            | ',' ')

如果您想要没有跳线的最终str,只需将其放在\n第二个替换的第一个参数的开头:

.replace('\n            | ',' ')`.

注意:“ …模板”之间的白线。和“还,…”在后面需要一个空格|

“À la” Scala way (but I think is the most pythonic way as OQ demands):

description = """
            | The intention of this module is to provide a method to 
            | pass meta information in markdown_ header files for 
            | using it in jinja_ templates. 
            | 
            | Also, to provide a method to use markdown files as jinja 
            | templates. Maybe you prefer to see the code than 
            | to install it.""".replace('\n            | \n','\n').replace('            | ',' ')

If you want final str without jump lines, just put \n at the start of the first argument of the second replace:

.replace('\n            | ',' ')`.

Note: the white line between “…templates.” and “Also, …” requires a whitespace after the |.


回答 16

tl; dr:使用"""\"""包装字符串,如

string = """\
This is a long string
spanning multiple lines.
"""

官方python文档中

字符串文字可以跨越多行。一种方法是使用三引号:“”“ …”“”或”’…”’。行尾会自动包含在字符串中,但是可以通过在行尾添加\来防止这种情况。下面的例子:

print("""\
Usage: thingy [OPTIONS]
     -h                        Display this usage message
     -H hostname               Hostname to connect to
""")

产生以下输出(请注意,不包括初始换行符):

Usage: thingy [OPTIONS]
     -h                        Display this usage message
     -H hostname               Hostname to connect to

tl;dr: Use """\ and """ to wrap the string, as in

string = """\
This is a long string
spanning multiple lines.
"""

From the official python documentation:

String literals can span multiple lines. One way is using triple-quotes: “””…””” or ”’…”’. End of lines are automatically included in the string, but it’s possible to prevent this by adding a \ at the end of the line. The following example:

print("""\
Usage: thingy [OPTIONS]
     -h                        Display this usage message
     -H hostname               Hostname to connect to
""")

produces the following output (note that the initial newline is not included):

Usage: thingy [OPTIONS]
     -h                        Display this usage message
     -H hostname               Hostname to connect to

回答 17

嘿,尝试这种希望能起作用的方法,就像这种格式,它将像您已成功查询此属性一样,返回一条连续的行。

"message": f'you have successfully inquired about '
           f'{enquiring_property.title} Property owned by '
           f'{enquiring_property.client}'

Hey try something like this hope it works, like in this format it will return you a continuous line like you have successfully enquired about this property`

"message": f'you have successfully inquired about '
           f'{enquiring_property.title} Property owned by '
           f'{enquiring_property.client}'

回答 18

我使用递归函数来构建复杂的SQL查询。此技术通常可用于构建大型字符串,同时保持代码的可读性。

# Utility function to recursively resolve SQL statements.
# CAUTION: Use this function carefully, Pass correct SQL parameters {},
# TODO: This should never happen but check for infinite loops
def resolveSQL(sql_seed, sqlparams):
    sql = sql_seed % (sqlparams)
    if sql == sql_seed:
        return ' '.join([x.strip() for x in sql.split()])
    else:
        return resolveSQL(sql, sqlparams)

PS:看一下很棒的python-sqlparse库,可以根据需要漂亮地打印SQL查询。 http://sqlparse.readthedocs.org/en/latest/api/#sqlparse.format

I use a recursive function to build complex SQL Queries. This technique can generally be used to build large strings while maintaining code readability.

# Utility function to recursively resolve SQL statements.
# CAUTION: Use this function carefully, Pass correct SQL parameters {},
# TODO: This should never happen but check for infinite loops
def resolveSQL(sql_seed, sqlparams):
    sql = sql_seed % (sqlparams)
    if sql == sql_seed:
        return ' '.join([x.strip() for x in sql.split()])
    else:
        return resolveSQL(sql, sqlparams)

P.S: Have a look at the awesome python-sqlparse library to pretty print SQL queries if needed. http://sqlparse.readthedocs.org/en/latest/api/#sqlparse.format


回答 19

当代码(例如变量)缩进并且输出字符串应该是一个衬线(没有换行符)时,我认为另一种方法更易读:

def some_method():

    long_string = """
a presumptuous long string 
which looks a bit nicer 
in a text editor when
written over multiple lines
""".strip('\n').replace('\n', ' ')

    return long_string 

Another option that I think is more readable when the code (e.g variable) is indented and the output string should be a one liner (no newlines):

def some_method():

    long_string = """
a presumptuous long string 
which looks a bit nicer 
in a text editor when
written over multiple lines
""".strip('\n').replace('\n', ' ')

    return long_string 

回答 20

使用三引号。人们经常在程序开始时使用它们来创建文档字符串,以解释其目的以及与该文档创建相关的其他信息。人们还在功能中使用这些来解释功能的目的和应用。例:

'''
Filename: practice.py
File creator: me
File purpose: explain triple quotes
'''


def example():
    """This prints a string that occupies multiple lines!!"""
    print("""
    This
    is 
    a multi-line
    string!
    """)

Use triple quotation marks. People often use these to create docstrings at the start of programs to explain their purpose and other information relevant to its creation. People also use these in functions to explain the purpose and application of functions. Example:

'''
Filename: practice.py
File creator: me
File purpose: explain triple quotes
'''


def example():
    """This prints a string that occupies multiple lines!!"""
    print("""
    This
    is 
    a multi-line
    string!
    """)

回答 21

我喜欢这种方法,因为它具有阅读的特权。如果我们的弦长,那就没办法了!根据您所处的缩进级别,仍然限制为每行80个字符。。。嗯…无需赘述。我认为python样式指南仍然很模糊。我采用@Eero Aaltonen方法是因为它具有阅读和常识的特权。我知道样式指南应该对我们有帮助,而不会使我们的生活变得一团糟。谢谢!

class ClassName():
    def method_name():
        if condition_0:
            if condition_1:
                if condition_2:
                    some_variable_0 =\
"""
some_js_func_call(
    undefined, 
    {
        'some_attr_0': 'value_0', 
        'some_attr_1': 'value_1', 
        'some_attr_2': '""" + some_variable_1 + """'
    }, 
    undefined, 
    undefined, 
    true
)
"""

I like this approach because it privileges reading. In cases where we have long strings there is no way! Depending on the level of indentation you are in and still limited to 80 characters per line… Well… No need to say anything else. In my view the python style guides are still very vague. I took the @Eero Aaltonen approach because it privileges reading and common sense. I understand that style guides should help us and not make our lives a mess. Thanks!

class ClassName():
    def method_name():
        if condition_0:
            if condition_1:
                if condition_2:
                    some_variable_0 =\
"""
some_js_func_call(
    undefined, 
    {
        'some_attr_0': 'value_0', 
        'some_attr_1': 'value_1', 
        'some_attr_2': '""" + some_variable_1 + """'
    }, 
    undefined, 
    undefined, 
    true
)
"""

回答 22

官方python文档中

字符串文字可以跨越多行。一种方法是使用三引号:“”“ …”“”或”’…”’。行尾会自动包含在字符串中,但是可以通过在行尾添加\来防止这种情况。下面的例子:

print("""\
Usage: thingy [OPTIONS]
     -h                        Display this usage message
     -H hostname               Hostname to connect to
""")

产生以下输出(请注意,不包括初始换行符):

From the official python documentation:

String literals can span multiple lines. One way is using triple-quotes: “””…””” or ”’…”’. End of lines are automatically included in the string, but it’s possible to prevent this by adding a \ at the end of the line. The following example:

print("""\
Usage: thingy [OPTIONS]
     -h                        Display this usage message
     -H hostname               Hostname to connect to
""")

produces the following output (note that the initial newline is not included):


回答 23

为了在字典中定义一个长字符串, 保留换行符,但省略空格,我最终在一个常量中定义字符串,如下所示:

LONG_STRING = \
"""
This is a long sting
that contains newlines.
The newlines are important.
"""

my_dict = {
   'foo': 'bar',
   'string': LONG_STRING
}

For defining a long string inside a dict, keeping the newlines but omitting the spaces, I ended up defining the string in a constant like this:

LONG_STRING = \
"""
This is a long sting
that contains newlines.
The newlines are important.
"""

my_dict = {
   'foo': 'bar',
   'string': LONG_STRING
}

回答 24

作为Python中长字符串的一种通用方法,您可以使用三引号splitjoin

_str = ' '.join('''Lorem ipsum dolor sit amet, consectetur adipiscing 
        elit, sed do eiusmod tempor incididunt ut labore et dolore 
        magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation 
        ullamco laboris nisi ut aliquip ex ea commodo.'''.split())

输出:

'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo.'

关于OP的与SQL查询有关的问题,下面的答案无视此构建SQL查询方法的正确性,并且仅关注以可读性和美观性方式构建长字符串,而没有其他导入。它还忽略了这带来的计算负荷。

使用三重引号,我们构建了一个长且可读的字符串,然后使用split()将该字符串分解为一个列表,从而去除了空格,然后将其与重新连接在一起' '.join()。最后,我们使用以下format()命令插入变量:

account_id = 123
def_id = 321

_str = '''
    SELECT action.descr AS "action", role.id AS role_id, role.descr AS role 
    FROM public.role_action_def, public.role, public.record_def, public.action
    WHERE role.id = role_action_def.role_id 
    AND record_def.id = role_action_def.def_id 
    AND' action.id = role_action_def.action_id 
    AND role_action_def.account_id = {} 
    AND record_def.account_id = {} 
    AND def_id = {}
    '''

query = ' '.join(_str.split()).format(account_id, account_id, def_id)

生成:

SELECT action.descr AS "action", role.id AS role_id, role.descr AS role FROM public.role_action_def, public.role, public.record_def, public.action WHERE role.id = role_action_def.role_id AND record_def.id = role_action_def.def_id AND\' action.id = role_action_def.action_id AND role_action_def.account_id = 123 AND record_def.account_id=123 AND def_id=321

编辑:这种方法不符合PEP8,但我有时发现它很有用

As a general approach to long strings in Python you can use triple quotes, split and join:

_str = ' '.join('''Lorem ipsum dolor sit amet, consectetur adipiscing 
        elit, sed do eiusmod tempor incididunt ut labore et dolore 
        magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation 
        ullamco laboris nisi ut aliquip ex ea commodo.'''.split())

Output:

'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo.'

With regard to OP’s question relating to a SQL query, the answer below disregards the correctness of this approach to building SQL queries and focuses only on building long strings in a readable and aesthetic way without additional imports. It also disregards the computational load this entails.

Using triple quotes we build a long and readable string which we then break up into a list using split() thereby stripping the whitespace and then join it back together with ' '.join(). Finally we insert the variables using the format() command:

account_id = 123
def_id = 321

_str = '''
    SELECT action.descr AS "action", role.id AS role_id, role.descr AS role 
    FROM public.role_action_def, public.role, public.record_def, public.action
    WHERE role.id = role_action_def.role_id 
    AND record_def.id = role_action_def.def_id 
    AND' action.id = role_action_def.action_id 
    AND role_action_def.account_id = {} 
    AND record_def.account_id = {} 
    AND def_id = {}
    '''

query = ' '.join(_str.split()).format(account_id, account_id, def_id)

Produces:

SELECT action.descr AS "action", role.id AS role_id, role.descr AS role FROM public.role_action_def, public.role, public.record_def, public.action WHERE role.id = role_action_def.role_id AND record_def.id = role_action_def.def_id AND\' action.id = role_action_def.action_id AND role_action_def.account_id = 123 AND record_def.account_id=123 AND def_id=321

Edit: This approach is not in line with PEP8 but I find it useful at times


回答 25

通常,我将listjoin用于多行注释/字符串。

lines = list()
lines.append('SELECT action.enter code here descr as "action", ')
lines.append('role.id as role_id,')
lines.append('role.descr as role')
lines.append('FROM ')
lines.append('public.role_action_def,')
lines.append('public.role,')
lines.append('public.record_def, ')
lines.append('public.action')
query = " ".join(lines)

您可以使用任何字符串来连接所有此列表元素,例如’ \n‘(换行符)或’ ,‘(逗号)或’ ‘(空格)

干杯..!!

Generally, I use list and join for multi-line comments/string.

lines = list()
lines.append('SELECT action.enter code here descr as "action", ')
lines.append('role.id as role_id,')
lines.append('role.descr as role')
lines.append('FROM ')
lines.append('public.role_action_def,')
lines.append('public.role,')
lines.append('public.record_def, ')
lines.append('public.action')
query = " ".join(lines)

you can use any string to join all this list element like ‘\n‘(newline) or ‘,‘(comma) or ‘‘(space)

Cheers..!!


如何修剪字符串中的空格?

问题:如何修剪字符串中的空格?

如何从Python中的字符串中删除开头和结尾的空格?

例如:

" Hello " --> "Hello"
" Hello"  --> "Hello"
"Hello "  --> "Hello"
"Bob has a cat" --> "Bob has a cat"

How do I remove leading and trailing whitespace from a string in Python?

For example:

" Hello " --> "Hello"
" Hello"  --> "Hello"
"Hello "  --> "Hello"
"Bob has a cat" --> "Bob has a cat"

回答 0

只是一个空格,还是所有连续的空格?如果是第二个,则字符串已经具有.strip()方法:

>>> ' Hello '.strip()
'Hello'
>>> ' Hello'.strip()
'Hello'
>>> 'Bob has a cat'.strip()
'Bob has a cat'
>>> '   Hello   '.strip()  # ALL consecutive spaces at both ends removed
'Hello'

但是,如果只需要删除一个空格,可以使用以下方法:

def strip_one_space(s):
    if s.endswith(" "): s = s[:-1]
    if s.startswith(" "): s = s[1:]
    return s

>>> strip_one_space("   Hello ")
'  Hello'

另外,请注意,str.strip()它也会删除其他空白字符(例如,制表符和换行符)。要仅删除空格,您可以指定要删除的字符作为的参数strip,即:

>>> "  Hello\n".strip(" ")
'Hello\n'

Just one space, or all consecutive spaces? If the second, then strings already have a .strip() method:

>>> ' Hello '.strip()
'Hello'
>>> ' Hello'.strip()
'Hello'
>>> 'Bob has a cat'.strip()
'Bob has a cat'
>>> '   Hello   '.strip()  # ALL consecutive spaces at both ends removed
'Hello'

If you need only to remove one space however, you could do it with:

def strip_one_space(s):
    if s.endswith(" "): s = s[:-1]
    if s.startswith(" "): s = s[1:]
    return s

>>> strip_one_space("   Hello ")
'  Hello'

Also, note that str.strip() removes other whitespace characters as well (e.g. tabs and newlines). To remove only spaces, you can specify the character to remove as an argument to strip, i.e.:

>>> "  Hello\n".strip(" ")
'Hello\n'

回答 1

正如以上答案中指出的

myString.strip()

将删除所有前导和尾随空格字符,例如\ n,\ r,\ t,\ f,空格。

为了获得更大的灵活性,请使用以下命令

  • 仅删除前导空格字符:myString.lstrip()
  • 仅删除尾随空白字符:myString.rstrip()
  • 删除特定的空格字符:myString.strip('\n')myString.lstrip('\n\r')or myString.rstrip('\n\t')等等。

更多详细信息可在文档中找到

As pointed out in answers above

myString.strip()

will remove all the leading and trailing whitespace characters such as \n, \r, \t, \f, space.

For more flexibility use the following

  • Removes only leading whitespace chars: myString.lstrip()
  • Removes only trailing whitespace chars: myString.rstrip()
  • Removes specific whitespace chars: myString.strip('\n') or myString.lstrip('\n\r') or myString.rstrip('\n\t') and so on.

More details are available in the docs


回答 2

strip 不限于空白字符:

# remove all leading/trailing commas, periods and hyphens
title = title.strip(',.-')

strip is not limited to whitespace characters either:

# remove all leading/trailing commas, periods and hyphens
title = title.strip(',.-')

回答 3

这将删除以下所有开头和结尾的空格myString

myString.strip()

This will remove all leading and trailing whitespace in myString:

myString.strip()

回答 4

您要strip():

myphrases = [ " Hello ", " Hello", "Hello ", "Bob has a cat" ]

for phrase in myphrases:
    print phrase.strip()

You want strip():

myphrases = [ " Hello ", " Hello", "Hello ", "Bob has a cat" ]

for phrase in myphrases:
    print phrase.strip()

回答 5

我想删除字符串中太多的空格(不仅在开头或结尾,而且在字符串之间)。我这样做了,因为否则我不知道该怎么做:

string = "Name : David         Account: 1234             Another thing: something  " 

ready = False
while ready == False:
    pos = string.find("  ")
    if pos != -1:
       string = string.replace("  "," ")
    else:
       ready = True
print(string)

这将在一个空间中替换双倍空格,直到您不再有双倍空格为止

I wanted to remove the too-much spaces in a string (also in between the string, not only in the beginning or end). I made this, because I don’t know how to do it otherwise:

string = "Name : David         Account: 1234             Another thing: something  " 

ready = False
while ready == False:
    pos = string.find("  ")
    if pos != -1:
       string = string.replace("  "," ")
    else:
       ready = True
print(string)

This replaces double spaces in one space until you have no double spaces any more


回答 6

我找不到想要的解决方案,所以我创建了一些自定义函数。您可以尝试一下。

def cleansed(s: str):
    """:param s: String to be cleansed"""
    assert s is not (None or "")
    # return trimmed(s.replace('"', '').replace("'", ""))
    return trimmed(s)


def trimmed(s: str):
    """:param s: String to be cleansed"""
    assert s is not (None or "")
    ss = trim_start_and_end(s).replace('  ', ' ')
    while '  ' in ss:
        ss = ss.replace('  ', ' ')
    return ss


def trim_start_and_end(s: str):
    """:param s: String to be cleansed"""
    assert s is not (None or "")
    return trim_start(trim_end(s))


def trim_start(s: str):
    """:param s: String to be cleansed"""
    assert s is not (None or "")
    chars = []
    for c in s:
        if c is not ' ' or len(chars) > 0:
            chars.append(c)
    return "".join(chars).lower()


def trim_end(s: str):
    """:param s: String to be cleansed"""
    assert s is not (None or "")
    chars = []
    for c in reversed(s):
        if c is not ' ' or len(chars) > 0:
            chars.append(c)
    return "".join(reversed(chars)).lower()


s1 = '  b Beer '
s2 = 'Beer  b    '
s3 = '      Beer  b    '
s4 = '  bread butter    Beer  b    '

cdd = trim_start(s1)
cddd = trim_end(s2)
clean1 = cleansed(s3)
clean2 = cleansed(s4)

print("\nStr: {0} Len: {1} Cleansed: {2} Len: {3}".format(s1, len(s1), cdd, len(cdd)))
print("\nStr: {0} Len: {1} Cleansed: {2} Len: {3}".format(s2, len(s2), cddd, len(cddd)))
print("\nStr: {0} Len: {1} Cleansed: {2} Len: {3}".format(s3, len(s3), clean1, len(clean1)))
print("\nStr: {0} Len: {1} Cleansed: {2} Len: {3}".format(s4, len(s4), clean2, len(clean2)))

I could not find a solution to what I was looking for so I created some custom functions. You can try them out.

def cleansed(s: str):
    """:param s: String to be cleansed"""
    assert s is not (None or "")
    # return trimmed(s.replace('"', '').replace("'", ""))
    return trimmed(s)


def trimmed(s: str):
    """:param s: String to be cleansed"""
    assert s is not (None or "")
    ss = trim_start_and_end(s).replace('  ', ' ')
    while '  ' in ss:
        ss = ss.replace('  ', ' ')
    return ss


def trim_start_and_end(s: str):
    """:param s: String to be cleansed"""
    assert s is not (None or "")
    return trim_start(trim_end(s))


def trim_start(s: str):
    """:param s: String to be cleansed"""
    assert s is not (None or "")
    chars = []
    for c in s:
        if c is not ' ' or len(chars) > 0:
            chars.append(c)
    return "".join(chars).lower()


def trim_end(s: str):
    """:param s: String to be cleansed"""
    assert s is not (None or "")
    chars = []
    for c in reversed(s):
        if c is not ' ' or len(chars) > 0:
            chars.append(c)
    return "".join(reversed(chars)).lower()


s1 = '  b Beer '
s2 = 'Beer  b    '
s3 = '      Beer  b    '
s4 = '  bread butter    Beer  b    '

cdd = trim_start(s1)
cddd = trim_end(s2)
clean1 = cleansed(s3)
clean2 = cleansed(s4)

print("\nStr: {0} Len: {1} Cleansed: {2} Len: {3}".format(s1, len(s1), cdd, len(cdd)))
print("\nStr: {0} Len: {1} Cleansed: {2} Len: {3}".format(s2, len(s2), cddd, len(cddd)))
print("\nStr: {0} Len: {1} Cleansed: {2} Len: {3}".format(s3, len(s3), clean1, len(clean1)))
print("\nStr: {0} Len: {1} Cleansed: {2} Len: {3}".format(s4, len(s4), clean2, len(clean2)))

回答 7

如果要从left和right修剪指定数量的空格,可以执行以下操作:

def remove_outer_spaces(text, num_of_leading, num_of_trailing):
    text = list(text)
    for i in range(num_of_leading):
        if text[i] == " ":
            text[i] = ""
        else:
            break

    for i in range(1, num_of_trailing+1):
        if text[-i] == " ":
            text[-i] = ""
        else:
            break
    return ''.join(text)

txt1 = "   MY name is     "
print(remove_outer_spaces(txt1, 1, 1))  # result is: "  MY name is    "
print(remove_outer_spaces(txt1, 2, 3))  # result is: " MY name is  "
print(remove_outer_spaces(txt1, 6, 8))  # result is: "MY name is"

If you want to trim specified number of spaces from left and right, you could do this:

def remove_outer_spaces(text, num_of_leading, num_of_trailing):
    text = list(text)
    for i in range(num_of_leading):
        if text[i] == " ":
            text[i] = ""
        else:
            break

    for i in range(1, num_of_trailing+1):
        if text[-i] == " ":
            text[-i] = ""
        else:
            break
    return ''.join(text)

txt1 = "   MY name is     "
print(remove_outer_spaces(txt1, 1, 1))  # result is: "  MY name is    "
print(remove_outer_spaces(txt1, 2, 3))  # result is: " MY name is  "
print(remove_outer_spaces(txt1, 6, 8))  # result is: "MY name is"

回答 8

也可以使用正则表达式来完成

import re

input  = " Hello "
output = re.sub(r'^\s+|\s+$', '', input)
# output = 'Hello'

This can also be done with a regular expression

import re

input  = " Hello "
output = re.sub(r'^\s+|\s+$', '', input)
# output = 'Hello'

回答 9

如何从Python中的字符串中删除开头和结尾的空格?

因此,下面的解决方案也将删除前导和尾随空格以及中间空格。就像您需要获取不带多个空格的清晰字符串值一样。

>>> str_1 = '     Hello World'
>>> print(' '.join(str_1.split()))
Hello World
>>>
>>>
>>> str_2 = '     Hello      World'
>>> print(' '.join(str_2.split()))
Hello World
>>>
>>>
>>> str_3 = 'Hello World     '
>>> print(' '.join(str_3.split()))
Hello World
>>>
>>>
>>> str_4 = 'Hello      World     '
>>> print(' '.join(str_4.split()))
Hello World
>>>
>>>
>>> str_5 = '     Hello World     '
>>> print(' '.join(str_5.split()))
Hello World
>>>
>>>
>>> str_6 = '     Hello      World     '
>>> print(' '.join(str_6.split()))
Hello World
>>>
>>>
>>> str_7 = 'Hello World'
>>> print(' '.join(str_7.split()))
Hello World

如您所见,这将删除字符串中的所有多个空格(输出适用Hello World于所有空格)。位置无关紧要。但是,如果您确实需要前导和尾随空格,那么strip()就会发现。

How do I remove leading and trailing whitespace from a string in Python?

So below solution will remove leading and trailing whitespaces as well as intermediate whitespaces too. Like if you need to get a clear string values without multiple whitespaces.

>>> str_1 = '     Hello World'
>>> print(' '.join(str_1.split()))
Hello World
>>>
>>>
>>> str_2 = '     Hello      World'
>>> print(' '.join(str_2.split()))
Hello World
>>>
>>>
>>> str_3 = 'Hello World     '
>>> print(' '.join(str_3.split()))
Hello World
>>>
>>>
>>> str_4 = 'Hello      World     '
>>> print(' '.join(str_4.split()))
Hello World
>>>
>>>
>>> str_5 = '     Hello World     '
>>> print(' '.join(str_5.split()))
Hello World
>>>
>>>
>>> str_6 = '     Hello      World     '
>>> print(' '.join(str_6.split()))
Hello World
>>>
>>>
>>> str_7 = 'Hello World'
>>> print(' '.join(str_7.split()))
Hello World

As you can see this will remove all the multiple whitespace in the string(output is Hello World for all). Location doesn’t matter. But if you really need leading and trailing whitespaces, then strip() would be find.


为什么使用’==’或’is’比较字符串有时会产生不同的结果?

问题:为什么使用’==’或’is’比较字符串有时会产生不同的结果?

我有一个Python程序,其中将两个变量设置为value 'public'。在条件表达式我有比较var1 is var2其失败,但如果我把它改为var1 == var2返回True

现在,如果我打开Python解释器并进行相同的“是”比较,则成功。

>>> s1 = 'public'
>>> s2 = 'public'
>>> s2 is s1
True

我在这里想念什么?

I’ve got a Python program where two variables are set to the value 'public'. In a conditional expression I have the comparison var1 is var2 which fails, but if I change it to var1 == var2 it returns True.

Now if I open my Python interpreter and do the same “is” comparison, it succeeds.

>>> s1 = 'public'
>>> s2 = 'public'
>>> s2 is s1
True

What am I missing here?


回答 0

is是身份测试,==是平等测试。您的代码中发生的情况将在解释器中进行模拟,如下所示:

>>> a = 'pub'
>>> b = ''.join(['p', 'u', 'b'])
>>> a == b
True
>>> a is b
False

所以,难怪他们不一样吧?

换句话说:isid(a) == id(b)

is is identity testing, == is equality testing. what happens in your code would be emulated in the interpreter like this:

>>> a = 'pub'
>>> b = ''.join(['p', 'u', 'b'])
>>> a == b
True
>>> a is b
False

so, no wonder they’re not the same, right?

In other words: is is the id(a) == id(b)


回答 1

这里的其他答案是正确的:is用于身份比较,而==用于相等比较。由于您关心的是相等性(两个字符串应包含相同的字符),因此在这种情况下,is运算符完全是错误的,您应该==改用。

is交互工作的原因是(大多数)字符串文字默认情况下是interned。从维基百科:

插入的字符串可加快字符串比较的速度,这有时是严重依赖带有字符串键的哈希表的应用程序(例如编译器和动态编程语言运行时)的性能瓶颈。在不进行实习的情况下,检查两个不同的字符串是否相等涉及检查两个字符串的每个字符。这很慢,原因有几个:字符串的长度固有地为O(n);它通常需要从多个内存区域进行读取,这需要时间。并且读取将填满处理器缓存,这意味着可用于其他需求的缓存较少。对于插入的字符串,在原始的内部操作之后,一个简单的对象身份测试就足够了;这通常被实现为指针相等性测试,

因此,当程序中有两个具有相同值的字符串文字(在程序源代码中逐字键入的单词,并用引号引起来)时,Python编译器将自动内插字符串,使它们都存储在相同的位置内存位置。(请注意,这并不总是会发生,并且发生这种情况的规则非常复杂,因此请不要在生产代码中依赖此行为!)

由于在您的交互式会话中,两个字符串实际上都存储在相同的存储位置中,因此它们具有相同的标识,因此is操作符将按预期工作。但是,如果您通过其他方法构造一个字符串(即使该字符串包含完全相同的字符),则该字符串可能相等,但它不是同一字符串 -也就是说,它具有不同的标识,因为它是存储在内存中的其他位置。

Other answers here are correct: is is used for identity comparison, while == is used for equality comparison. Since what you care about is equality (the two strings should contain the same characters), in this case the is operator is simply wrong and you should be using == instead.

The reason is works interactively is that (most) string literals are interned by default. From Wikipedia:

Interned strings speed up string comparisons, which are sometimes a performance bottleneck in applications (such as compilers and dynamic programming language runtimes) that rely heavily on hash tables with string keys. Without interning, checking that two different strings are equal involves examining every character of both strings. This is slow for several reasons: it is inherently O(n) in the length of the strings; it typically requires reads from several regions of memory, which take time; and the reads fills up the processor cache, meaning there is less cache available for other needs. With interned strings, a simple object identity test suffices after the original intern operation; this is typically implemented as a pointer equality test, normally just a single machine instruction with no memory reference at all.

So, when you have two string literals (words that are literally typed into your program source code, surrounded by quotation marks) in your program that have the same value, the Python compiler will automatically intern the strings, making them both stored at the same memory location. (Note that this doesn’t always happen, and the rules for when this happens are quite convoluted, so please don’t rely on this behavior in production code!)

Since in your interactive session both strings are actually stored in the same memory location, they have the same identity, so the is operator works as expected. But if you construct a string by some other method (even if that string contains exactly the same characters), then the string may be equal, but it is not the same string — that is, it has a different identity, because it is stored in a different place in memory.


回答 2

is关键字是对象标识一个测试而==是一个值比较。

如果使用is,则当且仅当对象是同一对象时,结果才为true。但是,==只要对象的值相同,就为真。

The is keyword is a test for object identity while == is a value comparison.

If you use is, the result will be true if and only if the object is the same object. However, == will be true any time the values of the object are the same.


回答 3

最后要注意的一点是,您可以使用该sys.intern函数来确保获得对相同字符串的引用:

>>> from sys import intern
>>> a = intern('a')
>>> a2 = intern('a')
>>> a is a2
True

如上所述,您不应该is用来确定字符串的相等性。但这可能有助于了解您是否有某种奇怪的要求要使用is

请注意,该intern函数以前是Python 2的内置函数,但已移至sysPython 3 的模块中。

One last thing to note, you may use the sys.intern function to ensure that you’re getting a reference to the same string:

>>> from sys import intern
>>> a = intern('a')
>>> a2 = intern('a')
>>> a is a2
True

As pointed out above, you should not be using is to determine equality of strings. But this may be helpful to know if you have some kind of weird requirement to use is.

Note that the intern function used to be a builtin on Python 2 but was moved to the sys module in Python 3.


回答 4

is是身份测试,==是平等测试。这意味着is检查两种事物是相同的还是等同的。

假设您有一个简单的person对象。如果它的名字叫“ Jack”并且是“ 23”岁,则相当于另一个23岁的Jack,但不是同一个人。

class Person(object):
   def __init__(self, name, age):
       self.name = name
       self.age = age

   def __eq__(self, other):
       return self.name == other.name and self.age == other.age

jack1 = Person('Jack', 23)
jack2 = Person('Jack', 23)

jack1 == jack2 #True
jack1 is jack2 #False

他们是同一年龄,但他们不是同一个人。一个字符串可能等效于另一个,但它不是同一对象。

is is identity testing, == is equality testing. What this means is that is is a way to check whether two things are the same things, or just equivalent.

Say you’ve got a simple person object. If it is named ‘Jack’ and is ’23’ years old, it’s equivalent to another 23yr old Jack, but its not the same person.

class Person(object):
   def __init__(self, name, age):
       self.name = name
       self.age = age

   def __eq__(self, other):
       return self.name == other.name and self.age == other.age

jack1 = Person('Jack', 23)
jack2 = Person('Jack', 23)

jack1 == jack2 #True
jack1 is jack2 #False

They’re the same age, but they’re not the same instance of person. A string might be equivalent to another, but it’s not the same object.


回答 5

这是一个旁注,但是在惯用的python中,您经常会看到类似以下内容:

if x is None: 
    # some clauses

这是安全的,因为保证存在Null对象的一个​​实例(即None)

This is a side note, but in idiomatic python, you will often see things like:

if x is None: 
    # some clauses

This is safe, because there is guaranteed to be one instance of the Null Object (i.e., None).


回答 6

如果不确定自己在做什么,请使用’==’。如果您对此有更多了解,可以对已知对象(例如“无”)使用“ is”。

否则,您将最终想知道为什么事情不起作用以及为什么会发生这种情况:

>>> a = 1
>>> b = 1
>>> b is a
True
>>> a = 6000
>>> b = 6000
>>> b is a
False

我什至不确定在不同的python版本/实现之间是否可以保证某些事情保持不变。

If you’re not sure what you’re doing, use the ‘==’. If you have a little more knowledge about it you can use ‘is’ for known objects like ‘None’.

Otherwise you’ll end up wondering why things doesn’t work and why this happens:

>>> a = 1
>>> b = 1
>>> b is a
True
>>> a = 6000
>>> b = 6000
>>> b is a
False

I’m not even sure if some things are guaranteed to stay the same between different python versions/implementations.


回答 7

根据我在python中的有限经验,is用于比较两个对象以查看它们是否是同一对象,而不是两个具有相同值的不同对象。 ==用于确定值是否相同。

这是一个很好的例子:

>>> s1 = u'public'
>>> s2 = 'public'
>>> s1 is s2
False
>>> s1 == s2
True

s1是unicode字符串,并且s2是普通字符串。它们不是同一类型,但是具有相同的值。

From my limited experience with python, is is used to compare two objects to see if they are the same object as opposed to two different objects with the same value. == is used to determine if the values are identical.

Here is a good example:

>>> s1 = u'public'
>>> s2 = 'public'
>>> s1 is s2
False
>>> s1 == s2
True

s1 is a unicode string, and s2 is a normal string. They are not the same type, but are the same value.


回答 8

我认为这与以下事实有关:当“ is”比较结果为false时,将使用两个不同的对象。如果评估结果为true,则表示内部使用的是完全相同的对象,而不是创建一个新对象,这可能是因为您在不到2秒的时间内创建了它们,并且在优化和使用相同的对象。

这就是为什么您应该使用相等运算符==而不是is来比较字符串对象的值的原因。

>>> s = 'one'
>>> s2 = 'two'
>>> s is s2
False
>>> s2 = s2.replace('two', 'one')
>>> s2
'one'
>>> s2 is s
False
>>> 

在此示例中,我创建了s2,它是一个以前等于’one’的不同字符串对象,但它与并不相同s,因为解释器没有使用相同的对象,因为我最初并未将其分配给’one’,如果我有的话,会让他们成为同一个对象。

I think it has to do with the fact that, when the ‘is’ comparison evaluates to false, two distinct objects are used. If it evaluates to true, that means internally it’s using the same exact object and not creating a new one, possibly because you created them within a fraction of 2 or so seconds and because there isn’t a large time gap in between it’s optimized and uses the same object.

This is why you should be using the equality operator ==, not is, to compare the value of a string object.

>>> s = 'one'
>>> s2 = 'two'
>>> s is s2
False
>>> s2 = s2.replace('two', 'one')
>>> s2
'one'
>>> s2 is s
False
>>> 

In this example, I made s2, which was a different string object previously equal to ‘one’ but it is not the same object as s, because the interpreter did not use the same object as I did not initially assign it to ‘one’, if I had it would have made them the same object.


回答 9

我相信这被称为“ interned”字符串。在优化模式下,Python会这样做,Java也会这样做,C和C ++也会这样做。

如果您使用两个相同的字符串,而不是通过创建两个字符串对象来浪费内存,则具有相同内容的所有已嵌入字符串都指向相同的内存。

这导致Python“ is”运算符返回True,因为两个内容相同的字符串指向同一个字符串对象。这也将在Java和C语言中发生。

但是,这仅对节省内存有用。您不能依靠它来测试字符串是否相等,因为各种解释器和编译器以及JIT引擎不能总是这样做。

I believe that this is known as “interned” strings. Python does this, so does Java, and so do C and C++ when compiling in optimized modes.

If you use two identical strings, instead of wasting memory by creating two string objects, all interned strings with the same contents point to the same memory.

This results in the Python “is” operator returning True because two strings with the same contents are pointing at the same string object. This will also happen in Java and in C.

This is only useful for memory savings though. You cannot rely on it to test for string equality, because the various interpreters and compilers and JIT engines cannot always do it.


回答 10

我回答了这个问题,尽管这个问题已经很老了,因为上面没有答案引用了语言参考

实际上,is运算符检查身份,而==运算符检查是否相等,

从语言参考:

类型影响对象行为的几乎所有方面。甚至对象身份的重要性在某种意义上也受到影响:对于不可变类型,计算新值的操作实际上可能返回对具有相同类型和值的任何现有对象的引用,而对于可变对象,则不允许这样做。例如,在a = 1之后;b = 1,取决于实现,a和b可以或可以不使用值1引用同一对象,但是在c = []之后;d = [],保证c和d引用两个不同的,唯一的,新创建的空列表。(请注意,c = d = []将相同的对象分配给c和d。)

因此,根据上述陈述,我们可以推断出,使用“ is”检查时,不可变类型的字符串可能会失败,而使用“ is”检查时,则可能会检查成功

同样适用于int,tuple也是不可变的类型

I am answering the question even though the question is to old because no answers above quotes the language reference

Actually the is operator checks for identity and == operator checks for equality,

From Language Reference:

Types affect almost all aspects of object behavior. Even the importance of object identity is affected in some sense: for immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value, while for mutable objects this is not allowed. E.g., after a = 1; b = 1, a and b may or may not refer to the same object with the value one, depending on the implementation, but after c = []; d = [], c and d are guaranteed to refer to two different, unique, newly created empty lists. (Note that c = d = [] assigns the same object to both c and d.)

so from above statement we can infer that the strings which is an immutable type may fail when checked with “is” and may checked succeed when checked with “is”

The same applies for int,tuple which are also immutable types


回答 11

==运营商测试值等价。该is运营商的测试对象的身份,Python的测试是否两者实际上是同一个对象(即住在内存中的地址相同)。

>>> a = 'banana'
>>> b = 'banana'
>>> a is b 
True

在此例如,Python只创建了一个字符串对象,都ab参照它。原因是Python在内部缓存和重用了一些字符串作为优化,实际上在内存中只有一个字符串“ banana”,由a和b共享;要触发正常行为,您需要使用更长的字符串:

>>> a = 'a longer banana'
>>> b = 'a longer banana'
>>> a == b, a is b
(True, False)

创建两个列表时,将获得两个对象:

>>> a = [1, 2, 3]
>>> b = [1, 2, 3]
>>> a is b
False

在这种情况下,我们可以说这两个列表是等效的,因为它们具有相同的元素,但是不相同,因为它们不是相同的对象。如果两个对象相同,则它们也是等效的,但是如果它们相等,则它们不一定相同。

如果a引用对象,则分配b = a,然后,则两个变量都引用同一个对象:

>>> a = [1, 2, 3]
>>> b = a
>>> b is a
True

The == operator test value equivalence. The is operator tests object identity, Python tests whether the two are really the same object(i.e., live at the same address in memory).

>>> a = 'banana'
>>> b = 'banana'
>>> a is b 
True

In this example, Python only created one string object, and both a and b refers to it. The reason is that Python internally caches and reuses some strings as an optimization, there really is just a string ‘banana’ in memory, shared by a and b; To trigger the normal behavior, you need to use longer strings:

>>> a = 'a longer banana'
>>> b = 'a longer banana'
>>> a == b, a is b
(True, False)

When you create two lists, you get two objects:

>>> a = [1, 2, 3]
>>> b = [1, 2, 3]
>>> a is b
False

In this case we would say that the two lists are equivalent, because they have the same elements, but not identical, because they are not the same object. If two objects are identical, they are also equivalent, but if they are equivalent, they are not necessarily identical.

If a refers to an object and you assign b = a, then both variables refer to the same object:

>>> a = [1, 2, 3]
>>> b = a
>>> b is a
True

回答 12

is将比较内存位置。它用于对象级比较。

==将比较程序中的变量。用于在值级别进行检查。

is 检查地址级别是否相等

== 检查价值水平是否相等

is will compare the memory location. It is used for object-level comparison.

== will compare the variables in the program. It is used for checking at a value level.

is checks for address level equivalence

== checks for value level equivalence


回答 13

is是身份测试,==是相等性测试(请参阅Python文档)。

在大多数情况下,如果a is b,则a == b。但是也有exceptions,例如:

>>> nan = float('nan')
>>> nan is nan
True
>>> nan == nan
False

因此,您只能is用于身份测试,而不能用于相等性测试。

is is identity testing, == is equality testing (see Python Documentation).

In most cases, if a is b, then a == b. But there are exceptions, for example:

>>> nan = float('nan')
>>> nan is nan
True
>>> nan == nan
False

So, you can only use is for identity tests, never equality tests.


如何修剪空白?

问题:如何修剪空白?

是否有Python函数可以从字符串中修剪空格(空格和制表符)?

例如:\t example string\texample string

Is there a Python function that will trim whitespace (spaces and tabs) from a string?

Example: \t example string\texample string


回答 0

两侧的空格:

s = "  \t a string example\t  "
s = s.strip()

右侧的空格:

s = s.rstrip()

左侧的空白:

s = s.lstrip()

正如thedz所指出的,您可以提供一个参数来将任意字符剥离到以下任何函数中,如下所示:

s = s.strip(' \t\n\r')

这将去除任何空间,\t\n,或\r从左侧字符,右手侧,或该字符串的两侧。

上面的示例仅从字符串的左侧和右侧删除字符串。如果还要从字符串中间删除字符,请尝试re.sub

import re
print re.sub('[\s+]', '', s)

那应该打印出来:

astringexample

Whitespace on both sides:

s = "  \t a string example\t  "
s = s.strip()

Whitespace on the right side:

s = s.rstrip()

Whitespace on the left side:

s = s.lstrip()

As thedz points out, you can provide an argument to strip arbitrary characters to any of these functions like this:

s = s.strip(' \t\n\r')

This will strip any space, \t, \n, or \r characters from the left-hand side, right-hand side, or both sides of the string.

The examples above only remove strings from the left-hand and right-hand sides of strings. If you want to also remove characters from the middle of a string, try re.sub:

import re
print re.sub('[\s+]', '', s)

That should print out:

astringexample

回答 1

Python trim方法称为strip

str.strip() #trim
str.lstrip() #ltrim
str.rstrip() #rtrim

Python trim method is called strip:

str.strip() #trim
str.lstrip() #ltrim
str.rstrip() #rtrim

回答 2

对于前导和尾随空格:

s = '   foo    \t   '
print s.strip() # prints "foo"

否则,一个正则表达式将起作用:

import re
pat = re.compile(r'\s+')
s = '  \t  foo   \t   bar \t  '
print pat.sub('', s) # prints "foobar"

For leading and trailing whitespace:

s = '   foo    \t   '
print s.strip() # prints "foo"

Otherwise, a regular expression works:

import re
pat = re.compile(r'\s+')
s = '  \t  foo   \t   bar \t  '
print pat.sub('', s) # prints "foobar"

回答 3

您还可以使用非常简单且基本的功能:str.replace(),用于空白和制表符:

>>> whitespaces = "   abcd ef gh ijkl       "
>>> tabs = "        abcde       fgh        ijkl"

>>> print whitespaces.replace(" ", "")
abcdefghijkl
>>> print tabs.replace(" ", "")
abcdefghijkl

简单容易。

You can also use very simple, and basic function: str.replace(), works with the whitespaces and tabs:

>>> whitespaces = "   abcd ef gh ijkl       "
>>> tabs = "        abcde       fgh        ijkl"

>>> print whitespaces.replace(" ", "")
abcdefghijkl
>>> print tabs.replace(" ", "")
abcdefghijkl

Simple and easy.


回答 4

#how to trim a multi line string or a file

s=""" line one
\tline two\t
line three """

#line1 starts with a space, #2 starts and ends with a tab, #3 ends with a space.

s1=s.splitlines()
print s1
[' line one', '\tline two\t', 'line three ']

print [i.strip() for i in s1]
['line one', 'line two', 'line three']




#more details:

#we could also have used a forloop from the begining:
for line in s.splitlines():
    line=line.strip()
    process(line)

#we could also be reading a file line by line.. e.g. my_file=open(filename), or with open(filename) as myfile:
for line in my_file:
    line=line.strip()
    process(line)

#moot point: note splitlines() removed the newline characters, we can keep them by passing True:
#although split() will then remove them anyway..
s2=s.splitlines(True)
print s2
[' line one\n', '\tline two\t\n', 'line three ']
#how to trim a multi line string or a file

s=""" line one
\tline two\t
line three """

#line1 starts with a space, #2 starts and ends with a tab, #3 ends with a space.

s1=s.splitlines()
print s1
[' line one', '\tline two\t', 'line three ']

print [i.strip() for i in s1]
['line one', 'line two', 'line three']




#more details:

#we could also have used a forloop from the begining:
for line in s.splitlines():
    line=line.strip()
    process(line)

#we could also be reading a file line by line.. e.g. my_file=open(filename), or with open(filename) as myfile:
for line in my_file:
    line=line.strip()
    process(line)

#moot point: note splitlines() removed the newline characters, we can keep them by passing True:
#although split() will then remove them anyway..
s2=s.splitlines(True)
print s2
[' line one\n', '\tline two\t\n', 'line three ']

回答 5

尚无人发布这些正则表达式解决方案。

匹配:

>>> import re
>>> p=re.compile('\\s*(.*\\S)?\\s*')

>>> m=p.match('  \t blah ')
>>> m.group(1)
'blah'

>>> m=p.match('  \tbl ah  \t ')
>>> m.group(1)
'bl ah'

>>> m=p.match('  \t  ')
>>> print m.group(1)
None

搜索(您必须以不同的方式处理“仅空格”输入大小写):

>>> p1=re.compile('\\S.*\\S')

>>> m=p1.search('  \tblah  \t ')
>>> m.group()
'blah'

>>> m=p1.search('  \tbl ah  \t ')
>>> m.group()
'bl ah'

>>> m=p1.search('  \t  ')
>>> m.group()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'group'

如果使用re.sub,则可以删除内部空格,这可能是不希望的。

No one has posted these regex solutions yet.

Matching:

>>> import re
>>> p=re.compile('\\s*(.*\\S)?\\s*')

>>> m=p.match('  \t blah ')
>>> m.group(1)
'blah'

>>> m=p.match('  \tbl ah  \t ')
>>> m.group(1)
'bl ah'

>>> m=p.match('  \t  ')
>>> print m.group(1)
None

Searching (you have to handle the “only spaces” input case differently):

>>> p1=re.compile('\\S.*\\S')

>>> m=p1.search('  \tblah  \t ')
>>> m.group()
'blah'

>>> m=p1.search('  \tbl ah  \t ')
>>> m.group()
'bl ah'

>>> m=p1.search('  \t  ')
>>> m.group()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'group'

If you use re.sub, you may remove inner whitespace, which could be undesirable.


回答 6

空格包括空格,制表符和CRLF。因此,我们可以使用的一种优雅且单线的字符串函数是translation

' hello apple'.translate(None, ' \n\t\r')

或者,如果您想彻底

import string
' hello  apple'.translate(None, string.whitespace)

Whitespace includes space, tabs and CRLF. So an elegant and one-liner string function we can use is translate.

' hello apple'.translate(None, ' \n\t\r')

OR if you want to be thorough

import string
' hello  apple'.translate(None, string.whitespace)

回答 7

(re.sub(’+’,”,(my_str.replace(’\ n’,”))))。strip()

这将删除所有不需要的空格和换行符。希望有帮助

import re
my_str = '   a     b \n c   '
formatted_str = (re.sub(' +', ' ',(my_str.replace('\n',' ')))).strip()

这将导致:

‘a b \ nc’ 将更改为 ‘ab c’

(re.sub(‘ +’, ‘ ‘,(my_str.replace(‘\n’,’ ‘)))).strip()

This will remove all the unwanted spaces and newline characters. Hope this help

import re
my_str = '   a     b \n c   '
formatted_str = (re.sub(' +', ' ',(my_str.replace('\n',' ')))).strip()

This will result :

‘ a      b \n c ‘ will be changed to ‘a b c’


回答 8

    something = "\t  please_     \t remove_  all_    \n\n\n\nwhitespaces\n\t  "

    something = "".join(something.split())

输出:

please_remove_all_whitespaces


在答案中添加Le Droid的评论。用空格分隔:

    something = "\t  please     \t remove  all   extra \n\n\n\nwhitespaces\n\t  "
    something = " ".join(something.split())

输出:

请删除所有多余的空格

    something = "\t  please_     \t remove_  all_    \n\n\n\nwhitespaces\n\t  "

    something = "".join(something.split())

output:

please_remove_all_whitespaces


Adding Le Droid’s comment to the answer. To separate with a space:
    something = "\t  please     \t remove  all   extra \n\n\n\nwhitespaces\n\t  "
    something = " ".join(something.split())

output:

please remove all extra whitespaces


回答 9

如果使用Python 3:在您的打印语句中,以sep =“”结尾。这将分隔所有空间。

例:

txt="potatoes"
print("I love ",txt,"",sep="")

这将打印: 我爱土豆。

代替: 我爱土豆。

在您的情况下,由于您尝试使用\ t,因此请执行sep =“ \ t”

If using Python 3: In your print statement, finish with sep=””. That will separate out all of the spaces.

EXAMPLE:

txt="potatoes"
print("I love ",txt,"",sep="")

This will print: I love potatoes.

Instead of: I love potatoes .

In your case, since you would be trying to get ride of the \t, do sep=”\t”


回答 10

在以不同的理解程度查看了这里的许多解决方案之后,我想知道如果字符串用逗号分隔该怎么办…

问题

在尝试处理联系人信息的csv时,我需要一个解决此问题的方法:修剪多余的空格和一些垃圾,但保留尾随逗号和内部空格。我要处理包含联系人注释的字段,所以我想删除垃圾,留下好东西。删除所有标点符号和谷壳后,我不想失去复合令牌之间的空白,因为我不想以后再构建。

正则表达式和模式: [\s_]+?\W+

该模式查找任何空白字符的单个实例,并且下划线(’_’)从1到无数次懒惰(尽可能少的字符),[\s_]+?而在非单词字符从1到无数个数字出现之前时间:( \W+等于[^a-zA-Z0-9_])。具体来说,这会找到大量空格:空字符(\ 0),制表符(\ t),换行符(\ n),前馈(\ f),回车符(\ r)。

我认为这样做有两个好处:

  1. 它不会删除您可能希望保持在一起的完整单词/标记之间的空格;

  2. Python的内置字符串方法strip()不在字符串内部处理,仅在左右两端进行处理,默认arg为空字符(请参见以下示例:文本中包含几行换行符,strip()而regex模式却不会将其全部删除) 。text.strip(' \n\t\r')

这超出了OP的问题,但我认为在很多情况下,像我一样,文本数据中可能会有奇怪的病理性实例(某些转义字符最终出现在某些文本中)。此外,在类似列表的字符串中,除非分隔符将两个空格字符或某些非单词字符分开,例如’-,’或’-、、、’,否则我们不希望删除分隔符。

注意:不是在谈论CSV本身的分隔符。仅在CSV内数据是列表形式的实例,即cs字符串是子字符串。

全面披露:我只处理文本约一个月,而正则表达式仅在最近两周内处理,所以我确定我缺少一些细微差别。就是说,对于较小的字符串集合(我的是在12,000行和40个奇数列的数据帧中),作为除去多余字符的最后一步,此方法效果很好,特别是如果您在其中引入了一些额外的空格想要分隔由非单词字符连接的文本,但又不想在以前没有空格的地方添加空格。

一个例子:

import re


text = "\"portfolio, derp, hello-world, hello-, -world, founders, mentors, :, ?, %, ,>, , ffib, biff, 1, 12.18.02, 12,  2013, 9874890288, .., ..., ...., , ff, series a, exit, general mailing, fr, , , ,, co founder, pitch_at_palace, ba, _slkdjfl_bf, sdf_jlk, )_(, jim.somedude@blahblah.com, ,dd invites,subscribed,, master, , , ,  dd invites,subscribed, , , , \r, , \0, ff dd \n invites, subscribed, , ,  , , alumni spring 2012 deck: https: www.dropbox.com s, \n i69rpofhfsp9t7c practice 20ignition - 20june \t\n .2134.pdf 2109                                                 \n\n\n\nklkjsdf\""

print(f"Here is the text as formatted:\n{text}\n")
print()
print("Trimming both the whitespaces and the non-word characters that follow them.")
print()
trim_ws_punctn = re.compile(r'[\s_]+?\W+')
clean_text = trim_ws_punctn.sub(' ', text)
print(clean_text)
print()
print("what about 'strip()'?")
print(f"Here is the text, formatted as is:\n{text}\n")
clean_text = text.strip(' \n\t\r')  # strip out whitespace?
print()
print(f"Here is the text, formatted as is:\n{clean_text}\n")

print()
print("Are 'text' and 'clean_text' unchanged?")
print(clean_text == text)

输出:

Here is the text as formatted:

"portfolio, derp, hello-world, hello-, -world, founders, mentors, :, ?, %, ,>, , ffib, biff, 1, 12.18.02, 12,  2013, 9874890288, .., ..., ...., , ff, series a, exit, general mailing, fr, , , ,, co founder, pitch_at_palace, ba, _slkdjfl_bf, sdf_jlk, )_(, jim.somedude@blahblah.com, ,dd invites,subscribed,, master, , , ,  dd invites,subscribed, ,, , , ff dd 
 invites, subscribed, , ,  , , alumni spring 2012 deck: https: www.dropbox.com s, 
 i69rpofhfsp9t7c practice 20ignition - 20june 
 .2134.pdf 2109                                                 



klkjsdf" 

using regex to trim both the whitespaces and the non-word characters that follow them.

"portfolio, derp, hello-world, hello-, world, founders, mentors, ffib, biff, 1, 12.18.02, 12, 2013, 9874890288, ff, series a, exit, general mailing, fr, co founder, pitch_at_palace, ba, _slkdjfl_bf, sdf_jlk,  jim.somedude@blahblah.com, dd invites,subscribed,, master, dd invites,subscribed, ff dd invites, subscribed, alumni spring 2012 deck: https: www.dropbox.com s, i69rpofhfsp9t7c practice 20ignition 20june 2134.pdf 2109 klkjsdf"

Very nice.
What about 'strip()'?

Here is the text, formatted as is:

"portfolio, derp, hello-world, hello-, -world, founders, mentors, :, ?, %, ,>, , ffib, biff, 1, 12.18.02, 12,  2013, 9874890288, .., ..., ...., , ff, series a, exit, general mailing, fr, , , ,, co founder, pitch_at_palace, ba, _slkdjfl_bf, sdf_jlk, )_(, jim.somedude@blahblah.com, ,dd invites,subscribed,, master, , , ,  dd invites,subscribed, ,, , , ff dd 
 invites, subscribed, , ,  , , alumni spring 2012 deck: https: www.dropbox.com s, 
 i69rpofhfsp9t7c practice 20ignition - 20june 
 .2134.pdf 2109                                                 



klkjsdf"


Here is the text, after stipping with 'strip':


"portfolio, derp, hello-world, hello-, -world, founders, mentors, :, ?, %, ,>, , ffib, biff, 1, 12.18.02, 12,  2013, 9874890288, .., ..., ...., , ff, series a, exit, general mailing, fr, , , ,, co founder, pitch_at_palace, ba, _slkdjfl_bf, sdf_jlk, )_(, jim.somedude@blahblah.com, ,dd invites,subscribed,, master, , , ,  dd invites,subscribed, ,, , , ff dd 
 invites, subscribed, , ,  , , alumni spring 2012 deck: https: www.dropbox.com s, 
 i69rpofhfsp9t7c practice 20ignition - 20june 
 .2134.pdf 2109                                                 



klkjsdf"
Are 'text' and 'clean_text' unchanged? 'True'

因此,strip一次删除一个空格。因此,在OP的情况下,strip()可以。但是如果情况变得更加复杂,则对于更一般的设置,正则表达式和类似的模式可能会有一定价值。

看到它在行动

Having looked at quite a few solutions here with various degrees of understanding, I wondered what to do if the string was comma separated…

the problem

While trying to process a csv of contact information, I needed a solution this problem: trim extraneous whitespace and some junk, but preserve trailing commas, and internal whitespace. Working with a field containing notes on the contacts, I wanted to remove the garbage, leaving the good stuff. Trimming out all the punctuation and chaff, I didn’t want to lose the whitespace between compound tokens as I didn’t want to rebuild later.

regex and patterns: [\s_]+?\W+

The pattern looks for single instances of any whitespace character and the underscore (‘_’) from 1 to an unlimited number of times lazily (as few characters as possible) with [\s_]+? that come before non-word characters occurring from 1 to an unlimited amount of time with this: \W+ (is equivalent to [^a-zA-Z0-9_]). Specifically, this finds swaths of whitespace: null characters (\0), tabs (\t), newlines (\n), feed-forward (\f), carriage returns (\r).

I see the advantage to this as two-fold:

  1. that it doesn’t remove whitespace between the complete words/tokens that you might want to keep together;

  2. Python’s built in string method strip()doesn’t deal inside the string, just the left and right ends, and default arg is null characters (see below example: several newlines are in the text, and strip() does not remove them all while the regex pattern does). text.strip(' \n\t\r')

This goes beyond the OPs question, but I think there are plenty of cases where we might have odd, pathological instances within the text data, as I did (some how the escape characters ended up in some of the text). Moreover, in list-like strings, we don’t want to eliminate the delimiter unless the delimiter separates two whitespace characters or some non-word character, like ‘-,’ or ‘-, ,,,’.

NB: Not talking about the delimiter of the CSV itself. Only of instances within the CSV where the data is list-like, ie is a c.s. string of substrings.

Full disclosure: I’ve only been manipulating text for about a month, and regex only the last two weeks, so I’m sure there are some nuances I’m missing. That said, for smaller collections of strings (mine are in a dataframe of 12,000 rows and 40 odd columns), as a final step after a pass for removal of extraneous characters, this works exceptionally well, especially if you introduce some additional whitespace where you want to separate text joined by a non-word character, but don’t want to add whitespace where there was none before.

An example:

import re


text = "\"portfolio, derp, hello-world, hello-, -world, founders, mentors, :, ?, %, ,>, , ffib, biff, 1, 12.18.02, 12,  2013, 9874890288, .., ..., ...., , ff, series a, exit, general mailing, fr, , , ,, co founder, pitch_at_palace, ba, _slkdjfl_bf, sdf_jlk, )_(, jim.somedude@blahblah.com, ,dd invites,subscribed,, master, , , ,  dd invites,subscribed, , , , \r, , \0, ff dd \n invites, subscribed, , ,  , , alumni spring 2012 deck: https: www.dropbox.com s, \n i69rpofhfsp9t7c practice 20ignition - 20june \t\n .2134.pdf 2109                                                 \n\n\n\nklkjsdf\""

print(f"Here is the text as formatted:\n{text}\n")
print()
print("Trimming both the whitespaces and the non-word characters that follow them.")
print()
trim_ws_punctn = re.compile(r'[\s_]+?\W+')
clean_text = trim_ws_punctn.sub(' ', text)
print(clean_text)
print()
print("what about 'strip()'?")
print(f"Here is the text, formatted as is:\n{text}\n")
clean_text = text.strip(' \n\t\r')  # strip out whitespace?
print()
print(f"Here is the text, formatted as is:\n{clean_text}\n")

print()
print("Are 'text' and 'clean_text' unchanged?")
print(clean_text == text)

This outputs:

Here is the text as formatted:

"portfolio, derp, hello-world, hello-, -world, founders, mentors, :, ?, %, ,>, , ffib, biff, 1, 12.18.02, 12,  2013, 9874890288, .., ..., ...., , ff, series a, exit, general mailing, fr, , , ,, co founder, pitch_at_palace, ba, _slkdjfl_bf, sdf_jlk, )_(, jim.somedude@blahblah.com, ,dd invites,subscribed,, master, , , ,  dd invites,subscribed, ,, , , ff dd 
 invites, subscribed, , ,  , , alumni spring 2012 deck: https: www.dropbox.com s, 
 i69rpofhfsp9t7c practice 20ignition - 20june 
 .2134.pdf 2109                                                 



klkjsdf" 

using regex to trim both the whitespaces and the non-word characters that follow them.

"portfolio, derp, hello-world, hello-, world, founders, mentors, ffib, biff, 1, 12.18.02, 12, 2013, 9874890288, ff, series a, exit, general mailing, fr, co founder, pitch_at_palace, ba, _slkdjfl_bf, sdf_jlk,  jim.somedude@blahblah.com, dd invites,subscribed,, master, dd invites,subscribed, ff dd invites, subscribed, alumni spring 2012 deck: https: www.dropbox.com s, i69rpofhfsp9t7c practice 20ignition 20june 2134.pdf 2109 klkjsdf"

Very nice.
What about 'strip()'?

Here is the text, formatted as is:

"portfolio, derp, hello-world, hello-, -world, founders, mentors, :, ?, %, ,>, , ffib, biff, 1, 12.18.02, 12,  2013, 9874890288, .., ..., ...., , ff, series a, exit, general mailing, fr, , , ,, co founder, pitch_at_palace, ba, _slkdjfl_bf, sdf_jlk, )_(, jim.somedude@blahblah.com, ,dd invites,subscribed,, master, , , ,  dd invites,subscribed, ,, , , ff dd 
 invites, subscribed, , ,  , , alumni spring 2012 deck: https: www.dropbox.com s, 
 i69rpofhfsp9t7c practice 20ignition - 20june 
 .2134.pdf 2109                                                 



klkjsdf"


Here is the text, after stipping with 'strip':


"portfolio, derp, hello-world, hello-, -world, founders, mentors, :, ?, %, ,>, , ffib, biff, 1, 12.18.02, 12,  2013, 9874890288, .., ..., ...., , ff, series a, exit, general mailing, fr, , , ,, co founder, pitch_at_palace, ba, _slkdjfl_bf, sdf_jlk, )_(, jim.somedude@blahblah.com, ,dd invites,subscribed,, master, , , ,  dd invites,subscribed, ,, , , ff dd 
 invites, subscribed, , ,  , , alumni spring 2012 deck: https: www.dropbox.com s, 
 i69rpofhfsp9t7c practice 20ignition - 20june 
 .2134.pdf 2109                                                 



klkjsdf"
Are 'text' and 'clean_text' unchanged? 'True'

So strip removes one whitespace from at a time. So in the OPs case, strip() is fine. but if things get any more complex, regex and a similar pattern may be of some value for more general settings.

see it in action


回答 11

尝试翻译

>>> import string
>>> print '\t\r\n  hello \r\n world \t\r\n'

  hello 
 world  
>>> tr = string.maketrans(string.whitespace, ' '*len(string.whitespace))
>>> '\t\r\n  hello \r\n world \t\r\n'.translate(tr)
'     hello    world    '
>>> '\t\r\n  hello \r\n world \t\r\n'.translate(tr).replace(' ', '')
'helloworld'

try translate

>>> import string
>>> print '\t\r\n  hello \r\n world \t\r\n'

  hello 
 world  
>>> tr = string.maketrans(string.whitespace, ' '*len(string.whitespace))
>>> '\t\r\n  hello \r\n world \t\r\n'.translate(tr)
'     hello    world    '
>>> '\t\r\n  hello \r\n world \t\r\n'.translate(tr).replace(' ', '')
'helloworld'

回答 12

如果要仅在字符串的开头和结尾处修剪空格,则可以执行以下操作:

some_string = "    Hello,    world!\n    "
new_string = some_string.strip()
# new_string is now "Hello,    world!"

这与Qt的QString :: trimmed()方法非常相似,因为它删除了前导和尾随空格,而只保留了内部空格。

但是,如果您想使用类似Qt的QString :: simplified()方法的方法,该方法不仅删除开头和结尾的空格,还可以将所有连续的内部空格“挤压”到一个空格字符,则可以使用.split()and 的组合" ".join,如下所示:

some_string = "\t    Hello,  \n\t  world!\n    "
new_string = " ".join(some_string.split())
# new_string is now "Hello, world!"

在最后一个示例中,内部空格的每个序列都用一个空格代替,同时仍在字符串的开头和结尾修剪空格。

If you want to trim the whitespace off just the beginning and end of the string, you can do something like this:

some_string = "    Hello,    world!\n    "
new_string = some_string.strip()
# new_string is now "Hello,    world!"

This works a lot like Qt’s QString::trimmed() method, in that it removes leading and trailing whitespace, while leaving internal whitespace alone.

But if you’d like something like Qt’s QString::simplified() method which not only removes leading and trailing whitespace, but also “squishes” all consecutive internal whitespace to one space character, you can use a combination of .split() and " ".join, like this:

some_string = "\t    Hello,  \n\t  world!\n    "
new_string = " ".join(some_string.split())
# new_string is now "Hello, world!"

In this last example, each sequence of internal whitespace replaced with a single space, while still trimming the whitespace off the start and end of the string.


回答 13

通常,我使用以下方法:

>>> myStr = "Hi\n Stack Over \r flow!"
>>> charList = [u"\u005Cn",u"\u005Cr",u"\u005Ct"]
>>> import re
>>> for i in charList:
        myStr = re.sub(i, r"", myStr)

>>> myStr
'Hi Stack Over  flow'

注意:这仅用于删除“ \ n”,“ \ r”和“ \ t”。它不会删除多余的空间。

Generally, I am using the following method:

>>> myStr = "Hi\n Stack Over \r flow!"
>>> charList = [u"\u005Cn",u"\u005Cr",u"\u005Ct"]
>>> import re
>>> for i in charList:
        myStr = re.sub(i, r"", myStr)

>>> myStr
'Hi Stack Over  flow'

Note: This is only for removing “\n”, “\r” and “\t” only. It does not remove extra spaces.


回答 14

用于从字符串中间删除空格

$p = "ATGCGAC ACGATCGACC";
$p =~ s/\s//g;
print $p;

输出:

ATGCGACACGATCGACC

for removing whitespaces from the middle of the string

$p = "ATGCGAC ACGATCGACC";
$p =~ s/\s//g;
print $p;

output:

ATGCGACACGATCGACC

回答 15

这将删除字符串开头和结尾的所有空格和换行符:

>>> s = "  \n\t  \n   some \n text \n     "
>>> re.sub("^\s+|\s+$", "", s)
>>> "some \n text"

This will remove all whitespace and newlines from both the beginning and end of a string:

>>> s = "  \n\t  \n   some \n text \n     "
>>> re.sub("^\s+|\s+$", "", s)
>>> "some \n text"