问题:如何在Python中获取字符串的大小?

例如,我得到一个字符串:

str = "please answer my question"

我想将其写入文件。

但是在将字符串写入文件之前,我需要知道字符串的大小。我可以使用什么函数来计算字符串的大小?

For example, I get a string:

str = "please answer my question"

I want to write it to a file.

But I need to know the size of the string before writing the string to the file. What function can I use to calculate the size of the string?


回答 0

如果您在谈论字符串的长度,则可以使用len()

>>> s = 'please answer my question'
>>> len(s)  # number of characters in s
25

如果需要以字节为单位的字符串大小,则需要sys.getsizeof()

>>> import sys
>>> sys.getsizeof(s)
58

另外,不要调用您的字符串变量str。它遮盖了内置str()功能。

If you are talking about the length of the string, you can use len():

>>> s = 'please answer my question'
>>> len(s)  # number of characters in s
25

If you need the size of the string in bytes, you need sys.getsizeof():

>>> import sys
>>> sys.getsizeof(s)
58

Also, don’t call your string variable str. It shadows the built-in str() function.


回答 1

Python 3:

user225312的答案是正确的:

A.要计算str对象中的字符数,可以使用len()函数:

>>> print(len('please anwser my question'))
25

B.要获得分配给存储str对象的字节大小的内存,可以使用sys.getsizeof()函数

>>> from sys import getsizeof
>>> print(getsizeof('please anwser my question'))
50

Python 2:

对于Python 2,它变得复杂。

len()Python 2中的函数返回分配的字节数,以将编码的字符存储在str对象中。

有时,它等于字符数:

>>> print(len('abc'))
3

但是有时候,它不会:

>>> print(len('йцы'))  # String contains Cyrillic symbols
6

那是因为str可以在内部使用可变长度编码。因此,要计算字符数,str您应该知道str对象正在使用哪种编码。然后,您可以将其转换为unicode对象并获得字符数:

>>> print(len('йцы'.decode('utf8'))) #String contains Cyrillic symbols 
3

B.sys.getsizeof()功能与Python 3中的功能相同-它返回分配用于存储整个字符串对象的字节数

>>> print(getsizeof('йцы'))
27
>>> print(getsizeof('йцы'.decode('utf8')))
32

Python 3:

user225312’s answer is correct:

A. To count number of characters in str object, you can use len() function:

>>> print(len('please anwser my question'))
25

B. To get memory size in bytes allocated to store str object, you can use sys.getsizeof() function

>>> from sys import getsizeof
>>> print(getsizeof('please anwser my question'))
50

Python 2:

It gets complicated for Python 2.

A. The len() function in Python 2 returns count of bytes allocated to store encoded characters in a str object.

Sometimes it will be equal to character count:

>>> print(len('abc'))
3

But sometimes, it won’t:

>>> print(len('йцы'))  # String contains Cyrillic symbols
6

That’s because str can use variable-length encoding internally. So, to count characters in str you should know which encoding your str object is using. Then you can convert it to unicode object and get character count:

>>> print(len('йцы'.decode('utf8'))) #String contains Cyrillic symbols 
3

B. The sys.getsizeof() function does the same thing as in Python 3 – it returns count of bytes allocated to store the whole string object

>>> print(getsizeof('йцы'))
27
>>> print(getsizeof('йцы'.decode('utf8')))
32

回答 2

>>> s = 'abcd'
>>> len(s)
4
>>> s = 'abcd'
>>> len(s)
4

回答 3

您也可以使用str.len()计算列中元素的长度

data['name of column'].str.len() 

You also may use str.len() to count length of element in the column

data['name of column'].str.len() 

回答 4

Python的方式是使用len()。请记住,转义序列中的’\’字符不计算在内,如果使用不正确,可能会造成危险。

>>> len('foo')
3
>>> len('\foo')
3
>>> len('\xoo')
  File "<stdin>", line 1
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \xXX escape

The most Pythonic way is to use the len(). Keep in mind that the ‘\’ character in escape sequences is not counted and can be dangerous if not used correctly.

>>> len('foo')
3
>>> len('\foo')
3
>>> len('\xoo')
  File "<stdin>", line 1
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \xXX escape

声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。