检查字符串是否以列表中的字符串之一结尾

问题:检查字符串是否以列表中的字符串之一结尾

编写以下代码的pythonic方法是什么?

extensions = ['.mp3','.avi']
file_name = 'test.mp3'

for extension in extensions:
    if file_name.endswith(extension):
        #do stuff

我有一个模糊的记忆,for可以避免循环的显式声明,并将其写成if条件。这是真的?

What is the pythonic way of writing the following code?

extensions = ['.mp3','.avi']
file_name = 'test.mp3'

for extension in extensions:
    if file_name.endswith(extension):
        #do stuff

I have a vague memory that the explicit declaration of the for loop can be avoided and be written in the if condition. Is this true?


回答 0

尽管尚不为人所知,str.endswith也接受一个元组。您不需要循环。

>>> 'test.mp3'.endswith(('.mp3', '.avi'))
True

Though not widely known, str.endswith also accepts a tuple. You don’t need to loop.

>>> 'test.mp3'.endswith(('.mp3', '.avi'))
True

回答 1

只需使用:

if file_name.endswith(tuple(extensions)):

Just use:

if file_name.endswith(tuple(extensions)):

回答 2

从文件中获取扩展名,然后查看它是否在扩展名集中:

>>> import os
>>> extensions = set(['.mp3','.avi'])
>>> file_name = 'test.mp3'
>>> extension = os.path.splitext(file_name)[1]
>>> extension in extensions
True

使用集合是因为​​集合中查找的时间复杂度为O(1)(docs)。

Take an extension from the file and see if it is in the set of extensions:

>>> import os
>>> extensions = set(['.mp3','.avi'])
>>> file_name = 'test.mp3'
>>> extension = os.path.splitext(file_name)[1]
>>> extension in extensions
True

Using a set because time complexity for lookups in sets is O(1) (docs).


回答 3

有两种方法:正则表达式和字符串(str)方法。

字符串方法通常更快(〜2x)。

import re, timeit
p = re.compile('.*(.mp3|.avi)$', re.IGNORECASE)
file_name = 'test.mp3'
print(bool(t.match(file_name))
%timeit bool(t.match(file_name)

每个循环792 ns±1.83 ns(平均±标准偏差,共7次运行,每个循环1000000次)

file_name = 'test.mp3'
extensions = ('.mp3','.avi')
print(file_name.lower().endswith(extensions))
%timeit file_name.lower().endswith(extensions)

每个循环274 ns±4.22 ns(平均±标准偏差,共7次运行,每个循环1000000次)

There is two ways: regular expressions and string (str) methods.

String methods are usually faster ( ~2x ).

import re, timeit
p = re.compile('.*(.mp3|.avi)$', re.IGNORECASE)
file_name = 'test.mp3'
print(bool(t.match(file_name))
%timeit bool(t.match(file_name)

792 ns ± 1.83 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

file_name = 'test.mp3'
extensions = ('.mp3','.avi')
print(file_name.lower().endswith(extensions))
%timeit file_name.lower().endswith(extensions)

274 ns ± 4.22 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


回答 4

我有这个:

def has_extension(filename, extension):

    ext = "." + extension
    if filename.endswith(ext):
        return True
    else:
        return False

I have this:

def has_extension(filename, extension):

    ext = "." + extension
    if filename.endswith(ext):
        return True
    else:
        return False

回答 5

我在寻找其他东西时碰到了这个问题。

我建议使用os软件包中的方法。这是因为您可以使其更通用,以补偿任何奇怪的情况。

您可以执行以下操作:

import os

the_file = 'aaaa/bbbb/ccc.ddd'

extensions_list = ['ddd', 'eee', 'fff']

if os.path.splitext(the_file)[-1] in extensions_list:
    # Do your thing.

I just came across this, while looking for something else.

I would recommend to go with the methods in the os package. This is because you can make it more general, compensating for any weird case.

You can do something like:

import os

the_file = 'aaaa/bbbb/ccc.ddd'

extensions_list = ['ddd', 'eee', 'fff']

if os.path.splitext(the_file)[-1] in extensions_list:
    # Do your thing.

回答 6

另一种可能是利用IN语句:

extensions = ['.mp3','.avi']
file_name  = 'test.mp3'
if "." in file_name and file_name[file_name.rindex("."):] in extensions:
    print(True)

Another possibility could be to make use of IN statement:

extensions = ['.mp3','.avi']
file_name  = 'test.mp3'
if "." in file_name and file_name[file_name.rindex("."):] in extensions:
    print(True)

回答 7

可以返回匹配字符串列表的另一种方法是

sample = "alexis has the control"
matched_strings = filter(sample.endswith, ["trol", "ol", "troll"])
print matched_strings
['trol', 'ol']

another way which can return the list of matching strings is

sample = "alexis has the control"
matched_strings = filter(sample.endswith, ["trol", "ol", "troll"])
print matched_strings
['trol', 'ol']