不区分大小写

问题:不区分大小写

我喜欢使用表达

if 'MICHAEL89' in USERNAMES:
    ...

USERNAMES清单在哪里。


有什么方法可以区分大小写不敏感的项目,还是需要使用自定义方法?只是想知道是否需要为此编写额外的代码。

I love using the expression

if 'MICHAEL89' in USERNAMES:
    ...

where USERNAMES is a list.


Is there any way to match items with case insensitivity or do I need to use a custom method? Just wondering if there is a need to write extra code for this.


回答 0

username = 'MICHAEL89'
if username.upper() in (name.upper() for name in USERNAMES):
    ...

或者:

if username.upper() in map(str.upper, USERNAMES):
    ...

或者,可以的,您可以定制方法。

username = 'MICHAEL89'
if username.upper() in (name.upper() for name in USERNAMES):
    ...

Alternatively:

if username.upper() in map(str.upper, USERNAMES):
    ...

Or, yes, you can make a custom method.


回答 1

我会做一个包装纸,这样您就可以做到无创。至少,例如:

class CaseInsensitively(object):
    def __init__(self, s):
        self.__s = s.lower()
    def __hash__(self):
        return hash(self.__s)
    def __eq__(self, other):
        # ensure proper comparison between instances of this class
        try:
           other = other.__s
        except (TypeError, AttributeError):
          try:
             other = other.lower()
          except:
             pass
        return self.__s == other

现在,if CaseInsensitively('MICHAEL89') in whatever:应按要求运行(无论右侧是列表,字典还是集合)。(可能需要付出更多的努力才能获得相似的字符串包含结果,在某些情况下避免发出警告,包括unicode等等)。

I would make a wrapper so you can be non-invasive. Minimally, for example…:

class CaseInsensitively(object):
    def __init__(self, s):
        self.__s = s.lower()
    def __hash__(self):
        return hash(self.__s)
    def __eq__(self, other):
        # ensure proper comparison between instances of this class
        try:
           other = other.__s
        except (TypeError, AttributeError):
          try:
             other = other.lower()
          except:
             pass
        return self.__s == other

Now, if CaseInsensitively('MICHAEL89') in whatever: should behave as required (whether the right-hand side is a list, dict, or set). (It may require more effort to achieve similar results for string inclusion, avoid warnings in some cases involving unicode, etc).


回答 2

通常(至少在oop中),您可以对对象进行形状调整,使其表现出所需的效果。name in USERNAMES不区分大小写,因此USERNAMES需要更改:

class NameList(object):
    def __init__(self, names):
        self.names = names

    def __contains__(self, name): # implements `in`
        return name.lower() in (n.lower() for n in self.names)

    def add(self, name):
        self.names.append(name)

# now this works
usernames = NameList(USERNAMES)
print someone in usernames

这样做的好处在于,它无需进行任何类外的代码更改,便可以进行许多改进。例如,您可以将更self.names改为一组以进行更快的查找,或者(n.lower() for n in self.names)仅计算一次并将其存储在类中,依此类推…

Usually (in oop at least) you shape your object to behave the way you want. name in USERNAMES is not case insensitive, so USERNAMES needs to change:

class NameList(object):
    def __init__(self, names):
        self.names = names

    def __contains__(self, name): # implements `in`
        return name.lower() in (n.lower() for n in self.names)

    def add(self, name):
        self.names.append(name)

# now this works
usernames = NameList(USERNAMES)
print someone in usernames

The great thing about this is that it opens the path for many improvements, without having to change any code outside the class. For example, you could change the self.names to a set for faster lookups, or compute the (n.lower() for n in self.names) only once and store it on the class and so on …


回答 3

str.casefold建议使用不区分大小写的字符串匹配。@nmichaels的解决方案可以轻松调整。

使用以下任一方法:

if 'MICHAEL89'.casefold() in (name.casefold() for name in USERNAMES):

要么:

if 'MICHAEL89'.casefold() in map(str.casefold, USERNAMES):

根据文档

大小写折叠类似于小写字母,但是更具攻击性,因为它旨在消除字符串中的所有大小写区别。例如,德语小写字母“ß”等效于“ ss”。由于它已经是小写字母,lower()因此对“ß”无效。casefold() 将其转换为“ ss”。

str.casefold is recommended for case-insensitive string matching. @nmichaels’s solution can trivially be adapted.

Use either:

if 'MICHAEL89'.casefold() in (name.casefold() for name in USERNAMES):

Or:

if 'MICHAEL89'.casefold() in map(str.casefold, USERNAMES):

As per the docs:

Casefolding is similar to lowercasing but more aggressive because it is intended to remove all case distinctions in a string. For example, the German lowercase letter ‘ß’ is equivalent to “ss”. Since it is already lowercase, lower() would do nothing to ‘ß’; casefold() converts it to “ss”.


回答 4

这是一种方法:

if string1.lower() in string2.lower(): 
    ...

为此,string1string2对象都必须是type string

Here’s one way:

if string1.lower() in string2.lower(): 
    ...

For this to work, both string1 and string2 objects must be of type string.


回答 5

我认为您必须编写一些额外的代码。例如:

if 'MICHAEL89' in map(lambda name: name.upper(), USERNAMES):
   ...

在这种情况下,我们将形成一个新列表,其中包含所有条目 USERNAMES转换为大写字母,然后与该新列表进行比较。

更新资料

@viraptor所说,最好使用生成器而不是map。参见@Nathon答案

I think you have to write some extra code. For example:

if 'MICHAEL89' in map(lambda name: name.upper(), USERNAMES):
   ...

In this case we are forming a new list with all entries in USERNAMES converted to upper case and then comparing against this new list.

Update

As @viraptor says, it is even better to use a generator instead of map. See @Nathon‘s answer.


回答 6

你可以做

matcher = re.compile('MICHAEL89', re.IGNORECASE)
filter(matcher.match, USERNAMES) 

更新:玩了一会儿,我认为您可以使用以下方法获得更好的短路类型方法

matcher = re.compile('MICHAEL89', re.IGNORECASE)
if any( ifilter( matcher.match, USERNAMES ) ):
    #your code here

ifilter函数来自itertools,它是Python中我最喜欢的模块之一。它比生成器快,但仅在被调用时才创建列表的下一项。

You could do

matcher = re.compile('MICHAEL89', re.IGNORECASE)
filter(matcher.match, USERNAMES) 

Update: played around a bit and am thinking you could get a better short-circuit type approach using

matcher = re.compile('MICHAEL89', re.IGNORECASE)
if any( ifilter( matcher.match, USERNAMES ) ):
    #your code here

The ifilter function is from itertools, one of my favorite modules within Python. It’s faster than a generator but only creates the next item of the list when called upon.


回答 7

我的5分(错误)

“”中的’a’.join([‘A’])。lower()

更新

uch,完全同意@jpp,我将举一个不良做法的例子:(

My 5 (wrong) cents

‘a’ in “”.join([‘A’]).lower()

UPDATE

Ouch, totally agree @jpp, I’ll keep as an example of bad practice :(


回答 8

我需要此字典而不是列表,Jochen解决方案在这种情况下是最优雅的,因此我对其进行了修改:

class CaseInsensitiveDict(dict):
    ''' requests special dicts are case insensitive when using the in operator,
     this implements a similar behaviour'''
    def __contains__(self, name): # implements `in`
        return name.casefold() in (n.casefold() for n in self.keys())

现在您可以像这样转换字典USERNAMESDICT = CaseInsensitiveDict(USERNAMESDICT)并使用if 'MICHAEL89' in USERNAMESDICT:

I needed this for a dictionary instead of list, Jochen solution was the most elegant for that case so I modded it a bit:

class CaseInsensitiveDict(dict):
    ''' requests special dicts are case insensitive when using the in operator,
     this implements a similar behaviour'''
    def __contains__(self, name): # implements `in`
        return name.casefold() in (n.casefold() for n in self.keys())

now you can convert a dictionary like so USERNAMESDICT = CaseInsensitiveDict(USERNAMESDICT) and use if 'MICHAEL89' in USERNAMESDICT:


回答 9

为了做到这一点,这就是我所做的:

if any(([True if 'MICHAEL89' in username.upper() else False for username in USERNAMES])):
    print('username exists in list')

我没有及时测试它。我不确定它的速度/效率。

To have it in one line, this is what I did:

if any(([True if 'MICHAEL89' in username.upper() else False for username in USERNAMES])):
    print('username exists in list')

I didn’t test it time-wise though. I am not sure how fast/efficient it is.