从字典中删除带有空字符串的键的有效方法

问题:从字典中删除带有空字符串的键的有效方法

我有一个字典,想删除所有有空值字符串的键。

metadata = {u'Composite:PreviewImage': u'(Binary data 101973 bytes)',
            u'EXIF:CFAPattern2': u''}

做这个的最好方式是什么?

I have a dict and would like to remove all the keys for which there are empty value strings.

metadata = {u'Composite:PreviewImage': u'(Binary data 101973 bytes)',
            u'EXIF:CFAPattern2': u''}

What is the best way to do this?


回答 0

Python 2.X

dict((k, v) for k, v in metadata.iteritems() if v)

Python 2.7-3.X

{k: v for k, v in metadata.items() if v is not None}

请注意,您所有的键都有值。只是其中一些值是空字符串。没有值的字典中就没有键。如果它没有价值,就不会在字典中。

Python 2.X

dict((k, v) for k, v in metadata.iteritems() if v)

Python 2.7 – 3.X

{k: v for k, v in metadata.items() if v is not None}

Note that all of your keys have values. It’s just that some of those values are the empty string. There’s no such thing as a key in a dict without a value; if it didn’t have a value, it wouldn’t be in the dict.


回答 1

它甚至比BrenBarn的解决方案还短(我认为它更具可读性)

{k: v for k, v in metadata.items() if v}

使用Python 2.7.3测试。

It can get even shorter than BrenBarn’s solution (and more readable I think)

{k: v for k, v in metadata.items() if v}

Tested with Python 2.7.3.


回答 2

如果您确实需要修改原始词典:

empty_keys = [k for k,v in metadata.iteritems() if not v]
for k in empty_keys:
    del metadata[k]

请注意,我们必须列出一个空键,因为我们无法在遍历字典时修改字典(您可能已经注意到)。但是,这(在内存方面)比创建全新的字典便宜(除非存在大量具有空值的条目)。

If you really need to modify the original dictionary:

empty_keys = [k for k,v in metadata.iteritems() if not v]
for k in empty_keys:
    del metadata[k]

Note that we have to make a list of the empty keys because we can’t modify a dictionary while iterating through it (as you may have noticed). This is less expensive (memory-wise) than creating a brand-new dictionary, though, unless there are a lot of entries with empty values.


回答 3

BrenBarn的解决方案是理想的(我可能会添加pythonic)。但是,这是另一个(fp)解决方案:

from operator import itemgetter
dict(filter(itemgetter(1), metadata.items()))

BrenBarn’s solution is ideal (and pythonic, I might add). Here is another (fp) solution, however:

from operator import itemgetter
dict(filter(itemgetter(1), metadata.items()))

回答 4

如果您想要一种功能全面但简洁的方法来处理通常是嵌套的甚至可能包含循环的现实世界数据结构,建议您从boltons实用程序包中查看remap实用程序

之后pip install boltons或复制iterutils.py到您的项目,只是做:

from boltons.iterutils import remap

drop_falsey = lambda path, key, value: bool(value)
clean = remap(metadata, visit=drop_falsey)

该页面上有更多示例,包括使用Github API处理更大对象的示例。

它是纯Python,因此可在任何地方使用,并已在Python 2.7和3.3+中进行了全面测试。最棒的是,我是针对这种情况编写的,因此,如果您发现它无法处理的情况,可以在这里麻烦我进行修复。

If you want a full-featured, yet succinct approach to handling real-world data structures which are often nested, and can even contain cycles, I recommend looking at the remap utility from the boltons utility package.

After pip install boltons or copying iterutils.py into your project, just do:

from boltons.iterutils import remap

drop_falsey = lambda path, key, value: bool(value)
clean = remap(metadata, visit=drop_falsey)

This page has many more examples, including ones working with much larger objects from Github’s API.

It’s pure-Python, so it works everywhere, and is fully tested in Python 2.7 and 3.3+. Best of all, I wrote it for exactly cases like this, so if you find a case it doesn’t handle, you can bug me to fix it right here.


回答 5

基于Ryan的解决方案,如果您还有列表和嵌套字典:

对于Python 2:

def remove_empty_from_dict(d):
    if type(d) is dict:
        return dict((k, remove_empty_from_dict(v)) for k, v in d.iteritems() if v and remove_empty_from_dict(v))
    elif type(d) is list:
        return [remove_empty_from_dict(v) for v in d if v and remove_empty_from_dict(v)]
    else:
        return d

对于Python 3:

def remove_empty_from_dict(d):
    if type(d) is dict:
        return dict((k, remove_empty_from_dict(v)) for k, v in d.items() if v and remove_empty_from_dict(v))
    elif type(d) is list:
        return [remove_empty_from_dict(v) for v in d if v and remove_empty_from_dict(v)]
    else:
        return d

Based on Ryan’s solution, if you also have lists and nested dictionaries:

For Python 2:

def remove_empty_from_dict(d):
    if type(d) is dict:
        return dict((k, remove_empty_from_dict(v)) for k, v in d.iteritems() if v and remove_empty_from_dict(v))
    elif type(d) is list:
        return [remove_empty_from_dict(v) for v in d if v and remove_empty_from_dict(v)]
    else:
        return d

For Python 3:

def remove_empty_from_dict(d):
    if type(d) is dict:
        return dict((k, remove_empty_from_dict(v)) for k, v in d.items() if v and remove_empty_from_dict(v))
    elif type(d) is list:
        return [remove_empty_from_dict(v) for v in d if v and remove_empty_from_dict(v)]
    else:
        return d

回答 6

如果您有一个嵌套的字典,并且希望它甚至对空的子元素也适用,则可以使用BrenBarn建议的递归变体:

def scrub_dict(d):
    if type(d) is dict:
        return dict((k, scrub_dict(v)) for k, v in d.iteritems() if v and scrub_dict(v))
    else:
        return d

If you have a nested dictionary, and you want this to work even for empty sub-elements, you can use a recursive variant of BrenBarn’s suggestion:

def scrub_dict(d):
    if type(d) is dict:
        return dict((k, scrub_dict(v)) for k, v in d.iteritems() if v and scrub_dict(v))
    else:
        return d

回答 7

快速解答(TL; DR)

范例01

### example01 -------------------

mydict  =   { "alpha":0,
              "bravo":"0",
              "charlie":"three",
              "delta":[],
              "echo":False,
              "foxy":"False",
              "golf":"",
              "hotel":"   ",                        
            }
newdict =   dict([(vkey, vdata) for vkey, vdata in mydict.iteritems() if(vdata) ])
print newdict

### result01 -------------------
result01 ='''
{'foxy': 'False', 'charlie': 'three', 'bravo': '0'}
'''

详细答案

问题

  • 内容: Python 2.x
  • 场景:开发人员希望修改字典以排除空白值
    • aka从字典中删除空值
    • 也就是删除具有空白值的键
    • aka过滤器字典,用于每个键值对上的非空白值

  • example01使用带有简单条件的python list-comprehension语法删除“空”值

陷阱

  • example01仅对原始词典的副本进行操作(未就地修改)
  • example01可能会产生意外结果,具体取决于开发人员“空”的含义
    • 开发人员是否打算保留虚假的价值
    • 如果字典中的值不保证是字符串,则开发人员可能会意外丢失数据。
    • result01显示原始集合中仅保留了三个键值对

替代示例

  • example02帮助解决潜在的陷阱
  • 该方法是通过更改条件使用“空”的更精确定义。
  • 在这里,我们只想滤除评估为空字符串的值。
  • 在这里,我们还使用.strip()过滤出仅包含空格的值。

示例02

### example02 -------------------

mydict  =   { "alpha":0,
              "bravo":"0",
              "charlie":"three",
              "delta":[],
              "echo":False,
              "foxy":"False",
              "golf":"",
              "hotel":"   ",
            }
newdict =   dict([(vkey, vdata) for vkey, vdata in mydict.iteritems() if(str(vdata).strip()) ])
print newdict

### result02 -------------------
result02 ='''
{'alpha': 0,
  'bravo': '0', 
  'charlie': 'three', 
  'delta': [],
  'echo': False,
  'foxy': 'False'
  }
'''

也可以看看

Quick Answer (TL;DR)

Example01

### example01 -------------------

mydict  =   { "alpha":0,
              "bravo":"0",
              "charlie":"three",
              "delta":[],
              "echo":False,
              "foxy":"False",
              "golf":"",
              "hotel":"   ",                        
            }
newdict =   dict([(vkey, vdata) for vkey, vdata in mydict.iteritems() if(vdata) ])
print newdict

### result01 -------------------
result01 ='''
{'foxy': 'False', 'charlie': 'three', 'bravo': '0'}
'''

Detailed Answer

Problem

  • Context: Python 2.x
  • Scenario: Developer wishes modify a dictionary to exclude blank values
    • aka remove empty values from a dictionary
    • aka delete keys with blank values
    • aka filter dictionary for non-blank values over each key-value pair

Solution

  • example01 use python list-comprehension syntax with simple conditional to remove “empty” values

Pitfalls

  • example01 only operates on a copy of the original dictionary (does not modify in place)
  • example01 may produce unexpected results depending on what developer means by “empty”
    • Does developer mean to keep values that are falsy?
    • If the values in the dictionary are not gauranteed to be strings, developer may have unexpected data loss.
    • result01 shows that only three key-value pairs were preserved from the original set

Alternate example

  • example02 helps deal with potential pitfalls
  • The approach is to use a more precise definition of “empty” by changing the conditional.
  • Here we only want to filter out values that evaluate to blank strings.
  • Here we also use .strip() to filter out values that consist of only whitespace.

Example02

### example02 -------------------

mydict  =   { "alpha":0,
              "bravo":"0",
              "charlie":"three",
              "delta":[],
              "echo":False,
              "foxy":"False",
              "golf":"",
              "hotel":"   ",
            }
newdict =   dict([(vkey, vdata) for vkey, vdata in mydict.iteritems() if(str(vdata).strip()) ])
print newdict

### result02 -------------------
result02 ='''
{'alpha': 0,
  'bravo': '0', 
  'charlie': 'three', 
  'delta': [],
  'echo': False,
  'foxy': 'False'
  }
'''

See also


回答 8

对于python 3

dict((k, v) for k, v in metadata.items() if v)

For python 3

dict((k, v) for k, v in metadata.items() if v)

回答 9

patriciasznneonneo的答案为基础,并考虑到您可能希望删除仅包含某些虚假内容(例如'')但没有其他虚假内容(例如)的密钥的可能性0,或者您甚至想包含一些真实的内容(例如'SPAM') ,那么您可以制作一个非常具体的命中列表:

unwanted = ['', u'', None, False, [], 'SPAM']

不幸的是,这并不是很有效,因为例如0 in unwanted计算结果为True。我们需要区分0和其他虚假的东西,所以我们必须使用is

any([0 is i for i in unwanted])

…评估为False

现在将其用于del不需要的东西:

unwanted_keys = [k for k, v in metadata.items() if any([v is i for i in unwanted])]
for k in unwanted_keys: del metadata[k]

如果您想要一个新的字典,而不是metadata就地修改:

newdict = {k: v for k, v in metadata.items() if not any([v is i for i in unwanted])}

Building on the answers from patriciasz and nneonneo, and accounting for the possibility that you might want to delete keys that have only certain falsy things (e.g. '') but not others (e.g. 0), or perhaps you even want to include some truthy things (e.g. 'SPAM'), then you could make a highly specific hitlist:

unwanted = ['', u'', None, False, [], 'SPAM']

Unfortunately, this doesn’t quite work, because for example 0 in unwanted evaluates to True. We need to discriminate between 0 and other falsy things, so we have to use is:

any([0 is i for i in unwanted])

…evaluates to False.

Now use it to del the unwanted things:

unwanted_keys = [k for k, v in metadata.items() if any([v is i for i in unwanted])]
for k in unwanted_keys: del metadata[k]

If you want a new dictionary, instead of modifying metadata in place:

newdict = {k: v for k, v in metadata.items() if not any([v is i for i in unwanted])}

回答 10

我阅读了该线程中的所有答复,并且也引用了该线程: 使用递归函数删除嵌套字典中的空字典

我最初在这里使用解决方案,效果很好:

尝试1:太热(不具有性能或过时的能力)

def scrub_dict(d):
    if type(d) is dict:
        return dict((k, scrub_dict(v)) for k, v in d.iteritems() if v and scrub_dict(v))
    else:
        return d

但是在Python 2.7世界中提出了一些性能和兼容性问题:

  1. isinstance代替type
  2. 将列表组合展开到for循环中以提高效率
  3. 使用python3安全items而不是iteritems

尝试2:太冷(缺乏记忆)

def scrub_dict(d):
    new_dict = {}
    for k, v in d.items():
        if isinstance(v,dict):
            v = scrub_dict(v)
        if not v in (u'', None, {}):
            new_dict[k] = v
    return new_dict

DOH!这不是递归的,也不是完全的记忆。

尝试3:正确(到目前为止)

def scrub_dict(d):
    new_dict = {}
    for k, v in d.items():
        if isinstance(v,dict):
            v = scrub_dict(v)
        if not v in (u'', None, {}):
            new_dict[k] = v
    return new_dict

I read all replies in this thread and some referred also to this thread: Remove empty dicts in nested dictionary with recursive function

I originally used solution here and it worked great:

Attempt 1: Too Hot (not performant or future-proof):

def scrub_dict(d):
    if type(d) is dict:
        return dict((k, scrub_dict(v)) for k, v in d.iteritems() if v and scrub_dict(v))
    else:
        return d

But some performance and compatibility concerns were raised in Python 2.7 world:

  1. use isinstance instead of type
  2. unroll the list comp into for loop for efficiency
  3. use python3 safe items instead of iteritems

Attempt 2: Too Cold (Lacks Memoization):

def scrub_dict(d):
    new_dict = {}
    for k, v in d.items():
        if isinstance(v,dict):
            v = scrub_dict(v)
        if not v in (u'', None, {}):
            new_dict[k] = v
    return new_dict

DOH! This is not recursive and not at all memoizant.

Attempt 3: Just Right (so far):

def scrub_dict(d):
    new_dict = {}
    for k, v in d.items():
        if isinstance(v,dict):
            v = scrub_dict(v)
        if not v in (u'', None, {}):
            new_dict[k] = v
    return new_dict

回答 11

带数组的字典

  • 在答案尝试3:刚刚好(到目前为止)BlissRage的回答不能正确处理数组中的元素。我会附上一个补丁,以防有人需要。该方法使用带有的语句块处理列表,该语句块if isinstance(v, list):使用原始scrub_dict(d)实现清理列表。
    @staticmethod
    def scrub_dict(d):
        new_dict = {}
        for k, v in d.items():
            if isinstance(v, dict):
                v = scrub_dict(v)
            if isinstance(v, list):
                v = scrub_list(v)
            if not v in (u'', None, {}):
                new_dict[k] = v
        return new_dict

    @staticmethod
    def scrub_list(d):
        scrubbed_list = []
        for i in d:
            if isinstance(i, dict):
                i = scrub_dict(i)
            scrubbed_list.append(i)
        return scrubbed_list

Dicts mixed with Arrays

  • The answer at Attempt 3: Just Right (so far) from BlissRage’s answer does not properly handle arrays elements. I’m including a patch in case anyone needs it. The method is handles list with the statement block of if isinstance(v, list):, which scrubs the list using the original scrub_dict(d) implementation.
    @staticmethod
    def scrub_dict(d):
        new_dict = {}
        for k, v in d.items():
            if isinstance(v, dict):
                v = scrub_dict(v)
            if isinstance(v, list):
                v = scrub_list(v)
            if not v in (u'', None, {}):
                new_dict[k] = v
        return new_dict

    @staticmethod
    def scrub_list(d):
        scrubbed_list = []
        for i in d:
            if isinstance(i, dict):
                i = scrub_dict(i)
            scrubbed_list.append(i)
        return scrubbed_list

回答 12

您可以执行此操作的另一种方法是使用字典理解。这应该与2.7+

result = {
    key: value for key, value in
    {"foo": "bar", "lorem": None}.items()
    if value
}

An alternative way you can do this, is using dictionary comprehension. This should be compatible with 2.7+

result = {
    key: value for key, value in
    {"foo": "bar", "lorem": None}.items()
    if value
}

回答 13

如果您使用的是以下选项pandas

import pandas as pd

d = dict.fromkeys(['a', 'b', 'c', 'd'])
d['b'] = 'not null'
d['c'] = ''  # empty string

print(d)

# convert `dict` to `Series` and replace any blank strings with `None`;
# use the `.dropna()` method and
# then convert back to a `dict`
d_ = pd.Series(d).replace('', None).dropna().to_dict()

print(d_)

Here is an option if you are using pandas:

import pandas as pd

d = dict.fromkeys(['a', 'b', 'c', 'd'])
d['b'] = 'not null'
d['c'] = ''  # empty string

print(d)

# convert `dict` to `Series` and replace any blank strings with `None`;
# use the `.dropna()` method and
# then convert back to a `dict`
d_ = pd.Series(d).replace('', None).dropna().to_dict()

print(d_)

回答 14

上面提到的某些方法会忽略是否存在整数,并且会以0和0.0的值进行浮点运算

如果有人想避免上述情况,可以使用以下代码(从嵌套字典和嵌套列表中删除空字符串和None值):

def remove_empty_from_dict(d):
    if type(d) is dict:
        _temp = {}
        for k,v in d.items():
            if v == None or v == "":
                pass
            elif type(v) is int or type(v) is float:
                _temp[k] = remove_empty_from_dict(v)
            elif (v or remove_empty_from_dict(v)):
                _temp[k] = remove_empty_from_dict(v)
        return _temp
    elif type(d) is list:
        return [remove_empty_from_dict(v) for v in d if( (str(v).strip() or str(remove_empty_from_dict(v)).strip()) and (v != None or remove_empty_from_dict(v) != None))]
    else:
        return d

Some of Methods mentioned above ignores if there are any integers and float with values 0 & 0.0

If someone wants to avoid the above can use below code(removes empty strings and None values from nested dictionary and nested list):

def remove_empty_from_dict(d):
    if type(d) is dict:
        _temp = {}
        for k,v in d.items():
            if v == None or v == "":
                pass
            elif type(v) is int or type(v) is float:
                _temp[k] = remove_empty_from_dict(v)
            elif (v or remove_empty_from_dict(v)):
                _temp[k] = remove_empty_from_dict(v)
        return _temp
    elif type(d) is list:
        return [remove_empty_from_dict(v) for v in d if( (str(v).strip() or str(remove_empty_from_dict(v)).strip()) and (v != None or remove_empty_from_dict(v) != None))]
    else:
        return d

回答 15

“由于我目前还为使用Python编写一个桌面应用程序,因此我在数据输入应用程序中发现有很多条目,而其中一些条目不是强制性的,因此用户可以将其留空,以进行验证,因此很容易抓住。所有条目,然后丢弃空键或字典的值,因此我的代码上方显示了如何使用字典理解功能轻松地将它们取出,并保留不为空的字典值元素。我使用Python 3.8.3

data = {'':'', '20':'', '50':'', '100':'1.1', '200':'1.2'}

dic = {key:value for key,value in data.items() if value != ''}

print(dic)

{'100': '1.1', '200': '1.2'}

“As I also currently write a desktop application for my work with Python, I found in data-entry application when there is lots of entry and which some are not mandatory thus user can left it blank, for validation purpose, it is easy to grab all entries and then discard empty key or value of a dictionary. So my code above a show how we can easy take them out, using dictionary comprehension and keep dictionary value element which is not blank. I use Python 3.8.3

data = {'':'', '20':'', '50':'', '100':'1.1', '200':'1.2'}

dic = {key:value for key,value in data.items() if value != ''}

print(dic)

{'100': '1.1', '200': '1.2'}

回答 16

一些基准测试:

1.列表理解重新创建字典

In [7]: %%timeit dic = {str(i):i for i in xrange(10)}; dic['10'] = None; dic['5'] = None
   ...: dic = {k: v for k, v in dic.items() if v is not None} 
   1000000 loops, best of 7: 375 ns per loop

2.列表理解使用dict()重新创建dict

In [8]: %%timeit dic = {str(i):i for i in xrange(10)}; dic['10'] = None; dic['5'] = None
   ...: dic = dict((k, v) for k, v in dic.items() if v is not None)
1000000 loops, best of 7: 681 ns per loop

3.如果v为None,则循环并删除密钥

In [10]: %%timeit dic = {str(i):i for i in xrange(10)}; dic['10'] = None; dic['5'] = None
    ...: for k, v in dic.items():
    ...:   if v is None:
    ...:     del dic[k]
    ...: 
10000000 loops, best of 7: 160 ns per loop

因此循环和删除最快在160ns时完成,列表理解在375ns时慢了一半,而调用dict()则在680ns时又慢了一半。

将3包装到函数中可将其再次降低到约275ns。对我来说,PyPy的速度也快于neet python的两倍。

Some benchmarking:

1. List comprehension recreate dict

In [7]: %%timeit dic = {str(i):i for i in xrange(10)}; dic['10'] = None; dic['5'] = None
   ...: dic = {k: v for k, v in dic.items() if v is not None} 
   1000000 loops, best of 7: 375 ns per loop

2. List comprehension recreate dict using dict()

In [8]: %%timeit dic = {str(i):i for i in xrange(10)}; dic['10'] = None; dic['5'] = None
   ...: dic = dict((k, v) for k, v in dic.items() if v is not None)
1000000 loops, best of 7: 681 ns per loop

3. Loop and delete key if v is None

In [10]: %%timeit dic = {str(i):i for i in xrange(10)}; dic['10'] = None; dic['5'] = None
    ...: for k, v in dic.items():
    ...:   if v is None:
    ...:     del dic[k]
    ...: 
10000000 loops, best of 7: 160 ns per loop

so loop and delete is the fastest at 160ns, list comprehension is half as slow at ~375ns and with a call to dict() is half as slow again ~680ns.

Wrapping 3 into a function brings it back down again to about 275ns. Also for me PyPy was about twice as fast as neet python.