根据布尔值列表过滤列表

Question 1

I have a list of values which I need to filter given the values in a list of booleans:

list_a = [1, 2, 4, 6]
filter = [True, False, True, False]

I generate a new filtered list with the following line:

filtered_list = [i for indx,i in enumerate(list_a) if filter[indx] == True]

which results in:

print filtered_list
[1,4]

The line works but looks (to me) a bit overkill and I was wondering if there was a simpler way to achieve the same.

Advices

Summary of two good advices given in the answers below:

1- Don’t name a list filter like I did because it is a built-in function.

2- Don’t compare things to True like I did with if filter[idx]==True.. since it’s unnecessary. Just using if filter[idx] is enough.

Question 2

You’re looking for itertools.compress:

>>> from itertools import compress
>>> list_a = [1, 2, 4, 6]
>>> fil = [True, False, True, False]
>>> list(compress(list_a, fil))
[1, 4]

Timing comparisons(py3.x):

>>> list_a = [1, 2, 4, 6]
>>> fil = [True, False, True, False]
>>> %timeit list(compress(list_a, fil))
100000 loops, best of 3: 2.58 us per loop
>>> %timeit [i for (i, v) in zip(list_a, fil) if v]  #winner
100000 loops, best of 3: 1.98 us per loop

>>> list_a = [1, 2, 4, 6]*100
>>> fil = [True, False, True, False]*100
>>> %timeit list(compress(list_a, fil))              #winner
10000 loops, best of 3: 24.3 us per loop
>>> %timeit [i for (i, v) in zip(list_a, fil) if v]
10000 loops, best of 3: 82 us per loop

>>> list_a = [1, 2, 4, 6]*10000
>>> fil = [True, False, True, False]*10000
>>> %timeit list(compress(list_a, fil))              #winner
1000 loops, best of 3: 1.66 ms per loop
>>> %timeit [i for (i, v) in zip(list_a, fil) if v] 
100 loops, best of 3: 7.65 ms per loop

_{Don’t use filter as a variable name, it is a built-in function.}

Question 3

Like so:

filtered_list = [i for (i, v) in zip(list_a, filter) if v]

Using zip is the pythonic way to iterate over multiple sequences in parallel, without needing any indexing. This assumes both sequences have the same length (zip stops after the shortest runs out). Using itertools for such a simple case is a bit overkill …

One thing you do in your example you should really stop doing is comparing things to True, this is usually not necessary. Instead of if filter[idx]==True: ..., you can simply write if filter[idx]: ....

Question 4

With numpy:

In [128]: list_a = np.array([1, 2, 4, 6])
In [129]: filter = np.array([True, False, True, False])
In [130]: list_a[filter]

Out[130]: array([1, 4])

or see Alex Szatmary’s answer if list_a can be a numpy array but not filter

Numpy usually gives you a big speed boost as well

In [133]: list_a = [1, 2, 4, 6]*10000
In [134]: fil = [True, False, True, False]*10000
In [135]: list_a_np = np.array(list_a)
In [136]: fil_np = np.array(fil)

In [139]: %timeit list(itertools.compress(list_a, fil))
1000 loops, best of 3: 625 us per loop

In [140]: %timeit list_a_np[fil_np]
10000 loops, best of 3: 173 us per loop

Question 5

To do this using numpy, ie, if you have an array, a, instead of list_a:

a = np.array([1, 2, 4, 6])
my_filter = np.array([True, False, True, False], dtype=bool)
a[my_filter]
> array([1, 4])

Question 6

filtered_list = [list_a[i] for i in range(len(list_a)) if filter[i]]

Question 7

With python 3 you can use list_a[filter] to get True values. To get False values use list_a[~filter]

根据布尔值列表过滤列表

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

7行代码 Python热力图可视化分析缺失数据处理

Python 流程图 — 一键转化代码为流程图

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

从cookiecutter(项目模板)创建项目的命令行实用程序，例如Python包项目、VueJS项目

将列表中的所有字符串转换为int

什么情况下需要使用 Python 的 asyncio 模块？

Python在一个列表中查找不在另一个列表中的元素[重复]

从标准输入读取密码

如何在Python中逐行打印字典？

根据布尔值列表过滤列表

问题：根据布尔值列表过滤列表

忠告

Advices

回答 0

时序比较（py3.x）：

Timing comparisons(py3.x):

回答 1

回答 2

回答 3

回答 4

回答 5

相关文章

排行榜展示

文章展示