标签归档:Python

Mininet-软件定义网络快速原型仿真器

什么是Mininet?

Mininet在一台机器上模拟由主机、链路和交换机组成的完整网络。要创建两台主机、一台交换机的示例网络,只需运行:

sudo mn

Mininet对于交互式开发、测试和演示非常有用,尤其是那些使用OpenFlow和SDN的开发、测试和演示。在Mininet中建立原型的基于OpenFlow的网络控制器通常可以转移到硬件上,只需最少的更改即可实现全线速执行

它怎麽工作?

Mininet使用基于进程的虚拟化和网络命名空间创建虚拟网络-这些特性在最近的Linux内核中都可用。在Mininet中,主机被模拟为bash进程在网络命名空间中运行,因此任何通常在Linux服务器(如Web服务器或客户端程序)上运行的代码应该在Mininet“Host”中运行良好。Mininet“主机”将拥有自己的专用网络接口,并且只能看到自己的进程。Mininet中的交换机是基于软件的交换机,如Open vSwitch或OpenFlow参考交换机。链路是虚拟以太网对,它们驻留在Linux内核中,并将我们的仿真交换机连接到仿真主机(进程)

功能

Mininet包括:

  • 命令行启动器(mn)实例化网络
  • 用于创建各种大小和拓扑的网络的便捷Python API
  • 示例(在examples/目录)来帮助您入门
  • 通过Python提供完整的API文档help()文档字符串,以及使用生成PDF/HTML文档的能力make doc
  • 参数化拓扑(Topo子类)使用Mininet对象。例如,可以使用以下命令创建树网络:

    mn --topo tree,depth=2,fanout=3

  • 命令行界面(CLI类),它提供有用的诊断命令(如iperfping),以及向节点运行命令的能力。例如,

    mininet> h11 ifconfig -a

    通知主机H11运行该命令ifconfig -a

  • 清除垃圾(/tmp中的接口、进程、文件等)的“清理”命令它可能会被Mininet或Linux留下。如果事情不起作用,试试这个!

    mn -c

Python 3支持

  • Mininet 2.3.0支持Python3和Python2!
  • 您可以并排安装Mininet的Python 3和Python 2版本,但最新安装将确定默认情况下使用哪个Python版本mn
  • 你可以跑mn直接使用Python 2或Python 3,只要安装了适当版本的Mininet,例如
    $ sudo python2 `which mn`
    
  • 有关Python 3和Python 2支持的详细信息,请参阅上的发行说明http://docs.mininet.org

其他增强功能和信息

  • 支持Ubuntu 20.04 LTS(以及18.04和16.04)
  • 通过GitHub操作实现更可靠的测试和CI
  • 有关此版本和以前版本的更多信息,请参阅上的版本说明http://docs.mininet.org

安装

看见INSTALL有关安装说明和详细信息,请参阅

文档

除了API文档(make doc),上提供了许多有用的信息,包括Mininet演练和Python API简介Mininet Web Site还有一个维基,鼓励您阅读并参与其中,特别是常见问题解答(FAQ),网址为http://faq.mininet.org

支持

Mininet是由社区支持的。我们鼓励您加入Mininet邮件列表,mininet-discuss在:

https://mailman.stanford.edu/mailman/listinfo/mininet-discuss

加入我们吧

再次感谢所有Mininet的贡献者和用户!

Mininet是一个开源项目,目前托管在https://github.com/mininet我们鼓励您下载、检查、修改代码,并提交错误报告、错误修复、功能请求、新功能和其他问题以及拉取请求。感谢为Mininet项目贡献代码的每个人(有关更多信息,请参阅贡献者!)正是因为每个人的辛勤工作,Mininet才能不断成长和完善

享受Mininet吧

玩得开心!我们期待着看到您将如何使用Mininet来改变网络世界

鲍勃·兰茨,代表Mininet的贡献者

Isort-用于对import进行排序的Python实用程序/库

isort 是一个Python的实用程序/库,它会按字母顺序对导入(import)的库进行排序,并自动分组。它提供多种实用方式,包括命令行、Python库和VSCode、Pycharm插件快速分拣你所有的import。

它基于Python 3.6+实现,但也支持格式化Python 2代码

在isort之前:

from my_lib import Object
import os
from my_lib import Object3
from my_lib import Object2
import sys
from third_party import lib15, lib1, lib2, lib3, lib4, lib5, lib6, lib7, lib8, lib9, lib10, lib11, lib12, lib13, lib14
import sys
from __future__ import absolute_import
from third_party import lib3
print("Hey")
print("yo")

排序后:

from __future__ import absolute_import import os
import sys from third_party import (lib1, lib2, lib3, lib4, lib5, lib6, lib7, lib8,
                        lib9, lib10, lib11, lib12, lib13, lib14, lib15)

from my_lib import Object, Object2, Object3 
print("Hey")
print("yo")

安装isort

安装isort非常简单,只需:

pip install isort

安装isort,并提供Requirements.txt支持:

pip install isort[requirements_deprecated_finder]

安装支持管道文件的isort:

pip install isort[pipfile_deprecated_finder]

安装支持两种格式的isort:

pip install isort[requirements_deprecated_finder,pipfile_deprecated_finder]

使用isort

从命令行

要在特定文件上运行,请执行以下操作:

isort mypythonfile.py mypythonfile2.py

要递归应用,请执行以下操作:

isort .

如果globstar已启用,isort .相当于:

isort **/*.py

要查看建议的更改而不应用它们,请执行以下操作:

isort mypythonfile.py --diff

最后,要对项目自动运行isort,仅在未引入语法错误的情况下应用更改:

isort --atomic .

(注意:这在默认情况下是禁用的,因为它阻止isort针对使用不同版本的Python编写的代码运行。)

从Python内部

import isort
isort.file(“pythonfile.py”)

或者:

import isort
sorted_code = isort.code(“import b\nimport a\n)

为您首选的文本编辑器安装isort

已经编写了几个插件,可以在各种文本编辑器中使用isort。您可以找到它们的完整列表on the isort
wiki
此外,我将热情地接受包括其他文本编辑器插件的Pull请求,并在收到通知时为它们添加文档

多行输出模式

您将注意到上面的“MULTI_LINE_OUTPUT”设置。此设置定义当从导入扩展到超过line_length限制并具有12 possible settings

压痕

要更改常量缩进的显示方式-只需使用以下可接受的格式更改缩进属性:

  • 您想要的空间数。例如:4将导致标准的4空格缩进
  • 选项卡
  • 带引号的逐字字符串

例如:

"    "

等于4

对于使用圆括号的导入样式,可以控制是否在上次导入后使用include_trailing_comma选项(默认为False)

智能平衡多行导入

从isort3.1.0开始,添加了对平衡多行导入的支持。启用此功能后,isort会将导入长度动态更改为生成最平衡格网的长度,同时保持在定义的最大导入长度以下

示例:

from __future__ import (absolute_import, division,
                        print_function, unicode_literals)

将会产生,而不是:

from __future__ import (absolute_import, division, print_function,
                        unicode_literals)

要启用此设置,请执行以下操作balanced_wrappingTrue在您的配置中,或将-e选项添加到命令行实用程序中

自定义节和排序

isort提供了配置选项来更改导入在各个部分中的组织、排序或分组方式的几乎每一个方面

Click here有关所有这些选项的概述,请参阅

跳过导入处理(配置之外)

要使isort忽略单个导入,只需在包含文本的导入行末尾添加注释isort:skip

import module  # isort:skip

或者:

from xyz import (abc,  # isort:skip
                 yo,
                 hey)

要使isort跳过整个文件,只需添加isort:skip_file添加到模块的文档字符串:

""" my_module.py
    Best module ever

   isort:skip_file
"""

import b
import a

从多个文件添加或删除导入

可以运行isort或将其配置为自动添加/删除导入

See a complete guide here.

使用isort验证代码

这个--check-only选项

也可以使用isort来验证代码的格式是否正确,方法是使用-c任何包含错误排序和/或格式化导入的文件都将输出到stderr

isort **/*.py -c -v

SUCCESS: /home/timothy/Projects/Open_Source/isort/isort_kate_plugin.py Everything Looks Good!
ERROR: /home/timothy/Projects/Open_Source/isort/isort/isort.py Imports are incorrectly sorted.

可以使用它的一个很好的地方是使用预先提交的git钩子,比如@acdha提供的这个钩子:

https://gist.github.com/acdha/8717683

这有助于在整个项目中确保一定级别的代码质量

Git钩子

isort提供了一个钩子函数,该函数可以集成到Git预提交脚本中,以便在提交之前检查Python代码

More info here.

Setuptools集成

安装后,isort将启用setuptools用于检查项目声明的Python文件的命令

More info here.

把消息传出去

将此徽章放在存储库的顶部,让其他人知道您的项目使用了isort

对于Readme.md:

[![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://pycqa.github.io/isort/)

或readme.rst:

.. image:: https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336
    :target: https://pycqa.github.io/isort/

安全联系信息

若要报告安全漏洞,请使用Tidelift security
contact
Tidelift将协调修复和披露

为什么是伊索尔?

isort简单地表示导入排序。它最初名为“sortImports”,但是我厌倦了键入额外的字符,并意识到CamelCase并不是巨蟒

我之所以编写isort,是因为在我曾经工作过的一个组织中,经理有一天来了,他决定所有代码都必须按字母顺序对导入进行排序。代码库很大–他的意思是让我们手工完成。然而,作为一名程序员-我太懒了,不会花8个小时盲目地执行一个函数,但也不会太懒,不会花16个小时让它自动化。我获得了开放源码sortImports的许可,我们在这里:)


Get professionally supported isort with the Tidelift
Subscription

对isort的专业支持是作为Tidelift
Subscription
Tidelift为软件开发团队提供了购买和维护软件的单一来源,并由最了解该技术的专家提供专业级别保证,同时与现有工具无缝集成


谢谢,我希望你觉得isort有用!

~蒂莫西·克罗斯利

WeiboSpider-新浪微博爬虫,用Python爬取新浪微博数据

本程序可以连续爬取一个多个新浪微博用户(如胡歌迪丽热巴郭碧婷)的数据,并将结果信息写入文件数据库那就是。写入信息几乎包括用户微博的所有数据,包括用户信息微博信息两大类.因为内容太多,这里不再赘述,详细内容见获取到的字段那就是。如果只需要用户信息,可以通过设置实现只爬取微博用户信息的功能.本程序需设置Cookie来获取微博访问权限,后面会讲解如何获取cookie那就是。如果不想设置Cookie,可以使用免cookie版、二者功能类似.

爬取结果可写入文件和数据库,具体的写入文件类型如下:

  • TXT文件(默认)
  • csv文件(默认)
  • JSON文件(可选)
  • mysql数据库(可选)
  • mongodb数据库(可选)
  • SQLite数据库(可选)

同时支持下载微博中的图片和视频,具体的可下载文件如下:

  • 原创微博中的原始图片(可选)
  • 转发微博中的原始图片(可选)
  • 原创微博中的视频(可选)
  • 转发微博中的视频(可选)
  • 原创微博现场照片中的视频(免cookie版(特有)
  • 转发微博现场照片中的视频(免cookie版(特有)

内容列表

[TOC]

获取到的字段

本部分为爬取到的字段信息说明,为了与免cookie版区分,下面将两者爬取到的信息都列出来.如果是免Cookie版所特有的信息,会有免Cookie标注,没有标注的为二者共有的信息。

用户信息

  • 用户id:微博用户id,如“1669879400”,其实这个字段本来就是已知字段
  • 昵称:用户昵称,如“亲爱的-迪丽热巴”
  • 性别:微博用户性别
  • 生日:用户出生日期
  • 所在地:用户所在地
  • 学习经历:用户上学时学校的名字和时间
  • 工作经历:用户所属公司名字和时间
  • 阳光信用(免Cookie版):用户的阳光信用
  • 微博注册时间(免Cookie版):用户微博注册日期
  • (微博数:用户的全部微博数(转发微博+原创微博)
  • 关注数:用户关注的微博数量
  • 粉丝数:用户的粉丝数
  • 简介:用户简介
  • 主页地址(免Cookie版):微博移动版主页URL
  • 头像URL(免Cookie版):用户头像URL
  • 高清头像URL(免Cookie版):用户高清头像URL
  • 微博等级(免Cookie版):用户微博等级
  • 会员等级(免Cookie版):微博会员用户等级,普通用户该等级为%0
  • 是否认证(免Cookie版):用户是否认证,为布尔类型
  • 认证类型(免Cookie版):用户认证类型,如个人认证、企业认证、政府认证等
  • 认证信息:为认证用户特有,用户信息栏显示的认证信息

微博信息

  • 微博id:微博唯一标志
  • 微博内容:微博正文
  • 头条文章URL:微博中头条文章的URL,若微博中不存在头条文章,则值为‘’
  • 原始图片URL:原创微博图片和转发微博转发理由中图片的URL,若某条微博存在多张图片,每个URL以英文逗号分隔,若没有图片则值为“无”
  • 视频URL:微博中的视频URL,若微博中没有视频,则值为“无”
  • 微博发布位置:位置微博中的发布位置
  • 微博发布时间:微博发布时的时间,精确到分
  • 点赞数:微博被赞的数量
  • 转发数:微博被转发的数量
  • 评论数:微博被评论的数量
  • 微博发布工具:微博的发布工具,如iphone客户端、华为Mate 20专业版等
  • 结果文件:保存在当前目录微博文件夹下以用户昵称为名的文件夹里,名字为“user_id.csv”和“user_id.txt”的形式
  • 微博图片:原创微博中的图片和转发微博转发理由中的图片,保存在以用户昵称为名的文件夹下的img文件夹里
  • 微博视频:原创微博中的视频,保存在以用户昵称为名的文件夹下的视频文件夹里
  • 微博BID(免Cookie版):为免cookie版所特有,与本程序中的微博id是同一个值
  • 话题(免Cookie版):微博话题,即两个#中的内容,若存在多个话题,每个URL以英文逗号分隔,若没有则值为‘’
  • @用户(免Cookie版):微博@的用户,若存在多个@用户,每个url以英文逗号分隔,若没有则值为‘’
  • 原始微博(免Cookie版):为转发微博所特有,是转发微博中那条被转发的微博,存储为字典形式,包含了上述微博信息中的所有内容,如微博id、微博内容等等

示例

如果想要知道程序的具体运行结果,可以查看示例文档,该文档介绍了爬取迪丽热巴微博的例子,并附有部分结果文件截图.

运行环境

  • 开发语言:Python2/Python3
  • 系统:Windows/LINUX/MacOS

使用说明

0。版本

本程序有两个版本,你现在看到的是python3版,另一个是python2版,python2版位于python2分支那就是。目前主力开发python3版,包括新功能开发和bug修复;python2版仅支持bug修复。推荐python3用户使用当前版本,推荐python2用户使用python2版、本使用说明是python3版的使用说明.

1.安装程序

本程序提供两种安装方式,一种是源码安装,另一种是PiP安装、二者功能完全相同.如果你需要修改源码,建议使用第一种方式,否则选哪种安装方式都可以.

源码安装

$ git clone https://github.com/dataabc/weiboSpider.git
$ cd weiboSpider
$ pip install -r requirements.txt

PiP安装

$ python3 -m pip install weibo-spider

2.程序设置

要了解程序设置,请查看程序设置文档那就是。

3.运行程序

源码安装的用户可以在WeiboSpider目录运行如下命令PiP安装的用户可以在任意有写权限的目录运行如下命令

$ python3 -m weibo_spider

第一次执行,会自动在当前目录创建配置.json配置文件,配置好后执行同样的命令就可以获取微博了。

如果你已经有config.json文件了,也可以通过CONFIG_PATH参数配置config.json路径,运行程序,命令行如下:

$ python3 -m weibo_spider --config_path="config.json"

如果你想指定文件(csv、txt、json、图片、视频)保存路径,可以通过输出目录参数设定.假如你想把文件保存到/主页/微博/目录,可以运行如下命令:

$ python3 -m weibo_spider --output_dir="/home/weibo/"

如果你想通过命令行输入user_id,可以使用参数u,可以输入一个或多个user_id,每个user_id以英文逗号分开,如果这些user_id中有重复的user_id,程序会自动去重。命令行如下:

$ python3 -m weibo_spider --u="1669879400,1223178222"

程序会获取User_id分别为1669879400和1223178222的微博用户的微博,后面会讲如何获取user_id那就是。该方式的所有User_id使用config.json中的Self_Date和End_Date设置,通过修改它们的值可以控制爬取的时间范围。若config.json中的User_id_List是文件路径,每个命令行中的User_id都会自动保存到该文件内,且自动更新Self_Date;若不是路径,User_id会保存在当前目录的User_id_List.txt内,且自动更新Self_Date,若当前目录下不存在User_id_List.txt,程序会自动创建它。

(个性化定制程序(可选)

本部分为可选部分,如果不需要个性化定制程序或添加新功能,可以忽略此部分.

本程序主体代码位于weibo_Spider.py文件,程序主体是一个蜘蛛类,上述所有功能都是通过在Main函数调用蜘蛛类实现的,默认的调用代码如下:

        config = get_config()
        wb = Spider(config)
        wb.start()  # 爬取微博信息

用户可以按照自己的需求调用或修改蜘蛛类.通过执行本程序,我们可以得到很多信息.

点击查看详情
  • wb.user[‘昵称’]:用户昵称;
  • wb.user[‘性别’]:用户性别;
  • wb.user[‘Location’]:用户所在地;
  • wb.user[‘生日’]:用户出生日期;
  • wb.user[‘Description’]:用户简介;
  • wb.user[‘VERIFIED_REASON’]:用户认证;
  • wb.user[‘Talent’]:用户标签;
  • wb.user[‘教育’]:用户学习经历;
  • wb.user[‘Work’]:用户工作经历;
  • wb.user[‘webo_num’]:微博数;
  • wb.user[‘Follow’]:关注数;
  • wb.user[‘Followers’]:粉丝数;

wb.weibo:除不包含上述信息外,wb.weibo包含爬取到的所有微博信息,如微博ID微博正文原始图片URL发布位置发布时间发布工具点赞数转发数评论数等.如果爬的是全部微博(原创+转发),除上述信息之外,还包含被转发微博原始图片URL是否为原创微博等.Wb.weibo是一个列表,包含了爬取的所有微博信息.wb.weibo[0]为爬取的第一条微博,wb.weibo[1]为爬取的第二条微博,以此类推.当过滤=1时,wb.weibo[0]为爬取的第一条原创微博,以此类推.wb.weibo[0][‘id’]为第一条微博的id,wb.weibo[0][‘Content’]为第一条微博的正文,wb.weibo[0][‘Publish_Time’]为第一条微博的发布时间,还有其它很多信息不在赘述,大家可以点击下面的“详情”查看具体用法。

详情

若目标微博用户存在微博,则:

  • ID:存储微博id。如wb.weibo[0][‘id’]为最新一条微博的id;
  • 内容:存储微博正文。内容wb.weibo[0][‘如’]为最新一条微博的正文;
  • 文章URL:存储微博中头条文章的URL。文章wb.weibo[0][‘如_url’]为最新一条微博的头条文章url,若微博中不存在头条文章,则值为‘’;
  • 原始图片:存储原创微博的原始图片URL和转发微博转发理由中的图片URL。如wb.weibo[0][‘Original_Pictures’]为最新一条微博的原始图片url,若该条微博有多张图片,则存储多个url,以英文逗号分割;若该微博没有图片,则值为“无”;
  • 转发图片:存储被转发微博中的原始图片URL。当最新微博为原创微博或者为没有图片的转发微博时,则值为“无”,否则为被转发微博的图片URL。若有多张图片,则存储多个URL,以英文逗号分割;
  • PUBLISH_PLACE:存储微博的发布位置。如wb.weibo[0][‘PUBLISH_PLACE’]为最新一条微博的发布位置,如果该条微博没有位置信息,则值为“无”;
  • PUBLISH_TIME:存储微博的发布时间。如wb.weibo[0][‘PUBLISH_TIME’]为最新一条微博的发布时间;
  • up_num:存储微博获得的点赞数。如wb.weibo[0][‘up_num’]为最新一条微博获得的点赞数;
  • 转发号:存储微博获得的转发数。转发wb.weibo[0][‘如_num’]为最新一条微博获得的转发数;
  • COMMENT_NUM:存储微博获得的评论数。评论wb.weibo[0][‘如_num’]为最新一条微博获得的评论数;
  • 发布工具:存储微博的发布工具。如wb.weibo[0][‘Publish_Tool’]为最新一条微博的发布工具。

(定期自动爬取微博(可选)

要想让程序每个一段时间自动爬取,且爬取的内容为新增加的内容(不包括已经获取的微博),请查看定期自动爬取微博那就是。

如何获取Cookie

要了解获取Cookie方法,请查看cookie文档那就是。

用户ID(如何获取_id)

要了解获取用户id方法,请查看user_id文档、该文档介绍了如何获取一个及多个微博用户用户id的方法。

常见问题

如果运行程序的过程中出现错误,可以查看常见问题页面,里面包含了最常见的问题及解决方法.如果出现的错误不在常见问题里,您可以通过发issue寻求帮助,我们会很乐意为您解答.

学术研究

本项目通过获取微博数据,为写论文、做研究等非商业项目提供所需数据.学术研究文档是一些在论文或研究等方面使用过本程序的项目,这些项目展示已征得所有者同意.在一些涉及隐私的描述上,已与所有者做了沟通,描述中只介绍所有者允许展示的部分.如果部分信息所有者之前同意展示并且已经写在了文档中,现在又不想展示了,可以通过邮件(chillychen1991@gmail.com)或Issue的方式告诉我,我会删除相关信息。同时,也欢迎使用本项目写论文或做其它学术研究的朋友,将自己的研究成果展示在学术研究文档里,这完全是自愿的.

相关项目

  • weibo-crawler-功能和本项目完全一样,可以不添加Cookie,获取的微博属性更多;
  • weibo-search–可以连续获取一个或多个微博关键词搜索结果,并将结果写入文件(可选)、数据库(可选)等.所谓微博关键词搜索即:搜索正文中包含指定关键词的微博、可以指定搜索的时间范围.对于非常热门的关键词,一天的时间范围,可以获得1,000万以上的搜索结果,N天的时间范围就可以获得1000万X N搜索结果。对于大多数关键词,一天产生的相应微博数量应该在1,000万条以下,因此可以说该程序可以获得大部分关键词的全部或近似全部的搜索结果。而且该程序可以获得搜索结果的所有信息,本程序获得的微博信息该程序都能获得.

贡献

欢迎为本项目贡献力量.贡献可以是提交代码,可以是通过Issue提建议(如新功能、改进方案等),也可以是通过Issue告知我们项目存在哪些Bug、缺点等,具体贡献方式见为本项目做贡献那就是。

贡献者

感谢所有为本项目贡献力量的朋友,贡献者详情见贡献者页面.

注意事项

  1. 不能为爬虫微博的用户id用户id。因为要爬微博信息,必须先登录到某个微博账号,此账号我们姑且称为爬虫微博.爬虫微博访问自己的页面和访问其他用户的页面,得到的网页格式不同,所以无法爬取自己的微博信息;如果想要爬取爬虫微博内容,可以参考获取自身微博信息
  2. 曲奇有期限限制,大约三个月。若提示Cookie错误或已过期,需要重新更新Cookie。

Nameko-用于构建微服务的Python框架

Python的微服务框架,让服务开发人员专注于应用程序逻辑并鼓励可测试性

Nameko服务只是一个类:

# helloworld.py

from nameko.rpc import rpc

class GreetingService:
    name = "greeting_service"

    @rpc
    def hello(self, name):
        return "Hello, {}!".format(name)

您可以在shell中运行它:

$ nameko run helloworld
starting services: greeting_service
...

从另一个人那里玩弄它:

$ nameko shell
>>> n.rpc.greeting_service.hello(name="ナメコ")
'Hello, ナメコ!'

功能

  • AMQP RPC和事件(发布-订阅)
  • HTTP GET、POST和WebSockets
  • CLI可实现轻松快速的开发
  • 用于单元和集成测试的实用程序

快速入门

支持

如需帮助、意见或问题,请访问<https://discourse.nameko.io/>

对于企业而言

作为Tidelift订阅的一部分提供

nameko和其他数千个包的维护者正在与Tidelift合作,为您用于构建应用程序的开源依赖项提供商业支持和维护。节省时间、降低风险并提高代码的健全性,同时付钱给您使用的确切依赖项的维护人员。Learn more.

安全联系信息

若要报告安全漏洞,请使用Tidelift security contactTidelift将协调修复和披露

贡献力量

  • 派生存储库
  • 提出问题或提出功能请求

许可证

Apache 2.0。有关详细信息,请参阅许可证

Sqlalchemy-Python数据库工具包

Python SQL工具包和对象关系映射器

引言

SQLAlChemy是Python SQL工具包和对象关系映射器,它为应用程序开发人员提供了SQL的全部功能和灵活性。SQLAlChemy提供了一整套众所周知的企业级持久化模式,专为高效和高性能的数据库访问而设计,并改编成简单的Python域语言

SQLAlChemy的主要功能包括:

  • 工业实力ORM,从身份图、工作单元和数据映射器模式上的核心构建。这些模式允许使用声明性配置系统透明地持久化对象。域模型可以自然地构建和操作,并且更改会自动与当前事务同步
  • 面向关系的查询系统,根据对象模型显式公开SQL的全部功能,包括联接、子查询、关联和几乎所有其他功能。使用ORM编写查询使用的关系组合技术与编写SQL时使用的技术相同。虽然您可以随时使用文字SQL,但实际上从来不需要它
  • 一个全面而灵活的系统,可以立即加载相关集合和对象。集合缓存在会话中,并且可以在单个访问时使用联接一次加载,也可以通过跨整个结果集的每个集合查询一次加载
  • 核心SQL构建系统和DBAPI交互层。SQLAlChemy Core独立于ORM,本身就是一个完整的数据库抽象层,包括可扩展的基于Python的SQL表达式语言、模式元数据、连接池、类型强制和自定义类型
  • 所有主键和外键约束都假定是复合的和自然的。当然,代理整数主键仍然是规范,但是SQLAlChemy从不假定或硬编码此模型
  • 数据库自省和生成。数据库模式可以在一个步骤中“反映”到表示数据库元数据的Python结构中;然后,这些相同的结构可以立即生成CREATE语句-所有这些语句都在Core中,独立于ORM

SQLAlChemy的理念:

  • SQL数据库的行为越来越不像对象集合,大小和性能越重要;对象集合的行为越不像表和行,抽象就越重要。SQLAlChemy旨在同时满足这两个原则
  • ORM不需要隐藏“R”。关系数据库提供了丰富的、基于集合的功能,这些功能应该完全公开。SQLAlChemy的ORM提供了一组开放式模式,允许开发人员在域模型和关系模式之间构建自定义中介层,将所谓的“对象关系阻抗”问题变成遥远的记忆
  • 在所有情况下,开发人员都要做出关于对象模型和关系模式的设计、结构和命名约定的所有决策。SQLAlChemy仅提供自动执行这些决策的方法
  • 使用SQLAlChemy,不存在“ORM生成错误的查询”这样的事情-您保留对查询结构的完全控制,包括如何组织联接、如何使用子查询和关联、请求哪些列。SQLAlChemy所做的一切最终都是开发人员发起的决策的结果
  • 如果问题不需要ORM,请不要使用ORM。SQLAlChemy由一个核心和单独的ORM组件组成。Core提供了完整的SQL表达式语言,允许以Pythonic方式构造SQL构造,这些构造直接呈现为目标数据库的SQL字符串,返回本质上是增强型DBAPI游标的结果集
  • 交易应该是常态。使用SQLAlChemy的ORM,在调用Commit()之前,不会将任何内容放到永久存储中。SQLAlChemy鼓励应用程序创建描述一系列操作的开始和结束的一致方法
  • 千万不要在SQL语句中呈现文字值。最大程度地使用了绑定参数,从而允许查询优化器有效地缓存查询计划,并使SQL注入攻击不再成为问题

文档

最新文档位于:

https://www.sqlalchemy.org/docs/

安装/要求

有关安装的完整文档,请访问Installation

获取帮助/开发/错误报告

请参阅SQLAlchemy Community Guide

行为规范

最重要的是,SQLAlChemy非常重视用户和开发人员之间的礼貌、深思熟虑和建设性的交流。请参阅我们当前的行为准则,网址为Code of Conduct

许可证

SQLAlChemy分布在MIT license

Visidata 一种用于发现和整理数据的终端电子表格工具

一种用于浏览和排列表格数据的终端界面

VisiData支持TSV、CSV、SQLite、json、xlsx(Excel)、hdf5和many other formats

平台要求

  • Linux、OS/X或Windows(带WSL)
  • Python 3.6+
  • 某些格式和源需要其他Python模块

安装

要从PyPI安装最新版本,请执行以下操作:

 

pip3 install visidata

安装尖端设备的步骤develop分公司(无明示或默示的保修):

 

pip3 install git+https://github.com/saulpw/visidata.git@develop

看见visidata.org/install有关所有可用平台和包管理器的详细说明,请参阅

用法

 

$ vd <input>
$ <command> | vd

按下Ctrl+Q随时戒烟

还可以使用数百个其他命令和选项;请参阅文档

文档

帮助和支持

如果您有关于VisiData的问题、问题或建议,请create an issue on Github或在#visidata上与我们聊天irc.libera.chat

如果您经常使用VisiData,请support me on Patreon好了!

许可证

中的代码。stable此存储库的分支,包括主vd应用程序、加载器和插件可在GPLv3下使用和重新分发

学分

VisiData由Saul Pwanson构思和开发<vd@saul.pw>

安雅·凯法拉(Anja Kefala)<anja.kefala@gmail.com>维护所有平台的文档和软件包

非常感谢无数其他人contributors,以及那些提供反馈的优秀用户,感谢他们帮助VisiData成为令人敬畏的工具

Manim-社区维护的用于创建数学动画的Python框架

用于解释数学视频的动画引擎


Manim是用于解释数学视频的动画引擎。它用于以编程方式创建精确的动画,如3Blue1Brown

注意:这个存储库是由Manim社区维护的,与Grant Sanderson或3Blue1Brown没有任何关联(尽管我们非常感谢他将他的工作提供给世界)。如果你想了解格兰特是如何制作视频的,请访问他的存储库(3b1b/manim)。此分叉比他的更新更频繁,如果您想将Manim用于您自己的项目,建议使用此分叉

目录:

安装

Manim需要一些依赖项,在使用它之前必须安装这些依赖项。如果您想在本地安装之前先试用,可以这样做in our online Jupyter environment

有关本地安装,请访问Documentation并按照适用于您的操作系统的说明进行操作

安装依赖项后,在终端窗口中运行以下命令:

pip install manim

用法

Manim是一个非常通用的软件包。以下是一个示例Scene您可以构建:

from manim import *


class SquareToCircle(Scene):
    def construct(self):
        circle = Circle()
        square = Square()
        square.flip(RIGHT)
        square.rotate(-3 * TAU / 8)
        circle.set_fill(PINK, opacity=0.5)

        self.play(Create(square))
        self.play(Transform(square, circle))
        self.play(FadeOut(square))

要查看此场景的输出,请将代码保存在一个名为example.py然后,在终端窗口中运行以下命令:

manim -p -ql example.py SquareToCircle

您应该会看到您的原生视频播放器程序弹出,并播放一个简单的场景,在该场景中,一个正方形被转换为一个圆。您可以在下面的内容中找到一些更简单的示例GitHub repository您也可以访问official gallery有关更高级的示例,请参阅

Manim还附带一个%%manimIPython魔术,允许在JupyterLab(以及经典的Jupyter)笔记本中方便地使用它。请参阅

corresponding documentation需要一些指导和帮助try it out online

命令行参数

Manim的一般用法如下:

这个-p上述命令中的标志用于预览,即视频文件渲染完成后将自动打开。这个-qlFLAG用于以较低质量进行更快的渲染

其他一些有用的标志包括:

  • -s跳到末尾,只显示最后一帧
  • -n <number>要跳到前面的n‘场景的第8个动画
  • -f在文件浏览器中显示文件

有关命令行参数的完整列表,请访问documentation

文档

文档正在进行中,地址为ReadTheDocs

码头工人

社区还维护码头形象(manimcommunity/manim),可以找到on DockerHub支持以下标签:

运行坞站映像的说明

快速示例

渲染场景的步骤CircleToSquare在文件中test_scenes.py包含在当前工作目录中,同时保留您的用户和组ID,请使用

docker run --rm -it  --user="$(id -u):$(id -g)" -v "$(pwd)":/manim manimcommunity/manim manim test_scenes.py CircleToSquare -qm

在后台运行图像

除了使用上面描述的“一次性容器”方法,您还可以创建一个命名容器,您也可以根据自己的喜好进行修改。首先,跑步

docker run -it --name my-manim-container -v "$(pwd):/manim" manimcommunity/manim /bin/bash

要在容器中获得交互式shell,例如,可以安装更多依赖项(比如使用以下命令安装texlive包tlmgr)。满意后立即退出容器。然后,在使用它之前,通过运行以下命令启动容器

docker start my-manim-container

然后,要渲染场景CircleToSquare在文件中test_scenes.py,呼叫

docker exec -it --user="$(id -u):$(id -g)" my-manim-container manim test.py CircleToSquare -qm

木星实验室

另一种替代方法是使用docker映像启动运行JupyterLab的本地Web服务器,JupyterLab中安装了Python内核Manim,可以通过%%manim细胞魔法。要使用JupyterLab,请运行

docker run -it -p 8888:8888 manimcommunity/manim jupyter lab --ip=0.0.0.0

然后按照航站楼里的说明操作

重要说明

在执行时manim在Docker容器中,有几个命令行标志(特别是-p(预览文件)和-f(在文件浏览器中显示输出文件))不受支持

帮助处理Manim

如果您需要安装或使用Manim的帮助,请随时联系我们的Discord
Server
Reddit Community如果您要提交错误报告或功能请求,请打开问题

贡献

对Manim的贡献总是受欢迎的。特别是,对测试和文档的需求非常迫切。有关投稿指引,请参阅documentation

该项目中的大多数开发人员使用Poetry对于管理层来说。您需要在您的环境中安装并使用POLITE。您可以了解更多关于poetry以及如何在ITS中使用它documentation

如何引用Manim

我们认识到好的软件对支持研究的重要性,我们注意到,当研究得到有效沟通时,它就会变得更有价值。为了证明曼尼姆的价值,我们要求你在你的作品中引用曼尼姆。目前,引用Manim的最佳方式是引用Manim home page使用此BibTeX条目(该条目用于发布v0.9.0,但很容易修改):

@Manual{Manim:v0.9.0,
  key =          {Manim},
  author =       {{The Manim Community Developers}},
  title =        {{Manim} -- {M}athematical {A}nimation {F}ramework ({V}ersion v0.9.0)},
  note =         {\url{https://www.manim.community}},
  year =         2021,
}

这应该会呈现大致如下所示的引用:

  1. 曼尼姆社区开发人员,Manim – Mathematical Animation Framework (Version v0.9.0)2021年

行为规范

我们完整的行为准则,以及我们如何执行它,都可以继续阅读。our website

许可证

该软件在麻省理工学院的许可下是双重许可的,版权归3Blu1Brown LLC所有(见许可),版权归Manim社区开发者所有(见LICENSE.Community)

如何检查列表是否为空?

问题:如何检查列表是否为空?

例如,如果通过以下内容:

a = []

如何检查是否a为空?

For example, if passed the following:

a = []

How do I check to see if a is empty?


回答 0

if not a:
  print("List is empty")

使用空的隐式布尔list是非常Python的。

if not a:
  print("List is empty")

Using the implicit booleanness of the empty list is quite pythonic.


回答 1

这样做的pythonic方法来自PEP 8样式指南(其中Yes表示“推荐”,No表示“不推荐”):

对于序列(字符串,列表,元组),请使用空序列为假的事实。

Yes: if not seq:
     if seq:

No:  if len(seq):
     if not len(seq):

The pythonic way to do it is from the PEP 8 style guide (where Yes means “recommended” and No means “not recommended”):

For sequences, (strings, lists, tuples), use the fact that empty sequences are false.

Yes: if not seq:
     if seq:

No:  if len(seq):
     if not len(seq):

回答 2

我明确喜欢它:

if len(li) == 0:
    print('the list is empty')

这样,它是100%清楚的li是一个序列(列表),我们要测试其大小。我的问题if not li: ...是它给人的错误印象li是布尔变量。

I prefer it explicitly:

if len(li) == 0:
    print('the list is empty')

This way it’s 100% clear that li is a sequence (list) and we want to test its size. My problem with if not li: ... is that it gives the false impression that li is a boolean variable.


回答 3

这是google首次针对“ python测试空数组”和类似的查询命中,再加上其他人似乎在推广问题,不仅限于列表,因此我想为很多人添加另一种类型的序列的警告可能会用。

其他方法不适用于NumPy数组

您需要注意NumPy数组,因为其他对lists或其他标准容器都适用的方法对NumPy数组无效。我在下面解释了原因,但总之,首选方法是使用size

“ pythonic”方式无效:第1部分

NumPy数组的“ pythonic”方法失败,因为NumPy尝试将数组转换为bools 的数组,并if x尝试bool一次对所有这些s 求值,以获得某种合计的真值。但这没有任何意义,因此您得到了ValueError

>>> x = numpy.array([0,1])
>>> if x: print("x")
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

“ pythonic”方式无效:第2部分

但是至少上述情况告诉您它失败了。如果您碰巧拥有一个仅包含一个元素的NumPy数组,则该if语句将“正常工作”,即不会产生错误。但是,如果该元素恰好是0(或0.0,或False,…),则该if语句将错误地导致False

>>> x = numpy.array([0,])
>>> if x: print("x")
... else: print("No x")
No x

但是显然x存在并且不为空!这个结果不是您想要的。

使用len会产生意想不到的结果

例如,

len( numpy.zeros((1,0)) )

即使数组有零个元素,也返回1。

numpythonic方式

SciPy常见问题解答中所述,在您知道拥有NumPy数组的所有情况下,正确的方法是使用if x.size

>>> x = numpy.array([0,1])
>>> if x.size: print("x")
x

>>> x = numpy.array([0,])
>>> if x.size: print("x")
... else: print("No x")
x

>>> x = numpy.zeros((1,0))
>>> if x.size: print("x")
... else: print("No x")
No x

如果不确定是a list,NumPy数组还是其他类型,可以将此方法与@dubiousjim给出的答案结合使用以确保对每种类型使用正确的测试。并不是很“ pythonic”,但事实证明,NumPy至少在这种意义上有意破坏了pythonicity。

如果你需要做的不仅仅是检查,如果输入的是空的,而你正在使用其他的功能NumPy的像索引或数学运算,它可能是更有效的(当然更常见)来强制输入一个NumPy的阵列。有一些不错的功能可以快速完成此操作-最重要的是numpy.asarray。这将接受您的输入,如果已经是数组,则不执行任何操作;如果是列表,元组等,则将您的输入包装到数组中,并有选择地将其转换为您选择的dtype。因此,它可以在任何时候都非常快,并且可以确保您只是假设输入是NumPy数组。我们通常甚至只使用相同的名称,因为转换为数组不会使它返回当前范围之外

x = numpy.asarray(x, dtype=numpy.double)

这将使x.size我在此页面上看到的所有情况下都可以进行检查。

This is the first google hit for “python test empty array” and similar queries, plus other people seem to be generalizing the question beyond just lists, so I thought I’d add a caveat for a different type of sequence that a lot of people might use.

Other methods don’t work for NumPy arrays

You need to be careful with NumPy arrays, because other methods that work fine for lists or other standard containers fail for NumPy arrays. I explain why below, but in short, the preferred method is to use size.

The “pythonic” way doesn’t work: Part 1

The “pythonic” way fails with NumPy arrays because NumPy tries to cast the array to an array of bools, and if x tries to evaluate all of those bools at once for some kind of aggregate truth value. But this doesn’t make any sense, so you get a ValueError:

>>> x = numpy.array([0,1])
>>> if x: print("x")
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

The “pythonic” way doesn’t work: Part 2

But at least the case above tells you that it failed. If you happen to have a NumPy array with exactly one element, the if statement will “work”, in the sense that you don’t get an error. However, if that one element happens to be 0 (or 0.0, or False, …), the if statement will incorrectly result in False:

>>> x = numpy.array([0,])
>>> if x: print("x")
... else: print("No x")
No x

But clearly x exists and is not empty! This result is not what you wanted.

Using len can give unexpected results

For example,

len( numpy.zeros((1,0)) )

returns 1, even though the array has zero elements.

The numpythonic way

As explained in the SciPy FAQ, the correct method in all cases where you know you have a NumPy array is to use if x.size:

>>> x = numpy.array([0,1])
>>> if x.size: print("x")
x

>>> x = numpy.array([0,])
>>> if x.size: print("x")
... else: print("No x")
x

>>> x = numpy.zeros((1,0))
>>> if x.size: print("x")
... else: print("No x")
No x

If you’re not sure whether it might be a list, a NumPy array, or something else, you could combine this approach with the answer @dubiousjim gives to make sure the right test is used for each type. Not very “pythonic”, but it turns out that NumPy intentionally broke pythonicity in at least this sense.

If you need to do more than just check if the input is empty, and you’re using other NumPy features like indexing or math operations, it’s probably more efficient (and certainly more common) to force the input to be a NumPy array. There are a few nice functions for doing this quickly — most importantly numpy.asarray. This takes your input, does nothing if it’s already an array, or wraps your input into an array if it’s a list, tuple, etc., and optionally converts it to your chosen dtype. So it’s very quick whenever it can be, and it ensures that you just get to assume the input is a NumPy array. We usually even just use the same name, as the conversion to an array won’t make it back outside of the current scope:

x = numpy.asarray(x, dtype=numpy.double)

This will make the x.size check work in all cases I see on this page.


回答 4

检查列表是否为空的最佳方法

例如,如果通过以下内容:

a = []

如何检查a是否为空?

简短答案:

将列表放在布尔上下文中(例如,使用ifor while语句)。它将测试False是否为空,True否则为空。例如:

if not a:                           # do this!
    print('a is an empty list')

人教版8

PEP 8是Python标准库中Python代码的官方Python样式指南,它断言:

对于序列(字符串,列表,元组),请使用以下事实:空序列为假。

Yes: if not seq:
     if seq:

No: if len(seq):
    if not len(seq):

我们应该期望标准库代码应尽可能地具有高性能和正确性。但是为什么会这样,为什么我们需要此指南?

说明

我经常从Python的新手那里看到这样的代码:

if len(a) == 0:                     # Don't do this!
    print('a is an empty list')

懒惰语言的用户可能会这样做:

if a == []:                         # Don't do this!
    print('a is an empty list')

这些在其各自的其他语言中都是正确的。在Python中,这甚至在语义上都是正确的。

但是我们认为它不是Python语言,因为Python通过布尔强制转换直接在列表对象的界面中支持这些语义。

文档中(并特别注意包含空列表[]):

默认情况下,除非对象的类定义了与该对象一起调用时__bool__()返回False__len__()方法或返回零的方法,否则该对象被视为true 。以下是大多数被视为错误的内置对象:

  • 定义为false的常量:NoneFalse
  • 任何数值类型的零:00.00jDecimal(0)Fraction(0, 1)
  • 空序列和集合:''()[]{}set()range(0)

以及数据模型文档:

object.__bool__(self)

调用实现真值测试和内置操作bool();应该返回FalseTrue。如果未定义此方法,__len__()则调用该方法( 如果已定义),并且如果其结果为非零,则将该对象视为true。如果一个类既未定义,也__len__() 未定义__bool__(),则其所有实例均被视为true。

object.__len__(self)

调用以实现内置函数len()。应该返回对象的长度,即> = 0的整数。此外,在布尔上下文中,未定义__bool__()方法且其__len__()方法返回零的对象被视为false。

所以代替这个:

if len(a) == 0:                     # Don't do this!
    print('a is an empty list')

或这个:

if a == []:                     # Don't do this!
    print('a is an empty list')

做这个:

if not a:
    print('a is an empty list')

做Pythonic通常可以提高性能:

它还清吗?(请注意,执行等效操作的时间越少越好:)

>>> import timeit
>>> min(timeit.repeat(lambda: len([]) == 0, repeat=100))
0.13775854044661884
>>> min(timeit.repeat(lambda: [] == [], repeat=100))
0.0984637276455409
>>> min(timeit.repeat(lambda: not [], repeat=100))
0.07878462291455435

对于规模而言,这是调用函数,构造并返回空列表的成本,您可以从上面使用的空度检查的成本中减去这些成本:

>>> min(timeit.repeat(lambda: [], repeat=100))
0.07074015751817342

我们看到,无论是与内建函数长度检查len相比,0 检查对空列表是太多比使用语言的内置语法记载高性能的少。

为什么?

对于len(a) == 0检查:

首先,Python必须检查全局变量以查看是否len有阴影。

然后,它必须调用函数load 0,并在Python中(而不是使用C)进行相等比较:

>>> import dis
>>> dis.dis(lambda: len([]) == 0)
  1           0 LOAD_GLOBAL              0 (len)
              2 BUILD_LIST               0
              4 CALL_FUNCTION            1
              6 LOAD_CONST               1 (0)
              8 COMPARE_OP               2 (==)
             10 RETURN_VALUE

并且对于[] == []它,它必须建立一个不必要的列表,然后再次在Python的虚拟机(而不是C)中执行比较操作。

>>> dis.dis(lambda: [] == [])
  1           0 BUILD_LIST               0
              2 BUILD_LIST               0
              4 COMPARE_OP               2 (==)
              6 RETURN_VALUE

因为列表的长度被缓存在对象实例头中,所以“ Pythonic”方式是一种更简单,更快速的检查:

>>> dis.dis(lambda: not [])
  1           0 BUILD_LIST               0
              2 UNARY_NOT
              4 RETURN_VALUE

来自C源代码和文档的证据

PyVarObject

这是PyObject对该ob_size字段的扩展。这仅用于具有长度概念的对象。这种类型通常不会出现在Python / C API中。它对应于由PyObject_VAR_HEAD宏扩展定义的字段。

Include / listobject.h中的c源:

typedef struct {
    PyObject_VAR_HEAD
    /* Vector of pointers to list elements.  list[0] is ob_item[0], etc. */
    PyObject **ob_item;

    /* ob_item contains space for 'allocated' elements.  The number
     * currently in use is ob_size.
     * Invariants:
     *     0 <= ob_size <= allocated
     *     len(list) == ob_size

对评论的回应:

我想指出,这也适用于非空的情况下,虽然它很丑陋与l=[]%timeit len(l) != 090.6纳秒±8.3纳秒,%timeit l != []55.6纳秒±3.09,%timeit not not l38.5±NS 0.372。但是,not not l尽管速度提高了三倍,但没有任何人可以享受。看起来很荒谬。但是速度胜出,
我想问题是要及时测试,因为这if l:足够了,但令人惊讶地%timeit bool(l)产生了101 ns±2.64 ns。有趣的是,没有这种惩罚就没有办法胁迫。%timeit l是没有用的,因为不会进行任何转换。

IPython的魔术%timeit在这里并非完全没有用:

In [1]: l = []                                                                  

In [2]: %timeit l                                                               
20 ns ± 0.155 ns per loop (mean ± std. dev. of 7 runs, 100000000 loops each)

In [3]: %timeit not l                                                           
24.4 ns ± 1.58 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [4]: %timeit not not l                                                       
30.1 ns ± 2.16 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

我们可以看到这里每增加一个线性成本not。我们希望看到成本ceteris paribus,即其他所有条件都相等-尽可能将其他所有条件最小化:

In [5]: %timeit if l: pass                                                      
22.6 ns ± 0.963 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [6]: %timeit if not l: pass                                                  
24.4 ns ± 0.796 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [7]: %timeit if not not l: pass                                              
23.4 ns ± 0.793 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

现在让我们看一看一个空列表的情况:

In [8]: l = [1]                                                                 

In [9]: %timeit if l: pass                                                      
23.7 ns ± 1.06 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [10]: %timeit if not l: pass                                                 
23.6 ns ± 1.64 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [11]: %timeit if not not l: pass                                             
26.3 ns ± 1 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

我们可以在这里看到的是,无论是将实际值传递bool给条件检查还是将列表本身传递给您,几乎没有什么区别,并且如果有的话,按原样提供列表会更快。

Python是用C编写的;它在C级别使用其逻辑。您用Python编写的任何内容都会变慢。除非您直接使用Python内置的机制,否则这可能会慢几个数量级。

Best way to check if a list is empty

For example, if passed the following:

a = []

How do I check to see if a is empty?

Short Answer:

Place the list in a boolean context (for example, with an if or while statement). It will test False if it is empty, and True otherwise. For example:

if not a:                           # do this!
    print('a is an empty list')

PEP 8

PEP 8, the official Python style guide for Python code in Python’s standard library, asserts:

For sequences, (strings, lists, tuples), use the fact that empty sequences are false.

Yes: if not seq:
     if seq:

No: if len(seq):
    if not len(seq):

We should expect that standard library code should be as performant and correct as possible. But why is that the case, and why do we need this guidance?

Explanation

I frequently see code like this from experienced programmers new to Python:

if len(a) == 0:                     # Don't do this!
    print('a is an empty list')

And users of lazy languages may be tempted to do this:

if a == []:                         # Don't do this!
    print('a is an empty list')

These are correct in their respective other languages. And this is even semantically correct in Python.

But we consider it un-Pythonic because Python supports these semantics directly in the list object’s interface via boolean coercion.

From the docs (and note specifically the inclusion of the empty list, []):

By default, an object is considered true unless its class defines either a __bool__() method that returns False or a __len__() method that returns zero, when called with the object. Here are most of the built-in objects considered false:

  • constants defined to be false: None and False.
  • zero of any numeric type: 0, 0.0, 0j, Decimal(0), Fraction(0, 1)
  • empty sequences and collections: '', (), [], {}, set(), range(0)

And the datamodel documentation:

object.__bool__(self)

Called to implement truth value testing and the built-in operation bool(); should return False or True. When this method is not defined, __len__() is called, if it is defined, and the object is considered true if its result is nonzero. If a class defines neither __len__() nor __bool__(), all its instances are considered true.

and

object.__len__(self)

Called to implement the built-in function len(). Should return the length of the object, an integer >= 0. Also, an object that doesn’t define a __bool__() method and whose __len__() method returns zero is considered to be false in a Boolean context.

So instead of this:

if len(a) == 0:                     # Don't do this!
    print('a is an empty list')

or this:

if a == []:                     # Don't do this!
    print('a is an empty list')

Do this:

if not a:
    print('a is an empty list')

Doing what’s Pythonic usually pays off in performance:

Does it pay off? (Note that less time to perform an equivalent operation is better:)

>>> import timeit
>>> min(timeit.repeat(lambda: len([]) == 0, repeat=100))
0.13775854044661884
>>> min(timeit.repeat(lambda: [] == [], repeat=100))
0.0984637276455409
>>> min(timeit.repeat(lambda: not [], repeat=100))
0.07878462291455435

For scale, here’s the cost of calling the function and constructing and returning an empty list, which you might subtract from the costs of the emptiness checks used above:

>>> min(timeit.repeat(lambda: [], repeat=100))
0.07074015751817342

We see that either checking for length with the builtin function len compared to 0 or checking against an empty list is much less performant than using the builtin syntax of the language as documented.

Why?

For the len(a) == 0 check:

First Python has to check the globals to see if len is shadowed.

Then it must call the function, load 0, and do the equality comparison in Python (instead of with C):

>>> import dis
>>> dis.dis(lambda: len([]) == 0)
  1           0 LOAD_GLOBAL              0 (len)
              2 BUILD_LIST               0
              4 CALL_FUNCTION            1
              6 LOAD_CONST               1 (0)
              8 COMPARE_OP               2 (==)
             10 RETURN_VALUE

And for the [] == [] it has to build an unnecessary list and then, again, do the comparison operation in Python’s virtual machine (as opposed to C)

>>> dis.dis(lambda: [] == [])
  1           0 BUILD_LIST               0
              2 BUILD_LIST               0
              4 COMPARE_OP               2 (==)
              6 RETURN_VALUE

The “Pythonic” way is a much simpler and faster check since the length of the list is cached in the object instance header:

>>> dis.dis(lambda: not [])
  1           0 BUILD_LIST               0
              2 UNARY_NOT
              4 RETURN_VALUE

Evidence from the C source and documentation

PyVarObject

This is an extension of PyObject that adds the ob_size field. This is only used for objects that have some notion of length. This type does not often appear in the Python/C API. It corresponds to the fields defined by the expansion of the PyObject_VAR_HEAD macro.

From the c source in Include/listobject.h:

typedef struct {
    PyObject_VAR_HEAD
    /* Vector of pointers to list elements.  list[0] is ob_item[0], etc. */
    PyObject **ob_item;

    /* ob_item contains space for 'allocated' elements.  The number
     * currently in use is ob_size.
     * Invariants:
     *     0 <= ob_size <= allocated
     *     len(list) == ob_size

Response to comments:

I would point out that this is also true for the non-empty case though its pretty ugly as with l=[] then %timeit len(l) != 0 90.6 ns ± 8.3 ns, %timeit l != [] 55.6 ns ± 3.09, %timeit not not l 38.5 ns ± 0.372. But there is no way anyone is going to enjoy not not l despite triple the speed. It looks ridiculous. But the speed wins out
I suppose the problem is testing with timeit since just if l: is sufficient but surprisingly %timeit bool(l) yields 101 ns ± 2.64 ns. Interesting there is no way to coerce to bool without this penalty. %timeit l is useless since no conversion would occur.

IPython magic, %timeit, is not entirely useless here:

In [1]: l = []                                                                  

In [2]: %timeit l                                                               
20 ns ± 0.155 ns per loop (mean ± std. dev. of 7 runs, 100000000 loops each)

In [3]: %timeit not l                                                           
24.4 ns ± 1.58 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [4]: %timeit not not l                                                       
30.1 ns ± 2.16 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

We can see there’s a bit of linear cost for each additional not here. We want to see the costs, ceteris paribus, that is, all else equal – where all else is minimized as far as possible:

In [5]: %timeit if l: pass                                                      
22.6 ns ± 0.963 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [6]: %timeit if not l: pass                                                  
24.4 ns ± 0.796 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [7]: %timeit if not not l: pass                                              
23.4 ns ± 0.793 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Now let’s look at the case for an unempty list:

In [8]: l = [1]                                                                 

In [9]: %timeit if l: pass                                                      
23.7 ns ± 1.06 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [10]: %timeit if not l: pass                                                 
23.6 ns ± 1.64 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [11]: %timeit if not not l: pass                                             
26.3 ns ± 1 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

What we can see here is that it makes little difference whether you pass in an actual bool to the condition check or the list itself, and if anything, giving the list, as is, is faster.

Python is written in C; it uses its logic at the C level. Anything you write in Python will be slower. And it will likely be orders of magnitude slower unless you’re using the mechanisms built into Python directly.


回答 5

空列表本身在真实值测试中被认为是错误的(请参阅python文档):

a = []
if a:
     print "not empty"

@达伦·托马斯

编辑:反对测试空列表为假的另一点:多态性怎么样?您不应该依赖列表作为列表。它应该像鸭子一样嘎嘎叫-当它没有元素时,如何使duckCollection嘎嘎叫“ False”?

你duckCollection应该实现__nonzero____len__因此如果一个:没有问题会工作。

An empty list is itself considered false in true value testing (see python documentation):

a = []
if a:
     print "not empty"

@Daren Thomas

EDIT: Another point against testing the empty list as False: What about polymorphism? You shouldn’t depend on a list being a list. It should just quack like a duck – how are you going to get your duckCollection to quack ”False” when it has no elements?

Your duckCollection should implement __nonzero__ or __len__ so the if a: will work without problems.


回答 6

帕特里克(已接受)的答案是正确的:这if not a:是正确的方法。Harley Holcombe的答案是正确的,因为这在PEP 8样式指南中。但是,答案没有一个能解释的是为什么遵循这个习惯用法是一个好主意-即使您个人发现它对于Ruby用户或其他任何人来说都不足够明确或令人困惑。

Python代码和Python社区都有非常强大的习惯用法。遵循这些惯用法可以使您的代码更容易为有Python经验的人阅读。当您违反这些习惯用法时,这是一个强烈的信号。

这是真的,if not a:不区分空列表None,或数字0,或空的元组,或空用户创建的集合类型,或空用户创建不-相当-集合类型,或单元素与NumPy阵列充当具有falsey标量值等。有时,明确这一点很重要。而在这种情况下,你知道什么,你想明确一下,这样你就可以测试这一点。例如,if not a and a is not None:表示“除None以外的任何虚假内容”,而if len(a) != 0:表示“仅空序列-此处除序列外的任何其他内容都是错误”,依此类推。除了精确测试要测试的内容外,这还向读者表明该测试很重要。

但是,当您没有什么要明确的内容时,除了if not a:会误导读者,还有其他任何事情。当您不重要时,您是在发出信号。(您也可以使代码不灵活,或慢,或什么的,但是这一切都不太重要。)如果你习惯性地误导这样的读者,那么当你这样做需要做一个区分,它会向任何人声张,因为您在代码中一直在“狼吞虎咽”。

Patrick’s (accepted) answer is right: if not a: is the right way to do it. Harley Holcombe’s answer is right that this is in the PEP 8 style guide. But what none of the answers explain is why it’s a good idea to follow the idiom—even if you personally find it’s not explicit enough or confusing to Ruby users or whatever.

Python code, and the Python community, has very strong idioms. Following those idioms makes your code easier to read for anyone experienced in Python. And when you violate those idioms, that’s a strong signal.

It’s true that if not a: doesn’t distinguish empty lists from None, or numeric 0, or empty tuples, or empty user-created collection types, or empty user-created not-quite-collection types, or single-element NumPy array acting as scalars with falsey values, etc. And sometimes it’s important to be explicit about that. And in that case, you know what you want to be explicit about, so you can test for exactly that. For example, if not a and a is not None: means “anything falsey except None”, while if len(a) != 0: means “only empty sequences—and anything besides a sequence is an error here”, and so on. Besides testing for exactly what you want to test, this also signals to the reader that this test is important.

But when you don’t have anything to be explicit about, anything other than if not a: is misleading the reader. You’re signaling something as important when it isn’t. (You may also be making the code less flexible, or slower, or whatever, but that’s all less important.) And if you habitually mislead the reader like this, then when you do need to make a distinction, it’s going to pass unnoticed because you’ve been “crying wolf” all over your code.


回答 7

为什么要检查?

似乎没有人已经解决了质疑你需要测试在首位名单。因为您没有提供其他上下文,所以我可以想象您可能不需要首先进行此检查,但是您不熟悉Python中的列表处理。

我认为最Python的方式是根本不检查,而只是处理列表。这样,无论是空还是满,它都会做正确的事情。

a = []

for item in a:
    <do something with item>

<rest of code>

这具有处理任何内容的好处,而不是要求对空虚的特定检查。如果a为空,则将不执行从属块,并且解释器将进入下一行。

如果确实需要检查数组是否为空,则其他答案就足够了。

Why check at all?

No one seems to have addressed questioning your need to test the list in the first place. Because you provided no additional context, I can imagine that you may not need to do this check in the first place, but are unfamiliar with list processing in Python.

I would argue that the most pythonic way is to not check at all, but rather to just process the list. That way it will do the right thing whether empty or full.

a = []

for item in a:
    <do something with item>

<rest of code>

This has the benefit of handling any contents of a, while not requiring a specific check for emptiness. If a is empty, the dependent block will not execute and the interpreter will fall through to the next line.

If you do actually need to check the array for emptiness, the other answers are sufficient.


回答 8

len()用于Python列表,字符串,字典和集合的O(1)操作。Python在内部跟踪这些容器中元素的数量。

JavaScript 有一个true / falsy的类似概念

len() is an O(1) operation for Python lists, strings, dicts, and sets. Python internally keeps track of the number of elements in these containers.

JavaScript has a similar notion of truthy/falsy.


回答 9

我写过:

if isinstance(a, (list, some, other, types, i, accept)) and not a:
    do_stuff

被投票为-1。我不确定这是否是因为读者反对该策略或认为答案对所提供的内容没有帮助。我会假装是后者,因为-不管什么都算是“ pythonic”-这都是正确的策略。除非您已经排除或准备好处理a例如的案例,否则False您需要的测试比just更具限制性if not a:。您可以使用如下形式:

if isinstance(a, numpy.ndarray) and not a.size:
    do_stuff
elif isinstance(a, collections.Sized) and not a:
    do_stuff

第一次测试是针对上述@Mike的回答。第三行也可以替换为:

elif isinstance(a, (list, tuple)) and not a:

如果您只想接受特定类型(及其子类型)的实例,或者使用:

elif isinstance(a, (list, tuple)) and not len(a):

您无需进行显式的类型检查就可以逃脱,但前提a是周围的上下文已经向您保证这是您准备处理的类型的值,或者如果您确定不准备处理的类型正在使用引发您准备处理的错误(例如,TypeError如果您调用len未定义的值)。通常,“ pythonic”约定似乎走到了最后。像鸭子一样挤压它,如果它不知道如何发出声音,则让它引发DuckError。但是,您仍然必须考虑要进行哪种类型的假设,以及您是否没有准备好正确处理的情况是否会在正确的地方出错。Numpy数组是一个很好的例子,只是盲目地依赖len 否则布尔类型转换可能无法完全满足您的期望。

I had written:

if isinstance(a, (list, some, other, types, i, accept)) and not a:
    do_stuff

which was voted -1. I’m not sure if that’s because readers objected to the strategy or thought the answer wasn’t helpful as presented. I’ll pretend it was the latter, since—whatever counts as “pythonic”—this is the correct strategy. Unless you’ve already ruled out, or are prepared to handle cases where a is, for example, False, you need a test more restrictive than just if not a:. You could use something like this:

if isinstance(a, numpy.ndarray) and not a.size:
    do_stuff
elif isinstance(a, collections.Sized) and not a:
    do_stuff

the first test is in response to @Mike’s answer, above. The third line could also be replaced with:

elif isinstance(a, (list, tuple)) and not a:

if you only want to accept instances of particular types (and their subtypes), or with:

elif isinstance(a, (list, tuple)) and not len(a):

You can get away without the explicit type check, but only if the surrounding context already assures you that a is a value of the types you’re prepared to handle, or if you’re sure that types you’re not prepared to handle are going to raise errors (e.g., a TypeError if you call len on a value for which it’s undefined) that you’re prepared to handle. In general, the “pythonic” conventions seem to go this last way. Squeeze it like a duck and let it raise a DuckError if it doesn’t know how to quack. You still have to think about what type assumptions you’re making, though, and whether the cases you’re not prepared to handle properly really are going to error out in the right places. The Numpy arrays are a good example where just blindly relying on len or the boolean typecast may not do precisely what you’re expecting.


回答 10

从有关真值测试的文档中

除此处列出的值外,所有其他值均被视为 True

  • None
  • False
  • 任何数值类型的零,例如00.00j
  • 任何空序列,例如''()[]
  • 任何空映射,例如{}
  • 用户定义的类的实例,如果该类定义了__bool__()__len__()方法,则该方法返回整数0或bool value时False

可以看出,空列表[]虚假的,因此对布尔值执行的操作听起来最有效:

if not a:
    print('"a" is empty!')

From documentation on truth value testing:

All values other than what is listed here are considered True

  • None
  • False
  • zero of any numeric type, for example, 0, 0.0, 0j.
  • any empty sequence, for example, '', (), [].
  • any empty mapping, for example, {}.
  • instances of user-defined classes, if the class defines a __bool__() or __len__() method, when that method returns the integer zero or bool value False.

As can be seen, empty list [] is falsy, so doing what would be done to a boolean value sounds most efficient:

if not a:
    print('"a" is empty!')

回答 11

您可以通过以下几种方法检查列表是否为空:

a = [] #the list

1)非常简单的pythonic方式:

if not a:
    print("a is empty")

在Python中,空容器如列表,元组,集合,字典,变量等被视为False。可以简单地将列表视为谓词(返回布尔值)。并且一个True值表示它是非空的。

2)一种非常明确的方法:使用len()来查找长度并检查其是否等于0

if len(a) == 0:
    print("a is empty")

3)或将其与匿名空列表进行比较:

if a == []:
    print("a is empty")

4)另一种愚蠢的做法是使用exceptioniter()

try:
    next(iter(a))
    # list has elements
except StopIteration:
    print("Error: a is empty")

Here are a few ways you can check if a list is empty:

a = [] #the list

1) The pretty simple pythonic way:

if not a:
    print("a is empty")

In Python, empty containers such as lists,tuples,sets,dicts,variables etc are seen as False. One could simply treat the list as a predicate (returning a Boolean value). And a True value would indicate that it’s non-empty.

2) A much explicit way: using the len() to find the length and check if it equals to 0:

if len(a) == 0:
    print("a is empty")

3) Or comparing it to an anonymous empty list:

if a == []:
    print("a is empty")

4) Another yet silly way to do is using exception and iter():

try:
    next(iter(a))
    # list has elements
except StopIteration:
    print("Error: a is empty")

回答 12

我更喜欢以下内容:

if a == []:
   print "The list is empty."

I prefer the following:

if a == []:
   print "The list is empty."

回答 13

方法1(首选):

if not a : 
   print ("Empty") 

方法2:

if len(a) == 0 :
   print( "Empty" )

方法3:

if a == [] :
  print ("Empty")

Method 1 (Preferred):

if not a : 
   print ("Empty") 

Method 2 :

if len(a) == 0 :
   print( "Empty" )

Method 3:

if a == [] :
  print ("Empty")

回答 14

def list_test (L):
    if   L is None  : print('list is None')
    elif not L      : print('list is empty')
    else: print('list has %d elements' % len(L))

list_test(None)
list_test([])
list_test([1,2,3])

有时最好分别测试一下是否None为空,因为这是两个不同的状态。上面的代码产生以下输出:

list is None 
list is empty 
list has 3 elements

虽然None毫无价值,但虚假的。因此,如果您不想对None-ness 进行单独测试,则不必这样做。

def list_test2 (L):
    if not L      : print('list is empty')
    else: print('list has %d elements' % len(L))

list_test2(None)
list_test2([])
list_test2([1,2,3])

产生预期

list is empty
list is empty
list has 3 elements
def list_test (L):
    if   L is None  : print('list is None')
    elif not L      : print('list is empty')
    else: print('list has %d elements' % len(L))

list_test(None)
list_test([])
list_test([1,2,3])

It is sometimes good to test for None and for emptiness separately as those are two different states. The code above produces the following output:

list is None 
list is empty 
list has 3 elements

Although it’s worth nothing that None is falsy. So if you don’t want to separate test for None-ness, you don’t have to do that.

def list_test2 (L):
    if not L      : print('list is empty')
    else: print('list has %d elements' % len(L))

list_test2(None)
list_test2([])
list_test2([1,2,3])

produces expected

list is empty
list is empty
list has 3 elements

回答 15

给出了许多答案,其中很多都很好。我只想补充一下

not a

也将通过None和其他类型的空结构。如果您确实要检查一个空列表,可以执行以下操作:

if isinstance(a, list) and len(a)==0:
    print("Received an empty list")

Many answers have been given, and a lot of them are pretty good. I just wanted to add that the check

not a

will also pass for None and other types of empty structures. If you truly want to check for an empty list, you can do this:

if isinstance(a, list) and len(a)==0:
    print("Received an empty list")

回答 16

我们可以使用简单的方法:

item_list=[]
if len(item_list) == 0:
    print("list is empty")
else:
    print("list is not empty")

we could use a simple if else:

item_list=[]
if len(item_list) == 0:
    print("list is empty")
else:
    print("list is not empty")

回答 17

如果要检查列表是否为空:

l = []
if l:
    # do your stuff.

如果要检查列表中的所有值是否为空。但是它将是True一个空列表:

l = ["", False, 0, '', [], {}, ()]
if all(bool(x) for x in l):
    # do your stuff.

如果要同时使用两种情况:

def empty_list(lst):
    if len(lst) == 0:
        return False
    else:
        return all(bool(x) for x in l)

现在您可以使用:

if empty_list(lst):
    # do your stuff.

If you want to check if a list is empty:

l = []
if l:
    # do your stuff.

If you want to check whether all the values in list is empty. However it will be True for an empty list:

l = ["", False, 0, '', [], {}, ()]
if all(bool(x) for x in l):
    # do your stuff.

If you want to use both cases together:

def empty_list(lst):
    if len(lst) == 0:
        return False
    else:
        return all(bool(x) for x in l)

Now you can use:

if empty_list(lst):
    # do your stuff.

回答 18

受@dubiousjim解决方案的启发,我建议使用附加的常规检查来确定它是否可迭代

import collections
def is_empty(a):
    return not a and isinstance(a, collections.Iterable)

注意:字符串被认为是可迭代的。- and not isinstance(a,(str,unicode))如果要排除空字符串,请添加

测试:

>>> is_empty('sss')
False
>>> is_empty(555)
False
>>> is_empty(0)
False
>>> is_empty('')
True
>>> is_empty([3])
False
>>> is_empty([])
True
>>> is_empty({})
True
>>> is_empty(())
True

Being inspired by @dubiousjim’s solution, I propose to use an additional general check of whether is it something iterable

import collections
def is_empty(a):
    return not a and isinstance(a, collections.Iterable)

Note: a string is considered to be iterable. – add and not isinstance(a,(str,unicode)) if you want the empty string to be excluded

Test:

>>> is_empty('sss')
False
>>> is_empty(555)
False
>>> is_empty(0)
False
>>> is_empty('')
True
>>> is_empty([3])
False
>>> is_empty([])
True
>>> is_empty({})
True
>>> is_empty(())
True

回答 19

print('not empty' if a else 'empty')

实用一点:

a.pop() if a else None

和最透明的版本:

if a: a.pop() 
print('not empty' if a else 'empty')

a little more practical:

a.pop() if a else None

and shertest version:

if a: a.pop() 

回答 20

从python3开始,您可以使用

a == []

检查列表是否为空

编辑:这也适用于python2.7。

我不确定为什么会有这么多复杂的答案。很清楚直接

From python3 onwards you can use

a == []

to check if the list is empty

EDIT : This works with python2.7 too..

I am not sure why there are so many complicated answers. It’s pretty clear and straightforward


回答 21

您甚至可以尝试使用bool()这样

    a = [1,2,3];
    print bool(a); # it will return True
    a = [];
    print bool(a); # it will return False

我喜欢这种方式来检查列表是否为空。

非常方便实用。

You can even try using bool() like this

    a = [1,2,3];
    print bool(a); # it will return True
    a = [];
    print bool(a); # it will return False

I love this way for checking list is empty or not.

Very handy and useful.


回答 22

只需使用is_empty()或使功能类似于:

def is_empty(any_structure):
    if any_structure:
        print('Structure is not empty.')
        return True
    else:
        print('Structure is empty.')
        return False  

它可以用于任何data_structure,例如列表,元组,字典等。通过这些,您可以使用just多次调用它is_empty(any_structure)

Simply use is_empty() or make function like:-

def is_empty(any_structure):
    if any_structure:
        print('Structure is not empty.')
        return True
    else:
        print('Structure is empty.')
        return False  

It can be used for any data_structure like a list,tuples, dictionary and many more. By these, you can call it many times using just is_empty(any_structure).


回答 23

一种简单的方法是检查长度等于零。

if len(a) == 0:
    print("a is empty")

Simple way is checking the length is equal zero.

if len(a) == 0:
    print("a is empty")

回答 24

空列表的真值是,False而非空列表的真值是True

The truth value of an empty list is False whereas for a non-empty list it is True.


回答 25

这给我带来了一个特殊的用例:我实际上想要一个函数来告诉我列表是否为空。我想避免在此处编写自己的函数或使用lambda表达式(因为它似乎应该足够简单):

foo = itertools.takewhile(is_not_empty, (f(x) for x in itertools.count(1)))

当然,有一种非常自然的方法:

foo = itertools.takewhile(bool, (f(x) for x in itertools.count(1)))

当然,也不能使用boolif(即if bool(L):),因为它暗示。但是,对于明确需要“不为空”作为函数的情况,bool则是最佳选择。

What brought me here is a special use-case: I actually wanted a function to tell me if a list is empty or not. I wanted to avoid writing my own function or using a lambda-expression here (because it seemed like it should be simple enough):

foo = itertools.takewhile(is_not_empty, (f(x) for x in itertools.count(1)))

And, of course, there is a very natural way to do it:

foo = itertools.takewhile(bool, (f(x) for x in itertools.count(1)))

Of course, do not use bool in if (i.e., if bool(L):) because it’s implied. But, for the cases when “is not empty” is explicitly needed as a function, bool is the best choice.


回答 26

要检查列表是否为空,可以使用以下两种方法。但是请记住,我们应该避免显式检查序列类型的方法(这是一种less pythonic方法):

def enquiry(list1): 
    if len(list1) == 0: 
        return 0
    else: 
        return 1

# ––––––––––––––––––––––––––––––––

list1 = [] 

if enquiry(list1): 
    print ("The list isn't empty") 
else: 
    print("The list is Empty") 

# Result: "The list is Empty".

第二种方法是more pythonic一种。此方法是一种隐式检查方法,比以前的方法更可取。

def enquiry(list1): 
    if not list1: 
        return True
    else: 
        return False

# ––––––––––––––––––––––––––––––––

list1 = [] 

if enquiry(list1): 
    print ("The list is Empty") 
else: 
    print ("The list isn't empty") 

# Result: "The list is Empty"

希望这可以帮助。

To check whether a list is empty or not you can use two following ways. But remember, we should avoid the way of explicitly checking for a type of sequence (it’s a less pythonic way):

def enquiry(list1): 
    if len(list1) == 0: 
        return 0
    else: 
        return 1

# ––––––––––––––––––––––––––––––––

list1 = [] 

if enquiry(list1): 
    print ("The list isn't empty") 
else: 
    print("The list is Empty") 

# Result: "The list is Empty".

The second way is a more pythonic one. This method is an implicit way of checking and much more preferable than the previous one.

def enquiry(list1): 
    if not list1: 
        return True
    else: 
        return False

# ––––––––––––––––––––––––––––––––

list1 = [] 

if enquiry(list1): 
    print ("The list is Empty") 
else: 
    print ("The list isn't empty") 

# Result: "The list is Empty"

Hope this helps.


在列表中查找项目的索引

问题:在列表中查找项目的索引

给定一个列表["foo", "bar", "baz"]和列表中的项目"bar",如何1在Python中获取其索引()?

Given a list ["foo", "bar", "baz"] and an item in the list "bar", how do I get its index (1) in Python?


回答 0

>>> ["foo", "bar", "baz"].index("bar")
1

参考:数据结构>列表中的更多内容

注意事项

请注意,虽然这也许是回答这个问题最彻底的方法是问index是一个相当薄弱的组件listAPI,而我不记得我最后一次使用它的愤怒。在评论中已向我指出,由于此答案被大量引用,因此应使其更完整。有关list.index以下注意事项。最初值得一看它的文档可能是值得的:

list.index(x[, start[, end]])

在值等于x的第一项的列表中返回从零开始的索引。ValueError如果没有此类项目,则引发a 。

可选参数startend的解释与切片符号相同,用于将搜索限制到列表的特定子序列。返回的索引是相对于完整序列的开始而不是开始参数计算的。

列表长度的线性时间复杂度

一个index调用检查,以列表的每一个元素,直到它找到一个匹配。如果您的列表很长,并且您大概不知道它在列表中的什么位置,则此搜索可能会成为瓶颈。在这种情况下,您应该考虑使用其他数据结构。请注意,如果您大致知道在哪里找到匹配项,则可以给出index提示。例如,在此代码段中,l.index(999_999, 999_990, 1_000_000)它比straight快大约五个数量级l.index(999_999),因为前者只需要搜索10个条目,而后者要搜索一百万个:

>>> import timeit
>>> timeit.timeit('l.index(999_999)', setup='l = list(range(0, 1_000_000))', number=1000)
9.356267921015387
>>> timeit.timeit('l.index(999_999, 999_990, 1_000_000)', setup='l = list(range(0, 1_000_000))', number=1000)
0.0004404920036904514

仅将第一个匹配项的索引返回到其参数

呼叫index顺序搜索列表,直到找到匹配项,然后在该处停止。如果希望需要更多匹配项的索引,则应使用列表推导或生成器表达式。

>>> [1, 1].index(1)
0
>>> [i for i, e in enumerate([1, 2, 1]) if e == 1]
[0, 2]
>>> g = (i for i, e in enumerate([1, 2, 1]) if e == 1)
>>> next(g)
0
>>> next(g)
2

我曾经使用过的大多数地方index,现在我使用列表推导或生成器表达式,因为它们更具通用性。因此,如果您打算接触index,请看看这些出色的Python功能。

如果元素不在列表中则抛出

如果该项目不存在,则调用会index导致ValueError

>>> [1, 1].index(2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: 2 is not in list

如果该项目可能不在列表中,则您应该

  1. 首先使用item in my_list(干净,可读的方法)进行检查,或者
  2. index呼叫包裹在一个try/except可以捕获的块中ValueError(可能更快,至少在要搜索的列表较长且通常存在该项目的情况下。)
>>> ["foo", "bar", "baz"].index("bar")
1

Reference: Data Structures > More on Lists

Caveats follow

Note that while this is perhaps the cleanest way to answer the question as asked, index is a rather weak component of the list API, and I can’t remember the last time I used it in anger. It’s been pointed out to me in the comments that because this answer is heavily referenced, it should be made more complete. Some caveats about list.index follow. It is probably worth initially taking a look at the documentation for it:

list.index(x[, start[, end]])

Return zero-based index in the list of the first item whose value is equal to x. Raises a ValueError if there is no such item.

The optional arguments start and end are interpreted as in the slice notation and are used to limit the search to a particular subsequence of the list. The returned index is computed relative to the beginning of the full sequence rather than the start argument.

Linear time-complexity in list length

An index call checks every element of the list in order, until it finds a match. If your list is long, and you don’t know roughly where in the list it occurs, this search could become a bottleneck. In that case, you should consider a different data structure. Note that if you know roughly where to find the match, you can give index a hint. For instance, in this snippet, l.index(999_999, 999_990, 1_000_000) is roughly five orders of magnitude faster than straight l.index(999_999), because the former only has to search 10 entries, while the latter searches a million:

>>> import timeit
>>> timeit.timeit('l.index(999_999)', setup='l = list(range(0, 1_000_000))', number=1000)
9.356267921015387
>>> timeit.timeit('l.index(999_999, 999_990, 1_000_000)', setup='l = list(range(0, 1_000_000))', number=1000)
0.0004404920036904514

Only returns the index of the first match to its argument

A call to index searches through the list in order until it finds a match, and stops there. If you expect to need indices of more matches, you should use a list comprehension, or generator expression.

>>> [1, 1].index(1)
0
>>> [i for i, e in enumerate([1, 2, 1]) if e == 1]
[0, 2]
>>> g = (i for i, e in enumerate([1, 2, 1]) if e == 1)
>>> next(g)
0
>>> next(g)
2

Most places where I once would have used index, I now use a list comprehension or generator expression because they’re more generalizable. So if you’re considering reaching for index, take a look at these excellent Python features.

Throws if element not present in list

A call to index results in a ValueError if the item’s not present.

>>> [1, 1].index(2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: 2 is not in list

If the item might not be present in the list, you should either

  1. Check for it first with item in my_list (clean, readable approach), or
  2. Wrap the index call in a try/except block which catches ValueError (probably faster, at least when the list to search is long, and the item is usually present.)

回答 1

学习Python真正有用的一件事是使用交互式帮助功能:

>>> help(["foo", "bar", "baz"])
Help on list object:

class list(object)
 ...

 |
 |  index(...)
 |      L.index(value, [start, [stop]]) -> integer -- return first index of value
 |

这通常会引导您找到所需的方法。

One thing that is really helpful in learning Python is to use the interactive help function:

>>> help(["foo", "bar", "baz"])
Help on list object:

class list(object)
 ...

 |
 |  index(...)
 |      L.index(value, [start, [stop]]) -> integer -- return first index of value
 |

which will often lead you to the method you are looking for.


回答 2

大多数答案都说明了如何查找单个索引,但是如果该项目多次在列表中,则它们的方法不会返回多个索引。用途enumerate()

for i, j in enumerate(['foo', 'bar', 'baz']):
    if j == 'bar':
        print(i)

index()函数仅返回第一个匹配项,而enumerate()返回所有匹配项。

作为列表理解:

[i for i, j in enumerate(['foo', 'bar', 'baz']) if j == 'bar']

这也是另一个小解决方案itertools.count()(与枚举几乎相同):

from itertools import izip as zip, count # izip for maximum efficiency
[i for i, j in zip(count(), ['foo', 'bar', 'baz']) if j == 'bar']

对于较大的列表,这比使用enumerate()以下命令更有效:

$ python -m timeit -s "from itertools import izip as zip, count" "[i for i, j in zip(count(), ['foo', 'bar', 'baz']*500) if j == 'bar']"
10000 loops, best of 3: 174 usec per loop
$ python -m timeit "[i for i, j in enumerate(['foo', 'bar', 'baz']*500) if j == 'bar']"
10000 loops, best of 3: 196 usec per loop

The majority of answers explain how to find a single index, but their methods do not return multiple indexes if the item is in the list multiple times. Use enumerate():

for i, j in enumerate(['foo', 'bar', 'baz']):
    if j == 'bar':
        print(i)

The index() function only returns the first occurrence, while enumerate() returns all occurrences.

As a list comprehension:

[i for i, j in enumerate(['foo', 'bar', 'baz']) if j == 'bar']

Here’s also another small solution with itertools.count() (which is pretty much the same approach as enumerate):

from itertools import izip as zip, count # izip for maximum efficiency
[i for i, j in zip(count(), ['foo', 'bar', 'baz']) if j == 'bar']

This is more efficient for larger lists than using enumerate():

$ python -m timeit -s "from itertools import izip as zip, count" "[i for i, j in zip(count(), ['foo', 'bar', 'baz']*500) if j == 'bar']"
10000 loops, best of 3: 174 usec per loop
$ python -m timeit "[i for i, j in enumerate(['foo', 'bar', 'baz']*500) if j == 'bar']"
10000 loops, best of 3: 196 usec per loop

回答 3

要获取所有索引:

indexes = [i for i,x in enumerate(xs) if x == 'foo']

To get all indexes:

indexes = [i for i,x in enumerate(xs) if x == 'foo']

回答 4

index()返回值的第一个索引!

| 索引(…)
| L.index(value,[start,[stop]])->整数-返回值的第一个索引

def all_indices(value, qlist):
    indices = []
    idx = -1
    while True:
        try:
            idx = qlist.index(value, idx+1)
            indices.append(idx)
        except ValueError:
            break
    return indices

all_indices("foo", ["foo","bar","baz","foo"])

index() returns the first index of value!

| index(…)
| L.index(value, [start, [stop]]) -> integer — return first index of value

def all_indices(value, qlist):
    indices = []
    idx = -1
    while True:
        try:
            idx = qlist.index(value, idx+1)
            indices.append(idx)
        except ValueError:
            break
    return indices

all_indices("foo", ["foo","bar","baz","foo"])

回答 5

如果该元素不在列表中,则会出现问题。此函数处理该问题:

# if element is found it returns index of element else returns None

def find_element_in_list(element, list_element):
    try:
        index_element = list_element.index(element)
        return index_element
    except ValueError:
        return None

A problem will arise if the element is not in the list. This function handles the issue:

# if element is found it returns index of element else returns None

def find_element_in_list(element, list_element):
    try:
        index_element = list_element.index(element)
        return index_element
    except ValueError:
        return None

回答 6

a = ["foo","bar","baz",'bar','any','much']

indexes = [index for index in range(len(a)) if a[index] == 'bar']
a = ["foo","bar","baz",'bar','any','much']

indexes = [index for index in range(len(a)) if a[index] == 'bar']

回答 7

您必须设置条件以检查要搜索的元素是否在列表中

if 'your_element' in mylist:
    print mylist.index('your_element')
else:
    print None

You have to set a condition to check if the element you’re searching is in the list

if 'your_element' in mylist:
    print mylist.index('your_element')
else:
    print None

回答 8

此处提出的所有功能均会重现固有的语言行为,但会掩盖正在发生的事情。

[i for i in range(len(mylist)) if mylist[i]==myterm]  # get the indices

[each for each in mylist if each==myterm]             # get the items

mylist.index(myterm) if myterm in mylist else None    # get the first index and fail quietly

如果该语言提供了执行所需功能的方法,为什么还要编写具有异常处理功能的函数?

All of the proposed functions here reproduce inherent language behavior but obscure what’s going on.

[i for i in range(len(mylist)) if mylist[i]==myterm]  # get the indices

[each for each in mylist if each==myterm]             # get the items

mylist.index(myterm) if myterm in mylist else None    # get the first index and fail quietly

Why write a function with exception handling if the language provides the methods to do what you want itself?


回答 9

如果需要所有索引,则可以使用NumPy

import numpy as np

array = [1, 2, 1, 3, 4, 5, 1]
item = 1
np_array = np.array(array)
item_index = np.where(np_array==item)
print item_index
# Out: (array([0, 2, 6], dtype=int64),)

这是一个清晰易读的解决方案。

If you want all indexes, then you can use NumPy:

import numpy as np

array = [1, 2, 1, 3, 4, 5, 1]
item = 1
np_array = np.array(array)
item_index = np.where(np_array==item)
print item_index
# Out: (array([0, 2, 6], dtype=int64),)

It is clear, readable solution.


回答 10

在Python中给定包含该项目的列表的情况下查找项目的索引

对于列表["foo", "bar", "baz"]和列表中的项目,"bar"用Python获取索引(1)的最干净方法是什么?

好吧,可以肯定的是,这里有index方法,它返回第一次出现的索引:

>>> l = ["foo", "bar", "baz"]
>>> l.index('bar')
1

此方法存在两个问题:

  • 如果该值不在列表中,则会得到一个 ValueError
  • 如果列表中有多个值,则仅获取第一个的索引

没有值

如果该值可能丢失,则需要捕获 ValueError

您可以使用这样的可重用定义来执行此操作:

def index(a_list, value):
    try:
        return a_list.index(value)
    except ValueError:
        return None

并像这样使用它:

>>> print(index(l, 'quux'))
None
>>> print(index(l, 'bar'))
1

不利的一面是,您可能会检查返回的值isis not无:

result = index(a_list, value)
if result is not None:
    do_something(result)

列表中有多个值

如果可能发生更多次,您将无法获得有关以下方面的完整信息list.index

>>> l.append('bar')
>>> l
['foo', 'bar', 'baz', 'bar']
>>> l.index('bar')              # nothing at index 3?
1

您可以将索引枚举到列表中:

>>> [index for index, v in enumerate(l) if v == 'bar']
[1, 3]
>>> [index for index, v in enumerate(l) if v == 'boink']
[]

如果没有出现,则可以通过布尔检查结果来进行检查,或者如果对结果进行循环,则什么也不做:

indexes = [index for index, v in enumerate(l) if v == 'boink']
for index in indexes:
    do_something(index)

用熊猫更好地处理数据

如果您有熊猫,则可以通过Series对象轻松获得以下信息:

>>> import pandas as pd
>>> series = pd.Series(l)
>>> series
0    foo
1    bar
2    baz
3    bar
dtype: object

比较检查将返回一系列布尔值:

>>> series == 'bar'
0    False
1     True
2    False
3     True
dtype: bool

通过下标符号将该布尔值系列传递给该系列,您将只获得匹配的成员:

>>> series[series == 'bar']
1    bar
3    bar
dtype: object

如果只需要索引,index属性将返回一系列整数:

>>> series[series == 'bar'].index
Int64Index([1, 3], dtype='int64')

而且,如果要将它们放在列表或元组中,只需将它们传递给构造函数即可:

>>> list(series[series == 'bar'].index)
[1, 3]

是的,您也可以使用带有枚举的列表理解,但这在我看来并不那么优雅-您正在用Python进行相等性测试,而不是让用C编写的内置代码来处理它:

>>> [i for i, value in enumerate(l) if value == 'bar']
[1, 3]

这是XY问题吗?

XY问题是在询问您尝试的解决方案,而不是您的实际问题。

为什么您认为需要列表中给定元素的索引?

如果您已经知道该值,为什么还要关心它在列表中的位置?

如果值不存在,则捕获ValueError相当冗长-我宁愿避免这种情况。

无论如何,我通常都会遍历该列表,因此我通常会保留一个指向任何有趣信息的指针,并使用枚举获取索引。

如果您要处理数据,则可能应该使用pandas-与我展示的纯Python解决方法相比,pandas的工具要优雅得多。

我不记得list.index自己需要。但是,我浏览了Python标准库,并且看到了一些很好的用法。

idlelibGUI和文本解析中,有很多用途。

keyword模块使用它在模块中查找注释标记,以通过元编程自动重新生成其中的关键字列表。

在Lib / mailbox.py中,它似乎像有序映射一样在使用它:

key_list[key_list.index(old)] = new

del key_list[key_list.index(key)]

在Lib / http / cookiejar.py中,似乎用来获取下个月的内容:

mon = MONTHS_LOWER.index(mon.lower())+1

在Lib / tarfile.py中,类似于distutils来获取最多一个项目的切片:

members = members[:members.index(tarinfo)]

在Lib / pickletools.py中:

numtopop = before.index(markobject)

这些用法似乎有一个共同点,即它们似乎在受限制大小的列表上运行(由于O的n(n)查找时间而很重要list.index),并且它们主要用于解析(对于Idle,则通常用于UI)。

尽管有用例,但这种情况很少见。如果发现自己正在寻找答案,请问自己正在做的事情是否最直接地使用了该用例所用语言提供的工具。

Finding the index of an item given a list containing it in Python

For a list ["foo", "bar", "baz"] and an item in the list "bar", what’s the cleanest way to get its index (1) in Python?

Well, sure, there’s the index method, which returns the index of the first occurrence:

>>> l = ["foo", "bar", "baz"]
>>> l.index('bar')
1

There are a couple of issues with this method:

  • if the value isn’t in the list, you’ll get a ValueError
  • if more than one of the value is in the list, you only get the index for the first one

No values

If the value could be missing, you need to catch the ValueError.

You can do so with a reusable definition like this:

def index(a_list, value):
    try:
        return a_list.index(value)
    except ValueError:
        return None

And use it like this:

>>> print(index(l, 'quux'))
None
>>> print(index(l, 'bar'))
1

And the downside of this is that you will probably have a check for if the returned value is or is not None:

result = index(a_list, value)
if result is not None:
    do_something(result)

More than one value in the list

If you could have more occurrences, you’ll not get complete information with list.index:

>>> l.append('bar')
>>> l
['foo', 'bar', 'baz', 'bar']
>>> l.index('bar')              # nothing at index 3?
1

You might enumerate into a list comprehension the indexes:

>>> [index for index, v in enumerate(l) if v == 'bar']
[1, 3]
>>> [index for index, v in enumerate(l) if v == 'boink']
[]

If you have no occurrences, you can check for that with boolean check of the result, or just do nothing if you loop over the results:

indexes = [index for index, v in enumerate(l) if v == 'boink']
for index in indexes:
    do_something(index)

Better data munging with pandas

If you have pandas, you can easily get this information with a Series object:

>>> import pandas as pd
>>> series = pd.Series(l)
>>> series
0    foo
1    bar
2    baz
3    bar
dtype: object

A comparison check will return a series of booleans:

>>> series == 'bar'
0    False
1     True
2    False
3     True
dtype: bool

Pass that series of booleans to the series via subscript notation, and you get just the matching members:

>>> series[series == 'bar']
1    bar
3    bar
dtype: object

If you want just the indexes, the index attribute returns a series of integers:

>>> series[series == 'bar'].index
Int64Index([1, 3], dtype='int64')

And if you want them in a list or tuple, just pass them to the constructor:

>>> list(series[series == 'bar'].index)
[1, 3]

Yes, you could use a list comprehension with enumerate too, but that’s just not as elegant, in my opinion – you’re doing tests for equality in Python, instead of letting builtin code written in C handle it:

>>> [i for i, value in enumerate(l) if value == 'bar']
[1, 3]

Is this an XY problem?

The XY problem is asking about your attempted solution rather than your actual problem.

Why do you think you need the index given an element in a list?

If you already know the value, why do you care where it is in a list?

If the value isn’t there, catching the ValueError is rather verbose – and I prefer to avoid that.

I’m usually iterating over the list anyways, so I’ll usually keep a pointer to any interesting information, getting the index with enumerate.

If you’re munging data, you should probably be using pandas – which has far more elegant tools than the pure Python workarounds I’ve shown.

I do not recall needing list.index, myself. However, I have looked through the Python standard library, and I see some excellent uses for it.

There are many, many uses for it in idlelib, for GUI and text parsing.

The keyword module uses it to find comment markers in the module to automatically regenerate the list of keywords in it via metaprogramming.

In Lib/mailbox.py it seems to be using it like an ordered mapping:

key_list[key_list.index(old)] = new

and

del key_list[key_list.index(key)]

In Lib/http/cookiejar.py, seems to be used to get the next month:

mon = MONTHS_LOWER.index(mon.lower())+1

In Lib/tarfile.py similar to distutils to get a slice up to an item:

members = members[:members.index(tarinfo)]

In Lib/pickletools.py:

numtopop = before.index(markobject)

What these usages seem to have in common is that they seem to operate on lists of constrained sizes (important because of O(n) lookup time for list.index), and they’re mostly used in parsing (and UI in the case of Idle).

While there are use-cases for it, they are fairly uncommon. If you find yourself looking for this answer, ask yourself if what you’re doing is the most direct usage of the tools provided by the language for your use-case.


回答 11

具有该zip功能的所有索引:

get_indexes = lambda x, xs: [i for (y, i) in zip(xs, range(len(xs))) if x == y]

print get_indexes(2, [1, 2, 3, 4, 5, 6, 3, 2, 3, 2])
print get_indexes('f', 'xsfhhttytffsafweef')

All indexes with the zip function:

get_indexes = lambda x, xs: [i for (y, i) in zip(xs, range(len(xs))) if x == y]

print get_indexes(2, [1, 2, 3, 4, 5, 6, 3, 2, 3, 2])
print get_indexes('f', 'xsfhhttytffsafweef')

回答 12

获取列表中一个或多个(相同)项目的所有出现次数和位置

使用enumerate(alist)可以存储第一个元素(n),即元素x等于要查找的内容时列表的索引。

>>> alist = ['foo', 'spam', 'egg', 'foo']
>>> foo_indexes = [n for n,x in enumerate(alist) if x=='foo']
>>> foo_indexes
[0, 3]
>>>

让我们使函数findindex

该函数将项目和列表作为参数,并返回项目在列表中的位置,就像我们之前看到的那样。

def indexlist(item2find, list_or_string):
  "Returns all indexes of an item in a list or a string"
  return [n for n,item in enumerate(list_or_string) if item==item2find]

print(indexlist("1", "010101010"))

输出量


[1, 3, 5, 7]

简单

for n, i in enumerate([1, 2, 3, 4, 1]):
    if i == 1:
        print(n)

输出:

0
4

Getting all the occurrences and the position of one or more (identical) items in a list

With enumerate(alist) you can store the first element (n) that is the index of the list when the element x is equal to what you look for.

>>> alist = ['foo', 'spam', 'egg', 'foo']
>>> foo_indexes = [n for n,x in enumerate(alist) if x=='foo']
>>> foo_indexes
[0, 3]
>>>

Let’s make our function findindex

This function takes the item and the list as arguments and return the position of the item in the list, like we saw before.

def indexlist(item2find, list_or_string):
  "Returns all indexes of an item in a list or a string"
  return [n for n,item in enumerate(list_or_string) if item==item2find]

print(indexlist("1", "010101010"))

Output


[1, 3, 5, 7]

Simple

for n, i in enumerate([1, 2, 3, 4, 1]):
    if i == 1:
        print(n)

Output:

0
4

回答 13

只需您可以选择

a = [['hand', 'head'], ['phone', 'wallet'], ['lost', 'stock']]
b = ['phone', 'lost']

res = [[x[0] for x in a].index(y) for y in b]

Simply you can go with

a = [['hand', 'head'], ['phone', 'wallet'], ['lost', 'stock']]
b = ['phone', 'lost']

res = [[x[0] for x in a].index(y) for y in b]

回答 14

另外一个选项

>>> a = ['red', 'blue', 'green', 'red']
>>> b = 'red'
>>> offset = 0;
>>> indices = list()
>>> for i in range(a.count(b)):
...     indices.append(a.index(b,offset))
...     offset = indices[-1]+1
... 
>>> indices
[0, 3]
>>> 

Another option

>>> a = ['red', 'blue', 'green', 'red']
>>> b = 'red'
>>> offset = 0;
>>> indices = list()
>>> for i in range(a.count(b)):
...     indices.append(a.index(b,offset))
...     offset = indices[-1]+1
... 
>>> indices
[0, 3]
>>> 

回答 15

而现在,对于完全不同的东西…

…就像在获取索引之前确认项目的存在。这种方法的好处是,该函数始终返回一个索引列表-即使它是一个空列表。它也适用于字符串。

def indices(l, val):
    """Always returns a list containing the indices of val in the_list"""
    retval = []
    last = 0
    while val in l[last:]:
            i = l[last:].index(val)
            retval.append(last + i)
            last += i + 1   
    return retval

l = ['bar','foo','bar','baz','bar','bar']
q = 'bar'
print indices(l,q)
print indices(l,'bat')
print indices('abcdaababb','a')

当粘贴到交互式python窗口中时:

Python 2.7.6 (v2.7.6:3a1db0d2747e, Nov 10 2013, 00:42:54) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> def indices(the_list, val):
...     """Always returns a list containing the indices of val in the_list"""
...     retval = []
...     last = 0
...     while val in the_list[last:]:
...             i = the_list[last:].index(val)
...             retval.append(last + i)
...             last += i + 1   
...     return retval
... 
>>> l = ['bar','foo','bar','baz','bar','bar']
>>> q = 'bar'
>>> print indices(l,q)
[0, 2, 4, 5]
>>> print indices(l,'bat')
[]
>>> print indices('abcdaababb','a')
[0, 4, 5, 7]
>>> 

更新资料

经过一年的低沉的python开发,我对最初的答案感到有些尴尬,因此要想保持纪录,肯定可以使用上面的代码;然而,很多更地道的方式来获得相同的行为是使用列表理解,用枚举()函数一起。

像这样:

def indices(l, val):
    """Always returns a list containing the indices of val in the_list"""
    return [index for index, value in enumerate(l) if value == val]

l = ['bar','foo','bar','baz','bar','bar']
q = 'bar'
print indices(l,q)
print indices(l,'bat')
print indices('abcdaababb','a')

将其粘贴到交互式python窗口中时会生成:

Python 2.7.14 |Anaconda, Inc.| (default, Dec  7 2017, 11:07:58) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> def indices(l, val):
...     """Always returns a list containing the indices of val in the_list"""
...     return [index for index, value in enumerate(l) if value == val]
... 
>>> l = ['bar','foo','bar','baz','bar','bar']
>>> q = 'bar'
>>> print indices(l,q)
[0, 2, 4, 5]
>>> print indices(l,'bat')
[]
>>> print indices('abcdaababb','a')
[0, 4, 5, 7]
>>> 

现在,在回顾了这个问题和所有答案之后,我意识到这正是FMc在他先前的答案中提出的。当我最初回答这个问题时,我什至没有看到那个答案,因为我不理解。我希望我的详细示例能有助于理解。

如果上面的单行代码对您仍然没有意义,我强烈建议您使用Google“ python list comprehension”,并花一些时间来熟悉一下自己。它只是众多强大功能之一,使使用Python开发代码感到非常高兴。

And now, for something completely different…

… like confirming the existence of the item before getting the index. The nice thing about this approach is the function always returns a list of indices — even if it is an empty list. It works with strings as well.

def indices(l, val):
    """Always returns a list containing the indices of val in the_list"""
    retval = []
    last = 0
    while val in l[last:]:
            i = l[last:].index(val)
            retval.append(last + i)
            last += i + 1   
    return retval

l = ['bar','foo','bar','baz','bar','bar']
q = 'bar'
print indices(l,q)
print indices(l,'bat')
print indices('abcdaababb','a')

When pasted into an interactive python window:

Python 2.7.6 (v2.7.6:3a1db0d2747e, Nov 10 2013, 00:42:54) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> def indices(the_list, val):
...     """Always returns a list containing the indices of val in the_list"""
...     retval = []
...     last = 0
...     while val in the_list[last:]:
...             i = the_list[last:].index(val)
...             retval.append(last + i)
...             last += i + 1   
...     return retval
... 
>>> l = ['bar','foo','bar','baz','bar','bar']
>>> q = 'bar'
>>> print indices(l,q)
[0, 2, 4, 5]
>>> print indices(l,'bat')
[]
>>> print indices('abcdaababb','a')
[0, 4, 5, 7]
>>> 

Update

After another year of heads-down python development, I’m a bit embarrassed by my original answer, so to set the record straight, one can certainly use the above code; however, the much more idiomatic way to get the same behavior would be to use list comprehension, along with the enumerate() function.

Something like this:

def indices(l, val):
    """Always returns a list containing the indices of val in the_list"""
    return [index for index, value in enumerate(l) if value == val]

l = ['bar','foo','bar','baz','bar','bar']
q = 'bar'
print indices(l,q)
print indices(l,'bat')
print indices('abcdaababb','a')

Which, when pasted into an interactive python window yields:

Python 2.7.14 |Anaconda, Inc.| (default, Dec  7 2017, 11:07:58) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> def indices(l, val):
...     """Always returns a list containing the indices of val in the_list"""
...     return [index for index, value in enumerate(l) if value == val]
... 
>>> l = ['bar','foo','bar','baz','bar','bar']
>>> q = 'bar'
>>> print indices(l,q)
[0, 2, 4, 5]
>>> print indices(l,'bat')
[]
>>> print indices('abcdaababb','a')
[0, 4, 5, 7]
>>> 

And now, after reviewing this question and all the answers, I realize that this is exactly what FMc suggested in his earlier answer. At the time I originally answered this question, I didn’t even see that answer, because I didn’t understand it. I hope that my somewhat more verbose example will aid understanding.

If the single line of code above still doesn’t make sense to you, I highly recommend you Google ‘python list comprehension’ and take a few minutes to familiarize yourself. It’s just one of the many powerful features that make it a joy to use Python to develop code.


回答 16

FMc和user7177的答案的变体将给出一个字典,该字典可以返回任何条目的所有索引:

>>> a = ['foo','bar','baz','bar','any', 'foo', 'much']
>>> l = dict(zip(set(a), map(lambda y: [i for i,z in enumerate(a) if z is y ], set(a))))
>>> l['foo']
[0, 5]
>>> l ['much']
[6]
>>> l
{'baz': [2], 'foo': [0, 5], 'bar': [1, 3], 'any': [4], 'much': [6]}
>>> 

您也可以将其用作单个衬纸,以获取单个条目的所有索引。尽管我确实使用set(a)减少了调用lambda的次数,但是并不能保证效率。

A variant on the answer from FMc and user7177 will give a dict that can return all indices for any entry:

>>> a = ['foo','bar','baz','bar','any', 'foo', 'much']
>>> l = dict(zip(set(a), map(lambda y: [i for i,z in enumerate(a) if z is y ], set(a))))
>>> l['foo']
[0, 5]
>>> l ['much']
[6]
>>> l
{'baz': [2], 'foo': [0, 5], 'bar': [1, 3], 'any': [4], 'much': [6]}
>>> 

You could also use this as a one liner to get all indices for a single entry. There are no guarantees for efficiency, though I did use set(a) to reduce the number of times the lambda is called.


回答 17

此解决方案不如其他解决方案强大,但是如果您是初学者并且仅了解for循环,则仍然可以在避免ValueError的情况下找到项目的第一个索引:

def find_element(p,t):
    i = 0
    for e in p:
        if e == t:
            return i
        else:
            i +=1
    return -1

This solution is not as powerful as others, but if you’re a beginner and only know about forloops it’s still possible to find the first index of an item while avoiding the ValueError:

def find_element(p,t):
    i = 0
    for e in p:
        if e == t:
            return i
        else:
            i +=1
    return -1

回答 18

在列表L中查找项目x的索引:

idx = L.index(x) if (x in L) else -1

Finding index of item x in list L:

idx = L.index(x) if (x in L) else -1

回答 19

由于Python列表是从零开始的,因此我们可以使用zip内置函数,如下所示:

>>> [i for i,j in zip(range(len(haystack)), haystack) if j == 'needle' ]

其中“ haystack”是有问题的列表,“ needle”是要查找的项目。

(注意:这里我们使用i进行迭代以获取索引,但是如果我们需要专注于项目,可以切换到j。)

Since Python lists are zero-based, we can use the zip built-in function as follows:

>>> [i for i,j in zip(range(len(haystack)), haystack) if j == 'needle' ]

where “haystack” is the list in question and “needle” is the item to look for.

(Note: Here we are iterating using i to get the indexes, but if we need rather to focus on the items we can switch to j.)


回答 20

name ="bar"
list = [["foo", 1], ["bar", 2], ["baz", 3]]
new_list=[]
for item in list:
    new_list.append(item[0])
print(new_list)
try:
    location= new_list.index(name)
except:
    location=-1
print (location)

这说明了字符串是否也不在列表中,如果字符串也不在列表中,则 location = -1

name ="bar"
list = [["foo", 1], ["bar", 2], ["baz", 3]]
new_list=[]
for item in list:
    new_list.append(item[0])
print(new_list)
try:
    location= new_list.index(name)
except:
    location=-1
print (location)

This accounts for if the string is not in the list too, if it isn’t in the list then location = -1


回答 21

index()如果找不到该项目,Python 方法将引发错误。因此,相反,您可以使其类似于indexOf()JavaScript 的功能,-1如果未找到该项目,它将返回:

try:
    index = array.index('search_keyword')
except ValueError:
    index = -1

Python index() method throws an error if the item was not found. So instead you can make it similar to the indexOf() function of JavaScript which returns -1 if the item was not found:

try:
    index = array.index('search_keyword')
except ValueError:
    index = -1

回答 22

有一个更实用的答案。

list(filter(lambda x: x[1]=="bar",enumerate(["foo", "bar", "baz", "bar", "baz", "bar", "a", "b", "c"])))

更通用的形式:

def get_index_of(lst, element):
    return list(map(lambda x: x[0],\
       (list(filter(lambda x: x[1]==element, enumerate(lst))))))

There is a more functional answer to this.

list(filter(lambda x: x[1]=="bar",enumerate(["foo", "bar", "baz", "bar", "baz", "bar", "a", "b", "c"])))

More generic form:

def get_index_of(lst, element):
    return list(map(lambda x: x[0],\
       (list(filter(lambda x: x[1]==element, enumerate(lst))))))

回答 23

让我们将名称指定lst给您拥有的列表。可以将列表转换lstnumpy array。并且,然后使用numpy.where获取列表中所选项目的索引。以下是实现它的方法。

import numpy as np

lst = ["foo", "bar", "baz"]  #lst: : 'list' data type
print np.where( np.array(lst) == 'bar')[0][0]

>>> 1

Let’s give the name lst to the list that you have. One can convert the list lst to a numpy array. And, then use numpy.where to get the index of the chosen item in the list. Following is the way in which you will implement it.

import numpy as np

lst = ["foo", "bar", "baz"]  #lst: : 'list' data type
print np.where( np.array(lst) == 'bar')[0][0]

>>> 1

回答 24

对于那些来自像我这样的另一种语言的人,也许有一个简单的循环,它更易于理解和使用:

mylist = ["foo", "bar", "baz", "bar"]
newlist = enumerate(mylist)
for index, item in newlist:
  if item == "bar":
    print(index, item)

我很感激枚举到底是做什么的?。那帮助我理解了。

For those coming from another language like me, maybe with a simple loop it’s easier to understand and use it:

mylist = ["foo", "bar", "baz", "bar"]
newlist = enumerate(mylist)
for index, item in newlist:
  if item == "bar":
    print(index, item)

I am thankful for So what exactly does enumerate do?. That helped me to understand.


回答 25

如果您打算一次查找索引,则可以使用“索引”方法。但是,如果要多次搜索数据,则建议使用bisect模块。请记住,使用bisect模块的数据必须进行排序。因此,您可以对数据进行一次排序,然后可以使用二等分。在我的机器上使用bisect模块比使用索引方法快20倍。

这是使用Python 3.8及更高版本语法的代码示例:

import bisect
from timeit import timeit

def bisect_search(container, value):
    return (
      index 
      if (index := bisect.bisect_left(container, value)) < len(container) 
      and container[index] == value else -1
    )

data = list(range(1000))
# value to search
value = 666

# times to test
ttt = 1000

t1 = timeit(lambda: data.index(value), number=ttt)
t2 = timeit(lambda: bisect_search(data, value), number=ttt)

print(f"{t1=:.4f}, {t2=:.4f}, diffs {t1/t2=:.2f}")

输出:

t1=0.0400, t2=0.0020, diffs t1/t2=19.60

If you are going to find an index once then using “index” method is fine. However, if you are going to search your data more than once then I recommend using bisect module. Keep in mind that using bisect module data must be sorted. So you sort data once and then you can use bisect. Using bisect module on my machine is about 20 times faster than using index method.

Here is an example of code using Python 3.8 and above syntax:

import bisect
from timeit import timeit

def bisect_search(container, value):
    return (
      index 
      if (index := bisect.bisect_left(container, value)) < len(container) 
      and container[index] == value else -1
    )

data = list(range(1000))
# value to search
value = 666

# times to test
ttt = 1000

t1 = timeit(lambda: data.index(value), number=ttt)
t2 = timeit(lambda: bisect_search(data, value), number=ttt)

print(f"{t1=:.4f}, {t2=:.4f}, diffs {t1/t2=:.2f}")

Output:

t1=0.0400, t2=0.0020, diffs t1/t2=19.60

回答 26

如果性能值得关注:

在众多答案中提到,内置方法 list.index(item)方法是O(n)算法。如果您需要执行一次,那就很好。但是,如果您需要多次访问元素的索引,则首先创建一个由项-索引对组成的字典(O(n)),然后每次需要时在O(1)处访问索引就更有意义了。它。

如果您确定列表中的项目不会重复,则可以轻松地进行以下操作:

myList = ["foo", "bar", "baz"]

# Create the dictionary
myDict = dict((e,i) for i,e in enumerate(myList))

# Lookup
myDict["bar"] # Returns 1
# myDict.get("blah") if you don't want an error to be raised if element not found.

如果您可能有重复的元素,并且需要返回其所有索引:

from collections import defaultdict as dd
myList = ["foo", "bar", "bar", "baz", "foo"]

# Create the dictionary
myDict = dd(list)
for i,e in enumerate(myList):
    myDict[e].append(i)

# Lookup
myDict["foo"] # Returns [0, 4]

If performance is of concern:

It is mentioned in numerous answers that the built-in method of list.index(item) method is an O(n) algorithm. It is fine if you need to perform this once. But if you need to access the indices of elements a number of times, it makes more sense to first create a dictionary (O(n)) of item-index pairs, and then access the index at O(1) every time you need it.

If you are sure that the items in your list are never repeated, you can easily:

myList = ["foo", "bar", "baz"]

# Create the dictionary
myDict = dict((e,i) for i,e in enumerate(myList))

# Lookup
myDict["bar"] # Returns 1
# myDict.get("blah") if you don't want an error to be raised if element not found.

If you may have duplicate elements, and need to return all of their indices:

from collections import defaultdict as dd
myList = ["foo", "bar", "bar", "baz", "foo"]

# Create the dictionary
myDict = dd(list)
for i,e in enumerate(myList):
    myDict[e].append(i)

# Lookup
myDict["foo"] # Returns [0, 4]

回答 27

如@TerryA所示,许多答案都讨论了如何查找一个索引。

more_itertools是一个第三方库,具有用于在可迭代对象中定位多个索引的工具。

给定

import more_itertools as mit


iterable = ["foo", "bar", "baz", "ham", "foo", "bar", "baz"]

查找多个观测值的索引:

list(mit.locate(iterable, lambda x: x == "bar"))
# [1, 5]

测试多个项目:

list(mit.locate(iterable, lambda x: x in {"bar", "ham"}))
# [1, 3, 5]

另请参见使用的更多选项more_itertools.locate。通过安装> pip install more_itertools

As indicated by @TerryA, many answers discuss how to find one index.

more_itertools is a third-party library with tools to locate multiple indices within an iterable.

Given

import more_itertools as mit


iterable = ["foo", "bar", "baz", "ham", "foo", "bar", "baz"]

Code

Find indices of multiple observations:

list(mit.locate(iterable, lambda x: x == "bar"))
# [1, 5]

Test multiple items:

list(mit.locate(iterable, lambda x: x in {"bar", "ham"}))
# [1, 3, 5]

See also more options with more_itertools.locate. Install via > pip install more_itertools.


回答 28

使用dictionary,其中首先处理列表,然后向其添加索引

from collections import defaultdict

index_dict = defaultdict(list)    
word_list =  ['foo','bar','baz','bar','any', 'foo', 'much']

for word_index in range(len(word_list)) :
    index_dict[word_list[word_index]].append(word_index)

word_index_to_find = 'foo'       
print(index_dict[word_index_to_find])

# output :  [0, 5]

using dictionary , where process the list first and then add the index to it

from collections import defaultdict

index_dict = defaultdict(list)    
word_list =  ['foo','bar','baz','bar','any', 'foo', 'much']

for word_index in range(len(word_list)) :
    index_dict[word_list[word_index]].append(word_index)

word_index_to_find = 'foo'       
print(index_dict[word_index_to_find])

# output :  [0, 5]

回答 29

在我看来,这["foo", "bar", "baz"].index("bar")是好的,但还不够!因为如果“ bar”不在字典中,请ValueError提出。因此,您可以使用以下功能:

def find_index(arr, name):
    try:
        return arr.index(name)
    except ValueError:
        return -1

if __name__ == '__main__':
    print(find_index(["foo", "bar", "baz"], "bar"))

结果是:

1个

如果name不是arr,则函数返回-1。例如:

打印(find_index([“ foo”,“ bar”,“ baz”],“ fooo”))

-1

in my opinion the ["foo", "bar", "baz"].index("bar") is good but it isn’t enough!because if “bar” isn’t in dictionary,ValueError raised.So you can use this function:

def find_index(arr, name):
    try:
        return arr.index(name)
    except ValueError:
        return -1

if __name__ == '__main__':
    print(find_index(["foo", "bar", "baz"], "bar"))

and the result is:

1

and if name wasn’t at arr,the function return -1.for example:

print(find_index([“foo”, “bar”, “baz”], “fooo”))

-1


使用“ for”循环遍历字典

问题:使用“ for”循环遍历字典

以下代码使我有些困惑:

d = {'x': 1, 'y': 2, 'z': 3} 
for key in d:
    print key, 'corresponds to', d[key]

我不明白的是那key部分。Python如何识别它仅需要从字典中读取密钥?是keyPython中的特殊字?还是仅仅是一个变量?

I am a bit puzzled by the following code:

d = {'x': 1, 'y': 2, 'z': 3} 
for key in d:
    print key, 'corresponds to', d[key]

What I don’t understand is the key portion. How does Python recognize that it needs only to read the key from the dictionary? Is key a special word in Python? Or is it simply a variable?


回答 0

key 只是一个变量名。

for key in d:

只会循环遍历字典中的键,而不是键和值。要遍历键和值,可以使用以下命令:

对于Python 3.x:

for key, value in d.items():

对于Python 2.x:

for key, value in d.iteritems():

要测试自己,请将单词更改keypoop

在Python 3.x中,iteritems()替换为simple items(),它返回由dict支持的类似set的视图,iteritems()但效果更好。在2.7中也可用viewitems()

该操作items()将对2和3都适用,但是在2中,它将返回字典(key, value)对的列表,该列表将不反映items()调用后发生的字典更改。如果要在3.x中使用2.x行为,可以调用list(d.items())

key is just a variable name.

for key in d:

will simply loop over the keys in the dictionary, rather than the keys and values. To loop over both key and value you can use the following:

For Python 3.x:

for key, value in d.items():

For Python 2.x:

for key, value in d.iteritems():

To test for yourself, change the word key to poop.

In Python 3.x, iteritems() was replaced with simply items(), which returns a set-like view backed by the dict, like iteritems() but even better. This is also available in 2.7 as viewitems().

The operation items() will work for both 2 and 3, but in 2 it will return a list of the dictionary’s (key, value) pairs, which will not reflect changes to the dict that happen after the items() call. If you want the 2.x behavior in 3.x, you can call list(d.items()).


回答 1

并不是说键是一个特殊的词,而是字典实现了迭代器协议。您可以在您的类中执行此操作,例如,有关如何构建类迭代器的信息,请参见此问题

对于字典,它是在C级别实现的。详细信息在PEP 234中可用。特别是标题为“字典迭代器”的部分:

  • 字典实现了一个tp_iter插槽,该插槽返回一个有效的迭代器,该迭代器对字典的键进行迭代。[…]这意味着我们可以写

    for k in dict: ...

    等同于,但是比

    for k in dict.keys(): ...

    只要不违反对字典修改的限制(无论是通过循环还是通过另一个线程)。

  • 将方法添加到字典中,以显式返回不同种类的迭代器:

    for key in dict.iterkeys(): ...
    
    for value in dict.itervalues(): ...
    
    for key, value in dict.iteritems(): ...

    for x in dict是的简写for x in dict.iterkeys()

在Python 3中dict.iterkeys()dict.itervalues()dict.iteritems()不再受支持。使用dict.keys()dict.values()dict.items()代替。

It’s not that key is a special word, but that dictionaries implement the iterator protocol. You could do this in your class, e.g. see this question for how to build class iterators.

In the case of dictionaries, it’s implemented at the C level. The details are available in PEP 234. In particular, the section titled “Dictionary Iterators”:

  • Dictionaries implement a tp_iter slot that returns an efficient iterator that iterates over the keys of the dictionary. […] This means that we can write

    for k in dict: ...
    

    which is equivalent to, but much faster than

    for k in dict.keys(): ...
    

    as long as the restriction on modifications to the dictionary (either by the loop or by another thread) are not violated.

  • Add methods to dictionaries that return different kinds of iterators explicitly:

    for key in dict.iterkeys(): ...
    
    for value in dict.itervalues(): ...
    
    for key, value in dict.iteritems(): ...
    

    This means that for x in dict is shorthand for for x in dict.iterkeys().

In Python 3, dict.iterkeys(), dict.itervalues() and dict.iteritems() are no longer supported. Use dict.keys(), dict.values() and dict.items() instead.


回答 2

遍历一个dict通过其按键迭代没有特定的顺序,你可以在这里看到:

编辑:(Python3.6中不再是这种情况,但是请注意,尚不能保证它的行为)

>>> d = {'x': 1, 'y': 2, 'z': 3} 
>>> list(d)
['y', 'x', 'z']
>>> d.keys()
['y', 'x', 'z']

对于您的示例,最好使用dict.items()

>>> d.items()
[('y', 2), ('x', 1), ('z', 3)]

这给您一个元组列表。当你遍历他们这个样子,每个元组是解压到kv自动:

for k,v in d.items():
    print(k, 'corresponds to', v)

如果循环的主体只有几行,则在遍历a时使用kv作为变量名dict非常普遍。对于更复杂的循环,最好使用更具描述性的名称:

for letter, number in d.items():
    print(letter, 'corresponds to', number)

养成使用格式字符串的习惯是一个好主意:

for letter, number in d.items():
    print('{0} corresponds to {1}'.format(letter, number))

Iterating over a dict iterates through its keys in no particular order, as you can see here:

Edit: (This is no longer the case in Python3.6, but note that it’s not guaranteed behaviour yet)

>>> d = {'x': 1, 'y': 2, 'z': 3} 
>>> list(d)
['y', 'x', 'z']
>>> d.keys()
['y', 'x', 'z']

For your example, it is a better idea to use dict.items():

>>> d.items()
[('y', 2), ('x', 1), ('z', 3)]

This gives you a list of tuples. When you loop over them like this, each tuple is unpacked into k and v automatically:

for k,v in d.items():
    print(k, 'corresponds to', v)

Using k and v as variable names when looping over a dict is quite common if the body of the loop is only a few lines. For more complicated loops it may be a good idea to use more descriptive names:

for letter, number in d.items():
    print(letter, 'corresponds to', number)

It’s a good idea to get into the habit of using format strings:

for letter, number in d.items():
    print('{0} corresponds to {1}'.format(letter, number))

回答 3

key 只是一个变量。

对于Python2.X

d = {'x': 1, 'y': 2, 'z': 3} 
for my_var in d:
    print my_var, 'corresponds to', d[my_var]

… 或更好,

d = {'x': 1, 'y': 2, 'z': 3} 
for the_key, the_value in d.iteritems():
    print the_key, 'corresponds to', the_value

对于Python3.X

d = {'x': 1, 'y': 2, 'z': 3} 
for the_key, the_value in d.items():
    print(the_key, 'corresponds to', the_value)

key is simply a variable.

For Python2.X:

d = {'x': 1, 'y': 2, 'z': 3} 
for my_var in d:
    print my_var, 'corresponds to', d[my_var]

… or better,

d = {'x': 1, 'y': 2, 'z': 3} 
for the_key, the_value in d.iteritems():
    print the_key, 'corresponds to', the_value

For Python3.X:

d = {'x': 1, 'y': 2, 'z': 3} 
for the_key, the_value in d.items():
    print(the_key, 'corresponds to', the_value)

回答 4

当您使用for .. in ..-syntax 遍历字典时,它总是在键上进行遍历(使用可以访问值dictionary[key])。

要遍历键值对,请在Python 2中使用for k,v in s.iteritems(),在Python 3中for k,v in s.items()

When you iterate through dictionaries using the for .. in ..-syntax, it always iterates over the keys (the values are accessible using dictionary[key]).

To iterate over key-value pairs, in Python 2 use for k,v in s.iteritems(), and in Python 3 for k,v in s.items().


回答 5

这是一个非常常见的循环习惯用法。in是运算符。有关何时使用for key in dict和何时使用的信息,for key in dict.keys()请参阅David Goodger的Idiomatic Python文章(归档副本)

This is a very common looping idiom. in is an operator. For when to use for key in dict and when it must be for key in dict.keys() see David Goodger’s Idiomatic Python article (archived copy).


回答 6

使用“ for”循环遍历字典

d = {'x': 1, 'y': 2, 'z': 3} 
for key in d:
    ...

Python如何识别它仅需要从字典中读取密钥?关键字在Python中是一个特殊的词吗?还是仅仅是一个变量?

不只是for循环。这里重要的词是“迭代”。

字典是键到值的映射:

d = {'x': 1, 'y': 2, 'z': 3} 

每当我们遍历它时,我们都会遍历键。变量名key仅是描述性的,非常适合此目的。

这发生在列表理解中:

>>> [k for k in d]
['x', 'y', 'z']

当我们将字典传递到列表(或任何其他集合类型对象)时,就会发生这种情况:

>>> list(d)
['x', 'y', 'z']

Python迭代的方式是在需要的上下文中调用__iter__对象的方法(在这种情况下为字典),该方法返回迭代器(在这种情况下为keyiterator对象):

>>> d.__iter__()
<dict_keyiterator object at 0x7fb1747bee08>

我们不应该自己使用这些特殊方法,而是使用各自的内置函数来调用它iter

>>> key_iterator = iter(d)
>>> key_iterator
<dict_keyiterator object at 0x7fb172fa9188>

迭代器有一个__next__方法-但我们使用内置函数来调用它next

>>> next(key_iterator)
'x'
>>> next(key_iterator)
'y'
>>> next(key_iterator)
'z'
>>> next(key_iterator)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

当迭代器用尽时,它将引发StopIteration。这就是Python知道退出for循环,列表理解,生成器表达式或任何其他迭代上下文的方式。迭代器一旦引发,StopIteration它就会一直引发-如果您想再次进行迭代,则需要一个新的迭代器。

>>> list(key_iterator)
[]
>>> new_key_iterator = iter(d)
>>> list(new_key_iterator)
['x', 'y', 'z']

返回字典

我们已经看到在许多情况下都会反复进行命令。我们看到的是,每当我们迭代一个字典时,我们都会得到密钥。回到原始示例:

d = {'x': 1, 'y': 2, 'z': 3} 
for key in d:

如果我们更改变量名,我们仍然会得到键。让我们尝试一下:

>>> for each_key in d:
...     print(each_key, '=>', d[each_key])
... 
x => 1
y => 2
z => 3

如果要遍历值,则需要使用.valuesdicts方法,或同时使用dicts方法.items

>>> list(d.values())
[1, 2, 3]
>>> list(d.items())
[('x', 1), ('y', 2), ('z', 3)]

在给定的示例中,迭代如下所示的项将更加有效:

for a_key, corresponding_value in d.items():
    print(a_key, corresponding_value)

但是出于学术目的,这个问题的例子很好。

Iterating over dictionaries using ‘for’ loops

d = {'x': 1, 'y': 2, 'z': 3} 
for key in d:
    ...

How does Python recognize that it needs only to read the key from the dictionary? Is key a special word in Python? Or is it simply a variable?

It’s not just for loops. The important word here is “iterating”.

A dictionary is a mapping of keys to values:

d = {'x': 1, 'y': 2, 'z': 3} 

Any time we iterate over it, we iterate over the keys. The variable name key is only intended to be descriptive – and it is quite apt for the purpose.

This happens in a list comprehension:

>>> [k for k in d]
['x', 'y', 'z']

It happens when we pass the dictionary to list (or any other collection type object):

>>> list(d)
['x', 'y', 'z']

The way Python iterates is, in a context where it needs to, it calls the __iter__ method of the object (in this case the dictionary) which returns an iterator (in this case, a keyiterator object):

>>> d.__iter__()
<dict_keyiterator object at 0x7fb1747bee08>

We shouldn’t use these special methods ourselves, instead, use the respective builtin function to call it, iter:

>>> key_iterator = iter(d)
>>> key_iterator
<dict_keyiterator object at 0x7fb172fa9188>

Iterators have a __next__ method – but we call it with the builtin function, next:

>>> next(key_iterator)
'x'
>>> next(key_iterator)
'y'
>>> next(key_iterator)
'z'
>>> next(key_iterator)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

When an iterator is exhausted, it raises StopIteration. This is how Python knows to exit a for loop, or a list comprehension, or a generator expression, or any other iterative context. Once an iterator raises StopIteration it will always raise it – if you want to iterate again, you need a new one.

>>> list(key_iterator)
[]
>>> new_key_iterator = iter(d)
>>> list(new_key_iterator)
['x', 'y', 'z']

Returning to dicts

We’ve seen dicts iterating in many contexts. What we’ve seen is that any time we iterate over a dict, we get the keys. Back to the original example:

d = {'x': 1, 'y': 2, 'z': 3} 
for key in d:

If we change the variable name, we still get the keys. Let’s try it:

>>> for each_key in d:
...     print(each_key, '=>', d[each_key])
... 
x => 1
y => 2
z => 3

If we want to iterate over the values, we need to use the .values method of dicts, or for both together, .items:

>>> list(d.values())
[1, 2, 3]
>>> list(d.items())
[('x', 1), ('y', 2), ('z', 3)]

In the example given, it would be more efficient to iterate over the items like this:

for a_key, corresponding_value in d.items():
    print(a_key, corresponding_value)

But for academic purposes, the question’s example is just fine.


回答 7

我有一个用例,我必须遍历字典以获取键,值对以及指示我在哪里的索引。这是我的方法:

d = {'x': 1, 'y': 2, 'z': 3} 
for i, (key, value) in enumerate(d.items()):
   print(i, key, value)

请注意,键值周围的括号很重要,如果没有括号,则会出现ValueError“没有足够的值要解压”。

I have a use case where I have to iterate through the dict to get the key, value pair, also the index indicating where I am. This is how I do it:

d = {'x': 1, 'y': 2, 'z': 3} 
for i, (key, value) in enumerate(d.items()):
   print(i, key, value)

Note that the parentheses around the key, value is important, without the parentheses, you get an ValueError “not enough values to unpack”.


回答 8

您可以dicttype在GitHub上检查CPython的实现。这是实现dict迭代器的方法的签名:

_PyDict_Next(PyObject *op, Py_ssize_t *ppos, PyObject **pkey,
             PyObject **pvalue, Py_hash_t *phash)

CPython的dictobject.c

You can check the implementation of CPython’s dicttype on GitHub. This is the signature of method that implements the dict iterator:

_PyDict_Next(PyObject *op, Py_ssize_t *ppos, PyObject **pkey,
             PyObject **pvalue, Py_hash_t *phash)

CPython dictobject.c


回答 9

要遍历键,使用起来比较慢,但效果更好my_dict.keys()。如果您尝试执行以下操作:

for key in my_dict:
    my_dict[key+"-1"] = my_dict[key]-1

这将导致运行时错误,因为在程序运行时更改了密钥。如果您绝对希望减少时间,请使用此for key in my_dict方法,但已被警告;)。

To iterate over keys, it is slower but better to use my_dict.keys(). If you tried to do something like this:

for key in my_dict:
    my_dict[key+"-1"] = my_dict[key]-1

it would create a runtime error because you are changing the keys while the program is running. If you are absolutely set on reducing time, use the for key in my_dict way, but you have been warned ;).


回答 10

这将按照值的升序打印输出。

d = {'x': 3, 'y': 1, 'z': 2}
def by_value(item):
    return item[1]

for key, value in sorted(d.items(), key=by_value):
    print(key, '->', value)

输出:

This will print the output in Sorted order by Values in ascending order.

d = {'x': 3, 'y': 1, 'z': 2}
def by_value(item):
    return item[1]

for key, value in sorted(d.items(), key=by_value):
    print(key, '->', value)

Output: