标签归档:Python

将Unicode字符串转换为Python中的字符串(包含多余的符号)

问题:将Unicode字符串转换为Python中的字符串(包含多余的符号)

如何将Unicode字符串(包含额外的字符,如£$等)转换为Python字符串?

How do you convert a Unicode string (containing extra characters like £ $, etc.) into a Python string?


回答 0

title = u"Klüft skräms inför på fédéral électoral große"
import unicodedata
unicodedata.normalize('NFKD', title).encode('ascii','ignore')
'Kluft skrams infor pa federal electoral groe'
title = u"Klüft skräms inför på fédéral électoral große"
import unicodedata
unicodedata.normalize('NFKD', title).encode('ascii','ignore')
'Kluft skrams infor pa federal electoral groe'

回答 1

如果不需要翻译非ASCII字符,则可以使用编码为ASCII:

>>> a=u"aaaàçççñññ"
>>> type(a)
<type 'unicode'>
>>> a.encode('ascii','ignore')
'aaa'
>>> a.encode('ascii','replace')
'aaa???????'
>>>

You can use encode to ASCII if you don’t need to translate the non-ASCII characters:

>>> a=u"aaaàçççñññ"
>>> type(a)
<type 'unicode'>
>>> a.encode('ascii','ignore')
'aaa'
>>> a.encode('ascii','replace')
'aaa???????'
>>>

回答 2

>>> text=u'abcd'
>>> str(text)
'abcd'

如果字符串仅包含ascii字符。

>>> text=u'abcd'
>>> str(text)
'abcd'

If the string only contains ascii characters.


回答 3

如果您有Unicode字符串,并且想要将其写入文件或其他序列化形式,则必须首先将其编码为可以存储的特定表示形式。有几种常见的Unicode编码,例如UTF-16(大多数Unicode字符使用两个字节)或UTF-8(1-4个字节/代码点,取决于字符)等。要将该字符串转换为特定的编码,您可以可以使用:

>>> s= u'£10'
>>> s.encode('utf8')
'\xc2\x9c10'
>>> s.encode('utf16')
'\xff\xfe\x9c\x001\x000\x00'

可以将此原始字节字符串写入文件。但是,请注意,当读回它时,您必须知道它所使用的编码并使用相同的编码对其进行解码。

写入文件时,您可以使用编解码器模块摆脱此手动编码/解码过程。因此,要打开将所有Unicode字符串编码为UTF-8的文件,请使用:

import codecs
f = codecs.open('path/to/file.txt','w','utf8')
f.write(my_unicode_string)  # Stored on disk as UTF-8

请注意,正在使用这些文件的其他任何文件,如果要读取它们,都必须了解文件的编码格式。如果您是唯一一个进行读/写的人,那么这不是问题,否则请确保以一种其他任何使用文件都可以理解的形式书写。

在Python 3中,这种形式的文件访问是默认的,并且内置open函数将采用编码参数,并始终与以文本模式打开的文件在Unicode字符串(Python 3中的默认字符串对象)之间进行转换。

If you have a Unicode string, and you want to write this to a file, or other serialised form, you must first encode it into a particular representation that can be stored. There are several common Unicode encodings, such as UTF-16 (uses two bytes for most Unicode characters) or UTF-8 (1-4 bytes / codepoint depending on the character), etc. To convert that string into a particular encoding, you can use:

>>> s= u'£10'
>>> s.encode('utf8')
'\xc2\x9c10'
>>> s.encode('utf16')
'\xff\xfe\x9c\x001\x000\x00'

This raw string of bytes can be written to a file. However, note that when reading it back, you must know what encoding it is in and decode it using that same encoding.

When writing to files, you can get rid of this manual encode/decode process by using the codecs module. So, to open a file that encodes all Unicode strings into UTF-8, use:

import codecs
f = codecs.open('path/to/file.txt','w','utf8')
f.write(my_unicode_string)  # Stored on disk as UTF-8

Do note that anything else that is using these files must understand what encoding the file is in if they want to read them. If you are the only one doing the reading/writing this isn’t a problem, otherwise make sure that you write in a form understandable by whatever else uses the files.

In Python 3, this form of file access is the default, and the built-in open function will take an encoding parameter and always translate to/from Unicode strings (the default string object in Python 3) for files opened in text mode.


回答 4

这是一个例子:

>>> u = u'€€€'
>>> s = u.encode('utf8')
>>> s
'\xe2\x82\xac\xe2\x82\xac\xe2\x82\xac'

Here is an example:

>>> u = u'€€€'
>>> s = u.encode('utf8')
>>> s
'\xe2\x82\xac\xe2\x82\xac\xe2\x82\xac'

回答 5

好吧,如果您愿意/准备切换到Python 3(可能不是由于与某些Python 2代码的向后不兼容),则不必进行任何转换。Python 3中的所有文本均以Unicode字符串表示,这也意味着不再使用该u'<text>'语法。实际上,您还有字节字符串,用于表示数据(可以是编码字符串)。

http://docs.python.org/3.1/whatsnew/3.0.html#text-vs-data-instead-of-unicode-vs-8位

(当然,如果您当前使用的是Python 3,则问题可能与您尝试将文本保存到文件中有关。)

Well, if you’re willing/ready to switch to Python 3 (which you may not be due to the backwards incompatibility with some Python 2 code), you don’t have to do any converting; all text in Python 3 is represented with Unicode strings, which also means that there’s no more usage of the u'<text>' syntax. You also have what are, in effect, strings of bytes, which are used to represent data (which may be an encoded string).

http://docs.python.org/3.1/whatsnew/3.0.html#text-vs-data-instead-of-unicode-vs-8-bit

(Of course, if you’re currently using Python 3, then the problem is likely something to do with how you’re attempting to save the text to a file.)


回答 6

这是一个示例代码

import unicodedata    
raw_text = u"here $%6757 dfgdfg"
convert_text = unicodedata.normalize('NFKD', raw_text).encode('ascii','ignore')

Here is an example code

import unicodedata    
raw_text = u"here $%6757 dfgdfg"
convert_text = unicodedata.normalize('NFKD', raw_text).encode('ascii','ignore')

回答 7

文件包含Unicode字符串

\"message\": \"\\u0410\\u0432\\u0442\\u043e\\u0437\\u0430\\u0446\\u0438\\u044f .....\",

为了我

 f = open("56ad62-json.log", encoding="utf-8")
 qq=f.readline() 

 print(qq)                          
 {"log":\"message\": \"\\u0410\\u0432\\u0442\\u043e\\u0440\\u0438\\u0437\\u0430\\u0446\\u0438\\u044f \\u043f\\u043e\\u043b\\u044c\\u0437\\u043e\\u0432\\u0430\\u0442\\u0435\\u043b\\u044f\"}

(qq.encode().decode("unicode-escape").encode().decode("unicode-escape")) 
# '{"log":"message": "Авторизация пользователя"}\n'

file contain unicode-esaped string

\"message\": \"\\u0410\\u0432\\u0442\\u043e\\u0437\\u0430\\u0446\\u0438\\u044f .....\",

for me

 f = open("56ad62-json.log", encoding="utf-8")
 qq=f.readline() 

 print(qq)                          
 {"log":\"message\": \"\\u0410\\u0432\\u0442\\u043e\\u0440\\u0438\\u0437\\u0430\\u0446\\u0438\\u044f \\u043f\\u043e\\u043b\\u044c\\u0437\\u043e\\u0432\\u0430\\u0442\\u0435\\u043b\\u044f\"}

(qq.encode().decode("unicode-escape").encode().decode("unicode-escape")) 
# '{"log":"message": "Авторизация пользователя"}\n'

回答 8

对于我的情况,没有答案可用。在这里,我有一个包含unichar字符的字符串变量,在此没有解释的encoding-decode起作用。

如果我在航站楼里

echo "no me llama mucho la atenci\u00f3n"

要么

python3
>>> print("no me llama mucho la atenci\u00f3n")

输出正确:

output: no me llama mucho la atención

但是使用脚本加载此字符串变量无法正常工作。

这是对我的案例起作用的,以防万一:

string_to_convert = "no me llama mucho la atenci\u00f3n"
print(json.dumps(json.loads(r'"%s"' % string_to_convert), ensure_ascii=False))
output: no me llama mucho la atención

No answere worked for my case, where I had a string variable containing unicode chars, and no encode-decode explained here did the work.

If I do in a Terminal

echo "no me llama mucho la atenci\u00f3n"

or

python3
>>> print("no me llama mucho la atenci\u00f3n")

The output is correct:

output: no me llama mucho la atención

But working with scripts loading this string variable didn’t work.

This is what worked on my case, in case helps anybody:

string_to_convert = "no me llama mucho la atenci\u00f3n"
print(json.dumps(json.loads(r'"%s"' % string_to_convert), ensure_ascii=False))
output: no me llama mucho la atención

处理多个Python版本和PIP?

问题:处理多个Python版本和PIP?

有什么方法可以使pip多个版本的Python正常运行吗?例如,我想用于pip将内容显式安装到站点2.5安装或站点2.6安装中。

例如,使用easy_install,我使用easy_install-2.{5,6}

而且,是的-我了解virtualenv,不是-这不是解决此特定问题的方法。

Is there any way to make pip play well with multiple versions of Python? For example, I want to use pip to explicitly install things to either my site 2.5 installation or my site 2.6 installation.

For example, with easy_install, I use easy_install-2.{5,6}.

And, yes — I know about virtualenv, and no — it’s not a solution to this particular problem.


回答 0

目前的建议是使用python -m pip,这里python是Python的版本,你想使用。这是建议,因为它适用于所有版本的Python和所有形式的virtualenv。例如:

# The system default python:
$ python -m pip install fish

# A virtualenv's python:
$ .env/bin/python -m pip install fish

# A specific version of python:
$ python-3.6 -m pip install fish

先前的答案,留给后代:

从0.8版开始,Pip支持pip-{version}。您可以像使用它一样easy_install-{version}

$ pip-2.5 install myfoopackage
$ pip-2.6 install otherpackage
$ pip-2.7 install mybarpackage

编辑:pip更改其架构以使用,pipVERSION而不是pip-VERSION在1.5版中使用。如果有,则应使用以下内容pip >= 1.5

$ pip2.6 install otherpackage
$ pip2.7 install mybarpackage

检查https://github.com/pypa/pip/pull/1053了解更多详细信息


参考文献:

The current recommendation is to use python -m pip, where python is the version of Python you would like to use. This is the recommendation because it works across all versions of Python, and in all forms of virtualenv. For example:

# The system default python:
$ python -m pip install fish

# A virtualenv's python:
$ .env/bin/python -m pip install fish

# A specific version of python:
$ python-3.6 -m pip install fish

Previous answer, left for posterity:

Since version 0.8, Pip supports pip-{version}. You can use it the same as easy_install-{version}:

$ pip-2.5 install myfoopackage
$ pip-2.6 install otherpackage
$ pip-2.7 install mybarpackage

EDIT: pip changed its schema to use pipVERSION instead of pip-VERSION in version 1.5. You should use the following if you have pip >= 1.5:

$ pip2.6 install otherpackage
$ pip2.7 install mybarpackage

Check https://github.com/pypa/pip/pull/1053 for more details


References:


回答 1

在Windows中,您可以执行使用通过给定的Python版本的PIP模块Python的发射器py.exe如果你选择了Python 3的安装过程中安装它。

py -3 -m pip install packagename
py -2 -m pip install packagename

您甚至可以更加具体,并请求确切的Python子版本:

py -3.6 -m pip install packagename

要获取通过启动器可用的所有已安装Python版本的列表,请运行:

py --list

另外,您可以直接启动所需的Python可执行文件:

C:/path/to/specific/python.exe -m pip install packagename

On Windows, you can execute the pip module using a given Python version through the Python launcher, py.exe, if you chose to install it during Python 3 setup.

py -3 -m pip install packagename
py -2 -m pip install packagename

You can be even more specific and request an exact sub-version of Python:

py -3.6 -m pip install packagename

To get a list of all installed Python versions available through the launcher, run:

py --list

Alternatively, you can launch the desired Python executable directly:

C:/path/to/specific/python.exe -m pip install packagename

回答 2

/path/to/python2.{5,6} /path/to/pip install PackageName 不起作用?

为了使它能够在尚未安装pip的任何python版本上运行,您需要下载pip并执行python*version* setup.py install。例如python3.3 setup.py install。这样可以解决注释中的导入错误。(由@hbdgaf建议)

/path/to/python2.{5,6} /path/to/pip install PackageName doesn’t work?

For this to work on any python version that doesn’t have pip already installed you need to download pip and do python*version* setup.py install. For example python3.3 setup.py install. This resolves the import error in the comments. (As suggested by @hbdgaf)


回答 3

我默认情况下安装了python 2.6(Amazon EC2 AMI),但是我的应用程序需要python2.7以及一些外部软件包。假设您已经安装了python2.7和默认的python(在我的例子中是2.6)。这是如何为非默认python2.7安装pip和软件包

为您的python版本安装pip:

curl -O https://bootstrap.pypa.io/get-pip.py
python27 get-pip.py

使用特定的pip版本来安装软件包:

pip2.7 install mysql-connector-python --allow-external mysql-connector-python

I had python 2.6 installed by default (Amazon EC2 AMI), but needed python2.7 plus some external packages for my application. Assuming you already installed python2.7 alongside with default python (2.6 in my case). Here is how to install pip and packages for non-default python2.7

Install pip for your python version:

curl -O https://bootstrap.pypa.io/get-pip.py
python27 get-pip.py

Use specific pip version to install packages:

pip2.7 install mysql-connector-python --allow-external mysql-connector-python

回答 4

它以这种方式在Windows中为我工作:

  1. 我将python文件python.py和pythonw.exe的名称更改为python3.py pythonw3.py

  2. 然后,我在提示符中运行了此命令:

    python3 -m pip install package

It worked for me in windows this way:

  1. I changed the name of python files python.py and pythonw.exe to python3.py pythonw3.py

  2. Then I just ran this command in the prompt:

    python3 -m pip install package


回答 5

其他答案显示了如何在2.X和3.X Python上同时使用pip,但没有显示如何处理多个Python发行版的情况(例如,原始Python和Anaconda Python)

我总共有3个Python版本:原始Python 2.7和Python 3.5以及Anaconda Python 3.5。

这是我将软件包安装到的方法:

  1. 原始Python 3.5

    /usr/bin/python3 -m pip install python-daemon
  2. 原始Python 2.7

    /usr/bin/python -m pip install python-daemon
  3. Anaconda Python 3.5

    python3 -m pip install python-daemon

    要么

    pip3 install python-daemon

    更简单,因为Anaconda在用户环境中覆盖了原始Python二进制文件。

    当然,在anaconda中安装应使用conda命令完成,这只是一个示例。


另外,请确保已为该特定python安装了pip。您可能需要手动安装pip。在Ubuntu 16.04中可以使用:

sudo apt-get install python-pip 

要么

sudo apt-get install python3-pip

Other answers show how to use pip with both 2.X and 3.X Python, but does not show how to handle the case of multiple Python distributions (eg. original Python and Anaconda Python).

I have a total of 3 Python versions: original Python 2.7 and Python 3.5 and Anaconda Python 3.5.

Here is how I install a package into:

  1. Original Python 3.5:

    /usr/bin/python3 -m pip install python-daemon
    
  2. Original Python 2.7:

    /usr/bin/python -m pip install python-daemon
    
  3. Anaconda Python 3.5:

    python3 -m pip install python-daemon
    

    or

    pip3 install python-daemon
    

    Simpler, as Anaconda overrides original Python binaries in user environment.

    Of course, installing in anaconda should be done with conda command, this is just an example.


Also, make sure that pip is installed for that specific python.You might need to manually install pip. This works in Ubuntu 16.04:

sudo apt-get install python-pip 

or

sudo apt-get install python3-pip

回答 6

我本人最近遇到了这个问题,发现在我的同时具有Python 2的Linux系统上,我对Python 3的了解不正确。

首先,您必须确保已为python版本安装了pip:

对于Python 2:

sudo apt-get install python-pip

对于Python 3:

sudo apt-get install python3-pip

然后,要为一个版本的Python或其他版本安装软件包,只需对Python 2使用以下代码:

pip install <package>

或对于Python 3:

pip3 install <package>

I ran into this issue myself recently and found that I wasn’t getting the right pip for Python 3, on my Linux system that also has Python 2.

First you must ensure that you have installed pip for your python version:

For Python 2:

sudo apt-get install python-pip

For Python 3:

sudo apt-get install python3-pip

Then to install packages for one version of Python or the other, simply use the following for Python 2:

pip install <package>

or for Python 3:

pip3 install <package>

回答 7

pip也是python包。因此,将模块安装到特定python版本的最简单方法如下

 python2.7 /usr/bin/pip install foo

要么

python2.7 -m pip install foo

pip is also a python package. So the easiest way to install modules to a specific python version would be below

 python2.7 /usr/bin/pip install foo

or

python2.7 -m pip install foo

回答 8

因此很明显,有多个版本easy_install pip。这似乎是一个大混乱。无论如何,这就是我在Ubuntu 12.10上为Python 2.7安装Django的目的:

$ sudo easy_install-2.7 pip
Searching for pip
Best match: pip 1.1
Adding pip 1.1 to easy-install.pth file
Installing pip-2.7 script to /usr/local/bin

Using /usr/lib/python2.7/dist-packages
Processing dependencies for pip
Finished processing dependencies for pip

$ sudo pip-2.7 install django
Downloading/unpacking django
  Downloading Django-1.5.1.tar.gz (8.0Mb): 8.0Mb downloaded
  Running setup.py egg_info for package django

    warning: no previously-included files matching '__pycache__' found under directory '*'
    warning: no previously-included files matching '*.py[co]' found under directory '*'
Installing collected packages: django
  Running setup.py install for django
    changing mode of build/scripts-2.7/django-admin.py from 644 to 755

    warning: no previously-included files matching '__pycache__' found under directory '*'
    warning: no previously-included files matching '*.py[co]' found under directory '*'
    changing mode of /usr/local/bin/django-admin.py to 755
Successfully installed django
Cleaning up...

$ python
Python 2.7.3 (default, Sep 26 2012, 21:51:14) 
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import django
>>> 

So apparently there are multiple versions of easy_install and pip. It seems to be a big mess. Anyway, this is what I did to install Django for Python 2.7 on Ubuntu 12.10:

$ sudo easy_install-2.7 pip
Searching for pip
Best match: pip 1.1
Adding pip 1.1 to easy-install.pth file
Installing pip-2.7 script to /usr/local/bin

Using /usr/lib/python2.7/dist-packages
Processing dependencies for pip
Finished processing dependencies for pip

$ sudo pip-2.7 install django
Downloading/unpacking django
  Downloading Django-1.5.1.tar.gz (8.0Mb): 8.0Mb downloaded
  Running setup.py egg_info for package django

    warning: no previously-included files matching '__pycache__' found under directory '*'
    warning: no previously-included files matching '*.py[co]' found under directory '*'
Installing collected packages: django
  Running setup.py install for django
    changing mode of build/scripts-2.7/django-admin.py from 644 to 755

    warning: no previously-included files matching '__pycache__' found under directory '*'
    warning: no previously-included files matching '*.py[co]' found under directory '*'
    changing mode of /usr/local/bin/django-admin.py to 755
Successfully installed django
Cleaning up...

$ python
Python 2.7.3 (default, Sep 26 2012, 21:51:14) 
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import django
>>> 

回答 9

从这里:https : //docs.python.org/3/installing/

这是如何为同时安装linux,mac,posix的各种版本安装软件包的方法:

python2   -m pip install SomePackage  # default Python 2
python2.7 -m pip install SomePackage  # specifically Python 2.7
python3   -m pip install SomePackage  # default Python 3
python3.4 -m pip install SomePackage  # specifically Python 3.4
python3.5 -m pip install SomePackage  # specifically Python 3.5
python3.6 -m pip install SomePackage  # specifically Python 3.6

Windows上,将py Python启动器与-m开关结合使用:

py -2   -m pip install SomePackage  # default Python 2
py -2.7 -m pip install SomePackage  # specifically Python 2.7
py -3   -m pip install SomePackage  # default Python 3
py -3.4 -m pip install SomePackage  # specifically Python 3.4

From here: https://docs.python.org/3/installing/

Here is how to install packages for various versions that are installed at the same time linux, mac, posix:

python2   -m pip install SomePackage  # default Python 2
python2.7 -m pip install SomePackage  # specifically Python 2.7
python3   -m pip install SomePackage  # default Python 3
python3.4 -m pip install SomePackage  # specifically Python 3.4
python3.5 -m pip install SomePackage  # specifically Python 3.5
python3.6 -m pip install SomePackage  # specifically Python 3.6

On Windows, use the py Python launcher in combination with the -m switch:

py -2   -m pip install SomePackage  # default Python 2
py -2.7 -m pip install SomePackage  # specifically Python 2.7
py -3   -m pip install SomePackage  # default Python 3
py -3.4 -m pip install SomePackage  # specifically Python 3.4

回答 10

在Linux,Mac OS X和其他POSIX系统上,结合使用版本化的Python命令和-m开关,以运行以下命令的适当副本pip

python2.7 -m pip install SomePackage
python3.4 -m pip install SomePackage

(也可能提供适当版本的pip命令)

在Windows上,将pyPython启动器与-m开关结合使用:

py -2.7 -m pip install SomePackage  # specifically Python 2.7
py -3.4 -m pip install SomePackage  # specifically Python 3.4

如果遇到错误,请py -3.4尝试:

pip install SomePackage

On Linux, Mac OS X and other POSIX systems, use the versioned Python commands in combination with the -m switch to run the appropriate copy of pip:

python2.7 -m pip install SomePackage
python3.4 -m pip install SomePackage

(appropriately versioned pip commands may also be available)

On Windows, use the py Python launcher in combination with the -m switch:

py -2.7 -m pip install SomePackage  # specifically Python 2.7
py -3.4 -m pip install SomePackage  # specifically Python 3.4

if you get an error for py -3.4 then try:

pip install SomePackage

回答 11

安装多个版本的Python和相应的软件包。

同一Windows机器上的Python版本:2.7,3.4和3.6

安装所有3个版本的Python

  • 使用以下路径安装了Python 2.7、3.4和3.6

所有3个版本的Python的PATH

  • 确保PATH变量(在System Variables中)包含以下路径-C:\ Python27 \; C:\ Python27 \ Scripts; C:\ Python34 \; C:\ Python34 \ Scripts; C:\ Python36 \; C:\ Python36 \ Scripts \;

重命名版本的可执行文件

  • 将C:\ Python36和C:\ Python34中的python可执行文件名称分别更改为python36和python34。

检查所有版本的命令提示符:

为每个版本分别安装软件包

Installation of multiple versions of Python and respective Packages.

Python version on the same windows machine : 2.7 , 3.4 and 3.6

Installation of all 3 versions of Python :

  • Installed the Python 2.7 , 3.4 and 3.6 with the below paths

PATH for all 3 versions of Python :

  • Made sure the PATH variable ( in System Variables ) has below paths included – C:\Python27\;C:\Python27\Scripts;C:\Python34\;C:\Python34\Scripts;C:\Python36\;C:\Python36\Scripts\;

Renaming the executables for versions :

  • Changed the python executable name in C:\Python36 and C:\Python34 to python36 and python34 respectively.

Checked for the command prompt with all versions :

Installing the packages separately for each version


回答 12

如果您有多个版本以及多个体系结构(32位,64位),则需要在版本末尾添加-32或-64。

对于Windows,请转到cmd并键入py –list,它将生成您已安装的版本。该列表将如下所示:

Installed Pythons found by py Launcher for Windows
 -3.7-64 *
 -3.7-32
 -3.6-32

以完整命令为例:

py -3.6-32 -m pip install (package)

如果您想更深入,要在特定版本的python上安装特定版本的软件包,请在该软件包之后使用==(version)。举个例子,

py -3.6-32 -m pip install opencv-python==4.1.0.25

If you have multiple versions as well as multiple architectures (32 bit, 64 bit) you will need to add a -32 or -64 at the end of your version.

For windows, go to cmd and type py –list and it will produce the versions you have installed. The list will look like the following:

Installed Pythons found by py Launcher for Windows
 -3.7-64 *
 -3.7-32
 -3.6-32

The full command as an example will be:

py -3.6-32 -m pip install (package)

If you want to get more indepth, to install a specific version of a package on a specific version of python, use ==(version) after the package. As an example,

py -3.6-32 -m pip install opencv-python==4.1.0.25

回答 13

这里的大多数答案都解决了这个问题,但是我想添加一些使我/usr/local在CentOS 7上的python替代安装中不断困惑的东西。当我安装在那里时,由于可以使用pip2.7 install和它会安装模块。但是,我不知道是为什么我新安装的python版本没有看到我要安装的内容。

事实证明,在CentOS 7中,该/usr/bin文件夹中已经有一个python2.7和pip2.7 。要为新的python发行版安装pip,您需要专门告诉sudo转到/usr/local/bin

sudo /usr/local/bin/python2.7 -m ensurepip

这应该在您的/usr/local/bin文件夹中安装了pip2.7 以及您的python版本。诀窍是,当您要安装模块时,您需要修改sudo $PATH变量以包含它,/usr/local/bin或者需要执行

sudo /usr/local/bin/pip2.7 install <module>

如果要安装新模块。我花了一辈子的时间才记住那sudo不是立即看到的/usr/local/bin

Most of the answers here address the issue but I want to add something what was continually confusing me with regard to creating an alternate installation of python in the /usr/local on CentOS 7. When I installed there, it appeared like pip was working since I could use pip2.7 install and it would install modules. However, what I couldn’t figure out was why my newly installed version of python wasn’t seeing what I was installing.

It turns out in CentOS 7 that there is already a python2.7 and a pip2.7 in the /usr/bin folder. To install pip for your new python distribution, you need to specifically tell sudo to go to /usr/local/bin

sudo /usr/local/bin/python2.7 -m ensurepip

This should get pip2.7 installed in your /usr/local/bin folder along with your version of python. The trick is that when you want to install modules, you either need to modify the sudo $PATH variable to include /usr/local/bin or you need to execute

sudo /usr/local/bin/pip2.7 install <module>

if you want to install a new module. It took me forever to remember that sudo wasn’t immediately seeing /usr/local/bin.


回答 14

这是我对这个问题的看法。适用于Python3。主要特点是:

  • 每个Python版本均从源代码编译
  • 所有版本均在本地安装
  • 不会以任何方式破坏系统的默认Python安装
  • 每个Python版本都通过virtualenv隔离

步骤如下:

  1. 如果您以其他方式安装了多个额外的python版本,请摆脱它们,例如,删除$ HOME / .local / lib / python3.x等(以及全局安装的版本)。但是不要触摸系统的默认python3版本。

  2. 在以下目录结构下的不同python版本的下载源:

    $HOME/
        python_versions/ : download Python-*.tgz packages here and "tar xvf" them.  You'll get directories like this:
          Python-3.4.8/
          Python-3.6.5/
          Python-3.x.y/
          ...
  3. 在每个“ Python-3.xy /”目录中,执行以下操作(在任何步骤中都不要使用“ sudo”!):

    mkdir root
    ./configure --prefix=$PWD/root 
    make -j 2
    make install
    virtualenv --no-site-packages -p root/bin/python3.x env
  4. 在“ python_versions /”处创建如下文件:

    env_python3x.bash:
    
    #!/bin/bash
    echo "type deactivate to exit"
    source $HOME/python_versions/Python-3.x.y/env/bin/activate
  5. 现在,无论何时您希望选择python3.x,都可以

    source $HOME/python_versions/env_python3x.bash

    进入virtualenv

  6. 在virtualenv中,使用以下命令安装您喜欢的python软件包:

    pip install --upgrade package_name
  7. 要退出virtualenv和python版本,只需键入“ deactivate”

更新

似乎--no-site-packages已弃用。有一个简单的解决方法:激活virtualenv后,只需将HOME env变量指向实际主目录之外的其他位置即可,即:

export HOME=some/where/else

通常,执行此操作的一种好方法是:

  • 创建virtualenv
  • 激活virtualenv
  • 如果要将现有库“回收”到virtualenv,请从现有安装中将它们进行软链接,即 ln -s $HOME/.local/lib/python3.6/site-packages/numpy $PWD/venv/lib/python3.6/site-packages/
  • export PYTHONPATH=export HOME=/some/other/dir

现在,您应该具有自定义隔离的virtualenv。

Here is my take on the problem. Works for Python3. The main features are:

  • Each Python version is compiled from source
  • All versions are installed locally
  • Does not mangle your system’s default Python installation in any way
  • Each Python version is isolated with virtualenv

The steps are as follows:

  1. If you have several extra python versions installed in some other way, get rid of them, e.g., remove $HOME/.local/lib/python3.x, etc. (also the globally installed ones). Don’t touch your system’s default python3 version though.

  2. Download source for different python versions under the following directory structure:

    $HOME/
        python_versions/ : download Python-*.tgz packages here and "tar xvf" them.  You'll get directories like this:
          Python-3.4.8/
          Python-3.6.5/
          Python-3.x.y/
          ...
    
  3. At each “Python-3.x.y/” directory, do the following (do NOT use “sudo” in any of the steps!):

    mkdir root
    ./configure --prefix=$PWD/root 
    make -j 2
    make install
    virtualenv --no-site-packages -p root/bin/python3.x env
    
  4. At “python_versions/” create files like this:

    env_python3x.bash:
    
    #!/bin/bash
    echo "type deactivate to exit"
    source $HOME/python_versions/Python-3.x.y/env/bin/activate
    
  5. Now, anytime you wish to opt for python3.x, do

    source $HOME/python_versions/env_python3x.bash
    

    to enter the virtualenv

  6. While in the virtualenv, install your favorite python packages with

    pip install --upgrade package_name
    
  7. To exit the virtualenv and python version just type “deactivate”

UPDATE

It seems that --no-site-packages is deprecated. There’s an easy fix for this: Once you have activated the virtualenv, just point the HOME env variable to somewhere else than your actual home directory, i.e.:

export HOME=some/where/else

A nice way to do this in general is:

  • Create virtualenv
  • Activate virtualenv
  • If you want to “recycle” existing libraries to your virtualenv, softlink them from your existing install, i.e. ln -s $HOME/.local/lib/python3.6/site-packages/numpy $PWD/venv/lib/python3.6/site-packages/
  • Do export PYTHONPATH=, export HOME=/some/other/dir

Now you should have custom-isolated virtualenv.


回答 15

上下文:Archlinux

行动:
安装python2-pip:
sudo pacman -S python2-pip

您现在有了pip2.7:
sudo pip2.7 install boto

测试(在我的情况下,我需要’boto’):
运行以下命令:

python2
import boto

成功:没有错误。

出口:Ctrl+D

Context: Archlinux

Action:
Install python2-pip:
sudo pacman -S python2-pip

You now have pip2.7:
sudo pip2.7 install boto

Test (in my case I needed ‘boto’):
Run the following commands:

python2
import boto

Success: No error.

Exit: Ctrl+D


回答 16

例如,如果您将其他版本(例如3.5)设置为默认版本,并想为python 2.7安装pip:

  1. https://pypi.python.org/pypi/pip(tar)下载pip
  2. 解压tar文件
  3. cd到文件目录
  4. sudo python2.7 setup.py安装

for example, if you set other versions (e.g. 3.5) as default and want to install pip for python 2.7:

  1. download pip at https://pypi.python.org/pypi/pip (tar)
  2. unzip tar file
  3. cd to the file’s directory
  4. sudo python2.7 setup.py install

回答 17

您可以转到例如C:\ Python2.7 \ Scripts,然后从该路径运行cmd。之后,您可以运行pip2.7安装软件包…

这将为该版本的Python安装软件包。

You can go to for example C:\Python2.7\Scripts and then run cmd from that path. After that you can run pip2.7 install yourpackage…

That will install package for that version of Python.


回答 18

这可能是完全错误的操作(我是python noob),但是我只是进去编辑了pip文件

#!/usr/bin/env python3 <-- I changed this line.

# -*- coding: utf-8 -*-
import re
import sys

from pip._internal import main

if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
    sys.exit(main())

This is probably the completely wrong thing to do (I’m a python noob), but I just went in and edited the pip file

#!/usr/bin/env python3 <-- I changed this line.

# -*- coding: utf-8 -*-
import re
import sys

from pip._internal import main

if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
    sys.exit(main())

回答 19

专门针对Windows: \ path \ to \ python.exe -m pip install PackageName有效。

For windows specifically: \path\to\python.exe -m pip install PackageName works.


回答 20

对于搅拌机:

/usr/bin $ python3.7 -m pip install irc

for Blender:

/usr/bin $ python3.7 -m pip install irc

过滤字典仅包含某些键?

问题:过滤字典仅包含某些键?

我有一个dict包含大量条目的条目。我只对其中一些感兴趣。有没有一种简单的方法可以将其他所有元素都修剪掉?

I’ve got a dict that has a whole bunch of entries. I’m only interested in a select few of them. Is there an easy way to prune all the other ones out?


回答 0

构建一个新的字典:

dict_you_want = { your_key: old_dict[your_key] for your_key in your_keys }

使用字典理解。

如果使用缺少它们的版本(例如Python 2.6和更早版本),请使其成为dict((your_key, old_dict[your_key]) for ...)。一样,尽管丑陋。

请注意,这与jnnnnn的版本不同,对于old_dict任何大小的,都具有稳定的性能(仅取决于your_keys的数量)。在速度和内存方面。由于这是一个生成器表达式,因此它一次只能处理一项,并且不会浏览old_dict的所有项。

就地删除所有内容:

unwanted = set(keys) - set(your_dict)
for unwanted_key in unwanted: del your_dict[unwanted_key]

Constructing a new dict:

dict_you_want = { your_key: old_dict[your_key] for your_key in your_keys }

Uses dictionary comprehension.

If you use a version which lacks them (ie Python 2.6 and earlier), make it dict((your_key, old_dict[your_key]) for ...). It’s the same, though uglier.

Note that this, unlike jnnnnn’s version, has stable performance (depends only on number of your_keys) for old_dicts of any size. Both in terms of speed and memory. Since this is a generator expression, it processes one item at a time, and it doesn’t looks through all items of old_dict.

Removing everything in-place:

unwanted = set(keys) - set(your_dict)
for unwanted_key in unwanted: del your_dict[unwanted_key]

回答 1

dict理解稍微更优雅:

foodict = {k: v for k, v in mydict.items() if k.startswith('foo')}

Slightly more elegant dict comprehension:

foodict = {k: v for k, v in mydict.items() if k.startswith('foo')}

回答 2

这是python 2.6中的示例:

>>> a = {1:1, 2:2, 3:3}
>>> dict((key,value) for key, value in a.iteritems() if key == 1)
{1: 1}

过滤部分是if语句。

如果您只想选择很多键中的几个键,则此方法比delnan的答案要慢。

Here’s an example in python 2.6:

>>> a = {1:1, 2:2, 3:3}
>>> dict((key,value) for key, value in a.iteritems() if key == 1)
{1: 1}

The filtering part is the if statement.

This method is slower than delnan’s answer if you only want to select a few of very many keys.


回答 3

您可以使用我的函数库中的项目函数来实现

from funcy import project
small_dict = project(big_dict, keys)

还要看看select_keys

You can do that with project function from my funcy library:

from funcy import project
small_dict = project(big_dict, keys)

Also take a look at select_keys.


回答 4

代码1:

dict = { key: key * 10 for key in range(0, 100) }
d1 = {}
for key, value in dict.items():
    if key % 2 == 0:
        d1[key] = value

代码2:

dict = { key: key * 10 for key in range(0, 100) }
d2 = {key: value for key, value in dict.items() if key % 2 == 0}

代码3:

dict = { key: key * 10 for key in range(0, 100) }
d3 = { key: dict[key] for key in dict.keys() if key % 2 == 0}

使用number = 1000随时间测量所有代码性能,并为每个代码收集1000次。

对于python 3.6,三种过滤器dict键的性能几乎相同。对于python 2.7,代码3稍快一些。

Code 1:

dict = { key: key * 10 for key in range(0, 100) }
d1 = {}
for key, value in dict.items():
    if key % 2 == 0:
        d1[key] = value

Code 2:

dict = { key: key * 10 for key in range(0, 100) }
d2 = {key: value for key, value in dict.items() if key % 2 == 0}

Code 3:

dict = { key: key * 10 for key in range(0, 100) }
d3 = { key: dict[key] for key in dict.keys() if key % 2 == 0}

All pieced of code performance are measured with timeit using number=1000, and collected 1000 times for each piece of code.

For python 3.6 the performance of three ways of filter dict keys almost the same. For python 2.7 code 3 is slightly faster.


回答 5

这一个线性lambda应该可以工作:

dictfilt = lambda x, y: dict([ (i,x[i]) for i in x if i in set(y) ])

这是一个例子:

my_dict = {"a":1,"b":2,"c":3,"d":4}
wanted_keys = ("c","d")

# run it
In [10]: dictfilt(my_dict, wanted_keys)
Out[10]: {'c': 3, 'd': 4}

这是对列表键(i在x中)进行迭代的基本列表理解,如果键位于所需的键列表(y)中,则输出元组(键,值)对的列表。dict()将整个内容包装为dict对象。

This one liner lambda should work:

dictfilt = lambda x, y: dict([ (i,x[i]) for i in x if i in set(y) ])

Here’s an example:

my_dict = {"a":1,"b":2,"c":3,"d":4}
wanted_keys = ("c","d")

# run it
In [10]: dictfilt(my_dict, wanted_keys)
Out[10]: {'c': 3, 'd': 4}

It’s a basic list comprehension iterating over your dict keys (i in x) and outputs a list of tuple (key,value) pairs if the key lives in your desired key list (y). A dict() wraps the whole thing to output as a dict object.


回答 6

给定您的原始字典orig和您感兴趣的条目集keys

filtered = dict(zip(keys, [orig[k] for k in keys]))

这不如delnan的答案那么好,但是应该可以在每个感兴趣的Python版本中使用。但是,它对于keys原始字典中存在的每个元素都是脆弱的。

Given your original dictionary orig and the set of entries that you’re interested in keys:

filtered = dict(zip(keys, [orig[k] for k in keys]))

which isn’t as nice as delnan’s answer, but should work in every Python version of interest. It is, however, fragile to each element of keys existing in your original dictionary.


回答 7

基于delnan接受的答案。

如果您想要的键之一不在old_dict中怎么办?delnan解决方案将引发您可以捕获的KeyError异常。如果那不是您所需要的,也许您想:

  1. 仅在old_dict和您的通缉钥匙组中包含存在的钥匙。

    old_dict = {'name':"Foobar", 'baz':42}
    wanted_keys = ['name', 'age']
    new_dict = {k: old_dict[k] for k in set(wanted_keys) & set(old_dict.keys())}
    
    >>> new_dict
    {'name': 'Foobar'}
  2. 具有在old_dict中未设置的键的默认值。

    default = None
    new_dict = {k: old_dict[k] if k in old_dict else default for k in wanted_keys}
    
    >>> new_dict
    {'age': None, 'name': 'Foobar'}

Based on the accepted answer by delnan.

What if one of your wanted keys aren’t in the old_dict? The delnan solution will throw a KeyError exception that you can catch. If that’s not what you need maybe you want to:

  1. only include keys that excists both in the old_dict and your set of wanted_keys.

    old_dict = {'name':"Foobar", 'baz':42}
    wanted_keys = ['name', 'age']
    new_dict = {k: old_dict[k] for k in set(wanted_keys) & set(old_dict.keys())}
    
    >>> new_dict
    {'name': 'Foobar'}
    
  2. have a default value for keys that’s not set in old_dict.

    default = None
    new_dict = {k: old_dict[k] if k in old_dict else default for k in wanted_keys}
    
    >>> new_dict
    {'age': None, 'name': 'Foobar'}
    

回答 8

此功能可以解决问题:

def include_keys(dictionary, keys):
    """Filters a dict by only including certain keys."""
    key_set = set(keys) & set(dictionary.keys())
    return {key: dictionary[key] for key in key_set}

就像delnan的版本一样,此版本使用字典理解,并且对于大型字典具有稳定的性能(仅取决于您允许的键数,而不取决于字典中键的总数)。

就像MyGGan的版本一样,此键允许您的键列表包含字典中可能不存在的键。

另外,这是相反的,您可以在其中通过排除原稿中的某些键来创建字典:

def exclude_keys(dictionary, keys):
    """Filters a dict by excluding certain keys."""
    key_set = set(dictionary.keys()) - set(keys)
    return {key: dictionary[key] for key in key_set}

请注意,与delnan版本不同,该操作未在适当位置完成,因此性能与字典中键的数量有关。但是,这样做的好处是该函数不会修改提供的字典。

编辑:添加了一个单独的功能,用于从字典中排除某些键。

This function will do the trick:

def include_keys(dictionary, keys):
    """Filters a dict by only including certain keys."""
    key_set = set(keys) & set(dictionary.keys())
    return {key: dictionary[key] for key in key_set}

Just like delnan’s version, this one uses dictionary comprehension and has stable performance for large dictionaries (dependent only on the number of keys you permit, and not the total number of keys in the dictionary).

And just like MyGGan’s version, this one allows your list of keys to include keys that may not exist in the dictionary.

And as a bonus, here’s the inverse, where you can create a dictionary by excluding certain keys in the original:

def exclude_keys(dictionary, keys):
    """Filters a dict by excluding certain keys."""
    key_set = set(dictionary.keys()) - set(keys)
    return {key: dictionary[key] for key in key_set}

Note that unlike delnan’s version, the operation is not done in place, so the performance is related to the number of keys in the dictionary. However, the advantage of this is that the function will not modify the dictionary provided.

Edit: Added a separate function for excluding certain keys from a dict.


回答 9

如果我们要删除选定的键来制作新字典,可以利用字典理解功能
,例如:

d = {
'a' : 1,
'b' : 2,
'c' : 3
}
x = {key:d[key] for key in d.keys() - {'c', 'e'}} # Python 3
y = {key:d[key] for key in set(d.keys()) - {'c', 'e'}} # Python 2.*
# x is {'a': 1, 'b': 2}
# y is {'a': 1, 'b': 2}

If we want to make a new dictionary with selected keys removed, we can make use of dictionary comprehension
For example:

d = {
'a' : 1,
'b' : 2,
'c' : 3
}
x = {key:d[key] for key in d.keys() - {'c', 'e'}} # Python 3
y = {key:d[key] for key in set(d.keys()) - {'c', 'e'}} # Python 2.*
# x is {'a': 1, 'b': 2}
# y is {'a': 1, 'b': 2}

回答 10

另外一个选项:

content = dict(k1='foo', k2='nope', k3='bar')
selection = ['k1', 'k3']
filtered = filter(lambda i: i[0] in selection, content.items())

但是您得到的是list(Python 2)或迭代器(Python 3)filter(),而不是返回dict

Another option:

content = dict(k1='foo', k2='nope', k3='bar')
selection = ['k1', 'k3']
filtered = filter(lambda i: i[0] in selection, content.items())

But you get a list (Python 2) or an iterator (Python 3) returned by filter(), not a dict.


回答 11

简写:

[s.pop(k) for k in list(s.keys()) if k not in keep]

正如大多数答案所暗示的那样,为了保持简洁,我们必须创建一个重复的对象a list或a dict。这会产生一个一次性的东西,list但会删除original中的键dict

Short form:

[s.pop(k) for k in list(s.keys()) if k not in keep]

As most of the answers suggest in order to maintain the conciseness we have to create a duplicate object be it a list or dict. This one creates a throw-away list but deletes the keys in original dict.


回答 12

这是del在一个衬管中使用的另一种简单方法:

for key in e_keys: del your_dict[key]

e_keys是要排除的键的列表。它会更新您的词典,而不是给您一个新的词典。

如果需要新的输出字典,请在删除之前复制该字典:

new_dict = your_dict.copy()           #Making copy of dict

for key in e_keys: del new_dict[key]

Here is another simple method using del in one liner:

for key in e_keys: del your_dict[key]

e_keys is the list of the keys to be excluded. It will update your dict rather than giving you a new one.

If you want a new output dict, then make a copy of the dict before deleting:

new_dict = your_dict.copy()           #Making copy of dict

for key in e_keys: del new_dict[key]

回答 13

您可以使用python-benedict,它是dict的子类。

安装: pip install python-benedict

from benedict import benedict

dict_you_want = benedict(your_dict).subset(keys=['firstname', 'lastname', 'email'])

它在GitHub上是开源的:https : //github.com/fabiocaccamo/python-benedict


免责声明:我是这个图书馆的作者。

You could use python-benedict, it’s a dict subclass.

Installation: pip install python-benedict

from benedict import benedict

dict_you_want = benedict(your_dict).subset(keys=['firstname', 'lastname', 'email'])

It’s open-source on GitHub: https://github.com/fabiocaccamo/python-benedict


Disclaimer: I’m the author of this library.


从Python中的另一个文件调用函数

问题:从Python中的另一个文件调用函数

设置:我需要在程序中使用每个函数的.py文件。

在此程序中,我需要从外部文件调用该函数。

我试过了:

from file.py import function(a,b)

但是我得到了错误:

ImportError:没有名为“ file.py”的模块;文件不是包

我该如何解决这个问题?

Set_up: I have a .py file for each function I need to use in a program.

In this program, I need to call the function from the external files.

I’ve tried:

from file.py import function(a,b)

But I get the error:

ImportError: No module named ‘file.py’; file is not a package

How do I fix this problem?


回答 0

file.py导入时无需添加任何内容。只需编写from file import function,然后使用调用函数function(a, b)。之所以可能不起作用,是因为它file是Python的核心模块之一,所以我建议您更改文件名。

请注意,如果您尝试将函数从导入a.py到名为的文件中b.py,则需要确保a.pyb.py处于同一目录中。

There isn’t any need to add file.py while importing. Just write from file import function, and then call the function using function(a, b). The reason why this may not work, is because file is one of Python’s core modules, so I suggest you change the name of your file.

Note that if you’re trying to import functions from a.py to a file called b.py, you will need to make sure that a.py and b.py are in the same directory.


回答 1

首先,您不需要.py

如果您有文件a.py并且内部有一些功能:

def b():
  # Something
  return 1

def c():
  # Something
  return 2

而您要导入它们,z.py您必须编写

from a import b, c

First of all you do not need a .py.

If you have a file a.py and inside you have some functions:

def b():
  # Something
  return 1

def c():
  # Something
  return 2

And you want to import them in z.py you have to write

from a import b, c

回答 2

您可以通过2种方式执行此操作。首先只是从file.py导入所需的特定功能。为此使用

from file import function

另一种方法是导入整个文件

import file as fl

然后您可以使用以下命令在file.py中调用任何函数

fl.function(a,b)

You can do this in 2 ways. First is just to import the specific function you want from file.py. To do this use

from file import function

Another way is to import the entire file

import file as fl

Then you can call any function inside file.py using

fl.function(a,b)

回答 3

如果您不能或不想在正在使用的同一目录中使用该函数,也可以从其他目录中调用该函数。您可以通过两种方式来做到这一点(也许还有更多选择,但这是对我有用的选择)。

备选方案1临时更改您的工作目录

import os

os.chdir("**Put here the directory where you have the file with your function**")

from file import function

os.chdir("**Put here the directory where you were working**")

选择2将具有功能的目录添加到sys.path

import sys

sys.path.append("**Put here the directory where you have the file with your function**")

from file import function

You can call the function from a different directory as well, in case you cannot or do not want to have the function in the same directory you are working. You can do this in two ways (perhaps there are more alternatives, but these are the ones that have worked for me).

Alternative 1 Temporarily change your working directory

import os

os.chdir("**Put here the directory where you have the file with your function**")

from file import function

os.chdir("**Put here the directory where you were working**")

Alternative 2 Add the directory where you have your function to sys.path

import sys

sys.path.append("**Put here the directory where you have the file with your function**")

from file import function

回答 4

如果您的文件位于不同的包结构中,并且您想从其他包中调用它,则可以按照以下方式调用它:

假设您在python项目中具有以下包结构:

com.my.func.DifferentFunction-python文件中,您具有一些功能,例如:

def add(arg1, arg2):
    return arg1 + arg2

def sub(arg1, arg2) :
    return arg1 - arg2

def mul(arg1, arg2) :
    return arg1 * arg2

您想从中调用不同的函数Example3.py,然后按照以下方式进行操作:

Example3.py文件中定义导入语句以导入所有功能

from com.my.func.DifferentFunction import *

或定义要导入的每个函数名称

from com.my.func.DifferentFunction import add, sub, mul

然后Example3.py可以调用函数执行:

num1 = 20
num2 = 10

print("\n add : ", add(num1,num2))
print("\n sub : ", sub(num1,num2))
print("\n mul : ", mul(num1,num2))

输出:

 add :  30

 sub :  10

 mul :  200

If your file is in the different package structure and you want to call it from a different package, then you can call it in that fashion:

Let’s say you have following package structure in your python project:

in – com.my.func.DifferentFunction python file you have some function, like:

def add(arg1, arg2):
    return arg1 + arg2

def sub(arg1, arg2) :
    return arg1 - arg2

def mul(arg1, arg2) :
    return arg1 * arg2

And you want to call different functions from Example3.py, then following way you can do it:

Define import statement in Example3.py – file for import all function

from com.my.func.DifferentFunction import *

or define each function name which you want to import

from com.my.func.DifferentFunction import add, sub, mul

Then in Example3.py you can call function for execute:

num1 = 20
num2 = 10

print("\n add : ", add(num1,num2))
print("\n sub : ", sub(num1,num2))
print("\n mul : ", mul(num1,num2))

Output:

 add :  30

 sub :  10

 mul :  200

回答 5

遇到了相同的功能,但我必须执行以下操作才能使其正常工作。

如果看到“ ModuleNotFoundError:未命名模块”,则可能需要在文件名前面加点号(。),如下所示;

.file导入功能

Came across the same feature but I had to do the below to make it work.

If you are seeing ‘ModuleNotFoundError: No module named’, you probably need the dot(.) in front of the filename as below;

from .file import funtion


回答 6

首先以.py格式保存文件(例如my_example.py)。如果该文件具有功能,

def xyz():

        --------

        --------

def abc():

        --------

        --------

在调用函数中,您只需要键入以下几行。

文件名:my_example2.py

===========================

import my_example.py


a = my_example.xyz()

b = my_example.abc()

===========================

First save the file in .py format (for example, my_example.py). And if that file have functions,

def xyz():

        --------

        --------

def abc():

        --------

        --------

In the calling function you just have to type the below lines.

file_name: my_example2.py

============================

import my_example.py


a = my_example.xyz()

b = my_example.abc()

============================


回答 7

将模块重命名为“文件”以外的名称。

然后还要确保在调用函数时:

1)如果要导入整个模块,则在调用它时要重申模块名称:

import module
module.function_name()

要么

import pizza
pizza.pizza_function()

2)或如果您要导入特定功能,带别名的功能或所有使用*的功能,则无需重复模块名称:

from pizza import pizza_function
pizza_function()

要么

from pizza import pizza_function as pf
pf()

要么

from pizza import *
pizza_function()

Rename the module to something other than ‘file’.

Then also be sure when you are calling the function that:

1)if you are importing the entire module, you reiterate the module name when calling it:

import module
module.function_name()

or

import pizza
pizza.pizza_function()

2)or if you are importing specific functions, functions with an alias, or all functions using *, you don’t reiterate the module name:

from pizza import pizza_function
pizza_function()

or

from pizza import pizza_function as pf
pf()

or

from pizza import *
pizza_function()

回答 8

.py文件中的函数(可以(当然)可以在不同目录中)可以通过首先写入目录然后输入不带.py扩展名的文件名来简单地导入:

from directory_name.file_name import function_name

后来被使用: function_name()

Functions from .py file (can (of course) be in different directory) can be simply imported by writing directories first and then the file name without .py extension:

from directory_name.file_name import function_name

And later be used: function_name()


回答 9

在MathMethod.Py内部。

def Add(a,b):
   return a+b 

def subtract(a,b):
  return a-b

内部Main.Py

import MathMethod as MM 
  print(MM.Add(200,1000))

输出:1200

Inside MathMethod.Py.

def Add(a,b):
   return a+b 

def subtract(a,b):
  return a-b

Inside Main.Py

import MathMethod as MM 
  print(MM.Add(200,1000))

Output:1200


回答 10

您不必添加file.py

只需将文件与文件导入位置保持在相同位置即可。然后只需导入您的函数:

from file import a, b

You don’t have to add file.py.

Just keep the file in the same location with the file from where you want to import it. Then just import your functions:

from file import a, b

回答 11

您应该将文件与要导入的Python文件放在同一位置。“从文件导入功能”也足够。

You should have the file at the same location as that of the Python files you are trying to import. Also ‘from file import function’ is enough.


回答 12

如果要导入此文件,请在文件名前附加一个点(。),该文件与运行代码的目录相同。

例如,我正在运行一个名为a.py的文件,我想导入一个名为addFun的方法,该方法是用b.py编写的,而b.py在同一目录中

从.b import addFun

append a dot(.) in front of a file name if you want to import this file which is in the same directory where you are running your code.

For example, i’m running a file named a.py and i want to import a method named addFun which is written in b.py, and b.py is there in the same directory

from .b import addFun


回答 13

假设您要调用的文件是anotherfile.py,并且您要调用的方法是method1,然后先导入文件,然后再导入方法

from anotherfile import method1

如果method1是类的一部分,则将该类设为class1,则

from anotherfile import class1

然后创建一个class1对象,假设对象名称是ob1,然后

ob1 = class1()
ob1.method1()

Suppose the file you want to call is anotherfile.py and the method you want to call is method1, then first import the file and then the method

from anotherfile import method1

if method1 is part of a class, let the class be class1, then

from anotherfile import class1

then create an object of class1, suppose the object name is ob1, then

ob1 = class1()
ob1.method1()

回答 14

就我而言,我命名了文件helper.scrap.py,直到更改为helper.py

in my case i named my file helper.scrap.py and couldn’t make it work until i changed to helper.py


将Pandas GroupBy输出从Series转换为DataFrame

问题:将Pandas GroupBy输出从Series转换为DataFrame

我从这样的输入数据开始

df1 = pandas.DataFrame( { 
    "Name" : ["Alice", "Bob", "Mallory", "Mallory", "Bob" , "Mallory"] , 
    "City" : ["Seattle", "Seattle", "Portland", "Seattle", "Seattle", "Portland"] } )

打印时显示为:

   City     Name
0   Seattle    Alice
1   Seattle      Bob
2  Portland  Mallory
3   Seattle  Mallory
4   Seattle      Bob
5  Portland  Mallory

分组非常简单:

g1 = df1.groupby( [ "Name", "City"] ).count()

打印产生一个GroupBy对象:

                  City  Name
Name    City
Alice   Seattle      1     1
Bob     Seattle      2     2
Mallory Portland     2     2
        Seattle      1     1

但是我最终想要的是另一个DataFrame对象,该对象包含GroupBy对象中的所有行。换句话说,我想得到以下结果:

                  City  Name
Name    City
Alice   Seattle      1     1
Bob     Seattle      2     2
Mallory Portland     2     2
Mallory Seattle      1     1

我在pandas文档中看不到如何完成此操作。任何提示都将受到欢迎。

I’m starting with input data like this

df1 = pandas.DataFrame( { 
    "Name" : ["Alice", "Bob", "Mallory", "Mallory", "Bob" , "Mallory"] , 
    "City" : ["Seattle", "Seattle", "Portland", "Seattle", "Seattle", "Portland"] } )

Which when printed appears as this:

   City     Name
0   Seattle    Alice
1   Seattle      Bob
2  Portland  Mallory
3   Seattle  Mallory
4   Seattle      Bob
5  Portland  Mallory

Grouping is simple enough:

g1 = df1.groupby( [ "Name", "City"] ).count()

and printing yields a GroupBy object:

                  City  Name
Name    City
Alice   Seattle      1     1
Bob     Seattle      2     2
Mallory Portland     2     2
        Seattle      1     1

But what I want eventually is another DataFrame object that contains all the rows in the GroupBy object. In other words I want to get the following result:

                  City  Name
Name    City
Alice   Seattle      1     1
Bob     Seattle      2     2
Mallory Portland     2     2
Mallory Seattle      1     1

I can’t quite see how to accomplish this in the pandas documentation. Any hints would be welcome.


回答 0

g1一个DataFrame。但是,它具有层次结构索引:

In [19]: type(g1)
Out[19]: pandas.core.frame.DataFrame

In [20]: g1.index
Out[20]: 
MultiIndex([('Alice', 'Seattle'), ('Bob', 'Seattle'), ('Mallory', 'Portland'),
       ('Mallory', 'Seattle')], dtype=object)

也许您想要这样的东西?

In [21]: g1.add_suffix('_Count').reset_index()
Out[21]: 
      Name      City  City_Count  Name_Count
0    Alice   Seattle           1           1
1      Bob   Seattle           2           2
2  Mallory  Portland           2           2
3  Mallory   Seattle           1           1

或类似的东西:

In [36]: DataFrame({'count' : df1.groupby( [ "Name", "City"] ).size()}).reset_index()
Out[36]: 
      Name      City  count
0    Alice   Seattle      1
1      Bob   Seattle      2
2  Mallory  Portland      2
3  Mallory   Seattle      1

g1 here is a DataFrame. It has a hierarchical index, though:

In [19]: type(g1)
Out[19]: pandas.core.frame.DataFrame

In [20]: g1.index
Out[20]: 
MultiIndex([('Alice', 'Seattle'), ('Bob', 'Seattle'), ('Mallory', 'Portland'),
       ('Mallory', 'Seattle')], dtype=object)

Perhaps you want something like this?

In [21]: g1.add_suffix('_Count').reset_index()
Out[21]: 
      Name      City  City_Count  Name_Count
0    Alice   Seattle           1           1
1      Bob   Seattle           2           2
2  Mallory  Portland           2           2
3  Mallory   Seattle           1           1

Or something like:

In [36]: DataFrame({'count' : df1.groupby( [ "Name", "City"] ).size()}).reset_index()
Out[36]: 
      Name      City  count
0    Alice   Seattle      1
1      Bob   Seattle      2
2  Mallory  Portland      2
3  Mallory   Seattle      1

回答 1

我想稍微更改Wes给出的答案,因为版本0.16.2需要as_index=False。如果未设置,则会得到一个空的数据框。

资料来源

如果将聚集函数命名as_index=True为默认值列,则聚集函数将不会返回要聚集的组。分组的列将是返回对象的索引。

as_index=False如果它们被命名为“列”,则传递将返回您正在聚合的组。

聚合函数是那些减少返回的对象的尺寸,例如:meansumsizecountstdvarsemdescribefirstlastnthminmax。例如DataFrame.sum(),当您这样做并取回一个时,就会发生这种情况Series

nth可以充当减速器或过滤器,请参见此处

import pandas as pd

df1 = pd.DataFrame({"Name":["Alice", "Bob", "Mallory", "Mallory", "Bob" , "Mallory"],
                    "City":["Seattle","Seattle","Portland","Seattle","Seattle","Portland"]})
print df1
#
#       City     Name
#0   Seattle    Alice
#1   Seattle      Bob
#2  Portland  Mallory
#3   Seattle  Mallory
#4   Seattle      Bob
#5  Portland  Mallory
#
g1 = df1.groupby(["Name", "City"], as_index=False).count()
print g1
#
#                  City  Name
#Name    City
#Alice   Seattle      1     1
#Bob     Seattle      2     2
#Mallory Portland     2     2
#        Seattle      1     1
#

编辑:

在版本0.17.1及更高版本,您可以使用subsetcountreset_index与参数namesize

print df1.groupby(["Name", "City"], as_index=False ).count()
#IndexError: list index out of range

print df1.groupby(["Name", "City"]).count()
#Empty DataFrame
#Columns: []
#Index: [(Alice, Seattle), (Bob, Seattle), (Mallory, Portland), (Mallory, Seattle)]

print df1.groupby(["Name", "City"])[['Name','City']].count()
#                  Name  City
#Name    City                
#Alice   Seattle      1     1
#Bob     Seattle      2     2
#Mallory Portland     2     2
#        Seattle      1     1

print df1.groupby(["Name", "City"]).size().reset_index(name='count')
#      Name      City  count
#0    Alice   Seattle      1
#1      Bob   Seattle      2
#2  Mallory  Portland      2
#3  Mallory   Seattle      1

count和之间的区别在于sizesize计算NaN值时count不计算。

I want to slightly change the answer given by Wes, because version 0.16.2 requires as_index=False. If you don’t set it, you get an empty dataframe.

Source:

Aggregation functions will not return the groups that you are aggregating over if they are named columns, when as_index=True, the default. The grouped columns will be the indices of the returned object.

Passing as_index=False will return the groups that you are aggregating over, if they are named columns.

Aggregating functions are ones that reduce the dimension of the returned objects, for example: mean, sum, size, count, std, var, sem, describe, first, last, nth, min, max. This is what happens when you do for example DataFrame.sum() and get back a Series.

nth can act as a reducer or a filter, see here.

import pandas as pd

df1 = pd.DataFrame({"Name":["Alice", "Bob", "Mallory", "Mallory", "Bob" , "Mallory"],
                    "City":["Seattle","Seattle","Portland","Seattle","Seattle","Portland"]})
print df1
#
#       City     Name
#0   Seattle    Alice
#1   Seattle      Bob
#2  Portland  Mallory
#3   Seattle  Mallory
#4   Seattle      Bob
#5  Portland  Mallory
#
g1 = df1.groupby(["Name", "City"], as_index=False).count()
print g1
#
#                  City  Name
#Name    City
#Alice   Seattle      1     1
#Bob     Seattle      2     2
#Mallory Portland     2     2
#        Seattle      1     1
#

EDIT:

In version 0.17.1 and later you can use subset in count and reset_index with parameter name in size:

print df1.groupby(["Name", "City"], as_index=False ).count()
#IndexError: list index out of range

print df1.groupby(["Name", "City"]).count()
#Empty DataFrame
#Columns: []
#Index: [(Alice, Seattle), (Bob, Seattle), (Mallory, Portland), (Mallory, Seattle)]

print df1.groupby(["Name", "City"])[['Name','City']].count()
#                  Name  City
#Name    City                
#Alice   Seattle      1     1
#Bob     Seattle      2     2
#Mallory Portland     2     2
#        Seattle      1     1

print df1.groupby(["Name", "City"]).size().reset_index(name='count')
#      Name      City  count
#0    Alice   Seattle      1
#1      Bob   Seattle      2
#2  Mallory  Portland      2
#3  Mallory   Seattle      1

The difference between count and size is that size counts NaN values while count does not.


回答 2

简单地说,这应该完成任务:

import pandas as pd

grouped_df = df1.groupby( [ "Name", "City"] )

pd.DataFrame(grouped_df.size().reset_index(name = "Group_Count"))

在这里,grouped_df.size()提取唯一的groupby计数,reset_index()方法重置您想要的列名。最后,Dataframe()调用pandas 函数创建一个DataFrame对象。

Simply, this should do the task:

import pandas as pd

grouped_df = df1.groupby( [ "Name", "City"] )

pd.DataFrame(grouped_df.size().reset_index(name = "Group_Count"))

Here, grouped_df.size() pulls up the unique groupby count, and reset_index() method resets the name of the column you want it to be. Finally, the pandas Dataframe() function is called upon to create a DataFrame object.


回答 3

关键是使用reset_index()方法。

采用:

import pandas

df1 = pandas.DataFrame( { 
    "Name" : ["Alice", "Bob", "Mallory", "Mallory", "Bob" , "Mallory"] , 
    "City" : ["Seattle", "Seattle", "Portland", "Seattle", "Seattle", "Portland"] } )

g1 = df1.groupby( [ "Name", "City"] ).count().reset_index()

现在,您在g1中有了新的数据

The key is to use the reset_index() method.

Use:

import pandas

df1 = pandas.DataFrame( { 
    "Name" : ["Alice", "Bob", "Mallory", "Mallory", "Bob" , "Mallory"] , 
    "City" : ["Seattle", "Seattle", "Portland", "Seattle", "Seattle", "Portland"] } )

g1 = df1.groupby( [ "Name", "City"] ).count().reset_index()

Now you have your new dataframe in g1:


回答 4

也许我误解了这个问题,但是如果您想将groupby转换回数据框,则可以使用.to_frame()。我想在执行此操作时重设索引,所以我也包括了该部分。

与问题无关的示例代码

df = df['TIME'].groupby(df['Name']).min()
df = df.to_frame()
df = df.reset_index(level=['Name',"TIME"])

Maybe I misunderstand the question but if you want to convert the groupby back to a dataframe you can use .to_frame(). I wanted to reset the index when I did this so I included that part as well.

example code unrelated to question

df = df['TIME'].groupby(df['Name']).min()
df = df.to_frame()
df = df.reset_index(level=['Name',"TIME"])

回答 5

我发现这对我有用。

import numpy as np
import pandas as pd

df1 = pd.DataFrame({ 
    "Name" : ["Alice", "Bob", "Mallory", "Mallory", "Bob" , "Mallory"] , 
    "City" : ["Seattle", "Seattle", "Portland", "Seattle", "Seattle", "Portland"]})

df1['City_count'] = 1
df1['Name_count'] = 1

df1.groupby(['Name', 'City'], as_index=False).count()

I found this worked for me.

import numpy as np
import pandas as pd

df1 = pd.DataFrame({ 
    "Name" : ["Alice", "Bob", "Mallory", "Mallory", "Bob" , "Mallory"] , 
    "City" : ["Seattle", "Seattle", "Portland", "Seattle", "Seattle", "Portland"]})

df1['City_count'] = 1
df1['Name_count'] = 1

df1.groupby(['Name', 'City'], as_index=False).count()

回答 6

下面的解决方案可能更简单:

df1.reset_index().groupby( [ "Name", "City"],as_index=False ).count()

Below solution may be simpler:

df1.reset_index().groupby( [ "Name", "City"],as_index=False ).count()

回答 7

我已经汇总了数量明智的数据并存储到数据框

almo_grp_data = pd.DataFrame({'Qty_cnt' :
almo_slt_models_data.groupby( ['orderDate','Item','State Abv']
          )['Qty'].sum()}).reset_index()

I have aggregated with Qty wise data and store to dataframe

almo_grp_data = pd.DataFrame({'Qty_cnt' :
almo_slt_models_data.groupby( ['orderDate','Item','State Abv']
          )['Qty'].sum()}).reset_index()

回答 8

这些解决方案仅对我部分起作用,因为我正在进行多个聚合。这是我要转换为数据框的分组的示例输出:

因为我想要的不仅仅是reset_index()提供的计数,所以我写了一个手动方法将上面的图像转换为数据帧。我知道这不是最复杂的方法,因为它很冗长和明确,但这是我所需要的。基本上,使用上面说明的reset_index()方法来启动“脚手架”数据框,然后遍历分组数据框中的组配对,检索索引,针对未分组数据框执行计算,并在新的聚合数据框中设置值。

df_grouped = df[['Salary Basis', 'Job Title', 'Hourly Rate', 'Male Count', 'Female Count']]
df_grouped = df_grouped.groupby(['Salary Basis', 'Job Title'], as_index=False)

# Grouped gives us the indices we want for each grouping
# We cannot convert a groupedby object back to a dataframe, so we need to do it manually
# Create a new dataframe to work against
df_aggregated = df_grouped.size().to_frame('Total Count').reset_index()
df_aggregated['Male Count'] = 0
df_aggregated['Female Count'] = 0
df_aggregated['Job Rate'] = 0

def manualAggregations(indices_array):
    temp_df = df.iloc[indices_array]
    return {
        'Male Count': temp_df['Male Count'].sum(),
        'Female Count': temp_df['Female Count'].sum(),
        'Job Rate': temp_df['Hourly Rate'].max()
    }

for name, group in df_grouped:
    ix = df_grouped.indices[name]
    calcDict = manualAggregations(ix)

    for key in calcDict:
        #Salary Basis, Job Title
        columns = list(name)
        df_aggregated.loc[(df_aggregated['Salary Basis'] == columns[0]) & 
                          (df_aggregated['Job Title'] == columns[1]), key] = calcDict[key]

如果不是字典,可以在for循环中内联应用计算:

    df_aggregated['Male Count'].loc[(df_aggregated['Salary Basis'] == columns[0]) & 
                                (df_aggregated['Job Title'] == columns[1])] = df['Male Count'].iloc[ix].sum()

These solutions only partially worked for me because I was doing multiple aggregations. Here is a sample output of my grouped by that I wanted to convert to a dataframe:

Because I wanted more than the count provided by reset_index(), I wrote a manual method for converting the image above into a dataframe. I understand this is not the most pythonic/pandas way of doing this as it is quite verbose and explicit, but it was all I needed. Basically, use the reset_index() method explained above to start a “scaffolding” dataframe, then loop through the group pairings in the grouped dataframe, retrieve the indices, perform your calculations against the ungrouped dataframe, and set the value in your new aggregated dataframe.

df_grouped = df[['Salary Basis', 'Job Title', 'Hourly Rate', 'Male Count', 'Female Count']]
df_grouped = df_grouped.groupby(['Salary Basis', 'Job Title'], as_index=False)

# Grouped gives us the indices we want for each grouping
# We cannot convert a groupedby object back to a dataframe, so we need to do it manually
# Create a new dataframe to work against
df_aggregated = df_grouped.size().to_frame('Total Count').reset_index()
df_aggregated['Male Count'] = 0
df_aggregated['Female Count'] = 0
df_aggregated['Job Rate'] = 0

def manualAggregations(indices_array):
    temp_df = df.iloc[indices_array]
    return {
        'Male Count': temp_df['Male Count'].sum(),
        'Female Count': temp_df['Female Count'].sum(),
        'Job Rate': temp_df['Hourly Rate'].max()
    }

for name, group in df_grouped:
    ix = df_grouped.indices[name]
    calcDict = manualAggregations(ix)

    for key in calcDict:
        #Salary Basis, Job Title
        columns = list(name)
        df_aggregated.loc[(df_aggregated['Salary Basis'] == columns[0]) & 
                          (df_aggregated['Job Title'] == columns[1]), key] = calcDict[key]

If a dictionary isn’t your thing, the calculations could be applied inline in the for loop:

    df_aggregated['Male Count'].loc[(df_aggregated['Salary Basis'] == columns[0]) & 
                                (df_aggregated['Job Title'] == columns[1])] = df['Male Count'].iloc[ix].sum()

Python单元测试去哪儿了?

问题:Python单元测试去哪儿了?

如果您正在编写库或应用程序,则单元测试文件会放在哪里?

将测试文件与主应用程序代码分开是很好的选择,但是将它们放在应用程序根目录内的“ tests”子目录中是很尴尬的,因为这使得导入要测试的模块更加困难。

这里有最佳实践吗?

If you’re writing a library, or an app, where do the unit test files go?

It’s nice to separate the test files from the main app code, but it’s awkward to put them into a “tests” subdirectory inside of the app root directory, because it makes it harder to import the modules that you’ll be testing.

Is there a best practice here?


回答 0

对于文件module.py,通常应test_module.py遵循Pythonic命名约定来调用单元测试。

有几个公认的地方test_module.py

  1. 与相同的目录中module.py
  2. 进入../tests/test_module.py(与代码目录处于同一级别)。
  3. tests/test_module.py(代码目录下的一级)。

我更喜欢#1,因为它可以轻松找到测试并将其导入。无论您使用哪种构建系统,都可以轻松地将其配置为运行以开头的文件test_。实际上,用于测试发现默认unittest模式是test*.py

For a file module.py, the unit test should normally be called test_module.py, following Pythonic naming conventions.

There are several commonly accepted places to put test_module.py:

  1. In the same directory as module.py.
  2. In ../tests/test_module.py (at the same level as the code directory).
  3. In tests/test_module.py (one level under the code directory).

I prefer #1 for its simplicity of finding the tests and importing them. Whatever build system you’re using can easily be configured to run files starting with test_. Actually, the default unittest pattern used for test discovery is test*.py.


回答 1

仅1个测试文件

如果只有1个测试文件,建议将其放在顶层目录中:

module/
    lib/
        __init__.py
        module.py
    test.py

在CLI中运行测试

python test.py

许多测试文件

如果有许多测试文件,请将其放在tests文件夹中:

module/
    lib/
        __init__.py
        module.py
    tests/
        test_module.py
        test_module_function.py
# test_module.py

import unittest
from lib import module

class TestModule(unittest.TestCase):
    def test_module(self):
        pass

if __name__ == '__main__':
    unittest.main()

在CLI中运行测试

# In top-level /module/ folder
python -m tests.test_module
python -m tests.test_module_function

采用 unittest discovery

unittest discovery 将在包文件夹中找到所有测试。

创建一个__init__.pyin tests/文件夹

module/
    lib/
        __init__.py
        module.py
    tests/
        __init__.py
        test_module.py
        test_module_function.py

在CLI中运行测试

# In top-level /module/ folder

# -s, --start-directory (default current directory)
# -p, --pattern (default test*.py)

python -m unittest discover

参考

单元测试框架

Only 1 test file

If there has only 1 test files, putting it in a top-level directory is recommended:

module/
    lib/
        __init__.py
        module.py
    test.py

Run the test in CLI

python test.py

Many test files

If has many test files, put it in a tests folder:

module/
    lib/
        __init__.py
        module.py
    tests/
        test_module.py
        test_module_function.py
# test_module.py

import unittest
from lib import module

class TestModule(unittest.TestCase):
    def test_module(self):
        pass

if __name__ == '__main__':
    unittest.main()

Run the test in CLI

# In top-level /module/ folder
python -m tests.test_module
python -m tests.test_module_function

Use unittest discovery

unittest discovery will find all test in package folder.

Create a __init__.py in tests/ folder

module/
    lib/
        __init__.py
        module.py
    tests/
        __init__.py
        test_module.py
        test_module_function.py

Run the test in CLI

# In top-level /module/ folder

# -s, --start-directory (default current directory)
# -p, --pattern (default test*.py)

python -m unittest discover

Reference

Unit test framework


回答 2

通常的做法是将tests目录放置在与模块/软件包相同的父目录中。因此,如果您的模块名为foo.py,则目录布局将如下所示:

parent_dir/
  foo.py
  tests/

当然,没有一种方法可以做到这一点。您也可以创建一个tests子目录,然后使用绝对导入导入模块。

无论您在哪里进行测试,我都建议您使用鼻子进行测试。鼻子会在您的目录中搜索测试。这样,您可以在组织上最有意义的地方进行测试。

A common practice is to put the tests directory in the same parent directory as your module/package. So if your module was called foo.py your directory layout would look like:

parent_dir/
  foo.py
  tests/

Of course there is no one way of doing it. You could also make a tests subdirectory and import the module using absolute import.

Wherever you put your tests, I would recommend you use nose to run them. Nose searches through your directories for tests. This way, you can put tests wherever they make the most sense organizationally.


回答 3

编写Pythoscope(https://pypi.org/project/pythoscope/)时,我们遇到了同样的问题,该问题会为Python程序生成单元测试。在选择目录之前,我们对python列表中的测试人员进行了调查,结果有很多不同的见解。最后,我们选择将“ tests”目录放置在与源代码相同的目录中。在该目录中,我们为父目录中的每个模块生成一个测试文件。

We had the very same question when writing Pythoscope (https://pypi.org/project/pythoscope/), which generates unit tests for Python programs. We polled people on the testing in python list before we chose a directory, there were many different opinions. In the end we chose to put a “tests” directory in the same directory as the source code. In that directory we generate a test file for each module in the parent directory.


回答 4

正如杰里米·坎特雷尔(Jeremy Cantrell)所述,我也倾向于将单元测试放在文件本身中,尽管我倾向于不将测试功能放在主体中,而是将所有内容放在一个文件中。

if __name__ == '__main__':
   do tests...

块。最后,将文档添加到文件中作为“示例代码”,以说明如何使用要测试的python文件。

我应该补充一点,我倾向于编写非常紧凑的模块/类。如果您的模块需要大量测试,则可以将它们放在另一个测试中,但是即使如此,我仍然要添加:

if __name__ == '__main__':
   import tests.thisModule
   tests.thisModule.runtests

这使任何阅读您的源代码的人都知道在哪里可以找到测试代码。

I also tend to put my unit tests in the file itself, as Jeremy Cantrell above notes, although I tend to not put the test function in the main body, but rather put everything in an

if __name__ == '__main__':
   do tests...

block. This ends up adding documentation to the file as ‘example code’ for how to use the python file you are testing.

I should add, I tend to write very tight modules/classes. If your modules require very large numbers of tests, you can put them in another, but even then, I’d still add:

if __name__ == '__main__':
   import tests.thisModule
   tests.thisModule.runtests

This lets anybody reading your source code know where to look for the test code.


回答 5

我偶尔会检查一下测试放置的主题,大多数人每次都在库代码旁边推荐一个单独的文件夹结构,但是我发现每次参数都相同且并不那么令人信服。我最终将测试模块放在核心模块旁边。

这样做的主要原因是:重构

当我四处移动时,我确实希望测试模块随代码一起移动。如果测试位于单独的树中,则很容易丢失测试。老实说,迟早您会得到一个完全不同的文件夹结构,例如djangoflask和许多其他文件夹。如果您不在乎,那很好。

您应该问自己的主要问题是:

我在写:

  • a)可重用的库或
  • b)构建项目而不是将一些半分隔的模块捆绑在一起?

如果一个:

一个单独的文件夹以及保持其结构的额外工作可能会更适合。没有人会抱怨您的测试被部署到生产环境中

但是,将测试与核心文件夹混合时,也可以将测试从分发中排除出去,这同样容易。把它放在setup.py中

find_packages("src", exclude=["*.tests", "*.tests.*", "tests.*", "tests"]) 

如果b:

就像我们每个人一样,您可能希望您正在编写可重用的库,但是大多数时候它们的生命与项目的生命息息相关。轻松维护项目的能力应该是首要任务。

然后,如果您做得很好,并且您的模块非常适合另一个项目,则可能会将其复制(而不是分叉或制作成单独的库)复制到此新项目中,并将位于其旁边的测试移动到同一文件夹结构中与在一个单独的测试文件夹变得混乱的情况下进行测试相比,这很容易。(您可能会争辩说,一开始它不应该是一团糟,但让我们在这里变得现实)。

因此,选择仍然是您的选择,但我认为,通过混合测试,您可以实现与使用单独的文件夹相同的所有功能,但是可以使工作保持整洁。

Every once in a while I find myself checking out the topic of test placement, and every time the majority recommends a separate folder structure beside the library code, but I find that every time the arguments are the same and are not that convincing. I end up putting my test modules somewhere beside the core modules.

The main reason for doing this is: refactoring.

When I move things around I do want test modules to move with the code; it’s easy to lose tests if they are in a separate tree. Let’s be honest, sooner or later you end up with a totally different folder structure, like django, flask and many others. Which is fine if you don’t care.

The main question you should ask yourself is this:

Am I writing:

  • a) reusable library or
  • b) building a project than bundles together some semi-separated modules?

If a:

A separate folder and the extra effort to maintain its structure may be better suited. No one will complain about your tests getting deployed to production.

But it’s also just as easy to exclude tests from being distributed when they are mixed with the core folders; put this in the setup.py:

find_packages("src", exclude=["*.tests", "*.tests.*", "tests.*", "tests"]) 

If b:

You may wish — as every one of us do — that you are writing reusable libraries, but most of the time their life is tied to the life of the project. Ability to easily maintain your project should be a priority.

Then if you did a good job and your module is a good fit for another project, it will probably get copied — not forked or made into a separate library — into this new project, and moving tests that lay beside it in the same folder structure is easy in comparison to fishing up tests in a mess that a separate test folder had become. (You may argue that it shouldn’t be a mess in the first place but let’s be realistic here).

So the choice is still yours, but I would argue that with mixed up tests you achieve all the same things as with a separate folder, but with less effort on keeping things tidy.


回答 6

我使用tests/目录,然后使用相对导入来导入主要应用程序模块。因此,在MyApp / tests / foo.py中,可能有:

from .. import foo

导入MyApp.foo模块。

I use a tests/ directory, and then import the main application modules using relative imports. So in MyApp/tests/foo.py, there might be:

from .. import foo

to import the MyApp.foo module.


回答 7

我认为没有公认的“最佳实践”。

我将测试放在应用程序代码之外的另一个目录中。然后,在运行所有测试之前,在测试运行器脚本(还执行其他一些操作)中,将主应用程序目录添加到sys.path中(允许您从任何位置导入模块)。这样,我发布时就不必从主代码中删除测试目录,从而节省了时间和精力。

I don’t believe there is an established “best practice”.

I put my tests in another directory outside of the app code. I then add the main app directory to sys.path (allowing you to import the modules from anywhere) in my test runner script (which does some other stuff as well) before running all the tests. This way I never have to remove the tests directory from the main code when I release it, saving me time and effort, if an ever so tiny amount.


回答 8

根据我在Python中开发测试框架的经验,我建议将python单元测试放在单独的目录中。保持对称目录结构。这将有助于仅打包核心库而不打包单元测试。下面是通过示意图实现的。

                              <Main Package>
                               /          \
                              /            \
                            lib           tests
                            /                \
             [module1.py, module2.py,  [ut_module1.py, ut_module2.py,
              module3.py  module4.py,   ut_module3.py, ut_module.py]
              __init__.py]

这样,当您使用rpm打包这些库时,您可以仅打包主库模块(仅)。这有助于维护性,尤其是在敏捷环境中。

From my experience in developing Testing frameworks in Python, I would suggest to put python unit tests in a separate directory. Maintain a symmetric directory structure. This would be helpful in packaging just the core libraries and not package the unit tests. Below is implemented through a schematic diagram.

                              <Main Package>
                               /          \
                              /            \
                            lib           tests
                            /                \
             [module1.py, module2.py,  [ut_module1.py, ut_module2.py,
              module3.py  module4.py,   ut_module3.py, ut_module.py]
              __init__.py]

In this way when you package these libraries using an rpm, you can just package the main library modules (only). This helps maintainability particularly in agile environment.


回答 9

我建议您检查GitHub上的一些主要Python项目并获得一些想法。

当代码变大并添加更多库时,最好在具有setup.py的目录中创建一个测试文件夹,并为每种测试类型(unittest,integration等)镜像项目目录结构。

例如,如果您具有如下目录结构:

myPackage/
    myapp/
       moduleA/
          __init__.py
          module_A.py
       moduleB/
          __init__.py
          module_B.py
setup.py

添加测试文件夹后,您将具有以下目录结构:

myPackage/
    myapp/
       moduleA/
          __init__.py
          module_A.py
       moduleB/
          __init__.py
          module_B.py
test/
   unit/
      myapp/
         moduleA/
            module_A_test.py
         moduleB/
            module_B_test.py
   integration/
          myapp/
             moduleA/
                module_A_test.py
             moduleB/
                module_B_test.py
setup.py

许多正确编写的Python软件包都使用相同的结构。Boto软件包就是一个很好的例子。检查https://github.com/boto/boto

I recommend you check some main Python projects on GitHub and get some ideas.

When your code gets larger and you add more libraries it’s better to create a test folder in the same directory you have setup.py and mirror your project directory structure for each test type (unittest, integration, …)

For example if you have a directory structure like:

myPackage/
    myapp/
       moduleA/
          __init__.py
          module_A.py
       moduleB/
          __init__.py
          module_B.py
setup.py

After adding test folder you will have a directory structure like:

myPackage/
    myapp/
       moduleA/
          __init__.py
          module_A.py
       moduleB/
          __init__.py
          module_B.py
test/
   unit/
      myapp/
         moduleA/
            module_A_test.py
         moduleB/
            module_B_test.py
   integration/
          myapp/
             moduleA/
                module_A_test.py
             moduleB/
                module_B_test.py
setup.py

Many properly written Python packages uses the same structure. A very good example is the Boto package. Check https://github.com/boto/boto


回答 10

我该怎么做…

资料夹结构:

project/
    src/
        code.py
    tests/
    setup.py

Setup.py指向src /作为包含我的项目模块的位置,然后运行:

setup.py develop

它将我的项目添加到站点程序包中,指向我的工作副本。要运行测试,我使用:

setup.py tests

使用我配置的任何测试运行程序。

How I do it…

Folder structure:

project/
    src/
        code.py
    tests/
    setup.py

Setup.py points to src/ as the location containing my projects modules, then i run:

setup.py develop

Which adds my project into site-packages, pointing to my working copy. To run my tests i use:

setup.py tests

Using whichever test runner I’ve configured.


回答 11

我更喜欢顶级测试目录。这确实意味着进口变得更加困难。为此,我有两个解决方案:

  1. 使用setuptools。然后,您可以test_suite='tests.runalltests.suite'进入setup(),并可以简单地运行测试:python setup.py test
  2. 运行测试时设置PYTHONPATH: PYTHONPATH=. python tests/runalltests.py

M2Crypto中的代码如何支持这些东西:

如果您希望通过鼻子测试运行测试,则可能需要做一些不同的事情。

I prefer toplevel tests directory. This does mean imports become a little more difficult. For that I have two solutions:

  1. Use setuptools. Then you can pass test_suite='tests.runalltests.suite' into setup(), and can run the tests simply: python setup.py test
  2. Set PYTHONPATH when running the tests: PYTHONPATH=. python tests/runalltests.py

Here’s how that stuff is supported by code in M2Crypto:

If you prefer to run tests with nosetests you might need do something a little different.


回答 12

我们用

app/src/code.py
app/testing/code_test.py 
app/docs/..

在每个测试文件,我们插入../src/sys.path。这不是最好的解决方案,但可以。我认为,如果有人想出了java中的maven之类的东西,无论您从事什么项目,它都会为您提供可以正常工作的标准约定,那就太好了。

We use

app/src/code.py
app/testing/code_test.py 
app/docs/..

In each test file we insert ../src/ in sys.path. It’s not the nicest solution but works. I think it would be great if someone came up w/ something like maven in java that gives you standard conventions that just work, no matter what project you work on.


回答 13

如果测试很简单,只需将它们放在docstring中-大多数适用于Python的测试框架都可以使用:

>>> import module
>>> module.method('test')
'testresult'

对于其他涉及更多的测试,我会将它们放在../tests/test_module.py或中tests/test_module.py

If the tests are simple, simply put them in the docstring — most of the test frameworks for Python will be able to use that:

>>> import module
>>> module.method('test')
'testresult'

For other more involved tests, I’d put them either in ../tests/test_module.py or in tests/test_module.py.


回答 14

在C#中,我通常将测试分为一个单独的程序集。

到目前为止,在Python中,我倾向于编写doctest,该测试位于函数的docstring中,或者将它们放在if __name__ == "__main__"模块底部的块中。

In C#, I’ve generally separated the tests into a separate assembly.

In Python — so far — I’ve tended to either write doctests, where the test is in the docstring of a function, or put them in the if __name__ == "__main__" block at the bottom of the module.


回答 15

在编写名为“ foo”的程序包时,我会将单元测试放入单独的程序包“ foo_test”中。这样,模块和子软件包将与SUT软件包模块具有相同的名称。例如,在foo_test.xy中找到模块foo.xy的测试。然后,每个测试包的__init__.py文件都包含一个AllTests套件,其中包括该包的所有测试套件。setuptools提供了一种方便的方法来指定主要的测试包,以便在“ python setup.py development”之后,您可以仅对“ python setup.py test”或“ python setup.py test -s foo_test.x.SomeTestSuite”使用只是一个特定的套件。

When writing a package called “foo”, I will put unit tests into a separate package “foo_test”. Modules and subpackages will then have the same name as the SUT package module. E.g. tests for a module foo.x.y are found in foo_test.x.y. The __init__.py files of each testing package then contain an AllTests suite that includes all test suites of the package. setuptools provides a convenient way to specify the main testing package, so that after “python setup.py develop” you can just use “python setup.py test” or “python setup.py test -s foo_test.x.SomeTestSuite” to the just a specific suite.


回答 16

我将测试与被测代码(CUT)放在同一目录中。用于foo.py测试将在foo_ut.py或相似。(我调整了测试发现过程以找到这些。)

这会将测试放在目录列表中的代码旁边,从而使测试很明显,并且使测试在单独文件中时的打开变得尽可能容易。(对于命令行编辑器,vim foo*以及在使用图形文件系统浏览器时,只需单击CUT文件,然后单击紧邻的测试文件。)

正如其他人指出的那样,如果需要的话,这也使得重构和提取代码以在其他地方使用变得更加容易。

我真的不喜欢将测试放在完全不同的目录树中的想法;为什么在使用CUT打开文件时,使开发人员更难以打开测试?并不是说绝大多数开发人员都热衷于编写或调整测试,以至于他们会忽略这样做的任何障碍,而不是以障碍为借口。(根据我的经验,情况恰恰相反;即使您使它尽可能地容易,我也知道许多开发人员不会为编写测试而烦恼。)

I put my tests in the same directory as the code under test (CUT); for foo.py the tests will be in foo_ut.py or similar. (I tweak the test discovery process to find these.)

This puts the tests right beside the code in a directory listing, making it obvious that tests are there, and makes opening the tests as easy as it can possibly be when they’re in a separate file. (For command line editors, vim foo* and when using a graphical filesystem browser, just click on the CUT file and then the immediately adjacent test file.)

As others have pointed out, this also makes it easier to refactor and to extract the code for use elsewhere should that ever be necessary.

I really dislike the idea of putting tests in a completely different directory tree; why make it harder than necessary for developers to open up the tests when they’re opening the file with the CUT? It’s not like the vast majority of developers are so keen on writing or tweaking tests that they’ll ignore any barrier to doing that, instead of using the barrier as an excuse. (Quite the opposite, in my experience; even when you make it as easy as possible I know many developers who can’t be bothered to write tests.)


回答 17

我最近开始用Python编程,所以我还没有真正找到最佳实践的机会。但是,我编写了一个模块,可以查找并运行所有测试。

所以我有:

应用/
 appfile.py
测试/
 appfileTest.py

我必须查看进展到更大项目时的情况。

I’ve recently started to program in Python, so I’ve not really had chance to find out best practice yet. But, I’ve written a module that goes and finds all the tests and runs them.

So, I have:

app/
 appfile.py
test/
 appfileTest.py

I’ll have to see how it goes as I progress to larger projects.


如何在Python中声明数组?

问题:如何在Python中声明数组?

如何在Python中声明数组?

我在文档中找不到对数组的任何引用。

How do I declare an array in Python?

I can’t find any reference to arrays in the documentation.


回答 0

variable = []

现在variable引用一个空列表*

当然,这是分配,而不是声明。在Python中,没有办法说“此变量不应引用列表以外的任何东西”,因为Python是动态类型的。


*默认的内置Python类型称为list,而不是数组。它是一个任意长度的有序容器,可以容纳异构对象集合(它们的类型无关紧要,可以自由混合)。请勿将其与array模块混淆,后者提供的类型更接近C array类型。内容必须是同质的(所有类型都相同),但是长度仍然是动态的。

variable = []

Now variable refers to an empty list*.

Of course this is an assignment, not a declaration. There’s no way to say in Python “this variable should never refer to anything other than a list”, since Python is dynamically typed.


*The default built-in Python type is called a list, not an array. It is an ordered container of arbitrary length that can hold a heterogenous collection of objects (their types do not matter and can be freely mixed). This should not be confused with the array module, which offers a type closer to the C array type; the contents must be homogenous (all of the same type), but the length is still dynamic.


回答 1

这是Python中令人惊讶的复杂主题。

实用答案

数组由类表示list(请参见参考,不要将它们与generator混合使用)。

查看用法示例:

# empty array
arr = [] 

# init with values (can contain mixed types)
arr = [1, "eels"]

# get item by index (can be negative to access end of array)
arr = [1, 2, 3, 4, 5, 6]
arr[0]  # 1
arr[-1] # 6

# get length
length = len(arr)

# supports append and insert
arr.append(8)
arr.insert(6, 7)

理论答案

Python的list内幕是一个真实数组的包装,该数组包含对项目的引用。同样,基础数组会创建一些额外的空间。

其后果是:

  • 随机访问真的很便宜(arr[6653]与相同arr[0]
  • append 操作是“免费的”,同时有一些额外的空间
  • insert 操作费用昂贵

检查这张很棒的操作复杂性表

另外,请参见这张图片,在这里我试图显示数组,引用数组和链接列表之间的最重要区别:

This is surprisingly complex topic in Python.

Practical answer

Arrays are represented by class list (see reference and do not mix them with generators).

Check out usage examples:

# empty array
arr = [] 

# init with values (can contain mixed types)
arr = [1, "eels"]

# get item by index (can be negative to access end of array)
arr = [1, 2, 3, 4, 5, 6]
arr[0]  # 1
arr[-1] # 6

# get length
length = len(arr)

# supports append and insert
arr.append(8)
arr.insert(6, 7)

Theoretical answer

Under the hood Python’s list is a wrapper for a real array which contains references to items. Also, underlying array is created with some extra space.

Consequences of this are:

  • random access is really cheap (arr[6653] is same to arr[0])
  • append operation is ‘for free’ while some extra space
  • insert operation is expensive

Check this awesome table of operations complexity.

Also, please see this picture, where I’ve tried to show most important differences between array, array of references and linked list:


回答 2

您实际上并没有声明任何东西,但这是在Python中创建数组的方式:

from array import array
intarray = array('i')

有关更多信息,请参见数组模块:http : //docs.python.org/library/array.html

现在可能您不想要数组,而是列表,但是其他人已经回答了。:)

You don’t actually declare things, but this is how you create an array in Python:

from array import array
intarray = array('i')

For more info see the array module: http://docs.python.org/library/array.html

Now possible you don’t want an array, but a list, but others have answered that already. :)


回答 3

我认为您想要一个列表,其中前30个单元格已经填充。所以

   f = []

   for i in range(30):
       f.append(0)

斐波那契数列就是一个可以使用它的例子。请参阅欧拉计画中的问题2

I think you (meant)want an list with the first 30 cells already filled. So

   f = []

   for i in range(30):
       f.append(0)

An example to where this could be used is in Fibonacci sequence. See problem 2 in Project Euler


回答 4

这是这样的:

my_array = [1, 'rebecca', 'allard', 15]

This is how:

my_array = [1, 'rebecca', 'allard', 15]

回答 5

对于计算,请使用如下的numpy数组:

import numpy as np

a = np.ones((3,2))        # a 2D array with 3 rows, 2 columns, filled with ones
b = np.array([1,2,3])     # a 1D array initialised using a list [1,2,3]
c = np.linspace(2,3,100)  # an array with 100 points beteen (and including) 2 and 3

print(a*1.5)  # all elements of a times 1.5
print(a.T+b)  # b added to the transpose of a

这些numpy数组可以从磁盘保存和加载(甚至压缩),并且包含大量元素的复杂计算的速度类似于C。

在科学环境中大量使用。看到这里更多。

For calculations, use numpy arrays like this:

import numpy as np

a = np.ones((3,2))        # a 2D array with 3 rows, 2 columns, filled with ones
b = np.array([1,2,3])     # a 1D array initialised using a list [1,2,3]
c = np.linspace(2,3,100)  # an array with 100 points beteen (and including) 2 and 3

print(a*1.5)  # all elements of a times 1.5
print(a.T+b)  # b added to the transpose of a

these numpy arrays can be saved and loaded from disk (even compressed) and complex calculations with large amounts of elements are C-like fast.

Much used in scientific environments. See here for more.


回答 6

JohnMachin的评论应该是真正的答案。我认为所有其他答案只是解决方法!所以:

array=[0]*element_count

JohnMachin’s comment should be the real answer. All the other answers are just workarounds in my opinion! So:

array=[0]*element_count

回答 7

一些贡献表明python中的数组由列表表示。这是不正确的。Python array()在标准库模块arrayarray.array()”中具有独立的实现,因此将两者混淆是不正确的。列表是python中的列表,因此请谨慎使用所使用的术语。

list_01 = [4, 6.2, 7-2j, 'flo', 'cro']

list_01
Out[85]: [4, 6.2, (7-2j), 'flo', 'cro']

list和之间有一个非常重要的区别array.array()。虽然这两个对象都是有序序列,但array.array()是有序均质序列,而列表是非均质序列。

A couple of contributions suggested that arrays in python are represented by lists. This is incorrect. Python has an independent implementation of array() in the standard library module arrayarray.array()” hence it is incorrect to confuse the two. Lists are lists in python so be careful with the nomenclature used.

list_01 = [4, 6.2, 7-2j, 'flo', 'cro']

list_01
Out[85]: [4, 6.2, (7-2j), 'flo', 'cro']

There is one very important difference between list and array.array(). While both of these objects are ordered sequences, array.array() is an ordered homogeneous sequences whereas a list is a non-homogeneous sequence.


回答 8

您无需在Python中声明任何内容。您只需要使用它。我建议您从http://diveintopython.net之类的东西开始。

You don’t declare anything in Python. You just use it. I recommend you start out with something like http://diveintopython.net.


回答 9

我通常只是做a = [1,2,3]一个实际上是一个,listarrays看看这个正式定义

I would normally just do a = [1,2,3] which is actually a list but for arrays look at this formal definition


回答 10

为了增加Lennart的答案,可以这样创建一个数组:

from array import array
float_array = array("f",values)

其中可以采取一个元组,列表的形式,或np.array,但不是数组:

values = [1,2,3]
values = (1,2,3)
values = np.array([1,2,3],'f')
# 'i' will work here too, but if array is 'i' then values have to be int
wrong_values = array('f',[1,2,3])
# TypeError: 'array.array' object is not callable

并且输出仍然是相同的:

print(float_array)
print(float_array[1])
print(isinstance(float_array[1],float))

# array('f', [1.0, 2.0, 3.0])
# 2.0
# True

list的大多数方法也适用于数组,常见的方法是pop(),extend()和append()。

从答案和评论来看,似乎数组数据结构并不流行。我喜欢它,就像人们可能会更喜欢元组而不是列表一样。

数组结构比列表或np.array具有更严格的规则,这可以减少错误并简化调试,尤其是在处理数字数据时。

尝试将浮点数插入/附加到int数组将引发TypeError:

values = [1,2,3]
int_array = array("i",values)
int_array.append(float(1))
# or int_array.extend([float(1)])

# TypeError: integer argument expected, got float

因此,将要为整数(例如索引列表)的值保留为数组形式可能会防止“ TypeError:列表索引必须为整数,而不是浮点数”,因为可以迭代数组,类似于np.array和list:

int_array = array('i',[1,2,3])
data = [11,22,33,44,55]
sample = []
for i in int_array:
    sample.append(data[i])

烦人的是,将int附加到float数组将导致int成为float,而不会引发异常。

np.array的条目也保留相同的数据类型,但是它不会改变错误,而是会更改其数据类型以适合新条目(通常为double或str):

import numpy as np
numpy_int_array = np.array([1,2,3],'i')
for i in numpy_int_array:
    print(type(i))
    # <class 'numpy.int32'>
numpy_int_array_2 = np.append(numpy_int_array,int(1))
# still <class 'numpy.int32'>
numpy_float_array = np.append(numpy_int_array,float(1))
# <class 'numpy.float64'> for all values
numpy_str_array = np.append(numpy_int_array,"1")
# <class 'numpy.str_'> for all values
data = [11,22,33,44,55]
sample = []
for i in numpy_int_array_2:
    sample.append(data[i])
    # no problem here, but TypeError for the other two

在分配期间也是如此。如果指定了数据类型,则np.array将在可能的情况下将条目转换为该数据类型:

int_numpy_array = np.array([1,2,float(3)],'i')
# 3 becomes an int
int_numpy_array_2 = np.array([1,2,3.9],'i')
# 3.9 gets truncated to 3 (same as int(3.9))
invalid_array = np.array([1,2,"string"],'i')
# ValueError: invalid literal for int() with base 10: 'string'
# Same error as int('string')
str_numpy_array = np.array([1,2,3],'str')
print(str_numpy_array)
print([type(i) for i in str_numpy_array])
# ['1' '2' '3']
# <class 'numpy.str_'>

或者,本质上:

data = [1.2,3.4,5.6]
list_1 = np.array(data,'i').tolist()
list_2 = [int(i) for i in data]
print(list_1 == list_2)
# True

而数组只会给出:

invalid_array = array([1,2,3.9],'i')
# TypeError: integer argument expected, got float

因此,对特定于类型的命令使用np.array不是一个好主意。数组结构在这里很有用。list保留值的数据类型。

对于某些问题,我感到有些讨厌:数据类型在array()中指定为第一个参数,但是(通常)在np.array()中指定为第二个参数。:|

与C的关系在这里引用: Python列表与数组-何时使用?

祝您探索愉快!

注意:数组的类型化和相当严格的性质更倾向于C而不是Python,并且通过设计,Python在其函数中没有许多特定于类型的约束。它的不受欢迎也给协作工作带来了积极的反馈,而替换它通常需要附加的[文件中x的int(x)]。因此,忽略数组的存在是完全可行和合理的。它不应该以任何方式阻碍我们大多数人。:D

To add to Lennart’s answer, an array may be created like this:

from array import array
float_array = array("f",values)

where values can take the form of a tuple, list, or np.array, but not array:

values = [1,2,3]
values = (1,2,3)
values = np.array([1,2,3],'f')
# 'i' will work here too, but if array is 'i' then values have to be int
wrong_values = array('f',[1,2,3])
# TypeError: 'array.array' object is not callable

and the output will still be the same:

print(float_array)
print(float_array[1])
print(isinstance(float_array[1],float))

# array('f', [1.0, 2.0, 3.0])
# 2.0
# True

Most methods for list work with array as well, common ones being pop(), extend(), and append().

Judging from the answers and comments, it appears that the array data structure isn’t that popular. I like it though, the same way as one might prefer a tuple over a list.

The array structure has stricter rules than a list or np.array, and this can reduce errors and make debugging easier, especially when working with numerical data.

Attempts to insert/append a float to an int array will throw a TypeError:

values = [1,2,3]
int_array = array("i",values)
int_array.append(float(1))
# or int_array.extend([float(1)])

# TypeError: integer argument expected, got float

Keeping values which are meant to be integers (e.g. list of indices) in the array form may therefore prevent a “TypeError: list indices must be integers, not float”, since arrays can be iterated over, similar to np.array and lists:

int_array = array('i',[1,2,3])
data = [11,22,33,44,55]
sample = []
for i in int_array:
    sample.append(data[i])

Annoyingly, appending an int to a float array will cause the int to become a float, without throwing an exception.

np.array retain the same data type for its entries too, but instead of giving an error it will change its data type to fit new entries (usually to double or str):

import numpy as np
numpy_int_array = np.array([1,2,3],'i')
for i in numpy_int_array:
    print(type(i))
    # <class 'numpy.int32'>
numpy_int_array_2 = np.append(numpy_int_array,int(1))
# still <class 'numpy.int32'>
numpy_float_array = np.append(numpy_int_array,float(1))
# <class 'numpy.float64'> for all values
numpy_str_array = np.append(numpy_int_array,"1")
# <class 'numpy.str_'> for all values
data = [11,22,33,44,55]
sample = []
for i in numpy_int_array_2:
    sample.append(data[i])
    # no problem here, but TypeError for the other two

This is true during assignment as well. If the data type is specified, np.array will, wherever possible, transform the entries to that data type:

int_numpy_array = np.array([1,2,float(3)],'i')
# 3 becomes an int
int_numpy_array_2 = np.array([1,2,3.9],'i')
# 3.9 gets truncated to 3 (same as int(3.9))
invalid_array = np.array([1,2,"string"],'i')
# ValueError: invalid literal for int() with base 10: 'string'
# Same error as int('string')
str_numpy_array = np.array([1,2,3],'str')
print(str_numpy_array)
print([type(i) for i in str_numpy_array])
# ['1' '2' '3']
# <class 'numpy.str_'>

or, in essence:

data = [1.2,3.4,5.6]
list_1 = np.array(data,'i').tolist()
list_2 = [int(i) for i in data]
print(list_1 == list_2)
# True

while array will simply give:

invalid_array = array([1,2,3.9],'i')
# TypeError: integer argument expected, got float

Because of this, it is not a good idea to use np.array for type-specific commands. The array structure is useful here. list preserves the data type of the values.

And for something I find rather pesky: the data type is specified as the first argument in array(), but (usually) the second in np.array(). :|

The relation to C is referred to here: Python List vs. Array – when to use?

Have fun exploring!

Note: The typed and rather strict nature of array leans more towards C rather than Python, and by design Python does not have many type-specific constraints in its functions. Its unpopularity also creates a positive feedback in collaborative work, and replacing it mostly involves an additional [int(x) for x in file]. It is therefore entirely viable and reasonable to ignore the existence of array. It shouldn’t hinder most of us in any way. :D


回答 11

这个怎么样…

>>> a = range(12)
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
>>> a[7]
6

How about this…

>>> a = range(12)
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
>>> a[7]
6

回答 12

Python将它们称为list。您可以使用方括号和逗号编写列表文字:

>>> [6,28,496,8128]
[6, 28, 496, 8128]

Python calls them lists. You can write a list literal with square brackets and commas:

>>> [6,28,496,8128]
[6, 28, 496, 8128]

回答 13

在Lennart之后,还有numpy,它实现了同类的多维数组。

Following on from Lennart, there’s also numpy which implements homogeneous multi-dimensional arrays.


回答 14

我有一个字符串数组,并且需要一个相同长度的布尔值数组(初始化为True)。这就是我所做的

strs = ["Hi","Bye"] 
bools = [ True for s in strs ]

I had an array of strings and needed an array of the same length of booleans initiated to True. This is what I did

strs = ["Hi","Bye"] 
bools = [ True for s in strs ]

回答 15

您可以创建列表并将其转换为数组,也可以使用numpy模块创建数组。以下是一些说明此问题的示例。Numpy还使使用多维数组更容易。

import numpy as np
a = np.array([1, 2, 3, 4])

#For custom inputs
a = np.array([int(x) for x in input().split()])

您还可以使用reshape函数将此数组整形为2X2矩阵,该函数将输入作为矩阵的尺寸。

mat = a.reshape(2, 2)

You can create lists and convert them into arrays or you can create array using numpy module. Below are few examples to illustrate the same. Numpy also makes it easier to work with multi-dimensional arrays.

import numpy as np
a = np.array([1, 2, 3, 4])

#For custom inputs
a = np.array([int(x) for x in input().split()])

You can also reshape this array into a 2X2 matrix using reshape function which takes in input as the dimensions of the matrix.

mat = a.reshape(2, 2)

如何在Python中创建目录的zip存档?

问题:如何在Python中创建目录的zip存档?

如何在Python中创建目录结构的zip存档?

How can I create a zip archive of a directory structure in Python?


回答 0

正如其他人指出的那样,您应该使用zipfile。该文档告诉您可用的功能,但并未真正说明如何使用它们来压缩整个目录。我认为用一些示例代码来解释是最简单的:

#!/usr/bin/env python
import os
import zipfile

def zipdir(path, ziph):
    # ziph is zipfile handle
    for root, dirs, files in os.walk(path):
        for file in files:
            ziph.write(os.path.join(root, file))

if __name__ == '__main__':
    zipf = zipfile.ZipFile('Python.zip', 'w', zipfile.ZIP_DEFLATED)
    zipdir('tmp/', zipf)
    zipf.close()

改编自:http : //www.devshed.com/c/a/Python/Python-UnZipped/

As others have pointed out, you should use zipfile. The documentation tells you what functions are available, but doesn’t really explain how you can use them to zip an entire directory. I think it’s easiest to explain with some example code:

#!/usr/bin/env python
import os
import zipfile

def zipdir(path, ziph):
    # ziph is zipfile handle
    for root, dirs, files in os.walk(path):
        for file in files:
            ziph.write(os.path.join(root, file))

if __name__ == '__main__':
    zipf = zipfile.ZipFile('Python.zip', 'w', zipfile.ZIP_DEFLATED)
    zipdir('tmp/', zipf)
    zipf.close()

Adapted from: http://www.devshed.com/c/a/Python/Python-UnZipped/


回答 1

最简单的方法是使用shutil.make_archive。它支持zip和tar格式。

import shutil
shutil.make_archive(output_filename, 'zip', dir_name)

如果您需要做的事情比压缩整个目录还要复杂(例如跳过某些文件),那么您将需要zipfile按照其他人的建议深入研究该模块。

The easiest way is to use shutil.make_archive. It supports both zip and tar formats.

import shutil
shutil.make_archive(output_filename, 'zip', dir_name)

If you need to do something more complicated than zipping the whole directory (such as skipping certain files), then you’ll need to dig into the zipfile module as others have suggested.


回答 2

要将内容添加mydirectory到新的zip文件中,包括所有文件和子目录:

import os
import zipfile

zf = zipfile.ZipFile("myzipfile.zip", "w")
for dirname, subdirs, files in os.walk("mydirectory"):
    zf.write(dirname)
    for filename in files:
        zf.write(os.path.join(dirname, filename))
zf.close()

To add the contents of mydirectory to a new zip file, including all files and subdirectories:

import os
import zipfile

zf = zipfile.ZipFile("myzipfile.zip", "w")
for dirname, subdirs, files in os.walk("mydirectory"):
    zf.write(dirname)
    for filename in files:
        zf.write(os.path.join(dirname, filename))
zf.close()

回答 3

如何在Python中创建目录结构的zip存档?

在Python脚本中

在Python 2.7+中,shutil具有make_archive功能。

from shutil import make_archive
make_archive(
  'zipfile_name', 
  'zip',           # the archive format - or tar, bztar, gztar 
  root_dir=None,   # root for archive - current working dir if None
  base_dir=None)   # start archiving from here - cwd if None too

此处的压缩存档将命名为zipfile_name.zip。如果base_dir距离较远root_dir,它将排除不在中的文件base_dir,但仍将文件归档在父目录中,直到root_dir

我在使用2.7的Cygwin上测试时确实遇到了问题-它需要一个root_dir参数,用于cwd:

make_archive('zipfile_name', 'zip', root_dir='.')

从外壳使用Python

您还可以使用以下zipfile模块从外壳使用Python :

$ python -m zipfile -c zipname sourcedir

zipname您想要的目标文件的名称在哪里(.zip如果需要,可以添加,它将不会自动添加),而sourcedir是目录的路径。

压缩Python(或者只是不希望父目录):

如果你想拉上一个Python包用__init__.py__main__.py,和你不想要的父目录,它是

$ python -m zipfile -c zipname sourcedir/*

$ python zipname

将运行该软件包。(请注意,您不能将子包作为压缩存档的入口点运行。)

压缩Python应用程式:

如果您拥有python3.5 +,并且特别想压缩一个Python包,请使用zipapp

$ python -m zipapp myapp
$ python myapp.pyz

How can I create a zip archive of a directory structure in Python?

In a Python script

In Python 2.7+, shutil has a make_archive function.

from shutil import make_archive
make_archive(
  'zipfile_name', 
  'zip',           # the archive format - or tar, bztar, gztar 
  root_dir=None,   # root for archive - current working dir if None
  base_dir=None)   # start archiving from here - cwd if None too

Here the zipped archive will be named zipfile_name.zip. If base_dir is farther down from root_dir it will exclude files not in the base_dir, but still archive the files in the parent dirs up to the root_dir.

I did have an issue testing this on Cygwin with 2.7 – it wants a root_dir argument, for cwd:

make_archive('zipfile_name', 'zip', root_dir='.')

Using Python from the shell

You can do this with Python from the shell also using the zipfile module:

$ python -m zipfile -c zipname sourcedir

Where zipname is the name of the destination file you want (add .zip if you want it, it won’t do it automatically) and sourcedir is the path to the directory.

Zipping up Python (or just don’t want parent dir):

If you’re trying to zip up a python package with a __init__.py and __main__.py, and you don’t want the parent dir, it’s

$ python -m zipfile -c zipname sourcedir/*

And

$ python zipname

would run the package. (Note that you can’t run subpackages as the entry point from a zipped archive.)

Zipping a Python app:

If you have python3.5+, and specifically want to zip up a Python package, use zipapp:

$ python -m zipapp myapp
$ python myapp.pyz

回答 4

此功能将递归压缩目录树,压缩文件,并在存档中记录正确的相对文件名。存档条目与生成的条目相同zip -r output.zip source_dir

import os
import zipfile
def make_zipfile(output_filename, source_dir):
    relroot = os.path.abspath(os.path.join(source_dir, os.pardir))
    with zipfile.ZipFile(output_filename, "w", zipfile.ZIP_DEFLATED) as zip:
        for root, dirs, files in os.walk(source_dir):
            # add directory (needed for empty dirs)
            zip.write(root, os.path.relpath(root, relroot))
            for file in files:
                filename = os.path.join(root, file)
                if os.path.isfile(filename): # regular files only
                    arcname = os.path.join(os.path.relpath(root, relroot), file)
                    zip.write(filename, arcname)

This function will recursively zip up a directory tree, compressing the files, and recording the correct relative filenames in the archive. The archive entries are the same as those generated by zip -r output.zip source_dir.

import os
import zipfile
def make_zipfile(output_filename, source_dir):
    relroot = os.path.abspath(os.path.join(source_dir, os.pardir))
    with zipfile.ZipFile(output_filename, "w", zipfile.ZIP_DEFLATED) as zip:
        for root, dirs, files in os.walk(source_dir):
            # add directory (needed for empty dirs)
            zip.write(root, os.path.relpath(root, relroot))
            for file in files:
                filename = os.path.join(root, file)
                if os.path.isfile(filename): # regular files only
                    arcname = os.path.join(os.path.relpath(root, relroot), file)
                    zip.write(filename, arcname)

回答 5

使用shutil,它是python标准库集的一部分。使用shutil非常简单(请参见下面的代码):

  • 第一个参数:生成的zip / tar文件的文件名,
  • 第二个参数:zip / tar,
  • 第三个参数:dir_name

码:

import shutil
shutil.make_archive('/home/user/Desktop/Filename','zip','/home/username/Desktop/Directory')

Use shutil, which is part of python standard library set. Using shutil is so simple(see code below):

  • 1st arg: Filename of resultant zip/tar file,
  • 2nd arg: zip/tar,
  • 3rd arg: dir_name

Code:

import shutil
shutil.make_archive('/home/user/Desktop/Filename','zip','/home/username/Desktop/Directory')

回答 6

要将压缩添加到生成的zip文件中,请查看此链接

您需要更改:

zip = zipfile.ZipFile('Python.zip', 'w')

zip = zipfile.ZipFile('Python.zip', 'w', zipfile.ZIP_DEFLATED)

For adding compression to the resulting zip file, check out this link.

You need to change:

zip = zipfile.ZipFile('Python.zip', 'w')

to

zip = zipfile.ZipFile('Python.zip', 'w', zipfile.ZIP_DEFLATED)

回答 7

我对Mark Byers给出的代码进行了一些更改。如果有空目录,下面的函数还会添加空目录。通过示例可以更清楚地了解添加到zip的路径是什么。

#!/usr/bin/env python
import os
import zipfile

def addDirToZip(zipHandle, path, basePath=""):
    """
    Adding directory given by \a path to opened zip file \a zipHandle

    @param basePath path that will be removed from \a path when adding to archive

    Examples:
        # add whole "dir" to "test.zip" (when you open "test.zip" you will see only "dir")
        zipHandle = zipfile.ZipFile('test.zip', 'w')
        addDirToZip(zipHandle, 'dir')
        zipHandle.close()

        # add contents of "dir" to "test.zip" (when you open "test.zip" you will see only it's contents)
        zipHandle = zipfile.ZipFile('test.zip', 'w')
        addDirToZip(zipHandle, 'dir', 'dir')
        zipHandle.close()

        # add contents of "dir/subdir" to "test.zip" (when you open "test.zip" you will see only contents of "subdir")
        zipHandle = zipfile.ZipFile('test.zip', 'w')
        addDirToZip(zipHandle, 'dir/subdir', 'dir/subdir')
        zipHandle.close()

        # add whole "dir/subdir" to "test.zip" (when you open "test.zip" you will see only "subdir")
        zipHandle = zipfile.ZipFile('test.zip', 'w')
        addDirToZip(zipHandle, 'dir/subdir', 'dir')
        zipHandle.close()

        # add whole "dir/subdir" with full path to "test.zip" (when you open "test.zip" you will see only "dir" and inside it only "subdir")
        zipHandle = zipfile.ZipFile('test.zip', 'w')
        addDirToZip(zipHandle, 'dir/subdir')
        zipHandle.close()

        # add whole "dir" and "otherDir" (with full path) to "test.zip" (when you open "test.zip" you will see only "dir" and "otherDir")
        zipHandle = zipfile.ZipFile('test.zip', 'w')
        addDirToZip(zipHandle, 'dir')
        addDirToZip(zipHandle, 'otherDir')
        zipHandle.close()
    """
    basePath = basePath.rstrip("\\/") + ""
    basePath = basePath.rstrip("\\/")
    for root, dirs, files in os.walk(path):
        # add dir itself (needed for empty dirs
        zipHandle.write(os.path.join(root, "."))
        # add files
        for file in files:
            filePath = os.path.join(root, file)
            inZipPath = filePath.replace(basePath, "", 1).lstrip("\\/")
            #print filePath + " , " + inZipPath
            zipHandle.write(filePath, inZipPath)

上面是一个简单函数,适用于简单情况。您可以在我的Gist中找到更优雅的类:https : //gist.github.com/Eccenux/17526123107ca0ac28e6

I’ve made some changes to code given by Mark Byers. Below function will also adds empty directories if you have them. Examples should make it more clear what is the path added to the zip.

#!/usr/bin/env python
import os
import zipfile

def addDirToZip(zipHandle, path, basePath=""):
    """
    Adding directory given by \a path to opened zip file \a zipHandle

    @param basePath path that will be removed from \a path when adding to archive

    Examples:
        # add whole "dir" to "test.zip" (when you open "test.zip" you will see only "dir")
        zipHandle = zipfile.ZipFile('test.zip', 'w')
        addDirToZip(zipHandle, 'dir')
        zipHandle.close()

        # add contents of "dir" to "test.zip" (when you open "test.zip" you will see only it's contents)
        zipHandle = zipfile.ZipFile('test.zip', 'w')
        addDirToZip(zipHandle, 'dir', 'dir')
        zipHandle.close()

        # add contents of "dir/subdir" to "test.zip" (when you open "test.zip" you will see only contents of "subdir")
        zipHandle = zipfile.ZipFile('test.zip', 'w')
        addDirToZip(zipHandle, 'dir/subdir', 'dir/subdir')
        zipHandle.close()

        # add whole "dir/subdir" to "test.zip" (when you open "test.zip" you will see only "subdir")
        zipHandle = zipfile.ZipFile('test.zip', 'w')
        addDirToZip(zipHandle, 'dir/subdir', 'dir')
        zipHandle.close()

        # add whole "dir/subdir" with full path to "test.zip" (when you open "test.zip" you will see only "dir" and inside it only "subdir")
        zipHandle = zipfile.ZipFile('test.zip', 'w')
        addDirToZip(zipHandle, 'dir/subdir')
        zipHandle.close()

        # add whole "dir" and "otherDir" (with full path) to "test.zip" (when you open "test.zip" you will see only "dir" and "otherDir")
        zipHandle = zipfile.ZipFile('test.zip', 'w')
        addDirToZip(zipHandle, 'dir')
        addDirToZip(zipHandle, 'otherDir')
        zipHandle.close()
    """
    basePath = basePath.rstrip("\\/") + ""
    basePath = basePath.rstrip("\\/")
    for root, dirs, files in os.walk(path):
        # add dir itself (needed for empty dirs
        zipHandle.write(os.path.join(root, "."))
        # add files
        for file in files:
            filePath = os.path.join(root, file)
            inZipPath = filePath.replace(basePath, "", 1).lstrip("\\/")
            #print filePath + " , " + inZipPath
            zipHandle.write(filePath, inZipPath)

Above is a simple function that should work for simple cases. You can find more elegant class in my Gist: https://gist.github.com/Eccenux/17526123107ca0ac28e6


回答 8

现代Python(3.6+)使用该pathlib模块进行类似于OOP的简洁路径处理和pathlib.Path.rglob()递归glob。据我所知,这相当于George V. Reilly的答案:压缩压缩,最上面的元素是目录,保留空目录,使用相对路径。

from pathlib import Path
from zipfile import ZIP_DEFLATED, ZipFile

from os import PathLike
from typing import Union


def zip_dir(zip_name: str, source_dir: Union[str, PathLike]):
    src_path = Path(source_dir).expanduser().resolve(strict=True)
    with ZipFile(zip_name, 'w', ZIP_DEFLATED) as zf:
        for file in src_path.rglob('*'):
            zf.write(file, file.relative_to(src_path.parent))

注意:如可选类型提示所指示,zip_name不能是Path对象(将在3.6.2+中修复)。

Modern Python (3.6+) using the pathlib module for concise OOP-like handling of paths, and pathlib.Path.rglob() for recursive globbing. As far as I can tell, this is equivalent to George V. Reilly’s answer: zips with compression, the topmost element is a directory, keeps empty dirs, uses relative paths.

from pathlib import Path
from zipfile import ZIP_DEFLATED, ZipFile

from os import PathLike
from typing import Union


def zip_dir(zip_name: str, source_dir: Union[str, PathLike]):
    src_path = Path(source_dir).expanduser().resolve(strict=True)
    with ZipFile(zip_name, 'w', ZIP_DEFLATED) as zf:
        for file in src_path.rglob('*'):
            zf.write(file, file.relative_to(src_path.parent))

Note: as optional type hints indicate, zip_name can’t be a Path object (would be fixed in 3.6.2+).


回答 9

我有另一个使用python3,pathlib和zipfile可能会有所帮助的代码示例。它应该可以在任何操作系统上运行。

from pathlib import Path
import zipfile
from datetime import datetime

DATE_FORMAT = '%y%m%d'


def date_str():
    """returns the today string year, month, day"""
    return '{}'.format(datetime.now().strftime(DATE_FORMAT))


def zip_name(path):
    """returns the zip filename as string"""
    cur_dir = Path(path).resolve()
    parent_dir = cur_dir.parents[0]
    zip_filename = '{}/{}_{}.zip'.format(parent_dir, cur_dir.name, date_str())
    p_zip = Path(zip_filename)
    n = 1
    while p_zip.exists():
        zip_filename = ('{}/{}_{}_{}.zip'.format(parent_dir, cur_dir.name,
                                             date_str(), n))
        p_zip = Path(zip_filename)
        n += 1
    return zip_filename


def all_files(path):
    """iterator returns all files and folders from path as absolute path string
    """
    for child in Path(path).iterdir():
        yield str(child)
        if child.is_dir():
            for grand_child in all_files(str(child)):
                yield str(Path(grand_child))


def zip_dir(path):
    """generate a zip"""
    zip_filename = zip_name(path)
    zip_file = zipfile.ZipFile(zip_filename, 'w')
    print('create:', zip_filename)
    for file in all_files(path):
        print('adding... ', file)
        zip_file.write(file)
    zip_file.close()


if __name__ == '__main__':
    zip_dir('.')
    print('end!')

I have another code example that may help, using python3, pathlib and zipfile. It should work in any OS.

from pathlib import Path
import zipfile
from datetime import datetime

DATE_FORMAT = '%y%m%d'


def date_str():
    """returns the today string year, month, day"""
    return '{}'.format(datetime.now().strftime(DATE_FORMAT))


def zip_name(path):
    """returns the zip filename as string"""
    cur_dir = Path(path).resolve()
    parent_dir = cur_dir.parents[0]
    zip_filename = '{}/{}_{}.zip'.format(parent_dir, cur_dir.name, date_str())
    p_zip = Path(zip_filename)
    n = 1
    while p_zip.exists():
        zip_filename = ('{}/{}_{}_{}.zip'.format(parent_dir, cur_dir.name,
                                             date_str(), n))
        p_zip = Path(zip_filename)
        n += 1
    return zip_filename


def all_files(path):
    """iterator returns all files and folders from path as absolute path string
    """
    for child in Path(path).iterdir():
        yield str(child)
        if child.is_dir():
            for grand_child in all_files(str(child)):
                yield str(Path(grand_child))


def zip_dir(path):
    """generate a zip"""
    zip_filename = zip_name(path)
    zip_file = zipfile.ZipFile(zip_filename, 'w')
    print('create:', zip_filename)
    for file in all_files(path):
        print('adding... ', file)
        zip_file.write(file)
    zip_file.close()


if __name__ == '__main__':
    zip_dir('.')
    print('end!')

回答 10

您可能想看一下zipfile模块;在http://docs.python.org/library/zipfile.html上有文档。

您可能还想os.walk()索引目录结构。

You probably want to look at the zipfile module; there’s documentation at http://docs.python.org/library/zipfile.html.

You may also want os.walk() to index the directory structure.


回答 11

这是Nux给出的答案的变体,它对我有用:

def WriteDirectoryToZipFile( zipHandle, srcPath, zipLocalPath = "", zipOperation = zipfile.ZIP_DEFLATED ):
    basePath = os.path.split( srcPath )[ 0 ]
    for root, dirs, files in os.walk( srcPath ):
        p = os.path.join( zipLocalPath, root [ ( len( basePath ) + 1 ) : ] )
        # add dir
        zipHandle.write( root, p, zipOperation )
        # add files
        for f in files:
            filePath = os.path.join( root, f )
            fileInZipPath = os.path.join( p, f )
            zipHandle.write( filePath, fileInZipPath, zipOperation )

Here is a variation on the answer given by Nux that works for me:

def WriteDirectoryToZipFile( zipHandle, srcPath, zipLocalPath = "", zipOperation = zipfile.ZIP_DEFLATED ):
    basePath = os.path.split( srcPath )[ 0 ]
    for root, dirs, files in os.walk( srcPath ):
        p = os.path.join( zipLocalPath, root [ ( len( basePath ) + 1 ) : ] )
        # add dir
        zipHandle.write( root, p, zipOperation )
        # add files
        for f in files:
            filePath = os.path.join( root, f )
            fileInZipPath = os.path.join( p, f )
            zipHandle.write( filePath, fileInZipPath, zipOperation )

回答 12

试试下面的一个对我有用

import zipfile, os
zipf = "compress.zip"  
def main():
    directory = r"Filepath"
    toZip(directory)
def toZip(directory):
    zippedHelp = zipfile.ZipFile(zipf, "w", compression=zipfile.ZIP_DEFLATED )

    list = os.listdir(directory)
    for file_list in list:
        file_name = os.path.join(directory,file_list)

        if os.path.isfile(file_name):
            print file_name
            zippedHelp.write(file_name)
        else:
            addFolderToZip(zippedHelp,file_list,directory)
            print "---------------Directory Found-----------------------"
    zippedHelp.close()

def addFolderToZip(zippedHelp,folder,directory):
    path=os.path.join(directory,folder)
    print path
    file_list=os.listdir(path)
    for file_name in file_list:
        file_path=os.path.join(path,file_name)
        if os.path.isfile(file_path):
            zippedHelp.write(file_path)
        elif os.path.isdir(file_name):
            print "------------------sub directory found--------------------"
            addFolderToZip(zippedHelp,file_name,path)


if __name__=="__main__":
    main()

Try the below one .it worked for me.

import zipfile, os
zipf = "compress.zip"  
def main():
    directory = r"Filepath"
    toZip(directory)
def toZip(directory):
    zippedHelp = zipfile.ZipFile(zipf, "w", compression=zipfile.ZIP_DEFLATED )

    list = os.listdir(directory)
    for file_list in list:
        file_name = os.path.join(directory,file_list)

        if os.path.isfile(file_name):
            print file_name
            zippedHelp.write(file_name)
        else:
            addFolderToZip(zippedHelp,file_list,directory)
            print "---------------Directory Found-----------------------"
    zippedHelp.close()

def addFolderToZip(zippedHelp,folder,directory):
    path=os.path.join(directory,folder)
    print path
    file_list=os.listdir(path)
    for file_name in file_list:
        file_path=os.path.join(path,file_name)
        if os.path.isfile(file_path):
            zippedHelp.write(file_path)
        elif os.path.isdir(file_name):
            print "------------------sub directory found--------------------"
            addFolderToZip(zippedHelp,file_name,path)


if __name__=="__main__":
    main()

回答 13

如果要使用任何通用图形文件管理器的compress文件夹之类的功能,则可以使用以下代码,它使用zipfile模块。使用此代码,您将获得带有路径的zip文件作为其根文件夹。

import os
import zipfile

def zipdir(path, ziph):
    # Iterate all the directories and files
    for root, dirs, files in os.walk(path):
        # Create a prefix variable with the folder structure inside the path folder. 
        # So if a file is at the path directory will be at the root directory of the zip file
        # so the prefix will be empty. If the file belongs to a containing folder of path folder 
        # then the prefix will be that folder.
        if root.replace(path,'') == '':
                prefix = ''
        else:
                # Keep the folder structure after the path folder, append a '/' at the end 
                # and remome the first character, if it is a '/' in order to have a path like 
                # folder1/folder2/file.txt
                prefix = root.replace(path, '') + '/'
                if (prefix[0] == '/'):
                        prefix = prefix[1:]
        for filename in files:
                actual_file_path = root + '/' + filename
                zipped_file_path = prefix + filename
                zipf.write( actual_file_path, zipped_file_path)


zipf = zipfile.ZipFile('Python.zip', 'w', zipfile.ZIP_DEFLATED)
zipdir('/tmp/justtest/', zipf)
zipf.close()

If you want a functionality like the compress folder of any common graphical file manager you can use the following code, it uses the zipfile module. Using this code you will have the zip file with the path as its root folder.

import os
import zipfile

def zipdir(path, ziph):
    # Iterate all the directories and files
    for root, dirs, files in os.walk(path):
        # Create a prefix variable with the folder structure inside the path folder. 
        # So if a file is at the path directory will be at the root directory of the zip file
        # so the prefix will be empty. If the file belongs to a containing folder of path folder 
        # then the prefix will be that folder.
        if root.replace(path,'') == '':
                prefix = ''
        else:
                # Keep the folder structure after the path folder, append a '/' at the end 
                # and remome the first character, if it is a '/' in order to have a path like 
                # folder1/folder2/file.txt
                prefix = root.replace(path, '') + '/'
                if (prefix[0] == '/'):
                        prefix = prefix[1:]
        for filename in files:
                actual_file_path = root + '/' + filename
                zipped_file_path = prefix + filename
                zipf.write( actual_file_path, zipped_file_path)


zipf = zipfile.ZipFile('Python.zip', 'w', zipfile.ZIP_DEFLATED)
zipdir('/tmp/justtest/', zipf)
zipf.close()

回答 14

为了提供更大的灵活性,例如,按名称选择目录/文件,请使用:

import os
import zipfile

def zipall(ob, path, rel=""):
    basename = os.path.basename(path)
    if os.path.isdir(path):
        if rel == "":
            rel = basename
        ob.write(path, os.path.join(rel))
        for root, dirs, files in os.walk(path):
            for d in dirs:
                zipall(ob, os.path.join(root, d), os.path.join(rel, d))
            for f in files:
                ob.write(os.path.join(root, f), os.path.join(rel, f))
            break
    elif os.path.isfile(path):
        ob.write(path, os.path.join(rel, basename))
    else:
        pass

对于文件树:

.
├── dir
   ├── dir2
      └── file2.txt
   ├── dir3
      └── file3.txt
   └── file.txt
├── dir4
   ├── dir5
   └── file4.txt
├── listdir.zip
├── main.py
├── root.txt
└── selective.zip

您可以例如仅选择dir4root.txt

cwd = os.getcwd()
files = [os.path.join(cwd, f) for f in ['dir4', 'root.txt']]

with zipfile.ZipFile("selective.zip", "w" ) as myzip:
    for f in files:
        zipall(myzip, f)

或者只是listdir在脚本调用目录中,然后从此处添加所有内容:

with zipfile.ZipFile("listdir.zip", "w" ) as myzip:
    for f in os.listdir():
        if f == "listdir.zip":
            # Creating a listdir.zip in the same directory
            # will include listdir.zip inside itself, beware of this
            continue
        zipall(myzip, f)

To give more flexibility, e.g. select directory/file by name use:

import os
import zipfile

def zipall(ob, path, rel=""):
    basename = os.path.basename(path)
    if os.path.isdir(path):
        if rel == "":
            rel = basename
        ob.write(path, os.path.join(rel))
        for root, dirs, files in os.walk(path):
            for d in dirs:
                zipall(ob, os.path.join(root, d), os.path.join(rel, d))
            for f in files:
                ob.write(os.path.join(root, f), os.path.join(rel, f))
            break
    elif os.path.isfile(path):
        ob.write(path, os.path.join(rel, basename))
    else:
        pass

For a file tree:

.
├── dir
│   ├── dir2
│   │   └── file2.txt
│   ├── dir3
│   │   └── file3.txt
│   └── file.txt
├── dir4
│   ├── dir5
│   └── file4.txt
├── listdir.zip
├── main.py
├── root.txt
└── selective.zip

You can e.g. select only dir4 and root.txt:

cwd = os.getcwd()
files = [os.path.join(cwd, f) for f in ['dir4', 'root.txt']]

with zipfile.ZipFile("selective.zip", "w" ) as myzip:
    for f in files:
        zipall(myzip, f)

Or just listdir in script invocation directory and add everything from there:

with zipfile.ZipFile("listdir.zip", "w" ) as myzip:
    for f in os.listdir():
        if f == "listdir.zip":
            # Creating a listdir.zip in the same directory
            # will include listdir.zip inside itself, beware of this
            continue
        zipall(myzip, f)

回答 15

假设您要压缩当前目录中的所有文件夹(子目录)。

for root, dirs, files in os.walk("."):
    for sub_dir in dirs:
        zip_you_want = sub_dir+".zip"
        zip_process = zipfile.ZipFile(zip_you_want, "w", zipfile.ZIP_DEFLATED)
        zip_process.write(file_you_want_to_include)
        zip_process.close()

        print("Successfully zipped directory: {sub_dir}".format(sub_dir=sub_dir))

Say you want to Zip all the folders(sub directories) in the current directory.

for root, dirs, files in os.walk("."):
    for sub_dir in dirs:
        zip_you_want = sub_dir+".zip"
        zip_process = zipfile.ZipFile(zip_you_want, "w", zipfile.ZIP_DEFLATED)
        zip_process.write(file_you_want_to_include)
        zip_process.close()

        print("Successfully zipped directory: {sub_dir}".format(sub_dir=sub_dir))

回答 16

为了将文件夹层次结构保留在要归档的父目录下的简洁方法:

import glob
import zipfile

with zipfile.ZipFile(fp_zip, "w", zipfile.ZIP_DEFLATED) as zipf:
    for fp in glob(os.path.join(parent, "**/*")):
        base = os.path.commonpath([parent, fp])
        zipf.write(fp, arcname=fp.replace(base, ""))

如果需要,可以将其更改为pathlib 用于文件globbing

For a concise way to retain the folder hierarchy under the parent directory to be archived:

import glob
import zipfile

with zipfile.ZipFile(fp_zip, "w", zipfile.ZIP_DEFLATED) as zipf:
    for fp in glob(os.path.join(parent, "**/*")):
        base = os.path.commonpath([parent, fp])
        zipf.write(fp, arcname=fp.replace(base, ""))

If you want, you could change this to use pathlib for file globbing.


回答 17

这里有这么多答案,我希望我可以为自己的版本做出贡献,该版本基于原始答案(顺便说一句),但具有更多图形化的视角,还为每个zipfile设置和排序使用了上下文os.walk(),以便获得有序输出。

具有这些文件夹及其文件(以及其他文件夹),我想.zip为每个cap_文件夹创建一个:

$ tree -d
.
├── cap_01
|    ├── 0101000001.json
|    ├── 0101000002.json
|    ├── 0101000003.json
|
├── cap_02
|    ├── 0201000001.json
|    ├── 0201000002.json
|    ├── 0201001003.json
|
├── cap_03
|    ├── 0301000001.json
|    ├── 0301000002.json
|    ├── 0301000003.json
| 
├── docs
|    ├── map.txt
|    ├── main_data.xml
|
├── core_files
     ├── core_master
     ├── core_slave

这是我应用的内容,并带有注释,以使您更好地理解该过程。

$ cat zip_cap_dirs.py 
""" Zip 'cap_*' directories. """           
import os                                                                       
import zipfile as zf                                                            


for root, dirs, files in sorted(os.walk('.')):                                                                                               
    if 'cap_' in root:                                                          
        print(f"Compressing: {root}")                                           
        # Defining .zip name, according to Capítulo.                            
        cap_dir_zip = '{}.zip'.format(root)                                     
        # Opening zipfile context for current root dir.                         
        with zf.ZipFile(cap_dir_zip, 'w', zf.ZIP_DEFLATED) as new_zip:          
            # Iterating over os.walk list of files for the current root dir.    
            for f in files:                                                     
                # Defining relative path to files from current root dir.        
                f_path = os.path.join(root, f)                                  
                # Writing the file on the .zip file of the context              
                new_zip.write(f_path) 

基本上,每次迭代过os.walk(path),我打开了情境zipfile设置,之后,迭代循环访问files,这是一个list从文件root目录,形成了基于当前的每个文件的相对路径root的目录,附加到zipfile其运行的背景下。

输出显示如下:

$ python3 zip_cap_dirs.py
Compressing: ./cap_01
Compressing: ./cap_02
Compressing: ./cap_03

要查看每个.zip目录的内容,可以使用以下less命令:

$ less cap_01.zip

Archive:  cap_01.zip
 Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
--------  ------  ------- ---- ---------- ----- --------  ----
  22017  Defl:N     2471  89% 2019-09-05 08:05 7a3b5ec6  cap_01/0101000001.json
  21998  Defl:N     2471  89% 2019-09-05 08:05 155bece7  cap_01/0101000002.json
  23236  Defl:N     2573  89% 2019-09-05 08:05 55fced20  cap_01/0101000003.json
--------          ------- ---                           -------
  67251             7515  89%                            3 files

So many answers here, and I hope I might contribute with my own version, which is based on the original answer (by the way), but with a more graphical perspective, also using context for each zipfile setup and sorting os.walk(), in order to have a ordered output.

Having these folders and them files (among other folders), I wanted to create a .zip for each cap_ folder:

$ tree -d
.
├── cap_01
|    ├── 0101000001.json
|    ├── 0101000002.json
|    ├── 0101000003.json
|
├── cap_02
|    ├── 0201000001.json
|    ├── 0201000002.json
|    ├── 0201001003.json
|
├── cap_03
|    ├── 0301000001.json
|    ├── 0301000002.json
|    ├── 0301000003.json
| 
├── docs
|    ├── map.txt
|    ├── main_data.xml
|
├── core_files
     ├── core_master
     ├── core_slave

Here’s what I applied, with comments for better understanding of the process.

$ cat zip_cap_dirs.py 
""" Zip 'cap_*' directories. """           
import os                                                                       
import zipfile as zf                                                            


for root, dirs, files in sorted(os.walk('.')):                                                                                               
    if 'cap_' in root:                                                          
        print(f"Compressing: {root}")                                           
        # Defining .zip name, according to Capítulo.                            
        cap_dir_zip = '{}.zip'.format(root)                                     
        # Opening zipfile context for current root dir.                         
        with zf.ZipFile(cap_dir_zip, 'w', zf.ZIP_DEFLATED) as new_zip:          
            # Iterating over os.walk list of files for the current root dir.    
            for f in files:                                                     
                # Defining relative path to files from current root dir.        
                f_path = os.path.join(root, f)                                  
                # Writing the file on the .zip file of the context              
                new_zip.write(f_path) 

Basically, for each iteration over os.walk(path), I’m opening a context for zipfile setup and afterwards, iterating iterating over files, which is a list of files from root directory, forming the relative path for each file based on the current root directory, appending to the zipfile context which is running.

And the output is presented like this:

$ python3 zip_cap_dirs.py
Compressing: ./cap_01
Compressing: ./cap_02
Compressing: ./cap_03

To see the contents of each .zip directory, you can use less command:

$ less cap_01.zip

Archive:  cap_01.zip
 Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
--------  ------  ------- ---- ---------- ----- --------  ----
  22017  Defl:N     2471  89% 2019-09-05 08:05 7a3b5ec6  cap_01/0101000001.json
  21998  Defl:N     2471  89% 2019-09-05 08:05 155bece7  cap_01/0101000002.json
  23236  Defl:N     2573  89% 2019-09-05 08:05 55fced20  cap_01/0101000003.json
--------          ------- ---                           -------
  67251             7515  89%                            3 files

回答 18

这是使用pathlib和上下文管理器的一种现代方法。将文件直接放在zip中,而不放在子文件夹中。

def zip_dir(filename: str, dir_to_zip: pathlib.Path):
    with zipfile.ZipFile(filename, 'w', zipfile.ZIP_DEFLATED) as zipf:
        # Use glob instead of iterdir(), to cover all subdirectories.
        for directory in dir_to_zip.glob('**'):
            for file in directory.iterdir():
                if not file.is_file():
                    continue
                # Strip the first component, so we don't create an uneeded subdirectory
                # containing everything.
                zip_path = pathlib.Path(*file.parts[1:])
                # Use a string, since zipfile doesn't support pathlib  directly.
                zipf.write(str(file), str(zip_path))

Here’s a modern approach, using pathlib, and a context manager. Puts the files directly in the zip, rather than in a subfolder.

def zip_dir(filename: str, dir_to_zip: pathlib.Path):
    with zipfile.ZipFile(filename, 'w', zipfile.ZIP_DEFLATED) as zipf:
        # Use glob instead of iterdir(), to cover all subdirectories.
        for directory in dir_to_zip.glob('**'):
            for file in directory.iterdir():
                if not file.is_file():
                    continue
                # Strip the first component, so we don't create an uneeded subdirectory
                # containing everything.
                zip_path = pathlib.Path(*file.parts[1:])
                # Use a string, since zipfile doesn't support pathlib  directly.
                zipf.write(str(file), str(zip_path))

回答 19

我通过将Mark Byers的解决方案与Reimund和Morten Zilmer的注释(相对路径,包括空目录)合并在一起来准备函数。最佳实践with是在ZipFile的文件构造中使用。

该函数还准备一个默认的zip文件名,带有压缩的目录名和’.zip’扩展名。因此,它仅适用于一个参数:要压缩的源目录。

import os
import zipfile

def zip_dir(path_dir, path_file_zip=''):
if not path_file_zip:
    path_file_zip = os.path.join(
        os.path.dirname(path_dir), os.path.basename(path_dir)+'.zip')
with zipfile.ZipFile(path_file_zip, 'wb', zipfile.ZIP_DEFLATED) as zip_file:
    for root, dirs, files in os.walk(path_dir):
        for file_or_dir in files + dirs:
            zip_file.write(
                os.path.join(root, file_or_dir),
                os.path.relpath(os.path.join(root, file_or_dir),
                                os.path.join(path_dir, os.path.pardir)))

I prepared a function by consolidating Mark Byers’ solution with Reimund and Morten Zilmer’s comments (relative path and including empty directories). As a best practice, with is used in ZipFile’s file construction.

The function also prepares a default zip file name with the zipped directory name and ‘.zip’ extension. Therefore, it works with only one argument: the source directory to be zipped.

import os
import zipfile

def zip_dir(path_dir, path_file_zip=''):
if not path_file_zip:
    path_file_zip = os.path.join(
        os.path.dirname(path_dir), os.path.basename(path_dir)+'.zip')
with zipfile.ZipFile(path_file_zip, 'wb', zipfile.ZIP_DEFLATED) as zip_file:
    for root, dirs, files in os.walk(path_dir):
        for file_or_dir in files + dirs:
            zip_file.write(
                os.path.join(root, file_or_dir),
                os.path.relpath(os.path.join(root, file_or_dir),
                                os.path.join(path_dir, os.path.pardir)))

回答 20

# import required python modules
# You have to install zipfile package using pip install

import os,zipfile

# Change the directory where you want your new zip file to be

os.chdir('Type your destination')

# Create a new zipfile ( I called it myfile )

zf = zipfile.ZipFile('myfile.zip','w')

# os.walk gives a directory tree. Access the files using a for loop

for dirnames,folders,files in os.walk('Type your directory'):
    zf.write('Type your Directory')
    for file in files:
        zf.write(os.path.join('Type your directory',file))
# import required python modules
# You have to install zipfile package using pip install

import os,zipfile

# Change the directory where you want your new zip file to be

os.chdir('Type your destination')

# Create a new zipfile ( I called it myfile )

zf = zipfile.ZipFile('myfile.zip','w')

# os.walk gives a directory tree. Access the files using a for loop

for dirnames,folders,files in os.walk('Type your directory'):
    zf.write('Type your Directory')
    for file in files:
        zf.write(os.path.join('Type your directory',file))

回答 21

好了,在阅读建议之后,我想到了一种与2.7.x相似的方式,而不创建“有趣的”目录名称(类似绝对的名称),并且只会在zip中创建指定的文件夹。

或者,以防万一您需要您的zip包含一个包含所选目录内容的文件夹。

def zipDir( path, ziph ) :
 """
 Inserts directory (path) into zipfile instance (ziph)
 """
 for root, dirs, files in os.walk( path ) :
  for file in files :
   ziph.write( os.path.join( root, file ) , os.path.basename( os.path.normpath( path ) ) + "\\" + file )

def makeZip( pathToFolder ) :
 """
 Creates a zip file with the specified folder
 """
 zipf = zipfile.ZipFile( pathToFolder + 'file.zip', 'w', zipfile.ZIP_DEFLATED )
 zipDir( pathToFolder, zipf )
 zipf.close()
 print( "Zip file saved to: " + pathToFolder)

makeZip( "c:\\path\\to\\folder\\to\\insert\\into\\zipfile" )

Well, after reading the suggestions I came up with a very similar way that works with 2.7.x without creating “funny” directory names (absolute-like names), and will only create the specified folder inside the zip.

Or just in case you needed your zip to contain a folder inside with the contents of the selected directory.

def zipDir( path, ziph ) :
 """
 Inserts directory (path) into zipfile instance (ziph)
 """
 for root, dirs, files in os.walk( path ) :
  for file in files :
   ziph.write( os.path.join( root, file ) , os.path.basename( os.path.normpath( path ) ) + "\\" + file )

def makeZip( pathToFolder ) :
 """
 Creates a zip file with the specified folder
 """
 zipf = zipfile.ZipFile( pathToFolder + 'file.zip', 'w', zipfile.ZIP_DEFLATED )
 zipDir( pathToFolder, zipf )
 zipf.close()
 print( "Zip file saved to: " + pathToFolder)

makeZip( "c:\\path\\to\\folder\\to\\insert\\into\\zipfile" )

回答 22

创建zip文件的功能。

def CREATEZIPFILE(zipname, path):
    #function to create a zip file
    #Parameters: zipname - name of the zip file; path - name of folder/file to be put in zip file

    zipf = zipfile.ZipFile(zipname, 'w', zipfile.ZIP_DEFLATED)
    zipf.setpassword(b"password") #if you want to set password to zipfile

    #checks if the path is file or directory
    if os.path.isdir(path):
        for files in os.listdir(path):
            zipf.write(os.path.join(path, files), files)

    elif os.path.isfile(path):
        zipf.write(os.path.join(path), path)
    zipf.close()

Function to create zip file.

def CREATEZIPFILE(zipname, path):
    #function to create a zip file
    #Parameters: zipname - name of the zip file; path - name of folder/file to be put in zip file

    zipf = zipfile.ZipFile(zipname, 'w', zipfile.ZIP_DEFLATED)
    zipf.setpassword(b"password") #if you want to set password to zipfile

    #checks if the path is file or directory
    if os.path.isdir(path):
        for files in os.listdir(path):
            zipf.write(os.path.join(path, files), files)

    elif os.path.isfile(path):
        zipf.write(os.path.join(path), path)
    zipf.close()

回答 23

使用zipfly

import zipfly

paths = [
    {
        'fs': '/path/to/large/file'
    },
]

zfly = zipfly.ZipFly( paths = paths )

with open("large.zip", "wb") as f:
    for i in zfly.generator():
        f.write(i)

Using zipfly

import zipfly

paths = [
    {
        'fs': '/path/to/large/file'
    },
]

zfly = zipfly.ZipFly( paths = paths )

with open("large.zip", "wb") as f:
    for i in zfly.generator():
        f.write(i)

Python退出命令-为什么要使用这么多?何时使用?

问题:Python退出命令-为什么要使用这么多?何时使用?

似乎python支持许多不同的命令来停止脚本执行。
我发现的选择是: quit()exit()sys.exit()os._exit()

我错过了吗?它们之间有什么区别?您什么时候使用?

It seems that python supports many different commands to stop script execution.
The choices I’ve found are: quit(), exit(), sys.exit(), os._exit()

Have I missed any? What’s the difference between them? When would you use each?


回答 0

让我给他们一些信息:

  1. quit()只是引发SystemExit异常。

    此外,如果您打印它,它将显示一条消息:

    >>> print (quit)
    Use quit() or Ctrl-Z plus Return to exit
    >>>
    

    包含此功能是为了帮助不了解Python的人。毕竟,新手尝试退出Python的最有可能的事情之一就是输入quit

    然而,quit应该不是在生产代码中使用。这是因为它仅在site模块加载后才起作用。相反,此功能应仅在解释器中使用。

  2. exit()是的别名quit(反之亦然)。它们一起存在只是为了使Python更加用户友好。

    此外,在打印时它还会给出一条消息:

    >>> print (exit)
    Use exit() or Ctrl-Z plus Return to exit
    >>>
    

    然而,像quitexit被认为是不好的产品代码使用,并应保留在解释使用。这是因为它也依赖于site模块。

  3. sys.exit()也引发了SystemExit异常。这意味着,它是相同的quit,并exit在这方面。

    但是,与这两者不同的是,sys.exit在生产代码中很好地使用它。这是因为sys模块将始终存在。

  4. os._exit()退出程序而不调用清理处理程序,刷新stdio缓冲区等。因此,这不是退出的标准方法,仅应在特殊情况下使用。其中最常见的是在所创建的子进程中os.fork

    请注意,在给出的四种方法中,只有一种是唯一的。

总结起来,所有四种方法都退出程序。但是,前两个在生产代码中被认为不好用,最后一个是非标准的,肮脏的方式,仅在特殊情况下使用。因此,如果要正常退出程序,请使用第三个方法:sys.exit


或者,我认为更好的是,您可以直接sys.exit执行幕后操作并运行:

raise SystemExit

这样,您无需先导入sys

但是,此选择只是样式上的一个,完全由您决定。

Let me give some information on them:

  1. quit() simply raises the SystemExit exception.

    Furthermore, if you print it, it will give a message:

    >>> print (quit)
    Use quit() or Ctrl-Z plus Return to exit
    >>>
    

    This functionality was included to help people who do not know Python. After all, one of the most likely things a newbie will try to exit Python is typing in quit.

    Nevertheless, quit should not be used in production code. This is because it only works if the site module is loaded. Instead, this function should only be used in the interpreter.

  2. exit() is an alias for quit (or vice-versa). They exist together simply to make Python more user-friendly.

    Furthermore, it too gives a message when printed:

    >>> print (exit)
    Use exit() or Ctrl-Z plus Return to exit
    >>>
    

    However, like quit, exit is considered bad to use in production code and should be reserved for use in the interpreter. This is because it too relies on the site module.

  3. sys.exit() also raises the SystemExit exception. This means that it is the same as quit and exit in that respect.

    Unlike those two however, sys.exit is considered good to use in production code. This is because the sys module will always be there.

  4. os._exit() exits the program without calling cleanup handlers, flushing stdio buffers, etc. Thus, it is not a standard way to exit and should only be used in special cases. The most common of these is in the child process(es) created by os.fork.

    Note that, of the four methods given, only this one is unique in what it does.

Summed up, all four methods exit the program. However, the first two are considered bad to use in production code and the last is a non-standard, dirty way that is only used in special scenarios. So, if you want to exit a program normally, go with the third method: sys.exit.


Or, even better in my opinion, you can just do directly what sys.exit does behind the scenes and run:

raise SystemExit

This way, you do not need to import sys first.

However, this choice is simply one on style and is purely up to you.


回答 1

函数* quit()exit()sys.exit()以相同的方式起作用:它们引发SystemExit异常。因此,有没有真正的区别,不同之处在于sys.exit()始终可用,但exit()quit()是唯一可用的,如果site模块是进口的。

os._exit()函数很特殊,它不调用任何清理函数就立即退出(例如,它不刷新缓冲区)。这是针对高度专业化的用例而设计的……基本上,仅在os.fork()通话后的孩子中。

结论

  • 在REPL中使用exit()quit()

  • sys.exit()在脚本中使用,或者raise SystemExit()根据需要使用。

  • 使用os._exit()在通话结束后子进程退出os.fork()

所有这些都可以在不带参数的情况下调用,或者您可以指定退出状态,例如,exit(1)raise SystemExit(1)以状态1退出。请注意,可移植程序仅限于退出状态代码,范围为0-255,如果raise SystemExit(256)在许多系统上,这将被截断,您的进程实际上将以状态0退出。

脚注

*其实,quit()并且exit()是可调用的实例对象,但我认为这没关系给他们打电话的功能。

The functions* quit(), exit(), and sys.exit() function in the same way: they raise the SystemExit exception. So there is no real difference, except that sys.exit() is always available but exit() and quit() are only available if the site module is imported.

The os._exit() function is special, it exits immediately without calling any cleanup functions (it doesn’t flush buffers, for example). This is designed for highly specialized use cases… basically, only in the child after an os.fork() call.

Conclusion

  • Use exit() or quit() in the REPL.

  • Use sys.exit() in scripts, or raise SystemExit() if you prefer.

  • Use os._exit() for child processes to exit after a call to os.fork().

All of these can be called without arguments, or you can specify the exit status, e.g., exit(1) or raise SystemExit(1) to exit with status 1. Note that portable programs are limited to exit status codes in the range 0-255, if you raise SystemExit(256) on many systems this will get truncated and your process will actually exit with status 0.

Footnotes

* Actually, quit() and exit() are callable instance objects, but I think it’s okay to call them functions.


回答 2

退出的不同方式

os._exit()

  • 退出过程而不调用清理处理程序。

exit(0)

  • 干净的出口,没有任何错误/问题。

exit(1)

  • 存在一些问题/错误/问题,这就是程序退出的原因。

sys.exit()

  • 当系统和python关闭时;这意味着程序运行后正在使用的内存更少。

quit()

  • 关闭python文件。

摘要

基本上他们都做相同的事情,但是,这还取决于您要做什么。

我认为您不会遗漏任何东西,建议您习惯quit()exit()

您将使用sys.exit()并且os._exit()主要是在使用大文件或使用python控制终端的情况下。

否则主要使用exit()quit()

Different Means of Exiting

os._exit():

  • Exit the process without calling the cleanup handlers.

exit(0):

  • a clean exit without any errors / problems.

exit(1):

  • There was some issue / error / problem and that is why the program is exiting.

sys.exit():

  • When the system and python shuts down; it means less memory is being used after the program is run.

quit():

  • Closes the python file.

Summary

Basically they all do the same thing, however, it also depends on what you are doing it for.

I don’t think you left anything out and I would recommend getting used to quit() or exit().

You would use sys.exit() and os._exit() mainly if you are using big files or are using python to control terminal.

Otherwise mainly use exit() or quit().


回答 3

sys.exit 是退出的规范方法。

内部sys.exit只是提高SystemExit。但是,呼叫sys.exitSystemExit直接发起更为惯用。

os.exit 是一个低级系统调用,它直接退出而不调用任何清除处理程序。

quit并且exit仅用于提供一种简便的方法来退出Python提示符。这适用于新用户或不小心输入Python提示符而又不想知道正确语法的用户。他们可能会尝试输入exitquit。尽管这不会退出解释器,但它至少会发出一条消息,告诉他们出路:

>>> exit
Use exit() or Ctrl-D (i.e. EOF) to exit
>>> exit()
$

本质上,这只是一种利用,事实是解释器会打印__repr__您在提示符下输入的任何表达式的事实。

sys.exit is the canonical way to exit.

Internally sys.exit just raises SystemExit. However, calling sys.exitis more idiomatic than raising SystemExit directly.

os.exit is a low-level system call that exits directly without calling any cleanup handlers.

quit and exit exist only to provide an easy way out of the Python prompt. This is for new users or users who accidentally entered the Python prompt, and don’t want to know the right syntax. They are likely to try typing exit or quit. While this will not exit the interpreter, it at least issues a message that tells them a way out:

>>> exit
Use exit() or Ctrl-D (i.e. EOF) to exit
>>> exit()
$

This is essentially just a hack that utilizes the fact that the interpreter prints the __repr__ of any expression that you enter at the prompt.


如果字典键不可用,则返回None

问题:如果字典键不可用,则返回None

我需要一种方法来获取字典值(如果它的键存在),或者简单地返回None,如果它不存在。

但是,KeyError如果您搜索不存在的键,Python会引发异常。我知道我可以检查密钥,但是我正在寻找更明确的密钥。None如果密钥不存在,是否有办法返回?

I need a way to get a dictionary value if its key exists, or simply return None, if it does not.

However, Python raises a KeyError exception if you search for a key that does not exist. I know that I can check for the key, but I am looking for something more explicit. Is there a way to just return None if the key does not exist?


回答 0

您可以使用 dict.get()

value = d.get(key)

None如果将返回key is not in d。您还可以提供将返回的其他默认值,而不是None

value = d.get(key, "empty")

You can use dict.get()

value = d.get(key)

which will return None if key is not in d. You can also provide a different default value that will be returned instead of None:

value = d.get(key, "empty")

回答 1

别再奇怪了。它内置在语言中。

    >>>帮助(dict)

    模块内置的类字典帮助:

    类dict(object)
     | dict()->新的空字典
     | dict(mapping)->从映射对象的字典初始化的新字典
     | (键,值)对
    ...
     |  
     | 得到(...)
     | D.get(k [,d])-> D [k]如果D中有k,否则为d。d默认为无。
     |  
    ...

Wonder no more. It’s built into the language.

    >>> help(dict)

    Help on class dict in module builtins:

    class dict(object)
     |  dict() -> new empty dictionary
     |  dict(mapping) -> new dictionary initialized from a mapping object's
     |      (key, value) pairs
    ...
     |  
     |  get(...)
     |      D.get(k[,d]) -> D[k] if k in D, else d.  d defaults to None.
     |  
    ...

回答 2

采用 dict.get

如果key在字典中,则返回key的值,否则返回默认值。如果未提供default,则默认为None,因此此方法永远不会引发KeyError。

Use dict.get

Returns the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.


回答 3

您应该使用类中的get()方法dict

d = {}
r = d.get('missing_key', None)

这将导致r == None。如果在字典中找不到键,则get函数将返回第二个参数。

You should use the get() method from the dict class

d = {}
r = d.get('missing_key', None)

This will result in r == None. If the key isn’t found in the dictionary, the get function returns the second argument.


回答 4

如果您想要一个更透明的解决方案,则可以继承dict此行为:

class NoneDict(dict):
    def __getitem__(self, key):
        return dict.get(self, key)

>>> foo = NoneDict([(1,"asdf"), (2,"qwerty")])
>>> foo[1]
'asdf'
>>> foo[2]
'qwerty'
>>> foo[3] is None
True

If you want a more transparent solution, you can subclass dict to get this behavior:

class NoneDict(dict):
    def __getitem__(self, key):
        return dict.get(self, key)

>>> foo = NoneDict([(1,"asdf"), (2,"qwerty")])
>>> foo[1]
'asdf'
>>> foo[2]
'qwerty'
>>> foo[3] is None
True

回答 5

我通常在这种情况下使用defaultdict。您提供一个不带任何参数的工厂方法,并在看到新键时创建一个值。当您想在新键上返回空列表之类的功能时,它会更有用(请参见示例)。

from collections import defaultdict
d = defaultdict(lambda: None)
print d['new_key']  # prints 'None'

I usually use a defaultdict for situations like this. You supply a factory method that takes no arguments and creates a value when it sees a new key. It’s more useful when you want to return something like an empty list on new keys (see the examples).

from collections import defaultdict
d = defaultdict(lambda: None)
print d['new_key']  # prints 'None'

回答 6

您可以使用dict对象的get()方法,就像其他人已经建议的那样。另外,根据您正在执行的操作,您可能可以使用如下try/except套件:

try:
   <to do something with d[key]>
except KeyError:
   <deal with it not being there>

这被认为是处理案件的非常“ Pythonic”的方法。

You could use a dict object’s get() method, as others have already suggested. Alternatively, depending on exactly what you’re doing, you might be able use a try/except suite like this:

try:
   <to do something with d[key]>
except KeyError:
   <deal with it not being there>

Which is considered to be a very “Pythonic” approach to handling the case.


回答 7

一线解决方案是:

item['key'] if 'key' in item else None

在尝试将字典值添加到新列表并想要提供默认值时,这很有用:

例如。

row = [item['key'] if 'key' in item else 'default_value']

A one line solution would be:

item['key'] if 'key' in item else None

This is useful when trying to add dictionary values to a new list and want to provide a default:

eg.

row = [item['key'] if 'key' in item else 'default_value']

回答 8

就像其他人说的那样,您可以使用get()。

但是要检查密钥,您也可以执行以下操作:

d = {}
if 'keyname' in d:

    # d['keyname'] exists
    pass

else:

    # d['keyname'] does not exist
    pass

As others have said above, you can use get().

But to check for a key, you can also do:

d = {}
if 'keyname' in d:

    # d['keyname'] exists
    pass

else:

    # d['keyname'] does not exist
    pass

回答 9

我被python2 vs python3中可能发生的事情吓了一跳。我将根据最终对python3所做的回答。我的目标很简单:检查字典格式的json响应是否给出错误。我的字典称为“令牌”,而我正在寻找的密钥是“错误”。我正在寻找键“错误”,如果不存在,则将其设置为“无”,然后检查其值为“无”,如果是,请继续执行我的代码。如果我确实拥有键“错误”,则将执行else语句。

if ((token.get('error', None)) is None):
    do something

I was thrown aback by what was possible in python2 vs python3. I will answer it based on what I ended up doing for python3. My objective was simple: check if a json response in dictionary format gave an error or not. My dictionary is called “token” and my key that I am looking for is “error”. I am looking for key “error” and if it was not there setting it to value of None, then checking is the value is None, if so proceed with my code. An else statement would handle if I do have the key “error”.

if ((token.get('error', None)) is None):
    do something

回答 10

如果可以使用False,则还可以使用hasattr内置功能:

e=dict()
hasattr(e, 'message'):
>>> False

If you can do it with False, then, there’s also the hasattr built-in funtion:

e=dict()
hasattr(e, 'message'):
>>> False