分类目录归档：知识问答

如何从日期中减去一天？

2021年7月25日 Python实用宝典

问题：如何从日期中减去一天？

我有一个Python datetime.datetime对象。减去一天的最佳方法是什么？

I have a Python datetime.datetime object. What is the best way to subtract one day?

回答 0

您可以使用timedelta对象：

from datetime import datetime, timedelta

d = datetime.today() - timedelta(days=days_to_subtract)

You can use a timedelta object:

from datetime import datetime, timedelta

d = datetime.today() - timedelta(days=days_to_subtract)

回答 1

减去 datetime.timedelta(days=1)

Subtract datetime.timedelta(days=1)

回答 2

如果您的Python日期时间对象可识别时区，则应注意避免DST转换周围的错误（或由于其他原因导致UTC偏移量发生变化）：

from datetime import datetime, timedelta
from tzlocal import get_localzone # pip install tzlocal

DAY = timedelta(1)
local_tz = get_localzone()   # get local timezone
now = datetime.now(local_tz) # get timezone-aware datetime object
day_ago = local_tz.normalize(now - DAY) # exactly 24 hours ago, time may differ
naive = now.replace(tzinfo=None) - DAY # same time
yesterday = local_tz.localize(naive, is_dst=None) # but elapsed hours may differ

在一般情况下，day_ago和yesterday如果UTC偏移量本地时区中的最后一天发生了变化可能会有所不同。

例如，夏令时/夏令时在美国/洛杉矶时区的Sun 2-Nov-2014的02:00:00 AM结束，因此，如果：

import pytz # pip install pytz

local_tz = pytz.timezone('America/Los_Angeles')
now = local_tz.localize(datetime(2014, 11, 2, 10), is_dst=None)
# 2014-11-02 10:00:00 PST-0800

然后day_ago和yesterday不同：

day_ago恰好是24小时前（相对于now），但在上午11点而不是上午10点now
yesterday是昨天上午10点，但是是25小时前（相对于now），而不是24小时。

pendulum模块自动处理它：

>>> import pendulum  # $ pip install pendulum

>>> now = pendulum.create(2014, 11, 2, 10, tz='America/Los_Angeles')
>>> day_ago = now.subtract(hours=24)  # exactly 24 hours ago
>>> yesterday = now.subtract(days=1)  # yesterday at 10 am but it is 25 hours ago

>>> (now - day_ago).in_hours()
24
>>> (now - yesterday).in_hours()
25

>>> now
<Pendulum [2014-11-02T10:00:00-08:00]>
>>> day_ago
<Pendulum [2014-11-01T11:00:00-07:00]>
>>> yesterday
<Pendulum [2014-11-01T10:00:00-07:00]>

If your Python datetime object is timezone-aware than you should be careful to avoid errors around DST transitions (or changes in UTC offset for other reasons):

from datetime import datetime, timedelta
from tzlocal import get_localzone # pip install tzlocal

DAY = timedelta(1)
local_tz = get_localzone()   # get local timezone
now = datetime.now(local_tz) # get timezone-aware datetime object
day_ago = local_tz.normalize(now - DAY) # exactly 24 hours ago, time may differ
naive = now.replace(tzinfo=None) - DAY # same time
yesterday = local_tz.localize(naive, is_dst=None) # but elapsed hours may differ

In general, day_ago and yesterday may differ if UTC offset for the local timezone has changed in the last day.

For example, daylight saving time/summer time ends on Sun 2-Nov-2014 at 02:00:00 A.M. in America/Los_Angeles timezone therefore if:

import pytz # pip install pytz

local_tz = pytz.timezone('America/Los_Angeles')
now = local_tz.localize(datetime(2014, 11, 2, 10), is_dst=None)
# 2014-11-02 10:00:00 PST-0800

then day_ago and yesterday differ:

day_ago is exactly 24 hours ago (relative to now) but at 11 am, not at 10 am as now
yesterday is yesterday at 10 am but it is 25 hours ago (relative to now), not 24 hours.

pendulum module handles it automatically:

>>> import pendulum  # $ pip install pendulum

>>> now = pendulum.create(2014, 11, 2, 10, tz='America/Los_Angeles')
>>> day_ago = now.subtract(hours=24)  # exactly 24 hours ago
>>> yesterday = now.subtract(days=1)  # yesterday at 10 am but it is 25 hours ago

>>> (now - day_ago).in_hours()
24
>>> (now - yesterday).in_hours()
25

>>> now
<Pendulum [2014-11-02T10:00:00-08:00]>
>>> day_ago
<Pendulum [2014-11-01T11:00:00-07:00]>
>>> yesterday
<Pendulum [2014-11-01T10:00:00-07:00]>

回答 3

只是为了阐明对它有帮助的替代方法和用例：

从当前日期时间减去1天：

from datetime import datetime, timedelta
print datetime.now() + timedelta(days=-1)  # Here, I am adding a negative timedelta

在案例中很有用，如果您想增加5天并从当前日期时间中减去5小时。即从现在算起5天，但少5个小时的日期时间是什么？

from datetime import datetime, timedelta
print datetime.now() + timedelta(days=5, hours=-5)

它可以类似地与其他参数一起使用，例如秒，周等

Just to Elaborate an alternate method and a Use case for which it is helpful:

Subtract 1 day from current datetime:

from datetime import datetime, timedelta
print datetime.now() + timedelta(days=-1)  # Here, I am adding a negative timedelta

Useful in the Case, If you want to add 5 days and subtract 5 hours from current datetime. i.e. What is the Datetime 5 days from now but 5 hours less ?

from datetime import datetime, timedelta
print datetime.now() + timedelta(days=5, hours=-5)

It can similarly be used with other parameters e.g. seconds, weeks etc

回答 4

当我想计算上个月的第一天/最后一天或其他相对时间增量等时，我也喜欢使用另一个好函数。

从relativedelta功能dateutil功能（一个强大的扩展到datetime LIB）

import datetime as dt
from dateutil.relativedelta import relativedelta
#get first and last day of this and last month)
today = dt.date.today()
first_day_this_month = dt.date(day=1, month=today.month, year=today.year)
last_day_last_month = first_day_this_month - relativedelta(days=1)
print (first_day_this_month, last_day_last_month)

>2015-03-01 2015-02-28

Also just another nice function i like to use when i want to compute i.e. first/last day of the last month or other relative timedeltas etc. …

The relativedelta function from dateutil function (a powerful extension to the datetime lib)

import datetime as dt
from dateutil.relativedelta import relativedelta
#get first and last day of this and last month)
today = dt.date.today()
first_day_this_month = dt.date(day=1, month=today.month, year=today.year)
last_day_last_month = first_day_this_month - relativedelta(days=1)
print (first_day_this_month, last_day_last_month)

>2015-03-01 2015-02-28

回答 5

存在温和的箭头模块

import arrow
utc = arrow.utcnow()
utc_yesterday = utc.shift(days=-1)
print(utc, '\n', utc_yesterday)

输出：

2017-04-06T11:17:34.431397+00:00 
 2017-04-05T11:17:34.431397+00:00

Genial arrow module exists

import arrow
utc = arrow.utcnow()
utc_yesterday = utc.shift(days=-1)
print(utc, '\n', utc_yesterday)

output:

2017-04-06T11:17:34.431397+00:00 
 2017-04-05T11:17:34.431397+00:00

知识问答

从相对路径导入模块

2021年7月25日 Python实用宝典

问题：从相对路径导入模块

给定相对路径，如何导入Python模块？

例如，如果dirFoo包含Foo.py和dirBar，和dirBar包含Bar.py，我怎么导入Bar.py到Foo.py？

这是一个视觉表示：

dirFoo\
    Foo.py
    dirBar\
        Bar.py

Foo希望包含Bar，但重组文件夹层次结构不是一种选择。

How do I import a Python module given its relative path?

For example, if dirFoo contains Foo.py and dirBar, and dirBar contains Bar.py, how do I import Bar.py into Foo.py?

Here’s a visual representation:

dirFoo\
    Foo.py
    dirBar\
        Bar.py

Foo wishes to include Bar, but restructuring the folder hierarchy is not an option.

回答 0

假设您的两个目录都是真实的Python包（__init__.py文件中确实有文件），那么这是一个相对于脚本位置包含模块的安全解决方案。

我假设您想这样做，因为您需要在脚本中包括一组模块。我在多个产品的生产环境中使用了此功能，并在许多特殊情况下工作，例如：从另一个目录调用或使用python执行的脚本执行而不是打开新的解释器。

 import os, sys, inspect
 # realpath() will make your script run, even if you symlink it :)
 cmd_folder = os.path.realpath(os.path.abspath(os.path.split(inspect.getfile( inspect.currentframe() ))[0]))
 if cmd_folder not in sys.path:
     sys.path.insert(0, cmd_folder)

 # Use this if you want to include modules from a subfolder
 cmd_subfolder = os.path.realpath(os.path.abspath(os.path.join(os.path.split(inspect.getfile( inspect.currentframe() ))[0],"subfolder")))
 if cmd_subfolder not in sys.path:
     sys.path.insert(0, cmd_subfolder)

 # Info:
 # cmd_folder = os.path.dirname(os.path.abspath(__file__)) # DO NOT USE __file__ !!!
 # __file__ fails if the script is called in different ways on Windows.
 # __file__ fails if someone does os.chdir() before.
 # sys.argv[0] also fails, because it doesn't not always contains the path.

另外，这种方法确实可以让您强制Python使用模块，而不是系统上安装的模块。

警告！我真的不知道当前模块在egg文件中时会发生什么。它也可能失败。

Assuming that both your directories are real Python packages (do have the __init__.py file inside them), here is a safe solution for inclusion of modules relatively to the location of the script.

I assume that you want to do this, because you need to include a set of modules with your script. I use this in production in several products and works in many special scenarios like: scripts called from another directory or executed with python execute instead of opening a new interpreter.

 import os, sys, inspect
 # realpath() will make your script run, even if you symlink it :)
 cmd_folder = os.path.realpath(os.path.abspath(os.path.split(inspect.getfile( inspect.currentframe() ))[0]))
 if cmd_folder not in sys.path:
     sys.path.insert(0, cmd_folder)

 # Use this if you want to include modules from a subfolder
 cmd_subfolder = os.path.realpath(os.path.abspath(os.path.join(os.path.split(inspect.getfile( inspect.currentframe() ))[0],"subfolder")))
 if cmd_subfolder not in sys.path:
     sys.path.insert(0, cmd_subfolder)

 # Info:
 # cmd_folder = os.path.dirname(os.path.abspath(__file__)) # DO NOT USE __file__ !!!
 # __file__ fails if the script is called in different ways on Windows.
 # __file__ fails if someone does os.chdir() before.
 # sys.argv[0] also fails, because it doesn't not always contains the path.

As a bonus, this approach does let you force Python to use your module instead of the ones installed on the system.

Warning! I don’t really know what is happening when current module is inside an egg file. It probably fails too.

回答 1

确保dirBar具有__init__.py文件-这会将目录创建到Python包中。

Be sure that dirBar has the __init__.py file — this makes a directory into a Python package.

回答 2

您也可以将子目录添加到Python路径中，以便将其作为普通脚本导入。

import sys
sys.path.insert(0, <path to dirFoo>)
import Bar

You could also add the subdirectory to your Python path so that it imports as a normal script.

import sys
sys.path.insert(0, <path to dirFoo>)
import Bar

回答 3

import os
import sys
lib_path = os.path.abspath(os.path.join(__file__, '..', '..', '..', 'lib'))
sys.path.append(lib_path)

import mymodule

import os
import sys
lib_path = os.path.abspath(os.path.join(__file__, '..', '..', '..', 'lib'))
sys.path.append(lib_path)

import mymodule

回答 4

只需执行简单的操作即可从其他文件夹导入.py文件。

假设您有一个目录，例如：

lib/abc.py

然后只需将一个空文件保留在lib文件夹中，命名为

__init__.py

然后用

from lib.abc import <Your Module name>

将__init__.py文件保留在导入模块层次结构的每个文件夹中。

Just do simple things to import the .py file from a different folder.

Let’s say you have a directory like:

lib/abc.py

Then just keep an empty file in lib folder as named

__init__.py

And then use

from lib.abc import <Your Module name>

Keep the __init__.py file in every folder of the hierarchy of the import module.

回答 5

如果您以这种方式构建项目：

src\
  __init__.py
  main.py
  dirFoo\
    __init__.py
    Foo.py
  dirBar\
    __init__.py
    Bar.py

然后从Foo.py您应该可以执行以下操作：

import dirFoo.Foo

要么：

from dirFoo.Foo import FooObject

根据Tom的评论，这确实要求src可以通过site_packages或您的搜索路径访问该文件夹。而且，正如他所提到的，__init__.py当您首次在该包/目录中导入模块时，它是隐式导入的。通常__init__.py只是一个空文件。

If you structure your project this way:

src\
  __init__.py
  main.py
  dirFoo\
    __init__.py
    Foo.py
  dirBar\
    __init__.py
    Bar.py

Then from Foo.py you should be able to do:

import dirFoo.Foo

Or:

from dirFoo.Foo import FooObject

Per Tom’s comment, this does require that the src folder is accessible either via site_packages or your search path. Also, as he mentions, __init__.py is implicitly imported when you first import a module in that package/directory. Typically __init__.py is simply an empty file.

回答 6

最简单的方法是使用sys.path.append（）。

但是，您可能也对imp模块感兴趣。它提供对内部导入功能的访问。

# mod_name is the filename without the .py/.pyc extention
py_mod = imp.load_source(mod_name,filename_path) # Loads .py file
py_mod = imp.load_compiled(mod_name,filename_path) # Loads .pyc file

当您不知道模块名称时，可以使用它来动态加载模块。

过去我曾使用它来创建应用程序的插件类型接口，用户可以在其中编写具有应用程序特定功能的脚本，然后将其脚本放置在特定目录中。

此外，这些功能可能会很有用：

imp.find_module(name[, path])
imp.load_module(name, file, pathname, description)

The easiest method is to use sys.path.append().

However, you may be also interested in the imp module. It provides access to internal import functions.

# mod_name is the filename without the .py/.pyc extention
py_mod = imp.load_source(mod_name,filename_path) # Loads .py file
py_mod = imp.load_compiled(mod_name,filename_path) # Loads .pyc file

This can be used to load modules dynamically when you don’t know a module’s name.

I’ve used this in the past to create a plugin type interface to an application, where the user would write a script with application specific functions, and just drop thier script in a specific directory.

Also, these functions may be useful:

imp.find_module(name[, path])
imp.load_module(name, file, pathname, description)

回答 7

特别是，假定dirFoo是dirBar的目录。

在dirFoo \ Foo.py中：

from ..dirBar import Bar

This is the relevant PEP:

http://www.python.org/dev/peps/pep-0328/

In particular, presuming dirFoo is a directory up from dirBar…

In dirFoo\Foo.py:

from ..dirBar import Bar

回答 8

不对脚本进行任何修改的最简单方法是设置PYTHONPATH环境变量。由于sys.path是从以下位置初始化的：

包含输入脚本的目录（或当前目录）。
PYTHONPATH（目录名称列表，语法与shell变量PATH相同）。
取决于安装的默认值。

赶紧跑：

export PYTHONPATH=/absolute/path/to/your/module

您的sys.path将包含以上路径，如下所示：

print sys.path

['', '/absolute/path/to/your/module', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-linux2', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages/PIL', '/usr/lib/python2.7/dist-packages/gst-0.10', '/usr/lib/python2.7/dist-packages/gtk-2.0', '/usr/lib/pymodules/python2.7', '/usr/lib/python2.7/dist-packages/ubuntu-sso-client', '/usr/lib/python2.7/dist-packages/ubuntuone-client', '/usr/lib/python2.7/dist-packages/ubuntuone-control-panel', '/usr/lib/python2.7/dist-packages/ubuntuone-couch', '/usr/lib/python2.7/dist-packages/ubuntuone-installer', '/usr/lib/python2.7/dist-packages/ubuntuone-storage-protocol']

The easiest way without any modification to your script is to set PYTHONPATH environment variable. Because sys.path is initialized from these locations:

The directory containing the input script (or the current directory).
PYTHONPATH (a list of directory names, with the same syntax as the shell variable PATH).
The installation-dependent default.

Just run:

export PYTHONPATH=/absolute/path/to/your/module

You sys.path will contains above path, as show below:

print sys.path

['', '/absolute/path/to/your/module', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-linux2', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages/PIL', '/usr/lib/python2.7/dist-packages/gst-0.10', '/usr/lib/python2.7/dist-packages/gtk-2.0', '/usr/lib/pymodules/python2.7', '/usr/lib/python2.7/dist-packages/ubuntu-sso-client', '/usr/lib/python2.7/dist-packages/ubuntuone-client', '/usr/lib/python2.7/dist-packages/ubuntuone-control-panel', '/usr/lib/python2.7/dist-packages/ubuntuone-couch', '/usr/lib/python2.7/dist-packages/ubuntuone-installer', '/usr/lib/python2.7/dist-packages/ubuntuone-storage-protocol']

回答 9

我认为最好的选择是将__ init __.py放在文件夹中，然后使用

from dirBar.Bar import *

不建议使用sys.path.append（），因为如果您使用与现有python包相同的文件名，则可能会出错。我还没有测试，但这将是模棱两可的。

In my opinion the best choice is to put __ init __.py in the folder and call the file with

from dirBar.Bar import *

It is not recommended to use sys.path.append() because something might gone wrong if you use the same file name as the existing python package. I haven’t test that but that will be ambiguous.

回答 10

Linux用户的快捷方式

如果您只是在修改而不关心部署问题，则可以使用符号链接（假设文件系统支持它）使模块或程序包直接在请求模块的文件夹中可见。

ln -s (path)/module_name.py

要么

ln -s (path)/package_name

注意：“模块”是带有.py扩展名的任何文件，“包”是包含该文件的任何文件夹__init__.py（可以是空文件）。从使用的角度来看，模块和程序包是相同的-都按照import命令的要求公开了它们包含的“定义和语句” 。

请参阅：http : //docs.python.org/2/tutorial/modules.html

The quick-and-dirty way for Linux users

If you are just tinkering around and don’t care about deployment issues, you can use a symbolic link (assuming your filesystem supports it) to make the module or package directly visible in the folder of the requesting module.

ln -s (path)/module_name.py

ln -s (path)/package_name

Note: A “module” is any file with a .py extension and a “package” is any folder that contains the file __init__.py (which can be an empty file). From a usage standpoint, modules and packages are identical — both expose their contained “definitions and statements” as requested via the import command.

See: http://docs.python.org/2/tutorial/modules.html

回答 11

from .dirBar import Bar

代替：

from dirBar import Bar

以防万一可能会安装另一个dirBar并混淆foo.py阅读器。

from .dirBar import Bar

instead of:

from dirBar import Bar

just in case there could be another dirBar installed and confuse a foo.py reader.

回答 12

对于这种情况，要将Bar.py导入Foo.py，首先，将这些文件夹转换为Python包，如下所示：

dirFoo\
    __init__.py
    Foo.py
    dirBar\
        __init__.py
        Bar.py

然后我会在Foo.py中这样做：

from .dirBar import Bar

如果我希望命名空间看起来像Bar。不管，或

from . import dirBar

如果我想要命名空间dirBar.Bar。随便。如果您在dirBar包下有更多模块，则第二种情况很有用。

For this case to import Bar.py into Foo.py, first I’d turn these folders into Python packages like so:

dirFoo\
    __init__.py
    Foo.py
    dirBar\
        __init__.py
        Bar.py

Then I would do it like this in Foo.py:

from .dirBar import Bar

If I wanted the namespacing to look like Bar.whatever, or

from . import dirBar

If I wanted the namespacing dirBar.Bar.whatever. This second case is useful if you have more modules under the dirBar package.

回答 13

添加__init__.py文件：

dirFoo\
    Foo.py
    dirBar\
        __init__.py
        Bar.py

然后将此代码添加到Foo.py的开头：

import sys
sys.path.append('dirBar')
import Bar

Add an __init__.py file:

dirFoo\
    Foo.py
    dirBar\
        __init__.py
        Bar.py

Then add this code to the start of Foo.py:

import sys
sys.path.append('dirBar')
import Bar

回答 14

相对sys.path示例：

# /lib/my_module.py
# /src/test.py


if __name__ == '__main__' and __package__ is None:
    sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '../lib')))
import my_module

基于此答案。

Relative sys.path example:

# /lib/my_module.py
# /src/test.py


if __name__ == '__main__' and __package__ is None:
    sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '../lib')))
import my_module

Based on this answer.

回答 15

好了，正如您提到的，通常您希望访问一个包含您的模块的文件夹，该模块相对于您运行主脚本的位置，因此您只需导入它们即可。

解：

我有脚本D:/Books/MyBooks.py和一些模块（如oldies.py）。我需要从子目录导入D:/Books/includes：

import sys,site
site.addsitedir(sys.path[0] + '\\includes')
print (sys.path)  # Just verify it is there
import oldies

将print('done')放在中oldies.py，以便您确认一切正常。这种方法始终有效，因为sys.path根据程序启动时初始化的Python定义，此列表的第一项path[0]是包含用于调用Python解释器的脚本的目录。

如果脚本目录不可用（例如，如果交互式调用解释器或从标准输入中读取脚本），path[0]则为空字符串，该字符串将引导Python首先搜索当前目录中的模块。请注意，作为的结果，在插入条目之前插入了脚本目录PYTHONPATH。

Well, as you mention, usually you want to have access to a folder with your modules relative to where your main script is run, so you just import them.

Solution:

I have the script in D:/Books/MyBooks.py and some modules (like oldies.py). I need to import from subdirectory D:/Books/includes:

import sys,site
site.addsitedir(sys.path[0] + '\\includes')
print (sys.path)  # Just verify it is there
import oldies

Place a print('done') in oldies.py, so you verify everything is going OK. This way always works because by the Python definition sys.path as initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter.

If the script directory is not available (e.g. if the interpreter is invoked interactively or if the script is read from standard input), path[0] is the empty string, which directs Python to search modules in the current directory first. Notice that the script directory is inserted before the entries inserted as a result of PYTHONPATH.

回答 16

只需使用即可： from Desktop.filename import something

例：

鉴于该文件是test.pydirectory目录中的 name Users/user/Desktop，并且将导入所有内容。

编码：

from Desktop.test import *

但是请确保__init__.py在该目录中创建一个名为“ ” 的空文件

Simply you can use: from Desktop.filename import something

Example:

given that the file is name test.py in directory Users/user/Desktop , and will import everthing.

the code:

from Desktop.test import *

But make sure you make an empty file called “__init__.py” in that directory

回答 17

另一种解决方案是安装py-require软件包，然后在Foo.py

import require
Bar = require('./dirBar/Bar')

Another solution would be to install the py-require package and then use the following in Foo.py

import require
Bar = require('./dirBar/Bar')

回答 18

这是一种使用相对路径从上一级导入文件的方法。

基本上，只需将工作目录上移某个级别（或任何相对位置），然后将其添加到您的路径中，然后再将工作目录移回其开始位置即可。

#to import from one level above:
cwd = os.getcwd()
os.chdir("..")
below_path =  os.getcwd()
sys.path.append(below_path)
os.chdir(cwd)

Here’s a way to import a file from one level above, using the relative path.

Basically, just move the working directory up a level (or any relative location), add that to your path, then move the working directory back where it started.

#to import from one level above:
cwd = os.getcwd()
os.chdir("..")
below_path =  os.getcwd()
sys.path.append(below_path)
os.chdir(cwd)

回答 19

我对python没有经验，所以如果我的话有什么错误，请告诉我。如果您的文件层次结构是这样排列的：

project\
    module_1.py 
    module_2.py

module_1.py定义了一个称为函数func_1()，module_2.py：

from module_1 import func_1

def func_2():
    func_1()

if __name__ == '__main__':
    func_2()

并且您python module_2.py在cmd中运行，它将按func_1()定义运行。通常，这就是我们导入相同层次结构文件的方式。但是当您from .module_1 import func_1输入时module_2.py，python解释器会说No module named '__main__.module_1'; '__main__' is not a package。因此，要解决此问题，我们只需保留所做的更改，然后将两个模块都移到一个程序包中，然后将第三个模块作为调用方运行即可module_2.py。

project\
    package_1\
        module_1.py
        module_2.py
    main.py

main.py：

from package_1.module_2 import func_2

def func_3():
    func_2()

if __name__ == '__main__':
    func_3()

而增加了的原因.之前module_1的module_2.py是，如果我们不这样做，并运行main.py，Python解释器会说No module named 'module_1'，这是一个有点棘手，module_1.py是旁边module_2.py。现在让我func_1()在module_1.py做一些事情：

def func_1():
    print(__name__)

该__name__记录谁调用func_1。现在，我们保留.之前的内容module_1，运行main.py，它将打印出来package_1.module_1，而不是module_1。它表明呼叫的func_1()对象与处于相同的层次结构main.py，这.意味着module_1与其module_2.py本身处于相同的层次结构。因此，如果没有点，main.py它将module_1在与自身相同的层次结构中进行识别package_1，它可以识别，但不能识别它的“下方”。

现在，让它变得有点复杂。您有一个，config.ini并且一个模块定义了一个函数来读取与“ main.py”相同的层次结构的函数。

project\
    package_1\
        module_1.py
        module_2.py
    config.py
    config.ini
    main.py

出于某些不可避免的原因，您必须使用调用它module_2.py，因此必须从上层结构导入。module_2.py：

 import ..config
 pass

两点表示从上级结构导入（三个点访问上层而不是上层，依此类推）。现在运行main.py，解释器将说：ValueError:attempted relative import beyond top-level package。这里的“顶级程序包”是main.py。仅仅因为config.py在旁边main.py，它们处于相同的层次结构，config.py不在“下面” main.py，或者不在“前面” main.py，所以它超出了main.py。要解决此问题，最简单的方法是：

project\
    package_1\
        module_1.py
        module_2.py
    config.py
    config.ini
main.py

我认为这与安排项目文件层次结构的原理是一致的，您应该将具有不同功能的模块安排在不同的文件夹中，而仅在外部放置一个顶级调用方，然后可以随心所欲地导入。

I’m not experienced about python, so if there is any wrong in my words, just tell me. If your file hierarchy arranged like this:

project\
    module_1.py 
    module_2.py

module_1.py defines a function called func_1(), module_2.py:

from module_1 import func_1

def func_2():
    func_1()

if __name__ == '__main__':
    func_2()

and you run python module_2.py in cmd, it will do run what func_1() defines. That’s usually how we import same hierarchy files. But when you write from .module_1 import func_1 in module_2.py, python interpreter will say No module named '__main__.module_1'; '__main__' is not a package. So to fix this, we just keep the change we just make, and move both of the module to a package, and make a third module as a caller to run module_2.py.

project\
    package_1\
        module_1.py
        module_2.py
    main.py

main.py:

from package_1.module_2 import func_2

def func_3():
    func_2()

if __name__ == '__main__':
    func_3()

But the reason we add a . before module_1 in module_2.py is that if we don’t do that and run main.py, python interpreter will say No module named 'module_1', that’s a little tricky, module_1.py is right beside module_2.py. Now I let func_1() in module_1.py do something:

def func_1():
    print(__name__)

that __name__ records who calls func_1. Now we keep the . before module_1 , run main.py, it will print package_1.module_1, not module_1. It indicates that the one who calls func_1() is at the same hierarchy as main.py, the . imply that module_1 is at the same hierarchy as module_2.py itself. So if there isn’t a dot, main.py will recognize module_1 at the same hierarchy as itself, it can recognize package_1, but not what “under” it.

Now let’s make it a bit complicated. You have a config.ini and a module defines a function to read it at the same hierarchy as ‘main.py’.

project\
    package_1\
        module_1.py
        module_2.py
    config.py
    config.ini
    main.py

And for some unavoidable reason, you have to call it with module_2.py, so it has to import from upper hierarchy.module_2.py:

 import ..config
 pass

Two dots means import from upper hierarchy (three dots access upper than upper,and so on). Now we run main.py, the interpreter will say:ValueError:attempted relative import beyond top-level package. The “top-level package” at here is main.py. Just because config.py is beside main.py, they are at same hierarchy, config.py isn’t “under” main.py, or it isn’t “leaded” by main.py, so it is beyond main.py. To fix this, the simplest way is:

project\
    package_1\
        module_1.py
        module_2.py
    config.py
    config.ini
main.py

I think that is coincide with the principle of arrange project file hierarchy, you should arrange modules with different function in different folders, and just leave a top caller in the outside, and you can import how ever you want.

回答 20

这也可行，并且比使用该sys模块的任何事情都要简单得多：

with open("C:/yourpath/foobar.py") as f:
    eval(f.read())

This also works, and is much simpler than anything with the sys module:

with open("C:/yourpath/foobar.py") as f:
    eval(f.read())

回答 21

称我过于谨慎，但我想让我的便携式计算机更加便携，因为假设文件始终位于每台计算机上的同一位置是不安全的。我个人的代码首先查找文件路径。我使用Linux，所以我的看起来像这样：

import os, sys
from subprocess import Popen, PIPE
try:
    path = Popen("find / -name 'file' -type f", shell=True, stdout=PIPE).stdout.read().splitlines()[0]
    if not sys.path.__contains__(path):
        sys.path.append(path)
except IndexError:
    raise RuntimeError("You must have FILE to run this program!")

当然，除非您计划将它们打包在一起。但是，在这种情况下，您实际上并不需要两个单独的文件。

Call me overly cautious, but I like to make mine more portable because it’s unsafe to assume that files will always be in the same place on every computer. Personally I have the code look up the file path first. I use Linux so mine would look like this:

import os, sys
from subprocess import Popen, PIPE
try:
    path = Popen("find / -name 'file' -type f", shell=True, stdout=PIPE).stdout.read().splitlines()[0]
    if not sys.path.__contains__(path):
        sys.path.append(path)
except IndexError:
    raise RuntimeError("You must have FILE to run this program!")

That is of course unless you plan to package these together. But if that’s the case you don’t really need two separate files anyway.

知识问答

如何在Python中检查文件大小？

2021年7月25日 Python实用宝典

问题：如何在Python中检查文件大小？

我在Windows中编写Python脚本。我想根据文件大小做一些事情。例如，如果大小大于0，我将向某人发送电子邮件，否则继续其他操作。

如何检查文件大小？

I am writing a Python script in Windows. I want to do something based on the file size. For example, if the size is greater than 0, I will send an email to somebody, otherwise continue to other things.

How do I check the file size?

回答 0

您需要由返回的对象的st_size属性。您可以使用（Python 3.4+）来获取它：os.statpathlib

>>> from pathlib import Path
>>> Path('somefile.txt').stat()
os.stat_result(st_mode=33188, st_ino=6419862, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=1564, st_atime=1584299303, st_mtime=1584299400, st_ctime=1584299400)
>>> Path('somefile.txt').stat().st_size
1564

或使用os.stat：

>>> import os
>>> os.stat('somefile.txt')
os.stat_result(st_mode=33188, st_ino=6419862, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=1564, st_atime=1584299303, st_mtime=1584299400, st_ctime=1584299400)
>>> os.stat('somefile.txt').st_size
1564

输出以字节为单位。

You need the st_size property of the object returned by os.stat. You can get it by either using pathlib (Python 3.4+):

>>> from pathlib import Path
>>> Path('somefile.txt').stat()
os.stat_result(st_mode=33188, st_ino=6419862, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=1564, st_atime=1584299303, st_mtime=1584299400, st_ctime=1584299400)
>>> Path('somefile.txt').stat().st_size
1564

or using os.stat:

>>> import os
>>> os.stat('somefile.txt')
os.stat_result(st_mode=33188, st_ino=6419862, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=1564, st_atime=1584299303, st_mtime=1584299400, st_ctime=1584299400)
>>> os.stat('somefile.txt').st_size
1564

Output is in bytes.

回答 1

使用os.path.getsize：

>>> import os
>>> b = os.path.getsize("/path/isa_005.mp3")
>>> b
2071611

输出以字节为单位。

Using os.path.getsize:

>>> import os
>>> b = os.path.getsize("/path/isa_005.mp3")
>>> b
2071611

The output is in bytes.

回答 2

其他答案适用于实际文件，但是如果您需要适用于“类文件的对象”的文件，请尝试以下操作：

# f is a file-like object. 
f.seek(0, os.SEEK_END)
size = f.tell()

在我有限的测试中，它适用于真实文件和StringIO。（Python 2.7.3。）当然，“类文件对象” API并不是严格的接口，但是API文档建议类文件对象应支持seek()和tell()。

编辑

这与之间的另一个区别os.stat()是，stat()即使您没有读取权限，也可以文件。显然，除非您具有阅读许可，否则搜索/讲述方法将无法工作。

编辑2

在乔纳森的建议下，这是一个偏执的版本。（以上版本将文件指针留在文件的末尾，因此，如果您尝试从文件中读取文件，则将返回零字节！）

# f is a file-like object. 
old_file_position = f.tell()
f.seek(0, os.SEEK_END)
size = f.tell()
f.seek(old_file_position, os.SEEK_SET)

The other answers work for real files, but if you need something that works for “file-like objects”, try this:

# f is a file-like object. 
f.seek(0, os.SEEK_END)
size = f.tell()

It works for real files and StringIO’s, in my limited testing. (Python 2.7.3.) The “file-like object” API isn’t really a rigorous interface, of course, but the API documentation suggests that file-like objects should support seek() and tell().

Edit

Another difference between this and os.stat() is that you can stat() a file even if you don’t have permission to read it. Obviously the seek/tell approach won’t work unless you have read permission.

Edit 2

At Jonathon’s suggestion, here’s a paranoid version. (The version above leaves the file pointer at the end of the file, so if you were to try to read from the file, you’d get zero bytes back!)

# f is a file-like object. 
old_file_position = f.tell()
f.seek(0, os.SEEK_END)
size = f.tell()
f.seek(old_file_position, os.SEEK_SET)

回答 3

import os


def convert_bytes(num):
    """
    this function will convert bytes to MB.... GB... etc
    """
    for x in ['bytes', 'KB', 'MB', 'GB', 'TB']:
        if num < 1024.0:
            return "%3.1f %s" % (num, x)
        num /= 1024.0


def file_size(file_path):
    """
    this function will return the file size
    """
    if os.path.isfile(file_path):
        file_info = os.stat(file_path)
        return convert_bytes(file_info.st_size)


# Lets check the file size of MS Paint exe 
# or you can use any file path
file_path = r"C:\Windows\System32\mspaint.exe"
print file_size(file_path)

结果：

6.1 MB

import os


def convert_bytes(num):
    """
    this function will convert bytes to MB.... GB... etc
    """
    for x in ['bytes', 'KB', 'MB', 'GB', 'TB']:
        if num < 1024.0:
            return "%3.1f %s" % (num, x)
        num /= 1024.0


def file_size(file_path):
    """
    this function will return the file size
    """
    if os.path.isfile(file_path):
        file_info = os.stat(file_path)
        return convert_bytes(file_info.st_size)


# Lets check the file size of MS Paint exe 
# or you can use any file path
file_path = r"C:\Windows\System32\mspaint.exe"
print file_size(file_path)

Result:

6.1 MB

回答 4

使用pathlib（在Python 3.4中添加或在PyPI上提供的反向端口）：

from pathlib import Path
file = Path() / 'doc.txt'  # or Path('./doc.txt')
size = file.stat().st_size

实际上，这只是一个接口os.stat，但是使用pathlib提供了一种访问其他文件相关操作的简便方法。

Using pathlib (added in Python 3.4 or a backport available on PyPI):

from pathlib import Path
file = Path() / 'doc.txt'  # or Path('./doc.txt')
size = file.stat().st_size

This is really only an interface around os.stat, but using pathlib provides an easy way to access other file related operations.

回答 5

bitshift如果要从转换bytes为任何其他单位，有一个技巧。如果您进行右移，则10基本上是按一个顺序（多个）进行移位。

例： 5GB are 5368709120 bytes

print (5368709120 >> 10)  # 5242880 kilobytes (kB)
print (5368709120 >> 20 ) # 5120 megabytes (MB)
print (5368709120 >> 30 ) # 5 gigabytes (GB)

There is a bitshift trick I use if I want to to convert from bytes to any other unit. If you do a right shift by 10 you basically shift it by an order (multiple).

Example: 5GB are 5368709120 bytes

print (5368709120 >> 10)  # 5242880 kilobytes (kB)
print (5368709120 >> 20 ) # 5120 megabytes (MB)
print (5368709120 >> 30 ) # 5 gigabytes (GB)

回答 6

严格遵循这个问题，Python代码（+伪代码）将是：

import os
file_path = r"<path to your file>"
if os.stat(file_path).st_size > 0:
    <send an email to somebody>
else:
    <continue to other things>

Strictly sticking to the question, the Python code (+ pseudo-code) would be:

import os
file_path = r"<path to your file>"
if os.stat(file_path).st_size > 0:
    <send an email to somebody>
else:
    <continue to other things>

回答 7

#Get file size , print it , process it...
#Os.stat will provide the file size in (.st_size) property. 
#The file size will be shown in bytes.

import os

fsize=os.stat('filepath')
print('size:' + fsize.st_size.__str__())

#check if the file size is less than 10 MB

if fsize.st_size < 10000000:
    process it ....

#Get file size , print it , process it...
#Os.stat will provide the file size in (.st_size) property. 
#The file size will be shown in bytes.

import os

fsize=os.stat('filepath')
print('size:' + fsize.st_size.__str__())

#check if the file size is less than 10 MB

if fsize.st_size < 10000000:
    process it ....

回答 8

我们有两个选择都包括导入os模块

1）作为os.stat（）函数导入os返回一个对象，该对象包含许多标头，包括文件创建时间和上次修改时间等。其中st_size（）给出文件的确切大小。

os.stat（“文件名”）.st_size（）

2）import os在此，我们必须提供确切的文件路径（绝对路径），而不是相对路径。

os.path.getsize（“文件路径”）

we have two options Both include importing os module

1) import os as os.stat() function returns an object which contains so many headers including file created time and last modified time etc.. among them st_size() gives the exact size of the file.

os.stat(“filename”).st_size()

2) import os In this, we have to provide the exact file path(absolute path), not a relative path.

os.path.getsize(“path of file”)

知识问答

如何获取模块的路径？

2021年7月25日 Python实用宝典

问题：如何获取模块的路径？

我想检测模块是否已更改。现在，使用inotify很简单，您只需要知道要从中获取通知的目录即可。

如何在python中检索模块的路径？

I want to detect whether module has changed. Now, using inotify is simple, you just need to know the directory you want to get notifications from.

How do I retrieve a module’s path in python?

回答 0

import a_module
print(a_module.__file__)

实际上，至少在Mac OS X上，将为您提供已加载的.pyc文件的路径。因此，我想您可以这样做：

import os
path = os.path.abspath(a_module.__file__)

您也可以尝试：

path = os.path.dirname(a_module.__file__)

获取模块的目录。

import a_module
print(a_module.__file__)

Will actually give you the path to the .pyc file that was loaded, at least on Mac OS X. So I guess you can do:

import os
path = os.path.abspath(a_module.__file__)

You can also try:

path = os.path.dirname(a_module.__file__)

To get the module’s directory.

回答 1

inspectpython中有模块。

官方文件

检查模块提供了几个有用的功能，以帮助获取有关活动对象的信息，例如模块，类，方法，函数，回溯，框架对象和代码对象。例如，它可以帮助您检查类的内容，检索方法的源代码，提取函数的参数列表并设置其格式或获取显示详细回溯所需的所有信息。

例：

>>> import os
>>> import inspect
>>> inspect.getfile(os)
'/usr/lib64/python2.7/os.pyc'
>>> inspect.getfile(inspect)
'/usr/lib64/python2.7/inspect.pyc'
>>> os.path.dirname(inspect.getfile(inspect))
'/usr/lib64/python2.7'

There is inspect module in python.

Official documentation

The inspect module provides several useful functions to help get information about live objects such as modules, classes, methods, functions, tracebacks, frame objects, and code objects. For example, it can help you examine the contents of a class, retrieve the source code of a method, extract and format the argument list for a function, or get all the information you need to display a detailed traceback.

Example:

>>> import os
>>> import inspect
>>> inspect.getfile(os)
'/usr/lib64/python2.7/os.pyc'
>>> inspect.getfile(inspect)
'/usr/lib64/python2.7/inspect.pyc'
>>> os.path.dirname(inspect.getfile(inspect))
'/usr/lib64/python2.7'

回答 2

正如其他答案所说的那样，最好的方法是使用__file__（在下面再次演示）。但是，有一个重要的警告，__file__如果您单独运行模块（例如，作为__main__），则该警告不存在。

例如，假设您有两个文件（两个文件都在PYTHONPATH上）：

#/path1/foo.py
import bar
print(bar.__file__)

和

#/path2/bar.py
import os
print(os.getcwd())
print(__file__)

运行foo.py将给出输出：

/path1        # "import bar" causes the line "print(os.getcwd())" to run
/path2/bar.py # then "print(__file__)" runs
/path2/bar.py # then the import statement finishes and "print(bar.__file__)" runs

但是，如果尝试单独运行bar.py，则会得到：

/path2                              # "print(os.getcwd())" still works fine
Traceback (most recent call last):  # but __file__ doesn't exist if bar.py is running as main
  File "/path2/bar.py", line 3, in <module>
    print(__file__)
NameError: name '__file__' is not defined

希望这可以帮助。在测试其他解决方案时，这一警告使我花费了大量时间和困惑。

As the other answers have said, the best way to do this is with __file__ (demonstrated again below). However, there is an important caveat, which is that __file__ does NOT exist if you are running the module on its own (i.e. as __main__).

For example, say you have two files (both of which are on your PYTHONPATH):

#/path1/foo.py
import bar
print(bar.__file__)

and

#/path2/bar.py
import os
print(os.getcwd())
print(__file__)

Running foo.py will give the output:

/path1        # "import bar" causes the line "print(os.getcwd())" to run
/path2/bar.py # then "print(__file__)" runs
/path2/bar.py # then the import statement finishes and "print(bar.__file__)" runs

HOWEVER if you try to run bar.py on its own, you will get:

/path2                              # "print(os.getcwd())" still works fine
Traceback (most recent call last):  # but __file__ doesn't exist if bar.py is running as main
  File "/path2/bar.py", line 3, in <module>
    print(__file__)
NameError: name '__file__' is not defined

Hope this helps. This caveat cost me a lot of time and confusion while testing the other solutions presented.

回答 3

我还将尝试解决此问题的一些变体：

查找被调用脚本的路径
查找当前正在执行的脚本的路径
查找被调用脚本的目录

（其中一些问题已在SO上提出，但已作为重复内容关闭并在此处重定向。）

使用注意事项 `file`

对于已导入的模块：

import something
something.__file__

将返回模块的绝对路径。但是，鉴于以下脚本foo.py：

#foo.py
print '__file__', __file__

用“ python foo.py”调用它只会返回“ foo.py”。如果添加shebang：

#!/usr/bin/python 
#foo.py
print '__file__', __file__

并使用./foo.py调用它，它将返回’./foo.py’。从另一个目录中调用它（例如，将foo.py放在目录栏中），然后调用

python bar/foo.py

或添加一个Shebang并直接执行文件：

bar/foo.py

将返回“ bar / foo.py”（相对路径）。

查找目录

现在从那里获取目录，os.path.dirname(__file__)也可能很棘手。至少在我的系统上，如果从与文件相同的目录中调用它，它将返回一个空字符串。例如

# foo.py
import os
print '__file__ is:', __file__
print 'os.path.dirname(__file__) is:', os.path.dirname(__file__)

将输出：

__file__ is: foo.py
os.path.dirname(__file__) is:

换句话说，它返回一个空字符串，因此如果要将其用于当前文件（与导入模块的文件相对），这似乎并不可靠。为了解决这个问题，您可以将其包装在对abspath的调用中：

# foo.py
import os
print 'os.path.abspath(__file__) is:', os.path.abspath(__file__)
print 'os.path.dirname(os.path.abspath(__file__)) is:', os.path.dirname(os.path.abspath(__file__))

输出类似：

os.path.abspath(__file__) is: /home/user/bar/foo.py
os.path.dirname(os.path.abspath(__file__)) is: /home/user/bar

请注意，abspath（）不会解析符号链接。如果要执行此操作，请改用realpath（）。例如，使符号链接file_import_testing_link指向file_import_testing.py，其内容如下：

import os
print 'abspath(__file__)',os.path.abspath(__file__)
print 'realpath(__file__)',os.path.realpath(__file__)

执行将打印绝对路径，例如：

abspath(__file__) /home/user/file_test_link
realpath(__file__) /home/user/file_test.py

file_import_testing_link-> file_import_testing.py

使用检查

@SummerBreeze提到使用检查模块。

对于导入的模块，这似乎很好用，也很简洁：

import os
import inspect
print 'inspect.getfile(os) is:', inspect.getfile(os)

听话地返回绝对路径。但是，为了找到当前正在执行的脚本的路径，我没有找到使用它的方法。

I will try tackling a few variations on this question as well:

finding the path of the called script
finding the path of the currently executing script
finding the directory of the called script

(Some of these questions have been asked on SO, but have been closed as duplicates and redirected here.)

Caveats of Using `file`

For a module that you have imported:

import something
something.__file__

will return the absolute path of the module. However, given the folowing script foo.py:

#foo.py
print '__file__', __file__

Calling it with ‘python foo.py’ Will return simply ‘foo.py’. If you add a shebang:

#!/usr/bin/python 
#foo.py
print '__file__', __file__

and call it using ./foo.py, it will return ‘./foo.py’. Calling it from a different directory, (eg put foo.py in directory bar), then calling either

python bar/foo.py

or adding a shebang and executing the file directly:

bar/foo.py

will return ‘bar/foo.py’ (the relative path).

Finding the directory

Now going from there to get the directory, os.path.dirname(__file__) can also be tricky. At least on my system, it returns an empty string if you call it from the same directory as the file. ex.

# foo.py
import os
print '__file__ is:', __file__
print 'os.path.dirname(__file__) is:', os.path.dirname(__file__)

will output:

__file__ is: foo.py
os.path.dirname(__file__) is:

In other words, it returns an empty string, so this does not seem reliable if you want to use it for the current file (as opposed to the file of an imported module). To get around this, you can wrap it in a call to abspath:

# foo.py
import os
print 'os.path.abspath(__file__) is:', os.path.abspath(__file__)
print 'os.path.dirname(os.path.abspath(__file__)) is:', os.path.dirname(os.path.abspath(__file__))

which outputs something like:

os.path.abspath(__file__) is: /home/user/bar/foo.py
os.path.dirname(os.path.abspath(__file__)) is: /home/user/bar

Note that abspath() does NOT resolve symlinks. If you want to do this, use realpath() instead. For example, making a symlink file_import_testing_link pointing to file_import_testing.py, with the following content:

import os
print 'abspath(__file__)',os.path.abspath(__file__)
print 'realpath(__file__)',os.path.realpath(__file__)

executing will print absolute paths something like:

abspath(__file__) /home/user/file_test_link
realpath(__file__) /home/user/file_test.py

file_import_testing_link -> file_import_testing.py

Using inspect

@SummerBreeze mentions using the inspect module.

This seems to work well, and is quite concise, for imported modules:

import os
import inspect
print 'inspect.getfile(os) is:', inspect.getfile(os)

obediently returns the absolute path. However for finding the path of the currently executing script, I did not see a way to use it.

回答 4

我不明白为什么没有人在谈论这个，但是对我来说，最简单的解决方案是使用imp.find_module（“ modulename”）（在这里的文档）：

import imp
imp.find_module("os")

它给出一个元组，路径在第二个位置：

(<open file '/usr/lib/python2.7/os.py', mode 'U' at 0x7f44528d7540>,
'/usr/lib/python2.7/os.py',
('.py', 'U', 1))

与“检查”方法相比，此方法的优势在于您无需导入模块即可使其工作，并且可以在输入中使用字符串。例如，在检查另一个脚本中调用的模块时很有用。

编辑：

在python3中，importlib模块应该执行以下操作：

的文档importlib.util.find_spec：

返回指定模块的规格。

首先，检查sys.modules以查看模块是否已经导入。如果是这样，则为sys.modules [name]。规格返回。如果恰好将其设置为“无”，则引发ValueError。如果该模块不在sys.modules中，则在sys.meta_path中搜索一个合适的规范，并为发现者提供’path’的值。如果找不到规范，则不返回任何内容。

如果名称是子模块的名称（包含点），则将自动导入父模块。

名称和包参数与importlib.import_module（）相同。换句话说，相对的模块名称（带有前导点）起作用。

I don’t get why no one is talking about this, but to me the simplest solution is using imp.find_module(“modulename”) (documentation here):

import imp
imp.find_module("os")

It gives a tuple with the path in second position:

(<open file '/usr/lib/python2.7/os.py', mode 'U' at 0x7f44528d7540>,
'/usr/lib/python2.7/os.py',
('.py', 'U', 1))

The advantage of this method over the “inspect” one is that you don’t need to import the module to make it work, and you can use a string in input. Useful when checking modules called in another script for example.

EDIT:

In python3, importlib module should do:

Doc of importlib.util.find_spec:

Return the spec for the specified module.

First, sys.modules is checked to see if the module was already imported. If so, then sys.modules[name].spec is returned. If that happens to be set to None, then ValueError is raised. If the module is not in sys.modules, then sys.meta_path is searched for a suitable spec with the value of ‘path’ given to the finders. None is returned if no spec could be found.

If the name is for submodule (contains a dot), the parent module is automatically imported.

The name and package arguments work the same as importlib.import_module(). In other words, relative module names (with leading dots) work.

回答 5

这是微不足道的。

每个模块都有一个__file__变量，显示当前位置的相对路径。

因此，获取模块通知目录的方法很简单：

os.path.dirname(__file__)

This was trivial.

Each module has a __file__ variable that shows its relative path from where you are right now.

Therefore, getting a directory for the module to notify it is simple as:

os.path.dirname(__file__)

回答 6

import os
path = os.path.abspath(__file__)
dir_path = os.path.dirname(path)

import os
path = os.path.abspath(__file__)
dir_path = os.path.dirname(path)

回答 7

import module
print module.__path__

程序包支持另一个特殊属性__path__。它被初始化为一个列表，其中包含__init__.py执行该文件中的代码之前包含软件包目录的目录的名称。这个变量可以修改；这样做会影响以后对包中包含的模块和子包的搜索。

尽管通常不需要此功能，但可以使用它扩展软件包中的模块集。

资源

import module
print module.__path__

Packages support one more special attribute, __path__. This is initialized to be a list containing the name of the directory holding the package’s __init__.py before the code in that file is executed. This variable can be modified; doing so affects future searches for modules and subpackages contained in the package.

While this feature is not often needed, it can be used to extend the set of modules found in a package.

Source

回答 8

命令行实用程序

您可以将其调整为命令行实用程序，

python-which <package name>

创建 /usr/local/bin/python-which

#!/usr/bin/env python

import importlib
import os
import sys

args = sys.argv[1:]
if len(args) > 0:
    module = importlib.import_module(args[0])
    print os.path.dirname(module.__file__)

使它可执行

sudo chmod +x /usr/local/bin/python-which

Command Line Utility

You can tweak it to a command line utility,

python-which <package name>

Create /usr/local/bin/python-which

#!/usr/bin/env python

import importlib
import os
import sys

args = sys.argv[1:]
if len(args) > 0:
    module = importlib.import_module(args[0])
    print os.path.dirname(module.__file__)

Make it executable

sudo chmod +x /usr/local/bin/python-which

回答 9

因此，我花了大量时间尝试使用py2exe来执行此操作。问题是要获取脚本的基本文件夹，而不管它是作为python脚本还是作为py2exe可执行文件运行。不管是从当前文件夹，另一个文件夹还是从系统路径运行（这是最困难的），它都可以正常运行。

最终，我使用了这种方法，使用sys.frozen作为在py2exe中运行的指标：

import os,sys
if hasattr(sys,'frozen'): # only when running in py2exe this exists
    base = sys.prefix
else: # otherwise this is a regular python script
    base = os.path.dirname(os.path.realpath(__file__))

So I spent a fair amount of time trying to do this with py2exe The problem was to get the base folder of the script whether it was being run as a python script or as a py2exe executable. Also to have it work whether it was being run from the current folder, another folder or (this was the hardest) from the system’s path.

Eventually I used this approach, using sys.frozen as an indicator of running in py2exe:

import os,sys
if hasattr(sys,'frozen'): # only when running in py2exe this exists
    base = sys.prefix
else: # otherwise this is a regular python script
    base = os.path.dirname(os.path.realpath(__file__))

回答 10

您可以导入模块，然后点击其名称，然后获取其完整路径

>>> import os
>>> os
<module 'os' from 'C:\\Users\\Hassan Ashraf\\AppData\\Local\\Programs\\Python\\Python36-32\\lib\\os.py'>
>>>

you can just import your module then hit its name and you’ll get its full path

>>> import os
>>> os
<module 'os' from 'C:\\Users\\Hassan Ashraf\\AppData\\Local\\Programs\\Python\\Python36-32\\lib\\os.py'>
>>>

回答 11

如果要从包的任何模块中检索包的根路径，则可以进行以下工作（在Python 3.6上测试）：

from . import __path__ as ROOT_PATH
print(ROOT_PATH)

主__init__.py路径也可以通过使用__file__代替。

希望这可以帮助！

If you want to retrieve the package’s root path from any of its modules, the following works (tested on Python 3.6):

from . import __path__ as ROOT_PATH
print(ROOT_PATH)

The main __init__.py path can also be referenced by using __file__ instead.

Hope this helps!

回答 12

如果使用的唯一警告__file__是当前相对目录为空（即，当脚本从脚本所在的同一目录运行时），那么一个简单的解决方案是：

import os.path
mydir = os.path.dirname(__file__) or '.'
full  = os.path.abspath(mydir)
print __file__, mydir, full

结果：

$ python teste.py 
teste.py . /home/user/work/teste

诀窍是在or '.'后dirname()调用。它将dir设置为.，表示当前目录并且是任何与路径相关的函数的有效目录。

因此，abspath()并不是真正需要使用。但是，如果仍然使用它，就不需要技巧了：abspath()接受空白路径并将其正确解释为当前目录。

If the only caveat of using __file__ is when current, relative directory is blank (ie, when running as a script from the same directory where the script is), then a trivial solution is:

import os.path
mydir = os.path.dirname(__file__) or '.'
full  = os.path.abspath(mydir)
print __file__, mydir, full

And the result:

$ python teste.py 
teste.py . /home/user/work/teste

The trick is in or '.' after the dirname() call. It sets the dir as ., which means current directory and is a valid directory for any path-related function.

Thus, using abspath() is not truly needed. But if you use it anyway, the trick is not needed: abspath() accepts blank paths and properly interprets it as the current directory.

回答 13

我想为一个常见的场景（在Python 3中）做出贡献，并探索一些实现它的方法。

内置函数open（）接受相对路径或绝对路径作为其第一个参数。但是，相对路径被视为相对于当前工作目录的相对路径，因此建议将绝对路径传递给文件。

简而言之，如果使用以下代码运行脚本文件，则不能保证example.txt将在脚本文件所在的目录中创建该文件：

with open('example.txt', 'w'):
    pass

要修复此代码，我们需要获取脚本的路径并将其设为绝对路径。为了确保路径是绝对的，我们只需使用os.path.realpath（）函数。要获取脚本的路径，有几个常用函数可以返回各种路径结果：

os.getcwd()
os.path.realpath('example.txt')
sys.argv[0]
__file__

os.getcwd（）和os.path.realpath（）这两个函数都基于当前工作目录返回路径结果。通常不是我们想要的。sys.argv列表的第一个元素是根脚本（运行的脚本）的路径，无论您是在根脚本本身还是在其任何模块中调用列表。在某些情况下可能会派上用场。该__file__变量包含从它被称为模块的路径。

以下代码example.txt在脚本所在的目录中正确创建了一个文件：

filedir = os.path.dirname(os.path.realpath(__file__))
filepath = os.path.join(filedir, 'example.txt')

with open(filepath, 'w'):
    pass

I’d like to contribute with one common scenario (in Python 3) and explore a few approaches to it.

The built-in function open() accepts either relative or absolute path as its first argument. The relative path is treated as relative to the current working directory though so it is recommended to pass the absolute path to the file.

Simply said, if you run a script file with the following code, it is not guaranteed that the example.txt file will be created in the same directory where the script file is located:

with open('example.txt', 'w'):
    pass

To fix this code we need to get the path to the script and make it absolute. To ensure the path to be absolute we simply use the os.path.realpath() function. To get the path to the script there are several common functions that return various path results:

os.getcwd()
os.path.realpath('example.txt')
sys.argv[0]
__file__

Both functions os.getcwd() and os.path.realpath() return path results based on the current working directory. Generally not what we want. The first element of the sys.argv list is the path of the root script (the script you run) regardless of whether you call the list in the root script itself or in any of its modules. It might come handy in some situations. The __file__ variable contains path of the module from which it has been called.

The following code correctly creates a file example.txt in the same directory where the script is located:

filedir = os.path.dirname(os.path.realpath(__file__))
filepath = os.path.join(filedir, 'example.txt')

with open(filepath, 'w'):
    pass

回答 14

如果您想从脚本中知道绝对路径，可以使用Path对象：

from pathlib import Path

print(Path().absolute())
print(Path().resolve('.'))
print(Path().cwd())

cwd（）方法

返回代表当前目录的新路径对象（由os.getcwd（）返回）

resolve（）方法

使路径绝对，解决任何符号链接。返回一个新的路径对象：

If you would like to know absolute path from your script you can use Path object:

from pathlib import Path

print(Path().absolute())
print(Path().resolve('.'))
print(Path().cwd())

cwd() method

Return a new path object representing the current directory (as returned by os.getcwd())

resolve() method

Make the path absolute, resolving any symlinks. A new path object is returned:

回答 15

从python包的模块内部，我必须引用与包位于同一目录中的文件。例如

some_dir/
  maincli.py
  top_package/
    __init__.py
    level_one_a/
      __init__.py
      my_lib_a.py
      level_two/
        __init__.py
        hello_world.py
    level_one_b/
      __init__.py
      my_lib_b.py

因此，在上面，我必须从my_lib_a.py模块调用maincli.py，因为知道top_package和maincli.py在同一目录中。这是我到达maincli.py的路径：

import sys
import os
import imp


class ConfigurationException(Exception):
    pass


# inside of my_lib_a.py
def get_maincli_path():
    maincli_path = os.path.abspath(imp.find_module('maincli')[1])
    # top_package = __package__.split('.')[0]
    # mod = sys.modules.get(top_package)
    # modfile = mod.__file__
    # pkg_in_dir = os.path.dirname(os.path.dirname(os.path.abspath(modfile)))
    # maincli_path = os.path.join(pkg_in_dir, 'maincli.py')

    if not os.path.exists(maincli_path):
        err_msg = 'This script expects that "maincli.py" be installed to the '\
        'same directory: "{0}"'.format(maincli_path)
        raise ConfigurationException(err_msg)

    return maincli_path

基于PlasmaBinturong的发布，我修改了代码。

From within modules of a python package I had to refer to a file that resided in the same directory as package. Ex.

some_dir/
  maincli.py
  top_package/
    __init__.py
    level_one_a/
      __init__.py
      my_lib_a.py
      level_two/
        __init__.py
        hello_world.py
    level_one_b/
      __init__.py
      my_lib_b.py

So in above I had to call maincli.py from my_lib_a.py module knowing that top_package and maincli.py are in the same directory. Here’s how I get the path to maincli.py:

import sys
import os
import imp


class ConfigurationException(Exception):
    pass


# inside of my_lib_a.py
def get_maincli_path():
    maincli_path = os.path.abspath(imp.find_module('maincli')[1])
    # top_package = __package__.split('.')[0]
    # mod = sys.modules.get(top_package)
    # modfile = mod.__file__
    # pkg_in_dir = os.path.dirname(os.path.dirname(os.path.abspath(modfile)))
    # maincli_path = os.path.join(pkg_in_dir, 'maincli.py')

    if not os.path.exists(maincli_path):
        err_msg = 'This script expects that "maincli.py" be installed to the '\
        'same directory: "{0}"'.format(maincli_path)
        raise ConfigurationException(err_msg)

    return maincli_path

Based on posting by PlasmaBinturong I modified the code.

回答 16

如果您希望在“程序”中动态执行此操作，请尝试以下代码：
我的意思是，您可能不知道要对其“硬编码”的模块的确切名称。它可能是从列表中选择的，或者可能当前未运行以使用__file__。

（我知道，它将在Python 3中不起作用）

global modpath
modname = 'os' #This can be any module name on the fly
#Create a file called "modname.py"
f=open("modname.py","w")
f.write("import "+modname+"\n")
f.write("modpath = "+modname+"\n")
f.close()
#Call the file with execfile()
execfile('modname.py')
print modpath
<module 'os' from 'C:\Python27\lib\os.pyc'>

我试图摆脱“全局”问题，但发现无法正常工作的情况，我认为“ execfile（）”可以在Python 3中进行仿真，因为它在程序中，因此可以轻松地放入方法或模块中重用。

If you wish to do this dynamically in a “program” try this code:
My point is, you may not know the exact name of the module to “hardcode” it. It may be selected from a list or may not be currently running to use __file__.

(I know, it will not work in Python 3)

global modpath
modname = 'os' #This can be any module name on the fly
#Create a file called "modname.py"
f=open("modname.py","w")
f.write("import "+modname+"\n")
f.write("modpath = "+modname+"\n")
f.close()
#Call the file with execfile()
execfile('modname.py')
print modpath
<module 'os' from 'C:\Python27\lib\os.pyc'>

I tried to get rid of the “global” issue but found cases where it did not work I think “execfile()” can be emulated in Python 3 Since this is in a program, it can easily be put in a method or module for reuse.

回答 17

如果您使用pip进行安装，则“ pip show”效果很好（“位置”）

$ pip show detectron2

Name: detectron2
Version: 0.1
Summary: Detectron2 is FAIR next-generation research platform for object detection and segmentation.
Home-page: https://github.com/facebookresearch/detectron2
Author: FAIR
Author-email: None
License: UNKNOWN
Location: /home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages
Requires: yacs, tabulate, tqdm, pydot, tensorboard, Pillow, termcolor, future, cloudpickle, matplotlib, fvcore

If you installed it using pip, “pip show” works great (‘Location’)

$ pip show detectron2

Name: detectron2
Version: 0.1
Summary: Detectron2 is FAIR next-generation research platform for object detection and segmentation.
Home-page: https://github.com/facebookresearch/detectron2
Author: FAIR
Author-email: None
License: UNKNOWN
Location: /home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages
Requires: yacs, tabulate, tqdm, pydot, tensorboard, Pillow, termcolor, future, cloudpickle, matplotlib, fvcore

回答 18

这是一个快速的bash脚本，以防对任何人有用。我只希望能够设置一个环境变量，以便可以pushd对代码进行设置。

#!/bin/bash
module=${1:?"I need a module name"}

python << EOI
import $module
import os
print os.path.dirname($module.__file__)
EOI

壳示例：

[root@sri-4625-0004 ~]# export LXML=$(get_python_path.sh lxml)
[root@sri-4625-0004 ~]# echo $LXML
/usr/lib64/python2.7/site-packages/lxml
[root@sri-4625-0004 ~]#

Here is a quick bash script in case it’s useful to anyone. I just want to be able to set an environment variable so that I can pushd to the code.

#!/bin/bash
module=${1:?"I need a module name"}

python << EOI
import $module
import os
print os.path.dirname($module.__file__)
EOI

Shell example:

[root@sri-4625-0004 ~]# export LXML=$(get_python_path.sh lxml)
[root@sri-4625-0004 ~]# echo $LXML
/usr/lib64/python2.7/site-packages/lxml
[root@sri-4625-0004 ~]#

知识问答

如何用逗号将数字打印为千位分隔符？

2021年7月25日 Python实用宝典

问题：如何用逗号将数字打印为千位分隔符？

我正在尝试在Python 2.6.1中打印一个整数，并以逗号作为千位分隔符。例如，我要将数字显示1234567为1,234,567。我将如何去做呢？我在Google上看到了很多示例，但我正在寻找最简单的实用方法。

在句点和逗号之间进行决定不需要特定于区域设置。我希望尽可能简单一些。

I am trying to print an integer in Python 2.6.1 with commas as thousands separators. For example, I want to show the number 1234567 as 1,234,567. How would I go about doing this? I have seen many examples on Google, but I am looking for the simplest practical way.

It does not need to be locale-specific to decide between periods and commas. I would prefer something as simple as reasonably possible.

回答 0

不知道语言环境

'{:,}'.format(value)  # For Python ≥2.7
f'{value:,}'  # For Python ≥3.6

区域感知

import locale
locale.setlocale(locale.LC_ALL, '')  # Use '' for auto, or force e.g. to 'en_US.UTF-8'

'{:n}'.format(value)  # For Python ≥2.7
f'{value:n}'  # For Python ≥3.6

参考

每种格式规格的迷你语言，

该','选项表示千位分隔符使用逗号。对于可识别语言环境的分隔符，请改用'n'整数表示类型。

Locale unaware

'{:,}'.format(value)  # For Python ≥2.7
f'{value:,}'  # For Python ≥3.6

Locale aware

import locale
locale.setlocale(locale.LC_ALL, '')  # Use '' for auto, or force e.g. to 'en_US.UTF-8'

'{:n}'.format(value)  # For Python ≥2.7
f'{value:n}'  # For Python ≥3.6

Reference

Per Format Specification Mini-Language,

The ',' option signals the use of a comma for a thousands separator. For a locale aware separator, use the 'n' integer presentation type instead.

回答 1

我得到这个工作：

>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'en_US')
'en_US'
>>> locale.format("%d", 1255000, grouping=True)
'1,255,000'

当然，您不需要国际化支持，但它清晰，简洁并且使用内置库。

PS“％d”是通常的％样式格式化程序。您只能有一个格式化程序，但是就字段宽度和精度设置而言，它可以是您所需的任何格式。

PPS如果您无法locale上班，建议您修改Mark的答案：

def intWithCommas(x):
    if type(x) not in [type(0), type(0L)]:
        raise TypeError("Parameter must be an integer.")
    if x < 0:
        return '-' + intWithCommas(-x)
    result = ''
    while x >= 1000:
        x, r = divmod(x, 1000)
        result = ",%03d%s" % (r, result)
    return "%d%s" % (x, result)

递归对于否定情况很有用，但是每个逗号一次递归对我来说似乎有点多余。

I got this to work:

>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'en_US')
'en_US'
>>> locale.format("%d", 1255000, grouping=True)
'1,255,000'

Sure, you don’t need internationalization support, but it’s clear, concise, and uses a built-in library.

P.S. That “%d” is the usual %-style formatter. You can have only one formatter, but it can be whatever you need in terms of field width and precision settings.

P.P.S. If you can’t get locale to work, I’d suggest a modified version of Mark’s answer:

def intWithCommas(x):
    if type(x) not in [type(0), type(0L)]:
        raise TypeError("Parameter must be an integer.")
    if x < 0:
        return '-' + intWithCommas(-x)
    result = ''
    while x >= 1000:
        x, r = divmod(x, 1000)
        result = ",%03d%s" % (r, result)
    return "%d%s" % (x, result)

Recursion is useful for the negative case, but one recursion per comma seems a bit excessive to me.

回答 2

由于效率低下和可读性差，很难克服：

>>> import itertools
>>> s = '-1234567'
>>> ','.join(["%s%s%s" % (x[0], x[1] or '', x[2] or '') for x in itertools.izip_longest(s[::-1][::3], s[::-1][1::3], s[::-1][2::3])])[::-1].replace('-,','-')

For inefficiency and unreadability it’s hard to beat:

>>> import itertools
>>> s = '-1234567'
>>> ','.join(["%s%s%s" % (x[0], x[1] or '', x[2] or '') for x in itertools.izip_longest(s[::-1][::3], s[::-1][1::3], s[::-1][2::3])])[::-1].replace('-,','-')

回答 3

在删除无关部分并对其进行一些清理之后，这是区域设置代码：

（以下仅适用于整数）

def group(number):
    s = '%d' % number
    groups = []
    while s and s[-1].isdigit():
        groups.append(s[-3:])
        s = s[:-3]
    return s + ','.join(reversed(groups))

>>> group(-23432432434.34)
'-23,432,432,434'

这里已经有一些不错的答案。我只想添加此内容以供将来参考。在python 2.7中，将有一个用于千位分隔符的格式说明符。根据python文档，它像这样工作

>>> '{:20,.2f}'.format(f)
'18,446,744,073,709,551,616.00'

在python3.1中，您可以执行以下操作：

>>> format(1234567, ',d')
'1,234,567'

Here is the locale grouping code after removing irrelevant parts and cleaning it up a little:

(The following only works for integers)

def group(number):
    s = '%d' % number
    groups = []
    while s and s[-1].isdigit():
        groups.append(s[-3:])
        s = s[:-3]
    return s + ','.join(reversed(groups))

>>> group(-23432432434.34)
'-23,432,432,434'

There are already some good answers in here. I just want to add this for future reference. In python 2.7 there is going to be a format specifier for thousands separator. According to python docs it works like this

>>> '{:20,.2f}'.format(f)
'18,446,744,073,709,551,616.00'

In python3.1 you can do the same thing like this:

>>> format(1234567, ',d')
'1,234,567'

回答 4

令我惊讶的是，没有人提到您可以在Python 3.6中使用f字符串做到这一点，就像这样简单：

>>> num = 10000000
>>> print(f"{num:,}")
10,000,000

…冒号后面的部分是格式说明符。逗号是所需的分隔符，因此请f"{num:_}"使用下划线而不是逗号。

这等效于format(num, ",")用于旧版本的python 3。

I’m surprised that no one has mentioned that you can do this with f-strings in Python 3.6 as easy as this:

>>> num = 10000000
>>> print(f"{num:,}")
10,000,000

… where the part after the colon is the format specifier. The comma is the separator character you want, so f"{num:_}" uses underscores instead of a comma.

This is equivalent of using format(num, ",") for older versions of python 3.

回答 5

这是单行正则表达式替换：

re.sub("(\d)(?=(\d{3})+(?!\d))", r"\1,", "%d" % val)

仅适用于非正式输出：

import re
val = 1234567890
re.sub("(\d)(?=(\d{3})+(?!\d))", r"\1,", "%d" % val)
# Returns: '1,234,567,890'

val = 1234567890.1234567890
# Returns: '1,234,567,890'

或对于少于4位数字的浮点数，将格式说明符更改为%.3f：

re.sub("(\d)(?=(\d{3})+(?!\d))", r"\1,", "%.3f" % val)
# Returns: '1,234,567,890.123'

注意：超过三位的小数位数无法正常工作，因为它将尝试对小数部分进行分组：

re.sub("(\d)(?=(\d{3})+(?!\d))", r"\1,", "%.5f" % val)
# Returns: '1,234,567,890.12,346'

怎么运行的

让我们分解一下：

re.sub(pattern, repl, string)

pattern = \
    "(\d)           # Find one digit...
     (?=            # that is followed by...
         (\d{3})+   # one or more groups of three digits...
         (?!\d)     # which are not followed by any more digits.
     )",

repl = \
    r"\1,",         # Replace that one digit by itself, followed by a comma,
                    # and continue looking for more matches later in the string.
                    # (re.sub() replaces all matches it finds in the input)

string = \
    "%d" % val      # Format the string as a decimal to begin with

Here’s a one-line regex replacement:

re.sub("(\d)(?=(\d{3})+(?!\d))", r"\1,", "%d" % val)

Works only for inegral outputs:

import re
val = 1234567890
re.sub("(\d)(?=(\d{3})+(?!\d))", r"\1,", "%d" % val)
# Returns: '1,234,567,890'

val = 1234567890.1234567890
# Returns: '1,234,567,890'

Or for floats with less than 4 digits, change the format specifier to %.3f:

re.sub("(\d)(?=(\d{3})+(?!\d))", r"\1,", "%.3f" % val)
# Returns: '1,234,567,890.123'

NB: Doesn’t work correctly with more than three decimal digits as it will attempt to group the decimal part:

re.sub("(\d)(?=(\d{3})+(?!\d))", r"\1,", "%.5f" % val)
# Returns: '1,234,567,890.12,346'

How it works

Let’s break it down:

re.sub(pattern, repl, string)

pattern = \
    "(\d)           # Find one digit...
     (?=            # that is followed by...
         (\d{3})+   # one or more groups of three digits...
         (?!\d)     # which are not followed by any more digits.
     )",

repl = \
    r"\1,",         # Replace that one digit by itself, followed by a comma,
                    # and continue looking for more matches later in the string.
                    # (re.sub() replaces all matches it finds in the input)

string = \
    "%d" % val      # Format the string as a decimal to begin with

回答 6

这就是我为花车所做的。虽然，老实说，我不确定它适用于哪个版本-我使用的是2.7：

my_number = 4385893.382939491

my_string = '{:0,.2f}'.format(my_number)

回报：4,385,893.38

更新：我最近遇到了这种格式的问题（无法告诉您确切的原因），但是可以通过删除0：来解决它：

my_string = '{:,.2f}'.format(my_number)

This is what I do for floats. Although, honestly, I’m not sure which versions it works for – I’m using 2.7:

my_number = 4385893.382939491

my_string = '{:0,.2f}'.format(my_number)

Returns: 4,385,893.38

Update: I recently had an issue with this format (couldn’t tell you the exact reason), but was able to fix it by dropping the 0:

my_string = '{:,.2f}'.format(my_number)

回答 7

您也可以将其'{:n}'.format( value )用于语言环境。我认为这是语言环境解决方案的最简单方法。

有关更多信息，请thousands在Python DOC中搜索。

对于货币，您可以使用locale.currency，设置标志grouping：

码

import locale

locale.setlocale( locale.LC_ALL, '' )
locale.currency( 1234567.89, grouping = True )

输出量

'Portuguese_Brazil.1252'
'R$ 1.234.567,89'

You can also use '{:n}'.format( value ) for a locale representation. I think this is the simpliest way for a locale solution.

For more information, search for thousands in Python DOC.

For currency, you can use locale.currency, setting the flag grouping:

Code

import locale

locale.setlocale( locale.LC_ALL, '' )
locale.currency( 1234567.89, grouping = True )

Output

'Portuguese_Brazil.1252'
'R$ 1.234.567,89'

回答 8

稍微扩大Ian Schneider的答案：

如果要使用自定义的千位分隔符，最简单的解决方案是：

'{:,}'.format(value).replace(',', your_custom_thousands_separator)

例子

'{:,.2f}'.format(123456789.012345).replace(',', ' ')

如果要这样的德语表示形式，它将变得更加复杂：

('{:,.2f}'.format(123456789.012345)
          .replace(',', ' ')  # 'save' the thousands separators 
          .replace('.', ',')  # dot to comma
          .replace(' ', '.')) # thousand separators to dot

Slightly expanding the answer of Ian Schneider:

If you want to use a custom thousands separator, the simplest solution is:

'{:,}'.format(value).replace(',', your_custom_thousands_separator)

Examples

'{:,.2f}'.format(123456789.012345).replace(',', ' ')

If you want the German representation like this, it gets a bit more complicated:

('{:,.2f}'.format(123456789.012345)
          .replace(',', ' ')  # 'save' the thousands separators 
          .replace('.', ',')  # dot to comma
          .replace(' ', '.')) # thousand separators to dot

回答 9

我确定必须有一个标准的库函数，但是尝试自己使用递归编写它很有趣，所以这是我想出的：

def intToStringWithCommas(x):
    if type(x) is not int and type(x) is not long:
        raise TypeError("Not an integer!")
    if x < 0:
        return '-' + intToStringWithCommas(-x)
    elif x < 1000:
        return str(x)
    else:
        return intToStringWithCommas(x / 1000) + ',' + '%03d' % (x % 1000)

话虽如此，如果其他人确实找到了一种标准方法，则应该改用该方法。

I’m sure there must be a standard library function for this, but it was fun to try to write it myself using recursion so here’s what I came up with:

def intToStringWithCommas(x):
    if type(x) is not int and type(x) is not long:
        raise TypeError("Not an integer!")
    if x < 0:
        return '-' + intToStringWithCommas(-x)
    elif x < 1000:
        return str(x)
    else:
        return intToStringWithCommas(x / 1000) + ',' + '%03d' % (x % 1000)

Having said that, if someone else does find a standard way to do it, you should use that instead.

回答 10

从评论到activestate食谱498181，我对此进行了重新设计：

import re
def thous(x, sep=',', dot='.'):
    num, _, frac = str(x).partition(dot)
    num = re.sub(r'(\d{3})(?=\d)', r'\1'+sep, num[::-1])[::-1]
    if frac:
        num += dot + frac
    return num

它使用正则表达式功能：先行搜索，即(?=\d)确保只有三个数字组成的组在其“后”有一个逗号。我说“之后”是因为此时字符串是反向的。

[::-1] 只是反转一个字符串。

From the comments to activestate recipe 498181 I reworked this:

import re
def thous(x, sep=',', dot='.'):
    num, _, frac = str(x).partition(dot)
    num = re.sub(r'(\d{3})(?=\d)', r'\1'+sep, num[::-1])[::-1]
    if frac:
        num += dot + frac
    return num

It uses the regular expressions feature: lookahead i.e. (?=\d) to make sure only groups of three digits that have a digit ‘after’ them get a comma. I say ‘after’ because the string is reverse at this point.

[::-1] just reverses a string.

回答 11

可接受的答案很好，但我实际上更喜欢format(number,',')。对我来说更容易解释和记住。

https://docs.python.org/3/library/functions.html#format

The accepted answer is fine, but I actually prefer format(number,','). Easier for me to interpret and remember.

https://docs.python.org/3/library/functions.html#format

回答 12

Python 3

–

整数（不带小数）：

"{:,d}".format(1234567)

–

浮点数（带小数）：

"{:,.2f}".format(1234567)

前面的数字f指定小数位数。

–

奖金

印度十万/克劳斯编号系统的快速启动器功能（12,34,567）：

https://stackoverflow.com/a/44832241/4928578

Python 3

—

Integers (without decimal):

"{:,d}".format(1234567)

—

Floats (with decimal):

"{:,.2f}".format(1234567)

where the number before f specifies the number of decimal places.

—

Bonus

Quick-and-dirty starter function for the Indian lakhs/crores numbering system (12,34,567):

https://stackoverflow.com/a/44832241/4928578

回答 13

从Python 2.6版开始，您可以执行以下操作：

def format_builtin(n):
    return format(n, ',')

对于2.6以下的Python版本，仅供参考，这里有2个手动解决方案，它们将浮点数转换为整数，但是负数可以正常工作：

def format_number_using_lists(number):
    string = '%d' % number
    result_list = list(string)
    indexes = range(len(string))
    for index in indexes[::-3][1:]:
        if result_list[index] != '-':
            result_list.insert(index+1, ',')
    return ''.join(result_list)

这里需要注意的几件事：

这行代码：string =’％d’％number将数字很好地转换为字符串，它支持负数，并从浮点数中除去小数，使它们成为整数；
这个slice的索引[::-3]从末尾开始返回每个第三个项目，因此我使用了另一个切片[1：]删除了最后一个项目，因为在最后一个数字之后不需要逗号；
如果l [index]！=’-‘用于支持负数，则此条件，不要在减号后插入逗号。

还有一个更核心的版本：

def format_number_using_generators_and_list_comprehensions(number):
    string = '%d' % number
    generator = reversed( 
        [
            value+',' if (index!=0 and value!='-' and index%3==0) else value
            for index,value in enumerate(reversed(string))
        ]
    )
    return ''.join(generator)

from Python version 2.6 you can do this:

def format_builtin(n):
    return format(n, ',')

For Python versions < 2.6 and just for your information, here are 2 manual solutions, they turn floats to ints but negative numbers work correctly:

def format_number_using_lists(number):
    string = '%d' % number
    result_list = list(string)
    indexes = range(len(string))
    for index in indexes[::-3][1:]:
        if result_list[index] != '-':
            result_list.insert(index+1, ',')
    return ''.join(result_list)

few things to notice here:

this line: string = ‘%d’ % number beautifully converts a number to a string, it supports negatives and it drops fractions from floats, making them ints;
this slice indexes[::-3] returns each third item starting from the end, so I used another slice [1:] to remove the very last item cuz I don’t need a comma after the last number;
this conditional if l[index] != ‘-‘ is being used to support negative numbers, do not insert a comma after the minus sign.

And a more hardcore version:

def format_number_using_generators_and_list_comprehensions(number):
    string = '%d' % number
    generator = reversed( 
        [
            value+',' if (index!=0 and value!='-' and index%3==0) else value
            for index,value in enumerate(reversed(string))
        ]
    )
    return ''.join(generator)

回答 14

我是Python初学者，但是经验丰富的程序员。我有Python 3.5，所以我只能使用逗号，但这仍然是一个有趣的编程练习。考虑无符号整数的情况。添加数千个分隔符的最易读的Python程序似乎是：

def add_commas(instr):
    out = [instr[0]]
    for i in range(1, len(instr)):
        if (len(instr) - i) % 3 == 0:
            out.append(',')
        out.append(instr[i])
    return ''.join(out)

也可以使用列表理解：

add_commas(instr):
    rng = reversed(range(1, len(instr) + (len(instr) - 1)//3 + 1))
    out = [',' if j%4 == 0 else instr[-(j - j//4)] for j in rng]
    return ''.join(out)

它比较短，可能只有一个衬里，但是您必须进行一些心理体操才能理解它的工作原理。在这两种情况下，我们得到：

for i in range(1, 11):
    instr = '1234567890'[:i]
    print(instr, add_commas(instr))

1 1
12 12
123 123
1234 1,234
12345 12,345
123456 123,456
1234567 1,234,567
12345678 12,345,678
123456789 123,456,789
1234567890 1,234,567,890

如果您想了解该程序，则第一个版本是更明智的选择。

I am a Python beginner, but an experienced programmer. I have Python 3.5, so I can just use the comma, but this is nonetheless an interesting programming exercise. Consider the case of an unsigned integer. The most readable Python program for adding thousands separators appears to be:

def add_commas(instr):
    out = [instr[0]]
    for i in range(1, len(instr)):
        if (len(instr) - i) % 3 == 0:
            out.append(',')
        out.append(instr[i])
    return ''.join(out)

It is also possible to use a list comprehension:

add_commas(instr):
    rng = reversed(range(1, len(instr) + (len(instr) - 1)//3 + 1))
    out = [',' if j%4 == 0 else instr[-(j - j//4)] for j in rng]
    return ''.join(out)

This is shorter, and could be a one liner, but you will have to do some mental gymnastics to understand why it works. In both cases we get:

for i in range(1, 11):
    instr = '1234567890'[:i]
    print(instr, add_commas(instr))

1 1
12 12
123 123
1234 1,234
12345 12,345
123456 123,456
1234567 1,234,567
12345678 12,345,678
123456789 123,456,789
1234567890 1,234,567,890

The first version is the more sensible choice, if you want the program to be understood.

回答 15

这也是一种适用于浮点数的方法：

def float2comma(f):
    s = str(abs(f)) # Convert to a string
    decimalposition = s.find(".") # Look for decimal point
    if decimalposition == -1:
        decimalposition = len(s) # If no decimal, then just work from the end
    out = "" 
    for i in range(decimalposition+1, len(s)): # do the decimal
        if not (i-decimalposition-1) % 3 and i-decimalposition-1: out = out+","
        out = out+s[i]      
    if len(out):
        out = "."+out # add the decimal point if necessary
    for i in range(decimalposition-1,-1,-1): # working backwards from decimal point
        if not (decimalposition-i-1) % 3 and decimalposition-i-1: out = ","+out
        out = s[i]+out      
    if f < 0:
        out = "-"+out
    return out

用法示例：

>>> float2comma(10000.1111)
'10,000.111,1'
>>> float2comma(656565.122)
'656,565.122'
>>> float2comma(-656565.122)
'-656,565.122'

Here’s one that works for floats too:

def float2comma(f):
    s = str(abs(f)) # Convert to a string
    decimalposition = s.find(".") # Look for decimal point
    if decimalposition == -1:
        decimalposition = len(s) # If no decimal, then just work from the end
    out = "" 
    for i in range(decimalposition+1, len(s)): # do the decimal
        if not (i-decimalposition-1) % 3 and i-decimalposition-1: out = out+","
        out = out+s[i]      
    if len(out):
        out = "."+out # add the decimal point if necessary
    for i in range(decimalposition-1,-1,-1): # working backwards from decimal point
        if not (decimalposition-i-1) % 3 and decimalposition-i-1: out = ","+out
        out = s[i]+out      
    if f < 0:
        out = "-"+out
    return out

Usage Example:

>>> float2comma(10000.1111)
'10,000.111,1'
>>> float2comma(656565.122)
'656,565.122'
>>> float2comma(-656565.122)
'-656,565.122'

回答 16

一种适用于Python 2.5+和Python 3的衬板（仅适用于正整数）：

''.join(reversed([x + (',' if i and not i % 3 else '') for i, x in enumerate(reversed(str(1234567)))]))

One liner for Python 2.5+ and Python 3 (positive int only):

''.join(reversed([x + (',' if i and not i % 3 else '') for i, x in enumerate(reversed(str(1234567)))]))

回答 17

通用解决方案

我在上一个投票最高的答案中发现了点分隔符的一些问题。我设计了一个通用解决方案，您可以在不修改语言环境的情况下将任何内容用作千位分隔符。我知道这不是最优雅的解决方案，但可以完成工作。随时进行改进！

def format_integer(number, thousand_separator='.'):
    def reverse(string):
        string = "".join(reversed(string))
        return string

    s = reverse(str(number))
    count = 0
    result = ''
    for char in s:
        count = count + 1
        if count % 3 == 0:
            if len(s) == count:
                result = char + result
            else:
                result = thousand_separator + char + result
        else:
            result = char + result
    return result


print(format_integer(50))
# 50
print(format_integer(500))
# 500
print(format_integer(50000))
# 50.000
print(format_integer(50000000))
# 50.000.000

Universal solution

I have found some issues with the dot separator in the previous top voted answers. I have designed a universal solution where you can use whatever you want as a thousand separator without modifying the locale. I know it’s not the most elegant solution, but it gets the job done. Feel free to improve it !

def format_integer(number, thousand_separator='.'):
    def reverse(string):
        string = "".join(reversed(string))
        return string

    s = reverse(str(number))
    count = 0
    result = ''
    for char in s:
        count = count + 1
        if count % 3 == 0:
            if len(s) == count:
                result = char + result
            else:
                result = thousand_separator + char + result
        else:
            result = char + result
    return result


print(format_integer(50))
# 50
print(format_integer(500))
# 500
print(format_integer(50000))
# 50.000
print(format_integer(50000000))
# 50.000.000

回答 18

这与逗号一起赚钱

def format_money(money, presym='$', postsym=''):
    fmt = '%0.2f' % money
    dot = string.find(fmt, '.')
    ret = []
    if money < 0 :
        ret.append('(')
        p0 = 1
    else :
        p0 = 0
    ret.append(presym)
    p1 = (dot-p0) % 3 + p0
    while True :
        ret.append(fmt[p0:p1])
        if p1 == dot : break
        ret.append(',')
        p0 = p1
        p1 += 3
    ret.append(fmt[dot:])   # decimals
    ret.append(postsym)
    if money < 0 : ret.append(')')
    return ''.join(ret)

This does money along with the commas

def format_money(money, presym='$', postsym=''):
    fmt = '%0.2f' % money
    dot = string.find(fmt, '.')
    ret = []
    if money < 0 :
        ret.append('(')
        p0 = 1
    else :
        p0 = 0
    ret.append(presym)
    p1 = (dot-p0) % 3 + p0
    while True :
        ret.append(fmt[p0:p1])
        if p1 == dot : break
        ret.append(',')
        p0 = p1
        p1 += 3
    ret.append(fmt[dot:])   # decimals
    ret.append(postsym)
    if money < 0 : ret.append(')')
    return ''.join(ret)

回答 19

我有此代码的python 2和python 3版本。我知道这个问题是针对python 2提出的，但是现在（8年后，大声笑）人们可能会使用python3。Python

3代码：

import random
number = str(random.randint(1, 10000000))
comma_placement = 4
print('The original number is: {}. '.format(number))
while True:
    if len(number) % 3 == 0:
        for i in range(0, len(number) // 3 - 1):
            number = number[0:len(number) - comma_placement + 1] + ',' + number[len(number) - comma_placement + 1:]
            comma_placement = comma_placement + 4
    else:
        for i in range(0, len(number) // 3):
            number = number[0:len(number) - comma_placement + 1] + ',' + number[len(number) - comma_placement + 1:]
    break
print('The new and improved number is: {}'.format(number))

Python 2代码：（编辑。python2代码无法正常工作。我认为语法是不同的）。

import random
number = str(random.randint(1, 10000000))
comma_placement = 4
print 'The original number is: %s.' % (number)
while True:
    if len(number) % 3 == 0:
        for i in range(0, len(number) // 3 - 1):
            number = number[0:len(number) - comma_placement + 1] + ',' + number[len(number) - comma_placement + 1:]
            comma_placement = comma_placement + 4
    else:
        for i in range(0, len(number) // 3):
            number = number[0:len(number) - comma_placement + 1] + ',' + number[len(number) - comma_placement + 1:]
    break
print 'The new and improved number is: %s.' % (number)

I have a python 2 and python 3 version of this code. I know that the question was asked for python 2 but now (8 years later lol) people will probably be using python 3.

Python 3 Code:

import random
number = str(random.randint(1, 10000000))
comma_placement = 4
print('The original number is: {}. '.format(number))
while True:
    if len(number) % 3 == 0:
        for i in range(0, len(number) // 3 - 1):
            number = number[0:len(number) - comma_placement + 1] + ',' + number[len(number) - comma_placement + 1:]
            comma_placement = comma_placement + 4
    else:
        for i in range(0, len(number) // 3):
            number = number[0:len(number) - comma_placement + 1] + ',' + number[len(number) - comma_placement + 1:]
    break
print('The new and improved number is: {}'.format(number))

Python 2 Code: (Edit. The python 2 code isn’t working. I am thinking that the syntax is different).

import random
number = str(random.randint(1, 10000000))
comma_placement = 4
print 'The original number is: %s.' % (number)
while True:
    if len(number) % 3 == 0:
        for i in range(0, len(number) // 3 - 1):
            number = number[0:len(number) - comma_placement + 1] + ',' + number[len(number) - comma_placement + 1:]
            comma_placement = comma_placement + 4
    else:
        for i in range(0, len(number) // 3):
            number = number[0:len(number) - comma_placement + 1] + ',' + number[len(number) - comma_placement + 1:]
    break
print 'The new and improved number is: %s.' % (number)

回答 20

我正在使用python 2.5，因此无法访问内置格式。

我查看了Django代码intcomma（下面的代码中的intcomma_recurs），发现它效率低下，因为它是递归的，并且每次运行时都编译正则表达式也不是一件好事。这不是必需的“问题”，因为django并不是真的专注于这种低级性能。另外，我期望性能差异达到10倍，但仅慢3倍。

出于好奇，我实现了一些intcomma版本，以查看使用正则表达式时的性能优势。我的测试数据总结出此任务有一点优势，但令人惊讶的是根本没有优势。

我也很高兴看到我所怀疑的：在无正则表达式的情况下，不需要使用反向xrange方法，但这确实使代码看起来更好一点，但性能却降低了10％。

另外，我假设您要传递的是一个字符串，看起来有点像一个数字。否则结果不确定。

from __future__ import with_statement
from contextlib import contextmanager
import re,time

re_first_num = re.compile(r"\d")
def intcomma_noregex(value):
    end_offset, start_digit, period = len(value),re_first_num.search(value).start(),value.rfind('.')
    if period == -1:
        period=end_offset
    segments,_from_index,leftover = [],0,(period-start_digit) % 3
    for _index in xrange(start_digit+3 if not leftover else start_digit+leftover,period,3):
        segments.append(value[_from_index:_index])
        _from_index=_index
    if not segments:
        return value
    segments.append(value[_from_index:])
    return ','.join(segments)

def intcomma_noregex_reversed(value):
    end_offset, start_digit, period = len(value),re_first_num.search(value).start(),value.rfind('.')
    if period == -1:
        period=end_offset
    _from_index,segments = end_offset,[]
    for _index in xrange(period-3,start_digit,-3):
        segments.append(value[_index:_from_index])
        _from_index=_index
    if not segments:
        return value
    segments.append(value[:_from_index])
    return ','.join(reversed(segments))

re_3digits = re.compile(r'(?<=\d)\d{3}(?!\d)')
def intcomma(value):
    segments,last_endoffset=[],len(value)
    while last_endoffset > 3:
        digit_group = re_3digits.search(value,0,last_endoffset)
        if not digit_group:
            break
        segments.append(value[digit_group.start():last_endoffset])
        last_endoffset=digit_group.start()
    if not segments:
        return value
    if last_endoffset:
        segments.append(value[:last_endoffset])
    return ','.join(reversed(segments))

def intcomma_recurs(value):
    """
    Converts an integer to a string containing commas every three digits.
    For example, 3000 becomes '3,000' and 45000 becomes '45,000'.
    """
    new = re.sub("^(-?\d+)(\d{3})", '\g<1>,\g<2>', str(value))
    if value == new:
        return new
    else:
        return intcomma(new)

@contextmanager
def timed(save_time_func):
    begin=time.time()
    try:
        yield
    finally:
        save_time_func(time.time()-begin)

def testset_xsimple(func):
    func('5')

def testset_simple(func):
    func('567')

def testset_onecomma(func):
    func('567890')

def testset_complex(func):
    func('-1234567.024')

def testset_average(func):
    func('-1234567.024')
    func('567')
    func('5674')

if __name__ == '__main__':
    print 'Test results:'
    for test_data in ('5','567','1234','1234.56','-253892.045'):
        for func in (intcomma,intcomma_noregex,intcomma_noregex_reversed,intcomma_recurs):
            print func.__name__,test_data,func(test_data)
    times=[]
    def overhead(x):
        pass
    for test_run in xrange(1,4):
        for func in (intcomma,intcomma_noregex,intcomma_noregex_reversed,intcomma_recurs,overhead):
            for testset in (testset_xsimple,testset_simple,testset_onecomma,testset_complex,testset_average):
                for x in xrange(1000): # prime the test
                    testset(func)
                with timed(lambda x:times.append(((test_run,func,testset),x))):
                    for x in xrange(50000):
                        testset(func)
    for (test_run,func,testset),_delta in times:
        print test_run,func.__name__,testset.__name__,_delta

这是测试结果：

intcomma 5 5
intcomma_noregex 5 5
intcomma_noregex_reversed 5 5
intcomma_recurs 5 5
intcomma 567 567
intcomma_noregex 567 567
intcomma_noregex_reversed 567 567
intcomma_recurs 567 567
intcomma 1234 1,234
intcomma_noregex 1234 1,234
intcomma_noregex_reversed 1234 1,234
intcomma_recurs 1234 1,234
intcomma 1234.56 1,234.56
intcomma_noregex 1234.56 1,234.56
intcomma_noregex_reversed 1234.56 1,234.56
intcomma_recurs 1234.56 1,234.56
intcomma -253892.045 -253,892.045
intcomma_noregex -253892.045 -253,892.045
intcomma_noregex_reversed -253892.045 -253,892.045
intcomma_recurs -253892.045 -253,892.045
1 intcomma testset_xsimple 0.0410001277924
1 intcomma testset_simple 0.0369999408722
1 intcomma testset_onecomma 0.213000059128
1 intcomma testset_complex 0.296000003815
1 intcomma testset_average 0.503000020981
1 intcomma_noregex testset_xsimple 0.134000062943
1 intcomma_noregex testset_simple 0.134999990463
1 intcomma_noregex testset_onecomma 0.190999984741
1 intcomma_noregex testset_complex 0.209000110626
1 intcomma_noregex testset_average 0.513000011444
1 intcomma_noregex_reversed testset_xsimple 0.124000072479
1 intcomma_noregex_reversed testset_simple 0.12700009346
1 intcomma_noregex_reversed testset_onecomma 0.230000019073
1 intcomma_noregex_reversed testset_complex 0.236999988556
1 intcomma_noregex_reversed testset_average 0.56299996376
1 intcomma_recurs testset_xsimple 0.348000049591
1 intcomma_recurs testset_simple 0.34600019455
1 intcomma_recurs testset_onecomma 0.625
1 intcomma_recurs testset_complex 0.773999929428
1 intcomma_recurs testset_average 1.6890001297
1 overhead testset_xsimple 0.0179998874664
1 overhead testset_simple 0.0190000534058
1 overhead testset_onecomma 0.0190000534058
1 overhead testset_complex 0.0190000534058
1 overhead testset_average 0.0309998989105
2 intcomma testset_xsimple 0.0360000133514
2 intcomma testset_simple 0.0369999408722
2 intcomma testset_onecomma 0.207999944687
2 intcomma testset_complex 0.302000045776
2 intcomma testset_average 0.523000001907
2 intcomma_noregex testset_xsimple 0.139999866486
2 intcomma_noregex testset_simple 0.141000032425
2 intcomma_noregex testset_onecomma 0.203999996185
2 intcomma_noregex testset_complex 0.200999975204
2 intcomma_noregex testset_average 0.523000001907
2 intcomma_noregex_reversed testset_xsimple 0.130000114441
2 intcomma_noregex_reversed testset_simple 0.129999876022
2 intcomma_noregex_reversed testset_onecomma 0.236000061035
2 intcomma_noregex_reversed testset_complex 0.241999864578
2 intcomma_noregex_reversed testset_average 0.582999944687
2 intcomma_recurs testset_xsimple 0.351000070572
2 intcomma_recurs testset_simple 0.352999925613
2 intcomma_recurs testset_onecomma 0.648999929428
2 intcomma_recurs testset_complex 0.808000087738
2 intcomma_recurs testset_average 1.81900000572
2 overhead testset_xsimple 0.0189998149872
2 overhead testset_simple 0.0189998149872
2 overhead testset_onecomma 0.0190000534058
2 overhead testset_complex 0.0179998874664
2 overhead testset_average 0.0299999713898
3 intcomma testset_xsimple 0.0360000133514
3 intcomma testset_simple 0.0360000133514
3 intcomma testset_onecomma 0.210000038147
3 intcomma testset_complex 0.305999994278
3 intcomma testset_average 0.493000030518
3 intcomma_noregex testset_xsimple 0.131999969482
3 intcomma_noregex testset_simple 0.136000156403
3 intcomma_noregex testset_onecomma 0.192999839783
3 intcomma_noregex testset_complex 0.202000141144
3 intcomma_noregex testset_average 0.509999990463
3 intcomma_noregex_reversed testset_xsimple 0.125999927521
3 intcomma_noregex_reversed testset_simple 0.126999855042
3 intcomma_noregex_reversed testset_onecomma 0.235999822617
3 intcomma_noregex_reversed testset_complex 0.243000030518
3 intcomma_noregex_reversed testset_average 0.56200003624
3 intcomma_recurs testset_xsimple 0.337000131607
3 intcomma_recurs testset_simple 0.342000007629
3 intcomma_recurs testset_onecomma 0.609999895096
3 intcomma_recurs testset_complex 0.75
3 intcomma_recurs testset_average 1.68300008774
3 overhead testset_xsimple 0.0189998149872
3 overhead testset_simple 0.018000125885
3 overhead testset_onecomma 0.018000125885
3 overhead testset_complex 0.0179998874664
3 overhead testset_average 0.0299999713898

I’m using python 2.5 so I don’t have access to the built-in formatting.

I looked at the Django code intcomma (intcomma_recurs in code below) and realized it’s inefficient, because it’s recursive and also compiling the regex on every run is not a good thing either. This is not necessary an ‘issue’ as django isn’t really THAT focused on this kind of low-level performance. Also, I was expecting a factor of 10 difference in performance, but it’s only 3 times slower.

Out of curiosity I implemented a few versions of intcomma to see what the performance advantages are when using regex. My test data concludes a slight advantage for this task, but surprisingly not much at all.

I also was pleased to see what I suspected: using the reverse xrange approach is unnecessary in the no-regex case, but it does make the code look slightly better at the cost of ~10% performance.

Also, I assume what you’re passing in is a string and looks somewhat like a number. Results undetermined otherwise.

from __future__ import with_statement
from contextlib import contextmanager
import re,time

re_first_num = re.compile(r"\d")
def intcomma_noregex(value):
    end_offset, start_digit, period = len(value),re_first_num.search(value).start(),value.rfind('.')
    if period == -1:
        period=end_offset
    segments,_from_index,leftover = [],0,(period-start_digit) % 3
    for _index in xrange(start_digit+3 if not leftover else start_digit+leftover,period,3):
        segments.append(value[_from_index:_index])
        _from_index=_index
    if not segments:
        return value
    segments.append(value[_from_index:])
    return ','.join(segments)

def intcomma_noregex_reversed(value):
    end_offset, start_digit, period = len(value),re_first_num.search(value).start(),value.rfind('.')
    if period == -1:
        period=end_offset
    _from_index,segments = end_offset,[]
    for _index in xrange(period-3,start_digit,-3):
        segments.append(value[_index:_from_index])
        _from_index=_index
    if not segments:
        return value
    segments.append(value[:_from_index])
    return ','.join(reversed(segments))

re_3digits = re.compile(r'(?<=\d)\d{3}(?!\d)')
def intcomma(value):
    segments,last_endoffset=[],len(value)
    while last_endoffset > 3:
        digit_group = re_3digits.search(value,0,last_endoffset)
        if not digit_group:
            break
        segments.append(value[digit_group.start():last_endoffset])
        last_endoffset=digit_group.start()
    if not segments:
        return value
    if last_endoffset:
        segments.append(value[:last_endoffset])
    return ','.join(reversed(segments))

def intcomma_recurs(value):
    """
    Converts an integer to a string containing commas every three digits.
    For example, 3000 becomes '3,000' and 45000 becomes '45,000'.
    """
    new = re.sub("^(-?\d+)(\d{3})", '\g<1>,\g<2>', str(value))
    if value == new:
        return new
    else:
        return intcomma(new)

@contextmanager
def timed(save_time_func):
    begin=time.time()
    try:
        yield
    finally:
        save_time_func(time.time()-begin)

def testset_xsimple(func):
    func('5')

def testset_simple(func):
    func('567')

def testset_onecomma(func):
    func('567890')

def testset_complex(func):
    func('-1234567.024')

def testset_average(func):
    func('-1234567.024')
    func('567')
    func('5674')

if __name__ == '__main__':
    print 'Test results:'
    for test_data in ('5','567','1234','1234.56','-253892.045'):
        for func in (intcomma,intcomma_noregex,intcomma_noregex_reversed,intcomma_recurs):
            print func.__name__,test_data,func(test_data)
    times=[]
    def overhead(x):
        pass
    for test_run in xrange(1,4):
        for func in (intcomma,intcomma_noregex,intcomma_noregex_reversed,intcomma_recurs,overhead):
            for testset in (testset_xsimple,testset_simple,testset_onecomma,testset_complex,testset_average):
                for x in xrange(1000): # prime the test
                    testset(func)
                with timed(lambda x:times.append(((test_run,func,testset),x))):
                    for x in xrange(50000):
                        testset(func)
    for (test_run,func,testset),_delta in times:
        print test_run,func.__name__,testset.__name__,_delta

And here are the test results:

intcomma 5 5
intcomma_noregex 5 5
intcomma_noregex_reversed 5 5
intcomma_recurs 5 5
intcomma 567 567
intcomma_noregex 567 567
intcomma_noregex_reversed 567 567
intcomma_recurs 567 567
intcomma 1234 1,234
intcomma_noregex 1234 1,234
intcomma_noregex_reversed 1234 1,234
intcomma_recurs 1234 1,234
intcomma 1234.56 1,234.56
intcomma_noregex 1234.56 1,234.56
intcomma_noregex_reversed 1234.56 1,234.56
intcomma_recurs 1234.56 1,234.56
intcomma -253892.045 -253,892.045
intcomma_noregex -253892.045 -253,892.045
intcomma_noregex_reversed -253892.045 -253,892.045
intcomma_recurs -253892.045 -253,892.045
1 intcomma testset_xsimple 0.0410001277924
1 intcomma testset_simple 0.0369999408722
1 intcomma testset_onecomma 0.213000059128
1 intcomma testset_complex 0.296000003815
1 intcomma testset_average 0.503000020981
1 intcomma_noregex testset_xsimple 0.134000062943
1 intcomma_noregex testset_simple 0.134999990463
1 intcomma_noregex testset_onecomma 0.190999984741
1 intcomma_noregex testset_complex 0.209000110626
1 intcomma_noregex testset_average 0.513000011444
1 intcomma_noregex_reversed testset_xsimple 0.124000072479
1 intcomma_noregex_reversed testset_simple 0.12700009346
1 intcomma_noregex_reversed testset_onecomma 0.230000019073
1 intcomma_noregex_reversed testset_complex 0.236999988556
1 intcomma_noregex_reversed testset_average 0.56299996376
1 intcomma_recurs testset_xsimple 0.348000049591
1 intcomma_recurs testset_simple 0.34600019455
1 intcomma_recurs testset_onecomma 0.625
1 intcomma_recurs testset_complex 0.773999929428
1 intcomma_recurs testset_average 1.6890001297
1 overhead testset_xsimple 0.0179998874664
1 overhead testset_simple 0.0190000534058
1 overhead testset_onecomma 0.0190000534058
1 overhead testset_complex 0.0190000534058
1 overhead testset_average 0.0309998989105
2 intcomma testset_xsimple 0.0360000133514
2 intcomma testset_simple 0.0369999408722
2 intcomma testset_onecomma 0.207999944687
2 intcomma testset_complex 0.302000045776
2 intcomma testset_average 0.523000001907
2 intcomma_noregex testset_xsimple 0.139999866486
2 intcomma_noregex testset_simple 0.141000032425
2 intcomma_noregex testset_onecomma 0.203999996185
2 intcomma_noregex testset_complex 0.200999975204
2 intcomma_noregex testset_average 0.523000001907
2 intcomma_noregex_reversed testset_xsimple 0.130000114441
2 intcomma_noregex_reversed testset_simple 0.129999876022
2 intcomma_noregex_reversed testset_onecomma 0.236000061035
2 intcomma_noregex_reversed testset_complex 0.241999864578
2 intcomma_noregex_reversed testset_average 0.582999944687
2 intcomma_recurs testset_xsimple 0.351000070572
2 intcomma_recurs testset_simple 0.352999925613
2 intcomma_recurs testset_onecomma 0.648999929428
2 intcomma_recurs testset_complex 0.808000087738
2 intcomma_recurs testset_average 1.81900000572
2 overhead testset_xsimple 0.0189998149872
2 overhead testset_simple 0.0189998149872
2 overhead testset_onecomma 0.0190000534058
2 overhead testset_complex 0.0179998874664
2 overhead testset_average 0.0299999713898
3 intcomma testset_xsimple 0.0360000133514
3 intcomma testset_simple 0.0360000133514
3 intcomma testset_onecomma 0.210000038147
3 intcomma testset_complex 0.305999994278
3 intcomma testset_average 0.493000030518
3 intcomma_noregex testset_xsimple 0.131999969482
3 intcomma_noregex testset_simple 0.136000156403
3 intcomma_noregex testset_onecomma 0.192999839783
3 intcomma_noregex testset_complex 0.202000141144
3 intcomma_noregex testset_average 0.509999990463
3 intcomma_noregex_reversed testset_xsimple 0.125999927521
3 intcomma_noregex_reversed testset_simple 0.126999855042
3 intcomma_noregex_reversed testset_onecomma 0.235999822617
3 intcomma_noregex_reversed testset_complex 0.243000030518
3 intcomma_noregex_reversed testset_average 0.56200003624
3 intcomma_recurs testset_xsimple 0.337000131607
3 intcomma_recurs testset_simple 0.342000007629
3 intcomma_recurs testset_onecomma 0.609999895096
3 intcomma_recurs testset_complex 0.75
3 intcomma_recurs testset_average 1.68300008774
3 overhead testset_xsimple 0.0189998149872
3 overhead testset_simple 0.018000125885
3 overhead testset_onecomma 0.018000125885
3 overhead testset_complex 0.0179998874664
3 overhead testset_average 0.0299999713898

回答 21

每个PEP将其烘焙到python中-> https://www.python.org/dev/peps/pep-0378/

只需使用format（1000，’，d’）来显示带有千位分隔符的整数

PEP中描述了更多的格式

this is baked into python per PEP -> https://www.python.org/dev/peps/pep-0378/

just use format(1000, ‘,d’) to show an integer with thousands separator

there are more formats described in the PEP, have at it

回答 22

这是使用生成器函数的另一种变体，适用于整数：

def ncomma(num):
    def _helper(num):
        # assert isinstance(numstr, basestring)
        numstr = '%d' % num
        for ii, digit in enumerate(reversed(numstr)):
            if ii and ii % 3 == 0 and digit.isdigit():
                yield ','
            yield digit

    return ''.join(reversed([n for n in _helper(num)]))

这是一个测试：

>>> for i in (0, 99, 999, 9999, 999999, 1000000, -1, -111, -1111, -111111, -1000000):
...     print i, ncomma(i)
... 
0 0
99 99
999 999
9999 9,999
999999 999,999
1000000 1,000,000
-1 -1
-111 -111
-1111 -1,111
-111111 -111,111
-1000000 -1,000,000

Here is another variant using a generator function that works for integers:

def ncomma(num):
    def _helper(num):
        # assert isinstance(numstr, basestring)
        numstr = '%d' % num
        for ii, digit in enumerate(reversed(numstr)):
            if ii and ii % 3 == 0 and digit.isdigit():
                yield ','
            yield digit

    return ''.join(reversed([n for n in _helper(num)]))

And here’s a test:

>>> for i in (0, 99, 999, 9999, 999999, 1000000, -1, -111, -1111, -111111, -1000000):
...     print i, ncomma(i)
... 
0 0
99 99
999 999
9999 9,999
999999 999,999
1000000 1,000,000
-1 -1
-111 -111
-1111 -1,111
-111111 -111,111
-1000000 -1,000,000

回答 23

只是子类long（或float，或其他）。这非常实用，因为这样您仍然可以在数学运算中使用数字（因此也可以在现有代码中使用数字），但是它们都可以在终端中很好地打印出来。

>>> class number(long):

        def __init__(self, value):
            self = value

        def __repr__(self):
            s = str(self)
            l = [x for x in s if x in '1234567890']
            for x in reversed(range(len(s)-1)[::3]):
                l.insert(-x, ',')
            l = ''.join(l[1:])
            return ('-'+l if self < 0 else l) 

>>> number(-100000)
-100,000
>>> number(-100)
-100
>>> number(-12345)
-12,345
>>> number(928374)
928,374
>>> 345

Just subclass long (or float, or whatever). This is highly practical, because this way you can still use your numbers in math ops (and therefore existing code), but they will all print nicely in your terminal.

>>> class number(long):

        def __init__(self, value):
            self = value

        def __repr__(self):
            s = str(self)
            l = [x for x in s if x in '1234567890']
            for x in reversed(range(len(s)-1)[::3]):
                l.insert(-x, ',')
            l = ''.join(l[1:])
            return ('-'+l if self < 0 else l) 

>>> number(-100000)
-100,000
>>> number(-100)
-100
>>> number(-12345)
-12,345
>>> number(928374)
928,374
>>> 345

回答 24

意大利：

>>> import locale
>>> locale.setlocale(locale.LC_ALL,"")
'Italian_Italy.1252'
>>> f"{1000:n}"
'1.000'

Italy:

>>> import locale
>>> locale.setlocale(locale.LC_ALL,"")
'Italian_Italy.1252'
>>> f"{1000:n}"
'1.000'

回答 25

对于花车：

float(filter(lambda x: x!=',', '1,234.52'))
# returns 1234.52

对于整数：

int(filter(lambda x: x!=',', '1,234'))
# returns 1234

For floats:

float(filter(lambda x: x!=',', '1,234.52'))
# returns 1234.52

For ints:

int(filter(lambda x: x!=',', '1,234'))
# returns 1234

知识问答

urllib，urllib2，urllib3和请求模块之间有什么区别？

2021年7月25日 Python实用宝典

问题：urllib，urllib2，urllib3和请求模块之间有什么区别？

在Python，有什么之间的差异urllib，urllib2，urllib3和requests模块？为什么有三个？他们似乎在做同样的事情…

In Python, what are the differences between the urllib, urllib2, urllib3 and requests modules? Why are there three? They seem to do the same thing…

回答 0

我知道已经有人说过了，但我强烈建议您使用requestsPython软件包。

如果您使用的是python以外的语言，则可能是在考虑urllib并且urllib2易于使用，代码不多且功能强大，这就是我以前的想法。但是该requests程序包是如此有用且太短，以至于每个人都应该使用它。

首先，它支持完全宁静的API，并且非常简单：

import requests

resp = requests.get('http://www.mywebsite.com/user')
resp = requests.post('http://www.mywebsite.com/user')
resp = requests.put('http://www.mywebsite.com/user/put')
resp = requests.delete('http://www.mywebsite.com/user/delete')

无论是GET / POST，您都无需再次对参数进行编码，只需将字典作为参数即可。

userdata = {"firstname": "John", "lastname": "Doe", "password": "jdoe123"}
resp = requests.post('http://www.mywebsite.com/user', data=userdata)

加上它甚至还具有内置的JSON解码器（再次，我知道json.loads()编写的内容并不多，但这肯定很方便）：

resp.json()

或者，如果您的响应数据只是文本，请使用：

resp.text

这只是冰山一角。这是请求站点中的功能列表：

国际域名和URL
保持活动和连接池
Cookie持久性会话
浏览器式SSL验证
基本/摘要身份验证
优雅的键/值Cookie
自动减压
Unicode响应机构
分段文件上传
连接超时
.netrc支持
项目清单
python 2.6—3.4
线程安全的。

I know it’s been said already, but I’d highly recommend the requests Python package.

If you’ve used languages other than python, you’re probably thinking urllib and urllib2 are easy to use, not much code, and highly capable, that’s how I used to think. But the requests package is so unbelievably useful and short that everyone should be using it.

First, it supports a fully restful API, and is as easy as:

import requests

resp = requests.get('http://www.mywebsite.com/user')
resp = requests.post('http://www.mywebsite.com/user')
resp = requests.put('http://www.mywebsite.com/user/put')
resp = requests.delete('http://www.mywebsite.com/user/delete')

Regardless of whether GET / POST, you never have to encode parameters again, it simply takes a dictionary as an argument and is good to go:

userdata = {"firstname": "John", "lastname": "Doe", "password": "jdoe123"}
resp = requests.post('http://www.mywebsite.com/user', data=userdata)

Plus it even has a built in JSON decoder (again, I know json.loads() isn’t a lot more to write, but this sure is convenient):

resp.json()

Or if your response data is just text, use:

resp.text

This is just the tip of the iceberg. This is the list of features from the requests site:

International Domains and URLs
Keep-Alive & Connection Pooling
Sessions with Cookie Persistence
Browser-style SSL Verification
Basic/Digest Authentication
Elegant Key/Value Cookies
Automatic Decompression
Unicode Response Bodies
Multipart File Uploads
Connection Timeouts
.netrc support
List item
Python 2.6—3.4
Thread-safe.

回答 1

urllib2提供了一些额外的功能，即该urlopen()函数可以允许您指定标头（通常您以前必须使用httplib，这要冗长得多。）不过，更重要的是，urllib2提供了Request该类，该类可以提供更多功能。声明式处理请求：

r = Request(url='http://www.mysite.com')
r.add_header('User-Agent', 'awesome fetcher')
r.add_data(urllib.urlencode({'foo': 'bar'})
response = urlopen(r)

请注意，urlencode()仅在urllib中，而不在urllib2中。

还有一些处理程序，用于在urllib2中实现更高级的URL支持。简短的答案是，除非使用旧代码，否则可能要使用urllib2中的URL打开程序，但是对于某些实用程序功能，仍然需要导入urllib。

奖励答案 使用Google App Engine，您可以使用httplib，urllib或urllib2中的任何一个，但它们都只是Google URL Fetch API的包装。也就是说，您仍然受到端口，协议和允许的响应时间之类的相同限制。不过，您可以像期望的那样使用库的核心来获取HTTP URL。

urllib2 provides some extra functionality, namely the urlopen() function can allow you to specify headers (normally you’d have had to use httplib in the past, which is far more verbose.) More importantly though, urllib2 provides the Request class, which allows for a more declarative approach to doing a request:

r = Request(url='http://www.mysite.com')
r.add_header('User-Agent', 'awesome fetcher')
r.add_data(urllib.urlencode({'foo': 'bar'})
response = urlopen(r)

Note that urlencode() is only in urllib, not urllib2.

There are also handlers for implementing more advanced URL support in urllib2. The short answer is, unless you’re working with legacy code, you probably want to use the URL opener from urllib2, but you still need to import into urllib for some of the utility functions.

Bonus answer With Google App Engine, you can use any of httplib, urllib or urllib2, but all of them are just wrappers for Google’s URL Fetch API. That is, you are still subject to the same limitations such as ports, protocols, and the length of the response allowed. You can use the core of the libraries as you would expect for retrieving HTTP URLs, though.

回答 2

urllib和urllib2都是Python模块，它们执行URL请求相关的内容，但提供不同的功能。

1）urllib2可以接受Request对象来设置URL请求的标头，而urllib仅接受URL。

2）urllib提供了urlencode方法，该方法用于生成GET查询字符串，而urllib2没有此功能。这是urllib与urllib2经常一起使用的原因之一。

Requests -Requests是一个使用Python编写的简单易用的HTTP库。

1）Python请求自动对参数进行编码，因此您只需将它们作为简单的参数传递，就与urllib不同，在urllib中，需要在传递参数之前使用urllib.encode（）方法对参数进行编码。

2）它自动将响应解码为Unicode。

3）Requests还具有更方便的错误处理方式。如果您的身份验证失败，则urllib2将引发urllib2.URLError，而Requests将返回正常的响应对象。您需要通过boolean response.ok查看所有请求是否成功

urllib and urllib2 are both Python modules that do URL request related stuff but offer different functionalities.

1) urllib2 can accept a Request object to set the headers for a URL request, urllib accepts only a URL.

2) urllib provides the urlencode method which is used for the generation of GET query strings, urllib2 doesn’t have such a function. This is one of the reasons why urllib is often used along with urllib2.

Requests – Requests’ is a simple, easy-to-use HTTP library written in Python.

1) Python Requests encodes the parameters automatically so you just pass them as simple arguments, unlike in the case of urllib, where you need to use the method urllib.encode() to encode the parameters before passing them.

2) It automatically decoded the response into Unicode.

3) Requests also has far more convenient error handling.If your authentication failed, urllib2 would raise a urllib2.URLError, while Requests would return a normal response object, as expected. All you have to see if the request was successful by boolean response.ok

回答 3

将Python2移植到Python3是一个相当大的区别。urllib2对于python3不存在，其方法已移植到urllib。因此，您正在大量使用它，并希望将来迁移到Python3，请考虑使用urllib。但是2to3工具将自动为您完成大部分工作。

One considerable difference is about porting Python2 to Python3. urllib2 does not exist for python3 and its methods ported to urllib. So you are using that heavily and want to migrate to Python3 in future, consider using urllib. However 2to3 tool will automatically do most of the work for you.

回答 4

仅添加到现有答案中，我看不到有人提到python请求不是本机库。如果可以添加依赖项，那么请求就可以了。但是，如果您试图避免添加依赖项，则urllib是一个本机python库，已经可供您使用。

Just to add to the existing answers, I don’t see anyone mentioning that python requests is not a native library. If you are ok with adding dependencies, then requests is fine. However, if you are trying to avoid adding dependencies, urllib is a native python library that is already available to you.

回答 5

我喜欢此urllib.urlencode功能，并且似乎不存在urllib2。

>>> urllib.urlencode({'abc':'d f', 'def': '-!2'})
'abc=d+f&def=-%212'

I like the urllib.urlencode function, and it doesn’t appear to exist in urllib2.

>>> urllib.urlencode({'abc':'d f', 'def': '-!2'})
'abc=d+f&def=-%212'

回答 6

要获取网址的内容：

try: # Try importing requests first.
    import requests
except ImportError: 
    try: # Try importing Python3 urllib
        import urllib.request
    except AttributeError: # Now importing Python2 urllib
        import urllib


def get_content(url):
    try:  # Using requests.
        return requests.get(url).content # Returns requests.models.Response.
    except NameError:  
        try: # Using Python3 urllib.
            with urllib.request.urlopen(index_url) as response:
                return response.read() # Returns http.client.HTTPResponse.
        except AttributeError: # Using Python3 urllib.
            return urllib.urlopen(url).read() # Returns an instance.

很难request为响应编写Python2和Python3以及依赖项代码，因为它们的urlopen()功能和requests.get()函数返回不同的类型：

Python2 urllib.request.urlopen()返回一个http.client.HTTPResponse
Python3 urllib.urlopen(url)返回一个instance
请求request.get(url)返回一个requests.models.Response

To get the content of a url:

try: # Try importing requests first.
    import requests
except ImportError: 
    try: # Try importing Python3 urllib
        import urllib.request
    except AttributeError: # Now importing Python2 urllib
        import urllib


def get_content(url):
    try:  # Using requests.
        return requests.get(url).content # Returns requests.models.Response.
    except NameError:  
        try: # Using Python3 urllib.
            with urllib.request.urlopen(index_url) as response:
                return response.read() # Returns http.client.HTTPResponse.
        except AttributeError: # Using Python3 urllib.
            return urllib.urlopen(url).read() # Returns an instance.

It’s hard to write Python2 and Python3 and request dependencies code for the responses because they urlopen() functions and requests.get() function return different types:

Python2 urllib.request.urlopen() returns a http.client.HTTPResponse
Python3 urllib.urlopen(url) returns an instance
Request request.get(url) returns a requests.models.Response

回答 7

通常应该使用urllib2，因为通过接受Request对象有时会使事情变得容易一些，并且还会在协议错误时引发URLException。但是，借助Google App Engine，您将无法使用任何一种。您必须使用Google在其沙盒Python环境中提供的URL Fetch API。

You should generally use urllib2, since this makes things a bit easier at times by accepting Request objects and will also raise a URLException on protocol errors. With Google App Engine though, you can’t use either. You have to use the URL Fetch API that Google provides in its sandboxed Python environment.

回答 8

我发现上述答案中缺少的一个关键点是urllib返回类型为object的对象，<class http.client.HTTPResponse>而requests返回return <class 'requests.models.Response'>。

因此，read（）方法可以与一起使用，urllib但不能与一起使用requests。

PS：requests已经有很多方法，几乎不需要read()；>

A key point that I find missing in the above answers is that urllib returns an object of type <class http.client.HTTPResponse> whereas requests returns <class 'requests.models.Response'>.

Due to this, read() method can be used with urllib but not with requests.

P.S. : requests is already rich with so many methods that it hardly needs one more as read() ;>

知识问答

如何删除在特定列中的值为NaN的Pandas DataFrame行

2021年7月25日 Python实用宝典

问题：如何删除在特定列中的值为NaN的Pandas DataFrame行

我有这个DataFrame，只想要EPS列不是的记录NaN：

>>> df
                 STK_ID  EPS  cash
STK_ID RPT_Date                   
601166 20111231  601166  NaN   NaN
600036 20111231  600036  NaN    12
600016 20111231  600016  4.3   NaN
601009 20111231  601009  NaN   NaN
601939 20111231  601939  2.5   NaN
000001 20111231  000001  NaN   NaN

…例如df.drop(....)要得到这个结果的数据框：

                  STK_ID  EPS  cash
STK_ID RPT_Date                   
600016 20111231  600016  4.3   NaN
601939 20111231  601939  2.5   NaN

我怎么做？

I have this DataFrame and want only the records whose EPS column is not NaN:

>>> df
                 STK_ID  EPS  cash
STK_ID RPT_Date                   
601166 20111231  601166  NaN   NaN
600036 20111231  600036  NaN    12
600016 20111231  600016  4.3   NaN
601009 20111231  601009  NaN   NaN
601939 20111231  601939  2.5   NaN
000001 20111231  000001  NaN   NaN

…i.e. something like df.drop(....) to get this resulting dataframe:

                  STK_ID  EPS  cash
STK_ID RPT_Date                   
600016 20111231  600016  4.3   NaN
601939 20111231  601939  2.5   NaN

How do I do that?

回答 0

不要丢掉，只取EPS不是NA的行：

df = df[df['EPS'].notna()]

Don’t drop, just take the rows where EPS is not NA:

df = df[df['EPS'].notna()]

回答 1

这个问题已经解决，但是…

…还要考虑伍特（Wouter）在其原始评论中提出的解决方案。dropna()大熊猫内置了处理丢失数据（包括）的功能。除了通过手动执行可能会提高的性能外，这些功能还带有多种可能有用的选项。

In [24]: df = pd.DataFrame(np.random.randn(10,3))

In [25]: df.iloc[::2,0] = np.nan; df.iloc[::4,1] = np.nan; df.iloc[::3,2] = np.nan;

In [26]: df
Out[26]:
          0         1         2
0       NaN       NaN       NaN
1  2.677677 -1.466923 -0.750366
2       NaN  0.798002 -0.906038
3  0.672201  0.964789       NaN
4       NaN       NaN  0.050742
5 -1.250970  0.030561 -2.678622
6       NaN  1.036043       NaN
7  0.049896 -0.308003  0.823295
8       NaN       NaN  0.637482
9 -0.310130  0.078891       NaN

In [27]: df.dropna()     #drop all rows that have any NaN values
Out[27]:
          0         1         2
1  2.677677 -1.466923 -0.750366
5 -1.250970  0.030561 -2.678622
7  0.049896 -0.308003  0.823295

In [28]: df.dropna(how='all')     #drop only if ALL columns are NaN
Out[28]:
          0         1         2
1  2.677677 -1.466923 -0.750366
2       NaN  0.798002 -0.906038
3  0.672201  0.964789       NaN
4       NaN       NaN  0.050742
5 -1.250970  0.030561 -2.678622
6       NaN  1.036043       NaN
7  0.049896 -0.308003  0.823295
8       NaN       NaN  0.637482
9 -0.310130  0.078891       NaN

In [29]: df.dropna(thresh=2)   #Drop row if it does not have at least two values that are **not** NaN
Out[29]:
          0         1         2
1  2.677677 -1.466923 -0.750366
2       NaN  0.798002 -0.906038
3  0.672201  0.964789       NaN
5 -1.250970  0.030561 -2.678622
7  0.049896 -0.308003  0.823295
9 -0.310130  0.078891       NaN

In [30]: df.dropna(subset=[1])   #Drop only if NaN in specific column (as asked in the question)
Out[30]:
          0         1         2
1  2.677677 -1.466923 -0.750366
2       NaN  0.798002 -0.906038
3  0.672201  0.964789       NaN
5 -1.250970  0.030561 -2.678622
6       NaN  1.036043       NaN
7  0.049896 -0.308003  0.823295
9 -0.310130  0.078891       NaN

还有其他选项（请参见http://pandas.pydata.org/pandas-docs/stable/generation/pandas.DataFrame.dropna.html上的文档），包括删除列而不是行。

很方便！

This question is already resolved, but…

…also consider the solution suggested by Wouter in his original comment. The ability to handle missing data, including dropna(), is built into pandas explicitly. Aside from potentially improved performance over doing it manually, these functions also come with a variety of options which may be useful.

In [24]: df = pd.DataFrame(np.random.randn(10,3))

In [25]: df.iloc[::2,0] = np.nan; df.iloc[::4,1] = np.nan; df.iloc[::3,2] = np.nan;

In [26]: df
Out[26]:
          0         1         2
0       NaN       NaN       NaN
1  2.677677 -1.466923 -0.750366
2       NaN  0.798002 -0.906038
3  0.672201  0.964789       NaN
4       NaN       NaN  0.050742
5 -1.250970  0.030561 -2.678622
6       NaN  1.036043       NaN
7  0.049896 -0.308003  0.823295
8       NaN       NaN  0.637482
9 -0.310130  0.078891       NaN

In [27]: df.dropna()     #drop all rows that have any NaN values
Out[27]:
          0         1         2
1  2.677677 -1.466923 -0.750366
5 -1.250970  0.030561 -2.678622
7  0.049896 -0.308003  0.823295

In [28]: df.dropna(how='all')     #drop only if ALL columns are NaN
Out[28]:
          0         1         2
1  2.677677 -1.466923 -0.750366
2       NaN  0.798002 -0.906038
3  0.672201  0.964789       NaN
4       NaN       NaN  0.050742
5 -1.250970  0.030561 -2.678622
6       NaN  1.036043       NaN
7  0.049896 -0.308003  0.823295
8       NaN       NaN  0.637482
9 -0.310130  0.078891       NaN

In [29]: df.dropna(thresh=2)   #Drop row if it does not have at least two values that are **not** NaN
Out[29]:
          0         1         2
1  2.677677 -1.466923 -0.750366
2       NaN  0.798002 -0.906038
3  0.672201  0.964789       NaN
5 -1.250970  0.030561 -2.678622
7  0.049896 -0.308003  0.823295
9 -0.310130  0.078891       NaN

In [30]: df.dropna(subset=[1])   #Drop only if NaN in specific column (as asked in the question)
Out[30]:
          0         1         2
1  2.677677 -1.466923 -0.750366
2       NaN  0.798002 -0.906038
3  0.672201  0.964789       NaN
5 -1.250970  0.030561 -2.678622
6       NaN  1.036043       NaN
7  0.049896 -0.308003  0.823295
9 -0.310130  0.078891       NaN

There are also other options (See docs at http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.dropna.html), including dropping columns instead of rows.

Pretty handy!

回答 2

我知道已经回答了这个问题，但是只是为了对这个特定问题提供一个纯粹的熊猫解决方案，而不是Aman的一般性描述（这很妙），以防万一其他人发生于此：

import pandas as pd
df = df[pd.notnull(df['EPS'])]

I know this has already been answered, but just for the sake of a purely pandas solution to this specific question as opposed to the general description from Aman (which was wonderful) and in case anyone else happens upon this:

import pandas as pd
df = df[pd.notnull(df['EPS'])]

回答 3

您可以使用此：

df.dropna(subset=['EPS'], how='all', inplace=True)

You can use this:

df.dropna(subset=['EPS'], how='all', inplace=True)

回答 4

所有解决方案中最简单的：

filtered_df = df[df['EPS'].notnull()]

上面的解决方案比使用np.isfinite（）更好

Simplest of all solutions:

filtered_df = df[df['EPS'].notnull()]

The above solution is way better than using np.isfinite()

回答 5

你可以使用数据帧的方法NOTNULL或逆ISNULL，或numpy.isnan：

In [332]: df[df.EPS.notnull()]
Out[332]:
   STK_ID  RPT_Date  STK_ID.1  EPS  cash
2  600016  20111231    600016  4.3   NaN
4  601939  20111231    601939  2.5   NaN


In [334]: df[~df.EPS.isnull()]
Out[334]:
   STK_ID  RPT_Date  STK_ID.1  EPS  cash
2  600016  20111231    600016  4.3   NaN
4  601939  20111231    601939  2.5   NaN


In [347]: df[~np.isnan(df.EPS)]
Out[347]:
   STK_ID  RPT_Date  STK_ID.1  EPS  cash
2  600016  20111231    600016  4.3   NaN
4  601939  20111231    601939  2.5   NaN

You could use dataframe method notnull or inverse of isnull, or numpy.isnan:

In [332]: df[df.EPS.notnull()]
Out[332]:
   STK_ID  RPT_Date  STK_ID.1  EPS  cash
2  600016  20111231    600016  4.3   NaN
4  601939  20111231    601939  2.5   NaN


In [334]: df[~df.EPS.isnull()]
Out[334]:
   STK_ID  RPT_Date  STK_ID.1  EPS  cash
2  600016  20111231    600016  4.3   NaN
4  601939  20111231    601939  2.5   NaN


In [347]: df[~np.isnan(df.EPS)]
Out[347]:
   STK_ID  RPT_Date  STK_ID.1  EPS  cash
2  600016  20111231    600016  4.3   NaN
4  601939  20111231    601939  2.5   NaN

回答 6

简单方法

df.dropna(subset=['EPS'],inplace=True)

来源：https : //pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.dropna.html

Simple and easy way

df.dropna(subset=['EPS'],inplace=True)

source: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.dropna.html

回答 7

还有一个使用以下事实的解决方案np.nan != np.nan：

In [149]: df.query("EPS == EPS")
Out[149]:
                 STK_ID  EPS  cash
STK_ID RPT_Date
600016 20111231  600016  4.3   NaN
601939 20111231  601939  2.5   NaN

yet another solution which uses the fact that np.nan != np.nan:

In [149]: df.query("EPS == EPS")
Out[149]:
                 STK_ID  EPS  cash
STK_ID RPT_Date
600016 20111231  600016  4.3   NaN
601939 20111231  601939  2.5   NaN

回答 8

另一个版本：

df[~df['EPS'].isna()]

Another version:

df[~df['EPS'].isna()]

回答 9

在具有大量列的数据集中，最好查看有多少列包含空值而有多少列不包含空值。

print("No. of columns containing null values")
print(len(df.columns[df.isna().any()]))

print("No. of columns not containing null values")
print(len(df.columns[df.notna().all()]))

print("Total no. of columns in the dataframe")
print(len(df.columns))

例如，在我的数据框中，它包含82列，其中19列至少包含一个空值。

此外，您还可以自动删除cols和row，具体取决于哪个具有更多的null值。
以下是巧妙地执行此操作的代码：

df = df.drop(df.columns[df.isna().sum()>len(df.columns)],axis = 1)
df = df.dropna(axis = 0).reset_index(drop=True)

注意：上面的代码删除了所有空值。如果需要空值，请先处理它们。

In datasets having large number of columns its even better to see how many columns contain null values and how many don’t.

print("No. of columns containing null values")
print(len(df.columns[df.isna().any()]))

print("No. of columns not containing null values")
print(len(df.columns[df.notna().all()]))

print("Total no. of columns in the dataframe")
print(len(df.columns))

For example in my dataframe it contained 82 columns, of which 19 contained at least one null value.

Further you can also automatically remove cols and rows depending on which has more null values
Here is the code which does this intelligently:

df = df.drop(df.columns[df.isna().sum()>len(df.columns)],axis = 1)
df = df.dropna(axis = 0).reset_index(drop=True)

Note: Above code removes all of your null values. If you want null values, process them before.

回答 10

可以将其添加为’＆’可用于添加其他条件，例如

df = df[(df.EPS > 2.0) & (df.EPS <4.0)]

请注意，在评估语句时，熊猫需要加上括号。

It may be added at that ‘&’ can be used to add additional conditions e.g.

df = df[(df.EPS > 2.0) & (df.EPS <4.0)]

Notice that when evaluating the statements, pandas needs parenthesis.

回答 11

由于某种原因，以前提交的答案都对我不起作用。这个基本解决方案做到了：

df = df[df.EPS >= 0]

当然，这也会删除带有负数的行。因此，如果您想要这些，在以后添加它可能也很聪明。

df = df[df.EPS <= 0]

For some reason none of the previously submitted answers worked for me. This basic solution did:

df = df[df.EPS >= 0]

Though of course that will drop rows with negative numbers, too. So if you want those it’s probably smart to add this after, too.

df = df[df.EPS <= 0]

回答 12

解决方案之一可以是

df = df[df.isnull().sum(axis=1) <= Cutoff Value]

另一种方法可以是

df= df.dropna(thresh=(df.shape[1] - Cutoff_value))

我希望这些是有用的。

One of the solution can be

df = df[df.isnull().sum(axis=1) <= Cutoff Value]

Another way can be

df= df.dropna(thresh=(df.shape[1] - Cutoff_value))

I hope these are useful.

知识问答

Python的`如果x不是None`或`如果x不是None`？

2021年7月25日 Python实用宝典

问题：Python的`如果x不是None`或`如果x不是None`？

我一直认为该if not x is None版本会更清晰，但是Google的样式指南和PEP-8都使用if x is not None。是否存在任何微小的性能差异（我假设不是），并且在任何情况下确实不适合（使另一方成为我的会议的明显获胜者）吗？*

*我指的是任何单身人士，而不仅仅是None。

…比较单例，如“无”。使用是或不是。

I’ve always thought of the if not x is None version to be more clear, but Google’s style guide and PEP-8 both use if x is not None. Is there any minor performance difference (I’m assuming not), and is there any case where one really doesn’t fit (making the other a clear winner for my convention)?*

*I’m referring to any singleton, rather than just None.

…to compare singletons like None. Use is or is not.

回答 0

没有性能差异，因为它们可以编译为相同的字节码：

Python 2.6.2 (r262:71600, Apr 15 2009, 07:20:39)
>>> import dis
>>> def f(x):
...    return x is not None
...
>>> dis.dis(f)
  2           0 LOAD_FAST                0 (x)
              3 LOAD_CONST               0 (None)
              6 COMPARE_OP               9 (is not)
              9 RETURN_VALUE
>>> def g(x):
...   return not x is None
...
>>> dis.dis(g)
  2           0 LOAD_FAST                0 (x)
              3 LOAD_CONST               0 (None)
              6 COMPARE_OP               9 (is not)
              9 RETURN_VALUE

从风格上讲，我尽量避免not x is y。尽管编译器总是将其视为not (x is y)。读者可能会误解为(not x) is y。如果我写的x is not y话就没有歧义。

There’s no performance difference, as they compile to the same bytecode:

Python 2.6.2 (r262:71600, Apr 15 2009, 07:20:39)
>>> import dis
>>> def f(x):
...    return x is not None
...
>>> dis.dis(f)
  2           0 LOAD_FAST                0 (x)
              3 LOAD_CONST               0 (None)
              6 COMPARE_OP               9 (is not)
              9 RETURN_VALUE
>>> def g(x):
...   return not x is None
...
>>> dis.dis(g)
  2           0 LOAD_FAST                0 (x)
              3 LOAD_CONST               0 (None)
              6 COMPARE_OP               9 (is not)
              9 RETURN_VALUE

Stylistically, I try to avoid not x is y. Although the compiler will always treat it as not (x is y), a human reader might misunderstand the construct as (not x) is y. If I write x is not y then there is no ambiguity.

回答 1

Google和Python的样式指南都是最佳做法：

if x is not None:
    # Do something about x

使用not x会导致不良结果。

见下文：

>>> x = 1
>>> not x
False
>>> x = [1]
>>> not x
False
>>> x = 0
>>> not x
True
>>> x = [0]         # You don't want to fall in this one.
>>> not x
False

您可能有兴趣了解对Python True或False在Python 中评估了哪些文字：

真值测试

编辑以下评论：

我只是做了一些测试。先not x is None不取反x，然后与相比较None。实际上，is使用这种方式时，似乎运算符具有更高的优先级：

>>> x
[0]
>>> not x is None
True
>>> not (x is None)
True
>>> (not x) is None
False

因此，not x is None以我的诚实观点，最好避免。

更多编辑：

我只是做了更多测试，可以确认bukzor的评论正确。（至少，我无法证明这一点。）

这意味着if x is not None结果与相同if not x is None。我站得住了。谢谢布克佐。

但是，我的答案仍然是：使用常规if x is not None。:]

Both Google and Python‘s style guide is the best practice:

if x is not None:
    # Do something about x

Using not x can cause unwanted results.

See below:

>>> x = 1
>>> not x
False
>>> x = [1]
>>> not x
False
>>> x = 0
>>> not x
True
>>> x = [0]         # You don't want to fall in this one.
>>> not x
False

You may be interested to see what literals are evaluated to True or False in Python:

Truth Value Testing

Edit for comment below:

I just did some more testing. not x is None doesn’t negate x first and then compared to None. In fact, it seems the is operator has a higher precedence when used that way:

>>> x
[0]
>>> not x is None
True
>>> not (x is None)
True
>>> (not x) is None
False

Therefore, not x is None is just, in my honest opinion, best avoided.

More edit:

I just did more testing and can confirm that bukzor’s comment is correct. (At least, I wasn’t able to prove it otherwise.)

This means if x is not None has the exact result as if not x is None. I stand corrected. Thanks bukzor.

However, my answer still stands: Use the conventional if x is not None. :]

回答 2

应该首先编写代码，以便程序员首先可以理解，然后再编译器或解释器理解。“不是”构造比“不是”更像英语。

Code should be written to be understandable to the programmer first, and the compiler or interpreter second. The “is not” construct resembles English more closely than “not is”.

回答 3

Python if x is not None还是if not x is None？

TLDR：字节码编译器将它们都解析为x is not None-为了便于阅读，请使用if x is not None。

可读性

我们之所以使用Python，是因为我们重视诸如人类可读性，可用性和各种编程范式的正确性之类的东西，而不是性能。

Python针对可读性进行了优化，尤其是在这种情况下。

解析和编译字节码

的not 结合更弱比is，所以这里没有逻辑的差异。请参阅文档：

运算符is并is not测试对象标识：x is y当且仅当x和y是同一对象时才为true。x is not y产生反真值。

将is not有具体规定，在Python 语法作为语言可读性改善：

comp_op: '<'|'>'|'=='|'>='|'<='|'<>'|'!='|'in'|'not' 'in'|'is'|'is' 'not'

因此，它也是语法的一个统一要素。

当然，它的解析方式不同：

>>> import ast
>>> ast.dump(ast.parse('x is not None').body[0].value)
"Compare(left=Name(id='x', ctx=Load()), ops=[IsNot()], comparators=[Name(id='None', ctx=Load())])"
>>> ast.dump(ast.parse('not x is None').body[0].value)
"UnaryOp(op=Not(), operand=Compare(left=Name(id='x', ctx=Load()), ops=[Is()], comparators=[Name(id='None', ctx=Load())]))"

但是字节编译器实际上会将转换not ... is为is not：

>>> import dis
>>> dis.dis(lambda x, y: x is not y)
  1           0 LOAD_FAST                0 (x)
              3 LOAD_FAST                1 (y)
              6 COMPARE_OP               9 (is not)
              9 RETURN_VALUE
>>> dis.dis(lambda x, y: not x is y)
  1           0 LOAD_FAST                0 (x)
              3 LOAD_FAST                1 (y)
              6 COMPARE_OP               9 (is not)
              9 RETURN_VALUE

因此，为了便于阅读并按预期使用语言，请使用is not。

不使用它是不明智的。

Python if x is not None or if not x is None?

TLDR: The bytecode compiler parses them both to x is not None – so for readability’s sake, use if x is not None.

Readability

We use Python because we value things like human readability, useability, and correctness of various paradigms of programming over performance.

Python optimizes for readability, especially in this context.

Parsing and Compiling the Bytecode

The not binds more weakly than is, so there is no logical difference here. See the documentation:

The operators is and is not test for object identity: x is y is true if and only if x and y are the same object. x is not y yields the inverse truth value.

The is not is specifically provided for in the Python grammar as a readability improvement for the language:

comp_op: '<'|'>'|'=='|'>='|'<='|'<>'|'!='|'in'|'not' 'in'|'is'|'is' 'not'

And so it is a unitary element of the grammar as well.

Of course, it is not parsed the same:

>>> import ast
>>> ast.dump(ast.parse('x is not None').body[0].value)
"Compare(left=Name(id='x', ctx=Load()), ops=[IsNot()], comparators=[Name(id='None', ctx=Load())])"
>>> ast.dump(ast.parse('not x is None').body[0].value)
"UnaryOp(op=Not(), operand=Compare(left=Name(id='x', ctx=Load()), ops=[Is()], comparators=[Name(id='None', ctx=Load())]))"

But then the byte compiler will actually translate the not ... is to is not:

>>> import dis
>>> dis.dis(lambda x, y: x is not y)
  1           0 LOAD_FAST                0 (x)
              3 LOAD_FAST                1 (y)
              6 COMPARE_OP               9 (is not)
              9 RETURN_VALUE
>>> dis.dis(lambda x, y: not x is y)
  1           0 LOAD_FAST                0 (x)
              3 LOAD_FAST                1 (y)
              6 COMPARE_OP               9 (is not)
              9 RETURN_VALUE

So for the sake of readability and using the language as it was intended, please use is not.

To not use it is not wise.

回答 4

答案比人们做的要简单。

两种方法都没有技术优势，其他人都使用 “ x不是y” ，这显然是赢家。是否“看起来更像英语”并不重要；每个人都使用它，这意味着Python的每个用户-甚至是中国用户，其语言与Python看起来都不像-都将一目了然地理解它，稍稍不常见的语法将需要花费更多的脑力来解析。

至少在这个领域，不要仅仅为了与众不同而与众不同。

The answer is simpler than people are making it.

There’s no technical advantage either way, and “x is not y” is what everybody else uses, which makes it the clear winner. It doesn’t matter that it “looks more like English” or not; everyone uses it, which means every user of Python–even Chinese users, whose language Python looks nothing like–will understand it at a glance, where the slightly less common syntax will take a couple extra brain cycles to parse.

Don’t be different just for the sake of being different, at least in this field.

回答 5

is not由于is风格上的原因，操作员优先于否定结果。“ if x is not None:”的读法类似于英语，但“ if not x is None:”需要理解操作符的优先级，并且读起来并不像英文。

如果有性能上的差异，我会花钱is not，但这几乎肯定不是决定选择该技术的动机。显然，这将取决于实现。由于这is是不可替代的，因此无论如何都应该很容易优化任何区别。

The is not operator is preferred over negating the result of is for stylistic reasons. “if x is not None:” reads just like English, but “if not x is None:” requires understanding of the operator precedence and does not read like english.

If there is a performance difference my money is on is not, but this almost certainly isn’t the motivation for the decision to prefer that technique. It would obviously be implementation-dependent. Since is isn’t overridable, it should be easy to optimise out any distinction anyhow.

回答 6

我个人使用

if not (x is None):

每个程序员，即使不是Python语法专家的程序员，也都可以毫不歧义地立即理解它。

Personally, I use

if not (x is None):

which is understood immediately without ambiguity by every programmer, even those not expert in the Python syntax.

回答 7

if not x is None与其他编程语言更相似，但if x is not None对我来说绝对听起来更清晰（英语语法更正确）。

话虽如此，这似乎对我来说更偏爱。

if not x is None is more similar to other programming languages, but if x is not None definitely sounds more clear (and is more grammatically correct in English) to me.

That said it seems like it’s more of a preference thing to me.

回答 8

我更喜欢可读性强的形式，而x is not y 不是想如何最终写出运算符的代码处理优先级以产生可读性更高的代码。

I would prefer the more readable form x is not y than I would think how to eventually write the code handling precedence of the operators in order to produce much more readable code.

知识问答

即使使用init.py，也如何解决“尝试以非软件包方式进行相对导入”

2021年7月25日 Python实用宝典

问题：即使使用init.py，也如何解决“尝试以非软件包方式进行相对导入”

我正在尝试使用以下目录结构来遵循PEP 328：

pkg/
  __init__.py
  components/
    core.py
    __init__.py
  tests/
    core_test.py
    __init__.py

在core_test.py我有以下进口声明

from ..components.core import GameLoopEvents

但是，当我运行时，出现以下错误：

tests$ python core_test.py 
Traceback (most recent call last):
  File "core_test.py", line 3, in <module>
    from ..components.core import GameLoopEvents
ValueError: Attempted relative import in non-package

到处搜索时，我发现“ 即使使用__init__.py，相对路径也无法使用 ”和“ 从相对路径导入模块 ”，但是它们没有帮助。

我在这里想念什么吗？

I’m trying to follow PEP 328, with the following directory structure:

pkg/
  __init__.py
  components/
    core.py
    __init__.py
  tests/
    core_test.py
    __init__.py

In core_test.py I have the following import statement

from ..components.core import GameLoopEvents

However, when I run, I get the following error:

tests$ python core_test.py 
Traceback (most recent call last):
  File "core_test.py", line 3, in <module>
    from ..components.core import GameLoopEvents
ValueError: Attempted relative import in non-package

Searching around I found “relative path not working even with __init__.py” and “Import a module from a relative path” but they didn’t help.

Is there anything I’m missing here?

回答 0

是。您没有将其用作包装。

python -m pkg.tests.core_test

Yes. You’re not using it as a package.

python -m pkg.tests.core_test

回答 1

详细阐述伊格纳西奥·巴斯克斯·阿布拉姆斯答案：

Python导入机制相对于__name__当前文件起作用。直接执行文件时，它没有通常的名称，但是具有"__main__"以它的名称命名。因此，相对进口无效。

您可以按照Igancio的建议使用该-m选项执行它。如果包的一部分要作为脚本运行，则还可以使用__package__属性告诉该文件在包层次结构中应具有的名称。

参见http://www.python.org/dev/peps/pep-0366/详细信息，。

To elaborate on Ignacio Vazquez-Abrams’s answer:

The Python import mechanism works relative to the __name__ of the current file. When you execute a file directly, it doesn’t have its usual name, but has "__main__" as its name instead. So relative imports don’t work.

You can, as Igancio suggested, execute it using the -m option. If you have a part of your package that is meant to be run as a script, you can also use the __package__ attribute to tell that file what name it’s supposed to have in the package hierarchy.

See http://www.python.org/dev/peps/pep-0366/ for details.

回答 2

import components.core如果将当前目录附加到，则可以直接使用sys.path：

if __name__ == '__main__' and __package__ is None:
    from os import sys, path
    sys.path.append(path.dirname(path.dirname(path.abspath(__file__))))

You can use import components.core directly if you append the current directory to sys.path:

if __name__ == '__main__' and __package__ is None:
    from os import sys, path
    sys.path.append(path.dirname(path.dirname(path.abspath(__file__))))

回答 3

这取决于您要如何启动脚本。

如果要以经典方式从命令行启动UnitTest，那就是：

python tests/core_test.py

然后，由于在这种情况下‘components’和‘tests’是同级文件夹，因此您可以使用sys.path模块的insert或append方法导入相关模块。就像是：

import sys
from os import path
sys.path.append( path.dirname( path.dirname( path.abspath(__file__) ) ) )
from components.core import GameLoopEvents

否则，您可以使用’-m’参数启动脚本（请注意，在这种情况下，我们正在谈论一个软件包，因此，您不能使用‘.py’扩展名），即：

python -m pkg.tests.core_test

在这种情况下，您可以像以前一样简单地使用相对导入：

from ..components.core import GameLoopEvents

最后，您可以将两种方法混合使用，以便您的脚本无论调用方式如何都可以正常工作。例如：

if __name__ == '__main__':
    if __package__ is None:
        import sys
        from os import path
        sys.path.append( path.dirname( path.dirname( path.abspath(__file__) ) ) )
        from components.core import GameLoopEvents
    else:
        from ..components.core import GameLoopEvents

It depends on how you want to launch your script.

If you want to launch your UnitTest from the command line in a classic way, that is:

python tests/core_test.py

Then, since in this case ‘components’ and ‘tests’ are siblings folders, you can import the relative module either using the insert or the append method of the sys.path module. Something like:

import sys
from os import path
sys.path.append( path.dirname( path.dirname( path.abspath(__file__) ) ) )
from components.core import GameLoopEvents

Otherwise, you can launch your script with the ‘-m’ argument (note that in this case, we are talking about a package, and thus you must not give the ‘.py’ extension), that is:

python -m pkg.tests.core_test

In such a case, you can simply use the relative import as you were doing:

from ..components.core import GameLoopEvents

You can finally mix the two approaches, so that your script will work no matter how it is called. For example:

if __name__ == '__main__':
    if __package__ is None:
        import sys
        from os import path
        sys.path.append( path.dirname( path.dirname( path.abspath(__file__) ) ) )
        from components.core import GameLoopEvents
    else:
        from ..components.core import GameLoopEvents

回答 4

在core_test.py中，执行以下操作：

import sys
sys.path.append('../components')
from core import GameLoopEvents

In core_test.py, do the following:

import sys
sys.path.append('../components')
from core import GameLoopEvents

回答 5

如果您的用例是用于运行测试的，并且可以接缝，那么您可以执行以下操作。不要像python core_test.py使用那样运行测试框架来运行测试脚本pytest。然后在命令行上您可以输入

$$ py.test

这将在您的目录中运行测试。这得到周围人的问题__name__的存在__main__，是由@BrenBarn指出。接下来，将一个空__init__.py文件放入您的测试目录，这将使测试目录成为您程序包的一部分。那你就可以做

from ..components.core import GameLoopEvents

但是，如果您将测试脚本作为主程序运行，那么事情将再次失败。因此，只需使用测试运行器。也许这也适用于其他测试运行程序，例如，nosetests但我尚未检查。希望这可以帮助。

If your use case is for running tests, and it seams that it is, then you can do the following. Instead of running your test script as python core_test.py use a testing framework such as pytest. Then on the command line you can enter

$$ py.test

That will run the tests in your directory. This gets around the issue of __name__ being __main__ that was pointed out by @BrenBarn. Next, put an empty __init__.py file into your test directory, this will make the test directory part of your package. Then you will be able to do

from ..components.core import GameLoopEvents

However, if you run your test script as a main program then things will fail once again. So just use the test runner. Maybe this also works with other test runners such as nosetests but i haven’t checked it. Hope this helps.

回答 6

我的快速解决方案是将目录添加到路径：

import sys
sys.path.insert(0, '../components/')

My quick-fix is to add the directory to the path:

import sys
sys.path.insert(0, '../components/')

回答 7

问题在于您的测试方法，

你试过了 python core_test.py

那么您将收到此错误 ValueError：尝试在非包中进行相对导入

原因：您正在从非包装来源测试包装。

因此，请从软件包源测试模块。

如果这是您的项目结构，

pkg/
  __init__.py
  components/
    core.py
    __init__.py
  tests/
    core_test.py
    __init__.py

cd pkg

python -m tests.core_test # dont use .py

或从外部pkg /

python -m pkg.tests.core_test

.如果要从同一目录中的文件夹导入，则为Single 。每退一步，再增加一个。

hi/
  hello.py
how.py

在 how.py

from .hi import hello

如果你想从hello.py导入

from .. import how

Issue is with your testing method,

you tried python core_test.py

then you will get this error ValueError: Attempted relative import in non-package

Reason: you are testing your packaging from non-package source.

so test your module from package source.

if this is your project structure,

pkg/
  __init__.py
  components/
    core.py
    __init__.py
  tests/
    core_test.py
    __init__.py

cd pkg

python -m tests.core_test # dont use .py

or from outside pkg/

python -m pkg.tests.core_test

single . if you want to import from folder in same directory . for each step back add one more.

hi/
  hello.py
how.py

in how.py

from .hi import hello

incase if you want to import how from hello.py

from .. import how

回答 8

旧线程。我发现__all__= ['submodule', ...]向 __init__.py文件中添加，然后from <CURRENT_MODULE> import *在目标中使用可以正常工作。

Old thread. I found out that adding an __all__= ['submodule', ...] to the __init__.py file and then using the from <CURRENT_MODULE> import * in the target works fine.

回答 9

您可以使用from pkg.components.core import GameLoopEvents，例如我使用pycharm，下面是我的项目结构图像，我只是从根包中导入，然后就可以了：

You can use from pkg.components.core import GameLoopEvents, for example I use pycharm, the below is my project structure image, I just import from the root package, then it works:

回答 10

正如Paolo所说，我们有2种调用方法：

1) python -m tests.core_test
2) python tests/core_test.py

它们之间的区别是sys.path [0]字符串。由于解释将在导入时搜索sys.path，因此我们可以使用tests/core_test.py：

if __name__ == '__main__':
    import sys
    from pathlib import Path
    sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
    from components import core
    <other stuff>

之后，我们可以使用其他方法运行core_test.py：

cd tests
python core_test.py
python -m core_test
...

注意，py36仅经过测试。

As Paolo said, we have 2 invocation methods:

1) python -m tests.core_test
2) python tests/core_test.py

One difference between them is sys.path[0] string. Since the interpret will search sys.path when doing import, we can do with tests/core_test.py:

if __name__ == '__main__':
    import sys
    from pathlib import Path
    sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
    from components import core
    <other stuff>

And more after this, we can run core_test.py with other methods:

cd tests
python core_test.py
python -m core_test
...

Note, py36 tested only.

回答 11

这种方法对我有用，并且比某些解决方案更混乱：

try:
  from ..components.core import GameLoopEvents
except ValueError:
  from components.core import GameLoopEvents

父目录位于我的PYTHONPATH中，并且__init__.py父目录和此目录中都有文件。

上面的代码始终在python 2中有效，但是python 3有时会遇到ImportError或ModuleNotFoundError（后者在python 3.6中是新功能，是ImportError的子类），因此以下调整对我在python 2和3中均有效：

try:
  from ..components.core import GameLoopEvents
except ( ValueError, ImportError):
  from components.core import GameLoopEvents

This approach worked for me and is less cluttered than some solutions:

try:
  from ..components.core import GameLoopEvents
except ValueError:
  from components.core import GameLoopEvents

The parent directory is in my PYTHONPATH, and there are __init__.py files in the parent directory and this directory.

The above always worked in python 2, but python 3 sometimes hit an ImportError or ModuleNotFoundError (the latter is new in python 3.6 and a subclass of ImportError), so the following tweak works for me in both python 2 and 3:

try:
  from ..components.core import GameLoopEvents
except ( ValueError, ImportError):
  from components.core import GameLoopEvents

回答 12

尝试这个

import components
from components import *

Try this

import components
from components import *

回答 13

如果有人正在寻找解决方法，我偶然发现了一个。这里有一些背景。我想测试文件中的一种方法。当我从内部运行时

if __name__ == "__main__":

它总是抱怨相对进口。我尝试应用上述解决方案，但由于许多嵌套文件，每个文件都有多个导入，因此无法正常工作。

这就是我所做的。我刚刚创建了一个启动器，一个外部程序，它将导入必要的方法并调用它们。虽然这不是一个很好的解决方案，但它可以工作。

If someone is looking for a workaround, I stumbled upon one. Here’s a bit of context. I wanted to test out one of the methods I’ve in a file. When I run it from within

if __name__ == "__main__":

it always complained of the relative imports. I tried to apply the above solutions, but failed to work, since there were many nested files, each with multiple imports.

Here’s what I did. I just created a launcher, an external program that would import necessary methods and call them. Though, not a great solution, it works.

回答 14

这是一种会惹恼所有人但效果很好的方法。在测试中运行：

ln -s ../components components

然后只需像往常一样导入组件。

Here’s one way which will piss off everyone but work pretty well. In tests run:

ln -s ../components components

Then just import components like you normally would.

回答 15

这非常令人困惑，如果您使用的是像pycharm这样的IDE，那就更令人困惑了。对我有用的方法：1.进行pycharm项目设置（如果从VE或python目录运行python）2.定义的方式没有错。有时它与from folder1.file1导入类一起使用

如果它不起作用，请使用import folder1.file1。3.您的环境变量应在系统中正确提及或在命令行参数中提供。

This is very confusing, and if you are using IDE like pycharm, it’s little more confusing. What worked for me: 1. Make pycharm project settings (if you are running python from a VE or from python directory) 2. There is no wrong the way you defined. sometime it works with from folder1.file1 import class

if it does not work, use import folder1.file1 3. Your environment variable should be correctly mentioned in system or provide it in your command line argument.

回答 16

由于您的代码包含if __name__ == "__main__"，而不会作为包导入，因此最好使用它sys.path.append()来解决问题。

Because your code contains if __name__ == "__main__", which doesn’t be imported as a package, you’d better use sys.path.append() to solve the problem.

知识问答

整数的最大值和最小值

2021年7月25日 Python实用宝典

问题：整数的最大值和最小值

我正在寻找python中整数的最小值和最大值。例如，在Java中，我们有Integer.MIN_VALUE和Integer.MAX_VALUE。python中是否有类似的东西？

I am looking for minimum and maximum values for integers in python. For eg., in Java, we have Integer.MIN_VALUE and Integer.MAX_VALUE. Is there something like this in python?

回答 0

Python 3

在Python 3中，此问题不适用。普通int类型是无界的。

但是，您实际上可能正在寻找有关当前解释器的字长的信息，在大多数情况下，该信息将与机器的字长相同。该信息仍在Python 3中以形式提供sys.maxsize，这是一个有符号的单词可以表示的最大值。等效地，它是最大可能列表或内存序列的大小。

通常，无符号字可表示的最大值为，字中sys.maxsize * 2 + 1的位数为math.log2(sys.maxsize * 2 + 2)。有关更多信息，请参见此答案。

Python 2

在Python 2中，纯int值的最大值可作为sys.maxint：

>>> sys.maxint
9223372036854775807

你可以计算与最小值-sys.maxint - 1如图所示这里。

一旦超过此值，Python就会从纯整数无缝转换为长整数。因此，大多数时候，您不需要了解它。

Python 3

In Python 3, this question doesn’t apply. The plain int type is unbounded.

However, you might actually be looking for information about the current interpreter’s word size, which will be the same as the machine’s word size in most cases. That information is still available in Python 3 as sys.maxsize, which is the maximum value representable by a signed word. Equivalently, it’s the size of the largest possible list or in-memory sequence.

Generally, the maximum value representable by an unsigned word will be sys.maxsize * 2 + 1, and the number of bits in a word will be math.log2(sys.maxsize * 2 + 2). See this answer for more information.

Python 2

In Python 2, the maximum value for plain int values is available as sys.maxint:

>>> sys.maxint
9223372036854775807

You can calculate the minimum value with -sys.maxint - 1 as shown here.

Python seamlessly switches from plain to long integers once you exceed this value. So most of the time, you won’t need to know it.

回答 1

如果您只需要一个大于所有其他数字的数字，则可以使用

float('inf')

以类似的方式，比所有其他方法小：

float('-inf')

这适用于python 2和3。

If you just need a number that’s bigger than all others, you can use

float('inf')

in similar fashion, a number smaller than all others:

float('-inf')

This works in both python 2 and 3.

回答 2

该sys.maxint常数已经在Python 3.0取出以后，改为使用sys.maxsize。

整数

PEP 237：从本质上讲，long已重命名为int。也就是说，只有一个内置的整数类型，称为int；但它的行为基本上类似于旧的long类型。

PEP 238：类似1/2的表达式返回浮点数。使用1 // 2获得截断行为。（至少从Python 2.2开始，后一种语法已经存在多年了。）

sys.maxint常量已删除，因为整数值不再受限制。但是，sys.maxsize可以用作大于任何实际列表或字符串索引的整数。它符合实现的“自然”整数大小，并且通常与同一平台上的先前版本中的sys.maxint相同（假定具有相同的生成选项）。

长整数的repr（）不再包含结尾的L，因此无条件剥离该字符的代码将砍掉最后一位数字。（改为使用str（）。）

八进制文字不再采用0720的形式；请改用0o720。

参考：https : //docs.python.org/3/whatsnew/3.0.html#integers

The sys.maxint constant has been removed from Python 3.0 onward, instead use sys.maxsize.

Integers

PEP 237: Essentially, long renamed to int. That is, there is only one built-in integral type, named int; but it behaves mostly like the old long type.

PEP 238: An expression like 1/2 returns a float. Use 1//2 to get the truncating behavior. (The latter syntax has existed for years, at least since Python 2.2.)

The sys.maxint constant was removed, since there is no longer a limit to the value of integers. However, sys.maxsize can be used as an integer larger than any practical list or string index. It conforms to the implementation’s “natural” integer size and is typically the same as sys.maxint in previous releases on the same platform (assuming the same build options).

The repr() of a long integer doesn’t include the trailing L anymore, so code that unconditionally strips that character will chop off the last digit instead. (Use str() instead.)

Octal literals are no longer of the form 0720; use 0o720 instead.

Refer : https://docs.python.org/3/whatsnew/3.0.html#integers

回答 3

在Python中，一旦您传递值，整数将自动从固定大小的int表示形式转换为宽度可变的long表示形式，该值取决于平台sys.maxint是2 ^311-1或2 ^63-1。请注意L，此处已附加：

>>> 9223372036854775807
9223372036854775807
>>> 9223372036854775808
9223372036854775808L

从Python手册：

数字是通过数字文字或内置函数和运算符创建的。未经修饰的整数文字（包括二进制，十六进制和八进制数字）将生成纯整数，除非它们表示的值太大而无法表示为纯整数，在这种情况下，它们将生成一个长整数。带'L'或'l'后缀的整数文字产生长整数（'L'首选，因为1l看起来太像11！）。

Python非常努力地假装其整数是数学整数并且是无界的。例如，它可以轻松计算出googol：

>>> 10**100
10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000L

In Python integers will automatically switch from a fixed-size int representation into a variable width long representation once you pass the value sys.maxint, which is either 2³¹ – 1 or 2⁶³ – 1 depending on your platform. Notice the L that gets appended here:

>>> 9223372036854775807
9223372036854775807
>>> 9223372036854775808
9223372036854775808L

From the Python manual:

Numbers are created by numeric literals or as the result of built-in functions and operators. Unadorned integer literals (including binary, hex, and octal numbers) yield plain integers unless the value they denote is too large to be represented as a plain integer, in which case they yield a long integer. Integer literals with an 'L' or 'l' suffix yield long integers ('L' is preferred because 1l looks too much like eleven!).

Python tries very hard to pretend its integers are mathematical integers and are unbounded. It can, for instance, calculate a googol with ease:

>>> 10**100
10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000L

回答 4

对于Python 3，它是

import sys
max = sys.maxsize
min = -sys.maxsize - 1

For Python 3, it is

import sys
max = sys.maxsize
min = -sys.maxsize - 1

回答 5

您可以这样使用’inf’：

import math
bool_true = 0 < math.inf
bool_false = 0 < -math.inf

请参阅：数学—数学函数

You may use ‘inf’ like this:

import math
bool_true = 0 < math.inf
bool_false = 0 < -math.inf

Refer: math — Mathematical functions

回答 6

如果您想要数组或列表索引的最大值（相当于size_tC / C ++），则可以使用numpy：

np.iinfo(np.intp).max

这是相同的，sys.maxsize但是优点是您不需要为此仅导入sys。

如果要在计算机上将max用于本机int：

np.iinfo(np.intc).max

您可以在doc中查看其他可用类型。

对于float，您也可以使用sys.float_info.max。

If you want the max for array or list indices (equivalent to size_t in C/C++), you can use numpy:

np.iinfo(np.intp).max

This is same as sys.maxsize however advantage is that you don’t need import sys just for this.

If you want max for native int on the machine:

np.iinfo(np.intc).max

You can look at other available types in doc.

For floats you can also use sys.float_info.max.

回答 7

我严重依赖这样的命令。

python -c 'import sys; print(sys.maxsize)'

返回的最大整数：9223372036854775807

有关“ sys”的更多参考，请访问

https://docs.python.org/3/library/sys.html

https://docs.python.org/3/library/sys.html#sys.maxsize

I rely heavily on commands like this.

python -c 'import sys; print(sys.maxsize)'

Max int returned: 9223372036854775807

For more references for ‘sys’ you should access

https://docs.python.org/3/library/sys.html

https://docs.python.org/3/library/sys.html#sys.maxsize

问题：如何从日期中减去一天？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

问题：从相对路径导入模块

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

Linux用户的快捷方式

The quick-and-dirty way for Linux users

回答 11

回答 12

回答 13

回答 14

回答 15

回答 16

回答 17

回答 18

回答 19

回答 20

回答 21

问题：如何在Python中检查文件大小？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

问题：如何获取模块的路径？

回答 0

回答 1

回答 2

回答 3

使用注意事项 __file__

查找目录

使用检查

Caveats of Using __file__

Finding the directory

Using inspect

回答 4

回答 5

回答 6

回答 7

回答 8

命令行实用程序

Command Line Utility

回答 9

回答 10

回答 11

回答 12

回答 13

回答 14

回答 15

回答 16

回答 17

回答 18

问题：如何用逗号将数字打印为千位分隔符？

回答 0

不知道语言环境

区域感知

参考

Locale unaware

Locale aware

Reference

回答 1

回答 2

使用注意事项 `file`

Caveats of Using `file`