Python中的CURL替代

问题:Python中的CURL替代

我在PHP中使用了一个cURL调用:

curl -i -H’接受:应用程序/ xml’-u登录名:密钥“ https://app.streamsend.com/emails

我需要一种在Python中执行相同操作的方法。在Python中是否可以使用cURL替代。我知道urllib,但我是Python菜鸟,也不知道如何使用它。

I have a cURL call that I use in PHP:

curl -i -H ‘Accept: application/xml’ -u login:key “https://app.streamsend.com/emails

I need a way to do the same thing in Python. Is there an alternative to cURL in Python. I know of urllib but I’m a Python noob and have no idea how to use it.


回答 0

import urllib2

manager = urllib2.HTTPPasswordMgrWithDefaultRealm()
manager.add_password(None, 'https://app.streamsend.com/emails', 'login', 'key')
handler = urllib2.HTTPBasicAuthHandler(manager)

director = urllib2.OpenerDirector()
director.add_handler(handler)

req = urllib2.Request('https://app.streamsend.com/emails', headers = {'Accept' : 'application/xml'})

result = director.open(req)
# result.read() will contain the data
# result.info() will contain the HTTP headers

# To get say the content-length header
length = result.info()['Content-Length']

您的cURL调用使用urllib2代替。完全未经测试。

import urllib2

manager = urllib2.HTTPPasswordMgrWithDefaultRealm()
manager.add_password(None, 'https://app.streamsend.com/emails', 'login', 'key')
handler = urllib2.HTTPBasicAuthHandler(manager)

director = urllib2.OpenerDirector()
director.add_handler(handler)

req = urllib2.Request('https://app.streamsend.com/emails', headers = {'Accept' : 'application/xml'})

result = director.open(req)
# result.read() will contain the data
# result.info() will contain the HTTP headers

# To get say the content-length header
length = result.info()['Content-Length']

Your cURL call using urllib2 instead. Completely untested.


回答 1

您可以使用“ 请求:人类的HTTP”用户指南中描述的HTTP请求。

You can use HTTP Requests that are described in the Requests: HTTP for Humans user guide.


回答 2

这是一个使用urllib2的简单示例,该示例针对GitHub的API进行了基本身份验证。

import urllib2

u='username'
p='userpass'
url='https://api.github.com/users/username'

# simple wrapper function to encode the username & pass
def encodeUserData(user, password):
    return "Basic " + (user + ":" + password).encode("base64").rstrip()

# create the request object and set some headers
req = urllib2.Request(url)
req.add_header('Accept', 'application/json')
req.add_header("Content-type", "application/x-www-form-urlencoded")
req.add_header('Authorization', encodeUserData(u, p))
# make the request and print the results
res = urllib2.urlopen(req)
print res.read()

此外,如果将其包装在脚本中并从终端运行,则可以将响应字符串传递到“ mjson.tool”以启用漂亮的打印。

>> basicAuth.py | python -mjson.tool

最后要注意的一点是,urllib2仅支持GET&POST请求。
如果您需要使用其他HTTP动词(例如DELETE,PUT等),则可能需要看一下PYCURL

Here’s a simple example using urllib2 that does a basic authentication against GitHub’s API.

import urllib2

u='username'
p='userpass'
url='https://api.github.com/users/username'

# simple wrapper function to encode the username & pass
def encodeUserData(user, password):
    return "Basic " + (user + ":" + password).encode("base64").rstrip()

# create the request object and set some headers
req = urllib2.Request(url)
req.add_header('Accept', 'application/json')
req.add_header("Content-type", "application/x-www-form-urlencoded")
req.add_header('Authorization', encodeUserData(u, p))
# make the request and print the results
res = urllib2.urlopen(req)
print res.read()

Furthermore if you wrap this in a script and run it from a terminal you can pipe the response string to ‘mjson.tool’ to enable pretty printing.

>> basicAuth.py | python -mjson.tool

One last thing to note, urllib2 only supports GET & POST requests.
If you need to use other HTTP verbs like DELETE, PUT, etc you’ll probably want to take a look at PYCURL


回答 3

如果您使用的是这样的命令来调用curl,则可以使用来在Python中执行相同的操作subprocess。例:

subprocess.call(['curl', '-i', '-H', '"Accept: application/xml"', '-u', 'login:key', '"https://app.streamsend.com/emails"'])

或者,如果您想将其作为像PHP一样具有更结构化的api,可以尝试PycURL

If you are using a command to just call curl like that, you can do the same thing in Python with subprocess. Example:

subprocess.call(['curl', '-i', '-H', '"Accept: application/xml"', '-u', 'login:key', '"https://app.streamsend.com/emails"'])

Or you could try PycURL if you want to have it as a more structured api like what PHP has.


回答 4

import requests

url = 'https://example.tld/'
auth = ('username', 'password')

r = requests.get(url, auth=auth)
print r.content

这是我所能获得的最简单的方法。

import requests

url = 'https://example.tld/'
auth = ('username', 'password')

r = requests.get(url, auth=auth)
print r.content

This is the simplest I’ve been able to get it.


回答 5

例如,如何使用urllib以及一些糖语法。我知道请求和其他库,但是urllib是python的标准库,不需要单独安装任何东西。

兼容Python 2/3。

import sys
if sys.version_info.major == 3:
  from urllib.request import HTTPPasswordMgrWithDefaultRealm, HTTPBasicAuthHandler, Request, build_opener
  from urllib.parse import urlencode
else:
  from urllib2 import HTTPPasswordMgrWithDefaultRealm, HTTPBasicAuthHandler, Request, build_opener
  from urllib import urlencode


def curl(url, params=None, auth=None, req_type="GET", data=None, headers=None):
  post_req = ["POST", "PUT"]
  get_req = ["GET", "DELETE"]

  if params is not None:
    url += "?" + urlencode(params)

  if req_type not in post_req + get_req:
    raise IOError("Wrong request type \"%s\" passed" % req_type)

  _headers = {}
  handler_chain = []

  if auth is not None:
    manager = HTTPPasswordMgrWithDefaultRealm()
    manager.add_password(None, url, auth["user"], auth["pass"])
    handler_chain.append(HTTPBasicAuthHandler(manager))

  if req_type in post_req and data is not None:
    _headers["Content-Length"] = len(data)

  if headers is not None:
    _headers.update(headers)

  director = build_opener(*handler_chain)

  if req_type in post_req:
    if sys.version_info.major == 3:
      _data = bytes(data, encoding='utf8')
    else:
      _data = bytes(data)

    req = Request(url, headers=_headers, data=_data)
  else:
    req = Request(url, headers=_headers)

  req.get_method = lambda: req_type
  result = director.open(req)

  return {
    "httpcode": result.code,
    "headers": result.info(),
    "content": result.read()
  }


"""
Usage example:
"""

Post data:
  curl("http://127.0.0.1/", req_type="POST", data='cascac')

Pass arguments (http://127.0.0.1/?q=show):
  curl("http://127.0.0.1/", params={'q': 'show'}, req_type="POST", data='cascac')

HTTP Authorization:
  curl("http://127.0.0.1/secure_data.txt", auth={"user": "username", "pass": "password"})

功能不完整,可能不理想,但显示了基本的表示形式和要使用的概念。可以根据口味添加或更改其他内容。

12/08更新

是GitHub实时更新源的链接。目前支持:

  • 授权

  • 兼容CRUD

  • 自动字符集检测

  • 自动编码(压缩)检测

Some example, how to use urllib for that things, with some sugar syntax. I know about requests and other libraries, but urllib is standard lib for python and doesn’t require anything to be installed separately.

Python 2/3 compatible.

import sys
if sys.version_info.major == 3:
  from urllib.request import HTTPPasswordMgrWithDefaultRealm, HTTPBasicAuthHandler, Request, build_opener
  from urllib.parse import urlencode
else:
  from urllib2 import HTTPPasswordMgrWithDefaultRealm, HTTPBasicAuthHandler, Request, build_opener
  from urllib import urlencode


def curl(url, params=None, auth=None, req_type="GET", data=None, headers=None):
  post_req = ["POST", "PUT"]
  get_req = ["GET", "DELETE"]

  if params is not None:
    url += "?" + urlencode(params)

  if req_type not in post_req + get_req:
    raise IOError("Wrong request type \"%s\" passed" % req_type)

  _headers = {}
  handler_chain = []

  if auth is not None:
    manager = HTTPPasswordMgrWithDefaultRealm()
    manager.add_password(None, url, auth["user"], auth["pass"])
    handler_chain.append(HTTPBasicAuthHandler(manager))

  if req_type in post_req and data is not None:
    _headers["Content-Length"] = len(data)

  if headers is not None:
    _headers.update(headers)

  director = build_opener(*handler_chain)

  if req_type in post_req:
    if sys.version_info.major == 3:
      _data = bytes(data, encoding='utf8')
    else:
      _data = bytes(data)

    req = Request(url, headers=_headers, data=_data)
  else:
    req = Request(url, headers=_headers)

  req.get_method = lambda: req_type
  result = director.open(req)

  return {
    "httpcode": result.code,
    "headers": result.info(),
    "content": result.read()
  }


"""
Usage example:
"""

Post data:
  curl("http://127.0.0.1/", req_type="POST", data='cascac')

Pass arguments (http://127.0.0.1/?q=show):
  curl("http://127.0.0.1/", params={'q': 'show'}, req_type="POST", data='cascac')

HTTP Authorization:
  curl("http://127.0.0.1/secure_data.txt", auth={"user": "username", "pass": "password"})

Function is not complete and possibly is not ideal, but shows a basic representation and concept to use. Additional things could be added or changed by taste.

12/08 update

Here is a GitHub link to live updated source. Currently supporting:

  • authorization

  • CRUD compatible

  • automatic charset detection

  • automatic encoding(compression) detection


回答 6

如果它正在从您要查找的命令行运行以上所有内容,那么我建议您使用HTTPie。这是一个奇妙的卷曲替代品,超级容易方便,以使用(和定制)。

这是来自GitHub的(简洁而准确的)描述;

HTTPie(发音为aych-tee-tee-pie)是命令行HTTP客户端。其目标是使CLI与Web服务的交互尽可能对人类友好。

它提供了一个简单的http命令,该命令允许使用简单自然的语法发送任意HTTP请求,并显示彩色输出。HTTPie可用于测试,调试和通常与HTTP服务器交互。


有关身份验证的文档应为您提供足够的指针来解决您的问题。当然,以上所有答案也是准确的,并且提供了完成同一任务的不同方法。


只是为了您不必离开Stack Overflow,这是它简而言之的功能。

Basic auth:

$ http -a username:password example.org
Digest auth:

$ http --auth-type=digest -a username:password example.org
With password prompt:

$ http -a username example.org

If it’s running all of the above from the command line that you’re looking for, then I’d recommend HTTPie. It is a fantastic cURL alternative and is super easy and convenient to use (and customize).

Here’s is its (succinct and precise) description from GitHub;

HTTPie (pronounced aych-tee-tee-pie) is a command line HTTP client. Its goal is to make CLI interaction with web services as human-friendly as possible.

It provides a simple http command that allows for sending arbitrary HTTP requests using a simple and natural syntax, and displays colorized output. HTTPie can be used for testing, debugging, and generally interacting with HTTP servers.


The documentation around authentication should give you enough pointers to solve your problem(s). Of course, all of the answers above are accurate as well, and provide different ways of accomplishing the same task.


Just so you do NOT have to move away from Stack Overflow, here’s what it offers in a nutshell.

Basic auth:

$ http -a username:password example.org
Digest auth:

$ http --auth-type=digest -a username:password example.org
With password prompt:

$ http -a username example.org

使用cython和mingw进行编译会产生gcc:错误:无法识别的命令行选项’-mno-cygwin’

问题:使用cython和mingw进行编译会产生gcc:错误:无法识别的命令行选项’-mno-cygwin’

我正在尝试使用mingw(64位)在win 7 64位中使用cython编译python扩展。
我正在使用Python 2.6(Active Python 2.6.6)和足够的distutils.cfg文件(将mingw设置为编译器)

执行时

> C:\Python26\programas\Cython>python setup.py build_ext --inplace

我收到一条错误消息,说gcc没有-mno-cygwin选项:

> C:\Python26\programas\Cython>python setup.py build_ext --inplace
running build_ext
skipping 'hello2.c' Cython extension (up-to-date)
building 'hello2' extension
C:\mingw\bin\gcc.exe -mno-cygwin -mdll -O -Wall -IC:\Python26\include -IC:\Python26\PC -c hello2.c -o build\temp.win-amd64-2.6\Release\hello2.o
gcc: error: unrecognized command line option '-mno-cygwin'
error: command 'gcc' failed with exit status 1

gcc是:

C:\>gcc --version
gcc (GCC) 4.7.0 20110430 (experimental)
Copyright (C) 2011 Free Software Foundation, Inc.

我该如何解决?

I’m trying to compile a python extension with cython in win 7 64-bit using mingw (64-bit).
I’m working with Python 2.6 (Active Python 2.6.6) and with the adequate distutils.cfg file (setting mingw as the compiler)

When executing

> C:\Python26\programas\Cython>python setup.py build_ext --inplace

I get an error saying that gcc has not an -mno-cygwin option:

> C:\Python26\programas\Cython>python setup.py build_ext --inplace
running build_ext
skipping 'hello2.c' Cython extension (up-to-date)
building 'hello2' extension
C:\mingw\bin\gcc.exe -mno-cygwin -mdll -O -Wall -IC:\Python26\include -IC:\Python26\PC -c hello2.c -o build\temp.win-amd64-2.6\Release\hello2.o
gcc: error: unrecognized command line option '-mno-cygwin'
error: command 'gcc' failed with exit status 1

gcc is:

C:\>gcc --version
gcc (GCC) 4.7.0 20110430 (experimental)
Copyright (C) 2011 Free Software Foundation, Inc.

How could I fix it?


回答 0

听起来好像GCC 4.7.0终于删除了不推荐使用的-mno-cygwin选项,但是distutils尚未赶上它。请安装较旧版本的MinGW,或者distutils\cygwinccompiler.py在Python目录中进行编辑以删除的所有实例-mno-cygwin

It sounds like GCC 4.7.0 has finally removed the deprecated -mno-cygwin option, but distutils has not yet caught up with it. Either install a slightly older version of MinGW, or edit distutils\cygwinccompiler.py in your Python directory to remove all instances of -mno-cygwin.


回答 1

在解决我发现的这些问题和以下问题的过程中,我在此线程中编写了一个配方。我在这里复制它,以防它可能对其他人有用:


在Win 7 64位中使用mingw编译器使用python 2.6.6逐步编译64位cython扩展的配方

安装mingw编译器
1)安装tdm64-gcc-4.5.2.exe进行64位编译

将补丁应用到python.h
2)按照http://bugs.python.org/file12411/mingw-w64.patch中的指示在C:\ python26 \ include中修改python.h

修改distutils
Edit 2013:注意,比python 2.7.6和3.3.3–mno-cygwin最终已被删除,因此可以跳过第3步

3)在Python26 \ Lib \ distutils \ cygwinccompiler.py中删除Mingw32CCompiler类中对gcc的调用中的所有参数-mno-cygwin
4)在同一模块中,修改get_msvcr()以返回空列表,而不是[‘msvcr90 ‘]当msc_ver ==’1500’时。

产生libpython26.a文件(64位python中不包括)
编辑2013:通过从gohlke下载并安装libpython26.a可以跳过以下步骤5-10

5)从mingw-w64-bin_x86_64- mingw_20101003_sezero.zip中获取gendef.exe(tmd64发行版中不提供gendef.exe。另一种解决方案是从源代码编译gendef …)
6)复制python26.dll(位于C中) \ windows \ system32)到用户目录(C:\ Users \ myname)
7)使用以下命令生成python26.def文件:

gendef.exe C:\ Users \ myname \ python26.dll

8)将生成的python.def文件(位于执行gendef的文件夹中)移至用户目录
9)生成libpython.a,其内容如下:

dlltool -v –dllname python26.dll –def C:\ Users \ myname \ python26.def –output-lib C:\ Users \ myname \ libpython26.a

10)将创建的libpython26.a移至C:\ Python26 \ libs

产生您的.pyd扩展名
。11)按照cython教程(http://docs.cython.org/src/quickstart/build.html)中的指示,创建一个hello.pyx测试文件和setup.py文件
。12)编译为

python setup.py build_ext –inplace

做完了!

During the process of solving these and the following problems I found, I wrote a recipe in this thread. I reproduce it here in case it could be of utility for others:


Step by step recipe to compile 64-bit cython extensions with python 2.6.6 with mingw compiler in win 7 64-bit

Install mingw compiler
1) Install tdm64-gcc-4.5.2.exe for 64-bit compilation

Apply patch to python.h
2) Modify python.h in C:\python26\include as indicated in http://bugs.python.org/file12411/mingw-w64.patch

Modify distutils
Edit 2013: Note than in python 2.7.6 and 3.3.3 -mno-cygwin has been finally removed so step 3 can be skipped.

3) Eliminate all the parameters -mno-cygwin fom the call to gcc in the Mingw32CCompiler class in Python26\Lib\distutils\cygwinccompiler.py
4) In the same module, modify get_msvcr() to return an empty list instead of [‘msvcr90’] when msc_ver == ‘1500’ .

Produce the libpython26.a file (not included in 64 bit python)
Edit 2013: the following steps 5-10 can be skipped by downloading and installing libpython26.a from gohlke.

5) Obtain gendef.exe from mingw-w64-bin_x86_64- mingw_20101003_sezero.zip (gendef.exe is not available in the tmd64 distribution. Another solution is to compile gendef from source…)
6) Copy python26.dll (located at C\windows\system32) to the user directory (C:\Users\myname)
7) Produce the python26.def file with:

gendef.exe C:\Users\myname\python26.dll

8) Move the python.def file produced (located in the folder from where gendef was executed) to the user directory
9) Produce the libpython.a with:

dlltool -v –dllname python26.dll –def C:\Users\myname \python26.def –output-lib C:\Users\myname\libpython26.a

10) Move the created libpython26.a to C:\Python26\libs

Produce your .pyd extension
11) Create a test hello.pyx file and a setup.py file as indicated in cython tutorial (http://docs.cython.org/src/quickstart/build.html)
12) Compile with

python setup.py build_ext –inplace

Done!


回答 2

现在,该错误已在Python 2.7.6版本候选1中修复。

补丁提交在这里

已解决的问题跟踪器线程在此处

This bug has now been fixed in Python 2.7.6 release candidate 1.

The patching commit is here.

The resolved issue tracker thread is here.


回答 3

试试这个 。它确实适用于错误
https://github.com/develersrl/gccwinbinaries


在Python脚本中,如何设置PYTHONPATH?

问题:在Python脚本中,如何设置PYTHONPATH?

我知道如何在/ etc / profile和环境变量中进行设置。

但是,如果我想在脚本中进行设置怎么办?是导入os,sys吗?我该怎么做?

I know how to set it in my /etc/profile and in my environment variables.

But what if I want to set it during a script? Is it import os, sys? How do I do it?


回答 0

您没有设置PYTHONPATH,而是向中添加条目sys.path。这是应该在其中搜索Python软件包的目录列表,因此您只需将目录追加到该列表即可。

sys.path.append('/path/to/whatever')

实际上,sys.path是通过分割PYTHONPATH路径分隔符:上的值来初始化的(在类似Linux的系统上,;在Windows上)。

您也可以使用来添加目录site.addsitedir,该方法还将考虑.pth您传递的目录内存在的文件。(对于您在中指定的目录,情况并非如此PYTHONPATH。)

You don’t set PYTHONPATH, you add entries to sys.path. It’s a list of directories that should be searched for Python packages, so you can just append your directories to that list.

sys.path.append('/path/to/whatever')

In fact, sys.path is initialized by splitting the value of PYTHONPATH on the path separator character (: on Linux-like systems, ; on Windows).

You can also add directories using site.addsitedir, and that method will also take into account .pth files existing within the directories you pass. (That would not be the case with directories you specify in PYTHONPATH.)


回答 1

您可以通过os.environ以下方式获取和设置环境变量:

import os
user_home = os.environ["HOME"]

os.environ["PYTHONPATH"] = "..."

但是,由于您的解释器已经在运行,因此不会起作用。你最好用

import sys
sys.path.append("...")

这是您PYTHONPATH将在解释程序启动时转换为的数组。

You can get and set environment variables via os.environ:

import os
user_home = os.environ["HOME"]

os.environ["PYTHONPATH"] = "..."

But since your interpreter is already running, this will have no effect. You’re better off using

import sys
sys.path.append("...")

which is the array that your PYTHONPATH will be transformed into on interpreter startup.


回答 2

如果您sys.path.append('dir/to/path')不加检查就放了它,则可以在中生成一个长列表sys.path。为此,我建议这样做:

import sys
import os # if you want this directory

try:
    sys.path.index('/dir/path') # Or os.getcwd() for this directory
except ValueError:
    sys.path.append('/dir/path') # Or os.getcwd() for this directory

If you put sys.path.append('dir/to/path') without check it is already added, you could generate a long list in sys.path. For that, I recommend this:

import sys
import os # if you want this directory

try:
    sys.path.index('/dir/path') # Or os.getcwd() for this directory
except ValueError:
    sys.path.append('/dir/path') # Or os.getcwd() for this directory

回答 3

PYTHONPATH结尾于sys.path,您可以在运行时进行修改。

import sys
sys.path += ["whatever"]

PYTHONPATH ends up in sys.path, which you can modify at runtime.

import sys
sys.path += ["whatever"]

回答 4

您可以通过设置PYTHONPATHos.environ['PATHPYTHON']=/some/path然后需要调用os.system('python')以重新启动python shell,以使新添加的路径生效。

you can set PYTHONPATH, by os.environ['PATHPYTHON']=/some/path, then you need to call os.system('python') to restart the python shell to make the newly added path effective.


回答 5

我的Linux也可以:

import sys
sys.path.extend(["/path/to/dotpy/file/"])

I linux this works too:

import sys
sys.path.extend(["/path/to/dotpy/file/"])

如何在python中使用鼻子测试/单元测试来断言输出?

问题:如何在python中使用鼻子测试/单元测试来断言输出?

我正在为下一个功能编写测试:

def foo():
    print 'hello world!'

因此,当我要测试此功能时,代码将如下所示:

import sys
from foomodule import foo
def test_foo():
    foo()
    output = sys.stdout.getline().strip() # because stdout is an StringIO instance
    assert output == 'hello world!'

但是,如果我使用-s参数运行鼻子测试,则测试将崩溃。如何使用unittest或鼻子模块捕获输出?

I’m writing tests for a function like next one:

def foo():
    print 'hello world!'

So when I want to test this function the code will be like this:

import sys
from foomodule import foo
def test_foo():
    foo()
    output = sys.stdout.getline().strip() # because stdout is an StringIO instance
    assert output == 'hello world!'

But if I run nosetests with -s parameter the test crashes. How can I catch the output with unittest or nose module?


回答 0

我使用此上下文管理器捕获输出。通过临时替换,它最终使用了与其他一些答案相同的技术sys.stdout。我更喜欢上下文管理器,因为它将所有簿记功能都包装到一个函数中,因此,我不必重新编写任何最终代码,也不必为此编写设置和拆卸功能。

import sys
from contextlib import contextmanager
from StringIO import StringIO

@contextmanager
def captured_output():
    new_out, new_err = StringIO(), StringIO()
    old_out, old_err = sys.stdout, sys.stderr
    try:
        sys.stdout, sys.stderr = new_out, new_err
        yield sys.stdout, sys.stderr
    finally:
        sys.stdout, sys.stderr = old_out, old_err

像这样使用它:

with captured_output() as (out, err):
    foo()
# This can go inside or outside the `with` block
output = out.getvalue().strip()
self.assertEqual(output, 'hello world!')

此外,由于退出with块后将恢复原始输出状态,因此我们可以使用与第一个捕获块相同的功能来设置第二个捕获块,使用设置和拆卸功能是不可能的,并且在最终尝试写入时会变得很冗长手动阻止。当测试的目的是将两个函数的结果相互比较而不是与某个预先计算的值进行比较时,该功能就派上用场了。

I use this context manager to capture output. It ultimately uses the same technique as some of the other answers by temporarily replacing sys.stdout. I prefer the context manager because it wraps all the bookkeeping into a single function, so I don’t have to re-write any try-finally code, and I don’t have to write setup and teardown functions just for this.

import sys
from contextlib import contextmanager
from StringIO import StringIO

@contextmanager
def captured_output():
    new_out, new_err = StringIO(), StringIO()
    old_out, old_err = sys.stdout, sys.stderr
    try:
        sys.stdout, sys.stderr = new_out, new_err
        yield sys.stdout, sys.stderr
    finally:
        sys.stdout, sys.stderr = old_out, old_err

Use it like this:

with captured_output() as (out, err):
    foo()
# This can go inside or outside the `with` block
output = out.getvalue().strip()
self.assertEqual(output, 'hello world!')

Furthermore, since the original output state is restored upon exiting the with block, we can set up a second capture block in the same function as the first one, which isn’t possible using setup and teardown functions, and gets wordy when writing try-finally blocks manually. That ability came in handy when the goal of a test was to compare the results of two functions relative to each other rather than to some precomputed value.


回答 1

如果确实要执行此操作,则可以在测试期间重新分配sys.stdout。

def test_foo():
    import sys
    from foomodule import foo
    from StringIO import StringIO

    saved_stdout = sys.stdout
    try:
        out = StringIO()
        sys.stdout = out
        foo()
        output = out.getvalue().strip()
        assert output == 'hello world!'
    finally:
        sys.stdout = saved_stdout

但是,如果我正在编写此代码,则希望将可选out参数传递给该foo函数。

def foo(out=sys.stdout):
    out.write("hello, world!")

然后测试要简单得多:

def test_foo():
    from foomodule import foo
    from StringIO import StringIO

    out = StringIO()
    foo(out=out)
    output = out.getvalue().strip()
    assert output == 'hello world!'

If you really want to do this, you can reassign sys.stdout for the duration of the test.

def test_foo():
    import sys
    from foomodule import foo
    from StringIO import StringIO

    saved_stdout = sys.stdout
    try:
        out = StringIO()
        sys.stdout = out
        foo()
        output = out.getvalue().strip()
        assert output == 'hello world!'
    finally:
        sys.stdout = saved_stdout

If I were writing this code, however, I would prefer to pass an optional out parameter to the foo function.

def foo(out=sys.stdout):
    out.write("hello, world!")

Then the test is much simpler:

def test_foo():
    from foomodule import foo
    from StringIO import StringIO

    out = StringIO()
    foo(out=out)
    output = out.getvalue().strip()
    assert output == 'hello world!'

回答 2

从2.7版开始,您不再需要重新分配sys.stdout,这是通过bufferflag提供的。而且,这是鼻子测试的默认行为。

这是在非缓冲上下文中失败的示例:

import sys
import unittest

def foo():
    print 'hello world!'

class Case(unittest.TestCase):
    def test_foo(self):
        foo()
        if not hasattr(sys.stdout, "getvalue"):
            self.fail("need to run in buffered mode")
        output = sys.stdout.getvalue().strip() # because stdout is an StringIO instance
        self.assertEquals(output,'hello world!')

您可以设置缓冲区通过unit2命令行标志-b--bufferunittest.main选择。相反的是通过nosetestflag 实现的--nocapture

if __name__=="__main__":   
    assert not hasattr(sys.stdout, "getvalue")
    unittest.main(module=__name__, buffer=True, exit=False)
    #.
    #----------------------------------------------------------------------
    #Ran 1 test in 0.000s
    #
    #OK
    assert not hasattr(sys.stdout, "getvalue")

    unittest.main(module=__name__, buffer=False)
    #hello world!
    #F
    #======================================================================
    #FAIL: test_foo (__main__.Case)
    #----------------------------------------------------------------------
    #Traceback (most recent call last):
    #  File "test_stdout.py", line 15, in test_foo
    #    self.fail("need to run in buffered mode")
    #AssertionError: need to run in buffered mode
    #
    #----------------------------------------------------------------------
    #Ran 1 test in 0.002s
    #
    #FAILED (failures=1)

Since version 2.7, you do not need anymore to reassign sys.stdout, this is provided through buffer flag. Moreover, it is the default behavior of nosetest.

Here is a sample failing in non buffered context:

import sys
import unittest

def foo():
    print 'hello world!'

class Case(unittest.TestCase):
    def test_foo(self):
        foo()
        if not hasattr(sys.stdout, "getvalue"):
            self.fail("need to run in buffered mode")
        output = sys.stdout.getvalue().strip() # because stdout is an StringIO instance
        self.assertEquals(output,'hello world!')

You can set buffer through unit2 command line flag -b, --buffer or in unittest.main options. The opposite is achieved through nosetest flag --nocapture.

if __name__=="__main__":   
    assert not hasattr(sys.stdout, "getvalue")
    unittest.main(module=__name__, buffer=True, exit=False)
    #.
    #----------------------------------------------------------------------
    #Ran 1 test in 0.000s
    #
    #OK
    assert not hasattr(sys.stdout, "getvalue")

    unittest.main(module=__name__, buffer=False)
    #hello world!
    #F
    #======================================================================
    #FAIL: test_foo (__main__.Case)
    #----------------------------------------------------------------------
    #Traceback (most recent call last):
    #  File "test_stdout.py", line 15, in test_foo
    #    self.fail("need to run in buffered mode")
    #AssertionError: need to run in buffered mode
    #
    #----------------------------------------------------------------------
    #Ran 1 test in 0.002s
    #
    #FAILED (failures=1)

回答 3

对于我来说,这些答案很多都失败了,因为您无法from StringIO import StringIO使用Python3。这是基于@naxa的注释和Python Cookbook的最小工作片段。

from io import StringIO
from unittest.mock import patch

with patch('sys.stdout', new=StringIO()) as fakeOutput:
    print('hello world')
    self.assertEqual(fakeOutput.getvalue().strip(), 'hello world')

A lot of these answers failed for me because you can’t from StringIO import StringIO in Python 3. Here’s a minimum working snippet based on @naxa’s comment and the Python Cookbook.

from io import StringIO
from unittest.mock import patch

with patch('sys.stdout', new=StringIO()) as fakeOutput:
    print('hello world')
    self.assertEqual(fakeOutput.getvalue().strip(), 'hello world')

回答 4

在python 3.5中,您可以使用contextlib.redirect_stdout()StringIO()。这是对代码的修改

import contextlib
from io import StringIO
from foomodule import foo

def test_foo():
    temp_stdout = StringIO()
    with contextlib.redirect_stdout(temp_stdout):
        foo()
    output = temp_stdout.getvalue().strip()
    assert output == 'hello world!'

In python 3.5 you can use contextlib.redirect_stdout() and StringIO(). Here’s the modification to your code

import contextlib
from io import StringIO
from foomodule import foo

def test_foo():
    temp_stdout = StringIO()
    with contextlib.redirect_stdout(temp_stdout):
        foo()
    output = temp_stdout.getvalue().strip()
    assert output == 'hello world!'

回答 5

我只是在学习Python,并且发现自己遇到了与上述类似的问题,并且对带有输出的方法进行了单元测试。我上面对foo模块的传递单元测试最终看起来像这样:

import sys
import unittest
from foo import foo
from StringIO import StringIO

class FooTest (unittest.TestCase):
    def setUp(self):
        self.held, sys.stdout = sys.stdout, StringIO()

    def test_foo(self):
        foo()
        self.assertEqual(sys.stdout.getvalue(),'hello world!\n')

I’m only just learning Python and found myself struggling with a similar problem to the one above with unit tests for methods with output. My passing unit test for foo module above has ended up looking like this:

import sys
import unittest
from foo import foo
from StringIO import StringIO

class FooTest (unittest.TestCase):
    def setUp(self):
        self.held, sys.stdout = sys.stdout, StringIO()

    def test_foo(self):
        foo()
        self.assertEqual(sys.stdout.getvalue(),'hello world!\n')

回答 6

编写测试通常会向我们展示一种更好的编写代码的方法。与Shane的答案类似,我想提出另一种看待这个问题的方式。您是否真的要断言您的程序输出了某个字符串,还是只是构造了一个用于输出的字符串?这变得更容易测试,因为我们可能可以假设Python print语句正确完成了工作。

def foo_msg():
    return 'hello world'

def foo():
    print foo_msg()

然后您的测试非常简单:

def test_foo_msg():
    assert 'hello world' == foo_msg()

当然,如果您确实需要测试程序的实际输出,则可以忽略。:)

Writing tests often shows us a better way to write our code. Similar to Shane’s answer, I’d like to suggest yet another way of looking at this. Do you really want to assert that your program outputted a certain string, or just that it constructed a certain string for output? This becomes easier to test, since we can probably assume that the Python print statement does its job correctly.

def foo_msg():
    return 'hello world'

def foo():
    print foo_msg()

Then your test is very simple:

def test_foo_msg():
    assert 'hello world' == foo_msg()

Of course, if you really have a need to test your program’s actual output, then feel free to disregard. :)


回答 7

基于Rob Kennedy的回答,我编写了一个基于类的上下文管理器版本来缓冲输出。

用法就像:

with OutputBuffer() as bf:
    print('hello world')
assert bf.out == 'hello world\n'

这是实现:

from io import StringIO
import sys


class OutputBuffer(object):

    def __init__(self):
        self.stdout = StringIO()
        self.stderr = StringIO()

    def __enter__(self):
        self.original_stdout, self.original_stderr = sys.stdout, sys.stderr
        sys.stdout, sys.stderr = self.stdout, self.stderr
        return self

    def __exit__(self, exception_type, exception, traceback):
        sys.stdout, sys.stderr = self.original_stdout, self.original_stderr

    @property
    def out(self):
        return self.stdout.getvalue()

    @property
    def err(self):
        return self.stderr.getvalue()

Based on Rob Kennedy’s answer, I wrote a class-based version of the context manager to buffer the output.

Usage is like:

with OutputBuffer() as bf:
    print('hello world')
assert bf.out == 'hello world\n'

Here’s the implementation:

from io import StringIO
import sys


class OutputBuffer(object):

    def __init__(self):
        self.stdout = StringIO()
        self.stderr = StringIO()

    def __enter__(self):
        self.original_stdout, self.original_stderr = sys.stdout, sys.stderr
        sys.stdout, sys.stderr = self.stdout, self.stderr
        return self

    def __exit__(self, exception_type, exception, traceback):
        sys.stdout, sys.stderr = self.original_stdout, self.original_stderr

    @property
    def out(self):
        return self.stdout.getvalue()

    @property
    def err(self):
        return self.stderr.getvalue()

回答 8

或考虑使用pytest,它具有内置的断言stdout和stderr支持。查看文件

def test_myoutput(capsys): # or use "capfd" for fd-level
    print("hello")
    captured = capsys.readouterr()
    assert captured.out == "hello\n"
    print("next")
    captured = capsys.readouterr()
    assert captured.out == "next\n"

Or consider using pytest, it has built-in support for asserting stdout and stderr. See docs

def test_myoutput(capsys): # or use "capfd" for fd-level
    print("hello")
    captured = capsys.readouterr()
    assert captured.out == "hello\n"
    print("next")
    captured = capsys.readouterr()
    assert captured.out == "next\n"

回答 9

无论n611x007本体使用已经建议unittest.mock,但这个答案适应阿卡门斯的向您展示如何轻松地包裹unittest.TestCase方法与嘲笑互动stdout

import io
import unittest
import unittest.mock

msg = "Hello World!"


# function we will be testing
def foo():
    print(msg, end="")


# create a decorator which wraps a TestCase method and pass it a mocked
# stdout object
mock_stdout = unittest.mock.patch('sys.stdout', new_callable=io.StringIO)


class MyTests(unittest.TestCase):

    @mock_stdout
    def test_foo(self, stdout):
        # run the function whose output we want to test
        foo()
        # get its output from the mocked stdout
        actual = stdout.getvalue()
        expected = msg
        self.assertEqual(actual, expected)

Both n611x007 and Noumenon already suggested using unittest.mock, but this answer adapts Acumenus’s to show how you can easily wrap unittest.TestCase methods to interact with a mocked stdout.

import io
import unittest
import unittest.mock

msg = "Hello World!"


# function we will be testing
def foo():
    print(msg, end="")


# create a decorator which wraps a TestCase method and pass it a mocked
# stdout object
mock_stdout = unittest.mock.patch('sys.stdout', new_callable=io.StringIO)


class MyTests(unittest.TestCase):

    @mock_stdout
    def test_foo(self, stdout):
        # run the function whose output we want to test
        foo()
        # get its output from the mocked stdout
        actual = stdout.getvalue()
        expected = msg
        self.assertEqual(actual, expected)

回答 10

基于此线程中所有出色的答案,这就是我解决的方法。我想尽可能地保留它。我使用增强的单元测试机制setUp(),以捕获sys.stdoutsys.stderr,增加了新的API断言检查捕获的值对一个预期值,然后还原sys.stdoutsys.stderrtearDown(). I did this to keep a similar unit test API as the built-in单元测试API while still being able to unit test values printed tosys.stdout的orsys.stderr`。

import io
import sys
import unittest


class TestStdout(unittest.TestCase):

    # before each test, capture the sys.stdout and sys.stderr
    def setUp(self):
        self.test_out = io.StringIO()
        self.test_err = io.StringIO()
        self.original_output = sys.stdout
        self.original_err = sys.stderr
        sys.stdout = self.test_out
        sys.stderr = self.test_err

    # restore sys.stdout and sys.stderr after each test
    def tearDown(self):
        sys.stdout = self.original_output
        sys.stderr = self.original_err

    # assert that sys.stdout would be equal to expected value
    def assertStdoutEquals(self, value):
        self.assertEqual(self.test_out.getvalue().strip(), value)

    # assert that sys.stdout would not be equal to expected value
    def assertStdoutNotEquals(self, value):
        self.assertNotEqual(self.test_out.getvalue().strip(), value)

    # assert that sys.stderr would be equal to expected value
    def assertStderrEquals(self, value):
        self.assertEqual(self.test_err.getvalue().strip(), value)

    # assert that sys.stderr would not be equal to expected value
    def assertStderrNotEquals(self, value):
        self.assertNotEqual(self.test_err.getvalue().strip(), value)

    # example of unit test that can capture the printed output
    def test_print_good(self):
        print("------")

        # use assertStdoutEquals(value) to test if your
        # printed value matches your expected `value`
        self.assertStdoutEquals("------")

    # fails the test, expected different from actual!
    def test_print_bad(self):
        print("@=@=")
        self.assertStdoutEquals("@-@-")


if __name__ == '__main__':
    unittest.main()

运行单元测试时,输出为:

$ python3 -m unittest -v tests/print_test.py
test_print_bad (tests.print_test.TestStdout) ... FAIL
test_print_good (tests.print_test.TestStdout) ... ok

======================================================================
FAIL: test_print_bad (tests.print_test.TestStdout)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tests/print_test.py", line 51, in test_print_bad
    self.assertStdoutEquals("@-@-")
  File "/tests/print_test.py", line 24, in assertStdoutEquals
    self.assertEqual(self.test_out.getvalue().strip(), value)
AssertionError: '@=@=' != '@-@-'
- @=@=
+ @-@-


----------------------------------------------------------------------
Ran 2 tests in 0.001s

FAILED (failures=1)

Building on all the awesome answers in this thread, this is how I solved it. I wanted to keep it as stock as possible. I augmented the unit test mechanism using setUp() to capture sys.stdout and sys.stderr, added new assert APIs to check the captured values against an expected value and then restore sys.stdout and sys.stderr upon tearDown(). I did this to keep a similar unit test API as the built-inunittestAPI while still being able to unit test values printed tosys.stdoutorsys.stderr`.

import io
import sys
import unittest


class TestStdout(unittest.TestCase):

    # before each test, capture the sys.stdout and sys.stderr
    def setUp(self):
        self.test_out = io.StringIO()
        self.test_err = io.StringIO()
        self.original_output = sys.stdout
        self.original_err = sys.stderr
        sys.stdout = self.test_out
        sys.stderr = self.test_err

    # restore sys.stdout and sys.stderr after each test
    def tearDown(self):
        sys.stdout = self.original_output
        sys.stderr = self.original_err

    # assert that sys.stdout would be equal to expected value
    def assertStdoutEquals(self, value):
        self.assertEqual(self.test_out.getvalue().strip(), value)

    # assert that sys.stdout would not be equal to expected value
    def assertStdoutNotEquals(self, value):
        self.assertNotEqual(self.test_out.getvalue().strip(), value)

    # assert that sys.stderr would be equal to expected value
    def assertStderrEquals(self, value):
        self.assertEqual(self.test_err.getvalue().strip(), value)

    # assert that sys.stderr would not be equal to expected value
    def assertStderrNotEquals(self, value):
        self.assertNotEqual(self.test_err.getvalue().strip(), value)

    # example of unit test that can capture the printed output
    def test_print_good(self):
        print("------")

        # use assertStdoutEquals(value) to test if your
        # printed value matches your expected `value`
        self.assertStdoutEquals("------")

    # fails the test, expected different from actual!
    def test_print_bad(self):
        print("@=@=")
        self.assertStdoutEquals("@-@-")


if __name__ == '__main__':
    unittest.main()

When the unit test is run, the output is:

$ python3 -m unittest -v tests/print_test.py
test_print_bad (tests.print_test.TestStdout) ... FAIL
test_print_good (tests.print_test.TestStdout) ... ok

======================================================================
FAIL: test_print_bad (tests.print_test.TestStdout)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tests/print_test.py", line 51, in test_print_bad
    self.assertStdoutEquals("@-@-")
  File "/tests/print_test.py", line 24, in assertStdoutEquals
    self.assertEqual(self.test_out.getvalue().strip(), value)
AssertionError: '@=@=' != '@-@-'
- @=@=
+ @-@-


----------------------------------------------------------------------
Ran 2 tests in 0.001s

FAILED (failures=1)

如何使用BeautifulSoup查找节点的子节点

问题:如何使用BeautifulSoup查找节点的子节点

我想获取所有<a>属于以下子项的标签<li>

<div>
<li class="test">
    <a>link1</a>
    <ul> 
       <li>  
          <a>link2</a> 
       </li>
    </ul>
</li>
</div>

我知道如何找到像这样的特定类的元素:

soup.find("li", { "class" : "test" }) 

但是我不知道如何找到所有<a>的孩子的孩子,<li class=test>而不是其他孩子的孩子。

就像我想选择:

<a>link1</a>

I want to get all the <a> tags which are children of <li>:

<div>
<li class="test">
    <a>link1</a>
    <ul> 
       <li>  
          <a>link2</a> 
       </li>
    </ul>
</li>
</div>

I know how to find element with particular class like this:

soup.find("li", { "class" : "test" }) 

But I don’t know how to find all <a> which are children of <li class=test> but not any others.

Like I want to select:

<a>link1</a>

回答 0

试试这个

li = soup.find('li', {'class': 'text'})
children = li.findChildren("a" , recursive=False)
for child in children:
    print child

Try this

li = soup.find('li', {'class': 'text'})
children = li.findChildren("a" , recursive=False)
for child in children:
    print(child)

回答 1

DOC中有一个超小部分,显示了如何查找/ find_all 直接子级。

https://www.crummy.com/software/BeautifulSoup/bs4/doc/#the-recursive-argument

在您需要的情况下,link1是第一个直接子级:

# for only first direct child
soup.find("li", { "class" : "test" }).find("a", recursive=False)

如果您想要所有直系子女:

# for all direct children
soup.find("li", { "class" : "test" }).findAll("a", recursive=False)

There’s a super small section in the DOCs that shows how to find/find_all direct children.

https://www.crummy.com/software/BeautifulSoup/bs4/doc/#the-recursive-argument

In your case as you want link1 which is first direct child:

# for only first direct child
soup.find("li", { "class" : "test" }).find("a", recursive=False)

If you want all direct children:

# for all direct children
soup.find("li", { "class" : "test" }).findAll("a", recursive=False)

回答 2

也许你想做

soup.find("li", { "class" : "test" }).find('a')

Perhaps you want to do

soup.find("li", { "class" : "test" }).find('a')

回答 3

试试这个:

li = soup.find("li", { "class" : "test" })
children = li.find_all("a") # returns a list of all <a> children of li

其他提醒:

find方法仅获取第一个出现的子元素。find_all方法获取所有后代元素,并存储在列表中。

try this:

li = soup.find("li", { "class" : "test" })
children = li.find_all("a") # returns a list of all <a> children of li

other reminders:

The find method only gets the first occurring child element. The find_all method gets all descendant elements and are stored in a list.


回答 4

“如何找到所有人a的孩子,<li class=test>而不是其他孩子?”

给定下面的HTML(我添加了另一个<a>以显示select和之间的区别select_one):

<div>
  <li class="test">
    <a>link1</a>
    <ul>
      <li>
        <a>link2</a>
      </li>
    </ul>
    <a>link3</a>
  </li>
</div>

解决方案是使用放在两个CSS选择器之间的子组合器>):

>>> soup.select('li.test > a')
[<a>link1</a>, <a>link3</a>]

如果您只想找到第一个孩子:

>>> soup.select_one('li.test > a')
<a>link1</a>

“How to find all a which are children of <li class=test> but not any others?”

Given the HTML below (I added another <a> to show te difference between select and select_one):

<div>
  <li class="test">
    <a>link1</a>
    <ul>
      <li>
        <a>link2</a>
      </li>
    </ul>
    <a>link3</a>
  </li>
</div>

The solution is to use child combinator (>) that is placed between two CSS selectors:

>>> soup.select('li.test > a')
[<a>link1</a>, <a>link3</a>]

In case you want to find only the first child:

>>> soup.select_one('li.test > a')
<a>link1</a>

回答 5

另一种方法-创建一个过滤器函数,该函数返回True所有所需标签:

def my_filter(tag):
    return (tag.name == 'a' and
        tag.parent.name == 'li' and
        'test' in tag.parent['class'])

然后只需调用find_all参数:

for a in soup(my_filter): # or soup.find_all(my_filter)
    print a

Yet another method – create a filter function that returns True for all desired tags:

def my_filter(tag):
    return (tag.name == 'a' and
        tag.parent.name == 'li' and
        'test' in tag.parent['class'])

Then just call find_all with the argument:

for a in soup(my_filter): # or soup.find_all(my_filter)
    print a

泡菜还是json?

问题:泡菜还是json?

我需要将一个dict密钥为type str且值为ints 的小对象保存到磁盘,然后将其恢复。像这样:

{'juanjo': 2, 'pedro':99, 'other': 333}

最佳选择是什么,为什么?使用pickle或使用序列化它simplejson

我正在使用Python 2.6。

I need to save to disk a little dict object whose keys are of the type str and values are ints and then recover it. Something like this:

{'juanjo': 2, 'pedro':99, 'other': 333}

What is the best option and why? Serialize it with pickle or with simplejson?

I am using Python 2.6.


回答 0

如果您没有任何互操作性要求(例如,您将仅使用Python使用数据)并且二进制格式很好,请使用cPickle,它将为您提供真正快速的Python对象序列化。

如果您希望互操作性或想要一种文本格式来存储数据,请使用JSON(或其他一些适当的格式,具体取决于您的约束)。

If you do not have any interoperability requirements (e.g. you are just going to use the data with Python) and a binary format is fine, go with cPickle which gives you really fast Python object serialization.

If you want interoperability or you want a text format to store your data, go with JSON (or some other appropriate format depending on your constraints).


回答 1

对于序列化,我更喜欢JSON而不是pickle。取消选择可以运行任意代码,并且pickle用于在程序之间传输数据或在会话之间存储数据是一个安全漏洞。JSON不会引入安全漏洞,并且已标准化,因此,您可以根据需要使用不同语言的程序访问数据。

I prefer JSON over pickle for my serialization. Unpickling can run arbitrary code, and using pickle to transfer data between programs or store data between sessions is a security hole. JSON does not introduce a security hole and is standardized, so the data can be accessed by programs in different languages if you ever need to.


回答 2

您可能还会发现一些有趣的图表,可以进行比较:http : //kovshenin.com/archives/pickle-vs-json-which-is-faster/

You might also find this interesting, with some charts to compare: http://kovshenin.com/archives/pickle-vs-json-which-is-faster/


回答 3

如果您主要关注速度和空间,请使用cPickle,因为cPickle比JSON快。

如果您更关注互操作性,安全性和/或人类可读性,请使用JSON。


其他答案中引用的测试结果记录在2010年,2016年使用cPickle 协议2更新的测试显示:

  • cPickle 3.8倍更快的加载速度
  • cPickle 1.5倍读取速度更快
  • cPickle编码稍小

使用这个gist可以自己重现这一点,它基于康斯坦丁在其他答案中引用的基准,但是使用协议2而不是pickle的cPickle,并且使用pickle的json(因为json比simplejson快)来使用json ,例如

wget https://gist.github.com/jdimatteo/af317ef24ccf1b3fa91f4399902bb534/raw/03e8dbab11b5605bc572bc117c8ac34cfa959a70/pickle_vs_json.py
python pickle_vs_json.py

在不错的2015 Xeon处理器上使用python 2.7的结果:

Dir Entries Method  Time    Length

dump    10  JSON    0.017   1484510
load    10  JSON    0.375   -
dump    10  Pickle  0.011   1428790
load    10  Pickle  0.098   -
dump    20  JSON    0.036   2969020
load    20  JSON    1.498   -
dump    20  Pickle  0.022   2857580
load    20  Pickle  0.394   -
dump    50  JSON    0.079   7422550
load    50  JSON    9.485   -
dump    50  Pickle  0.055   7143950
load    50  Pickle  2.518   -
dump    100 JSON    0.165   14845100
load    100 JSON    37.730  -
dump    100 Pickle  0.107   14287900
load    100 Pickle  9.907   -

带有pickle协议3的Python 3.4甚至更快。

If you are primarily concerned with speed and space, use cPickle because cPickle is faster than JSON.

If you are more concerned with interoperability, security, and/or human readability, then use JSON.


The tests results referenced in other answers were recorded in 2010, and the updated tests in 2016 with cPickle protocol 2 show:

  • cPickle 3.8x faster loading
  • cPickle 1.5x faster reading
  • cPickle slightly smaller encoding

Reproduce this yourself with this gist, which is based on the Konstantin’s benchmark referenced in other answers, but using cPickle with protocol 2 instead of pickle, and using json instead of simplejson (since json is faster than simplejson), e.g.

wget https://gist.github.com/jdimatteo/af317ef24ccf1b3fa91f4399902bb534/raw/03e8dbab11b5605bc572bc117c8ac34cfa959a70/pickle_vs_json.py
python pickle_vs_json.py

Results with python 2.7 on a decent 2015 Xeon processor:

Dir Entries Method  Time    Length

dump    10  JSON    0.017   1484510
load    10  JSON    0.375   -
dump    10  Pickle  0.011   1428790
load    10  Pickle  0.098   -
dump    20  JSON    0.036   2969020
load    20  JSON    1.498   -
dump    20  Pickle  0.022   2857580
load    20  Pickle  0.394   -
dump    50  JSON    0.079   7422550
load    50  JSON    9.485   -
dump    50  Pickle  0.055   7143950
load    50  Pickle  2.518   -
dump    100 JSON    0.165   14845100
load    100 JSON    37.730  -
dump    100 Pickle  0.107   14287900
load    100 Pickle  9.907   -

Python 3.4 with pickle protocol 3 is even faster.


回答 4

JSON还是泡菜?JSON 泡菜怎么样!您可以使用jsonpickle。它易于使用,并且磁盘上的文件是JSON,因此可读。

http://jsonpickle.github.com/

JSON or pickle? How about JSON and pickle! You can use jsonpickle. It easy to use and the file on disk is readable because it’s JSON.

http://jsonpickle.github.com/


回答 5

我尝试了几种方法,发现使用cPickle并将dumps方法的协议参数设置为:cPickle.dumps(obj, protocol=cPickle.HIGHEST_PROTOCOL)是最快的转储方法。

import msgpack
import json
import pickle
import timeit
import cPickle
import numpy as np

num_tests = 10

obj = np.random.normal(0.5, 1, [240, 320, 3])

command = 'pickle.dumps(obj)'
setup = 'from __main__ import pickle, obj'
result = timeit.timeit(command, setup=setup, number=num_tests)
print("pickle:  %f seconds" % result)

command = 'cPickle.dumps(obj)'
setup = 'from __main__ import cPickle, obj'
result = timeit.timeit(command, setup=setup, number=num_tests)
print("cPickle:   %f seconds" % result)


command = 'cPickle.dumps(obj, protocol=cPickle.HIGHEST_PROTOCOL)'
setup = 'from __main__ import cPickle, obj'
result = timeit.timeit(command, setup=setup, number=num_tests)
print("cPickle highest:   %f seconds" % result)

command = 'json.dumps(obj.tolist())'
setup = 'from __main__ import json, obj'
result = timeit.timeit(command, setup=setup, number=num_tests)
print("json:   %f seconds" % result)


command = 'msgpack.packb(obj.tolist())'
setup = 'from __main__ import msgpack, obj'
result = timeit.timeit(command, setup=setup, number=num_tests)
print("msgpack:   %f seconds" % result)

输出:

pickle         :   0.847938 seconds
cPickle        :   0.810384 seconds
cPickle highest:   0.004283 seconds
json           :   1.769215 seconds
msgpack        :   0.270886 seconds

I have tried several methods and found out that using cPickle with setting the protocol argument of the dumps method as: cPickle.dumps(obj, protocol=cPickle.HIGHEST_PROTOCOL) is the fastest dump method.

import msgpack
import json
import pickle
import timeit
import cPickle
import numpy as np

num_tests = 10

obj = np.random.normal(0.5, 1, [240, 320, 3])

command = 'pickle.dumps(obj)'
setup = 'from __main__ import pickle, obj'
result = timeit.timeit(command, setup=setup, number=num_tests)
print("pickle:  %f seconds" % result)

command = 'cPickle.dumps(obj)'
setup = 'from __main__ import cPickle, obj'
result = timeit.timeit(command, setup=setup, number=num_tests)
print("cPickle:   %f seconds" % result)


command = 'cPickle.dumps(obj, protocol=cPickle.HIGHEST_PROTOCOL)'
setup = 'from __main__ import cPickle, obj'
result = timeit.timeit(command, setup=setup, number=num_tests)
print("cPickle highest:   %f seconds" % result)

command = 'json.dumps(obj.tolist())'
setup = 'from __main__ import json, obj'
result = timeit.timeit(command, setup=setup, number=num_tests)
print("json:   %f seconds" % result)


command = 'msgpack.packb(obj.tolist())'
setup = 'from __main__ import msgpack, obj'
result = timeit.timeit(command, setup=setup, number=num_tests)
print("msgpack:   %f seconds" % result)

Output:

pickle         :   0.847938 seconds
cPickle        :   0.810384 seconds
cPickle highest:   0.004283 seconds
json           :   1.769215 seconds
msgpack        :   0.270886 seconds

回答 6

就个人而言,我通常更喜欢JSON,因为数据是人类可读的。当然,如果您需要序列化JSON不会接受的内容,则可以使用pickle。

但是对于大多数数据存储而言,您不需要序列化任何奇怪的东西,而JSON则容易得多,并且始终允许您在文本编辑器中将其弹出并自行检查数据。

速度不错,但是对于大多数数据集而言,差异可以忽略不计;无论如何,Python通常并不太快。

Personally, I generally prefer JSON because the data is human-readable. Definitely, if you need to serialize something that JSON won’t take, than use pickle.

But for most data storage, you won’t need to serialize anything weird and JSON is much easier and always allows you to pop it open in a text editor and check out the data yourself.

The speed is nice, but for most datasets the difference is negligible; Python generally isn’t too fast anyways.


如何在Python 2中发送HEAD HTTP请求?

问题:如何在Python 2中发送HEAD HTTP请求?

我在这里尝试做的是获取给定URL的标头,以便确定MIME类型。我希望能够查看是否http://somedomain/foo/将返回例如HTML文档或JPEG图像。因此,我需要弄清楚如何发送HEAD请求,以便无需下载内容就可以读取MIME类型。有人知道这样做的简单方法吗?

What I’m trying to do here is get the headers of a given URL so I can determine the MIME type. I want to be able to see if http://somedomain/foo/ will return an HTML document or a JPEG image for example. Thus, I need to figure out how to send a HEAD request so that I can read the MIME type without having to download the content. Does anyone know of an easy way of doing this?


回答 0

编辑:此答案有效,但是现在您应该使用下面其他答案中提到的请求库。


使用httplib

>>> import httplib
>>> conn = httplib.HTTPConnection("www.google.com")
>>> conn.request("HEAD", "/index.html")
>>> res = conn.getresponse()
>>> print res.status, res.reason
200 OK
>>> print res.getheaders()
[('content-length', '0'), ('expires', '-1'), ('server', 'gws'), ('cache-control', 'private, max-age=0'), ('date', 'Sat, 20 Sep 2008 06:43:36 GMT'), ('content-type', 'text/html; charset=ISO-8859-1')]

还有一个getheader(name)获取特定的标头。

edit: This answer works, but nowadays you should just use the requests library as mentioned by other answers below.


Use httplib.

>>> import httplib
>>> conn = httplib.HTTPConnection("www.google.com")
>>> conn.request("HEAD", "/index.html")
>>> res = conn.getresponse()
>>> print res.status, res.reason
200 OK
>>> print res.getheaders()
[('content-length', '0'), ('expires', '-1'), ('server', 'gws'), ('cache-control', 'private, max-age=0'), ('date', 'Sat, 20 Sep 2008 06:43:36 GMT'), ('content-type', 'text/html; charset=ISO-8859-1')]

There’s also a getheader(name) to get a specific header.


回答 1

urllib2可用于执行HEAD请求。这比使用httplib更好,因为urllib2为您解析URL,而不是要求您将URL分为主机名和路径。

>>> import urllib2
>>> class HeadRequest(urllib2.Request):
...     def get_method(self):
...         return "HEAD"
... 
>>> response = urllib2.urlopen(HeadRequest("http://google.com/index.html"))

头可以通过response.info()像以前一样使用。有趣的是,您可以找到重定向到的URL:

>>> print response.geturl()
http://www.google.com.au/index.html

urllib2 can be used to perform a HEAD request. This is a little nicer than using httplib since urllib2 parses the URL for you instead of requiring you to split the URL into host name and path.

>>> import urllib2
>>> class HeadRequest(urllib2.Request):
...     def get_method(self):
...         return "HEAD"
... 
>>> response = urllib2.urlopen(HeadRequest("http://google.com/index.html"))

Headers are available via response.info() as before. Interestingly, you can find the URL that you were redirected to:

>>> print response.geturl()
http://www.google.com.au/index.html

回答 2

强制Requests方式:

import requests

resp = requests.head("http://www.google.com")
print resp.status_code, resp.text, resp.headers

Obligatory Requests way:

import requests

resp = requests.head("http://www.google.com")
print resp.status_code, resp.text, resp.headers

回答 3

我认为也应该提及Requests库。

I believe the Requests library should be mentioned as well.


回答 4

只是:

import urllib2
request = urllib2.Request('http://localhost:8080')
request.get_method = lambda : 'HEAD'

response = urllib2.urlopen(request)
response.info().gettype()

编辑:我刚开始意识到有httplib2:D

import httplib2
h = httplib2.Http()
resp = h.request("http://www.google.com", 'HEAD')
assert resp[0]['status'] == 200
assert resp[0]['content-type'] == 'text/html'
...

连结文字

Just:

import urllib2
request = urllib2.Request('http://localhost:8080')
request.get_method = lambda : 'HEAD'

response = urllib2.urlopen(request)
response.info().gettype()

Edit: I’ve just came to realize there is httplib2 :D

import httplib2
h = httplib2.Http()
resp = h.request("http://www.google.com", 'HEAD')
assert resp[0]['status'] == 200
assert resp[0]['content-type'] == 'text/html'
...

link text


回答 5

为了完整起见,有一个与使用httplib接受的答案等效的Python3答案。

基本相同的代码只是该库不再被称为httplib而是http.client

from http.client import HTTPConnection

conn = HTTPConnection('www.google.com')
conn.request('HEAD', '/index.html')
res = conn.getresponse()

print(res.status, res.reason)

For completeness to have a Python3 answer equivalent to the accepted answer using httplib.

It is basically the same code just that the library isn’t called httplib anymore but http.client

from http.client import HTTPConnection

conn = HTTPConnection('www.google.com')
conn.request('HEAD', '/index.html')
res = conn.getresponse()

print(res.status, res.reason)

回答 6

import httplib
import urlparse

def unshorten_url(url):
    parsed = urlparse.urlparse(url)
    h = httplib.HTTPConnection(parsed.netloc)
    h.request('HEAD', parsed.path)
    response = h.getresponse()
    if response.status/100 == 3 and response.getheader('Location'):
        return response.getheader('Location')
    else:
        return url
import httplib
import urlparse

def unshorten_url(url):
    parsed = urlparse.urlparse(url)
    h = httplib.HTTPConnection(parsed.netloc)
    h.request('HEAD', parsed.path)
    response = h.getresponse()
    if response.status/100 == 3 and response.getheader('Location'):
        return response.getheader('Location')
    else:
        return url

回答 7

顺便说一句,当使用httplib时(至少在2.5.2上),尝试读取HEAD请求的响应将阻塞(在读取行中),随后失败。如果您未在响应中发出已读消息,则无法在连接上发送另一个请求,则需要打开一个新请求。或者接受两次请求之间的长时间延迟。

As an aside, when using the httplib (at least on 2.5.2), trying to read the response of a HEAD request will block (on readline) and subsequently fail. If you do not issue read on the response, you are unable to send another request on the connection, you will need to open a new one. Or accept a long delay between requests.


回答 8

我发现httplib比urllib2快一点。我为两个程序计时-一个使用httplib,另一个使用urllib2-将HEAD请求发送到10,000个URL。httplib的速度快了几分钟。 httplib的总统计为:实际6m21.334s用户0m2.124s sys 0m16.372s

的urllib2的总的统计是:实9m1.380s用户0m16.666s SYS 0m28.565s

有人对此有意见吗?

I have found that httplib is slightly faster than urllib2. I timed two programs – one using httplib and the other using urllib2 – sending HEAD requests to 10,000 URL’s. The httplib one was faster by several minutes. httplib‘s total stats were: real 6m21.334s user 0m2.124s sys 0m16.372s

And urllib2‘s total stats were: real 9m1.380s user 0m16.666s sys 0m28.565s

Does anybody else have input on this?


回答 9

还有另一种方法(类似于Pawel的答案):

import urllib2
import types

request = urllib2.Request('http://localhost:8080')
request.get_method = types.MethodType(lambda self: 'HEAD', request, request.__class__)

只是为了避免在实例级别使用无限制的方法。

And yet another approach (similar to Pawel answer):

import urllib2
import types

request = urllib2.Request('http://localhost:8080')
request.get_method = types.MethodType(lambda self: 'HEAD', request, request.__class__)

Just to avoid having unbounded methods at instance level.


回答 10

可能更容易:使用urllib或urllib2。

>>> import urllib
>>> f = urllib.urlopen('http://google.com')
>>> f.info().gettype()
'text/html'

f.info()是类似字典的对象,因此您可以执行f.info()[‘content-type’]等。

http://docs.python.org/library/urllib.html
http://docs.python.org/library/urllib2.html
http://docs.python.org/library/httplib.html

文档指出,通常不直接使用httplib。

Probably easier: use urllib or urllib2.

>>> import urllib
>>> f = urllib.urlopen('http://google.com')
>>> f.info().gettype()
'text/html'

f.info() is a dictionary-like object, so you can do f.info()[‘content-type’], etc.

http://docs.python.org/library/urllib.html
http://docs.python.org/library/urllib2.html
http://docs.python.org/library/httplib.html

The docs note that httplib is not normally used directly.


Python / Django:登录到runserver下的控制台,登录到Apache下的文件

问题:Python / Django:登录到runserver下的控制台,登录到Apache下的文件

print在之下运行Django应用程序时manage.py runserver,如何将跟踪消息发送至控制台(如),而在Apache下运行应用程序时如何将这些消息发送至日志文件?

我回顾了Django日志记录,尽管它对高级用途的灵活性和可配置性给我留下了深刻的印象,但我仍然对如何处理简单的用例感到困惑。

How can I send trace messages to the console (like print) when I’m running my Django app under manage.py runserver, but have those messages sent to a log file when I’m running the app under Apache?

I reviewed Django logging and although I was impressed with its flexibility and configurability for advanced uses, I’m still stumped with how to handle my simple use-case.


回答 0

在mod_wsgi下运行时,打印到stderr的文本将显示在httpd的错误日志中。您可以print直接使用,也可以logging改用。

print >>sys.stderr, 'Goodbye, cruel world!'

Text printed to stderr will show up in httpd’s error log when running under mod_wsgi. You can either use print directly, or use logging instead.

print >>sys.stderr, 'Goodbye, cruel world!'

回答 1

这是一个基于Django日志记录的解决方案。它使用DEBUG设置,而不是实际检查您是否在运行开发服务器,但是如果您找到一种更好的检查方法,则应该很容易进行调整。

LOGGING = {
    'version': 1,
    'formatters': {
        'verbose': {
            'format': '%(levelname)s %(asctime)s %(module)s %(process)d %(thread)d %(message)s'
        },
        'simple': {
            'format': '%(levelname)s %(message)s'
        },
    },
    'handlers': {
        'console': {
            'level': 'DEBUG',
            'class': 'logging.StreamHandler',
            'formatter': 'simple'
        },
        'file': {
            'level': 'DEBUG',
            'class': 'logging.FileHandler',
            'filename': '/path/to/your/file.log',
            'formatter': 'simple'
        },
    },
    'loggers': {
        'django': {
            'handlers': ['file'],
            'level': 'DEBUG',
            'propagate': True,
        },
    }
}

if DEBUG:
    # make all loggers use the console.
    for logger in LOGGING['loggers']:
        LOGGING['loggers'][logger]['handlers'] = ['console']

有关详细信息,请参见https://docs.djangoproject.com/en/dev/topics/logging/

Here’s a Django logging-based solution. It uses the DEBUG setting rather than actually checking whether or not you’re running the development server, but if you find a better way to check for that it should be easy to adapt.

LOGGING = {
    'version': 1,
    'formatters': {
        'verbose': {
            'format': '%(levelname)s %(asctime)s %(module)s %(process)d %(thread)d %(message)s'
        },
        'simple': {
            'format': '%(levelname)s %(message)s'
        },
    },
    'handlers': {
        'console': {
            'level': 'DEBUG',
            'class': 'logging.StreamHandler',
            'formatter': 'simple'
        },
        'file': {
            'level': 'DEBUG',
            'class': 'logging.FileHandler',
            'filename': '/path/to/your/file.log',
            'formatter': 'simple'
        },
    },
    'loggers': {
        'django': {
            'handlers': ['file'],
            'level': 'DEBUG',
            'propagate': True,
        },
    }
}

if DEBUG:
    # make all loggers use the console.
    for logger in LOGGING['loggers']:
        LOGGING['loggers'][logger]['handlers'] = ['console']

see https://docs.djangoproject.com/en/dev/topics/logging/ for details.


回答 2

您可以配置settings.py文件登录。

一个例子:

if DEBUG:
    # will output to your console
    logging.basicConfig(
        level = logging.DEBUG,
        format = '%(asctime)s %(levelname)s %(message)s',
    )
else:
    # will output to logging file
    logging.basicConfig(
        level = logging.DEBUG,
        format = '%(asctime)s %(levelname)s %(message)s',
        filename = '/my_log_file.log',
        filemode = 'a'
    )

但是,这取决于设置DEBUG,也许您不必担心它的设置方式。关于如何确定我的Django应用程序是否正在开发服务器上运行的信息,请参见此答案以更好的方式编写该条件。编辑:上面的示例来自Django 1.1项目,自该版本以来,Django中的日志记录配置已有所更改。

You can configure logging in your settings.py file.

One example:

if DEBUG:
    # will output to your console
    logging.basicConfig(
        level = logging.DEBUG,
        format = '%(asctime)s %(levelname)s %(message)s',
    )
else:
    # will output to logging file
    logging.basicConfig(
        level = logging.DEBUG,
        format = '%(asctime)s %(levelname)s %(message)s',
        filename = '/my_log_file.log',
        filemode = 'a'
    )

However that’s dependent upon setting DEBUG, and maybe you don’t want to have to worry about how it’s set up. See this answer on How can I tell whether my Django application is running on development server or not? for a better way of writing that conditional. Edit: the example above is from a Django 1.1 project, logging configuration in Django has changed somewhat since that version.


回答 3

我用这个:

logging.conf:

[loggers]
keys=root,applog
[handlers]
keys=rotateFileHandler,rotateConsoleHandler

[formatters]
keys=applog_format,console_format

[formatter_applog_format]
format=%(asctime)s-[%(levelname)-8s]:%(message)s

[formatter_console_format]
format=%(asctime)s-%(filename)s%(lineno)d[%(levelname)s]:%(message)s

[logger_root]
level=DEBUG
handlers=rotateFileHandler,rotateConsoleHandler

[logger_applog]
level=DEBUG
handlers=rotateFileHandler
qualname=simple_example

[handler_rotateFileHandler]
class=handlers.RotatingFileHandler
level=DEBUG
formatter=applog_format
args=('applog.log', 'a', 10000, 9)

[handler_rotateConsoleHandler]
class=StreamHandler
level=DEBUG
formatter=console_format
args=(sys.stdout,)

testapp.py:

import logging
import logging.config

def main():
    logging.config.fileConfig('logging.conf')
    logger = logging.getLogger('applog')

    logger.debug('debug message')
    logger.info('info message')
    logger.warn('warn message')
    logger.error('error message')
    logger.critical('critical message')
    #logging.shutdown()

if __name__ == '__main__':
    main()

I use this:

logging.conf:

[loggers]
keys=root,applog
[handlers]
keys=rotateFileHandler,rotateConsoleHandler

[formatters]
keys=applog_format,console_format

[formatter_applog_format]
format=%(asctime)s-[%(levelname)-8s]:%(message)s

[formatter_console_format]
format=%(asctime)s-%(filename)s%(lineno)d[%(levelname)s]:%(message)s

[logger_root]
level=DEBUG
handlers=rotateFileHandler,rotateConsoleHandler

[logger_applog]
level=DEBUG
handlers=rotateFileHandler
qualname=simple_example

[handler_rotateFileHandler]
class=handlers.RotatingFileHandler
level=DEBUG
formatter=applog_format
args=('applog.log', 'a', 10000, 9)

[handler_rotateConsoleHandler]
class=StreamHandler
level=DEBUG
formatter=console_format
args=(sys.stdout,)

testapp.py:

import logging
import logging.config

def main():
    logging.config.fileConfig('logging.conf')
    logger = logging.getLogger('applog')

    logger.debug('debug message')
    logger.info('info message')
    logger.warn('warn message')
    logger.error('error message')
    logger.critical('critical message')
    #logging.shutdown()

if __name__ == '__main__':
    main()

回答 4

您可以使用tagalog(https://github.com/dorkitude/tagalog)轻松完成此操作

例如,当标准python模块写入以附加模式打开的文件对象时,App Engine模块(https://github.com/dorkitude/tagalog/blob/master/tagalog_appengine.py)会覆盖此行为,而是使用logging.INFO

要在App Engine项目中获得此行为,只需执行以下操作:

import tagalog.tagalog_appengine as tagalog
tagalog.log('whatever message', ['whatever','tags'])

您可以自己扩展模块并覆盖日志功能,而不会遇到太大困难。

You can do this pretty easily with tagalog (https://github.com/dorkitude/tagalog)

For instance, while the standard python module writes to a file object opened in append mode, the App Engine module (https://github.com/dorkitude/tagalog/blob/master/tagalog_appengine.py) overrides this behavior and instead uses logging.INFO.

To get this behavior in an App Engine project, one could simply do:

import tagalog.tagalog_appengine as tagalog
tagalog.log('whatever message', ['whatever','tags'])

You could extend the module yourself and overwrite the log function without much difficulty.


回答 5

这在我的local.py中效果很好,省去了常规日志记录的麻烦:

from .settings import *

LOGGING['handlers']['console'] = {
    'level': 'DEBUG',
    'class': 'logging.StreamHandler',
    'formatter': 'verbose'
}
LOGGING['loggers']['foo.bar'] = {
    'handlers': ['console'],
    'propagate': False,
    'level': 'DEBUG',
}

This works quite well in my local.py, saves me messing up the regular logging:

from .settings import *

LOGGING['handlers']['console'] = {
    'level': 'DEBUG',
    'class': 'logging.StreamHandler',
    'formatter': 'verbose'
}
LOGGING['loggers']['foo.bar'] = {
    'handlers': ['console'],
    'propagate': False,
    'level': 'DEBUG',
}

Django:使用形式在一个模板中的多个模型

问题:Django:使用形式在一个模板中的多个模型

我正在构建一个支持票证跟踪应用程序,并希望从一个页面创建一些模型。票证通过ForeignKey属于客户。注释也通过ForeignKey属于票证。我想选择一个客户(这是一个单独的项目)或创建一个新客户,然后创建一个工单,最后创建一个分配给新工单的便笺。

由于我是Django的新手,因此我倾向于反复工作,每次尝试新功能。我玩过ModelForms,但是我想隐藏一些字段并进行一些复杂的验证。似乎我正在寻找的控制级别需要表单集或手动完成所有操作,并完成一个繁琐的手工编码模板页面,而我试图避免这种情况。

我缺少一些可爱的功能吗?有人对使用表单集有很好的参考或示例吗?我花了整个周末为他们准备API文档,但我仍然一无所知。如果我分解并手工编码所有内容,这是设计问题吗?

I’m building a support ticket tracking app and have a few models I’d like to create from one page. Tickets belong to a Customer via a ForeignKey. Notes belong to Tickets via a ForeignKey as well. I’d like to have the option of selecting a Customer (that’s a whole separate project) OR creating a new Customer, then creating a Ticket and finally creating a Note assigned to the new ticket.

Since I’m fairly new to Django, I tend to work iteratively, trying out new features each time. I’ve played with ModelForms but I want to hide some of the fields and do some complex validation. It seems like the level of control I’m looking for either requires formsets or doing everything by hand, complete with a tedious, hand-coded template page, which I’m trying to avoid.

Is there some lovely feature I’m missing? Does someone have a good reference or example for using formsets? I spent a whole weekend on the API docs for them and I’m still clueless. Is it a design issue if I break down and hand-code everything?


回答 0

使用ModelForms实在不是太难。假设您有A,B和C表单。您打印出每个表单和页面,现在需要处理POST。

if request.POST():
    a_valid = formA.is_valid()
    b_valid = formB.is_valid()
    c_valid = formC.is_valid()
    # we do this since 'and' short circuits and we want to check to whole page for form errors
    if a_valid and b_valid and c_valid:
        a = formA.save()
        b = formB.save(commit=False)
        c = formC.save(commit=False)
        b.foreignkeytoA = a
        b.save()
        c.foreignkeytoB = b
        c.save()

是用于自定义验证的文档。

This really isn’t too hard to implement with ModelForms. So lets say you have Forms A, B, and C. You print out each of the forms and the page and now you need to handle the POST.

if request.POST():
    a_valid = formA.is_valid()
    b_valid = formB.is_valid()
    c_valid = formC.is_valid()
    # we do this since 'and' short circuits and we want to check to whole page for form errors
    if a_valid and b_valid and c_valid:
        a = formA.save()
        b = formB.save(commit=False)
        c = formC.save(commit=False)
        b.foreignkeytoA = a
        b.save()
        c.foreignkeytoB = b
        c.save()

Here are the docs for custom validation.


回答 1

我一天前也处于相同的情况,这是我的2美分:

1)我可以在这里找到单一形式的多个模型输入的最短,最简洁的演示:http : //collingrady.wordpress.com/2008/02/18/editing-multiple-objects-in-django-with-newforms/

简而言之:为每个模型创建一个表单<form>,使用prefixkeyarg 将它们都提交到模板中,并进行视图句柄验证。如果存在依赖关系,只需确保在依赖关系之前保存“父”模型,并在提交“子”模型的保存之前使用父代的ID作为外键。链接中有演示。

2)也许表单集可打成了这样做,但据我的钻研,表单集主要用于输入相同的模型,它的倍数可能 /模型由外键可以任意捆绑到另一种模式。但是,似乎没有默认选项可输入多个模型的数据,而这并不是表单集的含义。

I just was in about the same situation a day ago, and here are my 2 cents:

1) I found arguably the shortest and most concise demonstration of multiple model entry in single form here: http://collingrady.wordpress.com/2008/02/18/editing-multiple-objects-in-django-with-newforms/ .

In a nutshell: Make a form for each model, submit them both to template in a single <form>, using prefix keyarg and have the view handle validation. If there is dependency, just make sure you save the “parent” model before dependant, and use parent’s ID for foreign key before commiting save of “child” model. The link has the demo.

2) Maybe formsets can be beaten into doing this, but as far as I delved in, formsets are primarily for entering multiples of the same model, which may be optionally tied to another model/models by foreign keys. However, there seem to be no default option for entering more than one model’s data and that’s not what formset seems to be meant for.


回答 2

我最近遇到了一些问题,只是想出了解决方法。假设您有三个类,Primary,B,C,并且B,C具有Primary的外键

    class PrimaryForm(ModelForm):
        class Meta:
            model = Primary

    class BForm(ModelForm):
        class Meta:
            model = B
            exclude = ('primary',)

    class CForm(ModelForm):
         class Meta:
            model = C
            exclude = ('primary',)

    def generateView(request):
        if request.method == 'POST': # If the form has been submitted...
            primary_form = PrimaryForm(request.POST, prefix = "primary")
            b_form = BForm(request.POST, prefix = "b")
            c_form = CForm(request.POST, prefix = "c")
            if primary_form.is_valid() and b_form.is_valid() and c_form.is_valid(): # All validation rules pass
                    print "all validation passed"
                    primary = primary_form.save()
                    b_form.cleaned_data["primary"] = primary
                    b = b_form.save()
                    c_form.cleaned_data["primary"] = primary
                    c = c_form.save()
                    return HttpResponseRedirect("/viewer/%s/" % (primary.name))
            else:
                    print "failed"

        else:
            primary_form = PrimaryForm(prefix = "primary")
            b_form = BForm(prefix = "b")
            c_form = Form(prefix = "c")
     return render_to_response('multi_model.html', {
     'primary_form': primary_form,
     'b_form': b_form,
     'c_form': c_form,
      })

此方法应允许您执行所需的任何验证,以及在同一页面上生成所有三个对象。我还使用了JavaScript和隐藏字段来允许在同一页面上生成多个B,C对象。

I very recently had the some problem and just figured out how to do this. Assuming you have three classes, Primary, B, C and that B,C have a foreign key to primary

    class PrimaryForm(ModelForm):
        class Meta:
            model = Primary

    class BForm(ModelForm):
        class Meta:
            model = B
            exclude = ('primary',)

    class CForm(ModelForm):
         class Meta:
            model = C
            exclude = ('primary',)

    def generateView(request):
        if request.method == 'POST': # If the form has been submitted...
            primary_form = PrimaryForm(request.POST, prefix = "primary")
            b_form = BForm(request.POST, prefix = "b")
            c_form = CForm(request.POST, prefix = "c")
            if primary_form.is_valid() and b_form.is_valid() and c_form.is_valid(): # All validation rules pass
                    print "all validation passed"
                    primary = primary_form.save()
                    b_form.cleaned_data["primary"] = primary
                    b = b_form.save()
                    c_form.cleaned_data["primary"] = primary
                    c = c_form.save()
                    return HttpResponseRedirect("/viewer/%s/" % (primary.name))
            else:
                    print "failed"

        else:
            primary_form = PrimaryForm(prefix = "primary")
            b_form = BForm(prefix = "b")
            c_form = Form(prefix = "c")
     return render_to_response('multi_model.html', {
     'primary_form': primary_form,
     'b_form': b_form,
     'c_form': c_form,
      })

This method should allow you to do whatever validation you require, as well as generating all three objects on the same page. I have also used javascript and hidden fields to allow the generation of multiple B,C objects on the same page.


回答 3

来自的MultiModelFormdjango-betterforms是一个方便的包装程序,可以执行Gnudiff的答案中所述。它将regulars包装ModelForm在单个类中,该类透明(至少对于基本用法而言)用作单个形式。我从下面的文档中复制了一个示例。

# forms.py
from django import forms
from django.contrib.auth import get_user_model
from betterforms.multiform import MultiModelForm
from .models import UserProfile

User = get_user_model()

class UserEditForm(forms.ModelForm):
    class Meta:
        fields = ('email',)

class UserProfileForm(forms.ModelForm):
    class Meta:
        fields = ('favorite_color',)

class UserEditMultiForm(MultiModelForm):
    form_classes = {
        'user': UserEditForm,
        'profile': UserProfileForm,
    }

# views.py
from django.views.generic import UpdateView
from django.core.urlresolvers import reverse_lazy
from django.shortcuts import redirect
from django.contrib.auth import get_user_model
from .forms import UserEditMultiForm

User = get_user_model()

class UserSignupView(UpdateView):
    model = User
    form_class = UserEditMultiForm
    success_url = reverse_lazy('home')

    def get_form_kwargs(self):
        kwargs = super(UserSignupView, self).get_form_kwargs()
        kwargs.update(instance={
            'user': self.object,
            'profile': self.object.profile,
        })
        return kwargs

The MultiModelForm from django-betterforms is a convenient wrapper to do what is described in Gnudiff’s answer. It wraps regular ModelForms in a single class which is transparently (at least for basic usage) used as a single form. I’ve copied an example from their docs below.

# forms.py
from django import forms
from django.contrib.auth import get_user_model
from betterforms.multiform import MultiModelForm
from .models import UserProfile

User = get_user_model()

class UserEditForm(forms.ModelForm):
    class Meta:
        fields = ('email',)

class UserProfileForm(forms.ModelForm):
    class Meta:
        fields = ('favorite_color',)

class UserEditMultiForm(MultiModelForm):
    form_classes = {
        'user': UserEditForm,
        'profile': UserProfileForm,
    }

# views.py
from django.views.generic import UpdateView
from django.core.urlresolvers import reverse_lazy
from django.shortcuts import redirect
from django.contrib.auth import get_user_model
from .forms import UserEditMultiForm

User = get_user_model()

class UserSignupView(UpdateView):
    model = User
    form_class = UserEditMultiForm
    success_url = reverse_lazy('home')

    def get_form_kwargs(self):
        kwargs = super(UserSignupView, self).get_form_kwargs()
        kwargs.update(instance={
            'user': self.object,
            'profile': self.object.profile,
        })
        return kwargs

回答 4

我目前有一个解决方法功能(它通过了我的单元测试)。当您只想从其他模型中添加有限数量的字段时,这是一个很好的解决方案。

我在这里想念什么吗?

class UserProfileForm(ModelForm):
    def __init__(self, instance=None, *args, **kwargs):
        # Add these fields from the user object
        _fields = ('first_name', 'last_name', 'email',)
        # Retrieve initial (current) data from the user object
        _initial = model_to_dict(instance.user, _fields) if instance is not None else {}
        # Pass the initial data to the base
        super(UserProfileForm, self).__init__(initial=_initial, instance=instance, *args, **kwargs)
        # Retrieve the fields from the user model and update the fields with it
        self.fields.update(fields_for_model(User, _fields))

    class Meta:
        model = UserProfile
        exclude = ('user',)

    def save(self, *args, **kwargs):
        u = self.instance.user
        u.first_name = self.cleaned_data['first_name']
        u.last_name = self.cleaned_data['last_name']
        u.email = self.cleaned_data['email']
        u.save()
        profile = super(UserProfileForm, self).save(*args,**kwargs)
        return profile

I currently have a workaround functional (it passes my unit tests). It is a good solution to my opinion when you only want to add a limited number of fields from other models.

Am I missing something here ?

class UserProfileForm(ModelForm):
    def __init__(self, instance=None, *args, **kwargs):
        # Add these fields from the user object
        _fields = ('first_name', 'last_name', 'email',)
        # Retrieve initial (current) data from the user object
        _initial = model_to_dict(instance.user, _fields) if instance is not None else {}
        # Pass the initial data to the base
        super(UserProfileForm, self).__init__(initial=_initial, instance=instance, *args, **kwargs)
        # Retrieve the fields from the user model and update the fields with it
        self.fields.update(fields_for_model(User, _fields))

    class Meta:
        model = UserProfile
        exclude = ('user',)

    def save(self, *args, **kwargs):
        u = self.instance.user
        u.first_name = self.cleaned_data['first_name']
        u.last_name = self.cleaned_data['last_name']
        u.email = self.cleaned_data['email']
        u.save()
        profile = super(UserProfileForm, self).save(*args,**kwargs)
        return profile

回答 5

“我想隐藏一些字段并进行一些复杂的验证。”

我从内置的管理界面开始。

  1. 构建ModelForm以显示所需的字段。

  2. 用表单中的验证规则扩展表单。通常这是一种clean方法。

    确保这部分工作正常。

完成此操作后,您可以离开内置的管理界面。

然后,您可以在单个网页上四处查找与部分相关的表单。这是一堆模板材料,可在一个页面上显示所有表单。

然后,您必须编写视图函数来读取和验证各种表单内容,并执行各种对象saves()。

“如果我分解并手动编码所有内容,这是设计问题吗?” 不,只是很多时间而没有太多好处。

“I want to hide some of the fields and do some complex validation.”

I start with the built-in admin interface.

  1. Build the ModelForm to show the desired fields.

  2. Extend the Form with the validation rules within the form. Usually this is a clean method.

    Be sure this part works reasonably well.

Once this is done, you can move away from the built-in admin interface.

Then you can fool around with multiple, partially related forms on a single web page. This is a bunch of template stuff to present all the forms on a single page.

Then you have to write the view function to read and validated the various form things and do the various object saves().

“Is it a design issue if I break down and hand-code everything?” No, it’s just a lot of time for not much benefit.


回答 6

根据Django文档,内联表单集用于此目的:“内联表单集是模型表单集之上的一个小抽象层。它们简化了通过外键处理相关对象的情况”。

参见https://docs.djangoproject.com/en/dev/topics/forms/modelforms/#inline-formsets

According to Django documentation, inline formsets are for this purpose: “Inline formsets is a small abstraction layer on top of model formsets. These simplify the case of working with related objects via a foreign key”.

See https://docs.djangoproject.com/en/dev/topics/forms/modelforms/#inline-formsets


在sudo下运行pip install是否可以接受并且安全?

问题:在sudo下运行pip install是否可以接受并且安全?

我已经开始使用Mac来安装Python软件包,就像在工作中使用Windows PC一样。但是,在Mac上,我在写入日志文件或站点程序包时经常遇到权限被拒绝的错误。

因此,我考虑过pip install <package>sudosudo 下运行,但是考虑到我只是想将其安装在当前用户帐户下,是否安全/可接受地使用sudo?

日志文件I / O错误的示例回溯:

Command /usr/bin/python -c "import setuptools;__file__='/Users/markwalker/build/pycrypto/setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --single-version-externally-managed --record /var/folders/tq/hy1fz_4j27v6rstzzw4vymnr0000gp/T/pip-k6f2FU-record/install-record.txt failed with error code 1 in /Users/markwalker/build/pycrypto
Storing complete log in /Users/markwalker/Library/Logs/pip.log
Traceback (most recent call last):
  File "/usr/local/bin/pip", line 8, in <module>
    load_entry_point('pip==1.1', 'console_scripts', 'pip')()
  File "/Library/Python/2.7/site-packages/pip-1.1-py2.7.egg/pip/__init__.py", line 116, in main
    return command.main(args[1:], options)
  File "/Library/Python/2.7/site-packages/pip-1.1-py2.7.egg/pip/basecommand.py", line 141, in main
    log_fp = open_logfile(log_fn, 'w')
  File "/Library/Python/2.7/site-packages/pip-1.1-py2.7.egg/pip/basecommand.py", line 168, in open_logfile
    log_fp = open(filename, mode)
IOError: [Errno 13] Permission denied: '/Users/markwalker/Library/Logs/pip.log'

更新 这可能取决于权限,但是最好的方法是为您的python项目使用虚拟环境。sudo pip除非绝对必要,否则应避免运行。

I’ve started to use my Mac to install Python packages in the same way I do with my Windows PC at work; however on my Mac I’ve come across frequent permission denied errors while writing to log files or site-packages.

Therefore I thought about running pip install <package> under sudo but is that a safe/acceptable use of sudo considering I’m just wanting this to be installed under my current user account?

Example traceback from a logfile I/O error:

Command /usr/bin/python -c "import setuptools;__file__='/Users/markwalker/build/pycrypto/setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --single-version-externally-managed --record /var/folders/tq/hy1fz_4j27v6rstzzw4vymnr0000gp/T/pip-k6f2FU-record/install-record.txt failed with error code 1 in /Users/markwalker/build/pycrypto
Storing complete log in /Users/markwalker/Library/Logs/pip.log
Traceback (most recent call last):
  File "/usr/local/bin/pip", line 8, in <module>
    load_entry_point('pip==1.1', 'console_scripts', 'pip')()
  File "/Library/Python/2.7/site-packages/pip-1.1-py2.7.egg/pip/__init__.py", line 116, in main
    return command.main(args[1:], options)
  File "/Library/Python/2.7/site-packages/pip-1.1-py2.7.egg/pip/basecommand.py", line 141, in main
    log_fp = open_logfile(log_fn, 'w')
  File "/Library/Python/2.7/site-packages/pip-1.1-py2.7.egg/pip/basecommand.py", line 168, in open_logfile
    log_fp = open(filename, mode)
IOError: [Errno 13] Permission denied: '/Users/markwalker/Library/Logs/pip.log'

Update This was likely down to permissions, however the best approach is to use virtual environments for your python projects. Running sudo pip should be avoided unless absolutely necessary.


回答 0

使用虚拟环境

$ virtualenv myenv
.. some output ..
$ source myenv/bin/activate
(myenv) $ pip install what-i-want

sudo当您要为全局的系统级Python安装安装内容时,才使用或提升权限。

最好使用虚拟环境为您隔离软件包。这样一来,您就可以畅玩而不会污染全局python安装。

另外,virtualenv不需要提升的权限。

Use a virtual environment:

$ virtualenv myenv
.. some output ..
$ source myenv/bin/activate
(myenv) $ pip install what-i-want

You only use sudo or elevated permissions when you want to install stuff for the global, system-wide Python installation.

It is best to use a virtual environment which isolates packages for you. That way you can play around without polluting the global python install.

As a bonus, virtualenv does not need elevated permissions.


回答 1

它是可接受的安全运行pip installsudo

它不安全并且被皱着眉头–请参阅运行“ sudo pip”有什么风险? 要在主目录中安装Python软件包,您不需要root特权。见描述--user选项点子。

Is it acceptable & safe to run pip install under sudo?

It’s not safe and it’s being frowned upon – see What are the risks of running ‘sudo pip’? To install Python package in your home directory you don’t need root privileges. See description of --user option to pip.


回答 2

您最初的问题是pip无法将日志写入文件夹。

IOError: [Errno 13] Permission denied: '/Users/markwalker/Library/Logs/pip.log'

您需要将cd放入一个文件夹,在该文件夹中,调用的进程可以像/tmp这样写,cd /tmp然后重新调用该命令可能会起作用,但这不是您想要的。

实际上对于这种特殊情况(您不希望sudo用于安装python软件包)并且不需要全局软件包安装,可以使用如下--user标记:

pip install --user <packagename>

它会很好地工作。

我假设您具有一个用户python python安装,并且不想打扰有关virtualenv(不是很用户友好)或pipenv的阅读

正如评论部分中的某些人指出的那样,除非您不知道该怎么办并陷入困境,否则下一个方法不是一个好主意:

针对全局包的另一种方法例如您要执行的操作:

chown -R $USER /Library/Python/2.7/site-packages/

或更一般地

chown -R $USER <path to your global pip packages>

Your original problem is that pip cannot write the logs to the folder.

IOError: [Errno 13] Permission denied: '/Users/markwalker/Library/Logs/pip.log'

You need to cd into a folder in which the process invoked can write like /tmp so a cd /tmp and re invoking the command will probably work but is not what you want.

BUT actually for this particular case (you not wanting to use sudo for installing python packages) and no need for global package installs you can use the --user flag like this :

pip install --user <packagename>

and it will work just fine.

I assume you have a one user python python installation and do not want to bother with reading about virtualenv (which is not very userfriendly) or pipenv.

As some people in the comments section have pointed out the next approach is not a very good idea unless you do not know what to do and got stuck:

Another approach for global packages like in your case you want to do something like :

chown -R $USER /Library/Python/2.7/site-packages/

or more generally

chown -R $USER <path to your global pip packages>

回答 3

因为我遇到了同样的问题,所以我想强调一下,布莱恩·凯恩的第一条评论实际上是“ IOError:[Errno 13]”问题的解决方案:

如果在temp目录(cd /tmp)中执行,那么如果我运行IOError就不会再发生sudo pip install foo

Because I had the same problem, I want to stress that actually the first comment by Brian Cain is the solution to the “IOError: [Errno 13]”-problem:

If executed in the temp directory (cd /tmp), the IOError does not occur anymore if I run sudo pip install foo.


回答 4

virtualenvwrapper成功安装后,我在安装时遇到问题virtualenv

执行此操作后,我的终端抱怨:

pip install virtualenvwrapper

因此,我尝试失败(不推荐)

sudo pip install virtualenvwrapper

然后,我成功安装了它:

pip install --user virtualenvwrapper

I had a problem installing virtualenvwrapper after successfully installing virtualenv.

My terminal complained after I did this:

pip install virtualenvwrapper

So, I unsuccessfully tried this (NOT RECOMMENDED):

sudo pip install virtualenvwrapper

Then, I successfully installed it with this:

pip install --user virtualenvwrapper

回答 5

看来您的权限搞砸了。键入chown -R markwalker ~在终端和尝试pip一次?让我知道您是否已排序。

It looks like your permissions are messed up. Type chown -R markwalker ~ in the Terminal and try pip again? Let me know if you’re sorted.


有趣好用的Python教程

退出移动版
微信支付
请使用 微信 扫码支付