在Python中转义HTML的最简单方法是什么?

问题:在Python中转义HTML的最简单方法是什么?

cgi.escape似乎是一种可能的选择。它运作良好吗?有什么更好的东西吗?

cgi.escape seems like one possible choice. Does it work well? Is there something that is considered better?


回答 0

cgi.escape很好 它逃脱了:

  • <&lt;
  • >&gt;
  • &&amp;

对于所有HTML而言,这就足够了。

编辑:如果您有非ASCII字符,您还想转义,以便包含在使用不同编码的另一个编码文档中,如Craig所说,只需使用:

data.encode('ascii', 'xmlcharrefreplace')

不要忘了解码dataunicode第一,使用任何编码它编码的。

但是根据我的经验,如果您unicode从头开始一直都在工作,那么这种编码是没有用的。只需在文档头中指定的编码末尾进行编码(utf-8以实现最大兼容性)。

例:

>>> cgi.escape(u'<a>bá</a>').encode('ascii', 'xmlcharrefreplace')
'&lt;a&gt;b&#225;&lt;/a&gt;

另外值得一提的(感谢Greg)是额外的quote参数cgi.escape。将其设置为时Truecgi.escape还转义双引号字符("),因此您可以在XML / HTML属性中使用结果值。

编辑:请注意,cgi.escape已在Python 3.2中弃用,转而使用html.escape,它的功能相同,但quote默认情况下为True。

cgi.escape is fine. It escapes:

  • < to &lt;
  • > to &gt;
  • & to &amp;

That is enough for all HTML.

EDIT: If you have non-ascii chars you also want to escape, for inclusion in another encoded document that uses a different encoding, like Craig says, just use:

data.encode('ascii', 'xmlcharrefreplace')

Don’t forget to decode data to unicode first, using whatever encoding it was encoded.

However in my experience that kind of encoding is useless if you just work with unicode all the time from start. Just encode at the end to the encoding specified in the document header (utf-8 for maximum compatibility).

Example:

>>> cgi.escape(u'<a>bá</a>').encode('ascii', 'xmlcharrefreplace')
'&lt;a&gt;b&#225;&lt;/a&gt;

Also worth of note (thanks Greg) is the extra quote parameter cgi.escape takes. With it set to True, cgi.escape also escapes double quote chars (") so you can use the resulting value in a XML/HTML attribute.

EDIT: Note that cgi.escape has been deprecated in Python 3.2 in favor of html.escape, which does the same except that quote defaults to True.


回答 1

在Python 3.2中,新 html,引入模块,该模块用于从HTML标记转义保留字符。

它具有一个功能escape()

>>> import html
>>> html.escape('x > 2 && x < 7 single quote: \' double quote: "')
'x &gt; 2 &amp;&amp; x &lt; 7 single quote: &#x27; double quote: &quot;'

In Python 3.2 a new html module was introduced, which is used for escaping reserved characters from HTML markup.

It has one function escape():

>>> import html
>>> html.escape('x > 2 && x < 7 single quote: \' double quote: "')
'x &gt; 2 &amp;&amp; x &lt; 7 single quote: &#x27; double quote: &quot;'

回答 2

如果您希望在URL中转义HTML:

这可能不是OP想要的(问题并没有明确指出转义是在哪种上下文中使用的),但是Python的本机库urllib有一种方法可以转义需要安全包含在URL中的HTML实体。

以下是一个示例:

#!/usr/bin/python
from urllib import quote

x = '+<>^&'
print quote(x) # prints '%2B%3C%3E%5E%26'

在这里找到文件

If you wish to escape HTML in a URL:

This is probably NOT what the OP wanted (the question doesn’t clearly indicate in which context the escaping is meant to be used), but Python’s native library urllib has a method to escape HTML entities that need to be included in a URL safely.

The following is an example:

#!/usr/bin/python
from urllib import quote

x = '+<>^&'
print quote(x) # prints '%2B%3C%3E%5E%26'

Find docs here


回答 3

还有出色的markupsafe软件包

>>> from markupsafe import Markup, escape
>>> escape("<script>alert(document.cookie);</script>")
Markup(u'&lt;script&gt;alert(document.cookie);&lt;/script&gt;')

markupsafe程序包经过精心设计,并且可能是逃避转义的最通用,最Python化的方法,恕我直言,因为:

  1. return(Markup)是从unicode派生的类(即isinstance(escape('str'), unicode) == True
  2. 它可以正确处理unicode输入
  3. 它适用于Python(2.6、2.7、3.3和pypy)
  4. 它尊重对象(即具有__html__属性的对象)和模板重载(__html_format__)的自定义方法。

There is also the excellent markupsafe package.

>>> from markupsafe import Markup, escape
>>> escape("<script>alert(document.cookie);</script>")
Markup(u'&lt;script&gt;alert(document.cookie);&lt;/script&gt;')

The markupsafe package is well engineered, and probably the most versatile and Pythonic way to go about escaping, IMHO, because:

  1. the return (Markup) is a class derived from unicode (i.e. isinstance(escape('str'), unicode) == True
  2. it properly handles unicode input
  3. it works in Python (2.6, 2.7, 3.3, and pypy)
  4. it respects custom methods of objects (i.e. objects with a __html__ property) and template overloads (__html_format__).

回答 4

cgi.escape 从转义HTML标记和字符实体的有限意义上讲,应该可以逃脱HTML。

但是,您可能还必须考虑编码问题:如果要引用的HTML在特定的编码中包含非ASCII字符,那么还必须注意在引用时要合理地表示这些字符。也许您可以将它们转换为实体。否则,您应确保在“源” HTML和嵌入页面之间进行正确的编码转换,以避免损坏非ASCII字符。

cgi.escape should be good to escape HTML in the limited sense of escaping the HTML tags and character entities.

But you might have to also consider encoding issues: if the HTML you want to quote has non-ASCII characters in a particular encoding, then you would also have to take care that you represent those sensibly when quoting. Perhaps you could convert them to entities. Otherwise you should ensure that the correct encoding translations are done between the “source” HTML and the page it’s embedded in, to avoid corrupting the non-ASCII characters.


回答 5

没有库,纯python,可以安全地将文本转义为html文本:

text.replace('&', '&amp;').replace('>', '&gt;').replace('<', '&lt;'
        ).encode('ascii', 'xmlcharrefreplace')

No libraries, pure python, safely escapes text into html text:

text.replace('&', '&amp;').replace('>', '&gt;').replace('<', '&lt;'
        ).encode('ascii', 'xmlcharrefreplace')

回答 6

cgi.escape 扩展的

此版本进行了改进cgi.escape。它还保留空格和换行符。返回一个unicode字符串。

def escape_html(text):
    """escape strings for display in HTML"""
    return cgi.escape(text, quote=True).\
           replace(u'\n', u'<br />').\
           replace(u'\t', u'&emsp;').\
           replace(u'  ', u' &nbsp;')

例如

>>> escape_html('<foo>\nfoo\t"bar"')
u'&lt;foo&gt;<br />foo&emsp;&quot;bar&quot;'

cgi.escape extended

This version improves cgi.escape. It also preserves whitespace and newlines. Returns a unicode string.

def escape_html(text):
    """escape strings for display in HTML"""
    return cgi.escape(text, quote=True).\
           replace(u'\n', u'<br />').\
           replace(u'\t', u'&emsp;').\
           replace(u'  ', u' &nbsp;')

for example

>>> escape_html('<foo>\nfoo\t"bar"')
u'&lt;foo&gt;<br />foo&emsp;&quot;bar&quot;'

回答 7

不是最简单的方法,但仍然很简单。与cgi.escape模块的主要区别-如果您已经&amp;在文本中使用了它,它仍然可以正常工作。从评论中可以看到:

cgi.escape版本

def escape(s, quote=None):
    '''Replace special characters "&", "<" and ">" to HTML-safe sequences.
    If the optional flag quote is true, the quotation mark character (")
is also translated.'''
    s = s.replace("&", "&amp;") # Must be done first!
    s = s.replace("<", "&lt;")
    s = s.replace(">", "&gt;")
    if quote:
        s = s.replace('"', "&quot;")
    return s

正则表达式版本

QUOTE_PATTERN = r"""([&<>"'])(?!(amp|lt|gt|quot|#39);)"""
def escape(word):
    """
    Replaces special characters <>&"' to HTML-safe sequences. 
    With attention to already escaped characters.
    """
    replace_with = {
        '<': '&gt;',
        '>': '&lt;',
        '&': '&amp;',
        '"': '&quot;', # should be escaped in attributes
        "'": '&#39'    # should be escaped in attributes
    }
    quote_pattern = re.compile(QUOTE_PATTERN)
    return re.sub(quote_pattern, lambda x: replace_with[x.group(0)], word)

Not the easiest way, but still straightforward. The main difference from cgi.escape module – it still will work properly if you already have &amp; in your text. As you see from comments to it:

cgi.escape version

def escape(s, quote=None):
    '''Replace special characters "&", "<" and ">" to HTML-safe sequences.
    If the optional flag quote is true, the quotation mark character (")
is also translated.'''
    s = s.replace("&", "&amp;") # Must be done first!
    s = s.replace("<", "&lt;")
    s = s.replace(">", "&gt;")
    if quote:
        s = s.replace('"', "&quot;")
    return s

regex version

QUOTE_PATTERN = r"""([&<>"'])(?!(amp|lt|gt|quot|#39);)"""
def escape(word):
    """
    Replaces special characters <>&"' to HTML-safe sequences. 
    With attention to already escaped characters.
    """
    replace_with = {
        '<': '&gt;',
        '>': '&lt;',
        '&': '&amp;',
        '"': '&quot;', # should be escaped in attributes
        "'": '&#39'    # should be escaped in attributes
    }
    quote_pattern = re.compile(QUOTE_PATTERN)
    return re.sub(quote_pattern, lambda x: replace_with[x.group(0)], word)

回答 8

对于Python 2.7中的旧代码,可以通过BeautifulSoup4做到:

>>> bs4.dammit import EntitySubstitution
>>> esub = EntitySubstitution()
>>> esub.substitute_html("r&d")
'r&amp;d'

For legacy code in Python 2.7, can do it via BeautifulSoup4:

>>> bs4.dammit import EntitySubstitution
>>> esub = EntitySubstitution()
>>> esub.substitute_html("r&d")
'r&amp;d'

创建简单的python网络服务的最佳方法

问题:创建简单的python网络服务的最佳方法

我已经使用python多年了,但是我对python Web编程的经验很少。我想创建一个非常简单的Web服务,该服务公开来自现有python脚本的一些功能以供公司使用。它可能会在csv中返回结果。什么是最快的方法?如果它影响您的建议,那么我很可能会在此之后添加更多功能。

I’ve been using python for years, but I have little experience with python web programming. I’d like to create a very simple web service that exposes some functionality from an existing python script for use within my company. It will likely return the results in csv. What’s the quickest way to get something up? If it affects your suggestion, I will likely be adding more functionality to this, down the road.


回答 0

看看werkzeug。Werkzeug最初是WSGI应用程序各种实用程序的简单集合,现已成为最高级的WSGI实用程序模块之一。它包括功能强大的调试器,功能齐全的请求和响应对象,用于处理实体标签的HTTP实用程序,高速缓存控制标头,HTTP日期,cookie处理,文件上传,功能强大的URL路由系统和一堆由社区提供的附加模块。

它包括许多可与http配合使用的出色工具,并且具有可以在不同环境(cgi,fcgi,apache / mod_wsgi或用于调试的普通简单python服务器)中与wsgi一起使用的优点。

Have a look at werkzeug. Werkzeug started as a simple collection of various utilities for WSGI applications and has become one of the most advanced WSGI utility modules. It includes a powerful debugger, full featured request and response objects, HTTP utilities to handle entity tags, cache control headers, HTTP dates, cookie handling, file uploads, a powerful URL routing system and a bunch of community contributed addon modules.

It includes lots of cool tools to work with http and has the advantage that you can use it with wsgi in different environments (cgi, fcgi, apache/mod_wsgi or with a plain simple python server for debugging).


回答 1

web.py可能是最简单的Web框架。“裸” CGI比较简单,但是在提供一项实际上可以做某事的服务时,您完全是一个人。

“你好,世界!” 根据web.py是不是比裸CGI版本更长的时间,但它增加了URL映射,HTTP命令的区别,和查询参数解析为免费

import web

urls = (
    '/(.*)', 'hello'
)
app = web.application(urls, globals())

class hello:        
    def GET(self, name):
        if not name: 
            name = 'world'
        return 'Hello, ' + name + '!'

if __name__ == "__main__":
    app.run()

web.py is probably the simplest web framework out there. “Bare” CGI is simpler, but you’re completely on your own when it comes to making a service that actually does something.

“Hello, World!” according to web.py isn’t much longer than an bare CGI version, but it adds URL mapping, HTTP command distinction, and query parameter parsing for free:

import web

urls = (
    '/(.*)', 'hello'
)
app = web.application(urls, globals())

class hello:        
    def GET(self, name):
        if not name: 
            name = 'world'
        return 'Hello, ' + name + '!'

if __name__ == "__main__":
    app.run()

回答 2

在线获取Python脚本的最简单方法是使用CGI:

#!/usr/bin/python

print "Content-type: text/html"
print

print "<p>Hello world.</p>"

将该代码放入驻留在Web服务器CGI目录中的脚本中,使其可执行并运行。cgi当您需要接受用户的参数时,该模块具有许多有用的实用程序。

The simplest way to get a Python script online is to use CGI:

#!/usr/bin/python

print "Content-type: text/html"
print

print "<p>Hello world.</p>"

Put that code in a script that lives in your web server CGI directory, make it executable, and run it. The cgi module has a number of useful utilities when you need to accept parameters from the user.


回答 3

原始CGI有点痛苦,Django有点重量级。关于它,有许多更简单,更轻便的框架,例如CherryPy。值得一看。

Raw CGI is kind of a pain, Django is kind of heavyweight. There are a number of simpler, lighter frameworks about, e.g. CherryPy. It’s worth looking around a bit.


回答 4

看一下WSGI参考实现。您已经在Python库中拥有它。这很简单。

Look at the WSGI reference implementation. You already have it in your Python libraries. It’s quite simple.


回答 5

如果您指的是“ Web服务”,则其他程序SimpleXMLRPCServer可能访问了某些 适合您的东西。自版本2.2起,所有Python安装中均包含该功能。

对于简单的人类可访问的东西,我通常使用Python的SimpleHTTPServer,它也随每次安装一起提供。显然,您还可以通过客户端程序访问SimpleHTTPServer。

If you mean with “Web Service” something accessed by other Programms SimpleXMLRPCServer might be right for you. It is included with every Python install since Version 2.2.

For Simple human accessible things I usually use Pythons SimpleHTTPServer which also comes with every install. Obviously you also could access SimpleHTTPServer by client programs.


回答 6

如果您拥有一个好的Web框架,生活将会很简单。Django中的Web服务很简单。定义模型,编写返回CSV文档的视图函数。跳过模板。

Life is simple if you get a good web framework. Web services in Django are easy. Define your model, write view functions that return your CSV documents. Skip the templates.


回答 7

如果您用SOAP / WSDL表示“ Web服务”,则可能需要看一下 使用Python和SOAPpy生成WSDL

If you mean “web service” in SOAP/WSDL sense, you might want to look at Generating a WSDL using Python and SOAPpy


回答 8

也许是扭曲的 http://twistedmatrix.com/trac/


python中的属性文件(类似于Java属性)

问题:python中的属性文件(类似于Java属性)

给定以下格式(.properties.ini):

propertyName1=propertyValue1
propertyName2=propertyValue2
...
propertyNameN=propertyValueN

对于Java,有一个Properties类,该类提供了解析/与上述格式交互的功能。

python标准库(2.x)中有类似的东西吗?

如果没有,我还有什么其他选择?

Given the following format (.properties or .ini):

propertyName1=propertyValue1
propertyName2=propertyValue2
...
propertyNameN=propertyValueN

For Java there is the Properties class that offers functionality to parse / interact with the above format.

Is there something similar in python‘s standard library (2.x) ?

If not, what other alternatives do I have ?


回答 0

对于.ini文件,有一个ConfigParser模块,它提供与.ini文件兼容的格式。

无论如何,没有任何可用于解析完整的.properties文件的文件,当我必须这样做时,我只是使用jython(我在谈论脚本)。

For .ini files there is the ConfigParser module that provides a format compatible with .ini files.

Anyway there’s nothing available for parsing complete .properties files, when I have to do that I simply use jython (I’m talking about scripting).


回答 1

我能够使用它ConfigParser,没有人显示如何执行此操作的任何示例,因此这里是属性文件的简单python阅读器和属性文件的示例。请注意,扩展名仍然.properties,但是我必须添加一个节标题,类似于您在.ini文件中看到的标题……有点混蛋,但是可以工作。

python文件: PythonPropertyReader.py

#!/usr/bin/python    
import ConfigParser
config = ConfigParser.RawConfigParser()
config.read('ConfigFile.properties')

print config.get('DatabaseSection', 'database.dbname');

属性文件: ConfigFile.properties

[DatabaseSection]
database.dbname=unitTest
database.user=root
database.password=

有关更多功能,请阅读:https : //docs.python.org/2/library/configparser.html

I was able to get this to work with ConfigParser, no one showed any examples on how to do this, so here is a simple python reader of a property file and example of the property file. Note that the extension is still .properties, but I had to add a section header similar to what you see in .ini files… a bit of a bastardization, but it works.

The python file: PythonPropertyReader.py

#!/usr/bin/python    
import ConfigParser
config = ConfigParser.RawConfigParser()
config.read('ConfigFile.properties')

print config.get('DatabaseSection', 'database.dbname');

The property file: ConfigFile.properties

[DatabaseSection]
database.dbname=unitTest
database.user=root
database.password=

For more functionality, read: https://docs.python.org/2/library/configparser.html


回答 2

Java属性文件通常也是有效的python代码。您可以将myconfig.properties文件重命名为myconfig.py。然后像这样导入文件

import myconfig

并直接访问属性

print myconfig.propertyName1

A java properties file is often valid python code as well. You could rename your myconfig.properties file to myconfig.py. Then just import your file, like this

import myconfig

and access the properties directly

print myconfig.propertyName1

回答 3

我知道这是一个非常老的问题,但是我现在需要它,因此我决定实现自己的解决方案,一个纯python解决方案,它涵盖了大多数用例(不是全部):

def load_properties(filepath, sep='=', comment_char='#'):
    """
    Read the file passed as parameter as a properties file.
    """
    props = {}
    with open(filepath, "rt") as f:
        for line in f:
            l = line.strip()
            if l and not l.startswith(comment_char):
                key_value = l.split(sep)
                key = key_value[0].strip()
                value = sep.join(key_value[1:]).strip().strip('"') 
                props[key] = value 
    return props

您可以sep将’:’ 更改为以下格式的文件:

key : value

该代码可以正确解析以下行:

url = "http://my-host.com"
name = Paul = Pablo
# This comment line will be ignored

您将获得以下命令:

{"url": "http://my-host.com", "name": "Paul = Pablo" }

I know that this is a very old question, but I need it just now and I decided to implement my own solution, a pure python solution, that covers most uses cases (not all):

def load_properties(filepath, sep='=', comment_char='#'):
    """
    Read the file passed as parameter as a properties file.
    """
    props = {}
    with open(filepath, "rt") as f:
        for line in f:
            l = line.strip()
            if l and not l.startswith(comment_char):
                key_value = l.split(sep)
                key = key_value[0].strip()
                value = sep.join(key_value[1:]).strip().strip('"') 
                props[key] = value 
    return props

You can change the sep to ‘:’ to parse files with format:

key : value

The code parses correctly lines like:

url = "http://my-host.com"
name = Paul = Pablo
# This comment line will be ignored

You’ll get a dict with:

{"url": "http://my-host.com", "name": "Paul = Pablo" }

回答 4

如果可以选择文件格式,我建议使用.ini和Python的ConfigParser,如上所述。如果您需要与Java .properties文件兼容,那么我已经为它编写了一个名为jprops的库。我们使用的是pyjavaproperties,但是遇到各种限制后,我最终实现了自己的。它完全支持.properties格式,包括unicode支持和对转义序列的更好支持。Jprops还可以解析任何类似文件的对象,而pyjavaproperties仅适用于磁盘上的实际文件。

If you have an option of file formats I suggest using .ini and Python’s ConfigParser as mentioned. If you need compatibility with Java .properties files I have written a library for it called jprops. We were using pyjavaproperties, but after encountering various limitations I ended up implementing my own. It has full support for the .properties format, including unicode support and better support for escape sequences. Jprops can also parse any file-like object while pyjavaproperties only works with real files on disk.


回答 5

如果您没有多行属性并且需求非常简单,那么可以使用几行代码为您解决:

档案t.properties

a=b
c=d
e=f

Python代码:

with open("t.properties") as f:
    l = [line.split("=") for line in f.readlines()]
    d = {key.strip(): value.strip() for key, value in l}

if you don’t have multi line properties and a very simple need, a few lines of code can solve it for you:

File t.properties:

a=b
c=d
e=f

Python code:

with open("t.properties") as f:
    l = [line.split("=") for line in f.readlines()]
    d = {key.strip(): value.strip() for key, value in l}

回答 6

这不完全是属性,但是Python确实有一个很好的库来解析配置文件。另请参见此食谱:java.util.Properties的python替代品

This is not exactly properties but Python does have a nice library for parsing configuration files. Also see this recipe: A python replacement for java.util.Properties.


回答 7

这是我的项目的链接:https : //sourceforge.net/projects/pyproperties/。它是一个库,其中包含用于处理Python 3.x的* .properties文件的方法。

但这不是基于java.util.Properties

Here is link to my project: https://sourceforge.net/projects/pyproperties/. It is a library with methods for working with *.properties files for Python 3.x.

But it is not based on java.util.Properties


回答 8

这个是java.util.Propeties的一对一替换

从文档中:

  def __parse(self, lines):
        """ Parse a list of lines and create
        an internal property dictionary """

        # Every line in the file must consist of either a comment
        # or a key-value pair. A key-value pair is a line consisting
        # of a key which is a combination of non-white space characters
        # The separator character between key-value pairs is a '=',
        # ':' or a whitespace character not including the newline.
        # If the '=' or ':' characters are found, in the line, even
        # keys containing whitespace chars are allowed.

        # A line with only a key according to the rules above is also
        # fine. In such case, the value is considered as the empty string.
        # In order to include characters '=' or ':' in a key or value,
        # they have to be properly escaped using the backslash character.

        # Some examples of valid key-value pairs:
        #
        # key     value
        # key=value
        # key:value
        # key     value1,value2,value3
        # key     value1,value2,value3 \
        #         value4, value5
        # key
        # This key= this value
        # key = value1 value2 value3

        # Any line that starts with a '#' is considerered a comment
        # and skipped. Also any trailing or preceding whitespaces
        # are removed from the key/value.

        # This is a line parser. It parses the
        # contents like by line.

This is a one-to-one replacement of java.util.Propeties

From the doc:

  def __parse(self, lines):
        """ Parse a list of lines and create
        an internal property dictionary """

        # Every line in the file must consist of either a comment
        # or a key-value pair. A key-value pair is a line consisting
        # of a key which is a combination of non-white space characters
        # The separator character between key-value pairs is a '=',
        # ':' or a whitespace character not including the newline.
        # If the '=' or ':' characters are found, in the line, even
        # keys containing whitespace chars are allowed.

        # A line with only a key according to the rules above is also
        # fine. In such case, the value is considered as the empty string.
        # In order to include characters '=' or ':' in a key or value,
        # they have to be properly escaped using the backslash character.

        # Some examples of valid key-value pairs:
        #
        # key     value
        # key=value
        # key:value
        # key     value1,value2,value3
        # key     value1,value2,value3 \
        #         value4, value5
        # key
        # This key= this value
        # key = value1 value2 value3

        # Any line that starts with a '#' is considerered a comment
        # and skipped. Also any trailing or preceding whitespaces
        # are removed from the key/value.

        # This is a line parser. It parses the
        # contents like by line.

回答 9

您可以在ConfigParser.RawConfigParser.readfp此处定义的类似文件的对象中使用-> https://docs.python.org/2/library/configparser.html#ConfigParser.RawConfigParser.readfp

定义一个覆盖的类 readline在属性文件的实际内容之前添加节名称。

我已将其打包到返回dict定义的所有属性的类中。

import ConfigParser

class PropertiesReader(object):

    def __init__(self, properties_file_name):
        self.name = properties_file_name
        self.main_section = 'main'

        # Add dummy section on top
        self.lines = [ '[%s]\n' % self.main_section ]

        with open(properties_file_name) as f:
            self.lines.extend(f.readlines())

        # This makes sure that iterator in readfp stops
        self.lines.append('')

    def readline(self):
        return self.lines.pop(0)

    def read_properties(self):
        config = ConfigParser.RawConfigParser()

        # Without next line the property names will be lowercased
        config.optionxform = str

        config.readfp(self)
        return dict(config.items(self.main_section))

if __name__ == '__main__':
    print PropertiesReader('/path/to/file.properties').read_properties()

You can use a file-like object in ConfigParser.RawConfigParser.readfp defined here -> https://docs.python.org/2/library/configparser.html#ConfigParser.RawConfigParser.readfp

Define a class that overrides readline that adds a section name before the actual contents of your properties file.

I’ve packaged it into the class that returns a dict of all the properties defined.

import ConfigParser

class PropertiesReader(object):

    def __init__(self, properties_file_name):
        self.name = properties_file_name
        self.main_section = 'main'

        # Add dummy section on top
        self.lines = [ '[%s]\n' % self.main_section ]

        with open(properties_file_name) as f:
            self.lines.extend(f.readlines())

        # This makes sure that iterator in readfp stops
        self.lines.append('')

    def readline(self):
        return self.lines.pop(0)

    def read_properties(self):
        config = ConfigParser.RawConfigParser()

        # Without next line the property names will be lowercased
        config.optionxform = str

        config.readfp(self)
        return dict(config.items(self.main_section))

if __name__ == '__main__':
    print PropertiesReader('/path/to/file.properties').read_properties()

回答 10

我已经用过了,这个库非常有用

from pyjavaproperties import Properties
p = Properties()
p.load(open('test.properties'))
p.list()
print(p)
print(p.items())
print(p['name3'])
p['name3'] = 'changed = value'

i have used this, this library is very useful

from pyjavaproperties import Properties
p = Properties()
p.load(open('test.properties'))
p.list()
print(p)
print(p.items())
print(p['name3'])
p['name3'] = 'changed = value'

回答 11

这就是我在项目中所做的事情:我只创建了另一个名为properties.py的.py文件,其中包含我在项目中使用的所有常见变量/属性,并且在任何文件中都需要引用这些变量,

from properties import *(or anything you need)

当我经常更改开发人员的位置并且某些常见变量与本地环境有关时,使用此方法来保持svn的和平。对我来说效果很好,但不确定是否可以在正式的开发环境中建议使用此方法。

This is what I’m doing in my project: I just create another .py file called properties.py which includes all common variables/properties I used in the project, and in any file need to refer to these variables, put

from properties import *(or anything you need)

Used this method to keep svn peace when I was changing dev locations frequently and some common variables were quite relative to local environment. Works fine for me but not sure this method would be suggested for formal dev environment etc.


回答 12

import json
f=open('test.json')
x=json.load(f)
f.close()
print(x)

test.json的内容:{“主机”:“ 127.0.0.1”,“用户”:“ jms”}

import json
f=open('test.json')
x=json.load(f)
f.close()
print(x)

Contents of test.json: {“host”: “127.0.0.1”, “user”: “jms”}


回答 13

我创建了一个与Java的Properties类几乎相似的python模块(实际上就像Spring中的PropertyPlaceholderConfigurer一样,它允许您使用$ {variable-reference}来引用已定义的property)

编辑:您可以通过运行命令(当前已针对python 3测试)来安装此软件包。
pip install property

该项目托管在GitHub上

示例:(详细文档可在此处找到)

假设您在my_file.properties文件中定义了以下属性

foo = I am awesome
bar = ${chocolate}-bar
chocolate = fudge

加载上述属性的代码

from properties.p import Property

prop = Property()
# Simply load it into a dictionary
dic_prop = prop.load_property_files('my_file.properties')

I have created a python module that is almost similar to the Properties class of Java ( Actually it is like the PropertyPlaceholderConfigurer in spring which lets you use ${variable-reference} to refer to already defined property )

EDIT : You may install this package by running the command(currently tested for python 3).
pip install property

The project is hosted on GitHub

Example : ( Detailed documentation can be found here )

Let’s say you have the following properties defined in my_file.properties file

foo = I am awesome
bar = ${chocolate}-bar
chocolate = fudge

Code to load the above properties

from properties.p import Property

prop = Property()
# Simply load it into a dictionary
dic_prop = prop.load_property_files('my_file.properties')

回答 14

如果您需要以简单的方式从属性文件中的部分读取所有值:

您的config.properties文件布局:

[SECTION_NAME]  
key1 = value1  
key2 = value2  

您的代码:

   import configparser

   config = configparser.RawConfigParser()
   config.read('path_to_config.properties file')

   details_dict = dict(config.items('SECTION_NAME'))

这将为您提供一个字典,其中的键与配置文件中的键及其相应值相同。

details_dict 是:

{'key1':'value1', 'key2':'value2'}

现在获取key1的值: details_dict['key1']

将所有内容放到仅从配置文件读取该部分一次的方法中(在程序运行期间第一次调用该方法)。

def get_config_dict():
    if not hasattr(get_config_dict, 'config_dict'):
        get_config_dict.config_dict = dict(config.items('SECTION_NAME'))
    return get_config_dict.config_dict

现在调用上面的函数并获取所需键的值:

config_details = get_config_dict()
key_1_value = config_details['key1'] 

————————————————– ———–

扩展上述方法,自动逐节阅读,然后按节名和键名进行访问。

def get_config_section():
    if not hasattr(get_config_section, 'section_dict'):
        get_config_section.section_dict = dict()

        for section in config.sections():
            get_config_section.section_dict[section] = 
                             dict(config.items(section))

    return get_config_section.section_dict

要访问:

config_dict = get_config_section()

port = config_dict['DB']['port'] 

(此处“ DB”是配置文件中的节名称,“端口”是“ DB”节下的键。)

If you need to read all values from a section in properties file in a simple manner:

Your config.properties file layout :

[SECTION_NAME]  
key1 = value1  
key2 = value2  

You code:

   import configparser

   config = configparser.RawConfigParser()
   config.read('path_to_config.properties file')

   details_dict = dict(config.items('SECTION_NAME'))

This will give you a dictionary where keys are same as in config file and their corresponding values.

details_dict is :

{'key1':'value1', 'key2':'value2'}

Now to get key1’s value : details_dict['key1']

Putting it all in a method which reads that section from config file only once(the first time the method is called during a program run).

def get_config_dict():
    if not hasattr(get_config_dict, 'config_dict'):
        get_config_dict.config_dict = dict(config.items('SECTION_NAME'))
    return get_config_dict.config_dict

Now call the above function and get the required key’s value :

config_details = get_config_dict()
key_1_value = config_details['key1'] 

————————————————————-

Extending the approach mentioned above, reading section by section automatically and then accessing by section name followed by key name.

def get_config_section():
    if not hasattr(get_config_section, 'section_dict'):
        get_config_section.section_dict = dict()

        for section in config.sections():
            get_config_section.section_dict[section] = 
                             dict(config.items(section))

    return get_config_section.section_dict

To access:

config_dict = get_config_section()

port = config_dict['DB']['port'] 

(here ‘DB’ is a section name in config file and ‘port’ is a key under section ‘DB’.)


回答 15

下面的两行代码显示了如何使用Python List Comprehension加载“ java样式”属性文件。

split_properties=[line.split("=") for line in open('/<path_to_property_file>)]
properties={key: value for key,value in split_properties }

请查看以下帖子以了解详细信息 https://ilearnonlinesite.wordpress.com/2017/07/24/reading-property-file-in-python-using-comprehension-and-generators/

Below 2 lines of code shows how to use Python List Comprehension to load ‘java style’ property file.

split_properties=[line.split("=") for line in open('/<path_to_property_file>)]
properties={key: value for key,value in split_properties }

Please have a look at below post for details https://ilearnonlinesite.wordpress.com/2017/07/24/reading-property-file-in-python-using-comprehension-and-generators/


回答 16

您可以将参数“ fromfile_prefix_chars”与argparse一起使用,以从配置文件中读取内容,如下所示-

临时

parser = argparse.ArgumentParser(fromfile_prefix_chars='#')
parser.add_argument('--a')
parser.add_argument('--b')
args = parser.parse_args()
print(args.a)
print(args.b)

配置文件

--a
hello
--b
hello dear

运行命令

python temp.py "#config"

you can use parameter “fromfile_prefix_chars” with argparse to read from config file as below—

temp.py

parser = argparse.ArgumentParser(fromfile_prefix_chars='#')
parser.add_argument('--a')
parser.add_argument('--b')
args = parser.parse_args()
print(args.a)
print(args.b)

config file

--a
hello
--b
hello dear

Run command

python temp.py "#config"

回答 17

我使用ConfigParser进行了如下操作。该代码假定在放置BaseTest的同一目录中有一个名为config.prop的文件:

配置文件

[CredentialSection]
app.name=MyAppName

BaseTest.py:

import unittest
import ConfigParser

class BaseTest(unittest.TestCase):
    def setUp(self):
        __SECTION = 'CredentialSection'
        config = ConfigParser.ConfigParser()
        config.readfp(open('config.prop'))
        self.__app_name = config.get(__SECTION, 'app.name')

    def test1(self):
        print self.__app_name % This should print: MyAppName

I did this using ConfigParser as follows. The code assumes that there is a file called config.prop in the same directory where BaseTest is placed:

config.prop

[CredentialSection]
app.name=MyAppName

BaseTest.py:

import unittest
import ConfigParser

class BaseTest(unittest.TestCase):
    def setUp(self):
        __SECTION = 'CredentialSection'
        config = ConfigParser.ConfigParser()
        config.readfp(open('config.prop'))
        self.__app_name = config.get(__SECTION, 'app.name')

    def test1(self):
        print self.__app_name % This should print: MyAppName

回答 18

这就是我编写的用于解析文件并将其设置为env变量的内容,该变量会跳过注释,并且非关键值行添加了开关来指定hg:d

  • -h或–help打印用法摘要
  • -c指定用于标识注释的字符
  • -s prop文件中键和值之间的分隔符
  • 并指定需要解析的属性文件,例如:python EnvParamSet.py -c#-s = env.properties

    import pipes
    import sys , getopt
    import os.path
    
    class Parsing :
    
            def __init__(self , seprator , commentChar , propFile):
            self.seprator = seprator
            self.commentChar = commentChar
            self.propFile  = propFile
    
        def  parseProp(self):
            prop = open(self.propFile,'rU')
            for line in prop :
                if line.startswith(self.commentChar)==False and  line.find(self.seprator) != -1  :
                    keyValue = line.split(self.seprator)
                    key =  keyValue[0].strip() 
                    value = keyValue[1].strip() 
                            print("export  %s=%s" % (str (key),pipes.quote(str(value))))
    
    
    
    
    class EnvParamSet:
    
        def main (argv):
    
            seprator = '='
            comment =  '#'
    
            if len(argv)  is 0:
                print "Please Specify properties file to be parsed "
                sys.exit()
            propFile=argv[-1] 
    
    
            try :
                opts, args = getopt.getopt(argv, "hs:c:f:", ["help", "seprator=","comment=", "file="])
            except getopt.GetoptError,e:
                print str(e)
                print " possible  arguments  -s <key value sperator > -c < comment char >    <file> \n  Try -h or --help "
                sys.exit(2)
    
    
            if os.path.isfile(args[0])==False:
                print "File doesnt exist "
                sys.exit()
    
    
            for opt , arg  in opts :
                if opt in ("-h" , "--help"):
                    print " hg:d  \n -h or --help print usage summary \n -c Specify char that idetifes comment  \n -s Sperator between key and value in prop file \n  specify file  "
                    sys.exit()
                elif opt in ("-s" , "--seprator"):
                    seprator = arg 
                elif opt in ("-c"  , "--comment"):
                    comment  = arg
    
            p = Parsing( seprator, comment , propFile)
            p.parseProp()
    
        if __name__ == "__main__":
                main(sys.argv[1:])

This is what i had written to parse file and set it as env variables which skips comments and non key value lines added switches to specify hg:d

  • -h or –help print usage summary
  • -c Specify char that identifies comment
  • -s Separator between key and value in prop file
  • and specify properties file that needs to be parsed eg : python EnvParamSet.py -c # -s = env.properties

    import pipes
    import sys , getopt
    import os.path
    
    class Parsing :
    
            def __init__(self , seprator , commentChar , propFile):
            self.seprator = seprator
            self.commentChar = commentChar
            self.propFile  = propFile
    
        def  parseProp(self):
            prop = open(self.propFile,'rU')
            for line in prop :
                if line.startswith(self.commentChar)==False and  line.find(self.seprator) != -1  :
                    keyValue = line.split(self.seprator)
                    key =  keyValue[0].strip() 
                    value = keyValue[1].strip() 
                            print("export  %s=%s" % (str (key),pipes.quote(str(value))))
    
    
    
    
    class EnvParamSet:
    
        def main (argv):
    
            seprator = '='
            comment =  '#'
    
            if len(argv)  is 0:
                print "Please Specify properties file to be parsed "
                sys.exit()
            propFile=argv[-1] 
    
    
            try :
                opts, args = getopt.getopt(argv, "hs:c:f:", ["help", "seprator=","comment=", "file="])
            except getopt.GetoptError,e:
                print str(e)
                print " possible  arguments  -s <key value sperator > -c < comment char >    <file> \n  Try -h or --help "
                sys.exit(2)
    
    
            if os.path.isfile(args[0])==False:
                print "File doesnt exist "
                sys.exit()
    
    
            for opt , arg  in opts :
                if opt in ("-h" , "--help"):
                    print " hg:d  \n -h or --help print usage summary \n -c Specify char that idetifes comment  \n -s Sperator between key and value in prop file \n  specify file  "
                    sys.exit()
                elif opt in ("-s" , "--seprator"):
                    seprator = arg 
                elif opt in ("-c"  , "--comment"):
                    comment  = arg
    
            p = Parsing( seprator, comment , propFile)
            p.parseProp()
    
        if __name__ == "__main__":
                main(sys.argv[1:])
    

回答 19

Lightbend发布了Typesafe Config库,该库可解析属性文件以及一些基于JSON的扩展。Lightbend的库仅用于JVM,但似乎已被广泛采用,并且现在有许多语言的端口,包括Python:https : //github.com/chimpler/pyhocon

Lightbend has released the Typesafe Config library, which parses properties files and also some JSON-based extensions. Lightbend’s library is only for the JVM, but it seems to be widely adopted and there are now ports in many languages, including Python: https://github.com/chimpler/pyhocon


回答 20

您可以使用以下函数,它是@mvallebr的修改代码。它尊重属性文件的注释,忽略空的新行,并允许检索单个键值。

def getProperties(propertiesFile ="/home/memin/.config/customMemin/conf.properties", key=''):
    """
    Reads a .properties file and returns the key value pairs as dictionary.
    if key value is specified, then it will return its value alone.
    """
    with open(propertiesFile) as f:
        l = [line.strip().split("=") for line in f.readlines() if not line.startswith('#') and line.strip()]
        d = {key.strip(): value.strip() for key, value in l}

        if key:
            return d[key]
        else:
            return d

You can use the following function, which is the modified code of @mvallebr. It respects the properties file comments, ignores empty new lines, and allows retrieving a single key value.

def getProperties(propertiesFile ="/home/memin/.config/customMemin/conf.properties", key=''):
    """
    Reads a .properties file and returns the key value pairs as dictionary.
    if key value is specified, then it will return its value alone.
    """
    with open(propertiesFile) as f:
        l = [line.strip().split("=") for line in f.readlines() if not line.startswith('#') and line.strip()]
        d = {key.strip(): value.strip() for key, value in l}

        if key:
            return d[key]
        else:
            return d

回答 21

这对我有用。

from pyjavaproperties import Properties
p = Properties()
p.load(open('test.properties'))
p.list()
print p
print p.items()
print p['name3']

this works for me.

from pyjavaproperties import Properties
p = Properties()
p.load(open('test.properties'))
p.list()
print p
print p.items()
print p['name3']

回答 22

我遵循configparser方法,对我来说效果很好。创建了一个PropertyReader文件,并在其中使用了配置解析器以准备与每个部分相对应的属性。

**使用Python 2.7

PropertyReader.py文件的内容:

#!/usr/bin/python
import ConfigParser

class PropertyReader:

def readProperty(self, strSection, strKey):
    config = ConfigParser.RawConfigParser()
    config.read('ConfigFile.properties')
    strValue = config.get(strSection,strKey);
    print "Value captured for "+strKey+" :"+strValue
    return strValue

读取架构文件的内容:

from PropertyReader import *

class ReadSchema:

print PropertyReader().readProperty('source1_section','source_name1')
print PropertyReader().readProperty('source2_section','sn2_sc1_tb')

.properties文件的内容:

[source1_section]
source_name1:module1
sn1_schema:schema1,schema2,schema3
sn1_sc1_tb:employee,department,location
sn1_sc2_tb:student,college,country

[source2_section]
source_name1:module2
sn2_schema:schema4,schema5,schema6
sn2_sc1_tb:employee,department,location
sn2_sc2_tb:student,college,country

I followed configparser approach and it worked quite well for me. Created one PropertyReader file and used config parser there to ready property to corresponding to each section.

**Used Python 2.7

Content of PropertyReader.py file:

#!/usr/bin/python
import ConfigParser

class PropertyReader:

def readProperty(self, strSection, strKey):
    config = ConfigParser.RawConfigParser()
    config.read('ConfigFile.properties')
    strValue = config.get(strSection,strKey);
    print "Value captured for "+strKey+" :"+strValue
    return strValue

Content of read schema file:

from PropertyReader import *

class ReadSchema:

print PropertyReader().readProperty('source1_section','source_name1')
print PropertyReader().readProperty('source2_section','sn2_sc1_tb')

Content of .properties file:

[source1_section]
source_name1:module1
sn1_schema:schema1,schema2,schema3
sn1_sc1_tb:employee,department,location
sn1_sc2_tb:student,college,country

[source2_section]
source_name1:module2
sn2_schema:schema4,schema5,schema6
sn2_sc1_tb:employee,department,location
sn2_sc2_tb:student,college,country

回答 23

在python模块中创建一个字典并将所有内容存储到其中并访问它,例如:

dict = {
       'portalPath' : 'www.xyx.com',
       'elementID': 'submit'}

现在访问它,您只需执行以下操作:

submitButton = driver.find_element_by_id(dict['elementID'])

create a dictionary in your python module and store everything into it and access it, for example:

dict = {
       'portalPath' : 'www.xyx.com',
       'elementID': 'submit'}

Now to access it you can simply do:

submitButton = driver.find_element_by_id(dict['elementID'])

如何在Python中逐行读取文件?

问题:如何在Python中逐行读取文件?

在史前时代(Python 1.4)中,我们做到了:

fp = open('filename.txt')
while 1:
    line = fp.readline()
    if not line:
        break
    print line

在Python 2.1之后,我们做到了:

for line in open('filename.txt').xreadlines():
    print line

在Python 2.3中获得便捷的迭代器协议之前,它可以做到:

for line in open('filename.txt'):
    print line

我看过一些使用更详细的示例:

with open('filename.txt') as fp:
    for line in fp:
        print line

这是首选的方法吗?

[edit]我知道with语句可以确保关闭文件…但是为什么文件对象的迭代器协议中没有包含该语句呢?

In pre-historic times (Python 1.4) we did:

fp = open('filename.txt')
while 1:
    line = fp.readline()
    if not line:
        break
    print line

after Python 2.1, we did:

for line in open('filename.txt').xreadlines():
    print line

before we got the convenient iterator protocol in Python 2.3, and could do:

for line in open('filename.txt'):
    print line

I’ve seen some examples using the more verbose:

with open('filename.txt') as fp:
    for line in fp:
        print line

is this the preferred method going forwards?

[edit] I get that the with statement ensures closing of the file… but why isn’t that included in the iterator protocol for file objects?


回答 0

首选以下原因正是有一个原因:

with open('filename.txt') as fp:
    for line in fp:
        print line

CPython的相对确定性的引用计数方案对垃圾回收来说,我们都被宠坏了。如果其他假设的Python实现with使用某种其他方案来回收内存,则它们在没有该块的情况下不一定会“足够快地”关闭文件。

在这样的实现中,如果您的代码打开文件的速度比垃圾收集器调用孤立文件句柄上的终结器的速度快,则可能会从OS收到“打开太多文件”错误。通常的解决方法是立即触发GC,但这是一个讨厌的技巧,必须由可能遇到错误的每个函数(包括库中的函数)来完成。什么样的恶梦。

或者,您可以只使用该with块。

奖金问题

(如果仅对问题的客观方面感兴趣,请立即停止阅读。)

为什么文件对象的迭代器协议中未包含该代码?

这是有关API设计的主观问题,因此我有两个部分的主观答案。

从直觉上讲,这是错的,因为它使迭代器协议执行两项单独的操作(遍历行关闭文件句柄),并且使外观简单的函数执行两项操作通常不是一个好主意。在这种情况下,感觉特别糟糕,因为迭代器以准功能,基于值的方式与文件内容相关联,但是管理文件句柄是完全独立的任务。对于阅读代码的人来说,将两者无形地压成一个动作是令人惊讶的,这使得推理程序行为变得更加困难。

其他语言基本上得出了相同的结论。Haskell简要调情了所谓的“惰性IO”,它允许您遍历文件并在到达流末尾时自动将其关闭,但是如今,在Haskell和Haskell中几乎普遍不建议使用惰性IO。用户大多转向更明确的资源管理,例如Conduit,其行为更像withPython中的块。

从技术上讲,您可能需要对Python中的文件句柄进行某些操作,如果迭代关闭了文件句柄,这些操作将无法正常工作。例如,假设我需要遍历文件两次:

with open('filename.txt') as fp:
    for line in fp:
        ...
    fp.seek(0)
    for line in fp:
        ...

虽然这是一种不太常见的用例,但请考虑以下事实:我可能刚刚将底部的三行代码添加到了原来具有前三行的现有代码库中。如果迭代关闭了该文件,我将无法执行该操作。因此,将迭代和资源管理分开保持可以更轻松地将代码块组合成一个更大的,可运行的Python程序。

可组合性是语言或API最重要的可用性功能之一。

There is exactly one reason why the following is preferred:

with open('filename.txt') as fp:
    for line in fp:
        print line

We are all spoiled by CPython’s relatively deterministic reference-counting scheme for garbage collection. Other, hypothetical implementations of Python will not necessarily close the file “quickly enough” without the with block if they use some other scheme to reclaim memory.

In such an implementation, you might get a “too many files open” error from the OS if your code opens files faster than the garbage collector calls finalizers on orphaned file handles. The usual workaround is to trigger the GC immediately, but this is a nasty hack and it has to be done by every function that could encounter the error, including those in libraries. What a nightmare.

Or you could just use the with block.

Bonus Question

(Stop reading now if are only interested in the objective aspects of the question.)

Why isn’t that included in the iterator protocol for file objects?

This is a subjective question about API design, so I have a subjective answer in two parts.

On a gut level, this feels wrong, because it makes iterator protocol do two separate things—iterate over lines and close the file handle—and it’s often a bad idea to make a simple-looking function do two actions. In this case, it feels especially bad because iterators relate in a quasi-functional, value-based way to the contents of a file, but managing file handles is a completely separate task. Squashing both, invisibly, into one action, is surprising to humans who read the code and makes it more difficult to reason about program behavior.

Other languages have essentially come to the same conclusion. Haskell briefly flirted with so-called “lazy IO” which allows you to iterate over a file and have it automatically closed when you get to the end of the stream, but it’s almost universally discouraged to use lazy IO in Haskell these days, and Haskell users have mostly moved to more explicit resource management like Conduit which behaves more like the with block in Python.

On a technical level, there are some things you may want to do with a file handle in Python which would not work as well if iteration closed the file handle. For example, suppose I need to iterate over the file twice:

with open('filename.txt') as fp:
    for line in fp:
        ...
    fp.seek(0)
    for line in fp:
        ...

While this is a less common use case, consider the fact that I might have just added the three lines of code at the bottom to an existing code base which originally had the top three lines. If iteration closed the file, I wouldn’t be able to do that. So keeping iteration and resource management separate makes it easier to compose chunks of code into a larger, working Python program.

Composability is one of the most important usability features of a language or API.


回答 1

是,

with open('filename.txt') as fp:
    for line in fp:
        print line

是要走的路。

它并不冗长。更安全。

Yes,

with open('filename.txt') as fp:
    for line in fp:
        print line

is the way to go.

It is not more verbose. It is more safe.


回答 2

如果您被多余的行关闭,则可以使用包装函数,如下所示:

def with_iter(iterable):
    with iterable as iter:
        for item in iter:
            yield item

for line in with_iter(open('...')):
    ...

在Python 3.3中,该yield from语句会使此操作更短:

def with_iter(iterable):
    with iterable as iter:
        yield from iter

if you’re turned off by the extra line, you can use a wrapper function like so:

def with_iter(iterable):
    with iterable as iter:
        for item in iter:
            yield item

for line in with_iter(open('...')):
    ...

in Python 3.3, the yield from statement would make this even shorter:

def with_iter(iterable):
    with iterable as iter:
        yield from iter

回答 3

f = open('test.txt','r')
for line in f.xreadlines():
    print line
f.close()
f = open('test.txt','r')
for line in f.xreadlines():
    print line
f.close()

我可以使用`pip`代替`easy_install`来实现`python setup.py install`依赖关系解析吗?

问题:我可以使用`pip`代替`easy_install`来实现`python setup.py install`依赖关系解析吗?

python setup.py install会自动安装requires=[]使用中列出的软件包easy_install。我该如何使用它pip呢?

python setup.py install will automatically install packages listed in requires=[] using easy_install. How do I get it to use pip instead?


回答 0

是的你可以。您可以从网络或计算机上的tarball或文件夹中安装软件包。例如:

从网络上的tarball安装

pip install https://pypi.python.org/packages/source/r/requests/requests-2.3.0.tar.gz

从本地tarball安装

wget https://pypi.python.org/packages/source/r/requests/requests-2.3.0.tar.gz
pip install requests-2.3.0.tar.gz

从本地文件夹安装

tar -zxvf requests-2.3.0.tar.gz
cd requests-2.3.0
pip install .

您可以删除requests-2.3.0文件夹。

从本地文件夹安装(可编辑模式)

pip install -e .

这将以可编辑模式安装软件包。您对代码所做的任何更改将立即在整个系统中应用。如果您是程序包开发人员并且想要测试更改,这将很有用。这也意味着您必须在不中断安装的情况下删除文件夹。

Yes you can. You can install a package from a tarball or a folder, on the web or your computer. For example:

Install from tarball on web

pip install https://pypi.python.org/packages/source/r/requests/requests-2.3.0.tar.gz

Install from local tarball

wget https://pypi.python.org/packages/source/r/requests/requests-2.3.0.tar.gz
pip install requests-2.3.0.tar.gz

Install from local folder

tar -zxvf requests-2.3.0.tar.gz
cd requests-2.3.0
pip install .

You can delete the requests-2.3.0 folder.

Install from local folder (editable mode)

pip install -e .

This installs the package in editable mode. Any changes you make to the code will immediately apply across the system. This is useful if you are the package developer and want to test changes. It also means you can’t delete the folder without breaking the install.


回答 1

您可以先pip install归档python setup.py sdist。您也pip install -e .可以像python setup.py develop

You can pip install a file perhaps by python setup.py sdist first. You can also pip install -e . which is like python setup.py develop.


回答 2

如果您真的python setup.py install愿意使用,可以尝试如下操作:

from setuptools import setup, find_packages
from setuptools.command.install import install as InstallCommand


class Install(InstallCommand):
    """ Customized setuptools install command which uses pip. """

    def run(self, *args, **kwargs):
        import pip
        pip.main(['install', '.'])
        InstallCommand.run(self, *args, **kwargs)


setup(
    name='your_project',
    version='0.0.1a',
    cmdclass={
        'install': Install,
    },
    packages=find_packages(),
    install_requires=['simplejson']
)

If you are really set on using python setup.py install you could try something like this:

from setuptools import setup, find_packages
from setuptools.command.install import install as InstallCommand


class Install(InstallCommand):
    """ Customized setuptools install command which uses pip. """

    def run(self, *args, **kwargs):
        import pip
        pip.main(['install', '.'])
        InstallCommand.run(self, *args, **kwargs)


setup(
    name='your_project',
    version='0.0.1a',
    cmdclass={
        'install': Install,
    },
    packages=find_packages(),
    install_requires=['simplejson']
)

如何执行两个列表的按元素相乘?

问题:如何执行两个列表的按元素相乘?

我想执行元素明智的乘法,将两个列表按值在Python中相乘,就像我们在Matlab中可以做到的那样。

这就是我在Matlab中要做的。

a = [1,2,3,4]
b = [2,3,4,5]
a .* b = [2, 6, 12, 20]

对于from 和from的每个组合x * y,列表理解将给出16个列表条目。不确定如何映射。xayb

如果有人对此感兴趣,我有一个数据集,并想乘以Numpy.linspace(1.0, 0.5, num=len(dataset)) =)

I want to perform an element wise multiplication, to multiply two lists together by value in Python, like we can do it in Matlab.

This is how I would do it in Matlab.

a = [1,2,3,4]
b = [2,3,4,5]
a .* b = [2, 6, 12, 20]

A list comprehension would give 16 list entries, for every combination x * y of x from a and y from b. Unsure of how to map this.

If anyone is interested why, I have a dataset, and want to multiply it by Numpy.linspace(1.0, 0.5, num=len(dataset)) =).


回答 0

使用列表理解与zip():混合。

[a*b for a,b in zip(lista,listb)]

Use a list comprehension mixed with zip():.

[a*b for a,b in zip(lista,listb)]

回答 1

由于您已经在使用numpy,因此将数据存储在numpy数组而不是列表中很有意义。完成此操作后,您将免费获得类似智能元素的产品:

In [1]: import numpy as np

In [2]: a = np.array([1,2,3,4])

In [3]: b = np.array([2,3,4,5])

In [4]: a * b
Out[4]: array([ 2,  6, 12, 20])

Since you’re already using numpy, it makes sense to store your data in a numpy array rather than a list. Once you do this, you get things like element-wise products for free:

In [1]: import numpy as np

In [2]: a = np.array([1,2,3,4])

In [3]: b = np.array([2,3,4,5])

In [4]: a * b
Out[4]: array([ 2,  6, 12, 20])

回答 2

使用np.multiply(a,b):

import numpy as np
a = [1,2,3,4]
b = [2,3,4,5]
np.multiply(a,b)

Use np.multiply(a,b):

import numpy as np
a = [1,2,3,4]
b = [2,3,4,5]
np.multiply(a,b)

回答 3

您可以尝试将每个元素乘以一个循环。这样做的捷径是

ab = [a[i]*b[i] for i in range(len(a))]

You can try multiplying each element in a loop. The short hand for doing that is

ab = [a[i]*b[i] for i in range(len(a))]

回答 4

还有一个答案:

-1…需要导入
+1…非常易读

import operator
a = [1,2,3,4]
b = [10,11,12,13]

list(map(operator.mul, a, b))

输出[10、22、36、52]

Yet another answer:

-1 … requires import
+1 … is very readable

import operator
a = [1,2,3,4]
b = [10,11,12,13]

list(map(operator.mul, a, b))

outputs [10, 22, 36, 52]


回答 5

相当直观的方法:

a = [1,2,3,4]
b = [2,3,4,5]
ab = []                        #Create empty list
for i in range(0, len(a)):
     ab.append(a[i]*b[i])      #Adds each element to the list

Fairly intuitive way of doing this:

a = [1,2,3,4]
b = [2,3,4,5]
ab = []                        #Create empty list
for i in range(0, len(a)):
     ab.append(a[i]*b[i])      #Adds each element to the list

回答 6

您可以使用 lambda

foo=[1,2,3,4]
bar=[1,2,5,55]
l=map(lambda x,y:x*y,foo,bar)

you can multiplication using lambda

foo=[1,2,3,4]
bar=[1,2,5,55]
l=map(lambda x,y:x*y,foo,bar)

回答 7

对于大型列表,我们可以反复进行:

product_iter_object = itertools.imap(operator.mul, [1,2,3,4], [2,3,4,5])

product_iter_object.next() 给出输出列表中的每个元素。

输出将是两个输入列表中较短者的长度。

For large lists, we can do it the iter-way:

product_iter_object = itertools.imap(operator.mul, [1,2,3,4], [2,3,4,5])

product_iter_object.next() gives each of the element in the output list.

The output would be the length of the shorter of the two input lists.


回答 8

创建一个数组;将每个列表乘以数组;将数组转换为列表

import numpy as np

a = [1,2,3,4]
b = [2,3,4,5]

c = (np.ones(len(a))*a*b).tolist()

[2.0, 6.0, 12.0, 20.0]

create an array of ones; multiply each list times the array; convert array to a list

import numpy as np

a = [1,2,3,4]
b = [2,3,4,5]

c = (np.ones(len(a))*a*b).tolist()

[2.0, 6.0, 12.0, 20.0]

回答 9

gahooa的答案对于标题中所述的问题是正确的,但是如果列表已经是numpy格式大于十,它将更快(3个数量级)并且可读性更高,如NPE。我得到这些时间:

0.0049ms -> N = 4, a = [i for i in range(N)], c = [a*b for a,b in zip(a, b)]
0.0075ms -> N = 4, a = [i for i in range(N)], c = a * b
0.0167ms -> N = 4, a = np.arange(N), c = [a*b for a,b in zip(a, b)]
0.0013ms -> N = 4, a = np.arange(N), c = a * b
0.0171ms -> N = 40, a = [i for i in range(N)], c = [a*b for a,b in zip(a, b)]
0.0095ms -> N = 40, a = [i for i in range(N)], c = a * b
0.1077ms -> N = 40, a = np.arange(N), c = [a*b for a,b in zip(a, b)]
0.0013ms -> N = 40, a = np.arange(N), c = a * b
0.1485ms -> N = 400, a = [i for i in range(N)], c = [a*b for a,b in zip(a, b)]
0.0397ms -> N = 400, a = [i for i in range(N)], c = a * b
1.0348ms -> N = 400, a = np.arange(N), c = [a*b for a,b in zip(a, b)]
0.0020ms -> N = 400, a = np.arange(N), c = a * b

即从以下测试程序。

import timeit

init = ['''
import numpy as np
N = {}
a = {}
b = np.linspace(0.0, 0.5, len(a))
'''.format(i, j) for i in [4, 40, 400] 
                  for j in ['[i for i in range(N)]', 'np.arange(N)']]

func = ['''c = [a*b for a,b in zip(a, b)]''',
'''c = a * b''']

for i in init:
  for f in func:
    lines = i.split('\n')
    print('{:6.4f}ms -> {}, {}, {}'.format(
           timeit.timeit(f, setup=i, number=1000), lines[2], lines[3], f))

gahooa’s answer is correct for the question as phrased in the heading, but if the lists are already numpy format or larger than ten it will be MUCH faster (3 orders of magnitude) as well as more readable, to do simple numpy multiplication as suggested by NPE. I get these timings:

0.0049ms -> N = 4, a = [i for i in range(N)], c = [a*b for a,b in zip(a, b)]
0.0075ms -> N = 4, a = [i for i in range(N)], c = a * b
0.0167ms -> N = 4, a = np.arange(N), c = [a*b for a,b in zip(a, b)]
0.0013ms -> N = 4, a = np.arange(N), c = a * b
0.0171ms -> N = 40, a = [i for i in range(N)], c = [a*b for a,b in zip(a, b)]
0.0095ms -> N = 40, a = [i for i in range(N)], c = a * b
0.1077ms -> N = 40, a = np.arange(N), c = [a*b for a,b in zip(a, b)]
0.0013ms -> N = 40, a = np.arange(N), c = a * b
0.1485ms -> N = 400, a = [i for i in range(N)], c = [a*b for a,b in zip(a, b)]
0.0397ms -> N = 400, a = [i for i in range(N)], c = a * b
1.0348ms -> N = 400, a = np.arange(N), c = [a*b for a,b in zip(a, b)]
0.0020ms -> N = 400, a = np.arange(N), c = a * b

i.e. from the following test program.

import timeit

init = ['''
import numpy as np
N = {}
a = {}
b = np.linspace(0.0, 0.5, len(a))
'''.format(i, j) for i in [4, 40, 400] 
                  for j in ['[i for i in range(N)]', 'np.arange(N)']]

func = ['''c = [a*b for a,b in zip(a, b)]''',
'''c = a * b''']

for i in init:
  for f in func:
    lines = i.split('\n')
    print('{:6.4f}ms -> {}, {}, {}'.format(
           timeit.timeit(f, setup=i, number=1000), lines[2], lines[3], f))

回答 10

可以使用枚举。

a = [1, 2, 3, 4]
b = [2, 3, 4, 5]

ab = [val * b[i] for i, val in enumerate(a)]

Can use enumerate.

a = [1, 2, 3, 4]
b = [2, 3, 4, 5]

ab = [val * b[i] for i, val in enumerate(a)]

回答 11

map功能在这里可能非常有用。使用map我们可以将任何函数应用于可迭代对象的每个元素。

Python 3.x

>>> def my_mul(x,y):
...     return x*y
...
>>> a = [1,2,3,4]
>>> b = [2,3,4,5]
>>>
>>> list(map(my_mul,a,b))
[2, 6, 12, 20]
>>>

当然:

map(f, iterable)

相当于

[f(x) for x in iterable]

因此,我们可以通过以下方式获得解决方案:

>>> [my_mul(x,y) for x, y in zip(a,b)]
[2, 6, 12, 20]
>>>

在Python 2.x中map()意味着:将函数应用于可迭代的每个元素并构造一个新列表。在Python 3.x中,map构造迭代器而不是列表。

代替my_mul我们可以使用 mul运算符

Python 2.7

>>>from operator import mul # import mul operator
>>>a = [1,2,3,4]
>>>b = [2,3,4,5]
>>>map(mul,a,b)
[2, 6, 12, 20]
>>>

Python 3.5+

>>> from operator import mul
>>> a = [1,2,3,4]
>>> b = [2,3,4,5]
>>> [*map(mul,a,b)]
[2, 6, 12, 20]
>>>

请注意,由于map()构造了迭代器,因此我们使用*可迭代的拆包运算符来获取列表。解压缩方法比list构造函数要快一些:

>>> list(map(mul,a,b))
[2, 6, 12, 20]
>>>

The map function can be very useful here. Using map we can apply any function to each element of an iterable.

Python 3.x

>>> def my_mul(x,y):
...     return x*y
...
>>> a = [1,2,3,4]
>>> b = [2,3,4,5]
>>>
>>> list(map(my_mul,a,b))
[2, 6, 12, 20]
>>>

Of course:

map(f, iterable)

is equivalent to

[f(x) for x in iterable]

So we can get our solution via:

>>> [my_mul(x,y) for x, y in zip(a,b)]
[2, 6, 12, 20]
>>>

In Python 2.x map() means: apply a function to each element of an iterable and construct a new list. In Python 3.x, map construct iterators instead of lists.

Instead of my_mul we could use mul operator

Python 2.7

>>>from operator import mul # import mul operator
>>>a = [1,2,3,4]
>>>b = [2,3,4,5]
>>>map(mul,a,b)
[2, 6, 12, 20]
>>>

Python 3.5+

>>> from operator import mul
>>> a = [1,2,3,4]
>>> b = [2,3,4,5]
>>> [*map(mul,a,b)]
[2, 6, 12, 20]
>>>

Please note that since map() constructs an iterator we use * iterable unpacking operator to get a list. The unpacking approach is a bit faster then the list constructor:

>>> list(map(mul,a,b))
[2, 6, 12, 20]
>>>

回答 12

要维护列表类型,并在一行中完成(当然,在将numpy导入为np之后):

list(np.array([1,2,3,4]) * np.array([2,3,4,5]))

要么

list(np.array(a) * np.array(b))

To maintain the list type, and do it in one line (after importing numpy as np, of course):

list(np.array([1,2,3,4]) * np.array([2,3,4,5]))

or

list(np.array(a) * np.array(b))

回答 13

您可以将其用于相同长度的列表

def lstsum(a, b):
    c=0
    pos = 0
for element in a:
   c+= element*b[pos]
   pos+=1
return c

you can use this for lists of the same length

def lstsum(a, b):
    c=0
    pos = 0
for element in a:
   c+= element*b[pos]
   pos+=1
return c

将int转换为ASCII并返回Python

问题:将int转换为ASCII并返回Python

我正在为我的站点制作URL缩短器,而我目前的计划(我愿意接受建议)是使用节点ID来生成缩短的URL。因此,从理论上讲,节点26可能是short.com/z,节点1可能是short.com/a,节点52可能是short.com/Z,节点104可能是short.com/ZZ。当用户转到该URL时,我需要撤消该过程(显然)。

我可以想到一些可行的方法来解决此问题,但我想还有更好的方法。有什么建议?

I’m working on making a URL shortener for my site, and my current plan (I’m open to suggestions) is to use a node ID to generate the shortened URL. So, in theory, node 26 might be short.com/z, node 1 might be short.com/a, node 52 might be short.com/Z, and node 104 might be short.com/ZZ. When a user goes to that URL, I need to reverse the process (obviously).

I can think of some kludgy ways to go about this, but I’m guessing there are better ones. Any suggestions?


回答 0

ASCII转换为int:

ord('a')

97

然后返回一个字符串:

  • 在Python2中: str(unichr(97))
  • 在Python3中: chr(97)

'a'

ASCII to int:

ord('a')

gives 97

And back to a string:

  • in Python2: str(unichr(97))
  • in Python3: chr(97)

gives 'a'


回答 1

>>> ord("a")
97
>>> chr(97)
'a'
>>> ord("a")
97
>>> chr(97)
'a'

回答 2

如果多个字符绑定在一个整数/长整数内,这就是我的问题:

s = '0123456789'
nchars = len(s)
# string to int or long. Type depends on nchars
x = sum(ord(s[byte])<<8*(nchars-byte-1) for byte in range(nchars))
# int or long to string
''.join(chr((x>>8*(nchars-byte-1))&0xFF) for byte in range(nchars))

Yield'0123456789'x = 227581098929683594426425L

If multiple characters are bound inside a single integer/long, as was my issue:

s = '0123456789'
nchars = len(s)
# string to int or long. Type depends on nchars
x = sum(ord(s[byte])<<8*(nchars-byte-1) for byte in range(nchars))
# int or long to string
''.join(chr((x>>8*(nchars-byte-1))&0xFF) for byte in range(nchars))

Yields '0123456789' and x = 227581098929683594426425L


回答 3

BASE58编码URL怎么样?像flickr这样。

# note the missing lowercase L and the zero etc.
BASE58 = '123456789abcdefghijkmnopqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ' 
url = ''
while node_id >= 58:
    div, mod = divmod(node_id, 58)
    url = BASE58[mod] + url
    node_id = int(div)

return 'http://short.com/%s' % BASE58[node_id] + url

将其转换为数字也没什么大不了的。

What about BASE58 encoding the URL? Like for example flickr does.

# note the missing lowercase L and the zero etc.
BASE58 = '123456789abcdefghijkmnopqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ' 
url = ''
while node_id >= 58:
    div, mod = divmod(node_id, 58)
    url = BASE58[mod] + url
    node_id = int(div)

return 'http://short.com/%s' % BASE58[node_id] + url

Turning that back into a number isn’t a big deal either.


回答 4

使用hex(id)[2:]int(urlpart, 16)。还有其他选择。对您的id进行base32编码也可以正常工作,但是我不知道有没有内置Python进行base32编码的库。

显然,在Python 2.4中使用base64模块引入了base32编码器。您可以尝试使用b32encodeb32decode。你应该给True两者的casefoldmap01期权b32decode的情况下,人们写下你的短网址。

实际上,我收回了这一点。我仍然认为base32编码是一个好主意,但是该模块对于URL缩短的情况没有用。您可以查看模块中的实现,并针对此特定情况进行自己的设计。:-)

Use hex(id)[2:] and int(urlpart, 16). There are other options. base32 encoding your id could work as well, but I don’t know that there’s any library that does base32 encoding built into Python.

Apparently a base32 encoder was introduced in Python 2.4 with the base64 module. You might try using b32encode and b32decode. You should give True for both the casefold and map01 options to b32decode in case people write down your shortened URLs.

Actually, I take that back. I still think base32 encoding is a good idea, but that module is not useful for the case of URL shortening. You could look at the implementation in the module and make your own for this specific case. :-)


我如何让Pyflakes忽略声明?

问题:我如何让Pyflakes忽略声明?

我们的许多模块都始于:

try:
    import json
except ImportError:
    from django.utils import simplejson as json  # Python 2.4 fallback.

…这是整个文件中唯一的Pyflakes警告:

foo/bar.py:14: redefinition of unused 'json' from line 12

我如何让Pyflakes忽略这一点?

(通常我会去阅读文档,但是链接断开了。如果没有人回答,我只会阅读源代码。)

A lot of our modules start with:

try:
    import json
except ImportError:
    from django.utils import simplejson as json  # Python 2.4 fallback.

…and it’s the only Pyflakes warning in the entire file:

foo/bar.py:14: redefinition of unused 'json' from line 12

How can I get Pyflakes to ignore this?

(Normally I’d go read the docs but the link is broken. If nobody has an answer, I’ll just read the source.)


回答 0

如果您可以改用flake8-包裹pyflakes和pep8 checker-则以

# NOQA

(其中的空格非常大-代码末尾与之间的2个空格,在代码与文本#之间的一个空格NOQA)将告诉检查程序忽略该行上的任何错误。

If you can use flake8 instead – which wraps pyflakes as well as the pep8 checker – a line ending with

# NOQA

(in which the space is significant – 2 spaces between the end of the code and the #, one between it and the NOQA text) will tell the checker to ignore any errors on that line.


回答 1

我知道这是在不久前被质疑的,并且已经得到答复。

但是我想补充一下我通常使用的内容:

try:
    import json
    assert json  # silence pyflakes
except ImportError:
    from django.utils import simplejson as json  # Python 2.4 fallback.

I know this was questioned some time ago and is already answered.

But I wanted to add what I usually use:

try:
    import json
    assert json  # silence pyflakes
except ImportError:
    from django.utils import simplejson as json  # Python 2.4 fallback.

回答 2

是的,不幸的是dimod.org和所有好东西都一起倒了。

看一下pyflakes代码,在我看来pyflakes是经过设计的,因此可以很容易地将其用作“嵌入式快速检查器”。

为了实现忽略功能,您将需要编写自己的调用pyflakes检查器。

在这里您可以找到一个主意:http : //djangosnippets.org/snippets/1762/

请注意,以上代码段仅用于同一行中的注释位置。为了忽略整个块,您可能需要在块docstring中添加’pyflakes:ignore’并基于node.doc进行过滤。

祝好运!


我正在使用Pocket-lint进行各种静态代码分析。以下是在Pocket-Lint中忽略pyflakes所做的更改:https ://code.launchpad.net/~adiroiban/pocket-lint/907742/+merge/102882

Yep, unfortunately dimod.org is down together with all goodies.

Looking at the pyflakes code, it seems to me that pyflakes is designed so that it will be easy to use it as an “embedded fast checker”.

For implementing ignore functionality you will need to write your own that calls the pyflakes checker.

Here you can find an idea: http://djangosnippets.org/snippets/1762/

Note that the above snippet only for for comments places on the same line. For ignoring a whole block you might want to add ‘pyflakes:ignore’ in the block docstring and filter based on node.doc.

Good luck!


I am using pocket-lint for all kind of static code analysis. Here are the changes made in pocket-lint for ignoring pyflakes: https://code.launchpad.net/~adiroiban/pocket-lint/907742/+merge/102882


回答 3

引用github问题票证

尽管此修复程序仍在进行中,但是如果您想知道,可以通过以下方法解决:

try:
    from unittest.runner import _WritelnDecorator
    _WritelnDecorator; # workaround for pyflakes issue #13
except ImportError:
    from unittest import _WritelnDecorator

用所需的实体(模块,函数,类)替换_unittest和_WritelnDecorator

deemoowoor

To quote from the github issue ticket:

While the fix is still coming, this is how it can be worked around, if you’re wondering:

try:
    from unittest.runner import _WritelnDecorator
    _WritelnDecorator; # workaround for pyflakes issue #13
except ImportError:
    from unittest import _WritelnDecorator

Substitude _unittest and _WritelnDecorator with the entities (modules, functions, classes) you need

deemoowoor


回答 4

这是pyflakes的Monkey补丁,添加了# bypass_pyflakes注释选项。

passive_pyflakes.py

#!/usr/bin/env python

from pyflakes.scripts import pyflakes
from pyflakes.checker import Checker


def report_with_bypass(self, messageClass, *args, **kwargs):
    text_lineno = args[0] - 1
    with open(self.filename, 'r') as code:
        if code.readlines()[text_lineno].find('bypass_pyflakes') >= 0:
            return
    self.messages.append(messageClass(self.filename, *args, **kwargs))

# monkey patch checker to support bypass
Checker.report = report_with_bypass

pyflakes.main()

如果将其另存为bypass_pyflakes.py,则可以将其调用为python bypass_pyflakes.py myfile.py

http://chase-seibert.github.com/blog/2013/01/11/bypass_pyflakes.html

Here is a monkey patch for pyflakes that adds a # bypass_pyflakes comment option.

bypass_pyflakes.py

#!/usr/bin/env python

from pyflakes.scripts import pyflakes
from pyflakes.checker import Checker


def report_with_bypass(self, messageClass, *args, **kwargs):
    text_lineno = args[0] - 1
    with open(self.filename, 'r') as code:
        if code.readlines()[text_lineno].find('bypass_pyflakes') >= 0:
            return
    self.messages.append(messageClass(self.filename, *args, **kwargs))

# monkey patch checker to support bypass
Checker.report = report_with_bypass

pyflakes.main()

If you save this as bypass_pyflakes.py, then you can invoke it as python bypass_pyflakes.py myfile.py.

http://chase-seibert.github.com/blog/2013/01/11/bypass_pyflakes.html


回答 5

您也可以使用导入__import__。它不是pythonic,但是pyflakes不再警告您。请参阅的文档__import__

try:
    import json
except ImportError:
    __import__('django.utils', globals(), locals(), ['json'], -1)

You can also import with __import__. It’s not pythonic, but pyflakes does not warn you anymore. See documentation for __import__ .

try:
    import json
except ImportError:
    __import__('django.utils', globals(), locals(), ['json'], -1)

回答 6

我创建了一个带有一些awk魔术的shell脚本来帮助我。有了这个的所有生产线import typingfrom typing import#$(后者是我在这里使用一个特殊的注释)被排除($1是Python脚本的文件名):

result=$(pyflakes -- "$1" 2>&1)

# check whether there is any output
if [ "$result" ]; then

    # lines to exclude
    excl=$(awk 'BEGIN { ORS="" } /(#\$)|(import +typing)|(from +typing +import )/ { print sep NR; sep="|" }' "$1")

    # exclude lines if there are any (otherwise we get invalid regex)
    [ "$excl" ] &&
        result=$(awk "! /^[^:]+:(${excl}):/" <<< "$result")

fi

# now echo "$result" or such ...

基本上,它会记录行号并动态创建一个正则表达式。

I created a little shell script with some awk magic to help me. With this all lines with import typing, from typing import or #$ (latter is a special comment I am using here) are excluded ($1 is the file name of the Python script):

result=$(pyflakes -- "$1" 2>&1)

# check whether there is any output
if [ "$result" ]; then

    # lines to exclude
    excl=$(awk 'BEGIN { ORS="" } /(#\$)|(import +typing)|(from +typing +import )/ { print sep NR; sep="|" }' "$1")

    # exclude lines if there are any (otherwise we get invalid regex)
    [ "$excl" ] &&
        result=$(awk "! /^[^:]+:(${excl}):/" <<< "$result")

fi

# now echo "$result" or such ...

Basically it notes the line numbers and dynamically creates a regex out it.


使用Python从字符串中删除数字以外的字符?

问题:使用Python从字符串中删除数字以外的字符?

如何从字符串中删除除数字以外的所有字符?

How can I remove all characters except numbers from string?


回答 0

在Python 2. *中,到目前为止最快的方法是.translate

>>> x='aaa12333bb445bb54b5b52'
>>> import string
>>> all=string.maketrans('','')
>>> nodigs=all.translate(all, string.digits)
>>> x.translate(all, nodigs)
'1233344554552'
>>> 

string.maketrans生成一个转换表(长度为256的字符串),在这种情况下,该转换表与''.join(chr(x) for x in range(256))(更快地制作;-)相同。.translate应用转换表(这里无关紧要,因为all本质上是指身份),并删除第二个参数(关键部分)中存在的字符。

.translate在Unicode字符串(和Python 3中的字符串)上的工作方式大不相同-我确实希望问题能说明感兴趣的是哪个Python的主要发行版!)-并不是那么简单,也不是那么快,尽管仍然非常有用。

回到2. *,性能差异令人印象深刻……:

$ python -mtimeit -s'import string; all=string.maketrans("", ""); nodig=all.translate(all, string.digits); x="aaa12333bb445bb54b5b52"' 'x.translate(all, nodig)'
1000000 loops, best of 3: 1.04 usec per loop
$ python -mtimeit -s'import re;  x="aaa12333bb445bb54b5b52"' 're.sub(r"\D", "", x)'
100000 loops, best of 3: 7.9 usec per loop

将事情加速7到8倍几乎不是花生,因此该translate方法非常值得了解和使用。另一种流行的非RE方法…

$ python -mtimeit -s'x="aaa12333bb445bb54b5b52"' '"".join(i for i in x if i.isdigit())'
100000 loops, best of 3: 11.5 usec per loop

比RE慢50%,因此该.translate方法将其击败了一个数量级。

在Python 3或Unicode中,您需要传递.translate一个映射(以普通字符而不是直接字符作为键),该映射返回None要删除的内容。这是删除“除以下所有内容外”几个字符的一种便捷方式:

import string

class Del:
  def __init__(self, keep=string.digits):
    self.comp = dict((ord(c),c) for c in keep)
  def __getitem__(self, k):
    return self.comp.get(k)

DD = Del()

x='aaa12333bb445bb54b5b52'
x.translate(DD)

也发出'1233344554552'。但是,将其放在xx.py中,我们可以…:

$ python3.1 -mtimeit -s'import re;  x="aaa12333bb445bb54b5b52"' 're.sub(r"\D", "", x)'
100000 loops, best of 3: 8.43 usec per loop
$ python3.1 -mtimeit -s'import xx; x="aaa12333bb445bb54b5b52"' 'x.translate(xx.DD)'
10000 loops, best of 3: 24.3 usec per loop

…表明性能优势对于这种“删除”任务消失了,而变成了性能下降。

In Python 2.*, by far the fastest approach is the .translate method:

>>> x='aaa12333bb445bb54b5b52'
>>> import string
>>> all=string.maketrans('','')
>>> nodigs=all.translate(all, string.digits)
>>> x.translate(all, nodigs)
'1233344554552'
>>> 

string.maketrans makes a translation table (a string of length 256) which in this case is the same as ''.join(chr(x) for x in range(256)) (just faster to make;-). .translate applies the translation table (which here is irrelevant since all essentially means identity) AND deletes characters present in the second argument — the key part.

.translate works very differently on Unicode strings (and strings in Python 3 — I do wish questions specified which major-release of Python is of interest!) — not quite this simple, not quite this fast, though still quite usable.

Back to 2.*, the performance difference is impressive…:

$ python -mtimeit -s'import string; all=string.maketrans("", ""); nodig=all.translate(all, string.digits); x="aaa12333bb445bb54b5b52"' 'x.translate(all, nodig)'
1000000 loops, best of 3: 1.04 usec per loop
$ python -mtimeit -s'import re;  x="aaa12333bb445bb54b5b52"' 're.sub(r"\D", "", x)'
100000 loops, best of 3: 7.9 usec per loop

Speeding things up by 7-8 times is hardly peanuts, so the translate method is well worth knowing and using. The other popular non-RE approach…:

$ python -mtimeit -s'x="aaa12333bb445bb54b5b52"' '"".join(i for i in x if i.isdigit())'
100000 loops, best of 3: 11.5 usec per loop

is 50% slower than RE, so the .translate approach beats it by over an order of magnitude.

In Python 3, or for Unicode, you need to pass .translate a mapping (with ordinals, not characters directly, as keys) that returns None for what you want to delete. Here’s a convenient way to express this for deletion of “everything but” a few characters:

import string

class Del:
  def __init__(self, keep=string.digits):
    self.comp = dict((ord(c),c) for c in keep)
  def __getitem__(self, k):
    return self.comp.get(k)

DD = Del()

x='aaa12333bb445bb54b5b52'
x.translate(DD)

also emits '1233344554552'. However, putting this in xx.py we have…:

$ python3.1 -mtimeit -s'import re;  x="aaa12333bb445bb54b5b52"' 're.sub(r"\D", "", x)'
100000 loops, best of 3: 8.43 usec per loop
$ python3.1 -mtimeit -s'import xx; x="aaa12333bb445bb54b5b52"' 'x.translate(xx.DD)'
10000 loops, best of 3: 24.3 usec per loop

…which shows the performance advantage disappears, for this kind of “deletion” tasks, and becomes a performance decrease.


回答 1

使用re.sub,如下所示:

>>> import re
>>> re.sub('\D', '', 'aas30dsa20')
'3020'

\D 匹配任何非数字字符,因此,上面的代码实质上是将每个非数字字符替换为空字符串。

或者您可以使用filter,就像这样(在Python 2中):

>>> filter(str.isdigit, 'aas30dsa20')
'3020'

由于在Python 3中,filter返回的是迭代器而不是list,因此您可以使用以下代码:

>>> ''.join(filter(str.isdigit, 'aas30dsa20'))
'3020'

Use re.sub, like so:

>>> import re
>>> re.sub('\D', '', 'aas30dsa20')
'3020'

\D matches any non-digit character so, the code above, is essentially replacing every non-digit character for the empty string.

Or you can use filter, like so (in Python 2):

>>> filter(str.isdigit, 'aas30dsa20')
'3020'

Since in Python 3, filter returns an iterator instead of a list, you can use the following instead:

>>> ''.join(filter(str.isdigit, 'aas30dsa20'))
'3020'

回答 2

s=''.join(i for i in s if i.isdigit())

另一个生成器变体。

s=''.join(i for i in s if i.isdigit())

Another generator variant.


回答 3

您可以使用过滤器:

filter(lambda x: x.isdigit(), "dasdasd2313dsa")

在python3.0上,您必须加入这个(有点丑陋的:()

''.join(filter(lambda x: x.isdigit(), "dasdasd2313dsa"))

You can use filter:

filter(lambda x: x.isdigit(), "dasdasd2313dsa")

On python3.0 you have to join this (kinda ugly :( )

''.join(filter(lambda x: x.isdigit(), "dasdasd2313dsa"))

回答 4

按照拜耳的回答:

''.join(i for i in s if i.isdigit())

along the lines of bayer’s answer:

''.join(i for i in s if i.isdigit())

回答 5

您可以使用Regex轻松完成此操作

>>> import re
>>> re.sub("\D","","£70,000")
70000

You can easily do it using Regex

>>> import re
>>> re.sub("\D","","£70,000")
70000

回答 6

x.translate(None, string.digits)

将从字符串中删除所有数字。要删除字母并保留数字,请执行以下操作:

x.translate(None, string.letters)
x.translate(None, string.digits)

will delete all digits from string. To delete letters and keep the digits, do this:

x.translate(None, string.letters)

回答 7

这位操作员在评论中提到他想保留小数位。可以通过re.sub方法(按照第二个方法和恕我直言的最佳答案)来完成,方法是明确列出要保留的字符,例如

>>> re.sub("[^0123456789\.]","","poo123.4and5fish")
'123.45'

The op mentions in the comments that he wants to keep the decimal place. This can be done with the re.sub method (as per the second and IMHO best answer) by explicitly listing the characters to keep e.g.

>>> re.sub("[^0123456789\.]","","poo123.4and5fish")
'123.45'

回答 8

Python 3的快速版本:

# xx3.py
from collections import defaultdict
import string
_NoneType = type(None)

def keeper(keep):
    table = defaultdict(_NoneType)
    table.update({ord(c): c for c in keep})
    return table

digit_keeper = keeper(string.digits)

这是与regex的性能比较:

$ python3.3 -mtimeit -s'import xx3; x="aaa12333bb445bb54b5b52"' 'x.translate(xx3.digit_keeper)'
1000000 loops, best of 3: 1.02 usec per loop
$ python3.3 -mtimeit -s'import re; r = re.compile(r"\D"); x="aaa12333bb445bb54b5b52"' 'r.sub("", x)'
100000 loops, best of 3: 3.43 usec per loop

对我来说,它比正则表达式快3倍多。它也比class Del上面更快,因为defaultdict它使用C语言而不是(慢)Python进行所有查找。这是我在同一系统上的版本,以进行比较。

$ python3.3 -mtimeit -s'import xx; x="aaa12333bb445bb54b5b52"' 'x.translate(xx.DD)'
100000 loops, best of 3: 13.6 usec per loop

A fast version for Python 3:

# xx3.py
from collections import defaultdict
import string
_NoneType = type(None)

def keeper(keep):
    table = defaultdict(_NoneType)
    table.update({ord(c): c for c in keep})
    return table

digit_keeper = keeper(string.digits)

Here’s a performance comparison vs. regex:

$ python3.3 -mtimeit -s'import xx3; x="aaa12333bb445bb54b5b52"' 'x.translate(xx3.digit_keeper)'
1000000 loops, best of 3: 1.02 usec per loop
$ python3.3 -mtimeit -s'import re; r = re.compile(r"\D"); x="aaa12333bb445bb54b5b52"' 'r.sub("", x)'
100000 loops, best of 3: 3.43 usec per loop

So it’s a little bit more than 3 times faster than regex, for me. It’s also faster than class Del above, because defaultdict does all its lookups in C, rather than (slow) Python. Here’s that version on my same system, for comparison.

$ python3.3 -mtimeit -s'import xx; x="aaa12333bb445bb54b5b52"' 'x.translate(xx.DD)'
100000 loops, best of 3: 13.6 usec per loop

回答 9

使用生成器表达式:

>>> s = "foo200bar"
>>> new_s = "".join(i for i in s if i in "0123456789")

Use a generator expression:

>>> s = "foo200bar"
>>> new_s = "".join(i for i in s if i in "0123456789")

回答 10

丑陋但可行:

>>> s
'aaa12333bb445bb54b5b52'
>>> a = ''.join(filter(lambda x : x.isdigit(), s))
>>> a
'1233344554552'
>>>

Ugly but works:

>>> s
'aaa12333bb445bb54b5b52'
>>> a = ''.join(filter(lambda x : x.isdigit(), s))
>>> a
'1233344554552'
>>>

回答 11

$ python -mtimeit -s'import re;  x="aaa12333bb445bb54b5b52"' 're.sub(r"\D", "", x)'

100000次循环,每循环3:2.48微秒最佳

$ python -mtimeit -s'import re; x="aaa12333bab445bb54b5b52"' '"".join(re.findall("[a-z]+",x))'

100000次循环,最好为3:每个循环2.02微秒

$ python -mtimeit -s'import re;  x="aaa12333bb445bb54b5b52"' 're.sub(r"\D", "", x)'

100000次循环,每循环3:2.37最佳

$ python -mtimeit -s'import re; x="aaa12333bab445bb54b5b52"' '"".join(re.findall("[a-z]+",x))'

100000次循环,每循环3:1.97最佳

我已经观察到联接比sub快。

$ python -mtimeit -s'import re;  x="aaa12333bb445bb54b5b52"' 're.sub(r"\D", "", x)'

100000 loops, best of 3: 2.48 usec per loop

$ python -mtimeit -s'import re; x="aaa12333bab445bb54b5b52"' '"".join(re.findall("[a-z]+",x))'

100000 loops, best of 3: 2.02 usec per loop

$ python -mtimeit -s'import re;  x="aaa12333bb445bb54b5b52"' 're.sub(r"\D", "", x)'

100000 loops, best of 3: 2.37 usec per loop

$ python -mtimeit -s'import re; x="aaa12333bab445bb54b5b52"' '"".join(re.findall("[a-z]+",x))'

100000 loops, best of 3: 1.97 usec per loop

I had observed that join is faster than sub.


回答 12

您可以阅读每个字符。如果是数字,则将其包括在答案中。该str.isdigit() 方法是一种知道字符是否为数字的方法。

your_input = '12kjkh2nnk34l34'
your_output = ''.join(c for c in your_input if c.isdigit())
print(your_output) # '1223434'

You can read each character. If it is digit, then include it in the answer. The str.isdigit() method is a way to know if a character is digit.

your_input = '12kjkh2nnk34l34'
your_output = ''.join(c for c in your_input if c.isdigit())
print(your_output) # '1223434'

回答 13

不是一行代码,但非常简单:

buffer = ""
some_str = "aas30dsa20"

for char in some_str:
    if not char.isdigit():
        buffer += char

print( buffer )

Not a one liner but very simple:

buffer = ""
some_str = "aas30dsa20"

for char in some_str:
    if not char.isdigit():
        buffer += char

print( buffer )

回答 14

我用这个 'letters'应该包含您要删除的所有字母:

Output = Input.translate({ord(i): None for i in 'letters'}))

例:

Input = "I would like 20 dollars for that suit" Output = Input.translate({ord(i): None for i in 'abcdefghijklmnopqrstuvwxzy'})) print(Output)

输出: 20

I used this. 'letters' should contain all the letters that you want to get rid of:

Output = Input.translate({ord(i): None for i in 'letters'}))

Example:

Input = "I would like 20 dollars for that suit" Output = Input.translate({ord(i): None for i in 'abcdefghijklmnopqrstuvwxzy'})) print(Output)

Output: 20


我应该使用scipy.pi,numpy.pi还是math.pi?

问题:我应该使用scipy.pi,numpy.pi还是math.pi?

在使用SciPy的和NumPy的一个项目,我应该使用scipy.pinumpy.pimath.pi

In a project using SciPy and NumPy, should I use scipy.pi, numpy.pi, or math.pi?


回答 0

>>> import math
>>> import numpy as np
>>> import scipy
>>> math.pi == np.pi == scipy.pi
True

所以没关系,它们都是相同的值。

这三个模块均提供pi值的唯一原因是,如果仅使用三个模块之一,则可以方便地访问pi,而不必导入另一个模块。他们没有为pi提供不同的值。

>>> import math
>>> import numpy as np
>>> import scipy
>>> math.pi == np.pi == scipy.pi
True

So it doesn’t matter, they are all the same value.

The only reason all three modules provide a pi value is so if you are using just one of the three modules, you can conveniently have access to pi without having to import another module. They’re not providing different values for pi.


回答 1

需要注意的一件事是,当然,并非所有库都将对pi使用相同的含义,因此知道您使用的内容永远不会有任何伤害。例如,符号数学库Sympy对pi的表示与math和numpy不同:

import math
import numpy
import scipy
import sympy

print(math.pi == numpy.pi)
> True
print(math.pi == scipy.pi)
> True
print(math.pi == sympy.pi)
> False

One thing to note is that not all libraries will use the same meaning for pi, of course, so it never hurts to know what you’re using. For example, the symbolic math library Sympy’s representation of pi is not the same as math and numpy:

import math
import numpy
import scipy
import sympy

print(math.pi == numpy.pi)
> True
print(math.pi == scipy.pi)
> True
print(math.pi == sympy.pi)
> False

有趣好用的Python教程

退出移动版
微信支付
请使用 微信 扫码支付