Java或Python用于自然语言处理

问题:Java或Python用于自然语言处理

我想知道哪种编程语言更适合自然语言处理。Java还是Python?我发现了很多与此有关的问题和答案。但是我仍然迷失在选择使用哪一个上。

我想知道用于Java的NLP库,因为有很多库(LingPipe,GATE,OpenNLP,StandfordNLP)。对于Python,大多数程序员都建议使用NLTK。

但是,如果我要对非结构化数据(只是自由格式的纯英文文本)进行一些文本处理或信息提取,以获得一些有用的信息,那么最佳选择是什么?Java还是Python?合适的图书馆?

更新

我要做的是从非结构化数据中提取有用的产品信息(例如,用户使用不太标准的英语来制作有关手机或笔记本电脑的不同形式的广告)

I would like to know which programming language is better for natural language processing. Java or Python? I have found lots of questions and answers regarding about it. But I am still lost in choosing which one to use.

And I want to know which NLP library to use for Java since there are lots of libraries (LingPipe, GATE, OpenNLP, StandfordNLP). For Python, most programmers recommend NLTK.

But if I am to do some text processing or information extraction from unstructured data (just free formed plain English text) to get some useful information, what is the best option? Java or Python? Suitable library?

Updated

What I want to do is to extract useful product information from unstructured data (E.g. users make different forms of advertisement about mobiles or laptops with not very standard English language)


回答 0

Java vs Python for NLP非常偏爱或必需。根据公司/项目的不同,您将需要使用其中一个,而除非您负责一个项目,否则通常没有太多选择。

除了NLTK(www.nltk.org),实际上还有其他用于文本处理的库python

(有关更多信息,请参见https://pypi.python.org/pypi?%3Aaction=search&term=natural+language+processing&submit=search

对于Java,还有其他许多吨,但这是另一个清单:

这是基本字符串处理的不错比较,请参阅http://nltk.googlecode.com/svn/trunk/doc/howto/nlp-python.html

GATE与UIMA与OpenNLP的有用比较,请参阅https://www.assembla.com/spaces/extraction-of-cost-data/wiki/Gate-vs-UIMA-vs-OpenNLP?version=4

如果您不确定使用NLP的语言是什么,我个人会说“可以为您提供所需分析/输出的任何语言”,请参阅要学习自然语言处理的语言或工具?

这是NLP工具的最新版本(2017):https : //github.com/alvations/awesome-community-curated-nlp

NLP工具的较旧列表(2013):http ://web.archive.org/web/20130703190201/http: //yauhenklimovich.wordpress.com/2013/05/20/tools-nlp


除了语言处理工具之外,您非常需要将machine learning工具合并到NLP管道中。

有一个整体的范围PythonJava,并再次就看个人喜好和库是否人性化不够:

python中的机器学习库:

(有关更多信息,请参见https://pypi.python.org/pypi?%3Aaction=search&term=machine+learning&submit=search


随着最近(2015年)NLP中的深度学习海啸,您可能可以考虑:https : //en.wikipedia.org/wiki/Comparison_of_deep_learning_software

我将避免出于非偏爱/中立的目的列出深度学习工具。


其他也需要NLP / ML工具的Stackoverflow问题:

Java vs Python for NLP is very much a preference or necessity. Depending on the company/projects you’ll need to use one or the other and often there isn’t much of a choice unless you’re heading a project.

Other than NLTK (www.nltk.org), there are actually other libraries for text processing in python:

(for more, see https://pypi.python.org/pypi?%3Aaction=search&term=natural+language+processing&submit=search)

For Java, there’re tonnes of others but here’s another list:

This is a nice comparison for basic string processing, see http://nltk.googlecode.com/svn/trunk/doc/howto/nlp-python.html

A useful comparison of GATE vs UIMA vs OpenNLP, see https://www.assembla.com/spaces/extraction-of-cost-data/wiki/Gate-vs-UIMA-vs-OpenNLP?version=4

If you’re uncertain, which is the language to go for NLP, personally i say, “any language that will give you the desired analysis/output”, see Which language or tools to learn for natural language processing?

Here’s a pretty recent (2017) of NLP tools: https://github.com/alvations/awesome-community-curated-nlp

An older list of NLP tools (2013): http://web.archive.org/web/20130703190201/http://yauhenklimovich.wordpress.com/2013/05/20/tools-nlp


Other than language processing tools, you would very much need machine learning tools to incorporate into NLP pipelines.

There’s a whole range in Python and Java, and once again it’s up to preference and whether the libraries are user-friendly enough:

Machine Learning libraries in python:

(for more, see https://pypi.python.org/pypi?%3Aaction=search&term=machine+learning&submit=search)


With the recent (2015) deep learning tsunami in NLP, possibly you could consider: https://en.wikipedia.org/wiki/Comparison_of_deep_learning_software

I’ll avoid listing deep learning tools out of non-favoritism / neutrality.


Other Stackoverflow questions that also asked for NLP/ML tools:


回答 1

这个问题很开放。就是说,下面而不是选择一个,而是根据您要使用的语言进行比较(因为两种语言都有不错的库)。

Python

在Python方面,首先要看的是Python Natural Language Toolkit。正如他们在描述中所指出的那样,NLTK是构建Python程序以使用人类语言数据的领先平台。它为50多种语料库和词汇资源(如WordNet)提供了易于使用的界面,并提供了一套用于分类,标记化,词干,标记,解析和语义推理的文本处理库。

您还可以查找一些出色的代码,这些代码源自基于Python的Google自然语言工具包项目。您可以在GitHub上找到该代码的链接。

爪哇

首先看的是斯坦福大学的自然语言处理小组。那里分发的所有软件都是用Java编写的。所有最新发行版都需要Oracle Java 6+或OpenJDK 7+。分发程序包包括用于命令行调用的组件,jar文件,Java API和源代码。

您在许多机器学习环境中看到的另一个很棒的选择(通用选择)是Weka。Weka是用于数据挖掘任务的机器学习算法的集合。这些算法既可以直接应用于数据集,也可以从您自己的Java代码中调用。Weka包含用于数据预处理,分类,回归,聚类,关联规则和可视化的工具。它也非常适合开发新的机器学习方案。

The question is very open ended. That said, rather than choose one, below is a comparison depending on the language that you would like to use (since there are good libraries available in both languages).

Python

In terms of Python, the first place you should look at is the Python Natural Language Toolkit. As they note in their description, NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning.

There is also some excellent code that you can look up that originated out of Google’s Natural Language Toolkit project that is Python based. You can find a link to that code here on GitHub.

Java

The first place to look would be Stanford’s Natural Language Processing Group. All of software that is distributed there is written in Java. All recent distributions require Oracle Java 6+ or OpenJDK 7+. Distribution packages include components for command-line invocation, jar files, a Java API, and source code.

Another great option that you see in a lot of machine learning environments here (general option), is Weka. Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.


如何配置Django以进行简单的开发和部署?

问题:如何配置Django以进行简单的开发和部署?

在进行Django 开发时,我倾向于使用SQLite,但是在实时服务器上,通常需要更强大的功能(例如MySQL / PostgreSQL)。同样,对Django设置也有其他更改:不同的日志记录位置/强度,媒体路径等。

您如何管理所有这些更改,以使部署变得简单,自动化?

I tend to use SQLite when doing Django development, but on a live server something more robust is often needed (MySQL/PostgreSQL, for example). Invariably, there are other changes to make to the Django settings as well: different logging locations / intensities, media paths, etc.

How do you manage all these changes to make deployment a simple, automated process?


回答 0

更新: django-configurations已发布,对于大多数人来说,它可能比手动执行更好。

如果您希望手动进行操作,则我先前的答案仍然适用:

我有多个设置文件。

  • settings_local.py -特定于主机的配置,例如数据库名称,文件路径等。
  • settings_development.py-用于开发的配置,例如DEBUG = True
  • settings_production.py-用于生产的配置,例如SERVER_EMAIL

我将所有这些与一个settings.py首先导入的文件捆绑在一起settings_local.py,然后再将另外两个文件之一捆绑在一起。它决定它通过两个设置加载内settings_local.pyDEVELOPMENT_HOSTSPRODUCTION_HOSTSsettings.py来电platform.node()以查找正在其上运行的计算机的主机名,然后在列表中查找该主机名,并根据在其中找到该主机名的列表加载第二个设置文件。

这样,您真正需要担心的唯一事情就是使settings_local.py文件与主机特定的配置保持最新,并且其他所有内容都会自动处理。

在此处查看示例。

Update: django-configurations has been released which is probably a better option for most people than doing it manually.

If you would prefer to do things manually, my earlier answer still applies:

I have multiple settings files.

  • settings_local.py – host-specific configuration, such as database name, file paths, etc.
  • settings_development.py – configuration used for development, e.g. DEBUG = True.
  • settings_production.py – configuration used for production, e.g. SERVER_EMAIL.

I tie these all together with a settings.py file that firstly imports settings_local.py, and then one of the other two. It decides which to load by two settings inside settings_local.pyDEVELOPMENT_HOSTS and PRODUCTION_HOSTS. settings.py calls platform.node() to find the hostname of the machine it is running on, and then looks for that hostname in the lists, and loads the second settings file depending on which list it finds the hostname in.

That way, the only thing you really need to worry about is keeping the settings_local.py file up to date with the host-specific configuration, and everything else is handled automatically.

Check out an example here.


回答 1

就个人而言,我为该项目使用一个settings.py,我只是让它查找所在的主机名(我的开发机器的主机名以“ gabriel”开头,所以我只有这样:

import socket
if socket.gethostname().startswith('gabriel'):
    LIVEHOST = False
else: 
    LIVEHOST = True

然后在其他地方,我有类似的东西:

if LIVEHOST:
    DEBUG = False
    PREPEND_WWW = True
    MEDIA_URL = 'http://static1.grsites.com/'
else:
    DEBUG = True
    PREPEND_WWW = False
    MEDIA_URL = 'http://localhost:8000/static/'

等等。可读性稍差,但是它可以正常工作,并且省去了处理多个设置文件的麻烦。

Personally, I use a single settings.py for the project, I just have it look up the hostname it’s on (my development machines have hostnames that start with “gabriel” so I just have this:

import socket
if socket.gethostname().startswith('gabriel'):
    LIVEHOST = False
else: 
    LIVEHOST = True

then in other parts I have things like:

if LIVEHOST:
    DEBUG = False
    PREPEND_WWW = True
    MEDIA_URL = 'http://static1.grsites.com/'
else:
    DEBUG = True
    PREPEND_WWW = False
    MEDIA_URL = 'http://localhost:8000/static/'

and so on. A little bit less readable, but it works fine and saves having to juggle multiple settings files.


回答 2

在settings.py的结尾,我有以下内容:

try:
    from settings_local import *
except ImportError:
    pass

这样,如果我想覆盖默认设置,则只需将settings_local.py放在settings.py旁边。

At the end of settings.py I have the following:

try:
    from settings_local import *
except ImportError:
    pass

This way if I want to override default settings I need to just put settings_local.py right next to settings.py.


回答 3

我有两个文件。settings_base.py其中包含通用/默认设置,并且已签入源代码管理。每个部署都有一个单独的settings.py,从头from settings_base import *开始执行,然后根据需要进行覆盖。

I have two files. settings_base.py which contains common/default settings, and which is checked into source control. Each deployment has a separate settings.py, which executes from settings_base import * at the beginning and then overrides as needed.


回答 4

我发现的最简单的方法是:

1)使用默认的settings.py进行本地开发,以及2)创建一个production-settings.py,开头为:

import os
from settings import *

然后只需覆盖生产中不同的设置:

DEBUG = False
TEMPLATE_DEBUG = DEBUG


DATABASES = {
    'default': {
           ....
    }
}

The most simplistic way I found was:

1) use the default settings.py for local development and 2) create a production-settings.py starting with:

import os
from settings import *

And then just override the settings that differ in production:

DEBUG = False
TEMPLATE_DEBUG = DEBUG


DATABASES = {
    'default': {
           ....
    }
}

回答 5

在某种程度上相关,关于使用多个数据库部署Django本身的问题,您可能需要看一下Djangostack。您可以下载一个完全免费的安装程序,该程序允许您安装Apache,Python,Django等。在安装过程中,我们允许您选择要使用的数据库(MySQL,SQLite,PostgreSQL)。在内部自动进行部署时,我们会广泛使用安装程序(它们可以在无人值守模式下运行)。

Somewhat related, for the issue of deploying Django itself with multiple databases, you may want to take a look at Djangostack. You can download a completely free installer that allows you to install Apache, Python, Django, etc. As part of the installation process we allow you to select which database you want to use (MySQL, SQLite, PostgreSQL). We use the installers extensively when automating deployments internally (they can be run in unattended mode).


回答 6

我的settings.py文件位于外部目录中。这样,就不会将其检入源代码管理或被部署覆盖。我将其与所有默认设置一起放入Django项目下的settings.py文件中:

import sys
import os.path

def _load_settings(path):    
    print "Loading configuration from %s" % (path)
    if os.path.exists(path):
    settings = {}
    # execfile can't modify globals directly, so we will load them manually
    execfile(path, globals(), settings)
    for setting in settings:
        globals()[setting] = settings[setting]

_load_settings("/usr/local/conf/local_settings.py")

注意:如果您不信任local_settings.py,这将非常危险。

I have my settings.py file in an external directory. That way, it doesn’t get checked into source control, or over-written by a deploy. I put this in the settings.py file under my Django project, along with any default settings:

import sys
import os.path

def _load_settings(path):    
    print "Loading configuration from %s" % (path)
    if os.path.exists(path):
    settings = {}
    # execfile can't modify globals directly, so we will load them manually
    execfile(path, globals(), settings)
    for setting in settings:
        globals()[setting] = settings[setting]

_load_settings("/usr/local/conf/local_settings.py")

Note: This is very dangerous if you can’t trust local_settings.py.


回答 7

除了吉姆提到的多个设置文件,我也倾向于地方两个设置成我的顶部settings.py文件BASE_DIRBASE_URL设置的代码和URL到现场的基地的路径,所有其他的设置被修改将自己附加到这些。

BASE_DIR = "/home/sean/myapp/" 例如 MEDIA_ROOT = "%smedia/" % BASEDIR

因此,在移动项目时,我只需要编辑这些设置,而无需搜索整个文件。

我还建议您查看一下能够促进远程部署自动化的fabric和Capistrano(Ruby工具,但是它可以用于部署Django应用程序)。

In addition to the multiple settings files mentioned by Jim, I also tend to place two settings into my settings.py file at the top BASE_DIR and BASE_URL set to the path of the code and the URL to the base of the site, all other settings are modified to append themselves to these.

BASE_DIR = "/home/sean/myapp/" e.g. MEDIA_ROOT = "%smedia/" % BASEDIR

So when moving the project I only have to edit these settings and not search the whole file.

I would also recommend looking at fabric and Capistrano (Ruby tool, but it can be used to deploy Django applications) which facilitate automation of remote deployment.


回答 8

好吧,我使用以下配置:

在settings.py的末尾:

#settings.py
try:
    from locale_settings import *
except ImportError:
    pass

在locale_settings.py中:

#locale_settings.py
class Settings(object):

    def __init__(self):
        import settings
        self.settings = settings

    def __getattr__(self, name):
        return getattr(self.settings, name)

settings = Settings()

INSTALLED_APPS = settings.INSTALLED_APPS + (
    'gunicorn',)

# Delete duplicate settings maybe not needed, but I prefer to do it.
del settings
del Settings

Well, I use this configuration:

At the end of settings.py:

#settings.py
try:
    from locale_settings import *
except ImportError:
    pass

And in locale_settings.py:

#locale_settings.py
class Settings(object):

    def __init__(self):
        import settings
        self.settings = settings

    def __getattr__(self, name):
        return getattr(self.settings, name)

settings = Settings()

INSTALLED_APPS = settings.INSTALLED_APPS + (
    'gunicorn',)

# Delete duplicate settings maybe not needed, but I prefer to do it.
del settings
del Settings

回答 9

这么多复杂的答案!

每个settings.py文件都附带:

BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))

我使用该目录来设置DEBUG变量,如下所示(用您的开发代码所在的directoy代替):

DEBUG=False
if(BASE_DIR=="/path/to/my/dev/dir"):
    DEBUG = True

然后,每次移动settings.py文件时,DEBUG将为False,这是您的生产环境。

每当您需要与开发环境中不同的设置时,只需使用:

if(DEBUG):
    #Debug setting
else:
    #Release setting

So many complicated answers!

Every settings.py file comes with :

BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))

I use that directory to set the DEBUG variable like this (reaplace with the directoy where your dev code is):

DEBUG=False
if(BASE_DIR=="/path/to/my/dev/dir"):
    DEBUG = True

Then, every time the settings.py file is moved, DEBUG will be False and it’s your production environment.

Every time you need different settings than the ones in your dev environment just use:

if(DEBUG):
    #Debug setting
else:
    #Release setting

回答 10

我认为这取决于站点的大小,是否需要逐步使用SQLite,我已经在几个较小的实时站点上成功使用了SQLite,并且运行良好。

I think it depends on the size of the site as to whether you need to step up from using SQLite, I’ve successfully used SQLite on several smaller live sites and it runs great.


回答 11

我使用环境:

if os.environ.get('WEB_MODE', None) == 'production' :
   from settings_production import *
else :
   from settings_dev import *

我相信这是一种更好的方法,因为最终您需要针对测试环境进行特殊设置,并且可以轻松地将其添加到此条件中。

I use environment:

if os.environ.get('WEB_MODE', None) == 'production' :
   from settings_production import *
else :
   from settings_dev import *

I believe this is a much better approach, because eventually you need special settings for your test environment, and you can easily add it to this condition.


回答 12

这是一个较旧的文章,但是我认为如果我添加这个有用的内容library,它将简化事情。

使用Django配置

快速开始

pip install django-configurations

然后子类化包含的配置。配置类在项目的settings.py或用于存储设置常量的任何其他模块中,例如:

# mysite/settings.py

from configurations import Configuration

class Dev(Configuration):
    DEBUG = True

DJANGO_CONFIGURATION环境变量设置为您刚刚创建的类的名称,例如~/.bashrc

export DJANGO_CONFIGURATION=Dev

并将DJANGO_SETTINGS_MODULE环境变量照常导入模块导入路径,例如在bash中:

export DJANGO_SETTINGS_MODULE=mysite.settings

另外--configuration在使用Django管理命令时,请按照Django默认--settings命令行选项的方式提供选项,例如:

python manage.py runserver --settings=mysite.settings --configuration=Dev

为了使Django使用您的配置,您现在必须修改您的manage.pywsgi.py脚本,以使用django-configurations的相应启动程序版本,例如,使用django-configurations 的典型manage.py如下所示:

#!/usr/bin/env python

import os
import sys

if __name__ == "__main__":
    os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'mysite.settings')
    os.environ.setdefault('DJANGO_CONFIGURATION', 'Dev')

    from configurations.management import execute_from_command_line

    execute_from_command_line(sys.argv)

请注意,在第10行中,我们不使用通用工具django.core.management.execute_from_command_line,而是使用configurations.management.execute_from_command_line

这同样适用于您的wsgi.py文件,例如:

import os

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'mysite.settings')
os.environ.setdefault('DJANGO_CONFIGURATION', 'Dev')

from configurations.wsgi import get_wsgi_application

application = get_wsgi_application()

这里我们不使用默认django.core.wsgi.get_wsgi_application功能,而是使用configurations.wsgi.get_wsgi_application

而已!现在,您可以将项目与manage.py以及您最喜欢的启用WSGI的服务器一起使用。

This is an older post but I think if I add this useful library it will simplify things.

Use django-configuration

Quickstart

pip install django-configurations

Then subclass the included configurations.Configuration class in your project’s settings.py or any other module you’re using to store the settings constants, e.g.:

# mysite/settings.py

from configurations import Configuration

class Dev(Configuration):
    DEBUG = True

Set the DJANGO_CONFIGURATION environment variable to the name of the class you just created, e.g. in ~/.bashrc:

export DJANGO_CONFIGURATION=Dev

and the DJANGO_SETTINGS_MODULE environment variable to the module import path as usual, e.g. in bash:

export DJANGO_SETTINGS_MODULE=mysite.settings

Alternatively supply the --configuration option when using Django management commands along the lines of Django’s default --settings command line option, e.g.:

python manage.py runserver --settings=mysite.settings --configuration=Dev

To enable Django to use your configuration you now have to modify your manage.py or wsgi.py script to use django-configurations’ versions of the appropriate starter functions, e.g. a typical manage.py using django-configurations would look like this:

#!/usr/bin/env python

import os
import sys

if __name__ == "__main__":
    os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'mysite.settings')
    os.environ.setdefault('DJANGO_CONFIGURATION', 'Dev')

    from configurations.management import execute_from_command_line

    execute_from_command_line(sys.argv)

Notice in line 10 we don’t use the common tool django.core.management.execute_from_command_line but instead configurations.management.execute_from_command_line.

The same applies to your wsgi.py file, e.g.:

import os

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'mysite.settings')
os.environ.setdefault('DJANGO_CONFIGURATION', 'Dev')

from configurations.wsgi import get_wsgi_application

application = get_wsgi_application()

Here we don’t use the default django.core.wsgi.get_wsgi_application function but instead configurations.wsgi.get_wsgi_application.

That’s it! You can now use your project with manage.py and your favorite WSGI enabled server.


回答 13

实际上,您可能应该考虑为您的开发和生产环境使用相同(或几乎相同)的配置。否则,会不时出现“嘿,它可以在我的机器上工作”之类的情况。

因此,为了自动化部署并消除那些WOMM问题,只需使用Docker即可

In fact you should probably consider having the same (or almost the same) configs for your development and production environment. Otherwise, situations like “Hey, it works on my machine” will happen from time to time.

So in order to automate your deployment and eliminate those WOMM issues, just use Docker.


Anaconda vs. EPD Enthought vs.手动安装Python [关闭]

问题:Anaconda vs. EPD Enthought vs.手动安装Python [关闭]

与手动安装相比,各种Python捆绑包(EPD / Anaconda)有哪些相对优点/缺点?

我已经安装了EPD Academic,但没有任何问题。它提供了我认为我将永远需要的更多软件包,并且使用enpkg enstaller进行更新非常容易。EPD学术许可证需要每年更新一次,而免费版本的更新并不那么容易。

目前,我实际上只使用了一些软件包,例如PandasNumPySciPymatplotlibIPythonStatsmodels及其各自的依赖项。

对于这种有限的使用,我最好手动安装,pip install --upgrade 'package'还是捆绑包提供了除此以外的其他功能?

What are the relative merits / downsides of various Python bundles (EPD / Anaconda) vs. a manual install?

I have installed EPD academic, and I have no issues with it. It provides more packages that I think I will ever need, and it is very easy to update using enpkg enstaller. The EPD academic licence requires yearly renewal however and the free version does not do updates as easily.

At the moment I really only use a handful of packages such as Pandas, NumPy, SciPy, matplotlib, IPython, Statsmodels and their respective dependencies.

For such limited use am I better off with manual install and pip install --upgrade 'package' or do the bundles offer anything over and above this?


回答 0

2015年更新:如今,我总是推荐水蟒。它包括许多用于科学计算,数据科学,Web开发等的Python程序包。它还提供了一个高级的环境工具conda,该工具可以轻松地在环境之间切换,甚至在Python 2和3之间切换。它也很快得到了更新。当发布新版本的软件包时,您可以conda update packagename进行更新。

以下为原始答案

在Windows上,复杂的是编译数学软件包,因此,我认为仅当您仅对Python而不是其他软件包感兴趣时,手动安装才是可行的选择。

因此最好选择EPD(现为Canopy)或Anaconda。

Anaconda大约有270个软件包,其中包括对于大多数科学应用程序和数据分析而言最重要的软件包,即NumPySciPyPandasIPythonmatplotlibScikit-learn。因此,如果这对您来说足够,我会选择Anaconda。

相反,如果您对其他软件包感兴趣,并且如果您使用任何Enthought软件包(例如Chaco对于实时数据可视化非常有用),则EPD / Canopy可能是一个更好的选择。学术版在基本安装中包含大量软件包,在存储库中包含更多软件包。Anaconda还包括Chaco。

Update 2015: Nowadays I always recommend Anaconda. It includes lots of Python packages for scientific computing, data science, web development, etc. It also provides a superior environment tool, conda, which allows to easily switch between environments, even between Python 2 and 3. It is also updated very quickly as soon as a new version of a package is released, and you can just do conda update packagename to update it.

Original answer below:

On Windows, what is complicated is to compile the math packages, so I think a manual install is a viable option only if you are interested only in Python, without other packages.

Therefore better chose either EPD (now Canopy) or Anaconda.

Anaconda has around 270 packages, including the most important for most scientific applications and data analysis, that is, NumPy, SciPy, Pandas, IPython, matplotlib, Scikit-learn. So if this is enough for you, I would choose Anaconda.

Instead, if you are interested in other packages, and even more if you use any of the Enthought packages (Chaco for example is very useful for realtime data visualization), then EPD/Canopy is probably a better choice. The Academic version has a larger number of packages in the base install, and many more in the repository. Anaconda also includes Chaco.


回答 1

去年,我尝试了各种Windows发行版,试图为我的工作环境找到一个合适的版本(在代理之后,但无法访问代理配置)。

这是我的经验反馈:

EPD / Canopy: 我们拥有EPD许可证,但是它很旧,并且由于代理服务器情况怪异而无法更新。为了添加一些软件包(例如xlrd / xlwt的最新版本),我从源代码进行了编译。要更新SciPyNumPy,我使用了http://www.lfd.uci.edu/~gohlke/pythonlibs/中的预编译安装程序,但有时会破坏兼容性。我喜欢拥有完全配置的Py2exeCython,它开箱即用。

过了一会儿,我尝试安装Canopy的免费版本,但是它缺少Cython和py2exe以及一些我需要的特定高级软件包,因此我从未真正使用过它。我的一些同事购买了完整的Canopy许可证,但是我们仍然不确定他们将如何更新…

Python(x,y): 不想在许可证上挣扎,我在家安装了Python(x,y)。我现在注意到的唯一缺点是标准安装要求您选择所需的软件包。这既有好处也有坏处,因为我不确定我的客户端将具有与安装时完全相同的配置。(可以在Python(x,y)中安装Enthought工具套件。)使用Python(x,y)一段时间后,我只是注意到我安装了32位版本。尽管在他们的网站上不清楚,但截至2015年7月,他们似乎还没有64位版本。我打算将其卸载并获得64位版本。

Anaconda: 当我第一次写这篇文章时,Anaconda似乎还没有足够的软件包。几年后,它似乎好多了,我将尝试一下!

手册: 为了避免与我们的旧EPD版本存在版本兼容性问题,我最终使用了手动安装Python,并从上面链接的LFD网站添加了其他软件包。效果很好,但我仍然向需要高级软件包(例如GDALPyFITS)的新用户建议Canopy 。

摘要:如果您要购买Canopy,请获取完整的许可证(学术版或购买的)。否则,使用Python(x,y),结果将相同。

在Ubuntu上: 不需要分发。这些都是相对较新的(可以容忍+/- 6个月)并已预编译。您只需要执行sudo apt-get install python python-scipy就可以了!也有最高级的软件包。

I have tried various Windows distributions in the last year, trying to find one sutable for my work environment (behind a proxy, but without access to proxy configuration).

Here is my feedback from experience:

EPD/Canopy: We had a license of EPD, but it was old and we were unable to update becasue of the weird proxy situation. In order to add some packages (such as recent version of xlrd/xlwt), I compiled from source. To update SciPy and NumPy, I used the precompiled installer from http://www.lfd.uci.edu/~gohlke/pythonlibs/, but it would sometimes screw up compatibility. I loved having a fully configured Py2exe and Cython, and it simply worked out of the box.

After a while, I tried installing the free version of Canopy, but it lacks Cython and py2exe and some specific advanced packaged I needed, so I never really used it. Some of my colleagues bought the full Canopy license, but we’re still not sure how they’re going to update…

Python(x,y): Not wanting to struggle with licenses, I installed Python(x,y) at home. The only downside I noticed right now is that the standard installation requires you to select which packages you want. It’s both a good and a bad point, because I can’t be sure that my clients will have the exact same configuration as I do when I install. (The Enthought tool suite can be installed in Python(x,y).) After using Python(x,y) for a while, I just noticed I installed the 32 bit version. Although it is not clear on their website, it seems they don’t have a 64 bit version as of July 2015. I’m going to uninstall it and get a 64 bit distribution.

Anaconda: When I first wrote this, Anaconda didn’t seem to have enough packages yet. A couple of years later, it seems much better, I’m going to give it a try!

Manual: In order to avoid version compatibility issues with our old EPD version, I ended up using manual Python installation and adding additional packages from the LFD website linked above. It works great, but I would still suggest Canopy to a new user who requires advanced packages (like GDAL or PyFITS).

Summary: If you go for Canopy, get the full licence (Academic or purchased). Else, go with Python(x,y), it will end up being the same.

On Ubuntu: No need for a distribution. It’s all relatively recent (+/- 6 months is tolerable) and pre-compiled. You just need to execute sudo apt-get install python python-scipy and it’s there! Most advanced packages are there as well.


回答 2

其他答案很好地覆盖了地面,因此我只想谈谈一个尚未提及的特定方面。它可能是相当利基的,但是对于Linux系统下的某些人来说,它可能会制造或破坏Anaconda或Canopy:

Anaconda Python版本使用UCS4 Unicode模式,而Enthought Canopy使用UCS2。

实际上,这意味着如果您依赖任何因某种原因而无法自行编译的扩展(例如,预编译的专有库),并且如果它们不是为使用相同模式的Python版本构建的,则可能会更快或以后遇到类似于的错误undefined symbol: PyUnicodeUCS4_AsUTF8String

根据PEP 0513,UCS4当前似乎更为流行和推荐。同样,整个UCS兼容性问题似乎仅影响2.x和<3.3版本。

The other answers cover the ground quite nicely, so I just want to remark on one particular aspect that nobody has mentioned yet. It is probably fairly niche, but it may potentially make or break Anaconda or Canopy for some people under Linux systems:

Anaconda Python builds use the UCS4 Unicode mode, whereas Enthought Canopy uses UCS2.

What this means in practical terms is that if you rely on any extensions which you cannot compile yourself for whatever reason (e.g. pre-compiled proprietary libraries), if they happen not to be built for a Python version with the same mode, you may sooner or later run into errors that look something like undefined symbol: PyUnicodeUCS4_AsUTF8String.

According to PEP 0513, UCS4 seems to currently be more popular and recommended. Also, the whole UCS compatibility issues seem to only affect 2.x and < 3.3 versions.


回答 3

我使用Anaconda已有多年,并且非常喜欢它。不幸的是,如果没有企业版,则无法使用IPython Notebook(现在为Jupyter)。

我想在教室里使用Jupyter笔记本,所以我改用Canopy。安装我们需要的所有软件包似乎很容易。诚然,我们还没有对它们全部进行测试。

I used Anaconda for years and liked it quite a bit. Unfortunately, IPython Notebook (now Jupyter) is unavailable without the enterprise edition.

I want to use Jupyter notebooks in the classroom, so I switched to Canopy. It seems easy enough to install all of the packages we need. Admittedly, we haven’t tested them all.


如何使用pip在Windows上安装PyQt4?

问题:如何使用pip在Windows上安装PyQt4?

我在Windows上使用Python 3.4。当我运行脚本时,它抱怨

ImportError: No Module named 'PyQt4'

所以我尝试安装它,但是pip install PyQt4

找不到符合要求PyQt4的下载

尽管我跑步时确实会出现pip search PyQt4。我尝试pip install python-qt安装成功,但这并不能解决问题。

我究竟做错了什么?

I’m using Python 3.4 on Windows. When I run a script, it complains

ImportError: No Module named 'PyQt4'

So I tried to install it, but pip install PyQt4 gives

Could not find any downloads that satisfy the requirement PyQt4

although it does show up when I run pip search PyQt4. I tried to pip install python-qt, which installed successfully but that didn’t solve the problem.

What am I doing wrong?


回答 0

这是Chris Golke构建的Windows wheel软件包-Python Windows Binary软件包 -PyQt

在文件名中cp27表示C-python版本2.7,cp35表示python 3.5等。

由于Qt是一个更复杂的系统,它在python接口的基础上提供了已编译的C ++代码库,因此它的构建可能比仅纯python代码包要复杂得多,这意味着很难从源代码安装它。

确保获取正确的Windows wheel文件(python版本,32/64位),然后使用pip进行安装-例如:

C:\path\where\wheel\is\> pip install PyQt4-4.11.4-cp35-none-win_amd64.whl

如果您运行的是Python 3.5的x64版本,则应正确安装。

Here are Windows wheel packages built by Chris Golke – Python Windows Binary packages – PyQt

In the filenames cp27 means C-python version 2.7, cp35 means python 3.5, etc.

Since Qt is a more complicated system with a compiled C++ codebase underlying the python interface it provides you, it can be more complex to build than just a pure python code package, which means it can be hard to install it from source.

Make sure you grab the correct Windows wheel file (python version, 32/64 bit), and then use pip to install it – e.g:

C:\path\where\wheel\is\> pip install PyQt4-4.11.4-cp35-none-win_amd64.whl

Should properly install if you are running an x64 build of Python 3.5.


回答 1

QT不再支持PyQt4,但是您可以通过pip安装PyQt5:

pip install PyQt5

QT no longer supports PyQt4, but you can install PyQt5 with pip:

pip install PyQt5

回答 2

你不能使用点子。您必须从Riverbank网站下载并运行适用于您的python版本的安装程序。如果您的版本没有安装,则必须为可用的安装程序之一安装Python,或者从源代码进行构建(这涉及到)。其他答案和评论都有链接。

You can’t use pip. You have to download from the Riverbank website and run the installer for your version of python. If there is no install for your version, you will have to install Python for one of the available installers, or build from source (which is rather involved). Other answers and comments have the links.


回答 3

如果您在Windows上安装PyQt4,则默认情况下文件会在此处结束:

C:\ Python27 \ Lib \ site-packages \ PyQt4 *。*

但它还会在此处保留文件:

C:\ Python27 \ Lib \ site-packages \ sip.pyd

如果将sip.pyd和PyQt4文件夹都复制到virtualenv中,则一切正常。

例如:

mkdir c:\code
cd c:\code
virtualenv BACKUP
cd c:\code\BACKUP\scripts
activate

然后使用Windows资源管理器从C:\Python27\Lib\site-packages上述文件(sip.pyd)和文件夹(PyQt4)复制到C:\code\BACKUP\Lib\site-packages\

然后回到CLI:

cd ..                 
(c:\code\BACKUP)
python backup.py

尝试启动从virtualenv内部调用PyQt4的脚本的问题在于virtualenv没有安装PyQt4,并且不知道如何引用上述默认安装。但是,请按照以下步骤将PyQt4复制到您的virtualenv中,并且一切正常。

If you install PyQt4 on Windows, files wind up here by default:

C:\Python27\Lib\site-packages\PyQt4*.*

but it also leaves a file here:

C:\Python27\Lib\site-packages\sip.pyd

If you copy the both the sip.pyd and PyQt4 folder into your virtualenv things will work fine.

For example:

mkdir c:\code
cd c:\code
virtualenv BACKUP
cd c:\code\BACKUP\scripts
activate

Then with windows explorer copy from C:\Python27\Lib\site-packages the file (sip.pyd) and folder (PyQt4) mentioned above to C:\code\BACKUP\Lib\site-packages\

Then back at CLI:

cd ..                 
(c:\code\BACKUP)
python backup.py

The problem with trying to launch a script which calls PyQt4 from within virtualenv is that the virtualenv does not have PyQt4 installed and it doesn’t know how to reference the default installation described above. But follow these steps to copy PyQt4 into your virtualenv and things should work great.


回答 4

可以从网站下载页面直接获得较早的PyQt .exe安装程序。现在,随着PyQt4.12的发布,安装程序已被弃用。您可以通过编译它们使库以某种方式工作,但这意味着要花很多时间。

否则,您可以使用以前的发行版来解决您的目的。可以从以下网站下载.exe Windows安装程序:

https://sourceforge.net/projects/pyqt/files/PyQt4/PyQt-4.11.4/

Earlier PyQt .exe installers were available directly from the website download page. Now with the release of PyQt4.12 , installers have been deprecated. You can make the libraries work somehow by compiling them but that would mean going to great lengths of trouble.

Otherwise you can use the previous distributions to solve your purpose. The .exe windows installers can be downloaded from :

https://sourceforge.net/projects/pyqt/files/PyQt4/PyQt-4.11.4/


回答 5

看来您可能需要对PyQt4进行一些手动安装。

http://pyqt.sourceforge.net/Docs/PyQt4/installation.html

这可能会有所帮助,在教程/逐步设置格式中可能会有所帮助:

http://movingthelamppost.com/blog/html/2013/07/12/installing_pyqt____因为_it_s_too_good_for_pip_or_easy_install_.html

It looks like you may have to do a bit of manual installation for PyQt4.

http://pyqt.sourceforge.net/Docs/PyQt4/installation.html

This might help a bit more, it’s a bit more in a tutorial/set-by-step format:

http://movingthelamppost.com/blog/html/2013/07/12/installing_pyqt____because_it_s_too_good_for_pip_or_easy_install_.html


回答 6

使用当前最新的python 3.6.5

pip3 install PyQt5

工作良好

With current latest python 3.6.5

pip3 install PyQt5

works fine


回答 7

尝试使用PyQt5:

pip install PyQt5

链接上将操作系统用于PyQt4。

或在链接上为您的平台下载支持的车轮。

否则,链接可用于Windows可执行安装程序。希望这可以帮助您安装PyQt4或PyQt5。

Try this for PyQt5:

pip install PyQt5

Use the operating system on this link for PyQt4.

Or download the supported wheel for your platform on this link.

Else use this link for the windows executable installer. Hopefully this helps you to install either PyQt4 or PyQt5.


回答 8

对于Windows:

从此处下载适当版本的PyQt4:

并使用pip进行安装(Python3.6的示例-64位)

 pip install PyQt44.11.4cp36cp36mwin_amd64.whl 

For Windows:

download the appropriate version of the PyQt4 from here:

and install it using pip (example for Python3.6 – 64bit)

 pip install PyQt4‑4.11.4‑cp36‑cp36m‑win_amd64.whl 

回答 9

为Windows 10和python 3.5+安装PyQt5。

点安装PyQt5

install PyQt5 for Windows 10 and python 3.5+.

pip install PyQt5


回答 10

如果在安装PyQt4时出错。

错误:此平台不支持PyQt4-4.11.4-cp27-cp27m-win_amd64.whl。

我的系统类型是64位,但要解决这个错误我已经安装了32位Windows系统的PyQt4的,即PyQt4-4.11.4-cp27-cp27m-win32.whl点击这里查看更多版本

请根据您安装的python版本选择合适的PyQt4版本。

If you have error while installing PyQt4.

Error: PyQt4-4.11.4-cp27-cp27m-win_amd64.whl is not a supported wheel on this platform.

My system type is 64 bit, But to solve this error I have installed PyQt4 of 32 bit windows system, i.e PyQt4-4.11.4-cp27-cp27m-win32.whlclick here to see more versions.

Kindly select appropriate version of PyQt4 according to your installed python version.


回答 11

您也可以使用此命令来安装PyQt5。

pip3 install PyQt5

You can also use this command to install PyQt5.

pip3 install PyQt5

回答 12

我正在使用PyCharm,并且能够安装PyQt5。

PyQt4以及PyQt4Enhanced和windows_whl都无法安装,我猜这是因为不再支持Qt4。

I am using PyCharm, and was able to install PyQt5.

PyQt4, as well as PyQt4Enhanced and windows_whl both failed to install, I’m guessing that’s because Qt4 is no longer supported.


将字符串转换为DataFrame中的float

问题:将字符串转换为DataFrame中的float

如何隐藏包含NaN浮点数的字符串和值的DataFrame列。还有另一列的值为字符串和浮点数;如何将整个列转换为浮点数。

How to covert a DataFrame column containing strings and NaN values to floats. And there is another column whose values are strings and floats; how to convert this entire column to floats.


回答 0

注意: pd.convert_objects现在已弃用。您应该使用pd.Series.astype(float)pd.to_numeric其他答案中所述。

在0.11中可用。强制转换(或将其设置为nan),即使astype失败也会起作用。它也按系列进行排序,因此不会转换为完整的字符串列

In [10]: df = DataFrame(dict(A = Series(['1.0','1']), B = Series(['1.0','foo'])))

In [11]: df
Out[11]: 
     A    B
0  1.0  1.0
1    1  foo

In [12]: df.dtypes
Out[12]: 
A    object
B    object
dtype: object

In [13]: df.convert_objects(convert_numeric=True)
Out[13]: 
   A   B
0  1   1
1  1 NaN

In [14]: df.convert_objects(convert_numeric=True).dtypes
Out[14]: 
A    float64
B    float64
dtype: object

NOTE: pd.convert_objects has now been deprecated. You should use pd.Series.astype(float) or pd.to_numeric as described in other answers.

This is available in 0.11. Forces conversion (or set’s to nan) This will work even when astype will fail; its also series by series so it won’t convert say a complete string column

In [10]: df = DataFrame(dict(A = Series(['1.0','1']), B = Series(['1.0','foo'])))

In [11]: df
Out[11]: 
     A    B
0  1.0  1.0
1    1  foo

In [12]: df.dtypes
Out[12]: 
A    object
B    object
dtype: object

In [13]: df.convert_objects(convert_numeric=True)
Out[13]: 
   A   B
0  1   1
1  1 NaN

In [14]: df.convert_objects(convert_numeric=True).dtypes
Out[14]: 
A    float64
B    float64
dtype: object

回答 1

你可以试试看df.column_name = df.column_name.astype(float)。至于这些NaN值,您需要指定如何转换它们,但是您可以使用该.fillna方法来进行转换。

例:

In [12]: df
Out[12]: 
     a    b
0  0.1  0.2
1  NaN  0.3
2  0.4  0.5

In [13]: df.a.values
Out[13]: array(['0.1', nan, '0.4'], dtype=object)

In [14]: df.a = df.a.astype(float).fillna(0.0)

In [15]: df
Out[15]: 
     a    b
0  0.1  0.2
1  0.0  0.3
2  0.4  0.5

In [16]: df.a.values
Out[16]: array([ 0.1,  0. ,  0.4])

You can try df.column_name = df.column_name.astype(float). As for the NaN values, you need to specify how they should be converted, but you can use the .fillna method to do it.

Example:

In [12]: df
Out[12]: 
     a    b
0  0.1  0.2
1  NaN  0.3
2  0.4  0.5

In [13]: df.a.values
Out[13]: array(['0.1', nan, '0.4'], dtype=object)

In [14]: df.a = df.a.astype(float).fillna(0.0)

In [15]: df
Out[15]: 
     a    b
0  0.1  0.2
1  0.0  0.3
2  0.4  0.5

In [16]: df.a.values
Out[16]: array([ 0.1,  0. ,  0.4])

回答 2

在较新版本的熊猫(0.17及更高版本)中,可以使用to_numeric函数。它允许您转换整个数据框或仅转换单个列。它还使您能够选择如何处理无法转换为数值的内容:

import pandas as pd
s = pd.Series(['1.0', '2', -3])
pd.to_numeric(s)
s = pd.Series(['apple', '1.0', '2', -3])
pd.to_numeric(s, errors='ignore')
pd.to_numeric(s, errors='coerce')

In a newer version of pandas (0.17 and up), you can use to_numeric function. It allows you to convert the whole dataframe or just individual columns. It also gives you an ability to select how to treat stuff that can’t be converted to numeric values:

import pandas as pd
s = pd.Series(['1.0', '2', -3])
pd.to_numeric(s)
s = pd.Series(['apple', '1.0', '2', -3])
pd.to_numeric(s, errors='ignore')
pd.to_numeric(s, errors='coerce')

回答 3

df['MyColumnName'] = df['MyColumnName'].astype('float64') 
df['MyColumnName'] = df['MyColumnName'].astype('float64') 

回答 4

您必须先将np.nan替换为空字符串(”),然后再转换为float。即:

df['a']=df.a.replace('',np.nan).astype(float)

you have to replace empty strings (”) with np.nan before converting to float. ie:

df['a']=df.a.replace('',np.nan).astype(float)

回答 5

这是一个例子

                            GHI             Temp  Power Day_Type
2016-03-15 06:00:00 -7.99999952505459e-7    18.3    0   NaN
2016-03-15 06:01:00 -7.99999952505459e-7    18.2    0   NaN
2016-03-15 06:02:00 -7.99999952505459e-7    18.3    0   NaN
2016-03-15 06:03:00 -7.99999952505459e-7    18.3    0   NaN
2016-03-15 06:04:00 -7.99999952505459e-7    18.3    0   NaN

但是如果这都是字符串值…就像我这样…将所需的列转换为浮点数:

df_inv_29['GHI'] = df_inv_29.GHI.astype(float)
df_inv_29['Temp'] = df_inv_29.Temp.astype(float)
df_inv_29['Power'] = df_inv_29.Power.astype(float)

您的数据框现在将具有浮点值:-)

Here is an example

                            GHI             Temp  Power Day_Type
2016-03-15 06:00:00 -7.99999952505459e-7    18.3    0   NaN
2016-03-15 06:01:00 -7.99999952505459e-7    18.2    0   NaN
2016-03-15 06:02:00 -7.99999952505459e-7    18.3    0   NaN
2016-03-15 06:03:00 -7.99999952505459e-7    18.3    0   NaN
2016-03-15 06:04:00 -7.99999952505459e-7    18.3    0   NaN

but if this is all string values…as was in my case… Convert the desired columns to floats:

df_inv_29['GHI'] = df_inv_29.GHI.astype(float)
df_inv_29['Temp'] = df_inv_29.Temp.astype(float)
df_inv_29['Power'] = df_inv_29.Power.astype(float)

Your dataframe will now have float values :-)


如何从命令行编译Visual Studio项目?

问题:如何从命令行编译Visual Studio项目?

我正在为使用MonotoneCMake,Visual Studio Express 2008和自定义测试的大型C ++解决方案编写结帐,构建,分发,测试和提交周期的脚本。

所有其他部分似乎都非常简单明了,但是我不明白如何在没有GUI的情况下编译Visual Studio解决方案。

该脚本是用Python编写的,但是给出的答案允许我仅调用os.system。

I’m scripting the checkout, build, distribution, test, and commit cycle for a large C++ solution that is using Monotone, CMake, Visual Studio Express 2008, and custom tests.

All of the other parts seem pretty straight-forward, but I don’t see how to compile the Visual Studio solution without getting the GUI.

The script is written in Python, but an answer that would allow me to just make a call to: os.system would do.


回答 0

我知道有两种方法可以做到。

方法1
第一种方法(我更喜欢)是使用msbuild

msbuild project.sln /Flags...

方法2
您还可以运行:

vcexpress project.sln /build /Flags...

vcexpress选项立即返回,并且不打印任何输出。我想这可能就是您想要的脚本。

请注意,DevEnv并未随Visual Studio Express 2008一起分发(我花了很多时间试图弄清第一次遇到类似问题的时间)。

因此,最终结果可能是:

os.system("msbuild project.sln /p:Configuration=Debug")

您还需要确保您的环境变量正确,因为默认情况下,系统路径上不包含msbuild和vcexpress。启动Visual Studio构建环境并从那里运行脚本,或者修改Python中的路径(使用os.putenv)。

I know of two ways to do it.

Method 1
The first method (which I prefer) is to use msbuild:

msbuild project.sln /Flags...

Method 2
You can also run:

vcexpress project.sln /build /Flags...

The vcexpress option returns immediately and does not print any output. I suppose that might be what you want for a script.

Note that DevEnv is not distributed with Visual Studio Express 2008 (I spent a lot of time trying to figure that out when I first had a similar issue).

So, the end result might be:

os.system("msbuild project.sln /p:Configuration=Debug")

You’ll also want to make sure your environment variables are correct, as msbuild and vcexpress are not by default on the system path. Either start the Visual Studio build environment and run your script from there, or modify the paths in Python (with os.putenv).


回答 1

MSBuild通常可以正常工作,但是我之前遇到过困难。你可能有更好的运气

devenv YourSolution.sln /Build 

MSBuild usually works, but I’ve run into difficulties before. You may have better luck with

devenv YourSolution.sln /Build 

回答 2

老实说,我必须加2美分。

您可以使用msbuild.exe来完成msbuild.exe的版本很多 。

C:\ Windows \ Microsoft.NET \ Framework64 \ v2.0.50727 \ msbuild.exe C:\ Windows \ Microsoft.NET \ Framework64 \ v3.5 \ msbuild.exe C:\ Windows \ Microsoft.NET \ Framework64 \ v4.0.30319 \ msbuild.exe
C:\ Windows \ Microsoft.NET \ Framework \ v2.0.50727 \ msbuild.exe C:\ Windows \ Microsoft.NET \ Framework \ v3.5 \ msbuild.exe C:\ Windows \ Microsoft.NET \ Framework \ v4.0.30319 \ msbuild.exe

使用您需要的版本。基本上,您必须使用最后一个。

C:\ Windows \ Microsoft.NET \ Framework64 \ v4.0.30319 \ msbuild.exe

那么怎么做。

  1. 运行命令窗口

  2. 输入msbuild.exe的路径

C:\ Windows \ Microsoft.NET \ Framework64 \ v4.0.30319 \ msbuild.exe

  1. 输入项目解决方案的路径,例如

“ C:\ Users \ Clark.Kent \ Documents \ visual studio 2012 \ Projects \ WpfApplication1 \ WpfApplication1.sln”

  1. 在解决方案路径后添加所需的任何标志。

  2. ENTER

请注意,您可以获得有关所有可能标记的帮助,例如

C:\ Windows \ Microsoft.NET \ Framework64 \ v4.0.30319 \ msbuild.exe / help

To be honest I have to add my 2 cents.

You can do it with msbuild.exe. There are many version of the msbuild.exe.

C:\Windows\Microsoft.NET\Framework64\v2.0.50727\msbuild.exe C:\Windows\Microsoft.NET\Framework64\v3.5\msbuild.exe C:\Windows\Microsoft.NET\Framework64\v4.0.30319\msbuild.exe
C:\Windows\Microsoft.NET\Framework\v2.0.50727\msbuild.exe C:\Windows\Microsoft.NET\Framework\v3.5\msbuild.exe C:\Windows\Microsoft.NET\Framework\v4.0.30319\msbuild.exe

Use version you need. Basically you have to use the last one.

C:\Windows\Microsoft.NET\Framework64\v4.0.30319\msbuild.exe

So how to do it.

  1. Run the COMMAND window

  2. Input the path to msbuild.exe

C:\Windows\Microsoft.NET\Framework64\v4.0.30319\msbuild.exe

  1. Input the path to the project solution like

“C:\Users\Clark.Kent\Documents\visual studio 2012\Projects\WpfApplication1\WpfApplication1.sln”

  1. Add any flags you need after the solution path.

  2. Press ENTER

Note you can get help about all possible flags like

C:\Windows\Microsoft.NET\Framework64\v4.0.30319\msbuild.exe /help


回答 3

使用msbuild其他人指出的方法对我有用,但我需要做的不止于此。首先,msbuild需要访问编译器。这可以通过运行以下命令来完成:

"C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\vcvarsall.bat"

然后msbuild不在我的$ PATH中,所以我不得不通过它的显式路径运行它:

"C:\Windows\Microsoft.NET\Framework64\v4.0.30319\MSBuild.exe" myproj.sln

最后,我的项目使用了诸如的一些变量$(VisualStudioDir)。看来这些没有被设置,msbuild所以我不得不通过/property选项手动设置它们:

"C:\Windows\Microsoft.NET\Framework64\v4.0.30319\MSBuild.exe" /property:VisualStudioDir="C:\Users\Administrator\Documents\Visual Studio 2013" myproj.sln

那行最终使我能够编译我的项目。

奖励:命令行工具使用30天后似乎不需要注册,就像基于GUI的“免费” Visual Studio Community版本一样。有了Microsoft注册要求,该版本几乎是免费的。如果有的话免费在Facebook上…

Using msbuild as pointed out by others worked for me but I needed to do a bit more than just that. First of all, msbuild needs to have access to the compiler. This can be done by running:

"C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\vcvarsall.bat"

Then msbuild was not in my $PATH so I had to run it via its explicit path:

"C:\Windows\Microsoft.NET\Framework64\v4.0.30319\MSBuild.exe" myproj.sln

Lastly, my project was making use of some variables like $(VisualStudioDir). It seems those do not get set by msbuild so I had to set them manually via the /property option:

"C:\Windows\Microsoft.NET\Framework64\v4.0.30319\MSBuild.exe" /property:VisualStudioDir="C:\Users\Administrator\Documents\Visual Studio 2013" myproj.sln

That line then finally allowed me to compile my project.

Bonus: it seems that the command line tools do not require a registration after 30 days of using them like the “free” GUI-based Visual Studio Community edition does. With the Microsoft registration requirement in place, that version is hardly free. Free-as-in-facebook if anything…


回答 4

MSBuild是您的朋友。

msbuild "C:\path to solution\project.sln"

MSBuild is your friend.

msbuild "C:\path to solution\project.sln"

回答 5

DEVENV在许多情况下都能很好地工作,但是在WIXPROJ上构建我的WIX安装程序时,我得到的只是Out日志中的“ CATASTROPHIC”错误。

这有效:MSBUILD /Path/PROJECT.WIXPROJ / t:Build / p:Configuration =发布

DEVENV works well in many cases, but on a WIXPROJ to build my WIX installer, all I got is “CATASTROPHIC” error in the Out log.

This works: MSBUILD /Path/PROJECT.WIXPROJ /t:Build /p:Configuration=Release


将文本放在matplotlib图的左上角

问题:将文本放在matplotlib图的左上角

如何在matplotlib图形的左上角(或右上角)放置文本,例如,左上角图例所在的位置,还是绘图的顶部,但在左上角?例如,如果它是一个plt.scatter(),那么将在散点图的平方内放置一些东西,将其放在最左上角。

我想在不理想地知道例如散点图的比例的情况下进行此操作,因为它会随数据集的不同而变化。我只希望它的文字大致在左上方,或大致在右上方。使用图例类型定位时,它无论如何都不应与任何散点图点重叠。

谢谢!

How can I put text in the top left (or top right) corner of a matplotlib figure, e.g. where a top left legend would be, or on top of the plot but in the top left corner? E.g. if it’s a plt.scatter(), then something that would be within the square of the scatter, put in the top left most corner.

I’d like to do this without ideally knowing the scale of the scatterplot being plotted for example, since it will change from dataset to data set. I just want it the text to be roughly in the upper left, or roughly in the upper right. With legend type positioning it should not overlap with any scatter plot points anyway.

thanks!


回答 0

您可以使用text

text(x, y, s, fontsize=12)

text 可以相对于轴指定坐标,因此文本的位置将与绘图的大小无关:

默认转换指定文本在数据坐标中,或者,您也可以在坐标轴中指定文本(0,0是左下角,而1,1是右上角)。下面的示例将文本放置在轴的中心::

text(0.5, 0.5,'matplotlib',
     horizontalalignment='center',
     verticalalignment='center',
     transform = ax.transAxes)

要防止文本干扰散点图的任何点,都是比较困难的。比较简单的方法是将y_axis(的ymax ylim((ymin,ymax))轴)设置为比点的最大y坐标高一点的值。这样,您将始终拥有文本的可用空间。

编辑:这里有一个例子:

In [17]: from pylab import figure, text, scatter, show
In [18]: f = figure()
In [19]: ax = f.add_subplot(111)
In [20]: scatter([3,5,2,6,8],[5,3,2,1,5])
Out[20]: <matplotlib.collections.CircleCollection object at 0x0000000007439A90>
In [21]: text(0.1, 0.9,'matplotlib', ha='center', va='center', transform=ax.transAxes)
Out[21]: <matplotlib.text.Text object at 0x0000000007415B38>
In [22]:

ha和va参数设置文本相对于插入点的对齐方式。即。ha =’left’是一个很好的设置,可以防止在手动缩小(变窄)帧时长文本从左轴移出。

You can use text.

text(x, y, s, fontsize=12)

text coordinates can be given relative to the axis, so the position of your text will be independent of the size of the plot:

The default transform specifies that text is in data coords, alternatively, you can specify text in axis coords (0,0 is lower-left and 1,1 is upper-right). The example below places text in the center of the axes::

text(0.5, 0.5,'matplotlib',
     horizontalalignment='center',
     verticalalignment='center',
     transform = ax.transAxes)

To prevent the text to interfere with any point of your scatter is more difficult afaik. The easier method is to set y_axis (ymax in ylim((ymin,ymax))) to a value a bit higher than the max y-coordinate of your points. In this way you will always have this free space for the text.

EDIT: here you have an example:

In [17]: from pylab import figure, text, scatter, show
In [18]: f = figure()
In [19]: ax = f.add_subplot(111)
In [20]: scatter([3,5,2,6,8],[5,3,2,1,5])
Out[20]: <matplotlib.collections.CircleCollection object at 0x0000000007439A90>
In [21]: text(0.1, 0.9,'matplotlib', ha='center', va='center', transform=ax.transAxes)
Out[21]: <matplotlib.text.Text object at 0x0000000007415B38>
In [22]:

The ha and va parameters set the alignment of your text relative to the insertion point. ie. ha=’left’ is a good set to prevent a long text to go out of the left axis when the frame is reduced (made narrower) manually.


回答 1

一种解决方案是使用该plt.legend功能,即使您不需要实际的图例。您可以使用loc关键字词指定图例框的位置。可以在此网站上找到更多信息但我还提供了一个示例,说明如何放置图例:

ax.scatter(xa,ya, marker='o', s=20, c="lightgreen", alpha=0.9)
ax.scatter(xb,yb, marker='o', s=20, c="dodgerblue", alpha=0.9)
ax.scatter(xc,yc marker='o', s=20, c="firebrick", alpha=1.0)
ax.scatter(xd,xd,xd, marker='o', s=20, c="goldenrod", alpha=0.9)
line1 = Line2D(range(10), range(10), marker='o', color="goldenrod")
line2 = Line2D(range(10), range(10), marker='o',color="firebrick")
line3 = Line2D(range(10), range(10), marker='o',color="lightgreen")
line4 = Line2D(range(10), range(10), marker='o',color="dodgerblue")
plt.legend((line1,line2,line3, line4),('line1','line2', 'line3', 'line4'),numpoints=1, loc=2) 

请注意,因为loc=2,图例位于图的左上角。并且如果文本与图重叠,则可以使用来使其变小legend.fontsize,从而使图例变小。

One solution would be to use the plt.legend function, even if you don’t want an actual legend. You can specify the placement of the legend box by using the loc keyterm. More information can be found at this website but I’ve also included an example showing how to place a legend:

ax.scatter(xa,ya, marker='o', s=20, c="lightgreen", alpha=0.9)
ax.scatter(xb,yb, marker='o', s=20, c="dodgerblue", alpha=0.9)
ax.scatter(xc,yc marker='o', s=20, c="firebrick", alpha=1.0)
ax.scatter(xd,xd,xd, marker='o', s=20, c="goldenrod", alpha=0.9)
line1 = Line2D(range(10), range(10), marker='o', color="goldenrod")
line2 = Line2D(range(10), range(10), marker='o',color="firebrick")
line3 = Line2D(range(10), range(10), marker='o',color="lightgreen")
line4 = Line2D(range(10), range(10), marker='o',color="dodgerblue")
plt.legend((line1,line2,line3, line4),('line1','line2', 'line3', 'line4'),numpoints=1, loc=2) 

Note that because loc=2, the legend is in the upper-left corner of the plot. And if the text overlaps with the plot, you can make it smaller by using legend.fontsize, which will then make the legend smaller.


检查是否安装了Python软件包

问题:检查是否安装了Python软件包

检查软件包是否在Python脚本中安装的好方法是什么?我知道从解释器很容易,但是我需要在脚本中完成。

我想我可以检查安装过程中在系统上是否创建了目录,但是我觉得有更好的方法。我试图确保已安装Skype4Py软件包,如果没有,我将安装它。

我完成支票的想法

  • 检查典型安装路径中的目录
  • 尝试导入软件包,如果抛出异常,则安装软件包

What’s a good way to check if a package is installed while within a Python script? I know it’s easy from the interpreter, but I need to do it within a script.

I guess I could check if there’s a directory on the system that’s created during the installation, but I feel like there’s a better way. I’m trying to make sure the Skype4Py package is installed, and if not I’ll install it.

My ideas for accomplishing the check

  • check for a directory in the typical install path
  • try to import the package and if an exception is throw, then install package

回答 0

如果您的意思是python脚本,请执行以下操作:

Python 3.3+使用sys.modules和find_spec

import importlib.util
import sys

# For illustrative purposes.
name = 'itertools'

if name in sys.modules:
    print(f"{name!r} already in sys.modules")
elif (spec := importlib.util.find_spec(name)) is not None:
    # If you choose to perform the actual import ...
    module = importlib.util.module_from_spec(spec)
    sys.modules[name] = module
    spec.loader.exec_module(module)
    print(f"{name!r} has been imported")
else:
    print(f"can't find the {name!r} module")

Python 3:

try:
    import mymodule
except ImportError as e:
    pass  # module doesn't exist, deal with it.

Python 2:

try:
    import mymodule
except ImportError, e:
    pass  # module doesn't exist, deal with it.

If you mean a python script, just do something like this:

Python 3.3+ use sys.modules and find_spec:

import importlib.util
import sys

# For illustrative purposes.
name = 'itertools'

if name in sys.modules:
    print(f"{name!r} already in sys.modules")
elif (spec := importlib.util.find_spec(name)) is not None:
    # If you choose to perform the actual import ...
    module = importlib.util.module_from_spec(spec)
    sys.modules[name] = module
    spec.loader.exec_module(module)
    print(f"{name!r} has been imported")
else:
    print(f"can't find the {name!r} module")

Python 3:

try:
    import mymodule
except ImportError as e:
    pass  # module doesn't exist, deal with it.

Python 2:

try:
    import mymodule
except ImportError, e:
    pass  # module doesn't exist, deal with it.

回答 1

更新的答案

更好的方法是:

import subprocess
import sys

reqs = subprocess.check_output([sys.executable, '-m', 'pip', 'freeze'])
installed_packages = [r.decode().split('==')[0] for r in reqs.split()]

结果:

print(installed_packages)

[
    "Django",
    "six",
    "requests",
]

检查是否requests已安装:

if 'requests' in installed_packages:
    # Do something

为什么这样呢?有时您会遇到应用名称冲突。从应用程序命名空间导入无法全面了解系统上已安装的内容。

注意,建议的解决方案有效:

  • 使用pip从PyPI或任何其他替代来源(例如pip install http://some.site/package-name.zip或任何其他存档类型)进行安装时。
  • 使用手动安装时python setup.py install
  • 从系统存储库安装时,例如sudo apt install python-requests

情况下,当它可能无法正常工作:

  • 在开发模式下安装时,例如python setup.py develop
  • 在开发模式下安装时,例如pip install -e /path/to/package/source/

旧答案

更好的方法是:

import pip
installed_packages = pip.get_installed_distributions()

对于pip> = 10.x,请使用:

from pip._internal.utils.misc import get_installed_distributions

为什么这样呢?有时您会遇到应用名称冲突。从应用程序命名空间导入无法全面了解系统上已安装的内容。

结果,您得到一个pkg_resources.Distribution对象列表。请参阅以下示例:

print installed_packages
[
    "Django 1.6.4 (/path-to-your-env/lib/python2.7/site-packages)",
    "six 1.6.1 (/path-to-your-env/lib/python2.7/site-packages)",
    "requests 2.5.0 (/path-to-your-env/lib/python2.7/site-packages)",
]

列出清单:

flat_installed_packages = [package.project_name for package in installed_packages]

[
    "Django",
    "six",
    "requests",
]

检查是否requests已安装:

if 'requests' in flat_installed_packages:
    # Do something

Updated answer

A better way of doing this is:

import subprocess
import sys

reqs = subprocess.check_output([sys.executable, '-m', 'pip', 'freeze'])
installed_packages = [r.decode().split('==')[0] for r in reqs.split()]

The result:

print(installed_packages)

[
    "Django",
    "six",
    "requests",
]

Check if requests is installed:

if 'requests' in installed_packages:
    # Do something

Why this way? Sometimes you have app name collisions. Importing from the app namespace doesn’t give you the full picture of what’s installed on the system.

Note, that proposed solution works:

  • When using pip to install from PyPI or from any other alternative source (like pip install http://some.site/package-name.zip or any other archive type).
  • When installing manually using python setup.py install.
  • When installing from system repositories, like sudo apt install python-requests.

Cases when it might not work:

  • When installing in development mode, like python setup.py develop.
  • When installing in development mode, like pip install -e /path/to/package/source/.

Old answer

A better way of doing this is:

import pip
installed_packages = pip.get_installed_distributions()

For pip>=10.x use:

from pip._internal.utils.misc import get_installed_distributions

Why this way? Sometimes you have app name collisions. Importing from the app namespace doesn’t give you the full picture of what’s installed on the system.

As a result, you get a list of pkg_resources.Distribution objects. See the following as an example:

print installed_packages
[
    "Django 1.6.4 (/path-to-your-env/lib/python2.7/site-packages)",
    "six 1.6.1 (/path-to-your-env/lib/python2.7/site-packages)",
    "requests 2.5.0 (/path-to-your-env/lib/python2.7/site-packages)",
]

Make a list of it:

flat_installed_packages = [package.project_name for package in installed_packages]

[
    "Django",
    "six",
    "requests",
]

Check if requests is installed:

if 'requests' in flat_installed_packages:
    # Do something

回答 2

从Python 3.3开始,您可以使用find_spec()方法

import importlib.util

# For illustrative purposes.
package_name = 'pandas'

spec = importlib.util.find_spec(package_name)
if spec is None:
    print(package_name +" is not installed")

As of Python 3.3, you can use the find_spec() method

import importlib.util

# For illustrative purposes.
package_name = 'pandas'

spec = importlib.util.find_spec(package_name)
if spec is None:
    print(package_name +" is not installed")

回答 3

如果要从终端机取支票,可以运行

pip3 show package_name

如果未返回任何内容,则表示未安装该软件包。

如果您想自动执行此检查,以便例如可以在丢失时安装它,则可以在bash脚本中包含以下内容:

pip3 show package_name 1>/dev/null #pip for Python 2
if [ $? == 0 ]; then
   echo "Installed" #Replace with your actions
else
   echo "Not Installed" #Replace with your actions, 'pip3 install --upgrade package_name' ?
fi

If you want to have the check from the terminal, you can run

pip3 show package_name

and if nothing is returned, the package is not installed.

If perhaps you want to automate this check, so that for example you can install it if missing, you can have the following in your bash script:

pip3 show package_name 1>/dev/null #pip for Python 2
if [ $? == 0 ]; then
   echo "Installed" #Replace with your actions
else
   echo "Not Installed" #Replace with your actions, 'pip3 install --upgrade package_name' ?
fi

回答 4

作为此答案的扩展:

对于Python 2. *,pip show <package_name>将执行相同的任务。

例如pip show numpy将返回以下内容:

Name: numpy
Version: 1.11.1
Summary: NumPy: array processing for numbers, strings, records, and objects.
Home-page: http://www.numpy.org
Author: NumPy Developers
Author-email: numpy-discussion@scipy.org
License: BSD
Location: /home/***/anaconda2/lib/python2.7/site-packages
Requires: 
Required-by: smop, pandas, tables, spectrum, seaborn, patsy, odo, numpy-stl, numba, nfft, netCDF4, MDAnalysis, matplotlib, h5py, GridDataFormats, dynd, datashape, Bottleneck, blaze, astropy

As an extension of this answer:

For Python 2.*, pip show <package_name> will perform the same task.

For example pip show numpy will return the following or alike:

Name: numpy
Version: 1.11.1
Summary: NumPy: array processing for numbers, strings, records, and objects.
Home-page: http://www.numpy.org
Author: NumPy Developers
Author-email: numpy-discussion@scipy.org
License: BSD
Location: /home/***/anaconda2/lib/python2.7/site-packages
Requires: 
Required-by: smop, pandas, tables, spectrum, seaborn, patsy, odo, numpy-stl, numba, nfft, netCDF4, MDAnalysis, matplotlib, h5py, GridDataFormats, dynd, datashape, Bottleneck, blaze, astropy

回答 5

您可以使用setuptools中的pkg_resources模块。例如:

import pkg_resources

package_name = 'cool_package'
try:
    cool_package_dist_info = pkg_resources.get_distribution(package_name)
except pkg_resources.DistributionNotFound:
    print('{} not installed'.format(package_name))
else:
    print(cool_package_dist_info)

请注意,python模块和python包之间有区别。一个软件包可以包含多个模块,并且模块名称可能与软件包名称不匹配。

You can use the pkg_resources module from setuptools. For example:

import pkg_resources

package_name = 'cool_package'
try:
    cool_package_dist_info = pkg_resources.get_distribution(package_name)
except pkg_resources.DistributionNotFound:
    print('{} not installed'.format(package_name))
else:
    print(cool_package_dist_info)

Note that there is a difference between python module and a python package. A package can contain multiple modules and module’s names might not match the package name.


回答 6

打开命令提示符类型

pip3 list

Open your command prompt type

pip3 list

回答 7

我想对此主题添加一些想法/发现。我正在编写一个脚本,检查定制程序的所有要求。python模块也有很多检查。

有一个小问题

try:
   import ..
except:
   ..

解。在我的情况下,其中一个python模块称为python-nmap,但是您使用导入了它,import nmap并且看到名称不匹配。因此,使用上述解决方案进行的测试将返回False结果,并且还会在命中时导入该模块,但对于简单的测试/检查,可能无需使用大量内存。

我也发现

import pip
installed_packages = pip.get_installed_distributions()

installed_packages只有pip安装了软件包。在我的系统上,pip freeze通过40python模块返回,而installed_packages只有1,我手动安装了该模块(python-nmap)。

下面我知道的另一种解决方案可能与该问题无关,但是我认为将测试功能与执行安装的功能分开是一种很好的做法,这可能对某些人有用。

对我有用的解决方案。它基于此答案如何在不导入的情况下检查python模块是否存在

from imp import find_module

def checkPythonmod(mod):
    try:
        op = find_module(mod)
        return True
    except ImportError:
        return False

注意:此解决方案也无法通过名称找到模块python-nmap,我必须nmap改用(易于使用),但是在这种情况下,模块将不会加载到内存中。

I’d like to add some thoughts/findings of mine to this topic. I’m writing a script that checks all requirements for a custom made program. There are many checks with python modules too.

There’s a little issue with the

try:
   import ..
except:
   ..

solution. In my case one of the python modules called python-nmap, but you import it with import nmap and as you see the names mismatch. Therefore the test with the above solution returns a False result, and it also imports the module on hit, but maybe no need to use a lot of memory for a simple test/check.

I also found that

import pip
installed_packages = pip.get_installed_distributions()

installed_packages will have only the packages has been installed with pip. On my system pip freeze returns over 40 python modules, while installed_packages has only 1, the one I installed manually (python-nmap).

Another solution below that I know it may not relevant to the question, but I think it’s a good practice to keep the test function separate from the one that performs the install it might be useful for some.

The solution that worked for me. It based on this answer How to check if a python module exists without importing it

from imp import find_module

def checkPythonmod(mod):
    try:
        op = find_module(mod)
        return True
    except ImportError:
        return False

NOTE: this solution can’t find the module by the name python-nmap too, I have to use nmap instead (easy to live with) but in this case the module won’t be loaded to the memory whatsoever.


回答 8

如果您希望脚本安装缺少的软件包并继续,则可以执行以下操作(在“ python-krbV”软件包中的“ krbV”模块示例中):

import pip
import sys

for m, pkg in [('krbV', 'python-krbV')]:
    try:
        setattr(sys.modules[__name__], m, __import__(m))
    except ImportError:
        pip.main(['install', pkg])
        setattr(sys.modules[__name__], m, __import__(m))

If you’d like your script to install missing packages and continue, you could do something like this (on example of ‘krbV’ module in ‘python-krbV’ package):

import pip
import sys

for m, pkg in [('krbV', 'python-krbV')]:
    try:
        setattr(sys.modules[__name__], m, __import__(m))
    except ImportError:
        pip.main(['install', pkg])
        setattr(sys.modules[__name__], m, __import__(m))

回答 9

一种快速的方法是使用python命令行工具。只需键入,import <your module name> 如果缺少模块,则会看到错误。

$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13) 
>>> import sys
>>> import jocker
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named jocker
$

A quick way is to use python command line tool. Simply type import <your module name> You see an error if module is missing.

$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13) 
>>> import sys
>>> import jocker
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named jocker
$

回答 10

嗯…我看到的最方便的答案是使用命令行尝试导入。但我什至宁愿避免这种情况。

冻结点如何?grep pkgname’?我试过了,效果很好。它还显示了它具有的版本以及是在版本控制(安装)下还是可编辑(开发)下安装的。

Hmmm … the closest I saw to a convenient answer was using the command line to try the import. But I prefer to even avoid that.

How about ‘pip freeze | grep pkgname’? I tried it and it works well. It also shows you the version it has and whether it is installed under version control (install) or editable (develop).


回答 11

类myError(exception):通过#或做一些尝试:导入mymodule,除了ImportError,例如e:提高myError(“发生错误”)

You can use this:

class myError(exception):
 pass # Or do some thing like this.
try:
 import mymodule
except ImportError as e:
 raise myError("error was occurred")

回答 12

在终端类型

pip show some_package_name

pip show matplotlib

In the Terminal type

pip show some_package_name

Example

pip show matplotlib

回答 13

我想评论@ ice.nicer的回复,但我不能,所以… 我的观察是带有破折号的软件包都带有下划线,而不仅仅是@dwich注释所指出的点。

例如,您这样做pip3 install sphinx-rtd-theme,但是:

  • importlib.util.find_spec(sphinx_rtd_theme) 返回一个对象
  • importlib.util.find_spec(sphinx-rtd-theme) 不返回
  • importlib.util.find_spec(sphinx.rtd.theme) 引发ModuleNotFoundError

此外,某些名称已完全更改。例如,您这样做,pip3 install pyyaml但是将其另存为yaml

我正在使用python3.8

I would like to comment to @ice.nicer reply but I cannot, so … My observations is that packages with dashes are saved with underscores, not only with dots as pointed out by @dwich comment

For example, you do pip3 install sphinx-rtd-theme, but:

  • importlib.util.find_spec(sphinx_rtd_theme) returns an Object
  • importlib.util.find_spec(sphinx-rtd-theme) returns None
  • importlib.util.find_spec(sphinx.rtd.theme) raises ModuleNotFoundError

Moreover, some names are totally changed. For example, you do pip3 install pyyaml but it is saved simply as yaml

I am using python3.8


回答 14

if pip3 list | grep -sE '^some_command\s+[0-9]' >/dev/null
  # installed ...
else
  # not installed ...
fi
if pip3 list | grep -sE '^some_command\s+[0-9]' >/dev/null
  # installed ...
else
  # not installed ...
fi

回答 15

转到选项2。如果ImportError抛出该错误,则表示未安装该软件包(或未安装sys.path)。

Go option #2. If ImportError is thrown, then the package is not installed (or not in sys.path).


在Python中将%f与strftime()结合使用可获取微秒

问题:在Python中将%f与strftime()结合使用可获取微秒

我想使用的strftime()以微秒级精度,这似乎可以使用%F(为说明这里)。但是,当我尝试以下代码时:

import time
import strftime from time

print strftime("%H:%M:%S.%f")

…我得到小时,分钟和秒,但%f打印为%f,没有微秒的迹象。我在Ubuntu上运行Python 2.6.5,所以应该没问题,应该支持%f(据我所知,它在2.6及更高版本中受支持。)

I’m trying to use strftime() to microsecond precision, which seems possible using %f (as stated here). However when I try the following code:

import time
import strftime from time

print strftime("%H:%M:%S.%f")

…I get the hour, the minutes and the seconds, but %f prints as %f, with no sign of the microseconds. I’m running Python 2.6.5 on Ubuntu, so it should be fine and %f should be supported (it’s supported for 2.6 and above, as far as I know.)


回答 0

您可以使用datetime的strftime函数来获取此信息。问题在于时间的strftime接受不携带微秒信息的时间元组。

from datetime import datetime
datetime.now().strftime("%H:%M:%S.%f")

应该做的把戏!

You can use datetime’s strftime function to get this. The problem is that time’s strftime accepts a timetuple that does not carry microsecond information.

from datetime import datetime
datetime.now().strftime("%H:%M:%S.%f")

Should do the trick!


回答 1

您正在查看错误的文档。该time模块具有不同的文档

您可以像这样使用datetime模块strftime

>>> from datetime import datetime
>>>
>>> now = datetime.now()
>>> now.strftime("%H:%M:%S.%f")
'12:19:40.948000'

You are looking at the wrong documentation. The time module has different documentation.

You can use the datetime module strftime like this:

>>> from datetime import datetime
>>>
>>> now = datetime.now()
>>> now.strftime("%H:%M:%S.%f")
'12:19:40.948000'

回答 2

使用Python的time模块,您无法获得毫秒%f

对于那些仍然只想使用time模块的人,这里有一个解决方法:

now = time.time()
mlsec = repr(now).split('.')[1][:3]
print time.strftime("%Y-%m-%d %H:%M:%S.{} %Z".format(mlsec), time.localtime(now))

您应该会得到类似2017-01-16 16:42:34.625 EET的信息(是的,我使用毫秒,因为这已经足够了)。

要将代码分成细节,请将以下代码粘贴到Python控制台中:

import time

# Get current timestamp
now = time.time()

# Debug now
now
print now
type(now)

# Debug strf time
struct_now = time.localtime(now)
print struct_now
type(struct_now)

# Print nicely formatted date
print time.strftime("%Y-%m-%d %H:%M:%S %Z", struct_now)

# Get miliseconds
mlsec = repr(now).split('.')[1][:3]
print mlsec

# Get your required timestamp string
timestamp = time.strftime("%Y-%m-%d %H:%M:%S.{} %Z".format(mlsec), struct_now)
print timestamp

为了澄清起见,我还在这里粘贴了Python 2.7.12结果:

>>> import time
>>> # get current timestamp
... now = time.time()
>>> # debug now
... now
1484578293.519106
>>> print now
1484578293.52
>>> type(now)
<type 'float'>
>>> # debug strf time
... struct_now = time.localtime(now)
>>> print struct_now
time.struct_time(tm_year=2017, tm_mon=1, tm_mday=16, tm_hour=16, tm_min=51, tm_sec=33, tm_wday=0, tm_yday=16, tm_isdst=0)
>>> type(struct_now)
<type 'time.struct_time'>
>>> # print nicely formatted date
... print time.strftime("%Y-%m-%d %H:%M:%S %Z", struct_now)
2017-01-16 16:51:33 EET
>>> # get miliseconds
... mlsec = repr(now).split('.')[1][:3]
>>> print mlsec
519
>>> # get your required timestamp string
... timestamp = time.strftime("%Y-%m-%d %H:%M:%S.{} %Z".format(mlsec), struct_now)
>>> print timestamp
2017-01-16 16:51:33.519 EET
>>>

With Python’s time module you can’t get microseconds with %f.

For those who still want to go with time module only, here is a workaround:

now = time.time()
mlsec = repr(now).split('.')[1][:3]
print time.strftime("%Y-%m-%d %H:%M:%S.{} %Z".format(mlsec), time.localtime(now))

You should get something like 2017-01-16 16:42:34.625 EET (yes, I use milliseconds as it’s fairly enough).

To break the code into details, paste the below code into a Python console:

import time

# Get current timestamp
now = time.time()

# Debug now
now
print now
type(now)

# Debug strf time
struct_now = time.localtime(now)
print struct_now
type(struct_now)

# Print nicely formatted date
print time.strftime("%Y-%m-%d %H:%M:%S %Z", struct_now)

# Get miliseconds
mlsec = repr(now).split('.')[1][:3]
print mlsec

# Get your required timestamp string
timestamp = time.strftime("%Y-%m-%d %H:%M:%S.{} %Z".format(mlsec), struct_now)
print timestamp

For clarification purposes, I also paste my Python 2.7.12 result here:

>>> import time
>>> # get current timestamp
... now = time.time()
>>> # debug now
... now
1484578293.519106
>>> print now
1484578293.52
>>> type(now)
<type 'float'>
>>> # debug strf time
... struct_now = time.localtime(now)
>>> print struct_now
time.struct_time(tm_year=2017, tm_mon=1, tm_mday=16, tm_hour=16, tm_min=51, tm_sec=33, tm_wday=0, tm_yday=16, tm_isdst=0)
>>> type(struct_now)
<type 'time.struct_time'>
>>> # print nicely formatted date
... print time.strftime("%Y-%m-%d %H:%M:%S %Z", struct_now)
2017-01-16 16:51:33 EET
>>> # get miliseconds
... mlsec = repr(now).split('.')[1][:3]
>>> print mlsec
519
>>> # get your required timestamp string
... timestamp = time.strftime("%Y-%m-%d %H:%M:%S.{} %Z".format(mlsec), struct_now)
>>> print timestamp
2017-01-16 16:51:33.519 EET
>>>

回答 3

这应该做的工作

import datetime
datetime.datetime.now().strftime("%H:%M:%S.%f")

它将打印

HH:MM:SS.microseconds 像这样 14:38:19.425961

This should do the work

import datetime
datetime.datetime.now().strftime("%H:%M:%S.%f")

It will print

HH:MM:SS.microseconds like this e.g 14:38:19.425961


回答 4

您还可以time使用其time()功能从模块获得微秒精度。
time.time()返回自纪元以来的时间(以秒为单位)。其小数部分是以微秒为单位的时间,这是您想要的。)

>>> from time import time
>>> time()
... 1310554308.287459   # the fractional part is what you want.


# comparision with strftime -
>>> from datetime import datetime
>>> from time import time
>>> datetime.now().strftime("%f"), time()
... ('287389', 1310554310.287459)

You can also get microsecond precision from the time module using its time() function.
(time.time() returns the time in seconds since epoch. Its fractional part is the time in microseconds, which is what you want.)

>>> from time import time
>>> time()
... 1310554308.287459   # the fractional part is what you want.


# comparision with strftime -
>>> from datetime import datetime
>>> from time import time
>>> datetime.now().strftime("%f"), time()
... ('287389', 1310554310.287459)

回答 5

如果微秒的“%f”不起作用,请使用以下方法:

import datetime

def getTimeStamp():
    dt = datetime.datetime.now()
    return dt.strftime("%Y%j%H%M%S") + str(dt.microsecond)

When the “%f” for micro seconds isn’t working, please use the following method:

import datetime

def getTimeStamp():
    dt = datetime.datetime.now()
    return dt.strftime("%Y%j%H%M%S") + str(dt.microsecond)

回答 6

如果要提高速度,请尝试以下操作:

def _timestamp(prec=0):
    t = time.time()
    s = time.strftime("%H:%M:%S", time.localtime(t))
    if prec > 0:
        s += ("%.9f" % (t % 1,))[1:2+prec]
    return s

prec精度在哪里-您想要多少个小数位。请注意,该函数与小数部分中的前导零没有问题,就像此处介绍的一些其他解决方案一样。

If you want speed, try this:

def _timestamp(prec=0):
    t = time.time()
    s = time.strftime("%H:%M:%S", time.localtime(t))
    if prec > 0:
        s += ("%.9f" % (t % 1,))[1:2+prec]
    return s

Where prec is precision — how many decimal places you want. Please note that the function does not have issues with leading zeros in fractional part like some other solutions presented here.


回答 7

如果需要整数,请尝试以下代码:

import datetime
print(datetime.datetime.now().strftime("%s%f")[:13])

输出:

1545474382803

If you want an integer, try this code:

import datetime
print(datetime.datetime.now().strftime("%s%f")[:13])

Output:

1545474382803

检查是否设置了argparse可选参数

问题:检查是否设置了argparse可选参数

我想检查用户是否设置了可选的argparse参数。

我可以安全地使用isset检查吗?

像这样:

if(isset(args.myArg)):
    #do something
else:
    #do something else

这对于float / int / string类型参数是否起作用?

我可以设置一个默认参数并检查它(例如,设置myArg = -1,或为字符串““或“ NOT_SET”)。但是,我最终要使用的值仅在脚本的稍后部分计算。因此,我会将其默认设置为-1,然后稍后将其更新为其他内容。与仅检查该值是否由用户设置相比,这似乎有点笨拙。

I would like to check whether an optional argparse argument has been set by the user or not.

Can I safely check using isset?

Something like this:

if(isset(args.myArg)):
    #do something
else:
    #do something else

Does this work the same for float / int / string type arguments?

I could set a default parameter and check it (e.g., set myArg = -1, or “” for a string, or “NOT_SET”). However, the value I ultimately want to use is only calculated later in the script. So I would be setting it to -1 as a default, and then updating it to something else later. This seems a little clumsy in comparison with simply checking if the value was set by the user.


回答 0

我认为,如果未提供可选参数(用指定--),None则将其初始化。因此,您可以使用进行测试is not None。请尝试以下示例:

import argparse as ap

def main():
    parser = ap.ArgumentParser(description="My Script")
    parser.add_argument("--myArg")
    args, leftovers = parser.parse_known_args()

    if args.myArg is not None:
        print "myArg has been set (value is %s)" % args.myArg

I think that optional arguments (specified with --) are initialized to None if they are not supplied. So you can test with is not None. Try the example below:

import argparse as ap

def main():
    parser = ap.ArgumentParser(description="My Script")
    parser.add_argument("--myArg")
    args, leftovers = parser.parse_known_args()

    if args.myArg is not None:
        print "myArg has been set (value is %s)" % args.myArg

回答 1

正如@Honza所说,这is None是一个很好的测试。这是默认设置default,用户无法给您提供重复的字符串。

您可以指定另一个default='mydefaultvalue,然后进行测试。但是,如果用户指定该字符串怎么办?是否算作设置?

您也可以指定default=argparse.SUPPRESS。然后,如果用户不使用该参数,它将不会出现在args命名空间中。但是测试可能会更复杂:

args.foo # raises an AttributeError
hasattr(args, 'foo')  # returns False
getattr(args, 'foo', 'other') # returns 'other'

内部parser保留一个的列表seen_actions,并将其用于“必需”和“互斥”测试。但您无法使用parse_args

As @Honza notes is None is a good test. It’s the default default, and the user can’t give you a string that duplicates it.

You can specify another default='mydefaultvalue, and test for that. But what if the user specifies that string? Does that count as setting or not?

You can also specify default=argparse.SUPPRESS. Then if the user does not use the argument, it will not appear in the args namespace. But testing that might be more complicated:

args.foo # raises an AttributeError
hasattr(args, 'foo')  # returns False
getattr(args, 'foo', 'other') # returns 'other'

Internally the parser keeps a list of seen_actions, and uses it for ‘required’ and ‘mutually_exclusive’ testing. But it isn’t available to you out side of parse_args.


回答 2

我认为使用该选项default=argparse.SUPPRESS最有意义。然后,而不是检查参数是否为,而是检查参数是否not Nonein为结果命名空间。

例:

import argparse

parser = argparse.ArgumentParser()
parser.add_argument("--foo", default=argparse.SUPPRESS)
ns = parser.parse_args()

print("Parsed arguments: {}".format(ns))
print("foo in namespace?: {}".format("foo" in ns))

用法:

$ python argparse_test.py --foo 1
Parsed arguments: Namespace(foo='1')
foo in namespace?: True
不提供参数:
$ python argparse_test.py
Parsed arguments: Namespace()
foo in namespace?: False

I think using the option default=argparse.SUPPRESS makes most sense. Then, instead of checking if the argument is not None, one checks if the argument is in the resulting namespace.

Example:

import argparse

parser = argparse.ArgumentParser()
parser.add_argument("--foo", default=argparse.SUPPRESS)
ns = parser.parse_args()

print("Parsed arguments: {}".format(ns))
print("foo in namespace?: {}".format("foo" in ns))

Usage:

$ python argparse_test.py --foo 1
Parsed arguments: Namespace(foo='1')
foo in namespace?: True
Argument is not supplied:
$ python argparse_test.py
Parsed arguments: Namespace()
foo in namespace?: False

回答 3

您可以使用store_truestore_false参数操作选项检查可选传递的标志:

import argparse

argparser = argparse.ArgumentParser()
argparser.add_argument('-flag', dest='flag_exists', action='store_true')

print argparser.parse_args([])
# Namespace(flag_exists=False)
print argparser.parse_args(['-flag'])
# Namespace(flag_exists=True)

这样,您就不必担心按条件检查is not None。您只需检查TrueFalse。在此处阅读更多关于这些选项的信息

You can check an optionally passed flag with store_true and store_false argument action options:

import argparse

argparser = argparse.ArgumentParser()
argparser.add_argument('-flag', dest='flag_exists', action='store_true')

print argparser.parse_args([])
# Namespace(flag_exists=False)
print argparser.parse_args(['-flag'])
# Namespace(flag_exists=True)

This way, you don’t have to worry about checking by conditional is not None. You simply check for True or False. Read more about these options in the docs here


回答 4

如果您的参数是位置参数(即,它没有“-”或“-”前缀,只有参数,通常是文件名),则可以使用nargs参数执行此操作:

parser = argparse.ArgumentParser(description='Foo is a program that does things')
parser.add_argument('filename', nargs='?')
args = parser.parse_args()

if args.filename is not None:
    print('The file name is {}'.format(args.filename))
else:
    print('Oh well ; No args, no problems')

If your argument is positional (ie it doesn’t have a “-” or a “–” prefix, just the argument, typically a file name) then you can use the nargs parameter to do this:

parser = argparse.ArgumentParser(description='Foo is a program that does things')
parser.add_argument('filename', nargs='?')
args = parser.parse_args()

if args.filename is not None:
    print('The file name is {}'.format(args.filename))
else:
    print('Oh well ; No args, no problems')

回答 5

这是我的解决方案,看看我是否正在使用argparse变量

import argparse

ap = argparse.ArgumentParser()
ap.add_argument("-1", "--first", required=True)
ap.add_argument("-2", "--second", required=True)
ap.add_argument("-3", "--third", required=False) 
# Combine all arguments into a list called args
args = vars(ap.parse_args())
if args["third"] is not None:
# do something

这可能会使我对上面的答案有更深入的了解,而我使用该答案并对其进行了修改以使其适合我的程序。

Here is my solution to see if I am using an argparse variable

import argparse

ap = argparse.ArgumentParser()
ap.add_argument("-1", "--first", required=True)
ap.add_argument("-2", "--second", required=True)
ap.add_argument("-3", "--third", required=False) 
# Combine all arguments into a list called args
args = vars(ap.parse_args())
if args["third"] is not None:
# do something

This might give more insight to the above answer which I used and adapted to work for my program.


回答 6

为了解决@kcpr对@Honza Osobne的(当前接受的)答案的评论

不幸的是,它不起作用,然后参数将其定义为默认值。

首先可以通过将参数与 Namespace提供default=argparse.SUPPRESS选项对象abd(请参见@hpaulj和@Erasmus Cedernaes的答案以及此python3 doc),如果未提供参数,则将其设置为默认值。

import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--infile', default=argparse.SUPPRESS)
args = parser.parse_args()
if 'infile' in args: 
    # the argument is in the namespace, it's been provided by the user
    # set it to what has been provided
    theinfile = args.infile
    print('argument \'--infile\' was given, set to {}'.format(theinfile))
else:
    # the argument isn't in the namespace
    # set it to a default value
    theinfile = 'your_default.txt'
    print('argument \'--infile\' was not given, set to default {}'.format(theinfile))

用法

$ python3 testargparse_so.py
argument '--infile' was not given, set to default your_default.txt

$ python3 testargparse_so.py --infile user_file.txt
argument '--infile' was given, set to user_file.txt

In order to address @kcpr’s comment on the (currently accepted) answer by @Honza Osobne

Unfortunately it doesn’t work then the argument got it’s default value defined.

one can first check if the argument was provided by comparing it with the Namespace object abd providing the default=argparse.SUPPRESS option (see @hpaulj’s and @Erasmus Cedernaes answers and this python3 doc) and if it hasn’t been provided, then set it to a default value.

import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--infile', default=argparse.SUPPRESS)
args = parser.parse_args()
if 'infile' in args: 
    # the argument is in the namespace, it's been provided by the user
    # set it to what has been provided
    theinfile = args.infile
    print('argument \'--infile\' was given, set to {}'.format(theinfile))
else:
    # the argument isn't in the namespace
    # set it to a default value
    theinfile = 'your_default.txt'
    print('argument \'--infile\' was not given, set to default {}'.format(theinfile))

Usage

$ python3 testargparse_so.py
argument '--infile' was not given, set to default your_default.txt

$ python3 testargparse_so.py --infile user_file.txt
argument '--infile' was given, set to user_file.txt

回答 7

很简单,在通过“ args = parser.parse_args()”定义args变量后,它也包含args子集变量的所有数据。要检查是否设置了变量或假设使用的是’action =“ store_true” …

if args.argument_name:
   # do something
else:
   # do something else

Very simple, after defining args variable by ‘args = parser.parse_args()’ it contains all data of args subset variables too. To check if a variable is set or no assuming the ‘action=”store_true” is used…

if args.argument_name:
   # do something
else:
   # do something else

有趣好用的Python教程

退出移动版
微信支付
请使用 微信 扫码支付