如何使用python在Selenium中以编程方式使Firefox无头?

问题:如何使用python在Selenium中以编程方式使Firefox无头?

我正在使用python,selenium和firefox运行此代码,但仍获得firefox的“ head”版本:

binary = FirefoxBinary('C:\\Program Files (x86)\\Mozilla Firefox\\firefox.exe', log_file=sys.stdout)
binary.add_command_line_options('-headless')
self.driver = webdriver.Firefox(firefox_binary=binary)

我还尝试了一些二进制的变体:

binary = FirefoxBinary('C:\\Program Files\\Nightly\\firefox.exe', log_file=sys.stdout)
        binary.add_command_line_options("--headless")

I am running this code with python, selenium, and firefox but still get ‘head’ version of firefox:

binary = FirefoxBinary('C:\\Program Files (x86)\\Mozilla Firefox\\firefox.exe', log_file=sys.stdout)
binary.add_command_line_options('-headless')
self.driver = webdriver.Firefox(firefox_binary=binary)

I also tried some variations of binary:

binary = FirefoxBinary('C:\\Program Files\\Nightly\\firefox.exe', log_file=sys.stdout)
        binary.add_command_line_options("--headless")

回答 0

要不费吹灰之力地调用Firefox浏览器,可以headless通过以下Options()类设置属性:

from selenium import webdriver
from selenium.webdriver.firefox.options import Options

options = Options()
options.headless = True
driver = webdriver.Firefox(options=options, executable_path=r'C:\Utility\BrowserDrivers\geckodriver.exe')
driver.get("http://google.com/")
print ("Headless Firefox Initialized")
driver.quit()

还有另一种方法可以完成无头模式。如果你需要禁用或启用Firefox中的无头模式,而无需修改代码,您可以设置环境变量MOZ_HEADLESS,以什么,如果你想Firefox的运行无头,或根本不设置它。

例如,在使用持续集成并且希望在服务器中运行功能测试但仍能够在PC上以正常模式运行测试时,此功能非常有用。

$ MOZ_HEADLESS=1 python manage.py test # testing example in Django with headless Firefox

要么

$ export MOZ_HEADLESS=1   # this way you only have to set it once
$ python manage.py test functional/tests/directory
$ unset MOZ_HEADLESS      # if you want to disable headless mode

奥托罗

如何配置ChromeDriver通过Selenium以无头模式启动Chrome浏览器?

To invoke Firefox Browser headlessly, you can set the headless property through Options() class as follows:

from selenium import webdriver
from selenium.webdriver.firefox.options import Options

options = Options()
options.headless = True
driver = webdriver.Firefox(options=options, executable_path=r'C:\Utility\BrowserDrivers\geckodriver.exe')
driver.get("http://google.com/")
print ("Headless Firefox Initialized")
driver.quit()

There’s another way to accomplish headless mode. If you need to disable or enable the headless mode in Firefox, without changing the code, you can set the environment variable MOZ_HEADLESS to whatever if you want Firefox to run headless, or don’t set it at all.

This is very useful when you are using for example continuous integration and you want to run the functional tests in the server but still be able to run the tests in normal mode in your PC.

$ MOZ_HEADLESS=1 python manage.py test # testing example in Django with headless Firefox

or

$ export MOZ_HEADLESS=1   # this way you only have to set it once
$ python manage.py test functional/tests/directory
$ unset MOZ_HEADLESS      # if you want to disable headless mode

Outro

How to configure ChromeDriver to initiate Chrome browser in Headless mode through Selenium?


回答 1

第一个答案不再起作用。

这对我有用:

from selenium.webdriver.firefox.options import Options as FirefoxOptions
from selenium import webdriver

options = FirefoxOptions()
options.add_argument("--headless")
driver = webdriver.Firefox(options=options)
driver.get("http://google.com")

The first answer does’t work anymore.

This worked for me:

from selenium.webdriver.firefox.options import Options as FirefoxOptions
from selenium import webdriver

options = FirefoxOptions()
options.add_argument("--headless")
driver = webdriver.Firefox(options=options)
driver.get("http://google.com")

回答 2

我的答案:

set_headless(headless=True) is deprecated. 

https://seleniumhq.github.io/selenium/docs/api/py/webdriver_firefox/selenium.webdriver.firefox.options.html

options.headless = True

为我工作

My answer:

set_headless(headless=True) is deprecated. 

https://seleniumhq.github.io/selenium/docs/api/py/webdriver_firefox/selenium.webdriver.firefox.options.html

options.headless = True

works for me


回答 3

只是为以后可能会发现此问题的人提供的注释(并希望使用java的方法来实现此目的);FirefoxOptions还能够启用无头模式:

FirefoxOptions firefoxOptions = new FirefoxOptions();
firefoxOptions.setHeadless(true);

Just a note for people who may have found this later (and want java way of achieving this); FirefoxOptions is also capable of enabling the headless mode:

FirefoxOptions firefoxOptions = new FirefoxOptions();
firefoxOptions.setHeadless(true);

回答 4

Used below code to set driver type based on need of Headless / Head for both Firefox and chrome:

// Can pass browser type 

if brower.lower() == 'chrome':
    driver = webdriver.Chrome('..\drivers\chromedriver')
elif brower.lower() == 'headless chrome':
    ch_Options = Options()
    ch_Options.add_argument('--headless')
    ch_Options.add_argument("--disable-gpu")
    driver = webdriver.Chrome('..\drivers\chromedriver',options=ch_Options)
elif brower.lower() == 'firefox':
    driver = webdriver.Firefox(executable_path=r'..\drivers\geckodriver.exe')
elif brower.lower() == 'headless firefox':
    ff_option = FFOption()
    ff_option.add_argument('--headless')
    ff_option.add_argument("--disable-gpu")
    driver = webdriver.Firefox(executable_path=r'..\drivers\geckodriver.exe', options=ff_option)
elif brower.lower() == 'ie':
    driver = webdriver.Ie('..\drivers\IEDriverServer')
else:
    raise Exception('Invalid Browser Type')
Used below code to set driver type based on need of Headless / Head for both Firefox and chrome:

// Can pass browser type 

if brower.lower() == 'chrome':
    driver = webdriver.Chrome('..\drivers\chromedriver')
elif brower.lower() == 'headless chrome':
    ch_Options = Options()
    ch_Options.add_argument('--headless')
    ch_Options.add_argument("--disable-gpu")
    driver = webdriver.Chrome('..\drivers\chromedriver',options=ch_Options)
elif brower.lower() == 'firefox':
    driver = webdriver.Firefox(executable_path=r'..\drivers\geckodriver.exe')
elif brower.lower() == 'headless firefox':
    ff_option = FFOption()
    ff_option.add_argument('--headless')
    ff_option.add_argument("--disable-gpu")
    driver = webdriver.Firefox(executable_path=r'..\drivers\geckodriver.exe', options=ff_option)
elif brower.lower() == 'ie':
    driver = webdriver.Ie('..\drivers\IEDriverServer')
else:
    raise Exception('Invalid Browser Type')

将Dataframe保存到csv直接保存到s3 Python

问题:将Dataframe保存到csv直接保存到s3 Python

我有一个要上传到新CSV文件的pandas DataFrame。问题是在将文件传输到s3之前,我不想在本地保存文件。是否有像to_csv这样的方法可以将数据帧直接写入s3?我正在使用boto3。
这是我到目前为止的内容:

import boto3
s3 = boto3.client('s3', aws_access_key_id='key', aws_secret_access_key='secret_key')
read_file = s3.get_object(Bucket, Key)
df = pd.read_csv(read_file['Body'])

# Make alterations to DataFrame

# Then export DataFrame to CSV through direct transfer to s3

I have a pandas DataFrame that I want to upload to a new CSV file. The problem is that I don’t want to save the file locally before transferring it to s3. Is there any method like to_csv for writing the dataframe to s3 directly? I am using boto3.
Here is what I have so far:

import boto3
s3 = boto3.client('s3', aws_access_key_id='key', aws_secret_access_key='secret_key')
read_file = s3.get_object(Bucket, Key)
df = pd.read_csv(read_file['Body'])

# Make alterations to DataFrame

# Then export DataFrame to CSV through direct transfer to s3

回答 0

您可以使用:

from io import StringIO # python3; python2: BytesIO 
import boto3

bucket = 'my_bucket_name' # already created on S3
csv_buffer = StringIO()
df.to_csv(csv_buffer)
s3_resource = boto3.resource('s3')
s3_resource.Object(bucket, 'df.csv').put(Body=csv_buffer.getvalue())

You can use:

from io import StringIO # python3; python2: BytesIO 
import boto3

bucket = 'my_bucket_name' # already created on S3
csv_buffer = StringIO()
df.to_csv(csv_buffer)
s3_resource = boto3.resource('s3')
s3_resource.Object(bucket, 'df.csv').put(Body=csv_buffer.getvalue())

回答 1

您可以直接使用S3路径。我正在使用Pandas 0.24.1

In [1]: import pandas as pd

In [2]: df = pd.DataFrame( [ [1, 1, 1], [2, 2, 2] ], columns=['a', 'b', 'c'])

In [3]: df
Out[3]:
   a  b  c
0  1  1  1
1  2  2  2

In [4]: df.to_csv('s3://experimental/playground/temp_csv/dummy.csv', index=False)

In [5]: pd.__version__
Out[5]: '0.24.1'

In [6]: new_df = pd.read_csv('s3://experimental/playground/temp_csv/dummy.csv')

In [7]: new_df
Out[7]:
   a  b  c
0  1  1  1
1  2  2  2

发行公告:

S3文件处理

熊猫现在使用s3fs处理S3连接。这不应破坏任何代码。但是,由于s3fs不是必需的依赖项,因此您将需要单独安装它,例如以前版本的panda中的boto。GH11915

You can directly use the S3 path. I am using Pandas 0.24.1

In [1]: import pandas as pd

In [2]: df = pd.DataFrame( [ [1, 1, 1], [2, 2, 2] ], columns=['a', 'b', 'c'])

In [3]: df
Out[3]:
   a  b  c
0  1  1  1
1  2  2  2

In [4]: df.to_csv('s3://experimental/playground/temp_csv/dummy.csv', index=False)

In [5]: pd.__version__
Out[5]: '0.24.1'

In [6]: new_df = pd.read_csv('s3://experimental/playground/temp_csv/dummy.csv')

In [7]: new_df
Out[7]:
   a  b  c
0  1  1  1
1  2  2  2

Release Note:

S3 File Handling

pandas now uses s3fs for handling S3 connections. This shouldn’t break any code. However, since s3fs is not a required dependency, you will need to install it separately, like boto in prior versions of pandas. GH11915.


回答 2

我喜欢s3fs,它使您可以像本地文件系统一样(几乎)使用s3。

你可以这样做:

import s3fs

bytes_to_write = df.to_csv(None).encode()
fs = s3fs.S3FileSystem(key=key, secret=secret)
with fs.open('s3://bucket/path/to/file.csv', 'wb') as f:
    f.write(bytes_to_write)

s3fs只支持rbwb打开文件,这就是为什么我做这个模式bytes_to_write的东西。

I like s3fs which lets you use s3 (almost) like a local filesystem.

You can do this:

import s3fs

bytes_to_write = df.to_csv(None).encode()
fs = s3fs.S3FileSystem(key=key, secret=secret)
with fs.open('s3://bucket/path/to/file.csv', 'wb') as f:
    f.write(bytes_to_write)

s3fs supports only rb and wb modes of opening the file, that’s why I did this bytes_to_write stuff.


回答 3

这是最新的答案:

import s3fs

s3 = s3fs.S3FileSystem(anon=False)

# Use 'w' for py3, 'wb' for py2
with s3.open('<bucket-name>/<filename>.csv','w') as f:
    df.to_csv(f)

StringIO的问题在于它将吞噬您的内存。使用此方法,您将文件流式传输到s3,而不是将其转换为字符串,然后将其写入s3。将pandas数据框及其字符串副本保存在内存中似乎效率很低。

如果您在ec2 Instant中工作,则可以为其赋予IAM角色以使其能够写入s3,因此您无需直接传递凭据。但是,您也可以通过将凭据传递给S3FileSystem()功能来连接到存储桶。请参阅文档:https : //s3fs.readthedocs.io/en/latest/

This is a more up to date answer:

import s3fs

s3 = s3fs.S3FileSystem(anon=False)

# Use 'w' for py3, 'wb' for py2
with s3.open('<bucket-name>/<filename>.csv','w') as f:
    df.to_csv(f)

The problem with StringIO is that it will eat away at your memory. With this method, you are streaming the file to s3, rather than converting it to string, then writing it into s3. Holding the pandas dataframe and its string copy in memory seems very inefficient.

If you are working in an ec2 instant, you can give it an IAM role to enable writing it to s3, thus you dont need to pass in credentials directly. However, you can also connect to a bucket by passing credentials to the S3FileSystem() function. See documention:https://s3fs.readthedocs.io/en/latest/


回答 4

如果None将第一个参数传递to_csv()给数据,则将以字符串形式返回。从那里开始,只需一步即可将其上传到S3。

也可以将一个StringIO对象传递给to_csv(),但是使用字符串会更容易。

If you pass None as the first argument to to_csv() the data will be returned as a string. From there it’s an easy step to upload that to S3 in one go.

It should also be possible to pass a StringIO object to to_csv(), but using a string will be easier.


回答 5

您还可以使用AWS Data Wrangler

import awswrangler

session = awswrangler.Session()
session.pandas.to_csv(
    dataframe=df,
    path="s3://...",
)

请注意,由于它是并行上传的,因此它将分为几部分。

You can also use the AWS Data Wrangler:

import awswrangler as wr
    
wr.s3.to_csv(
    df=df,
    path="s3://...",
)

Note that it will handle multipart upload for you to make the upload faster.


回答 6

我发现也可以使用client,而不仅仅是resource

from io import StringIO
import boto3
s3 = boto3.client("s3",\
                  region_name=region_name,\
                  aws_access_key_id=aws_access_key_id,\
                  aws_secret_access_key=aws_secret_access_key)
csv_buf = StringIO()
df.to_csv(csv_buf, header=True, index=False)
csv_buf.seek(0)
s3.put_object(Bucket=bucket, Body=csv_buf.getvalue(), Key='path/test.csv')

I found this can be done using client also and not just resource.

from io import StringIO
import boto3
s3 = boto3.client("s3",\
                  region_name=region_name,\
                  aws_access_key_id=aws_access_key_id,\
                  aws_secret_access_key=aws_secret_access_key)
csv_buf = StringIO()
df.to_csv(csv_buf, header=True, index=False)
csv_buf.seek(0)
s3.put_object(Bucket=bucket, Body=csv_buf.getvalue(), Key='path/test.csv')

回答 7

由于您正在使用boto3.client(),请尝试:

import boto3
from io import StringIO #python3 
s3 = boto3.client('s3', aws_access_key_id='key', aws_secret_access_key='secret_key')
def copy_to_s3(client, df, bucket, filepath):
    csv_buf = StringIO()
    df.to_csv(csv_buf, header=True, index=False)
    csv_buf.seek(0)
    client.put_object(Bucket=bucket, Body=csv_buf.getvalue(), Key=filepath)
    print(f'Copy {df.shape[0]} rows to S3 Bucket {bucket} at {filepath}, Done!')

copy_to_s3(client=s3, df=df_to_upload, bucket='abc', filepath='def/test.csv')

since you are using boto3.client(), try:

import boto3
from io import StringIO #python3 
s3 = boto3.client('s3', aws_access_key_id='key', aws_secret_access_key='secret_key')
def copy_to_s3(client, df, bucket, filepath):
    csv_buf = StringIO()
    df.to_csv(csv_buf, header=True, index=False)
    csv_buf.seek(0)
    client.put_object(Bucket=bucket, Body=csv_buf.getvalue(), Key=filepath)
    print(f'Copy {df.shape[0]} rows to S3 Bucket {bucket} at {filepath}, Done!')

copy_to_s3(client=s3, df=df_to_upload, bucket='abc', filepath='def/test.csv')

回答 8

我找到了一个似乎很有效的简单解决方案:

s3 = boto3.client("s3")

s3.put_object(
    Body=open("filename.csv").read(),
    Bucket="your-bucket",
    Key="your-key"
)

希望能有所帮助!

I found a very simple solution that seems to be working :

s3 = boto3.client("s3")

s3.put_object(
    Body=open("filename.csv").read(),
    Bucket="your-bucket",
    Key="your-key"
)

Hope that helps !


回答 9

我从存储桶s3中读取了两列的csv,并将文件csv的内容放入了pandas数据框。

例:

config.json

{
  "credential": {
    "access_key":"xxxxxx",
    "secret_key":"xxxxxx"
}
,
"s3":{
       "bucket":"mybucket",
       "key":"csv/user.csv"
   }
}

cls_config.json

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import os
import json

class cls_config(object):

    def __init__(self,filename):

        self.filename = filename


    def getConfig(self):

        fileName = os.path.join(os.path.dirname(__file__), self.filename)
        with open(fileName) as f:
        config = json.load(f)
        return config

cls_pandas.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import pandas as pd
import io

class cls_pandas(object):

    def __init__(self):
        pass

    def read(self,stream):

        df = pd.read_csv(io.StringIO(stream), sep = ",")
        return df

cls_s3.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import boto3
import json

class cls_s3(object):

    def  __init__(self,access_key,secret_key):

        self.s3 = boto3.client('s3', aws_access_key_id=access_key, aws_secret_access_key=secret_key)

    def getObject(self,bucket,key):

        read_file = self.s3.get_object(Bucket=bucket, Key=key)
        body = read_file['Body'].read().decode('utf-8')
        return body

test.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from cls_config import *
from cls_s3 import *
from cls_pandas import *

class test(object):

    def __init__(self):
        self.conf = cls_config('config.json')

    def process(self):

        conf = self.conf.getConfig()

        bucket = conf['s3']['bucket']
        key = conf['s3']['key']

        access_key = conf['credential']['access_key']
        secret_key = conf['credential']['secret_key']

        s3 = cls_s3(access_key,secret_key)
        ob = s3.getObject(bucket,key)

        pa = cls_pandas()
        df = pa.read(ob)

        print df

if __name__ == '__main__':
    test = test()
    test.process()

I read a csv with two columns from bucket s3, and the content of the file csv i put in pandas dataframe.

Example:

config.json

{
  "credential": {
    "access_key":"xxxxxx",
    "secret_key":"xxxxxx"
}
,
"s3":{
       "bucket":"mybucket",
       "key":"csv/user.csv"
   }
}

cls_config.json

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import os
import json

class cls_config(object):

    def __init__(self,filename):

        self.filename = filename


    def getConfig(self):

        fileName = os.path.join(os.path.dirname(__file__), self.filename)
        with open(fileName) as f:
        config = json.load(f)
        return config

cls_pandas.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import pandas as pd
import io

class cls_pandas(object):

    def __init__(self):
        pass

    def read(self,stream):

        df = pd.read_csv(io.StringIO(stream), sep = ",")
        return df

cls_s3.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import boto3
import json

class cls_s3(object):

    def  __init__(self,access_key,secret_key):

        self.s3 = boto3.client('s3', aws_access_key_id=access_key, aws_secret_access_key=secret_key)

    def getObject(self,bucket,key):

        read_file = self.s3.get_object(Bucket=bucket, Key=key)
        body = read_file['Body'].read().decode('utf-8')
        return body

test.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from cls_config import *
from cls_s3 import *
from cls_pandas import *

class test(object):

    def __init__(self):
        self.conf = cls_config('config.json')

    def process(self):

        conf = self.conf.getConfig()

        bucket = conf['s3']['bucket']
        key = conf['s3']['key']

        access_key = conf['credential']['access_key']
        secret_key = conf['credential']['secret_key']

        s3 = cls_s3(access_key,secret_key)
        ob = s3.getObject(bucket,key)

        pa = cls_pandas()
        df = pa.read(ob)

        print df

if __name__ == '__main__':
    test = test()
    test.process()

如何使用python中的pandas获取所有重复项的列表?

问题:如何使用python中的pandas获取所有重复项的列表?

我列出了可能存在一些出口问题的物品清单。我想获得重复项的列表,以便可以手动比较它们。当我尝试使用pandas 重复方法时,它仅返回第一个重复。有没有办法获取所有重复项,而不仅仅是第一个?

我的数据集的一个小部分看起来像这样:

ID,ENROLLMENT_DATE,TRAINER_MANAGING,TRAINER_OPERATOR,FIRST_VISIT_DATE
1536D,12-Feb-12,"06DA1B3-Lebanon NH",,15-Feb-12
F15D,18-May-12,"06405B2-Lebanon NH",,25-Jul-12
8096,8-Aug-12,"0643D38-Hanover NH","0643D38-Hanover NH",25-Jun-12
A036,1-Apr-12,"06CB8CF-Hanover NH","06CB8CF-Hanover NH",9-Aug-12
8944,19-Feb-12,"06D26AD-Hanover NH",,4-Feb-12
1004E,8-Jun-12,"06388B2-Lebanon NH",,24-Dec-11
11795,3-Jul-12,"0649597-White River VT","0649597-White River VT",30-Mar-12
30D7,11-Nov-12,"06D95A3-Hanover NH","06D95A3-Hanover NH",30-Nov-11
3AE2,21-Feb-12,"06405B2-Lebanon NH",,26-Oct-12
B0FE,17-Feb-12,"06D1B9D-Hartland VT",,16-Feb-12
127A1,11-Dec-11,"064456E-Hanover NH","064456E-Hanover NH",11-Nov-12
161FF,20-Feb-12,"0643D38-Hanover NH","0643D38-Hanover NH",3-Jul-12
A036,30-Nov-11,"063B208-Randolph VT","063B208-Randolph VT",
475B,25-Sep-12,"06D26AD-Hanover NH",,5-Nov-12
151A3,7-Mar-12,"06388B2-Lebanon NH",,16-Nov-12
CA62,3-Jan-12,,,
D31B,18-Dec-11,"06405B2-Lebanon NH",,9-Jan-12
20F5,8-Jul-12,"0669C50-Randolph VT",,3-Feb-12
8096,19-Dec-11,"0649597-White River VT","0649597-White River VT",9-Apr-12
14E48,1-Aug-12,"06D3206-Hanover NH",,
177F8,20-Aug-12,"063B208-Randolph VT","063B208-Randolph VT",5-May-12
553E,11-Oct-12,"06D95A3-Hanover NH","06D95A3-Hanover NH",8-Mar-12
12D5F,18-Jul-12,"0649597-White River VT","0649597-White River VT",2-Nov-12
C6DC,13-Apr-12,"06388B2-Lebanon NH",,
11795,27-Feb-12,"0643D38-Hanover NH","0643D38-Hanover NH",19-Jun-12
17B43,11-Aug-12,,,22-Oct-12
A036,11-Aug-12,"06D3206-Hanover NH",,19-Jun-12

我的代码当前如下所示:

df_bigdata_duplicates = df_bigdata[df_bigdata.duplicated(cols='ID')]

那里有几个重复的物品。但是,当我使用上面的代码时,我只会得到第一项。在API参考中,我看到了如何获得最后一个项目,但是我希望拥有所有这些项目,因此我可以目视检查它们,以查看为什么我得到了差异。因此,在此示例中,我想获得所有三个A036条目以及11795条目和任何其他重复的条目,而不是仅第一个。任何帮助深表感谢。

I have a list of items that likely has some export issues. I would like to get a list of the duplicate items so I can manually compare them. When I try to use pandas duplicated method, it only returns the first duplicate. Is there a a way to get all of the duplicates and not just the first one?

A small subsection of my dataset looks like this:

ID,ENROLLMENT_DATE,TRAINER_MANAGING,TRAINER_OPERATOR,FIRST_VISIT_DATE
1536D,12-Feb-12,"06DA1B3-Lebanon NH",,15-Feb-12
F15D,18-May-12,"06405B2-Lebanon NH",,25-Jul-12
8096,8-Aug-12,"0643D38-Hanover NH","0643D38-Hanover NH",25-Jun-12
A036,1-Apr-12,"06CB8CF-Hanover NH","06CB8CF-Hanover NH",9-Aug-12
8944,19-Feb-12,"06D26AD-Hanover NH",,4-Feb-12
1004E,8-Jun-12,"06388B2-Lebanon NH",,24-Dec-11
11795,3-Jul-12,"0649597-White River VT","0649597-White River VT",30-Mar-12
30D7,11-Nov-12,"06D95A3-Hanover NH","06D95A3-Hanover NH",30-Nov-11
3AE2,21-Feb-12,"06405B2-Lebanon NH",,26-Oct-12
B0FE,17-Feb-12,"06D1B9D-Hartland VT",,16-Feb-12
127A1,11-Dec-11,"064456E-Hanover NH","064456E-Hanover NH",11-Nov-12
161FF,20-Feb-12,"0643D38-Hanover NH","0643D38-Hanover NH",3-Jul-12
A036,30-Nov-11,"063B208-Randolph VT","063B208-Randolph VT",
475B,25-Sep-12,"06D26AD-Hanover NH",,5-Nov-12
151A3,7-Mar-12,"06388B2-Lebanon NH",,16-Nov-12
CA62,3-Jan-12,,,
D31B,18-Dec-11,"06405B2-Lebanon NH",,9-Jan-12
20F5,8-Jul-12,"0669C50-Randolph VT",,3-Feb-12
8096,19-Dec-11,"0649597-White River VT","0649597-White River VT",9-Apr-12
14E48,1-Aug-12,"06D3206-Hanover NH",,
177F8,20-Aug-12,"063B208-Randolph VT","063B208-Randolph VT",5-May-12
553E,11-Oct-12,"06D95A3-Hanover NH","06D95A3-Hanover NH",8-Mar-12
12D5F,18-Jul-12,"0649597-White River VT","0649597-White River VT",2-Nov-12
C6DC,13-Apr-12,"06388B2-Lebanon NH",,
11795,27-Feb-12,"0643D38-Hanover NH","0643D38-Hanover NH",19-Jun-12
17B43,11-Aug-12,,,22-Oct-12
A036,11-Aug-12,"06D3206-Hanover NH",,19-Jun-12

My code looks like this currently:

df_bigdata_duplicates = df_bigdata[df_bigdata.duplicated(cols='ID')]

There area a couple duplicate items. But, when I use the above code, I only get the first item. In the API reference, I see how I can get the last item, but I would like to have all of them so I can visually inspect them to see why I am getting the discrepancy. So, in this example I would like to get all three A036 entries and both 11795 entries and any other duplicated entries, instead of the just first one. Any help is most appreciated.


回答 0

方法1:打印所有ID为重复ID之一的行:

>>> import pandas as pd
>>> df = pd.read_csv("dup.csv")
>>> ids = df["ID"]
>>> df[ids.isin(ids[ids.duplicated()])].sort("ID")
       ID ENROLLMENT_DATE        TRAINER_MANAGING        TRAINER_OPERATOR FIRST_VISIT_DATE
24  11795       27-Feb-12      0643D38-Hanover NH      0643D38-Hanover NH        19-Jun-12
6   11795        3-Jul-12  0649597-White River VT  0649597-White River VT        30-Mar-12
18   8096       19-Dec-11  0649597-White River VT  0649597-White River VT         9-Apr-12
2    8096        8-Aug-12      0643D38-Hanover NH      0643D38-Hanover NH        25-Jun-12
12   A036       30-Nov-11     063B208-Randolph VT     063B208-Randolph VT              NaN
3    A036        1-Apr-12      06CB8CF-Hanover NH      06CB8CF-Hanover NH         9-Aug-12
26   A036       11-Aug-12      06D3206-Hanover NH                     NaN        19-Jun-12

但是我想不出一种防止重复ids很多次的好方法。我更喜欢groupbyID上的方法2 :。

>>> pd.concat(g for _, g in df.groupby("ID") if len(g) > 1)
       ID ENROLLMENT_DATE        TRAINER_MANAGING        TRAINER_OPERATOR FIRST_VISIT_DATE
6   11795        3-Jul-12  0649597-White River VT  0649597-White River VT        30-Mar-12
24  11795       27-Feb-12      0643D38-Hanover NH      0643D38-Hanover NH        19-Jun-12
2    8096        8-Aug-12      0643D38-Hanover NH      0643D38-Hanover NH        25-Jun-12
18   8096       19-Dec-11  0649597-White River VT  0649597-White River VT         9-Apr-12
3    A036        1-Apr-12      06CB8CF-Hanover NH      06CB8CF-Hanover NH         9-Aug-12
12   A036       30-Nov-11     063B208-Randolph VT     063B208-Randolph VT              NaN
26   A036       11-Aug-12      06D3206-Hanover NH                     NaN        19-Jun-12

Method #1: print all rows where the ID is one of the IDs in duplicated:

>>> import pandas as pd
>>> df = pd.read_csv("dup.csv")
>>> ids = df["ID"]
>>> df[ids.isin(ids[ids.duplicated()])].sort("ID")
       ID ENROLLMENT_DATE        TRAINER_MANAGING        TRAINER_OPERATOR FIRST_VISIT_DATE
24  11795       27-Feb-12      0643D38-Hanover NH      0643D38-Hanover NH        19-Jun-12
6   11795        3-Jul-12  0649597-White River VT  0649597-White River VT        30-Mar-12
18   8096       19-Dec-11  0649597-White River VT  0649597-White River VT         9-Apr-12
2    8096        8-Aug-12      0643D38-Hanover NH      0643D38-Hanover NH        25-Jun-12
12   A036       30-Nov-11     063B208-Randolph VT     063B208-Randolph VT              NaN
3    A036        1-Apr-12      06CB8CF-Hanover NH      06CB8CF-Hanover NH         9-Aug-12
26   A036       11-Aug-12      06D3206-Hanover NH                     NaN        19-Jun-12

but I couldn’t think of a nice way to prevent repeating ids so many times. I prefer method #2: groupby on the ID.

>>> pd.concat(g for _, g in df.groupby("ID") if len(g) > 1)
       ID ENROLLMENT_DATE        TRAINER_MANAGING        TRAINER_OPERATOR FIRST_VISIT_DATE
6   11795        3-Jul-12  0649597-White River VT  0649597-White River VT        30-Mar-12
24  11795       27-Feb-12      0643D38-Hanover NH      0643D38-Hanover NH        19-Jun-12
2    8096        8-Aug-12      0643D38-Hanover NH      0643D38-Hanover NH        25-Jun-12
18   8096       19-Dec-11  0649597-White River VT  0649597-White River VT         9-Apr-12
3    A036        1-Apr-12      06CB8CF-Hanover NH      06CB8CF-Hanover NH         9-Aug-12
12   A036       30-Nov-11     063B208-Randolph VT     063B208-Randolph VT              NaN
26   A036       11-Aug-12      06D3206-Hanover NH                     NaN        19-Jun-12

回答 1

在Pandas版本0.17中,您可以在重复函数中设置“ keep = False”,以获取所有重复项。

In [1]: import pandas as pd

In [2]: df = pd.DataFrame(['a','b','c','d','a','b'])

In [3]: df
Out[3]: 
       0
    0  a
    1  b
    2  c
    3  d
    4  a
    5  b

In [4]: df[df.duplicated(keep=False)]
Out[4]: 
       0
    0  a
    1  b
    4  a
    5  b

With Pandas version 0.17, you can set ‘keep = False’ in the duplicated function to get all the duplicate items.

In [1]: import pandas as pd

In [2]: df = pd.DataFrame(['a','b','c','d','a','b'])

In [3]: df
Out[3]: 
       0
    0  a
    1  b
    2  c
    3  d
    4  a
    5  b

In [4]: df[df.duplicated(keep=False)]
Out[4]: 
       0
    0  a
    1  b
    4  a
    5  b

回答 2

df[df.duplicated(['ID'], keep=False)]

它将所有重复的行返回给您。

根据文件

keep:{‘first’,’last’,False},默认为’first’

  • first:将第一次出现的重复项标记为True。
  • last:将最后一次出现的重复项标记为True。
  • False:将所有重复项标记为True。
df[df.duplicated(['ID'], keep=False)]

it’ll return all duplicated rows back to you.

According to documentation:

keep : {‘first’, ‘last’, False}, default ‘first’

  • first : Mark duplicates as True except for the first occurrence.
  • last : Mark duplicates as True except for the last occurrence.
  • False : Mark all duplicates as True.

回答 3

由于我无法发表评论,因此将其发布为单独的答案

要在多个列的基础上查找重复项,请提及以下每个列名,它将返回所有已设置的重复行:

df[df[['product_uid', 'product_title', 'user']].duplicated() == True]

As I am unable to comment, hence posting as a separate answer

To find duplicates on the basis of more than one column, mention every column name as below, and it will return you all the duplicated rows set:

df[df[['product_uid', 'product_title', 'user']].duplicated() == True]

回答 4

df[df['ID'].duplicated() == True]

这对我有用

df[df['ID'].duplicated() == True]

This worked for me


回答 5

使用按元素进行逻辑运算或将pandas复制方法的take_last参数设置为True和False,您可以从数据框中获取一个包含所有重复项的集合。

df_bigdata_duplicates = 
    df_bigdata[df_bigdata.duplicated(cols='ID', take_last=False) |
               df_bigdata.duplicated(cols='ID', take_last=True)
              ]

Using an element-wise logical or and setting the take_last argument of the pandas duplicated method to both True and False you can obtain a set from your dataframe that includes all of the duplicates.

df_bigdata_duplicates = 
    df_bigdata[df_bigdata.duplicated(cols='ID', take_last=False) |
               df_bigdata.duplicated(cols='ID', take_last=True)
              ]

回答 6

这可能不是解决问题的方法,而是举例说明:

import pandas as pd

df = pd.DataFrame({
    'A': [1,1,3,4],
    'B': [2,2,5,6],
    'C': [3,4,7,6],
})

print(df)
df.duplicated(keep=False)
df.duplicated(['A','B'], keep=False)

输出:

   A  B  C
0  1  2  3
1  1  2  4
2  3  5  7
3  4  6  6

0    False
1    False
2    False
3    False
dtype: bool

0     True
1     True
2    False
3    False
dtype: bool

This may not be a solution to the question, but to illustrate examples:

import pandas as pd

df = pd.DataFrame({
    'A': [1,1,3,4],
    'B': [2,2,5,6],
    'C': [3,4,7,6],
})

print(df)
df.duplicated(keep=False)
df.duplicated(['A','B'], keep=False)

The outputs:

   A  B  C
0  1  2  3
1  1  2  4
2  3  5  7
3  4  6  6

0    False
1    False
2    False
3    False
dtype: bool

0     True
1     True
2    False
3    False
dtype: bool

回答 7

sort("ID")现在似乎无法正常工作,似乎已按照sort doc弃用,因此请sort_values("ID")改为使用重复过滤器进行排序,如下所示:

df[df.ID.duplicated(keep=False)].sort_values("ID")

sort("ID") does not seem to be working now, seems deprecated as per sort doc, so use sort_values("ID") instead to sort after duplicate filter, as following:

df[df.ID.duplicated(keep=False)].sort_values("ID")

回答 8

对于我的数据库,在对列进行排序之前,重复的(keep = False)不起作用。

data.sort_values(by=['Order ID'], inplace=True)
df = data[data['Order ID'].duplicated(keep=False)]

For my database duplicated(keep=False) did not work until the column was sorted.

data.sort_values(by=['Order ID'], inplace=True)
df = data[data['Order ID'].duplicated(keep=False)]

回答 9

df[df.duplicated(['ID'])==True].sort_values('ID')

df[df.duplicated(['ID'])==True].sort_values('ID')


ImportError:没有名为pip的模块

问题:ImportError:没有名为pip的模块

操作系统:Mac OS X 10.7.5 Python Ver:2.7.5

我已经安装了setuptools的1.0 ez_setup.py从https://pypi.python.org/pypi/setuptools 然后我下载pip.1.4.1 PKG从https://pypi.python.org/pypi/pip/1.4.1

python setup.py install在iTerm中运行(sudo)显示

running install
running bdist_egg running egg_info writing requirements to
pip.egg-info/requires.txt writing pip.egg-info/PKG-INFO writing
top-level names to pip.egg-info/top_level.txt writing dependency_links
to pip.egg-info/dependency_links.txt writing entry points to
pip.egg-info/entry_points.txt warning: manifest_maker: standard file
'setup.py' not found

reading manifest file 'pip.egg-info/SOURCES.txt' writing manifest file
'pip.egg-info/SOURCES.txt' installing library code to
build/bdist.macosx-10.6-intel/egg running install_lib warning:
install_lib: 'build/lib' does not exist -- no Python modules to
install

creating build/bdist.macosx-10.6-intel/egg creating
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/PKG-INFO -> build/bdist.macosx-10.6-intel/egg/EGG-INFO
copying pip.egg-info/SOURCES.txt ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/dependency_links.txt ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/entry_points.txt ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/not-zip-safe ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/requires.txt ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/top_level.txt ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO creating
'dist/pip-1.4.1-py2.7.egg' and adding
'build/bdist.macosx-10.6-intel/egg' to it removing
'build/bdist.macosx-10.6-intel/egg' (and everything under it)
Processing pip-1.4.1-py2.7.egg removing
'/Users/dl/Library/Python/2.7/lib/python/site-packages/pip-1.4.1-py2.7.egg'
(and everything under it) creating
/Users/dl/Library/Python/2.7/lib/python/site-packages/pip-1.4.1-py2.7.egg
Extracting pip-1.4.1-py2.7.egg to
/Users/dl/Library/Python/2.7/lib/python/site-packages pip 1.4.1 is
already the active version in easy-install.pth Installing pip script
to /Users/dl/Library/Python/2.7/bin Installing pip-2.7 script to
/Users/dl/Library/Python/2.7/bin

Installed
/Users/dl/Library/Python/2.7/lib/python/site-packages/pip-1.4.1-py2.7.egg
Processing dependencies for pip==1.4.1 Finished processing
dependencies for pip==1.4.1

然后我输入pip install,错误信息显示为

Traceback (most recent call last):   File
"/Library/Frameworks/Python.framework/Versions/2.7/bin/pip", line 9,
in <module>
load_entry_point('pip==1.4.1', 'console_scripts', 'pip')()   File "build/bdist.macosx-10.6-intel/egg/pkg_resources.py", line 357, in
load_entry_point   File
"build/bdist.macosx-10.6-intel/egg/pkg_resources.py", line 2394, in
load_entry_point   File
"build/bdist.macosx-10.6-intel/egg/pkg_resources.py", line 2108, in
load ImportError: No module named pip

曾经遇到过相同问题并且可以给我一些解决方法的人吗?

OS: Mac OS X 10.7.5 Python Ver: 2.7.5

I have installed setuptools 1.0 with ez_setup.py from https://pypi.python.org/pypi/setuptools Then I download pip.1.4.1 pkg from https://pypi.python.org/pypi/pip/1.4.1.

Run (sudo) python setup.py install in iTerm shows that

running install
running bdist_egg running egg_info writing requirements to
pip.egg-info/requires.txt writing pip.egg-info/PKG-INFO writing
top-level names to pip.egg-info/top_level.txt writing dependency_links
to pip.egg-info/dependency_links.txt writing entry points to
pip.egg-info/entry_points.txt warning: manifest_maker: standard file
'setup.py' not found

reading manifest file 'pip.egg-info/SOURCES.txt' writing manifest file
'pip.egg-info/SOURCES.txt' installing library code to
build/bdist.macosx-10.6-intel/egg running install_lib warning:
install_lib: 'build/lib' does not exist -- no Python modules to
install

creating build/bdist.macosx-10.6-intel/egg creating
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/PKG-INFO -> build/bdist.macosx-10.6-intel/egg/EGG-INFO
copying pip.egg-info/SOURCES.txt ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/dependency_links.txt ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/entry_points.txt ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/not-zip-safe ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/requires.txt ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/top_level.txt ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO creating
'dist/pip-1.4.1-py2.7.egg' and adding
'build/bdist.macosx-10.6-intel/egg' to it removing
'build/bdist.macosx-10.6-intel/egg' (and everything under it)
Processing pip-1.4.1-py2.7.egg removing
'/Users/dl/Library/Python/2.7/lib/python/site-packages/pip-1.4.1-py2.7.egg'
(and everything under it) creating
/Users/dl/Library/Python/2.7/lib/python/site-packages/pip-1.4.1-py2.7.egg
Extracting pip-1.4.1-py2.7.egg to
/Users/dl/Library/Python/2.7/lib/python/site-packages pip 1.4.1 is
already the active version in easy-install.pth Installing pip script
to /Users/dl/Library/Python/2.7/bin Installing pip-2.7 script to
/Users/dl/Library/Python/2.7/bin

Installed
/Users/dl/Library/Python/2.7/lib/python/site-packages/pip-1.4.1-py2.7.egg
Processing dependencies for pip==1.4.1 Finished processing
dependencies for pip==1.4.1

Then I inputed pip install, the error message showed like that

Traceback (most recent call last):   File
"/Library/Frameworks/Python.framework/Versions/2.7/bin/pip", line 9,
in <module>
load_entry_point('pip==1.4.1', 'console_scripts', 'pip')()   File "build/bdist.macosx-10.6-intel/egg/pkg_resources.py", line 357, in
load_entry_point   File
"build/bdist.macosx-10.6-intel/egg/pkg_resources.py", line 2394, in
load_entry_point   File
"build/bdist.macosx-10.6-intel/egg/pkg_resources.py", line 2108, in
load ImportError: No module named pip

Anyone who met the same problem before and can give me some tips to solve it?


回答 0

我有同样的问题。我的解决方案:

对于Python 3

sudo apt-get install python3-pip

对于Python 2

sudo apt-get install python-pip

I had the same problem. My solution:

For Python 3

sudo apt-get install python3-pip

For Python 2

sudo apt-get install python-pip

回答 1

在Mac上,使用brew是一个更好的选择,因为apt-get不可用。命令:

如果您在机器上同时安装了python2和python3

python2.7 -m ensurepip --default-pip

根本应该解决问题。

相反,如果您缺少python 3的pip,则只需在上述命令中更改python2.7为即可python3

On Mac using brew is a better option as apt-get is not available. Command:

brew install python

In case you have both python2 & python3 installed on machine

python2.7 -m ensurepip --default-pip

simply should solve the issue.

If instead you are missing pip from python 3 then simply change python2.7 to python3 in the command above.


回答 2

安装ez_setup之后,您应该已经easy_install可以使用。要安装pip只需执行以下操作:

easy_install pip

After installing ez_setup, you should have easy_install available. To install pip just do:

easy_install pip

回答 3

使用macOS 10.15和Homebrew 2.1.6时,Python 3.7出现此错误。我只需要运行:

python3 -m ensurepip

现在python3 -m pip为我工作。

With macOS 10.15 and Homebrew 2.1.6 I was getting this error with Python 3.7. I just needed to run:

python3 -m ensurepip

Now python3 -m pip works for me.


回答 4

尝试pip通过Python 安装:

请转到:https : //pip.pypa.io/en/stable/installing/

并下载get-pip.py,然后运行:

(sudo) python get-pip.py

Try to install pip through Python:

Please go to: https://pip.pypa.io/en/stable/installing/

and download get-pip.py, and then run:

(sudo) python get-pip.py

回答 5

尝试安装nova客户端时,我遇到了同样的问题。

spencers-macbook-pro:python-novaclient root# python  setup.py install    
running install
/usr/bin/python: No module named pip
error: /usr/bin/python -m pip.__init__ install   'pbr>=0.5.21,<1.0' 'iso8601>=0.1.4' 'PrettyTable>=0.6,<0.8' 'requests>=1.1' 'simplejson>=2.0.9' 'six' 'Babel>=0.9.6' returned 1

我使用自制软件,因此我解决了 sudo easy_install pip

spencers-macbook-pro:python-novaclient root# brew search pip
aespipe     brew-pip    lesspipe    pipebench   pipemeter   spiped  pipeviewer

If you meant "pip" precisely:

Homebrew provides pip via: `brew install python`. However you will then
have two Pythons installed on your Mac, so alternatively you can:
    sudo easy_install pip
spencers-macbook-pro:python-novaclient root# sudo easy_install pip

如果使用macports,则命令应类似。

I ran into this same issue when I attempted to install the nova client.

spencers-macbook-pro:python-novaclient root# python  setup.py install    
running install
/usr/bin/python: No module named pip
error: /usr/bin/python -m pip.__init__ install   'pbr>=0.5.21,<1.0' 'iso8601>=0.1.4' 'PrettyTable>=0.6,<0.8' 'requests>=1.1' 'simplejson>=2.0.9' 'six' 'Babel>=0.9.6' returned 1

I use homebrew so I worked around the issue with sudo easy_install pip

spencers-macbook-pro:python-novaclient root# brew search pip
aespipe     brew-pip    lesspipe    pipebench   pipemeter   spiped  pipeviewer

If you meant "pip" precisely:

Homebrew provides pip via: `brew install python`. However you will then
have two Pythons installed on your Mac, so alternatively you can:
    sudo easy_install pip
spencers-macbook-pro:python-novaclient root# sudo easy_install pip

The commands should be similar if you use macports.


回答 6

我认为上述所有答案都无法解决您的问题。

我曾经也被这个问题弄糊涂了。您应该pip按照官方指南pip安装(当前涉及运行单个get-pip.pyPython脚本)进行手动安装

之后,就sudo pip install Django。错误将消失。

I think none of these answers above can fix your problem.

I was also confused by this problem once. You should manually install pip following the official guide pip installation (which currently involves running a single get-pip.py Python script)

after that, just sudo pip install Django. The error will be gone.


回答 7

我知道这个线程很旧,但是我在OS X上为自己解决了这个问题,与这里描述的有所不同。

基本上,我是通过brew重新安装了Python 2.7的,它附带了pip。

如果尚未安装Xcode,请安装:

xcode-select install

为说明安装啤酒在这里

ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

然后通过Brew安装Python:

brew install python

这样就完成了。就我而言,我只需要安装pyserial。

pip install pyserial

I know this thread is old, but I just solved the problem for myself on OS X differently than described here.

Basically I reinstalled Python 2.7 through brew, and it comes with pip.

Install Xcode if not already:

xcode-select –install

Install Brew as described here:

ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

Then install Python through Brew:

brew install python

And you’re done. In my case I just needed to install pyserial.

pip install pyserial

回答 8

pip这里下载了二进制文件,它解决了这个问题。

I downloaded pip binaries from here and it resolved the issue.


回答 9

在终端中尝试以下操作:

ls -lA /usr/local/bin | grep pip

就我而言:

-rwxr-xr-x 1 root  root      284 Сен 13 16:20 pip
-rwxr-xr-x 1 root  root      204 Окт 27 16:37 pip2
-rwxr-xr-x 1 root  root      204 Окт 27 16:37 pip2.7
-rwxr-xr-x 1 root  root      292 Сен 13 16:20 pip-3.4

所以pip2 || 在我的情况下pip2.7可以正常工作,并且pip

In terminal try this:

ls -lA /usr/local/bin | grep pip

in my case i get:

-rwxr-xr-x 1 root  root      284 Сен 13 16:20 pip
-rwxr-xr-x 1 root  root      204 Окт 27 16:37 pip2
-rwxr-xr-x 1 root  root      204 Окт 27 16:37 pip2.7
-rwxr-xr-x 1 root  root      292 Сен 13 16:20 pip-3.4

So pip2 || pip2.7 in my case works, and pip


回答 10

我的Py版本是3.7.3,这个cmd有效

python3.7 -m pip install requests

请求库-用于从Web API检索数据。

这将运行pip模块,并要求其在PyPI.org(Python软件包索引)上找到请求库,并将其安装在本地系统中,以便您可以导入

my py version is 3.7.3, and this cmd worked

python3.7 -m pip install requests

requests library – for retrieving data from web APIs.

This runs the pip module and asks it to find the requests library on PyPI.org (the Python Package Index) and install it in your local system so that it becomes available for you to import


回答 11

通过在站点软件包位置设置PYTHONPATH,我解决了Linux上的类似错误。这是在跑步之后python get-pip.py --prefix /home/chet/pip

[chet@rhel1 ~]$ ~/pip/bin/pip -V
Traceback (most recent call last):
  File "/home/chet/pip/bin/pip", line 7, in <module>
    from pip import main
ImportError: No module named pip

[chet@rhel1 ~]$ export PYTHONPATH=/home/chet/pip/lib/python2.6/site-packages

[chet@rhel1 ~]$ ~/pip/bin/pip -V
pip 9.0.1 from /home/chet/pip/lib/python2.6/site-packages (python 2.6)

I solved a similar error on Linux by setting PYTHONPATH to the site-packages location. This was after running python get-pip.py --prefix /home/chet/pip.

[chet@rhel1 ~]$ ~/pip/bin/pip -V
Traceback (most recent call last):
  File "/home/chet/pip/bin/pip", line 7, in <module>
    from pip import main
ImportError: No module named pip

[chet@rhel1 ~]$ export PYTHONPATH=/home/chet/pip/lib/python2.6/site-packages

[chet@rhel1 ~]$ ~/pip/bin/pip -V
pip 9.0.1 from /home/chet/pip/lib/python2.6/site-packages (python 2.6)

回答 12

在Linux下进行了以下测试:您可以直接从https://pypi.org/simple/pip/ untar 下载pip,并直接与最新的python一起使用。

tar -xvf  pip-0.2.tar.gz
cd pip-0.2

检查内容。

anant$ ls
docs  pip.egg-info  pip-log.txt  pip.py  PKG-INFO  regen-docs  scripts  setup.cfg  setup.py  tests

直接执行:

anant$ python pip.py --help
Usage: pip.py COMMAND [OPTIONS]

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -E DIR, --environment=DIR
                        virtualenv environment to run pip in (either give the
                        interpreter or the environment base directory)
  -v, --verbose         Give more output
  -q, --quiet           Give less output
  --log=FILENAME        Log file where a complete (maximum verbosity) record
                        will be kept
  --proxy=PROXY         Specify a proxy in the form
                        user:passwd@proxy.server:port. Note that the
                        user:password@ is optional and required only if you
                        are behind an authenticated proxy.  If you provide
                        user@proxy.server:port then you will be prompted for a
                        password.
  --timeout=SECONDS     Set the socket timeout (default 15 seconds)

Tested below for Linux: You can directly download pip from https://pypi.org/simple/pip/ untar and use directly with your latest python.

tar -xvf  pip-0.2.tar.gz
cd pip-0.2

Check for the contents.

anant$ ls
docs  pip.egg-info  pip-log.txt  pip.py  PKG-INFO  regen-docs  scripts  setup.cfg  setup.py  tests

Execute directly:

anant$ python pip.py --help
Usage: pip.py COMMAND [OPTIONS]

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -E DIR, --environment=DIR
                        virtualenv environment to run pip in (either give the
                        interpreter or the environment base directory)
  -v, --verbose         Give more output
  -q, --quiet           Give less output
  --log=FILENAME        Log file where a complete (maximum verbosity) record
                        will be kept
  --proxy=PROXY         Specify a proxy in the form
                        user:passwd@proxy.server:port. Note that the
                        user:password@ is optional and required only if you
                        are behind an authenticated proxy.  If you provide
                        user@proxy.server:port then you will be prompted for a
                        password.
  --timeout=SECONDS     Set the socket timeout (default 15 seconds)

回答 13

以下是使用MacPorts升级到Python 3的最少说明:

sudo port install py37-pip
sudo port select --set pip pip37
sudo port select --set pip3 pip37
sudo pip install numpy, scipy, matplotlib

我运行了一些旧代码,在升级后它又可以工作了。

Here’s a minimal set of instructions for upgrading to Python 3 using MacPorts:

sudo port install py37-pip
sudo port select --set pip pip37
sudo port select --set pip3 pip37
sudo pip install numpy, scipy, matplotlib

I ran some old code and it works again after this upgrade.


回答 14

按照该URL的建议进行操作,以重命名python39._pth文件。那解决了问题

https://michlstechblog.info/blog/python-install-python-with-pip-on-windows-by-the-embeddable-zip-file/#more-5606

ren python39._pth python39._pth.save

Followed the advise on this URL, to rename the python39._pth file. That solved the issue

https://michlstechblog.info/blog/python-install-python-with-pip-on-windows-by-the-embeddable-zip-file/#more-5606

ren python39._pth python39._pth.save

回答 15

在诸如ubuntu之类的linux上,首先执行apt-get update,然后尝试安装python-pip软件包。如果没有apt-get更新,您可能会收到如下错误:

E:无法找到软件包python-pip

1.更新

sudo apt-get update

2,安装pip包

对于python2

sudo apt-get install python-pip

要么

对于python3

sudo apt-get install python3-pip

并做了!

On some kind of linux like ubuntu, first, do apt-get update and then try installing the python-pip package. without apt-get update, you might get error such as

E: Unable to locate package python-pip

1.Update :

sudo apt-get update

2.Install the pip package

For python2

sudo apt-get install python-pip

or

For python3

sudo apt-get install python3-pip

And done!


“用户警告:Matplotlib当前正在使用agg,它是非GUI后端,因此无法显示该图。” 在Pycharm上用pyplot绘制图时

问题:“用户警告:Matplotlib当前正在使用agg,它是非GUI后端,因此无法显示该图。” 在Pycharm上用pyplot绘制图时

我正在尝试使用pyplot绘制一个简单的图形,例如:

import matplotlib.pyplot as plt
plt.plot([1,2,3],[5,7,4])
plt.show()

但该图未出现,并且我收到以下消息:

UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure.

我在几个地方看到必须使用以下命令更改matplotlib的配置:

import matplotlib
matplotlib.use('TkAgg')
import matplotlib.pyplot as plt

我这样做了,但是却收到一条错误消息,因为它找不到模块:

ModuleNotFoundError: No module named 'tkinter'

然后,我尝试使用安装“ tkinter” pip install tkinter(在虚拟环境中),但找不到它:

Collecting tkinter
  Could not find a version that satisfies the requirement tkinter (from versions: )
No matching distribution found for tkinter

我还应该提到,我正在使用虚拟环境在Pycharm Community Edition IDE上运行所有这些程序,并且我的操作系统是Linux / Ubuntu 18.04。

我想知道如何解决此问题才能显示图形。

I am trying to plot a simple graph using pyplot, e.g.:

import matplotlib.pyplot as plt
plt.plot([1,2,3],[5,7,4])
plt.show()

but the figure does not appear and I get the following message:

UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure.

I saw in several places that one had to change the configuration of matplotlib using the following:

import matplotlib
matplotlib.use('TkAgg')
import matplotlib.pyplot as plt

I did this, but then got an error message because it cannot find a module:

ModuleNotFoundError: No module named 'tkinter'

Then, I tried to install “tkinter” using pip install tkinter (inside the virtual environment), but it does not find it:

Collecting tkinter
  Could not find a version that satisfies the requirement tkinter (from versions: )
No matching distribution found for tkinter

I should also mention that I am running all this on Pycharm Community Edition IDE using a virtual environment, and that my operating system is Linux/Ubuntu 18.04.

I would like to know how I can solve this problem in order to be able to display the graph.


回答 0

我找到了解决问题的方法(借助于ImportanceOfBeingErnest的帮助)。

我要做的就是tkinter使用以下命令通过Linux bash终端安装:

sudo apt-get install python3-tk

而不是将其安装pip在Pycharm的虚拟环境中或直接安装在虚拟环境中。

I found a solution to my problem (thanks to the help of ImportanceOfBeingErnest).

All I had to do was to install tkinter through the Linux bash terminal using the following command:

sudo apt-get install python3-tk

instead of installing it with pip or directly in the virtual environment in Pycharm.


回答 1

就我而言,该错误消息表示我在无头控制台中工作。因此plt.show()无法正常工作。起作用的是plt.savefig

import matplotlib.pyplot as plt

plt.plot([1,2,3], [5,7,4])
plt.savefig("mygraph.png")

我在github仓库上找到了答案。

In my case, the error message was implying that I was working in a headless console. So plt.show() could not work. What worked was calling plt.savefig:

import matplotlib.pyplot as plt

plt.plot([1,2,3], [5,7,4])
plt.savefig("mygraph.png")

I found the answer on a github repository.


回答 2

如果您使用Arch Linux(分布类似ManjaroAntegros),只需键入:

sudo pacman -S tk

所有都将完美运行!

If you use Arch Linux (distributions like Manjaro or Antegros) simply type:

sudo pacman -S tk

And all will work perfectly!


回答 3

尝试一下,import tkinter因为pycharm已经为您安装了tkinter,所以我看了为Python安装tkinter

您可以尝试:

import tkinter
import matplotlib
matplotlib.use('TkAgg')
plt.plot([1,2,3],[5,7,4])
plt.show()

作为tkinter的安装方式

我已经尝试过,在计算机上运行似乎没有错误,它成功显示了该图。也许是因为pycharm将tkinter作为系统软件包,所以您不需要安装它。但是,如果您在内部找不到tkinter,则可以去Tkdocs看看安装tkinter的方法,正如它所提到的,tkinter是python的核心软件包。

Try import tkinter because pycharm already installed tkinter for you, I looked Install tkinter for Python

You can maybe try:

import tkinter
import matplotlib
matplotlib.use('TkAgg')
plt.plot([1,2,3],[5,7,4])
plt.show()

as a tkinter-installing way

I’ve tried your way, it seems no error to run at my computer, it successfully shows the figure. maybe because pycharm have tkinter as a system package, so u don’t need to install it. But if u can’t find tkinter inside, you can go to Tkdocs to see the way of installing tkinter, as it mentions, tkinter is a core package for python.


回答 4

我在PyCharm中也遇到了这个问题。此问题是因为您的计算机中没有tkinter模块。

要安装,请遵循以下步骤(选择合适的操作系统)

对于ubuntu用户

 sudo apt-get install python-tk

要么

 sudo apt-get install python3-tk

对于Centos用户

 sudo yum install python-tkinter

要么

 sudo yum install python3-tkinter

对于Windows,请使用pip安装tk

安装tkinter后,重新启动Pycharm并运行您的代码,它将起作用

I too had this issue in PyCharm. This issue is because you don’t have tkinter module in your machine.

To install follow the steps given below (select your appropriate os)

For ubuntu users

 sudo apt-get install python-tk

or

 sudo apt-get install python3-tk

For Centos users

 sudo yum install python-tkinter

or

 sudo yum install python3-tkinter

For Windows, use pip to install tk

After installing tkinter restart your Pycharm and run your code, it will work


回答 5

安装简单

pip3 install PyQt5==5.9.2

这个对我有用。

Simple install

pip3 install PyQt5==5.9.2

It works for me.


回答 6

您可以使用fromagg到Tkinter TKAggusing命令将后端使用的matplotlib更改为

matplotlib.use('TKAgg',warn=False, force=True)

You can change the matplotlib using backend using the from agg to Tkinter TKAgg using command

matplotlib.use('TKAgg',warn=False, force=True)

回答 7

Linux Mint 19.对我有帮助:

sudo apt install tk-dev

PS软件包安装后重新编译python解释器。

Linux Mint 19. Helped for me:

sudo apt install tk-dev

P.S. Recompile python interpreter after package install.


回答 8

以防万一这对任何人都有帮助。

Python版本:3.7.7平台:Ubuntu 18.04.4 LTS

这带有默认的python 3.6.9版本,但是我已经在上面安装了自己的3.7.7版本python(已从源代码安装了它)

即使当 help('module')列表中显示了tkinter,。

以下步骤对我有用:

  1. sudo apt-get install tk-dev.

重建python:1.导航到您的python文件夹并运行检查:

cd Python-3.7.7
sudo ./configure --enable-optimizations
  1. 使用make命令进行构建: sudo make -j 8 —这是8个处理器的数量,请使用nproc命令。
  2. 使用以下方式安装:

    sudo make altinstall
    

不要使用sudo make install,它将覆盖默认的3.6.9版本,以后可能会很混乱。

  1. 立即检查tkinter
    python3.7 -m tkinter
    

将弹出一个窗口框,您的Tkinter现在已准备就绪。

Just in case if this helps anybody.

Python version: 3.7.7 platform: Ubuntu 18.04.4 LTS

This came with default python version 3.6.9, however I had installed my own 3.7.7 version python on it (installed building it from source)

tkinter was not working even when the help('module') shows tkinter in the list.

The following steps worked for me:

  1. sudo apt-get install tk-dev.

rebuild the python: 1. Navigate to your python folder and run the checks:

cd Python-3.7.7
sudo ./configure --enable-optimizations
  1. Build using make command: sudo make -j 8 — here 8 are the number of processors, check yours using nproc command.
  2. Installing using:

    sudo make altinstall
    

Don’t use sudo make install, it will overwrite default 3.6.9 version, which might be messy later.

  1. Check tkinter now
    python3.7 -m tkinter
    

A windows box will pop up, your tkinter is ready now.


回答 9

在升级了很多软件包(Spyder3到4,Keras以及Tensorflow很多依赖)之后,我今天遇到了同样的问题!我不知道发生了什么事。但是继续使用Spyder3的(基于conda的)虚拟环境没有问题。尽管如上所示安装tkinter或更改了后端,via matplotlib.use('TkAgg)或者这篇有关如何更改后端的不错的帖子 可能很好地解决了问题,但我不认为这些是严格的解决方案。对我来说,卸载matplotlib并重新安装它是不可思议的,问题已解决。

pip uninstall matplotlib

…然后,安装

pip install matplotlib

综上所述,这可能是一个程序包管理问题,顺便说一句,在可行的情况下,我会同时使用condapip

After upgrading lots of packages (Spyder 3 to 4, Keras and Tensorflow and lots of their dependencies), I had the same problem today! I cannot figure out what happened; but the (conda-based) virtual environment that kept using Spyder 3 did not have the problem. Although installing tkinter or changing the backend, via matplotlib.use('TkAgg) as shown above, or this nice post on how to change the backend, might well resolve the problem, I don’t see these as rigid solutions. For me, uninstalling matplotlib and reinstalling it was magic and the problem was solved.

pip uninstall matplotlib

… then, install

pip install matplotlib

From all the above, this could be a package management problem, and BTW, I use both conda and pip, whenever feasible.


回答 10

当我在Spyder上遇到此错误时,我从逐行运行代码变为突出显示绘图代码块并立即运行所有代码。瞧,图像出现了。

When I ran into this error on Spyder, I changed from running my code line by line to highlighting my block of plotting code and running that all at once. Voila, the image appeared.


回答 11

内联添加了%matplotlib, 并且我的情节出现在Jupyter Notebook中。

I added %matplotlib inline and my plot showed up in Jupyter Notebook.


回答 12

@xicocaio的评论应突出显示。

从某种意义上说,tkinter是特定于python版本的,sudo apt-get install python3-tk它将专门为您的默认python版本安装tkinter。假设您在各种虚拟环境中具有不同的python版本,则必须为该虚拟环境中使用的所需python版本安装tkinter。例如,sudo apt-get install python3.7-tkNo module named ' tkinter'即使不为全局python版本安装它,这样做仍然会导致错误。

The comment by @xicocaio should be highlighted.

tkinter is python version-specific in the sense that sudo apt-get install python3-tk will install tkinter exclusively for your default version of python. Suppose you have different python versions within various virtual environments, you will have to install tkinter for the desired python version used in that virtual environment. For example, sudo apt-get install python3.7-tk. Not doing this will still lead to No module named ' tkinter' errors, even after installing it for the global python version.


使用reduce()的有用代码?[关闭]

问题:使用reduce()的有用代码?[关闭]

这里有没有人有任何有用的代码在python中使用reduce()函数?除了示例中常见的+和*之外,是否还有其他代码?

通过GvR 引用Python 3000中的reduce()的命运

Does anyone here have any useful code which uses reduce() function in python? Is there any code other than the usual + and * that we see in the examples?

Refer Fate of reduce() in Python 3000 by GvR


回答 0

除+和*外,我为它找到的其他用途是与和和或,但现在我们有anyall来替换这些情况。

foldl并且foldr确实在Scheme中出现了很多…

这是一些可爱的用法:

整理清单

目标:[[1, 2, 3], [4, 5], [6, 7, 8]]变成[1, 2, 3, 4, 5, 6, 7, 8]

reduce(list.__add__, [[1, 2, 3], [4, 5], [6, 7, 8]], [])

数字列表到一个数字

目标:[1, 2, 3, 4, 5, 6, 7, 8]变成12345678

丑陋,缓慢的方式:

int("".join(map(str, [1,2,3,4,5,6,7,8])))

漂亮的reduce方式:

reduce(lambda a,d: 10*a+d, [1,2,3,4,5,6,7,8], 0)

The other uses I’ve found for it besides + and * were with and and or, but now we have any and all to replace those cases.

foldl and foldr do come up in Scheme a lot…

Here’s some cute usages:

Flatten a list

Goal: turn [[1, 2, 3], [4, 5], [6, 7, 8]] into [1, 2, 3, 4, 5, 6, 7, 8].

reduce(list.__add__, [[1, 2, 3], [4, 5], [6, 7, 8]], [])

List of digits to a number

Goal: turn [1, 2, 3, 4, 5, 6, 7, 8] into 12345678.

Ugly, slow way:

int("".join(map(str, [1,2,3,4,5,6,7,8])))

Pretty reduce way:

reduce(lambda a,d: 10*a+d, [1,2,3,4,5,6,7,8], 0)

回答 1

reduce()可用于查找3个或更多数字的最小公倍数

#!/usr/bin/env python
from fractions import gcd
from functools import reduce

def lcm(*args):
    return reduce(lambda a,b: a * b // gcd(a, b), args)

例:

>>> lcm(100, 23, 98)
112700
>>> lcm(*range(1, 20))
232792560

reduce() can be used to find Least common multiple for 3 or more numbers:

#!/usr/bin/env python
from fractions import gcd
from functools import reduce

def lcm(*args):
    return reduce(lambda a,b: a * b // gcd(a, b), args)

Example:

>>> lcm(100, 23, 98)
112700
>>> lcm(*range(1, 20))
232792560

回答 2

reduce()可以用来解析点名(eval()太不安全了,无法使用):

>>> import __main__
>>> reduce(getattr, "os.path.abspath".split('.'), __main__)
<function abspath at 0x009AB530>

reduce() could be used to resolve dotted names (where eval() is too unsafe to use):

>>> import __main__
>>> reduce(getattr, "os.path.abspath".split('.'), __main__)
<function abspath at 0x009AB530>

回答 3

找到N个给定列表的交集:

input_list = [[1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [3, 4, 5, 6, 7]]

result = reduce(set.intersection, map(set, input_list))

返回:

result = set([3, 4, 5])

通过:Python-两个列表的交集

Find the intersection of N given lists:

input_list = [[1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [3, 4, 5, 6, 7]]

result = reduce(set.intersection, map(set, input_list))

returns:

result = set([3, 4, 5])

via: Python – Intersection of two lists


回答 4

我认为reduce是一个愚蠢的命令。因此:

reduce(lambda hold,next:hold+chr(((ord(next.upper())-65)+13)%26+65),'znlorabggbbhfrshy','')

I think reduce is a silly command. Hence:

reduce(lambda hold,next:hold+chr(((ord(next.upper())-65)+13)%26+65),'znlorabggbbhfrshy','')

回答 5

reduce我在代码中发现的用法涉及以下情况:我具有一些用于逻辑表达式的类结构,因此需要将这些表达式对象的列表转换为表达式的并集。我已经有了一个make_and给定两个表达式的连接词创建函数,因此我写了reduce(make_and,l)。(我知道列表不为空;否则它将是类似reduce(make_and,l,make_true)。)

这正是(某些)函数式程序员喜欢reduce(或折叠函数,通常称为此类函数)的原因。经常有已经有很多二元函数喜欢+*minmax,级联和,在我的情况,make_andmake_or。有一个reduce它,将这些操作提升到列表(或一般来说是折叠功能的树或其他任何东西)变得微不足道。

当然,如果sum经常使用某些实例化(例如),那么您就不想继续写作reduce。但是,除了sum使用某些for循环定义之外,您还可以使用轻松定义它reduce

正如其他人提到的,可读性确实是一个问题。但是,您可能会争辩说,人们发现reduce“清晰”较少的唯一原因是因为它不是许多人知道和/或使用的功能。

The usage of reduce that I found in my code involved the situation where I had some class structure for logic expression and I needed to convert a list of these expression objects to a conjunction of the expressions. I already had a function make_and to create a conjunction given two expressions, so I wrote reduce(make_and,l). (I knew the list wasn’t empty; otherwise it would have been something like reduce(make_and,l,make_true).)

This is exactly the reason that (some) functional programmers like reduce (or fold functions, as such functions are typically called). There are often already many binary functions like +, *, min, max, concatenation and, in my case, make_and and make_or. Having a reduce makes it trivial to lift these operations to lists (or trees or whatever you got, for fold functions in general).

Of course, if certain instantiations (such as sum) are often used, then you don’t want to keep writing reduce. However, instead of defining the sum with some for-loop, you can just as easily define it with reduce.

Readability, as mentioned by others, is indeed an issue. You could argue, however, that only reason why people find reduce less “clear” is because it is not a function that many people know and/or use.


回答 6

函数组成:如果您已经有了要连续应用的函数列表,例如:

color = lambda x: x.replace('brown', 'blue')
speed = lambda x: x.replace('quick', 'slow')
work = lambda x: x.replace('lazy', 'industrious')
fs = [str.lower, color, speed, work, str.title]

然后,您可以使用以下命令连续应用它们:

>>> call = lambda s, func: func(s)
>>> s = "The Quick Brown Fox Jumps Over the Lazy Dog"
>>> reduce(call, fs, s)
'The Slow Blue Fox Jumps Over The Industrious Dog'

在这种情况下,方法链接可能更具可读性。但是有时这是不可能的,并且这种组合可能比f1(f2(f3(f4(x))))某种语法更易读和可维护。

Function composition: If you already have a list of functions that you’d like to apply in succession, such as:

color = lambda x: x.replace('brown', 'blue')
speed = lambda x: x.replace('quick', 'slow')
work = lambda x: x.replace('lazy', 'industrious')
fs = [str.lower, color, speed, work, str.title]

Then you can apply them all consecutively with:

>>> call = lambda s, func: func(s)
>>> s = "The Quick Brown Fox Jumps Over the Lazy Dog"
>>> reduce(call, fs, s)
'The Slow Blue Fox Jumps Over The Industrious Dog'

In this case, method chaining may be more readable. But sometimes it isn’t possible, and this kind of composition may be more readable and maintainable than a f1(f2(f3(f4(x)))) kind of syntax.


回答 7

您可以替换value = json_obj['a']['b']['c']['d']['e']为:

value = reduce(dict.__getitem__, 'abcde', json_obj)

如果您已经将路径a/b/c/..作为列表。例如,使用list中的项目更改嵌套字典的dict中的值

You could replace value = json_obj['a']['b']['c']['d']['e'] with:

value = reduce(dict.__getitem__, 'abcde', json_obj)

If you already have the path a/b/c/.. as a list. For example, Change values in dict of nested dicts using items in a list.


回答 8

@Blair Conrad:您也可以使用sum来实现glob / reduce,如下所示:

files = sum([glob.glob(f) for f in args], [])

这比您的两个示例中的任何一个都不那么冗长,完全是Python风格的,并且仍然只是一行代码。

因此,为了回答最初的问题,我个人尝试避免使用reduce,因为它从来没有真正需要过,而且我发现它比其他方法不太清楚。但是,有些人习惯于减少并开始喜欢它来列出理解力(尤其是Haskell程序员)。但是,如果您还没有考虑过reduce的问题,那么您可能不必担心使用它。

@Blair Conrad: You could also implement your glob/reduce using sum, like so:

files = sum([glob.glob(f) for f in args], [])

This is less verbose than either of your two examples, is perfectly Pythonic, and is still only one line of code.

So to answer the original question, I personally try to avoid using reduce because it’s never really necessary and I find it to be less clear than other approaches. However, some people get used to reduce and come to prefer it to list comprehensions (especially Haskell programmers). But if you’re not already thinking about a problem in terms of reduce, you probably don’t need to worry about using it.


回答 9

reduce 可用于支持链式属性查找:

reduce(getattr, ('request', 'user', 'email'), self)

当然,这相当于

self.request.user.email

但是在代码需要接受任意属性列表时很有用。

(在处理Django模型时,任意长度的链接属性是常见的。)

reduce can be used to support chained attribute lookups:

reduce(getattr, ('request', 'user', 'email'), self)

Of course, this is equivalent to

self.request.user.email

but it’s useful when your code needs to accept an arbitrary list of attributes.

(Chained attributes of arbitrary length are common when dealing with Django models.)


回答 10

reduce当您需要查找类似set对象序列的并集或交集时,此功能很有用。

>>> reduce(operator.or_, ({1}, {1, 2}, {1, 3}))  # union
{1, 2, 3}
>>> reduce(operator.and_, ({1}, {1, 2}, {1, 3}))  # intersection
{1}

(除了实际set的,其中的一个示例是Django的Q对象。)

另一方面,如果要处理bools,则应使用anyall

>>> any((True, False, True))
True

reduce is useful when you need to find the union or intersection of a sequence of set-like objects.

>>> reduce(operator.or_, ({1}, {1, 2}, {1, 3}))  # union
{1, 2, 3}
>>> reduce(operator.and_, ({1}, {1, 2}, {1, 3}))  # intersection
{1}

(Apart from actual sets, an example of these are Django’s Q objects.)

On the other hand, if you’re dealing with bools, you should use any and all:

>>> any((True, False, True))
True

回答 11

重复我的代码后,看来我使用过的reduce唯一要做的就是计算阶乘:

reduce(operator.mul, xrange(1, x+1) or (1,))

After grepping my code, it seems the only thing I’ve used reduce for is calculating the factorial:

reduce(operator.mul, xrange(1, x+1) or (1,))

回答 12

我正在为一种语言编写一个compose函数,因此我使用reduce和我的apply运算符来构造该组合函数。

简而言之,compose将一系列函数组合成一个函数。如果我有一个分阶段应用的复杂操作,那么我希望将其全部组合起来,如下所示:

complexop = compose(stage4, stage3, stage2, stage1)

这样,我便可以将其应用于这样的表达式:

complexop(expression)

我希望它等同于:

stage4(stage3(stage2(stage1(expression))))

现在,要构建内部对象,我希望它说:

Lambda([Symbol('x')], Apply(stage4, Apply(stage3, Apply(stage2, Apply(stage1, Symbol('x'))))))

(Lambda类构建用户定义的函数,Apply构建函数应用程序。)

现在,不幸的是,reduce折叠的方向错误,所以我大致使用了:

reduce(lambda x,y: Apply(y, x), reversed(args + [Symbol('x')]))

要弄清楚reduce产生了什么,请在REPL中尝试以下方法:

reduce(lambda x, y: (x, y), range(1, 11))
reduce(lambda x, y: (y, x), reversed(range(1, 11)))

I’m writing a compose function for a language, so I construct the composed function using reduce along with my apply operator.

In a nutshell, compose takes a list of functions to compose into a single function. If I have a complex operation that is applied in stages, I want to put it all together like so:

complexop = compose(stage4, stage3, stage2, stage1)

This way, I can then apply it to an expression like so:

complexop(expression)

And I want it to be equivalent to:

stage4(stage3(stage2(stage1(expression))))

Now, to build my internal objects, I want it to say:

Lambda([Symbol('x')], Apply(stage4, Apply(stage3, Apply(stage2, Apply(stage1, Symbol('x'))))))

(The Lambda class builds a user-defined function, and Apply builds a function application.)

Now, reduce, unfortunately, folds the wrong way, so I wound up using, roughly:

reduce(lambda x,y: Apply(y, x), reversed(args + [Symbol('x')]))

To figure out what reduce produces, try these in the REPL:

reduce(lambda x, y: (x, y), range(1, 11))
reduce(lambda x, y: (y, x), reversed(range(1, 11)))

回答 13

reduce可以用来获取第n个元素最大的列表

reduce(lambda x,y: x if x[2] > y[2] else y,[[1,2,3,4],[5,2,5,7],[1,6,0,2]])

将返回[5、2、5、7],因为它是具有最大3rd元素的列表+

reduce can be used to get the list with the maximum nth element

reduce(lambda x,y: x if x[2] > y[2] else y,[[1,2,3,4],[5,2,5,7],[1,6,0,2]])

would return [5, 2, 5, 7] as it is the list with max 3rd element +


回答 14

Reduce不仅限于标量运算;它也可以用于将事物分类到存储桶中。(这是我最常使用的减少方法)。

想象一下,如果您有一个对象列表,并且想根据对象中平面存储的属性按层次结构对其进行重新组织。在以下示例中,我使用该articles功能生成了与XML编码报纸中的文章相关的元数据对象列表。articles生成一个XML元素列表,然后一个一个地映射它们,生成包含一些有趣信息的对象。在前端,我要让用户按节/小节/标题浏览文章。因此,我通常使用reduce文章列表,并返回一个反映章节/小节/文章层次结构的字典。

from lxml import etree
from Reader import Reader

class IssueReader(Reader):
    def articles(self):
        arts = self.q('//div3')  # inherited ... runs an xpath query against the issue
        subsection = etree.XPath('./ancestor::div2/@type')
        section = etree.XPath('./ancestor::div1/@type')
        header_text = etree.XPath('./head//text()')
        return map(lambda art: {
            'text_id': self.id,
            'path': self.getpath(art)[0],
            'subsection': (subsection(art)[0] or '[none]'),
            'section': (section(art)[0] or '[none]'),
            'headline': (''.join(header_text(art)) or '[none]')
        }, arts)

    def by_section(self):
        arts = self.articles()

        def extract(acc, art):  # acc for accumulator
            section = acc.get(art['section'], False)
            if section:
                subsection = acc.get(art['subsection'], False)
                if subsection:
                    subsection.append(art)
                else:
                    section[art['subsection']] = [art]
            else:
                acc[art['section']] = {art['subsection']: [art]}
            return acc

        return reduce(extract, arts, {})

我在这里给出两个函数,因为我认为它显示了map和reduce在处理对象时如何很好地互补。使用for循环可以完成相同的事情,但是,……花一些严肃的时间使用函数式语言往往会使我在映射和归约方面进行思考。

顺便说一句,如果有人像我在中所做的那样设置属性的更好方法extract,而您要设置的属性的父级可能还不存在,请告诉我。

Reduce isn’t limited to scalar operations; it can also be used to sort things into buckets. (This is what I use reduce for most often).

Imagine a case in which you have a list of objects, and you want to re-organize it hierarchically based on properties stored flatly in the object. In the following example, I produce a list of metadata objects related to articles in an XML-encoded newspaper with the articles function. articles generates a list of XML elements, and then maps through them one by one, producing objects that hold some interesting info about them. On the front end, I’m going to want to let the user browse the articles by section/subsection/headline. So I use reduce to take the list of articles and return a single dictionary that reflects the section/subsection/article hierarchy.

from lxml import etree
from Reader import Reader

class IssueReader(Reader):
    def articles(self):
        arts = self.q('//div3')  # inherited ... runs an xpath query against the issue
        subsection = etree.XPath('./ancestor::div2/@type')
        section = etree.XPath('./ancestor::div1/@type')
        header_text = etree.XPath('./head//text()')
        return map(lambda art: {
            'text_id': self.id,
            'path': self.getpath(art)[0],
            'subsection': (subsection(art)[0] or '[none]'),
            'section': (section(art)[0] or '[none]'),
            'headline': (''.join(header_text(art)) or '[none]')
        }, arts)

    def by_section(self):
        arts = self.articles()

        def extract(acc, art):  # acc for accumulator
            section = acc.get(art['section'], False)
            if section:
                subsection = acc.get(art['subsection'], False)
                if subsection:
                    subsection.append(art)
                else:
                    section[art['subsection']] = [art]
            else:
                acc[art['section']] = {art['subsection']: [art]}
            return acc

        return reduce(extract, arts, {})

I give both functions here because I think it shows how map and reduce can complement each other nicely when dealing with objects. The same thing could have been accomplished with a for loop, … but spending some serious time with a functional language has tended to make me think in terms of map and reduce.

By the way, if anybody has a better way to set properties like I’m doing in extract, where the parents of the property you want to set might not exist yet, please let me know.


回答 15

不知道这是您要追求的,但是您可以在Google上搜索源代码

点击链接以搜索“ function:reduce()lang:python”在Google Code搜索上

乍看之下,以下项目使用 reduce()

  • MoinMoin
  • 佐佩
  • 数字
  • 科学Python

等),但由于它们是大型项目,因此这些不足为奇。

reduce的功能可以使用函数递归来完成,我想Guido认为它更明确。

更新:

由于Google的代码搜索已于2012年1月15日停产,因此除了恢复常规的Google搜索外,还有一个名为“代码片段集”的代码看起来很有希望。这个(封闭的)问题的答案中提到了许多其他资源。是否可以替换Google Code Search?

更新2(2017年5月29日):

Nullege搜索引擎是Python示例(使用开源代码)的一个很好的来源。

Not sure if this is what you are after but you can search source code on Google.

Follow the link for a search on ‘function:reduce() lang:python’ on Google Code search

At first glance the following projects use reduce()

  • MoinMoin
  • Zope
  • Numeric
  • ScientificPython

etc. etc. but then these are hardly surprising since they are huge projects.

The functionality of reduce can be done using function recursion which I guess Guido thought was more explicit.

Update:

Since Google’s Code Search was discontinued on 15-Jan-2012, besides reverting to regular Google searches, there’s something called Code Snippets Collection that looks promising. A number of other resources are mentioned in answers this (closed) question Replacement for Google Code Search?.

Update 2 (29-May-2017):

A good source for Python examples (in open-source code) is the Nullege search engine.


回答 16

import os

files = [
    # full filenames
    "var/log/apache/errors.log",
    "home/kane/images/avatars/crusader.png",
    "home/jane/documents/diary.txt",
    "home/kane/images/selfie.jpg",
    "var/log/abc.txt",
    "home/kane/.vimrc",
    "home/kane/images/avatars/paladin.png",
]

# unfolding of plain filiname list to file-tree
fs_tree = ({}, # dict of folders
           []) # list of files
for full_name in files:
    path, fn = os.path.split(full_name)
    reduce(
        # this fucction walks deep into path
        # and creates placeholders for subfolders
        lambda d, k: d[0].setdefault(k,         # walk deep
                                     ({}, [])), # or create subfolder storage
        path.split(os.path.sep),
        fs_tree
    )[1].append(fn)

print fs_tree
#({'home': (
#    {'jane': (
#        {'documents': (
#           {},
#           ['diary.txt']
#        )},
#        []
#    ),
#    'kane': (
#       {'images': (
#          {'avatars': (
#             {},
#             ['crusader.png',
#             'paladin.png']
#          )},
#          ['selfie.jpg']
#       )},
#       ['.vimrc']
#    )},
#    []
#  ),
#  'var': (
#     {'log': (
#         {'apache': (
#            {},
#            ['errors.log']
#         )},
#         ['abc.txt']
#     )},
#     [])
#},
#[])
import os

files = [
    # full filenames
    "var/log/apache/errors.log",
    "home/kane/images/avatars/crusader.png",
    "home/jane/documents/diary.txt",
    "home/kane/images/selfie.jpg",
    "var/log/abc.txt",
    "home/kane/.vimrc",
    "home/kane/images/avatars/paladin.png",
]

# unfolding of plain filiname list to file-tree
fs_tree = ({}, # dict of folders
           []) # list of files
for full_name in files:
    path, fn = os.path.split(full_name)
    reduce(
        # this fucction walks deep into path
        # and creates placeholders for subfolders
        lambda d, k: d[0].setdefault(k,         # walk deep
                                     ({}, [])), # or create subfolder storage
        path.split(os.path.sep),
        fs_tree
    )[1].append(fn)

print fs_tree
#({'home': (
#    {'jane': (
#        {'documents': (
#           {},
#           ['diary.txt']
#        )},
#        []
#    ),
#    'kane': (
#       {'images': (
#          {'avatars': (
#             {},
#             ['crusader.png',
#             'paladin.png']
#          )},
#          ['selfie.jpg']
#       )},
#       ['.vimrc']
#    )},
#    []
#  ),
#  'var': (
#     {'log': (
#         {'apache': (
#            {},
#            ['errors.log']
#         )},
#         ['abc.txt']
#     )},
#     [])
#},
#[])

回答 17

def dump(fname,iterable):
  with open(fname,'w') as f:
    reduce(lambda x, y: f.write(unicode(y,'utf-8')), iterable)
def dump(fname,iterable):
  with open(fname,'w') as f:
    reduce(lambda x, y: f.write(unicode(y,'utf-8')), iterable)

回答 18

我曾经用sqlalchemy-searchable中的运算符reduce 来连接PostgreSQL搜索向量列表||

vectors = (self.column_vector(getattr(self.table.c, column_name))
           for column_name in self.indexed_columns)
concatenated = reduce(lambda x, y: x.op('||')(y), vectors)
compiled = concatenated.compile(self.conn)

I used reduce to concatenate a list of PostgreSQL search vectors with the || operator in sqlalchemy-searchable:

vectors = (self.column_vector(getattr(self.table.c, column_name))
           for column_name in self.indexed_columns)
concatenated = reduce(lambda x, y: x.op('||')(y), vectors)
compiled = concatenated.compile(self.conn)

回答 19

我有一个老式的pipegrep Python实现,该实现使用reduce和glob模块来构建要处理的文件列表:

files = []
files.extend(reduce(lambda x, y: x + y, map(glob.glob, args)))

当时我觉得很方便,但实际上没有必要,因为类似的东西一样好,而且可读性更高

files = []
for f in args:
    files.extend(glob.glob(f))

I have an old Python implementation of pipegrep that uses reduce and the glob module to build a list of files to process:

files = []
files.extend(reduce(lambda x, y: x + y, map(glob.glob, args)))

I found it handy at the time, but it’s really not necessary, as something similar is just as good, and probably more readable

files = []
for f in args:
    files.extend(glob.glob(f))

回答 20

假设有一些年度统计数据存储在“计数器”列表中。我们想要查找不同年份中每个月的MIN / MAX值。例如,对于1月将是10。对于2月将是15。我们需要将结果存储在新的Counter中。

from collections import Counter

stat2011 = Counter({"January": 12, "February": 20, "March": 50, "April": 70, "May": 15,
           "June": 35, "July": 30, "August": 15, "September": 20, "October": 60,
           "November": 13, "December": 50})

stat2012 = Counter({"January": 36, "February": 15, "March": 50, "April": 10, "May": 90,
           "June": 25, "July": 35, "August": 15, "September": 20, "October": 30,
           "November": 10, "December": 25})

stat2013 = Counter({"January": 10, "February": 60, "March": 90, "April": 10, "May": 80,
           "June": 50, "July": 30, "August": 15, "September": 20, "October": 75,
           "November": 60, "December": 15})

stat_list = [stat2011, stat2012, stat2013]

print reduce(lambda x, y: x & y, stat_list)     # MIN
print reduce(lambda x, y: x | y, stat_list)     # MAX

Let say that there are some yearly statistic data stored a list of Counters. We want to find the MIN/MAX values in each month across the different years. For example, for January it would be 10. And for February it would be 15. We need to store the results in a new Counter.

from collections import Counter

stat2011 = Counter({"January": 12, "February": 20, "March": 50, "April": 70, "May": 15,
           "June": 35, "July": 30, "August": 15, "September": 20, "October": 60,
           "November": 13, "December": 50})

stat2012 = Counter({"January": 36, "February": 15, "March": 50, "April": 10, "May": 90,
           "June": 25, "July": 35, "August": 15, "September": 20, "October": 30,
           "November": 10, "December": 25})

stat2013 = Counter({"January": 10, "February": 60, "March": 90, "April": 10, "May": 80,
           "June": 50, "July": 30, "August": 15, "September": 20, "October": 75,
           "November": 60, "December": 15})

stat_list = [stat2011, stat2012, stat2013]

print reduce(lambda x, y: x & y, stat_list)     # MIN
print reduce(lambda x, y: x | y, stat_list)     # MAX

回答 21

我有代表某种重叠区间(基因组外显子)的对象,并使用__and__以下方法重新定义了它们的交集:

class Exon:
    def __init__(self):
        ...
    def __and__(self,other):
        ...
        length = self.length + other.length  # (e.g.)
        return self.__class__(...length,...)

然后,当我有它们的集合(例如,在同一个基因中)时,我使用

intersection = reduce(lambda x,y: x&y, exons)

I have objects representing some kind of overlapping intervals (genomic exons), and redefined their intersection using __and__:

class Exon:
    def __init__(self):
        ...
    def __and__(self,other):
        ...
        length = self.length + other.length  # (e.g.)
        return self.__class__(...length,...)

Then when I have a collection of them (for instance, in the same gene), I use

intersection = reduce(lambda x,y: x&y, exons)

回答 22

我只是发现了有用的用法reduce:在不删除定界符的情况下拆分字符串该代码完全来自“编程口语”博客。这是代码:

reduce(lambda acc, elem: acc[:-1] + [acc[-1] + elem] if elem == "\n" else acc + [elem], re.split("(\n)", "a\nb\nc\n"), [])

结果如下:

['a\n', 'b\n', 'c\n', '']

请注意,它处理的是在SO中无法普遍解决的极端情况。有关更深入的解释,我将您重定向到原始博客文章。

I just found useful usage of reduce: splitting string without removing the delimiter. The code is entirely from Programatically Speaking blog. Here’s the code:

reduce(lambda acc, elem: acc[:-1] + [acc[-1] + elem] if elem == "\n" else acc + [elem], re.split("(\n)", "a\nb\nc\n"), [])

Here’s the result:

['a\n', 'b\n', 'c\n', '']

Note that it handles edge cases that popular answer in SO doesn’t. For more in-depth explanation, I am redirecting you to original blog post.


回答 23

使用reduce()来确定日期列表是否连续:

from datetime import date, timedelta


def checked(d1, d2):
    """
    We assume the date list is sorted.
    If d2 & d1 are different by 1, everything up to d2 is consecutive, so d2
    can advance to the next reduction.
    If d2 & d1 are not different by 1, returning d1 - 1 for the next reduction
    will guarantee the result produced by reduce() to be something other than
    the last date in the sorted date list.

    Definition 1: 1/1/14, 1/2/14, 1/2/14, 1/3/14 is consider consecutive
    Definition 2: 1/1/14, 1/2/14, 1/2/14, 1/3/14 is consider not consecutive

    """
    #if (d2 - d1).days == 1 or (d2 - d1).days == 0:  # for Definition 1
    if (d2 - d1).days == 1:                          # for Definition 2
        return d2
    else:
        return d1 + timedelta(days=-1)

# datelist = [date(2014, 1, 1), date(2014, 1, 3),
#             date(2013, 12, 31), date(2013, 12, 30)]

# datelist = [date(2014, 2, 19), date(2014, 2, 19), date(2014, 2, 20),
#             date(2014, 2, 21), date(2014, 2, 22)]

datelist = [date(2014, 2, 19), date(2014, 2, 21),
            date(2014, 2, 22), date(2014, 2, 20)]

datelist.sort()

if datelist[-1] == reduce(checked, datelist):
    print "dates are consecutive"
else:
    print "dates are not consecutive"

Using reduce() to find out if a list of dates are consecutive:

from datetime import date, timedelta


def checked(d1, d2):
    """
    We assume the date list is sorted.
    If d2 & d1 are different by 1, everything up to d2 is consecutive, so d2
    can advance to the next reduction.
    If d2 & d1 are not different by 1, returning d1 - 1 for the next reduction
    will guarantee the result produced by reduce() to be something other than
    the last date in the sorted date list.

    Definition 1: 1/1/14, 1/2/14, 1/2/14, 1/3/14 is consider consecutive
    Definition 2: 1/1/14, 1/2/14, 1/2/14, 1/3/14 is consider not consecutive

    """
    #if (d2 - d1).days == 1 or (d2 - d1).days == 0:  # for Definition 1
    if (d2 - d1).days == 1:                          # for Definition 2
        return d2
    else:
        return d1 + timedelta(days=-1)

# datelist = [date(2014, 1, 1), date(2014, 1, 3),
#             date(2013, 12, 31), date(2013, 12, 30)]

# datelist = [date(2014, 2, 19), date(2014, 2, 19), date(2014, 2, 20),
#             date(2014, 2, 21), date(2014, 2, 22)]

datelist = [date(2014, 2, 19), date(2014, 2, 21),
            date(2014, 2, 22), date(2014, 2, 20)]

datelist.sort()

if datelist[-1] == reduce(checked, datelist):
    print "dates are consecutive"
else:
    print "dates are not consecutive"

如何在python字符串中找到子字符串的首次出现?

问题:如何在python字符串中找到子字符串的首次出现?

因此,如果我的字符串是“花花公子很酷”。
我想找到’dude’的第一个索引:

mystring.findfirstindex('dude') # should return 4

这是什么python命令?
谢谢。

So if my string is “the dude is a cool dude”.
I’d like to find the first index of ‘dude’:

mystring.findfirstindex('dude') # should return 4

What is the python command for this?
Thanks.


回答 0

find()

>>> s = "the dude is a cool dude"
>>> s.find('dude')
4

find()

>>> s = "the dude is a cool dude"
>>> s.find('dude')
4

回答 1

快速概述: indexfind

find方法旁边也有indexfindindex这两个产生相同的结果:返回第一个出现的位置,如果没有找到index将引发ValueError,而find回报-1。在速度方面,两者都有相同的基准结果。

s.find(t)    #returns: -1, or index where t starts in s
s.index(t)   #returns: Same as find, but raises ValueError if t is not in s

其他知识: rfindrindex

在一般情况下,发现和指数收益率,其中传入的字符串开始最小的指数,并rfindrindex返回它开始大部分的字符串搜索算法进行搜索的最大索引从左到右,所以开始的功能r表示搜索从发生右向左

因此,如果您正在搜索的元素的可能性比列表的开始更接近结尾,rfind或者rindex会更快。

s.rfind(t)   #returns: Same as find, but searched right to left
s.rindex(t)  #returns: Same as index, but searches right to left

来源: Python:Visual快速入门指南,Toby Donaldson

Quick Overview: index and find

Next to the find method there is as well index. find and index both yield the same result: returning the position of the first occurrence, but if nothing is found index will raise a ValueError whereas find returns -1. Speedwise, both have the same benchmark results.

s.find(t)    #returns: -1, or index where t starts in s
s.index(t)   #returns: Same as find, but raises ValueError if t is not in s

Additional knowledge: rfind and rindex:

In general, find and index return the smallest index where the passed-in string starts, and rfind and rindex return the largest index where it starts Most of the string searching algorithms search from left to right, so functions starting with r indicate that the search happens from right to left.

So in case that the likelihood of the element you are searching is close to the end than to the start of the list, rfind or rindex would be faster.

s.rfind(t)   #returns: Same as find, but searched right to left
s.rindex(t)  #returns: Same as index, but searches right to left

Source: Python: Visual QuickStart Guide, Toby Donaldson


回答 2

通过不使用任何python内置函数来以算法方式实现此功能。这可以实现为

def find_pos(string,word):

    for i in range(len(string) - len(word)+1):
        if string[i:i+len(word)] == word:
            return i
    return 'Not Found'

string = "the dude is a cool dude"
word = 'dude1'
print(find_pos(string,word))
# output 4

to implement this in algorithmic way, by not using any python inbuilt function . This can be implemented as

def find_pos(string,word):

    for i in range(len(string) - len(word)+1):
        if string[i:i+len(word)] == word:
            return i
    return 'Not Found'

string = "the dude is a cool dude"
word = 'dude1'
print(find_pos(string,word))
# output 4

回答 3

def find_pos(chaine,x):

    for i in range(len(chaine)):
        if chaine[i] ==x :
            return 'yes',i 
    return 'no'
def find_pos(chaine,x):

    for i in range(len(chaine)):
        if chaine[i] ==x :
            return 'yes',i 
    return 'no'

将参数传递给结构任务

问题:将参数传递给结构任务

从命令行调用“ fab”时,如何将参数传递给Fabric任务?例如:

def task(something=''):
    print "You said %s" % something
$ fab task "hello"
You said hello

Done.

是否可以在没有提示的情况下执行此操作fabric.operations.prompt

How can I pass a parameter to a fabric task when calling “fab” from the command line? For example:

def task(something=''):
    print "You said %s" % something
$ fab task "hello"
You said hello

Done.

Is it possible to do this without prompting with fabric.operations.prompt?


回答 0

Fabric 2任务参数文档:

http://docs.pyinvoke.org/zh_CN/latest/concepts/invoking-tasks.html#task-command-line-arguments


Fabric 1.X使用以下语法将参数传递给任务:

 fab task:'hello world'
 fab task:something='hello'
 fab task:foo=99,bar=True
 fab task:foo,bar

您可以在Fabric文档中阅读有关它的更多信息。

Fabric 2 task arguments documentation:

http://docs.pyinvoke.org/en/latest/concepts/invoking-tasks.html#task-command-line-arguments


Fabric 1.X uses the following syntax for passing arguments to tasks:

 fab task:'hello world'
 fab task:something='hello'
 fab task:foo=99,bar=True
 fab task:foo,bar

You can read more about it in Fabric docs.


回答 1

结构参数是通过非常基本的字符串解析来理解的,因此您在发送它们时必须要小心一点。

以下是将参数传递给以下测试函数的几种不同方式的示例:

@task
def test(*args, **kwargs):
    print("args:", args)
    print("named args:", kwargs)

$ fab "test:hello world"
('args:', ('hello world',))
('named args:', {})

$ fab "test:hello,world"
('args:', ('hello', 'world'))
('named args:', {})

$ fab "test:message=hello world"
('args:', ())
('named args:', {'message': 'hello world'})

$ fab "test:message=message \= hello\, world"
('args:', ())
('named args:', {'message': 'message = hello, world'})

我在这里使用双引号将外壳排除在等式之外,但对于某些平台,单引号可能更好。还要注意Fabric认为是定界符的字符的转义符。

docs中有更多详细信息:http : //docs.fabfile.org/en/1.14/usage/fab.html#per-task-arguments

Fabric arguments are understood with very basic string parsing, so you have to be a bit careful with how you send them.

Here are a few examples of different ways to pass arguments to the following test function:

@task
def test(*args, **kwargs):
    print("args:", args)
    print("named args:", kwargs)

$ fab "test:hello world"
('args:', ('hello world',))
('named args:', {})

$ fab "test:hello,world"
('args:', ('hello', 'world'))
('named args:', {})

$ fab "test:message=hello world"
('args:', ())
('named args:', {'message': 'hello world'})

$ fab "test:message=message \= hello\, world"
('args:', ())
('named args:', {'message': 'message = hello, world'})

I use double quote here to take the shell out of the equation, but single quotes may be better for some platforms. Also note the escapes for characters that fabric considers delimiters.

More details in the docs: http://docs.fabfile.org/en/1.14/usage/fab.html#per-task-arguments


回答 2

在Fabric 2中,只需将参数添加到任务函数即可。例如,将version参数传递给task deploy

@task
def deploy(context, version):
    ...

如下运行:

fab -H host deploy --version v1.2.3

Fabric甚至自动记录选项:

$ fab --help deploy
Usage: fab [--core-opts] deploy [--options] [other tasks here ...]

Docstring:
  none

Options:
  -v STRING, --version=STRING

In Fabric 2, simply add the argument to your task function. For example, to pass the version argument to task deploy:

@task
def deploy(context, version):
    ...

Run it as follows:

fab -H host deploy --version v1.2.3

Fabric even documents the options automatically:

$ fab --help deploy
Usage: fab [--core-opts] deploy [--options] [other tasks here ...]

Docstring:
  none

Options:
  -v STRING, --version=STRING

回答 3

您需要将所有Python变量作为字符串传递,尤其是在使用子进程来运行脚本时,否则会出错。您将需要分别将变量转换回int / boolean类型。

def print_this(var):
    print str(var)

fab print_this:'hello world'
fab print_this='hello'
fab print_this:'99'
fab print_this='True'

You need to pass all Python variables as strings, especially if you are using sub-process to run the scripts, or you will get an error. You will need to convert the variables back to int/boolean types separately.

def print_this(var):
    print str(var)

fab print_this:'hello world'
fab print_this='hello'
fab print_this:'99'
fab print_this='True'

回答 4

如果有人希望将参数从一个任务传递给fabric2中的另一个任务,则只需使用环境字典即可:

@task
def qa(ctx):
  ctx.config.run.env['counter'] = 22
  ctx.config.run.env['conn'] = Connection('qa_host')

@task
def sign(ctx):
  print(ctx.config.run.env['counter'])
  conn = ctx.config.run.env['conn']
  conn.run('touch mike_was_here.txt')

并运行:

fab2 qa sign

If someone is looking to pass parameters from one task to another in fabric2, just use the environment dictionary for that:

@task
def qa(ctx):
  ctx.config.run.env['counter'] = 22
  ctx.config.run.env['conn'] = Connection('qa_host')

@task
def sign(ctx):
  print(ctx.config.run.env['counter'])
  conn = ctx.config.run.env['conn']
  conn.run('touch mike_was_here.txt')

And run:

fab2 qa sign

round()似乎无法正确舍入

问题:round()似乎无法正确舍入

round()函数的文档指出,您向其传递了一个数字,并将小数点后的位置传递给四舍五入。因此,它应该这样做:

n = 5.59
round(n, 1) # 5.6

但是,实际上,老的浮点怪异现象不断蔓延,您会得到:

5.5999999999999996

出于UI的目的,我需要显示5.6。我在Internet上闲逛,发现一些文档取决于我对Python的实现。不幸的是,这在我的Windows开发机和我尝试过的每台Linux服务器上都会发生。另请参阅此处

除了创建自己的回合库之外,还有什么办法可以解决?

The documentation for the round() function states that you pass it a number, and the positions past the decimal to round. Thus it should do this:

n = 5.59
round(n, 1) # 5.6

But, in actuality, good old floating point weirdness creeps in and you get:

5.5999999999999996

For the purposes of UI, I need to display 5.6. I poked around the Internet and found some documentation that this is dependent on my implementation of Python. Unfortunately, this occurs on both my Windows dev machine and each Linux server I’ve tried. See here also.

Short of creating my own round library, is there any way around this?


回答 0

我不知道它的存储方式,但至少格式化正确:

'%.1f' % round(n, 1) # Gives you '5.6'

I can’t help the way it’s stored, but at least formatting works correctly:

'%.1f' % round(n, 1) # Gives you '5.6'

回答 1

格式化无需四舍五入即可正确进行:

"%.1f" % n

Formatting works correctly even without having to round:

"%.1f" % n

回答 2

如果使用小数模块,则无需使用“舍入”功能就可以近似。这是我用于舍入的内容,尤其是在编写货币应用程序时:

Decimal(str(16.2)).quantize(Decimal('.01'), rounding=ROUND_UP)

这将返回一个十进制数为16.20。

If you use the Decimal module you can approximate without the use of the ’round’ function. Here is what I’ve been using for rounding especially when writing monetary applications:

Decimal(str(16.2)).quantize(Decimal('.01'), rounding=ROUND_UP)

This will return a Decimal Number which is 16.20.


回答 3

round(5.59, 1)工作正常。问题在于5.6无法精确地用二进制浮点表示。

>>> 5.6
5.5999999999999996
>>> 

正如Vinko所说,您可以使用字符串格式对显示进行四舍五入。

如果需要,Python有一个用于十进制算术模块

round(5.59, 1) is working fine. The problem is that 5.6 cannot be represented exactly in binary floating point.

>>> 5.6
5.5999999999999996
>>> 

As Vinko says, you can use string formatting to do rounding for display.

Python has a module for decimal arithmetic if you need that.


回答 4

如果您执行此操作,str(round(n, 1))而不是,则会得到“ 5.6” round(n, 1)

You get ‘5.6’ if you do str(round(n, 1)) instead of just round(n, 1).


回答 5

您可以将数据类型切换为整数:

>>> n = 5.59
>>> int(n * 10) / 10.0
5.5
>>> int(n * 10 + 0.5)
56

然后通过插入语言环境的小数点分隔符来显示数字。

但是,吉米的答案更好。

You can switch the data type to an integer:

>>> n = 5.59
>>> int(n * 10) / 10.0
5.5
>>> int(n * 10 + 0.5)
56

And then display the number by inserting the locale’s decimal separator.

However, Jimmy’s answer is better.


回答 6

浮点数学容易受到轻微但令人讨厌的精度误差的影响。如果可以使用整数或定点,则可以保证精度。

Floating point math is vulnerable to slight, but annoying, precision inaccuracies. If you can work with integer or fixed point, you will be guaranteed precision.


回答 7

看一下Decimal模块

十进制“基于浮点模型,该浮点模型是为人而设计的,并且必然具有最重要的指导原则–计算机必须提供一种与人们在学校学习的算法相同的算法。” –摘自十进制算术规范。

小数可以精确表示。相反,像1.1和2.2这样的数字在二进制浮点数中没有确切的表示形式。最终用户通常不会期望1.1 + 2.2像二进制浮点那样显示为3.3000000000000003。

Decimal提供了一种操作,使编写需要浮点运算的应用程序变得容易,并且需要以人类可读的格式(例如记帐)显示这些结果。

Take a look at the Decimal module

Decimal “is based on a floating-point model which was designed with people in mind, and necessarily has a paramount guiding principle – computers must provide an arithmetic that works in the same way as the arithmetic that people learn at school.” – excerpt from the decimal arithmetic specification.

and

Decimal numbers can be represented exactly. In contrast, numbers like 1.1 and 2.2 do not have an exact representations in binary floating point. End users typically would not expect 1.1 + 2.2 to display as 3.3000000000000003 as it does with binary floating point.

Decimal provides the kind of operations that make it easy to write apps that require floating point operations and also need to present those results in a human readable format, e.g., accounting.


回答 8

打印吸盘。

print '%.1f' % 5.59  # returns 5.6

printf the sucker.

print '%.1f' % 5.59  # returns 5.6

回答 9

确实是个大问题。试用以下代码:

print "%.2f" % (round((2*4.4+3*5.6+3*4.4)/8,2),)

显示4.85。然后,您执行以下操作:

print "Media = %.1f" % (round((2*4.4+3*5.6+3*4.4)/8,1),)

它显示4.8。您手动计算的确切答案是4.85,但是如果尝试:

print "Media = %.20f" % (round((2*4.4+3*5.6+3*4.4)/8,20),)

您会看到事实:浮点存储为分母为2的幂的分数的最接近有限和。

It’s a big problem indeed. Try out this code:

print "%.2f" % (round((2*4.4+3*5.6+3*4.4)/8,2),)

It displays 4.85. Then you do:

print "Media = %.1f" % (round((2*4.4+3*5.6+3*4.4)/8,1),)

and it shows 4.8. Do you calculations by hand the exact answer is 4.85, but if you try:

print "Media = %.20f" % (round((2*4.4+3*5.6+3*4.4)/8,20),)

you can see the truth: the float point is stored as the nearest finite sum of fractions whose denominators are powers of two.


回答 10

您可以使用%类似于sprintf 的字符串格式运算符。

mystring = "%.2f" % 5.5999

You can use the string format operator %, similar to sprintf.

mystring = "%.2f" % 5.5999

回答 11

完美的作品

format(5.59, '.1f') # to display
float(format(5.59, '.1f')) #to round

Works Perfect

format(5.59, '.1f') # to display
float(format(5.59, '.1f')) #to round

回答 12

我在做:

int(round( x , 0))

在这种情况下,我们首先在单位级别正确舍入,然后转换为整数以避免打印浮点数。

所以

>>> int(round(5.59,0))
6

我认为这个答案比格式化字符串更好,并且使用round函数对我也更有意义。

I am doing:

int(round( x , 0))

In this case, we first round properly at the unit level, then we convert to integer to avoid printing a float.

so

>>> int(round(5.59,0))
6

I think this answer works better than formating the string, and it also makes more sens to me to use the round function.


回答 13

round()在这种情况下,我将完全避免依赖。考虑

print(round(61.295, 2))
print(round(1.295, 2))

将输出

61.3
1.29

如果您需要四舍五入到最接近的整数,则这不是理想的输出。要绕过此行为,请使用math.ceil()(或math.floor()如果要舍入):

from math import ceil
decimal_count = 2
print(ceil(61.295 * 10 ** decimal_count) / 10 ** decimal_count)
print(ceil(1.295 * 10 ** decimal_count) / 10 ** decimal_count)

输出

61.3
1.3

希望有帮助。

I would avoid relying on round() at all in this case. Consider

print(round(61.295, 2))
print(round(1.295, 2))

will output

61.3
1.29

which is not a desired output if you need solid rounding to the nearest integer. To bypass this behavior go with math.ceil() (or math.floor() if you want to round down):

from math import ceil
decimal_count = 2
print(ceil(61.295 * 10 ** decimal_count) / 10 ** decimal_count)
print(ceil(1.295 * 10 ** decimal_count) / 10 ** decimal_count)

outputs

61.3
1.3

Hope that helps.


回答 14

码:

x1 = 5.63
x2 = 5.65
print(float('%.2f' % round(x1,1)))  # gives you '5.6'
print(float('%.2f' % round(x2,1)))  # gives you '5.7'

输出:

5.6
5.7

Code:

x1 = 5.63
x2 = 5.65
print(float('%.2f' % round(x1,1)))  # gives you '5.6'
print(float('%.2f' % round(x2,1)))  # gives you '5.7'

Output:

5.6
5.7

回答 15

这是我看到回合失败的地方。如果您想将这两个数字四舍五入到小数点后该怎么办?23.45 23.55我的教育是,通过对这些数字进行四舍五入,您将获得:23.4 23.6“规则”是,如果前一个数字为奇数,则应四舍五入,如果前一个数字为偶数,则不四舍五入。python中的round函数将截断5。

Here’s where I see round failing. What if you wanted to round these 2 numbers to one decimal place? 23.45 23.55 My education was that from rounding these you should get: 23.4 23.6 the “rule” being that you should round up if the preceding number was odd, not round up if the preceding number were even. The round function in python simply truncates the 5.


回答 16

问题仅在最后一位数字为5时出现。0.045在内部存储为0.044999999999999 …您可以将最后一位数字简单地增加到6并四舍五入。这将为您提供所需的结果。

import re


def custom_round(num, precision=0):
    # Get the type of given number
    type_num = type(num)
    # If the given type is not a valid number type, raise TypeError
    if type_num not in [int, float, Decimal]:
        raise TypeError("type {} doesn't define __round__ method".format(type_num.__name__))
    # If passed number is int, there is no rounding off.
    if type_num == int:
        return num
    # Convert number to string.
    str_num = str(num).lower()
    # We will remove negative context from the number and add it back in the end
    negative_number = False
    if num < 0:
        negative_number = True
        str_num = str_num[1:]
    # If number is in format 1e-12 or 2e+13, we have to convert it to
    # to a string in standard decimal notation.
    if 'e-' in str_num:
        # For 1.23e-7, e_power = 7
        e_power = int(re.findall('e-[0-9]+', str_num)[0][2:])
        # For 1.23e-7, number = 123
        number = ''.join(str_num.split('e-')[0].split('.'))
        zeros = ''
        # Number of zeros = e_power - 1 = 6
        for i in range(e_power - 1):
            zeros = zeros + '0'
        # Scientific notation 1.23e-7 in regular decimal = 0.000000123
        str_num = '0.' + zeros + number
    if 'e+' in str_num:
        # For 1.23e+7, e_power = 7
        e_power = int(re.findall('e\+[0-9]+', str_num)[0][2:])
        # For 1.23e+7, number_characteristic = 1
        # characteristic is number left of decimal point.
        number_characteristic = str_num.split('e+')[0].split('.')[0]
        # For 1.23e+7, number_mantissa = 23
        # mantissa is number right of decimal point.
        number_mantissa = str_num.split('e+')[0].split('.')[1]
        # For 1.23e+7, number = 123
        number = number_characteristic + number_mantissa
        zeros = ''
        # Eg: for this condition = 1.23e+7
        if e_power >= len(number_mantissa):
            # Number of zeros = e_power - mantissa length = 5
            for i in range(e_power - len(number_mantissa)):
                zeros = zeros + '0'
            # Scientific notation 1.23e+7 in regular decimal = 12300000.0
            str_num = number + zeros + '.0'
        # Eg: for this condition = 1.23e+1
        if e_power < len(number_mantissa):
            # In this case, we only need to shift the decimal e_power digits to the right
            # So we just copy the digits from mantissa to characteristic and then remove
            # them from mantissa.
            for i in range(e_power):
                number_characteristic = number_characteristic + number_mantissa[i]
            number_mantissa = number_mantissa[i:]
            # Scientific notation 1.23e+1 in regular decimal = 12.3
            str_num = number_characteristic + '.' + number_mantissa
    # characteristic is number left of decimal point.
    characteristic_part = str_num.split('.')[0]
    # mantissa is number right of decimal point.
    mantissa_part = str_num.split('.')[1]
    # If number is supposed to be rounded to whole number,
    # check first decimal digit. If more than 5, return
    # characteristic + 1 else return characteristic
    if precision == 0:
        if mantissa_part and int(mantissa_part[0]) >= 5:
            return type_num(int(characteristic_part) + 1)
        return type_num(characteristic_part)
    # Get the precision of the given number.
    num_precision = len(mantissa_part)
    # Rounding off is done only if number precision is
    # greater than requested precision
    if num_precision <= precision:
        return num
    # Replace the last '5' with 6 so that rounding off returns desired results
    if str_num[-1] == '5':
        str_num = re.sub('5$', '6', str_num)
    result = round(type_num(str_num), precision)
    # If the number was negative, add negative context back
    if negative_number:
        result = result * -1
    return result

The problem is only when last digit is 5. Eg. 0.045 is internally stored as 0.044999999999999… You could simply increment last digit to 6 and round off. This will give you the desired results.

import re


def custom_round(num, precision=0):
    # Get the type of given number
    type_num = type(num)
    # If the given type is not a valid number type, raise TypeError
    if type_num not in [int, float, Decimal]:
        raise TypeError("type {} doesn't define __round__ method".format(type_num.__name__))
    # If passed number is int, there is no rounding off.
    if type_num == int:
        return num
    # Convert number to string.
    str_num = str(num).lower()
    # We will remove negative context from the number and add it back in the end
    negative_number = False
    if num < 0:
        negative_number = True
        str_num = str_num[1:]
    # If number is in format 1e-12 or 2e+13, we have to convert it to
    # to a string in standard decimal notation.
    if 'e-' in str_num:
        # For 1.23e-7, e_power = 7
        e_power = int(re.findall('e-[0-9]+', str_num)[0][2:])
        # For 1.23e-7, number = 123
        number = ''.join(str_num.split('e-')[0].split('.'))
        zeros = ''
        # Number of zeros = e_power - 1 = 6
        for i in range(e_power - 1):
            zeros = zeros + '0'
        # Scientific notation 1.23e-7 in regular decimal = 0.000000123
        str_num = '0.' + zeros + number
    if 'e+' in str_num:
        # For 1.23e+7, e_power = 7
        e_power = int(re.findall('e\+[0-9]+', str_num)[0][2:])
        # For 1.23e+7, number_characteristic = 1
        # characteristic is number left of decimal point.
        number_characteristic = str_num.split('e+')[0].split('.')[0]
        # For 1.23e+7, number_mantissa = 23
        # mantissa is number right of decimal point.
        number_mantissa = str_num.split('e+')[0].split('.')[1]
        # For 1.23e+7, number = 123
        number = number_characteristic + number_mantissa
        zeros = ''
        # Eg: for this condition = 1.23e+7
        if e_power >= len(number_mantissa):
            # Number of zeros = e_power - mantissa length = 5
            for i in range(e_power - len(number_mantissa)):
                zeros = zeros + '0'
            # Scientific notation 1.23e+7 in regular decimal = 12300000.0
            str_num = number + zeros + '.0'
        # Eg: for this condition = 1.23e+1
        if e_power < len(number_mantissa):
            # In this case, we only need to shift the decimal e_power digits to the right
            # So we just copy the digits from mantissa to characteristic and then remove
            # them from mantissa.
            for i in range(e_power):
                number_characteristic = number_characteristic + number_mantissa[i]
            number_mantissa = number_mantissa[i:]
            # Scientific notation 1.23e+1 in regular decimal = 12.3
            str_num = number_characteristic + '.' + number_mantissa
    # characteristic is number left of decimal point.
    characteristic_part = str_num.split('.')[0]
    # mantissa is number right of decimal point.
    mantissa_part = str_num.split('.')[1]
    # If number is supposed to be rounded to whole number,
    # check first decimal digit. If more than 5, return
    # characteristic + 1 else return characteristic
    if precision == 0:
        if mantissa_part and int(mantissa_part[0]) >= 5:
            return type_num(int(characteristic_part) + 1)
        return type_num(characteristic_part)
    # Get the precision of the given number.
    num_precision = len(mantissa_part)
    # Rounding off is done only if number precision is
    # greater than requested precision
    if num_precision <= precision:
        return num
    # Replace the last '5' with 6 so that rounding off returns desired results
    if str_num[-1] == '5':
        str_num = re.sub('5$', '6', str_num)
    result = round(type_num(str_num), precision)
    # If the number was negative, add negative context back
    if negative_number:
        result = result * -1
    return result

回答 17

另一个可能的选择是:

def hard_round(number, decimal_places=0):
    """
    Function:
    - Rounds a float value to a specified number of decimal places
    - Fixes issues with floating point binary approximation rounding in python
    Requires:
    - `number`:
        - Type: int|float
        - What: The number to round
    Optional:
    - `decimal_places`:
        - Type: int 
        - What: The number of decimal places to round to
        - Default: 0
    Example:
    ```
    hard_round(5.6,1)
    ```
    """
    return int(number*(10**decimal_places)+0.5)/(10**decimal_places)

Another potential option is:

def hard_round(number, decimal_places=0):
    """
    Function:
    - Rounds a float value to a specified number of decimal places
    - Fixes issues with floating point binary approximation rounding in python
    Requires:
    - `number`:
        - Type: int|float
        - What: The number to round
    Optional:
    - `decimal_places`:
        - Type: int 
        - What: The number of decimal places to round to
        - Default: 0
    Example:
    ```
    hard_round(5.6,1)
    ```
    """
    return int(number*(10**decimal_places)+0.5)/(10**decimal_places)

回答 18

关于什么:

round(n,1)+epsilon

What about:

round(n,1)+epsilon

pip安装失败,出现以下错误:OSError:[Errno 13]目录权限被拒绝

问题:pip安装失败,出现以下错误:OSError:[Errno 13]目录权限被拒绝

pip install -r requirements.txt失败,但以下情况除外OSError: [Errno 13] Permission denied: '/usr/local/lib/...。有什么问题,我该如何解决?(我正在尝试设置Django

Installing collected packages: amqp, anyjson, arrow, beautifulsoup4, billiard, boto, braintree, celery, cffi, cryptography, Django, django-bower, django-braces, django-celery, django-crispy-forms, django-debug-toolbar, django-disqus, django-embed-video, django-filter, django-merchant, django-pagination, django-payments, django-storages, django-vote, django-wysiwyg-redactor, easy-thumbnails, enum34, gnureadline, idna, ipaddress, ipython, kombu, mock, names, ndg-httpsclient, Pillow, pyasn1, pycparser, pycrypto, PyJWT, pyOpenSSL, python-dateutil, pytz, requests, six, sqlparse, stripe, suds-jurko
Cleaning up...
Exception:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/pip/basecommand.py", line 122, in main
    status = self.run(options, args)
  File "/usr/lib/python2.7/dist-packages/pip/commands/install.py", line 283, in run
    requirement_set.install(install_options, global_options, root=options.root_path)
  File "/usr/lib/python2.7/dist-packages/pip/req.py", line 1436, in install
    requirement.install(install_options, global_options, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/pip/req.py", line 672, in install
    self.move_wheel_files(self.source_dir, root=root)
  File "/usr/lib/python2.7/dist-packages/pip/req.py", line 902, in move_wheel_files
    pycompile=self.pycompile,
  File "/usr/lib/python2.7/dist-packages/pip/wheel.py", line 206, in move_wheel_files
    clobber(source, lib_dir, True)
  File "/usr/lib/python2.7/dist-packages/pip/wheel.py", line 193, in clobber
    os.makedirs(destsubdir)
  File "/usr/lib/python2.7/os.py", line 157, in makedirs
    mkdir(name, mode)
OSError: [Errno 13] Permission denied: '/usr/local/lib/python2.7/dist-packages/amqp-1.4.6.dist-info'

pip install -r requirements.txt fails with the exception below OSError: [Errno 13] Permission denied: '/usr/local/lib/.... What’s wrong and how do I fix this? (I am trying to setup Django)

Installing collected packages: amqp, anyjson, arrow, beautifulsoup4, billiard, boto, braintree, celery, cffi, cryptography, Django, django-bower, django-braces, django-celery, django-crispy-forms, django-debug-toolbar, django-disqus, django-embed-video, django-filter, django-merchant, django-pagination, django-payments, django-storages, django-vote, django-wysiwyg-redactor, easy-thumbnails, enum34, gnureadline, idna, ipaddress, ipython, kombu, mock, names, ndg-httpsclient, Pillow, pyasn1, pycparser, pycrypto, PyJWT, pyOpenSSL, python-dateutil, pytz, requests, six, sqlparse, stripe, suds-jurko
Cleaning up...
Exception:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/pip/basecommand.py", line 122, in main
    status = self.run(options, args)
  File "/usr/lib/python2.7/dist-packages/pip/commands/install.py", line 283, in run
    requirement_set.install(install_options, global_options, root=options.root_path)
  File "/usr/lib/python2.7/dist-packages/pip/req.py", line 1436, in install
    requirement.install(install_options, global_options, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/pip/req.py", line 672, in install
    self.move_wheel_files(self.source_dir, root=root)
  File "/usr/lib/python2.7/dist-packages/pip/req.py", line 902, in move_wheel_files
    pycompile=self.pycompile,
  File "/usr/lib/python2.7/dist-packages/pip/wheel.py", line 206, in move_wheel_files
    clobber(source, lib_dir, True)
  File "/usr/lib/python2.7/dist-packages/pip/wheel.py", line 193, in clobber
    os.makedirs(destsubdir)
  File "/usr/lib/python2.7/os.py", line 157, in makedirs
    mkdir(name, mode)
OSError: [Errno 13] Permission denied: '/usr/local/lib/python2.7/dist-packages/amqp-1.4.6.dist-info'

回答 0

选项a)创建一个virtualenv,将其激活并安装:

virtualenv .venv
source .venv/bin/activate
pip install -r requirements.txt

选项b)安装在您的homedir中:

pip install --user -r requirements.txt

我的建议使用安全(a)选项,以便该项目的需求不会干扰其他项目的需求。

Option a) Create a virtualenv, activate it and install:

virtualenv .venv
source .venv/bin/activate
pip install -r requirements.txt

Option b) Install in your homedir:

pip install --user -r requirements.txt

My recommendation use safe (a) option, so that requirements of this project do not interfere with other projects requirements.


回答 1

我们真的应该停止建议sudowith 的使用pip install。最好先尝试一下pip install --user。如果失败了,请查看此处的最高职位。

您不应使用的原因sudo如下:

当您使用进行pip操作时sudo,您将以root用户身份从Internet运行任意Python代码,这会带来很大的安全风险。如果有人在PyPI上放置了一个恶意项目,然后安装了该项目,则可以使攻击者具有对计算机的根访问权限。

We should really stop advising the use of sudo with pip install. It’s better to first try pip install --user. If this fails then take a look at the top post here.

The reason you shouldn’t use sudo is as follows:

When you run pip with sudo, you are running arbitrary Python code from the Internet as a root user, which is quite a big security risk. If someone puts up a malicious project on PyPI and you install it, you give an attacker root access to your machine.


回答 2

您试图在系统范围内的路径上安装软件包,而无须这样做。

  1. 通常,您可以根据自己的职责sudo临时获得超级用户 权限以便在系统范围的路径上安装软件包:

     sudo pip install -r requirements.txt

    sudo 在这里找到更多信息。

    实际上,这不是一个好主意,也没有很好的用例,请参阅@wim的评论。

  2. 如果您不想在系统范围内进行更改,则可以使用该标志将该包安装在每用户路径上--user

    它所需要的只是:

     pip install --user runloop requirements.txt
  3. 最后,对于更细粒度的控制,您还可以使用virtualenv,它可能是开发环境的最佳解决方案,尤其是当您正在处理多个项目并希望跟踪每个人的依赖关系时。

    用以下命令激活您的virtualenv

    $ my-virtualenv/bin/activate

    以下命令会将软件包安装在virtualenv内部(而不是系统范围的路径):

    pip install -r requirements.txt

You are trying to install a package on the system-wide path without having the permission to do so.

  1. In general, you can use sudo to temporarily obtain superuser permissions at your responsibility in order to install the package on the system-wide path:

     sudo pip install -r requirements.txt
    

    Find more about sudo here.

    Actually, this is a bad idea and there’s no good use case for it, see @wim’s comment.

  2. If you don’t want to make system-wide changes, you can install the package on your per-user path using the --user flag.

    All it takes is:

     pip install --user runloop requirements.txt
    
  3. Finally, for even finer grained control, you can also use a virtualenv, which might be the superior solution for a development environment, especially if you are working on multiple projects and want to keep track of each one’s dependencies.

    After activating your virtualenv with

    $ my-virtualenv/bin/activate

    the following command will install the package inside the virtualenv (and not on the system-wide path):

    pip install -r requirements.txt


回答 3

只是澄清在Linux(基于ubuntu)上由于权限被拒绝的错误而遭受了很多痛苦之后,什么对我有用,并利用了上面Bert的回答,我现在使用…

$ pip install --user <package-name>

或者如果在需求文件上运行pip …

$ pip install --user -r requirements.txt

并且这些功能对于每个pip安装(包括创建虚拟环境)都可靠地起作用。

然而,干净的解决方案在我进一步的经验已经安装python-virtualenv,并virtualenvwrappersudo apt-get install在系统级。

然后,在虚拟环境中,使用pip install不带--user标志AND不带sudo。整体上更清洁,更安全,更轻松。

Just clarifying what worked for me after much pain in linux (ubuntu based) on permission denied errors, and leveraging from Bert’s answer above, I now use …

$ pip install --user <package-name>

or if running pip on a requirements file …

$ pip install --user -r requirements.txt

and these work reliably for every pip install including creating virtual environments.

However, the cleanest solution in my further experience has been to install python-virtualenv and virtualenvwrapper with sudo apt-get install at the system level.

Then, inside virtual environments, use pip install without the --user flag AND without sudo. Much cleaner, safer, and easier overall.


回答 4

用户没有某些Python安装路径的写许可权。您可以通过以下方式给予许可:

sudo chown -R $USER /absolute/path/to/directory

因此,您应该授予权限,然后尝试再次安装它,如果您有新路径,还应该授予权限:

sudo chown -R $USER /usr/local/lib/python2.7/

User doesn’t have write permission for some Python installation paths. You can give the permission by:

sudo chown -R $USER /absolute/path/to/directory

So you should give permission, then try to install it again, if you have new paths you should also give permission:

sudo chown -R $USER /usr/local/lib/python2.7/

回答 5

如果需要权限,则不能将’pip’与’sudo’一起使用。您可以做一个技巧,以便可以使用“ sudo”并安装软件包。只需在您的pip命令前面放置“ sudo python -m …”即可。

sudo python -m pip install --user -r package_name

If you need permissions, you cannot use ‘pip’ with ‘sudo’. You can do a trick, so that you can use ‘sudo’ and install package. Just place ‘sudo python -m …’ in front of your pip command.

sudo python -m pip install --user -r package_name

回答 6

因此,由于完全不同的原因,我得到了相同的确切错误。由于完全独立但已知的Homebrew + pip错误,我遵循了Google Cloud帮助文档中列出的此变通办法,您可以在主目录中创建.pydistutils.cfg文件。该文件具有特殊的配置,只应将其用于安装某些库。安装软件包后,我应该已经删除了该disutils.cfg文件,但我忘记这样做了。所以对我来说实际上就是

rm ~/.pydistutils.cfg

然后一切正常。当然,如果确实有原因在该文件中有一些配置,那么您将不希望直接对该文件进行管理。但是,如果其他任何人都做了该解决方法,却忘了删除该文件,这对我来说就成功了!

So, I got this same exact error for a completely different reason. Due to a totally separate, but known Homebrew + pip bug, I had followed this workaround listed on Google Cloud’s help docs, where you create a .pydistutils.cfg file in your home directory. This file has special config that you’re only supposed to use for your install of certain libraries. I should have removed that disutils.cfg file after installing the packages, but I forgot to do so. So the fix for me was actually just…

rm ~/.pydistutils.cfg.

And then everything worked as normal. Of course, if you have some config in that file for a real reason, then you won’t want to just straight rm that file. But in case anyone else did that workaround, and forgot to remove that file, this did the trick for me!


回答 7

是适当的许可问题,

sudo chown -R $USER /path to your python installed directory

默认是 /usr/local/lib/python2.7/

或尝试

pip install --user -r package_name

然后说,pip install -r requirements.txt这将安装在您的环境中

不要说,sudo pip install -r requirements.txt这将安装到任意python路径中。

It is due permission problem,

sudo chown -R $USER /path to your python installed directory

default it would be /usr/local/lib/python2.7/

or try,

pip install --user -r package_name

and then say, pip install -r requirements.txt this will install inside your env

dont say, sudo pip install -r requirements.txt this is will install into arbitrary python path.


有趣好用的Python教程

退出移动版
微信支付
请使用 微信 扫码支付