分类目录归档:知识问答

如何将python datetime转换为具有可读格式date的字符串?

问题:如何将python datetime转换为具有可读格式date的字符串?

t = e['updated_parsed']
dt = datetime.datetime(t[0],t[1],t[2],t[3],t[4],t[5],t[6]
print dt
>>>2010-01-28 08:39:49.000003

如何将其转换为字符串?:

"January 28, 2010"
t = e['updated_parsed']
dt = datetime.datetime(t[0],t[1],t[2],t[3],t[4],t[5],t[6]
print dt
>>>2010-01-28 08:39:49.000003

How do I turn that into a string?:

"January 28, 2010"

回答 0

datetime类具有strftime方法。Python文档记录了它接受的不同格式:

对于此特定示例,它将类似于:

my_datetime.strftime("%B %d, %Y")

The datetime class has a method strftime. The Python docs documents the different formats it accepts:

For this specific example, it would look something like:

my_datetime.strftime("%B %d, %Y")

回答 1

这是您可以使用python的常规格式化功能来完成的操作…

>>>from datetime import datetime
>>>"{:%B %d, %Y}".format(datetime.now())

此处使用的格式字符与strftime所使用的格式字符相同。不要错过: 格式说明符的开头。

在大多数情况下,使用format()代替strftime()可以使代码更具可读性,易于编写,并且与格式化输出的生成方式保持一致…

>>>"{} today's date is: {:%B %d, %Y}".format("Andre", datetime.now())

将以上内容与以下strftime()替代项进行比较…

>>>"{} today's date is {}".format("Andre", datetime.now().strftime("%B %d, %Y"))

此外,以下内容将无法工作…

>>>datetime.now().strftime("%s %B %d, %Y" % "Andre")
Traceback (most recent call last):
  File "<pyshell#11>", line 1, in <module>
    datetime.now().strftime("%s %B %d, %Y" % "Andre")
TypeError: not enough arguments for format string

等等…

Here is how you can accomplish the same using python’s general formatting function…

>>>from datetime import datetime
>>>"{:%B %d, %Y}".format(datetime.now())

The formatting characters used here are the same as those used by strftime. Don’t miss the leading : in the format specifier.

Using format() instead of strftime() in most cases can make the code more readable, easier to write and consistent with the way formatted output is generated…

>>>"{} today's date is: {:%B %d, %Y}".format("Andre", datetime.now())

Compare the above with the following strftime() alternative…

>>>"{} today's date is {}".format("Andre", datetime.now().strftime("%B %d, %Y"))

Moreover, the following is not going to work…

>>>datetime.now().strftime("%s %B %d, %Y" % "Andre")
Traceback (most recent call last):
  File "<pyshell#11>", line 1, in <module>
    datetime.now().strftime("%s %B %d, %Y" % "Andre")
TypeError: not enough arguments for format string

And so on…


回答 2

在Python 3.6及更高版本中使用f字符串。

from datetime import datetime

date_string = f'{datetime.now():%Y-%m-%d %H:%M:%S%z}'

Using f-strings, in Python 3.6+.

from datetime import datetime

date_string = f'{datetime.now():%Y-%m-%d %H:%M:%S%z}'

回答 3

我知道很老的问题。但是有了新的f字符串(从python 3.6开始),就有了新的选择。所以这里是为了完整性:

from datetime import datetime

dt = datetime.now()

# str.format
strg = '{:%B %d, %Y}'.format(dt)
print(strg)  # July 22, 2017

# datetime.strftime
strg = dt.strftime('%B %d, %Y')
print(strg)  # July 22, 2017

# f-strings in python >= 3.6
strg = f'{dt:%B %d, %Y}'
print(strg)  # July 22, 2017

strftime()and strptime()Behavior解释格式说明符的含义。

very old question, i know. but with the new f-strings (starting from python 3.6) there are fresh options. so here for completeness:

from datetime import datetime

dt = datetime.now()

# str.format
strg = '{:%B %d, %Y}'.format(dt)
print(strg)  # July 22, 2017

# datetime.strftime
strg = dt.strftime('%B %d, %Y')
print(strg)  # July 22, 2017

# f-strings in python >= 3.6
strg = f'{dt:%B %d, %Y}'
print(strg)  # July 22, 2017

strftime() and strptime() Behavior explains what the format specifiers mean.


回答 4

阅读官方文档中的strfrtime

Read strfrtime from the official docs.


回答 5

Python datetime对象具有method属性,该属性以可读格式打印。

>>> a = datetime.now()
>>> a.ctime()
'Mon May 21 18:35:18 2018'
>>> 

Python datetime object has a method attribute, which prints in readable format.

>>> a = datetime.now()
>>> a.ctime()
'Mon May 21 18:35:18 2018'
>>> 

回答 6

这是用于格式化的日期吗?

def format_date(day, month, year):
        # {} betekent 'plaats hier stringvoorstelling van volgend argument'
        return "{}/{}/{}".format(day, month, year)

This is for format the date?

def format_date(day, month, year):
        # {} betekent 'plaats hier stringvoorstelling van volgend argument'
        return "{}/{}/{}".format(day, month, year)

如何在不使用“ |”的情况下将两组连接在一起

问题:如何在不使用“ |”的情况下将两组连接在一起

假定ST被分配了集合。不使用join运算符|,如何找到两个集合的并集?例如,找到交叉点:

S = {1, 2, 3, 4}
T = {3, 4, 5, 6}
S_intersect_T = { i for i in S if i in T }

那么如何在不使用的情况下在一行中找到两个集合的并集|呢?

Assume that S and T are assigned sets. Without using the join operator |, how can I find the union of the two sets? This, for example, finds the intersection:

S = {1, 2, 3, 4}
T = {3, 4, 5, 6}
S_intersect_T = { i for i in S if i in T }

So how can I find the union of two sets in one line without using |?


回答 0

您可以对集合使用联合方法: set.union(other_set)

请注意,它会返回一个新集合,即不会对其自身进行修改。

You can use union method for sets: set.union(other_set)

Note that it returns a new set i.e it doesn’t modify itself.


回答 1

您可以使用or_别名:

>>> from operator import or_
>>> from functools import reduce # python3 required
>>> reduce(or_, [{1, 2, 3, 4}, {3, 4, 5, 6}])
set([1, 2, 3, 4, 5, 6])

You could use or_ alias:

>>> from operator import or_
>>> from functools import reduce # python3 required
>>> reduce(or_, [{1, 2, 3, 4}, {3, 4, 5, 6}])
set([1, 2, 3, 4, 5, 6])

回答 2

如果您可以修改原始集(在某些情况下可能需要这样做),可以使用set.update()

S.update(T)

返回值是None,但S将更新为原始S和的并集T

If you are fine with modifying the original set (which you may want to do in some cases), you can use set.update():

S.update(T)

The return value is None, but S will be updated to be the union of the original S and T.


回答 3

假设您也无法使用s.union(t),等于s | t,您可以尝试

>>> from itertools import chain
>>> set(chain(s,t))
set([1, 2, 3, 4, 5, 6])

或者,如果您想理解,

>>> {i for j in (s,t) for i in j}
set([1, 2, 3, 4, 5, 6])

Assuming you also can’t use s.union(t), which is equivalent to s | t, you could try

>>> from itertools import chain
>>> set(chain(s,t))
set([1, 2, 3, 4, 5, 6])

Or, if you want a comprehension,

>>> {i for j in (s,t) for i in j}
set([1, 2, 3, 4, 5, 6])

回答 4

如果加入表示您是工会,请尝试以下方法:

set(list(s) + list(t))

这有点骇人听闻,但我想不出更好的衬里来做。

If by join you mean union, try this:

set(list(s) + list(t))

It’s a bit of a hack, but I can’t think of a better one liner to do it.


回答 5

假设您有2个清单

 A = [1,2,3,4]
 B = [3,4,5,6]

所以你可以找到A联盟B如下

 union = set(A).union(set(B))

另外,如果要查找相交和非相交,请按照以下步骤进行操作

 intersection = set(A).intersection(set(B))
 non_intersection = union - intersection

Suppose you have 2 lists

 A = [1,2,3,4]
 B = [3,4,5,6]

so you can find A Union B as follow

 union = set(A).union(set(B))

also if you want to find intersection and non-intersection you do that as follow

 intersection = set(A).intersection(set(B))
 non_intersection = union - intersection

回答 6

您可以像这样将两个集合解压缩成一个集合:

>>> set_1 = {1, 2, 3, 4}
>>> set_2 = {3, 4, 5, 6}
>>> union = {*set_1, *set_2}
>>> union
{1, 2, 3, 4, 5, 6}

*解包集。拆包是将可迭代项(例如集合或列表)表示为它产生的每个项目的地方。这意味着上面的示例简化为{1, 2, 3, 4, 3, 4, 5, 6},然后简化为,{1, 2, 3, 4, 5, 6}因为该集合只能包含唯一项。

You can just unpack both sets into one like this:

>>> set_1 = {1, 2, 3, 4}
>>> set_2 = {3, 4, 5, 6}
>>> union = {*set_1, *set_2}
>>> union
{1, 2, 3, 4, 5, 6}

The * unpacks the set. Unpacking is where an iterable (e.g. a set or list) is represented as every item it yields. This means the above example simplifies to {1, 2, 3, 4, 3, 4, 5, 6} which then simplifies to {1, 2, 3, 4, 5, 6} because the set can only contain unique items.


回答 7

您可以做union或简单的列表理解

[A.add(_) for _ in B]

A将拥有B的所有元素

You can do union or simple list comprehension

[A.add(_) for _ in B]

A would have all the elements of B


python:将脚本工作目录更改为脚本自己的目录

问题:python:将脚本工作目录更改为脚本自己的目录

我每分钟从crontab运行python shell:

* * * * * /home/udi/foo/bar.py

/home/udi/foo有一些必要的子目录,例如/home/udi/foo/log/home/udi/foo/config,它/home/udi/foo/bar.py指的是。

问题是crontab从另一个工作目录运行脚本,因此尝试打开./log/bar.log失败。

有没有办法告诉脚本将工作目录更改为脚本自己的目录?我想找到一种适用于任何脚本位置的解决方案,而不是明确告诉脚本位置。

编辑:

os.chdir(os.path.dirname(sys.argv[0]))

是最紧凑的优雅解决方案。感谢您的回答和解释!

I run a python shell from crontab every minute:

* * * * * /home/udi/foo/bar.py

/home/udi/foo has some necessary subdirectories, like /home/udi/foo/log and /home/udi/foo/config, which /home/udi/foo/bar.py refers to.

The problem is that crontab runs the script from a different working directory, so trying to open ./log/bar.log fails.

Is there a nice way to tell the script to change the working directory to the script’s own directory? I would fancy a solution that would work for any script location, rather than explicitly telling the script where it is.

EDIT:

os.chdir(os.path.dirname(sys.argv[0]))

Was the most compact elegant solution. Thanks for your answers and explanations!


回答 0

这会将当前工作目录更改为,以便打开相对路径将起作用:

import os
os.chdir("/home/udi/foo")

但是,您询问如何将Python脚本更改为任何目录,即使您不知道编写脚本时的目录也是如此。为此,您可以使用以下os.path功能:

import os

abspath = os.path.abspath(__file__)
dname = os.path.dirname(abspath)
os.chdir(dname)

这将获取脚本的文件名,将其转换为绝对路径,然后提取该路径的目录,然后更改为该目录。

This will change your current working directory to so that opening relative paths will work:

import os
os.chdir("/home/udi/foo")

However, you asked how to change into whatever directory your Python script is located, even if you don’t know what directory that will be when you’re writing your script. To do this, you can use the os.path functions:

import os

abspath = os.path.abspath(__file__)
dname = os.path.dirname(abspath)
os.chdir(dname)

This takes the filename of your script, converts it to an absolute path, then extracts the directory of that path, then changes into that directory.


回答 1

您可以使用来获得较短的版本sys.path[0]

os.chdir(sys.path[0])

来自http://docs.python.org/library/sys.html#sys.path

在程序启动时进行初始化,该列表的第一项 path[0]是包含用于调用Python解释器的脚本的目录。

You can get a shorter version by using sys.path[0].

os.chdir(sys.path[0])

From http://docs.python.org/library/sys.html#sys.path

As initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter


回答 2

不要这样

您的脚本和数据不应混入一个大目录中。把你的代码在一些已知的位置(site-packages或者/var/opt/udi什么的),从数据中分离出来。对代码使用良好的版本控制,以确保当前版本和以前的版本彼此分开,以便可以回退到以前的版本并测试将来的版本。

底线:请勿混合代码和数据。

数据是宝贵的。代码来来往往。

提供工作目录作为命令行参数值。您可以提供默认值作为环境变量。不要演绎(或猜测)

将其设为必需的参数值并执行此操作。

import sys
import os
working= os.environ.get("WORKING_DIRECTORY","/some/default")
if len(sys.argv) > 1: working = sys.argv[1]
os.chdir( working )

不要根据软件的位置“假定”目录。从长远来看,它将无法很好地工作。

Don’t do this.

Your scripts and your data should not be mashed into one big directory. Put your code in some known location (site-packages or /var/opt/udi or something) separate from your data. Use good version control on your code to be sure that you have current and previous versions separated from each other so you can fall back to previous versions and test future versions.

Bottom line: Do not mingle code and data.

Data is precious. Code comes and goes.

Provide the working directory as a command-line argument value. You can provide a default as an environment variable. Don’t deduce it (or guess at it)

Make it a required argument value and do this.

import sys
import os
working= os.environ.get("WORKING_DIRECTORY","/some/default")
if len(sys.argv) > 1: working = sys.argv[1]
os.chdir( working )

Do not “assume” a directory based on the location of your software. It will not work out well in the long run.


回答 3

将您的crontab命令更改为

* * * * * (cd /home/udi/foo/ || exit 1; ./bar.py)

(...)您的crond执行作为一个单一的命令启动一个子shell。如果|| exit 1目录不可用,则导致您的cronjob失败。

尽管从长远来看,其他解决方案对于您的特定脚本可能更优雅,但是在您无法修改要执行的程序或命令的情况下,我的示例仍然有用。

Change your crontab command to

* * * * * (cd /home/udi/foo/ || exit 1; ./bar.py)

The (...) starts a sub-shell that your crond executes as a single command. The || exit 1 causes your cronjob to fail in case that the directory is unavailable.

Though the other solutions may be more elegant in the long run for your specific scripts, my example could still be useful in cases where you can’t modify the program or command that you want to execute.


使用PyCrypto AES 256加密和解密

问题:使用PyCrypto AES 256加密和解密

我正在尝试使用PyCrypto构建两个接受两个参数的函数:消息和密钥,然后对消息进行加密/解密。

我在网络上找到了几个链接可以帮助我,但每个链接都有缺陷:

在codekoala上,此代码使用os.urandom,PyCrypto不建议这样做。

此外,不能保证我提供给函数的键具有预期的确切长度。我该怎么做才能做到这一点?

另外,有几种模式,推荐哪种?我不知道该怎么用:/

最后,IV到底是什么?我可以提供不同的IV进行加密和解密,还是返回不同的结果?

编辑:删除了代码部分,因为它不安全。

I’m trying to build two functions using PyCrypto that accept two parameters: the message and the key, and then encrypt/decrypt the message.

I found several links on the web to help me out, but each one of them has flaws:

This one at codekoala uses os.urandom, which is discouraged by PyCrypto.

Moreover, the key I give to the function is not guaranteed to have the exact length expected. What can I do to make that happen ?

Also, there are several modes, which one is recommended? I don’t know what to use :/

Finally, what exactly is the IV? Can I provide a different IV for encrypting and decrypting, or will this return in a different result?

Edit: Removed the code part since it was not secure.


回答 0

这是我的实现,并通过一些修复为我工作,并用32字节和iv到16字节增强了密钥和秘密短语的对齐方式:

import base64
import hashlib
from Crypto import Random
from Crypto.Cipher import AES

class AESCipher(object):

    def __init__(self, key): 
        self.bs = AES.block_size
        self.key = hashlib.sha256(key.encode()).digest()

    def encrypt(self, raw):
        raw = self._pad(raw)
        iv = Random.new().read(AES.block_size)
        cipher = AES.new(self.key, AES.MODE_CBC, iv)
        return base64.b64encode(iv + cipher.encrypt(raw.encode()))

    def decrypt(self, enc):
        enc = base64.b64decode(enc)
        iv = enc[:AES.block_size]
        cipher = AES.new(self.key, AES.MODE_CBC, iv)
        return self._unpad(cipher.decrypt(enc[AES.block_size:])).decode('utf-8')

    def _pad(self, s):
        return s + (self.bs - len(s) % self.bs) * chr(self.bs - len(s) % self.bs)

    @staticmethod
    def _unpad(s):
        return s[:-ord(s[len(s)-1:])]

Here is my implementation and works for me with some fixes and enhances the alignment of the key and secret phrase with 32 bytes and iv to 16 bytes:

import base64
import hashlib
from Crypto import Random
from Crypto.Cipher import AES

class AESCipher(object):

    def __init__(self, key): 
        self.bs = AES.block_size
        self.key = hashlib.sha256(key.encode()).digest()

    def encrypt(self, raw):
        raw = self._pad(raw)
        iv = Random.new().read(AES.block_size)
        cipher = AES.new(self.key, AES.MODE_CBC, iv)
        return base64.b64encode(iv + cipher.encrypt(raw.encode()))

    def decrypt(self, enc):
        enc = base64.b64decode(enc)
        iv = enc[:AES.block_size]
        cipher = AES.new(self.key, AES.MODE_CBC, iv)
        return self._unpad(cipher.decrypt(enc[AES.block_size:])).decode('utf-8')

    def _pad(self, s):
        return s + (self.bs - len(s) % self.bs) * chr(self.bs - len(s) % self.bs)

    @staticmethod
    def _unpad(s):
        return s[:-ord(s[len(s)-1:])]

回答 1

您可能需要以下两个功能:padunpad当输入的长度不是BLOCK_SIZE的倍数时,填充(执行加密时)和-取消填充(执行解密时)。

BS = 16
pad = lambda s: s + (BS - len(s) % BS) * chr(BS - len(s) % BS) 
unpad = lambda s : s[:-ord(s[len(s)-1:])]

所以您要问密钥的长度?您可以使用密钥的md5sum而不是直接使用它。

而且,根据我使用PyCrypto的经验,在输入相同的情况下,IV用于混合加密输出,因此将IV选择为随机字符串,并将其用作加密输出的一部分,然后用它来解密消息。

这是我的实现,希望它将对您有用:

import base64
from Crypto.Cipher import AES
from Crypto import Random

class AESCipher:
    def __init__( self, key ):
        self.key = key

    def encrypt( self, raw ):
        raw = pad(raw)
        iv = Random.new().read( AES.block_size )
        cipher = AES.new( self.key, AES.MODE_CBC, iv )
        return base64.b64encode( iv + cipher.encrypt( raw ) ) 

    def decrypt( self, enc ):
        enc = base64.b64decode(enc)
        iv = enc[:16]
        cipher = AES.new(self.key, AES.MODE_CBC, iv )
        return unpad(cipher.decrypt( enc[16:] ))

You may need the following two functions: pad– to pad(when doing encryption) and unpad– to unpad (when doing decryption) when the length of input is not a multiple of BLOCK_SIZE.

BS = 16
pad = lambda s: s + (BS - len(s) % BS) * chr(BS - len(s) % BS) 
unpad = lambda s : s[:-ord(s[len(s)-1:])]

So you’re asking the length of key? You can use the md5sum of the key rather than use it directly.

More, according to my little experience of using PyCrypto, the IV is used to mix up the output of a encryption when input is same, so the IV is chosen as a random string, and use it as part of the encryption output, and then use it to decrypt the message.

And here’s my implementation, hope it will be useful for you:

import base64
from Crypto.Cipher import AES
from Crypto import Random

class AESCipher:
    def __init__( self, key ):
        self.key = key

    def encrypt( self, raw ):
        raw = pad(raw)
        iv = Random.new().read( AES.block_size )
        cipher = AES.new( self.key, AES.MODE_CBC, iv )
        return base64.b64encode( iv + cipher.encrypt( raw ) ) 

    def decrypt( self, enc ):
        enc = base64.b64decode(enc)
        iv = enc[:16]
        cipher = AES.new(self.key, AES.MODE_CBC, iv )
        return unpad(cipher.decrypt( enc[16:] ))

回答 2

让我解决您有关“模式”的问题。AES256是一种分组密码。它以32字节的密钥和16字节的字符串(称为块)作为输入,并输出一个块。我们在操作模式下使用AES 进行加密。上面的解决方案建议使用CBC,这是一个示例。另一个称为CTR,使用起来更容易一些:

from Crypto.Cipher import AES
from Crypto.Util import Counter
from Crypto import Random

# AES supports multiple key sizes: 16 (AES128), 24 (AES192), or 32 (AES256).
key_bytes = 32

# Takes as input a 32-byte key and an arbitrary-length plaintext and returns a
# pair (iv, ciphtertext). "iv" stands for initialization vector.
def encrypt(key, plaintext):
    assert len(key) == key_bytes

    # Choose a random, 16-byte IV.
    iv = Random.new().read(AES.block_size)

    # Convert the IV to a Python integer.
    iv_int = int(binascii.hexlify(iv), 16) 

    # Create a new Counter object with IV = iv_int.
    ctr = Counter.new(AES.block_size * 8, initial_value=iv_int)

    # Create AES-CTR cipher.
    aes = AES.new(key, AES.MODE_CTR, counter=ctr)

    # Encrypt and return IV and ciphertext.
    ciphertext = aes.encrypt(plaintext)
    return (iv, ciphertext)

# Takes as input a 32-byte key, a 16-byte IV, and a ciphertext, and outputs the
# corresponding plaintext.
def decrypt(key, iv, ciphertext):
    assert len(key) == key_bytes

    # Initialize counter for decryption. iv should be the same as the output of
    # encrypt().
    iv_int = int(iv.encode('hex'), 16) 
    ctr = Counter.new(AES.block_size * 8, initial_value=iv_int)

    # Create AES-CTR cipher.
    aes = AES.new(key, AES.MODE_CTR, counter=ctr)

    # Decrypt and return the plaintext.
    plaintext = aes.decrypt(ciphertext)
    return plaintext

(iv, ciphertext) = encrypt(key, 'hella')
print decrypt(key, iv, ciphertext)

这通常称为AES-CTR。在将AES-CBC与PyCrypto结合使用时,我建议您谨慎使用。原因是它要求您指定填充方案,如其他给出的解决方案所示。通常,如果您对填充不太谨慎,则可以完全破坏加密的攻击

现在,必须注意,密钥必须是一个随机的32字节字符串;密码足够。通常,密钥是这样生成的:

# Nominal way to generate a fresh key. This calls the system's random number
# generator (RNG).
key1 = Random.new().read(key_bytes)

密钥也可以从密码派生

# It's also possible to derive a key from a password, but it's important that
# the password have high entropy, meaning difficult to predict.
password = "This is a rather weak password."

# For added # security, we add a "salt", which increases the entropy.
#
# In this example, we use the same RNG to produce the salt that we used to
# produce key1.
salt_bytes = 8 
salt = Random.new().read(salt_bytes)

# Stands for "Password-based key derivation function 2"
key2 = PBKDF2(password, salt, key_bytes)

上面的一些解决方案建议使用SHA256派生密钥,但这通常被认为是不良的加密做法。查阅Wikipedia,了解更多有关操作模式的信息。

Let me address your question about “modes.” AES256 is a kind of block cipher. It takes as input a 32-byte key and a 16-byte string, called the block and outputs a block. We use AES in a mode of operation in order to encrypt. The solutions above suggest using CBC, which is one example. Another is called CTR, and it’s somewhat easier to use:

from Crypto.Cipher import AES
from Crypto.Util import Counter
from Crypto import Random

# AES supports multiple key sizes: 16 (AES128), 24 (AES192), or 32 (AES256).
key_bytes = 32

# Takes as input a 32-byte key and an arbitrary-length plaintext and returns a
# pair (iv, ciphtertext). "iv" stands for initialization vector.
def encrypt(key, plaintext):
    assert len(key) == key_bytes

    # Choose a random, 16-byte IV.
    iv = Random.new().read(AES.block_size)

    # Convert the IV to a Python integer.
    iv_int = int(binascii.hexlify(iv), 16) 

    # Create a new Counter object with IV = iv_int.
    ctr = Counter.new(AES.block_size * 8, initial_value=iv_int)

    # Create AES-CTR cipher.
    aes = AES.new(key, AES.MODE_CTR, counter=ctr)

    # Encrypt and return IV and ciphertext.
    ciphertext = aes.encrypt(plaintext)
    return (iv, ciphertext)

# Takes as input a 32-byte key, a 16-byte IV, and a ciphertext, and outputs the
# corresponding plaintext.
def decrypt(key, iv, ciphertext):
    assert len(key) == key_bytes

    # Initialize counter for decryption. iv should be the same as the output of
    # encrypt().
    iv_int = int(iv.encode('hex'), 16) 
    ctr = Counter.new(AES.block_size * 8, initial_value=iv_int)

    # Create AES-CTR cipher.
    aes = AES.new(key, AES.MODE_CTR, counter=ctr)

    # Decrypt and return the plaintext.
    plaintext = aes.decrypt(ciphertext)
    return plaintext

(iv, ciphertext) = encrypt(key, 'hella')
print decrypt(key, iv, ciphertext)

This is often referred to as AES-CTR. I would advise caution in using AES-CBC with PyCrypto. The reason is that it requires you to specify the padding scheme, as exemplified by the other solutions given. In general, if you’re not very careful about the padding, there are attacks that completely break encryption!

Now, it’s important to note that the key must be a random, 32-byte string; a password does not suffice. Normally, the key is generated like so:

# Nominal way to generate a fresh key. This calls the system's random number
# generator (RNG).
key1 = Random.new().read(key_bytes)

A key may be derived from a password, too:

# It's also possible to derive a key from a password, but it's important that
# the password have high entropy, meaning difficult to predict.
password = "This is a rather weak password."

# For added # security, we add a "salt", which increases the entropy.
#
# In this example, we use the same RNG to produce the salt that we used to
# produce key1.
salt_bytes = 8 
salt = Random.new().read(salt_bytes)

# Stands for "Password-based key derivation function 2"
key2 = PBKDF2(password, salt, key_bytes)

Some solutions above suggest using SHA256 for deriving the key, but this is generally considered bad cryptographic practice. Check out wikipedia for more on modes of operation.


回答 3

对于想使用urlsafe_b64encode和urlsafe_b64decode的用户,以下是对我有用的版本(花了一些时间处理unicode问题之后)

BS = 16
key = hashlib.md5(settings.SECRET_KEY).hexdigest()[:BS]
pad = lambda s: s + (BS - len(s) % BS) * chr(BS - len(s) % BS)
unpad = lambda s : s[:-ord(s[len(s)-1:])]

class AESCipher:
    def __init__(self, key):
        self.key = key

    def encrypt(self, raw):
        raw = pad(raw)
        iv = Random.new().read(AES.block_size)
        cipher = AES.new(self.key, AES.MODE_CBC, iv)
        return base64.urlsafe_b64encode(iv + cipher.encrypt(raw)) 

    def decrypt(self, enc):
        enc = base64.urlsafe_b64decode(enc.encode('utf-8'))
        iv = enc[:BS]
        cipher = AES.new(self.key, AES.MODE_CBC, iv)
        return unpad(cipher.decrypt(enc[BS:]))

For someone who would like to use urlsafe_b64encode and urlsafe_b64decode, here are the version that’re working for me (after spending some time with the unicode issue)

BS = 16
key = hashlib.md5(settings.SECRET_KEY).hexdigest()[:BS]
pad = lambda s: s + (BS - len(s) % BS) * chr(BS - len(s) % BS)
unpad = lambda s : s[:-ord(s[len(s)-1:])]

class AESCipher:
    def __init__(self, key):
        self.key = key

    def encrypt(self, raw):
        raw = pad(raw)
        iv = Random.new().read(AES.block_size)
        cipher = AES.new(self.key, AES.MODE_CBC, iv)
        return base64.urlsafe_b64encode(iv + cipher.encrypt(raw)) 

    def decrypt(self, enc):
        enc = base64.urlsafe_b64decode(enc.encode('utf-8'))
        iv = enc[:BS]
        cipher = AES.new(self.key, AES.MODE_CBC, iv)
        return unpad(cipher.decrypt(enc[BS:]))

回答 4

您可以使用SHA-1或SHA-256之类的加密哈希函数(不是 Python的内置函数)从任意密码中获取密码短语hash。Python在其标准库中包括对两者的支持:

import hashlib

hashlib.sha1("this is my awesome password").digest() # => a 20 byte string
hashlib.sha256("another awesome password").digest() # => a 32 byte string

您可以仅使用[:16]或来截断加密哈希值,[:24]并且它将在指定长度内保持其安全性。

You can get a passphrase out of an arbitrary password by using a cryptographic hash function (NOT Python’s builtin hash) like SHA-1 or SHA-256. Python includes support for both in its standard library:

import hashlib

hashlib.sha1("this is my awesome password").digest() # => a 20 byte string
hashlib.sha256("another awesome password").digest() # => a 32 byte string

You can truncate a cryptographic hash value just by using [:16] or [:24] and it will retain its security up to the length you specify.


回答 5

感谢其他启发但对我不起作用的答案。

在花了数小时试图弄清楚它是如何工作之后,我想到了下面的实现,并带有最新的PyCryptodomex库(这是我如何在Windows上的virtualenv .. phew中成功设置它的代理)

。在实现时,请记住写下填充,编码,加密步骤(反之亦然)。您必须打包和拆包,并牢记顺序。

导入base64
导入hashlib
从Cryptodome.Cipher导入AES
从Cryptodome.Random导入get_random_bytes

__key__ = hashlib.sha256(b'16个字符的键').digest()

def加密(原始):
    BS = AES.block_size
    pad = lambda s:s +(BS-len%BS)* chr(BS-len%BS)

    原始= base64.b64encode(pad(raw).encode('utf8'))
    iv = get_random_bytes(AES.block_size)
    密码= AES.new(密钥= __密钥__,模式= AES.MODE_CFB,iv = iv)
    返回base64.b64encode(iv + cipher.encrypt(raw))

def解密(enc):
    unpad = lambda s:s [:-ord(s [-1:])]

    enc = base64.b64decode(enc)
    iv = enc [:AES.block_size]
    cipher = AES.new(__ key__,AES.MODE_CFB,iv)
    返回unpad(base64.b64decode(cipher.decrypt(enc [AES.block_size:]))。decode('utf8'))

Grateful for the other answers which inspired but didn’t work for me.

After spending hours trying to figure out how it works, I came up with the implementation below with the newest PyCryptodomex library (it is another story how I managed to set it up behind proxy, on Windows, in a virtualenv.. phew)

Working on your implementation, remember to write down padding, encoding, encrypting steps (and vice versa). You have to pack and unpack keeping in mind the order.

import base64
import hashlib
from Cryptodome.Cipher import AES
from Cryptodome.Random import get_random_bytes

__key__ = hashlib.sha256(b'16-character key').digest()

def encrypt(raw):
    BS = AES.block_size
    pad = lambda s: s + (BS - len(s) % BS) * chr(BS - len(s) % BS)

    raw = base64.b64encode(pad(raw).encode('utf8'))
    iv = get_random_bytes(AES.block_size)
    cipher = AES.new(key= __key__, mode= AES.MODE_CFB,iv= iv)
    return base64.b64encode(iv + cipher.encrypt(raw))

def decrypt(enc):
    unpad = lambda s: s[:-ord(s[-1:])]

    enc = base64.b64decode(enc)
    iv = enc[:AES.block_size]
    cipher = AES.new(__key__, AES.MODE_CFB, iv)
    return unpad(base64.b64decode(cipher.decrypt(enc[AES.block_size:])).decode('utf8'))

回答 6

为了他人的利益,这是我结合@Cyril和@Marcus的答案所获得的解密实现。假定此消息是通过HTTP请求传入的,该消息带有quoted和base64编码。

import base64
import urllib2
from Crypto.Cipher import AES


def decrypt(quotedEncodedEncrypted):
    key = 'SecretKey'

    encodedEncrypted = urllib2.unquote(quotedEncodedEncrypted)

    cipher = AES.new(key)
    decrypted = cipher.decrypt(base64.b64decode(encodedEncrypted))[:16]

    for i in range(1, len(base64.b64decode(encodedEncrypted))/16):
        cipher = AES.new(key, AES.MODE_CBC, base64.b64decode(encodedEncrypted)[(i-1)*16:i*16])
        decrypted += cipher.decrypt(base64.b64decode(encodedEncrypted)[i*16:])[:16]

    return decrypted.strip()

For the benefit of others, here is my decryption implementation which I got to by combining the answers of @Cyril and @Marcus. This assumes that this coming in via HTTP Request with the encryptedText quoted and base64 encoded.

import base64
import urllib2
from Crypto.Cipher import AES


def decrypt(quotedEncodedEncrypted):
    key = 'SecretKey'

    encodedEncrypted = urllib2.unquote(quotedEncodedEncrypted)

    cipher = AES.new(key)
    decrypted = cipher.decrypt(base64.b64decode(encodedEncrypted))[:16]

    for i in range(1, len(base64.b64decode(encodedEncrypted))/16):
        cipher = AES.new(key, AES.MODE_CBC, base64.b64decode(encodedEncrypted)[(i-1)*16:i*16])
        decrypted += cipher.decrypt(base64.b64decode(encodedEncrypted)[i*16:])[:16]

    return decrypted.strip()

回答 7

对此的另一种看法(很大程度上来自上述解决方案),但

  • 使用null进行填充
  • 不使用lambda(从不成为粉丝)
  • 用python 2.7和3.6.5测试

    #!/usr/bin/python2.7
    # you'll have to adjust for your setup, e.g., #!/usr/bin/python3
    
    
    import base64, re
    from Crypto.Cipher import AES
    from Crypto import Random
    from django.conf import settings
    
    class AESCipher:
        """
          Usage:
          aes = AESCipher( settings.SECRET_KEY[:16], 32)
          encryp_msg = aes.encrypt( 'ppppppppppppppppppppppppppppppppppppppppppppppppppppppp' )
          msg = aes.decrypt( encryp_msg )
          print("'{}'".format(msg))
        """
        def __init__(self, key, blk_sz):
            self.key = key
            self.blk_sz = blk_sz
    
        def encrypt( self, raw ):
            if raw is None or len(raw) == 0:
                raise NameError("No value given to encrypt")
            raw = raw + '\0' * (self.blk_sz - len(raw) % self.blk_sz)
            raw = raw.encode('utf-8')
            iv = Random.new().read( AES.block_size )
            cipher = AES.new( self.key.encode('utf-8'), AES.MODE_CBC, iv )
            return base64.b64encode( iv + cipher.encrypt( raw ) ).decode('utf-8')
    
        def decrypt( self, enc ):
            if enc is None or len(enc) == 0:
                raise NameError("No value given to decrypt")
            enc = base64.b64decode(enc)
            iv = enc[:16]
            cipher = AES.new(self.key.encode('utf-8'), AES.MODE_CBC, iv )
            return re.sub(b'\x00*$', b'', cipher.decrypt( enc[16:])).decode('utf-8')

Another take on this (heavily derived from solutions above) but

  • uses null for padding
  • does not use lambda (never been a fan)
  • tested with python 2.7 and 3.6.5

    #!/usr/bin/python2.7
    # you'll have to adjust for your setup, e.g., #!/usr/bin/python3
    
    
    import base64, re
    from Crypto.Cipher import AES
    from Crypto import Random
    from django.conf import settings
    
    class AESCipher:
        """
          Usage:
          aes = AESCipher( settings.SECRET_KEY[:16], 32)
          encryp_msg = aes.encrypt( 'ppppppppppppppppppppppppppppppppppppppppppppppppppppppp' )
          msg = aes.decrypt( encryp_msg )
          print("'{}'".format(msg))
        """
        def __init__(self, key, blk_sz):
            self.key = key
            self.blk_sz = blk_sz
    
        def encrypt( self, raw ):
            if raw is None or len(raw) == 0:
                raise NameError("No value given to encrypt")
            raw = raw + '\0' * (self.blk_sz - len(raw) % self.blk_sz)
            raw = raw.encode('utf-8')
            iv = Random.new().read( AES.block_size )
            cipher = AES.new( self.key.encode('utf-8'), AES.MODE_CBC, iv )
            return base64.b64encode( iv + cipher.encrypt( raw ) ).decode('utf-8')
    
        def decrypt( self, enc ):
            if enc is None or len(enc) == 0:
                raise NameError("No value given to decrypt")
            enc = base64.b64decode(enc)
            iv = enc[:16]
            cipher = AES.new(self.key.encode('utf-8'), AES.MODE_CBC, iv )
            return re.sub(b'\x00*$', b'', cipher.decrypt( enc[16:])).decode('utf-8')
    

回答 8

我都用了CryptoPyCryptodomex库,它是速度极快…

import base64
import hashlib
from Cryptodome.Cipher import AES as domeAES
from Cryptodome.Random import get_random_bytes
from Crypto import Random
from Crypto.Cipher import AES as cryptoAES

BLOCK_SIZE = AES.block_size

key = "my_secret_key".encode()
__key__ = hashlib.sha256(key).digest()
print(__key__)

def encrypt(raw):
    BS = cryptoAES.block_size
    pad = lambda s: s + (BS - len(s) % BS) * chr(BS - len(s) % BS)
    raw = base64.b64encode(pad(raw).encode('utf8'))
    iv = get_random_bytes(cryptoAES.block_size)
    cipher = cryptoAES.new(key= __key__, mode= cryptoAES.MODE_CFB,iv= iv)
    a= base64.b64encode(iv + cipher.encrypt(raw))
    IV = Random.new().read(BLOCK_SIZE)
    aes = domeAES.new(__key__, domeAES.MODE_CFB, IV)
    b = base64.b64encode(IV + aes.encrypt(a))
    return b

def decrypt(enc):
    passphrase = __key__
    encrypted = base64.b64decode(enc)
    IV = encrypted[:BLOCK_SIZE]
    aes = domeAES.new(passphrase, domeAES.MODE_CFB, IV)
    enc = aes.decrypt(encrypted[BLOCK_SIZE:])
    unpad = lambda s: s[:-ord(s[-1:])]
    enc = base64.b64decode(enc)
    iv = enc[:cryptoAES.block_size]
    cipher = cryptoAES.new(__key__, cryptoAES.MODE_CFB, iv)
    b=  unpad(base64.b64decode(cipher.decrypt(enc[cryptoAES.block_size:])).decode('utf8'))
    return b

encrypted_data =encrypt("Hi Steven!!!!!")
print(encrypted_data)
print("=======")
decrypted_data = decrypt(encrypted_data)
print(decrypted_data)

I have used both Crypto and PyCryptodomex library and it is blazing fast…

import base64
import hashlib
from Cryptodome.Cipher import AES as domeAES
from Cryptodome.Random import get_random_bytes
from Crypto import Random
from Crypto.Cipher import AES as cryptoAES

BLOCK_SIZE = AES.block_size

key = "my_secret_key".encode()
__key__ = hashlib.sha256(key).digest()
print(__key__)

def encrypt(raw):
    BS = cryptoAES.block_size
    pad = lambda s: s + (BS - len(s) % BS) * chr(BS - len(s) % BS)
    raw = base64.b64encode(pad(raw).encode('utf8'))
    iv = get_random_bytes(cryptoAES.block_size)
    cipher = cryptoAES.new(key= __key__, mode= cryptoAES.MODE_CFB,iv= iv)
    a= base64.b64encode(iv + cipher.encrypt(raw))
    IV = Random.new().read(BLOCK_SIZE)
    aes = domeAES.new(__key__, domeAES.MODE_CFB, IV)
    b = base64.b64encode(IV + aes.encrypt(a))
    return b

def decrypt(enc):
    passphrase = __key__
    encrypted = base64.b64decode(enc)
    IV = encrypted[:BLOCK_SIZE]
    aes = domeAES.new(passphrase, domeAES.MODE_CFB, IV)
    enc = aes.decrypt(encrypted[BLOCK_SIZE:])
    unpad = lambda s: s[:-ord(s[-1:])]
    enc = base64.b64decode(enc)
    iv = enc[:cryptoAES.block_size]
    cipher = cryptoAES.new(__key__, cryptoAES.MODE_CFB, iv)
    b=  unpad(base64.b64decode(cipher.decrypt(enc[cryptoAES.block_size:])).decode('utf8'))
    return b

encrypted_data =encrypt("Hi Steven!!!!!")
print(encrypted_data)
print("=======")
decrypted_data = decrypt(encrypted_data)
print(decrypted_data)

回答 9

还不晚,但是我认为这将非常有帮助。没有人提及像PKCS#7填充这样的使用方案。您可以使用它代替以前的函数进行填充(加密时)和取消填充(解密时)。i将在下面提供完整的源代码。

import base64
import hashlib
from Crypto import Random
from Crypto.Cipher import AES
import pkcs7
class Encryption:

    def __init__(self):
        pass

    def Encrypt(self, PlainText, SecurePassword):
        pw_encode = SecurePassword.encode('utf-8')
        text_encode = PlainText.encode('utf-8')

        key = hashlib.sha256(pw_encode).digest()
        iv = Random.new().read(AES.block_size)

        cipher = AES.new(key, AES.MODE_CBC, iv)
        pad_text = pkcs7.encode(text_encode)
        msg = iv + cipher.encrypt(pad_text)

        EncodeMsg = base64.b64encode(msg)
        return EncodeMsg

    def Decrypt(self, Encrypted, SecurePassword):
        decodbase64 = base64.b64decode(Encrypted.decode("utf-8"))
        pw_encode = SecurePassword.decode('utf-8')

        iv = decodbase64[:AES.block_size]
        key = hashlib.sha256(pw_encode).digest()

        cipher = AES.new(key, AES.MODE_CBC, iv)
        msg = cipher.decrypt(decodbase64[AES.block_size:])
        pad_text = pkcs7.decode(msg)

        decryptedString = pad_text.decode('utf-8')
        return decryptedString

import StringIO
import binascii


def decode(text, k=16):
    nl = len(text)
    val = int(binascii.hexlify(text[-1]), 16)
    if val > k:
        raise ValueError('Input is not padded or padding is corrupt')

    l = nl - val
    return text[:l]


def encode(text, k=16):
    l = len(text)
    output = StringIO.StringIO()
    val = k - (l % k)
    for _ in xrange(val):
        output.write('%02x' % val)
    return text + binascii.unhexlify(output.getvalue())

It’s little late but i think this will be very helpful. No one mention about use scheme like PKCS#7 padding. You can use it instead the previous functions to pad(when do encryption) and unpad(when do decryption).i will provide the full Source Code below.

import base64
import hashlib
from Crypto import Random
from Crypto.Cipher import AES
import pkcs7
class Encryption:

    def __init__(self):
        pass

    def Encrypt(self, PlainText, SecurePassword):
        pw_encode = SecurePassword.encode('utf-8')
        text_encode = PlainText.encode('utf-8')

        key = hashlib.sha256(pw_encode).digest()
        iv = Random.new().read(AES.block_size)

        cipher = AES.new(key, AES.MODE_CBC, iv)
        pad_text = pkcs7.encode(text_encode)
        msg = iv + cipher.encrypt(pad_text)

        EncodeMsg = base64.b64encode(msg)
        return EncodeMsg

    def Decrypt(self, Encrypted, SecurePassword):
        decodbase64 = base64.b64decode(Encrypted.decode("utf-8"))
        pw_encode = SecurePassword.decode('utf-8')

        iv = decodbase64[:AES.block_size]
        key = hashlib.sha256(pw_encode).digest()

        cipher = AES.new(key, AES.MODE_CBC, iv)
        msg = cipher.decrypt(decodbase64[AES.block_size:])
        pad_text = pkcs7.decode(msg)

        decryptedString = pad_text.decode('utf-8')
        return decryptedString

import StringIO
import binascii


def decode(text, k=16):
    nl = len(text)
    val = int(binascii.hexlify(text[-1]), 16)
    if val > k:
        raise ValueError('Input is not padded or padding is corrupt')

    l = nl - val
    return text[:l]


def encode(text, k=16):
    l = len(text)
    output = StringIO.StringIO()
    val = k - (l % k)
    for _ in xrange(val):
        output.write('%02x' % val)
    return text + binascii.unhexlify(output.getvalue())


回答 10

https://stackoverflow.com/a/21928790/11402877

兼容的utf-8编码

def _pad(self, s):
    s = s.encode()
    res = s + (self.bs - len(s) % self.bs) * chr(self.bs - len(s) % self.bs).encode()
    return res

https://stackoverflow.com/a/21928790/11402877

compatible utf-8 encoding

def _pad(self, s):
    s = s.encode()
    res = s + (self.bs - len(s) % self.bs) * chr(self.bs - len(s) % self.bs).encode()
    return res

回答 11

from Crypto import Random
from Crypto.Cipher import AES
import base64

BLOCK_SIZE=16
def trans(key):
     return md5.new(key).digest()

def encrypt(message, passphrase):
    passphrase = trans(passphrase)
    IV = Random.new().read(BLOCK_SIZE)
    aes = AES.new(passphrase, AES.MODE_CFB, IV)
    return base64.b64encode(IV + aes.encrypt(message))

def decrypt(encrypted, passphrase):
    passphrase = trans(passphrase)
    encrypted = base64.b64decode(encrypted)
    IV = encrypted[:BLOCK_SIZE]
    aes = AES.new(passphrase, AES.MODE_CFB, IV)
    return aes.decrypt(encrypted[BLOCK_SIZE:])
from Crypto import Random
from Crypto.Cipher import AES
import base64

BLOCK_SIZE=16
def trans(key):
     return md5.new(key).digest()

def encrypt(message, passphrase):
    passphrase = trans(passphrase)
    IV = Random.new().read(BLOCK_SIZE)
    aes = AES.new(passphrase, AES.MODE_CFB, IV)
    return base64.b64encode(IV + aes.encrypt(message))

def decrypt(encrypted, passphrase):
    passphrase = trans(passphrase)
    encrypted = base64.b64decode(encrypted)
    IV = encrypted[:BLOCK_SIZE]
    aes = AES.new(passphrase, AES.MODE_CFB, IV)
    return aes.decrypt(encrypted[BLOCK_SIZE:])

pandas loc vs. iloc vs. ix vs. at vs. iat?

问题:pandas loc vs. iloc vs. ix vs. at vs. iat?

最近开始从我的安全地方(R)分支到Python,并且对中的单元格本地化/选择感到有些困惑Pandas。我已经阅读了文档,但仍在努力了解各种本地化/选择选项的实际含义。

  • 我为什么应该使用.loc.iloc超过最一般的选择.ix
  • 我的理解是.locilocat,和iat可以提供一些保证正确性是.ix不能提供的,但我也看到了在那里.ix往往是一刀切最快的解决方案。
  • 请说明使用除.ix?以外的任何东西背后的现实世界,最佳实践推理。

Recently began branching out from my safe place (R) into Python and and am a bit confused by the cell localization/selection in Pandas. I’ve read the documentation but I’m struggling to understand the practical implications of the various localization/selection options.

  • Is there a reason why I should ever use .loc or .iloc over the most general option .ix?
  • I understand that .loc, iloc, at, and iat may provide some guaranteed correctness that .ix can’t offer, but I’ve also read where .ix tends to be the fastest solution across the board.
  • Please explain the real-world, best-practices reasoning behind utilizing anything other than .ix?

回答 0

loc:仅适用于索引
iloc:适用于位置
ix:您可以从数据获取数据,而无需将其包含在索引
中:获取标量值。这是一个非常快速的定位
获取标量值。这是一个非常快的iloc

http://pyciencia.blogspot.com/2015/05/obtener-y-filtrar-datos-de-un-dataframe.html

注:由于pandas 0.20.0中,.ix索引被弃用赞成更加严格.iloc.loc索引。

loc: only work on index
iloc: work on position
ix: You can get data from dataframe without it being in the index
at: get scalar values. It’s a very fast loc
iat: Get scalar values. It’s a very fast iloc

http://pyciencia.blogspot.com/2015/05/obtener-y-filtrar-datos-de-un-dataframe.html

Note: As of pandas 0.20.0, the .ix indexer is deprecated in favour of the more strict .iloc and .loc indexers.


回答 1

已更新,pandas 0.20因为ix已弃用。这不但表明了如何使用locilocatiatset_value,但如何实现,混合位置/标签基于索引。


loc基于标签
允许您将一维数组作为索引器传递。数组可以是索引或列的切片(子集),也可以是长度与索引或列相等的布尔数组。

特别说明:当传递标量索引器时,loc可以分配以前不存在的新索引或列值。

# label based, but we can use position values
# to get the labels from the index object
df.loc[df.index[2], 'ColName'] = 3

df.loc[df.index[1:3], 'ColName'] = 3

iloc基于位置
类似于,loc除了位置而不是索引值。但是,您不能分配新的列或索引。

# position based, but we can get the position
# from the columns object via the `get_loc` method
df.iloc[2, df.columns.get_loc('ColName')] = 3

df.iloc[2, 4] = 3

df.iloc[:3, 2:4] = 3

at基于标签的
作品与loc标量索引器非常相似。 无法对数组索引器进行操作。 能够!分配新的索引和列。

优势loc是,这是速度更快。
缺点是不能将数组用于索引器。

# label based, but we can use position values
# to get the labels from the index object
df.at[df.index[2], 'ColName'] = 3

df.at['C', 'ColName'] = 3

iat基于位置的
原理相似iloc无法在数组索引器中工作。 不能!分配新的索引和列。

优势iloc是,这是速度更快。
缺点是不能将数组用于索引器。

# position based, but we can get the position
# from the columns object via the `get_loc` method
IBM.iat[2, IBM.columns.get_loc('PNL')] = 3

set_value基于标签的
作品与loc标量索引器非常相似。 无法对数组索引器进行操作。 能够!分配新的索引和列

优势超级快,因为几乎没有开销!
缺点由于pandas没有进行大量安全检查,因此开销很少。 使用风险自负。另外,这也不打算供公众使用。

# label based, but we can use position values
# to get the labels from the index object
df.set_value(df.index[2], 'ColName', 3)

set_valuetakable=True位置,并根据
原理相似iloc无法在数组索引器中工作。 不能!分配新的索引和列。

优势超级快,因为几乎没有开销!
缺点由于pandas没有进行大量安全检查,因此开销很少。 使用风险自负。另外,这也不打算供公众使用。

# position based, but we can get the position
# from the columns object via the `get_loc` method
df.set_value(2, df.columns.get_loc('ColName'), 3, takable=True)

Updated for pandas 0.20 given that ix is deprecated. This demonstrates not only how to use loc, iloc, at, iat, set_value, but how to accomplish, mixed positional/label based indexing.


loclabel based
Allows you to pass 1-D arrays as indexers. Arrays can be either slices (subsets) of the index or column, or they can be boolean arrays which are equal in length to the index or columns.

Special Note: when a scalar indexer is passed, loc can assign a new index or column value that didn’t exist before.

# label based, but we can use position values
# to get the labels from the index object
df.loc[df.index[2], 'ColName'] = 3

df.loc[df.index[1:3], 'ColName'] = 3

ilocposition based
Similar to loc except with positions rather that index values. However, you cannot assign new columns or indices.

# position based, but we can get the position
# from the columns object via the `get_loc` method
df.iloc[2, df.columns.get_loc('ColName')] = 3

df.iloc[2, 4] = 3

df.iloc[:3, 2:4] = 3

atlabel based
Works very similar to loc for scalar indexers. Cannot operate on array indexers. Can! assign new indices and columns.

Advantage over loc is that this is faster.
Disadvantage is that you can’t use arrays for indexers.

# label based, but we can use position values
# to get the labels from the index object
df.at[df.index[2], 'ColName'] = 3

df.at['C', 'ColName'] = 3

iatposition based
Works similarly to iloc. Cannot work in array indexers. Cannot! assign new indices and columns.

Advantage over iloc is that this is faster.
Disadvantage is that you can’t use arrays for indexers.

# position based, but we can get the position
# from the columns object via the `get_loc` method
IBM.iat[2, IBM.columns.get_loc('PNL')] = 3

set_valuelabel based
Works very similar to loc for scalar indexers. Cannot operate on array indexers. Can! assign new indices and columns

Advantage Super fast, because there is very little overhead!
Disadvantage There is very little overhead because pandas is not doing a bunch of safety checks. Use at your own risk. Also, this is not intended for public use.

# label based, but we can use position values
# to get the labels from the index object
df.set_value(df.index[2], 'ColName', 3)

set_value with takable=Trueposition based
Works similarly to iloc. Cannot work in array indexers. Cannot! assign new indices and columns.

Advantage Super fast, because there is very little overhead!
Disadvantage There is very little overhead because pandas is not doing a bunch of safety checks. Use at your own risk. Also, this is not intended for public use.

# position based, but we can get the position
# from the columns object via the `get_loc` method
df.set_value(2, df.columns.get_loc('ColName'), 3, takable=True)

回答 2

熊猫从DataFrame中进行选择的主要方式有两种。

  • 标签
  • 整数位置

该文档使用位置一词来指代整数位置。我不喜欢这个术语,因为我觉得它很混乱。整数位置更具描述性,正好.iloc代表该位置。此处的关键字是INTEGER-按整数位置选择时必须使用整数。

在显示摘要之前,让我们确保…

.ix已弃用且含糊不清,切勿使用

熊猫有三个主要的索引器。我们有索引运算符本身(括号[].loc,和.iloc。让我们总结一下:

  • []-主要选择列的子集,但也可以选择行。无法同时选择行和列。
  • .loc -仅按标签选择行和列的子集
  • .iloc -仅按整数位置选择行和列的子集

我几乎从未使用过,.at或者.iat因为它们没有添加任何附加功能并且只增加了一点性能。除非您有一个对时间敏感的应用程序,否则我不建议您使用它们。无论如何,我们有他们的摘要:

  • .at 仅通过标签在DataFrame中选择单个标量值
  • .iat 仅通过整数位置选择DataFrame中的单个标量值

除了按标签和整数位置进行选择外,还存在布尔选择(也称为布尔索引)


解释.loc,,.iloc布尔选择.at.iat的示例如下所示

我们将首先关注.loc和之间的差异.iloc。在讨论差异之前,必须了解DataFrame具有用于帮助标识每一列和每一行的标签,这一点很重要。让我们看一个示例DataFrame:

df = pd.DataFrame({'age':[30, 2, 12, 4, 32, 33, 69],
                   'color':['blue', 'green', 'red', 'white', 'gray', 'black', 'red'],
                   'food':['Steak', 'Lamb', 'Mango', 'Apple', 'Cheese', 'Melon', 'Beans'],
                   'height':[165, 70, 120, 80, 180, 172, 150],
                   'score':[4.6, 8.3, 9.0, 3.3, 1.8, 9.5, 2.2],
                   'state':['NY', 'TX', 'FL', 'AL', 'AK', 'TX', 'TX']
                   },
                  index=['Jane', 'Nick', 'Aaron', 'Penelope', 'Dean', 'Christina', 'Cornelia'])

所有粗体字均为标签。标签,agecolorfoodheightscorestate被用于。其他标签,JaneNickAaronPenelopeDeanChristinaCornelia用作标签的行。这些行标签统称为index


在DataFrame中选择特定行的主要方式是使用.loc.iloc索引器。这些索引器中的每一个也可以用于同时选择列,但是现在只关注行比较容易。此外,每个索引器都使用紧跟其名称的一组括号进行选择。

.loc仅通过标签选择数据

我们将首先讨论.loc仅通过索引或列标签选择数据的索引器。在示例DataFrame中,我们提供了有意义的名称作为索引值。许多DataFrame都没有任何有意义的名称,而是默认为0到n-1之间的整数,其中n是DataFrame的长度(行数)。

您可以使用三种输入中的许多不同.loc,它们是

  • 一串
  • 字符串列表
  • 使用字符串作为起始值和终止值的切片符号

用带字符串的.loc选择单行

要选择单行数据,请将索引标签放在后面的括号内.loc

df.loc['Penelope']

这将数据行作为系列返回

age           4
color     white
food      Apple
height       80
score       3.3
state        AL
Name: Penelope, dtype: object

使用.loc与字符串列表选择多行

df.loc[['Cornelia', 'Jane', 'Dean']]

这将返回一个DataFrame,其中的数据行按列表中指定的顺序进行:

使用带有切片符号的.loc选择多行

切片符号由开始值,停止值和步长值定义。按标签切片时,大熊猫在返回值中包含停止值。以下是从亚伦到迪恩(含)的片段。它的步长未明确定义,但默认为1。

df.loc['Aaron':'Dean']

可以采用与Python列表相同的方式获取复杂的切片。

.iloc仅按整数位置选择数据

现在转到.iloc。DataFrame中数据的每一行和每一列都有一个定义它的整数位置。这是输出中直观显示的标签的补充。整数位置就是从0开始从顶部/左侧开始的行数/列数。

您可以使用三种输入中的许多不同.iloc,它们是

  • 一个整数
  • 整数列表
  • 使用整数作为起始值和终止值的切片符号

用带整数的.iloc选择单行

df.iloc[4]

这将返回第5行(整数位置4)为系列

age           32
color       gray
food      Cheese
height       180
score        1.8
state         AK
Name: Dean, dtype: object

用.iloc选择带有整数列表的多行

df.iloc[[2, -2]]

这将返回第三行和倒数第二行的DataFrame:

使用带切片符号的.iloc选择多行

df.iloc[:5:3]


使用.loc和.iloc同时选择行和列

两者的一项出色功能.loc/.iloc是它们可以同时选择行和列。在上面的示例中,所有列都是从每个选择中返回的。我们可以选择输入类型与行相同的列。我们只需要用逗号分隔行和列的选择即可。

例如,我们可以选择Jane行和Dean行,它们的高度,得分和状态如下:

df.loc[['Jane', 'Dean'], 'height':]

这对行使用标签列表,对列使用切片符号

我们自然可以.iloc只使用整数来执行类似的操作。

df.iloc[[1,4], 2]
Nick      Lamb
Dean    Cheese
Name: food, dtype: object

带标签和整数位置的同时选择

.ix用来与标签和整数位置同时进行选择,这很有用,但有时会造成混淆和模棱两可,值得庆幸的是,它已弃用。如果您需要混合使用标签和整数位置进行选择,则必须同时选择标签或整数位置。

例如,如果我们要选择行Nick以及第Cornelia2列和第4列,则可以.loc通过以下方式将整数转换为标签来使用:

col_names = df.columns[[2, 4]]
df.loc[['Nick', 'Cornelia'], col_names] 

或者,可以使用get_locindex方法将索引标签转换为整数。

labels = ['Nick', 'Cornelia']
index_ints = [df.index.get_loc(label) for label in labels]
df.iloc[index_ints, [2, 4]]

布尔选择

.loc索引器还可以进行布尔选择。例如,如果我们有兴趣查找年龄在30岁以上的所有行并仅返回foodscore列,则可以执行以下操作:

df.loc[df['age'] > 30, ['food', 'score']] 

您可以使用复制它,.iloc但是不能将其传递为布尔系列。您必须将boolean Series转换为numpy数组,如下所示:

df.iloc[(df['age'] > 30).values, [2, 4]] 

选择所有行

可以.loc/.iloc仅用于列选择。您可以使用冒号选择所有行,如下所示:

df.loc[:, 'color':'score':2]


索引运算符[]可以切片也可以选择行和列,但不能同时选择。

大多数人都熟悉DataFrame索引运算符的主要目的,即选择列。字符串选择单个列作为系列,字符串列表选择多个列作为DataFrame。

df['food']

Jane          Steak
Nick           Lamb
Aaron         Mango
Penelope      Apple
Dean         Cheese
Christina     Melon
Cornelia      Beans
Name: food, dtype: object

使用列表选择多个列

df[['food', 'score']]

人们所不熟悉的是,当使用切片符号时,选择是通过行标签或整数位置进行的。这非常令人困惑,我几乎从未使用过,但是确实可以使用。

df['Penelope':'Christina'] # slice rows by label

df[2:6:2] # slice rows by integer location

.loc/.iloc选择行的明确性是高度首选的。单独的索引运算符无法同时选择行和列。

df[3:5, 'color']
TypeError: unhashable type: 'slice'

.at和选择.iat

选择与.at几乎相同,.loc但仅在DataFrame中选择一个“单元”。我们通常将此单元称为标量值。要使用.at,请将行标签和列标签都传递给它,并用逗号分隔。

df.at['Christina', 'color']
'black'

选择与.iat几乎相同,.iloc但仅选择一个标量值。您必须为行和列位置都传递一个整数

df.iat[2, 5]
'FL'

There are two primary ways that pandas makes selections from a DataFrame.

  • By Label
  • By Integer Location

The documentation uses the term position for referring to integer location. I do not like this terminology as I feel it is confusing. Integer location is more descriptive and is exactly what .iloc stands for. The key word here is INTEGER – you must use integers when selecting by integer location.

Before showing the summary let’s all make sure that …

.ix is deprecated and ambiguous and should never be used

There are three primary indexers for pandas. We have the indexing operator itself (the brackets []), .loc, and .iloc. Let’s summarize them:

  • [] – Primarily selects subsets of columns, but can select rows as well. Cannot simultaneously select rows and columns.
  • .loc – selects subsets of rows and columns by label only
  • .iloc – selects subsets of rows and columns by integer location only

I almost never use .at or .iat as they add no additional functionality and with just a small performance increase. I would discourage their use unless you have a very time-sensitive application. Regardless, we have their summary:

  • .at selects a single scalar value in the DataFrame by label only
  • .iat selects a single scalar value in the DataFrame by integer location only

In addition to selection by label and integer location, boolean selection also known as boolean indexing exists.


Examples explaining .loc, .iloc, boolean selection and .at and .iat are shown below

We will first focus on the differences between .loc and .iloc. Before we talk about the differences, it is important to understand that DataFrames have labels that help identify each column and each row. Let’s take a look at a sample DataFrame:

df = pd.DataFrame({'age':[30, 2, 12, 4, 32, 33, 69],
                   'color':['blue', 'green', 'red', 'white', 'gray', 'black', 'red'],
                   'food':['Steak', 'Lamb', 'Mango', 'Apple', 'Cheese', 'Melon', 'Beans'],
                   'height':[165, 70, 120, 80, 180, 172, 150],
                   'score':[4.6, 8.3, 9.0, 3.3, 1.8, 9.5, 2.2],
                   'state':['NY', 'TX', 'FL', 'AL', 'AK', 'TX', 'TX']
                   },
                  index=['Jane', 'Nick', 'Aaron', 'Penelope', 'Dean', 'Christina', 'Cornelia'])

All the words in bold are the labels. The labels, age, color, food, height, score and state are used for the columns. The other labels, Jane, Nick, Aaron, Penelope, Dean, Christina, Cornelia are used as labels for the rows. Collectively, these row labels are known as the index.


The primary ways to select particular rows in a DataFrame are with the .loc and .iloc indexers. Each of these indexers can also be used to simultaneously select columns but it is easier to just focus on rows for now. Also, each of the indexers use a set of brackets that immediately follow their name to make their selections.

.loc selects data only by labels

We will first talk about the .loc indexer which only selects data by the index or column labels. In our sample DataFrame, we have provided meaningful names as values for the index. Many DataFrames will not have any meaningful names and will instead, default to just the integers from 0 to n-1, where n is the length(number of rows) of the DataFrame.

There are many different inputs you can use for .loc three out of them are

  • A string
  • A list of strings
  • Slice notation using strings as the start and stop values

Selecting a single row with .loc with a string

To select a single row of data, place the index label inside of the brackets following .loc.

df.loc['Penelope']

This returns the row of data as a Series

age           4
color     white
food      Apple
height       80
score       3.3
state        AL
Name: Penelope, dtype: object

Selecting multiple rows with .loc with a list of strings

df.loc[['Cornelia', 'Jane', 'Dean']]

This returns a DataFrame with the rows in the order specified in the list:

Selecting multiple rows with .loc with slice notation

Slice notation is defined by a start, stop and step values. When slicing by label, pandas includes the stop value in the return. The following slices from Aaron to Dean, inclusive. Its step size is not explicitly defined but defaulted to 1.

df.loc['Aaron':'Dean']

Complex slices can be taken in the same manner as Python lists.

.iloc selects data only by integer location

Let’s now turn to .iloc. Every row and column of data in a DataFrame has an integer location that defines it. This is in addition to the label that is visually displayed in the output. The integer location is simply the number of rows/columns from the top/left beginning at 0.

There are many different inputs you can use for .iloc three out of them are

  • An integer
  • A list of integers
  • Slice notation using integers as the start and stop values

Selecting a single row with .iloc with an integer

df.iloc[4]

This returns the 5th row (integer location 4) as a Series

age           32
color       gray
food      Cheese
height       180
score        1.8
state         AK
Name: Dean, dtype: object

Selecting multiple rows with .iloc with a list of integers

df.iloc[[2, -2]]

This returns a DataFrame of the third and second to last rows:

Selecting multiple rows with .iloc with slice notation

df.iloc[:5:3]


Simultaneous selection of rows and columns with .loc and .iloc

One excellent ability of both .loc/.iloc is their ability to select both rows and columns simultaneously. In the examples above, all the columns were returned from each selection. We can choose columns with the same types of inputs as we do for rows. We simply need to separate the row and column selection with a comma.

For example, we can select rows Jane, and Dean with just the columns height, score and state like this:

df.loc[['Jane', 'Dean'], 'height':]

This uses a list of labels for the rows and slice notation for the columns

We can naturally do similar operations with .iloc using only integers.

df.iloc[[1,4], 2]
Nick      Lamb
Dean    Cheese
Name: food, dtype: object

Simultaneous selection with labels and integer location

.ix was used to make selections simultaneously with labels and integer location which was useful but confusing and ambiguous at times and thankfully it has been deprecated. In the event that you need to make a selection with a mix of labels and integer locations, you will have to make both your selections labels or integer locations.

For instance, if we want to select rows Nick and Cornelia along with columns 2 and 4, we could use .loc by converting the integers to labels with the following:

col_names = df.columns[[2, 4]]
df.loc[['Nick', 'Cornelia'], col_names] 

Or alternatively, convert the index labels to integers with the get_loc index method.

labels = ['Nick', 'Cornelia']
index_ints = [df.index.get_loc(label) for label in labels]
df.iloc[index_ints, [2, 4]]

Boolean Selection

The .loc indexer can also do boolean selection. For instance, if we are interested in finding all the rows where age is above 30 and return just the food and score columns we can do the following:

df.loc[df['age'] > 30, ['food', 'score']] 

You can replicate this with .iloc but you cannot pass it a boolean series. You must convert the boolean Series into a numpy array like this:

df.iloc[(df['age'] > 30).values, [2, 4]] 

Selecting all rows

It is possible to use .loc/.iloc for just column selection. You can select all the rows by using a colon like this:

df.loc[:, 'color':'score':2]


The indexing operator, [], can slice can select rows and columns too but not simultaneously.

Most people are familiar with the primary purpose of the DataFrame indexing operator, which is to select columns. A string selects a single column as a Series and a list of strings selects multiple columns as a DataFrame.

df['food']

Jane          Steak
Nick           Lamb
Aaron         Mango
Penelope      Apple
Dean         Cheese
Christina     Melon
Cornelia      Beans
Name: food, dtype: object

Using a list selects multiple columns

df[['food', 'score']]

What people are less familiar with, is that, when slice notation is used, then selection happens by row labels or by integer location. This is very confusing and something that I almost never use but it does work.

df['Penelope':'Christina'] # slice rows by label

df[2:6:2] # slice rows by integer location

The explicitness of .loc/.iloc for selecting rows is highly preferred. The indexing operator alone is unable to select rows and columns simultaneously.

df[3:5, 'color']
TypeError: unhashable type: 'slice'

Selection by .at and .iat

Selection with .at is nearly identical to .loc but it only selects a single ‘cell’ in your DataFrame. We usually refer to this cell as a scalar value. To use .at, pass it both a row and column label separated by a comma.

df.at['Christina', 'color']
'black'

Selection with .iat is nearly identical to .iloc but it only selects a single scalar value. You must pass it an integer for both the row and column locations

df.iat[2, 5]
'FL'

回答 3

df = pd.DataFrame({'A':['a', 'b', 'c'], 'B':[54, 67, 89]}, index=[100, 200, 300])

df

                        A   B
                100     a   54
                200     b   67
                300     c   89
In [19]:    
df.loc[100]

Out[19]:
A     a
B    54
Name: 100, dtype: object

In [20]:    
df.iloc[0]

Out[20]:
A     a
B    54
Name: 100, dtype: object

In [24]:    
df2 = df.set_index([df.index,'A'])
df2

Out[24]:
        B
    A   
100 a   54
200 b   67
300 c   89

In [25]:    
df2.ix[100, 'a']

Out[25]:    
B    54
Name: (100, a), dtype: int64
df = pd.DataFrame({'A':['a', 'b', 'c'], 'B':[54, 67, 89]}, index=[100, 200, 300])

df

                        A   B
                100     a   54
                200     b   67
                300     c   89
In [19]:    
df.loc[100]

Out[19]:
A     a
B    54
Name: 100, dtype: object

In [20]:    
df.iloc[0]

Out[20]:
A     a
B    54
Name: 100, dtype: object

In [24]:    
df2 = df.set_index([df.index,'A'])
df2

Out[24]:
        B
    A   
100 a   54
200 b   67
300 c   89

In [25]:    
df2.ix[100, 'a']

Out[25]:    
B    54
Name: (100, a), dtype: int64

回答 4

让我们从这个小df开始:

import pandas as pd
import time as tm
import numpy as np
n=10
a=np.arange(0,n**2)
df=pd.DataFrame(a.reshape(n,n))

我们会这样

df
Out[25]: 
        0   1   2   3   4   5   6   7   8   9
    0   0   1   2   3   4   5   6   7   8   9
    1  10  11  12  13  14  15  16  17  18  19
    2  20  21  22  23  24  25  26  27  28  29
    3  30  31  32  33  34  35  36  37  38  39
    4  40  41  42  43  44  45  46  47  48  49
    5  50  51  52  53  54  55  56  57  58  59
    6  60  61  62  63  64  65  66  67  68  69
    7  70  71  72  73  74  75  76  77  78  79
    8  80  81  82  83  84  85  86  87  88  89
    9  90  91  92  93  94  95  96  97  98  99

有了这个我们有:

df.iloc[3,3]
Out[33]: 33

df.iat[3,3]
Out[34]: 33

df.iloc[:3,:3]
Out[35]: 
    0   1   2   3
0   0   1   2   3
1  10  11  12  13
2  20  21  22  23
3  30  31  32  33



df.iat[:3,:3]
Traceback (most recent call last):
   ... omissis ...
ValueError: At based indexing on an integer index can only have integer indexers

因此,我们不能将.iat用于子集,而只能在其中使用.iloc。

但是,让我们尝试从较大的df中进行选择,并检查速度…

# -*- coding: utf-8 -*-
"""
Created on Wed Feb  7 09:58:39 2018

@author: Fabio Pomi
"""

import pandas as pd
import time as tm
import numpy as np
n=1000
a=np.arange(0,n**2)
df=pd.DataFrame(a.reshape(n,n))
t1=tm.time()
for j in df.index:
    for i in df.columns:
        a=df.iloc[j,i]
t2=tm.time()
for j in df.index:
    for i in df.columns:
        a=df.iat[j,i]
t3=tm.time()
loc=t2-t1
at=t3-t2
prc = loc/at *100
print('\nloc:%f at:%f prc:%f' %(loc,at,prc))

loc:10.485600 at:7.395423 prc:141.784987

因此,使用.loc我们可以管理子集,并且仅使用单个标量即可使用.loc,但是.at比.loc更快

:-)

Let’s start with this small df:

import pandas as pd
import time as tm
import numpy as np
n=10
a=np.arange(0,n**2)
df=pd.DataFrame(a.reshape(n,n))

We’ll so have

df
Out[25]: 
        0   1   2   3   4   5   6   7   8   9
    0   0   1   2   3   4   5   6   7   8   9
    1  10  11  12  13  14  15  16  17  18  19
    2  20  21  22  23  24  25  26  27  28  29
    3  30  31  32  33  34  35  36  37  38  39
    4  40  41  42  43  44  45  46  47  48  49
    5  50  51  52  53  54  55  56  57  58  59
    6  60  61  62  63  64  65  66  67  68  69
    7  70  71  72  73  74  75  76  77  78  79
    8  80  81  82  83  84  85  86  87  88  89
    9  90  91  92  93  94  95  96  97  98  99

With this we have:

df.iloc[3,3]
Out[33]: 33

df.iat[3,3]
Out[34]: 33

df.iloc[:3,:3]
Out[35]: 
    0   1   2   3
0   0   1   2   3
1  10  11  12  13
2  20  21  22  23
3  30  31  32  33



df.iat[:3,:3]
Traceback (most recent call last):
   ... omissis ...
ValueError: At based indexing on an integer index can only have integer indexers

Thus we cannot use .iat for subset, where we must use .iloc only.

But let’s try both to select from a larger df and let’s check the speed …

# -*- coding: utf-8 -*-
"""
Created on Wed Feb  7 09:58:39 2018

@author: Fabio Pomi
"""

import pandas as pd
import time as tm
import numpy as np
n=1000
a=np.arange(0,n**2)
df=pd.DataFrame(a.reshape(n,n))
t1=tm.time()
for j in df.index:
    for i in df.columns:
        a=df.iloc[j,i]
t2=tm.time()
for j in df.index:
    for i in df.columns:
        a=df.iat[j,i]
t3=tm.time()
loc=t2-t1
at=t3-t2
prc = loc/at *100
print('\nloc:%f at:%f prc:%f' %(loc,at,prc))

loc:10.485600 at:7.395423 prc:141.784987

So with .loc we can manage subsets and with .at only a single scalar, but .at is faster than .loc

:-)


如何使用python执行curl命令

问题:如何使用python执行curl命令

我想在python中执行curl命令。

通常,我只需要在终端中输入命令并按回车键即可。但是,我不知道它如何在python中工作。

该命令显示如下:

curl -d @request.json --header "Content-Type: application/json" https://www.googleapis.com/qpxExpress/v1/trips/search?key=mykeyhere

有一个request.json文件要发送以获得响应。

我搜索了很多,感到困惑。尽管我无法完全理解,但我还是尝试编写了一段代码。没用

import pycurl
import StringIO

response = StringIO.StringIO()
c = pycurl.Curl()
c.setopt(c.URL, 'https://www.googleapis.com/qpxExpress/v1/trips/search?key=mykeyhere')
c.setopt(c.WRITEFUNCTION, response.write)
c.setopt(c.HTTPHEADER, ['Content-Type: application/json','Accept-Charset: UTF-8'])
c.setopt(c.POSTFIELDS, '@request.json')
c.perform()
c.close()
print response.getvalue()
response.close()

错误信息为“解析错误”。有人可以告诉我如何解决吗?或如何正确获取服务器的响应?

I want to execute a curl command in python.

Usually, I just need enter the command in terminal and press return key. However, I don’t know how it works in python.

The command shows below:

curl -d @request.json --header "Content-Type: application/json" https://www.googleapis.com/qpxExpress/v1/trips/search?key=mykeyhere

There is a request.json file to be sent to get response.

I searched a lot and got confused. I tried to write a piece of code, although I could not fully understand. It didn’t work.

import pycurl
import StringIO

response = StringIO.StringIO()
c = pycurl.Curl()
c.setopt(c.URL, 'https://www.googleapis.com/qpxExpress/v1/trips/search?key=mykeyhere')
c.setopt(c.WRITEFUNCTION, response.write)
c.setopt(c.HTTPHEADER, ['Content-Type: application/json','Accept-Charset: UTF-8'])
c.setopt(c.POSTFIELDS, '@request.json')
c.perform()
c.close()
print response.getvalue()
response.close()

The error message is ‘Parse Error’.Can anyone tell me how to fix it? or how to get response from the sever correctly?


回答 0

为了简单起见,也许您应该考虑使用Requests库。

带有json响应内容的示例如下所示:

import requests
r = requests.get('https://github.com/timeline.json')
r.json()

如果您需要更多信息,请在“ 快速入门”部分中找到许多可行的示例。

编辑:

对于您特定的curl翻译:

import requests
url = 'https://www.googleapis.com/qpxExpress/v1/trips/search?key=mykeyhere'
payload = open("request.json")
headers = {'content-type': 'application/json', 'Accept-Charset': 'UTF-8'}
r = requests.post(url, data=payload, headers=headers)

For sake of simplicity, maybe you should consider using the Requests library.

An example with json response content would be something like:

import requests
r = requests.get('https://github.com/timeline.json')
r.json()

If you look for further information, in the Quickstart section, they have lots of working examples.

EDIT:

For your specific curl translation:

import requests
url = 'https://www.googleapis.com/qpxExpress/v1/trips/search?key=mykeyhere'
payload = open("request.json")
headers = {'content-type': 'application/json', 'Accept-Charset': 'UTF-8'}
r = requests.post(url, data=payload, headers=headers)

回答 1

只要使用这个网站。它将任何curl命令转换为Python,Node.js,PHP,R或Go。

例:

curl -X POST -H 'Content-type: application/json' --data '{"text":"Hello, World!"}' https://hooks.slack.com/services/asdfasdfasdf

在Python中成为这个

import requests

headers = {
    'Content-type': 'application/json',
}

data = '{"text":"Hello, World!"}'

response = requests.post('https://hooks.slack.com/services/asdfasdfasdf', headers=headers, data=data)

Just use this website. It’ll convert any curl command into Python, Node.js, PHP, R, or Go.

Example:

curl -X POST -H 'Content-type: application/json' --data '{"text":"Hello, World!"}' https://hooks.slack.com/services/asdfasdfasdf

Becomes this in Python,

import requests

headers = {
    'Content-type': 'application/json',
}

data = '{"text":"Hello, World!"}'

response = requests.post('https://hooks.slack.com/services/asdfasdfasdf', headers=headers, data=data)

回答 2

import requests
url = "https://www.googleapis.com/qpxExpress/v1/trips/search?key=mykeyhere"
data = requests.get(url).json

也许?

如果您要发送文件

files = {'request_file': open('request.json', 'rb')}
r = requests.post(url, files=files)
print r.text, print r.json

啊,谢谢@LukasGraf现在,我更好地了解了他的原始代码在做什么

import requests,json
url = "https://www.googleapis.com/qpxExpress/v1/trips/search?key=mykeyhere"
my_json_data = json.load(open("request.json"))
req = requests.post(url,data=my_json_data)
print req.text
print 
print req.json # maybe? 
import requests
url = "https://www.googleapis.com/qpxExpress/v1/trips/search?key=mykeyhere"
data = requests.get(url).json

maybe?

if you are trying to send a file

files = {'request_file': open('request.json', 'rb')}
r = requests.post(url, files=files)
print r.text, print r.json

ahh thanks @LukasGraf now i better understand what his original code is doing

import requests,json
url = "https://www.googleapis.com/qpxExpress/v1/trips/search?key=mykeyhere"
my_json_data = json.load(open("request.json"))
req = requests.post(url,data=my_json_data)
print req.text
print 
print req.json # maybe? 

回答 3

curl -d @request.json --header "Content-Type: application/json" https://www.googleapis.com/qpxExpress/v1/trips/search?key=mykeyhere

它的python实现就像

import requests

headers = {
    'Content-Type': 'application/json',
}

params = (
    ('key', 'mykeyhere'),
)

data = open('request.json')
response = requests.post('https://www.googleapis.com/qpxExpress/v1/trips/search', headers=headers, params=params, data=data)

#NB. Original query string below. It seems impossible to parse and
#reproduce query strings 100% accurately so the one below is given
#in case the reproduced version is not "correct".
# response = requests.post('https://www.googleapis.com/qpxExpress/v1/trips/search?key=mykeyhere', headers=headers, data=data)

检查此链接,它将有助于将cURl命令转换为python,php和nodejs

curl -d @request.json --header "Content-Type: application/json" https://www.googleapis.com/qpxExpress/v1/trips/search?key=mykeyhere

its python implementation be like

import requests

headers = {
    'Content-Type': 'application/json',
}

params = (
    ('key', 'mykeyhere'),
)

data = open('request.json')
response = requests.post('https://www.googleapis.com/qpxExpress/v1/trips/search', headers=headers, params=params, data=data)

#NB. Original query string below. It seems impossible to parse and
#reproduce query strings 100% accurately so the one below is given
#in case the reproduced version is not "correct".
# response = requests.post('https://www.googleapis.com/qpxExpress/v1/trips/search?key=mykeyhere', headers=headers, data=data)

check this link, it will help convert cURl command to python,php and nodejs


回答 4

我的答案是WRT python 2.6.2。

import commands

status, output = commands.getstatusoutput("curl -H \"Content-Type:application/json\" -k -u (few other parameters required) -X GET https://example.org -s")

print output

对于未提供必需的参数,我深表歉意,因为这是机密信息。

My answer is WRT python 2.6.2.

import commands

status, output = commands.getstatusoutput("curl -H \"Content-Type:application/json\" -k -u (few other parameters required) -X GET https://example.org -s")

print output

I apologize for not providing the required parameters ‘coz it’s confidential.


回答 5

背景知识:我一直在寻找这个问题,因为我不得不做一些事情来检索内容,但是我所能得到的只是一个旧版本的python,它没有足够的SSL支持。如果您使用的是较旧的MacBook,那么您就会知道我在说什么。无论如何,都curl可以从Shell正常运行(我怀疑它已链接了现代SSL支持),因此有时您想要在不使用requests或的情况下执行此操作urllib2

您可以使用该subprocess模块执行curl并获取检索到的内容:

import subprocess

// 'response' contains a []byte with the retrieved content.
// use '-s' to keep curl quiet while it does its job, but
// it's useful to omit that while you're still writing code
// so you know if curl is working
response = subprocess.check_output(['curl', '-s', baseURL % page_num])

Python 3的subprocess模块还包含.run()许多有用的选项。我将其留给实际上正在运行python 3的人提供该答案。

Some background: I went looking for exactly this question because I had to do something to retrieve content, but all I had available was an old version of python with inadequate SSL support. If you’re on an older MacBook, you know what I’m talking about. In any case, curl runs fine from a shell (I suspect it has modern SSL support linked in) so sometimes you want to do this without using requests or urllib2.

You can use the subprocess module to execute curl and get at the retrieved content:

import subprocess

// 'response' contains a []byte with the retrieved content.
// use '-s' to keep curl quiet while it does its job, but
// it's useful to omit that while you're still writing code
// so you know if curl is working
response = subprocess.check_output(['curl', '-s', baseURL % page_num])

Python 3’s subprocess module also contains .run() with a number of useful options. I’ll leave it to someone who is actually running python 3 to provide that answer.


回答 6

这可以通过下面提到的伪代码方法来实现

Import os导入请求Data = os.execute(curl URL)R = Data.json()

This could be achieve with the below mentioned psuedo code approach

Import os import requests Data = os.execute(curl URL) R= Data.json()


如何将Seaborn图保存到文件中

问题:如何将Seaborn图保存到文件中

我尝试了以下代码(test_seaborn.py):

import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
matplotlib.style.use('ggplot')
import seaborn as sns
sns.set()
df = sns.load_dataset('iris')
sns_plot = sns.pairplot(df, hue='species', size=2.5)
fig = sns_plot.get_figure()
fig.savefig("output.png")
#sns.plt.show()

但是我得到这个错误:

  Traceback (most recent call last):
  File "test_searborn.py", line 11, in <module>
    fig = sns_plot.get_figure()
AttributeError: 'PairGrid' object has no attribute 'get_figure'

我希望决赛output.png将存在,看起来像这样:

我该如何解决该问题?

I tried the following code (test_seaborn.py):

import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
matplotlib.style.use('ggplot')
import seaborn as sns
sns.set()
df = sns.load_dataset('iris')
sns_plot = sns.pairplot(df, hue='species', size=2.5)
fig = sns_plot.get_figure()
fig.savefig("output.png")
#sns.plt.show()

But I get this error:

  Traceback (most recent call last):
  File "test_searborn.py", line 11, in <module>
    fig = sns_plot.get_figure()
AttributeError: 'PairGrid' object has no attribute 'get_figure'

I expect the final output.png will exist and look like this:

How can I resolve the problem?


回答 0

删除get_figure并使用sns_plot.savefig('output.png')

df = sns.load_dataset('iris')
sns_plot = sns.pairplot(df, hue='species', size=2.5)
sns_plot.savefig("output.png")

Remove the get_figure and just use sns_plot.savefig('output.png')

df = sns.load_dataset('iris')
sns_plot = sns.pairplot(df, hue='species', size=2.5)
sns_plot.savefig("output.png")

回答 1

建议的解决方案与Seaborn 0.8.1不兼容

由于Seaborn界面已更改,因此出现以下错误:

AttributeError: 'AxesSubplot' object has no attribute 'fig'
When trying to access the figure

AttributeError: 'AxesSubplot' object has no attribute 'savefig'
when trying to use the savefig directly as a function

以下调用允许您访问该图(与Seaborn 0.8.1兼容):

swarm_plot = sns.swarmplot(...)
fig = swarm_plot.get_figure()
fig.savefig(...) 

如先前在此答案中所见。

更新: 我最近使用了seaborn的PairGrid对象生成了一个类似于本示例中的图。在这种情况下,由于GridPlot不是像sns.swarmplot这样的绘图对象,因此它没有get_figure()函数。可以通过以下方式直接访问matplotlib图

fig = myGridPlotObject.fig

就像之前在该主题的其他文章中建议的那样。

The suggested solutions are incompatible with Seaborn 0.8.1

giving the following errors because the Seaborn interface has changed:

AttributeError: 'AxesSubplot' object has no attribute 'fig'
When trying to access the figure

AttributeError: 'AxesSubplot' object has no attribute 'savefig'
when trying to use the savefig directly as a function

The following calls allow you to access the figure (Seaborn 0.8.1 compatible):

swarm_plot = sns.swarmplot(...)
fig = swarm_plot.get_figure()
fig.savefig(...) 

as seen previously in this answer.

UPDATE: I have recently used PairGrid object from seaborn to generate a plot similar to the one in this example. In this case, since GridPlot is not a plot object like, for example, sns.swarmplot, it has no get_figure() function. It is possible to directly access the matplotlib figure by

fig = myGridPlotObject.fig

Like previously suggested in other posts in this thread.


回答 2

上述某些解决方案对我不起作用。.fig尝试该属性时未找到该属性,因此无法.savefig()直接使用。但是,起作用的是:

sns_plot.figure.savefig("output.png")

我是Python新用户,所以我不知道这是否是由于更新引起的。我想提一下,以防其他人遇到和我一样的问题。

Some of the above solutions did not work for me. The .fig attribute was not found when I tried that and I was unable to use .savefig() directly. However, what did work was:

sns_plot.figure.savefig("output.png")

I am a newer Python user, so I do not know if this is due to an update. I wanted to mention it in case anybody else runs into the same issues as I did.


回答 3

您应该只能够直接使用savefig方法sns_plot

sns_plot.savefig("output.png")

为了使您的代码更加清晰,如果您确实要访问sns_plot驻留在其中的matplotlib图形,则可以直接通过

fig = sns_plot.fig

在这种情况下get_figure,您的代码将假定没有方法。

You should just be able to use the savefig method of sns_plot directly.

sns_plot.savefig("output.png")

For clarity with your code if you did want to access the matplotlib figure that sns_plot resides in then you can get it directly with

fig = sns_plot.fig

In this case there is no get_figure method as your code assumes.


回答 4

我使用distplotget_figure成功保存了图片。

sns_hist = sns.distplot(df_train['SalePrice'])
fig = sns_hist.get_figure()
fig.savefig('hist.png')

I use distplot and get_figure to save picture successfully.

sns_hist = sns.distplot(df_train['SalePrice'])
fig = sns_hist.get_figure()
fig.savefig('hist.png')

回答 5

2019年搜索者的台词更少:

import matplotlib.pyplot as plt
import seaborn as sns

df = sns.load_dataset('iris')
sns_plot = sns.pairplot(df, hue='species', height=2.5)
plt.savefig('output.png')

更新说明:size已更改为height

Fewer lines for 2019 searchers:

import matplotlib.pyplot as plt
import seaborn as sns

df = sns.load_dataset('iris')
sns_plot = sns.pairplot(df, hue='species', height=2.5)
plt.savefig('output.png')

UPDATE NOTE: size was changed to height.


回答 6

这对我有用

import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

sns.factorplot(x='holiday',data=data,kind='count',size=5,aspect=1)
plt.savefig('holiday-vs-count.png')

This works for me

import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

sns.factorplot(x='holiday',data=data,kind='count',size=5,aspect=1)
plt.savefig('holiday-vs-count.png')

回答 7

也可以只创建一个matplotlib figure对象,然后使用plt.savefig(...)

from matplotlib import pyplot as plt
import seaborn as sns
import pandas as pd

df = sns.load_dataset('iris')
plt.figure() # Push new figure on stack
sns_plot = sns.pairplot(df, hue='species', size=2.5)
plt.savefig('output.png') # Save that figure

Its also possible to just create a matplotlib figure object and then use plt.savefig(...):

from matplotlib import pyplot as plt
import seaborn as sns
import pandas as pd

df = sns.load_dataset('iris')
plt.figure() # Push new figure on stack
sns_plot = sns.pairplot(df, hue='species', size=2.5)
plt.savefig('output.png') # Save that figure

回答 8

sns.figure.savefig("output.png")在seaborn 0.8.1中使用会出错。

而是使用:

import seaborn as sns

df = sns.load_dataset('iris')
sns_plot = sns.pairplot(df, hue='species', size=2.5)
sns_plot.savefig("output.png")

You would get an error for using sns.figure.savefig("output.png") in seaborn 0.8.1.

Instead use:

import seaborn as sns

df = sns.load_dataset('iris')
sns_plot = sns.pairplot(df, hue='species', size=2.5)
sns_plot.savefig("output.png")

回答 9

仅供参考,下面的命令在seaborn 0.8.1中起作用,因此我想最初的答案仍然有效。

sns_plot = sns.pairplot(data, hue='species', size=3)
sns_plot.savefig("output.png")

Just FYI, the below command worked in seaborn 0.8.1 so I guess the initial answer is still valid.

sns_plot = sns.pairplot(data, hue='species', size=3)
sns_plot.savefig("output.png")

如何使用conda升级到Python 3.6?

问题:如何使用conda升级到Python 3.6?

我是Conda软件包管理的新手,我想获取最新版本的Python以在代码中使用f字符串。目前,我的版本是(python -V):

Python 3.5.2 :: Anaconda 4.2.0 (x86_64)

如何升级到Python 3.6?

I’m new to Conda package management and I want to get the latest version of Python to use f-strings in my code. Currently my version is (python -V):

Python 3.5.2 :: Anaconda 4.2.0 (x86_64)

How would I upgrade to Python 3.6?


回答 0

Anaconda尚未将python内部更新为3.6。

a)方法1

  1. 如果要更新,请输入 conda update python
  2. 更新anaconda类型 conda update anaconda
  3. 如果要在主要的python版本(例如3.5到3.6)之间升级,则必须

    conda install python=$pythonversion$

b)方法2-创建一个新环境(更好的方法)

conda create --name py36 python=3.6

c)要获取绝对最新的python(在撰写本文时为3.6.5)

conda create --name py365 python=3.6.5 --channel conda-forge

您可以从这里看到所有这些

另外,请参阅此以进行强制升级

编辑:Anaconda现在在这里具有Python 3.6版本

Anaconda has not updated python internally to 3.6.

a) Method 1

  1. If you wanted to update you will type conda update python
  2. To update anaconda type conda update anaconda
  3. If you want to upgrade between major python version like 3.5 to 3.6, you’ll have to do

    conda install python=$pythonversion$
    

b) Method 2 – Create a new environment (Better Method)

conda create --name py36 python=3.6

c) To get the absolute latest python(3.6.5 at time of writing)

conda create --name py365 python=3.6.5 --channel conda-forge

You can see all this from here

Also, refer to this for force upgrading

EDIT: Anaconda now has a Python 3.6 version here


回答 1

创建一个新环境将安装python 3.6:

$ conda create --name 3point6 python=3.6
Fetching package metadata .......
Solving package specifications: ..........

Package plan for installation in environment /Users/dstansby/miniconda3/envs/3point6:

The following NEW packages will be INSTALLED:

    openssl:    1.0.2j-0     
    pip:        9.0.1-py36_1 
    python:     3.6.0-0      
    readline:   6.2-2        
    setuptools: 27.2.0-py36_0
    sqlite:     3.13.0-0     
    tk:         8.5.18-0     
    wheel:      0.29.0-py36_0
    xz:         5.2.2-1      
    zlib:       1.2.8-3 

Creating a new environment will install python 3.6:

$ conda create --name 3point6 python=3.6
Fetching package metadata .......
Solving package specifications: ..........

Package plan for installation in environment /Users/dstansby/miniconda3/envs/3point6:

The following NEW packages will be INSTALLED:

    openssl:    1.0.2j-0     
    pip:        9.0.1-py36_1 
    python:     3.6.0-0      
    readline:   6.2-2        
    setuptools: 27.2.0-py36_0
    sqlite:     3.13.0-0     
    tk:         8.5.18-0     
    wheel:      0.29.0-py36_0
    xz:         5.2.2-1      
    zlib:       1.2.8-3 

回答 2

我在此页面上找到了有关将Anaconda升级到Python的主要更新版本(从Anaconda 4.0+)的详细说明。第一,

conda update conda
conda remove argcomplete conda-manager

我还需要conda remove一些不在官方清单中的软件包:

  • backports_abc
  • 美丽的汤
  • 火焰芯

根据系统上安装的软件包,您可能会遇到其他UnsatisfiableError错误-只需将这些软件包添加到删除列表中即可。接下来,安装Python版本,

conda install python==3.6

这需要一段时间,之后显示消息给conda install anaconda-client,所以我做了

conda install anaconda-client

说它已经在那里。最后,按照指示进行

conda update anaconda

我是在Windows 10命令提示符下执行此操作的,但在Mac OS X中应该与此类似。

I found this page with detailed instructions to upgrade Anaconda to a major newer version of Python (from Anaconda 4.0+). First,

conda update conda
conda remove argcomplete conda-manager

I also had to conda remove some packages not on the official list:

  • backports_abc
  • beautiful-soup
  • blaze-core

Depending on packages installed on your system, you may get additional UnsatisfiableError errors – simply add those packages to the remove list. Next, install the version of Python,

conda install python==3.6

which takes a while, after which a message indicated to conda install anaconda-client, so I did

conda install anaconda-client

which said it’s already there. Finally, following the directions,

conda update anaconda

I did this in the Windows 10 command prompt, but things should be similar in Mac OS X.


回答 3

过去,我发现尝试就地升级非常困难。

注意:我对Anaconda的用例是作为一个多合一的Python环境。我不用理会单独的虚拟环境。如果您conda用于创建环境,这可能具有破坏性,因为conda创建的Anaconda/envs目录中包含硬链接的环境。

因此,如果您使用环境,则可能首先要导出环境。激活环境后,请执行以下操作:

conda env export > environment.yml

备份环境后(如有必要),您可以删除旧的Anaconda(卸载Anaconda非常简单):

$ rm -rf ~/anaconda3/

并通过下载新的Anaconda(例如64位Linux)来替换它:

$ cd ~/Downloads
$ wget https://repo.continuum.io/archive/Anaconda3-4.3.0-Linux-x86_64.sh 

有关最新信息请参见此处),

然后执行它:

$ bash Anaconda3-4.3.0-Linux-x86_64.sh 

In the past, I have found it quite difficult to try to upgrade in-place.

Note: my use-case for Anaconda is as an all-in-one Python environment. I don’t bother with separate virtual environments. If you’re using conda to create environments, this may be destructive because conda creates environments with hard-links inside your Anaconda/envs directory.

So if you use environments, you may first want to export your environments. After activating your environment, do something like:

conda env export > environment.yml

After backing up your environments (if necessary), you may remove your old Anaconda (it’s very simple to uninstall Anaconda):

$ rm -rf ~/anaconda3/

and replace it by downloading the new Anaconda, e.g. Linux, 64 bit:

$ cd ~/Downloads
$ wget https://repo.continuum.io/archive/Anaconda3-4.3.0-Linux-x86_64.sh 

(see here for a more recent one),

and then executing it:

$ bash Anaconda3-4.3.0-Linux-x86_64.sh 

回答 4

我正在使用Mac OS Mojave

这四个步骤对我有用。

  1. conda update conda
  2. conda install python=3.6
  3. conda install anaconda-client
  4. conda update anaconda

I’m using a Mac OS Mojave

These 4 steps worked for me.

  1. conda update conda
  2. conda install python=3.6
  3. conda install anaconda-client
  4. conda update anaconda

回答 5

我发现的最佳方法:

source activate old_env
conda env export > old_env.yml

然后使用以下方法进行处理:

with open('old_env.yml', 'r') as fin, open('new_env.yml', 'w') as fout:
    for line in fin:
        if 'py35' in line:  # replace by the version you want to supersede
            line = line[:line.rfind('=')] + '\n'
        fout.write(line)

然后手动编辑第一行(name: ...)和最后一行(prefix: ...)以反映您的新环境名称并运行:

conda env create -f new_env.yml

您可能需要手动删除或更改一些软件包的版本标记,而对于这些软件包,固定的版本old_env与新python版本不兼容或丢失。

我希望有一个内置的,更简单的方法…

Best method I found:

source activate old_env
conda env export > old_env.yml

Then process it with something like this:

with open('old_env.yml', 'r') as fin, open('new_env.yml', 'w') as fout:
    for line in fin:
        if 'py35' in line:  # replace by the version you want to supersede
            line = line[:line.rfind('=')] + '\n'
        fout.write(line)

then edit manually the first (name: ...) and last line (prefix: ...) to reflect your new environment name and run:

conda env create -f new_env.yml

you might need to remove or change manually the version pin of a few packages for which which the pinned version from old_env is found incompatible or missing for the new python version.

I wish there was a built-in, easier way…


错误消息:“’chromedriver’可执行文件必须在路径中可用”

问题:错误消息:“’chromedriver’可执行文件必须在路径中可用”

我正在将硒与python结合使用,并已从以下站点下载了适用于Windows计算机的chromedriver:http ://chromedriver.storage.googleapis.com/index.html?path=2.15 /

下载zip文件后,我将zip文件解压缩到我的下载文件夹中。然后,我将可执行二进制文件(C:\ Users \ michael \ Downloads \ chromedriver_win32)的路径放入环境变量“路径”中。

但是,当我运行以下代码时:

  from selenium import webdriver

  driver = webdriver.Chrome()

…我不断收到以下错误消息:

WebDriverException: Message: 'chromedriver' executable needs to be available in the path. Please look at     http://docs.seleniumhq.org/download/#thirdPartyDrivers and read up at http://code.google.com/p/selenium/wiki/ChromeDriver

但是-如上所述-可执行文件在路径中是(!)…这里发生了什么?

I am using selenium with python and have downloaded the chromedriver for my windows computer from this site: http://chromedriver.storage.googleapis.com/index.html?path=2.15/

After downloading the zip file, I unpacked the zip file to my downloads folder. Then I put the path to the executable binary (C:\Users\michael\Downloads\chromedriver_win32) into the Environment Variable “Path”.

However, when I run the following code:

  from selenium import webdriver

  driver = webdriver.Chrome()

… I keep getting the following error message:

WebDriverException: Message: 'chromedriver' executable needs to be available in the path. Please look at     http://docs.seleniumhq.org/download/#thirdPartyDrivers and read up at http://code.google.com/p/selenium/wiki/ChromeDriver

But – as explained above – the executable is(!) in the path … what is going on here?


回答 0

您可以测试它是否确实在PATH中,如果您打开cmd并输入chromedriver(假设您的chromedriver可执行文件仍以此命名),然后按Enter。如果Starting ChromeDriver 2.15.322448显示,则PATH设置正确,并且还有其他问题。

另外,您可以像这样使用chromedriver的直接路径:

 driver = webdriver.Chrome('/path/to/chromedriver') 

因此,在您的特定情况下:

 driver = webdriver.Chrome("C:/Users/michael/Downloads/chromedriver_win32/chromedriver.exe")

You can test if it actually is in the PATH, if you open a cmd and type in chromedriver (assuming your chromedriver executable is still named like this) and hit Enter. If Starting ChromeDriver 2.15.322448 is appearing, the PATH is set appropriately and there is something else going wrong.

Alternatively you can use a direct path to the chromedriver like this:

 driver = webdriver.Chrome('/path/to/chromedriver') 

So in your specific case:

 driver = webdriver.Chrome("C:/Users/michael/Downloads/chromedriver_win32/chromedriver.exe")

回答 1

我看到讨论仍在讨论通过下载二进制文件并手动配置路径来设置chromedriver的旧方法。

可以使用webdriver-manager自动完成

pip install webdriver-manager

现在,问题中的上述代码将可以在下面的更改中简单地工作,

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager

driver = webdriver.Chrome(ChromeDriverManager().install())

可以使用相同的方法来设置Firefox,Edge和二进制文件。

I see the discussions still talk about the old way of setting up chromedriver by downloading the binary and configuring the path manually.

This can be done automatically using webdriver-manager

pip install webdriver-manager

Now the above code in the question will work simply with below change,

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager

driver = webdriver.Chrome(ChromeDriverManager().install())

The same can be used to set Firefox, Edge and ie binaries.


回答 2

pycharm社区版的情况与此相同,因此对于cmd,必须重新启动ide才能重新加载路径变量。重新启动您的ide,应该没问题。

Same situation with pycharm community edition, so, as for cmd, you must restart your ide in order to reload path variables. Restart your ide and it should be fine.


回答 3

在Linux(Ubuntu或Debian)上:

sudo apt install chromium-chromedriver

在macOS上安装https://brew.sh/然后执行

brew cask install chromedriver

On Ubuntu:

sudo apt install chromium-chromedriver

On Debian:

sudo apt install chromium-driver

On macOS install https://brew.sh/ then do

brew cask install chromedriver

回答 4

r对于原始字符串,我们必须添加路径字符串,以字符串之前的字母开头。我以这种方式进行了测试,并且有效。

driver = webdriver.Chrome(r"C:/Users/michael/Downloads/chromedriver_win32/chromedriver.exe")

We have to add path string, begin with the letter r before the string, for raw string. I tested this way, and it works.

driver = webdriver.Chrome(r"C:/Users/michael/Downloads/chromedriver_win32/chromedriver.exe")

回答 5

一些额外的输入/说明,供以后使用该线程的读者使用,以避免修改PATH env。Windows级别的变量并重新启动Windows系统:(从https://stackoverflow.com/a/49851498/9083077复制我的答案,适用于Chrome):

(1)下载chromedriver(如本主题前面所述),然后将(解压缩的)chromedriver.exe放在X:\ Folder \ of \ your \ choice中

(2)Python代码示例:

import os;
os.environ["PATH"] += os.pathsep + r'X:\Folder\of\your\choice';

from selenium import webdriver;
browser = webdriver.Chrome();
browser.get('http://localhost:8000')
assert 'Django' in browser.title

注意:(1)示例代码(在引用的答案中)可能需要5秒钟打开Firefox浏览器以获取指定的URL。(2)如果尚无服务器在指定的url上运行或提供标题为字符串’Django’的页面,则python控制台将显示以下错误:在browser.title AssertionError中断言’Django’。

Some additional input/clarification for future readers of this thread, to avoid tinkering with the PATH env. variable at the Windows level and restart of the Windows system: (copy of my answer from https://stackoverflow.com/a/49851498/9083077 as applicable to Chrome):

(1) Download chromedriver (as described in this thread earlier) and place the (unzipped) chromedriver.exe at X:\Folder\of\your\choice

(2) Python code sample:

import os;
os.environ["PATH"] += os.pathsep + r'X:\Folder\of\your\choice';

from selenium import webdriver;
browser = webdriver.Chrome();
browser.get('http://localhost:8000')
assert 'Django' in browser.title

Notes: (1) It may take about 5 seconds for the sample code (in the referenced answer) to open up the Firefox browser for the specified url. (2) The python console would show the following error if there’s no server already running at the specified url or serving a page with the title containing the string ‘Django’: assert ‘Django’ in browser.title AssertionError


回答 6

对于Linux和OSX

步骤1:下载chromedriver

# You can find more recent/older versions at http://chromedriver.storage.googleapis.com/
# Also make sure to pick the right driver, based on your Operating System
wget http://chromedriver.storage.googleapis.com/81.0.4044.69/chromedriver_mac64.zip

第2步:将chromedriver添加到 /usr/local/bin

unzip chromedriver_mac64.zip
cp chromedriver /usr/local/bin

您现在应该可以运行

from selenium import webdriver

browser = webdriver.Chrome()
browser.get('http://localhost:8000')

没有任何问题

For Linux and OSX

Step 1: Download chromedriver

# You can find more recent/older versions at http://chromedriver.storage.googleapis.com/
# Also make sure to pick the right driver, based on your Operating System
wget http://chromedriver.storage.googleapis.com/81.0.4044.69/chromedriver_mac64.zip

For debian: wget https://chromedriver.storage.googleapis.com/2.41/chromedriver_linux64.zip

Step 2: Add chromedriver to /usr/local/bin

unzip chromedriver_mac64.zip
sudo mv chromedriver /usr/local/bin
sudo chown root:root /usr/local/bin/chromedriver
sudo chmod +x /usr/local/bin/chromedriver

You should now be able to run

from selenium import webdriver

browser = webdriver.Chrome()
browser.get('http://localhost:8000')

without any issues


回答 7

解压缩chromedriver时,请务必指定确切位置,以便以后进行跟踪。在下面,您将为您的操作系统找到合适的chromedriver,然后将其解压缩到一个确切的位置,稍后可以在代码中将其作为参数提供。

wget http://chromedriver.storage.googleapis.com/2.10/chromedriver_linux64.zip unzip chromedriver_linux64.zip -d /home/virtualenv/python2.7.9/

When you unzip chromedriver, please do specify an exact location so that you can trace it later. Below, you are getting the right chromedriver for your OS, and then unzipping it to an exact location, which could be provided as argument later on in your code.

wget http://chromedriver.storage.googleapis.com/2.10/chromedriver_linux64.zip unzip chromedriver_linux64.zip -d /home/virtualenv/python2.7.9/


回答 8

如果您正在使用机器人框架RIDE。然后,您可以Chromedriver.exe从其官方网站下载并将此.exe文件保存在C:\Python27\Scripts目录中。现在将此路径作为环境变量提及。C:\Python27\Scripts\chromedriver.exe

重新启动计算机,然后再次运行相同的测试用例。您不会再遇到此问题。

If you are working with robot framework RIDE. Then you can download Chromedriver.exe from its official website and keep this .exe file in C:\Python27\Scripts directory. Now mention this path as your environment variable eg. C:\Python27\Scripts\chromedriver.exe.

Restart your computer and run same test case again. You will not get this problem again.


回答 9

根据说明,您需要在实例化webdriver时包括ChromeDriver的路径。Chrome例如:

driver = webdriver.Chrome('/path/to/chromedriver')

According to the instruction, you need to include the path to ChromeDriver when instantiating webdriver.Chrome eg.:

driver = webdriver.Chrome('/path/to/chromedriver')

回答 10

在将chromedriver添加到路径之前,请确保它与浏览器的版本相同。

如果不是,则需要匹配版本:更新/降级chrome,以及升级/降级webdriver。

我建议您尽可能多地更新Chrome版本,并匹配网络驱动程序。

要更新Chrome:

  • 在右上角,单击三个点。
  • 点击help->About Google Chrome
  • 更新版本并重新启动chrome

然后从此处下载兼容版本:http : //chromedriver.chromium.org/downloads

注意:最新的chromedriver并不总是与最新版本的chrome匹配!

现在,您可以将其添加到PATH中:

  1. 在您计算机的某个位置创建一个新文件夹,您将在其中放置Web驱动程序。我创建了一个命名的文件夹webdriversC:\Program Files

  2. 复制文件夹路径。就我而言C:\Program Files\webdrivers

  3. 右键单击this PC-> properties

  1. 在右键上 Advanced System settings
  2. 请点击 Environment Variables
  3. 在中System variables,单击,path然后单击edit
  4. 点击 new
  5. 粘贴之前复制的路径
  6. 在所有窗口上单击确定

而已!我使用了pycharm,不得不重新打开它。也许与其他IDE或终端相同。

Before you add the chromedriver to your path, make sure it’s the same version as your browser.

If not, you will need to match versions: either update/downgrade you chrome, and upgrade/downgrade your webdriver.

I recommend updating your chrome version as much as possible, and the matching the webdriver.

To update chrome:

  • On the top right corner, click on the three dots.
  • click help -> About Google Chrome
  • update the version and restart chrome

Then download the compatible version from here: http://chromedriver.chromium.org/downloads .

Note: The newest chromedriver doesn’t always match the newest version of chrome!

Now you can add it to the PATH:

  1. create a new folder somewhere in your computer, where you will place your web drivers. I created a folder named webdrivers in C:\Program Files

  2. copy the folder path. In my case it was C:\Program Files\webdrivers

  3. right click on this PC -> properties:

  1. On the right click Advanced System settings
  2. Click Environment Variables
  3. In System variables, click on path and click edit
  4. click new
  5. paste the path you copied before
  6. click OK on all the windows

Thats it! I used pycharm and I had to reopen it. Maybe its the same with other IDEs or terminals.


回答 11

如果您完全确定PATH设置正确,可以尝试重新启动计算机,如果它无法正常工作。

就Windows 7而言,我总是在WebDriverException上出现错误:消息:对于chromedriver,gecodriver,IEDriverServer。我很确定我有正确的路径。重启电脑,一切正常

Could try to restart computer if it doesn’t work after you are quite sure that PATH is set correctly.

In my case on windows 7, I always got the error on WebDriverException: Message: for chromedriver, gecodriver, IEDriverServer. I am pretty sure that i have correct path. Restart computer, all work


回答 12

就我而言,当我将chromedriver文件复制到c:\ Windows文件夹时,此错误消失了。这是因为Windows目录位于python脚本检查chromedriver可用性的路径中。

In my case, this error disappears when I have copied chromedriver file to c:\Windows folder. Its because windows directory is in the path which python script check for chromedriver availability.


回答 13

如果使用远程解释器,则还必须检查是否定义了其可执行文件PATH。在我的情况下,从远程Docker解释器切换到本地解释器解决了问题。

If you are using remote interpreter you have to also check if its executable PATH is defined. In my case switching from remote Docker interpreter to local interpreter solved the problem.


回答 14

我遇到了与您相同的问题。我正在使用PyCharm编写程序,我认为问题出在PyCharm中而不是OS中。我解决了该问题,方法是进行脚本配置,然后手动编辑环境变量中的PATH。希望对您有所帮助!

I encountered the same problem as yours. I’m using PyCharm to write programs, and I think the problem lies in environment setup in PyCharm rather than the OS. I solved the problem by going to script configuration and then editing the PATH in environment variables manually. Hope you find this helpful!


回答 15

C:\ Windows处添加webdriver(chromedriver.exe或geckodriver.exe)。这对我来说很有效

Add the webdriver(chromedriver.exe or geckodriver.exe) here C:\Windows. This worked in my case


回答 16

最好的方法可能是获取当前目录并将剩余地址附加到该目录。像这样的代码(Windows上的Word。在Linux上,您可以使用pwd行): webdriveraddress = str(os.popen("cd").read().replace("\n", ''))+'\path\to\webdriver'

The best way is maybe to get the current directory and append the remaining address to it. Like this code(Word on windows. On linux you can use something line pwd): webdriveraddress = str(os.popen("cd").read().replace("\n", ''))+'\path\to\webdriver'


回答 17

当我下载chromedriver.exe时,我只是将其移动到PATH文件夹C:\ Windows \ System32 \ chromedriver.exe中,却遇到了完全相同的问题。

对我来说,解决方案是只更改PATH中的文件夹,因此我将其移到了PATH中也位于Pycharm Community bin文件夹中。例如:

  • C:\ Windows \ System32 \ chromedriver.exe->给我exceptions
  • C:\ Program Files \ JetBrains \ PyCharm Community Edition 2019.1.3 \ bin \ chromedriver.exe->运行正常

When I downloaded chromedriver.exe I just move it in PATH folder C:\Windows\System32\chromedriver.exe and had exact same problem.

For me solution was to just change folder in PATH, so I just moved it at Pycharm Community bin folder that was also in PATH. ex:

  • C:\Windows\System32\chromedriver.exe –> Gave me exception
  • C:\Program Files\JetBrains\PyCharm Community Edition 2019.1.3\bin\chromedriver.exe –> worked fine

回答 18

Mac Mojave运行机器人测试框架和Chrome 77时出现了此问题。这解决了问题。感谢@Navarasu将我指向正确的轨道。

$ pip install webdriver-manager --user # install webdriver-manager lib for python
$ python # open python prompt

接下来,在python提示符下:

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(ChromeDriverManager().install())

# ctrl+d to exit

这导致以下错误:

Checking for mac64 chromedriver:xx.x.xxxx.xx in cache
There is no cached driver. Downloading new one...
Trying to download new driver from http://chromedriver.storage.googleapis.com/xx.x.xxxx.xx/chromedriver_mac64.zip
...
TypeError: makedirs() got an unexpected keyword argument 'exist_ok'
  • 我现在得到了最新的下载链接
    • 将chromedriver下载并解压缩到所需位置
    • 例如: ~/chromedriver/chromedriver

~/.bash_profile用编辑器打开并添加:

export PATH="$HOME/chromedriver:$PATH"

打开新的终端窗口,ta-da🎉

Had this issue with Mac Mojave running Robot test framework and Chrome 77. This solved the problem. Kudos @Navarasu for pointing me to the right track.

$ pip install webdriver-manager --user # install webdriver-manager lib for python
$ python # open python prompt

Next, in python prompt:

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(ChromeDriverManager().install())

# ctrl+d to exit

This leads to the following error:

Checking for mac64 chromedriver:xx.x.xxxx.xx in cache
There is no cached driver. Downloading new one...
Trying to download new driver from http://chromedriver.storage.googleapis.com/xx.x.xxxx.xx/chromedriver_mac64.zip
...
TypeError: makedirs() got an unexpected keyword argument 'exist_ok'
  • I now got the newest download link
    • Download and unzip chromedriver to where you want
    • For example: ~/chromedriver/chromedriver

Open ~/.bash_profile with editor and add:

export PATH="$HOME/chromedriver:$PATH"

Open new terminal window, ta-da 🎉


回答 19

我在Webdriver 3.8.0(Chrome 73.0.3683.103和ChromeDriver 73.0.3683.68)上遇到了此问题。我做完之后问题就消失了

pip install -U selenium

将Webdriver升级到3.14.1。

I had this problem on Webdriver 3.8.0 (Chrome 73.0.3683.103 and ChromeDriver 73.0.3683.68). The problem disappeared after I did

pip install -U selenium

to upgrade Webdriver to 3.14.1.


回答 20

最好的确定方法是在这里:

下载并解压缩chromedriver并将chromedriver.exe放入C:\ Python27 \ Scripts中,然后您无需提供驱动程序的路径,只需

driver= webdriver.Chrome()

您无需添加路径或其他任何操作

Best way for sure is here:

Download and unzip chromedriver and put ‘chromedriver.exe’ in C:\Python27\Scripts and then you need not to provide the path of driver, just

driver= webdriver.Chrome()

You are done no need to add paths or anything


回答 21

检查您的Chrome驱动程序的路径,它可能无法从那里获取。只需复制即可将驱动程序位置粘贴到代码中。

Check the path of your chrome driver, it might not get it from there. Simply Copy paste the driver location into the code.


回答 22

(对于Mac用户)我有同样的问题,但是我通过以下简单方法解决了:您必须将chromedriver.exe放在执行脚本的同一文件夹中,然后在pyhton中编写以下指令:

导入操作系统

os.environ [“ PATH”] + = os.pathsep + r’X:/您的/文件夹/脚本/’

(for Mac users) I have the same problem but i solved by this simple way: You have to put your chromedriver.exe in the same folder to your executed script and than in pyhton write this instruction :

import os

os.environ[“PATH”] += os.pathsep + r’X:/your/folder/script/’


如何在Mac OS上安装2个Anacondas(Python 2和3)

问题:如何在Mac OS上安装2个Anacondas(Python 2和3)

我在Mac OS中比较新。我刚刚使用最新的Python 3(针对我自己)安装了XCode(针对c ++编译器)和Anaconda。现在我想知道如何使用Python 2正确安装第二个Anaconda(用于工作)?

我需要两个版本都可以与iPython和Spyder IDE一起使用。理想的方法是拥有完全独立的Python环境。例如,我希望我可以conda install scikit-learn为Python 3环境编写类似的东西,而conda2 install scikit-learn为Python 2 环境编写类似的东西。

I’m relatively new in Mac OS. I’ve just installed XCode (for c++ compiler) and Anaconda with the latest Python 3 (for myself). Now I’m wondering how to install properly second Anaconda (for work) with Python 2?

I need both versions to work with iPython and Spyder IDE. Ideal way is to have totally separate Python environments. For example, I wish I could write like conda install scikit-learn for Python 3 environment and something like conda2 install scikit-learn for Python 2.


回答 0

无需再次安装Anaconda。Anaconda的软件包管理器Conda完全支持分离的环境。为Python 2.7创建环境的最简单方法是

conda create -n python2 python=2.7 anaconda

这将创建一个名为python2Python Anaconda的环境。您可以使用

source activate python2

这会将那个环境(通常是~/anaconda/envs/python2)放在您的前面PATH,这样当您python在终端上键入内容时,它将从该环境中加载Python。

如果您不希望使用Anaconda的全部功能,则可以anaconda在上面的命令中将其替换为所需的任何软件包。您可以conda稍后使用-n python2标记conda或激活环境,以在该环境中安装软件包。

There is no need to install Anaconda again. Conda, the package manager for Anaconda, fully supports separated environments. The easiest way to create an environment for Python 2.7 is to do

conda create -n python2 python=2.7 anaconda

This will create an environment named python2 that contains the Python 2.7 version of Anaconda. You can activate this environment with

source activate python2

This will put that environment (typically ~/anaconda/envs/python2) in front in your PATH, so that when you type python at the terminal it will load the Python from that environment.

If you don’t want all of Anaconda, you can replace anaconda in the command above with whatever packages you want. You can use conda to install packages in that environment later, either by using the -n python2 flag to conda, or by activating the environment.


回答 1

编辑!:请确保您在计算机上同时安装了两个Python。

也许我的答案对您来说太迟了,但我可以帮助遇到同样问题的人!

您不必同时下载两者Anaconda

如果你正在使用SpyderJupyterAnaconda的环境下和,

如果您已经有Anaconda 2输入终端:

    python3 -m pip install ipykernel

    python3 -m ipykernel install --user

如果您已经有Anaconda 3,则输入终端:

    python2 -m pip install ipykernel

    python2 -m ipykernel install --user

然后在使用之前,Spyder您可以选择如下所示的Python环境!有时只有您可以看到root和新的Python环境,因此root是您的第一个anaconda环境!

这也是Jupyter。您可以选择像这样的python版本!

希望对您有所帮助。

Edit!: Please be sure that you should have both Python installed on your computer.

Maybe my answer is late for you but I can help someone who has the same problem!

You don’t have to download both Anaconda.

If you are using Spyder and Jupyter in Anaconda environmen and,

If you have already Anaconda 2 type in Terminal:

    python3 -m pip install ipykernel

    python3 -m ipykernel install --user

If you have already Anaconda 3 then type in terminal:

    python2 -m pip install ipykernel

    python2 -m ipykernel install --user

Then before use Spyder you can choose Python environment like below! Sometimes only you can see root and your new Python environment, so root is your first anaconda environment!

Also this is Jupyter. You can choose python version like this!

I hope it will help.


回答 2

如果您安装了多个python版本并且不知道如何告诉您的助手使用特定版本,这可能会有所帮助。

  1. 安装anaconda。最新版本可以在这里找到
  2. 通过输入anaconda-navigator终端打开导航器
  3. 开放环境。点击create,然后在其中选择您的python版本。
  4. 现在将为您的python版本创建新的环境,您只需单击即可安装IDE(在此处列出)install
  5. 在您的环境中启动IDE,以便该IDE将在该环境中使用指定的版本。

希望能帮助到你!!

This may be helpful if you have more than one python versions installed and dont know how to tell your ide’s to use a specific version.

  1. Install anaconda. Latest version can be found here
  2. Open the navigator by typing anaconda-navigator in terminal
  3. Open environments. Click on create and then choose your python version in that.
  4. Now new environment will be created for your python version and you can install the IDE’s(which are listed there) just by clicking install in that.
  5. Launch the IDE in your environment so that that IDE will use the specified version for that environment.

Hope it helps!!