问题:如何对Python中的URL参数进行百分比编码?

如果我做

url = "http://example.com?p=" + urllib.quote(query)
  1. 它不编码/%2F(破坏OAuth规范化)
  2. 它不处理Unicode(引发异常)

有没有更好的图书馆?

If I do

url = "http://example.com?p=" + urllib.quote(query)
  1. It doesn’t encode / to %2F (breaks OAuth normalization)
  2. It doesn’t handle Unicode (it throws an exception)

Is there a better library?


回答 0

Python 2

文档

urllib.quote(string[, safe])

使用%xx转义符替换字符串中的特殊字符。字母,数字和字符“ _.-”都不会被引用。默认情况下,此函数用于引用URL的路径部分。可选的safe参数指定不应引用的其他字符- 其默认值为’/’

这意味着通过“安全”将解决您的第一个问题:

>>> urllib.quote('/test')
'/test'
>>> urllib.quote('/test', safe='')
'%2Ftest'

关于第二个问题,有关于它的bug报告在这里。显然,它已在python 3中修复。您可以通过编码为utf8来解决此问题,如下所示:

>>> query = urllib.quote(u"Müller".encode('utf8'))
>>> print urllib.unquote(query).decode('utf8')
Müller

顺便看看urlencode

Python 3

相同的,除了更换urllib.quoteurllib.parse.quote

Python 2

From the docs:

urllib.quote(string[, safe])

Replace special characters in string using the %xx escape. Letters, digits, and the characters ‘_.-‘ are never quoted. By default, this function is intended for quoting the path section of the URL.The optional safe parameter specifies additional characters that should not be quoted — its default value is ‘/’

That means passing ” for safe will solve your first issue:

>>> urllib.quote('/test')
'/test'
>>> urllib.quote('/test', safe='')
'%2Ftest'

About the second issue, there is a bug report about it here. Apparently it was fixed in python 3. You can workaround it by encoding as utf8 like this:

>>> query = urllib.quote(u"Müller".encode('utf8'))
>>> print urllib.unquote(query).decode('utf8')
Müller

By the way have a look at urlencode

Python 3

The same, except replace urllib.quote with urllib.parse.quote.


回答 1

在Python 3中,urllib.quote已移至,urllib.parse.quote并且默认情况下确实处理unicode。

>>> from urllib.parse import quote
>>> quote('/test')
'/test'
>>> quote('/test', safe='')
'%2Ftest'
>>> quote('/El Niño/')
'/El%20Ni%C3%B1o/'

In Python 3, urllib.quote has been moved to urllib.parse.quote and it does handle unicode by default.

>>> from urllib.parse import quote
>>> quote('/test')
'/test'
>>> quote('/test', safe='')
'%2Ftest'
>>> quote('/El Niño/')
'/El%20Ni%C3%B1o/'

回答 2

我的答案类似于保罗的答案。

我认为模块requests要好得多。它基于urllib3。您可以尝试以下方法:

>>> from requests.utils import quote
>>> quote('/test')
'/test'
>>> quote('/test', safe='')
'%2Ftest'

My answer is similar to Paolo’s answer.

I think module requests is much better. It’s based on urllib3. You can try this:

>>> from requests.utils import quote
>>> quote('/test')
'/test'
>>> quote('/test', safe='')
'%2Ftest'

回答 3

如果您使用的是django,则可以使用urlquote:

>>> from django.utils.http import urlquote
>>> urlquote(u"Müller")
u'M%C3%BCller'

请注意,自发布此答案以来对Python的更改意味着它现在是旧版包装器。从django.utils.http的Django 2.1源代码中:

A legacy compatibility wrapper to Python's urllib.parse.quote() function.
(was used for unicode handling on Python 2)

If you’re using django, you can use urlquote:

>>> from django.utils.http import urlquote
>>> urlquote(u"Müller")
u'M%C3%BCller'

Note that changes to Python since this answer was published mean that this is now a legacy wrapper. From the Django 2.1 source code for django.utils.http:

A legacy compatibility wrapper to Python's urllib.parse.quote() function.
(was used for unicode handling on Python 2)

回答 4

最好在urlencode这里使用。单个参数没有太大区别,但是恕我直言使代码更清晰。(看一个函数看起来很混乱quote_plus!尤其是那些来自其他语言的函数)

In [21]: query='lskdfj/sdfkjdf/ksdfj skfj'

In [22]: val=34

In [23]: from urllib.parse import urlencode

In [24]: encoded = urlencode(dict(p=query,val=val))

In [25]: print(f"http://example.com?{encoded}")
http://example.com?p=lskdfj%2Fsdfkjdf%2Fksdfj+skfj&val=34

文件

urlencode:https//docs.python.org/3/library/urllib.parse.html#urllib.parse.urlencode

quote_plus:https ://docs.python.org/3/library/urllib.parse.html#urllib.parse.quote_plus

It is better to use urlencode here. Not much difference for single parameter but IMHO makes the code clearer. (It looks confusing to see a function quote_plus! especially those coming from other languates)

In [21]: query='lskdfj/sdfkjdf/ksdfj skfj'

In [22]: val=34

In [23]: from urllib.parse import urlencode

In [24]: encoded = urlencode(dict(p=query,val=val))

In [25]: print(f"http://example.com?{encoded}")
http://example.com?p=lskdfj%2Fsdfkjdf%2Fksdfj+skfj&val=34

Docs

urlencode: https://docs.python.org/3/library/urllib.parse.html#urllib.parse.urlencode

quote_plus: https://docs.python.org/3/library/urllib.parse.html#urllib.parse.quote_plus


声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。