问题:在Python中将参数添加到给定的URL
假设给了我一个URL。
它可能已经具有GET参数(例如http://example.com/search?q=question
),也可能没有(例如http://example.com/
)。
现在我需要为其添加一些参数{'lang':'en','tag':'python'}
。在第一种情况下,我将拥有,http://example.com/search?q=question&lang=en&tag=python
而在第二种情况下- http://example.com/search?lang=en&tag=python
。
有什么标准的方法可以做到这一点吗?
Suppose I was given a URL.
It might already have GET parameters (e.g. http://example.com/search?q=question
) or it might not (e.g. http://example.com/
).
And now I need to add some parameters to it like {'lang':'en','tag':'python'}
. In the first case I’m going to have http://example.com/search?q=question&lang=en&tag=python
and in the second — http://example.com/search?lang=en&tag=python
.
Is there any standard way to do this?
回答 0
urllib
和urlparse
模块有几个怪癖。这是一个工作示例:
try:
import urlparse
from urllib import urlencode
except: # For Python 3
import urllib.parse as urlparse
from urllib.parse import urlencode
url = "http://stackoverflow.com/search?q=question"
params = {'lang':'en','tag':'python'}
url_parts = list(urlparse.urlparse(url))
query = dict(urlparse.parse_qsl(url_parts[4]))
query.update(params)
url_parts[4] = urlencode(query)
print(urlparse.urlunparse(url_parts))
ParseResult
,结果urlparse()
,是只读的,我们需要把它转换成list
之前,我们可以尝试修改其数据。
There are a couple of quirks with the urllib
and urlparse
modules. Here’s a working example:
try:
import urlparse
from urllib import urlencode
except: # For Python 3
import urllib.parse as urlparse
from urllib.parse import urlencode
url = "http://stackoverflow.com/search?q=question"
params = {'lang':'en','tag':'python'}
url_parts = list(urlparse.urlparse(url))
query = dict(urlparse.parse_qsl(url_parts[4]))
query.update(params)
url_parts[4] = urlencode(query)
print(urlparse.urlunparse(url_parts))
ParseResult
, the result of urlparse()
, is read-only and we need to convert it to a list
before we can attempt to modify its data.
回答 1
为什么
我对本页上的所有解决方案都不满意(请问,我们最喜欢的复制粘贴内容在哪里?),所以我根据此处的答案写了自己的解决方案。它试图变得完整和更加Pythonic。我为参数中的dict和bool值添加了一个处理程序,以使其对消费者端(JS)更友好,但是它们仍然是可选的,您可以将其删除。
这个怎么运作
测试1:添加新参数,处理数组和布尔值:
url = 'http://stackoverflow.com/test'
new_params = {'answers': False, 'data': ['some','values']}
add_url_params(url, new_params) == \
'http://stackoverflow.com/test?data=some&data=values&answers=false'
测试2:重写现有的参数,处理DICT值:
url = 'http://stackoverflow.com/test/?question=false'
new_params = {'question': {'__X__':'__Y__'}}
add_url_params(url, new_params) == \
'http://stackoverflow.com/test/?question=%7B%22__X__%22%3A+%22__Y__%22%7D'
谈话很便宜。给我看代码。
代码本身。我试图详细描述它:
from json import dumps
try:
from urllib import urlencode, unquote
from urlparse import urlparse, parse_qsl, ParseResult
except ImportError:
# Python 3 fallback
from urllib.parse import (
urlencode, unquote, urlparse, parse_qsl, ParseResult
)
def add_url_params(url, params):
""" Add GET params to provided URL being aware of existing.
:param url: string of target URL
:param params: dict containing requested params to be added
:return: string with updated URL
>> url = 'http://stackoverflow.com/test?answers=true'
>> new_params = {'answers': False, 'data': ['some','values']}
>> add_url_params(url, new_params)
'http://stackoverflow.com/test?data=some&data=values&answers=false'
"""
# Unquoting URL first so we don't loose existing args
url = unquote(url)
# Extracting url info
parsed_url = urlparse(url)
# Extracting URL arguments from parsed URL
get_args = parsed_url.query
# Converting URL arguments to dict
parsed_get_args = dict(parse_qsl(get_args))
# Merging URL arguments dict with new params
parsed_get_args.update(params)
# Bool and Dict values should be converted to json-friendly values
# you may throw this part away if you don't like it :)
parsed_get_args.update(
{k: dumps(v) for k, v in parsed_get_args.items()
if isinstance(v, (bool, dict))}
)
# Converting URL argument to proper query string
encoded_get_args = urlencode(parsed_get_args, doseq=True)
# Creating new parsed result object based on provided with new
# URL arguments. Same thing happens inside of urlparse.
new_url = ParseResult(
parsed_url.scheme, parsed_url.netloc, parsed_url.path,
parsed_url.params, encoded_get_args, parsed_url.fragment
).geturl()
return new_url
请注意,可能会有一些问题,如果您发现一个问题,请告诉我,我们会做的更好
Why
I’ve been not satisfied with all the solutions on this page (come on, where is our favorite copy-paste thing?) so I wrote my own based on answers here. It tries to be complete and more Pythonic. I’ve added a handler for dict and bool values in arguments to be more consumer-side (JS) friendly, but they are yet optional, you can drop them.
How it works
Test 1: Adding new arguments, handling Arrays and Bool values:
url = 'http://stackoverflow.com/test'
new_params = {'answers': False, 'data': ['some','values']}
add_url_params(url, new_params) == \
'http://stackoverflow.com/test?data=some&data=values&answers=false'
Test 2: Rewriting existing args, handling DICT values:
url = 'http://stackoverflow.com/test/?question=false'
new_params = {'question': {'__X__':'__Y__'}}
add_url_params(url, new_params) == \
'http://stackoverflow.com/test/?question=%7B%22__X__%22%3A+%22__Y__%22%7D'
Talk is cheap. Show me the code.
Code itself. I’ve tried to describe it in details:
from json import dumps
try:
from urllib import urlencode, unquote
from urlparse import urlparse, parse_qsl, ParseResult
except ImportError:
# Python 3 fallback
from urllib.parse import (
urlencode, unquote, urlparse, parse_qsl, ParseResult
)
def add_url_params(url, params):
""" Add GET params to provided URL being aware of existing.
:param url: string of target URL
:param params: dict containing requested params to be added
:return: string with updated URL
>> url = 'http://stackoverflow.com/test?answers=true'
>> new_params = {'answers': False, 'data': ['some','values']}
>> add_url_params(url, new_params)
'http://stackoverflow.com/test?data=some&data=values&answers=false'
"""
# Unquoting URL first so we don't loose existing args
url = unquote(url)
# Extracting url info
parsed_url = urlparse(url)
# Extracting URL arguments from parsed URL
get_args = parsed_url.query
# Converting URL arguments to dict
parsed_get_args = dict(parse_qsl(get_args))
# Merging URL arguments dict with new params
parsed_get_args.update(params)
# Bool and Dict values should be converted to json-friendly values
# you may throw this part away if you don't like it :)
parsed_get_args.update(
{k: dumps(v) for k, v in parsed_get_args.items()
if isinstance(v, (bool, dict))}
)
# Converting URL argument to proper query string
encoded_get_args = urlencode(parsed_get_args, doseq=True)
# Creating new parsed result object based on provided with new
# URL arguments. Same thing happens inside of urlparse.
new_url = ParseResult(
parsed_url.scheme, parsed_url.netloc, parsed_url.path,
parsed_url.params, encoded_get_args, parsed_url.fragment
).geturl()
return new_url
Please be aware that there may be some issues, if you’ll find one please let me know and we will make this thing better
回答 2
如果字符串可以具有任意数据(例如,需要对与号,斜线等字符进行编码),则要使用URL编码。
查看urllib.urlencode:
>>> import urllib
>>> urllib.urlencode({'lang':'en','tag':'python'})
'lang=en&tag=python'
在python3中:
from urllib import parse
parse.urlencode({'lang':'en','tag':'python'})
You want to use URL encoding if the strings can have arbitrary data (for example, characters such as ampersands, slashes, etc. will need to be encoded).
Check out urllib.urlencode:
>>> import urllib
>>> urllib.urlencode({'lang':'en','tag':'python'})
'lang=en&tag=python'
In python3:
from urllib import parse
parse.urlencode({'lang':'en','tag':'python'})
回答 3
您还可以使用furl模块https://github.com/gruns/furl
>>> from furl import furl
>>> print furl('http://example.com/search?q=question').add({'lang':'en','tag':'python'}).url
http://example.com/search?q=question&lang=en&tag=python
You can also use the furl module https://github.com/gruns/furl
>>> from furl import furl
>>> print furl('http://example.com/search?q=question').add({'lang':'en','tag':'python'}).url
http://example.com/search?q=question&lang=en&tag=python
回答 4
将其外包给经过战斗测试的请求库。
这就是我要做的:
from requests.models import PreparedRequest
url = 'http://example.com/search?q=question'
params = {'lang':'en','tag':'python'}
req = PreparedRequest()
req.prepare_url(url, params)
print(req.url)
Outsource it to the battle tested requests library.
This is how I will do it:
from requests.models import PreparedRequest
url = 'http://example.com/search?q=question'
params = {'lang':'en','tag':'python'}
req = PreparedRequest()
req.prepare_url(url, params)
print(req.url)
回答 5
如果您使用请求lib:
import requests
...
params = {'tag': 'python'}
requests.get(url, params=params)
If you are using the requests lib:
import requests
...
params = {'tag': 'python'}
requests.get(url, params=params)
回答 6
是的:使用urllib。
从文档中的示例中:
>>> import urllib
>>> params = urllib.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
>>> f = urllib.urlopen("http://www.musi-cal.com/cgi-bin/query?%s" % params)
>>> print f.geturl() # Prints the final URL with parameters.
>>> print f.read() # Prints the contents
Yes: use urllib.
From the examples in the documentation:
>>> import urllib
>>> params = urllib.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
>>> f = urllib.urlopen("http://www.musi-cal.com/cgi-bin/query?%s" % params)
>>> print f.geturl() # Prints the final URL with parameters.
>>> print f.read() # Prints the contents
回答 7
基于这个答案,简单案例的一线式(Python 3代码):
from urllib.parse import urlparse, urlencode
url = "https://stackoverflow.com/search?q=question"
params = {'lang':'en','tag':'python'}
url += ('&' if urlparse(url).query else '?') + urlencode(params)
要么:
url += ('&', '?')[urlparse(url).query == ''] + urlencode(params)
Based on this answer, one-liner for simple cases (Python 3 code):
from urllib.parse import urlparse, urlencode
url = "https://stackoverflow.com/search?q=question"
params = {'lang':'en','tag':'python'}
url += ('&' if urlparse(url).query else '?') + urlencode(params)
or:
url += ('&', '?')[urlparse(url).query == ''] + urlencode(params)
回答 8
我发现这比两个最重要的答案更为优雅:
from urllib.parse import urlencode, urlparse, parse_qs
def merge_url_query_params(url: str, additional_params: dict) -> str:
url_components = urlparse(url)
original_params = parse_qs(url_components.query)
# Before Python 3.5 you could update original_params with
# additional_params, but here all the variables are immutable.
merged_params = {**original_params, **additional_params}
updated_query = urlencode(merged_params, doseq=True)
# _replace() is how you can create a new NamedTuple with a changed field
return url_components._replace(query=updated_query).geturl()
assert merge_url_query_params(
'http://example.com/search?q=question',
{'lang':'en','tag':'python'},
) == 'http://example.com/search?q=question&lang=en&tag=python'
我在最重要的答案中不喜欢的最重要的事情(尽管如此,它们还是不错的):
- Łukasz:必须记住
query
URL组件中的索引
- Sapphire64:创建更新版本的非常冗长的方法
ParseResult
我的响应不好的是dict
使用了拆包的神奇合并,但是由于我对可变性的偏见,我更喜欢更新现有字典。
I find this more elegant than the two top answers:
from urllib.parse import urlencode, urlparse, parse_qs
def merge_url_query_params(url: str, additional_params: dict) -> str:
url_components = urlparse(url)
original_params = parse_qs(url_components.query)
# Before Python 3.5 you could update original_params with
# additional_params, but here all the variables are immutable.
merged_params = {**original_params, **additional_params}
updated_query = urlencode(merged_params, doseq=True)
# _replace() is how you can create a new NamedTuple with a changed field
return url_components._replace(query=updated_query).geturl()
assert merge_url_query_params(
'http://example.com/search?q=question',
{'lang':'en','tag':'python'},
) == 'http://example.com/search?q=question&lang=en&tag=python'
The most important things I dislike in the top answers (they are nevertheless good):
- Łukasz: having to remember the index at which the
query
is in the URL components
- Sapphire64: the very verbose way of creating the updated
ParseResult
What’s bad about my response is the magically looking dict
merge using unpacking, but I prefer that to updating an already existing dictionary because of my prejudice against mutability.
回答 9
我喜欢Łukasz版本,但是由于在这种情况下使用urllib和urllparse函数有些尴尬,因此我认为执行以下操作更简单:
params = urllib.urlencode(params)
if urlparse.urlparse(url)[4]:
print url + '&' + params
else:
print url + '?' + params
I liked Łukasz version, but since urllib and urllparse functions are somewhat awkward to use in this case, I think it’s more straightforward to do something like this:
params = urllib.urlencode(params)
if urlparse.urlparse(url)[4]:
print url + '&' + params
else:
print url + '?' + params
回答 10
Use the various urlparse
functions to tear apart the existing URL, urllib.urlencode()
on the combined dictionary, then urlparse.urlunparse()
to put it all back together again.
Or just take the result of urllib.urlencode()
and concatenate it to the URL appropriately.
回答 11
还有一个答案:
def addGetParameters(url, newParams):
(scheme, netloc, path, params, query, fragment) = urlparse.urlparse(url)
queryList = urlparse.parse_qsl(query, keep_blank_values=True)
for key in newParams:
queryList.append((key, newParams[key]))
return urlparse.urlunparse((scheme, netloc, path, params, urllib.urlencode(queryList), fragment))
Yet another answer:
def addGetParameters(url, newParams):
(scheme, netloc, path, params, query, fragment) = urlparse.urlparse(url)
queryList = urlparse.parse_qsl(query, keep_blank_values=True)
for key in newParams:
queryList.append((key, newParams[key]))
return urlparse.urlunparse((scheme, netloc, path, params, urllib.urlencode(queryList), fragment))
回答 12
这是我的实现方法。
import urllib
params = urllib.urlencode({'lang':'en','tag':'python'})
url = ''
if request.GET:
url = request.url + '&' + params
else:
url = request.url + '?' + params
像魅力一样工作。但是,我希望有一种更清洁的方法来实现此目的。
实现上述内容的另一种方法是将其放入方法中。
import urllib
def add_url_param(request, **params):
new_url = ''
_params = dict(**params)
_params = urllib.urlencode(_params)
if _params:
if request.GET:
new_url = request.url + '&' + _params
else:
new_url = request.url + '?' + _params
else:
new_url = request.url
return new_ur
Here is how I implemented it.
import urllib
params = urllib.urlencode({'lang':'en','tag':'python'})
url = ''
if request.GET:
url = request.url + '&' + params
else:
url = request.url + '?' + params
Worked like a charm. However, I would have liked a more cleaner way to implement this.
Another way of implementing the above is put it in a method.
import urllib
def add_url_param(request, **params):
new_url = ''
_params = dict(**params)
_params = urllib.urlencode(_params)
if _params:
if request.GET:
new_url = request.url + '&' + _params
else:
new_url = request.url + '?' + _params
else:
new_url = request.url
return new_ur
回答 13
在python 2.5中
import cgi
import urllib
import urlparse
def add_url_param(url, **params):
n=3
parts = list(urlparse.urlsplit(url))
d = dict(cgi.parse_qsl(parts[n])) # use cgi.parse_qs for list values
d.update(params)
parts[n]=urllib.urlencode(d)
return urlparse.urlunsplit(parts)
url = "http://stackoverflow.com/search?q=question"
add_url_param(url, lang='en') == "http://stackoverflow.com/search?q=question&lang=en"
In python 2.5
import cgi
import urllib
import urlparse
def add_url_param(url, **params):
n=3
parts = list(urlparse.urlsplit(url))
d = dict(cgi.parse_qsl(parts[n])) # use cgi.parse_qs for list values
d.update(params)
parts[n]=urllib.urlencode(d)
return urlparse.urlunsplit(parts)
url = "http://stackoverflow.com/search?q=question"
add_url_param(url, lang='en') == "http://stackoverflow.com/search?q=question&lang=en"