Python 实用宝典

Question 1

Suppose I was given a URL.
It might already have GET parameters (e.g. http://example.com/search?q=question) or it might not (e.g. http://example.com/).

And now I need to add some parameters to it like {'lang':'en','tag':'python'}. In the first case I’m going to have http://example.com/search?q=question&lang=en&tag=python and in the second — http://example.com/search?lang=en&tag=python.

Is there any standard way to do this?

Question 2

There are a couple of quirks with the urllib and urlparse modules. Here’s a working example:

try:
    import urlparse
    from urllib import urlencode
except: # For Python 3
    import urllib.parse as urlparse
    from urllib.parse import urlencode

url = "http://stackoverflow.com/search?q=question"
params = {'lang':'en','tag':'python'}

url_parts = list(urlparse.urlparse(url))
query = dict(urlparse.parse_qsl(url_parts[4]))
query.update(params)

url_parts[4] = urlencode(query)

print(urlparse.urlunparse(url_parts))

ParseResult, the result of urlparse(), is read-only and we need to convert it to a list before we can attempt to modify its data.

Question 3

Why

I’ve been not satisfied with all the solutions on this page (come on, where is our favorite copy-paste thing?) so I wrote my own based on answers here. It tries to be complete and more Pythonic. I’ve added a handler for dict and bool values in arguments to be more consumer-side (JS) friendly, but they are yet optional, you can drop them.

How it works

Test 1: Adding new arguments, handling Arrays and Bool values:

url = 'http://stackoverflow.com/test'
new_params = {'answers': False, 'data': ['some','values']}

add_url_params(url, new_params) == \
    'http://stackoverflow.com/test?data=some&data=values&answers=false'

Test 2: Rewriting existing args, handling DICT values:

url = 'http://stackoverflow.com/test/?question=false'
new_params = {'question': {'__X__':'__Y__'}}

add_url_params(url, new_params) == \
    'http://stackoverflow.com/test/?question=%7B%22__X__%22%3A+%22__Y__%22%7D'

Talk is cheap. Show me the code.

Code itself. I’ve tried to describe it in details:

from json import dumps

try:
    from urllib import urlencode, unquote
    from urlparse import urlparse, parse_qsl, ParseResult
except ImportError:
    # Python 3 fallback
    from urllib.parse import (
        urlencode, unquote, urlparse, parse_qsl, ParseResult
    )


def add_url_params(url, params):
    """ Add GET params to provided URL being aware of existing.

    :param url: string of target URL
    :param params: dict containing requested params to be added
    :return: string with updated URL

    >> url = 'http://stackoverflow.com/test?answers=true'
    >> new_params = {'answers': False, 'data': ['some','values']}
    >> add_url_params(url, new_params)
    'http://stackoverflow.com/test?data=some&data=values&answers=false'
    """
    # Unquoting URL first so we don't loose existing args
    url = unquote(url)
    # Extracting url info
    parsed_url = urlparse(url)
    # Extracting URL arguments from parsed URL
    get_args = parsed_url.query
    # Converting URL arguments to dict
    parsed_get_args = dict(parse_qsl(get_args))
    # Merging URL arguments dict with new params
    parsed_get_args.update(params)

    # Bool and Dict values should be converted to json-friendly values
    # you may throw this part away if you don't like it :)
    parsed_get_args.update(
        {k: dumps(v) for k, v in parsed_get_args.items()
         if isinstance(v, (bool, dict))}
    )

    # Converting URL argument to proper query string
    encoded_get_args = urlencode(parsed_get_args, doseq=True)
    # Creating new parsed result object based on provided with new
    # URL arguments. Same thing happens inside of urlparse.
    new_url = ParseResult(
        parsed_url.scheme, parsed_url.netloc, parsed_url.path,
        parsed_url.params, encoded_get_args, parsed_url.fragment
    ).geturl()

    return new_url

Please be aware that there may be some issues, if you’ll find one please let me know and we will make this thing better

Question 4

You want to use URL encoding if the strings can have arbitrary data (for example, characters such as ampersands, slashes, etc. will need to be encoded).

Check out urllib.urlencode:

>>> import urllib
>>> urllib.urlencode({'lang':'en','tag':'python'})
'lang=en&tag=python'

In python3:

from urllib import parse
parse.urlencode({'lang':'en','tag':'python'})

Question 5

You can also use the furl module https://github.com/gruns/furl

>>> from furl import furl
>>> print furl('http://example.com/search?q=question').add({'lang':'en','tag':'python'}).url
http://example.com/search?q=question&lang=en&tag=python

Question 6

Outsource it to the battle tested requests library.

This is how I will do it:

from requests.models import PreparedRequest
url = 'http://example.com/search?q=question'
params = {'lang':'en','tag':'python'}
req = PreparedRequest()
req.prepare_url(url, params)
print(req.url)

Question 7

If you are using the requests lib:

import requests
...
params = {'tag': 'python'}
requests.get(url, params=params)

Question 8

Yes: use urllib.

From the examples in the documentation:

>>> import urllib
>>> params = urllib.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
>>> f = urllib.urlopen("http://www.musi-cal.com/cgi-bin/query?%s" % params)
>>> print f.geturl() # Prints the final URL with parameters.
>>> print f.read() # Prints the contents

Question 9

Based on this answer, one-liner for simple cases (Python 3 code):

from urllib.parse import urlparse, urlencode


url = "https://stackoverflow.com/search?q=question"
params = {'lang':'en','tag':'python'}

url += ('&' if urlparse(url).query else '?') + urlencode(params)

or:

url += ('&', '?')[urlparse(url).query == ''] + urlencode(params)

Question 10

I find this more elegant than the two top answers:

from urllib.parse import urlencode, urlparse, parse_qs

def merge_url_query_params(url: str, additional_params: dict) -> str:
    url_components = urlparse(url)
    original_params = parse_qs(url_components.query)
    # Before Python 3.5 you could update original_params with 
    # additional_params, but here all the variables are immutable.
    merged_params = {**original_params, **additional_params}
    updated_query = urlencode(merged_params, doseq=True)
    # _replace() is how you can create a new NamedTuple with a changed field
    return url_components._replace(query=updated_query).geturl()

assert merge_url_query_params(
    'http://example.com/search?q=question',
    {'lang':'en','tag':'python'},
) == 'http://example.com/search?q=question&lang=en&tag=python'

The most important things I dislike in the top answers (they are nevertheless good):

Łukasz: having to remember the index at which the query is in the URL components
Sapphire64: the very verbose way of creating the updated ParseResult

What’s bad about my response is the magically looking dict merge using unpacking, but I prefer that to updating an already existing dictionary because of my prejudice against mutability.

Question 11

I liked Łukasz version, but since urllib and urllparse functions are somewhat awkward to use in this case, I think it’s more straightforward to do something like this:

params = urllib.urlencode(params)

if urlparse.urlparse(url)[4]:
    print url + '&' + params
else:
    print url + '?' + params

Question 12

Use the various urlparse functions to tear apart the existing URL, urllib.urlencode() on the combined dictionary, then urlparse.urlunparse() to put it all back together again.

Or just take the result of urllib.urlencode() and concatenate it to the URL appropriately.

Question 13

Yet another answer:

def addGetParameters(url, newParams):
    (scheme, netloc, path, params, query, fragment) = urlparse.urlparse(url)
    queryList = urlparse.parse_qsl(query, keep_blank_values=True)
    for key in newParams:
        queryList.append((key, newParams[key]))
    return urlparse.urlunparse((scheme, netloc, path, params, urllib.urlencode(queryList), fragment))

Question 14

Here is how I implemented it.

import urllib

params = urllib.urlencode({'lang':'en','tag':'python'})
url = ''
if request.GET:
   url = request.url + '&' + params
else:
   url = request.url + '?' + params

Worked like a charm. However, I would have liked a more cleaner way to implement this.

Another way of implementing the above is put it in a method.

import urllib

def add_url_param(request, **params):
   new_url = ''
   _params = dict(**params)
   _params = urllib.urlencode(_params)

   if _params:
      if request.GET:
         new_url = request.url + '&' + _params
      else:
         new_url = request.url + '?' + _params
   else:
      new_url = request.url

   return new_ur

Question 15

In python 2.5

import cgi
import urllib
import urlparse

def add_url_param(url, **params):
    n=3
    parts = list(urlparse.urlsplit(url))
    d = dict(cgi.parse_qsl(parts[n])) # use cgi.parse_qs for list values
    d.update(params)
    parts[n]=urllib.urlencode(d)
    return urlparse.urlunsplit(parts)

url = "http://stackoverflow.com/search?q=question"
add_url_param(url, lang='en') == "http://stackoverflow.com/search?q=question&lang=en"

Python 实用宝典

在Python中将参数添加到给定的URL

问题：在Python中将参数添加到给定的URL

回答 0

回答 1

为什么

这个怎么运作

谈话很便宜。给我看代码。

Why

How it works

Talk is cheap. Show me the code.

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

回答 13

有趣好用的Python教程