问题:如何使用请求下载图像

我正在尝试使用python的requests模块从网络下载并保存图像。

这是我使用的(工作)代码:

img = urllib2.urlopen(settings.STATICMAP_URL.format(**data))
with open(path, 'w') as f:
    f.write(img.read())

这是使用requests以下代码的新代码(无效):

r = requests.get(settings.STATICMAP_URL.format(**data))
if r.status_code == 200:
    img = r.raw.read()
    with open(path, 'w') as f:
        f.write(img)

您能帮助我从响应中使用什么属性requests吗?

I’m trying to download and save an image from the web using python’s requests module.

Here is the (working) code I used:

img = urllib2.urlopen(settings.STATICMAP_URL.format(**data))
with open(path, 'w') as f:
    f.write(img.read())

Here is the new (non-working) code using requests:

r = requests.get(settings.STATICMAP_URL.format(**data))
if r.status_code == 200:
    img = r.raw.read()
    with open(path, 'w') as f:
        f.write(img)

Can you help me on what attribute from the response to use from requests?


回答 0

您可以使用response.rawfile对象,也可以遍历响应。

response.raw默认情况下,使用类似文件的对象不会解码压缩的响应(使用GZIP或deflate)。您可以通过将decode_content属性设置为Truerequests将其设置False为控制自身解码)来强制为您解压缩。然后,您可以使用shutil.copyfileobj()Python将数据流式传输到文件对象:

import requests
import shutil

r = requests.get(settings.STATICMAP_URL.format(**data), stream=True)
if r.status_code == 200:
    with open(path, 'wb') as f:
        r.raw.decode_content = True
        shutil.copyfileobj(r.raw, f)        

要遍历响应,请使用循环;这样迭代可确保在此阶段对数据进行解压缩:

r = requests.get(settings.STATICMAP_URL.format(**data), stream=True)
if r.status_code == 200:
    with open(path, 'wb') as f:
        for chunk in r:
            f.write(chunk)

这将以128字节的块读取数据;如果您觉得其他块大小更好,请使用具有自定义块大小的Response.iter_content()方法

r = requests.get(settings.STATICMAP_URL.format(**data), stream=True)
if r.status_code == 200:
    with open(path, 'wb') as f:
        for chunk in r.iter_content(1024):
            f.write(chunk)

请注意,您需要以二进制模式打开目标文件,以确保python不会尝试为您翻译换行符。我们还设置stream=Truerequests不先将整个图像下载到内存中。

You can either use the response.raw file object, or iterate over the response.

To use the response.raw file-like object will not, by default, decode compressed responses (with GZIP or deflate). You can force it to decompress for you anyway by setting the decode_content attribute to True (requests sets it to False to control decoding itself). You can then use shutil.copyfileobj() to have Python stream the data to a file object:

import requests
import shutil

r = requests.get(settings.STATICMAP_URL.format(**data), stream=True)
if r.status_code == 200:
    with open(path, 'wb') as f:
        r.raw.decode_content = True
        shutil.copyfileobj(r.raw, f)        

To iterate over the response use a loop; iterating like this ensures that data is decompressed by this stage:

r = requests.get(settings.STATICMAP_URL.format(**data), stream=True)
if r.status_code == 200:
    with open(path, 'wb') as f:
        for chunk in r:
            f.write(chunk)

This’ll read the data in 128 byte chunks; if you feel another chunk size works better, use the Response.iter_content() method with a custom chunk size:

r = requests.get(settings.STATICMAP_URL.format(**data), stream=True)
if r.status_code == 200:
    with open(path, 'wb') as f:
        for chunk in r.iter_content(1024):
            f.write(chunk)

Note that you need to open the destination file in binary mode to ensure python doesn’t try and translate newlines for you. We also set stream=True so that requests doesn’t download the whole image into memory first.


回答 1

从请求中获取类似文件的对象,然后将其复制到文件中。这也将避免将整个事件立即读入内存。

import shutil

import requests

url = 'http://example.com/img.png'
response = requests.get(url, stream=True)
with open('img.png', 'wb') as out_file:
    shutil.copyfileobj(response.raw, out_file)
del response

Get a file-like object from the request and copy it to a file. This will also avoid reading the whole thing into memory at once.

import shutil

import requests

url = 'http://example.com/img.png'
response = requests.get(url, stream=True)
with open('img.png', 'wb') as out_file:
    shutil.copyfileobj(response.raw, out_file)
del response

回答 2

怎么样,一个快速的解决方案。

import requests

url = "http://craphound.com/images/1006884_2adf8fc7.jpg"
response = requests.get(url)
if response.status_code == 200:
    with open("/Users/apple/Desktop/sample.jpg", 'wb') as f:
        f.write(response.content)

How about this, a quick solution.

import requests

url = "http://craphound.com/images/1006884_2adf8fc7.jpg"
response = requests.get(url)
if response.status_code == 200:
    with open("/Users/apple/Desktop/sample.jpg", 'wb') as f:
        f.write(response.content)

回答 3

我同样需要使用请求下载图像。我首先尝试了Martijn Pieters的答案,并且效果很好。但是,当我对该简单函数进行概要分析时,发现与urllib和urllib2相比,它使用了许多函数调用。

然后,我尝试了请求模块的作者推荐方式

import requests
from PIL import Image
# python2.x, use this instead  
# from StringIO import StringIO
# for python3.x,
from io import StringIO

r = requests.get('https://example.com/image.jpg')
i = Image.open(StringIO(r.content))

这大大减少了函数调用的次数,从而加快了我的应用程序的速度。这是我的探查器的代码和结果。

#!/usr/bin/python
import requests
from StringIO import StringIO
from PIL import Image
import profile

def testRequest():
    image_name = 'test1.jpg'
    url = 'http://example.com/image.jpg'

    r = requests.get(url, stream=True)
    with open(image_name, 'wb') as f:
        for chunk in r.iter_content():
            f.write(chunk)

def testRequest2():
    image_name = 'test2.jpg'
    url = 'http://example.com/image.jpg'

    r = requests.get(url)

    i = Image.open(StringIO(r.content))
    i.save(image_name)

if __name__ == '__main__':
    profile.run('testUrllib()')
    profile.run('testUrllib2()')
    profile.run('testRequest()')

testRequest的结果:

343080 function calls (343068 primitive calls) in 2.580 seconds

以及testRequest2的结果:

3129 function calls (3105 primitive calls) in 0.024 seconds

I have the same need for downloading images using requests. I first tried the answer of Martijn Pieters, and it works well. But when I did a profile on this simple function, I found that it uses so many function calls compared to urllib and urllib2.

I then tried the way recommended by the author of requests module:

import requests
from PIL import Image
# python2.x, use this instead  
# from StringIO import StringIO
# for python3.x,
from io import StringIO

r = requests.get('https://example.com/image.jpg')
i = Image.open(StringIO(r.content))

This much more reduced the number of function calls, thus speeded up my application. Here is the code of my profiler and the result.

#!/usr/bin/python
import requests
from StringIO import StringIO
from PIL import Image
import profile

def testRequest():
    image_name = 'test1.jpg'
    url = 'http://example.com/image.jpg'

    r = requests.get(url, stream=True)
    with open(image_name, 'wb') as f:
        for chunk in r.iter_content():
            f.write(chunk)

def testRequest2():
    image_name = 'test2.jpg'
    url = 'http://example.com/image.jpg'

    r = requests.get(url)

    i = Image.open(StringIO(r.content))
    i.save(image_name)

if __name__ == '__main__':
    profile.run('testUrllib()')
    profile.run('testUrllib2()')
    profile.run('testRequest()')

The result for testRequest:

343080 function calls (343068 primitive calls) in 2.580 seconds

And the result for testRequest2:

3129 function calls (3105 primitive calls) in 0.024 seconds

回答 4

这可能比使用容易requests。这是我唯一一次建议不要使用requestsHTTP的东西。

二班轮使用urllib

>>> import urllib
>>> urllib.request.urlretrieve("http://www.example.com/songs/mp3.mp3", "mp3.mp3")

还有一个名为Python的漂亮模块wget,非常易于使用。在这里找到。

这证明了设计的简单性:

>>> import wget
>>> url = 'http://www.futurecrew.com/skaven/song_files/mp3/razorback.mp3'
>>> filename = wget.download(url)
100% [................................................] 3841532 / 3841532>
>> filename
'razorback.mp3'

请享用。

编辑:您还可以添加out参数以指定路径。

>>> out_filepath = <output_filepath>    
>>> filename = wget.download(url, out=out_filepath)

This might be easier than using requests. This is the only time I’ll ever suggest not using requests to do HTTP stuff.

Two liner using urllib:

>>> import urllib
>>> urllib.request.urlretrieve("http://www.example.com/songs/mp3.mp3", "mp3.mp3")

There is also a nice Python module named wget that is pretty easy to use. Found here.

This demonstrates the simplicity of the design:

>>> import wget
>>> url = 'http://www.futurecrew.com/skaven/song_files/mp3/razorback.mp3'
>>> filename = wget.download(url)
100% [................................................] 3841532 / 3841532>
>> filename
'razorback.mp3'

Enjoy.

Edit: You can also add an out parameter to specify a path.

>>> out_filepath = <output_filepath>    
>>> filename = wget.download(url, out=out_filepath)

回答 5

以下代码段下载文件。

该文件以其文件名保存在指定的url中。

import requests

url = "http://example.com/image.jpg"
filename = url.split("/")[-1]
r = requests.get(url, timeout=0.5)

if r.status_code == 200:
    with open(filename, 'wb') as f:
        f.write(r.content)

Following code snippet downloads a file.

The file is saved with its filename as in specified url.

import requests

url = "http://example.com/image.jpg"
filename = url.split("/")[-1]
r = requests.get(url, timeout=0.5)

if r.status_code == 200:
    with open(filename, 'wb') as f:
        f.write(r.content)

回答 6

有两种主要方法:

  1. 使用.content(最简单/官方的)(请参见Zhenyi Zhang的答案):

    import io  # Note: io.BytesIO is StringIO.StringIO on Python2.
    import requests
    
    r = requests.get('http://lorempixel.com/400/200')
    r.raise_for_status()
    with io.BytesIO(r.content) as f:
        with Image.open(f) as img:
            img.show()
  2. 使用.raw(请参阅Martijn Pieters的答案):

    import requests
    
    r = requests.get('http://lorempixel.com/400/200', stream=True)
    r.raise_for_status()
    r.raw.decode_content = True  # Required to decompress gzip/deflate compressed responses.
    with PIL.Image.open(r.raw) as img:
        img.show()
    r.close()  # Safety when stream=True ensure the connection is released.

两者的时间都没有明显差异。

There are 2 main ways:

  1. Using .content (simplest/official) (see Zhenyi Zhang’s answer):

    import io  # Note: io.BytesIO is StringIO.StringIO on Python2.
    import requests
    
    r = requests.get('http://lorempixel.com/400/200')
    r.raise_for_status()
    with io.BytesIO(r.content) as f:
        with Image.open(f) as img:
            img.show()
    
  2. Using .raw (see Martijn Pieters’s answer):

    import requests
    
    r = requests.get('http://lorempixel.com/400/200', stream=True)
    r.raise_for_status()
    r.raw.decode_content = True  # Required to decompress gzip/deflate compressed responses.
    with PIL.Image.open(r.raw) as img:
        img.show()
    r.close()  # Safety when stream=True ensure the connection is released.
    

Timing both shows no noticeable difference.


回答 7

就像导入图像和请求一样容易

from PIL import Image
import requests

img = Image.open(requests.get(url, stream = True).raw)
img.save('img1.jpg')

As easy as to import Image and requests

from PIL import Image
import requests

img = Image.open(requests.get(url, stream = True).raw)
img.save('img1.jpg')

回答 8

这是一个更加用户友好的答案,仍然使用流式传输。

只需定义这些函数并调用即可getImage()。默认情况下,它将使用与url相同的文件名并写入当前目录,但是两者都可以更改。

import requests
from StringIO import StringIO
from PIL import Image

def createFilename(url, name, folder):
    dotSplit = url.split('.')
    if name == None:
        # use the same as the url
        slashSplit = dotSplit[-2].split('/')
        name = slashSplit[-1]
    ext = dotSplit[-1]
    file = '{}{}.{}'.format(folder, name, ext)
    return file

def getImage(url, name=None, folder='./'):
    file = createFilename(url, name, folder)
    with open(file, 'wb') as f:
        r = requests.get(url, stream=True)
        for block in r.iter_content(1024):
            if not block:
                break
            f.write(block)

def getImageFast(url, name=None, folder='./'):
    file = createFilename(url, name, folder)
    r = requests.get(url)
    i = Image.open(StringIO(r.content))
    i.save(file)

if __name__ == '__main__':
    # Uses Less Memory
    getImage('http://www.example.com/image.jpg')
    # Faster
    getImageFast('http://www.example.com/image.jpg')

request胆量getImage()是根据这里的答案而的胆量getImageFast()是根据以上答案。

Here is a more user-friendly answer that still uses streaming.

Just define these functions and call getImage(). It will use the same file name as the url and write to the current directory by default, but both can be changed.

import requests
from StringIO import StringIO
from PIL import Image

def createFilename(url, name, folder):
    dotSplit = url.split('.')
    if name == None:
        # use the same as the url
        slashSplit = dotSplit[-2].split('/')
        name = slashSplit[-1]
    ext = dotSplit[-1]
    file = '{}{}.{}'.format(folder, name, ext)
    return file

def getImage(url, name=None, folder='./'):
    file = createFilename(url, name, folder)
    with open(file, 'wb') as f:
        r = requests.get(url, stream=True)
        for block in r.iter_content(1024):
            if not block:
                break
            f.write(block)

def getImageFast(url, name=None, folder='./'):
    file = createFilename(url, name, folder)
    r = requests.get(url)
    i = Image.open(StringIO(r.content))
    i.save(file)

if __name__ == '__main__':
    # Uses Less Memory
    getImage('http://www.example.com/image.jpg')
    # Faster
    getImageFast('http://www.example.com/image.jpg')

The request guts of getImage() are based on the answer here and the guts of getImageFast() are based on the answer above.


回答 9

我将发布答案,因为我没有足够的代表发表评论,但是使用Blairg23发布的wget,您还可以为路径提供out参数。

 wget.download(url, out=path)

I’m going to post an answer as I don’t have enough rep to make a comment, but with wget as posted by Blairg23, you can also provide an out parameter for the path.

 wget.download(url, out=path)

回答 10

这是谷歌搜索有关如何下载带有请求的二进制文件的第一个响应。如果您需要下载包含请求的任意文件,可以使用:

import requests
url = 'https://s3.amazonaws.com/lab-data-collections/GoogleNews-vectors-negative300.bin.gz'
open('GoogleNews-vectors-negative300.bin.gz', 'wb').write(requests.get(url, allow_redirects=True).content)

This is the first response that comes up for google searches on how to download a binary file with requests. In case you need to download an arbitrary file with requests, you can use:

import requests
url = 'https://s3.amazonaws.com/lab-data-collections/GoogleNews-vectors-negative300.bin.gz'
open('GoogleNews-vectors-negative300.bin.gz', 'wb').write(requests.get(url, allow_redirects=True).content)

回答 11

这就是我做的

import requests
from PIL import Image
from io import BytesIO

url = 'your_url'
files = {'file': ("C:/Users/shadow/Downloads/black.jpeg", open('C:/Users/shadow/Downloads/black.jpeg', 'rb'),'image/jpg')}
response = requests.post(url, files=files)

img = Image.open(BytesIO(response.content))
img.show()

This is how I did it

import requests
from PIL import Image
from io import BytesIO

url = 'your_url'
files = {'file': ("C:/Users/shadow/Downloads/black.jpeg", open('C:/Users/shadow/Downloads/black.jpeg', 'rb'),'image/jpg')}
response = requests.post(url, files=files)

img = Image.open(BytesIO(response.content))
img.show()

回答 12

您可以执行以下操作:

import requests
import random

url = "https://images.pexels.com/photos/1308881/pexels-photo-1308881.jpeg? auto=compress&cs=tinysrgb&dpr=1&w=500"
name=random.randrange(1,1000)
filename=str(name)+".jpg"
response = requests.get(url)
if response.status_code.ok:
   with open(filename,'w') as f:
    f.write(response.content)

You can do something like this:

import requests
import random

url = "https://images.pexels.com/photos/1308881/pexels-photo-1308881.jpeg? auto=compress&cs=tinysrgb&dpr=1&w=500"
name=random.randrange(1,1000)
filename=str(name)+".jpg"
response = requests.get(url)
if response.status_code.ok:
   with open(filename,'w') as f:
    f.write(response.content)

声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。