基本的HTTP文件下载并保存到python中的磁盘上？

Question 1

I’m new to Python and I’ve been going through the Q&A on this site, for an answer to my question. However, I’m a beginner and I find it difficult to understand some of the solutions. I need a very basic solution.

Could someone please explain a simple solution to ‘Downloading a file through http’ and ‘Saving it to disk, in Windows’, to me?

I’m not sure how to use shutil and os modules, either.

The file I want to download is under 500 MB and is an .gz archive file.If someone can explain how to extract the archive and utilise the files in it also, that would be great!

Here’s a partial solution, that I wrote from various answers combined:

import requests
import os
import shutil

global dump

def download_file():
    global dump
    url = "http://randomsite.com/file.gz"
    file = requests.get(url, stream=True)
    dump = file.raw

def save_file():
    global dump
    location = os.path.abspath("D:\folder\file.gz")
    with open("file.gz", 'wb') as location:
        shutil.copyfileobj(dump, location)
    del dump

Could someone point out errors (beginner level) and explain any easier methods to do this?

Thanks!

Question 2

A clean way to download a file is:

import urllib

testfile = urllib.URLopener()
testfile.retrieve("http://randomsite.com/file.gz", "file.gz")

This downloads a file from a website and names it file.gz. This is one of my favorite solutions, from Downloading a picture via urllib and python.

This example uses the urllib library, and it will directly retrieve the file form a source.

Question 3

As mentioned here:

import urllib
urllib.urlretrieve ("http://randomsite.com/file.gz", "file.gz")

EDIT: If you still want to use requests, take a look at this question or this one.

Question 4

I use wget.

Simple and good library if you want to example?

import wget

file_url = 'http://johndoe.com/download.zip'

file_name = wget.download(file_url)

wget module support python 2 and python 3 versions

Question 5

Four methods using wget, urllib and request.

#!/usr/bin/python
import requests
from StringIO import StringIO
from PIL import Image
import profile as profile
import urllib
import wget


url = 'https://tinypng.com/images/social/website.jpg'

def testRequest():
    image_name = 'test1.jpg'
    r = requests.get(url, stream=True)
    with open(image_name, 'wb') as f:
        for chunk in r.iter_content():
            f.write(chunk)

def testRequest2():
    image_name = 'test2.jpg'
    r = requests.get(url)
    i = Image.open(StringIO(r.content))
    i.save(image_name)

def testUrllib():
    image_name = 'test3.jpg'
    testfile = urllib.URLopener()
    testfile.retrieve(url, image_name)

def testwget():
    image_name = 'test4.jpg'
    wget.download(url, image_name)

if __name__ == '__main__':
    profile.run('testRequest()')
    profile.run('testRequest2()')
    profile.run('testUrllib()')
    profile.run('testwget()')

testRequest – 4469882 function calls (4469842 primitive calls) in 20.236 seconds

testRequest2 – 8580 function calls (8574 primitive calls) in 0.072 seconds

testUrllib – 3810 function calls (3775 primitive calls) in 0.036 seconds

testwget – 3489 function calls in 0.020 seconds

Question 6

For Python3+ URLopener is deprecated. And when used you will get error as below:

url_opener = urllib.URLopener() AttributeError: module ‘urllib’ has no attribute ‘URLopener’

So, try:

import urllib.request 
urllib.request.urlretrieve(url, filename)

Question 7

Exotic Windows Solution

import subprocess

subprocess.run("powershell Invoke-WebRequest {} -OutFile {}".format(your_url, filename), shell=True)

Question 8

I started down this path because ESXi’s wget is not compiled with SSL and I wanted to download an OVA from a vendor’s website directly onto the ESXi host which is on the other side of the world.

I had to disable the firewall(lazy)/enable https out by editing the rules(proper)

created the python script:

import ssl
import shutil
import tempfile
import urllib.request
context = ssl._create_unverified_context()

dlurl='https://somesite/path/whatever'
with urllib.request.urlopen(durl, context=context) as response:
    with open("file.ova", 'wb') as tmp_file:
        shutil.copyfileobj(response, tmp_file)

ESXi libraries are kind of paired down but the open source weasel installer seemed to use urllib for https… so it inspired me to go down this path

Question 9

Another clean way to save the file is this:

import csv
import urllib

urllib.retrieve("your url goes here" , "output.csv")

基本的HTTP文件下载并保存到python中的磁盘上？

问题：基本的HTTP文件下载并保存到python中的磁盘上？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

7行代码 Python热力图可视化分析缺失数据处理

Python 流程图 — 一键转化代码为流程图

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

如何从熊猫数据框中删除行列表？

如何剖析Python中的内存使用情况？

熊猫行动中的进度指示器

创建单独变量字典的更简单方法？

如何在Ubuntu上通过pip安装python3版本的软件包？

创建零填充的熊猫数据框

基本的HTTP文件下载并保存到python中的磁盘上？

问题：基本的HTTP文件下载并保存到python中的磁盘上？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

相关文章

排行榜展示

文章展示