I am new to gevents and greenlets. I found some good documentation on how to work with them, but none gave me justification on how and when I should use greenlets!
What are they really good at?
Is it a good idea to use them in a proxy server or not?
Why not threads?
What I am not sure about is how they can provide us with concurrency if they’re basically co-routines.
Greenlets provide concurrency but not parallelism. Concurrency is when code can run independently of other code. Parallelism is the execution of concurrent code simultaneously. Parallelism is particularly useful when there’s a lot of work to be done in userspace, and that’s typically CPU-heavy stuff. Concurrency is useful for breaking apart problems, enabling different parts to be scheduled and managed more easily in parallel.
Greenlets really shine in network programming where interactions with one socket can occur independently of interactions with other sockets. This is a classic example of concurrency. Because each greenlet runs in its own context, you can continue to use synchronous APIs without threading. This is good because threads are very expensive in terms of virtual memory and kernel overhead, so the concurrency you can achieve with threads is significantly less. Additionally, threading in Python is more expensive and more limited than usual due to the GIL. Alternatives to concurrency are usually projects like Twisted, libevent, libuv, node.js etc, where all your code shares the same execution context, and register event handlers.
It’s an excellent idea to use greenlets (with appropriate networking support such as through gevent) for writing a proxy, as your handling of requests are able to execute independently and should be written as such.
Greenlets provide concurrency for the reasons I gave earlier. Concurrency is not parallelism. By concealing event registration and performing scheduling for you on calls that would normally block the current thread, projects like gevent expose this concurrency without requiring change to an asynchronous API, and at significantly less cost to your system.
Taking @Max’s answer and adding some relevance to it for scaling, you can see the difference. I achieved this by changing the URLs to be filled as follows:
URLS_base = ['www.google.com', 'www.example.com', 'www.python.org', 'www.yahoo.com', 'www.ubc.ca', 'www.wikipedia.org']
URLS = []
for _ in range(10000):
for url in URLS_base:
URLS.append(url)
I had to drop out the multiprocess version as it fell before I had 500; but at 10,000 iterations:
Using gevent it took: 3.756914
-----------
Using multi-threading it took: 15.797028
So you can see there is some significant difference in I/O using gevent
Correcting for @TemporalBeing ‘s answer above, greenlets are not “faster” than threads and it is an incorrect programming technique to spawn 60000 threads to solve a concurrency problem, a small pool of threads is instead appropriate. Here is a more reasonable comparison (from my reddit post in response to people citing this SO post).
import gevent
from gevent import socket as gsock
import socket as sock
import threading
from datetime import datetime
def timeit(fn, URLS):
t1 = datetime.now()
fn()
t2 = datetime.now()
print(
"%s / %d hostnames, %s seconds" % (
fn.__name__,
len(URLS),
(t2 - t1).total_seconds()
)
)
def run_gevent_without_a_timeout():
ip_numbers = []
def greenlet(domain_name):
ip_numbers.append(gsock.gethostbyname(domain_name))
jobs = [gevent.spawn(greenlet, domain_name) for domain_name in URLS]
gevent.joinall(jobs)
assert len(ip_numbers) == len(URLS)
def run_threads_correctly():
ip_numbers = []
def process():
while queue:
try:
domain_name = queue.pop()
except IndexError:
pass
else:
ip_numbers.append(sock.gethostbyname(domain_name))
threads = [threading.Thread(target=process) for i in range(50)]
queue = list(URLS)
for t in threads:
t.start()
for t in threads:
t.join()
assert len(ip_numbers) == len(URLS)
URLS_base = ['www.google.com', 'www.example.com', 'www.python.org',
'www.yahoo.com', 'www.ubc.ca', 'www.wikipedia.org']
for NUM in (5, 50, 500, 5000, 10000):
URLS = []
for _ in range(NUM):
for url in URLS_base:
URLS.append(url)
print("--------------------")
timeit(run_gevent_without_a_timeout, URLS)
timeit(run_threads_correctly, URLS)
the misunderstanding everyone has about non-blocking IO with Python is the belief that the Python interpreter can attend to the work of retrieving results from sockets at a large scale faster than the network connections themselves can return IO. While this is certainly true in some cases, it is not true nearly as often as people think, because the Python interpreter is really, really slow. In my blog post here, I illustrate some graphical profiles that show that for even very simple things, if you are dealing with crisp and fast network access to things like databases or DNS servers, those services can come back a lot faster than the Python code can attend to many thousands of those connections.
回答 3
这足以分析有趣。这是一个代码,用于比较greenlet与多处理池与多线程的性能:
import gevent
from gevent import socket as gsock
import socket as sock
from multiprocessing importPoolfrom threading importThreadfrom datetime import datetime
classIpGetter(Thread):def __init__(self, domain):Thread.__init__(self)
self.domain = domain
def run(self):
self.ip = sock.gethostbyname(self.domain)if __name__ =="__main__":
URLS =['www.google.com','www.example.com','www.python.org','www.yahoo.com','www.ubc.ca','www.wikipedia.org']
t1 = datetime.now()
jobs =[gevent.spawn(gsock.gethostbyname, url)for url in URLS]
gevent.joinall(jobs, timeout=2)
t2 = datetime.now()print"Using gevent it took: %s"%(t2-t1).total_seconds()print"-----------"
t1 = datetime.now()
pool =Pool(len(URLS))
results = pool.map(sock.gethostbyname, URLS)
t2 = datetime.now()
pool.close()print"Using multiprocessing it took: %s"%(t2-t1).total_seconds()print"-----------"
t1 = datetime.now()
threads =[]for url in URLS:
t =IpGetter(url)
t.start()
threads.append(t)for t in threads:
t.join()
t2 = datetime.now()print"Using multi-threading it took: %s"%(t2-t1).total_seconds()
结果如下:
Using gevent it took:0.083758-----------Using multiprocessing it took:0.023633-----------Using multi-threading it took:0.008327
This is interesting enough to analyze.
Here is a code to compare performance of greenlets versus multiprocessing pool versus multi-threading:
import gevent
from gevent import socket as gsock
import socket as sock
from multiprocessing import Pool
from threading import Thread
from datetime import datetime
class IpGetter(Thread):
def __init__(self, domain):
Thread.__init__(self)
self.domain = domain
def run(self):
self.ip = sock.gethostbyname(self.domain)
if __name__ == "__main__":
URLS = ['www.google.com', 'www.example.com', 'www.python.org', 'www.yahoo.com', 'www.ubc.ca', 'www.wikipedia.org']
t1 = datetime.now()
jobs = [gevent.spawn(gsock.gethostbyname, url) for url in URLS]
gevent.joinall(jobs, timeout=2)
t2 = datetime.now()
print "Using gevent it took: %s" % (t2-t1).total_seconds()
print "-----------"
t1 = datetime.now()
pool = Pool(len(URLS))
results = pool.map(sock.gethostbyname, URLS)
t2 = datetime.now()
pool.close()
print "Using multiprocessing it took: %s" % (t2-t1).total_seconds()
print "-----------"
t1 = datetime.now()
threads = []
for url in URLS:
t = IpGetter(url)
t.start()
threads.append(t)
for t in threads:
t.join()
t2 = datetime.now()
print "Using multi-threading it took: %s" % (t2-t1).total_seconds()
here are the results:
Using gevent it took: 0.083758
-----------
Using multiprocessing it took: 0.023633
-----------
Using multi-threading it took: 0.008327
I think that greenlet claims that it is not bound by GIL unlike the multithreading library. Moreover, Greenlet doc says that it is meant for network operations. For a network intensive operation, thread-switching is fine and you can see that the multithreading approach is pretty fast.
Also it’s always prefeerable to use python’s official libraries; I tried installing greenlet on windows and encountered a dll dependency problem so I ran this test on a linux vm.
Alway try to write a code with the hope that it runs on any machine.
I am opening a file which has 100,000 URL’s. I need to send an HTTP request to each URL and print the status code. I am using Python 2.6, and so far looked at the many confusing ways Python implements threading/concurrency. I have even looked at the python concurrence library, but cannot figure out how to write this program correctly. Has anyone come across a similar problem? I guess generally I need to know how to perform thousands of tasks in Python as fast as possible – I suppose that means ‘concurrently’.
回答 0
无捻解决方案:
from urlparse import urlparse
from threading importThreadimport httplib, sys
fromQueueimportQueue
concurrent =200def doWork():whileTrue:
url = q.get()
status, url = getStatus(url)
doSomethingWithResult(status, url)
q.task_done()def getStatus(ourl):try:
url = urlparse(ourl)
conn = httplib.HTTPConnection(url.netloc)
conn.request("HEAD", url.path)
res = conn.getresponse()return res.status, ourl
except:return"error", ourl
def doSomethingWithResult(status, url):print status, url
q =Queue(concurrent *2)for i in range(concurrent):
t =Thread(target=doWork)
t.daemon =True
t.start()try:for url in open('urllist.txt'):
q.put(url.strip())
q.join()exceptKeyboardInterrupt:
sys.exit(1)
from tornado import ioloop, httpclient
i =0def handle_request(response):print(response.code)global i
i -=1if i ==0:
ioloop.IOLoop.instance().stop()
http_client = httpclient.AsyncHTTPClient()for url in open('urls.txt'):
i +=1
http_client.fetch(url.strip(), handle_request, method='HEAD')
ioloop.IOLoop.instance().start()
A solution using tornado asynchronous networking library
from tornado import ioloop, httpclient
i = 0
def handle_request(response):
print(response.code)
global i
i -= 1
if i == 0:
ioloop.IOLoop.instance().stop()
http_client = httpclient.AsyncHTTPClient()
for url in open('urls.txt'):
i += 1
http_client.fetch(url.strip(), handle_request, method='HEAD')
ioloop.IOLoop.instance().start()
Things have changed quite a bit since 2010 when this was posted and I haven’t tried all the other answers but I have tried a few, and I found this to work the best for me using python3.6.
I was able to fetch about ~150 unique domains per second running on AWS.
import pandas as pd
import concurrent.futures
import requests
import time
out = []
CONNECTIONS = 100
TIMEOUT = 5
tlds = open('../data/sample_1k.txt').read().splitlines()
urls = ['http://{}'.format(x) for x in tlds[1:]]
def load_url(url, timeout):
ans = requests.head(url, timeout=timeout)
return ans.status_code
with concurrent.futures.ThreadPoolExecutor(max_workers=CONNECTIONS) as executor:
future_to_url = (executor.submit(load_url, url, TIMEOUT) for url in urls)
time1 = time.time()
for future in concurrent.futures.as_completed(future_to_url):
try:
data = future.result()
except Exception as exc:
data = str(type(exc))
finally:
out.append(data)
print(str(len(out)),end="\r")
time2 = time.time()
print(f'Took {time2-time1:.2f} s')
print(pd.Series(out).value_counts())
Threads are absolutely not the answer here. They will provide both process and kernel bottlenecks, as well as throughput limits that are not acceptable if the overall goal is “the fastest way”.
A little bit of twisted and its asynchronous HTTP client would give you much better results.
A good approach to solving this problem is to first write the code required to get one result, then incorporate threading code to parallelize the application.
In a perfect world this would simply mean simultaneously starting 100,000 threads which output their results into a dictionary or list for later processing, but in practice you are limited in how many parallel HTTP requests you can issue in this fashion. Locally, you have limits in how many sockets you can open concurrently, how many threads of execution your Python interpreter will allow. Remotely, you may be limited in the number of simultaneous connections if all the requests are against one server, or many. These limitations will probably necessitate that you write the script in such a way as to only poll a small fraction of the URLs at any one time (100, as another poster mentioned, is probably a decent thread pool size, although you may find that you can successfully deploy many more).
You can follow this design pattern to resolve the above issue:
Start a thread which launches new request threads until the number of currently running threads (you can track them via threading.active_count() or by pushing the thread objects into a data structure) is >= your maximum number of simultaneous requests (say 100), then sleeps for a short timeout. This thread should terminate when there is are no more URLs to process. Thus, the thread will keep waking up, launching new threads, and sleeping until your are finished.
Have the request threads store their results in some data structure for later retrieval and output. If the structure you are storing the results in is a list or dict in CPython, you can safely append or insert unique items from your threads without locks, but if you write to a file or require in more complex cross-thread data interaction you should use a mutual exclusion lock to protect this state from corruption.
I would suggest you use the threading module. You can use it to launch and track running threads. Python’s threading support is bare, but the description of your problem suggests that it is completely sufficient for your needs.
Finally, if you’d like to see a pretty straightforward application of a parallel network application written in Python, check out ssh.py. It’s a small library which uses Python threading to parallelize many SSH connections. The design is close enough to your requirements that you may find it to be a good resource.
If you’re looking to get the best performance possible, you might want to consider using Asynchronous I/O rather than threads. The overhead associated with thousands of OS threads is non-trivial and the context switching within the Python interpreter adds even more on top of it. Threading will certainly get the job done but I suspect that an asynchronous route will provide better overall performance.
Specifically, I’d suggest the async web client in the Twisted library (http://www.twistedmatrix.com). It has an admittedly steep learning curve but it quite easy to use once you get a good handle on Twisted’s style of asynchronous programming.
A HowTo on Twisted’s asynchronous web client API is available at:
Using a thread pool is a good option, and will make this fairly easy. Unfortunately, python doesn’t have a standard library that makes thread pools ultra easy. But here is a decent library that should get you started:
http://www.chrisarndt.de/projects/threadpool/
Code example from their site:
pool = ThreadPool(poolsize)
requests = makeRequests(some_callable, list_of_args, callback)
[pool.putRequest(req) for req in requests]
pool.wait()
Create epoll object,
open many client TCP sockets,
adjust their send buffers to be a bit more than request header,
send a request header — it should be immediate, just placing into a buffer,
register socket in epoll object,
do .poll on epoll obect,
read first 3 bytes from each socket from .poll,
write them to sys.stdout followed by \n (don’t flush),
close the client socket.
Limit number of sockets opened simultaneously — handle errors when sockets are created. Create a new socket only if another is closed.
Adjust OS limits.
Try forking into a few (not many) processes: this may help to use CPU a bit more effectively.
For your case, threading will probably do the trick as you’ll probably be spending most time waiting for a response. There are helpful modules like Queue in the standard library that might help.
I did a similar thing with parallel downloading of files before and it was good enough for me, but it wasn’t on the scale you are talking about.
If your task was more CPU-bound, you might want to look at the multiprocessing module, which will allow you to utilize more CPUs/cores/threads (more processes that won’t block each other since the locking is per process)
Consider using Windmill , although Windmill probably cant do that many threads.
You could do it with a hand rolled Python script on 5 machines, each one connecting outbound using ports 40000-60000, opening 100,000 port connections.
Also, it might help to do a sample test with a nicely threaded QA app such as OpenSTA in order to get an idea of how much each server can handle.
Also, try looking into just using simple Perl with the LWP::ConnCache class. You’ll probably get more performance (more connections) that way.
回答 13
这个扭曲的异步Web客户端运行得很快。
#!/usr/bin/python2.7from twisted.internet import reactor
from twisted.internet.defer importDeferred,DeferredList,DeferredLockfrom twisted.internet.defer import inlineCallbacks
from twisted.web.client importAgent,HTTPConnectionPoolfrom twisted.web.http_headers importHeadersfrom pprint import pprint
from collections import defaultdict
from urlparse import urlparse
from random import randrange
import fileinput
pool =HTTPConnectionPool(reactor)
pool.maxPersistentPerHost =16
agent =Agent(reactor, pool)
locks = defaultdict(DeferredLock)
codes ={}def getLock(url, simultaneous =1):return locks[urlparse(url).netloc, randrange(simultaneous)]@inlineCallbacksdef getMapping(url):# Limit ourselves to 4 simultaneous connections per host# Tweak this number, but it should be no larger than pool.maxPersistentPerHost
lock = getLock(url,4)yield lock.acquire()try:
resp =yield agent.request('HEAD', url)
codes[url]= resp.code
exceptExceptionas e:
codes[url]= str(e)finally:
lock.release()
dl =DeferredList(getMapping(url.strip())for url in fileinput.input())
dl.addCallback(lambda _: reactor.stop())
reactor.run()
pprint(codes)
The easiest way would be to use Python’s built-in threading library. They’re not “real” / kernel threads They have issues (like serialization), but are good enough. You’d want a queue & thread pool. One option is here, but it’s trivial to write your own. You can’t parallelize all 100,000 calls, but you can fire off 100 (or so) of them at the same time.
I have not seen clear examples with use-cases for Pool.apply, Pool.apply_async and Pool.map. I am mainly using Pool.map; what are the advantages of others?
import multiprocessing as mpimport timedef foo_pool(x):
time.sleep(2)return x*x
result_list =[]def log_result(result):# This is called whenever foo_pool(i) returns a result.# result_list is modified only by the main process, not the pool workers.
result_list.append(result)def apply_async_with_callback():
pool = mp.Pool()for i in range(10):
pool.apply_async(foo_pool, args =(i,), callback = log_result)
pool.close()
pool.join()print(result_list)if __name__ =='__main__':
apply_async_with_callback()
Back in the old days of Python, to call a function with arbitrary arguments, you would use apply:
apply(f,args,kwargs)
apply still exists in Python2.7 though not in Python3, and is generally not used anymore. Nowadays,
f(*args,**kwargs)
is preferred. The multiprocessing.Pool modules tries to provide a similar interface.
Pool.apply is like Python apply, except that the function call is performed in a separate process. Pool.apply blocks until the function is completed.
Pool.apply_async is also like Python’s built-in apply, except that the call returns immediately instead of waiting for the result. An AsyncResult object is returned. You call its get() method to retrieve the result of the function call. The get() method blocks until the function is completed. Thus, pool.apply(func, args, kwargs) is equivalent to pool.apply_async(func, args, kwargs).get().
In contrast to Pool.apply, the Pool.apply_async method also has a callback which, if supplied, is called when the function is complete. This can be used instead of calling get().
For example:
import multiprocessing as mp
import time
def foo_pool(x):
time.sleep(2)
return x*x
result_list = []
def log_result(result):
# This is called whenever foo_pool(i) returns a result.
# result_list is modified only by the main process, not the pool workers.
result_list.append(result)
def apply_async_with_callback():
pool = mp.Pool()
for i in range(10):
pool.apply_async(foo_pool, args = (i, ), callback = log_result)
pool.close()
pool.join()
print(result_list)
if __name__ == '__main__':
apply_async_with_callback()
may yield a result such as
[1, 0, 4, 9, 25, 16, 49, 36, 81, 64]
Notice, unlike pool.map, the order of the results may not correspond to the order in which the pool.apply_async calls were made.
So, if you need to run a function in a separate process, but want the current process to block until that function returns, use Pool.apply. Like Pool.apply, Pool.map blocks until the complete result is returned.
If you want the Pool of worker processes to perform many function calls asynchronously, use Pool.apply_async. The order of the results is not guaranteed to be the same as the order of the calls to Pool.apply_async.
Notice also that you could call a number of different functions with Pool.apply_async (not all calls need to use the same function).
In contrast, Pool.map applies the same function to many arguments.
However, unlike Pool.apply_async, the results are returned in an order corresponding to the order of the arguments.
pool.apply(f, args): f is only executed in ONE of the workers of the pool. So ONE of the processes in the pool will run f(args).
pool.map(f, iterable): This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks. So you take advantage of all the processes in the pool.
|Multi-args ConcurrenceBlockingOrdered-results---------------------------------------------------------------------Pool.map | no yes yes yesPool.map_async | no yes no yesPool.apply | yes no yes noPool.apply_async | yes yes no noPool.starmap | yes yes yes yesPool.starmap_async| yes yes no no
Here is an overview in a table format in order to show the differences between Pool.apply, Pool.apply_async, Pool.map and Pool.map_async. When choosing one, you have to take multi-args, concurrency, blocking, and ordering into account:
| Multi-args Concurrence Blocking Ordered-results
---------------------------------------------------------------------
Pool.map | no yes yes yes
Pool.map_async | no yes no yes
Pool.apply | yes no yes no
Pool.apply_async | yes yes no no
Pool.starmap | yes yes yes yes
Pool.starmap_async| yes yes no no
Notes:
Pool.imap and Pool.imap_async – lazier version of map and map_async.
Pool.starmap method, very much similar to map method besides it acceptance of multiple arguments.
Async methods submit all the processes at once and retrieve the results once they are finished. Use get method to obtain the results.
Pool.map(or Pool.apply)methods are very much similar to Python built-in map(or apply). They block the main process until all the processes complete and return the result.
Examples:
map
Is called for a list of jobs in one time
results = pool.map(func, [1, 2, 3])
apply
Can only be called for one job
for x, y in [[1, 1], [2, 2]]:
results.append(pool.apply(func, (x, y)))
def collect_result(result):
results.append(result)
Can only be called for one job and executes a job in the background in parallel
for x, y in [[1, 1], [2, 2]]:
pool.apply_async(worker, (x, y), callback=collect_result)
starmap
Is a variant of pool.map which support multiple arguments
pool.starmap(func, [(1, 1), (2, 1), (3, 1)])
starmap_async
A combination of starmap() and map_async() that iterates over iterable of iterables and calls func with the iterables unpacked. Returns a result object.
I am trying to understand threading in Python. I’ve looked at the documentation and examples, but quite frankly, many examples are overly sophisticated and I’m having trouble understanding them.
How do you clearly show tasks being divided for multi-threading?
import urllib2from multiprocessing.dummy importPoolasThreadPool
urls =['http://www.python.org','http://www.python.org/about/','http://www.onlamp.com/pub/a/python/2003/04/17/metaclasses.html','http://www.python.org/doc/','http://www.python.org/download/','http://www.python.org/getit/','http://www.python.org/community/','https://wiki.python.org/moin/',]# Make the Pool of workers
pool =ThreadPool(4)# Open the URLs in their own threads# and return the results
results = pool.map(urllib2.urlopen, urls)# Close the pool and wait for the work to finish
pool.close()
pool.join()
以及计时结果:
Single thread:14.4 seconds4Pool:3.1 seconds8Pool:1.4 seconds13Pool:1.3 seconds
from multiprocessing.dummy import Pool as ThreadPool
pool = ThreadPool(4)
results = pool.map(my_function, my_array)
Which is the multithreaded version of:
results = []
for item in my_array:
results.append(my_function(item))
Description
Map is a cool little function, and the key to easily injecting parallelism into your Python code. For those unfamiliar, map is something lifted from functional languages like Lisp. It is a function which maps another function over a sequence.
Map handles the iteration over the sequence for us, applies the function, and stores all of the results in a handy list at the end.
Implementation
Parallel versions of the map function are provided by two libraries:multiprocessing, and also its little known, but equally fantastic step child:multiprocessing.dummy.
multiprocessing.dummy is exactly the same as multiprocessing module, but uses threads instead (an important distinction – use multiple processes for CPU-intensive tasks; threads for (and during) I/O):
multiprocessing.dummy replicates the API of multiprocessing, but is no more than a wrapper around the threading module.
import urllib2
from multiprocessing.dummy import Pool as ThreadPool
urls = [
'http://www.python.org',
'http://www.python.org/about/',
'http://www.onlamp.com/pub/a/python/2003/04/17/metaclasses.html',
'http://www.python.org/doc/',
'http://www.python.org/download/',
'http://www.python.org/getit/',
'http://www.python.org/community/',
'https://wiki.python.org/moin/',
]
# Make the Pool of workers
pool = ThreadPool(4)
# Open the URLs in their own threads
# and return the results
results = pool.map(urllib2.urlopen, urls)
# Close the pool and wait for the work to finish
pool.close()
pool.join()
importQueueimport threadingimport urllib2# Called by each threaddef get_url(q, url):
q.put(urllib2.urlopen(url).read())
theurls =["http://google.com","http://yahoo.com"]
q =Queue.Queue()for u in theurls:
t = threading.Thread(target=get_url, args =(q,u))
t.daemon =True
t.start()
s = q.get()print s
Here’s a simple example: you need to try a few alternative URLs and return the contents of the first one to respond.
import Queue
import threading
import urllib2
# Called by each thread
def get_url(q, url):
q.put(urllib2.urlopen(url).read())
theurls = ["http://google.com", "http://yahoo.com"]
q = Queue.Queue()
for u in theurls:
t = threading.Thread(target=get_url, args = (q,u))
t.daemon = True
t.start()
s = q.get()
print s
This is a case where threading is used as a simple optimization: each subthread is waiting for a URL to resolve and respond, in order to put its contents on the queue; each thread is a daemon (won’t keep the process up if main thread ends — that’s more common than not); the main thread starts all subthreads, does a get on the queue to wait until one of them has done a put, then emits the results and terminates (which takes down any subthreads that might still be running, since they’re daemon threads).
Proper use of threads in Python is invariably connected to I/O operations (since CPython doesn’t use multiple cores to run CPU-bound tasks anyway, the only reason for threading is not blocking the process while there’s a wait for some I/O). Queues are almost invariably the best way to farm out work to threads and/or collect the work’s results, by the way, and they’re intrinsically threadsafe, so they save you from worrying about locks, conditions, events, semaphores, and other inter-thread coordination/communication concepts.
import threading
classSummingThread(threading.Thread):def __init__(self,low,high):
super(SummingThread, self).__init__()
self.low=low
self.high=high
self.total=0def run(self):for i in range(self.low,self.high):
self.total+=i
thread1 =SummingThread(0,500000)
thread2 =SummingThread(500000,1000000)
thread1.start()# This actually causes the thread to run
thread2.start()
thread1.join()# This waits until the thread has completed
thread2.join()# At this point, both threads have completed
result = thread1.total + thread2.total
print result
NOTE: For actual parallelization in Python, you should use the multiprocessing module to fork multiple processes that execute in parallel (due to the global interpreter lock, Python threads provide interleaving, but they are in fact executed serially, not in parallel, and are only useful when interleaving I/O operations).
However, if you are merely looking for interleaving (or are doing I/O operations that can be parallelized despite the global interpreter lock), then the threading module is the place to start. As a really simple example, let’s consider the problem of summing a large range by summing subranges in parallel:
import threading
class SummingThread(threading.Thread):
def __init__(self,low,high):
super(SummingThread, self).__init__()
self.low=low
self.high=high
self.total=0
def run(self):
for i in range(self.low,self.high):
self.total+=i
thread1 = SummingThread(0,500000)
thread2 = SummingThread(500000,1000000)
thread1.start() # This actually causes the thread to run
thread2.start()
thread1.join() # This waits until the thread has completed
thread2.join()
# At this point, both threads have completed
result = thread1.total + thread2.total
print result
Note that the above is a very stupid example, as it does absolutely no I/O and will be executed serially albeit interleaved (with the added overhead of context switching) in CPython due to the global interpreter lock.
Like others mentioned, CPython can use threads only for I/O waits due to GIL.
If you want to benefit from multiple cores for CPU-bound tasks, use multiprocessing:
from multiprocessing import Process
def f(name):
print 'hello', name
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
回答 4
仅需注意:线程不需要队列。
这是我能想到的最简单的示例,其中显示了10个进程同时运行。
import threading
from random import randint
from time import sleep
def print_number(number):# Sleeps a random 1 to 10 seconds
rand_int_var = randint(1,10)
sleep(rand_int_var)print"Thread "+ str(number)+" slept for "+ str(rand_int_var)+" seconds"
thread_list =[]for i in range(1,10):# Instantiates the thread# (i) does not make a sequence, so (i,)
t = threading.Thread(target=print_number, args=(i,))# Sticks the thread in a list so that it remains accessible
thread_list.append(t)# Starts threadsfor thread in thread_list:
thread.start()# This blocks the calling thread until the thread whose join() method is called is terminated.# From http://docs.python.org/2/library/threading.html#thread-objectsfor thread in thread_list:
thread.join()# Demonstrates that the main process waited for threads to completeprint"Done"
Just a note: A queue is not required for threading.
This is the simplest example I could imagine that shows 10 processes running concurrently.
import threading
from random import randint
from time import sleep
def print_number(number):
# Sleeps a random 1 to 10 seconds
rand_int_var = randint(1, 10)
sleep(rand_int_var)
print "Thread " + str(number) + " slept for " + str(rand_int_var) + " seconds"
thread_list = []
for i in range(1, 10):
# Instantiates the thread
# (i) does not make a sequence, so (i,)
t = threading.Thread(target=print_number, args=(i,))
# Sticks the thread in a list so that it remains accessible
thread_list.append(t)
# Starts threads
for thread in thread_list:
thread.start()
# This blocks the calling thread until the thread whose join() method is called is terminated.
# From http://docs.python.org/2/library/threading.html#thread-objects
for thread in thread_list:
thread.join()
# Demonstrates that the main process waited for threads to complete
print "Done"
try:# For Python 3import queue
from urllib.request import urlopen
except:# For Python 2 importQueueas queue
from urllib2 import urlopen
import threading
worker_data =['http://google.com','http://yahoo.com','http://bing.com']# Load up a queue with your data. This will handle locking
q = queue.Queue()for url in worker_data:
q.put(url)# Define a worker functiondef worker(url_queue):
queue_full =Truewhile queue_full:try:# Get your data off the queue, and do some work
url = url_queue.get(False)
data = urlopen(url).read()print(len(data))except queue.Empty:
queue_full =False# Create as many threads as you want
thread_count =5for i in range(thread_count):
t = threading.Thread(target=worker, args =(q,))
t.start()
The answer from Alex Martelli helped me. However, here is a modified version that I thought was more useful (at least to me).
Updated: works in both Python 2 and Python 3
try:
# For Python 3
import queue
from urllib.request import urlopen
except:
# For Python 2
import Queue as queue
from urllib2 import urlopen
import threading
worker_data = ['http://google.com', 'http://yahoo.com', 'http://bing.com']
# Load up a queue with your data. This will handle locking
q = queue.Queue()
for url in worker_data:
q.put(url)
# Define a worker function
def worker(url_queue):
queue_full = True
while queue_full:
try:
# Get your data off the queue, and do some work
url = url_queue.get(False)
data = urlopen(url).read()
print(len(data))
except queue.Empty:
queue_full = False
# Create as many threads as you want
thread_count = 5
for i in range(thread_count):
t = threading.Thread(target=worker, args = (q,))
t.start()
importQueueimport threading
import multiprocessing
import subprocess
q =Queue.Queue()for i in range(30):# Put 30 tasks in the queue
q.put(i)def worker():whileTrue:
item = q.get()# Execute a task: call a shell program and wait until it completes
subprocess.call("echo "+ str(item), shell=True)
q.task_done()
cpus = multiprocessing.cpu_count()# Detect number of coresprint("Creating %d threads"% cpus)for i in range(cpus):
t = threading.Thread(target=worker)
t.daemon =True
t.start()
q.join()# Block until all tasks are done
I found this very useful: create as many threads as cores and let them execute a (large) number of tasks (in this case, calling a shell program):
import Queue
import threading
import multiprocessing
import subprocess
q = Queue.Queue()
for i in range(30): # Put 30 tasks in the queue
q.put(i)
def worker():
while True:
item = q.get()
# Execute a task: call a shell program and wait until it completes
subprocess.call("echo " + str(item), shell=True)
q.task_done()
cpus = multiprocessing.cpu_count() # Detect number of cores
print("Creating %d threads" % cpus)
for i in range(cpus):
t = threading.Thread(target=worker)
t.daemon = True
t.start()
q.join() # Block until all tasks are done
import concurrent.futures
import urllib.request
URLS =['http://www.foxnews.com/','http://www.cnn.com/','http://europe.wsj.com/','http://www.bbc.co.uk/','http://some-made-up-domain.com/']# Retrieve a single page and report the URL and contentsdef load_url(url, timeout):with urllib.request.urlopen(url, timeout=timeout)as conn:return conn.read()# We can use a with statement to ensure threads are cleaned up promptlywith concurrent.futures.ThreadPoolExecutor(max_workers=5)as executor:# Start the load operations and mark each future with its URL
future_to_url ={executor.submit(load_url, url,60): url for url in URLS}for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]try:
data = future.result()exceptExceptionas exc:print('%r generated an exception: %s'%(url, exc))else:print('%r page is %d bytes'%(url, len(data)))
import concurrent.futures
import math
PRIMES =[112272535095293,112582705942171,112272535095293,115280095190773,115797848077099,1099726899285419]def is_prime(n):if n %2==0:returnFalse
sqrt_n = int(math.floor(math.sqrt(n)))for i in range(3, sqrt_n +1,2):if n % i ==0:returnFalsereturnTruedef main():with concurrent.futures.ProcessPoolExecutor()as executor:for number, prime in zip(PRIMES, executor.map(is_prime, PRIMES)):print('%d is prime: %s'%(number, prime))if __name__ =='__main__':
main()
import concurrent.futures
import urllib.request
URLS = ['http://www.foxnews.com/',
'http://www.cnn.com/',
'http://europe.wsj.com/',
'http://www.bbc.co.uk/',
'http://some-made-up-domain.com/']
# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
with urllib.request.urlopen(url, timeout=timeout) as conn:
return conn.read()
# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
# Start the load operations and mark each future with its URL
future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
data = future.result()
except Exception as exc:
print('%r generated an exception: %s' % (url, exc))
else:
print('%r page is %d bytes' % (url, len(data)))
import concurrent.futures
import math
PRIMES = [
112272535095293,
112582705942171,
112272535095293,
115280095190773,
115797848077099,
1099726899285419]
def is_prime(n):
if n % 2 == 0:
return False
sqrt_n = int(math.floor(math.sqrt(n)))
for i in range(3, sqrt_n + 1, 2):
if n % i == 0:
return False
return True
def main():
with concurrent.futures.ProcessPoolExecutor() as executor:
for number, prime in zip(PRIMES, executor.map(is_prime, PRIMES)):
print('%d is prime: %s' % (number, prime))
if __name__ == '__main__':
main()
def sqr(val):import time
time.sleep(0.1)return val * val
def process_result(result):print(result)def process_these_asap(tasks):import concurrent.futures
with concurrent.futures.ProcessPoolExecutor()as executor:
futures =[]for task in tasks:
futures.append(executor.submit(sqr, task))for future in concurrent.futures.as_completed(futures):
process_result(future.result())# Or instead of all this just do:# results = executor.map(sqr, tasks)# list(map(process_result, results))def main():
tasks = list(range(10))print('Processing {} tasks'.format(len(tasks)))
process_these_asap(tasks)print('Done')return0if __name__ =='__main__':import sys
sys.exit(main())
def sqr(val):
import time
time.sleep(0.1)
return val * val
def process_result(result):
print(result)
def process_these_asap(tasks):
import concurrent.futures
with concurrent.futures.ProcessPoolExecutor() as executor:
futures = []
for task in tasks:
futures.append(executor.submit(sqr, task))
for future in concurrent.futures.as_completed(futures):
process_result(future.result())
# Or instead of all this just do:
# results = executor.map(sqr, tasks)
# list(map(process_result, results))
def main():
tasks = list(range(10))
print('Processing {} tasks'.format(len(tasks)))
process_these_asap(tasks)
print('Done')
return 0
if __name__ == '__main__':
import sys
sys.exit(main())
The executor approach might seem familiar to all those who have gotten their hands dirty with Java before.
Also on a side note: To keep the universe sane, don’t forget to close your pools/executors if you don’t use with context (which is so awesome that it does it for you)
from concurrent.futures importThreadPoolExecutor, as_completed
def get_url(url):# Your actual program here. Using threading.Lock() if necessaryreturn""# List of URLs to fetch
urls =["url1","url2"]withThreadPoolExecutor(max_workers =5)as executor:# Create threads
futures ={executor.submit(get_url, url)for url in urls}# as_completed() gives you the threads once finishedfor f in as_completed(futures):# Get the results
rs = f.result()
Most documentation and tutorials use Python’s Threading and Queue module, and they could seem overwhelming for beginners.
Perhaps consider the concurrent.futures.ThreadPoolExecutor module of Python 3.
Combined with with clause and list comprehension it could be a real charm.
from concurrent.futures import ThreadPoolExecutor, as_completed
def get_url(url):
# Your actual program here. Using threading.Lock() if necessary
return ""
# List of URLs to fetch
urls = ["url1", "url2"]
with ThreadPoolExecutor(max_workers = 5) as executor:
# Create threads
futures = {executor.submit(get_url, url) for url in urls}
# as_completed() gives you the threads once finished
for f in as_completed(futures):
# Get the results
rs = f.result()
import math
import timeit
import threading
import multiprocessing
from concurrent.futures importThreadPoolExecutor,ProcessPoolExecutordef time_stuff(fn):"""
Measure time of execution of a function
"""def wrapper(*args,**kwargs):
t0 = timeit.default_timer()
fn(*args,**kwargs)
t1 = timeit.default_timer()print("{} seconds".format(t1 - t0))return wrapper
def find_primes_in(nmin, nmax):"""
Compute a list of prime numbers between the given minimum and maximum arguments
"""
primes =[]# Loop from minimum to maximumfor current in range(nmin, nmax +1):# Take the square root of the current number
sqrt_n = int(math.sqrt(current))
found =False# Check if the any number from 2 to the square root + 1 divides the current numnber under considerationfor number in range(2, sqrt_n +1):# If divisible we have found a factor, hence this is not a prime number, lets move to the next oneif current % number ==0:
found =Truebreak# If not divisible, add this number to the list of primes that we have found so farifnot found:
primes.append(current)# I am merely printing the length of the array containing all the primes, but feel free to do what you wantprint(len(primes))@time_stuffdef sequential_prime_finder(nmin, nmax):"""
Use the main process and main thread to compute everything in this case
"""
find_primes_in(nmin, nmax)@time_stuffdef threading_prime_finder(nmin, nmax):"""
If the minimum is 1000 and the maximum is 2000 and we have four workers,
1000 - 1250 to worker 1
1250 - 1500 to worker 2
1500 - 1750 to worker 3
1750 - 2000 to worker 4
so let’s split the minimum and maximum values according to the number of workers
"""
nrange = nmax - nmin
threads =[]for i in range(8):
start = int(nmin + i * nrange/8)
end = int(nmin +(i +1)* nrange/8)# Start the thread with the minimum and maximum split up to compute# Parallel computation will not work here due to the GIL since this is a CPU-bound task
t = threading.Thread(target = find_primes_in, args =(start, end))
threads.append(t)
t.start()# Don’t forget to wait for the threads to finishfor t in threads:
t.join()@time_stuffdef processing_prime_finder(nmin, nmax):"""
Split the minimum, maximum interval similar to the threading method above, but use processes this time
"""
nrange = nmax - nmin
processes =[]for i in range(8):
start = int(nmin + i * nrange/8)
end = int(nmin +(i +1)* nrange/8)
p = multiprocessing.Process(target = find_primes_in, args =(start, end))
processes.append(p)
p.start()for p in processes:
p.join()@time_stuffdef thread_executor_prime_finder(nmin, nmax):"""
Split the min max interval similar to the threading method, but use a thread pool executor this time.
This method is slightly faster than using pure threading as the pools manage threads more efficiently.
This method is still slow due to the GIL limitations since we are doing a CPU-bound task.
"""
nrange = nmax - nmin
withThreadPoolExecutor(max_workers =8)as e:for i in range(8):
start = int(nmin + i * nrange/8)
end = int(nmin +(i +1)* nrange/8)
e.submit(find_primes_in, start, end)@time_stuffdef process_executor_prime_finder(nmin, nmax):"""
Split the min max interval similar to the threading method, but use the process pool executor.
This is the fastest method recorded so far as it manages process efficiently + overcomes GIL limitations.
RECOMMENDED METHOD FOR CPU-BOUND TASKS
"""
nrange = nmax - nmin
withProcessPoolExecutor(max_workers =8)as e:for i in range(8):
start = int(nmin + i * nrange/8)
end = int(nmin +(i +1)* nrange/8)
e.submit(find_primes_in, start, end)def main():
nmin = int(1e7)
nmax = int(1.05e7)print("Sequential Prime Finder Starting")
sequential_prime_finder(nmin, nmax)print("Threading Prime Finder Starting")
threading_prime_finder(nmin, nmax)print("Processing Prime Finder Starting")
processing_prime_finder(nmin, nmax)print("Thread Executor Prime Finder Starting")
thread_executor_prime_finder(nmin, nmax)print("Process Executor Finder Starting")
process_executor_prime_finder(nmin, nmax)
main()
I saw a lot of examples here where no real work was being performed, and they were mostly CPU-bound. Here is an example of a CPU-bound task that computes all prime numbers between 10 million and 10.05 million. I have used all four methods here:
import math
import timeit
import threading
import multiprocessing
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
def time_stuff(fn):
"""
Measure time of execution of a function
"""
def wrapper(*args, **kwargs):
t0 = timeit.default_timer()
fn(*args, **kwargs)
t1 = timeit.default_timer()
print("{} seconds".format(t1 - t0))
return wrapper
def find_primes_in(nmin, nmax):
"""
Compute a list of prime numbers between the given minimum and maximum arguments
"""
primes = []
# Loop from minimum to maximum
for current in range(nmin, nmax + 1):
# Take the square root of the current number
sqrt_n = int(math.sqrt(current))
found = False
# Check if the any number from 2 to the square root + 1 divides the current numnber under consideration
for number in range(2, sqrt_n + 1):
# If divisible we have found a factor, hence this is not a prime number, lets move to the next one
if current % number == 0:
found = True
break
# If not divisible, add this number to the list of primes that we have found so far
if not found:
primes.append(current)
# I am merely printing the length of the array containing all the primes, but feel free to do what you want
print(len(primes))
@time_stuff
def sequential_prime_finder(nmin, nmax):
"""
Use the main process and main thread to compute everything in this case
"""
find_primes_in(nmin, nmax)
@time_stuff
def threading_prime_finder(nmin, nmax):
"""
If the minimum is 1000 and the maximum is 2000 and we have four workers,
1000 - 1250 to worker 1
1250 - 1500 to worker 2
1500 - 1750 to worker 3
1750 - 2000 to worker 4
so let’s split the minimum and maximum values according to the number of workers
"""
nrange = nmax - nmin
threads = []
for i in range(8):
start = int(nmin + i * nrange/8)
end = int(nmin + (i + 1) * nrange/8)
# Start the thread with the minimum and maximum split up to compute
# Parallel computation will not work here due to the GIL since this is a CPU-bound task
t = threading.Thread(target = find_primes_in, args = (start, end))
threads.append(t)
t.start()
# Don’t forget to wait for the threads to finish
for t in threads:
t.join()
@time_stuff
def processing_prime_finder(nmin, nmax):
"""
Split the minimum, maximum interval similar to the threading method above, but use processes this time
"""
nrange = nmax - nmin
processes = []
for i in range(8):
start = int(nmin + i * nrange/8)
end = int(nmin + (i + 1) * nrange/8)
p = multiprocessing.Process(target = find_primes_in, args = (start, end))
processes.append(p)
p.start()
for p in processes:
p.join()
@time_stuff
def thread_executor_prime_finder(nmin, nmax):
"""
Split the min max interval similar to the threading method, but use a thread pool executor this time.
This method is slightly faster than using pure threading as the pools manage threads more efficiently.
This method is still slow due to the GIL limitations since we are doing a CPU-bound task.
"""
nrange = nmax - nmin
with ThreadPoolExecutor(max_workers = 8) as e:
for i in range(8):
start = int(nmin + i * nrange/8)
end = int(nmin + (i + 1) * nrange/8)
e.submit(find_primes_in, start, end)
@time_stuff
def process_executor_prime_finder(nmin, nmax):
"""
Split the min max interval similar to the threading method, but use the process pool executor.
This is the fastest method recorded so far as it manages process efficiently + overcomes GIL limitations.
RECOMMENDED METHOD FOR CPU-BOUND TASKS
"""
nrange = nmax - nmin
with ProcessPoolExecutor(max_workers = 8) as e:
for i in range(8):
start = int(nmin + i * nrange/8)
end = int(nmin + (i + 1) * nrange/8)
e.submit(find_primes_in, start, end)
def main():
nmin = int(1e7)
nmax = int(1.05e7)
print("Sequential Prime Finder Starting")
sequential_prime_finder(nmin, nmax)
print("Threading Prime Finder Starting")
threading_prime_finder(nmin, nmax)
print("Processing Prime Finder Starting")
processing_prime_finder(nmin, nmax)
print("Thread Executor Prime Finder Starting")
thread_executor_prime_finder(nmin, nmax)
print("Process Executor Finder Starting")
process_executor_prime_finder(nmin, nmax)
main()
Here are the results on my Mac OS X four-core machine
Sequential Prime Finder Starting
9.708213827005238 seconds
Threading Prime Finder Starting
9.81836523200036 seconds
Processing Prime Finder Starting
3.2467174359990167 seconds
Thread Executor Prime Finder Starting
10.228896902000997 seconds
Process Executor Finder Starting
2.656402041000547 seconds
Here is the very simple example of CSV import using threading. (Library inclusion may differ for different purpose.)
Helper Functions:
from threading import Thread
from project import app
import csv
def import_handler(csv_file_name):
thr = Thread(target=dump_async_csv_data, args=[csv_file_name])
thr.start()
def dump_async_csv_data(csv_file_name):
with app.app_context():
with open(csv_file_name) as File:
reader = csv.DictReader(File)
for row in reader:
# DB operation/query
#!/bin/pythonfrom multiprocessing.dummy importPoolfrom subprocess import PIPE,Popenimport time
import os
# In the variable pool_size we define the "parallelness".# For CPU-bound tasks, it doesn't make sense to create more Pool processes# than you have cores to run them on.## On the other hand, if you are using I/O-bound tasks, it may make sense# to create a quite a few more Pool processes than cores, since the processes# will probably spend most their time blocked (waiting for I/O to complete).
pool_size =8def do_ping(ip):if os.name =='nt':print("Using Windows Ping to "+ ip)
proc =Popen(['ping', ip], stdout=PIPE)return proc.communicate()[0]else:print("Using Linux / Unix Ping to "+ ip)
proc =Popen(['ping', ip,'-c','4'], stdout=PIPE)return proc.communicate()[0]
os.system('cls'if os.name=='nt'else'clear')print("Running using threads\n")
start_time = time.time()
pool =Pool(pool_size)
website_names =["www.google.com","www.facebook.com","www.pinterest.com","www.microsoft.com"]
result ={}for website_name in website_names:
result[website_name]= pool.apply_async(do_ping, args=(website_name,))
pool.close()
pool.join()print("\n--- Execution took {} seconds ---".format((time.time()- start_time)))# Now we do the same without threading, just to compare timeprint("\nRunning NOT using threads\n")
start_time = time.time()for website_name in website_names:
do_ping(website_name)print("\n--- Execution took {} seconds ---".format((time.time()- start_time)))# Here's one way to print the final output from the threads
output ={}for key, value in result.items():
output[key]= value.get()print("\nOutput aggregated in a Dictionary:")print(output)print("\n")print("\nPretty printed output: ")for key, value in output.items():print(key +"\n")print(value)
I would like to contribute with a simple example and the explanations I’ve found useful when I had to tackle this problem myself.
In this answer you will find some information about Python’s GIL (global interpreter lock) and a simple day-to-day example written using multiprocessing.dummy plus some simple benchmarks.
Global Interpreter Lock (GIL)
Python doesn’t allow multi-threading in the truest sense of the word. It has a multi-threading package, but if you want to multi-thread to speed your code up, then it’s usually not a good idea to use it.
Python has a construct called the global interpreter lock (GIL).
The GIL makes sure that only one of your ‘threads’ can execute at any one time. A thread acquires the GIL, does a little work, then passes the GIL onto the next thread.
This happens very quickly so to the human eye it may seem like your threads are executing in parallel, but they are really just taking turns using the same CPU core.
All this GIL passing adds overhead to execution. This means that if you want to make your code run faster then using the threading
package often isn’t a good idea.
There are reasons to use Python’s threading package. If you want to run some things simultaneously, and efficiency is not a concern,
then it’s totally fine and convenient. Or if you are running code that needs to wait for something (like some I/O) then it could make a lot of sense. But the threading library won’t let you use extra CPU cores.
Multi-threading can be outsourced to the operating system (by doing multi-processing), and some external application that calls your Python code (for example, Spark or Hadoop), or some code that your Python code calls (for example: you could have your Python code call a C function that does the expensive multi-threaded stuff).
Why This Matters
Because lots of people spend a lot of time trying to find bottlenecks in their fancy Python multi-threaded code before they learn what the GIL is.
Once this information is clear, here’s my code:
#!/bin/python
from multiprocessing.dummy import Pool
from subprocess import PIPE,Popen
import time
import os
# In the variable pool_size we define the "parallelness".
# For CPU-bound tasks, it doesn't make sense to create more Pool processes
# than you have cores to run them on.
#
# On the other hand, if you are using I/O-bound tasks, it may make sense
# to create a quite a few more Pool processes than cores, since the processes
# will probably spend most their time blocked (waiting for I/O to complete).
pool_size = 8
def do_ping(ip):
if os.name == 'nt':
print ("Using Windows Ping to " + ip)
proc = Popen(['ping', ip], stdout=PIPE)
return proc.communicate()[0]
else:
print ("Using Linux / Unix Ping to " + ip)
proc = Popen(['ping', ip, '-c', '4'], stdout=PIPE)
return proc.communicate()[0]
os.system('cls' if os.name=='nt' else 'clear')
print ("Running using threads\n")
start_time = time.time()
pool = Pool(pool_size)
website_names = ["www.google.com","www.facebook.com","www.pinterest.com","www.microsoft.com"]
result = {}
for website_name in website_names:
result[website_name] = pool.apply_async(do_ping, args=(website_name,))
pool.close()
pool.join()
print ("\n--- Execution took {} seconds ---".format((time.time() - start_time)))
# Now we do the same without threading, just to compare time
print ("\nRunning NOT using threads\n")
start_time = time.time()
for website_name in website_names:
do_ping(website_name)
print ("\n--- Execution took {} seconds ---".format((time.time() - start_time)))
# Here's one way to print the final output from the threads
output = {}
for key, value in result.items():
output[key] = value.get()
print ("\nOutput aggregated in a Dictionary:")
print (output)
print ("\n")
print ("\nPretty printed output: ")
for key, value in output.items():
print (key + "\n")
print (value)
Here is multi threading with a simple example which will be helpful. You can run it and understand easily how multi threading is working in Python. I used a lock for preventing access to other threads until the previous threads finished their work. By the use of this line of code,
tLock = threading.BoundedSemaphore(value=4)
you can allow a number of processes at a time and keep hold to the rest of the threads which will run later or after finished previous processes.
from concurrent.futures importThreadPoolExecutor, as_completed
from time import sleep, time
def concurrent(max_worker=1):
futures =[]
tick = time()withThreadPoolExecutor(max_workers=max_worker)as executor:
futures.append(executor.submit(sleep,2))# Two seconds sleep
futures.append(executor.submit(sleep,1))
futures.append(executor.submit(sleep,7))
futures.append(executor.submit(sleep,3))for future in as_completed(futures):if future.result()isnotNone:print(future.result())print('Total elapsed time by {} workers:'.format(max_worker), time()-tick)
concurrent(5)
concurrent(4)
concurrent(3)
concurrent(2)
concurrent(1)
输出:
Total elapsed time by 5 workers:7.007831811904907Total elapsed time by 4 workers:7.007944107055664Total elapsed time by 3 workers:7.003149509429932Total elapsed time by 2 workers:8.004627466201782Total elapsed time by 1 workers:13.013478994369507
With borrowing from this post we know about choosing between the multithreading, multiprocessing, and async/asyncio and their usage.
Python 3 has a new built-in library in order to concurrency and parallelism: concurrent.futures
So I’ll demonstrate through an experiment to run four tasks (i.e. .sleep() method) by Threading-Pool manner:
from concurrent.futures import ThreadPoolExecutor, as_completed
from time import sleep, time
def concurrent(max_worker=1):
futures = []
tick = time()
with ThreadPoolExecutor(max_workers=max_worker) as executor:
futures.append(executor.submit(sleep, 2)) # Two seconds sleep
futures.append(executor.submit(sleep, 1))
futures.append(executor.submit(sleep, 7))
futures.append(executor.submit(sleep, 3))
for future in as_completed(futures):
if future.result() is not None:
print(future.result())
print('Total elapsed time by {} workers:'.format(max_worker), time()-tick)
concurrent(5)
concurrent(4)
concurrent(3)
concurrent(2)
concurrent(1)
Output:
Total elapsed time by 5 workers: 7.007831811904907
Total elapsed time by 4 workers: 7.007944107055664
Total elapsed time by 3 workers: 7.003149509429932
Total elapsed time by 2 workers: 8.004627466201782
Total elapsed time by 1 workers: 13.013478994369507
[NOTE]:
As you can see in the above results, the best case was 3 workers for those four tasks.
If you have a process task instead of I/O bound or blocking (multiprocessing vs threading) you could change the ThreadPoolExecutor to ProcessPoolExecutor.
None of the previous solutions actually used multiple cores on my GNU/Linux server (where I don’t have administrator rights). They just ran on a single core.
I used the lower level os.fork interface to spawn multiple processes. This is the code that worked for me:
from os import fork
values = ['different', 'values', 'for', 'threads']
for i in range(len(values)):
p = fork()
if p == 0:
my_function(values[i])
break
回答 18
import threading
import requests
def send():
r = requests.get('https://www.stackoverlow.com')
thread =[]
t = threading.Thread(target=send())
thread.append(t)
t.start()