Python 实用宝典

Question 1

I have a multi-threading Python program, and a utility function, writeLog(message), that writes out a timestamp followed by the message. Unfortunately, the resultant log file gives no indication of which thread is generating which message.

I would like writeLog() to be able to add something to the message to identify which thread is calling it. Obviously I could just make the threads pass this information in, but that would be a lot more work. Is there some thread equivalent of os.getpid() that I could use?

Question 2

threading.get_ident() works, or threading.current_thread().ident (or threading.currentThread().ident for Python < 2.6).

Question 3

Using the logging module you can automatically add the current thread identifier in each log entry. Just use one of these LogRecord mapping keys in your logger format string:

%(thread)d : Thread ID (if available).

%(threadName)s : Thread name (if available).

and set up your default handler with it:

logging.basicConfig(format="%(threadName)s:%(message)s")

Question 4

The thread.get_ident() function returns a long integer on Linux. It’s not really a thread id.

I use this method to really get the thread id on Linux:

import ctypes
libc = ctypes.cdll.LoadLibrary('libc.so.6')

# System dependent, see e.g. /usr/include/x86_64-linux-gnu/asm/unistd_64.h
SYS_gettid = 186

def getThreadId():
   """Returns OS thread id - Specific to Linux"""
   return libc.syscall(SYS_gettid)

Question 5

This functionality is now supported by Python 3.8+ :)

https://github.com/python/cpython/commit/4959c33d2555b89b494c678d99be81a65ee864b0

https://github.com/python/cpython/pull/11993

Question 6

I saw examples of thread IDs like this:

class myThread(threading.Thread):
    def __init__(self, threadID, name, counter):
        self.threadID = threadID
        ...

The threading module docs lists name attribute as well:

...

A thread has a name. 
The name can be passed to the constructor, 
and read or changed through the name attribute.

...

Thread.name

A string used for identification purposes only. 
It has no semantics. Multiple threads may
be given the same name. The initial name is set by the constructor.

Question 7

You can get the ident of the current running thread. The ident could be reused for other threads, if the current thread ends.

When you crate an instance of Thread, a name is given implicit to the thread, which is the pattern: Thread-number

The name has no meaning and the name don’t have to be unique. The ident of all running threads is unique.

import threading


def worker():
    print(threading.current_thread().name)
    print(threading.get_ident())


threading.Thread(target=worker).start()
threading.Thread(target=worker, name='foo').start()

The function threading.current_thread() returns the current running thread. This object holds the whole information of the thread.

Question 8

I created multiple threads in Python, I printed the thread objects, and I printed the id using the ident variable. I see all the ids are same:

<Thread(Thread-1, stopped 140500807628544)>
<Thread(Thread-2, started 140500807628544)>
<Thread(Thread-3, started 140500807628544)>

Question 9

Similarly to @brucexin I needed to get OS-level thread identifier (which != thread.get_ident()) and use something like below not to depend on particular numbers and being amd64-only:

---- 8< ---- (xos.pyx)
"""module xos complements standard module os""" 

cdef extern from "<sys/syscall.h>":                                                             
    long syscall(long number, ...)                                                              
    const int SYS_gettid                                                                        

# gettid returns current OS thread identifier.                                                  
def gettid():                                                                                   
    return syscall(SYS_gettid)

and

---- 8< ---- (test.py)
import pyximport; pyximport.install()
import xos

...

print 'my tid: %d' % xos.gettid()

this depends on Cython though.

Question 10

I notice that it is often suggested to use queues with multiple threads, instead of lists and .pop(). Is this because lists are not thread-safe, or for some other reason?

Question 11

Lists themselves are thread-safe. In CPython the GIL protects against concurrent accesses to them, and other implementations take care to use a fine-grained lock or a synchronized datatype for their list implementations. However, while lists themselves can’t go corrupt by attempts to concurrently access, the lists’s data is not protected. For example:

L[0] += 1

is not guaranteed to actually increase L[0] by one if another thread does the same thing, because += is not an atomic operation. (Very, very few operations in Python are actually atomic, because most of them can cause arbitrary Python code to be called.) You should use Queues because if you just use an unprotected list, you may get or delete the wrong item because of race conditions.

Question 12

To clarify a point in Thomas’ excellent answer, it should be mentioned that append() is thread safe.

This is because there is no concern that data being read will be in the same place once we go to write to it. The append() operation does not read data, it only writes data to the list.

Question 13

Here’s a comprehensive yet non-exhaustive list of examples of list operations and whether or not they are thread safe. Hoping to get an answer regarding the obj in a_list language construct here.

Question 14

I recently had this case where I needed to append to a list continuously in one thread, loop through the items and check if the item was ready, it was an AsyncResult in my case and remove it from the list only if it was ready. I could not find any examples that demonstrated my problem clearly Here is an example demonstrating adding to list in one thread continuously and removing from the same list in another thread continuously The flawed version runs easily on smaller numbers but keep the numbers big enough and run a few times and you will see the error

The FLAWED version

import threading
import time

# Change this number as you please, bigger numbers will get the error quickly
count = 1000
l = []

def add():
    for i in range(count):
        l.append(i)
        time.sleep(0.0001)

def remove():
    for i in range(count):
        l.remove(i)
        time.sleep(0.0001)


t1 = threading.Thread(target=add)
t2 = threading.Thread(target=remove)
t1.start()
t2.start()
t1.join()
t2.join()

print(l)

Output when ERROR

Exception in thread Thread-63:
Traceback (most recent call last):
  File "/Users/zup/.pyenv/versions/3.6.8/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/Users/zup/.pyenv/versions/3.6.8/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "<ipython-input-30-ecfbac1c776f>", line 13, in remove
    l.remove(i)
ValueError: list.remove(x): x not in list

Version that uses locks

import threading
import time
count = 1000
l = []
lock = threading.RLock()
def add():
    with lock:
        for i in range(count):
            l.append(i)
            time.sleep(0.0001)

def remove():
    with lock:
        for i in range(count):
            l.remove(i)
            time.sleep(0.0001)


t1 = threading.Thread(target=add)
t2 = threading.Thread(target=remove)
t1.start()
t2.start()
t1.join()
t2.join()

print(l)

Output

[] # Empty list

Conclusion

As mentioned in the earlier answers while the act of appending or popping elements from the list itself is thread safe, what is not thread safe is when you append in one thread and pop in another

Question 15

I was studying the python threading and came across join().

The author told that if thread is in daemon mode then i need to use join() so that thread can finish itself before main thread terminates.

but I have also seen him using t.join() even though t was not daemon

example code is this

import threading
import time
import logging

logging.basicConfig(level=logging.DEBUG,
                    format='(%(threadName)-10s) %(message)s',
                    )

def daemon():
    logging.debug('Starting')
    time.sleep(2)
    logging.debug('Exiting')

d = threading.Thread(name='daemon', target=daemon)
d.setDaemon(True)

def non_daemon():
    logging.debug('Starting')
    logging.debug('Exiting')

t = threading.Thread(name='non-daemon', target=non_daemon)

d.start()
t.start()

d.join()
t.join()

i don’t know what is use of t.join() as it is not daemon and i can see no change even if i remove it

Question 16

A somewhat clumsy ascii-art to demonstrate the mechanism: The join() is presumably called by the main-thread. It could also be called by another thread, but would needlessly complicate the diagram.

join-calling should be placed in the track of the main-thread, but to express thread-relation and keep it as simple as possible, I choose to place it in the child-thread instead.

without join:
+---+---+------------------                     main-thread
    |   |
    |   +...........                            child-thread(short)
    +..................................         child-thread(long)

with join
+---+---+------------------***********+###      main-thread
    |   |                             |
    |   +...........join()            |         child-thread(short)
    +......................join()......         child-thread(long)

with join and daemon thread
+-+--+---+------------------***********+###     parent-thread
  |  |   |                             |
  |  |   +...........join()            |        child-thread(short)
  |  +......................join()......        child-thread(long)
  +,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,     child-thread(long + daemonized)

'-' main-thread/parent-thread/main-program execution
'.' child-thread execution
'#' optional parent-thread execution after join()-blocked parent-thread could 
    continue
'*' main-thread 'sleeping' in join-method, waiting for child-thread to finish
',' daemonized thread - 'ignores' lifetime of other threads;
    terminates when main-programs exits; is normally meant for 
    join-independent tasks

So the reason you don’t see any changes is because your main-thread does nothing after your join. You could say join is (only) relevant for the execution-flow of the main-thread.

If, for example, you want to concurrently download a bunch of pages to concatenate them into a single large page, you may start concurrent downloads using threads, but need to wait until the last page/thread is finished before you start assembling a single page out of many. That’s when you use join().

Question 17

Straight from the docs

join([timeout]) Wait until the thread terminates. This blocks the calling thread until the thread whose join() method is called terminates – either normally or through an unhandled exception – or until the optional timeout occurs.

This means that the main thread which spawns t and d, waits for t to finish until it finishes.

Depending on the logic your program employs, you may want to wait until a thread finishes before your main thread continues.

Also from the docs:

A thread can be flagged as a “daemon thread”. The significance of this flag is that the entire Python program exits when only daemon threads are left.

A simple example, say we have this:

def non_daemon():
    time.sleep(5)
    print 'Test non-daemon'

t = threading.Thread(name='non-daemon', target=non_daemon)

t.start()

Which finishes with:

print 'Test one'
t.join()
print 'Test two'

This will output:

Test one
Test non-daemon
Test two

Here the master thread explicitly waits for the t thread to finish until it calls print the second time.

Alternatively if we had this:

print 'Test one'
print 'Test two'
t.join()

We’ll get this output:

Test one
Test two
Test non-daemon

Here we do our job in the main thread and then we wait for the t thread to finish. In this case we might even remove the explicit joining t.join() and the program will implicitly wait for t to finish.

Question 18

Thanks for this thread — it helped me a lot too.

I learned something about .join() today.

These threads run in parallel:

d.start()
t.start()
d.join()
t.join()

and these run sequentially (not what I wanted):

d.start()
d.join()
t.start()
t.join()

In particular, I was trying to clever and tidy:

class Kiki(threading.Thread):
    def __init__(self, time):
        super(Kiki, self).__init__()
        self.time = time
        self.start()
        self.join()

This works! But it runs sequentially. I can put the self.start() in __ init __, but not the self.join(). That has to be done after every thread has been started.

join() is what causes the main thread to wait for your thread to finish. Otherwise, your thread runs all by itself.

So one way to think of join() as a “hold” on the main thread — it sort of de-threads your thread and executes sequentially in the main thread, before the main thread can continue. It assures that your thread is complete before the main thread moves forward. Note that this means it’s ok if your thread is already finished before you call the join() — the main thread is simply released immediately when join() is called.

In fact, it just now occurs to me that the main thread waits at d.join() until thread d finishes before it moves on to t.join().

In fact, to be very clear, consider this code:

import threading
import time

class Kiki(threading.Thread):
    def __init__(self, time):
        super(Kiki, self).__init__()
        self.time = time
        self.start()

    def run(self):
        print self.time, " seconds start!"
        for i in range(0,self.time):
            time.sleep(1)
            print "1 sec of ", self.time
        print self.time, " seconds finished!"


t1 = Kiki(3)
t2 = Kiki(2)
t3 = Kiki(1)
t1.join()
print "t1.join() finished"
t2.join()
print "t2.join() finished"
t3.join()
print "t3.join() finished"

It produces this output (note how the print statements are threaded into each other.)

$ python test_thread.py
32   seconds start! seconds start!1

 seconds start!
1 sec of  1
 1 sec of 1  seconds finished!
 21 sec of
3
1 sec of  3
1 sec of  2
2  seconds finished!
1 sec of  3
3  seconds finished!
t1.join() finished
t2.join() finished
t3.join() finished
$

The t1.join() is holding up the main thread. All three threads complete before the t1.join() finishes and the main thread moves on to execute the print then t2.join() then print then t3.join() then print.

Corrections welcome. I’m also new to threading.

(Note: in case you’re interested, I’m writing code for a DrinkBot, and I need threading to run the ingredient pumps concurrently rather than sequentially — less time to wait for each drink.)

Question 19

The method join()

blocks the calling thread until the thread whose join() method is called is terminated.

Source : http://docs.python.org/2/library/threading.html

Question 20

With join – interpreter will wait until your process get completed or terminated

>>> from threading import Thread
>>> import time
>>> def sam():
...   print 'started'
...   time.sleep(10)
...   print 'waiting for 10sec'
... 
>>> t = Thread(target=sam)
>>> t.start()
started

>>> t.join() # with join interpreter will wait until your process get completed or terminated
done?   # this line printed after thread execution stopped i.e after 10sec
waiting for 10sec
>>> done?

without join – interpreter wont wait until process get terminated,

>>> t = Thread(target=sam)
>>> t.start()
started
>>> print 'yes done' #without join interpreter wont wait until process get terminated
yes done
>>> waiting for 10sec

Question 21

When making join(t) function for both non-daemon thread and daemon thread, the main thread (or main process) should wait t seconds, then can go further to work on its own process. During the t seconds waiting time, both of the children threads should do what they can do, such as printing out some text. After the t seconds, if non-daemon thread still didn’t finish its job, and it still can finish it after the main process finishes its job, but for daemon thread, it just missed its opportunity window. However, it will eventually die after the python program exits. Please correct me if there is something wrong.

Question 22

In python 3.x join() is used to join a thread with the main thread i.e. when join() is used for a particular thread the main thread will stop executing until the execution of joined thread is complete.

#1 - Without Join():
import threading
import time
def loiter():
    print('You are loitering!')
    time.sleep(5)
    print('You are not loitering anymore!')

t1 = threading.Thread(target = loiter)
t1.start()
print('Hey, I do not want to loiter!')
'''
Output without join()--> 
You are loitering!
Hey, I do not want to loiter!
You are not loitering anymore! #After 5 seconds --> This statement will be printed

'''
#2 - With Join():
import threading
import time
def loiter():
    print('You are loitering!')
    time.sleep(5)
    print('You are not loitering anymore!')

t1 = threading.Thread(target = loiter)
t1.start()
t1.join()
print('Hey, I do not want to loiter!')

'''
Output with join() -->
You are loitering!
You are not loitering anymore! #After 5 seconds --> This statement will be printed
Hey, I do not want to loiter! 

'''

Question 23

This example demonstrate the .join() action:

import threading
import time

def threaded_worker():
    for r in range(10):
        print('Other: ', r)
        time.sleep(2)

thread_ = threading.Timer(1, threaded_worker)
thread_.daemon = True  # If the main thread kills, this thread will be killed too. 
thread_.start()

flag = True

for i in range(10):
    print('Main: ', i)
    time.sleep(2)
    if flag and i > 4:
        print(
            '''
            Threaded_worker() joined to the main thread. 
            Now we have a sequential behavior instead of concurrency.
            ''')
        thread_.join()
        flag = False

Out:

Main:  0
Other:  0
Main:  1
Other:  1
Main:  2
Other:  2
Main:  3
Other:  3
Main:  4
Other:  4
Main:  5
Other:  5

            Threaded_worker() joined to the main thread. 
            Now we have a sequential behavior instead of concurrency.

Other:  6
Other:  7
Other:  8
Other:  9
Main:  6
Main:  7
Main:  8
Main:  9

Question 24

There are a few reasons for the main thread (or any other thread) to join other threads

A thread may have created or holding (locking) some resources. The join-calling thread may be able to clear the resources on its behalf
join() is a natural blocking call for the join-calling thread to continue after the called thread has terminated.

If a python program does not join other threads, the python interpreter will still join non-daemon threads on its behalf.

Question 25

“What’s the use of using join()?” you say. Really, it’s the same answer as “what’s the use of closing files, since python and the OS will close my file for me when my program exits?”.

It’s simply a matter of good programming. You should join() your threads at the point in the code that the thread should not be running anymore, either because you positively have to ensure the thread is not running to interfere with your own code, or that you want to behave correctly in a larger system.

You might say “I don’t want my code to delay giving an answer” just because of the additional time that the join() might require. This may be perfectly valid in some scenarios, but you now need to take into account that your code is “leaving cruft around for python and the OS to clean up”. If you do this for performance reasons, I strongly encourage you to document that behavior. This is especially true if you’re building a library/package that others are expected to utilize.

There’s no reason to not join(), other than performance reasons, and I would argue that your code does not need to perform that well.

Question 26

In the Python documentation it says:

A thread can be flagged as a “daemon thread”. The significance of this flag is that the entire Python program exits when only daemon threads are left. The initial value is inherited from the creating thread.

Does anyone have a clearer explanation of what that means or a practical example showing where you would set threads as daemonic?

Clarify it for me: so the only situation you wouldn’t set threads as daemonic, is when you want them to continue running after the main thread exits?

Question 27

Some threads do background tasks, like sending keepalive packets, or performing periodic garbage collection, or whatever. These are only useful when the main program is running, and it’s okay to kill them off once the other, non-daemon, threads have exited.

Without daemon threads, you’d have to keep track of them, and tell them to exit, before your program can completely quit. By setting them as daemon threads, you can let them run and forget about them, and when your program quits, any daemon threads are killed automatically.

Question 28

Let’s say you’re making some kind of dashboard widget. As part of this, you want it to display the unread message count in your email box. So you make a little thread that will:

Connect to the mail server and ask how many unread messages you have.
Signal the GUI with the updated count.
Sleep for a little while.

When your widget starts up, it would create this thread, designate it a daemon, and start it. Because it’s a daemon, you don’t have to think about it; when your widget exits, the thread will stop automatically.

Question 29

Other posters gave some examples for situations in which you’d use daemon threads. My recommendation, however, is never to use them.

It’s not because they’re not useful, but because there are some bad side effects you can experience if you use them. Daemon threads can still execute after the Python runtime starts tearing down things in the main thread, causing some pretty bizarre exceptions.

More info here:

https://joeshaw.org/python-daemon-threads-considered-harmful/

https://mail.python.org/pipermail/python-list/2005-February/343697.html

Strictly speaking you never need them, it just makes implementation easier in some cases.

Question 30

A simpler way to think about it, perhaps: when main returns, your process will not exit if there are non-daemon threads still running.

A bit of advice: Clean shutdown is easy to get wrong when threads and synchronization are involved – if you can avoid it, do so. Use daemon threads whenever possible.

Question 31

Chris already explained what daemon threads are, so let’s talk about practical usage. Many thread pool implementations use daemon threads for task workers. Workers are threads which execute tasks from task queue.

Worker needs to keep waiting for tasks in task queue indefinitely as they don’t know when new task will appear. Thread which assigns tasks (say main thread) only knows when tasks are over. Main thread waits on task queue to get empty and then exits. If workers are user threads i.e. non-daemon, program won’t terminate. It will keep waiting for these indefinitely running workers, even though workers aren’t doing anything useful. Mark workers daemon threads, and main thread will take care of killing them as soon as it’s done handling tasks.

Question 32

Quoting Chris: “… when your program quits, any daemon threads are killed automatically.”. I think that sums it up. You should be careful when you use them as they abruptly terminate when main program executes to completion.

Question 33

When your second thread is non-Daemon, your application’s primary main thread cannot quit because its exit criteria is being tied to the exit also of non-Daemon thread(s). Threads cannot be forcibly killed in python, therefore your app will have to really wait for the non-Daemon thread(s) to exit. If this behavior is not what you want, then set your second thread as daemon so that it won’t hold back your application from exiting.

Question 34

I’m calling a function in Python which I know may stall and force me to restart the script.

How do I call the function or what do I wrap it in so that if it takes longer than 5 seconds the script cancels it and does something else?

Question 35

You may use the signal package if you are running on UNIX:

In [1]: import signal

# Register an handler for the timeout
In [2]: def handler(signum, frame):
   ...:     print("Forever is over!")
   ...:     raise Exception("end of time")
   ...: 

# This function *may* run for an indetermined time...
In [3]: def loop_forever():
   ...:     import time
   ...:     while 1:
   ...:         print("sec")
   ...:         time.sleep(1)
   ...:         
   ...:         

# Register the signal function handler
In [4]: signal.signal(signal.SIGALRM, handler)
Out[4]: 0

# Define a timeout for your function
In [5]: signal.alarm(10)
Out[5]: 0

In [6]: try:
   ...:     loop_forever()
   ...: except Exception, exc: 
   ...:     print(exc)
   ....: 
sec
sec
sec
sec
sec
sec
sec
sec
Forever is over!
end of time

# Cancel the timer if the function returned before timeout
# (ok, mine won't but yours maybe will :)
In [7]: signal.alarm(0)
Out[7]: 0

10 seconds after the call signal.alarm(10), the handler is called. This raises an exception that you can intercept from the regular Python code.

This module doesn’t play well with threads (but then, who does?)

Note that since we raise an exception when timeout happens, it may end up caught and ignored inside the function, for example of one such function:

def loop_forever():
    while 1:
        print('sec')
        try:
            time.sleep(10)
        except:
            continue

Question 36

You can use multiprocessing.Process to do exactly that.

Code

import multiprocessing
import time

# bar
def bar():
    for i in range(100):
        print "Tick"
        time.sleep(1)

if __name__ == '__main__':
    # Start bar as a process
    p = multiprocessing.Process(target=bar)
    p.start()

    # Wait for 10 seconds or until process finishes
    p.join(10)

    # If thread is still active
    if p.is_alive():
        print "running... let's kill it..."

        # Terminate - may not work if process is stuck for good
        p.terminate()
        # OR Kill - will work for sure, no chance for process to finish nicely however
        # p.kill()

        p.join()

Question 37

How do I call the function or what do I wrap it in so that if it takes longer than 5 seconds the script cancels it?

I posted a gist that solves this question/problem with a decorator and a threading.Timer. Here it is with a breakdown.

Imports and setups for compatibility

It was tested with Python 2 and 3. It should also work under Unix/Linux and Windows.

First the imports. These attempt to keep the code consistent regardless of the Python version:

from __future__ import print_function
import sys
import threading
from time import sleep
try:
    import thread
except ImportError:
    import _thread as thread

Use version independent code:

try:
    range, _print = xrange, print
    def print(*args, **kwargs): 
        flush = kwargs.pop('flush', False)
        _print(*args, **kwargs)
        if flush:
            kwargs.get('file', sys.stdout).flush()            
except NameError:
    pass

Now we have imported our functionality from the standard library.

`exit_after` decorator

Next we need a function to terminate the main() from the child thread:

def quit_function(fn_name):
    # print to stderr, unbuffered in Python 2.
    print('{0} took too long'.format(fn_name), file=sys.stderr)
    sys.stderr.flush() # Python 3 stderr is likely buffered.
    thread.interrupt_main() # raises KeyboardInterrupt

And here is the decorator itself:

def exit_after(s):
    '''
    use as decorator to exit process if 
    function takes longer than s seconds
    '''
    def outer(fn):
        def inner(*args, **kwargs):
            timer = threading.Timer(s, quit_function, args=[fn.__name__])
            timer.start()
            try:
                result = fn(*args, **kwargs)
            finally:
                timer.cancel()
            return result
        return inner
    return outer

Usage

And here’s the usage that directly answers your question about exiting after 5 seconds!:

@exit_after(5)
def countdown(n):
    print('countdown started', flush=True)
    for i in range(n, -1, -1):
        print(i, end=', ', flush=True)
        sleep(1)
    print('countdown finished')

Demo:

>>> countdown(3)
countdown started
3, 2, 1, 0, countdown finished
>>> countdown(10)
countdown started
10, 9, 8, 7, 6, countdown took too long
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 11, in inner
  File "<stdin>", line 6, in countdown
KeyboardInterrupt

The second function call will not finish, instead the process should exit with a traceback!

`KeyboardInterrupt` does not always stop a sleeping thread

Note that sleep will not always be interrupted by a keyboard interrupt, on Python 2 on Windows, e.g.:

@exit_after(1)
def sleep10():
    sleep(10)
    print('slept 10 seconds')

>>> sleep10()
sleep10 took too long         # Note that it hangs here about 9 more seconds
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 11, in inner
  File "<stdin>", line 3, in sleep10
KeyboardInterrupt

nor is it likely to interrupt code running in extensions unless it explicitly checks for PyErr_CheckSignals(), see Cython, Python and KeyboardInterrupt ignored

I would avoid sleeping a thread more than a second, in any case – that’s an eon in processor time.

How do I call the function or what do I wrap it in so that if it takes longer than 5 seconds the script cancels it and does something else?

To catch it and do something else, you can catch the KeyboardInterrupt.

>>> try:
...     countdown(10)
... except KeyboardInterrupt:
...     print('do something else')
... 
countdown started
10, 9, 8, 7, 6, countdown took too long
do something else

Question 38

I have a different proposal which is a pure function (with the same API as the threading suggestion) and seems to work fine (based on suggestions on this thread)

def timeout(func, args=(), kwargs={}, timeout_duration=1, default=None):
    import signal

    class TimeoutError(Exception):
        pass

    def handler(signum, frame):
        raise TimeoutError()

    # set the timeout handler
    signal.signal(signal.SIGALRM, handler) 
    signal.alarm(timeout_duration)
    try:
        result = func(*args, **kwargs)
    except TimeoutError as exc:
        result = default
    finally:
        signal.alarm(0)

    return result

Question 39

I ran across this thread when searching for a timeout call on unit tests. I didn’t find anything simple in the answers or 3rd party packages so I wrote the decorator below you can drop right into code:

import multiprocessing.pool
import functools

def timeout(max_timeout):
    """Timeout decorator, parameter in seconds."""
    def timeout_decorator(item):
        """Wrap the original function."""
        @functools.wraps(item)
        def func_wrapper(*args, **kwargs):
            """Closure for function."""
            pool = multiprocessing.pool.ThreadPool(processes=1)
            async_result = pool.apply_async(item, args, kwargs)
            # raises a TimeoutError if execution exceeds max_timeout
            return async_result.get(max_timeout)
        return func_wrapper
    return timeout_decorator

Then it’s as simple as this to timeout a test or any function you like:

@timeout(5.0)  # if execution takes longer than 5 seconds, raise a TimeoutError
def test_base_regression(self):
    ...

Question 40

The stopit package, found on pypi, seems to handle timeouts well.

I like the @stopit.threading_timeoutable decorator, which adds a timeout parameter to the decorated function, which does what you expect, it stops the function.

Check it out on pypi: https://pypi.python.org/pypi/stopit

Question 41

There are a lot of suggestions, but none using concurrent.futures, which I think is the most legible way to handle this.

from concurrent.futures import ProcessPoolExecutor

# Warning: this does not terminate function if timeout
def timeout_five(fnc, *args, **kwargs):
    with ProcessPoolExecutor() as p:
        f = p.submit(fnc, *args, **kwargs)
        return f.result(timeout=5)

Super simple to read and maintain.

We make a pool, submit a single process and then wait up to 5 seconds before raising a TimeoutError that you could catch and handle however you needed.

Native to python 3.2+ and backported to 2.7 (pip install futures).

Switching between threads and processes is as simple as replacing ProcessPoolExecutor with ThreadPoolExecutor.

If you want to terminate the Process on timeout I would suggest looking into Pebble.

Question 42

Great, easy to use and reliable PyPi project timeout-decorator (https://pypi.org/project/timeout-decorator/)

installation:

pip install timeout-decorator

Usage:

import time
import timeout_decorator

@timeout_decorator.timeout(5)
def mytest():
    print "Start"
    for i in range(1,10):
        time.sleep(1)
        print "%d seconds have passed" % i

if __name__ == '__main__':
    mytest()

Question 43

I am the author of wrapt_timeout_decorator

Most of the solutions presented here work wunderfully under Linux on the first glance – because we have fork() and signals() – but on windows the things look a bit different. And when it comes to subthreads on Linux, You cant use Signals anymore.

In order to spawn a process under Windows, it needs to be picklable – and many decorated functions or Class methods are not.

So You need to use a better pickler like dill and multiprocess (not pickle and multiprocessing) – thats why You cant use ProcessPoolExecutor (or only with limited functionality).

For the timeout itself – You need to define what timeout means – because on Windows it will take considerable (and not determinable) time to spawn the process. This can be tricky on short timeouts. Lets assume, spawning the process takes about 0.5 seconds (easily !!!). If You give a timeout of 0.2 seconds what should happen ? Should the function time out after 0.5 + 0.2 seconds (so let the method run for 0.2 seconds)? Or should the called process time out after 0.2 seconds (in that case, the decorated function will ALWAYS timeout, because in that time it is not even spawned) ?

Also nested decorators can be nasty and You cant use Signals in a subthread. If You want to create a truly universal, cross-platform decorator, all this needs to be taken into consideration (and tested).

Other issues are passing exceptions back to the caller, as well as logging issues (if used in the decorated function – logging to files in another process is NOT supported)

I tried to cover all edge cases, You might look into the package wrapt_timeout_decorator, or at least test Your own solutions inspired by the unittests used there.

@Alexis Eggermont – unfortunately I dont have enough points to comment – maybe someone else can notify You – I think I solved Your import issue.

Question 44

timeout-decorator don’t work on windows system as , windows didn’t support signal well.

If you use timeout-decorator in windows system you will get the following

AttributeError: module 'signal' has no attribute 'SIGALRM'

Some suggested to use use_signals=False but didn’t worked for me.

Author @bitranox created the following package:

pip install https://github.com/bitranox/wrapt-timeout-decorator/archive/master.zip

Code Sample:

import time
from wrapt_timeout_decorator import *

@timeout(5)
def mytest(message):
    print(message)
    for i in range(1,10):
        time.sleep(1)
        print('{} seconds have passed'.format(i))

def main():
    mytest('starting')


if __name__ == '__main__':
    main()

Gives the following exception:

TimeoutError: Function mytest timed out after 5 seconds

Question 45

We can use signals for the same. I think the below example will be useful for you. It is very simple compared to threads.

import signal

def timeout(signum, frame):
    raise myException

#this is an infinite loop, never ending under normal circumstances
def main():
    print 'Starting Main ',
    while 1:
        print 'in main ',

#SIGALRM is only usable on a unix platform
signal.signal(signal.SIGALRM, timeout)

#change 5 to however many seconds you need
signal.alarm(5)

try:
    main()
except myException:
    print "whoops"

Question 46

#!/usr/bin/python2
import sys, subprocess, threading
proc = subprocess.Popen(sys.argv[2:])
timer = threading.Timer(float(sys.argv[1]), proc.terminate)
timer.start()
proc.wait()
timer.cancel()
exit(proc.returncode)

Question 47

I had a need for nestable timed interrupts (which SIGALARM can’t do) that won’t get blocked by time.sleep (which the thread-based approach can’t do). I ended up copying and lightly modifying code from here: http://code.activestate.com/recipes/577600-queue-for-managing-multiple-sigalrm-alarms-concurr/

The code itself:

#!/usr/bin/python

# lightly modified version of http://code.activestate.com/recipes/577600-queue-for-managing-multiple-sigalrm-alarms-concurr/


"""alarm.py: Permits multiple SIGALRM events to be queued.

Uses a `heapq` to store the objects to be called when an alarm signal is
raised, so that the next alarm is always at the top of the heap.
"""

import heapq
import signal
from time import time

__version__ = '$Revision: 2539 $'.split()[1]

alarmlist = []

__new_alarm = lambda t, f, a, k: (t + time(), f, a, k)
__next_alarm = lambda: int(round(alarmlist[0][0] - time())) if alarmlist else None
__set_alarm = lambda: signal.alarm(max(__next_alarm(), 1))


class TimeoutError(Exception):
    def __init__(self, message, id_=None):
        self.message = message
        self.id_ = id_


class Timeout:
    ''' id_ allows for nested timeouts. '''
    def __init__(self, id_=None, seconds=1, error_message='Timeout'):
        self.seconds = seconds
        self.error_message = error_message
        self.id_ = id_
    def handle_timeout(self):
        raise TimeoutError(self.error_message, self.id_)
    def __enter__(self):
        self.this_alarm = alarm(self.seconds, self.handle_timeout)
    def __exit__(self, type, value, traceback):
        try:
            cancel(self.this_alarm) 
        except ValueError:
            pass


def __clear_alarm():
    """Clear an existing alarm.

    If the alarm signal was set to a callable other than our own, queue the
    previous alarm settings.
    """
    oldsec = signal.alarm(0)
    oldfunc = signal.signal(signal.SIGALRM, __alarm_handler)
    if oldsec > 0 and oldfunc != __alarm_handler:
        heapq.heappush(alarmlist, (__new_alarm(oldsec, oldfunc, [], {})))


def __alarm_handler(*zargs):
    """Handle an alarm by calling any due heap entries and resetting the alarm.

    Note that multiple heap entries might get called, especially if calling an
    entry takes a lot of time.
    """
    try:
        nextt = __next_alarm()
        while nextt is not None and nextt <= 0:
            (tm, func, args, keys) = heapq.heappop(alarmlist)
            func(*args, **keys)
            nextt = __next_alarm()
    finally:
        if alarmlist: __set_alarm()


def alarm(sec, func, *args, **keys):
    """Set an alarm.

    When the alarm is raised in `sec` seconds, the handler will call `func`,
    passing `args` and `keys`. Return the heap entry (which is just a big
    tuple), so that it can be cancelled by calling `cancel()`.
    """
    __clear_alarm()
    try:
        newalarm = __new_alarm(sec, func, args, keys)
        heapq.heappush(alarmlist, newalarm)
        return newalarm
    finally:
        __set_alarm()


def cancel(alarm):
    """Cancel an alarm by passing the heap entry returned by `alarm()`.

    It is an error to try to cancel an alarm which has already occurred.
    """
    __clear_alarm()
    try:
        alarmlist.remove(alarm)
        heapq.heapify(alarmlist)
    finally:
        if alarmlist: __set_alarm()

and a usage example:

import alarm
from time import sleep

try:
    with alarm.Timeout(id_='a', seconds=5):
        try:
            with alarm.Timeout(id_='b', seconds=2):
                sleep(3)
        except alarm.TimeoutError as e:
            print 'raised', e.id_
        sleep(30)
except alarm.TimeoutError as e:
    print 'raised', e.id_
else:
    print 'nope.'

Question 48

Here is a slight improvement to the given thread-based solution.

The code below supports exceptions:

def runFunctionCatchExceptions(func, *args, **kwargs):
    try:
        result = func(*args, **kwargs)
    except Exception, message:
        return ["exception", message]

    return ["RESULT", result]


def runFunctionWithTimeout(func, args=(), kwargs={}, timeout_duration=10, default=None):
    import threading
    class InterruptableThread(threading.Thread):
        def __init__(self):
            threading.Thread.__init__(self)
            self.result = default
        def run(self):
            self.result = runFunctionCatchExceptions(func, *args, **kwargs)
    it = InterruptableThread()
    it.start()
    it.join(timeout_duration)
    if it.isAlive():
        return default

    if it.result[0] == "exception":
        raise it.result[1]

    return it.result[1]

Invoking it with a 5 second timeout:

result = timeout(remote_calculate, (myarg,), timeout_duration=5)

问题：如何在Python中找到线程ID

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

问题：列表是线程安全的吗？

回答 0

回答 1

回答 2

回答 3

问题：Python线程中的join（）有什么用？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

问题：守护程序线程说明

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

问题：函数调用超时

回答 0

回答 1

回答 2

如何调用该函数或将其包装起来，以便如果花费的时间超过5秒钟，脚本将取消该函数？

导入和设置以实现兼容性

exit_after 装饰工

用法

KeyboardInterrupt 并不总是停止休眠线程

How do I call the function or what do I wrap it in so that if it takes longer than 5 seconds the script cancels it?

Imports and setups for compatibility

exit_after decorator

Usage

KeyboardInterrupt does not always stop a sleeping thread

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

回答 13

问题：如何在Python中使用线程？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

回答 13

回答 14

回答 15

回答 16

回答 17

回答 18

有趣好用的Python教程

`exit_after` 装饰工

`KeyboardInterrupt` 并不总是停止休眠线程

`exit_after` decorator

`KeyboardInterrupt` does not always stop a sleeping thread