Python 实用宝典

Question 1

This question is motivated by my another question: How to await in cdef?

There are tons of articles and blog posts on the web about asyncio, but they are all very superficial. I couldn’t find any information about how asyncio is actually implemented, and what makes I/O asynchronous. I was trying to read the source code, but it’s thousands of lines of not the highest grade C code, a lot of which deals with auxiliary objects, but most crucially, it is hard to connect between Python syntax and what C code it would translate into.

Asycnio’s own documentation is even less helpful. There’s no information there about how it works, only some guidelines about how to use it, which are also sometimes misleading / very poorly written.

I’m familiar with Go’s implementation of coroutines, and was kind of hoping that Python did the same thing. If that was the case, the code I came up in the post linked above would have worked. Since it didn’t, I’m now trying to figure out why. My best guess so far is as follows, please correct me where I’m wrong:

Procedure definitions of the form async def foo(): ... are actually interpreted as methods of a class inheriting coroutine.
Perhaps, async def is actually split into multiple methods by await statements, where the object, on which these methods are called is able to keep track of the progress it made through the execution so far.
If the above is true, then, essentially, execution of a coroutine boils down to calling methods of coroutine object by some global manager (loop?).
The global manager is somehow (how?) aware of when I/O operations are performed by Python (only?) code and is able to choose one of the pending coroutine methods to execute after the current executing method relinquished control (hit on the await statement).

In other words, here’s my attempt at “desugaring” of some asyncio syntax into something more understandable:

async def coro(name):
    print('before', name)
    await asyncio.sleep()
    print('after', name)

asyncio.gather(coro('first'), coro('second'))

# translated from async def coro(name)
class Coro(coroutine):
    def before(self, name):
        print('before', name)

    def after(self, name):
        print('after', name)

    def __init__(self, name):
        self.name = name
        self.parts = self.before, self.after
        self.pos = 0

    def __call__():
        self.parts[self.pos](self.name)
        self.pos += 1

    def done(self):
        return self.pos == len(self.parts)


# translated from asyncio.gather()
class AsyncIOManager:

    def gather(*coros):
        while not every(c.done() for c in coros):
            coro = random.choice(coros)
            coro()

Should my guess prove correct: then I have a problem. How does I/O actually happen in this scenario? In a separate thread? Is the whole interpreter suspended and I/O happens outside the interpreter? What exactly is meant by I/O? If my python procedure called C open() procedure, and it in turn sent interrupt to kernel, relinquishing control to it, how does Python interpreter know about this and is able to continue running some other code, while kernel code does the actual I/O and until it wakes up the Python procedure which sent the interrupt originally? How can Python interpreter in principle, be aware of this happening?

Question 2

How does asyncio work?

Before answering this question we need to understand a few base terms, skip these if you already know any of them.

Generators

Generators are objects that allow us to suspend the execution of a python function. User curated generators are implement using the keyword yield. By creating a normal function containing the yield keyword, we turn that function into a generator:

>>> def test():
...     yield 1
...     yield 2
...
>>> gen = test()
>>> next(gen)
1
>>> next(gen)
2
>>> next(gen)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

As you can see, calling next() on the generator causes the interpreter to load test’s frame, and return the yielded value. Calling next() again, cause the frame to load again into the interpreter stack, and continue on yielding another value.

By the third time next() is called, our generator was finished, and StopIteration was thrown.

Communicating with a generator

A less-known feature of generators, is the fact that you can communicate with them using two methods: send() and throw().

>>> def test():
...     val = yield 1
...     print(val)
...     yield 2
...     yield 3
...
>>> gen = test()
>>> next(gen)
1
>>> gen.send("abc")
abc
2
>>> gen.throw(Exception())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in test
Exception

Upon calling gen.send(), the value is passed as a return value from the yield keyword.

gen.throw() on the other hand, allows throwing Exceptions inside generators, with the exception raised at the same spot yield was called.

Returning values from generators

Returning a value from a generator, results in the value being put inside the StopIteration exception. We can later on recover the value from the exception and use it to our need.

>>> def test():
...     yield 1
...     return "abc"
...
>>> gen = test()
>>> next(gen)
1
>>> try:
...     next(gen)
... except StopIteration as exc:
...     print(exc.value)
...
abc

Behold, a new keyword: `yield from`

Python 3.4 came with the addition of a new keyword: yield from. What that keyword allows us to do, is pass on any next(), send() and throw() into an inner-most nested generator. If the inner generator returns a value, it is also the return value of yield from:

>>> def inner():
...     inner_result = yield 2
...     print('inner', inner_result)
...     return 3
...
>>> def outer():
...     yield 1
...     val = yield from inner()
...     print('outer', val)
...     yield 4
...
>>> gen = outer()
>>> next(gen)
1
>>> next(gen) # Goes inside inner() automatically
2
>>> gen.send("abc")
inner abc
outer 3
4

I’ve written an article to further elaborate on this topic.

Putting it all together

Upon introducing the new keyword yield from in Python 3.4, we were now able to create generators inside generators that just like a tunnel, pass the data back and forth from the inner-most to the outer-most generators. This has spawned a new meaning for generators – coroutines.

Coroutines are functions that can be stopped and resumed while being run. In Python, they are defined using the async def keyword. Much like generators, they too use their own form of yield from which is await. Before async and await were introduced in Python 3.5, we created coroutines in the exact same way generators were created (with yield from instead of await).

async def inner():
    return 1

async def outer():
    await inner()

Like every iterator or generator that implement the __iter__() method, coroutines implement __await__() which allows them to continue on every time await coro is called.

There’s a nice sequence diagram inside the Python docs that you should check out.

In asyncio, apart from coroutine functions, we have 2 important objects: tasks and futures.

Futures

Futures are objects that have the __await__() method implemented, and their job is to hold a certain state and result. The state can be one of the following:

PENDING – future does not have any result or exception set.
CANCELLED – future was cancelled using fut.cancel()
FINISHED – future was finished, either by a result set using fut.set_result() or by an exception set using fut.set_exception()

The result, just like you have guessed, can either be a Python object, that will be returned, or an exception which may be raised.

Another important feature of future objects, is that they contain a method called add_done_callback(). This method allows functions to be called as soon as the task is done – whether it raised an exception or finished.

Tasks

Task objects are special futures, which wrap around coroutines, and communicate with the inner-most and outer-most coroutines. Every time a coroutine awaits a future, the future is passed all the way back to the task (just like in yield from), and the task receives it.

Next, the task binds itself to the future. It does so by calling add_done_callback() on the future. From now on, if the future will ever be done, by either being cancelled, passed an exception or passed a Python object as a result, the task’s callback will be called, and it will rise back up to existence.

Asyncio

The final burning question we must answer is – how is the IO implemented?

Deep inside asyncio, we have an event loop. An event loop of tasks. The event loop’s job is to call tasks every time they are ready and coordinate all that effort into one single working machine.

The IO part of the event loop is built upon a single crucial function called select. Select is a blocking function, implemented by the operating system underneath, that allows waiting on sockets for incoming or outgoing data. Upon data being received it wakes up, and returns the sockets which received data, or the sockets whom are ready for writing.

When you try to receive or send data over a socket through asyncio, what actually happens below is that the socket is first checked if it has any data that can be immediately read or sent. If its .send() buffer is full, or the .recv() buffer is empty, the socket is registered to the select function (by simply adding it to one of the lists, rlist for recv and wlist for send) and the appropriate function awaits a newly created future object, tied to that socket.

When all available tasks are waiting for futures, the event loop calls select and waits. When the one of the sockets has incoming data, or its send buffer drained up, asyncio checks for the future object tied to that socket, and sets it to done.

Now all the magic happens. The future is set to done, the task that added itself before with add_done_callback() rises up back to life, and calls .send() on the coroutine which resumes the inner-most coroutine (because of the await chain) and you read the newly received data from a nearby buffer it was spilled unto.

Method chain again, in case of recv():

select.select waits.
A ready socket, with data is returned.
Data from the socket is moved into a buffer.
future.set_result() is called.
Task that added itself with add_done_callback() is now woken up.
Task calls .send() on the coroutine which goes all the way into the inner-most coroutine and wakes it up.
Data is being read from the buffer and returned to our humble user.

In summary, asyncio uses generator capabilities, that allow pausing and resuming functions. It uses yield from capabilities that allow passing data back and forth from the inner-most generator to the outer-most. It uses all of those in order to halt function execution while it’s waiting for IO to complete (by using the OS select function).

And the best of all? While one function is paused, another may run and interleave with the delicate fabric, which is asyncio.

Question 3

Talking about async/await and asyncio is not the same thing. The first is a fundamental, low-level construct (coroutines) while the later is a library using these constructs. Conversely, there is no single ultimate answer.

The following is a general description of how async/await and asyncio-like libraries work. That is, there may be other tricks on top (there are…) but they are inconsequential unless you build them yourself. The difference should be negligible unless you already know enough to not have to ask such a question.

1. Coroutines versus subroutines in a nut shell

Just like subroutines (functions, procedures, …), coroutines (generators, …) are an abstraction of call stack and instruction pointer: there is a stack of executing code pieces, and each is at a specific instruction.

The distinction of def versus async def is merely for clarity. The actual difference is return versus yield. From this, await or yield from take the difference from individual calls to entire stacks.

1.1. Subroutines

A subroutine represents a new stack level to hold local variables, and a single traversal of its instructions to reach an end. Consider a subroutine like this:

def subfoo(bar):
     qux = 3
     return qux * bar

When you run it, that means

allocate stack space for bar and qux
recursively execute the first statement and jump to the next statement
once at a return, push its value to the calling stack
clear the stack (1.) and instruction pointer (2.)

Notably, 4. means that a subroutine always starts at the same state. Everything exclusive to the function itself is lost upon completion. A function cannot be resumed, even if there are instructions after return.

root -\
  :    \- subfoo --\
  :/--<---return --/
  |
  V

1.2. Coroutines as persistent subroutines

A coroutine is like a subroutine, but can exit without destroying its state. Consider a coroutine like this:

 def cofoo(bar):
      qux = yield bar  # yield marks a break point
      return qux

When you run it, that means

allocate stack space for bar and qux
recursively execute the first statement and jump to the next statement
1. once at a yield, push its value to the calling stack but store the stack and instruction pointer
2. once calling into yield, restore stack and instruction pointer and push arguments to qux
once at a return, push its value to the calling stack
clear the stack (1.) and instruction pointer (2.)

Note the addition of 2.1 and 2.2 – a coroutine can be suspended and resumed at predefined points. This is similar to how a subroutine is suspended during calling another subroutine. The difference is that the active coroutine is not strictly bound to its calling stack. Instead, a suspended coroutine is part of a separate, isolated stack.

root -\
  :    \- cofoo --\
  :/--<+--yield --/
  |    :
  V    :

This means that suspended coroutines can be freely stored or moved between stacks. Any call stack that has access to a coroutine can decide to resume it.

1.3. Traversing the call stack

So far, our coroutine only goes down the call stack with yield. A subroutine can go down and up the call stack with return and (). For completeness, coroutines also need a mechanism to go up the call stack. Consider a coroutine like this:

def wrap():
    yield 'before'
    yield from cofoo()
    yield 'after'

When you run it, that means it still allocates the stack and instruction pointer like a subroutine. When it suspends, that still is like storing a subroutine.

However, yield from does both. It suspends stack and instruction pointer of wrap and runs cofoo. Note that wrap stays suspended until cofoo finishes completely. Whenever cofoo suspends or something is sent, cofoo is directly connected to the calling stack.

1.4. Coroutines all the way down

As established, yield from allows to connect two scopes across another intermediate one. When applied recursively, that means the top of the stack can be connected to the bottom of the stack.

root -\
  :    \-> coro_a -yield-from-> coro_b --\
  :/ <-+------------------------yield ---/
  |    :
  :\ --+-- coro_a.send----------yield ---\
  :                             coro_b <-/

Note that root and coro_b do not know about each other. This makes coroutines much cleaner than callbacks: coroutines still built on a 1:1 relation like subroutines. Coroutines suspend and resume their entire existing execution stack up until a regular call point.

Notably, root could have an arbitrary number of coroutines to resume. Yet, it can never resume more than one at the same time. Coroutines of the same root are concurrent but not parallel!

1.5. Python’s `async` and `await`

The explanation has so far explicitly used the yield and yield from vocabulary of generators – the underlying functionality is the same. The new Python3.5 syntax async and await exists mainly for clarity.

def foo():  # subroutine?
     return None

def foo():  # coroutine?
     yield from foofoo()  # generator? coroutine?

async def foo():  # coroutine!
     await foofoo()  # coroutine!
     return None

The async for and async with statements are needed because you would break the yield from/await chain with the bare for and with statements.

2. Anatomy of a simple event loop

By itself, a coroutine has no concept of yielding control to another coroutine. It can only yield control to the caller at the bottom of a coroutine stack. This caller can then switch to another coroutine and run it.

This root node of several coroutines is commonly an event loop: on suspension, a coroutine yields an event on which it wants resume. In turn, the event loop is capable of efficiently waiting for these events to occur. This allows it to decide which coroutine to run next, or how to wait before resuming.

Such a design implies that there is a set of pre-defined events that the loop understands. Several coroutines await each other, until finally an event is awaited. This event can communicate directly with the event loop by yielding control.

loop -\
  :    \-> coroutine --await--> event --\
  :/ <-+----------------------- yield --/
  |    :
  |    :  # loop waits for event to happen
  |    :
  :\ --+-- send(reply) -------- yield --\
  :        coroutine <--yield-- event <-/

The key is that coroutine suspension allows the event loop and events to directly communicate. The intermediate coroutine stack does not require any knowledge about which loop is running it, nor how events work.

2.1.1. Events in time

The simplest event to handle is reaching a point in time. This is a fundamental block of threaded code as well: a thread repeatedly sleeps until a condition is true. However, a regular sleep blocks execution by itself – we want other coroutines to not be blocked. Instead, we want tell the event loop when it should resume the current coroutine stack.

2.1.2. Defining an Event

An event is simply a value we can identify – be it via an enum, a type or other identity. We can define this with a simple class that stores our target time. In addition to storing the event information, we can allow to await a class directly.

class AsyncSleep:
    """Event to sleep until a point in time"""
    def __init__(self, until: float):
        self.until = until

    # used whenever someone ``await``s an instance of this Event
    def __await__(self):
        # yield this Event to the loop
        yield self
    
    def __repr__(self):
        return '%s(until=%.1f)' % (self.__class__.__name__, self.until)

This class only stores the event – it does not say how to actually handle it.

The only special feature is __await__ – it is what the await keyword looks for. Practically, it is an iterator but not available for the regular iteration machinery.

2.2.1. Awaiting an event

Now that we have an event, how do coroutines react to it? We should be able to express the equivalent of sleep by awaiting our event. To better see what is going on, we wait twice for half the time:

import time

async def asleep(duration: float):
    """await that ``duration`` seconds pass"""
    await AsyncSleep(time.time() + duration / 2)
    await AsyncSleep(time.time() + duration / 2)

We can directly instantiate and run this coroutine. Similar to a generator, using coroutine.send runs the coroutine until it yields a result.

coroutine = asleep(100)
while True:
    print(coroutine.send(None))
    time.sleep(0.1)

This gives us two AsyncSleep events and then a StopIteration when the coroutine is done. Notice that the only delay is from time.sleep in the loop! Each AsyncSleep only stores an offset from the current time.

2.2.2. Event + Sleep

At this point, we have two separate mechanisms at our disposal:

AsyncSleep Events that can be yielded from inside a coroutine
time.sleep that can wait without impacting coroutines

Notably, these two are orthogonal: neither one affects or triggers the other. As a result, we can come up with our own strategy to sleep to meet the delay of an AsyncSleep.

2.3. A naive event loop

If we have several coroutines, each can tell us when it wants to be woken up. We can then wait until the first of them wants to be resumed, then for the one after, and so on. Notably, at each point we only care about which one is next.

This makes for a straightforward scheduling:

sort coroutines by their desired wake up time
pick the first that wants to wake up
wait until this point in time
run this coroutine
repeat from 1.

A trivial implementation does not need any advanced concepts. A list allows to sort coroutines by date. Waiting is a regular time.sleep. Running coroutines works just like before with coroutine.send.

def run(*coroutines):
    """Cooperatively run all ``coroutines`` until completion"""
    # store wake-up-time and coroutines
    waiting = [(0, coroutine) for coroutine in coroutines]
    while waiting:
        # 2. pick the first coroutine that wants to wake up
        until, coroutine = waiting.pop(0)
        # 3. wait until this point in time
        time.sleep(max(0.0, until - time.time()))
        # 4. run this coroutine
        try:
            command = coroutine.send(None)
        except StopIteration:
            continue
        # 1. sort coroutines by their desired suspension
        if isinstance(command, AsyncSleep):
            waiting.append((command.until, coroutine))
            waiting.sort(key=lambda item: item[0])

Of course, this has ample room for improvement. We can use a heap for the wait queue or a dispatch table for events. We could also fetch return values from the StopIteration and assign them to the coroutine. However, the fundamental principle remains the same.

2.4. Cooperative Waiting

The AsyncSleep event and run event loop are a fully working implementation of timed events.

async def sleepy(identifier: str = "coroutine", count=5):
    for i in range(count):
        print(identifier, 'step', i + 1, 'at %.2f' % time.time())
        await asleep(0.1)

run(*(sleepy("coroutine %d" % j) for j in range(5)))

This cooperatively switches between each of the five coroutines, suspending each for 0.1 seconds. Even though the event loop is synchronous, it still executes the work in 0.5 seconds instead of 2.5 seconds. Each coroutine holds state and acts independently.

3. I/O event loop

An event loop that supports sleep is suitable for polling. However, waiting for I/O on a file handle can be done more efficiently: the operating system implements I/O and thus knows which handles are ready. Ideally, an event loop should support an explicit “ready for I/O” event.

3.1. The `select` call

Python already has an interface to query the OS for read I/O handles. When called with handles to read or write, it returns the handles ready to read or write:

readable, writeable, _ = select.select(rlist, wlist, xlist, timeout)

For example, we can open a file for writing and wait for it to be ready:

write_target = open('/tmp/foo')
readable, writeable, _ = select.select([], [write_target], [])

Once select returns, writeable contains our open file.

3.2. Basic I/O event

Similar to the AsyncSleep request, we need to define an event for I/O. With the underlying select logic, the event must refer to a readable object – say an open file. In addition, we store how much data to read.

class AsyncRead:
    def __init__(self, file, amount=1):
        self.file = file
        self.amount = amount
        self._buffer = ''

    def __await__(self):
        while len(self._buffer) < self.amount:
            yield self
            # we only get here if ``read`` should not block
            self._buffer += self.file.read(1)
        return self._buffer

    def __repr__(self):
        return '%s(file=%s, amount=%d, progress=%d)' % (
            self.__class__.__name__, self.file, self.amount, len(self._buffer)
        )

As with AsyncSleep we mostly just store the data required for the underlying system call. This time, __await__ is capable of being resumed multiple times – until our desired amount has been read. In addition, we return the I/O result instead of just resuming.

3.3. Augmenting an event loop with read I/O

The basis for our event loop is still the run defined previously. First, we need to track the read requests. This is no longer a sorted schedule, we only map read requests to coroutines.

# new
waiting_read = {}  # type: Dict[file, coroutine]

Since select.select takes a timeout parameter, we can use it in place of time.sleep.

# old
time.sleep(max(0.0, until - time.time()))
# new
readable, _, _ = select.select(list(reads), [], [])

This gives us all readable files – if there are any, we run the corresponding coroutine. If there are none, we have waited long enough for our current coroutine to run.

# new - reschedule waiting coroutine, run readable coroutine
if readable:
    waiting.append((until, coroutine))
    waiting.sort()
    coroutine = waiting_read[readable[0]]

Finally, we have to actually listen for read requests.

# new
if isinstance(command, AsyncSleep):
    ...
elif isinstance(command, AsyncRead):
    ...

3.4. Putting it together

The above was a bit of a simplification. We need to do some switching to not starve sleeping coroutines if we can always read. We need to handle having nothing to read or nothing to wait for. However, the end result still fits into 30 LOC.

def run(*coroutines):
    """Cooperatively run all ``coroutines`` until completion"""
    waiting_read = {}  # type: Dict[file, coroutine]
    waiting = [(0, coroutine) for coroutine in coroutines]
    while waiting or waiting_read:
        # 2. wait until the next coroutine may run or read ...
        try:
            until, coroutine = waiting.pop(0)
        except IndexError:
            until, coroutine = float('inf'), None
            readable, _, _ = select.select(list(waiting_read), [], [])
        else:
            readable, _, _ = select.select(list(waiting_read), [], [], max(0.0, until - time.time()))
        # ... and select the appropriate one
        if readable and time.time() < until:
            if until and coroutine:
                waiting.append((until, coroutine))
                waiting.sort()
            coroutine = waiting_read.pop(readable[0])
        # 3. run this coroutine
        try:
            command = coroutine.send(None)
        except StopIteration:
            continue
        # 1. sort coroutines by their desired suspension ...
        if isinstance(command, AsyncSleep):
            waiting.append((command.until, coroutine))
            waiting.sort(key=lambda item: item[0])
        # ... or register reads
        elif isinstance(command, AsyncRead):
            waiting_read[command.file] = coroutine

3.5. Cooperative I/O

The AsyncSleep, AsyncRead and run implementations are now fully functional to sleep and/or read. Same as for sleepy, we can define a helper to test reading:

async def ready(path, amount=1024*32):
    print('read', path, 'at', '%d' % time.time())
    with open(path, 'rb') as file:
        result = await AsyncRead(file, amount)
    print('done', path, 'at', '%d' % time.time())
    print('got', len(result), 'B')

run(sleepy('background', 5), ready('/dev/urandom'))

Running this, we can see that our I/O is interleaved with the waiting task:

id background round 1
read /dev/urandom at 1530721148
id background round 2
id background round 3
id background round 4
id background round 5
done /dev/urandom at 1530721148
got 1024 B

4. Non-Blocking I/O

While I/O on files gets the concept across, it is not really suitable for a library like asyncio: the select call always returns for files, and both open and read may block indefinitely. This blocks all coroutines of an event loop – which is bad. Libraries like aiofiles use threads and synchronization to fake non-blocking I/O and events on file.

However, sockets do allow for non-blocking I/O – and their inherent latency makes it much more critical. When used in an event loop, waiting for data and retrying can be wrapped without blocking anything.

4.1. Non-Blocking I/O event

Similar to our AsyncRead, we can define a suspend-and-read event for sockets. Instead of taking a file, we take a socket – which must be non-blocking. Also, our __await__ uses socket.recv instead of file.read.

class AsyncRecv:
    def __init__(self, connection, amount=1, read_buffer=1024):
        assert not connection.getblocking(), 'connection must be non-blocking for async recv'
        self.connection = connection
        self.amount = amount
        self.read_buffer = read_buffer
        self._buffer = b''

    def __await__(self):
        while len(self._buffer) < self.amount:
            try:
                self._buffer += self.connection.recv(self.read_buffer)
            except BlockingIOError:
                yield self
        return self._buffer

    def __repr__(self):
        return '%s(file=%s, amount=%d, progress=%d)' % (
            self.__class__.__name__, self.connection, self.amount, len(self._buffer)
        )

In contrast to AsyncRead, __await__ performs truly non-blocking I/O. When data is available, it always reads. When no data is available, it always suspends. That means the event loop is only blocked while we perform useful work.

4.2. Un-Blocking the event loop

As far as the event loop is concerned, nothing changes much. The event to listen for is still the same as for files – a file descriptor marked ready by select.

# old
elif isinstance(command, AsyncRead):
    waiting_read[command.file] = coroutine
# new
elif isinstance(command, AsyncRead):
    waiting_read[command.file] = coroutine
elif isinstance(command, AsyncRecv):
    waiting_read[command.connection] = coroutine

At this point, it should be obvious that AsyncRead and AsyncRecv are the same kind of event. We could easily refactor them to be one event with an exchangeable I/O component. In effect, the event loop, coroutines and events cleanly separate a scheduler, arbitrary intermediate code and the actual I/O.

4.3. The ugly side of non-blocking I/O

In principle, what you should do at this point is replicate the logic of read as a recv for AsyncRecv. However, this is much more ugly now – you have to handle early returns when functions block inside the kernel, but yield control to you. For example, opening a connection versus opening a file is much longer:

# file
file = open(path, 'rb')
# non-blocking socket
connection = socket.socket()
connection.setblocking(False)
# open without blocking - retry on failure
try:
    connection.connect((url, port))
except BlockingIOError:
    pass

Long story short, what remains is a few dozen lines of Exception handling. The events and event loop already work at this point.

id background round 1
read localhost:25000 at 1530783569
read /dev/urandom at 1530783569
done localhost:25000 at 1530783569 got 32768 B
id background round 2
id background round 3
id background round 4
done /dev/urandom at 1530783569 got 4096 B
id background round 5

Addendum

Example code at github

Question 4

Your coro desugaring is conceptually correct, but slightly incomplete.

await doesn’t suspend unconditionally, but only if it encounters a blocking call. How does it know that a call is blocking? This is decided by the code being awaited. For example, an awaitable implementation of socket read could be desugared to:

def read(sock, n):
    # sock must be in non-blocking mode
    try:
        return sock.recv(n)
    except EWOULDBLOCK:
        event_loop.add_reader(sock.fileno, current_task())
        return SUSPEND

In real asyncio the equivalent code modifies the state of a Future instead of returning magic values, but the concept is the same. When appropriately adapted to a generator-like object, the above code can be awaited.

On the caller side, when your coroutine contains:

data = await read(sock, 1024)

It desugars into something close to:

data = read(sock, 1024)
if data is SUSPEND:
    return SUSPEND
self.pos += 1
self.parts[self.pos](...)

People familiar with generators tend to describe the above in terms of yield from which does the suspension automatically.

The suspension chain continues all the way up to the event loop, which notices that the coroutine is suspended, removes it from the runnable set, and goes on to execute coroutines that are runnable, if any. If no coroutines are runnable, the loop waits in select() until either a file descriptor a coroutine is interested in becomes ready for IO. (The event loop maintains a file-descriptor-to-coroutine mapping.)

In the above example, once select() tells the event loop that sock is readable, it will re-add coro to the runnable set, so it will be continued from the point of suspension.

In other words:

Everything happens in the same thread by default.
The event loop is responsible for scheduling the coroutines and waking them up when whatever they were waiting for (typically an IO call that would normally block, or a timeout) becomes ready.

For insight on coroutine-driving event loops, I recommend this talk by Dave Beazley, where he demonstrates coding an event loop from scratch in front of live audience.

Question 5

It all boils down to the two main challenges that asyncio is addressing:

How to perform multiple I/O in a single thread?
How to implement cooperative multitasking?

The answer to the first point has been around for a long while and is called a select loop. In python, it is implemented in the selectors module.

The second question is related to the concept of coroutine, i.e. functions that can stop their execution and be restored later on. In python, coroutines are implemented using generators and the yield from statement. That’s what is hiding behind the async/await syntax.

Explanation

The code attempts to create a mutex with name derived from the full path to the script. We use forward-slashes to avoid potential confusion with the real file system.

Advantages

No configuration or ‘magic’ identifiers needed, use it in as many different scripts as needed.
No stale files left around, the mutex dies with you.
Prints a helpful message when waiting

Question 18

The best solution for this on windows is to use mutexes as suggested by @zgoda.

import win32event
import win32api
from winerror import ERROR_ALREADY_EXISTS

mutex = win32event.CreateMutex(None, False, 'name')
last_error = win32api.GetLastError()

if last_error == ERROR_ALREADY_EXISTS:
   print("App instance already running")

Some answers use fctnl (included also in @sorin tendo package) which is not available on windows and should you try to freeze your python app using a package like pyinstaller which does static imports, it throws an error.

Also, using the lock file method, creates a read-only problem with database files( experienced this with sqlite3).

Question 19

I’m posting this as an answer because I’m a new user and Stack Overflow won’t let me vote yet.

Sorin Sbarnea’s solution works for me under OS X, Linux and Windows, and I am grateful for it.

However, tempfile.gettempdir() behaves one way under OS X and Windows and another under other some/many/all(?) *nixes (ignoring the fact that OS X is also Unix!). The difference is important to this code.

OS X and Windows have user-specific temp directories, so a tempfile created by one user isn’t visible to another user. By contrast, under many versions of *nix (I tested Ubuntu 9, RHEL 5, OpenSolaris 2008 and FreeBSD 8), the temp dir is /tmp for all users.

That means that when the lockfile is created on a multi-user machine, it’s created in /tmp and only the user who creates the lockfile the first time will be able to run the application.

A possible solution is to embed the current username in the name of the lock file.

It’s worth noting that the OP’s solution of grabbing a port will also misbehave on a multi-user machine.

Question 20

I use single_process on my gentoo;

pip install single_process

example:

from single_process import single_process

@single_process
def main():
    print 1

if __name__ == "__main__":
    main()

refer: https://pypi.python.org/pypi/single_process/1.0

Question 21

I keep suspecting there ought to be a good POSIXy solution using process groups, without having to hit the file system, but I can’t quite nail it down. Something like:

On startup, your process sends a ‘kill -0’ to all processes in a particular group. If any such processes exist, it exits. Then it joins the group. No other processes use that group.

However, this has a race condition – multiple processes could all do this at precisely the same time and all end up joining the group and running simultaneously. By the time you’ve added some sort of mutex to make it watertight, you no longer need the process groups.

This might be acceptable if your process only gets started by cron, once every minute or every hour, but it makes me a bit nervous that it would go wrong precisely on the day when you don’t want it to.

I guess this isn’t a very good solution after all, unless someone can improve on it?

Question 22

I ran into this exact problem last week, and although I did find some good solutions, I decided to make a very simple and clean python package and uploaded it to PyPI. It differs from tendo in that it can lock any string resource name. Although you could certainly lock __file__ to achieve the same effect.

Install with: pip install quicklock

Using it is extremely simple:

[nate@Nates-MacBook-Pro-3 ~/live] python
Python 2.7.6 (default, Sep  9 2014, 15:04:36)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from quicklock import singleton
>>> # Let's create a lock so that only one instance of a script will run
...
>>> singleton('hello world')
>>>
>>> # Let's try to do that again, this should fail
...
>>> singleton('hello world')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/nate/live/gallery/env/lib/python2.7/site-packages/quicklock/quicklock.py", line 47, in singleton
    raise RuntimeError('Resource <{}> is currently locked by <Process {}: "{}">'.format(resource, other_process.pid, other_process.name()))
RuntimeError: Resource <hello world> is currently locked by <Process 24801: "python">
>>>
>>> # But if we quit this process, we release the lock automatically
...
>>> ^D
[nate@Nates-MacBook-Pro-3 ~/live] python
Python 2.7.6 (default, Sep  9 2014, 15:04:36)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from quicklock import singleton
>>> singleton('hello world')
>>>
>>> # No exception was thrown, we own 'hello world'!

Take a look: https://pypi.python.org/pypi/quicklock

Question 23

Building upon Roberto Rosario’s answer, I come up with the following function:

SOCKET = None
def run_single_instance(uniq_name):
    try:
        import socket
        global SOCKET
        SOCKET = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
        ## Create an abstract socket, by prefixing it with null.
        # this relies on a feature only in linux, when current process quits, the
        # socket will be deleted.
        SOCKET.bind('\0' + uniq_name)
        return True
    except socket.error as e:
        return False

We need to define global SOCKET vaiable since it will only be garbage collected when the whole process quits. If we declare a local variable in the function, it will go out of scope after the function exits, thus the socket be deleted.

All the credit should go to Roberto Rosario, since I only clarify and elaborate upon his code. And this code will work only on Linux, as the following quoted text from https://troydhanson.github.io/network/Unix_domain_sockets.html explains:

Linux has a special feature: if the pathname for a UNIX domain socket begins with a null byte \0, its name is not mapped into the filesystem. Thus it won’t collide with other names in the filesystem. Also, when a server closes its UNIX domain listening socket in the abstract namespace, its file is deleted; with regular UNIX domain sockets, the file persists after the server closes it.

Question 24

linux example

This method is based on the creation of a temporary file automatically deleted after you close the application. the program launch we verify the existence of the file; if the file exists ( there is a pending execution) , the program is closed ; otherwise it creates the file and continues the execution of the program.

from tempfile import *
import time
import os
import sys


f = NamedTemporaryFile( prefix='lock01_', delete=True) if not [f  for f in     os.listdir('/tmp') if f.find('lock01_')!=-1] else sys.exit()

YOUR CODE COMES HERE

Question 25

On a Linux system one could also ask pgrep -a for the number of instances, the script is found in the process list (option -a reveals the full command line string). E.g.

import os
import sys
import subprocess

procOut = subprocess.check_output( "/bin/pgrep -u $UID -a python", shell=True, 
                                   executable="/bin/bash", universal_newlines=True)

if procOut.count( os.path.basename(__file__)) > 1 :        
    sys.exit( ("found another instance of >{}<, quitting."
              ).format( os.path.basename(__file__)))

Remove -u $UID if the restriction should apply to all users. Disclaimer: a) it is assumed that the script’s (base)name is unique, b) there might be race conditions.

Question 26

import sys,os

# start program
try:  # (1)
    os.unlink('lock')  # (2)
    fd=os.open("lock", os.O_CREAT|os.O_EXCL) # (3)  
except: 
    try: fd=os.open("lock", os.O_CREAT|os.O_EXCL) # (4) 
    except:  
        print "Another Program running !.."  # (5)
        sys.exit()  

# your program  ...
# ...

# exit program
try: os.close(fd)  # (6)
except: pass
try: os.unlink('lock')  
except: pass
sys.exit()

Question 27

I started an application in Google App Engine right when it came out, to play with the technology and work on a pet project that I had been thinking about for a long time but never gotten around to starting. The result is BowlSK. However, as it has grown, and features have been added, it has gotten really difficult to keep things organized – mainly due to the fact that this is my first python project, and I didn’t know anything about it until I started working.

What I have:

Main Level contains:
- all .py files (didn’t know how to make packages work)
- all .html templates for main level pages
Subdirectories:
- separate folders for css, images, js, etc.
- folders that hold .html templates for subdirecty-type urls

Example:
http://www.bowlsk.com/ maps to HomePage (default package), template at “index.html”
http://www.bowlsk.com/games/view-series.html?series=7130 maps to ViewSeriesPage (again, default package), template at “games/view-series.html”

It’s nasty. How do I restructure? I had 2 ideas:

Main Folder containing: appdef, indexes, main.py?
- Subfolder for code. Does this have to be my first package?
- Subfolder for templates. Folder heirarchy would match package heirarchy
- Individual subfolders for css, images, js, etc.
Main Folder containing appdef, indexes, main.py?
- Subfolder for code + templates. This way I have the handler class right next to the template, because in this stage, I’m adding lots of features, so modifications to one mean modifications to the other. Again, do I have to have this folder name be the first package name for my classes? I’d like the folder to be “src”, but I don’t want my classes to be “src.WhateverPage”

Is there a best practice? With Django 1.0 on the horizon, is there something I can do now to improve my ability to integrate with it when it becomes the official GAE templating engine? I would simply start trying these things, and seeing which seems better, but pyDev’s refactoring support doesn’t seem to handle package moves very well, so it will likely be a non-trivial task to get all of this working again.

Question 28

First, I would suggest you have a look at “Rapid Development with Python, Django, and Google App Engine“

GvR describes a general/standard project layout on page 10 of his slide presentation.

Here I’ll post a slightly modified version of the layout/structure from that page. I pretty much follow this pattern myself. You also mentioned you had trouble with packages. Just make sure each of your sub folders has an __init__.py file. It’s ok if its empty.

Boilerplate files

These hardly vary between projects
app.yaml: direct all non-static requests to main.py
main.py: initialize app and send it all requests

Project lay-out

static/*: static files; served directly by App Engine
myapp/*.py: app-specific python code
- views.py, models.py, tests.py, __init__.py, and more
templates/*.html: templates (or myapp/templates/*.html)

Here are some code examples that may help as well:

main.py

import wsgiref.handlers

from google.appengine.ext import webapp
from myapp.views import *

application = webapp.WSGIApplication([
  ('/', IndexHandler),
  ('/foo', FooHandler)
], debug=True)

def main():
  wsgiref.handlers.CGIHandler().run(application)

myapp/views.py

import os
import datetime
import logging
import time

from google.appengine.api import urlfetch
from google.appengine.ext.webapp import template
from google.appengine.api import users
from google.appengine.ext import webapp
from models import *

class IndexHandler(webapp.RequestHandler):
  def get(self):
    date = "foo"
    # Do some processing        
    template_values = {'data': data }
    path = os.path.join(os.path.dirname(__file__) + '/../templates/', 'main.html')
    self.response.out.write(template.render(path, template_values))

class FooHandler(webapp.RequestHandler):
  def get(self):
    #logging.debug("start of handler")

myapp/models.py

from google.appengine.ext import db

class SampleModel(db.Model):

I think this layout works great for new and relatively small to medium projects. For larger projects I would suggest breaking up the views and models to have their own sub-folders with something like:

Project lay-out

static/: static files; served directly by App Engine
- js/*.js
- images/*.gif|png|jpg
- css/*.css
myapp/: app structure
- models/*.py
- views/*.py
- tests/*.py
- templates/*.html: templates

Question 29

My usual layout looks something like this:

app.yaml
index.yaml
request.py – contains the basic WSGI app
lib
- __init__.py – common functionality, including a request handler base class
controllers – contains all the handlers. request.yaml imports these.
templates
- all the django templates, used by the controllers
model
- all the datastore model classes
static
- static files (css, images, etc). Mapped to /static by app.yaml

I can provide examples of what my app.yaml, request.py, lib/init.py, and sample controllers look like, if this isn’t clear.

Question 30

I implemented a google app engine boilerplate today and checked it on github. This is along the lines described by Nick Johnson above (who used to work for Google).

Follow this link gae-boilerplate

Question 31

I think the first option is considered the best practice. And make the code folder your first package. The Rietveld project developed by Guido van Rossum is a very good model to learn from. Have a look at it: http://code.google.com/p/rietveld

With regard to Django 1.0, I suggest you start using the Django trunk code instead of the GAE built in django port. Again, have a look at how it’s done in Rietveld.

Question 32

I like webpy so I’ve adopted it as templating framework on Google App Engine.
My package folders are typically organized like this:

app.yaml
application.py
index.yaml
/app
   /config
   /controllers
   /db
   /lib
   /models
   /static
        /docs
        /images
        /javascripts
        /stylesheets
   test/
   utility/
   views/

Here is an example.

Question 33

I am not entirely up to date on the latest best practices, et cetera when it comes to code layout, but when I did my first GAE application, I used something along your second option, where the code and templates are next to eachother.

There was two reasons for this – one, it kept the code and template nearby, and secondly, I had the directory structure layout mimic that of the website – making it (for me) a bit easier too remember where everything was.

Question 34

When installing scipy through pip with :

pip install scipy

Pip fails to build scipy and throws the following error:

Cleaning up...
Command /Users/administrator/dev/KaggleAux/env/bin/python2.7 -c "import setuptools, tokenize;__file__='/Users/administrator/dev/KaggleAux/env/build/scipy/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /var/folders/zl/7698ng4d4nxd49q1845jd9340000gn/T/pip-eO8gua-record/install-record.txt --single-version-externally-managed --compile --install-headers /Users/administrator/dev/KaggleAux/env/bin/../include/site/python2.7 failed with error code 1 in /Users/administrator/dev/KaggleAux/env/build/scipy
Storing debug log for failure in /Users/administrator/.pip/pip.log

How can I get scipy to build successfully? This may be a new issue with OSX Yosemite since I just upgraded and haven’t had issues installing scipy before.

Debug log:

Cleaning up...
  Removing temporary dir /Users/administrator/dev/KaggleAux/env/build...
Command /Users/administrator/dev/KaggleAux/env/bin/python2.7 -c "import setuptools, tokenize;__file__='/Users/administrator/dev/KaggleAux/env/build/scipy/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /var/folders/zl/7698ng4d4nxd49q1845jd9340000gn/T/pip-eO8gua-record/install-record.txt --single-version-externally-managed --compile --install-headers /Users/administrator/dev/KaggleAux/env/bin/../include/site/python2.7 failed with error code 1 in /Users/administrator/dev/KaggleAux/env/build/scipy
Exception information:
Traceback (most recent call last):
  File "/Users/administrator/dev/KaggleAux/env/lib/python2.7/site-packages/pip/basecommand.py", line 122, in main
    status = self.run(options, args)
  File "/Users/administrator/dev/KaggleAux/env/lib/python2.7/site-packages/pip/commands/install.py", line 283, in run
    requirement_set.install(install_options, global_options, root=options.root_path)
  File "/Users/administrator/dev/KaggleAux/env/lib/python2.7/site-packages/pip/req.py", line 1435, in install
    requirement.install(install_options, global_options, *args, **kwargs)
  File "/Users/administrator/dev/KaggleAux/env/lib/python2.7/site-packages/pip/req.py", line 706, in install
    cwd=self.source_dir, filter_stdout=self._filter_install, show_stdout=False)
  File "/Users/administrator/dev/KaggleAux/env/lib/python2.7/site-packages/pip/util.py", line 697, in call_subprocess
    % (command_desc, proc.returncode, cwd))
InstallationError: Command /Users/administrator/dev/KaggleAux/env/bin/python2.7 -c "import setuptools, tokenize;__file__='/Users/administrator/dev/KaggleAux/env/build/scipy/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /var/folders/zl/7698ng4d4nxd49q1845jd9340000gn/T/pip-eO8gua-record/install-record.txt --single-version-externally-managed --compile --install-headers /Users/administrator/dev/KaggleAux/env/bin/../include/site/python2.7 failed with error code 1 in /Users/administrator/dev/KaggleAux/env/build/scipy

Question 35

After opening up an issue with the SciPy team, we found that you need to upgrade pip with:

pip install --upgrade pip

And in Python 3 this works:

python3 -m pip install --upgrade pip

for SciPy to install properly. Why? Because:

Older versions of pip have to be told to use wheels, IIRC with –use-wheel. Or you can upgrade pip itself, then it should pick up the wheels.

Upgrading pip solves the issue, but you might be able to just use the --use-wheel flag as well.

Question 36

Microsoft Windows users of 64 bit Python installations will need to download the 64 bit .whl of Scipy from here, then simply cd into the folder you’ve downloaded the .whl file and run:

pip install scipy-0.16.1-cp27-none-win_amd64.whl

Question 37

I face same problem when install Scipy under ubuntu.
I had to use command:

$ sudo apt-get install libatlas-base-dev gfortran
$ sudo pip3 install scipy

You can get more details here Installing SciPy with pip
Sorry don’t know how to do it under OS X Yosemite.

Question 38

In windows 10, most options will not work. Follow these steps:

In Windows 10 with CMD, you cannot download scipy directly using most of the well known commands like wget, cloning scipy github, pip install scipy, etc

To install, go to pythonlibs .whl files , and if you are using python 2.7 32 bit then download numpy-1.11.2rc1+mkl-cp27-cp27m-win32.whl and scipy-0.18.1-cp27-cp27m-win32.whl or if python 2.7 62 bit then download numpy-1.11.2rc1+mkl-cp27-cp27m-win_amd64.whl and scipy-0.18.1-cp27-cp27m-win_amd64.whl

After downloading,save the files under your python directory , in my case it was c:\>python27

Then run:

pip install C:\Python27\numpy-1.11.2rc1+mkl-cp27-cp27m-win32.whl 
pip install C:\Python27\scipy-0.18.1-cp27-cp27m-win32.whl

Note:

scipy needs numpy as dependency, so that’s why we are downloading numpy before scipy.
cp27 in .whl files means that these files are meant for python 2.7 and cp33 stands for python 3.x speciafically >=3.3

Question 39

After finding this answer for some clues, I got this working by doing

brew install gcc 
pip install scipy

(The first of these steps took 96 minutes on my 2011 Mac Book Air so I hope you’re not in a hurry!)

Question 40

If you are totally new to python read step by step or go directly to last step. Follow the below method to install scipy 0.18.1 on Windows 64-bit , Python 64-bit . If below command is not working then proceed further

pip install scipy

Be careful with the versions of

Python
Windows
.whl version of numpy and scipy files
First install numpy and scipy.
```
pip install FileName.whl
```
For Numpy:http://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy For Scipy:http://www.lfd.uci.edu/~gohlke/pythonlibs/#scipy

Be aware of the file name (check the version number).

Ex :scipy-0.18.1-cp35-cp35m-win_amd64.whl

To check which version is supported by your pip, go to point No 2 below.

If you are using .whl file . Following errors are likely to occur .

You are using pip version 7.1.0, however version 8.1.2 is available.

You should consider upgrading via the ‘python -m pip install –upgrade pip’ command

scipy-0.15.1-cp33-none-win_amd64.whl.whl is not supported wheel on this platform

For the above error: start Python and type :

import pip
print(pip.pep425tags.get_supported())

Output:

[(‘cp35’, ‘cp35m’, ‘win_amd64’), (‘cp35’, ‘none’, ‘win_amd64’), (‘py3’, ‘none’, ‘win_amd64’), (‘cp35’, ‘none’, ‘any’), (‘cp3’, ‘none’, ‘any’), (‘py35’, ‘none’, ‘any’), (‘py3’, ‘none’, ‘any’), (‘py34’, ‘none’, ‘any’), (‘py33’, ‘none’, ‘any’), (‘py32’, ‘none’, ‘any’), (‘py31’, ‘none’, ‘any’), (‘py30’, ‘none’, ‘any’)]

In the output you will observe cp35 is there , so download cp35 for numpy as well as scipy.Further edits are most welcome.

Question 41

For Windows 10

Download whl file for the apropriate python version from http://www.lfd.uci.edu/~gohlke/pythonlibs/#scipy
put it under directory run the below command

C:\directory> pip install scipy-0.19.0rc2-cp35-cp35m-win_amd64.whl

Question 42

Rather than going the harder route of downloading specific packages. I prefer to go the faster route of using Conda. pip has its issues.

Python -v (3.6.0)
Windows 10 (64 bit)

Conda , install conda from : https://conda.io/docs/install/quick.html#windows-miniconda-install

command prompt

C:\Users\xyz>conda install -c anaconda scipy=0.18.1
Fetching package metadata .............
Solving package specifications:

Package plan for installation in environment C:\Users\xyz\Miniconda3:

The following NEW packages will be INSTALLED:

mkl:       2017.0.1-0         anaconda
numpy:     1.12.0-py36_0      anaconda
scipy:     0.18.1-np112py36_1 anaconda

The following packages will be SUPERCEDED by a higher-priority channel:

conda:     4.3.11-py36_0               --> 4.3.11-py36_0 anaconda
conda-env: 2.6.0-0                     --> 2.6.0-0       anaconda

Proceed ([y]/n)? y

conda-env-2.6. 100% |###############################| Time: 0:00:00  32.92 kB/s
mkl-2017.0.1-0 100% |###############################| Time: 0:00:24   5.45 MB/s
numpy-1.12.0-p 100% |###############################| Time: 0:00:00   5.09 MB/s
scipy-0.18.1-n 100% |###############################| Time: 0:00:02   5.59 MB/s
conda-4.3.11-p 100% |###############################| Time: 0:00:00   4.70 MB/s

Question 43

Download SciPy from http://www.lfd.uci.edu/~gohlke/pythonlibs/#scipy
Go into the directory the downloaded file is in and pip install the file.
Go to python shell, run import scipy; it worked for me with no errors.

Question 44

This is an alternative to pip. I also had the same error when installing scipy with pip.

Then I downloaded and installed MiniConda. And then I used the below command to install pytables.

conda install -c conda-forge scipy

Please refer the below screenshot.

Question 45

the best method I could suggest is this

Download the wheel file from this location for your version of python
- https://pypi.python.org/pypi/scipy
Move the file to your Main Drive eg C:>
Run Cmd and enter the following
- pip install scipy-1.0.0rc1-cp36-none-win_amd64.whl

Please note this is the version I am using for my pyhton 3.6.2 it should install fine

you may want to run this command after to make sure all your python add ons are up to date

pip list --outdated

Question 46

Alternatively, manually download and execute http://www.lfd.uci.edu/~gohlke/pythonlibs Scipy version suitable for you. Consider your Python version (python –version) and your system architecture (32/64 bit). Choose the Scipy version accordingly. scipy-0.18.1-cp27-cp27m-win32 – for Python 2.7 32 bit scipy-0.18.1-cp27-cp27m-win_amd64 – for Python 2.7 64 bit Otherwise the error scipy-0.15.1-cp33-none-win_amd64.whl.whl is not supported wheel on this platform will popup on installation.

Now change directory to the downloaded file and execute command pip install scipy-0.15.1-cp33-none-win_amd64.whl.whl (change file name appropriately)

I have added this answer only because the Arun’s answer(found useful by myself) has not mentioned anything about 32/64 bit matching which i have faced.

Question 47

If you are using CentOS you need to install lapack-devel like so:

 $ yum install lapack-devel

Question 48

Try downloading the scipy file from the below link

https://sourceforge.net/projects/scipy/?source=typ_redirect

It will be a .exe file and you just need to run it. But be sure to chose the scipy version corresponding to your python version.

When the scipy.exe file is run it will locate the python directory and will be installed .

Question 49

use the wheel file to install download from here http://www.lfd.uci.edu/~gohlke/pythonlibs/#scipy install

pip install c:\jjjj\ggg\fdadf.whl

Question 50

I was having the same issue, and I had succeeded using sudo.

$ sudo pip install scipy

Question 51

The easiest way is in the following steps: Fixing scipy for python [ 2.n < python < 3.n ]

Download the necessary files from: http://www.lfd.uci.edu/~gohlke/pythonlibs/

Download the version of numpy+mkl (needed to run scipy) and then download scipy for your python type (2.n python written as 2n) or (3.n python written as 3n), n is a variable. Note you must know whether you have a 32bit or 64bit processor.

Create a directory somewhere on your computer, example [C:\DIRECTORY] to install the files numpy+mkd.whl and scipy.whl

Once both file are downloaded, find the location of the file on your computer and move it to the directory you created.

Example: First file installation is needed for scipy is in

C:\DIRECTORY\numpy\numpy-0.0.0+mkl-cp2n-cp2nm-win_amd32.whl

Example: Second file installation is in

C:\DIRECTORY\scipy\scipy-0.0.0-cp2n-cp2nm-win_amd32.whl

Go to your command prompt and proceed the following example for a python version 2.n:

py -2.n -m pip install C:\DIRECTORY\numpy\numpy-0.0.0+mkl-cp2n-cp2nm-win_amd32.whl

should install

py -2.n -m pip install C:\DIRECTORY\scipy\scipy-0.0.0-cp2n-cp2nm-win_amd32.whl

should install

Test both modules on your python IDLE as following:

import numpy

import scipy

the modules are working if no errors are returned.

IFDAAS

Question 52

For windows(7 in my case):

download scipy-0.19.1-cp36-cp36m-win32.whl from http://www.lfd.uci.edu/~gohlke/pythonlibs/#scipy
create one some.bat file with content

@echo off C:\Python36\python.exe -m pip -V C:\Python36\python.exe -m pip install scipy-0.19.1-cp36-cp36m-win32.whl C:\Python36\python.exe -m pip list pause
then run this batch file some.bat
call python shell “C:\Python36\pythonw.exe “C:\Python36\Lib\idlelib\idle.pyw” and test if scipy was installed with

import scipy

Question 53

The easy way to install scipy on Windows 10 100% is this: Just pip this ====> pip install scipy==1.0.0rc2

Thank me later :)

Question 54

I experienced similar issues with Python 3.7 (3.7.0b4). This was due to some changes regarding some encoding assumptions (Python 3.6 >> Python 3.7)

As a result lots of package installations (e.g. via pip) failed.

Question 55

You can test this answer:

python -m pip install --user numpy scipy matplotlib ipython jupyter pandas sympy nose

Question 56

I’m testing the tuple structure, and I found it’s strange when I use the == operator like:

>>>  (1,) == 1,
Out: (False,)

When I assign these two expressions to a variable, the result is true:

>>> a = (1,)
>>> b = 1,
>>> a==b
Out: True

This questions is different from Python tuple trailing comma syntax rule in my view. I ask the group of expressions between == operator.

Question 57

Other answers have already shown you that the behaviour is due to operator precedence, as documented here.

I’m going to show you how to find the answer yourself next time you have a question similar to this. You can deconstruct how the expression parses using the ast module:

>>> import ast
>>> source_code = '(1,) == 1,'
>>> print(ast.dump(ast.parse(source_code), annotate_fields=False))
Module([Expr(Tuple([Compare(Tuple([Num(1)], Load()), [Eq()], [Num(1)])], Load()))])

From this we can see that the code gets parsed as Tim Peters explained:

Module([Expr(
    Tuple([
        Compare(
            Tuple([Num(1)], Load()), 
            [Eq()], 
            [Num(1)]
        )
    ], Load())
)])

Question 58

This is just operator precedence. Your first

(1,) == 1,

groups like so:

((1,) == 1),

so builds a tuple with a single element from the result of comparing the one-element tuple 1, to the integer 1 for equality They’re not equal, so you get the 1-tuple False, for a result.

Question 59

When you do

>>> (1,) == 1,

it builds a tuple with the result from comparing the tuple (1,) with an integer and thus returning False.

Instead when you assign to variables, the two equal tuples are compared with each other.

You can try:

>>> x = 1,
>>> x
(1,)

Question 60

I have a multi-line string defined like this:

foo = """
this is 
a multi-line string.
"""

This string we used as test-input for a parser I am writing. The parser-function receives a file-object as input and iterates over it. It does also call the next() method directly to skip lines, so I really need an iterator as input, not an iterable. I need an iterator that iterates over the individual lines of that string like a file-object would over the lines of a text-file. I could of course do it like this:

lineiterator = iter(foo.splitlines())

Is there a more direct way of doing this? In this scenario the string has to traversed once for the splitting, and then again by the parser. It doesn’t matter in my test-case, since the string is very short there, I am just asking out of curiosity. Python has so many useful and efficient built-ins for such stuff, but I could find nothing that suits this need.

Question 61

Here are three possibilities:

foo = """
this is 
a multi-line string.
"""

def f1(foo=foo): return iter(foo.splitlines())

def f2(foo=foo):
    retval = ''
    for char in foo:
        retval += char if not char == '\n' else ''
        if char == '\n':
            yield retval
            retval = ''
    if retval:
        yield retval

def f3(foo=foo):
    prevnl = -1
    while True:
      nextnl = foo.find('\n', prevnl + 1)
      if nextnl < 0: break
      yield foo[prevnl + 1:nextnl]
      prevnl = nextnl

if __name__ == '__main__':
  for f in f1, f2, f3:
    print list(f())

Running this as the main script confirms the three functions are equivalent. With timeit (and a * 100 for foo to get substantial strings for more precise measurement):

$ python -mtimeit -s'import asp' 'list(asp.f3())'
1000 loops, best of 3: 370 usec per loop
$ python -mtimeit -s'import asp' 'list(asp.f2())'
1000 loops, best of 3: 1.36 msec per loop
$ python -mtimeit -s'import asp' 'list(asp.f1())'
10000 loops, best of 3: 61.5 usec per loop

Note we need the list() call to ensure the iterators are traversed, not just built.

IOW, the naive implementation is so much faster it isn’t even funny: 6 times faster than my attempt with find calls, which in turn is 4 times faster than a lower-level approach.

Lessons to retain: measurement is always a good thing (but must be accurate); string methods like splitlines are implemented in very fast ways; putting strings together by programming at a very low level (esp. by loops of += of very small pieces) can be quite slow.

Edit: added @Jacob’s proposal, slightly modified to give the same results as the others (trailing blanks on a line are kept), i.e.:

from cStringIO import StringIO

def f4(foo=foo):
    stri = StringIO(foo)
    while True:
        nl = stri.readline()
        if nl != '':
            yield nl.strip('\n')
        else:
            raise StopIteration

Measuring gives:

$ python -mtimeit -s'import asp' 'list(asp.f4())'
1000 loops, best of 3: 406 usec per loop

not quite as good as the .find based approach — still, worth keeping in mind because it might be less prone to small off-by-one bugs (any loop where you see occurrences of +1 and -1, like my f3 above, should automatically trigger off-by-one suspicions — and so should many loops which lack such tweaks and should have them — though I believe my code is also right since I was able to check its output with other functions’).

But the split-based approach still rules.

An aside: possibly better style for f4 would be:

from cStringIO import StringIO

def f4(foo=foo):
    stri = StringIO(foo)
    while True:
        nl = stri.readline()
        if nl == '': break
        yield nl.strip('\n')

at least, it’s a bit less verbose. The need to strip trailing \ns unfortunately prohibits the clearer and faster replacement of the while loop with return iter(stri) (the iter part whereof is redundant in modern versions of Python, I believe since 2.3 or 2.4, but it’s also innocuous). Maybe worth trying, also:

    return itertools.imap(lambda s: s.strip('\n'), stri)

or variations thereof — but I’m stopping here since it’s pretty much a theoretical exercise wrt the strip based, simplest and fastest, one.

Question 62

I’m not sure what you mean by “then again by the parser”. After the splitting has been done, there’s no further traversal of the string, only a traversal of the list of split strings. This will probably actually be the fastest way to accomplish this, so long as the size of your string isn’t absolutely huge. The fact that python uses immutable strings means that you must always create a new string, so this has to be done at some point anyway.

If your string is very large, the disadvantage is in memory usage: you’ll have the original string and a list of split strings in memory at the same time, doubling the memory required. An iterator approach can save you this, building a string as needed, though it still pays the “splitting” penalty. However, if your string is that large, you generally want to avoid even the unsplit string being in memory. It would be better just to read the string from a file, which already allows you to iterate through it as lines.

However if you do have a huge string in memory already, one approach would be to use StringIO, which presents a file-like interface to a string, including allowing iterating by line (internally using .find to find the next newline). You then get:

import StringIO
s = StringIO.StringIO(myString)
for line in s:
    do_something_with(line)

Question 63

If I read Modules/cStringIO.c correctly, this should be quite efficient (although somewhat verbose):

from cStringIO import StringIO

def iterbuf(buf):
    stri = StringIO(buf)
    while True:
        nl = stri.readline()
        if nl != '':
            yield nl.strip()
        else:
            raise StopIteration

Question 64

Regex-based searching is sometimes faster than generator approach:

RRR = re.compile(r'(.*)\n')
def f4(arg):
    return (i.group(1) for i in RRR.finditer(arg))

问题：异步实际上是如何工作的？

回答 0

asyncio如何工作？

与生成器通讯

从生成器返回值

看，一个新的关键字： yield from

放在一起

异步

How does asyncio work?

Communicating with a generator

Returning values from generators

Behold, a new keyword: yield from

Putting it all together

Asyncio

回答 1

1.坚果壳中的协程与子程序

1.1。子程序

1.2。协程作为持久子例程

1.3。遍历调用栈

1.4。协程一直向下

1.5。Python的async和await

2.简单事件循环的剖析

2.1.1。及时事件

2.1.2。定义事件

2.2.1。等待事件

2.2.2。活动+睡眠

2.3。天真的事件循环

2.4。合作等待

3. I / O事件循环

3.1。该select呼叫

3.2。基本I / O事件

3.3。使用读取的I / O增强事件循环

3.4。把它放在一起

3.5。协同I / O

4.非阻塞I / O

4.1。非阻塞I / O事件

4.2。解除阻塞事件循环

4.3。非阻塞I / O的丑陋一面

附录

1. Coroutines versus subroutines in a nut shell

1.1. Subroutines

1.2. Coroutines as persistent subroutines

1.3. Traversing the call stack

1.4. Coroutines all the way down

1.5. Python’s async and await

2. Anatomy of a simple event loop

2.1.1. Events in time

2.1.2. Defining an Event

2.2.1. Awaiting an event

2.2.2. Event + Sleep

2.3. A naive event loop

2.4. Cooperative Waiting

3. I/O event loop

3.1. The select call

3.2. Basic I/O event

3.3. Augmenting an event loop with read I/O

3.4. Putting it together

3.5. Cooperative I/O

4. Non-Blocking I/O

4.1. Non-Blocking I/O event

4.2. Un-Blocking the event loop

4.3. The ugly side of non-blocking I/O

Addendum

回答 2

回答 3

问题：确保只运行一个程序实例

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

说明

优点

Explanation

看，一个新的关键字： `yield from`

Behold, a new keyword: `yield from`

1.5。Python的`async`和`await`

3.1。该`select`呼叫

1.5. Python’s `async` and `await`

3.1. The `select` call