逐行读取子流程标准输出

Question 1

My python script uses subprocess to call a linux utility that is very noisy. I want to store all of the output to a log file and show some of it to the user. I thought the following would work, but the output doesn’t show up in my application until the utility has produced a significant amount of output.

#fake_utility.py, just generates lots of output over time
import time
i = 0
while True:
   print hex(i)*512
   i += 1
   time.sleep(0.5)

#filters output
import subprocess
proc = subprocess.Popen(['python','fake_utility.py'],stdout=subprocess.PIPE)
for line in proc.stdout:
   #the real code does filtering here
   print "test:", line.rstrip()

The behavior I really want is for the filter script to print each line as it is received from the subprocess. Sorta like what tee does but with python code.

What am I missing? Is this even possible?

Update:

If a sys.stdout.flush() is added to fake_utility.py, the code has the desired behavior in python 3.1. I’m using python 2.6. You would think that using proc.stdout.xreadlines() would work the same as py3k, but it doesn’t.

Update 2:

Here is the minimal working code.

#fake_utility.py, just generates lots of output over time
import sys, time
for i in range(10):
   print i
   sys.stdout.flush()
   time.sleep(0.5)

#display out put line by line
import subprocess
proc = subprocess.Popen(['python','fake_utility.py'],stdout=subprocess.PIPE)
#works in python 3.0+
#for line in proc.stdout:
for line in iter(proc.stdout.readline,''):
   print line.rstrip()

Question 2

It’s been a long time since I last worked with Python, but I think the problem is with the statement for line in proc.stdout, which reads the entire input before iterating over it. The solution is to use readline() instead:

#filters output
import subprocess
proc = subprocess.Popen(['python','fake_utility.py'],stdout=subprocess.PIPE)
while True:
  line = proc.stdout.readline()
  if not line:
    break
  #the real code does filtering here
  print "test:", line.rstrip()

Of course you still have to deal with the subprocess’ buffering.

Note: according to the documentation the solution with an iterator should be equivalent to using readline(), except for the read-ahead buffer, but (or exactly because of this) the proposed change did produce different results for me (Python 2.5 on Windows XP).

Question 3

Bit late to the party, but was surprised not to see what I think is the simplest solution here:

import io
import subprocess

proc = subprocess.Popen(["prog", "arg"], stdout=subprocess.PIPE)
for line in io.TextIOWrapper(proc.stdout, encoding="utf-8"):  # or another encoding
    # do something with line

(This requires Python 3.)

Question 4

Indeed, if you sorted out the iterator then buffering could now be your problem. You could tell the python in the sub-process not to buffer its output.

proc = subprocess.Popen(['python','fake_utility.py'],stdout=subprocess.PIPE)

becomes

proc = subprocess.Popen(['python','-u', 'fake_utility.py'],stdout=subprocess.PIPE)

I have needed this when calling python from within python.

Question 5

You want to pass these extra parameters to subprocess.Popen:

bufsize=1, universal_newlines=True

Then you can iterate as in your example. (Tested with Python 3.5)

Question 6

A function that allows iterating over both stdout and stderr concurrently, in realtime, line by line

In case you need to get the output stream for both stdout and stderr at the same time, you can use the following function.

The function uses Queues to merge both Popen pipes into a single iterator.

Here we create the function read_popen_pipes():

from queue import Queue, Empty
from concurrent.futures import ThreadPoolExecutor


def enqueue_output(file, queue):
    for line in iter(file.readline, ''):
        queue.put(line)
    file.close()


def read_popen_pipes(p):

    with ThreadPoolExecutor(2) as pool:
        q_stdout, q_stderr = Queue(), Queue()

        pool.submit(enqueue_output, p.stdout, q_stdout)
        pool.submit(enqueue_output, p.stderr, q_stderr)

        while True:

            if p.poll() is not None and q_stdout.empty() and q_stderr.empty():
                break

            out_line = err_line = ''

            try:
                out_line = q_stdout.get_nowait()
            except Empty:
                pass
            try:
                err_line = q_stderr.get_nowait()
            except Empty:
                pass

            yield (out_line, err_line)

read_popen_pipes() in use:

import subprocess as sp


with sp.Popen(my_cmd, stdout=sp.PIPE, stderr=sp.PIPE, text=True) as p:

    for out_line, err_line in read_popen_pipes(p):

        # Do stuff with each line, e.g.:
        print(out_line, end='')
        print(err_line, end='')

    return p.poll() # return status-code

Question 7

You can also read lines w/o loop. Works in python3.6.

import os
import subprocess

process = subprocess.Popen(command, stdout=subprocess.PIPE)
list_of_byte_strings = process.stdout.readlines()

Question 8

I tried this with python3 and it worked, source

def output_reader(proc):
    for line in iter(proc.stdout.readline, b''):
        print('got line: {0}'.format(line.decode('utf-8')), end='')


def main():
    proc = subprocess.Popen(['python', 'fake_utility.py'],
                            stdout=subprocess.PIPE,
                            stderr=subprocess.STDOUT)

    t = threading.Thread(target=output_reader, args=(proc,))
    t.start()

    try:
        time.sleep(0.2)
        import time
        i = 0

        while True:
        print (hex(i)*512)
        i += 1
        time.sleep(0.5)
    finally:
        proc.terminate()
        try:
            proc.wait(timeout=0.2)
            print('== subprocess exited with rc =', proc.returncode)
        except subprocess.TimeoutExpired:
            print('subprocess did not terminate in time')
    t.join()

Question 9

The following modification of Rômulo’s answer works for me on Python 2 and 3 (2.7.12 and 3.6.1):

import os
import subprocess

process = subprocess.Popen(command, stdout=subprocess.PIPE)
while True:
  line = process.stdout.readline()
  if line != '':
    os.write(1, line)
  else:
    break

Question 10

Dunno when this has been added to the subprocess module, but with Python 3 you should be fine with using proc.stdout.splitlines():

for line in proc.stdout.splitlines():
   print "stdout:", line

逐行读取子流程标准输出

问题：逐行读取子流程标准输出

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

7行代码 Python热力图可视化分析缺失数据处理

Python 流程图 — 一键转化代码为流程图

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

如何修剪字符串中的空格？

熊猫可以自动识别日期吗？

多重处理：使用tqdm显示进度条

我在python中遇到关键错误

Pytest-pytest框架使编写小测试变得很容易，而且可以扩展以支持复杂的功能测试。

将if-elif-else语句放在一行上吗？

逐行读取子流程标准输出

问题：逐行读取子流程标准输出

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

相关文章

排行榜展示

文章展示