Python vs Bash-在性能方面,每个任务都胜过其他任务?

问题:Python vs Bash-在性能方面,每个任务都胜过其他任务?

显然,Python更加用户友好,在Google上进行的快速搜索显示了许多结果,这些结果表明,由于Python是字节编译的,因此通常速度更快。我什至发现声称可以在基于字典的操作上看到超过2000%的改进。

您对此事有何经验?每个人在哪种任务上都是明显的赢家?

Obviously Python is more user friendly, a quick search on google shows many results that say that, as Python is byte-compiled is usually faster. I even found this that claims that you can see an improvement of over 2000% on dictionary-based operations.

What is your experience on this matter? In which kind of task each one is a clear winner?


回答 0

典型的大型机流程…

Input Disk/Tape/User (runtime) --> Job Control Language (JCL) --> Output Disk/Tape/Screen/Printer
                                   |                          ^
                                   v                          |
                                   `--> COBOL Program --------' 

典型的Linux流程…

Input Disk/SSD/User (runtime) --> sh/bash/ksh/zsh/... ----------> Output Disk/SSD/Screen/Printer
                                   |                          ^
                                   v                          |
                                   `--> Python script --------'
                                   |                          ^
                                   v                          |
                                   `--> awk script -----------'
                                   |                          ^
                                   v                          |
                                   `--> sed script -----------'
                                   |                          ^
                                   v                          |
                                   `--> C/C++ program --------'
                                   |                          ^
                                   v                          |
                                   `--- Java program ---------'
                                   |                          ^
                                   v                          |
                                   :                          :

外壳是Linux的粘合剂

像sh / ksh / bash / … 这样的Linux shell 提供输入/输出/流控制指定功能,就像旧的大型机Job Control Language …一样,但是在类固醇上!它们本身就是图灵完整的语言,同时经过优化以有效地将数据和控制传递给O / S支持的以任何语言编写的其他执行过程和从其他执行过程进行传递。

大多数Linux应用程序,无论程序的大部分语言是哪种语言,都取决于shell脚本,而Bash已成为最常见的应用程序。单击桌面上的图标通常会运行一个简短的Bash脚本。该脚本直接或间接知道所有所需文件的位置,并设置变量和命令行参数,最后调用程序。这是shell最简单的用法。

然而,众所周知,如果没有成千上万的外壳脚本来启动系统,响应事件,控制执行优先级以及编译,配置和运行程序,几乎就不会是Linux。其中许多都是非常大而复杂的。

Shell提供了一种基础结构,使我们可以使用在运行时而不是编译时链接在一起的预构建组件。这些组件本身就是独立的程序,可以单独使用或以其他组合使用,而无需重新编译。调用它们的语法与Bash内置命令的语法没有区别,实际上,有许多内置命令,在系统上也有独立的可执行文件,这些命令通常具有其他选项。

PythonBash在性能上没有语言范围的差异。这完全取决于每个代码的编码方式以及调用哪些外部工具。

任何众所周知的工具,例如awk,sed,grep,bc,dc,tr等,都将以任何一种语言进行操作。然后,对于没有图形用户界面的任何事物,Bash都是首选的,因为从Bash之类的工具中调用和传递数据比从Python那里更容易,更有效。

性能

它的总体吞吐量和/或响应能力是否好于等效的Python取决于Bash shell脚本调用的程序及其对子任务的适用性。使事情复杂化的是,Python和大多数语言一样,也可以调用其他可执行文件,尽管它比较麻烦,因此不常用。

用户界面

一个领域的Python是明显的赢家是用户界面。这使其成为构建本地或客户端服务器应用程序的极佳语言,因为它本身支持GTK图形,并且比Bash直观得多。

Bash仅能理解文本。必须为GUI调用其他工具,并从这些工具传回数据。一个Python的脚本是一个选项。更快但更不灵活的选项是YAD,Zenity和GTKDialog之类的二进制文件。

虽然像Bash这样的shell 可以与YadGtkDialog(GTK +函数的嵌入式XML相似的接口)dialogxmessage等GUI很好地配合使用,但Python的功能更加强大,因此对于复杂的GUI窗口也更好。

摘要

使用Shell脚本进行构建就像使用台式机组装具有现成组件的计算机一样。

使用PythonC ++或大多数其他语言进行构建更像是通过像智能手机一样将芯片(库)和其他电子零件焊接在一起来构建计算机。

最好的结果通常是通过使用多种语言来获得的,每种语言都可以尽其所能。一个开发人员称此为“ 多语言编程 ”。

Typical mainframe flow…

Input Disk/Tape/User (runtime) --> Job Control Language (JCL) --> Output Disk/Tape/Screen/Printer
                                   |                          ^
                                   v                          |
                                   `--> COBOL Program --------' 

Typical Linux flow…

Input Disk/SSD/User (runtime) --> sh/bash/ksh/zsh/... ----------> Output Disk/SSD/Screen/Printer
                                   |                          ^
                                   v                          |
                                   `--> Python script --------'
                                   |                          ^
                                   v                          |
                                   `--> awk script -----------'
                                   |                          ^
                                   v                          |
                                   `--> sed script -----------'
                                   |                          ^
                                   v                          |
                                   `--> C/C++ program --------'
                                   |                          ^
                                   v                          |
                                   `--- Java program ---------'
                                   |                          ^
                                   v                          |
                                   :                          :

Shells are the glue of Linux

Linux shells like sh/ksh/bash/… provide input/output/flow-control designation facilities much like the old mainframe Job Control Language… but on steroids! They are Turing complete languages in their own right while being optimized to efficiently pass data and control to and from other executing processes written in any language the O/S supports.

Most Linux applications, regardless what language the bulk of the program is written in, depend on shell scripts and Bash has become the most common. Clicking an icon on the desktop usually runs a short Bash script. That script, either directly or indirectly, knows where all the files needed are and sets variables and command line parameters, finally calling the program. That’s a shell’s simplest use.

Linux as we know it however would hardly be Linux without the thousands of shell scripts that startup the system, respond to events, control execution priorities and compile, configure and run programs. Many of these are quite large and complex.

Shells provide an infrastructure that lets us use pre-built components that are linked together at run time rather than compile time. Those components are free-standing programs in their own right that can be used alone or in other combinations without recompiling. The syntax for calling them is indistinguishable from that of a Bash builtin command, and there are in fact numerous builtin commands for which there is also a stand-alone executable on the system, often having additional options.

There is no language-wide difference between Python and Bash in performance. It entirely depends on how each is coded and which external tools are called.

Any of the well known tools like awk, sed, grep, bc, dc, tr, etc. will leave doing those operations in either language in the dust. Bash then is preferred for anything without a graphical user interface since it is easier and more efficient to call and pass data back from a tool like those with Bash than Python.

Performance

It depends on which programs the Bash shell script calls and their suitability for the subtask they are given whether the overall throughput and/or responsiveness will be better or worse than the equivalent Python. To complicate matters Python, like most languages, can also call other executables, though it is more cumbersome and thus not as often used.

User Interface

One area where Python is the clear winner is user interface. That makes it an excellent language for building local or client-server applications as it natively supports GTK graphics and is far more intuitive than Bash.

Bash only understands text. Other tools must be called for a GUI and data passed back from them. A Python script is one option. Faster but less flexible options are the binaries like YAD, Zenity, and GTKDialog.

While shells like Bash work well with GUIs like Yad, GtkDialog (embedded XML-like interface to GTK+ functions), dialog, and xmessage, Python is much more capable and so better for complex GUI windows.

Summary

Building with shell scripts is like assembling a computer with off-the-shelf components the way desktop PCs are.

Building with Python, C++ or most any other language is more like building a computer by soldering the chips (libraries) and other electronic parts together the way smartphones are.

The best results are usually obtained by using a combination of languages where each can do what they do best. One developer calls this “polyglot programming“.


回答 1

通常,只有在python不可用的环境中,bash才能比python更好。:)

认真地讲,我每天都必须处理两种语言,并且如果可以选择的话,Python将比bash立即使用。las,我被迫在某些“小型”平台上使用bash,因为有人(错误地,恕我直言)认为python“太大”以致无法容纳。

虽然对于某些选择任务,bash的确可以比python快,但它的开发速度或维护速度都不可能如此之快(至少在经过10行左右的代码之后)。Bash的无处不在是python或ruby或lua等的唯一优点。

Generally, bash works better than python only in those environments where python is not available. :)

Seriously, I have to deal with both languages daily, and will take python instantly over bash if given the choice. Alas, I am forced to use bash on certain “small” platforms because someone has (mistakenly, IMHO) decided that python is “too large” to fit.

While it is true that bash might be faster than python for some select tasks, it can never be as quick to develop with, or as easy to maintain (at least after you get past 10 lines of code or so). Bash’s sole strong point wrt python or ruby or lua, etc., is its ubiquity.


回答 2

在bash和Python都是明智的选择的情况下,开发人员的效率对我而言更为重要。

有些任务很适合bash,另一些则适合Python。对于我来说,将其作为bash脚本启动并将其更改为Python并不罕见,因为它会持续数周的发展。

Python的一大优势是在处理文件名的极端情况下,尽管它具有globshutil,和其他常见的脚本的需求。

Developer efficiency matters much more to me in scenarios where both bash and Python are sensible choices.

Some tasks lend themselves well to bash, and others to Python. It also isn’t unusual for me to start something as a bash script and change it to Python as it evolves over several weeks.

A big advantage Python has is in corner cases around filename handling, while it has glob, shutil, subprocess, and others for common scripting needs.


回答 3

在编写脚本时,性能并不重要(在大多数情况下)。
如果您关心性能,“ Python vs Bash”是一个错误的问题。

Python
+易于编写
+易于维护
+代码重用(尝试在通用代码中找到通用的防错方法来包含文件sh,我敢)
+您也可以使用OOP!
+更轻松的参数解析。好吧,确实不容易。它仍然太罗to了,但python argparse内置了功能
。-丑陋的’subprocess’。尝试链接命令,不要哭了,您的代码将变得多么丑陋。特别是如果您关心退出代码。

Bash
+如前所述,无处不在。
+简单的命令链接。这就是您以简单的方式将不同的命令粘合在一起的方式。也Bash(不是sh)也有一些改进,例如pipefail,因此链接确实很短且富有表现力。
+不需要安装第三方程序。可以立即执行。
-天哪,到处都是陷阱。IFS,CDPATH ..数千种。

如果编写的脚本大于100 LOC:请选择Python
如果需要在脚本中进行路径操作:请选择Python(3)
如果需要一些类似于alias但有点复杂的:请选择Bash / sh

无论如何,一个人应该尽力让双方了解他们的能力。

也许可以通过打包和IDE支持点来扩展答案,但是我对此并不熟悉。

与往常一样,您必须选择粪便三明治和巨型水饺。请记住,仅在几年前,Perl就是新希望。现在在哪里。

When you writing scripts performance does not matter (in most cases).
If you care about performance ‘Python vs Bash’ is a false question.

Python:
+ easier to write
+ easier to maintain
+ easier code reuse (try to find universal error-proof way to include files with common code in sh, I dare you)
+ you can do OOP with it too!
+ easier arguments parsing. well, not easier, exactly. it still will be too wordy to my taste, but python have argparse facility built in.
– ugly ugly ‘subprocess’. try to chain commands and not to cry a river how ugly your code will become. especially if you care about exit codes.

Bash:
+ ubiquity, as was said earlier, indeed.
+ simple commands chaining. that’s how you glue together different commands in a simple way. Also Bash (not sh) have some improvements, like pipefail, so chaining is really short and expressive.
+ do not require 3rd-party programs to be installed. can be executed right away.
– god, it’s full of gotchas. IFS, CDPATH.. thousands of them.

If one writing a script bigger than 100 LOC: choose Python
If one need path manipulation in script: choose Python(3)
If one need somewhat like alias but slightly complicated: choose Bash/sh

Anyway, one should try both sides to get the idea what are they capable of.

Maybe answer can be extended with packaging and IDE support points, but I’m not familiar with this sides.

As always you have to choose from turd sandwich and giant douche. And remember, just a few years ago Perl was new hope. Where it is now.


回答 4

在进程启动时,性能方面的bash优于python。

以下是我的运行Linux Mint的核心i7笔记本电脑的一些测量结果:

Starting process                       Startup time

empty /bin/sh script                   1.7 ms
empty /bin/bash script                 2.8 ms
empty python script                    11.1 ms
python script with a few libs*         110 ms

* Python加载的库是:os,os.path,json,时间,请求,线程,子进程

这显示出巨大的差异,但是如果bash必须做任何明智的事情,因为它通常必须调用外部进程,则执行时间会迅速缩短。

如果您关心性能,请仅将bash用于:

  • 非常简单且经常调用的脚本
  • 主要调用其他进程的脚本
  • 当您需要手动管理操作和脚本之间的最小摩擦时-快速检查一些命令并将其放置在file.sh中

Performance-wise bash outperforms python in the process startup time.

Here are some measurements from my core i7 laptop running Linux Mint:

Starting process                       Startup time

empty /bin/sh script                   1.7 ms
empty /bin/bash script                 2.8 ms
empty python script                    11.1 ms
python script with a few libs*         110 ms

*Python loaded libs are: os, os.path, json, time, requests, threading, subprocess

This shows a huge difference however bash execution time degrades quickly if it has to do anything sensible since it usually must call external processes.

If you care about performance use bash only for:

  • really simple and frequently called scripts
  • scripts that mainly call other processes
  • when you need minimal friction between manual administrative actions and scripting – fast check a few commands and place them in the file.sh

回答 5

Bash主要是一种批处理/ shell脚本语言,对各种数据类型和围绕控制结构的各种怪癖的支持要少得多,更不用说兼容性问题了。

哪个更快?两者都不是,因为您这里没有将苹果与其他苹果进行比较。如果您必须对一个ascii文本文件进行排序,并且正在使用zcat,sort,uniq和sed之类的工具,那么您将明智地利用Python性能。

但是,如果您需要一个支持浮点和各种控制流的适当编程环境,那么Python无疑是明智之举。如果您在Bash和Python中写了一个递归算法,则Python版本将赢得一个数量级或更多。

Bash is primarily a batch / shell scripting language with far less support for various data types and all sorts of quirks around control structures — not to mention compatibility issues.

Which is faster? Neither, because you are not comparing apples to apples here. If you had to sort an ascii text file and you were using tools like zcat, sort, uniq, and sed then you will smoke Python performance wise.

However, if you need a proper programming environment that supports floating point and various control flow, then Python wins hands down. If you wrote say a recursive algorithm in Bash and Python, the Python version will win in an order of magnitude or more.


回答 6

如果您希望以最小的努力拼凑快速的实用程序,那么bash就是不错的选择。对于应用程序的包装器而言,bash不可估量。

任何可能使您一遍又一遍地添加改进的东西(尽管并非总是如此)可能更适合于Python之类的语言,因为包含超过1000行的Bash代码很难维护。当Bash代码变长时,它也很烦人调试。

根据我的经验,这类问题的部分问题是shell脚本通常都是自定义任务。在已经有免费解决方案的地方,遇到的shell脚本任务很少。

If you are looking to cobble together a quick utility with minimal effort, bash is good. For a wrapper round an application, bash is invaluable.

Anything that may have you coming back over and over to add improvements is probably (though not always) better suited to a language like Python as Bash code comprising over a 1000 lines gets very painful to maintain. Bash code is also irritating to debug when it gets long…….

Part of the problem with these kind of questions is, from my experience, that shell scripts are usually all custom tasks. There have been very few shell scripting tasks that I have come across where there is already a solution freely available.


回答 7

我相信有两种方案的Bash性能至少相等:

  • 命令行实用程序的脚本
  • 只需很短时间即可执行的脚本;在其中启动Python解释器需要比操作本身更多的时间

就是说,我通常并不真正关心脚本语言本身的性能。如果性能是一个真正的问题,那么您不必编写脚本而是编写程序(可能使用Python)。

There are 2 scenario’s where Bash performance is at least equal I believe:

  • Scripting of command line utilities
  • Scripts which take only a short time to execute; where starting the Python interpreter takes more time than the operation itself

That said, I usually don’t really concern myself with performance of the scripting language itself. If performance is a real issue you don’t script but program (possibly in Python).


回答 8

我之所以发布此最新答案,主要是因为Google喜欢这个问题。

我认为问题和背景确实应该与工作流程有关,而不是工具。总体理念始终是“使用正确的工具完成工作”。但是在此之前,许多人常常在工具迷路时忘记了这一点:“完成工作”。

当我遇到一个尚未完全定义的问题时,我几乎总是从Bash开始。我已经解决了大型Bash脚本中易读且可维护的一些棘手问题。

但是问题什么时候开始超过应该要求Bash做什么的呢?我有一些支票可以用来警告我:

  1. 我是否希望Bash具有2D(或更高)阵列?如果是的话,是时候意识到Bash不是很好的数据处理语言了。
  2. 与为其他实用程序准备数据相比,我是否正在做更多的工作?如果是,请再次意识到Bash不是一种出色的数据处理语言。
  3. 我的脚本仅仅是变得太大而无法管理吗?如果是,那么很重要的一点是要意识到,尽管Bash可以导入脚本库,但它缺少像其他语言一样的软件包系统。与大多数其他语言相比,它确实是一种“自己动手”的语言。再说一次,它具有大量的内置功能(有人说太多…)

清单继续。底线是,当您为添加功能而更加努力地保持脚本运行时,该离开Bash了。

假设您已决定将工作移至Python。如果您的Bash脚本干净,则初始转换非常简单。甚至还有几个转换器/翻译器将为您做第一遍。

下一个问题是:您放弃转向Python的什么?

  1. 必须将对外部实用程序的所有调用包装在subprocess模块(或等效模块)中的某些内容中。有多种方法可以做到这一点,直到3.7,它才花了点力气才将其改正(改进subprocess.run()了3.7,可以自行处理所有常见情况)。

  2. 令人惊讶的是,Python没有用于轮询键盘(stdin)的标准独立于平台的非阻塞实用程序(带有超时)。Bash read命令是一个很棒的工具,用于简单的用户交互。我最常见的用法是显示一个微调框,直到用户按下某个键为止,同时还运行轮询功能(每个微调框步骤都执行一次),以确保一切运行正常。这是一个比刚开始时要棘手的问题,所以我经常简单地打电话给Bash:昂贵,但这恰恰满足了我的需求。

  3. 如果您是在嵌入式或受内存限制的系统上进行开发,Python的内存占用量可能是Bash的很多倍(取决于手头的任务)。另外,内存中几乎总是有一个Bash实例,而Python可能并非如此。

  4. 对于只运行一次并快速退出的脚本,Python的启动时间可能比Bash的启动时间长得多。但是,如果脚本中包含大量计算,Python会迅速前进。

  5. Python具有地球上最全面的软件包系统。当Bash变得稍微复杂时,Python可能会提供一个程序包,使整个Bash块成为单个调用。但是,找到合适的软件包成为Pythonista的最大也是最艰巨的任务。幸运的是,Google和StackExchange是您的朋友。

I’m posting this late answer primarily because Google likes this question.

I believe the issue and context really should be about the workflow, not the tools. The overall philosophy is always “Use the right tool for the job.” But before this comes one that many often forget when they get lost in the tools: “Get the job done.”

When I have a problem that isn’t completely defined, I almost always start with Bash. I have solved some gnarly problems in large Bash scripts that are both readable and maintainable.

But when does the problem start to exceed what Bash should be asked to do? I have some checks I use to give me warnings:

  1. Am I wishing Bash had 2D (or higher) arrays? If yes, it’s time to realize that Bash is not a great data processing language.
  2. Am I doing more work preparing data for other utilities than I am actually running those utilities? If yes, time again to realize Bash is not a great data processing language.
  3. Is my script simply getting too large to manage? If yes, it is important to realize that while Bash can import script libraries, it lacks a package system like other languages. It’s really a “roll your own” language compared to most others. Then again, it has a enormous amount of functionality built-in (some say too much…)

The list goes on. Bottom-line, when you are working harder to keep your scripts running that you do adding features, it’s time to leave Bash.

Let’s assume you’ve decided to move your work to Python. If your Bash scripts are clean, the initial conversion is quite straightforward. There are even several converters / translators that will do the first pass for you.

The next question is: What do you give up moving to Python?

  1. All calls to external utilities must be wrapped in something from the subprocess module (or equivalent). There are multiple ways to do this, and until 3.7 it took some effort to get it right (3.7 improved subprocess.run() to handle all common cases on its own).

  2. Surprisingly, Python has no standard platform-independent non-blocking utility (with timeout) for polling the keyboard (stdin). The Bash read command is an awesome tool for simple user interaction. My most common use is to show a spinner until the user presses a key, while also running a polling function (with each spinner step) to make sure things are still running well. This is a harder problem than it would appear at first, so I often simply make a call to Bash: Expensive, but it does precisely what I need.

  3. If you are developing on an embedded or memory-constrained system, Python’s memory footprint can be many times larger than Bash’s (depending on the task at hand). Plus, there is almost always an instance of Bash already in memory, which may not be the case for Python.

  4. For scripts that run once and exit quickly, Python’s startup time can be much longer than Bash’s. But if the script contains significant calculations, Python quickly pulls ahead.

  5. Python has the most comprehensive package system on the planet. When Bash gets even slightly complex, Python probably has a package that makes whole chunks of Bash become a single call. However, finding the right package(s) to use is the biggest and most daunting part of becoming a Pythonista. Fortunately, Google and StackExchange are your friends.


回答 9

我不知道这是否正确,但是我发现python / ruby​​在具有大量数学计算的脚本中效果更好。否则你必须使用dc或其他“任意精度计算器”。这只是一个很大的痛苦。使用python,您可以更好地控制浮点数和整数,并且有时执行许多计算要容易得多。

特别是,我永远不会使用bash脚本来处理二进制信息或字节。相反,我会使用python(也许)或C ++或什至Node.JS之类的东西。

I don’t know if this is accurate, but I have found that python/ruby works much better for scripts that have a lot of mathematical computations. Otherwise you have to use dc or some other “arbitrary precision calculator”. It just becomes a very big pain. With python you have much more control over floats vs ints and it is much easier to perform a lot of computations and sometimes.

In particular, I would never work with a bash script to handle binary information or bytes. Instead I would use something like python (maybe) or C++ or even Node.JS.


回答 10

在性能方面,两者可以做同样的事情,所以问题就变成了节省更多开发时间的问题?

Bash依赖于调用其他命令,并通过管道传递它们来创建新命令。这样做的好处是,无论他们使用什么编程语言,都可以使用从其他人那里借来的代码快速创建新程序。

这也具有很好的抵抗子命令更改的副作用,因为它们之间的界面只是纯文本。

另外,Bash在如何编写方面非常宽容。这意味着它可以在更广泛的上下文中很好地工作,但是它也依赖于程序员以一种干净安全的方式进行编码的意图。否则,Bash不会阻止您制造混乱。

Python的样式更加结构化,因此凌乱的程序员不会那么凌乱。它也可以在Linux以外的操作系统上运行,如果需要这种可移植性,使其立即变得更合适。

但这并不是调用其他命令那么简单。因此,如果您的操作系统是Unix,那么您将发现在Bash上进行开发是最快的开发方法。

何时使用Bash:

  • 它是一个非图形程序,或者是图形程序的引擎。
  • 仅适用于Unix。

何时使用Python:

  • 这是一个图形程序。
  • 它可以在Windows上运行。

Performance wise both can do equally the same, so the question becomes which saves more development time?

Bash relies on calling other commands, and piping them for creating new ones. This has the advantage that you can quickly create new programs just with the code borrowed from other people, no matter what programming language they used.

This also has the side effect of resisting change in sub-commands pretty well, as the interface between them is just plain text.

Additionally Bash is very permissive on how you can write on it. This means it will work well for a wider variety of context, but it also relies on the programmer having the intention of coding in a clean safe manner. Otherwise Bash won’t stop you from building a mess.

Python is more structured on style, so a messy programmer won’t be as messy. It will also work on operating systems outside Linux, making it instantly more appropriate if you need that kind of portability.

But it isn’t as simple for calling other commands. So if your operating system is Unix most likely you will find that developing on Bash is the fastest way to develop.

When to use Bash:

  • It’s a non graphical program, or the engine of a graphical one.
  • It’s only for Unix.

When to use Python:

  • It’s a graphical program.
  • It shall work on Windows.