标签归档:scripting

如何使Python脚本像Linux中的服务或守护程序一样运行

问题:如何使Python脚本像Linux中的服务或守护程序一样运行

我已经编写了一个Python脚本,该脚本检查特定的电子邮件地址并将新的电子邮件传递给外部程序。如何获得此脚本以执行24/7,例如在Linux中将其转换为守护程序或服务。我是否还需要一个永无休止的循环,还是只需要多次重新执行代码就可以完成?

I have written a Python script that checks a certain e-mail address and passes new e-mails to an external program. How can I get this script to execute 24/7, such as turning it into daemon or service in Linux. Would I also need a loop that never ends in the program, or can it be done by just having the code re executed multiple times?


回答 0

您在这里有两个选择。

  1. 进行适当的cron作业来调用您的脚本。Cron是GNU / Linux守护程序的通用名称,该守护程序会根据您设置的时间表定期启动脚本。您将脚本添加到crontab中,或将其符号链接放置到特殊目录中,守护程序将在后台启动该脚本。您可以在Wikipedia上阅读更多内容。有各种不同的cron守护程序,但是您的GNU / Linux系统应该已经安装了它。

  2. 对您的脚本使用某种python方法(例如,一个库)可以使其自身守护进程。是的,这将需要一个简单的事件循环(您的事件可能是计时器触发的,可能由睡眠功能提供)。

我不建议您选择2.,因为实际上您将重复cron功能。Linux系统范例是让多个简单的工具交互并解决您的问题。除非有其他原因(除了定期触发)之外,您还应创建守护程序,否则请选择其他方法。

另外,如果将daemonize与循环一起使用,并且发生崩溃,此后没有人会检查邮件(如Ivan Nevostruev对此答案的评论中指出的)。如果将脚本添加为cron作业,它将再次触发。

You have two options here.

  1. Make a proper cron job that calls your script. Cron is a common name for a GNU/Linux daemon that periodically launches scripts according to a schedule you set. You add your script into a crontab or place a symlink to it into a special directory and the daemon handles the job of launching it in the background. You can read more at Wikipedia. There is a variety of different cron daemons, but your GNU/Linux system should have it already installed.

  2. Use some kind of python approach (a library, for example) for your script to be able to daemonize itself. Yes, it will require a simple event loop (where your events are timer triggering, possibly, provided by sleep function).

I wouldn’t recommend you to choose 2., because you would be, in fact, repeating cron functionality. The Linux system paradigm is to let multiple simple tools interact and solve your problems. Unless there are additional reasons why you should make a daemon (in addition to trigger periodically), choose the other approach.

Also, if you use daemonize with a loop and a crash happens, no one will check the mail after that (as pointed out by Ivan Nevostruev in comments to this answer). While if the script is added as a cron job, it will just trigger again.


回答 1

这是一个不错的类,它是从这里获取的

#!/usr/bin/env python

import sys, os, time, atexit
from signal import SIGTERM

class Daemon:
        """
        A generic daemon class.

        Usage: subclass the Daemon class and override the run() method
        """
        def __init__(self, pidfile, stdin='/dev/null', stdout='/dev/null', stderr='/dev/null'):
                self.stdin = stdin
                self.stdout = stdout
                self.stderr = stderr
                self.pidfile = pidfile

        def daemonize(self):
                """
                do the UNIX double-fork magic, see Stevens' "Advanced
                Programming in the UNIX Environment" for details (ISBN 0201563177)
                http://www.erlenstar.demon.co.uk/unix/faq_2.html#SEC16
                """
                try:
                        pid = os.fork()
                        if pid > 0:
                                # exit first parent
                                sys.exit(0)
                except OSError, e:
                        sys.stderr.write("fork #1 failed: %d (%s)\n" % (e.errno, e.strerror))
                        sys.exit(1)

                # decouple from parent environment
                os.chdir("/")
                os.setsid()
                os.umask(0)

                # do second fork
                try:
                        pid = os.fork()
                        if pid > 0:
                                # exit from second parent
                                sys.exit(0)
                except OSError, e:
                        sys.stderr.write("fork #2 failed: %d (%s)\n" % (e.errno, e.strerror))
                        sys.exit(1)

                # redirect standard file descriptors
                sys.stdout.flush()
                sys.stderr.flush()
                si = file(self.stdin, 'r')
                so = file(self.stdout, 'a+')
                se = file(self.stderr, 'a+', 0)
                os.dup2(si.fileno(), sys.stdin.fileno())
                os.dup2(so.fileno(), sys.stdout.fileno())
                os.dup2(se.fileno(), sys.stderr.fileno())

                # write pidfile
                atexit.register(self.delpid)
                pid = str(os.getpid())
                file(self.pidfile,'w+').write("%s\n" % pid)

        def delpid(self):
                os.remove(self.pidfile)

        def start(self):
                """
                Start the daemon
                """
                # Check for a pidfile to see if the daemon already runs
                try:
                        pf = file(self.pidfile,'r')
                        pid = int(pf.read().strip())
                        pf.close()
                except IOError:
                        pid = None

                if pid:
                        message = "pidfile %s already exist. Daemon already running?\n"
                        sys.stderr.write(message % self.pidfile)
                        sys.exit(1)

                # Start the daemon
                self.daemonize()
                self.run()

        def stop(self):
                """
                Stop the daemon
                """
                # Get the pid from the pidfile
                try:
                        pf = file(self.pidfile,'r')
                        pid = int(pf.read().strip())
                        pf.close()
                except IOError:
                        pid = None

                if not pid:
                        message = "pidfile %s does not exist. Daemon not running?\n"
                        sys.stderr.write(message % self.pidfile)
                        return # not an error in a restart

                # Try killing the daemon process       
                try:
                        while 1:
                                os.kill(pid, SIGTERM)
                                time.sleep(0.1)
                except OSError, err:
                        err = str(err)
                        if err.find("No such process") > 0:
                                if os.path.exists(self.pidfile):
                                        os.remove(self.pidfile)
                        else:
                                print str(err)
                                sys.exit(1)

        def restart(self):
                """
                Restart the daemon
                """
                self.stop()
                self.start()

        def run(self):
                """
                You should override this method when you subclass Daemon. It will be called after the process has been
                daemonized by start() or restart().
                """

Here’s a nice class that is taken from here:

#!/usr/bin/env python

import sys, os, time, atexit
from signal import SIGTERM

class Daemon:
        """
        A generic daemon class.

        Usage: subclass the Daemon class and override the run() method
        """
        def __init__(self, pidfile, stdin='/dev/null', stdout='/dev/null', stderr='/dev/null'):
                self.stdin = stdin
                self.stdout = stdout
                self.stderr = stderr
                self.pidfile = pidfile

        def daemonize(self):
                """
                do the UNIX double-fork magic, see Stevens' "Advanced
                Programming in the UNIX Environment" for details (ISBN 0201563177)
                http://www.erlenstar.demon.co.uk/unix/faq_2.html#SEC16
                """
                try:
                        pid = os.fork()
                        if pid > 0:
                                # exit first parent
                                sys.exit(0)
                except OSError, e:
                        sys.stderr.write("fork #1 failed: %d (%s)\n" % (e.errno, e.strerror))
                        sys.exit(1)

                # decouple from parent environment
                os.chdir("/")
                os.setsid()
                os.umask(0)

                # do second fork
                try:
                        pid = os.fork()
                        if pid > 0:
                                # exit from second parent
                                sys.exit(0)
                except OSError, e:
                        sys.stderr.write("fork #2 failed: %d (%s)\n" % (e.errno, e.strerror))
                        sys.exit(1)

                # redirect standard file descriptors
                sys.stdout.flush()
                sys.stderr.flush()
                si = file(self.stdin, 'r')
                so = file(self.stdout, 'a+')
                se = file(self.stderr, 'a+', 0)
                os.dup2(si.fileno(), sys.stdin.fileno())
                os.dup2(so.fileno(), sys.stdout.fileno())
                os.dup2(se.fileno(), sys.stderr.fileno())

                # write pidfile
                atexit.register(self.delpid)
                pid = str(os.getpid())
                file(self.pidfile,'w+').write("%s\n" % pid)

        def delpid(self):
                os.remove(self.pidfile)

        def start(self):
                """
                Start the daemon
                """
                # Check for a pidfile to see if the daemon already runs
                try:
                        pf = file(self.pidfile,'r')
                        pid = int(pf.read().strip())
                        pf.close()
                except IOError:
                        pid = None

                if pid:
                        message = "pidfile %s already exist. Daemon already running?\n"
                        sys.stderr.write(message % self.pidfile)
                        sys.exit(1)

                # Start the daemon
                self.daemonize()
                self.run()

        def stop(self):
                """
                Stop the daemon
                """
                # Get the pid from the pidfile
                try:
                        pf = file(self.pidfile,'r')
                        pid = int(pf.read().strip())
                        pf.close()
                except IOError:
                        pid = None

                if not pid:
                        message = "pidfile %s does not exist. Daemon not running?\n"
                        sys.stderr.write(message % self.pidfile)
                        return # not an error in a restart

                # Try killing the daemon process       
                try:
                        while 1:
                                os.kill(pid, SIGTERM)
                                time.sleep(0.1)
                except OSError, err:
                        err = str(err)
                        if err.find("No such process") > 0:
                                if os.path.exists(self.pidfile):
                                        os.remove(self.pidfile)
                        else:
                                print str(err)
                                sys.exit(1)

        def restart(self):
                """
                Restart the daemon
                """
                self.stop()
                self.start()

        def run(self):
                """
                You should override this method when you subclass Daemon. It will be called after the process has been
                daemonized by start() or restart().
                """

回答 2

您应该使用python-daemon库,它可以处理所有事情。

来自PyPI:库,用于实现行为良好的Unix守护进程。

You should use the python-daemon library, it takes care of everything.

From PyPI: Library to implement a well-behaved Unix daemon process.


回答 3

您可以使用fork()将脚本与tty分离,并使其继续运行,如下所示:

import os, sys
fpid = os.fork()
if fpid!=0:
  # Running as daemon now. PID is fpid
  sys.exit(0)

当然,您还需要实现一个无限循环,例如

while 1:
  do_your_check()
  sleep(5)

希望这可以帮助您开始。

You can use fork() to detach your script from the tty and have it continue to run, like so:

import os, sys
fpid = os.fork()
if fpid!=0:
  # Running as daemon now. PID is fpid
  sys.exit(0)

Of course you also need to implement an endless loop, like

while 1:
  do_your_check()
  sleep(5)

Hope this get’s you started.


回答 4

您还可以使用Shell脚本使python脚本作为服务运行。首先创建一个shell脚本来像这样运行python脚本(脚本名任意名称)

#!/bin/sh
script='/home/.. full path to script'
/usr/bin/python $script &

现在在/etc/init.d/scriptname中创建一个文件

#! /bin/sh

PATH=/bin:/usr/bin:/sbin:/usr/sbin
DAEMON=/home/.. path to shell script scriptname created to run python script
PIDFILE=/var/run/scriptname.pid

test -x $DAEMON || exit 0

. /lib/lsb/init-functions

case "$1" in
  start)
     log_daemon_msg "Starting feedparser"
     start_daemon -p $PIDFILE $DAEMON
     log_end_msg $?
   ;;
  stop)
     log_daemon_msg "Stopping feedparser"
     killproc -p $PIDFILE $DAEMON
     PID=`ps x |grep feed | head -1 | awk '{print $1}'`
     kill -9 $PID       
     log_end_msg $?
   ;;
  force-reload|restart)
     $0 stop
     $0 start
   ;;
  status)
     status_of_proc -p $PIDFILE $DAEMON atd && exit 0 || exit $?
   ;;
 *)
   echo "Usage: /etc/init.d/atd {start|stop|restart|force-reload|status}"
   exit 1
  ;;
esac

exit 0

现在,您可以使用命令/etc/init.d/scriptname start或stop启动和停止python脚本。

You can also make the python script run as a service using a shell script. First create a shell script to run the python script like this (scriptname arbitary name)

#!/bin/sh
script='/home/.. full path to script'
/usr/bin/python $script &

now make a file in /etc/init.d/scriptname

#! /bin/sh

PATH=/bin:/usr/bin:/sbin:/usr/sbin
DAEMON=/home/.. path to shell script scriptname created to run python script
PIDFILE=/var/run/scriptname.pid

test -x $DAEMON || exit 0

. /lib/lsb/init-functions

case "$1" in
  start)
     log_daemon_msg "Starting feedparser"
     start_daemon -p $PIDFILE $DAEMON
     log_end_msg $?
   ;;
  stop)
     log_daemon_msg "Stopping feedparser"
     killproc -p $PIDFILE $DAEMON
     PID=`ps x |grep feed | head -1 | awk '{print $1}'`
     kill -9 $PID       
     log_end_msg $?
   ;;
  force-reload|restart)
     $0 stop
     $0 start
   ;;
  status)
     status_of_proc -p $PIDFILE $DAEMON atd && exit 0 || exit $?
   ;;
 *)
   echo "Usage: /etc/init.d/atd {start|stop|restart|force-reload|status}"
   exit 1
  ;;
esac

exit 0

Now you can start and stop your python script using the command /etc/init.d/scriptname start or stop.


回答 5

一个简单且受支持的版本Daemonize

从Python软件包索引(PyPI)安装它:

$ pip install daemonize

然后像这样使用:

...
import os, sys
from daemonize import Daemonize
...
def main()
      # your code here

if __name__ == '__main__':
        myname=os.path.basename(sys.argv[0])
        pidfile='/tmp/%s' % myname       # any name
        daemon = Daemonize(app=myname,pid=pidfile, action=main)
        daemon.start()

A simple and supported version is Daemonize.

Install it from Python Package Index (PyPI):

$ pip install daemonize

and then use like:

...
import os, sys
from daemonize import Daemonize
...
def main()
      # your code here

if __name__ == '__main__':
        myname=os.path.basename(sys.argv[0])
        pidfile='/tmp/%s' % myname       # any name
        daemon = Daemonize(app=myname,pid=pidfile, action=main)
        daemon.start()

回答 6

cron显然,在许多方面都是不错的选择。但是,它不会按照您在OP中的请求创建服务或守护程序。 cron只是周期性地运行作业(意味着作业开始和停止),并且不超过一次/分钟。出现问题cron-例如,如果您的脚本的先前实例在下次cron计划表出现并启动新实例时仍在运行,可以吗? cron不处理依赖关系;时间表说的话,它只是试图开始工作。

如果发现确实需要守护程序的情况(一个永不停止运行的进程),请看一下supervisord。它提供了一种简单的方法来包装普通的,非守护进程的脚本或程序,并使其像守护进程一样运行。这比创建本地Python守护程序更好。

cron is clearly a great choice for many purposes. However it doesn’t create a service or daemon as you requested in the OP. cron just runs jobs periodically (meaning the job starts and stops), and no more often than once / minute. There are issues with cron — for example, if a prior instance of your script is still running the next time the cron schedule comes around and launches a new instance, is that OK? cron doesn’t handle dependencies; it just tries to start a job when the schedule says to.

If you find a situation where you truly need a daemon (a process that never stops running), take a look at supervisord. It provides a simple way to wrapper a normal, non-daemonized script or program and make it operate like a daemon. This is a much better way than creating a native Python daemon.


回答 7

$nohup在Linux上使用命令怎么样?

我使用它在Bluehost服务器上运行命令。

如果我错了,请指教。

how about using $nohup command on linux?

I use it for running my commands on my Bluehost server.

Please advice if I am wrong.


回答 8

如果您正在使用终端(ssh或其他东西),并且想要从终端注销后保持长时间运行的脚本,则可以尝试以下操作:

screen

apt-get install screen

在内部创建一个虚拟终端(即abc): screen -dmS abc

现在我们连接到abc: screen -r abc

因此,现在我们可以运行python脚本了: python keep_sending_mails.py

从现在开始,您可以直接关闭终端,但是python脚本将继续运行而不是被关闭

由于此keep_sending_mails.pyPID是虚拟屏幕的子进程,而不是终端(ssh)

如果要返回以检查脚本的运行状态,可以screen -r abc再次使用

If you are using terminal(ssh or something) and you want to keep a long-time script working after you log out from the terminal, you can try this:

screen

apt-get install screen

create a virtual terminal inside( namely abc): screen -dmS abc

now we connect to abc: screen -r abc

So, now we can run python script: python keep_sending_mails.py

from now on, you can directly close your terminal, however, the python script will keep running rather than being shut down

Since this keep_sending_mails.py‘s PID is a child process of the virtual screen rather than the terminal(ssh)

If you want to go back check your script running status, you can use screen -r abc again


回答 9

首先,阅读邮件别名。邮件别名将在邮件系统内执行此操作,而您无需四处寻找守护程序或服务或任何类似的内容。

您可以编写一个简单的脚本,该脚本将在每次将邮件发送到特定邮箱时由sendmail执行。

参见http://www.feep.net/sendmail/tutorial/intro/aliases.html

如果您确实想编写不必要的复杂服务器,则可以执行此操作。

nohup python myscript.py &

这就是全部。您的脚本只是循环而进入休眠状态。

import time
def do_the_work():
    # one round of polling -- checking email, whatever.
while True:
    time.sleep( 600 ) # 10 min.
    try:
        do_the_work()
    except:
        pass

First, read up on mail aliases. A mail alias will do this inside the mail system without you having to fool around with daemons or services or anything of the sort.

You can write a simple script that will be executed by sendmail each time a mail message is sent to a specific mailbox.

See http://www.feep.net/sendmail/tutorial/intro/aliases.html

If you really want to write a needlessly complex server, you can do this.

nohup python myscript.py &

That’s all it takes. Your script simply loops and sleeps.

import time
def do_the_work():
    # one round of polling -- checking email, whatever.
while True:
    time.sleep( 600 ) # 10 min.
    try:
        do_the_work()
    except:
        pass

回答 10

我会推荐这种解决方案。您需要继承和重写method run

import sys
import os
from signal import SIGTERM
from abc import ABCMeta, abstractmethod



class Daemon(object):
    __metaclass__ = ABCMeta


    def __init__(self, pidfile):
        self._pidfile = pidfile


    @abstractmethod
    def run(self):
        pass


    def _daemonize(self):
        # decouple threads
        pid = os.fork()

        # stop first thread
        if pid > 0:
            sys.exit(0)

        # write pid into a pidfile
        with open(self._pidfile, 'w') as f:
            print >> f, os.getpid()


    def start(self):
        # if daemon is started throw an error
        if os.path.exists(self._pidfile):
            raise Exception("Daemon is already started")

        # create and switch to daemon thread
        self._daemonize()

        # run the body of the daemon
        self.run()


    def stop(self):
        # check the pidfile existing
        if os.path.exists(self._pidfile):
            # read pid from the file
            with open(self._pidfile, 'r') as f:
                pid = int(f.read().strip())

            # remove the pidfile
            os.remove(self._pidfile)

            # kill daemon
            os.kill(pid, SIGTERM)

        else:
            raise Exception("Daemon is not started")


    def restart(self):
        self.stop()
        self.start()

I would recommend this solution. You need to inherit and override method run.

import sys
import os
from signal import SIGTERM
from abc import ABCMeta, abstractmethod



class Daemon(object):
    __metaclass__ = ABCMeta


    def __init__(self, pidfile):
        self._pidfile = pidfile


    @abstractmethod
    def run(self):
        pass


    def _daemonize(self):
        # decouple threads
        pid = os.fork()

        # stop first thread
        if pid > 0:
            sys.exit(0)

        # write pid into a pidfile
        with open(self._pidfile, 'w') as f:
            print >> f, os.getpid()


    def start(self):
        # if daemon is started throw an error
        if os.path.exists(self._pidfile):
            raise Exception("Daemon is already started")

        # create and switch to daemon thread
        self._daemonize()

        # run the body of the daemon
        self.run()


    def stop(self):
        # check the pidfile existing
        if os.path.exists(self._pidfile):
            # read pid from the file
            with open(self._pidfile, 'r') as f:
                pid = int(f.read().strip())

            # remove the pidfile
            os.remove(self._pidfile)

            # kill daemon
            os.kill(pid, SIGTERM)

        else:
            raise Exception("Daemon is not started")


    def restart(self):
        self.stop()
        self.start()

回答 11

创建一些像服务一样运行的东西,您可以使用以下东西:

您必须做的第一件事是安装Cement框架:Cement框架是一个CLI框架,您可以在其上部署应用程序。

应用程序的命令行界面:

interface.py

 from cement.core.foundation import CementApp
 from cement.core.controller import CementBaseController, expose
 from YourApp import yourApp

 class Meta:
    label = 'base'
    description = "your application description"
    arguments = [
        (['-r' , '--run'],
          dict(action='store_true', help='Run your application')),
        (['-v', '--version'],
          dict(action='version', version="Your app version")),
        ]
        (['-s', '--stop'],
          dict(action='store_true', help="Stop your application")),
        ]

    @expose(hide=True)
    def default(self):
        if self.app.pargs.run:
            #Start to running the your app from there !
            YourApp.yourApp()
        if self.app.pargs.stop:
            #Stop your application
            YourApp.yourApp.stop()

 class App(CementApp):
       class Meta:
       label = 'Uptime'
       base_controller = 'base'
       handlers = [MyBaseController]

 with App() as app:
       app.run()

YourApp.py类:

 import threading

 class yourApp:
     def __init__:
        self.loger = log_exception.exception_loger()
        thread = threading.Thread(target=self.start, args=())
        thread.daemon = True
        thread.start()

     def start(self):
        #Do every thing you want
        pass
     def stop(self):
        #Do some things to stop your application

请记住,您的应用必须在线程上运行才能成为守护进程

要运行该应用程序,只需在命令行中执行此操作

python interface.py-帮助

to creating some thing that is running like service you can use this thing :

The first thing that you must do is installing the Cement framework: Cement frame work is a CLI frame work that you can deploy your application on it.

command line interface of the app :

interface.py

 from cement.core.foundation import CementApp
 from cement.core.controller import CementBaseController, expose
 from YourApp import yourApp

 class Meta:
    label = 'base'
    description = "your application description"
    arguments = [
        (['-r' , '--run'],
          dict(action='store_true', help='Run your application')),
        (['-v', '--version'],
          dict(action='version', version="Your app version")),
        ]
        (['-s', '--stop'],
          dict(action='store_true', help="Stop your application")),
        ]

    @expose(hide=True)
    def default(self):
        if self.app.pargs.run:
            #Start to running the your app from there !
            YourApp.yourApp()
        if self.app.pargs.stop:
            #Stop your application
            YourApp.yourApp.stop()

 class App(CementApp):
       class Meta:
       label = 'Uptime'
       base_controller = 'base'
       handlers = [MyBaseController]

 with App() as app:
       app.run()

YourApp.py class:

 import threading

 class yourApp:
     def __init__:
        self.loger = log_exception.exception_loger()
        thread = threading.Thread(target=self.start, args=())
        thread.daemon = True
        thread.start()

     def start(self):
        #Do every thing you want
        pass
     def stop(self):
        #Do some things to stop your application

Keep in mind that your app must run on a thread to be daemon

To run the app just do this in command line

python interface.py –help


回答 12

使用系统提供的任何服务管理器-例如在Ubuntu下使用upstart。这将为您处理所有详细信息,例如启动时启动,崩溃时重启等。

Use whatever service manager your system offers – for example under Ubuntu use upstart. This will handle all the details for you such as start on boot, restart on crash, etc.


回答 13

假设您真的希望循环将24/7作为后台服务运行

对于不涉及使用库注入代码的解决方案,您可以简单地创建一个服务模板,因为您使用的是Linux:

将该文件放置在守护程序服务文件夹中(通常为/etc/systemd/system/),然后使用以下systemctl命令进行安装(可能需要sudo特权):

systemctl enable <service file name without extension>

systemctl daemon-reload

systemctl start <service file name without extension>

然后可以使用以下命令检查服务是否正在运行:

systemctl | grep running

Assuming that you would really want your loop to run 24/7 as a background service

For a solution that doesn’t involve injecting your code with libraries, you can simply create a service template, since you are using linux:

[Unit]
Description = <Your service description here>
After = network.target # Assuming you want to start after network interfaces are made available
 
[Service]
Type = simple
ExecStart = python <Path of the script you want to run>
User = # User to run the script as
Group = # Group to run the script as
Restart = on-failure # Restart when there are errors
SyslogIdentifier = <Name of logs for the service>
RestartSec = 5
TimeoutStartSec = infinity
 
[Install]
WantedBy = multi-user.target # Make it accessible to other users

Place that file in your daemon service folder (usually /etc/systemd/system/), in a *.service file, and install it using the following systemctl commands (will likely require sudo privileges):

systemctl enable <service file name without .service extension>

systemctl daemon-reload

systemctl start <service file name without .service extension>

You can then check that your service is running by using the command:

systemctl | grep running

#!/ usr / bin / python3的目的

问题:#!/ usr / bin / python3的目的

我已经用几种脚本语言注意到了这一点,但是在此示例中,我使用的是python。在许多教程中,它们将从#!/usr/bin/python3第一行开始。我不明白为什么会有这个。

  • 操作系统不应该知道它是python脚本(显然是安装的,因为您正在引用它)
  • 如果用户使用的不是基于Unix的操作系统怎么办
  • 无论出于何种原因,该语言都安装在其他文件夹中
  • 用户具有不同的版本。特别是当它不是完整版本号时(例如Python3 vs Python32)

如果有的话,由于上面列出的原因,我可以看到这破坏了python脚本。

I have noticed this in a couple of scripting languages, but in this example, I am using python. In many tutorials, they would start with #!/usr/bin/python3 on the first line. I don’t understand why we have this.

  • Shouldn’t the operating system know it’s a python script (obviously it’s installed since you are making a reference to it)
  • What if the user is using a operating system that isn’t unix based
  • The language is installed in a different folder for whatever reason
  • The user has a different version. Especially when it’s not a full version number(Like Python3 vs Python32)

If anything, I could see this breaking the python script because of the listed reasons above.


回答 0

#!/usr/bin/python3蛇邦线

shebang线定义了解释器的位置。在这种情况下,python3解释器位于中/usr/bin/python3。甲shebang行也可以是bashrubyperl或任何其他脚本语言解释,例如:#!/bin/bash

没有shebang行,即使您在脚本上设置了执行标志并像那样运行,操作系统也不知道它是python脚本./script.py。要使脚本在python3中默认运行,请以方式调用它python3 script.py或设置shebang行。

#!/usr/bin/env python3如果它们在不同的位置安装了语言解释器,则可以用于跨不同系统的可移植性。

#!/usr/bin/python3 is a shebang line.

A shebang line defines where the interpreter is located. In this case, the python3 interpreter is located in /usr/bin/python3. A shebang line could also be a bash, ruby, perl or any other scripting languages’ interpreter, for example: #!/bin/bash.

Without the shebang line, the operating system does not know it’s a python script, even if you set the execution flag on the script and run it like ./script.py. To make the script run by default in python3, either invoke it as python3 script.py or set the shebang line.

You can use #!/usr/bin/env python3 for portability across different systems in case they have the language interpreter installed in different locations.


回答 1

这就是所谓的哈希爆炸。如果从外壳程序运行脚本,它将检查第一行以找出应启动哪个程序来解释脚本。

非基于Unix的操作系统将使用其自己的规则来确定如何运行脚本。例如,Windows将使用文件扩展名,并且#将导致第一行被视为注释。

如果Python可执行文件的路径错误,则脚本自然会失败。从标准约定指定的任何位置创建指向实际可执行文件的链接都很容易。

That’s called a hash-bang. If you run the script from the shell, it will inspect the first line to figure out what program should be started to interpret the script.

A non Unix based OS will use its own rules for figuring out how to run the script. Windows for example will use the filename extension and the # will cause the first line to be treated as a comment.

If the path to the Python executable is wrong, then naturally the script will fail. It is easy to create links to the actual executable from whatever location is specified by standard convention.


回答 2

此行有助于找到将运行脚本的程序可执行文件。这种shebang表示法在大多数脚本语言中都是非常标准的(至少在成熟的操作系统上使用)。

该行的重要方面是指定将使用哪个解释器。例如,在许多以开发为中心的Linux发行版中,通常同​​时安装多个版本的python是正常的。

Python 2.x和Python 3不是100%兼容的,因此这种差异可能非常重要。因此,#! /usr/bin/python#! /usr/bin/python3并不相同(也与#! /usr/bin/env python3本页面其他地方所述的不完全相同)。

This line helps find the program executable that will run the script. This shebang notation is fairly standard across most scripting languages (at least as used on grown-up operating systems).

An important aspect of this line is specifying which interpreter will be used. On many development-centered Linux distributions, for example, it is normal to have several versions of python installed at the same time.

Python 2.x and Python 3 are not 100% compatible, so this difference can be very important. So #! /usr/bin/python and #! /usr/bin/python3 are not the same (and neither are quite the same as #! /usr/bin/env python3 as noted elsewhere on this page.


回答 3

  1. 这条线是怎么回事

  2. 它被忽略。

  3. 它将无法运行,应更改为指向正确的位置。还是env应该使用。

  4. 它将无法运行,并且可能无法在其他版本下运行。

  1. And this line is how.

  2. It is ignored.

  3. It will fail to run, and should be changed to point to the proper location. Or env should be used.

  4. It will fail to run, and probably fail to run under a different version regardless.


回答 4

为了阐明shebang行在Windows中的工作原理,请参阅3.7 Python文档

  • 如果脚本文件的第一行以#!开头,则称为“ shebang”行。Linux和其他类似Unix的操作系统对此行具有本地支持,它们通常在此类系统上用于指示应如何执行脚本。
  • Windows的Python启动器允许Windows上的Python脚本使用相同的功能
  • 为了允许Python脚本中的shebang行在Unix和Windows之间可移植,启动器支持许多“虚拟”命令来指定要使用的解释器。支持的虚拟命令是:
    • / usr / bin / env python
      • shebang行的/ usr / bin / env格式还有另一个特殊属性。在查找已安装的Python解释器之前,此表单将在可执行文件PATH中搜索Python可执行文件。这对应于Unix env程序的行为,该程序执行PATH搜索。
    • / usr / bin / python
    • / usr / local / bin / python
    • Python

To clarify how the shebang line works for windows, from the 3.7 Python doc:

  • If the first line of a script file starts with #!, it is known as a “shebang” line. Linux and other Unix like operating systems have native support for such lines and they are commonly used on such systems to indicate how a script should be executed.
  • The Python Launcher for Windows allows the same facilities to be used with Python scripts on Windows
  • To allow shebang lines in Python scripts to be portable between Unix and Windows, the launcher supports a number of ‘virtual’ commands to specify which interpreter to use. The supported virtual commands are:
    • /usr/bin/env python
      • The /usr/bin/env form of shebang line has one further special property. Before looking for installed Python interpreters, this form will search the executable PATH for a Python executable. This corresponds to the behaviour of the Unix env program, which performs a PATH search.
    • /usr/bin/python
    • /usr/local/bin/python
    • python

回答 5

实际上,确定文件的文件类型非常复杂,因此现在操作系统不能仅仅知道。它可以基于-做出很多猜测

  • 延期
  • 尿路感染
  • 哑剧

但是命令行并不打扰所有这一切,因为它在有限的向后兼容层上运行,从那一刻起,废话什么都没有。如果您双击确定,现代操作系统可以解决这一问题,但是如果您从终端运行它,则不会,因为终端并不关心您的操作系统特定的文件键入API。

关于其他要点。这很方便,也可以运行

python3 path/to/your/script

如果您的python不在指定的路径中,则它将无法正常工作,但是我们倾向于安装一些东西来使类似的工作正常进行,而不是相反。您是否在* nix下实际上并不重要,是否考虑这行取决于您的shell,因为这是一个shellcode。因此,例如,您可以bash在Windows下运行。

实际上,您可以完全省略此行,这仅意味着调用方将必须指定解释器。另外,请勿将口译员放在非标准位置,然后尝试在不提供口译员的情况下调用脚本。

Actually the determination of what type of file a file is very complicated, so now the operating system can’t just know. It can make lots of guesses based on –

  • extension
  • UTI
  • MIME

But the command line doesn’t bother with all that, because it runs on a limited backwards compatible layer, from when that fancy nonsense didn’t mean anything. If you double click it sure, a modern OS can figure that out- but if you run it from a terminal then no, because the terminal doesn’t care about your fancy OS specific file typing APIs.

Regarding the other points. It’s a convenience, it’s similarly possible to run

python3 path/to/your/script

If your python isn’t in the path specified, then it won’t work, but we tend to install things to make stuff like this work, not the other way around. It doesn’t actually matter if you’re under *nix, it’s up to your shell whether to consider this line because it’s a shellcode. So for example you can run bash under Windows.

You can actually ommit this line entirely, it just mean the caller will have to specify an interpreter. Also don’t put your interpreters in nonstandard locations and then try to call scripts without providing an interpreter.


Python递归文件夹读取

问题:Python递归文件夹读取

我有C ++ / Obj-C背景,而我刚发现Python(大约写了一个小时)。我正在编写一个脚本,以递归方式读取文件夹结构中文本文件的内容。

我的问题是我编写的代码仅适用于一个文件夹较深的地方。我可以看到为什么在代码中(请参阅参考资料#hardcoded path),我只是不知道如何继续使用Python,因为我的经验仅仅是全新的。

Python代码:

import os
import sys

rootdir = sys.argv[1]

for root, subFolders, files in os.walk(rootdir):

    for folder in subFolders:
        outfileName = rootdir + "/" + folder + "/py-outfile.txt" # hardcoded path
        folderOut = open( outfileName, 'w' )
        print "outfileName is " + outfileName

        for file in files:
            filePath = rootdir + '/' + file
            f = open( filePath, 'r' )
            toWrite = f.read()
            print "Writing '" + toWrite + "' to" + filePath
            folderOut.write( toWrite )
            f.close()

        folderOut.close()

I have a C++/Obj-C background and I am just discovering Python (been writing it for about an hour). I am writing a script to recursively read the contents of text files in a folder structure.

The problem I have is the code I have written will only work for one folder deep. I can see why in the code (see #hardcoded path), I just don’t know how I can move forward with Python since my experience with it is only brand new.

Python Code:

import os
import sys

rootdir = sys.argv[1]

for root, subFolders, files in os.walk(rootdir):

    for folder in subFolders:
        outfileName = rootdir + "/" + folder + "/py-outfile.txt" # hardcoded path
        folderOut = open( outfileName, 'w' )
        print "outfileName is " + outfileName

        for file in files:
            filePath = rootdir + '/' + file
            f = open( filePath, 'r' )
            toWrite = f.read()
            print "Writing '" + toWrite + "' to" + filePath
            folderOut.write( toWrite )
            f.close()

        folderOut.close()

回答 0

确保您了解以下三个返回值os.walk

for root, subdirs, files in os.walk(rootdir):

具有以下含义:

  • root:“经过”的当前路径
  • subdirsroot目录类型中的文件
  • files:目录中以外类型root(不在中subdirs)的文件

并且请使用os.path.join而不是用斜杠连接!您的问题是filePath = rootdir + '/' + file-您必须串联当前“步行”的文件夹,而不是最顶层的文件夹。所以一定是filePath = os.path.join(root, file)。顺便说一句,“文件”是内置的,因此通常不将其用作变量名。

另一个问题是循环,应该像这样,例如:

import os
import sys

walk_dir = sys.argv[1]

print('walk_dir = ' + walk_dir)

# If your current working directory may change during script execution, it's recommended to
# immediately convert program arguments to an absolute path. Then the variable root below will
# be an absolute path as well. Example:
# walk_dir = os.path.abspath(walk_dir)
print('walk_dir (absolute) = ' + os.path.abspath(walk_dir))

for root, subdirs, files in os.walk(walk_dir):
    print('--\nroot = ' + root)
    list_file_path = os.path.join(root, 'my-directory-list.txt')
    print('list_file_path = ' + list_file_path)

    with open(list_file_path, 'wb') as list_file:
        for subdir in subdirs:
            print('\t- subdirectory ' + subdir)

        for filename in files:
            file_path = os.path.join(root, filename)

            print('\t- file %s (full path: %s)' % (filename, file_path))

            with open(file_path, 'rb') as f:
                f_content = f.read()
                list_file.write(('The file %s contains:\n' % filename).encode('utf-8'))
                list_file.write(f_content)
                list_file.write(b'\n')

如果您不知道,则with文件声明是一种简写形式:

with open('filename', 'rb') as f:
    dosomething()

# is effectively the same as

f = open('filename', 'rb')
try:
    dosomething()
finally:
    f.close()

Make sure you understand the three return values of os.walk:

for root, subdirs, files in os.walk(rootdir):

has the following meaning:

  • root: Current path which is “walked through”
  • subdirs: Files in root of type directory
  • files: Files in root (not in subdirs) of type other than directory

And please use os.path.join instead of concatenating with a slash! Your problem is filePath = rootdir + '/' + file – you must concatenate the currently “walked” folder instead of the topmost folder. So that must be filePath = os.path.join(root, file). BTW “file” is a builtin, so you don’t normally use it as variable name.

Another problem are your loops, which should be like this, for example:

import os
import sys

walk_dir = sys.argv[1]

print('walk_dir = ' + walk_dir)

# If your current working directory may change during script execution, it's recommended to
# immediately convert program arguments to an absolute path. Then the variable root below will
# be an absolute path as well. Example:
# walk_dir = os.path.abspath(walk_dir)
print('walk_dir (absolute) = ' + os.path.abspath(walk_dir))

for root, subdirs, files in os.walk(walk_dir):
    print('--\nroot = ' + root)
    list_file_path = os.path.join(root, 'my-directory-list.txt')
    print('list_file_path = ' + list_file_path)

    with open(list_file_path, 'wb') as list_file:
        for subdir in subdirs:
            print('\t- subdirectory ' + subdir)

        for filename in files:
            file_path = os.path.join(root, filename)

            print('\t- file %s (full path: %s)' % (filename, file_path))

            with open(file_path, 'rb') as f:
                f_content = f.read()
                list_file.write(('The file %s contains:\n' % filename).encode('utf-8'))
                list_file.write(f_content)
                list_file.write(b'\n')

If you didn’t know, the with statement for files is a shorthand:

with open('filename', 'rb') as f:
    dosomething()

# is effectively the same as

f = open('filename', 'rb')
try:
    dosomething()
finally:
    f.close()

回答 1

如果您使用的是Python 3.5或更高版本,则可以在1行中完成此操作。

import glob

for filename in glob.iglob(root_dir + '**/*.txt', recursive=True):
     print(filename)

文档中所述

如果递归为true,则模式**将匹配任何文件以及零个或多个目录和子目录。

如果需要每个文件,可以使用

import glob

for filename in glob.iglob(root_dir + '**/*', recursive=True):
     print(filename)

If you are using Python 3.5 or above, you can get this done in 1 line.

import glob

# root_dir needs a trailing slash (i.e. /root/dir/)
for filename in glob.iglob(root_dir + '**/*.txt', recursive=True):
     print(filename)

As mentioned in the documentation

If recursive is true, the pattern ‘**’ will match any files and zero or more directories and subdirectories.

If you want every file, you can use

import glob

for filename in glob.iglob(root_dir + '**/**', recursive=True):
     print(filename)

回答 2

同意Dave Webb,os.walk将为树中的每个目录生成一个项目。事实是,您不必在意subFolders

这样的代码应该工作:

import os
import sys

rootdir = sys.argv[1]

for folder, subs, files in os.walk(rootdir):
    with open(os.path.join(folder, 'python-outfile.txt'), 'w') as dest:
        for filename in files:
            with open(os.path.join(folder, filename), 'r') as src:
                dest.write(src.read())

Agree with Dave Webb, os.walk will yield an item for each directory in the tree. Fact is, you just don’t have to care about subFolders.

Code like this should work:

import os
import sys

rootdir = sys.argv[1]

for folder, subs, files in os.walk(rootdir):
    with open(os.path.join(folder, 'python-outfile.txt'), 'w') as dest:
        for filename in files:
            with open(os.path.join(folder, filename), 'r') as src:
                dest.write(src.read())

回答 3

TL; DR:这等效于find -type f遍历以下所有文件夹(包括当前文件夹)中的所有文件:

for currentpath, folders, files in os.walk('.'):
    for file in files:
        print(os.path.join(currentpath, file))

正如其他答案中已经提到的那样,答案os.walk()是正确的,但是可以更好地解释它。很简单!让我们来看看这棵树:

docs/
└── doc1.odt
pics/
todo.txt

使用此代码:

for currentpath, folders, files in os.walk('.'):
    print(currentpath)

currentpath是它正在查看的当前文件夹。这将输出:

.
./docs
./pics

因此它循环了3次,因为有3个文件夹:当前文件夹docs,和pics。在每个循环中,它将填充变量folders以及files所有文件夹和文件。让我们向他们展示:

for currentpath, folders, files in os.walk('.'):
    print(currentpath, folders, files)

这向我们显示:

# currentpath  folders           files
.              ['pics', 'docs']  ['todo.txt']
./pics         []                []
./docs         []                ['doc1.odt']

因此,在第一行中,我们看到我们在folder中.,它包含两个文件夹即picsdocs,并且存在一个文件,即todo.txt。您无需执行任何操作即可递归到那些文件夹中,因为如您所见,它会自动递归,并且只为您提供任何子文件夹中的文件。以及它的任何子文件夹(尽管示例中没有这些子文件夹)。

如果您只想遍历所有文件(等效于)find -type f,则可以执行以下操作:

for currentpath, folders, files in os.walk('.'):
    for file in files:
        print(os.path.join(currentpath, file))

输出:

./todo.txt
./docs/doc1.odt

TL;DR: This is the equivalent to find -type f to go over all files in all folders below and including the current one:

for currentpath, folders, files in os.walk('.'):
    for file in files:
        print(os.path.join(currentpath, file))

As already mentioned in other answers, os.walk() is the answer, but it could be explained better. It’s quite simple! Let’s walk through this tree:

docs/
└── doc1.odt
pics/
todo.txt

With this code:

for currentpath, folders, files in os.walk('.'):
    print(currentpath)

The currentpath is the current folder it is looking at. This will output:

.
./docs
./pics

So it loops three times, because there are three folders: the current one, docs, and pics. In every loop, it fills the variables folders and files with all folders and files. Let’s show them:

for currentpath, folders, files in os.walk('.'):
    print(currentpath, folders, files)

This shows us:

# currentpath  folders           files
.              ['pics', 'docs']  ['todo.txt']
./pics         []                []
./docs         []                ['doc1.odt']

So in the first line, we see that we are in folder ., that it contains two folders namely pics and docs, and that there is one file, namely todo.txt. You don’t have to do anything to recurse into those folders, because as you see, it recurses automatically and just gives you the files in any subfolders. And any subfolders of that (though we don’t have those in the example).

If you just want to loop through all files, the equivalent of find -type f, you can do this:

for currentpath, folders, files in os.walk('.'):
    for file in files:
        print(os.path.join(currentpath, file))

This outputs:

./todo.txt
./docs/doc1.odt

回答 4

pathlib库非常适合处理文件。您可以Path像这样对对象执行递归glob 。

from pathlib import Path

for elem in Path('/path/to/my/files').rglob('*.*'):
    print(elem)

The pathlib library is really great for working with files. You can do a recursive glob on a Path object like so.

from pathlib import Path

for elem in Path('/path/to/my/files').rglob('*.*'):
    print(elem)

回答 5

如果要给定目录下所有路径的平面列表(如find .在shell中):

   files = [ 
       os.path.join(parent, name)
       for (parent, subdirs, files) in os.walk(YOUR_DIRECTORY)
       for name in files + subdirs
   ]

要仅在基本目录下包含文件的完整路径,请省略+ subdirs

If you want a flat list of all paths under a given dir (like find . in the shell):

   files = [ 
       os.path.join(parent, name)
       for (parent, subdirs, files) in os.walk(YOUR_DIRECTORY)
       for name in files + subdirs
   ]

To only include full paths to files under the base dir, leave out + subdirs.


回答 6

import glob
import os

root_dir = <root_dir_here>

for filename in glob.iglob(root_dir + '**/**', recursive=True):
    if os.path.isfile(filename):
        with open(filename,'r') as file:
            print(file.read())

**/**用于递归获取所有文件,包括directory

if os.path.isfile(filename)用于检查filename变量是file还是directory,如果它是文件,那么我们可以读取该文件。我在这里打印文件。

import glob
import os

root_dir = <root_dir_here>

for filename in glob.iglob(root_dir + '**/**', recursive=True):
    if os.path.isfile(filename):
        with open(filename,'r') as file:
            print(file.read())

**/** is used to get all files recursively including directory.

if os.path.isfile(filename) is used to check if filename variable is file or directory, if it is file then we can read that file. Here I am printing file.


回答 7

我发现以下是最简单的

from glob import glob
import os

files = [f for f in glob('rootdir/**', recursive=True) if os.path.isfile(f)]

使用glob('some/path/**', recursive=True)获取所有文件,但还包括目录名称。添加if os.path.isfile(f)条件仅将此列表过滤到现有文件

I’ve found the following to be the easiest

from glob import glob
import os

files = [f for f in glob('rootdir/**', recursive=True) if os.path.isfile(f)]

Using glob('some/path/**', recursive=True) gets all files, but also includes directory names. Adding the if os.path.isfile(f) condition filters this list to existing files only


回答 8

用于os.path.join()构建路径-更整洁:

import os
import sys
rootdir = sys.argv[1]
for root, subFolders, files in os.walk(rootdir):
    for folder in subFolders:
        outfileName = os.path.join(root,folder,"py-outfile.txt")
        folderOut = open( outfileName, 'w' )
        print "outfileName is " + outfileName
        for file in files:
            filePath = os.path.join(root,file)
            toWrite = open( filePath).read()
            print "Writing '" + toWrite + "' to" + filePath
            folderOut.write( toWrite )
        folderOut.close()

use os.path.join() to construct your paths – It’s neater:

import os
import sys
rootdir = sys.argv[1]
for root, subFolders, files in os.walk(rootdir):
    for folder in subFolders:
        outfileName = os.path.join(root,folder,"py-outfile.txt")
        folderOut = open( outfileName, 'w' )
        print "outfileName is " + outfileName
        for file in files:
            filePath = os.path.join(root,file)
            toWrite = open( filePath).read()
            print "Writing '" + toWrite + "' to" + filePath
            folderOut.write( toWrite )
        folderOut.close()

回答 9

os.walk默认情况下不会递归遍历。对于每个目录,从根目录开始都会生成一个三元组(目录路径,目录名,文件名)

from os import walk
from os.path import splitext, join

def select_files(root, files):
    """
    simple logic here to filter out interesting files
    .py files in this example
    """

    selected_files = []

    for file in files:
        #do concatenation here to get full path 
        full_path = join(root, file)
        ext = splitext(file)[1]

        if ext == ".py":
            selected_files.append(full_path)

    return selected_files

def build_recursive_dir_tree(path):
    """
    path    -    where to begin folder scan
    """
    selected_files = []

    for root, dirs, files in walk(path):
        selected_files += select_files(root, files)

    return selected_files

os.walk does recursive walk by default. For each dir, starting from root it yields a 3-tuple (dirpath, dirnames, filenames)

from os import walk
from os.path import splitext, join

def select_files(root, files):
    """
    simple logic here to filter out interesting files
    .py files in this example
    """

    selected_files = []

    for file in files:
        #do concatenation here to get full path 
        full_path = join(root, file)
        ext = splitext(file)[1]

        if ext == ".py":
            selected_files.append(full_path)

    return selected_files

def build_recursive_dir_tree(path):
    """
    path    -    where to begin folder scan
    """
    selected_files = []

    for root, dirs, files in walk(path):
        selected_files += select_files(root, files)

    return selected_files

回答 10

试试这个:

import os
import sys

for root, subdirs, files in os.walk(path):

    for file in os.listdir(root):

        filePath = os.path.join(root, file)

        if os.path.isdir(filePath):
            pass

        else:
            f = open (filePath, 'r')
            # Do Stuff

Try this:

import os
import sys

for root, subdirs, files in os.walk(path):

    for file in os.listdir(root):

        filePath = os.path.join(root, file)

        if os.path.isdir(filePath):
            pass

        else:
            f = open (filePath, 'r')
            # Do Stuff

回答 11

我认为问题是您没有在处理 os.walk正确。

首先,更改:

filePath = rootdir + '/' + file

至:

filePath = root + '/' + file

rootdir是您的固定起始目录;root是由返回的目录os.walk

其次,您不需要缩进文件处理循环,因为对每个子目录运行此循环都没有意义。您将root设置到每个子目录。除非您要对目录本身进行某些操作,否则无需手动处理子目录。

I think the problem is that you’re not processing the output of os.walk correctly.

Firstly, change:

filePath = rootdir + '/' + file

to:

filePath = root + '/' + file

rootdir is your fixed starting directory; root is a directory returned by os.walk.

Secondly, you don’t need to indent your file processing loop, as it makes no sense to run this for each subdirectory. You’ll get root set to each subdirectory. You don’t need to process the subdirectories by hand unless you want to do something with the directories themselves.


在迭代字典时如何从字典中删除项目?

问题:在迭代字典时如何从字典中删除项目?

在Python上进行迭代时从字典中删除项目是否合法?

例如:

for k, v in mydict.iteritems():
   if k == val:
     del mydict[k]

这个想法是从字典中删除不满足特定条件的元素,而不是创建一个新字典,该字典是被迭代的字典的子集。

这是一个好的解决方案吗?有没有更优雅/更有效的方法?

Is it legitimate to delete items from a dictionary in Python while iterating over it?

For example:

for k, v in mydict.iteritems():
   if k == val:
     del mydict[k]

The idea is to remove elements that don’t meet a certain condition from the dictionary, instead of creating a new dictionary that’s a subset of the one being iterated over.

Is this a good solution? Are there more elegant/efficient ways?


回答 0

编辑:

此答案不适用于Python3,并且会给出RuntimeError

RuntimeError:词典在迭代过程中更改了大小。

发生这种情况是因为mydict.keys()返回的是迭代器而不是列表。正如注释中所指出的那样,只需将其转换mydict.keys()为列表即可list(mydict.keys()),它应该可以工作。


控制台中的一个简单测试显示,在迭代字典时您无法修改字典:

>>> mydict = {'one': 1, 'two': 2, 'three': 3, 'four': 4}
>>> for k, v in mydict.iteritems():
...    if k == 'two':
...        del mydict[k]
...
------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython console>", line 1, in <module>
RuntimeError: dictionary changed size during iteration

如delnan的回答所述,当迭代器尝试移至下一个条目时,删除条目会导致问题。而是使用keys()方法获取键列表并进行处理:

>>> for k in mydict.keys():
...    if k == 'two':
...        del mydict[k]
...
>>> mydict
{'four': 4, 'three': 3, 'one': 1}

如果需要根据项目值删除,请使用items()方法:

>>> for k, v in mydict.items():
...     if v == 3:
...         del mydict[k]
...
>>> mydict
{'four': 4, 'one': 1}

EDIT:

This answer will not work for Python3 and will give a RuntimeError.

RuntimeError: dictionary changed size during iteration.

This happens because mydict.keys() returns an iterator not a list. As pointed out in comments simply convert mydict.keys() to a list by list(mydict.keys()) and it should work.


A simple test in the console shows you cannot modify a dictionary while iterating over it:

>>> mydict = {'one': 1, 'two': 2, 'three': 3, 'four': 4}
>>> for k, v in mydict.iteritems():
...    if k == 'two':
...        del mydict[k]
...
------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython console>", line 1, in <module>
RuntimeError: dictionary changed size during iteration

As stated in delnan’s answer, deleting entries causes problems when the iterator tries to move onto the next entry. Instead, use the keys() method to get a list of the keys and work with that:

>>> for k in mydict.keys():
...    if k == 'two':
...        del mydict[k]
...
>>> mydict
{'four': 4, 'three': 3, 'one': 1}

If you need to delete based on the items value, use the items() method instead:

>>> for k, v in mydict.items():
...     if v == 3:
...         del mydict[k]
...
>>> mydict
{'four': 4, 'one': 1}

回答 1

您也可以分两个步骤进行操作:

remove = [k for k in mydict if k == val]
for k in remove: del mydict[k]

我最喜欢的方法通常是做出一个新的决定:

# Python 2.7 and 3.x
mydict = { k:v for k,v in mydict.items() if k!=val }
# before Python 2.7
mydict = dict((k,v) for k,v in mydict.iteritems() if k!=val)

You could also do it in two steps:

remove = [k for k in mydict if k == val]
for k in remove: del mydict[k]

My favorite approach is usually to just make a new dict:

# Python 2.7 and 3.x
mydict = { k:v for k,v in mydict.items() if k!=val }
# before Python 2.7
mydict = dict((k,v) for k,v in mydict.iteritems() if k!=val)

回答 2

迭代时不能修改集合。那就是疯狂-最为明显的是,如果允许您删除和删除当前项目,则迭代器将必须继续(+1),下一次调用next将使您超出该范围(+2),因此您会最终跳过了一个元素(删除的元素后面的一个)。您有两种选择:

  • 复制所有键(或值,或两者,取决于您的需要),然后遍历这些键。您可以.keys()为此使用et al(在Python 3中,将生成的迭代器传递给list)。但是在空间上可能会非常浪费。
  • mydict照常进行迭代,将要保存的密钥保存在单独的collection中to_delete。当你完成迭代mydict,删除所有项目to_deletemydict。与第一种方法相比,可以节省一些(取决于删除的键数和剩余的键数)空间,但还需要多几行。

You can’t modify a collection while iterating it. That way lies madness – most notably, if you were allowed to delete and deleted the current item, the iterator would have to move on (+1) and the next call to next would take you beyond that (+2), so you’d end up skipping one element (the one right behind the one you deleted). You have two options:

  • Copy all keys (or values, or both, depending on what you need), then iterate over those. You can use .keys() et al for this (in Python 3, pass the resulting iterator to list). Could be highly wasteful space-wise though.
  • Iterate over mydict as usual, saving the keys to delete in a seperate collection to_delete. When you’re done iterating mydict, delete all items in to_delete from mydict. Saves some (depending on how many keys are deleted and how many stay) space over the first approach, but also requires a few more lines.

回答 3

而是遍历一个副本,例如items()

for k, v in list(mydict.items()):

Iterate over a copy instead, such as the one returned by items():

for k, v in list(mydict.items()):

回答 4

使用起来最干净list(mydict)

>>> mydict = {'one': 1, 'two': 2, 'three': 3, 'four': 4}
>>> for k in list(mydict):
...     if k == 'three':
...         del mydict[k]
... 
>>> mydict
{'four': 4, 'two': 2, 'one': 1}

这对应于列表的并行结构:

>>> mylist = ['one', 'two', 'three', 'four']
>>> for k in list(mylist):                            # or mylist[:]
...     if k == 'three':
...         mylist.remove(k)
... 
>>> mylist
['one', 'two', 'four']

两者都在python2和python3中工作。

It’s cleanest to use list(mydict):

>>> mydict = {'one': 1, 'two': 2, 'three': 3, 'four': 4}
>>> for k in list(mydict):
...     if k == 'three':
...         del mydict[k]
... 
>>> mydict
{'four': 4, 'two': 2, 'one': 1}

This corresponds to a parallel structure for lists:

>>> mylist = ['one', 'two', 'three', 'four']
>>> for k in list(mylist):                            # or mylist[:]
...     if k == 'three':
...         mylist.remove(k)
... 
>>> mylist
['one', 'two', 'four']

Both work in python2 and python3.


回答 5

您可以使用字典理解。

d = {k:d[k] for k in d if d[k] != val}

You can use a dictionary comprehension.

d = {k:d[k] for k in d if d[k] != val}


回答 6

使用python3,在dic.keys()上进行迭代将引发字典大小错误。您可以使用这种替代方式:

使用python3进行测试,它可以正常工作,并且不会引发错误“ 字典在迭代期间更改大小 ”:

my_dic = { 1:10, 2:20, 3:30 }
# Is important here to cast because ".keys()" method returns a dict_keys object.
key_list = list( my_dic.keys() )

# Iterate on the list:
for k in key_list:
    print(key_list)
    print(my_dic)
    del( my_dic[k] )


print( my_dic )
# {}

With python3, iterate on dic.keys() will raise the dictionary size error. You can use this alternative way:

Tested with python3, it works fine and the Error “dictionary changed size during iteration” is not raised:

my_dic = { 1:10, 2:20, 3:30 }
# Is important here to cast because ".keys()" method returns a dict_keys object.
key_list = list( my_dic.keys() )

# Iterate on the list:
for k in key_list:
    print(key_list)
    print(my_dic)
    del( my_dic[k] )


print( my_dic )
# {}

回答 7

您可以先构建要删除的键列表,然后遍历该列表以删除它们。

dict = {'one' : 1, 'two' : 2, 'three' : 3, 'four' : 4}
delete = []
for k,v in dict.items():
    if v%2 == 1:
        delete.append(k)
for i in delete:
    del dict[i]

You could first build a list of keys to delete, and then iterate over that list deleting them.

dict = {'one' : 1, 'two' : 2, 'three' : 3, 'four' : 4}
delete = []
for k,v in dict.items():
    if v%2 == 1:
        delete.append(k)
for i in delete:
    del dict[i]

回答 8

如果您要删除的项目始终位于dict迭代的“开始”,则有一种方法可能合适

while mydict:
    key, value = next(iter(mydict.items()))
    if should_delete(key, value):
       del mydict[key]
    else:
       break

仅保证“开始”对于某些Python版本/实现是一致的。例如,Python 3.7新增功能

dict对象的插入顺序保留性质已声明是Python语言规范的正式组成部分。

这种方式避免了很多其他答案所暗示的dict副本,至少在Python 3中如此。

There is a way that may be suitable if the items you want to delete are always at the “beginning” of the dict iteration

while mydict:
    key, value = next(iter(mydict.items()))
    if should_delete(key, value):
       del mydict[key]
    else:
       break

The “beginning” is only guaranteed to be consistent for certain Python versions/implementations. For example from What’s New In Python 3.7

the insertion-order preservation nature of dict objects has been declared to be an official part of the Python language spec.

This way avoids a copy of the dict that a lot of the other answers suggest, at least in Python 3.


回答 9

我在Python3中尝试了上述解决方案,但在将对象存储在dict中时,似乎这是唯一对我有用的解决方案。基本上,您会复制dict()并对其进行迭代,同时删除原始词典中的条目。

        tmpDict = realDict.copy()
        for key, value in tmpDict.items():
            if value:
                del(realDict[key])

I tried the above solutions in Python3 but this one seems to be the only one working for me when storing objects in a dict. Basically you make a copy of your dict() and iterate over that while deleting the entries in your original dictionary.

        tmpDict = realDict.copy()
        for key, value in tmpDict.items():
            if value:
                del(realDict[key])

如何获取当前正在执行的文件的路径和名称?

问题:如何获取当前正在执行的文件的路径和名称?

我有调用其他脚本文件的脚本,但是我需要获取该进程中当前正在运行的文件的文件路径。

例如,假设我有三个文件。使用execfile

  • script_1.py来电script_2.py
  • 依次script_2.py调用script_3.py

我怎样才能获得的文件名和路径script_3.py从内部代码script_3.py,而无需从传递这些信息作为参数script_2.py

(执行os.getcwd()将返回原始启动脚本的文件路径,而不是当前文件的路径。)

I have scripts calling other script files but I need to get the filepath of the file that is currently running within the process.

For example, let’s say I have three files. Using execfile:

  • script_1.py calls script_2.py.
  • In turn, script_2.py calls script_3.py.

How can I get the file name and path of script_3.py, from code within script_3.py, without having to pass that information as arguments from script_2.py?

(Executing os.getcwd() returns the original starting script’s filepath not the current file’s.)


回答 0

p1.py:

execfile("p2.py")

p2.py:

import inspect, os
print (inspect.getfile(inspect.currentframe()) # script filename (usually with path)
print (os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))) # script directory

p1.py:

execfile("p2.py")

p2.py:

import inspect, os
print (inspect.getfile(inspect.currentframe()) # script filename (usually with path)
print (os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))) # script directory

回答 1

__file__

正如其他人所说。您可能还想使用os.path.realpath消除符号链接:

import os

os.path.realpath(__file__)
__file__

as others have said. You may also want to use os.path.realpath to eliminate symlinks:

import os

os.path.realpath(__file__)

回答 2

更新2018-11-28:

以下是使用Python 2和3进行实验的摘要。

main.py-运行foo.py
foo.py-运行lib / bar.py
lib / bar.py-打印文件路径表达式

| Python | Run statement       | Filepath expression                    |
|--------+---------------------+----------------------------------------|
|      2 | execfile            | os.path.abspath(inspect.stack()[0][1]) |
|      2 | from lib import bar | __file__                               |
|      3 | exec                | (wasn't able to obtain it)             |
|      3 | import lib.bar      | __file__                               |

对于Python 2,切换到软件包以便可以使用更为清晰from lib import bar-只需将空__init__.py文件添加到两个文件夹中即可。

对于Python 3,execfile不存在-最接近的替代方法是exec(open(<filename>).read()),尽管这会影响堆栈框架。使用起来最简单,import foo而且import lib.bar-无需__init__.py文件。

另请参见import和execfile之间的区别


原始答案:

这是基于该线程答案的实验-Windows上的Python 2.7.10。

基于堆栈的堆栈似乎是唯一可以提供可靠结果的堆栈。后两个语法最短,即-

print os.path.abspath(inspect.stack()[0][1])                   # C:\filepaths\lib\bar.py
print os.path.dirname(os.path.abspath(inspect.stack()[0][1]))  # C:\filepaths\lib

这些是作为功​​能添加到sys中的!归功于@Usagi和@pablog

基于以下三个文件,并从其文件夹运行main.py python main.py(也尝试了具有绝对路径并从另一个文件夹调用的execfile)。

C:\ filepaths \ main.py:execfile('foo.py')
C:\ filepaths \ foo.py:execfile('lib/bar.py')
C:\ filepaths \ lib \ bar.py:

import sys
import os
import inspect

print "Python " + sys.version
print

print __file__                                        # main.py
print sys.argv[0]                                     # main.py
print inspect.stack()[0][1]                           # lib/bar.py
print sys.path[0]                                     # C:\filepaths
print

print os.path.realpath(__file__)                      # C:\filepaths\main.py
print os.path.abspath(__file__)                       # C:\filepaths\main.py
print os.path.basename(__file__)                      # main.py
print os.path.basename(os.path.realpath(sys.argv[0])) # main.py
print

print sys.path[0]                                     # C:\filepaths
print os.path.abspath(os.path.split(sys.argv[0])[0])  # C:\filepaths
print os.path.dirname(os.path.abspath(__file__))      # C:\filepaths
print os.path.dirname(os.path.realpath(sys.argv[0]))  # C:\filepaths
print os.path.dirname(__file__)                       # (empty string)
print

print inspect.getfile(inspect.currentframe())         # lib/bar.py

print os.path.abspath(inspect.getfile(inspect.currentframe())) # C:\filepaths\lib\bar.py
print os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe()))) # C:\filepaths\lib
print

print os.path.abspath(inspect.stack()[0][1])          # C:\filepaths\lib\bar.py
print os.path.dirname(os.path.abspath(inspect.stack()[0][1]))  # C:\filepaths\lib
print

Update 2018-11-28:

Here is a summary of experiments with Python 2 and 3. With

main.py – runs foo.py
foo.py – runs lib/bar.py
lib/bar.py – prints filepath expressions

| Python | Run statement       | Filepath expression                    |
|--------+---------------------+----------------------------------------|
|      2 | execfile            | os.path.abspath(inspect.stack()[0][1]) |
|      2 | from lib import bar | __file__                               |
|      3 | exec                | (wasn't able to obtain it)             |
|      3 | import lib.bar      | __file__                               |

For Python 2, it might be clearer to switch to packages so can use from lib import bar – just add empty __init__.py files to the two folders.

For Python 3, execfile doesn’t exist – the nearest alternative is exec(open(<filename>).read()), though this affects the stack frames. It’s simplest to just use import foo and import lib.bar – no __init__.py files needed.

See also Difference between import and execfile


Original Answer:

Here is an experiment based on the answers in this thread – with Python 2.7.10 on Windows.

The stack-based ones are the only ones that seem to give reliable results. The last two have the shortest syntax, i.e. –

print os.path.abspath(inspect.stack()[0][1])                   # C:\filepaths\lib\bar.py
print os.path.dirname(os.path.abspath(inspect.stack()[0][1]))  # C:\filepaths\lib

Here’s to these being added to sys as functions! Credit to @Usagi and @pablog

Based on the following three files, and running main.py from its folder with python main.py (also tried execfiles with absolute paths and calling from a separate folder).

C:\filepaths\main.py: execfile('foo.py')
C:\filepaths\foo.py: execfile('lib/bar.py')
C:\filepaths\lib\bar.py:

import sys
import os
import inspect

print "Python " + sys.version
print

print __file__                                        # main.py
print sys.argv[0]                                     # main.py
print inspect.stack()[0][1]                           # lib/bar.py
print sys.path[0]                                     # C:\filepaths
print

print os.path.realpath(__file__)                      # C:\filepaths\main.py
print os.path.abspath(__file__)                       # C:\filepaths\main.py
print os.path.basename(__file__)                      # main.py
print os.path.basename(os.path.realpath(sys.argv[0])) # main.py
print

print sys.path[0]                                     # C:\filepaths
print os.path.abspath(os.path.split(sys.argv[0])[0])  # C:\filepaths
print os.path.dirname(os.path.abspath(__file__))      # C:\filepaths
print os.path.dirname(os.path.realpath(sys.argv[0]))  # C:\filepaths
print os.path.dirname(__file__)                       # (empty string)
print

print inspect.getfile(inspect.currentframe())         # lib/bar.py

print os.path.abspath(inspect.getfile(inspect.currentframe())) # C:\filepaths\lib\bar.py
print os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe()))) # C:\filepaths\lib
print

print os.path.abspath(inspect.stack()[0][1])          # C:\filepaths\lib\bar.py
print os.path.dirname(os.path.abspath(inspect.stack()[0][1]))  # C:\filepaths\lib
print

回答 3

我认为这更干净:

import inspect
print inspect.stack()[0][1]

并获得与以下信息相同的信息:

print inspect.getfile(inspect.currentframe())

其中[0]是堆栈中的当前帧(堆栈的顶部),[1]是文件名,请增加以在堆栈中向后移动,即

print inspect.stack()[1][1]

将是调用当前框架的脚本的文件名。另外,使用[-1]将使您到达堆栈的底部,即原始调用脚本。

I think this is cleaner:

import inspect
print inspect.stack()[0][1]

and gets the same information as:

print inspect.getfile(inspect.currentframe())

Where [0] is the current frame in the stack (top of stack) and [1] is for the file name, increase to go backwards in the stack i.e.

print inspect.stack()[1][1]

would be the file name of the script that called the current frame. Also, using [-1] will get you to the bottom of the stack, the original calling script.


回答 4

import os
os.path.dirname(__file__) # relative directory path
os.path.abspath(__file__) # absolute file path
os.path.basename(__file__) # the file name only
import os
os.path.dirname(__file__) # relative directory path
os.path.abspath(__file__) # absolute file path
os.path.basename(__file__) # the file name only

回答 5

如果您的脚本仅包含一个文件,则标记为“最佳”的建议都是正确的。

如果要从可能作为模块导入的文件中找出可执行文件的名称(即,传递给当前程序的python解释器的根文件),则需要执行此操作(假设此文件位于文件中)名为foo.py):

import inspect

print inspect.stack()[-1][1]

因为[-1]堆栈上的最后一件事()是进入堆栈的第一件事(堆栈是LIFO / FILO数据结构)。

然后在文件bar.py中,如果您import foo将输出bar.py,而不是foo.py,它将是所有这些值:

  • __file__
  • inspect.getfile(inspect.currentframe())
  • inspect.stack()[0][1]

The suggestions marked as best are all true if your script consists of only one file.

If you want to find out the name of the executable (i.e. the root file passed to the python interpreter for the current program) from a file that may be imported as a module, you need to do this (let’s assume this is in a file named foo.py):

import inspect

print inspect.stack()[-1][1]

Because the last thing ([-1]) on the stack is the first thing that went into it (stacks are LIFO/FILO data structures).

Then in file bar.py if you import foo it’ll print bar.py, rather than foo.py, which would be the value of all of these:

  • __file__
  • inspect.getfile(inspect.currentframe())
  • inspect.stack()[0][1]

回答 6

import os
print os.path.basename(__file__)

这只会给我们文件名。即如果文件的绝对路径为c:\ abcd \ abc.py,则第二行将打印abc.py

import os
print os.path.basename(__file__)

this will give us the filename only. i.e. if abspath of file is c:\abcd\abc.py then 2nd line will print abc.py


回答 7

您还不清楚“进程中当前正在运行的文件的文件路径”是什么意思。 sys.argv[0]通常包含由Python解释器调用的脚本的位置。查看sys文档以获取更多详细信息。

正如@Tim和@Pat Notz指出的那样,__file__属性提供了对

从模块加载文件的文件(如果从文件加载的文件)

It’s not entirely clear what you mean by “the filepath of the file that is currently running within the process”. sys.argv[0] usually contains the location of the script that was invoked by the Python interpreter. Check the sys documentation for more details.

As @Tim and @Pat Notz have pointed out, the __file__ attribute provides access to

the file from which the module was loaded, if it was loaded from a file


回答 8

我有一个必须在Windows环境下工作的脚本。这段代码是我完成的:

import os,sys
PROJECT_PATH = os.path.abspath(os.path.split(sys.argv[0])[0])

这是一个不明智的决定。但这不需要外部库,这对我来说是最重要的。

I have a script that must work under windows environment. This code snipped is what I’ve finished with:

import os,sys
PROJECT_PATH = os.path.abspath(os.path.split(sys.argv[0])[0])

it’s quite a hacky decision. But it requires no external libraries and it’s the most important thing in my case.


回答 9

尝试这个,

import os
os.path.dirname(os.path.realpath(__file__))

Try this,

import os
os.path.dirname(os.path.realpath(__file__))

回答 10

import os
os.path.dirname(os.path.abspath(__file__))

无需检查或任何其他库。

当我必须导入脚本(从与执行脚本所在的目录不同的目录)导入脚本时,此方法对我有用,该脚本使用与导入脚本位于同一文件夹中的配置文件。

import os
os.path.dirname(os.path.abspath(__file__))

No need for inspect or any other library.

This worked for me when I had to import a script (from a different directory then the executed script), that used a configuration file residing in the same folder as the imported script.


回答 11

__file__属性适用于包含主要执行代码的文件以及导入的模块。

参见https://web.archive.org/web/20090918095828/http://pyref.infogami.com/__file__

The __file__ attribute works for both the file containing the main execution code as well as imported modules.

See https://web.archive.org/web/20090918095828/http://pyref.infogami.com/__file__


回答 12

import sys

print sys.path[0]

这将打印当前正在执行的脚本的路径

import sys

print sys.path[0]

this would print the path of the currently executing script


回答 13

我认为这__file__ 听起来像您可能还想签出inspect模块

I think it’s just __file__ Sounds like you may also want to checkout the inspect module.


回答 14

您可以使用 inspect.stack()

import inspect,os
inspect.stack()[0]  => (<frame object at 0x00AC2AC0>, 'g:\\Python\\Test\\_GetCurrentProgram.py', 15, '<module>', ['print inspect.stack()[0]\n'], 0)
os.path.abspath (inspect.stack()[0][1]) => 'g:\\Python\\Test\\_GetCurrentProgram.py'

You can use inspect.stack()

import inspect,os
inspect.stack()[0]  => (<frame object at 0x00AC2AC0>, 'g:\\Python\\Test\\_GetCurrentProgram.py', 15, '<module>', ['print inspect.stack()[0]\n'], 0)
os.path.abspath (inspect.stack()[0][1]) => 'g:\\Python\\Test\\_GetCurrentProgram.py'

回答 15

由于Python 3相当主流,因此我想提供一个pathlib答案,因为我认为它现在可能是访问文件和路径信息的更好工具。

from pathlib import Path

current_file: Path = Path(__file__).resolve()

如果要查找当前文件的目录,则只需添加.parent以下Path()语句即可:

current_path: Path = Path(__file__).parent.resolve()

Since Python 3 is fairly mainstream, I wanted to include a pathlib answer, as I believe that it is probably now a better tool for accessing file and path information.

from pathlib import Path

current_file: Path = Path(__file__).resolve()

If you are seeking the directory of the current file, it is as easy as adding .parent to the Path() statement:

current_path: Path = Path(__file__).parent.resolve()

回答 16

import sys
print sys.argv[0]
import sys
print sys.argv[0]

回答 17

这应该工作:

import os,sys
filename=os.path.basename(os.path.realpath(sys.argv[0]))
dirname=os.path.dirname(os.path.realpath(sys.argv[0]))

This should work:

import os,sys
filename=os.path.basename(os.path.realpath(sys.argv[0]))
dirname=os.path.dirname(os.path.realpath(sys.argv[0]))

回答 18

print(__file__)
print(__import__("pathlib").Path(__file__).parent)
print(__file__)
print(__import__("pathlib").Path(__file__).parent)

回答 19

获取执行脚本的目录

 print os.path.dirname( inspect.getfile(inspect.currentframe()))

To get directory of executing script

 print os.path.dirname( inspect.getfile(inspect.currentframe()))

回答 20

我一直只使用“当前工作目录”或CWD的os功能。这是标准库的一部分,非常容易实现。这是一个例子:

    import os
    base_directory = os.getcwd()

I have always just used the os feature of Current Working Directory, or CWD. This is part of the standard library, and is very easy to implement. Here is an example:

    import os
    base_directory = os.getcwd()

回答 21

我在__file__中使用了此方法,
os.path.abspath(__file__)
但是有一个小技巧,它在第一次运行代码时返回.py文件,下一次运行给出* .pyc文件的名称,
所以我使用:
inspect.getfile(inspect.currentframe())

sys._getframe().f_code.co_filename

I used the approach with __file__
os.path.abspath(__file__)
but there is a little trick, it returns the .py file when the code is run the first time, next runs give the name of *.pyc file
so I stayed with:
inspect.getfile(inspect.currentframe())
or
sys._getframe().f_code.co_filename


回答 22

我编写了一个函数,该函数考虑了Eclipse 调试器unittest。它返回您启动的第一个脚本的文件夹。您可以选择指定__file__ var,但主要的事情是您不必在所有调用层次结构中共享此变量

也许您可以处理其他我没看到的特殊情况,但是对我来说还可以。

import inspect, os
def getRootDirectory(_file_=None):
    """
    Get the directory of the root execution file
    Can help: http://stackoverflow.com/questions/50499/how-do-i-get-the-path-and-name-of-the-file-that-is-currently-executing
    For eclipse user with unittest or debugger, the function search for the correct folder in the stack
    You can pass __file__ (with 4 underscores) if you want the caller directory
    """
    # If we don't have the __file__ :
    if _file_ is None:
        # We get the last :
        rootFile = inspect.stack()[-1][1]
        folder = os.path.abspath(rootFile)
        # If we use unittest :
        if ("/pysrc" in folder) & ("org.python.pydev" in folder):
            previous = None
            # We search from left to right the case.py :
            for el in inspect.stack():
                currentFile = os.path.abspath(el[1])
                if ("unittest/case.py" in currentFile) | ("org.python.pydev" in currentFile):
                    break
                previous = currentFile
            folder = previous
        # We return the folder :
        return os.path.dirname(folder)
    else:
        # We return the folder according to specified __file__ :
        return os.path.dirname(os.path.realpath(_file_))

I wrote a function which take into account eclipse debugger and unittest. It return the folder of the first script you launch. You can optionally specify the __file__ var, but the main thing is that you don’t have to share this variable across all your calling hierarchy.

Maybe you can handle others stack particular cases I didn’t see, but for me it’s ok.

import inspect, os
def getRootDirectory(_file_=None):
    """
    Get the directory of the root execution file
    Can help: http://stackoverflow.com/questions/50499/how-do-i-get-the-path-and-name-of-the-file-that-is-currently-executing
    For eclipse user with unittest or debugger, the function search for the correct folder in the stack
    You can pass __file__ (with 4 underscores) if you want the caller directory
    """
    # If we don't have the __file__ :
    if _file_ is None:
        # We get the last :
        rootFile = inspect.stack()[-1][1]
        folder = os.path.abspath(rootFile)
        # If we use unittest :
        if ("/pysrc" in folder) & ("org.python.pydev" in folder):
            previous = None
            # We search from left to right the case.py :
            for el in inspect.stack():
                currentFile = os.path.abspath(el[1])
                if ("unittest/case.py" in currentFile) | ("org.python.pydev" in currentFile):
                    break
                previous = currentFile
            folder = previous
        # We return the folder :
        return os.path.dirname(folder)
    else:
        # We return the folder according to specified __file__ :
        return os.path.dirname(os.path.realpath(_file_))

回答 23

要保持跨平台(macOS / Windows / Linux)的迁移一致性,请尝试:

path = r'%s' % os.getcwd().replace('\\','/')

To keep the migration consistency across platforms (macOS/Windows/Linux), try:

path = r'%s' % os.getcwd().replace('\\','/')


回答 24

最简单的方法是:

script_1.py中:

import subprocess
subprocess.call(['python3',<path_to_script_2.py>])

script_2.py中:

sys.argv[0]

PS:我已经尝试过execfile,但是由于它以字符串形式读取script_2.py,所以sys.argv[0]返回了<string>

Simplest way is:

in script_1.py:

import subprocess
subprocess.call(['python3',<path_to_script_2.py>])

in script_2.py:

sys.argv[0]

P.S.: I’ve tried execfile, but since it reads script_2.py as a string, sys.argv[0] returned <string>.


回答 25

这是我所使用的,因此我可以将我的代码无处不在。__name__总是被定义,但是__file__仅当代码作为文件运行时才被定义(例如,不在IDLE / iPython中)。

    if '__file__' in globals():
        self_name = globals()['__file__']
    elif '__file__' in locals():
        self_name = locals()['__file__']
    else:
        self_name = __name__

或者,可以这样写:

self_name = globals().get('__file__', locals().get('__file__', __name__))

Here is what I use so I can throw my code anywhere without issue. __name__ is always defined, but __file__ is only defined when the code is run as a file (e.g. not in IDLE/iPython).

    if '__file__' in globals():
        self_name = globals()['__file__']
    elif '__file__' in locals():
        self_name = locals()['__file__']
    else:
        self_name = __name__

Alternatively, this can be written as:

self_name = globals().get('__file__', locals().get('__file__', __name__))

回答 26

这些答案大多数都是用Python 2.x或更早版本编写的。在Python 3.x中,print函数的语法已更改为需要括号,即print()。

因此,Python 2.x中来自user13993的较早的高分答案:

import inspect, os
print inspect.getfile(inspect.currentframe()) # script filename (usually with path)
print os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe()))) # script directory

在Python 3.x中成为:

import inspect, os
print(inspect.getfile(inspect.currentframe())) # script filename (usually with path)
print(os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe()))) ) # script directory

Most of these answers were written in Python version 2.x or earlier. In Python 3.x the syntax for the print function has changed to require parentheses, i.e. print().

So, this earlier high score answer from user13993 in Python 2.x:

import inspect, os
print inspect.getfile(inspect.currentframe()) # script filename (usually with path)
print os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe()))) # script directory

Becomes in Python 3.x:

import inspect, os
print(inspect.getfile(inspect.currentframe())) # script filename (usually with path)
print(os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe()))) ) # script directory

回答 27

如果您只想要文件名而没有,./或者.py可以尝试此操作

filename = testscript.py
file_name = __file__[2:-3]

file_name 将打印测试脚本,您可以通过更改[]中的索引来生成所需的任何内容

if you want just the filename without ./ or .py you can try this

filename = testscript.py
file_name = __file__[2:-3]

file_name will print testscript you can generate whatever you want by changing the index inside []


回答 28

import os

import wx


# return the full path of this file
print(os.getcwd())

icon = wx.Icon(os.getcwd() + '/img/image.png', wx.BITMAP_TYPE_PNG, 16, 16)

# put the icon on the frame
self.SetIcon(icon)
import os

import wx


# return the full path of this file
print(os.getcwd())

icon = wx.Icon(os.getcwd() + '/img/image.png', wx.BITMAP_TYPE_PNG, 16, 16)

# put the icon on the frame
self.SetIcon(icon)

获取Python中当前脚本的名称

问题:获取Python中当前脚本的名称

我正在尝试获取当前正在运行的Python脚本的名称。

我有一个名为的脚本foo.py,我想做这样的事情以获得脚本名称:

print Scriptname

I’m trying to get the name of the Python script that is currently running.

I have a script called foo.py and I’d like to do something like this in order to get the script name:

print Scriptname

回答 0

您可以使用__file__获取当前文件的名称。在主模块中使用时,这是最初调用的脚本的名称。

如果要省略目录部分(可能存在),可以使用os.path.basename(__file__)

You can use __file__ to get the name of the current file. When used in the main module, this is the name of the script that was originally invoked.

If you want to omit the directory part (which might be present), you can use os.path.basename(__file__).


回答 1

import sys
print sys.argv[0]

这将打印foo.pypython foo.pydir/foo.pypython dir/foo.py等,这是第一个参数python。(请注意,在py2exe之后将会是foo.exe。)

import sys
print sys.argv[0]

This will print foo.py for python foo.py, dir/foo.py for python dir/foo.py, etc. It’s the first argument to python. (Note that after py2exe it would be foo.exe.)


回答 2

为了完整起见,我认为值得总结各种可能的结果,并为每种结果的确切行为提供参考:

  • __file__是当前正在执行的文件,如官方文档中所述

    __file__是从中加载模块的文件的路径名(如果它是从文件加载的)。所述__file__属性可以是缺少某些类型的模块,如Ç静态链接到解释器模块; 对于从共享库动态加载的扩展模块,它是共享库文件的路径名。

    从Python3.4起,每发行18416__file__始终是一个绝对路径,除非当前正在执行的文件是已经被直接执行(不通过与解释脚本-m使用相对路径命令行选项)。

  • __main__.__file__(需要import __main__)仅访问主模块的上述__file__属性,例如,从命令行调用的脚本的属性。

  • sys.argv[0](需要import sys)是从命令行调用的脚本名称,并且可能是绝对路径,如官方文档中所述

    argv[0]是脚本名称(是否为完整路径名取决于操作系统)。如果命令是使用-c解释器的命令行选项执行的,argv[0]则将其设置为字符串'-c'。如果没有脚本名称传递给Python解释器,argv[0]则为空字符串。

    正如提到的另一个回答这个问题Python的是被通过的工具,如转换成独立的可执行程序的脚本py2exePyInstaller可能不会显示预期的结果使用这种方法的时候(也就是sys.argv[0]将持有的可执行文件的名称,而不是名称该可执行文件中主要Python文件的名称)。

  • 如果上述选项似乎都不起作用,可能是由于不规则的导入操作造成的,那么检查模块可能会证明是有用的。特别是,在调用inspect.getfile(...)inspect.currentframe()可以工作,尽管后者将返回None没有实现运行时的Python堆栈帧。


处理符号链接

如果当前脚本是符号链接,则以上所有内容都将返回符号链接的路径,而不是真实文件的路径,因此os.path.realpath(...)应调用它们以提取后者。


提取实际文件名的进一步操作

os.path.basename(...)可以在上述任何方法上调用以便提取实际的文件名,os.path.splitext(...)也可以在实际的文件名上调用以便截断其后缀,如中所示os.path.splitext(os.path.basename(...))

Python的3.4起,每PEP 428中,PurePath的的pathlib模块可以用作以及任何上述的。具体来说,pathlib.PurePath(...).name提取实际文件名并pathlib.PurePath(...).stem提取不带后缀的实际文件名。

For completeness’ sake, I thought it would be worthwhile summarizing the various possible outcomes and supplying references for the exact behaviour of each:

  • __file__ is the currently executing file, as detailed in the official documentation:

    __file__ is the pathname of the file from which the module was loaded, if it was loaded from a file. The __file__ attribute may be missing for certain types of modules, such as C modules that are statically linked into the interpreter; for extension modules loaded dynamically from a shared library, it is the pathname of the shared library file.

    From Python3.4 onwards, per issue 18416, __file__ is always an absolute path, unless the currently executing file is a script that has been executed directly (not via the interpreter with the -m command line option) using a relative path.

  • __main__.__file__ (requires importing __main__) simply accesses the aforementioned __file__ attribute of the main module, e.g. of the script that was invoked from the command line.

  • sys.argv[0] (requires importing sys) is the script name that was invoked from the command line, and might be an absolute path, as detailed in the official documentation:

    argv[0] is the script name (it is operating system dependent whether this is a full pathname or not). If the command was executed using the -c command line option to the interpreter, argv[0] is set to the string '-c'. If no script name was passed to the Python interpreter, argv[0] is the empty string.

    As mentioned in another answer to this question, Python scripts that were converted into stand-alone executable programs via tools such as py2exe or PyInstaller might not display the desired result when using this approach (i.e. sys.argv[0] would hold the name of the executable rather than the name of the main Python file within that executable).

  • If none of the aforementioned options seem to work, probably due to an irregular import operation, the inspect module might prove useful. In particular, invoking inspect.getfile(...) on inspect.currentframe() could work, although the latter would return None when running in an implementation without Python stack frame.


Handling symbolic links

If the current script is a symbolic link, then all of the above would return the path of the symbolic link rather than the path of the real file and os.path.realpath(...) should be invoked in order to extract the latter.


Further manipulations that extract the actual file name

os.path.basename(...) may be invoked on any of the above in order to extract the actual file name and os.path.splitext(...) may be invoked on the actual file name in order to truncate its suffix, as in os.path.splitext(os.path.basename(...)).

From Python 3.4 onwards, per PEP 428, the PurePath class of the pathlib module may be used as well on any of the above. Specifically, pathlib.PurePath(...).name extracts the actual file name and pathlib.PurePath(...).stem extracts the actual file name without its suffix.


回答 3

注意 __file__将提供此代码所在的文件,该文件可以导入,并且与要解释的主文件不同。要获取主文件,可以使用特殊的__main__模块:

import __main__ as main
print(main.__file__)

注意 __main__.__file__在Python 2.7中有效,但在3.2中无效,因此请使用上述import-as语法使其具有可移植性。

Note that __file__ will give the file where this code resides, which can be imported and different from the main file being interpreted. To get the main file, the special __main__ module can be used:

import __main__ as main
print(main.__file__)

Note that __main__.__file__ works in Python 2.7 but not in 3.2, so use the import-as syntax as above to make it portable.


回答 4

上述答案是好的。但是我发现使用上面的结果这种方法更有效。
这导致实际的脚本文件名不是路径。

import sys    
import os    
file_name =  os.path.basename(sys.argv[0])

The Above answers are good . But I found this method more efficient using above results.
This results in actual script file name not a path.

import sys    
import os    
file_name =  os.path.basename(sys.argv[0])

回答 5

对于现代Python版本(3.4+),Path(__file__).name应该更加惯用。另外,Path(__file__).stem为您提供不带.py扩展名的脚本名称。

For modern Python versions (3.4+), Path(__file__).name should be more idiomatic. Also, Path(__file__).stem gives you the script name without the .py extension.


回答 6

尝试这个:

print __file__

Try this:

print __file__

回答 7

注意:如果您使用的是Python 3+,则应改用print()函数

假设文件名为foo.py,则以下代码段

import sys
print sys.argv[0][:-3]

要么

import sys
print sys.argv[0][::-1][3:][::-1]

至于具有更多字符的其他扩展名,例如文件名 foo.pypy

import sys
print sys.argv[0].split('.')[0]

如果要从绝对路径中提取

import sys
print sys.argv[0].split('/')[-1].split('.')[0]

将输出 foo

Note: If you are using Python 3+, then you should use the print() function instead

Assuming that the filename is foo.py, the below snippet

import sys
print sys.argv[0][:-3]

or

import sys
print sys.argv[0][::-1][3:][::-1]

As for other extentions with more characters, for example the filename foo.pypy

import sys
print sys.argv[0].split('.')[0]

If you want to extract from an absolute path

import sys
print sys.argv[0].split('/')[-1].split('.')[0]

will output foo


回答 8

sys中的第一个参数将是当前文件名,因此它将起作用

   import sys
   print sys.argv[0] # will print the file name

The first argument in sys will be the current file name so this will work

   import sys
   print sys.argv[0] # will print the file name

回答 9

如果您执行的是异常导入(例如,这是一个选项文件),请尝试:

import inspect
print (inspect.getfile(inspect.currentframe()))

请注意,这将返回文件的绝对路径。

If you’re doing an unusual import (e.g., it’s an options file), try:

import inspect
print (inspect.getfile(inspect.currentframe()))

Note that this will return the absolute path to the file.


回答 10

我们可以尝试使用此命令来获取当前脚本名称(不带扩展名)。

import os

script_name = os.path.splitext(os.path.basename(__file__))[0]

we can try this to get current script name without extension.

import os

script_name = os.path.splitext(os.path.basename(__file__))[0]

回答 11

由于OP要求提供当前脚本文件的名称,所以我希望

import os
os.path.split(sys.argv[0])[1]

Since the OP asked for the name of the current script file I would prefer

import os
os.path.split(sys.argv[0])[1]

回答 12

我快速的肮脏解决方案:

__file__.split('/')[-1:][0]

My fast dirty solution:

__file__.split('/')[-1:][0]

回答 13

os.path.abspath(__file__)将为您提供一条绝对路径(也relpath()可用)。

sys.argv[-1] 会给你一个相对的路径。

os.path.abspath(__file__) will give you an absolute path (relpath() available as well).

sys.argv[-1] will give you a relative path.


回答 14

所有这些答案都很不错,但是有一些问题,您乍一看可能看不到。

让我们定义我们想要的-我们想要执行的脚本的名称,而不是当前模块的名称-因此,__file__只有在已执行的脚本中使用了它,而不是在导入的模块中使用它时,它才起作用。 sys.argv也是可疑的-如果您的程序被pytest调用了怎么办?还是pydocRunner?还是被uwsgi调用?

-还有第三种获取脚本名称的方法,我在答案中没有看到-您可以检查堆栈。

另一个问题是,您(或某些其他程序)可以篡改sys.argv并且__main__.__file__-它可能存在,但可能不存在。它可能有效或无效。至少您可以检查脚本(所需结果)是否存在!

我在github上的库bitranox / lib_programname确实做到了:

  • 检查是否__main__存在
  • 检查是否__main__.__file__存在
  • 确实给 __main__.__file__有效结果(该脚本是否存在?)
  • 如果不是,请检查sys.argv:
  • sys.argv中是否有pytest,docrunner等?->如果是,请忽略
  • 我们可以在这里得到有效的结果吗?
  • 如果不是:检查堆栈并从那里获取结果
  • 如果堆栈也未给出有效结果,则抛出异常。

通过这种方式,我的解决方案正在到目前为止有setup.py testuwsgipytestpycharm pytestpycharm docrunner (doctest)dreampieeclipse

Dough Hellman也有一篇关于该问题的不错的博客文章,“用Python确定进程的名称”。

all that answers are great, but have some problems You might not see at the first glance.

lets define what we want – we want the name of the script that was executed, not the name of the current module – so __file__ will only work if it is used in the executed script, not in an imported module. sys.argv is also questionable – what if your program was called by pytest ? or pydoc runner ? or if it was called by uwsgi ?

and – there is a third method of getting the script name, I havent seen in the answers – You can inspect the stack.

Another problem is, that You (or some other program) can tamper around with sys.argv and __main__.__file__ – it might be present, it might be not. It might be valid, or not. At least You can check if the script (the desired result) exists !

my library bitranox/lib_programname at github does exactly that :

  • check if __main__ is present
  • check if __main__.__file__ is present
  • does give __main__.__file__ a valid result (does that script exist ?)
  • if not: check sys.argv:
  • is there pytest, docrunner, etc in the sys.argv ? –> if yes, ignore that
  • can we get a valid result here ?
  • if not: inspect the stack and get the result from there possibly
  • if also the stack does not give a valid result, then throw an Exception.

by that way, my solution is working so far with setup.py test, uwsgi, pytest, pycharm pytest , pycharm docrunner (doctest), dreampie, eclipse

there is also a nice blog article about that problem from Dough Hellman, “Determining the Name of a Process from Python”


回答 15

从Python 3.5开始,您可以简单地执行以下操作:

from pathlib import Path
Path(__file__).stem

在此处查看更多信息:https : //docs.python.org/3.5/library/pathlib.html#pathlib.PurePath.stem

例如,我的用户目录下有一个文件,test.py里面是这个文件:

from pathlib import Path

print(Path(__file__).stem)
print(__file__)

运行此输出:

>>> python3.6 test.py
test
test.py

As of Python 3.5 you can simply do:

from pathlib import Path
Path(__file__).stem

See more here: https://docs.python.org/3.5/library/pathlib.html#pathlib.PurePath.stem

For example, I have a file under my user directory named test.py with this inside:

from pathlib import Path

print(Path(__file__).stem)
print(__file__)

running this outputs:

>>> python3.6 test.py
test
test.py