如何打开文件夹中的每个文件?

问题:如何打开文件夹中的每个文件?

我有一个python脚本parse.py,该脚本在脚本中打开一个文件,例如file1,然后执行一些操作,可能会打印出字符总数。

filename = 'file1'
f = open(filename, 'r')
content = f.read()
print filename, len(content)

现在,我正在使用stdout将结果定向到我的输出文件-输出

python parse.py >> output

但是,我不想按文件手动处理此文件,有没有办法自动处理每个文件?喜欢

ls | awk '{print}' | python parse.py >> output 

然后问题是如何从standardin中读取文件名?还是已经有一些内置函数可以轻松执行ls和此类工作?

谢谢!

I have a python script parse.py, which in the script open a file, say file1, and then do something maybe print out the total number of characters.

filename = 'file1'
f = open(filename, 'r')
content = f.read()
print filename, len(content)

Right now, I am using stdout to direct the result to my output file – output

python parse.py >> output

However, I don’t want to do this file by file manually, is there a way to take care of every single file automatically? Like

ls | awk '{print}' | python parse.py >> output 

Then the problem is how could I read the file name from standardin? or there are already some built-in functions to do the ls and those kind of work easily?

Thanks!


回答 0

操作系统

您可以使用以下命令列出当前目录中的所有文件os.listdir

import os
for filename in os.listdir(os.getcwd()):
   with open(os.path.join(os.cwd(), filename), 'r') as f: # open in readonly mode
      # do your stuff

球状

或者,您可以根据glob模块的文件模式仅列出一些文件:

import glob
for filename in glob.glob('*.txt'):
   with open(os.path.join(os.cwd(), filename), 'r') as f: # open in readonly mode
      # do your stuff

不必是当前目录,您可以在所需的任何路径中列出它们:

path = '/some/path/to/file'
for filename in glob.glob(os.path.join(path, '*.txt')):
   with open(os.path.join(os.cwd(), filename), 'r') as f: # open in readonly mode
      # do your stuff

管道 或者您甚至可以使用指定的管道来使用fileinput

import fileinput
for line in fileinput.input():
    # do your stuff

然后将其与管道一起使用:

ls -1 | python parse.py

Os

You can list all files in the current directory using os.listdir:

import os
for filename in os.listdir(os.getcwd()):
   with open(os.path.join(os.cwd(), filename), 'r') as f: # open in readonly mode
      # do your stuff

Glob

Or you can list only some files, depending on the file pattern using the glob module:

import glob
for filename in glob.glob('*.txt'):
   with open(os.path.join(os.cwd(), filename), 'r') as f: # open in readonly mode
      # do your stuff

It doesn’t have to be the current directory you can list them in any path you want:

path = '/some/path/to/file'
for filename in glob.glob(os.path.join(path, '*.txt')):
   with open(os.path.join(os.cwd(), filename), 'r') as f: # open in readonly mode
      # do your stuff

Pipe Or you can even use the pipe as you specified using fileinput

import fileinput
for line in fileinput.input():
    # do your stuff

And then use it with piping:

ls -1 | python parse.py

回答 1

你应该尝试使用os.walk

yourpath = 'path'

import os
for root, dirs, files in os.walk(yourpath, topdown=False):
    for name in files:
        print(os.path.join(root, name))
        stuff
    for name in dirs:
        print(os.path.join(root, name))
        stuff

you should try using os.walk

yourpath = 'path'

import os
for root, dirs, files in os.walk(yourpath, topdown=False):
    for name in files:
        print(os.path.join(root, name))
        stuff
    for name in dirs:
        print(os.path.join(root, name))
        stuff

回答 2

我一直在寻找这个答案:

import os,glob
folder_path = '/some/path/to/file'
for filename in glob.glob(os.path.join(folder_path, '*.htm')):
  with open(filename, 'r') as f:
    text = f.read()
    print (filename)
    print (len(text))

您也可以选择“ * .txt”或文件名的另一端

I was looking for this answer:

import os,glob
folder_path = '/some/path/to/file'
for filename in glob.glob(os.path.join(folder_path, '*.htm')):
  with open(filename, 'r') as f:
    text = f.read()
    print (filename)
    print (len(text))

you can choose as well ‘*.txt’ or other ends of your filename


回答 3

您实际上可以只使用os模块来完成这两个操作:

  1. 列出文件夹中的所有文件
  2. 按文件类型,文件名等对文件进行排序

这是一个简单的例子:

import os #os module imported here
location = os.getcwd() # get present working directory location here
counter = 0 #keep a count of all files found
csvfiles = [] #list to store all csv files found at location
filebeginwithhello = [] # list to keep all files that begin with 'hello'
otherfiles = [] #list to keep any other file that do not match the criteria

for file in os.listdir(location):
    try:
        if file.endswith(".csv"):
            print "csv file found:\t", file
            csvfiles.append(str(file))
            counter = counter+1

        elif file.startswith("hello") and file.endswith(".csv"): #because some files may start with hello and also be a csv file
            print "csv file found:\t", file
            csvfiles.append(str(file))
            counter = counter+1

        elif file.startswith("hello"):
            print "hello files found: \t", file
            filebeginwithhello.append(file)
            counter = counter+1

        else:
            otherfiles.append(file)
            counter = counter+1
    except Exception as e:
        raise e
        print "No files found here!"

print "Total files found:\t", counter

现在,您不仅列出了文件夹中的所有文件,而且(可选)按起始名称,文件类型等排序。刚才遍历每个列表并做您的工作。

You can actually just use os module to do both:

  1. list all files in a folder
  2. sort files by file type, file name etc.

Here’s a simple example:

import os #os module imported here
location = os.getcwd() # get present working directory location here
counter = 0 #keep a count of all files found
csvfiles = [] #list to store all csv files found at location
filebeginwithhello = [] # list to keep all files that begin with 'hello'
otherfiles = [] #list to keep any other file that do not match the criteria

for file in os.listdir(location):
    try:
        if file.endswith(".csv"):
            print "csv file found:\t", file
            csvfiles.append(str(file))
            counter = counter+1

        elif file.startswith("hello") and file.endswith(".csv"): #because some files may start with hello and also be a csv file
            print "csv file found:\t", file
            csvfiles.append(str(file))
            counter = counter+1

        elif file.startswith("hello"):
            print "hello files found: \t", file
            filebeginwithhello.append(file)
            counter = counter+1

        else:
            otherfiles.append(file)
            counter = counter+1
    except Exception as e:
        raise e
        print "No files found here!"

print "Total files found:\t", counter

Now you have not only listed all the files in a folder but also have them (optionally) sorted by starting name, file type and others. Just now iterate over each list and do your stuff.


回答 4

import pyautogui
import keyboard
import time
import os
import pyperclip

os.chdir("target directory")

# get the current directory
cwd=os.getcwd()

files=[]

for i in os.walk(cwd):
    for j in i[2]:
        files.append(os.path.abspath(j))

os.startfile("C:\Program Files (x86)\Adobe\Acrobat 11.0\Acrobat\Acrobat.exe")
time.sleep(1)


for i in files:
    print(i)
    pyperclip.copy(i)
    keyboard.press('ctrl')
    keyboard.press_and_release('o')
    keyboard.release('ctrl')
    time.sleep(1)

    keyboard.press('ctrl')
    keyboard.press_and_release('v')
    keyboard.release('ctrl')
    time.sleep(1)
    keyboard.press_and_release('enter')
    keyboard.press('ctrl')
    keyboard.press_and_release('p')
    keyboard.release('ctrl')
    keyboard.press_and_release('enter')
    time.sleep(3)
    keyboard.press('ctrl')
    keyboard.press_and_release('w')
    keyboard.release('ctrl')
    pyperclip.copy('')
import pyautogui
import keyboard
import time
import os
import pyperclip

os.chdir("target directory")

# get the current directory
cwd=os.getcwd()

files=[]

for i in os.walk(cwd):
    for j in i[2]:
        files.append(os.path.abspath(j))

os.startfile("C:\Program Files (x86)\Adobe\Acrobat 11.0\Acrobat\Acrobat.exe")
time.sleep(1)


for i in files:
    print(i)
    pyperclip.copy(i)
    keyboard.press('ctrl')
    keyboard.press_and_release('o')
    keyboard.release('ctrl')
    time.sleep(1)

    keyboard.press('ctrl')
    keyboard.press_and_release('v')
    keyboard.release('ctrl')
    time.sleep(1)
    keyboard.press_and_release('enter')
    keyboard.press('ctrl')
    keyboard.press_and_release('p')
    keyboard.release('ctrl')
    keyboard.press_and_release('enter')
    time.sleep(3)
    keyboard.press('ctrl')
    keyboard.press_and_release('w')
    keyboard.release('ctrl')
    pyperclip.copy('')

回答 5

下面的代码读取包含我们正在运行的脚本的目录中所有可用的文本文件。然后,它将打开每个文本文件,并将文本行中的单词存储到列表中。存储单词后,我们逐行打印每个单词

import os, fnmatch

listOfFiles = os.listdir('.')
pattern = "*.txt"
store = []
for entry in listOfFiles:
    if fnmatch.fnmatch(entry, pattern):
        _fileName = open(entry,"r")
        if _fileName.mode == "r":
            content = _fileName.read()
            contentList = content.split(" ")
            for i in contentList:
                if i != '\n' and i != "\r\n":
                    store.append(i)

for i in store:
    print(i)

The code below reads for any text files available in the directory which contains the script we are running. Then it opens every text file and stores the words of the text line into a list. After store the words we print each word line by line

import os, fnmatch

listOfFiles = os.listdir('.')
pattern = "*.txt"
store = []
for entry in listOfFiles:
    if fnmatch.fnmatch(entry, pattern):
        _fileName = open(entry,"r")
        if _fileName.mode == "r":
            content = _fileName.read()
            contentList = content.split(" ")
            for i in contentList:
                if i != '\n' and i != "\r\n":
                    store.append(i)

for i in store:
    print(i)