Python:从多个文本文件中搜索字符串

2024-09-27 21:22:45 发布

您现在位置:Python中文网/ 问答频道 /正文

我想做什么:

  1. 提取sample1.tgz文件。你知道吗
  2. 存储到“sample1”目录
  3. 从sample1/nvram2/log/TextFiles中搜索字符串

完整路径=>;C:\Users\username\scripts\sample1\nvram2\logs\version.txt

注意:文本文件具有不同的扩展名

示例:

textFile.txt 
textFile.txt.0 
textFile.txt.1 
textFile.log 
textFile

我尝试过的:

import os,tarfile, glob

string_to_search=input("Enter the string you want to search : ")

#all_files holds all the files in current directory
all_files = [f for f in os.listdir('.') if os.path.isfile(f)] 
    for current_file in all_files: 
        print("Reading " + current_file)

        if (current_file.endswith(".tgz")) or (current_file.endswith("tar.gz")):
            tar = tarfile.open(current_file, "r:gz")
            #file_name contains only name by removing the extension
            file_name=os.path.splitext(current_file)[0]
            os.makedirs(file_name) #make directory with the file name
            output_file_path=file_name  #Path to store the files after extraction
            tar.extractall(output_file_path) #extract the current file
            tar.close()
            #---Following code is to find  the string from all the files in a directory---
            path=output_file_path + '\nvram2\logs\*'
            files=glob.glob(path)

            for file1 in files: 
                with open(file1) as f2:
                    for line in f2:
                        if string_to_search in line:
                            #print file name which contains the string
                            print(file1)
                            #print the line which contains the string
                            print(str(line))

问题:

我认为,问题在于路径。当我尝试用下面的代码执行代码时,它就起作用了。你知道吗

path='\nvram2\logs\*.txt'

但它只检查“.txt”文件扩展名。但我想搜索所有的文件扩展名。你知道吗

当我尝试以下代码时,它不起作用。这里output_file_path包含sample1,即目录名

path=output_file_path + '\nvram2\logs\*'

Tags: thetopathnameinstringosfiles
3条回答

将文件解压到文件夹后,可以使用os.步行访问给定路径中的所有文件并进行比较。你知道吗

示例代码:

import os

# Extract tar file
# ...
# ...

path = output_file_path + r'\nvram\logs'

for dirpath, dirs, files in os.walk(path):
    # dirpath : current dir path
    # dirs : directories found in currect dir path
    # files : files found in currect dir path

    # iterate each files
    for file in files:

        # build actual path of the file by joining to dirpath
        file_path = os.path.join(dirpath, file)

        # open file
        with open(file_path) as file_desc:

            # iterate over each line, enumerate is used to get line count
            for ln_no, line in enumerate(file_desc):
                if string_to_search in line:
                    print('Filename: {}'.format(file))
                    print('Text: {}'.format(line.strip()))
                    print('Line No: {}\n'.format(ln_no + 1))

以下是解决问题的完整代码:

import os,tarfile, glob

string_to_search=input("Enter the string you want to search : ")

#all_files holds all the files in current directory
all_files = [f for f in os.listdir('.') if os.path.isfile(f)] 
for current_file in all_files: 
    if (current_file.endswith(".tgz")) or (current_file.endswith("tar.gz")):
        tar = tarfile.open(current_file, "r:gz")
        #file_name contains only name by removing the extension
        file_name=os.path.splitext(current_file)[0] 
        os.makedirs(file_name) #make directory with the file name
        output_file_path=file_name  #Path to store the files after extraction
        tar.extractall(output_file_path) #extract the current file
        tar.close()

        #  Following code is to find  the string from all the files in a directory
        path1=output_file_path + r'\nvram2\logs'
        all_files=glob.glob(os.path.join(path1,"*"))
        for my_file1 in glob.glob(os.path.join(path1,"*")):
            if os.path.isfile(my_file1): # to discard folders
                with open(my_file1, errors='ignore') as my_file2:
                    for line_no, line in enumerate(my_file2):
                        if string_to_search in line:
                            print(string_to_search + " is found in " + my_file1 + "; Line Number = " + str(line_no))

得到了this answer的帮助。“未找到路径和文件”问题已通过“将目录与文件名合并解决。”

您可以添加一个条件来检查文件1中是否存在“.txt”

files= os.listdir(output_file_path + '/nvram2/logs/')

for file1 in files:   
   if '.txt' in file1:
       with open(file1) as f2:
           for line in f2:
               if string_to_search in line:
                    #print file name which contains the string
                    print(file1)
                    #print the line which contains the string
                    print(str(line))

相关问题 更多 >

    热门问题