需要找到最重复出现的长度,然后找到与res长度不同的行

2024-10-03 13:17:11 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个程序可以扫描一个文件,我需要它来找到一行中最常出现的长度,然后找到与其他行长度不同的行…有什么提示吗

我觉得我需要完全改变我的第一个if语句

这是我到目前为止得到的

import sys
import os 
import schedule
import time


#Using schedule, a continuous loop running every 1 second will monitor the folder "filedrop"
#defining the "job" that will be done
def job():
#Find any file in the folder "filedrop" with a .txt then scan it
    for file in os.listdir("/Users/justinstarkman/Desktop/Filedrop/"):
        if file.endswith(".txt"):
        #open the file and check to see if it empty
        #if the file is empty it will be moved to the failure folder with a message why
        #if the file has contents the scanning will continue
            with open(file) as f:
                flenme = os.path.basename(file)         #gets the file name for later
                cnt= os.stat(file).st_size              #counts the elements in the file
                try:
                    if cnt > 0:
                        fle = [line.strip() for line in f]          #strips the \n
                        min_char = (min(fle, key = len))            #finds min amount of character
                        max_char = (max(fle, key = len))            #finds max amount of character
                        length_max = len(max_char)                  #converts to numbers to compare lengths
                        length_min = len(min_char)

                        #moving files based on is all records are the same length or not
                        if length_max == length_min:   
                            os.rename("/Users/justinstarkman/Desktop/Filedrop/{0}".format(flenme) ,"/Users/justinstarkman/Desktop/Filedrop/Success/{0}".format(flenme))

                        else:
                            os.rename("/Users/justinstarkman/Desktop/Filedrop/{0}".format(flenme) ,"/Users/justinstarkman/Desktop/Filedrop/Failure/{0}".format(flenme))     
                            os.chdir("/Users/justinstarkman/Desktop/Filedrop/Failure")  #change directory to add to the failed_file document    
                            FldFle= open("failed_files.txt","a")
                            FldFle.write('--"%s" was moved to the "Failure" folder.\n' % (flenme));
                            FldFle.close()
                            os.chdir("/Users/justinstarkman/Desktop/Filedrop")
                    else:
                        os.rename("/Users/justinstarkman/Desktop/Filedrop/{0}".format(flenme) ,"/Users/justinstarkman/Desktop/Filedrop/Failure/{0}".format(flenme))
                        os.chdir("/Users/justinstarkman/Desktop/Filedrop/Failure")      #change directory to add to the failed_file document    
                        FldFle= open("failed_files.txt","a")
                        FldFle.write('--"%s" was moved to the "Failure" folder because it was an empty file.\n' % (flenme));
                        FldFle.close()
                        os.chdir("/Users/justinstarkman/Desktop/Filedrop")

                except :
                    print 'ERROR'
#the program will check the folder for a new file ever 1 second
schedule.every(1).seconds.do(job)   



while True:
    schedule.run_pending()
    time.sleep(1)

Tags: thetoformatiffailureosfoldermin
1条回答
网友
1楼 · 发布于 2024-10-03 13:17:11

对于每个文件,可以执行以下操作:

import collections

line_lengths = [len(line.strip()) for line in f]

# most common length
mcl = collections.Counter(line_lengths).most_common(1)[0][0]

# indexes_of_other_length_lines 
idx = [n for n, v in enumerate(line_lengths) if v != mcl]

这应该是您需要查看文件是否有效(len(idx) == 0)以及其他长度的行的全部内容

相关问题 更多 >