用Python从文本文件中删除xline段落

2024-09-24 22:25:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个很长的文本文件,每个段落有6行和7行。我需要把所有七行的段落都写进一个文件,六行的段落写进一个文件。 或删除6行(7行)段落。 每段用空行隔开(或两个空行)。 文本文件示例:

Firs Name Last Name
address1
Address2
Note 1
Note 2
Note3
Note 4

First Name LastName
add 1
add 2
Note2
Note3
Note4

etc...

我想用python3forwindows。欢迎任何帮助。谢谢!在


Tags: 文件nameadd示例notefirstlast段落
1条回答
网友
1楼 · 发布于 2024-09-24 22:25:52

作为对stackoverflow的欢迎,并且因为我认为您现在已经搜索了更多的代码,我建议您使用以下代码。在

它核实段落不超过7行,不少于6行。当源代码中存在这样的段落时,它会发出警告。在

您将删除所有打印以获得干净的代码,但使用它们,您可以遵循算法。在

我认为这里面没有错误,但不要认为这是百分之百的肯定。在

这不是唯一的方法,但是我选择了一种可以用于所有类型的文件的方法,不管大小:一次迭代一行。可以一次读取整个文件,然后将其拆分成一系列行,或者借助正则表达式进行处理;但是,当一个文件很大时,一次读取所有文件都会消耗内存。在

with open('source.txt') as fsource,\
     open('SIX.txt','w') as six,  open('SEVEN.txt','w') as seven:

    buf = []
    cnt = 0
    exceeding7paragraphs = 0
    tinyparagraphs = 0

    line = 'go'
    while line:
        line = fsource.readline()
        cnt += 1
        buf.append(line)

        if len(buf)<6 and line.rstrip('\n\r')=='':
            tinyparagraphs += 1
            print cnt,repr(line),"this line of paragraph < 6 is void,"+\
                  "\nthe treatment of all this paragraph is skipped\n"+\
                  '\n# '+str(cnt)+' '+ repr(line)+" skipped line "
            buf = []
            while line and line.rstrip('\n\r')=='':
                line = fsource.readline()
                cnt += 1
                if line=='':
                    print "line",cnt,"is '' , EOF -> the program will be stopped"
                elif line.rstrip('\n\r')=='':
                    print '#',cnt,repr(line)
                else:
                    buf.append(line)
                    print '!',cnt,repr(line),' put in void buf'
        else:
            print cnt,repr(line),' put in buf'




        if len(buf)==6:
            line = fsource.readline() # reading a potential seventh line of a paragraph
            cnt += 1

            if line.rstrip('\n\r'): # means the content of the seventh line isn't void
                buf.append(line)
                print cnt,repr(line),'seventh line put in buf'
                line = fsource.readline()
                cnt += 1

                if line.rstrip('\n\r'): # means the content of the eighth line isn't void
                    exceeding7paragraphs += 1
                    print cnt,repr(line),"the eight line isn't void,"+\
                          "\nthe treatment of all this paragraph is skipped"+\
                          "\neighth line skipped"
                    buf = []
                    while line and line.rstrip('\n\r'):
                        line = fsource.readline()
                        cnt += 1
                        if line=='':
                            print "line",cnt,"is '' , EOF -> the program will be stopped"
                        elif line.rstrip('\n\r')=='':
                            print '\n#',cnt,repr(line)
                        else:
                            print str(cnt) + ' ' + repr(line)+' skipped line'

                else:
                    if line=='':
                        print cnt,"line is '' , EOF -> the program will be stopped\n"
                    else: # line.rstrip('\n\r') is ''
                        print cnt,'eighth line is void',repr(line)
                    seven.write(''.join(buf) + '\n')
                    print buf,'\n',len(buf),'lines recorded in file SEVEN\n'
                    buf = []

            else:
                print cnt,repr(line),'seventh line: void'
                six.write(''.join(buf) + '\n')
                print buf,'\n',len(buf),'lines recorded in file SIX'
                buf = []
                if line=='':
                    print "line",cnt,"is '' , EOF -> the program will be stopped"
                else:
                    print '\nthe line is',cnt, repr(line)

            while line and line.rstrip('\n\r')=='':
                line = fsource.readline()
                cnt += 1
                if line=='':
                    print "line",cnt,"is '' , EOF -> the program will be stopped"
                elif line.rstrip('\n\r')=='':
                    print '#',cnt,repr(line)
                else: # line.rstrip('\n\r') != ''
                    buf.append(line)
                    print '!',cnt,repr(line),' put in void buf'

if exceeding7paragraphs>0:
    print '\nWARNING :'+\
          '\nThere are '+str(exceeding7paragraphs)+' paragraphs whose number of lines exceeds 7.'

if tinyparagraphs>0:
    print '\nWARNING :'+\
          '\nThere are '+str(tinyparagraphs)+' paragraphs whose number of lines is less than 6.'


print '\n===================================================================='
print 'File SIX\n'
with open('SIX.txt') as six:
    print six.read()


print '===================================================================='
print 'File SEVEN\n'
with open('SEVEN.txt') as seven:
    print seven.read()

我也对你的问题投赞成票,因为这是一个似乎不容易解决的问题,不让你一个职位一个否决票,这是一个士气低落的开始。就像其他人说的那样,下次尽量把你的演讲做得更好。在

一。在

编辑:

这是一个简化的代码,它只包含6到7行的段落,精确地由1到2行分隔,正如问题的措辞所述

^{pr2}$

相关问题 更多 >