如何在Python中提取文本文件中部分填充图案的行的一部分？

0:00 txt txt e-mail1_to_extract txt_to_extract1 txt txt /data 0:00 txt txt e-mail2_to_extract txt_to_extract2 txt txt /data 0:00 txt txt txt e-mail3_to_extract txt_to_extract3 txt txt /var 0:00 txt txt txt txt e-mail4_to_extract txt_to_extract4 txt txt /var 0:00 txt txt e-mail5_to_extract txt_to_extract5 txt txt /data

with open('content.txt') as infile, open('extraction.txt', 'w') as outfile: copy = False for line in infile: if line.strip() == "0:00": copy = True continue elif line.strip() == "/": copy = False continue elif copy: outfile.write(line)

1条回答

网友

1楼 · 发布于 2024-10-01 13:35:38

我使用了您提供的格式的示例文件-

0:00 txt txt123 abc@abs.com txt_to_extract1 txt6456 txtssss /data
0:00 txt11 txt111 abd@rtx.vg txt_to_extract2 txtssss txtffff /data
0:00 txt111 txt123 txt tyrr@rgahb.com txt_to_extract3 txtosvbsvs txtkkkk /var
0:00 txt456 txt3663 srsr31415s@gagha.gha txt e-mail4_to_extract txt_to_extract4 txabjahsjat txtasba /var
0:00 txtGJK txtfggg gfa456vaj@aghaha.com txt_to_extract5 txtbxajla txtzbaza /data

我使用了以下代码（用于确定电子邮件的函数，请相应地更改regex）——

import re 
  
regex = '^[a-z0-9]+[\._]?[a-z0-9]+[@]\w+[.]\w{2,3}$'
def check(email):    
    if(re.search(regex,email)):  
        return True
    else:  
        return False
        
def getcols(row):
    for i in row.keys():
        if check(row[i]):
            return str(row[i]) + " " + str(row[i+1])
        else:
            return ""


ls = []
with open('TestData.txt') as infile, open('extraction.txt', 'w') as outfile:
    for line in infile:
        ls = line.split()
        for i in range(len(ls)):
            if check(ls[i]):
                try:
                    outfile.write(ls[i] + " " + ls[i+1]+"\n")
                except:
                    pass

我得到以下输出-

abc@abs.com txt_to_extract1
abd@rtx.vg txt_to_extract2
tyrr@rgahb.com txt_to_extract3
srsr31415s@gagha.gha txt
gfa456vaj@aghaha.com txt_to_extract5

相关问题更多 >

编程相关推荐

热门问题

热门文章