在python中从文本文件中删除单词而不是数字

2024-09-27 00:19:51 发布

您现在位置:Python中文网/ 问答频道 /正文

有一列,在该列中,可能有1到多个6位数字,也可能没有。数据必须保持在原始工作表中列出的顺序,例如A1必须保持在第1行、第2行、第2行,依此类推。你知道吗

例如:

Cell A1:
Lipodystrophy: congenital generalized: type 2: 269700; Encephalopathy: progressive: with or without lipodystrophy: 615924; Silver spastic paraplegia syndrome: 270685; and Neuropathy: distal hereditary motor: type VA: 600794

变成:

269700, 615924, 270685, 600794

Tags: or数据顺序a1typewithcell数字
2条回答

使用regular expression。你知道吗

import csv 
import re 

with open('input.csv') as fin, 
open('output.csv', 'wb') as fout: 
    csv_in = csv.reader(fin, delimiter = '\t')
    csv_out = csv.writer(fout) 
    for row in csv_in: 
        matchList = re.findall(r'\d{6}', row, flags=0)
        csv_out.writerow(matchList)

模式类似于“\d{6}”或“/d/d/d/d/d/d”

试试这一行

in_string = ("Lipodystrophy: congenital generalized: type 2: 269700; "
             "Encephalopathy: progressive: with or without lipodystrophy: "
             "615924; Silver spastic paraplegia syndrome: 270685; "
             "and Neuropathy: distal hereditary motor: type VA: 600794")

output = ', '.join([word for word in in_string.replace(';', '').split()
                    if word.isdigit()])

输出

print(output)
>>> 269700, 615924, 270685, 600794

或者,使用输入文件

 with open('input.csv') as fin, open('output.csv', 'w') as fout:
    output = '\n'.join(','.join(word for word in line.replace(';', '').split() 
                                if word.isdigit()) for line in fin)
    fout.write(output)

相关问题 更多 >

    热门问题