如何在python中从列表中提取数据,列表之间有附加空格

2024-10-01 19:22:05 发布

您现在位置:Python中文网/ 问答频道 /正文

代码试图从文件中提取:(格式:group,team,val1,val2)。但是,如果没有额外的空间,某些结果是正确的,并且在中间有额外空间的行中产生错误的结果

data = {}
with open('source.txt') as f:
    for line in f:
        print ("this is the line data: ", line)
        
        needed = line.split()[0:2]
        print ("this is what i need: ", needed)

source.txt#--格式:组、团队、val1、val2

alpha diehard group 1 54,00.01
bravo nevermindteam 3 500,000.00
charlie team ultimatum 1 27,722.29 ($250.45)
charlie team ultimatum 10 252,336,733.383 ($492.06)
delta beyond-imagination 2 11 ($10)
echo double doubt 5 143,299.00 ($101)
echo double doubt 8 145,300 ($125.01)
falcon revengers 3 0.1234
falcon revengers 5 9.19
lima almost done 6 45.00181 ($38.9)
romeo ontheway home 12 980

我试图只提取val1之前的值小组

alpha diehard group
bravo nevermindteam
charlie team ultimatum
delta beyond-imagination
echo double doubt
falcon revengers
lima almost done
romeo ontheway home

Tags: echodata格式linegroup空间teamdouble
3条回答

我是这样做的,基本上迭代所有单词,当你点击一个数字时停止:

data = {}
with open('source.txt') as f:
    for line in f:
        print ("this is the line data: ", line)
        
        split_line = line.split()
        for i in range (len(split_line)):
            if split_line[i].isnumeric():
                break
        
        needed = split_line[0:i]
        
        print ("this is what i need: ", needed)

使用正则表达式

import regex as re
with open('source.txt') as f:
   for line in f:
       found = re.search("(.*?)\d", line)
       needed = found.group(1).split()[0:3]
       print(needed)

输出:

['alpha', 'diehard', 'group']
['bravo', 'nevermindteam']
['charlie', 'team', 'ultimatum']
['charlie', 'team', 'ultimatum']
['delta', 'beyond-imagination']
['echo', 'double', 'doubt']
['echo', 'double', 'doubt']
['falcon', 'revengers']
['falcon', 'revengers']
['lima', 'almost', 'done']
['romeo', 'ontheway', 'home']

试一试

with open('source.txt') as f:
   for line in f:
       new_line = ' '.join(filter(lambda s: s.isalpha() , l.split(' ')))
       print(new_line)

代码对于空格的数量是合理的

用正则表达式

import regex

with open('source.txt', 'r') as f:
    text = re.sub(r'[0-9|,|\.|\(|\)|\$|\s]+\n', '\n', f.read()+'\n', re.M)

相关问题 更多 >

    热门问题