是否替换isAlpha()以包含下划线?

2024-09-24 22:31:41 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用Python3处理数据,需要读取如下所示的结果文件:

ENERGY_BOUNDS 
  1.964033E+07  1.733253E+07  1.491825E+07  1.384031E+07  1.161834E+07  1.000000E+07  8.187308E+06  6.703200E+06
  6.065307E+06  5.488116E+06  4.493290E+06  3.678794E+06  3.011942E+06  2.465970E+06  2.231302E+06  2.018965E+06
GAMMA_INTERFACE
     0
EIGENVALUE 
  1.219034E+00

我想在文件中搜索一个特定的标识符(在本例中是ENERGY_BOUNDS),开始读取该标识符之后的数值,而不是标识符本身,并在到达下一个标识符时停止。但是,我的问题是,我使用isAlpha查找下一个标识符,其中一些标识符包含下划线。这是我的密码:

def read_data_from_file(file_name, identifier):
    with open(file_name, 'r') as read_obj:
        list_of_results = []
        # Read all lines in the file one by one
        for line in read_obj:
            # For each line, check if line contains the string
            if identifier in line:
            # If yes, read the next line
                nextValue = next(read_obj)
                while(not nextValue.strip().isalpha()): # Keep on reading until next identifier appears 
                    list_of_results.extend(nextValue.split())
                    nextValue = next(read_obj)
    return(list_of_results)

我想我需要使用正则表达式,但我一直在考虑如何表达它。任何帮助都将不胜感激


Tags: 文件oftheinobjreadline标识符
3条回答
take = False

with open('path/to/input') as infile:
    for line in input:
        if line.strip() == "ENERGY_BOUNDS":
            take = True
            continue  # we don't actually want this line
        if all(char.isalpha() or char=="_" for char in line.strip()):  # we've hit the next section
            take = False
        if take:
            print(line)  # or whatever else you want to do with this line

您可以使用以下正则表达式:^[A-Z]+(?:_[A-Z]+)*$

此外,您可以修改正则表达式以匹配自定义长度的字符串,如:^[A-Z]{2,10}+(?:_[A-Z]+)*$,其中{2, 10}{MIN, MAX}长度:

enter image description here

你可以在这里找到这个演示:https://regex101.com/r/9jESAH/35

有关详细信息,请参见this answer

给你一个选择

只需迭代文件,直到找到标识符。 然后在另一个for循环中对其进行迭代,直到下一个标识符导致ValueError

def read_data_from_file(file_name, identifier):
    with open(file_name, 'r') as f:
        list_of_results = []
        for line in f:
            if identifier in line:
                break

        for line in f:
            try:
                list_of_results.extend(map(float, line.split()))
            except ValueError:
                break
    return list_of_results

相关问题 更多 >