在指定字符串之间使用正则表达式提取数据

2024-10-02 12:29:20 发布

您现在位置:Python中文网/ 问答频道 /正文

问题1:我想提取“目标信息”和“组信息”前一行之间的数据,并将其作为变量或适当地存储

问题2:接下来,我想从“组信息”中提取数据,直到文件结束,并将其存储在变量或适当的东西中

问题3:在上述两种情况下都有这些信息,我想提取以“Name”开头的行后面的行

从下面的代码中,我能够获得“目标信息”和“组信息”之间的信息,并在“required_lines”变量中捕获数据

接下来,我尝试在“Name”行之后获取该行。但这失败了。这个逻辑可以用regex调用实现吗

# Extract the lines between
with open ('showrcopy.txt', 'r') as f:
    file = f.readlines()


required_lines1 = []
required_lines = []
inRecordingMode = False
for line in file:
   
    if not inRecordingMode:
        if line.startswith('Target Information'):
            inRecordingMode = True
    elif line.startswith('Group Information'):
        inRecordingMode = False
    else:

        required_lines.append(line.strip())


print(required_lines)


#Extract the line after the line "Name"  

def gen():
    for x in required_lines:
        yield x

for line in gen():
    if "Name" in line:
        print(next(gen())

showrcopy.txt

root@gnodee184119:/home/usr/redsuren# date; showrcopy -qw
Tue Aug 24 00:20:38 PDT 2021

Remote Copy System Information
Status: Started, Normal

Target Information

Name  ID Type Status Policy        QW-Server                  QW-Ver  Q-Status    Q-Status-Qual     ATF-Timeout
s2976  4 IP   ready  mirror_config https://10.157.35.148:8443 4.0.007 Re-starting Quorum not stable          10

Link Information

Target  Node  Address       Status Options
s2976   0:9:1 192.168.20.21 Up     -
s2976   1:9:1 192.168.20.22 Up     -
receive 0:9:1 192.168.10.21 Up     -
receive 1:9:1 192.168.10.22 Up     -

Group Information

Name                      Target     Status   Role       Mode     Options
SG_hpux_vgcgloack.r518634 s2976      Started  Primary    Sync     auto_recover,auto_failover,path_management,auto_synchronize,active_active
  LocalVV              ID   RemoteVV             ID   SyncStatus    LastSyncTime
  vgcglock_SG_cluster 13496 vgcglock_SG_cluster 28505 Synced        NA

Name                Target     Status   Role       Mode     Options
aix_rcg1_AA.r518634 s2976      Started  Primary    Sync     auto_recover,auto_failover,path_management,auto_synchronize,active_active
  LocalVV         ID   RemoteVV      ID   SyncStatus    LastSyncTime
  tpvvA_aix_r.2  20149 tpvvA_aix.2  41097 Synced        NA
  tpvvA_aix_r.3  20150 tpvvA_aix.3  41098 Synced        NA
  tpvvA_aix_r.4  20151 tpvvA_aix.4  41099 Synced        NA
  tpvvA_aix_r.5  20152 tpvvA_aix.5  41100 Synced        NA
  tpvvA_aix_r.6  20153 tpvvA_aix.6  41101 Synced        NA
  tpvvA_aix_r.7  20154 tpvvA_aix.7  41102 Synced        NA
  tpvvA_aix_r.8  20155 tpvvA_aix.8  41103 Synced        NA
  tpvvA_aix_r.9  20156 tpvvA_aix.9  41104 Synced        NA
  tpvvA_aix_r.10 20157 tpvvA_aix.10 41105 Synced        NA

Tags: name信息idtargetautoinformationstatusline
1条回答
网友
1楼 · 发布于 2024-10-02 12:29:20

下面是一个用于提取目标信息和组信息的正则表达式解决方案:

import re

with open("./showrcopy.txt", "r") as f:
    text = f.read()


target_info_pattern = re.compile(r"Target Information([.\s\S]*)Group Information")
group_info_pattern = re.compile(r"Group Information([.\s\S]*)")

target_info = target_info_pattern.findall(text)[0].strip().split("\n")
group_info = group_info_pattern.findall(text)[0].strip().split("\n")

target_info_line_after_name = target_info[1]
group_info_line_after_name = group_info[1]

还有你感兴趣的台词:

>>> target_info_line_after_name
's2976  4 IP   ready  mirror_config https://10.157.35.148:8443 4.0.007 Re-starting Quorum not stable          10'

>>> group_info_line_after_name
'SG_hpux_vgcgloack.r518634 s2976      Started  Primary    Sync     auto_recover,auto_failover,path_management,auto_synchronize,active_active'

相关问题 更多 >

    热门问题