在提取多个模式之间的数据时需要Python解决方案保存到csv

import re inFile = open("input_new.txt",encoding='utf-8') outFile = open("result.txt", "w") buffer1 = "" keepCurrentSet = True for line in inFile: buffer1=buffer1+(line) buffer1=re.findall(r"(?<=QUESTION NO:\s\d\s*) (.*?) (?=A\.)", buffer1) outFile.write("".join(buffer1)) inFile.close() outFile.close()

1条回答

网友

1楼 · 发布于 2024-09-27 20:16:14

由于您拥有的问答文本文件是结构化文件，因此可以采用以下方法将单个问答数据转换为csv行

import re

text = """Which of the following is a hardware requirement that either an IDS/IPS system or a proxy server must have in order to properly function?

A. Fast processor to help with network traffic analysis

B. They must be dual-homed

C. Similar RAM requirements

D. Fast network interface cards

Answer: B Explanation:

Dual-homed or dual-homing can refer to either an Ethernet device that has more than one network interface, for redundancy purposes, or in firewall technology, dual-homed is one of the firewall architectures, such as an IDS/IPS system, for implementing preventive security."""

text1 = re.sub(r'(\n+A\.)|(\n+B\.)|(\n+C\.)|(\n+D\.)', ',', text)
text1 = re.sub(r'\n+Answer:|Explanation:\n+', ',', text1)

print(text1)

最终输出：

Which of the following is a hardware requirement that either an IDS/IPS system or a proxy server must have in order to properly functi
on?, Fast processor to help with network traffic analysis, They must be dual-homed, Similar RAM requirements, Fast network interface c
ards, B ,Dual-homed or dual-homing can refer to either an Ethernet device that has more than one network interface, for redundancy pur
poses, or in firewall technology, dual-homed is one of the firewall architectures, such as an IDS/IPS system, for implementing prevent
ive security.

正如您所看到的，text变量保存着一个问题-答案数据（如果您使用python从文件中读取该数据，就会得到换行符）

接下来，我所做的是

text1 = re.sub(r'(\n+A\.)|(\n+B\.)|(\n+C\.)|(\n+D\.)', ',', text)

例如\n+A\.匹配多项选择答案中类似“A.”的文本（\n+匹配多项选择选项前面的任何换行符）。使用上面的代码，我们将MCQ选项标记转换为,

在最后一步，我这样做

text1 = re.sub(r'\n+Answer:|Explanation:\n+', ',', text1)

它负责实际的答案和解释

您可以通过对文件中的每个问题循环上述逻辑来推断这一点，以获得所需的输出

编辑：

您还可以使用以下代码一次处理多个问题：

import re

text = """QUESTION NO: 1

Which of the following is a hardware requirement that either an IDS/IPS system or a proxy server must have in order to properly function?

A. Fast processor to help with network traffic analysis

B. They must be dual-homed

C. Similar RAM requirements

D. Fast network interface cards

Answer: B Explanation:

Dual-homed or dual-homing can refer to either an Ethernet device that has more than one network interface, for redundancy purposes, or in firewall technology, dual-homed is one of the firewall architectures, such as an IDS/IPS system, for implementing preventive security.

QUESTION NO: 2

Which of the following is an application that requires a host application for replication?

A. Micro

B. Worm

C. Trojan

D.

Virus

Answer: D Explanation:

Computer viruses infect a variety of different subsystems on their hosts. A computer virus is a malware that, when executed, replicates by reproducing itself or infecting other programs by modifying them. Infecting computer programs can include as well, data files, or the boot sector of the hard drive. When this replication succeeds, the affected areas are then said to be "infected"."""


text1 = re.sub(r'QUESTION NO: \d+', '\n', text)
text1 = re.sub(r'(\n+A\.)|(\n+B\.)|(\n+C\.)|(\n+D\.)', ',', text1)
text1 = re.sub(r'\n+Answer:|Explanation:\n+', ',', text1)
text1 = re.sub(r'\n+', '\n', text1)

print(text1)

这将打印以下内容：

Which of the following is a hardware requirement that either an IDS/IPS system or a proxy server must have in order to properly functi
on?, Fast processor to help with network traffic analysis, They must be dual-homed, Similar RAM requirements, Fast network interface c
ards, B ,Dual-homed or dual-homing can refer to either an Ethernet device that has more than one network interface, for redundancy pur
poses, or in firewall technology, dual-homed is one of the firewall architectures, such as an IDS/IPS system, for implementing prevent
ive security.                                                                                                                         
Which of the following is an application that requires a host application for replication?, Micro, Worm, Trojan,                      
Virus, D ,Computer viruses infect a variety of different subsystems on their hosts. A computer virus is a malware that, when executed,
 replicates by reproducing itself or infecting other programs by modifying them. Infecting computer programs can include as well, data
 files, or the boot sector of the hard drive. When this replication succeeds, the affected areas are then said to be "infected".

相关问题更多 >

编程相关推荐

热门问题

热门文章