如何在键和值(或任何其他格式)中使用python拆分文件

2024-09-26 22:54:23 发布

您现在位置:Python中文网/ 问答频道 /正文

假设我有一个doc文件,其中包含以下内容

Therapeutic Focus and Assessment : Describe the (1) types of interventions (such as pharmacologic, surgical, preventive, lifestyle, self-care) and (2) administration and intensity of the intervention (including dosage, strength, duration, frequency).

Follow-up and Outcomes : Please describe the clinical course of this case including all follow-up visits as well as (1) intervention modification, interruption, or discontinuation, and the reasons; (2) adherence to the intervention and how this was assessed;

Discussion : Please describe the strengths and limitations of this case report including case management, and the scientific and medical literature related to this case report.

在这个文件中,我要将每个标题及其内容分开。这意味着我将有3个标题和3个包含。我想把标题作为一个关键,把内容作为它的价值。如何使用regex过滤这些信息。

文件结构变化不大:(附加问题)

Therapeutic Focus and Assessment : Describe the (1) types of interventions (such as pharmacologic, surgical, preventive, lifestyle, self-care) and (2) administration and intensity of the intervention (including dosage, strength, duration, frequency).

Discussion :

Please describe the strengths and limitations of this case report including case > management, and the scientific. Health : medical literature related to this > case report.

如果我有这样一个文件,第一段的内容在一行,第二段的内容有一个行间距。同一段还增加了一节。如果那样的话,我将如何分开?


Tags: and文件ofthetoreport标题内容
1条回答
网友
1楼 · 发布于 2024-09-26 22:54:23

下面是一种基于字符而不是正则表达式进行拆分的方法。你知道吗

String document = "Header: blah blah \n Header: blah blah"

String[] sections = document.split("\n");
String[] headers = new String[sections.length];
String[] bodies = new String[sections.length];;

for(int i = 0; i < sections.length; i++){
      headers[i] = sections[i].split(":")[0];
      bodies[i] = sections[i].substring(headers[i].length() + 2);
}

如果你有更复杂的东西要除以回车和“:”,那么同样的分割方法也适用于正则表达式模式,但从外观上看,这可能对你有用。你知道吗

相关问题 更多 >

    热门问题