Python如何将字符串拆分为几个阶段问题的回答

Python如何将字符串拆分为几个阶段

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

使用正则表达式，您可以在<code><=></code>或<code>+</code>上拆分，以获得带有数字的单独化合物 将它们分开后，可以使用<code>lstrip</code>删除前面的数字（包括<code>(n+1)</code>等），并使用<code>strip</code>删除后面的空格 <pre><code>import re str1 = 'Polyphosphate + n H2O <=> (n+1) Oligophosphate' str2 = '16 ATP + 16 H2O + 8 Reduced ferredoxin <=> 8 e- + 16 Orthophosphate + 16 ADP + 8 Oxidized ferredoxin' res1 = [i.lstrip(" 123456789n()+").strip() for i in re.split(r" \+ | <=> ", str1)] res2 = [i.lstrip(" 123456789n()+").strip() for i in re.split(r" \+ | <=> ", str2)] print(res1) # ['Polyphosphate', 'H2O', 'Oligophosphate'] print(res2) # ['ATP', 'H2O', 'Reduced ferredoxin', 'e-', 'Orthophosphate', 'ADP', 'Oxidized ferredoxin'] </code></pre> <hr/> 随着您不断变化的需求： <blockquote> In some compound, it may also exist the number or some other char, for example, '5-Aminolevulinate' or '(+)-Bisdechlorogeodin' </blockquote> 下面是另一个稍微不太好的解决方案，带有一个额外复杂的示例： <pre><code>import re str1 = 'Polyphosphate + n H2O <=> (n+1) Oligophosphate' str2 = '16 ATP + 16 H2O + 8 Reduced ferredoxin <=> 8 e- + 16 Orthophosphate + 16 ADP + 8 Oxidized ferredoxin' str3 = '5-Aminolevulinate + 8 Reduced ferredoxin <=> 8 e- + 16 Orthophosphate + (+)-Bisdechlorogeodin + (n+1) Oligophosphate' res1 = [re.split(r"[^a-z] ", i)[-1].lstrip("n ").strip() for i in re.split(r" \+ | <=> ", str1)] res2 = [re.split(r"[^a-z] ", i)[-1].lstrip("n ").strip() for i in re.split(r" \+ | <=> ", str2)] res3 = [re.split(r"[^a-z] ", i)[-1].lstrip("n ").strip() for i in re.split(r" \+ | <=> ", str3)] print(res1) # ['Polyphosphate', 'H2O', 'Oligophosphate'] print(res2) # ['ATP', 'H2O', 'Reduced ferredoxin', 'e-', 'Orthophosphate', 'ADP', 'Oxidized ferredoxin'] print(res3) # ['5-Aminolevulinate', 'Reduced ferredoxin', 'e-', 'Orthophosphate', '(+)-Bisdechlorogeodin', 'Oligophosphate'] </code></pre> <hr/> 要处理您现在已删除的评论，并满足进一步的可能要求，请执行以下操作： <blockquote> During the experiment, there exist new compounds, for example ''2 GTP <=> Diphosphate + P1,P4-Bis(5'-guanosyl) tetraphosphate'', the compound is 'P1,P4-Bis(5'-guanosyl) tetraphosphate' </blockquote> <pre><code>import re str1 = 'Polyphosphate + n H2O <=> (n+1) Oligophosphate' str2 = '16 ATP + 16 H2O + 8 Reduced ferredoxin <=> 8 e- + 16 Orthophosphate + 16 ADP + 8 Oxidized ferredoxin' str3 = '5-Aminolevulinate + 8 Reduced ferredoxin <=> 8 e- + 16 Orthophosphate + (+)-Bisdechlorogeodin + (n+1) Oligophosphate' str4 = '2 GTP <=> Diphosphate + 8 e- + 16 Orthophosphate + 12 (+)-Bisdechlorogeodin + (n+1) P1,P4-Bis(5\'-guanosyl) tetraphosphate' res1 = [re.split(r"[^a-z\)]\)? ", i)[-1].lstrip("n ").strip() for i in re.split(r" \+ | <=> ", str1)] res2 = [re.split(r"[^a-z\)]\)? ", i)[-1].lstrip("n ").strip() for i in re.split(r" \+ | <=> ", str2)] res3 = [re.split(r"[^a-z\)]\)? ", i)[-1].lstrip("n ").strip() for i in re.split(r" \+ | <=> ", str3)] res4 = [re.split(r"[^a-z\)]\)? ", i)[-1].lstrip("n ").strip() for i in re.split(r" \+ | <=> ", str4)] print(res1) # ['Polyphosphate', 'H2O', 'Oligophosphate'] print(res2) # ['ATP', 'H2O', 'Reduced ferredoxin', 'e-', 'Orthophosphate', 'ADP', 'Oxidized ferredoxin'] print(res3) # ['5-Aminolevulinate', 'Reduced ferredoxin', 'e-', 'Orthophosphate', '(+)-Bisdechlorogeodin', 'Oligophosphate'] print(res4) # ['GTP', 'Diphosphate', 'e-', 'Orthophosphate', '(+)-Bisdechlorogeodin', "P1,P4-Bis(5'-guanosyl) tetraphosphate"] </code></pre> （注意：我在公式中添加了一些任意的其他内容，以尝试确保在更多情况下生成正确的结果，同时注意，我不一定捕获了所有边缘情况，但它适用于给定的示例。）

Python如何将字符串拆分为几个阶段

1 个回答

相关Python问题