如何使用正在读取的文本文件中的值创建python字典

2024-09-27 07:18:49 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一份档案_电子邮件.xpk我正试图用python代码中的值创建一个字典。你知道吗

j = 0;
contents_atom = []
atom_lines=[]
with open ("peaks_ee.xpk","r") as atomName:
    for name in atomName.readlines():
        float_str = re.findall("\d\.H\d'?", name)
        if (len(float_str)>1):
            j = j+1
            value1 = ('Atom ' + str(j) + ' ' + str(float_str[0]) + ' ' + str(float_str[1]) + '\n')
            atom_lines.insert(-1,value1)                     
tclust_atom = open("tclust.txt","a")
for value1 in atom_lines:
    tclust_atom.write(value1)
tclust_atom.close()

我在看文件_电子邮件.xpk. 这就是高峰_电子邮件.xpk看起来像:

peaks_ee

这是来自peaks的一个示例片段_电子邮件.xpk地址:

label dataset sw sf
1H 1H_2
NOESY_F1eF2e.nv
4807.69238281 4803.07373047
600.402832031 600.402832031
1H.L 1H.P 1H.W 1H.B 1H.E 1H.J 1H.U 1H_2.L 1H_2.P 1H_2.W 1H_2.B 1H_2.E 1H_2.J 1H_2.U vol int stat comment flag0 flag8 flag9
0 {1.H1'} 5.82020 0.05000 0.10000 ++ {0.0} {} {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
1 {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} {1.H1'} 5.82020 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
2 {1.H8} 8.13712 0.05000 0.10000 ++ {0.0} {} {1.H1'} 5.82020 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
3 {1.H1'} 5.82020 0.05000 0.10000 ++ {0.0} {} {1.H8} 8.13712 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
4 {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} {2.H1'} 5.90291 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
5 {2.H1'} 5.90291 0.05000 0.10000 ++ {0.0} {} {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
6 {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} {1.H1'} 5.82020 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
7 {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} {1.H8} 8.13712 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
8 {1.H1'} 5.82020 0.05000 0.10000 ++ {0.0} {} {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
9 {1.H8} 8.13712 0.05000 0.10000 ++ {0.0} {} {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
10 {3.H6} 7.53261 0.05000 0.10000 ++ {0.0} {} {4.H1'} 5.74125 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
11 {4.H1'} 5.74125 0.05000 0.10000 ++ {0.0} {} {3.H6} 7.53261 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
12 {3.H1'} 5.54935 0.05000 0.10000 ++ {0.0} {} {4.H8} 7.49932 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
13 {4.H8} 7.49932 0.05000 0.10000 ++ {0.0} {} {3.H1'} 5.54935 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
14 {3.H6} 7.53261 0.05000 0.10000 ++ {0.0} {} {3.H1'} 5.54935 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
15 {3.H1'} 5.54935 0.05000 0.10000 ++ {0.0} {} {3.H6} 7.53261 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
16 {3.H6} 7.53261 0.05000 0.10000 ++ {0.0} {} {2.H1'} 5.90291 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
17 {3.H6} 7.53261 0.05000 0.10000 ++ {0.0} {} {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
18 {2.H1'} 5.90291 0.05000 0.10000 ++ {0.0} {} {3.H6} 7.53261 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
19 {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} {3.H6} 7.53261 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
20 {4.H8} 7.49932 0.05000 0.10000 ++ {0.0} {} {4.H1'} 5.74125 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
21 {4.H1'} 5.74125 0.05000 0.10000 ++ {0.0} {} {4.H8} 7.49932 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
22 {4.H8} 7.49932 0.05000 0.10000 ++ {0.0} {} {3.H1'} 5.54935 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
23 {4.H8} 7.49932 0.05000 0.10000 ++ {0.0} {} {3.H6} 7.53261 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
24 {3.H1'} 5.54935 0.05000 0.10000 ++ {0.0} {} {4.H8} 7.49932 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0

我想做一本以原子名为键的字典。峰中的原子名_电子邮件.xpk是“1.H1”、“2.H8”等。。我希望这个值是化学位移,例如“5.82020”和“7.61004”(这是从峰的0线来的)_电子邮件.xpk) 例如,我希望字典看起来像:

dict = { "1.H1'":"5.82020", "2.H8":"7.61004"...}

但是下一行重复使用2.H8和1.H1',因此不需要添加到字典中。之后的一行(第2行)应该添加到字典中,因为它有一个名为1.H8的新原子,所以应该是

dict = {"1.H1'":"5.82020", "2.H8":"7.61004", "1.H8:8.13712", ...}

我该怎么做?你知道吗

编辑:如果我有另一个文件“ee\u pinkH1.xpk”,我想读入它,看看那里的化学位移值是否在一定范围内,然后打印出这些值,这就是代码吗?你知道吗

这是我的全部代码:

import os
import sys
import re

i = 0;
contents_peak = []
peak_lines=[]
with open ("ee_pinkH1.xpk","r") as peakPPM:
    for PPM in peakPPM.readlines():
        float_num = re.findall("[\s][1-9]{1}\.[0-9]+",PPM)
        if (len(float_num)>1):
            i=i+1
            value = ('Peak ' + str(i) + ' '+  str(float_num[0])+ ' 0.05 ' + str(float_num[1])+ ' 0.05 ' + '\n')
            peak_lines.insert(-1,value)
tclust_peak = open("tclust.txt","w+")
tclust_peak.write('rbclust \n')
for value in peak_lines:
    tclust_peak.write(value)
tclust_peak.close()

j = 0;
contents_atom = []
atom_lines=[]
result = {}
with open ("peaks_ee.xpk","r") as atomName:
    for name in atomName.readlines():
        for match in rex.finditer(line):
            name,shift = match.groups()
        if name not in result: 
            result[name] = float(shift)
            float_str = re.findall("\d\.H\d'?", name)
            if (len(float_str)>1):
                j = j+1
                if peakPPM = 'ee_pinkH1.xpk':
                    if 5<=float_num<=6.25:
                        value1 = ('Atom ' + str(j) + ' ' + str(float_str[0]) + ' ' + str(float_str[1]) + '\n')
                    atom_lines.insert(-1,value1)

tclust_atom = open("tclust.txt","a")
for value1 in atom_lines:
    tclust_atom.write(value1)
tclust_atom.close()

Tags: nameinfor电子邮件floath1atomlines
2条回答

您可以扩展regex模式以包含化学变化,并在每个匹配中获得所需的内容。在要保留的图案部分周围加上括号,以便捕获它们。你知道吗

pattern = '''{(\d\.H\d'?)}\s(\d\.\d+)\s'''
rex = re.compile(pattern)

迭代所有匹配项;name和shift将位于match.groups()元组中;如果尚未看到名称,则将其添加到字典中。你知道吗

with open(filepath) as atom_name:
    data = atom_name.read()
result = {}
for match in rex.finditer(data):
    name, shift = match.groups()
    #print(name,shift)
    if name not in result:
        result[name] = float(shift)

如果文件太大,无法一次读取,请一次提取一行信息。你知道吗

with open(filepath) as atom_name:
    for line in atom_name:
        for match in rex.finditer(line):
            name, shift = match.groups()
            #print(name,shift)
            if name not in result:
                result[name] = float(shift)

在添加一个键之前,只需使用in检查它是否已经在字典中。你知道吗

dict = {}
for line in atomName.readlines()
    atom_name = line.split()[1][1:-1]
    if (atom_name in dict):
        atom_value = float(line.split()[2])
        dict[atom_name] = atom_value

因为看起来每行都有多个键值对要检查,所以可以在每行中重复此函数,如下所示:

dict = {}
for line in atomName.readlines()
    atom_name = line.split()[1][1:-1]
    if (atom_name in dict):
        atom_value = float(line.split()[2])
        dict[atom_name] = atom_value
    atom_name = line.split()[8][1:-1]
    if (atom_name in dict):
        atom_value = float(line.split()[9])
        dict[atom_name] = atom_value

顺便问一下,你是想编辑这篇文章吗?我也在你以前的duplicate帖子上回复了。你知道吗

相关问题 更多 >

    热门问题