基于匹配模式和字符索引从文本文件生成字典

S1645BS5010 11 2558180123.98N0185135.88W 91175.71997031.83098.5346232936 R0001 91823.71996951.410.80002 91824.81996938.811.00003 91825.91996926.311.01 R0004 91827.01996913.811.10005 91828.11996901.311.10006 91829.21996888.711.11 R0007 91830.31996876.211.20008 91831.41996863.711.20009 91832.51996851.211.31 S1645BS5010 13 2563180126.23N0185138.97W 91086.31997103.13098.5346233020 R0001 91822.91997032.810.90002 91824.01997020.311.10003 91825.21997007.711.21 R0004 91826.31996995.211.20005 91827.41996982.711.30006 91828.51996970.211.31 R0007 91829.51996957.611.40008 91830.61996945.111.40009 91831.71996932.611.51

3条回答

网友

1楼 · 编辑于 2024-10-06 12:25:20

一步一步地通过它。你在正确的轨道上。如果您逐行读取文件，那么您有3个案例：

    lines with "S" set the key
    lines with "R" have the values
    others...who knows.

因此，请考虑：

shot_dict = {}
with open(file, 'r') as f:
  for line in f:
    if line.startswith('S'):
      key = line[21:25]
      shot_dict[key] = []   # or look into defaultdict
    elif line.startswith('R'):  # this will pick up subsequent lines
      # add to dictionary using current key...
      shot_dict[key].append(line[...], line[...], ...)  #psuedocode

网友

2楼 · 编辑于 2024-10-06 12:25:20

使用collections.defaultdict

from collections import defaultdict

file_name='text.pp'
shot_no = defaultdict(list)

with open(file_name , 'r') as f:
    for line in f:
        if line.strip():
            if line.startswith('S'):
               key = line[21:25]
            elif line.startswith('R'):
               shot_no[key].extend([line[23:26], line[49:54], line[75:80]])

print(shot_no)

输出

defaultdict(<class 'list'>, {'2563': ['10.', '11.10', '11.21', '11.', '11.30', '11.31', '11.', '11.40', '11.51'], '2558': ['10.', '11.00', '11.01', '11.', '11.10', '11.11', '11.', '11.20', '11.31']})

我认为您需要修复索引，它们不是您显示为输出的内容。我也不知道你想转换成浮点还是十进制。你知道吗

如果要保留插入顺序-可能需要使用OrderedDict，然后调整添加值的部分。你知道吗

使用collections.OrderedDict

from collections import OrderedDict

file_name='text.pp'
shot_no = OrderedDict()

with open(file_name , 'r') as f:
    for line in f:
        if line.strip():
            if line.startswith('S'):
               key = line[21:25]
            elif line.startswith('R'):
               shot_no.setdefault(key, []).extend([line[23:26], line[49:54], line[75:80]])

print(shot_no)

输出

OrderedDict([('2558', ['10.', '11.00', '11.01', '11.', '11.10', '11.11', '11.', '11.20', '11.31']), ('2563', ['10.', '11.10', '11.21', '11.', '11.30', '11.31', '11.', '11.40', '11.51'])])

编辑：在python3.7+中，常规dict也可以，因为根据文档“dict对象的插入顺序保存性质已经声明为Python语言规范的一个正式部分”。在3.6中，此功能被视为实现细节，不应依赖。因此，在3.7之前，必须使用OrderedDict。你知道吗

网友

3楼 · 编辑于 2024-10-06 12:25:20

file='Z:\Sei\text.pp'

shot_dict = {} #creating empty dictionary

with open(file , 'r') as f:
    for line in f:
        if len(line) > 0 and line.startswith('S'):
            shot_dict[line[:11]] = line[21:25] #writing into the dictionary
print (shot_dict) #see the dictionary

请注意，字典必须有唯一的键。
因此，您必须考虑作为键输入的内容。
如果您使用的是line[:11]，并且存在重复项，那么最后一个键将丢失一些数据，值对将覆盖字典中已有的键。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章