python中xyz坐标的欧几里德距离

2024-10-04 01:27:56 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个像这样的数据文件。在

HETATM    1  H10 XSHQ    0      10.139   2.231   0.091  1.00  0.00           H
HETATM    2   N1 XSHQ    0       9.641   1.386  -0.104  1.00  0.00           N
HETATM    3   H9 XSHQ    0       9.773   1.133  -1.063  1.00  0.00           H
HETATM    4   C1 XSHQ    0       8.245   1.531   0.230  1.00  0.00           H

其中XYZ坐标在第6、7、8列和最后一列中,与点相关联的字母在最后一列中。我想确定在最后一列有字母H的点之间的距离。我该怎么做?我知道这是我执行该操作所需的代码,但我不清楚如何使用第6、7和8列中的值,而且只适用于最后一列是H的情况:

^{pr2}$

Tags: 代码距离数据文件字母情况c1xyzh9
3条回答

当然,@smonener的答案已经是正确的,使用OrderedDict这样的数据结构是个好主意,但是如果只想使用标准方法,可以尝试:

from scipy.spatial import distance

# Load data from file
with open('datafile.txt') as datafile: 
    contents = [line.split() for line in datafile]

# Extract the xyz coordiantes, if there is an H in the last column
coords = []
for i, item in enumerate(contents):
    if item[-1] == 'H':
        coords.append([[float(x) for x in item[5:8]], i+1])

# Show results
for i in range(len(coords)):
    for j in range(i+1, len(coords)):
        dist = distance.euclidean(coords[i][0], coords[j][0])
        print('({}, {}): {:.5f}'.format(coords[i][1], coords[j][1], dist))

我使用regexp来提取日期,然后根据规则过滤它们。在

演示代码如下:

enter image description here

使用生成器表达式的简单解决方案

From PEP 289 Generator Expressions
Rationale
Experience with list comprehensions has shown their widespread utility throughout Python. However, many of the use cases do not need to have a full list created in memory. Instead, they only need to iterate over the elements one at a time.

因为

  1. 你不需要保存中间结果
  2. 可能你有一个大的数据集

以及itertools标准库模块中的^{},因为您需要计算数据集中每对有趣的点的距离。在

$ cat euclid.py
from scipy.spatial.distance import euclidean
from itertools import combinations

lines = ['HETATM 1 H10 XSHQ 0  10.139 2.231  0.091 1.00 0.00 H',
         'HETATM 2  N1 XSHQ 0   9.641 1.386 -0.104 1.00 0.00 N',
         'HETATM 3  H9 XSHQ 0   9.773 1.133 -1.063 1.00 0.00 H',
         'HETATM 4  C1 XSHQ 0   8.245 1.531  0.230 1.00 0.00 H']

H_lines = (line for line in lines if line[-1]=='H')
H_lists = (line.split() for line in H_lines)
H_data = ((int(tok[1]), [float(x) for x in tok[5:8]]) for tok in H_lists)
H_dist = {(i[0], j[0]):euclidean(i[1], j[1])
             for i, j in combinations(H_data, 2)}

for m, n in H_dist:
    print('Distance between points %d and %d is %.6f'%(
             m, n, H_dist[m, n]))
$ python3 euclid.py
Distance between points 1 and 3 is 1.634404
Distance between points 1 and 4 is 2.023995
Distance between points 3 and 4 is 2.040842
$ 

相关问题 更多 >