计算每个唯一ID的行驶距离总和

2024-10-02 00:41:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据框,它有三列。一列包含x坐标,另一列包含y坐标。此外,如您所见,还有一个“trackid”列——该列将所有x和y坐标与特定的、唯一的trackid相关联

    trackiD   X_COORDINATES     Y_COORDINATES
        
     2        542.299805        23.388090
     2        544.108215        23.575758
     2        545.300598        23.962421
     2        546.417053        25.049328
     2        546.198669        24.830357
     2        546.724915        24.916084
     2        547.037048        24.918982
     2        547.011963        24.785202
     2        547.649231        24.845772
     3        547.600525        24.613401
     3        547.891479        24.268734
     3        548.580505        24.459103
     3        548.144409        23.915531
     3        548.626770        23.922005
     4        548.527222        24.134670
     4        548.504211        23.642254
     4        548.936584        24.028818
     4        548.627869        23.295454

我想做的是:

  • 取每对x和y坐标,并使用毕达哥拉斯距离公式计算它们之间移动距离的增量: (sqrt(x2-x1)^2+(y2-y1)^2),将每个距离增量添加到列表中,然后计算列表中所有增量的总和,以获得移动的总距离——同样需要注意的是,我仅对唯一trackid内的每组坐标进行此计算。即,计算trackid 2的距离增量之和,然后分别对trackid 3和4执行相同的过程,以此类推——最终将每个唯一的track ID的所有行驶总距离存储在一个新列表中

这是我当前的代码——它运行,但问题是,它输出的列表中只有一个较大的可能不正确的值(如下所示)。此外,在stackoverflow上,“value”变量似乎已被截断并跨多行显示,但我在jupyter笔记本中运行它时并非如此

       def pythag_dis(U_id):
          c = data.Unique_id == U_id
          df = data[c]
          df.reset_index(inplace = True)
          k = sorted(df.trackId.unique())
          i = 0
          j = 1
          length = len(k)
          while i < length: 
            condition = df.trackId == k[i]
            df2 = df[condition]
            df2.reset_index(inplace = True)
            value = 
           math.sqrt((df.Object_Center_0.iloc[j] - 
           df.Object_Center_0.iloc[i])**2 + 
           (df.Object_Center_1.iloc[j] - 
           df.Object_Center_1.iloc[i])**2)
           mylist = []
           mylist.append(value)
           fulldistance = sum(mylist)
           mylist2 = []
           mylist2.append(fulldistance)
           i+=1
      return mylist2
    pythag_dis('1CCM0701')

OUTPUT: [1976.075585650214]

Tags: id距离df列表objectvaluesqrt增量
2条回答

首先创建两个新列X_SHIFTEDY_SHIFTED,它们代表每个轨迹ID的下一点坐标。我们通过组合df.groupbydf.shift来实现这一点:

df[['X_SHIFTED', 'Y_SHIFTED']] = df.groupby('trackiD').shift()

然后,简单地使用点(X_COORDINATESY_COORDINATES)和(X_SHIFTEDY_SHIFTED)之间的欧几里德距离公式。我们可以使用df.apply行方式(axis=1)以及math.dist来实现这一点:

import math

df['DIST'] = df.apply(
    lambda row: math.dist(
        (row['X_COORDINATES'], row['Y_COORDINATES']),
        (row['X_SHIFTED'], row['Y_SHIFTED'])
    ), axis=1)

输出:

    trackiD  X_COORDINATES  Y_COORDINATES   X_SHIFTED  Y_SHIFTED      DIST
0         2     542.299805      23.388090         NaN        NaN       NaN
1         2     544.108215      23.575758  542.299805  23.388090  1.818122
2         2     545.300598      23.962421  544.108215  23.575758  1.253509
3         2     546.417053      25.049328  545.300598  23.962421  1.558152
4         2     546.198669      24.830357  546.417053  25.049328  0.309257
5         2     546.724915      24.916084  546.198669  24.830357  0.533183
6         2     547.037048      24.918982  546.724915  24.916084  0.312146
7         2     547.011963      24.785202  547.037048  24.918982  0.136112
8         2     547.649231      24.845772  547.011963  24.785202  0.640140
9         3     547.600525      24.613401         NaN        NaN       NaN
10        3     547.891479      24.268734  547.600525  24.613401  0.451054
11        3     548.580505      24.459103  547.891479  24.268734  0.714841
12        3     548.144409      23.915531  548.580505  24.459103  0.696886
13        3     548.626770      23.922005  548.144409  23.915531  0.482404
14        4     548.527222      24.134670         NaN        NaN       NaN
15        4     548.504211      23.642254  548.527222  24.134670  0.492953
16        4     548.936584      24.028818  548.504211  23.642254  0.579981
17        4     548.627869      23.295454  548.936584  24.028818  0.795693

要获取每条轨迹的距离总和,可以使用:

df.groupby('trackiD')['DIST'].sum()

输出:

trackiD
2    6.560621
3    2.345185
4    1.868628
Name: DIST, dtype: float64

使用Pandas的一种可能解决方案:我使用Pandas groupby shift匹配坐标,计算距离,然后对组中的距离求和:

import math
import numpy as np
import pandas as pd

def distance(row):
    x1, y1, x2, y2 = row["X_COORDINATES"], row["Y_COORDINATES"], row["X2"], row["Y2"]
    if np.isnan(x2) or np.isnan(y2):
        return 0
    return math.sqrt((x2 - x1) ** 2 + (y2 - y1) ** 2)

df["X2"] = df.groupby("trackiD")["X_COORDINATES"].shift(-1)
df["Y2"] = df.groupby("trackiD")["Y_COORDINATES"].shift(-1)

df["distance"] = df.apply(distance, axis=1)
df.groupby("trackiD")["distance"].sum()

输出:

trackiD
2    6.560621
3    2.345185
4    1.868628
Name: distance, dtype: float64

测试数据帧:

df = pd.DataFrame(
    {
        "trackiD": {
            0: 2,
            1: 2,
            2: 2,
            3: 2,
            4: 2,
            5: 2,
            6: 2,
            7: 2,
            8: 2,
            9: 3,
            10: 3,
            11: 3,
            12: 3,
            13: 3,
            14: 4,
            15: 4,
            16: 4,
            17: 4,
        },
        "X_COORDINATES": {
            0: 542.299805,
            1: 544.108215,
            2: 545.300598,
            3: 546.417053,
            4: 546.198669,
            5: 546.724915,
            6: 547.037048,
            7: 547.011963,
            8: 547.649231,
            9: 547.600525,
            10: 547.891479,
            11: 548.580505,
            12: 548.144409,
            13: 548.62677,
            14: 548.527222,
            15: 548.504211,
            16: 548.936584,
            17: 548.627869,
        },
        "Y_COORDINATES": {
            0: 23.38809,
            1: 23.575758,
            2: 23.962421,
            3: 25.049328,
            4: 24.830357,
            5: 24.916084,
            6: 24.918982,
            7: 24.785202,
            8: 24.845772,
            9: 24.613401,
            10: 24.268734,
            11: 24.459103,
            12: 23.915531,
            13: 23.922005,
            14: 24.13467,
            15: 23.642254,
            16: 24.028818,
            17: 23.295454,
        },
    }
)

相关问题 更多 >

    热门问题