我试图在数据帧中添加两列,以根据距离计算每个方向和每个时间帧内每辆车的车辆周围邻居50和邻居100,如果距离低于50,我将向邻居50添加一个计数,以此类推,我应该只使用熊猫来完成此任务
我应该根据每辆车的x和y位置,通过以下等式计算距离:
距离=((x2-x1)**2+(y2-y1)**2)**0.5
我使用了以下代码:
#import numpy as np
df['neighbor_50'] = 0
df['neighbor_100'] = 0
frame_group = df.groupby(['frame','direction'])
list_keys = list(frame_group.indices.keys())
for key in list_keys :
frame , direction = key[0] , key[1]
#new_df = df.loc[(df['frame'] == frame) & (df['direction'] == direction)]
mask1 = (df['frame'] == frame) & (df['direction'] == direction)
ids = df[mask1]['id']
for i in ids:
for j in ids:
if i != j:
#distance = sqrt((x2-x1)**2 + (y2-y1)**2)
maski = (df['frame'] == frame) & (df['direction'] == direction)& (df['id'] == i)
maskj = (df['frame'] == frame) & (df['direction'] == direction)& (df['id'] == j)
x2 = df[maski]['x'].iloc[0]
x1 = df[maskj]['x'].iloc[0]
y2 = df[maski]['y'].iloc[0]
y1 = df[maskj]['y'].iloc[0]
distance = ((x2 - x1)**2 + (y2 - y1)**2)**0.5
#distance = np.hypot((x2 - x1),(y2 - y1))
mask = (df['frame'] == frame) & (df['direction'] == direction) &( df['id']== i)
if distance <= 50:
df.loc[mask , 'neighbor_50'] += 1
if distance <= 100 :
df.loc[mask ,'neighbor_100'] += 1
问题是要花很长时间才能完成,因为即使我使用NumPy,数据也很大
更新: 通过避免对相同ID重复计算,我成功地将时间减少了一半,但仍然非常慢
import numpy as np
df['neighbor_50'] = 0
df['neighbor_100'] = 0
frame_group = df.groupby(['frame','direction'])
list_keys = list(frame_group.indices.keys())
for key in list_keys :
frame , direction = key[0] , key[1]
#new_df = df.loc[(df['frame'] == frame) & (df['direction'] == direction)]
mask = (df['frame'] == frame) & (df['direction'] == direction)
ids = df[mask]['id'].values
for i in range(len(ids)-1):
id1 = ids[i]
for j in range(i+1,len(ids)):
id2 = ids[j]
maski = (df['frame'] == frame) & (df['direction'] == direction)& (df['id'] == id1)
maskj = (df['frame'] == frame) & (df['direction'] == direction)& (df['id'] == id2)
x2 = df[maski]['x'].iloc[0]
x1 = df[maskj]['x'].iloc[0]
y2 = df[maski]['y'].iloc[0]
y1 = df[maskj]['y'].iloc[0]
#distance = ((x2 - x1)**2 + (y2 - y1)**2)**0.5
distance = np.hypot((x2 - x1),(y2 - y1))
if distance <= 100 :
df.loc[maski ,'neighbor_100'] += 1
df.loc[maskj ,'neighbor_100'] += 1
if distance <= 50:
df.loc[maski , 'neighbor_50'] += 1
df.loc[maskj , 'neighbor_50'] += 1
有几种方法可以做到这一点,但如果不使用
scipy
或numpy
,这可能是最快的方法:输出:
注意:输出列可能是一个浮点,因为如果没有邻居,值将是
NaN
,不能用int
表示。但是,如果所有行都有一个邻居,则数据类型将为int
相关问题 更多 >
编程相关推荐