如何在python中合并具有不同时间值的两个时间序列?

2024-10-03 04:39:37 发布

您现在位置:Python中文网/ 问答频道 /正文

我有:

highhz =  [(0,1),(2,2),(4,4),(5,5),(6,6),(7,7),(8,8)]
lowhz=    [(1.5,1.5),(5.6,5.6)]

我想:

alldata =  [(0,1,1.5),
            (2,2,NaN),
            (4,4,NaN),
            (5,5,5.6),
            (6,6,NaN),
            (7,7,NaN),
            (8,8,NaN)]

也就是说,将第二个低频源的值附加到高频源中的纵坐标值上,形成一个组合表,其中包含高频源的时间纵坐标和没有低频数据的nan的时间纵坐标。你知道吗

你知道如何用python来处理这个问题吗?在C语言中,我会使用两个移动指针,在lisp语言中,我会递归,但即使我可以将这些算法移植到python中,它们看起来也不太地道。你知道吗


Tags: 数据算法语言时间nan指针我会lisp
3条回答

您可以使用列表比较,但它会生成一个额外的(0, 1, 'Nan')我不知道为什么???:)

>>> [i+(max(j),) if max(i)<max(j)<max(i)+1 else i+('Nan',) for i in highhz for j in lowhz]
[(0, 1, 1.5), (0, 1, 'Nan'), (2, 2, 'Nan'), (2, 2, 'Nan'), (4, 4, 'Nan'), (4, 4, 'Nan'), (5, 5, 'Nan'), (5, 5, 5.6), (6, 6, 'Nan'), (6, 6, 'Nan'), (7, 7, 'Nan'), (7, 7, 'Nan'), (8, 8, 'Nan'), (8, 8, 'Nan')]

它能解决你的问题吗?你知道吗

highhz =  [(0,1),(2,2),(4,4),(5,5),(6,6),(7,7),(8,8)]
lowhz=    [(1.5,1.5),(5.6,5.6)]

# hash lowhz tuples with floor from the first value
low_d = {int(x) : x for x, _ in lowhz} 
# {1: 1.5 5: 5.6}


# use fact, that dict.get() takes default value as optional argument 
result = [(x, y, low_d.get(y, None)) for x, y in highhz]
# or as @Ashwini Chaudhary suggested:
result = [(x, y, low_d.get(y, float('nan'))) for x, y in highhz]
# [(0, 1, (1.5, 1.5)),
#  (2, 2, None),
#  (4, 4, None),
#  (5, 5, (5.6, 5.6)),
#  (6, 6, None),
#  (7, 7, None),
#  (8, 8, None)]

使用^{}^{}有一种方法:

from collections import OrderedDict
from bisect import bisect_left
from pprint import pprint

dct = OrderedDict()
for t, v in highhz:
    dct.setdefault(t, []).append(v)
times = list(dct)

for t, v in lowhz:
    ind = bisect_left(times, t) - 1
    dct[times[ind]].append(v)
#          
for k, v in dct.items():
    if len(v) == 1:
        v.append(float('nan'))
#  
print [[k] + v for k, v in dct.items()]
#[[0, 1, 1.5], [2, 2, nan], [4, 4, nan], [5, 5, 5.6], [6, 6, nan], [7, 7, nan], [8, 8, nan]]

上面代码的一个稍加修改的版本,如果两次之间的项数大于1,则插入统一数量的NaN

highhz = [(0, 1), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7), (8, 8)]
lowhz = [(1.5, 1.5), (2, 2), (5.6, 5.6), (5.7, 10), (5.8, 20)]
#  
max_n = len(max(dct.values(), key=len))

for k, v in dct.items():
    le = len(v)
    v.extend([float('nan')]*(max_n-le))
#  
pprint([[k] + v for k, v in dct.items()])

[[0, 1, 1.5, 2, nan],
 [3, 3, nan, nan, nan],
 [4, 4, nan, nan, nan],
 [5, 5, 5.6, 10, 20],
 [6, 6, nan, nan, nan],
 [7, 7, nan, nan, nan],
 [8, 8, nan, nan, nan]]

相关问题 更多 >