在一个numpy数组中找到接近值的整数，并将它们合并起来

3条回答

网友

1楼 · 编辑于 2024-10-03 23:28:14

尝试这样的方法，我添加了一些额外的步骤，只是为了显示流程：其思想是将数据分组到相邻的组中，并根据它们的分布情况决定是否要对它们进行分组。你知道吗

因此，正如您所描述的，您可以将数据组合成3个数的集合，如果最大值和最小值之间的差值小于50，则取其平均值，否则保持原样。你知道吗

import pandas as pd
import numpy as np
arr = np.ravel([1,24,5.3, 12, 8, 45, 14, 18, 33, 15, 19, 22])
arr.sort()

def reshape_arr(a, n): # n is number of consecutive adjacent items you want to compare for averaging
    hold = len(a)%n
    if hold != 0:
        container = a[-hold:] #numbers that do not fit on the array will be excluded for averaging
        a = a[:-hold].reshape(-1,n)
    else:
        a = a.reshape(-1,n)
        container = None
    return a, container
def get_mean(a, close): # close = how close adjacent numbers need to be, in order to be averaged together
    my_list=[]
    for i in range(len(a)):
        if a[i].max()-a[i].min() > close:
            for j in range(len(a[i])):
                my_list.append(a[i][j])
        else:
            my_list.append(a[i].mean())
    return my_list  
def final_list(a, c): # add any elemts held in the container to the final list
    if c is not None:
        c = c.tolist()
        for i in range(len(c)):
            a.append(c[i])
    return a 

arr, container = reshape_arr(arr,3)
arr = get_mean(arr, 5)
final_list(arr, container)

网友

2楼 · 编辑于 2024-10-03 23:28:14

您可以在这里使用fuzzyfuzzy来衡量两个数据集之间的完备性比率。你知道吗

详见：http://jonathansoma.com/lede/algorithms-2017/classes/fuzziness-matplotlib/fuzzing-matching-in-pandas-with-fuzzywuzzy/

网友

3楼 · 编辑于 2024-10-03 23:28:14

接受古斯塔沃的回答并根据我的需要进行调整：

def reshape_arr(a, close):
    flag = True
    while flag is not False:
        array = a.sort_values().unique()
        l = len(array)
        flag = False
        for i in range(l):
            previous_item = next_item = None
            if i > 0:
                previous_item = array[i - 1]
            if i < (l - 1):
                next_item = array[i + 1]
            if previous_item is not None:
                if abs(array[i] - previous_item) < close:
                    average = (array[i] + previous_item) / 2
                    flag = True
                    #find matching values in a, and replace with the average
                    a.replace(previous_item, value=average, inplace=True)
                    a.replace(array[i], value=average, inplace=True)

            if next_item is not None:
                if abs(next_item - array[i]) < close:
                    flag = True
                    average = (array[i] + next_item) / 2
                    # find matching values in a, and replace with the average
                    a.replace(array[i], value=average, inplace=True)
                    a.replace(next_item, value=average, inplace=True)
    return a

如果我做了这样的事情：

 candlesticks['support'] = reshape_arr(supres_df['support'], 150)

其中烛台是我正在使用的主数据帧，supres\u df是另一个数据帧，我在将其应用到主数据帧之前正在对其进行按摩。你知道吗

它可以工作，但是非常慢。我现在正在尝试优化它。你知道吗

我添加了一个while循环，因为在平均之后，平均值可以变得足够接近再次平均，所以我将再次循环，直到不再需要平均为止。这是完全新手的工作，所以如果你看到一些愚蠢的，请评论。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章

在一个numpy数组中找到接近值的整数，并将它们合并起来

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >