<p>尝试这样的方法,我添加了一些额外的步骤,只是为了显示流程:
其思想是将数据分组到相邻的组中,并根据它们的分布情况决定是否要对它们进行分组。你知道吗</p>
<p>因此,正如您所描述的,您可以将数据组合成3个数的集合,如果最大值和最小值之间的差值小于50,则取其平均值,否则保持原样。你知道吗</p>
<p><a href="https://i.stack.imgur.com/Gr2Zg.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/Gr2Zg.png" alt="enter image description here"/></a></p>
<pre><code>import pandas as pd
import numpy as np
arr = np.ravel([1,24,5.3, 12, 8, 45, 14, 18, 33, 15, 19, 22])
arr.sort()
def reshape_arr(a, n): # n is number of consecutive adjacent items you want to compare for averaging
hold = len(a)%n
if hold != 0:
container = a[-hold:] #numbers that do not fit on the array will be excluded for averaging
a = a[:-hold].reshape(-1,n)
else:
a = a.reshape(-1,n)
container = None
return a, container
def get_mean(a, close): # close = how close adjacent numbers need to be, in order to be averaged together
my_list=[]
for i in range(len(a)):
if a[i].max()-a[i].min() > close:
for j in range(len(a[i])):
my_list.append(a[i][j])
else:
my_list.append(a[i].mean())
return my_list
def final_list(a, c): # add any elemts held in the container to the final list
if c is not None:
c = c.tolist()
for i in range(len(c)):
a.append(c[i])
return a
arr, container = reshape_arr(arr,3)
arr = get_mean(arr, 5)
final_list(arr, container)
</code></pre>