如何一次性计算多个数据帧的平均值和标准差？

5113.440 1 0.25846 0.10166 27.96867 0.94852 -0.25846 268.29305 5113.434129 5074.760 3 0.68155 0.16566 120.18771 3.02654 -0.68155 101.02457 5074.745627 5083.340 2 0.74771 0.13267 105.59355 2.15700 -0.74771 157.52406 5083.337081 5088.150 1 0.28689 0.12986 39.65747 2.43339 -0.28689 164.40787 5088.141849 5090.780 1 0.61464 0.14479 94.72901 2.78712 -0.61464 132.25865 5090.773443 #first Sample path_to_files = '/home/Desktop/computed_2d_blaze/' lst = [] for filen in [x for x in os.listdir(path_to_files) if '.ares' in x]: df = pd.read_table(path_to_files+filen, skiprows=0, usecols=(0,1,2,3,4,8),names=['wave','num','stlines','fwhm','EWs','MeasredWave'],delimiter=r'\s+') df = df.sort_values('stlines', ascending=False) df = df.drop_duplicates('wave') df = df.reset_index(drop=True) lst.append(df) #second sample path_to_files1 = '/home/Desktop/computed_1d/' lst1 = [] for filen in [x for x in os.listdir(path_to_files1) if '.ares' in x]: df1 = pd.read_table(path_to_files1+filen, skiprows=0, usecols=(0,1,2,3,4,8),names=['wave','num','stlines','fwhm','EWs','MeasredWave'],delimiter=r'\s+') df1 = df1.sort_values('stlines', ascending=False) df1 = df1.drop_duplicates('wave') df1 = df1.reset_index(drop=True) lst1.append(df1)

2条回答

网友

1楼 · 编辑于 2024-10-03 23:24:48

您不应该使用apply。只需使用布尔运算：

mask = df['waves'].between(lower_outlier, upper_outlier)
df[mask].plot(x='waves', y='stlines')

网友

2楼 · 编辑于 2024-10-03 23:24:48

我们想到的一个解决方案是编写一个基于upper和lower bounds查找离群值的函数，然后根据离群值索引对{}进行切片

df1 = pd.DataFrame({'wave': [1, 2, 3, 4, 5]})

df2 = pd.DataFrame({'stlines': [0.1, 0.2, 0.3, 0.4, 0.5]})

def outlier(value, upper, lower):
    """
    Find outliers based on upper and lower bound
    """
    # Check if input value is within bounds
    in_bounds = (value <= upper) and (value >= lower) 

    return in_bounds 

# Function finds outliers in wave column of DF1
outlier_index = df1.wave.apply(lambda x: outlier(x, 4, 1))

# Return DF2 without values at outlier index
df2[outlier_index]

# Return DF1 without values at outlier index
df1[outlier_index]

相关问题更多 >

编程相关推荐

热门问题

热门文章