查找一行中是否有n个小于某个数字的数据点

2024-10-01 02:38:22 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用Python的一个频谱,我已经为该频谱设置了一条线。我想要一个代码,可以检测光谱上是否有10个数据点在一行中小于拟合线。有人知道这是一种简单快捷的方法吗

我现在有这样的东西:

count = 0
for i in range(lowerbound, upperbound):
    if spectrum[i] < fittedline[i]
        count += 1
    if count > 15:
        *do whatever*

如果我将第一个If语句行更改为:

if spectrum[i] < fittedline[i] & spectrum[i+1] < fittedline[i+1] & so on

我相信这个算法会起作用,但是如果我想让用户输入一个数字,表示一行中有多少数据点必须小于拟合线,那么有没有更聪明的方法让我自动完成这个任务


Tags: 数据方法代码inforifcountrange
2条回答

我的建议是,在开发特殊功能之前,研究并使用现有库

在这种情况下,一些超级聪明的人开发了数字python库numpy。该库在科学项目中广泛使用,它有大量有用的功能实现,这些功能实现经过测试优化

您的需求可以通过以下行来满足:

number_of_points = (np.array(spectrum) < np.array(fittedline)).sum()

但让我们一步一步走:

spectrum = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
fittedline = [1, 2, 10, 10, 10, 10, 10, 8, 9, 10]

# Import numerical python module
import numpy as np

# Convert your lists to numpy arrays
spectrum_array = np.array(spectrum)
gittedline_array = np.array(fittedline)

# Substract fitted line to spectrum
difference = spectrum_array - gittedline_array
#>>> array([ 0,  0, -7, -6, -5, -4, -3,  0,  0,  0])

# Identify points where condition is met
condition_check_array = difference < 0.0
# >>> array([False, False,  True,  True,  True,  True,  True, False, False, False])

# Get the number of points where condition is met
number_of_points = condition_check_array.sum()
# >>> 5

# Get index of points where condition is met
index_of_points = np.where(difference < 0)
# >>> (array([2, 3, 4, 5, 6], dtype=int64),)

print(f"{number_of_points} points found at location {index_of_points[0][0]}-{index_of_points[0][-1]}!")

# Now same functionality in a simple function
def get_point_count(spectrum, fittedline):  
    return (np.array(spectrum) < np.array(fittedline)).sum()

get_point_count(spectrum, fittedline)
现在让我们考虑一下,在你的光谱中有10个点,你有10米。代码效率是一个需要考虑的关键问题,NUMPY在那里可以节省帮助:

number_of_samples = 1000000
spectrum = [1] * number_of_samples
# >>> [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]
fittedline = [0] * number_of_samples
fittedline[2:7] =[2] * 5
# >>> [0, 0, 2, 2, 2, 2, 2, 0, 0, 0, ...]

# With numpy
start_time = time.time()
number_of_points = (np.array(spectrum) < np.array(fittedline)).sum()
numpy_time = time.time() - start_time
print(" - %s seconds  -" % (numpy_time))


# With ad hoc loop and ifs
start_time = time.time()
count=0
for i in range(0, len(spectrum)):
    if spectrum[i] < fittedline[i]:
        count += 1
    else: # If the current point is NOT below the threshold, reset the count
        count = 0
adhoc_time = time.time() - start_time
print(" - %s seconds  -" % (adhoc_time))

print("Ad hoc is {:3.1f}% slower".format(100 * (adhoc_time / numpy_time - 1)))

number_of_samples = 1000000
spectrum = [1] * number_of_samples
# >>> [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]
fittedline = [0] * number_of_samples
fittedline[2:7] =[2] * 5
# >>> [0, 0, 2, 2, 2, 2, 2, 0, 0, 0, ...]

# With numpy
start_time = time.time()
number_of_points = (np.array(spectrum) < np.array(fittedline)).sum()
numpy_time = time.time() - start_time
print(" - %s seconds  -" % (numpy_time))


# With ad hoc loop and ifs
start_time = time.time()
count=0
for i in range(0, len(spectrum)):
    if spectrum[i] < fittedline[i]:
        count += 1
    else: # If the current point is NOT below the threshold, reset the count
        count = 0
adhoc_time = time.time() - start_time
print(" - %s seconds  -" % (adhoc_time))

print("Ad hoc is {:3.1f}% slower".format(100 * (adhoc_time / numpy_time - 1)))

>>> - 0.20999646186828613 seconds  -
>>> - 0.28800177574157715 seconds  -
>>>Ad hoc is 37.1% slower

你的尝试很接近成功!对于连续的点,如果一个点不满足您的条件,您只需重置计数

num_points = int(input("How many points must be less than the fitted line? "))

count = 0
for i in range(lowerbound, upperbound):
    if spectrum[i] < fittedline[i]:
        count += 1
    else: # If the current point is NOT below the threshold, reset the count
        count = 0

    if count >= num_points:
        print(f"{count} consecutive points found at location {i-count+1}-{i}!")

让我们测试一下:

lowerbound = 0
upperbound = 10

num_points = 5

spectrum = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
fittedline = [1, 2, 10, 10, 10, 10, 10, 8, 9, 10]

使用这些值运行代码可以得到:

5 consecutive points found at location 2-6!

相关问题 更多 >