找到合适的截止值

from scipy.stats.mstats import trimboth import numpy as np theList = np.log10(1+np.arange(.1, 100)) theMedian = np.median(theList) trimmedList = trimboth(theList, proportiontocut=0.15) a = (trimmedList.max() - trimmedList.min()) * 0.5 #check how many elements fall into the range sel = (theList > (theMedian - a)) * (theList < (theMedian + a)) print np.sum(sel) / float(len(theList))

3条回答

网友

1楼 · 编辑于 2024-09-24 22:30:21

首先需要将所有小于平均值的值向右折叠，使分布对称化。然后您可以在此单面分布上使用标准scipy.stats函数：

from scipy.stats import scoreatpercentile
import numpy as np

theList = np.log10(1+np.arange(.1, 100))
theMedian = np.median(theList)

oneSidedList = theList[:]               # copy original list
# fold over to the right all values left of the median
oneSidedList[theList < theMedian] = 2*theMedian - theList[theList < theMedian]

# find the 70th centile of the one-sided distribution
a = scoreatpercentile(oneSidedList, 70) - theMedian

#check how many elements fall into the range
sel = (theList > (theMedian - a)) * (theList < (theMedian + a))

print np.sum(sel) / float(len(theList))

这将根据需要给出0.7的结果。在

网友

2楼 · 编辑于 2024-09-24 22:30:21

你想要的是scipy.stats.mstats.trimboth。设置proportiontocut=0.15。修剪后，取(max-min)/2。在

网友

3楼 · 编辑于 2024-09-24 22:30:21

稍微重申一下这个问题。你知道列表的长度和列表中要考虑的数字的分数。鉴于此，您可以确定列表中第一个和最后一个索引之间的差异，这些索引为您提供了所需的范围。然后，目标是找到指标，使成本函数最小化，对应于中位数的期望对称值。在

让较小的索引是n1，而大索引是n2；它们不是独立的。索引列表中的值是x[n1] = m-b和{}。现在要选择n1（因此n2），以便b和{}尽可能接近。当(b - c)**2最小时会发生这种情况。使用numpy.argmin很容易。与问题中的示例类似，下面是一个交互式会话，演示了该方法：

$ python
Python 2.6.5 (r265:79063, Jun 12 2010, 17:07:01)
[GCC 4.3.4 20090804 (release) 1] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> theList = np.log10(1+np.arange(.1, 100))
>>> theMedian = np.median(theList)
>>> listHead = theList[0:30]
>>> listTail = theList[-30:]
>>> b = np.abs(listHead - theMedian)
>>> c = np.abs(listTail - theMedian)
>>> squaredDiff = (b - c) ** 2
>>> np.argmin(squaredDiff)
25
>>> listHead[25] - theMedian, listTail[25] - theMedian
(-0.2874888056626983, 0.27859407466756614)

相关问题更多 >

编程相关推荐

热门问题

热门文章