找到一个标准偏差最小的组合

2024-09-28 22:59:25 发布

您现在位置:Python中文网/ 问答频道 /正文

110选108这么快,但522选108浪费时间这么做。有什么帮助吗?你知道吗

from itertools import combinations
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
#read df
df = pd.read_csv('capacity.csv', header=None)
#plot df
df[1].hist()
plt.ylabel('Quantity')
plt.xlabel('Capacity / Ah')
plt.savefig('capacity_hist.png')
#numpy.ndarray is quicker than pandas.DataFrame here.
nda = df[1].values
# 110 choose 108 and find the best combinations
nda = nda[:111]
combin = combinations(nda, 108)
the_best_list = []
num = 0
tmp = nda.std()
for i in combin:
    num += 1
    if np.std(i) < tmp:
        tmp = np.std(i)
        the_best_list = i
#shows the result
print(num)
print(tmp)
print(the_best_list)

df的形状为(522,2)。你知道吗

df的hist图如下: normal disbution?

文件capacity.csv如下所示:

138.913美元 2,38.904 3,38.925 4,38.872 6,38.876 7,38.968 8,38.896 9,38.893 10,38.915 11,38.974 12,38.885 13,38.982 14,38.944 16,38.844 18,38.914 19,38.913 20,38.824 21,38.926 22,38.964 23,37.295 24,38.807 25,38.908 27,38.927 28,38.83 30,38.943 32,39.013 33,38.751 36,38.92 37,38.869 38,38.909 39,38.9 40,38.892 41,38.9 42,38.951 43,38.726 44,38.937 45,38.757 46,38.867 47,38.882 48,38.952 49,38.918 50,38.875 51,38.998 52,38.888 54,37.822 56,38.982 57,38.922 58,38.934 59,38.938 60,39.035 61,38.955 63,38.935 64,38.946 66,38.983 67,38.983 69,38.886 71,38.884 72,38.964 73,39.06 74,38.869 75,38.926 76,38.972 78,38.851 79,38.989 80,38.902 81,38.998 82,38.897 83,38.98 85,37.939 86,38.947 87,38.617 89,38.981 90,38.957 91,38.851 92,38.978 93,38.831 94,39.001 95,38.942 96,39.003 97,38.978 98,38.915 99,38.872 100,38.977 101,38.932 102,38.583 103,38.966 104,38.935 105,38.906 106,39.004 107,38.989 109,38.852 110,38.925 111,38.183 113,38.896 114,38.979 116,38.914 117,38.666 119,38.9 120,38.952 121,38.806 122,37.957 123,38.922 124,38.844 125,38.786 126,38.95 128,38.875 129,38.954 130,38.912 131,38.93 132,37.785 133,38.883 134,38.911 135,38.859 136,38.802 137,38.909 138,38.892 139,37.872 140,38.897 141,38.985 142,39 143,38.916 144,38.902 145,38.906 146,37.863 147,38.905 148,38.733 149,38.9 150,38.851 151,38.9 152,38.855 153,38.931 155,38.924 156,38.864 157,38.869 158,38.943 159,38.978 160,39.018 161,38.992 162,38.654 163,38.95 164,38.887 165,38.966 166,38.98 167,38.862 168,38.96 170,38.893 171,38.931 172,38.894 173,38.985 174,38.941 175,38.92 176,38.911 178,38.952 179,38.711 183,38.945 184,38.893 185,38.882 186,38.807 187,38.968 188,38.958 189,38.88 190,38.937 191,38.899 192,38.922 193,36.259 194,38.901 195,38.946 196,38.971 197,38.916 198,38.968 201,38.888 203,38.872 204,38.815 205,38.861 206,38.909 207,39.023 208,38.832 209,38.959 210,38.964 211,38.91 212,38.952 213,39.033 214,38.987 215,38.942 216,38.956 217,38.916 218,38.842 219,37.471 220,38.931 221,38.833 222,38.952 223,38.903 224,38.95 225,38.921 226,38.904 227,39.018 228,38.936 230,38.974 231,38.909 232,38.911 233,38.964 235,37.851 236,38.919 237,38.955 238,39.091 239,38.955 241,38.995 242,39.053 243,39.014 244,39.047 246,39.05 247,39.039 248,39.106 249,38.976 250,38.998 251,38.997 252,38.978 253,39.009 254,39.06 256,39.051 257,39.081 258,39.005 259,39.067 260,38.988 261,39.015 262,39.007 264,36.393 266,39.023 269,38.967 270,39.053 271,39.084 272,38.999 273,39.043 274,39.079 275,38.985 276,39.074 278,39.009 279,39.041 280,39.011 281,39.157 282,39.156 283,41.513 284,38.983 285,39.057 286,38.99 287,39.202 289,38.918 290,39.119 291,38.798 292,39.046 293,39.053 294,38.809 295,39.006 296,38.809 297,38.946 298,38.992 299,38.934 300,39.008 301,39.038 302,39.084 303,39.175 304,39.091 305,38.959 306,39.086 307,39.094 308,38.636 310,39.027 311,38.998 313,39.041 314,39.013 315,39.222 316,39.02 317,38.778 318,38.851 319,39.023 320,39.152 321,39.024 322,38.895 323,38.311 324,38.962 325,38.886 326,39.058 327,39.049 328,38.726 329,39.187 330,39.041 332,39.016 333,38.968 334,38.759 335,39.073 336,38.869 337,38.945 338,38.91 339,39.006 340,39.212 341,39.134 343,39.06 344,38.966 345,39.154 346,38.901 347,38.808 348,38.69 349,38.904 350,39.197 351,39.032 352,38.927 353,39.04 355,39.001 356,38.988 357,38.874 358,38.824 359,37.72 360,38.87 361,37.871 362,38.676 363,39.026 364,37.98 365,37.84 366,38.88 367,39.113 368,39.124 369,39.139 370,39.127 371,38.723 372,38.985 373,39.082 374,38.616 375,39.139 377,38.916 378,38.967 379,38.907 380,39.057 381,39.037 382,38.995 383,38.754 384,38.701 385,38.687 387,39.008 389,39.221 390,38.949 391,38.017 392,38.97 393,38.892 394,38.538 396,38.449 397,39.013 400,38.784 401,39.032 402,38.889 403,38.813 404,38.928 405,38.965 406,39.122 407,38.999 408,38.92 409,38.973 410,38.991 411,39.002 412,38.861 413,38.934 414,38.93 415,38.856 416,39.03 417,38.929 418,38.628 419,38.807 420,38.956 421,39.065 422,39.008 423,38.914 424,38.951 425,38.898 426,38.891 427,39.356 428,38.968 429,39.026 430,38.925 431,39.212 432,39.183 433,39.049 434,39.079 435,39.091 436,39.071 437,38.724 438,38.879 439,38.987 440,39.019 441,38.945 442,39.182 443,39.125 444,39.138 445,39.078 446,38.825 447,39.001 448,39.011 449,39.084 450,39.024 451,39.026 452,39.102 453,39.102 454,39.317 455,38.936 457,38.969 458,38.936 459,38.536 460,38.852 461,39.107 462,38.637 463,38.867 464,37.063 465,38.035 466,39.064 467,37.437 468,38.874 469,38.475 470,38.836 471,38.971 472,38.827 473,38.908 474,38.567 475,38.749 476,37.969 477,38.855 478,38.348 479,38.876 481,38.769 482,38.675 483,38.891 484,38.649 485,38.919 486,38.937 487,38.922 488,38.842 490,38.813 491,38.83 492,38.809 493,38.739 494,38.811 495,39.013 496,39.08 497,38.892 498,38.868 499,38.879 501,38.87 502,38.848 503,38.665 504,39.06 505,38.696 506,38.948 507,38.792 508,38.896 509,38.855 510,38.963 511,38.926 513,38.674 514,38.741 515,38.793 516,38.851 517,38.964 518,38.83 519,38.846 520,39.073 522,38.81 523,37.493 524,38.948 525,38.704 526,37.456 527,38.716 529,38.941 530,38.828 531,38.909 532,38.829 534,38.795 535,38.757 537,38.699 538,38.982 539,38.983 540,38.932 541,38.808 542,38.988 543,38.933 544,39.06 545,39.134 546,38.651 547,38.839 548,39.132 549,38.911 550,38.503 551,38.785 552,38.763 554,38.671 555,38.51 556,38.936 557,38.559 558,38.701 559,38.693 560,38.562 561,38.889 562,38.91 563,38.441 564,38.701 566,38.772 568,38.681 569,38.509 570,38.785 571,38.799 572,38.79 573,38.833 575,38.656 576,38.677 577,38.854 578,38.439 579,38.849 582,38.094 583,38.871 584,38.502 585,38.818 586,38.627 587,38.685 588,38.812 589,38.752 590,38.601美元

我只想找到一个标准差最小的组合。你知道吗

谢谢你的帮助。你知道吗


Tags: csvtheimportdfasnppltnum
1条回答
网友
1楼 · 发布于 2024-09-28 22:59:25

下面是一个适用于正态分布数据的部分解决方案。其思想是,最接近平均值的样本子集将具有最小的标准差。你知道吗

如果你的数据不一致,你可以先做k均值聚类,然后检查最接近每个均值的点的数量,看看这些子集中哪个标准差最小。你知道吗

请注意,您提供的示例数据只有38个值,它们位于一行中,因此我更改了从数据帧创建nda的方式

from itertools import combinations
import pandas as pd
import numpy as np

df = pd.read_csv('randint35-39.txt', header=None)
nda = df.values[0, :]

nda = nda[:113]
print(nda.shape)

# sort values by their distance from the global mean
normalised_vals = abs(nda-np.mean(nda))
indices = np.argsort(normalised_vals)
print(indices)

num_in_combo = 12

the_best = nda[indices[:num_in_combo]]

print("Global mean = {:10.3f} Global std_dev = {:10.3f}".format(np.mean(nda), np.std(nda)))
print("Subset mean = {:10.3f} Subset std_dev = {:10.3f}".format(np.mean(the_best), np.std(the_best)))

print(the_best)

编辑:改进这个答案的方法:如果你怀疑有大的异常值,就用中位数而不是平均值。利用本文的输出创建一个二元隶属度掩模,创建一个随机稍有变化的掩模群体,然后用遗传算法得到最佳掩模(用包含样本的标准差作为适应度函数)。如果样本的范围很小,则沿该范围进行暴力搜索,找到最接近该范围的N个样本(如上所述),并查看沿该范围的哪个点给出标准偏差最低的集合

相关问题 更多 >