从一个列表中向数据库添加值,该列表的值对应于python中的另一个数据库

2024-09-29 23:17:44 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个名为_ucDB的数据库,它有262行数据,看起来像这样:

     indexID  matchID  order userClean
         1       21      0     dirty
         1       32      1     dirty
         1      145      2     dirty
         4        5      3     clean
         4       43      4     dirty
         4      180      5     dirty
         4      184      6     dirty
         6        7      7     clean
         6       13      8     dirty
         6       93      9     dirty
         6      132     10     dirty
         6      153     11     dirty
         6      172     12     dirty
         6      196     13     dirty
         8        9     14     clean
         8      171     15     dirty
        12       13     16     clean
        12       93     17     dirty
        12      132     18     dirty
        12      153     19     dirty
        12      181     20     dirty
        12      196     21     dirty

我有一个概率列表,它包含131个值,如下所示:

[0.99966824, 0.96239686, 0.99911624, 0.28857997, 0.003755328, 0.0046950155, 0.0044651907, 0.0047618235, 0.23484087, 0.962187, 3.0091974e-22, 8.1519043e-22, 0.9905359, 0.00011853044, 4.4233568e-14, 7.127504e-07, 1.864812e-17, 0.99703133, 3.17426e-16, 0.50278896, 0.55311096, 1.159942e-05, 0.53562385, 0.16331102, 1.5920829e-06, 7.9792744e-07, 5.823995e-07, 0.284861, 0.46748465, 0.46383706, 0.25041214, 0.99107516, 1.5370236e-11, 0.8576025, 0.0010161225, 0.58321816, 0.76292366, 0.00010934622, 0.72824544, 0.38391674, 0.0097409785, 4.3164547e-08, 1.7280547e-05, 0.7246928, 5.9006602e-08, 5.0709765e-05, 0.978512, 3.5036015e-12, 1.5390156e-11, 0.6185394, 0.017997066, 0.00023294186, 0.13520418, 6.6481048e-06, 0.00015752365, 7.000092e-06, 7.17631e-06, 0.07471306, 0.0015149566, 0.0012117986, 2.0014808e-12, 0.0013824155, 0.040859833, 0.14533857, 0.9288511, 4.464196e-09, 0.07058981, 0.8535712, 0.81062424, 3.734015e-05, 0.22207999, 4.903828e-21, 0.08622761, 0.041497793, 0.018137224, 0.019342968, 0.015368458, 0.41454336, 0.08082744, 0.004606869, 0.0035861062, 0.002696093, 0.8877732, 2.1275096e-06, 6.6134373e-07, 0.0008052338, 0.42654076, 0.17369142, 0.3299104, 1.858753e-18, 0.7474273, 0.14151353, 0.0010253238, 5.308538e-06, 3.493124e-06, 0.00033286438, 0.8685754, 0.7645787, 0.701938, 0.3150338, 2.9346756e-08, 7.83391e-12, 3.4358197e-10, 1.960794e-11, 8.5792645e-17, 0.9964175, 1.3673732e-14, 2.3826202e-14, 7.9876345e-14, 2.4482112e-14, 4.786919e-16, 0.15512297, 0.41997427, 0.25056317, 0.4547511, 0.29294935, 0.29281262, 1.3639165e-06, 2.9399953e-06, 0.6283169, 0.48729306, 6.892901e-06, 3.1108675e-06, 0.009136838, 2.9103248e-10, 5.8614324e-12, 0.6969736, 0.6400705, 0.0028972547, 0.27473485, 0.42833236]

最后,我在数据库中有另一列,包含131个值,称为_deMeta['evalID'],如下所示:

[3, 14, 16, 27, 44, 46, 50, 61, 63, 70, 74, 81, 90, 126, 130, 154, 166, 177, 183, 197, 210, 220, 223, 226, 235, 252, 253, 261, 10, 19, 21, 25, 26, 30, 31, 32, 36, 37, 38, 41, 43, 45, 47, 49, 51, 52, 54, 55, 56, 57, 58, 59, 62, 65, 68, 73, 76, 77, 78, 79, 82, 83, 86, 88, 89, 92, 93, 94, 96, 101, 106, 107, 108, 110, 112, 116, 123, 124, 125, 127, 128, 131, 132, 134, 135, 140, 143, 144, 147, 148, 156, 157, 158, 162, 169, 172, 173, 175, 176, 181, 184, 185, 187, 191, 193, 198, 199, 201, 202, 203, 204, 205, 209, 212, 215, 216, 217, 218, 224, 225, 227, 230, 231, 233, 237, 238, 240, 245, 247, 257, 258]

基本上,概率反映了数据被清除的概率。概率的“ID”与“evalID”相同。也就是说,概率列表中0.99966824的第一个概率对应于数据库列中名为_deMeta['evalID']的第一个条目,即3。该值对应于_ucDB数据库中的顺序,该数据库是_ucDB中的第四个条目

我想创建一个名为_newucDB的新数据库,该数据库在另一个名为“概率”的列中添加,并反映订单的概率

例如,如果代码正确地将evalID 3的第一个概率与order 3匹配,则新数据库应如下所示:

  indexID  matchID  order userClean Probability
     1       21      0     dirty
     1       32      1     dirty
     1      145      2     dirty
     4        5      3     clean     0.99966824
     4       43      4     dirty
     4      180      5     dirty
     4      184      6     dirty
     6        7      7     clean

请注意,并非所有行都有概率值。没有概率值的行应留空。谢谢


Tags: 数据clean数据库列表order条目概率dirty
1条回答
网友
1楼 · 发布于 2024-09-29 23:17:44

我假设您将把数据读入python

新数据

 indexID  matchID  order userClean
     1       21      0     dirty
     1       32      1     dirty
     1      145      2     dirty
     4        5      3     clean
     4       43      4     dirty
     4      180      5     dirty
     4      184      6     dirty
     6        7      7     clean
     6       13      8     dirty
     6       93      9     dirty
     6      132     10     dirty
     6      153     11     dirty
     6      172     12     dirty
     6      196     13     dirty
     8        9     14     clean
     8      171     15     dirty
    12       13     16     clean
    12       93     17     dirty
    12      132     18     dirty
    12      153     19     dirty
    12      181     20     dirty
    12      196     21     dirty

代码

l_prob = [0.99966824, 0.96239686, 0.99911624, 0.28857997, 0.003755328, 0.0046950155, 0.0044651907, 0.0047618235, 0.23484087, 0.962187, 3.0091974e-22, 8.1519043e-22, 0.9905359, 0.00011853044, 4.4233568e-14, 7.127504e-07, 1.864812e-17, 0.99703133, 3.17426e-16, 0.50278896, 0.55311096, 1.159942e-05, 0.53562385, 0.16331102, 1.5920829e-06, 7.9792744e-07, 5.823995e-07, 0.284861, 0.46748465, 0.46383706, 0.25041214, 0.99107516, 1.5370236e-11, 0.8576025, 0.0010161225, 0.58321816, 0.76292366, 0.00010934622, 0.72824544, 0.38391674, 0.0097409785, 4.3164547e-08, 1.7280547e-05, 0.7246928, 5.9006602e-08, 5.0709765e-05, 0.978512, 3.5036015e-12, 1.5390156e-11, 0.6185394, 0.017997066, 0.00023294186, 0.13520418, 6.6481048e-06, 0.00015752365, 7.000092e-06, 7.17631e-06, 0.07471306, 0.0015149566, 0.0012117986, 2.0014808e-12, 0.0013824155, 0.040859833, 0.14533857, 0.9288511, 4.464196e-09, 0.07058981, 0.8535712, 0.81062424, 3.734015e-05, 0.22207999, 4.903828e-21, 0.08622761, 0.041497793, 0.018137224, 0.019342968, 0.015368458, 0.41454336, 0.08082744, 0.004606869, 0.0035861062, 0.002696093, 0.8877732, 2.1275096e-06, 6.6134373e-07, 0.0008052338, 0.42654076, 0.17369142, 0.3299104, 1.858753e-18, 0.7474273, 0.14151353, 0.0010253238, 5.308538e-06, 3.493124e-06, 0.00033286438, 0.8685754, 0.7645787, 0.701938, 0.3150338, 2.9346756e-08, 7.83391e-12, 3.4358197e-10, 1.960794e-11, 8.5792645e-17, 0.9964175, 1.3673732e-14, 2.3826202e-14, 7.9876345e-14, 2.4482112e-14, 4.786919e-16, 0.15512297, 0.41997427, 0.25056317, 0.4547511, 0.29294935, 0.29281262, 1.3639165e-06, 2.9399953e-06, 0.6283169, 0.48729306, 6.892901e-06, 3.1108675e-06, 0.009136838, 2.9103248e-10, 5.8614324e-12, 0.6969736, 0.6400705, 0.0028972547, 0.27473485, 0.42833236]

eval_id = [3, 14, 16, 27, 44, 46, 50, 61, 63, 70, 74, 81, 90, 126, 130, 154, 166, 177, 183, 197, 210, 220, 223, 226, 235, 252, 253, 261, 10, 19, 21, 25, 26, 30, 31, 32, 36, 37, 38, 41, 43, 45, 47, 49, 51, 52, 54, 55, 56, 57, 58, 59, 62, 65, 68, 73, 76, 77, 78, 79, 82, 83, 86, 88, 89, 92, 93, 94, 96, 101, 106, 107, 108, 110, 112, 116, 123, 124, 125, 127, 128, 131, 132, 134, 135, 140, 143, 144, 147, 148, 156, 157, 158, 162, 169, 172, 173, 175, 176, 181, 184, 185, 187, 191, 193, 198, 199, 201, 202, 203, 204, 205, 209, 212, 215, 216, 217, 218, 224, 225, 227, 230, 231, 233, 237, 238, 240, 245, 247, 257, 258]


new_data['probability'] = ''

order = list(map(int , new_data['order']))
for i in range(len(eval_id)):
    try:
        pos = order.index(eval_id[i])
        new_data['probability'][pos] = l_prob[i]
    except:
        pass

另一种方法

new_data['order'] = list(map(int, new_data['order']))
temp_data = pd.DataFrame()
temp_data['order'] = eval_id
temp_data['probability'] = l_prob

pd.merge(new_data, temp_data[['order','probability']], how='left' ,on='order')

相关问题 更多 >

    热门问题