使用pandas如何对某些参数(列)进行加权,以便基于加权方式获取输出

2024-09-28 01:30:28 发布

您现在位置:Python中文网/ 问答频道 /正文

在我的数据集中,所有参数都有定性变量

当我的所有参数(列)对于一行不同时,我们给我们该变量的权重

for column-Irrigated we giving us 40% weightage,
for soil we giving us 35% weightage and
for seed variety we giving us 25% weightage,

  • 因此,当所有参数给出不同的值时,将根据我们计算的权重(40%)Irrigated column值选择输出

  • 如果重复2次以上,则输出将显示为重复2次的值

enter image description here 任何建议都会有帮助

>>> import pandas as pd
>>> data = {'District':  ['Ahmednagar', 'Aurangabad','Jalna','Buldhana','Amravati','Nashik','Pune','Palghar'],
        'Soil': ['B','A','D','D','A','B','D','A' ],
    'Irrigated': ['B','B','D','A','A','B','C','A' ],
    'Seed Variety': ['A','B','B','B','A','A','A','D']
        }
>>> data
{'District': ['Ahmednagar', 'Aurangabad', 'Jalna', 'Buldhana', 'Amravati', 'Nashik', 'Pune', 'Palghar'], 'Soil': ['B', 'A', 'D', 'D', 'A', 'B', 'D', 'A'], 'Seed Variety': ['A', 'B', 'B', 'B', 'A', 'A', 'A', 'D'], 'Irrigated': ['B', 'B', 'D', 'A', 'A', 'B', 'C', 'A']}
>>> df = pd.DataFrame (data, columns = ['District','Soil','Irrigated','Seed Variety'])
>>> df
     District  ... Seed Variety
0  Ahmednagar  ...            A
1  Aurangabad  ...            B
2       Jalna  ...            B
3    Buldhana  ...            B
4    Amravati  ...            A
5      Nashik  ...            A
6        Pune  ...            A
7     Palghar  ...            D

[8 rows x 4 columns]
>>> 

Tags: fordata参数seedweusdistrictvariety
1条回答
网友
1楼 · 发布于 2024-09-28 01:30:28

so when all parameters giving different value, then it will be select output for Irrigated column value [...] if more than 2 times repeated then output will be display as which value repeated 2 times.

因此,这意味着只有当其他两列“土壤”和“种子品种”具有相同的值时,产量才会不同于“灌溉”

因此,我首先填充“输出”以匹配“灌溉”,然后在后续操作中,将其设置为另一列的值,其中另外两列具有相同的值:

df['Output'] = df['Irrigated']
df.loc[df['Soil'] == df['Seed Variety'], 'Output'] = df['Soil']

应该这样做

稍后,如果要计算总百分比,可以将结果“输出”与源列进行比较,然后将其乘以每个权重:

df['Output(%)'] = (
    (df['Output'] == df['Soil']) * 35.0 +
    (df['Output'] == df['Irrigated']) * 40.0 +
    (df['Output'] == df['Seed Variety']) * 25.0
)

相关问题 更多 >

    热门问题