使用具有重复项的序列创建新变量

Country.Name Indicator.Name 2004 2005 0 World GDP 5.590000e+13 5.810000e+13 1 World Health 5.590000e+13 5.810000e+13 086 Zimbabwe GDP per capita 8.681564e+02 8.082944e+02 089 Zimbabwe Population 1.277751e+07 1.294003e+07

NameError Traceback (most recent call last) <ipython-input-37-d817ea1522fc> in <module>() ----> 1 for i in series(gdp1['Country.Name']): 2 gdp1['Military Spending'] = 100 / gdp1['Military percent of GDP'] * gdp1['GDP'] NameError: name 'series' is not defined

1条回答

网友

1楼 · 发布于 2024-09-30 18:26:17

假设您有以下输入Dataframe（请注意，在您的示例中数据Military percent of GDP不存在）：

>>> gdp
  Country.Name           Indicator.Name          2004          2005
0        World                      GDP  5.590000e+13  5.810000e+13
1        World  Military percent of GDP  2.100000e+00  2.300000e+00
2     Zimbabwe                      GDP  1.628900e+10  1.700000e+10
3     Zimbabwe  Military percent of GDP  2.000000e+00  2.100000e+00

然后可以分别为GDP和Military percent of GDP使用2004和2005中的数据创建辅助数据帧df_gdp和df_mpgdp。然后您可以创建df_msp，其中将包含名为Military Spending的新Indicator.Name，最后将其结果附加到原始Dataframe。注意，在某些情况下，我们需要reset_index，以确保使用预期的索引完成计算。你知道吗

以下代码适用于您的目标：

import pandas as pd
gdp = pd.DataFrame( [
["World",  "GDP",  5.590000e+13,  5.810000e+13], 
["World",  "Military percent of GDP",  2.1, 2.3], 
["Zimbabwe",  "GDP",  16289e6, 17000e6], 
["Zimbabwe",  "Military percent of GDP",  2, 2.1]])
gdp.columns = ["Country.Name", "Indicator.Name", "2004", "2005"]

df_gdp = gdp[gdp["Indicator.Name"] == "GDP"]
df_mpgdp = gdp[gdp["Indicator.Name"] == "Military percent of GDP"]

df_msp = pd.DataFrame()
df_msp["Country.Name"] = df_gdp["Country.Name"].reset_index(drop=True)
df_msp["Indicator.Name"] = "Military Spending"
df_msp["2004"] = 100 / df_mpgdp[["2004"]].reset_index(drop=True) *  df_gdp[["2004"]].reset_index(drop=True)
df_msp["2005"] = 100 / df_mpgdp[["2005"]].reset_index(drop=True) *  df_gdp[["2005"]].reset_index(drop=True)

gdp_out = gdp.append(df_msp)
gdp_out = gdp_out.sort_values(["Country.Name", "Indicator.Name"])
gdp_out = gdp_out.reset_index(drop=True)

最后输出Dataframe将导致：

>>> gdp_out
  Country.Name           Indicator.Name          2004          2005
0        World                      GDP  5.590000e+13  5.810000e+13
1        World        Military Spending  2.661905e+15  2.526087e+15
2        World  Military percent of GDP  2.100000e+00  2.300000e+00
3     Zimbabwe                      GDP  1.628900e+10  1.700000e+10
4     Zimbabwe        Military Spending  8.144500e+11  8.095238e+11
5     Zimbabwe  Military percent of GDP  2.000000e+00  2.100000e+00

相关问题更多 >

编程相关推荐

热门问题

热门文章