<p>我的问题似乎很简单,但我找不到答案。
我试图使用熊猫中的sum()来计算在阿尔巴尼亚有多少男女试图自杀,这是基于kaggle提供的数据集</p>
<p>代码:</p>
<pre><code>import pandas as pd
pd.options.mode.chained_assignment = None
#Create a dataframe
suicide = pd.read_csv('who_suicide_statistics.csv', header=None)
#Rename column names because it was int
suicide = suicide.rename(columns={0: 'country', 1: 'year', 2: 'sex', 3: 'age', 4:'suicides_no', 5: 'population'})
#Delete first row because it was a duplicate with column names
suicide = suicide.iloc[1: , :]
#Filter values only with Albania
albania_suicide = suicide.loc[(suicide['country'] == 'Albania')]
#Delete rows with Nan values
albania_suicide.dropna(subset=['suicides_no'], inplace=True)
# Is it more women or men who attempts suicide?
print(albania_suicide.loc[albania_suicide['sex'] == 'female', 'suicides_no'].sum())
</code></pre>
<p>输出为:</p>
<pre><code>"144600185403252701074201010771206420121378222161091122116760232109160191351646350029412262147151411491309620111733300000000000013814093209191750000006612272"
</code></pre>
<p>这些数字被一个接一个地显示,就好像它们被当作字符串一样。
应该是14+4+6+0+0+18</p>