熊猫关键错误:csv文件datafram的年份

2024-06-26 02:36:05 发布

您现在位置:Python中文网/ 问答频道 /正文

有一个相似的数据帧:

 BirthYear    Sex    Area    Count
2015         W      Dhaka    6
2015         M      Dhaka    3
2015         W      Khulna   1
2015         M      Khulna   8
2014         M      Dhaka    13
2014         W      Dhaka    20
2014         M      Khulna   9
2014         W      Khulna   6
2013         W      Dhaka    11
2013         M      Dhaka    2
2013         W      Khulna    8
2013         M      Khulna    5
2012         M      Dhaka    12
2012         W      Dhaka    4
2012         W      Khulna    7
2012         M      Khulna    1

现在我想在熊猫中创建一个条形图,只显示2015年出生的雄性和雌性。 代码:

^{pr2}$

执行之后,IDLE显示以下错误:

    Traceback (most recent call last):
  File "C:\Users\sabid\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\indexes\base.py", line 1945, in get_loc
    return self._engine.get_loc(key)
  File "pandas\index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas\index.c:4066)
  File "pandas\index.pyx", line 159, in pandas.index.IndexEngine.get_loc (pandas\index.c:3930)
  File "pandas\hashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12408)
  File "pandas\hashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12359)
KeyError: 'BirthYear'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:/Users/sabid/Dropbox/Freelancing/data visualization python/pie.py", line 8, in <module>
    df=df.loc[df["StichtagDatJahr"]==2015]
  File "C:\Users\sabid\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\core\frame.py", line 1997, in __getitem__
    return self._getitem_column(key)
  File "C:\Users\sabid\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\core\frame.py", line 2004, in _getitem_column
    return self._get_item_cache(key)
  File "C:\Users\sabid\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\core\generic.py", line 1350, in _get_item_cache
    values = self._data.get(item)
  File "C:\Users\sabid\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\core\internals.py", line 3290, in get
    loc = self.items.get_loc(item)
  File "C:\Users\sabid\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\indexes\base.py", line 1947, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas\index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas\index.c:4066)
  File "pandas\index.pyx", line 159, in pandas.index.IndexEngine.get_loc (pandas\index.c:3930)
  File "pandas\hashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12408)
  File "pandas\hashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12359)
KeyError: 'BirthYear'

我从this link得知发生这种情况是因为'BirthYear'列名前面有一些标题。 但我不知道如何删除头,使代码工作。 有什么有效的解决办法吗?在


Tags: inpypandasgetindexlineitemusers
2条回答

我想你想要这样的输出:

Barplot

我不确定这一点,但我认为使用pivot方法会把你搞砸。您不需要使用pivot,因为agg_df基本上是一个透视表。下面是我用来创建图形的代码:

import pandas as pd

# I made this to approximate your CSV file.
table = {
    'BirthYear': [2015, 2015, 2015, 2015, 2014, 2014,],
    'Sex': ['W', 'M', 'W', 'M', 'M', 'W',],
    'Area': ['Dhaka', 'Dhaka', 'Khulna', 'Khulna', 'Dhaka', 'Dhaka',],
    'Count': [6, 3, 1, 8, 13, 20]
}

df = pd.DataFrame(table)
df = df.reset_index(drop=True)

# Select people born in 2015.
df = df.loc[df["BirthYear"] == 2015]

# This is basically a pivot table.
agg_df = df.groupby(['Sex']).sum()

# Make the plot.
agg_df['Count'].plot.bar(stacked=True)

可以重命名列。在

df.rename(columns=["BirthYear", "Sex", "Area", "Count"], inplace=True)

相关问题 更多 >