如何从csv生成包含数据的条形图?

2024-05-19 12:03:09 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个csv与几个列,其中之一是城市列。有几个城市,也有同一个城市,重复了好几次。 我想建立一个条形图有多少城市出现在CSV。 示例:

Y   X
5   Belo Horizonte
1   Vespasiano
4   São Paulo

我做了下面的代码,但我得到了错误,这是正确的代码之后。你知道吗

代码:

import matplotlib.pyplot as plt; plt.rcdefaults()
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

#lendo o arquivo
tb_usuarios = 'tb_usuarios.csv'
usuarios = pd.read_csv(tb_usuarios,
header=0,
index_col=False
)
print(usuarios.head())
usuarios["vc_municipio"] = usuarios["vc_municipio"].dropna()
usuarios["vc_municipio"] = usuarios["vc_municipio"].str.upper()
municipio = usuarios.groupby(['vc_municipio'])
print(municipio)
y_pos = usuarios.groupby(['vc_municipio'])['vc_municipio'].count()
print(y_pos)

plt.bar(y_pos, municipio, align='center', alpha=0.5)
plt.xticks(y_pos, municipio)
plt.ylabel('Qtd')
plt.title('Municipio')

plt.show()

错误:

Traceback (most recent call last):
  File "C:/Users/Henrique Mendes/PycharmProjects/emprestimo/venv1/emprestimo.py", line 20, in <module>
    plt.bar(y_pos, municipio, align='center', alpha=0.5)
  File "C:\Users\Henrique Mendes\PycharmProjects\emprestimo\venv1\lib\site-packages\matplotlib\pyplot.py", line 2440, in bar
    **({"data": data} if data is not None else {}), **kwargs)
  File "C:\Users\Henrique Mendes\PycharmProjects\emprestimo\venv1\lib\site-packages\matplotlib\__init__.py", line 1601, in inner
    return func(ax, *map(sanitize_sequence, args), **kwargs)
  File "C:\Users\Henrique Mendes\PycharmProjects\emprestimo\venv1\lib\site-packages\matplotlib\axes\_axes.py", line 2348, in bar
    self._process_unit_info(xdata=x, ydata=height, kwargs=kwargs)
  File "C:\Users\Henrique Mendes\PycharmProjects\emprestimo\venv1\lib\site-packages\matplotlib\axes\_base.py", line 2126, in _process_unit_info
    kwargs = _process_single_axis(ydata, self.yaxis, 'yunits', kwargs)
  File "C:\Users\Henrique Mendes\PycharmProjects\emprestimo\venv1\lib\site-packages\matplotlib\axes\_base.py", line 2108, in _process_single_axis
    axis.update_units(data)
  File "C:\Users\Henrique Mendes\PycharmProjects\emprestimo\venv1\lib\site-packages\matplotlib\axis.py", line 1493, in update_units
    default = self.converter.default_units(data, self)
  File "C:\Users\Henrique Mendes\PycharmProjects\emprestimo\venv1\lib\site-packages\matplotlib\category.py", line 115, in default_units
    axis.set_units(UnitData(data))
  File "C:\Users\Henrique Mendes\PycharmProjects\emprestimo\venv1\lib\site-packages\matplotlib\category.py", line 181, in __init__
    self.update(data)
  File "C:\Users\Henrique Mendes\PycharmProjects\emprestimo\venv1\lib\site-packages\matplotlib\category.py", line 215, in update
    for val in OrderedDict.fromkeys(data):
TypeError: unhashable type: 'numpy.ndarray'

我的输出:

"C:\Users\Henrique Mendes\PycharmProjects\emprestimo\venv1\Scripts\python.exe" "C:/Users/Henrique Mendes/PycharmProjects/emprestimo/venv1/emprestimo.py"
   pr_usuario  bl_administrador dt_nascimento  ... dt_cheque es_anexo dt_anexo
0           2                 0    24/02/1980  ...       NaN      NaN      NaN
1           3                 0    05/09/1985  ...       NaN      NaN      NaN
2           4                 1    20/03/1984  ...       NaN      NaN      NaN
3           5                 1    20/01/1982  ...       NaN      NaN      NaN
4           6                 0    25/05/1985  ...       NaN      NaN      NaN

[5 rows x 30 columns]
{'BELO HORIZONTE': Int64Index([0, 1, 2, 3, 6, 9, 10, 14, 17, 20, 22, 25], dtype='int64'), 'BRASILIA': Int64Index([4], dtype='int64'), 'CONTAGEM': Int64Index([23], dtype='int64'), 'CURITIBA': Int64Index([5, 7, 15, 18, 19], dtype='int64'), 'SANTA LUZIA': Int64Index([21], dtype='int64'), 'VESPASIANO': Int64Index([24], dtype='int64')}
vc_municipio
BELO HORIZONTE    12
BRASILIA           1
CONTAGEM           1
CURITIBA           5
SANTA LUZIA        1
VESPASIANO         1
Name: vc_municipio, dtype: int64

我怎么做这个图表?你知道吗


Tags: inpymatplotliblinepltnanusersfile
2条回答

使用^{}

您的数据:

  • 假设您的数据是.csv,格式如下
0.0,BELO HORIZONTE
1.0,BELO HORIZONTE
2.0,BELO HORIZONTE
3.0,BELO HORIZONTE
6.0,BELO HORIZONTE
9.0,BELO HORIZONTE
10.0,BELO HORIZONTE
14.0,BELO HORIZONTE
17.0,BELO HORIZONTE
20.0,BELO HORIZONTE
22.0,BELO HORIZONTE
25.0,BELO HORIZONTE
4.0,BRASILIA
23.0,CONTAGEM
5.0,CURITIBA
7.0,CURITIBA
15.0,CURITIBA
18.0,CURITIBA
19.0,CURITIBA
21.0,SANTA LUZIA
24.0,VESPASIANO

创建数据帧:

import pandas as pd
import matplotlib.pyplot as plt


df = pd.read_csv('test.csv', header=None)
df.columns = ['value', 'city']

    value            city
0     0.0  BELO HORIZONTE
1     1.0  BELO HORIZONTE
2     2.0  BELO HORIZONTE
3     3.0  BELO HORIZONTE
4     6.0  BELO HORIZONTE
5     9.0  BELO HORIZONTE
6    10.0  BELO HORIZONTE
7    14.0  BELO HORIZONTE
8    17.0  BELO HORIZONTE
9    20.0  BELO HORIZONTE
10   22.0  BELO HORIZONTE
11   25.0  BELO HORIZONTE
12    4.0        BRASILIA
13   23.0        CONTAGEM
14    5.0        CURITIBA
15    7.0        CURITIBA
16   15.0        CURITIBA
17   18.0        CURITIBA
18   19.0        CURITIBA
19   21.0     SANTA LUZIA
20   24.0      VESPASIANO

分组并绘制数据:

# groupby & count
city_count = df.groupby('city').count()

                value
city                 
BELO HORIZONTE     12
BRASILIA            1
CONTAGEM            1
CURITIBA            5
SANTA LUZIA         1
VESPASIANO          1

# plot
city_count.plot.bar()
plt.ylabel('Qtd')
plt.title('Municipio')
plt.show()

enter image description here

^{}绘图:

import seaborn as sns

sns.barplot(x=city_count.index, y='value', data=city_count)
plt.xticks(rotation=45)
plt.show()

enter image description here

municipio = usuarios.groupby(['vc_municipio'])返回pandas中的groupby对象,该对象导致错误,因为matplotlib不处理该对象。你知道吗

plt.bar取x值,后跟y值(参见docs)。你知道吗

matplotlib.pyplot.bar(x, height, width=0.8, bottom=None, *, align='center', data=None, **kwargs)

幸运的是,在pandas中执行groupby时,它会自动合并x值(或类别)作为索引。你知道吗

假设municipio是一个类别列表(您想要按城市计数吗?)那么下面的方法就可以了。你知道吗

替换代码

plt.bar(y_pos, municipio, align='center', alpha=0.5)

plt.bar(y_pos.index, y_pos, align='center', alpha=0.5)

或者,您可以使用plt.barpandas version(它扩展了matplot lib)本机处理一些数据帧异常。你知道吗

相关问题 更多 >

    热门问题