如何按列中的值过滤数据帧?

2024-09-27 19:21:44 发布

您现在位置:Python中文网/ 问答频道 /正文

在python3和pandas中,我有一个数据帧:

nao_eleitos.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 1549 entries, 5 to 5174
Data columns (total 15 columns):
SG_UF                            1549 non-null object
DS_CARGO                         1549 non-null object
SQ_CANDIDATO                     1549 non-null object
NM_CANDIDATO                     1549 non-null object
NM_URNA_CANDIDATO                1549 non-null object
NR_CPF_CANDIDATO                 1549 non-null object
SG_PARTIDO                       1549 non-null object
DT_NASCIMENTO                    1549 non-null object
NR_IDADE_DATA_POSSE              1549 non-null int64
NR_TITULO_ELEITORAL_CANDIDATO    1549 non-null object
DS_GENERO                        1549 non-null object
DS_SIT_TOT_TURNO                 1549 non-null object
QT_VOTOS_NOMINAIS                1549 non-null int64
VR_RECEITA_FUNDOS                1549 non-null float64
custo_por_voto                   1549 non-null float64
dtypes: float64(2), int64(2), object(11)
memory usage: 193.6+ KB

“custo\u por\u voto”列具有现金值。我需要过滤大于或等于1904的值

nao_eleitos[['custo_por_voto']].head()
custo_por_voto
    1.9940
    3.6092
    35,500.0000
    1.1461
    30,000.0000

所以我试着用变量布尔值过滤:

seleciona = nao_eleitos['custo_por_voto'] >= 1,904
valores_altos = nao_eleitos[seleciona]

但我有个错误:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-87-ffe42c00da2f> in <module>
      1 seleciona = nao_eleitos['custo_por_voto'] >= 1,904
      2 
----> 3 valores_altos = nao_eleitos[seleciona]

~/Documentos/Code/laranjas/lib/python3.6/site-packages/pandas/core/frame.py in __getitem__(self, key)
   2925             if self.columns.nlevels > 1:
   2926                 return self._getitem_multilevel(key)
-> 2927             indexer = self.columns.get_loc(key)
   2928             if is_integer(indexer):
   2929                 indexer = [indexer]

~/Documentos/Code/laranjas/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2654                                  'backfill or nearest lookups')
   2655             try:
-> 2656                 return self._engine.get_loc(key)
   2657             except KeyError:
   2658                 return self._engine.get_loc(self._maybe_cast_indexer(key))

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

TypeError: '(5        True
6        True
7        True
8        True
10       True
11       True
27       True
28       True
34      False
35       True
37       True
40       True
47       True
51       True
52       True
55       True
57       True
61       True
62       True
67       True
73       True
77       True
84       True
86      False
88       True
89       True
91      False
92      False
94       True
98       True
        ...  
5065     True
5067     True
5070     True
5074    False
5081     True
5084     True
5098     True
5099     True
5100     True
5104     True
5107     True
5111     True
5112     True
5113     True
5114    False
5123    False
5126     True
5136     True
5147    False
5149     True
5150     True
5155     True
5158     True
5162     True
5166     True
5167     True
5168     True
5170     True
5172     True
5174     True
Name: custo_por_voto, Length: 1549, dtype: bool, 904)' is an invalid key

这个条件导致布尔变量在值大于等于1904时为真,对吗?否则为假

之后,我们可以使用这个布尔变量来过滤数据帧

请问,有人知道错误的原因吗?或者更好的过滤方法?你知道吗


Tags: keyselffalsetruepandasgetobjectnull

热门问题