如何在特定条件下过滤pandas数据框中的列值?

2024-10-06 12:45:09 发布

您现在位置:Python中文网/ 问答频道 /正文

我创建了一个Pandas数据帧,并想过滤一些值。数据帧包含4列,即currency port supplier_id value,我希望有能够满足以下条件的值

* port – expressed as a portcode, a 5-letter string uniquely identifying a port. Portcodes consist of 2-letter country code and 3-letter city code.
* supplier_id - integer, uniquely identifying the provider of the information
* currency - 3-letter string identifying the currency
* value - a floating-point number

df =  df[ (len(df['port']) == 5 & isinstance(df['port'], basestring)) & \
  isinstance(df['supplier_id'], int) & \
  (len(df['currency']) == 3 & isinstance(df['currency'], basestring))\
  isinstance(df['value'], float) ]

代码片段应该很明显,并尝试实现前面提到的条件,但它不起作用。来自df的打印如下所示

^{pr2}$

怎么写好?在


Tags: ofthe数据iddfstringvalueport
1条回答
网友
1楼 · 发布于 2024-10-06 12:45:09

可以使用if在一列中包含混合值-带字符串的数值:

df = pd.DataFrame({'port':['aa789',2,3],
                   'supplier_id':[4,'s',6],
                   'currency':['USD',8,9],
                   'value':[1.7,3,5]})

print (df)
  currency   port supplier_id  value
0      USD  aa789           4    1.7
1        8      2           s    3.0
2        9      3           6    5.0

#for python 2 change str to basestring
m1 = (df.port.astype(str).str.len() == 5) & (df.port.apply(lambda x :isinstance(x, str)))
m2 = df.supplier_id.apply(lambda x : isinstance(x, int))
m3=(df.currency.astype(str).str.len() == 3)&(df.currency.apply(lambda x :isinstance(x, str)))
m4 = df.value.apply(lambda x : isinstance(x, float))
mask = m1 & m2 & m3 & m4
print (mask)
0     True
1    False
2    False
dtype: bool

print (df[mask])
  currency   port supplier_id  value
0      USD  aa789           4    1.7

相关问题 更多 >