csv-fi中的Python条件过滤

id gender ses schtyp prog write 70 male low public general 52 121 female middle public vocation 68 86 male high public general 33 141 male high public vocation 63 172 male middle public academic 47 113 male middle public academic 44 50 male middle public general 59 11 male middle public academic 34 84 male middle public general 57 48 male middle public academic 57 75 male middle public vocation 60 60 male middle public academic 57

2条回答

网友

1楼 · 编辑于 2024-06-01 18:45:25

与Ramon一致的是，Pandas绝对是最好的选择，一旦你习惯了它，它就拥有非凡的过滤/子设置功能。但首先要把你的头绕过来可能很难（至少对我来说是这样！），所以我从我的一些旧代码中找到了一些您需要的子设置的示例。下面的变量itu是一个Pandas数据框架，包含了不同国家的数据。在

# Subsetting by using True/False:
subset = itu['CntryName'] == 'Albania'  # returns True/False values
itu[subset]  # returns 1x144 DataFrame of only data for Albania
itu[itu['CntryName'] == 'Albania']  # one-line command, equivalent to the above two lines

# Pandas has many built-in functions like .isin() to provide params to filter on    
itu[itu.cntrycode.isin(['USA','FRA'])]  # returns where itu['cntrycode'] is 'USA' or 'FRA'
itu[itu.year.isin([2000,2001,2002])]  # Returns all of itu for only years 2000-2002
# Advanced subsetting can include logical operations:
itu[itu.cntrycode.isin(['USA','FRA']) & itu.year.isin([2000,2001,2002])]  # Both of above at same time

# Use .loc with two elements to simultaneously select by row/index & column:
itu.loc['USA','CntryName']
itu.iloc[204,0]
itu.loc[['USA','BHS'], ['CntryName', 'Year']]
itu.iloc[[204, 13], [0, 1]]

# Can do many operations at once, but this reduces "readability" of the code
itu[itu.cntrycode.isin(['USA','FRA']) & 
    itu.year.isin([2000,2001,2002])].loc[:, ['cntrycode','cntryname','year','mpen','fpen']]

# Finally, if you're comfortable with using map() and list comprehensions, 
you can do some advanced subsetting that includes evaluations & functions 
to determine what elements you want to select from the whole, such as all 
countries whose name begins with "United":
criterion = itu['CntryName'].map(lambda x: x.startswith('United'))
itu[criterion]['CntryName']  # gives us UAE, UK, & US

网友

2楼 · 编辑于 2024-06-01 18:45:25

看pandas。我认为它将缩短你的csv解析工作，并提供你所要求的子集功能。。。在

import pandas as pd
data = pd.read_csv('fileName.txt', delim_whitespace=True)

#get all of the male students
data[data['gender'] == 'male']

相关问题更多 >

编程相关推荐

热门问题

热门文章