我有一个包含多列的数据集,我只对分析其中六列(0、1、2、4、6、7)的数据感兴趣。我想用标题(时间、模式、事件、xcoord、ycoord、phi)来标记它们。总共有十列,以下是数据的示例:
1385940076332 3 M subject_avatar -30.000000 1.000000 -59.028107 180.000000 0.000000 0.000000
1385940076336 2 M subject_avatar -30.000000 1.000000 -59.028107 180.000000 0.000000 0.000000
1385940076339 3 M subject_avatar -30.000000 1.000000 -59.028107 180.000000 0.000000 0.000000
1385940076342 3 M subject_avatar -30.000000 1.000000 -59.028107 180.000000 0.000000 0.000000
1385940076346 3 M subject_avatar -30.000000 1.000000 -59.028107 180.000000 0.000000 0.000000
1385940076350 2 M subject_avatar -30.000000 1.000000 -59.028107 180.000000 0.000000 0.000000
1385940076353 3 M subject_avatar -30.000000 1.000000 -59.028107 180.000000 0.000000 0.000000
1385940076356 3 M subject_avatar -30.000000 1.000000 -59.028107 180.000000 0.000000 0.000000
当我使用下面的代码将数据解析为列时,它似乎只对数据进行计数—但我希望能够列出数据以供进一步分析。以下是我在@alko中使用的代码:
with open('/Users/Lab/Desktop/test.txt', 'r') as infile:
f = infile.readlines()
with open('filtered.txt', 'w') as outfile:
for line in f:
if 'subject_avatar' in line: #this line is just to take the relevant rows
outfile.write(line)
import pandas as pd
df = pd.read_csv('filtered.txt', header=None, false_values=None, sep='\s+')[[0, 1, 2, 4, 6, 7]]
df.columns = ['time', 'mode', 'event', 'xcoord', 'ycoord', 'phi']
print df
下面是代码返回的内容:
class 'pandas.core.frame.DataFrame'
Int64Index: 115534 entries, 0 to 115533
Data columns (total 6 columns):
time 115534 non-null values
mode 115534 non-null values
event 115534 non-null values
xcoord 115534 non-null values
ycoord 115534 non-null values
phi 115534 non-null values
dtypes: float64(3), int64(2), object(1)
我查了熊猫文档,试着测向值,和测向指数,但没有一个会打印正确的数据(即具有正确标题的6列)
目前没有回答
相关问题 更多 >
编程相关推荐