我已经试了很多次,但这根本不会发生。在
在输入:-在
condor t airline airline
eight n 0 flightnumber
nine n 0 flightnumber
five n 0 flightnumber
hallo t 0 sentence
turn t com turn_heading
left t 0 direction
heading t com turn_heading
three n 0 degree_absolute
two n 0 degree_absolute
zero n 0 degree_absolute
预期产量:
^{pr2}$每次我尝试输入内容时,制表符都会阻碍字符串的标记化,即使我是以列表或字符串的形式输入的。这就是当我试着剥掉标签时会发生的事情
['condor\tt\tairline\tairline\n', 'eight\tn\t \tflightnumber\n', 'nine\tn\t \tflightnumber\n', 'five\tn\t \tflightnumber\n', 'hallo\tt\t \tsentence\n', 'turn\tt\tcom\tturn_heading\n', 'left\tt\t \tdirection\n', 'heading\tt\tcom\tturn_heading\n', 'three\tn\t \tdegree_absolute\n', 'two\tn\t \tdegree_absolute\n', 'zero\tn\t \tdegree_absolute\n', '\n', 'aeh\tt\t \tsentence\n', 'two\tn\t \tflightnumber\n', 'eight\tn\t \tflightnumber\n', 'november\tt\tflightnumber\tflightnumber\n', 'hallo\tt\t \tsentence\n', 'reduce\tt\tcom\treduce\n', 'two\tn\t \tspeed\n', 'two\tn\t \tspeed\n', 'zero\tn\t \tspeed\n', 'knots\tt\t \treduce\n', '\n', 'condor\tt\tairline\tairline\n', 'eight\tn\t \tflightnumber\n', 'nine\tn\t \tflightnumber\n', 'five\tn\t \tflightnumber\n', 'descend\tt\tcom\tdescend\n', 'three\tn\t \taltitude\n', 'thousand\tn\t \taltitude\n', 'feet\tt\t \tdescend\n', 'turn\tt\tcom\tturn_heading\n', 'left\tt\t \tdirection\n', 'heading\tt\tcom\tturn_heading\n', 'two\tn\t \tdegree_absolute\n', 'six\tn\t \tdegree_absolute\n', 'zero\tn\t \tdegree_absolute\n', 'cleared\tt\tcom\tcleared_ils\n', 'ils\tt\t \tcleared_ils\n', 'runway\tt\t \tcleared_ils\n', 'two\tn\t \trunway\n', 'three\tn\t \trunway\n', 'left\tt\t \trunway\n', 'turn\tt\tcom\tturn_heading\n', 'left\tt\t \tdirection\n', 'heading\tt\tcom\tturn_heading\n', 'two\tn\t \tdegree_absolute\n', 'five\tn\t \tdegree_absolute\n', 'zero\tn\t \tdegree_absolute\n']
有什么帮助,我可以剥离标签,标记他们,并转换成标记格式??在
我用来删除控制字符的代码:
import string
with open('input.txt', 'r') as file1:
lines = str(list(file1))
print lines.translate(string.maketrans("\n\t\r", " "))
如果使用^{} module ,这很容易:
请注意,我指定} 自动使每一行成为一个字典
delimiter='\t'
以指定制表符分隔(而不是默认的逗号分隔)的输入文件,并使用^{{fieldname: value, ...}
。在然后,您可以将这些字典处理成您想要的任何格式。在
相关问题 更多 >
编程相关推荐