Python：如何从混合内容的文本文件中提取浮点数？

import csv from itertools import islice results = csv.reader(open('test', 'r'), delimiter="\n") list(islice(results,3)) print results.next() print results.next() list(islice(results,3)) print results.next() print results.next()

3条回答

网友

1楼 · 编辑于 2024-05-18 22:28:58

也许这能帮上忙

zip(*[results]*5)

例如

^{pr2}$

网友

2楼 · 编辑于 2024-05-18 22:28:58

足够棘手但更具说服力和顺序性的解决方案：

$ grep -v "ahi" myFileName | grep -v se | tr -d "test\" " | awk 'NR%2{printf $0", ";next;}1'
-2.435953, 1.218364
-2.001858, 1.303935

工作原理：基本上删除特定的文本行，然后删除行中不需要的文本，然后用格式连接每一行。我只是为了美化而加了逗号。如果你不需要的话，把逗号从awks printf中去掉。在

网友

3楼 · 编辑于 2024-05-18 22:28:58

以下是执行此操作的代码：

import re

# this is the same data just copy/pasted from your question
data = """    ahi1
    b/se
ahi 
test    -2.435953
        1.218364
    ahi2
    b/se
ahi 
test    -2.001858
        1.303935"""

# what we're gonna do, is search through it line-by-line
# and parse out the numbers, using regular expressions

# what this basically does is, look for any number of characters
# that aren't digits or '-' [^-\d]  ^ means NOT
# then look for 0 or 1 dashes ('-') followed by one or more decimals
# and a dot and decimals again: [\-]{0,1}\d+\.\d+
# and then the same as first..
pattern = re.compile(r"[^-\d]*([\-]{0,1}\d+\.\d+)[^-\d]*")

results = []
for line in data.split("\n"):
    match = pattern.match(line)
    if match:
        results.append(match.groups()[0])

pairs = []
i = 0
end = len(results)
while i < end - 1:
    pairs.append((results[i], results[i+1]))
    i += 2

for p in pairs:
    print "%s, %s" % (p[0], p[1])

输出：

^{pr2}$

不用打印出这些数字，你可以把它们保存在一个列表中，然后再把它们压缩在一起。。我使用python regular expression framework来解析文本。如果你还不知道正则表达式，我只能建议你选择它。我发现解析文本和各种机器生成的输出文件非常有用。在

编辑：

哦，顺便说一句，如果你担心性能，我在我那台缓慢的老式2ghz IBM T60笔记本电脑上进行了测试，我可以使用regex在大约200毫秒内解析一兆字节。在

更新：我感觉很好，所以我为你做了最后一步：P

相关问题更多 >

编程相关推荐

热门问题

热门文章