从在线.txt文件中仅下载某些行

2条回答

网友

1楼 · 编辑于 2024-10-05 10:58:40

你可以使用curl和grep。您仍然需要下载整个文件，除非ebi.ac.uk服务器api提供服务器端过滤

curl 'https://www.ebi.ac.uk/ena/data/view/FO203355&display=text' | grep '^FT' > lines.txt

网友

2楼 · 编辑于 2024-10-05 10:58:40

因为您最终打算使用pandas，所以您所需要的只是将数据流到您的脚本中并过滤所需的行。最简单的方法是在流模式下使用requests模块，然后将远程数据视为文件流，即：

import requests

url = "https://www.ebi.ac.uk/ena/data/view/FO203355&display=text"

with requests.get(url, stream=True) as r:  # open a streaming request
    for line in r:  # iterate over the stream line by line
        if line[:2] == "FT":  # check if a line begins with `FT`
            print(line)  # or do whatever you want with the line

如果只想保存数据，可以将过滤后的行转发到文件输出流：

import requests

url = "https://www.ebi.ac.uk/ena/data/view/FO203355&display=text"

with requests.get(url, stream=True) as r, open("output.dat", "w") as f:
    for line in r:  # iterate over the stream line by line
        if line[:2] == "FT":  # check if a line begins with `FT`
            f.write(line)  # write the line to output.dat

您可能希望创建数据帧并直接将行解析到其中，但这取决于您希望如何解析数据，所以这是我留给您的练习

相关问题更多 >

编程相关推荐

热门问题

热门文章

从在线.txt文件中仅下载某些行

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >