使用分隔符elemen将列表拆分为数组

2024-10-01 13:25:49 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个包含以下表格中数据的文件:

Foo
http://url.com
http://url2.com

FooBar
http://url3.com

FooBarBar
http://url9.com

我想把每n行分别当作一个元素。因此,在每一行只有一个\n之后,我想处理以下字符串和url(url的数量各不相同)。我用第一个字符串的名称创建一个文件夹,然后从URL下载文件。你知道吗

我使用下面的行来获取行的列表。你知道吗

elements = list(open('C:\\filename.txt'))

现在我想把它放到一个列表列表中,\n用作分隔符元素。你知道吗

我怎样才能达到我想要的?你知道吗


Tags: 文件数据字符串comhttpurl元素列表
3条回答

由于您没有关闭文件,因此不应在这类问题中出现一行:

with open('C:\\filename.txt', 'r') as f:

    result = [] # This will keep track of the final output
    entry = [] # This will be our temporary entry that we will add to the result

    for line in f.readlines():
        line = line.strip() # remove the new line stuff
        if not line and entry: # If it is not an empty line and our entry actually has stuff in it
            result.append(' '.join(entry))
            entry = []
        else:
            entry.append(line)
    if entry:
        result.append(' '.join(entry)) # Add the last entry.

print(result)

输出:

['Foo http://url.com http://url2.com', ' FooBar http://url3.com', 'FooBarBar http://url9.com']

迭代方法根据要求创建一个带有第一个字符串名称的文件夹,然后从URL下载文件

import os

with open('input.txt') as f:
    folder_name = None
    folder_failed = False

    for line in f:
        line = line.strip()
        if line:
            if not line.startswith('http'):
                try:
                    os.mkdir(os.path.join(os.getcwd(), line))
                    folder_name = line
                except OSError:
                    print(f"Creation of the directory `{line}` failed")
                    folder_failed = True
                else:
                    folder_failed = False
            elif not folder_failed:
                # downloading file
                new_file = download_file_from_url(line)  # replace with your custom function
                # save file into a folder `folder_name`

您应该能够遍历文件中的行,并分别处理每种情况。你知道吗

def urlsFromFile(path):
    files = {}
    with open(path) as f:  # Important to use with here to ensure file is closed after reading
        fileName = None
        for line in f.readlines():
            line = line.rstrip('\n')  # Remove \n from end of line
            if not line:  # If the line is empty reset the fileName
                fileName = None
            elif fileName is None:  # If fileName is None, then we previously reached a new line. Set the new fileName
                fileName = line
                files[fileName] = []
            else:  # We are working through the urls
                files[fileName].append(line)
    return files

print(urlsFromFile('filename.txt'))

输出:

{'FooBar': ['http://url3.com'], 'Foo': ['http://url.com', 'http://url2.com'], 'FooBarBar': ['http://url9.com']}

这将允许您使用结果创建目录并下载每个列表中的文件,例如:

for folder, urls in urlsFromFile('filename.txt').items():
    print('create folder {}'.format(folder))
    for url in urls:
        print('download {} to folder {}'.format(url, folder))

输出:

create folder FooBar
download http://url3.com to folder FooBar
create folder Foo
download http://url.com to folder Foo
download http://url2.com to folder Foo
create folder FooBarBar
download http://url9.com to folder FooBarBar

相关问题 更多 >