在Python中从列表中提取特定文本

2024-09-28 21:49:01 发布

您现在位置:Python中文网/ 问答频道 /正文

我正试图从一长串文本中提取某些信息,但确实很好地显示了这些信息,但我似乎无法准确地解决这个问题

我的案文如下:

"(Craw...Crawley\n\n\n\n\n\n\n08:00\n\n\n\n\n\n\n**Hotstage**\n **248236**\n\n\n\n\n\n\n\n\n\n\n\n\n\nCosta Collect...Costa Coffee (Bedf...Bedford\n\n\n\n\n\n\n08:00\n\n\n\n  \n\n\n**Hotstage**\n **247962**\n\n\n\n\n\n\n\n\n\n\n\n\n\nKFC - Acrelec Deployment...KFC - Sheffield Qu...Sheffield\n\n\n\n\n\n\n08:00\n\n\n\n\n\n\nHotstage\n 247971\n\n\n\n\n\n\n\n\n\n\n\n\n\nKFC - Acrelec Deployment...KFC - Brentford...BRENTFORD\n\n\n\n\n\n\n08:00\n\n\n\n\n\n\nHotstage\n 248382\n\n\n\n\n\n\n\n\n\n\n\n\n\nKFC - Acrelec Deployment...KFC - Newport"

我想提取突出显示的内容

我认为解决方案很简单,可能我没有正确地存储信息或提取信息

这是我的密码

from bs4 import BeautifulSoup
import requests
import re


import time


def main():

    url = "http://antares.platinum-computers.com/schedule.htm"
    response = requests.get(url)
    soup = BeautifulSoup(response.content, "html.parser")

    response.close()
    # Get
    tech_count = 0
    technicians = [] #List to hold technicians names
    xcount = 0
    test = 0
    name_links = soup.find_all('td', {"class": "resouce_on"}) #Get all table data with class name "resource on".
    # iterate through html data and add them to "technicians = []"
    for i in name_links:
        technicians.append(str(i.text.strip()))  # append value to dictionary
        tech_count += 1
    print("Found: " + str(tech_count) + " technicians + 1 default unallocated.")


    for t in technicians:
        print(xcount,t)
        xcount += 1
    test = int(input("choose technician: "))
    for link in name_links:
        if link.find(text=re.compile(technicians[test])):
            jobs = []
            numbers = []
            unique_cr = []
            jobs.append(link.parent.text.strip())
            for item in jobs:
                for subitem in item.split():
                    if(subitem.isdigit()):
                        numbers.append(subitem)
            for number in numbers:
                if number not in unique_cr:
                    unique_cr.append(number)
            print ("tasks for technician " + str(technicians[test]) + " are as follows")
            for cr in unique_cr:
                print (jobs)


if __name__ == '__main__':
    main()

Tags: nameintestimport信息forifjobs
1条回答
网友
1楼 · 发布于 2024-09-28 21:49:01

这相当简单:

myStr = "your complicated text"

words = mystr.split("\n")
niceWords = []

for word in words:
    If "**"in word:
        niceWords.append(word.replace("**", "")

print(niceWords)

相关问题 更多 >