使用Scrapy高效地循环数据

2条回答

网友

1楼 · 编辑于 2024-06-25 23:42:41

在这种情况下，您需要单独定位表。然后在特定元素中使用选择器（而不是整个response中的选择器）迭代元素

def parse(self, response):
    #Select only div.box tags that have League name inside h2 tag:
    for match_table in [x for x in response.css("div.box") if x.css("div.table-header h2 a[name]")]:
        # Find League name:
        competition_name = match_table.css("div.table-header img::attr(title)").extract_first()
        # Iterare through each tr tag inside table (which is inside correct div.box
        for tr_tag in match_table.css("div.responsive-table tbody tr"):
            # Write league name:
            i = {"competition_name": competition_name}
            # As each row have fixed number of cells we can use list unpacking like this:
            i["Matchday"], i["Date"], i["Time"],_, i["Home team"],_, i["Away team"], i["System of play"], i["Coach"], i["Attendance"], i["Result"]\
                = [td.css("*::text").extract_first("").strip("\n\t") or td.css("a::text").extract_first("").strip("\n\t")
                   for td in tr_tag.css("td")]
            # Find and write result link:
            i["result_link"] = response.urljoin(tr_tag.css("td a[title='Match report']::attr(href)").extract_first())
            # Release item:
            yield i

日志输出：

2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'UEFA Champions League', 'Matchday': 'Group E', 'Date': 'Wed Sep 17, 2014', 'Time': '8:45 PM', 'Home team': 'Bayern Munich ', 'Away team': 'Man City', 'System of play': '4-4-1-1', 'Coach': 'Rubén Cousillas', 'Attendance': '68.000', 'Result': '1:0 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2490851'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'UEFA Champions League', 'Matchday': 'Group E', 'Date': 'Tue Sep 30, 2014', 'Time': '8:45 PM', 'Home team': 'Man City', 'Away team': 'AS Roma', 'System of play': '4-4-2 double 6', 'Coach': 'Manuel Pellegrini', 'Attendance': '37.509', 'Result': '1:1 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2495295'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'UEFA Champions League', 'Matchday': 'Group E', 'Date': 'Tue Oct 21, 2014', 'Time': '6:00 PM', 'Home team': 'CSKA Moscow', 'Away team': 'Man City', 'System of play': '4-4-2 double 6', 'Coach': 'Manuel Pellegrini', 'Attendance': 'x', 'Result': '2:2 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2495310'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'UEFA Champions League', 'Matchday': 'Group E', 'Date': 'Wed Nov 5, 2014', 'Time': '8:45 PM', 'Home team': 'Man City', 'Away team': 'CSKA Moscow', 'System of play': '4-4-2 double 6', 'Coach': 'Manuel Pellegrini', 'Attendance': '44.000', 'Result': '1:2 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2495334'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'UEFA Champions League', 'Matchday': 'Group E', 'Date': 'Tue Nov 25, 2014', 'Time': '8:45 PM', 'Home team': 'Man City', 'Away team': 'Bayern Munich ', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '44.510', 'Result': '3:2 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2495344'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'UEFA Champions League', 'Matchday': 'Group E', 'Date': 'Wed Dec 10, 2014', 'Time': '8:45 PM', 'Home team': 'AS Roma', 'Away team': 'Man City', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '54.119', 'Result': '0:2 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2495366'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'UEFA Champions League', 'Matchday': 'last 16 1st leg', 'Date': 'Tue Feb 24, 2015', 'Time': '8:45 PM', 'Home team': 'Man City', 'Away team': 'FC Barcelona', 'System of play': '4-4-2 double 6', 'Coach': 'Manuel Pellegrini', 'Attendance': '45.081', 'Result': '1:2 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2517513'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'UEFA Champions League', 'Matchday': 'last 16 2nd leg', 'Date': 'Wed Mar 18, 2015', 'Time': '8:45 PM', 'Home team': 'FC Barcelona', 'Away team': 'Man City', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '92.551', 'Result': '1:0 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2517521'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '1', 'Date': 'Sun Aug 17, 2014', 'Time': '5:00 PM', 'Home team': 'Newcastle', 'Away team': 'Man City', 'System of play': '4-4-2 double 6', 'Coach': 'Manuel Pellegrini', 'Attendance': '50.816', 'Result': '0:2 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2460301'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '2', 'Date': 'Mon Aug 25, 2014', 'Time': '9:00 PM', 'Home team': 'Man City', 'Away team': 'Liverpool', 'System of play': '4-4-1-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '45.471', 'Result': '3:1 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2478488'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '3', 'Date': 'Sat Aug 30, 2014', 'Time': '4:00 PM', 'Home team': 'Man City', 'Away team': 'Stoke City', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '45.622', 'Result': '0:1 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486491'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '4', 'Date': 'Sat Sep 13, 2014', 'Time': '1:45 PM', 'Home team': 'Arsenal', 'Away team': 'Man City', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '60.003', 'Result': '2:2 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486499'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '5', 'Date': 'Sun Sep 21, 2014', 'Time': '5:00 PM', 'Home team': 'Man City', 'Away team': 'Chelsea', 'System of play': '4-4-2 double 6', 'Coach': 'Manuel Pellegrini', 'Attendance': '45.602', 'Result': '1:1 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486518'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '6', 'Date': 'Sat Sep 27, 2014', 'Time': '4:00 PM', 'Home team': 'Hull City', 'Away team': 'Man City', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '22.859', 'Result': '2:4 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486522'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '7', 'Date': 'Sat Oct 4, 2014', 'Time': '6:30 PM', 'Home team': 'Aston Villa', 'Away team': 'Man City', 'System of play': '4-4-2 double 6', 'Coach': 'Manuel Pellegrini', 'Attendance': '32.964', 'Result': '0:2 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486535'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '8', 'Date': 'Sat Oct 18, 2014', 'Time': '1:45 PM', 'Home team': 'Man City', 'Away team': 'Spurs', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '45.549', 'Result': '4:1 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486539'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '9', 'Date': 'Sat Oct 25, 2014', 'Time': '1:45 PM', 'Home team': 'West Ham', 'Away team': 'Man City', 'System of play': '4-4-2 double 6', 'Coach': 'Manuel Pellegrini', 'Attendance': '34.977', 'Result': '2:1 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486554'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '10', 'Date': 'Sun Nov 2, 2014', 'Time': '2:30 PM', 'Home team': 'Man City', 'Away team': 'Man Utd', 'System of play': '4-4-1-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '45.358', 'Result': '1:0 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486566'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '11', 'Date': 'Sat Nov 8, 2014', 'Time': '6:30 PM', 'Home team': 'QPR', 'Away team': 'Man City', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '18.005', 'Result': '2:2 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486572'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '12', 'Date': 'Sat Nov 22, 2014', 'Time': '4:00 PM', 'Home team': 'Man City', 'Away team': 'Swansea', 'System of play': '4-4-2 double 6', 'Coach': 'Manuel Pellegrini', 'Attendance': '45.448', 'Result': '2:1 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486582'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '13', 'Date': 'Sun Nov 30, 2014', 'Time': '2:30 PM', 'Home team': 'Southampton', 'Away team': 'Man City', 'System of play': '4-4-2 double 6', 'Coach': 'Manuel Pellegrini', 'Attendance': '30.919', 'Result': '0:3 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486597'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '14', 'Date': 'Wed Dec 3, 2014', 'Time': '8:45 PM', 'Home team': 'Sunderland', 'Away team': 'Man City', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '41.152', 'Result': '1:4 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486608'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '15', 'Date': 'Sat Dec 6, 2014', 'Time': '6:30 PM', 'Home team': 'Man City', 'Away team': 'Everton', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '45.603', 'Result': '1:0 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486612'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '16', 'Date': 'Sat Dec 13, 2014', 'Time': '4:00 PM', 'Home team': 'Leicester', 'Away team': 'Man City', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '31.643', 'Result': '0:1 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486624'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '17', 'Date': 'Sat Dec 20, 2014', 'Time': '1:45 PM', 'Home team': 'Man City', 'Away team': 'Crystal Palace', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '45.302', 'Result': '3:0 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486632'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '18', 'Date': 'Fri Dec 26, 2014', 'Time': '4:00 PM', 'Home team': 'West Brom', 'Away team': 'Man City', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '26.040', 'Result': '1:3 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486648'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '19', 'Date': 'Sun Dec 28, 2014', 'Time': '4:00 PM', 'Home team': 'Man City', 'Away team': 'Burnley', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '45.608', 'Result': '2:2 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486652'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '20', 'Date': 'Thu Jan 1, 2015', 'Time': '4:00 PM', 'Home team': 'Man City', 'Away team': 'Sunderland', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '45.367', 'Result': '3:2 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486662'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '21', 'Date': 'Sat Jan 10, 2015', 'Time': '4:00 PM', 'Home team': 'Everton', 'Away team': 'Man City', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '39.499', 'Result': '1:1 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486673'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '22', 'Date': 'Sun Jan 18, 2015', 'Time': '5:00 PM', 'Home team': 'Man City', 'Away team': 'Arsenal', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '45.596', 'Result': '0:2 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486683'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '23', 'Date': 'Sat Jan 31, 2015', 'Time': '6:30 PM', 'Home team': 'Chelsea', 'Away team': 'Man City', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '41.620', 'Result': '1:1 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486690'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '24', 'Date': 'Sat Feb 7, 2015', 'Time': '4:00 PM', 'Home team': 'Man City', 'Away team': 'Hull City', 'System of play': '4-4-2 double 6', 'Coach': 'Manuel Pellegrini', 'Attendance': '45.233', 'Result': '1:1 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486703'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '25', 'Date': 'Wed Feb 11, 2015', 'Time': '8:45 PM', 'Home team': 'Stoke City', 'Away team': 'Man City', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '27.011', 'Result': '1:4 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486717'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '26', 'Date': 'Sat Feb 21, 2015', 'Time': '6:30 PM', 'Home team': 'Man City', 'Away team': 'Newcastle', 'System of play': '4-4-2 double 6', 'Coach': 'Manuel Pellegrini', 'Attendance': '45.602', 'Result': '5:0 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486724'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '27', 'Date': 'Sun Mar 1, 2015', 'Time': '1:00 PM', 'Home team': 'Liverpool', 'Away team': 'Man City', 'System of play': '4-4-2 double 6', 'Coach': 'Manuel Pellegrini', 'Attendance': '44.590', 'Result': '2:1 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486732'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '28', 'Date': 'Wed Mar 4, 2015', 'Time': '8:45 PM', 'Home team': 'Man City', 'Away team': 'Leicester', 'System of play': '4-4-2 double 6', 'Coach': 'Manuel Pellegrini', 'Attendance': '45.000', 'Result': '2:0 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486745'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '29', 'Date': 'Sat Mar 14, 2015', 'Time': '6:30 PM', 'Home team': 'Burnley', 'Away team': 'Man City', 'System of play': '4-4-2', 'Coach': 'Manuel Pellegrini', 'Attendance': '21.216', 'Result': '1:0 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486750'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '30', 'Date': 'Sat Mar 21, 2015', 'Time': '1:45 PM', 'Home team': 'Man City', 'Away team': 'West Brom', 'System of play': '4-4-2 double 6', 'Coach': 'Manuel Pellegrini', 'Attendance': '45.018', 'Result': '3:0 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486762'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '31', 'Date': 'Mon Apr 6, 2015', 'Time': '9:00 PM', 'Home team': 'Crystal Palace', 'Away team': 'Man City', 'System of play': '4-4-2', 'Coach': 'Manuel Pellegrini', 'Attendance': '24.718', 'Result': '2:1 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486772'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '32', 'Date': 'Sun Apr 12, 2015', 'Time': '5:00 PM', 'Home team': 'Man Utd', 'Away team': 'Man City', 'System of play': '4-4-1-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '75.313', 'Result': '4:2 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486781'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '33', 'Date': 'Sun Apr 19, 2015', 'Time': '2:30 PM', 'Home team': 'Man City', 'Away team': 'West Ham', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '45.041', 'Result': '2:0 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486796'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '34', 'Date': 'Sat Apr 25, 2015', 'Time': '6:30 PM', 'Home team': 'Man City', 'Away team': 'Aston Villa', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '45.036', 'Result': '3:2 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486803'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '35', 'Date': 'Sun May 3, 2015', 'Time': '5:00 PM', 'Home team': 'Spurs', 'Away team': 'Man City', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '35.784', 'Result': '0:1 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486817'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '36', 'Date': 'Sun May 10, 2015', 'Time': '2:30 PM', 'Home team': 'Man City', 'Away team': 'QPR', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '44.564', 'Result': '6:0 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486826'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '37', 'Date': 'Sun May 17, 2015', 'Time': '2:30 PM', 'Home team': 'Swansea', 'Away team': 'Man City', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '20.670', 'Result': '2:4 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486835'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Premier League', 'Matchday': '38', 'Date': 'Sun May 24, 2015', 'Time': '4:00 PM', 'Home team': 'Man City', 'Away team': 'Southampton', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '45.919', 'Result': '2:0 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2486846'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'Community Shield', 'Matchday': 'Final', 'Date': 'Sun Aug 10, 2014', 'Time': '4:00 PM', 'Home team': 'Arsenal', 'Away team': 'Man City', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '71.523', 'Result': '3:0 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2458586'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'FA Cup', 'Matchday': 'Third Round', 'Date': 'Sun Jan 4, 2015', 'Time': '4:00 PM', 'Home team': 'Man City', 'Away team': 'Sheff Wed', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '44.309', 'Result': '2:1 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2517326'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'FA Cup', 'Matchday': 'Fourth Round', 'Date': 'Sat Jan 24, 2015', 'Time': '4:00 PM', 'Home team': 'Man City', 'Away team': 'Middlesbrough', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '44.836', 'Result': '0:2 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2527276'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'EFL Cup', 'Matchday': 'Third Round', 'Date': 'Wed Sep 24, 2014', 'Time': '8:45 PM', 'Home team': 'Man City', 'Away team': 'Sheff Wed', 'System of play': '4-2-3-1', 'Coach': 'Manuel Pellegrini', 'Attendance': '32.346', 'Result': '7:0 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2490248'}
2020-03-10 11:45:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.transfermarkt.com/manchester-city/spielplan/verein/281/saison_id/2014/plus/1>
{'competition_name': 'EFL Cup', 'Matchday': 'Round of 16', 'Date': 'Wed Oct 29, 2014', 'Time': '8:45 PM', 'Home team': 'Man City', 'Away team': 'Newcastle', 'System of play': '4-4-2 double 6', 'Coach': 'Manuel Pellegrini', 'Attendance': '40.752', 'Result': '0:2 ', 'result_link': 'https://www.transfermarkt.com/spielbericht/index/spielbericht/2500670'}
2020-03-10 11:45:26 [scrapy.core.engine] INFO: Closing spider (finished)

网友

2楼 · 编辑于 2024-06-25 23:42:41

R无回路的解决方案：

library(XML)
library(stringr)

data=htmlParse("path to your data")

raw=str_trim(xpathSApply(data,"//div[@class='responsive-table']/table/tbody/tr/td[not(.//img)]",xmlValue),side = 'both')
df=as.data.frame(matrix(raw, ncol = 9,  byrow = TRUE), stringsAsFactors = FALSE)
names(df)=xpathSApply(data,"(//div[@class='responsive-table'])[1]//th",xmlValue)


dummy=str_trim(xpathSApply(data,"//div[@class='responsive-table']/table/tbody/tr/td[not(.//img)]|//h2/a[@name]",xmlValue),side = 'both')
comp=str_trim(xpathSApply(data,"//h2/a[@name]",xmlValue),side = 'both')

sta=c(match(comp,dummy),length(dummy)+1)
stb=(diff(sta)-1)/9
Competition=rep(comp,stb)
df=cbind(df,Competition)

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用Scrapy高效地循环数据

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >