用Python删除美联储演讲日历

for day in dates: #find the title of each entry in the date supertable=table.find_all("dt",text=day) # print(supertable) for i in range(len(supertable)): #for each date-entry, find description subtable=supertable[i].findNext("dd") print(supertable[i]) print(subtable) bm=subtable.find("span",class_="boardMember") ti=subtable.find("span",class_="time") ev=subtable.find("span",class_="event")

<div id="ecb-content-col" > <main> <h1>Weekly schedule of public speaking engagements and other activities</h1> <h3>Friday, 17 July 2020 - Sunday, 26 July 2020</h3> <dl class="ecb-basicList"> <dt >Monday, 20 Jul 2020</dt> <dd> Event:Euro area monthly balance of payments (Dataset: BP6) Time:10:00 CET Info website:<a class="arrow" href="https://www.ecb.europa.eu/press/pr/stats/bop/html/index.en.html" target="_self">https://www.ecb.europa.eu/press/pr/stats/bop/html/index.en.html</a> Last modified: 20 July 2020, 11:05 CET </dd> <dt >Monday, 20 Jul 2020</dt> <dd> Board member:Luis de Guindos Event:Participation by Mr de Guindos in the panel "La respuesta europea frente a la crisis" organised by Universidad Complutense de Madrid as part of the Cursos de verano de El Escorial Time:10:00 CET Venue:Real Colegio Universitario María Cristina. Paseo de los Alamillos, 2, 28200 San Lorenzo de El Escorial, Madrid, Spain Contact:Esther Tejedor - ECB Global Media Relations - Tel: +49 69 1344 95596 - Mob: +49 172 5171280 E-mail:<a class="mail" href="mailto:esther.tejedor@ecb.europa.eu">esther.tejedor@ecb.europa.eu</a> Info website:<a class="external" href="www.pp.es" target="_blank">www.pp.es</a> Text:No text will be made available. Notes:The event will be streamed in Spanish via the above-mentioned link. Last modified: 20 July 2020, 11:05 CET </dd> </dl> <script type="text/javascript"> var currentContentsUrl = "/press/calendars/weekly/html/index_content.en.html"; </script> </main> </div>

1条回答

网友

1楼 · 发布于 2024-09-29 21:27:46

Fed日历页面是使用javascript动态加载的，因此需要不同的方法。使用浏览器中的“开发人员”选项卡，可以看到指向实际包含数据的页面的链接。一旦您获得了该链接和对该链接的请求，事情就会简单得多：

import requests
import json
import pandas as pd

cookies = {
    'BIGipServerwww.federalreserve.gov_hsts.app~www.federalreserve.gov_hsts_pool': '!XzbhBUzoOQRgHRNSiGDasURiAFpsPA28LjvywchJo0mMdcFUyd/2zqN601BqfWI2JmSmmNuETixO1A==',
}

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:81.0) Gecko/20100101 Firefox/81.0',
    'Accept': 'application/json, text/plain, */*',
    'Accept-Language': 'en-US,en;q=0.5',
    'Connection': 'keep-alive',
    'Referer': 'https://www.federalreserve.gov/newsevents/calendar.htm',
    'Cache-Control': 'max-age=0',
}

response = requests.get('https://www.federalreserve.gov/json/calendar.json', headers=headers, cookies=cookies)
cal = json.loads(response.text)
pd.DataFrame(cal['events'])

输出就是您要查找的表。您可能需要将其清理一点，删除不相关的列等，以使其达到预期的最终形状

相关问题更多 >

编程相关推荐

热门问题

热门文章