python：在数据帧中排列

1条回答

网友

1楼 · 发布于 2024-10-04 09:25:57

Pandas有一个解析HTML的函数：^{}

您可以尝试以下操作：

# Modules
import pandas as pd
import requests

# HTML content
finviz = requests.get('https://finviz.com/screener.ashx?v=152&o=ticker&c=0,1,2,3,4,5,6,7,10,11,12,14,16,17,19,21,22,23,24,25,31,32,33,38,41,48,65,66,67&r=1')

# Convert to dataframe
df = pd.read_html(finviz.content)[-2]

# Set 1st row to columns names
df.columns = df.iloc[0]
# Drop 1st row
df = df.drop(df.index[0])

# df = df.set_index('No.')
print(df)
# 0  No. Ticker                                     Company            Sector                               Industry      Country  ... Debt/Eq Profit M  Beta   Price  Change    Volume
# 1    1      A                  Agilent Technologies, Inc.        Healthcare        Medical Laboratories & Research          USA  ...    0.51   14.60 % 1.20   72.47 - 0.28 % 177333
# 2    2     AA                           Alcoa Corporation   Basic Materials                               Aluminum          USA  ...    0.44 - 10.80 % 2.03    6.28   3.46 % 3021371
# 3    3   AAAU                Perth Mint Physical Gold ETF         Financial                   Exchange Traded Fund          USA  ... - - - 16.08 - 0.99 % 45991
# 4    4   AACG                       ATA Creativity Global          Services          Education & Training Services        China  ...    0.02 - 2.96    0.95 - 0.26 % 6177
# 5    5   AADR         AdvisorShares Dorsey Wright ADR ETF         Financial                   Exchange Traded Fund          USA  ... - - - 40.80   0.22 % 1605
# 6    6    AAL                American Airlines Group Inc.          Services                         Major Airlines          USA  ... - 3.70 % 1.83   12.81   4.57 % 16736506
# 7    7   AAMC     Altisource Asset Management Corporation         Financial                       Asset Management          USA  ... - -17.90 % 0.78   12.28   0.00 % 0
# 8    8   AAME               Atlantic American Corporation         Financial                         Life Insurance          USA  ...    0.28 - 0.40 % 0.29    2.20   3.29 % 26
# 9    9    AAN                               Aaron's, Inc.          Services              Rental & Leasing Services          USA  ...    0.20    0.80 % 1.23   22.47 - 0.35 % 166203
# 10  10   AAOI               Applied Optoelectronics, Inc.        Technology    Semiconductor - Integrated Circuits          USA  ...    0.49 - 34.60 % 2.02    7.80   2.63 % 61303
# 11  11   AAON                                  AAON, Inc.  Industrial Goods             General Building Materials          USA  ...    0.02   11.40 % 0.88   48.60   0.71 % 20533
# 12  12    AAP                    Advance Auto Parts, Inc.          Services                      Auto Parts Stores          USA  ...    0.21    5.00 % 1.04   95.94 - 0.58 % 165445
# 13  13   AAPL                                  Apple Inc.    Consumer Goods                   Electronic Equipment          USA  ...    1.22   21.50 % 1.19  262.39   2.97 % 11236642
# 14  14    AAT                 American Assets Trust, Inc.         Financial                          REIT - Retail          USA  ...    1.03   12.50 % 0.99   25.35   2.78 % 30158
# 15  15    AAU                       Almaden Minerals Ltd.   Basic Materials                                   Gold       Canada  ...    0.04 - 0.53    0.28 - 1.43 % 34671
# 16  16   AAWW          Atlas Air Worldwide Holdings, Inc.          Services                    Air Services, Other          USA  ...    1.33 - 10.70 % 1.65   22.79   2.70 % 56521
# 17  17   AAXJ  iShares MSCI All Country Asia ex Japan ETF         Financial                   Exchange Traded Fund          USA  ... - - - 60.13   1.18 % 161684
# 18  18   AAXN                       Axon Enterprise, Inc.  Industrial Goods  Aerospace/Defense Products & Services          USA  ...    0.00    0.20 % 0.77   71.11   2.37 % 187899
# 19  19     AB              AllianceBernstein Holding L.P.         Financial                       Asset Management          USA  ...    0.00   89.60 % 1.35   19.15   1.84 % 54588
# 20  20    ABB                                     ABB Ltd  Industrial Goods                  Diversified Machinery  Switzerland  ...    0.67    5.10 % 1.10   17.44   0.52 % 723739

# [20 rows x 29 columns]

如果HTML页面结构发生变化，我可以让您改进数据选择！父div id可能有用

解释“[-2]”：该read_html返回数据帧列表：

list_df = pd.read_html(finviz.content)
print(type(list_df))
# <class 'list'>

# Elements types in the lists
print(type(list_df [0]))
# <class 'pandas.core.frame.DataFrame' >

因此，为了获得所需的数据帧，我选择了结尾前的第二个元素[-2]。这个discussion解释了负索引

相关问题更多 >

编程相关推荐

热门问题

热门文章

python：在数据帧中排列

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >