<p>实际上,我们可以大大简化此代码,以获得所需的结果(并使其在将来更易于调整!)</p>
<p>完成的代码在这里,更详细的解释如下:</p>
<pre><code>import numpy as np
import json
import pandas as pd
import requests
content = requests.get(r'https://eodhistoricaldata.com/api/fundamentals/AAPL.US?api_token=OeAFFmMliFG5orCUuwAKQ8l4WWFQ67YX')
income_data = content.json()['Financials']['Income_Statement']['quarterly']
income = pd.DataFrame.from_dict(income_data).transpose().set_index("date")
income = income[['ebit']]
balance_data = content.json()['Financials']['Balance_Sheet']['quarterly']
balance = pd.DataFrame.from_dict(balance_data).transpose().set_index("date")
balance = balance[['totalAssets', 'cash', 'totalCurrentAssets', 'totalCurrentLiabilities']]
financials = income.merge(balance, left_index = True, right_index = True).fillna(0)
</code></pre>
<p>财务数据框架如下所示(仅显示2005-2009年的数据):</p>
<pre><code>| date | ebit | totalAssets | cash | totalCurrentAssets | totalCurrentLiabilities |
|: -| :| :| -:| -:| :|
| 2009-12-26 | 4.758e+09 | 5.3926e+10 | 7.609e+09 | 3.3332e+10 | 1.3097e+10 |
| 2009-09-26 | 0 | 4.7501e+10 | 5.263e+09 | 3.1555e+10 | 1.1506e+10 |
| 2009-06-27 | 1.732e+09 | 4.814e+10 | 5.605e+09 | 3.517e+10 | 1.6661e+10 |
| 2009-03-31 | 0 | 4.3237e+10 | 4.466e+09 | 0 | 1.3751e+10 |
| 2008-12-31 | 0 | 4.2787e+10 | 7.236e+09 | 0 | 1.4757e+10 |
| 2008-09-30 | 0 | 3.9572e+10 | 1.1875e+10 | 0 | 1.4092e+10 |
| 2008-06-30 | 0 | 3.1709e+10 | 9.373e+09 | 0 | 9.218e+09 |
| 2008-03-31 | 0 | 3.0471e+10 | 9.07e+09 | 0 | 9.634e+09 |
| 2007-12-31 | 0 | 3.0039e+10 | 9.162e+09 | 0 | 1.0535e+10 |
| 2007-09-30 | 0 | 2.5347e+10 | 9.352e+09 | 0 | 9.299e+09 |
| 2007-06-30 | 0 | 2.1647e+10 | 7.118e+09 | 0 | 6.992e+09 |
| 2007-03-31 | 0 | 1.8711e+10 | 7.095e+09 | 0 | 5.485e+09 |
| 2006-12-31 | 0 | 1.9461e+10 | 7.159e+09 | 0 | 7.337e+09 |
| 2006-09-30 | 0 | 1.7205e+10 | 6.392e+09 | 0 | 6.471e+09 |
| 2006-06-30 | 0 | 1.5114e+10 | 0 | 0 | 5.023e+09 |
| 2006-03-31 | 0 | 1.3911e+10 | 0 | 0 | 4.456e+09 |
| 2005-12-31 | 0 | 1.4181e+10 | 0 | 0 | 5.06e+09 |
| 2005-09-30 | 0 | 1.1551e+10 | 3.491e+09 | 0 | 3.484e+09 |
| 2005-06-30 | 0 | 1.0488e+10 | 0 | 0 | 3.123e+09 |
| 2005-03-31 | 0 | 1.0111e+10 | 0 | 0 | 3.352e+09 |
</code></pre>
<hr/>
<p><code>content.json()['Financials']['Income_Statement']['quarterly']</code>的结果是一个字典,每个键都是日期,每个值都是第二个字典,其中包含列数据</p>
<pre><code>{'2005-03-31': {'date': '2005-03-31',
'filing_date': None,
'currency_symbol': 'USD',
'researchDevelopment': '120000000.00',
...},
'2005-06-30': {...},
...}
</code></pre>
<p>由于是这种情况,您实际上可以使用</p>
<p><code>pd.DataFrame.from_dict(income_data).transpose().set_index("date")</code></p>
<p>由于JSON的结构,转置是必要的。Pandas需要一个格式为<code>{'column name': data}</code>的字典。由于键是日期,您将首先获得一个数据框,其中行标记为“totalAssets”、“cash”等,列为日期。<code>transpose()</code>命令翻转行和列,使其符合您需要的格式<em>最后一个<code>.set_index("date")</code>命令用于使用“日期”数据而不是初始键日期,以保持一致性并命名索引。它是完全可选的</em></p>
<p>现在,这个数据框架将包含JSON文件中的每一列,但您只对其中的几列感兴趣。代码</p>
<p><code>income = income[['ebit']]</code></p>
<p>仅从数据中选择相关列</p>
<p>由于要从两个不同的源提取数据,因此确实需要创建两个不同的表。这还有一个额外的好处,那就是你可以更清楚地看到哪些栏目是从“损益表”中提取出来的,哪些栏目是从“资产负债表”中提取出来的</p>
<p>最后一行</p>
<p><code>financials = income.merge(balance, left_index = True, right_index = True).fillna(0)</code></p>
<p>使用索引(在本例中为“日期”列)将两个表合并在一起<code>fillna(0)</code>确保按照您的请求,用零值替换任何缺失的数据</p>
<p>如果您最终需要添加另一个表,例如“现金流”,您可以使用相同的代码行创建该表并选择相关列,然后添加第二个合并行:</p>
<pre><code>cashflow_data = content.json()['Financials']['Balance_Sheet']['quarterly']
cashflow = pd.DataFrame.from_dict(cashflow_data).transpose().set_index("date")
cashflow = cashflow[['accountsPayable', 'liabilitiesAndStockholdersEquity']]
...
financials.merge(cashflow, left_index = True, right_index = True).fillna(0)
</code></pre>
<hr/>
<p>作为一个额外提示,源JSON中有相当多的数据!要查看任何给定表中的可用列,请使用以下命令:</p>
<p><code>cashflow.columns.sort_values()</code></p>
<p>要获取按字母顺序排列的列列表,可以使用:</p>
<pre><code> ['accountsPayable', 'accumulatedAmortization', 'accumulatedDepreciation',
'accumulatedOtherComprehensiveIncome', 'additionalPaidInCapital',
'capitalLeaseObligations', 'capitalSurpluse', 'cash',
'cashAndShortTermInvestments', 'commonStock',
'commonStockSharesOutstanding', 'commonStockTotalEquity',
'currency_symbol', 'deferredLongTermAssetCharges',
'deferredLongTermLiab', 'filing_date', 'goodWill', 'intangibleAssets',
'inventory', 'liabilitiesAndStockholdersEquity', 'longTermDebt',
'longTermDebtTotal', 'longTermInvestments', 'negativeGoodwill',
'netReceivables', 'netTangibleAssets', 'nonCurrentAssetsTotal',
'nonCurrentLiabilitiesOther', 'nonCurrentLiabilitiesTotal',
'nonCurrrentAssetsOther', 'noncontrollingInterestInConsolidatedEntity',
'otherAssets', 'otherCurrentAssets', 'otherCurrentLiab', 'otherLiab',
'otherStockholderEquity', 'preferredStockRedeemable',
'preferredStockTotalEquity', 'propertyPlantAndEquipmentGross',
'propertyPlantEquipment', 'retainedEarnings',
'retainedEarningsTotalEquity', 'shortLongTermDebt', 'shortTermDebt',
'shortTermInvestments',
'temporaryEquityRedeemableNoncontrollingInterests', 'totalAssets',
'totalCurrentAssets', 'totalCurrentLiabilities', 'totalLiab',
'totalPermanentEquity', 'totalStockholderEquity', 'treasuryStock',
'warrants']
</code></pre>
<p>当数据中出现拼写错误时,如上面的“capitalSurpluse”中,这也非常有用</p>