<p>如果有人感兴趣,下面是我如何解决这个问题的</p>
<p>首先,我们需要从Sheets API获取所有数据</p>
<pre class="lang-py prettyprint-override"><code># define the names of the tabs I want to get
ranges = ['tab1', 'tab2']
# Call the Sheets API
request = service.spreadsheets().values().batchGet(spreadsheetId=document, ranges=ranges,)
response = request.execute()
</code></pre>
<p>现在,我想遍历每一列,并确保每一行的列表包含的元素数与第一行包含列标题的元素数相同</p>
<pre class="lang-py prettyprint-override"><code># response is the response from google sheets API,
# and from the code above. It contains column headings
# and data from every row.
# valueRanges is the key to access the data.
def extract_case_data(response, keyword):
for obj in response["valueRanges"]:
if keyword in obj["range"]:
values = pad_data(obj["values"])
df = pd.DataFrame(values[1:], columns=values[0])
return df
return None
</code></pre>
<p>最后介绍了数据的填充方法</p>
<pre class="lang-py prettyprint-override"><code>def pad_data(data: list):
# build a new array with the column heading data
# this is the list which we will return
return_data = [data[0]]
for row in data[1:]:
difference = len(data[0]) - len(row)
new_row = row
# append None to the lists which have a shorter
# length than the column heading list
for count in range(1, difference + 1):
new_row.append(None)
return_data.append(new_row)
return return_data
</code></pre>
<p>我当然不是说这是最好或最优雅的解决方案,但它为我做到了</p>
<p>希望这对别人有帮助</p>