NeuroMorpho.org从多个API页面获取结果

2024-09-28 17:02:53 发布

您现在位置:Python中文网/ 问答频道 /正文

很抱歉,因为这是我的第一篇文章,而且我对Python编码一无所知。 我想使用NeuroMorphoAPI(http://neuromorpho.org/apiReference.html)来查找和获取关于某些神经元的信息(在查询行中添加了过滤器)

我使用了以下代码:

import requests
import json
import csv
import pandas as pd
from pandas import DataFrame
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

response = requests.get("http://neuromorpho.org/api")
response

query = (
    "http://neuromorpho.org/api/neuron/select?q=species:rat&fq=brain_region:hippocampus, CA1&fq=experiment_condition:Control&fq=cell_type:Pyramidal, principal cell"
)

response = requests.get(query)
json_data = response.json()
rat_data = json_data
rat_data

我得到了大量的数据,最后它说:

'page': {'size': 50, 'totalElements': 1115, 'totalPages': 23, 'number': 0}}

然后,我想从该数据创建一个字典,并使用以下代码:

df_dict = {}
df_dict['NeuronID'] = []
df_dict['Archive'] = []
df_dict['Strain'] = []
df_dict['Cell'] = []
df_dict['Region'] = []
for i in rat_data['_embedded']['neuronResources']:
    df_dict['NeuronID'].append(str(i['neuron_id']))
    df_dict['Archive'].append(str(i['archive']))
    df_dict['Strain'].append(str(i['strain']))
    df_dict['Cell'].append(str(i['cell_type']))
    df_dict['Region'].append(str(i['brain_region']))

rat_df = DataFrame(df_dict)
print(rat_df)

最后,当我检查字典的长度时:

len(rat_df)

产量是50

因此,我在最后计算出,该程序只从第一个神经元中提取了前50个神经元(第0页)。根据开头的输出,我还有23页。 如何将所有这些结果放入一本词典或一个类中,也就是说,有没有办法列出所有这些页面?我尝试了几个循环选项,但没有成功

很抱歉,如果这是一个简单的问题,或者我犯了一些错误,但在过去的几天里,我一直在尝试一切,没有得到任何结果


Tags: orgimportjsonhttpdfdataresponseas
1条回答
网友
1楼 · 发布于 2024-09-28 17:02:53

免责声明:我不是HTTP或请求库的专家,以前也没有使用neurophormo.org,所以请对此持保留态度

您可以查询第一个请求的页数,然后循环遍历各个页面。在循环中,您必须将请求的页面作为参数包含到HTTP GET方法中,例如?page=42&...,如下所示:

url = 'http://neuromorpho.org/api/neuron/select'
params = {
        'page' : 0,
        'q' : 'species:rat',
        'fq' : [
            'brain_region:hippocampus,CA1',
            'experiment_condition:Control',
            'cell_type:Pyramidal,principal cell' ] }

totalPages = requests.get(url, params).json()['page']['totalPages']

df_dict = {
        'NeuronID' : list(),
        'Archive' : list(),
        'Strain' :  list(),
        'Cell' : list(),
        'Region' : list() }

for pageNum in range(totalPages):
    params['page'] = pageNum
    response = requests.get(url, params)
    print('Querying page {} -> status code: {}'.format(
        pageNum, response.status_code))
    if (response.status_code == 200):    #only parse successful requests
        data = response.json()
        for row in data['_embedded']['neuronResources']:
            df_dict['NeuronID'].append(str(row['neuron_id']))
            df_dict['Archive'].append(str(row['archive']))
            df_dict['Strain'].append(str(row['strain']))
            df_dict['Cell'].append(str(row['cell_type']))
            df_dict['Region'].append(str(row['brain_region']))

rat_df = pd.DataFrame(df_dict)
print(rat_df)

您可以在控制台输出中看到结果DataFrame以及请求的页码如何更改:

Querying page 0 -> status code: 200
Querying page 1 -> status code: 200
Querying page 2 -> status code: 200
Querying page 3 -> status code: 200
Querying page 4 -> status code: 200
Querying page 5 -> status code: 200
Querying page 6 -> status code: 200
Querying page 7 -> status code: 200
Querying page 8 -> status code: 200
Querying page 9 -> status code: 200
Querying page 10 -> status code: 200
Querying page 11 -> status code: 200
Querying page 12 -> status code: 200
Querying page 13 -> status code: 200
Querying page 14 -> status code: 200
Querying page 15 -> status code: 200
Querying page 16 -> status code: 200
Querying page 17 -> status code: 200
Querying page 18 -> status code: 200
Querying page 19 -> status code: 200
Querying page 20 -> status code: 200
Querying page 21 -> status code: 200
Querying page 22 -> status code: 200
     NeuronID    Archive          Strain                             Cell                          Region
0         100     Turner     Fischer 344  ['pyramidal', 'principal cell']          ['hippocampus', 'CA1']
1         101     Turner     Fischer 344  ['pyramidal', 'principal cell']          ['hippocampus', 'CA1']
2        1016     Ascoli  Sprague-Dawley  ['pyramidal', 'principal cell']                 ['hippocampus']
3        1019     Ascoli  Sprague-Dawley  ['pyramidal', 'principal cell']                 ['hippocampus']
4         102     Turner     Fischer 344  ['pyramidal', 'principal cell']          ['hippocampus', 'CA1']
...       ...        ...             ...                              ...                             ...
1110    99614  Guizzetti  Sprague-Dawley  ['principal cell', 'pyramidal']  ['hippocampus', 'CA1', 'left']
1111    99615  Guizzetti  Sprague-Dawley  ['principal cell', 'pyramidal']  ['hippocampus', 'CA1', 'left']
1112    99616  Guizzetti  Sprague-Dawley  ['principal cell', 'pyramidal']  ['hippocampus', 'CA1', 'left']
1113    99617  Guizzetti  Sprague-Dawley  ['principal cell', 'pyramidal']  ['hippocampus', 'CA1', 'left']
1114    99618  Guizzetti  Sprague-Dawley  ['principal cell', 'pyramidal']  ['hippocampus', 'CA1', 'left']

[1115 rows x 5 columns]

更新#1:

我更改了我发布的代码,添加了用于解析循环中响应的代码的修改版本。我认为neuromorpho.orgAPI中有一个小错误,因为它在最后一页(数字22)中使用size: 50进行响应,而JSON响应中只包含15个(索引0-14)对象。您可以通过迭代JSON对象并忽略报告的大小来避免这个问题

更新#2:

意识到GET参数不必在URL中编码,但是请求在作为dict传递它们时会为我们编码(更新了代码)

我希望这有帮助

相关问题 更多 >