使用Django查询降低SQL数据库中的搜索成本

class GetSimilarDrugs(APIView): def get(self, request, format=None): #import pdb #pdb.set_trace() get_req = request.GET.get('drugid', '') simi_list = [] comp_class = DrugBankDrugEPClass.objects.filter(drug_bank_id = get_req).values_list('epc_id', flat=True).distinct() for drg_id in DrugBankDrugEPClass.objects.values_list('drug_bank_id', flat = True).distinct(): classtocomp = DrugBankDrugEPClass.objects.filter(drug_bank_id = str(drg_id)).values_list('epc_id', flat=True).distinct() complist = list(comp_class) tolist = list(classtocomp) if complist == tolist: simi_list.append(drg_id) return Response({'result':simi_list})

id | drug_bank_id | epc_id | +------+--------------+--------+ | 1 | DB12789 | 1 | | 2 | DB12788 | 2 | | 3 | DB00596 | 3 | | 4 | DB09161 | 4 | | 5 | DB01178 | 5 | | 6 | DB01177 | 6 | | 7 | DB01177 | 6 | | 8 | DB01174 | 7 | | 9 | DB01175 | 8 | | 10 | DB01172 | 9 | | 11 | DB01173 | 10 | | 12 | DB12257 | 11 | | 13 | DB08167 | 12 | | 14 | DB01551 | 13 | | 15 | DB01006 | 14 | | 16 | DB01007 | 15 | | 17 | DB01007 | 16 | | 18 | DB01004 | 17 | | 19 | DB01004 | 18 | | 20 | DB01004 | 17 | | 21 | DB01004 | 18 | | 22 | DB01004 | 19 | | 23 | DB00570 | 20 | | 24 | DB01008 | 21 | | 25 | DB00572 | 22 | | 26 | DB00575 | 7 | | 27 | DB00577 | 23 | | 28 | DB00577 | 24 | | 29 | DB00577 | 25 | | 30 | DB00576 | 26 | | 31 | DB00751 | 27 | | 32 | DB00751 | 28 | | 33 | DB00750 | 29 | | 34 | DB00753 | 30 | | 35 | DB00752 | 31 | | 36 | DB00755 | 32 | | 37 | DB00755 | 32 | | 38 | DB00757 | 33 | | 39 | DB00756 | 34 | | 40 | DB00759 | 35 | | 41 | DB00759 | 36 | | 42 | DB00759 | 36 |

2条回答

网友
1楼 · 编辑于 2024-10-02 08:28:03

根据您的需要，我认为您可以这样做：
get_req = request.GET.get('drugid', '') # Fetching all the epc_ids that belongs to requisted drug_bank_ids comp_class = DrugBankDrugEPClass.objects.filter(drug_bank_id = get_req).values_list('epc_id', flat=True).distinct() # filters all drug_bank_ids thats matcth with the epc_ids in requisted classtocomp = DrugBankDrugEPClass.objects.filter(epc_id__in = comp_class).values_list('drug_bank_id', flat=True).distinct()

升级版：

get_req = request.GET.get('drugid', '')

comp_class = DrugBankDrugEPClass.objects.filter(
     drug_bank_id=get_req).values_list('epc_id', flat=True).distinct()

class_to_comp = DrugBankDrugEPClass.objects.filter(
     epc_id__in=comp_class).values_list('drug_bank_id', 'epc_id')

d = {}
for k, v in class_to_comp:
     d.setdefault(k, []).append(v)

simi_list = [k for k, v in d.items() if v == list(comp_class)]
print(simi_list)

我认为它会比你的代码快一点，因为如果我也循环就像你做的那样，它不会在每个循环中命中数据库。它还可以循环过滤数据。你知道吗

网友

2楼 · 编辑于 2024-10-02 08:28:03

你可以从

for drg_id in DrugBankDrugEPClass.objects.values_list('drug_bank_id', flat = True).distinct():
  classtocomp = DrugBankDrugEPClass.objects.filter(drug_bank_id = str(drg_id)).values_list('epc_id', flat=True).distinct()

至

drug_ids = DrugBankDrugEPClass.objects.values_list('drug_bank_id', flat = True).distinct()
comps = DrugBankDrugEPClass.objects.filter(drug_bank_id__in = drug_ids).values_list('epc_id', flat=True).distinct()

然后迭代comps结果集。你知道吗

您应该做的其他优化是将db_index = True添加到您要查询的必要字段中。你知道吗

如果使用Postgres，可以将字段参数添加到^{}：

On PostgreSQL only, you can pass positional arguments (*fields) in order to specify the names of fields to which the DISTINCT should apply. This translates to a SELECT DISTINCT ON SQL query. Here’s the difference. For a normal distinct() call, the database compares each field in each row when determining which rows are distinct. For a distinct() call with specified field names, the database will only compare the specified field names.

您可以在其中执行以下操作：

comps = DrugBankDrugEPClass.objects.values_list('drug_bank_id', flat = True).distinct('drug_bank_id', 'epc_id')

编辑以添加：

此外，您可以使用^{}或^{}之类的附加组件来分析查询和应用程序

相关问题更多 >

编程相关推荐

热门问题

热门文章