在字典中具有相似值的Python分组键

2024-06-26 08:12:31 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个存储在MongoDB的酒店数据库。我的目标是把电话号码相互交叉的酒店集中在一起。你知道吗

Example : If Hotel A has phone numbers 1,2; B has phone numbers 2,3; C has phone numbers 3,4; D has phone numbers 5,6; E has phone numbers 6,7, then A, B, C will be grouped together and D, E will be put in another group so that when a user searches for Hotel A, he'll get Hotels B and C as recommended hotels.

我的数据库中的文档结构是:

{
"_id" : ObjectId("57bd5108f4733211b61217fa"),
"autoid" : 1, "parentid" : "P01982.01982.110601173548.N2C5",
"companyname" : "Sheldan Holiday Home",
"latitude" : 34.169552,
"longitude" : 77.579315,
"state" : "JAMMU AND KASHMIR",
"city" : "LEH Ladakh",
"pincode" : 194101,
"phone_search" : "9419179870|253013",
"address" : "Sheldan Holiday Home|Changspa|Leh Ladakh-194101|LEH Ladakh|JAMMU AND KASHMIR",
"email" : "",
"website" : "", "national_catidlineage_search" : "/10255012/|/10255031/|/10255037/|/10238369/|/10238380/|/10238373/", "area" : "Leh Ladakh",
"data_city" : "Leh Ladakh"}

我迄今为止所取得的成就: 我已经能够将电话号码分开,并将酒店的“parentid”存储在按“phone\u search”分组的字典中

Example : u'9426029957': [u'P2772.2772.140207213142.C6X5'], u'9796603277': [u'P1991.1991.110710093157.Z8G1'], u'9447706927': [u'PX477.X477.160620184114.P7P3', u'PX484.X484.160620185334.E4G6'] and so on...

我计划做什么:我计划给相关酒店分配一个组id。因此,在上述示例中,酒店A、B、C将被赋予组id 1,而酒店D、E将被赋予组id 2。你知道吗

但是我被困在如何有效地实现它上。你知道吗

这是我到现在为止写的代码。任何其他建议也欢迎。你知道吗

from pymongo import MongoClient #To import client for MongoDB
from pprint import pprint #Pretty print 

#Defining variables
hotels = []
hotelsByPhone = {}
phones = []

#Initializing MongoDB client
client = MongoClient()

#Connection
db = client.hotel
collection = db.hotelData

#Storing all hotels in a list 'hotels'
for post in collection.find():
    hotels.append(post)

#Splitting all the numbers and storing parent ids of hotels grouped together by similar phone numbers
for hotel in hotels:
    try:
        phones = hotel["phone_search"].split("|")
        for phone in phones:
            hotelsByPhone.setdefault(phone,[]).append(hotel["parentid"])
    except:
        try:
            phones = hotel["phone_search"]
            hotelsByPhone.setdefault(phones,[]).append(hotel["parentid"])
        except:
            hotelsByPhone.setdefault(phones,[]).append(hotel["parentid"])

Tags: andinclientidforsearchphone酒店