<p>这个问题描述了“稳定婚姻问题”的一个变体。它有一个非常简单的算法;我在下面粘贴了最简单的版本-还有其他更严格的版本。未订婚的求婚者向对方求婚,对方拒绝或接受(解除其当前订婚)。这将一直持续到各方都参与进来</p>
<pre><code># Gale-Shapley algorithm:
algorithm stable_matching is
Initialize all m ∈ M and w ∈ W to free
while ∃ free man m who still has a woman w to propose to do
w := first woman on m`s list to whom m has not yet proposed
if w is free then
(m, w) become engaged
else some pair (m`, w) already exists
if w prefers m to m` then
m` becomes free
(m, w) become engaged
else
(m`, w) remain engaged
end if
end if
repeat
</code></pre>
<p>对稳定婚姻问题后的项目分配解决方案进行建模,我们可以将<code>Teammate</code>一方视为求婚者,另一方视为一夫多妻<code>Project</code>,可以与多个求婚者订婚。这看起来像是很多代码,但它相对较短且易于理解</p>
<p>如果严格遵循上述算法,项目分布可能包含一些队友的第3或第4个选择-这不是一个理想的情况。因此,我发现随机化作业顺序,并在跟踪最佳分布的同时重试几次,会产生最佳结果</p>
<p><em>每次运行时,可能会产生一个得分相等的项目分布,但队友的安排不同。因此,如果队友出于某种原因决定不喜欢1号或2号选择,只需生成一个新的分布</em></p>
<p>可运行代码位于:<a href="https://pastebin.com/kVj0FuJP" rel="nofollow noreferrer">https://pastebin.com/kVj0FuJP</a></p>
<pre><code>import io
import pandas as pd
from copy import deepcopy
from itertools import cycle
from random import randint, shuffle
from statistics import mean, stdev
class Teammate:
def __init__(self, identifier, projects, rankings):
self._id = identifier
self._projects = list(projects)
self._rankings = dict(zip(projects, rankings))
self._prefs = self.gen_project_preferences_cycle()
self._project = None
@property
def id(self):
return self._id
def rank(self, project):
return self._rankings[project]
@property
def state(self):
return 'ENGAGED' if self._project else 'FREE'
def propose(self, project):
return project.consider(self)
def engage(self, project):
self._project = project
project.add(self)
def disengage(self, project):
self._project = None
project.remove(self)
@property
def project_preferences(self):
return self._prefs
def gen_project_preferences_cycle(self):
""" Returns a generator for a cyclical list of preferences in
order.
"""
prefs = sorted(self._projects, key=self.rank)
if randint(0, 1):
prefs.insert(0, prefs[1])
return cycle(prefs)
def reset(self):
self._project = None
self._prefs = self.gen_project_preferences_cycle()
</code></pre>
<p>通过将第二个选择随机分配到一些<code>Teammate</code>的首选项列表的前面,我们在放置上获得了一个有趣的效果。通过让一些队友开始他们的第二选择,他们很可能会被淘汰出他们的第二选择,然后找到另一个他们评为1的项目。最终的效果是你得到的矩阵大多是1,2的一小部分,没有3和4</p>
<p><code>Project</code>类中更有趣的方法是<code>consider()</code>方法,其中<code>Project</code>考虑了<code>Teammate</code>的参与提议。这个变化无常的项目将抛弃现有求婚者中任何一个订婚的求婚者,如果另一个求婚者提供更好的报价的话</p>
<pre><code>class Project:
def __init__(self, identifier, seats):
self._id = identifier
self._seats = seats
self._team = []
@property
def id(self):
return self._id
@property
def score(self):
return sum(tm.rank(self) for tm in self._team)
def consider(self, teammate):
""" Consider a proposal, have the Teammate establish the engagement
if accepted, then return True; otherwise return False.
"""
rank = teammate.rank(self)
team = self._team
seats = self._seats
if len(team) < seats or any(rank < tm.rank(self) for tm in team):
# Either there's a seat available, or the newcomer has a better
# offer than one of the current suitors.
if len(team) >= seats:
shuffle(team)
tm_out = max(team, key=lambda tm: tm.rank(self))
tm_out.disengage(self) # Hit the road Jack...
teammate.engage(self) # What? and no ring!?...
return True
return False
def add(self, teammate):
self._team.append(teammate)
def remove(self, teammate):
self._team.remove(teammate)
def clear(self):
self._team = []
@property
def state(self):
return 'FREE' if len(self._team) < self._seats else 'ENGAGED'
@property
def roster(self):
self._team.sort(key=lambda tm: tm.id)
return f"{self.id}: {[(tm.id, tm.rank(self)) for tm in self._team]}"
</code></pre>
<p>当然,在现代世界中,每个人都在使用婚介服务,<code>Teammate</code>和<code>Projects</code>。。。不过,这项服务有点古怪,在某种程度上依赖于随机尝试,直到找到适合各方的最佳匹配</p>
<pre><code>class ProjectMatcher:
def __init__(self, projects, teammates):
self._projects = projects
self._teammates = teammates
def match(self, num_iterations=1000):
best_score = 99999
best_distrib = None
for _ in range(num_iterations):
# This is the piece that does the actual matching. You can
# see each line corresponds to a step in the Gale-Shapely
# algorithm. You can find the details of some steps within
# the Teammate and Project methods.
for teammate in self.gen_free_teammates_cycle():
for project in teammate.project_preferences:
accepted = teammate.propose(project)
if accepted:
break
# Determine if this distribution is the best so far.
if self.evaluate() < best_score:
best_score = self.evaluate()
best_distrib = deepcopy(self._projects)
# Get ready for another round of matchmaking.
self.reset()
# Print out the best distribution of assignments found.
print("PROJECT ASSIGNMENTS:")
for project in best_distrib:
print(f" {project.roster}")
def evaluate(self):
# Determine the quality of the project assignments. We want a
# low average score, preferably without outliers.
s = [p.score for p in self._projects]
m = mean(s)
d = stdev(s, m)
return m + d
def gen_free_teammates_cycle(self):
""" Returns a cyclical generator of the list of Teammates that are
still unengaged.
"""
teammates = list(self._teammates)
shuffle(teammates)
while any(tm.state == 'FREE' for tm in teammates):
for tm in teammates:
if tm.state == 'FREE':
yield tm
shuffle(teammates)
def reset(self):
[p.clear() for p in self._projects]
[t.reset() for t in self._teammates]
@staticmethod
def from_dataframe(df):
n_seats = len(df.values) // (len(df.columns) - 1)
projects = [Project(proj, n_seats) for proj in df.columns[1:]]
teammates = [Teammate(tm[0], projects, tm[1:]) for tm in df.values]
return ProjectMatcher(projects, teammates)
</code></pre>
<p>把这一切放在一起</p>
<pre><code>if __name__ == '__main__':
df = pd.read_csv(io.StringIO("""
employee proj_A proj_B proj_C proj_D proj_E
A1 1 5 3 4 2
B1 5 4 1 2 3
C1 2 3 4 1 5
A2 4 2 1 3 5
B2 4 5 3 2 1
C2 3 1 2 5 4
A3 1 2 4 3 5
B3 2 3 1 5 4
C3 5 3 4 1 2
A4 4 5 3 2 1
B4 5 3 4 2 1
C4 1 2 3 4 5
A5 1 3 2 5 4
B5 2 1 3 5 4
C5 2 1 4 5 4
"""), sep=r'\s+')
pm = ProjectMatcher.from_dataframe(df)
pm.match()
</code></pre>
<p>输出:</p>
<pre><code>$ python projectmatcher.py
PROJECT ASSIGNMENTS:
proj_A: [('A1', 1), ('A3', 1), ('C4', 1)]
proj_B: [('B5', 1), ('C2', 1), ('C5', 1)]
proj_C: [('A2', 1), ('A5', 2), ('B3', 1)]
proj_D: [('B1', 2), ('C1', 1), ('C3', 1)]
proj_E: [('A4', 1), ('B2', 1), ('B4', 1)]
</code></pre>
<p>结果始终是好的,可能有100次迭代,它默认为1K表示好的度量,这只需要眨眼的时间。不知何故,尽管部分基于偶然,它还是成功地制作出了高质量的配对</p>
<p>可以修改该程序以收集得分最低的所有唯一分发,并生成一个列表。每一项都有相同的最低分数,但分布不同:</p>
<pre><code>PROJECT ASSIGNMENTS:
option 1
proj_A: [('A3', 1), ('A5', 1), ('C4', 1)]
proj_B: [('B5', 1), ('C2', 1), ('C5', 1)]
proj_C: [('A2', 1), ('B1', 1), ('B3', 1)]
proj_D: [('B4', 2), ('C1', 1), ('C3', 1)]
proj_E: [('A1', 2), ('A4', 1), ('B2', 1)]
option 2
proj_A: [('A3', 1), ('A5', 1), ('C4', 1)]
proj_B: [('B5', 1), ('C2', 1), ('C5', 1)]
proj_C: [('A2', 1), ('B1', 1), ('B3', 1)]
proj_D: [('A4', 2), ('C1', 1), ('C3', 1)]
proj_E: [('A1', 2), ('B2', 1), ('B4', 1)]
option 3
proj_A: [('A1', 1), ('A3', 1), ('C4', 1)]
proj_B: [('B5', 1), ('C2', 1), ('C5', 1)]
proj_C: [('A2', 1), ('A5', 2), ('B3', 1)]
proj_D: [('B1', 2), ('C1', 1), ('C3', 1)]
proj_E: [('A4', 1), ('B2', 1), ('B4', 1)]
option 4
proj_A: [('A3', 1), ('A5', 1), ('C4', 1)]
proj_B: [('B5', 1), ('C2', 1), ('C5', 1)]
proj_C: [('A2', 1), ('B1', 1), ('B3', 1)]
proj_D: [('B2', 2), ('C1', 1), ('C3', 1)]
proj_E: [('A1', 2), ('A4', 1), ('B4', 1)]
</code></pre>