为什么这个线性分类器算法是错误的？问题的回答

为什么这个线性分类器算法是错误的？

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我指定“n”个点数。给它们标上<code>+1</code>或<code>-1</code>。我将所有这些存储在一个字典中，看起来像：<code>{'point1' : [(0.565,-0.676), +1], ... }</code>。我试图找到一条线来分隔它们-即标记为线上方+1的点，那些标记为线下方-1的点。有人能帮忙吗？你知道吗 我尝试应用<code>w = w + y(r)</code>作为“学习算法”，<code>w</code>是权重向量<code>y</code>是<code>+1</code>或<code>-1</code>，<code>r</code>是重点 代码运行，但分隔线不精确-它不能正确分隔。另外，随着要分离的点数的增加，行的效率也会降低。你知道吗 如果运行代码，绿线应该是分隔线。它越接近蓝线的斜率（定义上完美的线），就越好。你知道吗 <pre><code>from matplotlib import pyplot as plt import numpy as np import random n = 4 x_values = [round(random.uniform(-1,1),3) for _ in range(n)] y_values = [round(random.uniform(-1,1),3) for _ in range(n)] pts10 = zip(x_values, y_values) label_dict = {} x1, y1, x2, y2 = (round(random.uniform(-1,1),3) for _ in range(4)) b = [x1, y1] d = [x2, y2] slope, intercept = np.polyfit(b, d, 1) fig, ax = plt.subplots(figsize=(8,8)) ax.scatter(*zip(*pts10), color = 'black') ax.plot(b,d,'b-') label_plus = '+' label_minus = '--' i = 1 for x,y in pts10: if(y > (slope*x + intercept)): ax.annotate(label_plus, xy=(x,y), xytext=(0, -10), textcoords='offset points', color = 'blue', ha='center', va='center') label_dict['point{}'.format(i)] = [(x,y), "+1"] else: ax.annotate(label_minus, xy=(x,y), xytext=(0, -10), textcoords='offset points', color = 'red', ha='center', va='center') label_dict['point{}'.format(i)] = [(x,y), "-1"] i += 1 # this is the algorithm def check(ww,rr): while(np.dot(ww,rr) >= 0): print "being refined 1" ww = np.subtract(ww,rr) return ww def check_two(ww,rr): while(np.dot(ww,rr) < 0): print "being refined 2" ww = np.add(ww,rr) return ww w = np.array([0,0]) ii = 1 for x,y in pts10: r = np.array([x,y]) print w if (np.dot(w,r) >= 0) != int(label_dict['point{}'.format(ii)][1]) < 0: print "Point " + str(ii) + " should have been below the line" w = np.subtract(w,r) w = check(w,r) elif (np.dot(w,r) < 0) != int(label_dict['point{}'.format(ii)][1]) >= 0: print "Point " + str(ii) + " should have been above the line" w = np.add(w,r) w = check_two(w,r) else: print "Point " + str(ii) + " is in the correct position" ii += 1 ax.plot(w,'g--') ax.set_xlabel('X-axis') ax.set_ylabel('Y-axis') ax.set_title('Labelling 10 points') ax.set_xticks(np.arange(-1, 1.1, 0.2)) ax.set_yticks(np.arange(-1, 1.1, 0.2)) ax.set_xlim(-1, 1) ax.set_ylim(-1, 1) ax.legend() </code></pre>

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

例如，您可以使用scikit learn（<code>sklearn</code>）中的<a href="http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html" rel="nofollow noreferrer">^{<cd1>}</a>。线性分类器计算预测如下（参见<a href="https://github.com/scikit-learn/scikit-learn/blob/5a74e2f1c8470c527018dba78f86557f40eaeb47/sklearn/linear_model/base.py#L272" rel="nofollow noreferrer">the source code</a>）： <pre><code>def predict(self, X): scores = self.decision_function(X) if len(scores.shape) == 1: indices = (scores > 0).astype(np.int) else: indices = scores.argmax(axis=1) return self.classes_[indices] </code></pre> 其中<a href="https://github.com/scikit-learn/scikit-learn/blob/5a74e2f1c8470c527018dba78f86557f40eaeb47/sklearn/linear_model/base.py#L278" rel="nofollow noreferrer">^{<cd3>}</a>由下式给出： <pre><code>def decision_function(self, X): [...] scores = safe_sparse_dot(X, self.coef_.T, dense_output=True) + self.intercept_ return scores.ravel() if scores.shape[1] == 1 else scores </code></pre> 因此对于您的示例的二维情况，这意味着数据点被分类<code>+1</code>，如果 <pre><code>x*w1 + y*w2 + i > 0 </code></pre> 在哪里 <pre><code>x, y = X w1, w2 = self.coef_ i = self.intercept_ </code></pre> 否则<code>-1</code>。因此决定取决于<code>x*w1 + y*w2 + i</code>大于或小于（或等于）零。因此，通过设置<code>x*w1 + y*w2 + i == 0</code>可以找到“border”。我们可以自由选择其中一个组成部分，另一个由这个方程决定。你知道吗 下面的代码片段匹配<code>SGDClassifier</code>，并绘制结果“border”。它假设数据点分散在原点周围（<code>x, y = 0, 0</code>），即它们的平均值（大约）为零。实际上，为了得到好的结果，我们应该先从数据点中减去平均值，然后进行拟合，然后再将平均值加到结果中。下面的代码片段只是散布原点周围的点。你知道吗 <pre><code>import matplotlib.pyplot as plt import numpy as np from sklearn.linear_model import SGDClassifier n = 100 x = np.random.uniform(-1, 1, size=(n, 2)) # We assume points are scatter around zero. b = np.zeros(2) d = np.random.uniform(-1, 1, size=2) slope, intercept = (d[1] / d[0]), 0. fig, ax = plt.subplots(figsize=(8,8)) ax.scatter(x[:, 0], x[:, 1], color = 'black') ax.plot([b[0], d[0]], [b[1], d[1]], 'b-', label='Ideal') labels = [] for point in x: if(point[1] > (slope * point[0] + intercept)): ax.annotate('+', xy=point, xytext=(0, -10), textcoords='offset points', color = 'blue', ha='center', va='center') labels.append(1) else: ax.annotate(' ', xy=point, xytext=(0, -10), textcoords='offset points', color = 'red', ha='center', va='center') labels.append(-1) labels = np.array(labels) classifier = SGDClassifier() classifier.fit(x, labels) x1 = np.random.uniform(-1, 1) x2 = (-classifier.intercept_ - x1 * classifier.coef_[0, 0]) / classifier.coef_[0, 1] ax.plot([0, x1], [0, x2], 'g ', label='Fit') plt.legend() plt.show() </code></pre> 此图显示<code>n = 100</code>数据点的结果： <a href="https://i.stack.imgur.com/GeWvc.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/GeWvc.png" alt="Result for n=100"/></a> 下图显示了从包含1000个数据点的池中随机选择点的不同<code>n</code>的结果： <a href="https://i.stack.imgur.com/bN5ID.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/bN5ID.png" alt="Results for different n"/></a>

为什么这个线性分类器算法是错误的？

1 个回答

相关Python问题