回答此问题可获得 20 贡献值,回答如果被采纳可获得 50 分。
<p><strong>问题:</strong></p>
<p>我正在使用scikit learn的管道设计一个自定义转换器,但存在位置参数不匹配的问题。我定义的类是:</p>
<pre><code>class DataSubsetGenerator(BaseEstimator, TransformerMixin):
def __init__(self, sub_percentage, random_state = 42):
self.sub_percentage = sub_percentage
self.random_state = random_state
def fit(self):
return self
def transform(self, X_train, X_test, y_train, y_test):
# Do data processing stuff here, removed to simplify example here...
return X_train_sub, X_test_sub, y_train_sub, y_test_sub
</code></pre>
<p>然后我将其放入一个1步定制管道中进行测试:</p>
<pre><code>reduce_pipeline = Pipeline([
('Prototype dataset', DataSubsetGenerator(0.5, random_state = random_state))
])
X_train, X_test, y_train, y_test = reduce_pipeline.transform(X_train, X_test, y_train, y_test)
</code></pre>
<p>我收到了错误信息:</p>
<pre><code>TypeError Traceback (most recent call last)
<ipython-input-42-4b2a20eb8b63> in <module>()
3 ])
4
----> 5 X_train, X_test, y_train, y_test = reduce_pipeline.transform(X_train, X_test, y_train, y_test)
TypeError: _transform() takes 2 positional arguments but 5 were given
</code></pre>
<p>这没有任何意义,因为我已经定义了<code>DataSubGenerator</code>类的<code>transform()</code>函数来接受4个参数</p>
<p><strong>我的测试:</strong></p>
<p>我通过实例化<code>DataSubGenerator</code>并调用<code>transform()</code>来测试这一点,但没有使用sklearn的管道,它的功能符合设计:</p>
<pre><code>dsg = DataSubsetGenerator(0.5, random_state = random_state)
X_train, X_test, y_train, y_test = dsg.transform(X_train, X_test, y_train, y_test)
</code></pre>
<p><strong><em>我的问题是:<code>transform()</code>函数在sklearn管道中使用时为什么不能识别这4个参数</em></strong></p>
<p><strong>相关问答;答:</strong></p>
<p>我试过研究,最近的问题和;一个线程是:<a href="https://stackoverflow.com/questions/40363650/transform-takes-2-positional-arguments-but-3-were-given">_transform() takes 2 positional arguments but 3 were given</a>。但是,我无法理解解决方案以及它如何应用于我的场景</p>