擅长:python、mysql、java
<p>以下是对我有效的方法:</p>
<pre><code>class ArrayCaster(BaseEstimator, TransformerMixin):
def fit(self, x, y=None):
return self
def transform(self, data):
print data.shape
print np.transpose(np.matrix(data)).shape
return np.transpose(np.matrix(data))
FeatureUnion([('text', Pipeline([
('selector', ItemSelector(key='text')),
('vect', CountVectorizer(ngram_range=(1,1), binary=True, min_df=3)),
('tfidf', TfidfTransformer())
])
),
('other data', Pipeline([
('selector', ItemSelector(key='has_foriegn_char')),
('caster', ArrayCaster())
])
)])
</code></pre>