<p>您可以使用开源软件包功能引擎来实现这一点:</p>
<pre><code>import pandas as pd
from sklearn.model_selection import train_test_split
from feature_engine.categorical_encoders import OneHotCategoricalEncoder
# load titanic data from openML
pd.read_csv('https://www.openml.org/data/get_csv/16826755/phpMYEkMl')
# divide into train and test
X_train, X_test, y_train, y_test = train_test_split(
data[['sex', 'embarked']], # predictors for this example
data['survived'], # target
test_size=0.3, # percentage of obs in test set
random_state=0) # seed to ensure reproducibility
ohe_enc = OneHotCategoricalEncoder(
top_categories=None,
variables=['sex', 'embarked'],
drop_last=True)
ohe_enc.fit(X_train)
X_train = ohe_enc.transform(X_train)
X_test = ohe_enc.transform(X_test)
X_train.head()
</code></pre>
<p>您应该看到返回的输出:</p>
<pre><code> sex_female embarked_S embarked_C embarked_Q
501 1 1 0 0
588 1 1 0 0
402 1 0 1 0
1193 0 0 0 1
686 1 0 0 1
</code></pre>
<p>有关功能引擎的更多详细信息,请参见:</p>
<p><a href="https://www.trainindata.com/feature-engine" rel="nofollow noreferrer">https://www.trainindata.com/feature-engine</a></p>
<p><a href="https://github.com/solegalli/feature_engine" rel="nofollow noreferrer">https://github.com/solegalli/feature_engine</a></p>
<p><a href="https://feature-engine.readthedocs.io/en/latest/" rel="nofollow noreferrer">https://feature-engine.readthedocs.io/en/latest/</a></p>