擅长:python、mysql、java
<p>在本例中,您可以使用pandas的数据类型<code>category</code>将字符串映射到索引(请参见<a href="https://pandas.pydata.org/pandas-docs/stable/categorical.html" rel="nofollow noreferrer">categorical data</a>)。所以没有必要使用<a href="https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html" rel="nofollow noreferrer">LabelEncoder</a>或<a href="https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html" rel="nofollow noreferrer">OneHotEncoder</a>的<a href="https://scikit-learn.org/stable/" rel="nofollow noreferrer">scikit-learn</a>。在</p>
<pre><code>import pandas as pd
df = pd.read_csv('54055554.csv', header=0, dtype={
'type': 'category', # <
'amount': float,
'nameOrig': str,
'oldbalanceOrg': float,
'newbalanceOrig': float,
'nameDest': str,
'oldbalanceDest': float,
'newbalanceDest': float,
'isFraud': bool,
'isFlaggedFraud': bool
})
print(dict(enumerate(df['type'].cat.categories)))
# {0: 'PAYMENT', 1: 'TRANSFER'}
print(list(df['type'].cat.codes))
# [0, 0, 1]
</code></pre>
<p>来自CSV的数据:</p>
^{pr2}$