我有一个包含两列的数据帧:一列包含字符串,另一列包含整数。正如预期的那样,整数列的数据类型是int64
。但是,对于string列,它是object
。你知道吗
现在我想通过为每个字符串指定一个给定的整数,将字符串列转换为整数列。我这样做如下:
from pandas import DataFrame
# Create a data frame with two columns:
# - `catCol' represents categorical data and consists of strings
# - `intCol' represents numerical data and consists of integers
myList = {'catCol': ['NM', 'VT', 'VA', 'NY', 'VA'], 'intCol': [3, 6, 10, -1, 0]}
df = DataFrame(myList)
print('Before the mapping:')
print(df)
print('Data type of `catCol`:', df['catCol'].dtype)
print('Data type of a `catCol` element:', type(df['catCol'][3]))
print('Data type of `intCol`:', df['intCol'].dtype)
print('Data type of a `intCol` elements:', type(df['intCol'][3]))
# Replace the categorical columns with unique integers IDs.
fromList = df['catCol'].unique()
toList = list(range(len(fromList)))
for idx in range(len(fromList)):
df.loc[df['catCol'] == fromList[idx], 'catCol'] = toList[idx]
print()
print('After the mapping:')
print(df)
print('Data type of `catCol`:', df['catCol'].dtype)
print('Data type of a `catCol` element:', type(df['catCol'][3]))
print('Data type of `intCol`:', df['intCol'].dtype)
print('Data type of a `intCol` elements:', type(df['intCol'][3]))
输出为:
Before the mapping:
catCol intCol
0 NM 3
1 VT 6
2 VA 10
3 NY -1
4 VA 0
Data type of `catCol`: object
Data type of a `catCol` element: <class 'str'>
Data type of `intCol`: int64
Data type of a `intCol` elements: <class 'numpy.int64'>
After the mapping:
catCol intCol
0 0 3
1 1 6
2 2 10
3 3 -1
4 2 0
Data type of `catCol`: object
Data type of a `catCol` element: <class 'int'>
Data type of `intCol`: int64
Data type of a `intCol` elements: <class 'numpy.int64'>
问题来了:如果转换后的catCol
现在只包含整数,为什么它仍然是一个对象数据类型?我需要它是一个整数数据类型,就像intCol
。我怎样才能在不使用任何石膏的情况下修复这个?你知道吗
在这种情况下,我将使用map()函数:
相关问题 更多 >
编程相关推荐