如何使用pandas.read_csv（）将索引数据读取为字符串？

2条回答

网友

1楼 · 编辑于 2024-05-10 19:39:43

传递dtype参数以指定数据类型：

In [159]:
import pandas as pd
import io
t="""uid,f1,f2,f3
01,0.1,1,10
02,0.2,2,20
03,0.3,3,30"""
df = pd.read_csv(io.StringIO(t), dtype={'uid':str})
df.set_index('uid', inplace=True)
df.index

Out[159]:
Index(['01', '02', '03'], dtype='object', name='uid')

因此，在您的情况下，以下操作应该有效：

df = pd.read_csv('sample.csv', dtype={'uid':str})
df.set_index('uid', inplace=True)

单行等效项不起作用，因为此处还有一个未完成的pandas bug，在将被视为索引的列上忽略dtype参数**：

df = pd.read_csv('sample.csv', dtype={'uid':str}, index_col='uid')

如果假设第一列是索引列，则可以动态执行此操作：

In [171]:
t="""uid,f1,f2,f3
01,0.1,1,10
02,0.2,2,20
03,0.3,3,30"""
cols = pd.read_csv(io.StringIO(t), nrows=1).columns.tolist()
index_col_name = cols[0]
dtypes = dict(zip(cols[1:], [float]* len(cols[1:])))
dtypes[index_col_name] = str
df = pd.read_csv(io.StringIO(t), dtype=dtypes)
df.set_index('uid', inplace=True)
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 3 entries, 01 to 03
Data columns (total 3 columns):
f1    3 non-null float64
f2    3 non-null float64
f3    3 non-null float64
dtypes: float64(3)
memory usage: 96.0+ bytes

In [172]:
df.index

Out[172]:
Index(['01', '02', '03'], dtype='object', name='uid')

在这里，我们只读取标题行以获取列名：

cols = pd.read_csv(io.StringIO(t), nrows=1).columns.tolist()

然后，我们使用所需的数据类型生成列名的dict：

index_col_name = cols[0]
dtypes = dict(zip(cols[1:], [float]* len(cols[1:])))
dtypes[index_col_name] = str

我们得到索引名，假设它是第一个条目，然后从其余的col创建一个dict，并将float指定为所需的dtype，并添加指定类型为str的index col，然后可以将其作为dtype参数传递给read_csv

网友

2楼 · 编辑于 2024-05-10 19:39:43

如果结果不是字符串，则必须将其转换为字符串。尝试：

result = [str(i) for i in result]

或者在这种情况下：

print([str(i) for i in df.index.values])

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何使用pandas.read_csv（）将索引数据读取为字符串？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >