如何修复数据被加载到数据帧的一列中？

import pandas as pd file_path = 'https://archive.ics.uci.edu/ml/machine-learning-databases/voting-records/house-votes-84.data' dataset2 = pd.read_csv(file_path, header=None, dtype=str) v = dataset2.values f = pd.factorize(v.ravel())[0].reshape(v.shape) dataset1 = pd.DataFrame(f) df = dataset1.astype('str') dataset = df.values.tolist() print (type (dataset)) print (type (dataset[1])) print (type (dataset[1][1]))

1条回答

网友

1楼 · 发布于 2024-05-20 17:10:26

你需要了解你正在处理的数据。一个快速打印电话会帮助你意识到这个分隔符是不同的。你知道吗

此外，它似乎是数字数据；您不再需要str转换。你知道吗

file_path = 'https://archive.ics.uci.edu/ml/machine-learning-databases/undocumented/connectionist-bench/vowel/vowel-context.data'

t = pd.read_csv(file_path, header=None, delim_whitespace=True)
v = t.values
f = pd.factorize(v.ravel())[0].reshape(v.shape)

df = pd.DataFrame(f)

如果要猜测分隔符格式，可以使用sep=None：

t = pd.read_csv(file_path, header=None, sep=None)

我不建议这样做，因为在使用推断分隔符加载数据时，panda很容易出错。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章