Pandas Dataframe:将预测列表拆分为Dataframe列

2024-10-03 09:16:01 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用scikit学习模型(kdtree)进行预测。你知道吗

最初,我尝试只取前三名,这是使用以下代码完成的,给定一个拟合模型(clf)、标签列表(labels)和测试数据(test):

_,neighbors = clf.query(select_columns(test),k=30)
top3 = [Counter([labels[idx] for idx in neighborSet]).most_common(3) for neighborSet in neighbors]
predict = [[x for x,_ in  idx] for idx in top3]
preds = pd.DataFrame()
preds['predict1'],preds['predict2'],preds['predict3'] = [x[0] for x in predict],[x[1] if len(x) > 1 else x[0] for x in predict],[x[2] if len(x) > 2 else x[0] for x in predict]

我试着把它推广到n个预测。你知道吗

print len(predict)
465
print predict[0:2]
[[111111111, 22222, 33333, 44444, 55555, ...],[123,233,466,557,886, ...]]

将n个预测拆分为n列而不具体调用每个预测名称的python方法是什么。你知道吗

理想情况下,我会这样做:

_,neighbors = clf.query(select_columns(test),k=30)
topn = [Counter([labels[idx] for idx in neighborSet]).most_common(n) for neighborSet in neighbors]
predict = [[x for x,_ in  idx] for idx in topn]
preds = pd.DataFrame()
preds['predict'+str(i)] for i in range(n) = [x[i] if len(x) > i else x[0] for x in predict]

我正在寻找一种方法,可以有效地将数据帧中的项目列表拆分为单独的列。你知道吗


编辑:我能用字典和理解能力来处理这个问题pd.DataFrame.from\ U目录是的

_,neighbors = clf.query(select_columns(test),k=30)
topn = [Counter([labels[idx] for idx in neighborSet]).most_common(n) for neighborSet in neighbors]
predict = [[x for x,_ in  idx] for idx in topn]
preds = [{'predict'+str(i+1):x[i] if len(x) > i else x[0] for i in range(n)} for x in predict]
preds = pd.DataFrame.from_dict(preds)

这是正确的方法还是有更好的方法。你知道吗


Tags: intestdataframeforlabelslenifneighbors